0% found this document useful (0 votes)
31 views12 pages

Data Structures (1) - 61-72

The document discusses deques and hash tables, explaining that a deque is a double-ended queue allowing insertion and deletion from both ends, and can function as both a stack and a queue. It also covers hash tables, which enable O(1) data retrieval using a hash function to map data to specific indices in an array, while addressing collision handling through methods like separate chaining. Additionally, the document outlines characteristics of a good hash function and basic operations of hash tables.

Uploaded by

lavanya.d
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views12 pages

Data Structures (1) - 61-72

The document discusses deques and hash tables, explaining that a deque is a double-ended queue allowing insertion and deletion from both ends, and can function as both a stack and a queue. It also covers hash tables, which enable O(1) data retrieval using a hash function to map data to specific indices in an array, while addressing collision handling through methods like separate chaining. Additionally, the document outlines characteristics of a good hash function and basic operations of hash tables.

Uploaded by

lavanya.d
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

UNIT - 4

DEQUES AND HASHING


Deque:
The dequeue represents Double Ended Queue. In the queue, the inclusion happens from one end
while the erasure happens from another end. The end at which the addition happens is known as
the backside while the end at which the erasure happens is known as front end.

Deque is a direct information structure in which the inclusion and cancellation tasks are
performed from the two finishes. We can say that deque is a summed up form of the line. How
about we take a gander at certain properties of deque. Deque can be utilized both as stack and line
as it permits the inclusion and cancellation procedure on the two finishes. In deque, the inclusion
and cancellation activity can be performed from one side. The stack adheres to the LIFO rule in
which both the addition and erasure can be performed distinctly from one end; in this way, we
reason that deque can be considered as a stack.

In deque, the addition can be performed toward one side, and the erasure should be possible on
another end. The queue adheres to the FIFO rule in which the component is embedded toward one
side and erased from another end. Hence, we reason that the deque can likewise be considered as
the queue.
There are two types of Queues, Input-restricted queue, and output-restricted queue. Information
confined queue: The info limited queue implies that a few limitations areapplied to the inclusion.
In info confined queue, the addition is applied to one end while the erasure is applied from both
the closures.

Yield confined queue: The yield limited line implies that a few limitations are applied

to the erasure activity. In a yield limited queue, the cancellation can be applied

uniquely from one end, while the inclusion is conceivable from the two finishes.

Operations on Deque
The following are the operations applied on deque:
 Insert at front
 Delete from end
 insert at rear
 delete from rear

Other than inclusion and cancellation, we can likewise perform look activity in deque. Through
look activity, we can get the front and the back component of the dequeue.
We can perform two additional procedure on dequeue:
isFull(): This capacity restores a genuine worth if the stack is full; else, it restores a bogus worth.
isEmpty(): This capacity restores a genuine worth if the stack is vacant; else it restores a bogus
worth.
Memory Representation
The deque can be executed utilizing two information structures, i.e., round exhibit, and doubly
connected rundown. To actualize the deque utilizing round exhibit, we initially should realize
what is roundabout cluster.
Implementation of Deque using a circular array:
The following are the steps to perform the operations on the
Deque: Enqueue operation
1. At first, we are thinking about that the deque is unfilled, so both front and back are set to - 1,
i.e., f = - 1 and r = - 1.
2. As the deque is vacant, so embeddings a component either from the front or backside would be
something very similar. Assume we have embedded component 1, at that point front is equivalent
to 0, and the back is likewise equivalent to 0.

3. Assume we need to embed the following component from the back. To embed the component
from the backside, we first need to augment the back, i.e., rear=rear+1. Presently, the back is
highlighting the subsequent component, and the front is highlighting the main component.

4. Assume we are again embeddings the component from the backside. To embed the
component, we will first addition the back, and now back focuses to the third component.

5. In the event that we need to embed the component from the front end, and addition a
component from the front, we need to decrement the estimation of front by 1. In the event that we
decrement the front by 1, at that point the front focuses to - 1 area, which isn't any substantial area
in an exhibit. Thus, we set the front as (n - 1), which is equivalent to 4 as n is 5. When the front is
set, we will embed the incentive as demonstrated in the beneath figure:

7.12.Dequeue Operation
1. On the off chance that the front is highlighting the last component of the exhibit, and we need
to play out the erase activity from the front. To erase any component from the front, we need to
set front=front+1. At present, the estimation of the front is equivalent to 4, and in the event that
we increase the estimation of front, it becomes 5 which is definitely not a substantial list. Thusly,
we presume that in the event that front focuses to the last component, at that point front is set to 0
if there should be an occurrence of erase activity.

2. If we want to delete the element from rear end then we need to decrement the rear value by 1,
i.e., rear=rear-1 as shown in the below figure:

3. In the event that the back is highlighting the principal component, and we need to erase the
component from the backside then we need to set rear=n-1 where n is the size of the exhibit as
demonstrated in the beneath figure:

Applications of Deque
The deque can be utilized as a stack and line; subsequently, it can perform both re-try and fix
activities.
It tends to be utilized as a palindrome checker implies that in the event that we read the string
from the two closures, at that point the string would be the equivalent.
It tends to be utilized for multiprocessor planning. Assume we have two processors, and every
processor has one interaction to execute. Every processor is appointed with an interaction or a
task, and each cycle contains numerous strings. Every processor keeps a deque that contains
strings that are prepared to execute. The processor executes an interaction, and on the off chance
that a cycle makes a kid cycle, at that point that cycle will be embedded at the front of the deque
of the parent interaction. Assume the processor P2 has finished the execution of every one of its
strings then it takes the string from the backside of the processor P1 and adds to the front finish of
the processor P2. The processor P2 will take the string from the front end; thusly, the erasure
takes from both the closures, i.e., front and backside. This is known as the A-take calculation for
planning.

Hash Tables :
Introduction:

We've seen searches that allow you to look through data in O(n) time, and searches that allow you to look
through data in O(logn) time, but imagine a way to find exactly what you want in O(1) time. Think it's not
possible? Think again! Hash tables allow the storage and retrieval of data in an average time

At its most basic level, a hash table data structure is just an array. Data is stored into this array at specific
indices designated by a hash function. A hash function is a mapping between the set of input data and a set
of integers.
With hash tables, there always exists the possibility that two data elements will hash to the same integer
value. When this happens, a collision results (two data members try to occupy the same place in the hash
table array),and methods have been devised to deal with such situations. In this guide, we will cover two
methods, linear probing and separate chaining, focusing on the latter.

A hash table is made up of two parts: an array (the actual table where the data to be searched is stored) and a
mapping function, known as a hash function. The hash function is a mapping from the input space to the
integer space that defines the indices of the array. In other words, the hash function provides a way for
assigning numbers to the input data such that the data can then be stored at the array index corresponding to
the assigned number.
Let's take a simple example. First, we start with a hash table array of strings (we'll use strings as the data
being stored and searched in this example). Let's say the hash table size is 12:
Next we need a hash function. There are many possible ways to construct a hash function. We'll discuss
these possibilities more in the next section. For now, let's assume a simple hash function that takes a string
as input. The returned hash value will be the sum of the ASCII characters that make up the string mod the
size of the table:int hash(char *str, int table_size) { int sum; /* Make sure a valid string passed
in */ if (str==NULL) return -1; /* Sum up all the characters in the string */ for( ; *str; str++)
sum += *str; /* Return the sum mod the table size */ return sum % table_size; } We run "Steve"
through the hash function, and find that hash("Steve",12) yields 3:
Figure %: The hash table after inserting "Steve"

Let's try another string: "Spark". We run the string through the hash function and find
that hash("Spark",12) yields 6. Fine. We insert it into the hash table:

Figure %: The hash table after inserting "Spark"


Let's try another: "Notes". We run "Notes" through the hash function and find that hash("Notes",12) is 3. Ok.
We insert it into the hash table:

Figure %: A hash table collision


What happened? A hash function doesn't guarantee that every input will map to a different output. There is
always the chance that two inputs will hash to the same output. This indicates that both elements should be
inserted at the same place in the array, and this is impossible. This phenomenon is known as a collision.
There are many algorithms for dealing with collisions, such as linear probing an d separate chaining. While
each of the methods has its advantages, we will only discuss separate chaining here.
Separate chaining requires a slight modification to the data structure. Instead of storing the data elements
right into the array, they are stored in linked lists. Each slot in the array then points to one of these linked
lists. When an element hashes to a value, it is added to the linked list at that index in the array. Because a
linked list has no limit on length, collisions are no longer a problem. If more than one element hashes to the
same value, then both
are stored in that linked list.

Let's look at the above example again, this time with our modified data structure:

Figure %: Modified table for separate chaining

Again, let's try adding "Steve" which hashes to 3:

Figure %: After adding "Steve" to the table And "Spark" which hashes to 6:
Problem : How does a hash table allow for O(1) searching? What is the worst case efficiency of a look up
in a hash table using separate chainging?

A hash table uses hash functions to compute an integer value for data. This integer value can then be
used as an index into an array, giving us a constant time access to the requested data. However, using
separate chaining, we won't always achieve the best and average case efficiency of O(1). If we have too
small a hash table for the data set size and/or a bad hash function, elements can start to build in one index in
the array. Theoretically, all n element could end up in the same linked list. Therefore, to do a search in the
worst case is equivalent to looking up a data element in a linked list, something we already know to be O(n)
time. However, with a good hash function and a well created hash table, the chances of this happening are,
for all intents and purposes, ignorable. Problem : The bigger the ratio between the size of the hash table
and the number of data elements, the less chance there is for collision. What is a drawback to making the
hash table big enough so the chances of collision is ignorable?
Wasted memory space

Problem : How could a linked list and a hash table be combined to allow someone to run through the list
from item to item while still maintaining the ability to access an individual element in O(1) time?

Hash Functions

As mentioned briefly in the previous section, there are multiple ways for constructing a hash function.
Remember that hash function takes the data as input (often a string), and return s an integer in the range of
possible indices into the hash table. Every hash function must do that, including the bad ones. So what
makes for a good hash function?

Characteristics of a Good Hash Function

There are four main characteristics of a good hash function:


1) The hash value is fully determined by the data being hashed.
2) The hash function uses all the input data.
3) The hash function "uniformly" distributes the data across the entire set of possible hash
values.
4) The hash function generates very different hash values for similar strings. Let's examine
why each of these is important:

Rule 1: If something else besides the input data is used to determine the hash, then the hash value is not as
dependent upon the input data, thus allowing for a worse distribution of the hash values.

Rule 2: If the hash function doesn't use all the inp5u5t data, then slight variations to the input data would
cause an inappropriate number of similar hash values resulting in too many collisions.
Rule 3: If the hash function does not uniformly distribute the data across the entire set of possible hash
values, a large number of collisions will result, cutting down on the efficiency of the hash table.
Rule 4: In real world applications, many data sets contain very similar data elements.

Hash Table is a data structure which stores data in an associative manner. In a hash table, data is stored
in an array format, where each data value has its own unique index value. Access of data becomes very fast
if we know the index of the desired data.

Thus, it becomes a data structure in which insertion and search operations are very fast irrespective of
the size of the data. Hash Table uses an array as a storage medium and uses hash technique to generate an
index
where an element is to be inserted or is to be located from.

Hashing
Hashing is a technique to convert a range of key values into a range of indexes of an array. We're going
to use modulo operator to get a range of key values. Consider an example of hash table of size 20, and the
following items are to be stored. Item are in the (key,value) format.
 (1,20)

 (2,70)

 (42,80)

 (4,25)

 (12,44)

 (14,32)

 (17,11)

 (13,78)

 (37,98)
Sr.No. Key Hash Array Index

1 1 1 % 20 = 1 1

2 2 2 % 20 = 2 2

3 42 42 % 20 = 2 2

4 4 4 % 20 = 4 4

5 12 12 % 20 = 12 12

6 14 14 % 20 = 14 14

7 17 17 % 20 = 17 17

8 13 13 % 20 = 13 13

9 37 37 % 20 = 17 17

Linear Probing
As we can see, it may happen that the hashing technique is used to create an already used index of
the array. In such a case, we can search the next empty location in the array by looking into the
next cell until we find an empty cell. This technique is called linear probing.

Array After Linear Probing,


Sr.No. Key Hash
Inex Array Index

1 1 1 % 20 = 1 1 1

2 2 2 % 20 = 2 2 2
3 42 42 % 20 = 2 2 3

4 4 4 % 20 = 4 4 4

5 12 12 % 20 = 12 12 12

6 14 14 % 20 = 14 14 14

7 17 17 % 20 = 17 17 17

8 13 13 % 20 = 13 13 13

9 37 37 % 20 = 17 17 18

Basic Operations
Following are the basic primary operations of a hash table.
Search − Searches an element in a hash
table. Insert − inserts an element in a hash
table. delete − Deletes an element from a
hash table. DataItem
Define a data item having some data and key, based on which the search is to be conducted in
a hash table.
struct
DataItem {
int data;
int key;
};

Hash Method
Define a hashing method to compute the hash code of the key of the data item.
int hashCode(int key){
return key % SIZE;
}

Search Operation
Whenever an element is to be searched, compute the hash code of the key passed and locate the
element using that hash code as index in the array. Use linear probing to get the element ahead if the
element is not found at the computed hash code.

Insert Operation
Whenever an element is to be inserted, compute the hash code of the key passed and locate the index
using that hash code as an index in the array. Use linear probing for empty location, if an element is
found at the computed hash code.

Delete Operation
Whenever an element is to be deleted, compute the hash code of the key passed and locate the index
using that hash code as an index in the array. Use linear probing to get the element ahead if an
element is not found at the computed hash code. When found, store a dummy item there to keep the
performance of the hash table intact.

OpenAddressing
Like separate chaining, open addressing is a method for handling collisions. In Open Addressing, all
elements are stored in the hash table itself. So at any point, the size of the table must be greater than
or equal to the total number of keys (Note that we can increase table size by copying old data if
needed).
Insert(k): Keep probing until an empty slot is found. Once an empty slot is found, insert k.
Search(k): Keep probing until slot’s key doesn’t become equal to k or an empty slot is
reached.
Delete(k): Delete operation is interesting. If we simply delete a key, then the search may fail.
So slots of deleted keys are marked specially as “deleted”. The
insert can insert an item in a deleted slot, but the search doesn’t stop at a deleted slot.
Open Addressing is done in the following ways:

a) Linear Probing: In linear probing, we linearly probe for next slot. For example, the
typical gap between two probes is 1 as seen in the
example below. Let hash(x) be the slot index computed using a hash function and S
be the table size
If slot hash(x) % S is full, then we try (hash(x) + 1) % S
If (hash(x) + 1) % S is also full, then we try (hash(x) + 2) %
S If (hash(x) + 2) % S is also full, then we try (hash(x) + 3)
%S
Let us consider a simple hash function as “key mod 7” and a sequence of keys as 50, 700, 76, 85, 92,
73, 101.

Challenges in Linear Probing :


1. Primary Clustering: One of the problems with linear probing is Primary
clustering, many consecutive elements form groups and it starts taking time to find
a free slot or to search for an element.
2. Secondary Clustering: Secondary clustering is less severe, two records only have
the same collision chain (Probe Sequence) if their initial position is the same.
b) Quadratic Probing We look for i2‘th slot in
i’th iteration. let hash(x) be the slot index
computed using hash function. If slot hash(x) %
S is full,
then we try (hash(x) + 1*1) % S
If (hash(x) + 1*1) % S is also full, then we try (hash(x) + 2*2) %
S If (hash(x) + 2*2) % S is also full, then we try (hash(x) + 3*3)
%S

c) Double Hashing We use another hash function hash2(x) and look for i*hash2(x)
slot in i’th rotation.
let hash(x) be the slot index computed using hash function.
If slot hash(x) % S is full, then we try (hash(x) + 1*hash2(x)) % S
If (hash(x) + 1*hash2(x)) % S is also full, then we try (hash(x) + 2*hash2(x)) % S If
(hash(x) + 2*hash2(x)) % S is also full, then we try (hash(x) + 3*hash2(x)) % S

Comparison:
Linear probing has the best cache performance but suffers from clustering. One more advantage of Linear
probing is easy to compute. Quadratic probing lies between the two in terms of cache performance
and
clustering. Double hashing has poor cache
performance but no clustering. Double hashing requires more computation time as two hash functions
need to be computed.

S.No. Separate Chaining Open Addressing

1. Chaining is Simpler to implement. Open Addressing requires more computation.

In chaining, Hash table never fills up, we


2. can always add more elements to chain. In open addressing, table may become full.

Chaining is Less sensitive to the hash Open addressing requires extra care to avoid
3. function or load factors. clustering and load factor.

Chaining is mostly used when it is unknown


how many and how frequently keys Open addressing is used when the frequency and
4. may be inserted or deleted. number of keys is known.

Open addressing provides better cache


Cache performance of chaining is not good performance as everything is stored in the
5. as keys are stored using linked list. same table.

Wastage of Space (Some Parts of hash table In Open addressing, a slot can be used even if an
6. in chaining are never used). input doesn’t map to it.

You might also like