Data Structures
Data Structures
ON
DATA STRUCTURES
Mrs. N.HEMALATHA
ASST.PROFESSOR
Objectives:
Unit-1: Introduction and overview: Asymptotic Notations, One Dimensional array- Multi Dimensional array-
pointer arrays.
Linked lists: Definition- Single linked list- Circular linked list- Double linked list- Circular Double linked list-
Application of linked lists.
Queues: Introduction, Definition- Representations of Queues- Various Queue Structures- Applications of Queues.
Tables: Hash tables.
Unit-3:Trees: Basic Terminologies- Definition and Concepts- Representations of Binary Tree- Operation on a
Binary Tree- Types of Binary Trees-Binary Search Tree, Heap Trees, Height Balanced Trees, B. Trees, Red Black
Trees. Graphs: Introduction- Graph terminologies- Representation of graphs- Operations on Graphs- Application
of Graph Structures: Shortest path problem- topological sorting.
Unit-4:Sorting : Sorting Techniques- Sorting by Insertion: Straight Insertion sort- List insertion sort- Binary
insertion sort- Sorting by selection: Straight selection sort- Heap Sort- Sorting by Exchange- Bubble Sort- Shell
Sort-Quick Sort-External Sorts: Merging Order Files-Merging Unorder Files- Sorting Process.
Unit-5:Searching: List Searches- Sequential Search- Variations on Sequential Searches- Binary Search-
Analyzing Search Algorithm- Hashed List Searches- Basic Concepts- Hashing Methods- Collision Resolutions-
Open Addressing- Linked List Collision Resolution- Bucket Hashing.
Text Books:
Reference Books:
1. Fundamentals of Data Structures in C – Horowitz, Sahni, Anderson- Freed, Universities Press, Second Edition.
3. Data structures and Algorithms using C++, Ananda Rao Akepogu and Radhika Raju Palagiri, Pearson
Education.
Unit-1 Introduction and overview: Asymptotic Notations, One Dimensional array , Multi
Dimensional array- pointer arrays. Linked lists: Definition- Single linked list- Circular
linked list- Double linked list- Circular Double linked list
The term DATA STRUCTURE is used to describe the way data is stored, and the term
algorithm is used to describe the way data is processed. Data structures and algorithms are
interrelated. Choosing a data structure affects the kind of algorithm you might use, and
choosing an algorithm affects the data structures we use.
An Algorithm is a finite sequence of instructions, each of which has a clear meaning
and can be performed with a finite amount of effort in a finite length of time. No matter what
the input values may be, an algorithm terminates after executing a finite number of
instructions.
Da t a Struc ture s
Primitive Data Structures are the basic data structures that directly operate upon the machine
instructions. They have different representations on different computers. Integers, floating point
numbers, character constants, string constants and pointers come under this category.
Non-primitive data structures are more complicated data structures and are derived from
primitive data structures. They emphasize on grouping same or different data items with
relationship between each data item. Arrays, lists and files come under this category. Figure 1.1
shows the classification of data structures.
Data structures: Organization of data
The collection of data you work with in a program have some kind of structure or organization.
No matte how complex your data structures are they can be broken down into two fundamental
types:
• Contiguous
• Non-Contiguous.
In contiguous structures, terms of data are kept together in memory (either RAM or in a
file). An array is an example of a contiguous structure. Since each element in the array is
located next to one or two other elements. In contrast, items in a non-contiguous structure and
scattered in memory, but we linked to each other in some way. A linked list is an example of a
non-contiguous data structure. Here, the nodes of the list are linked together using pointers
stored in each node. Figure 1.2 below illustrates the difference between contiguous and non-
contiguous structures.
1 2 3 1 2 3
Contiguous structures:
Contiguous structures can be broken drawn further into two kinds: those that contain
data items of all the same size, and those where the size may differ. Figure 1.2 shows example
of each kind. The first kind is called the array. Figure 1.3(a) shows an example of an array of
numbers. In an array, each element is of the same type, and thus has the same size.
The second kind of contiguous structure is called structure, figure 1.3(b) shows a simple
structure consisting of a person‟s name and age. In a struct, elements may be of different data
types and thus may have different sizes.
For example, a person‟s age can be represented with a simple integer that occupies two
bytes of memory. But his or her name, represented as a string of characters, may require many
bytes and may even be of varying length.Couples with the atomic types (that is, the single data-
item built-in types such as integer, float and pointers), arrays and structs provide all the
“mortar” you need to built more exotic form of data structure, including the non-contiguous
forms.
int arr[3] = {1, 2, 3}; struct cust_data
{
int age;
1 2 3 char name[20];
};
21
(b) struct
A B C A
B C
(a) Linked List
D
A
E G
B C
(b) Tree F (c) graph
D E F G
1.2 Arrays
Arrays a kind of data structure that can store a fixed-size sequential collection of
elements of the same type. An array is used to store a collection of data, but it is often more
useful to think of an array as a collection of variables of the same type.
Instead of declaring individual variables, such as number0, number1, ..., and number99,
you declare one array variable such as numbers and use numbers[0], numbers[1], and ...,
numbers[99] to represent individual variables. A specific element in an array is accessed by
an index.
All arrays consist of contiguous memory locations. The lowest address corresponds to the
first element and the highest address to the last element.
Declaring Arrays
To declare an array in C, a programmer specifies the type of the elements and the number
of elements required by an array as follows −
double balance[10];
The number of values between braces { } cannot be larger than the number of elements that
we declare for the array between square brackets [ ].
If you omit the size of the array, an array just big enough to hold the initialization is
created. Therefore, if you write −
You will create exactly the same array as you did in the previous example. Following is an
example to assign a single element of the array −
balance[4] = 50.0;
The above statement assigns the 5th element in the array with a value of 50.0. All arrays
have 0 as the index of their first element which is also called the base index and the last
index of an array will be total size of the array minus 1. Shown below is the pictorial
representation of the array we discussed above
Types of Arrays
1. .Single Dimensional Array :
1. Single or One Dimensional array is used to represent and store data in a linear form.
2. Array having only one subscript variable is called One-Dimensional array
3. It is also called as Single Dimensional Array or Linear Array
Syntax :
int arr[5]={ 1, 2, 3, 4, 5 };
Assuming that the base address of arr is 1000 and each integer requires two byte, the five
element will be stored as follows
Here variable arr will give the base address, which is a constant pointer pointing to the
element, arr[0]. Therefore arr is containing the address of arr[0] i.e 1000.
int *p;
p = arr;
Now we can access every element of array arr using p++ to move from one element to
another.
NOTE : You cannot decrement a pointer once incremented. p-- won't work.
Pointer to Array
As studied above, we can use a pointer to point to an Array, and then we can use that pointer
to access the array. Lets have an example,
int i;
int a[5] = {1, 2, 3, 4, 5};
int *p = a; // same as int*p = &a[0]
for (i=0; i<5; i++)
{
printf("%d", *p);
p++;
}
In the aboce program, the pointer *p will print all the values stored in the array one by one.
We can also use the Base address (a in above case) to act as pointer and print all the values.
such an array. As we know now, name of the array gives its base address. In a[i][j] , a will
give the base address of this array, even a+0+0 will also give the base address, that is the
A single linked list is one in which all nodes are linked together in some sequential
manner. Hence, it is also called as linear linked list.
A double linked list is one in which all nodes are linked together by multiple links
which helps in accessing both the successor node (next node) and predecessor node
(previous node) from any arbitrary node within the list. Therefore each node in a double
linked list has two link fields (pointers) to point to the left node (previous) and the right
node (next). This helps to traverse in forward direction and backward direction.
A circular linked list is one, which has no beginning and no end. A single linked list
can be made a circular linked list by simply storing address of the very first node in the link
field of the last node.
A circular double linked list is one, which has both the successor pointer and
predecessor pointer in the circular manner.
A linked list allocates space for each element separately in its own block of memory called a
"node". The list gets an overall structure by using pointers to connect all its nodes together
like the links in a chain. Each node contains two fields; a "data" field to store whatever
element, and a "next" field which is a pointer used to link to the next node. Each node is
allocated in the heap using malloc(), so the node memory continues to exist until it is
explicitly de-allocated using free(). The front of the list is a pointer to the “start” node.
STACK HEAP
100
start 30 400 40 X
10 200 20 300
300 400
100 200
The start
pointer holds
Each node stores Stores the next
the address the data. node address.
of the first node of the list.
The beginning of the linked list is stored in a "start" pointer which points to the first node.
The first node contains a pointer to the second node. The second node contains a pointer to
the third node,. and so on. The last node in the list has its next field set to NULL to mark the
end of the list. Code can access any node in the list by starting at the start and following the
next pointers.
The start pointer is an ordinary local pointer variable, so it is drawn separately on the left
top to show that it is in the stack. The list nodes are drawn on the right to show that they are
allocated in the heap.
The start pointer is made to point the new node by assigning the address of
the new node. Repeat the above steps „n‟ times
Insertion of a Node:
One of the most primitive operations that can be done in a singly linked list is the
insertion of a node. Memory is to be allocated for the new node (in a similar way that is done
while creating a list) before reading the data. The new node will contain empty data field and
empty next field. The data field of the new node is then stored with the information read
from the user. The next field of the new node is assigned to NULL. The new node can then
be inserted at three different places namely:
• Inserting a node at the beginning.
• Inserting a node at the end.
• Inserting a node at intermediate position.
Inserting a node at the beginning:
The following steps are to be followed to insert a new node at the beginning of the list:
• Get the new node using getnode().
newnode = getnode();
• If the list is empty then start = newnode.
• If the list is not empty, follow the steps given below:
newnode -> next = start;
start = newnode;
The function insert_at_beg(), is used for inserting a node at the beginning
void insert_at_beg()
{
node *newnode; newnode = getnode();
if(start == NULL)
{
start = newnode;
}
else
{
newnode -> next = start; start = newnode;
}
}
Inserting a node at the end:
The following steps are followed to insert a new node at the end of the list:
• Get the new node using getnode()
newnode = getnode();
• If the list is empty then start = newnode.
• If the list is not empty follow the steps given below: temp= start;
while(temp -> next != NULL) temp = temp ->
next;
temp -> next = newnode;
The function insert_at_end(), is used for inserting a node at the end.
void insert_at_end()
{
node *newnode,
*temp; newnode =
getnode(); if(start ==
NULL)
{
start = newnode;
}
else
{
temp = start;
while(temp -> next !=
NULL) temp = temp
-> next;
temp -> next = newnode;
}
}
Inserting a node at intermediate position:
The following steps are followed, to insert a new node in an intermediate position in the
list:
• Ensure that the specified position is in between first node and last node. If not,
specified position is invalid. This is done by countnode() function.
• Store the starting address (which is in start pointer) in temp and prev pointers.
Then traverse the temp pointer upto the specified position followed by prev
pointer.
• After reaching the specified position, follow the steps given below:
prev -> next = newnode;
newnode -> next = temp;
The function insert_at_mid(), is used for inserting a node in the intermediate position.
void insert_at_mid()
{
node *newnode, *temp, *prev; int pos, nodectr, ctr = 1;
newnode = getnode();
printf("\n Enter the position: ");
scanf("%d", &pos);
nodectr = countnode(start);
if(pos > 1 && pos < nodectr)
{
temp = prev = start;
while(ctr < pos)
{
prev = temp;
temp = temp -> next;
ctr++;
}
prev -> next = newnode;
newnode -> next = temp;
}
else
{
printf("position %d is not a middle position", pos);
}
}
Deletion of a node:
Another primitive operation that can be done in a singly linked list is the deletion
of a node. Memory is to be released for the node to be deleted. A node can be deleted
from the list from three different places namely.
• Deleting a node at the beginning.
• Deleting a node at the end.
• Deleting a node at intermediate position.
Deleting a node at the beginning:
The following steps are followed, to delete a node at the beginning of the list:
• If list is empty then display „Empty List‟ message.
• If the list is not empty, follow the steps given
below: temp = start;
start = start ->
next;
free(temp);
The function delete_at_beg(), is used for deleting the first node in the list.
void delete_at_beg()
{
node *temp; if(start ==
NULL)
{
printf("\n No nodes are exist..");
return ;
}
else
{
temp = start;
start = temp -
> next;
free(temp);
printf("\n Node deleted ");
}
}
Deleting a node at the end:
The following steps are followed to delete a node at the end of the list:
• If list is empty then display „Empty List‟ message.
• If the list is not empty, follow the steps given below:
temp = prev = start;
while(temp -> next != NULL)
{
prev = temp;
temp = temp -> next;
}
prev -> next = NULL;
free(temp);
The function delete_at_last(), is used for deleting the last node in the list.
void delete_at_last()
{
node *temp, prev;
if(start == NULL)
{
printf("\n Empty List.."); return ;
}
else
{
temp = start;
prev = start;
while(temp -> next != NULL)
{
prev = temp;
temp = temp -> next;
}
prev -> next = NULL;
free(temp);
printf("\n Node deleted ");
}
}
Deleting a node at Intermediate position:
The following steps are followed, to delete a node from an intermediate position in the list
(List must contain more than two node).
• If list is empty then display „Empty List‟ message
• If the list is not empty, follow the steps given below.
if(pos > 1 && pos < nodectr)
{
temp = prev = start; ctr = 1;
while(ctr < pos)
{
prev = temp;
temp = temp -> next;
ctr++;
}
prev -> next = temp -> next;
free(temp);
printf("\n node deleted..");
}
The following steps are followed to delete a node at the end of the list:
• If the list is empty, display a message „Empty List‟.
temp = start;
prev = start;
while(temp -> next != start)
{
prev = temp;
temp = temp -> next;
}
prev -> next = start;
After deleting the node, if the list is empty then start = NULL.
Circular Double Linked List:
A circular double linked list has both successor pointer and predecessor pointer in
circular manner. The objective behind considering circular double linked list is to simplify
the insertion and deletion operations performed on double linked list. In circular double
linked list the right link of the right most node points back to the start node and left link of
the first node points to the last node.
The basic operations in a circular double linked list are:
• Creation.
• Insertion.
• Deletion.
• Traversing.
Creating a Circular Double Linked List with ‘n’ number of nodes:
The following steps are to be followed to create „n‟ number of nodes:
• Get the new node using getnode().
newnode = getnode();
• If the list is empty, then do the following
start = newnode;
newnode -> left = start;
newnode ->right = start;
• If the list is not empty, follow the steps given below:
newnode -> left = start -> left;
newnode -> right = start;
start -> left->right = newnode;
start -> left = newnode;
Inserting a node at the beginning:
The following steps are to be followed to insert a new node at the beginning of the list:
• Get the new node using getnode().
newnode=getnode();
• If the list is empty, then
start = newnode;
newnode -> left = start;
newnode -> right = start;
• If the list is not empty, follow the steps given below:
newnode -> left = start -> left;
newnode -> right = start;
start -> left -> right = newnode;
start -> left = newnode;
start = newnode;
Inserting a node at the end:
The following steps are followed to insert a new node at the end of the list:
• Get the new node using getnode()
newnode=getnode();
• If the list is empty, then
start = newnode;
newnode -> left = start;
newnode -> right = start;
• If the list is not empty follow the steps given below:
newnode -> left = start -> left;
newnode -> right = start;
start -> left -> right = newnode;
start -> left = newnode;
The function cdll_insert_end(), is used for inserting a node at the end. Figure 3.8.3 shows
inserting a node into the circular linked list at the end.
Inserting a node at an intermediate position:
The following steps are followed, to insert a new node in an intermediate position in the list:
The following steps are followed, to delete a node from an intermediate position in the list
(List must contain more than two node).
• If list is empty then display „Empty List‟ message.
• If the list is not empty, follow the steps given below:
• Get the position of the node to delete.
• Ensure that the specified position is in between first node and last node.
If not, specified position is invalid.
• Then perform the following steps:
if(pos > 1 && pos < nodectr)
{
temp = start; i = 1;
while(i < pos)
{
temp = temp -> right ;
i++;
}
temp -> right -> left = temp -> left;
temp -> left -> right = temp -> right;
free(temp);
printf("\n node deleted..");
nodectr--;
}
Unit-2
Stacks: Introduction-Definition-Representation of Stack-Operations on Stacks-
Applications of Stacks.
Queues: Introduction, Definition- Representations of Queues- Various Queue
Structures- Applications of Queues. Tables: Hash tables.
STACK
A stack is a list of elements in which an element may be inserted or deleted only at
one end, called the top of the stack. Stacks are sometimes known as LIFO (last in, first out)
lists.
As the items can be added or removed only from the top i.e. the last item to be added to a
stack is the first item to be removed.
The two basic operations associated with stacks are:
• Push: is the term used to insert an element into a stack.
• Pop: is the term used to delete an element from a stack.
“Push” is the term used to insert an element into a stack. “Pop” is the term used to delete an
element from the stack.
All insertions and deletions take place at the same end, so the last element added to the stack
will be the first element removed from the stack. When a stack is created, the stack base
remains fixed while the stack top changes as elements are added and removed. The most
accessible element is the top and the least accessible element is the bottom of the stack.
Representation of Stack:
Let us consider a stack with 6 elements capacity. This is called as the size of the
stack. The number of elements to be added should not exceed the maximum size of the
stack. If we attempt to add new element beyond the maximum size, we will encounter a
stack overflow condition. Similarly, you cannot remove elements beyond the base of the
stack. If such is the case, we will reach a stack underflow condition. When an element is
added to a stack, the operation is performed by push().
4 4 4 4
3 3 3 3
TOP
2 2 TOP 2 33 2
1 TOP 1 22 1 22 1
11 11 11
TOP 0 0 0 0
Empty Insert Insert Insert
Stack 11 22 33
4 4 4 4
TOP 3 3 3 3
33
2 TOP 2 2 2
22 22
1 1 TOP 1 1
11 11 11 TOP
0 0 0 0
Initial POP POP POP
Stack
Empty
Linked List Implementation of Stack:
We can represent a stack as a linked list. In a stack push and pop operations are
performed at one end called top. We can perform similar operations at one end of list using
top pointer. The linked stack looks as shown in figure
top
400
data next
40 X
400
30 400
300
20 300
200
start
100 10 200
100
Algebraic Expressions:
An algebraic expression is a legal combination of operators and operands. Operand
is the quantity on which a mathematical operation is performed. Operand may be a variable
like x, y, z or a constant like 5, 4, 6 etc. Operator is a symbol which signifies a mathematical
or logical operation between the operands. Examples of familiar operators include +, -, *, /,
^ etc.
An algebraic expression can be represented using three different notations. They are infix,
postfix and prefix notations:
Infix: It is the form of an arithmetic expression in which we fix (place) the arithmetic
operator in between the two operands.
Example: (A + B) * (C - D)
Prefix: It is the form of an arithmetic notation in which we fix (place) the arithmetic
operator before (pre) its two operands.
The prefix notation is called as polish notation
Example: * + A B – C D
Postfix: It is the form of an arithmetic expression in which we fix (place) the arithmetic
operator after (post) its two operands. The postfix notation is called as suffix
notation and is also referred to reverse polish notation.
Example: A B + C D - *
The three important features of postfix expression are:
1. The operands maintain the same order as in the equivalent infix expression.
2. The parentheses are not needed to designate the expression un-ambiguously.
3. While evaluating the postfix expression the priority of the operators is no longer
relevant.
We consider five binary operations: +, -, *, / and $ or ↑ (exponentiation). For these binary
operations, the following in the order of precedence (highest to lowest):
Exponentiation ($ or ↑ or ^) Highest 3
*, / Next highest 2
+, - Lowest 1
Example 1:
PREFIX
SYMBOL STACK REMARKS
STRING
C C
- C -
B BC -
+ BC -+
A ABC -+
End of - + A B C The input is now empty. Pop the output symbols from the
string stack until it is empty.
A queue is another special kind of list, where items are inserted at one end called the
rear and deleted at the other end called the front. Another name for a queue is a “FIFO” or
“First-in-first-out” list.
The operations for a queue are analogues to those for a stack, the difference is that the
insertions go at the end of the list, rather than the beginning. We shall use the following
operations on queues:
• enqueue: which inserts an element at the end of the queue.
• dequeue: which deletes an element at the start of the queue.
Representation of Queue:
Let us consider a queue, which can hold maximum of five elements. Initially the queue is
empty.
0 1 2 3 4
Que u e E mpt y
F RO NT = REA R = 0
FR
Now, insert 11 to the queue. Then queue status will be:
0 1 2 3 4
REA R = REA R + 1 = 1
11
F RO NT = 0
0 1 2 3 4
REA R = REA R + 1 = 1
11 22 33
F RO NT = 0
Circular Queue:
A more efficient queue representation is obtained by regarding the array Q[MAX] as
circular. Any number of items could be placed on the queue. This implementation of a
queue is called a circular queue because it uses its storage array as if it were a circle instead
of a linear list.
There are two problems associated with linear queue. They are:
• Time consuming: linear time to be spent in shifting the elements to the
beginning of the queue.
• Signaling queue full: even if the queue is having vacant position.
For example, let us consider a linear queue status as follows:
0 1 2 3 4
REA R = 5
33 44 55
F RO NT = 2
F R
Next insert another element, say 66 to the queue. We cannot insert 66 to the queue as the
rear crossed the maximum size of the queue (i.e., 5). There will be queue full signal. The
queue status is as follows:
0 1 2 3 4
REA R = 5
33 44 55
F RO NT = 2
F R
This difficulty can be overcome if we treat queue position with index zero as a position that
comes after position with index four then we treat the queue as a circular queue.
In circular queue if we reach the end for inserting elements to it, it is possible to insert new
elements if the slots at the beginning of the circular queue are empty.
Let us consider a circular queue, which can hold maximum (MAX) of six elements. Initially
the queue is empty.
F R
5 0
1 Que u e E mpt y
4 MAX=6
F RO NT = REA R = 0
CO U NT = 0
3 2
Now, insert 11 to the circular queue. Then circular queue status will be:
5 0
R
11
F RO NT = 0
1
4 REA R = ( REA R + 1) % 6 = 1
CO U NT = 1
3 2
Insert new elements 22, 33, 44 and 55 into the circular queue. The circular queue
status is:
F
R
0
5
11
22 1 FRONT = 0
4 55
REAR = (REAR + 1) % 6 = 5
COUNT = 5
44 33
2
3
Circular Queue
Now, delete an element. The element deleted is the element at the front of the circular
queue. So, 11 is deleted. The circular queue status is as follows:
R
0
5
F
22 1 F RO NT = (F R O NT + 1) % 6 = 1
4 55 REA R = 5
CO U NT = CO U NT - 1 = 4
44 33
3 2
Again, delete an element. The element to be deleted is always pointed to by the FRONT
pointer. So, 22 is deleted. The circular queue status is as follows:
0
5
1 F RO NT = (F R O NT + 1) % 6 = 2
4 55 REA R = 5
CO U NT = CO U NT - 1 = 3
44 33
F
3 2
0
5
66
1
4 55 F RO NT = 2
REA R = ( REA R + 1) % 6 = 0
CO U NT = CO U NT + 1 = 4
44 33
3 2 F
Deque:
In the preceding section we saw that a queue in which we insert items at one end and
from which we remove items at the other end. In this section we examine an extension of
the queue, which provides a means to insert and remove items at both ends of the queue.
This data structure is a deque. The word deque is an acronym derived from double-ended
queue. Figure 4.5 shows the representation of a deque.
Deletion Insertion
36 16 56 62 19
Insertion Deletion
front rear
A deque provides four operations. Figure 4.6 shows the basic operations on a deque.
• enqueue_front: insert an element at front.
• dequeue_front: delete an element at front.
• enqueue_rear: insert element at rear.
• dequeue_rear: delete element at rear.
There are two variations of deque. They are:
• Input restricted deque (IRD)
• Output restricted deque (ORD)
An Input restricted deque is a deque, which allows insertions at one end but allows
deletions at both ends of the list.
Priority Queue:
A priority queue is a collection of elements such that each element has been assigned a
priority and such that the order in which elements are deleted and processed comes from
the following rules:
1. An element of higher priority is processed before any element of lower priority.
2. two elements with same priority are processed according to the order in which
they were added to the queue.
A prototype of a priority queue is time sharing system: programs of high priority are
processed first, and programs with the same priority form a standard queue. An efficient
implementation for the Priority Queue is to use heap, which in turn can be used for sorting
purpose called heap sort.
An output restricted deque is a deque, which allows deletions at one end but allows
insertions at both ends of the list.