DSD - Unit 2
DSD - Unit 2
LECTURE NOTES
On
Contents:
❖ Syllabus
❖ Unit wise Notes
❖ Question Bank
❖ Two marks with Answers
CS3351 2
VISION
Our Vision is to build a strong teaching & research environment in the field of computer
science and engineering for developing a team of young dynamic computer science engineers,
researchers, future entrepreneurs who are adaptive to respond to the challenges of 21st century.
Our commitment lies in producing disciplined human individuals, capable of contributing
solutions to solve problems faced by our society.
MISSION
PEO-2: To enable the graduates to excel in professional career and /or higher education
by acquiring knowledge in mathematical, computing and engineering principles.
PEO-3: To enable the graduates, to be competent to grasp, analyze, design, and create
new products and solutions for the real time problems that are technically advanced
economically feasible and socially acceptable
PEO- 4:To enable the graduates to pursue a productive career as a member of multi-
disciplinary and cross-functional teams, with an appreciation for the value of ethic and
cultural diversity and an ability to relate engineering issues to broader social context.
CS3351 3
PSO1: To Analyze, Design and Develop computer programs / Applications in the areas
related to Web-Technologies, Networking, Algorithms, Cloud Computing, Data analytics,
Computer Vision, Cyber-Security and Intelligent Systems for efficient design of computer-
based and Mobile-based systems of varying complexity.
PSO2: To use modern software tools (like NS2, MATLAB, Open CV, etc..) for designing,
simulating, analyzing and generating experimental results for real-time problems and case
studies
PSO3: To Apply Software Engineering practices and strategies for developing Projects
related to emerging technologies.
CS3351 4
UNIT I LISTS
Abstract Data Types (ADTs) – List ADT – Array-based implementation – Linked list
implementation – Singly linked lists – Circularly linked lists – Doubly-linked lists –
Applications of lists – Polynomial ADT
– Radix Sort – Multilists.
COURSE OUTCOMES:
At the end of this course, the students will be able to:
CO1: Define linear and non-linear data structures.
CO2: Implement linear and non–linear data structure operations.
CO3: Use appropriate linear/non–linear data structure operations for solving a given problem.
CO4: Apply appropriate graph algorithms for graph applications.
CO5: Analyze the various searching and sorting algorithms.
TOTAL:45
PERIODS
TEXT BOOKS
1. Mark Allen Weiss, Data Structures and Algorithm Analysis in C, 2nd Edition,
Pearson Education, 2005.
2. Kamthane, Introduction to Data Structures in C, 1st Edition, Pearson Education, 2007
REFERENCES
CS3351 5
1. Langsam, Augenstein and Tanenbaum, Data Structures Using C and C++, 2nd Edition, Pearson
Education, 2015.
2. Thomas H. Cormen, Charles E. Leiserson, Ronald L.Rivest, Clifford Stein, Introduction to
Algorithms", Fourth Edition, Mcgraw Hill/ MIT Press, 2022.
3. Alfred V. Aho, Jeffrey D. Ullman,John E. Hopcroft ,Data Structures and Algorithms, 1st
edition, Pearson, 2002.
4. Kruse, Data Structures and Program Design in C, 2nd Edition, Pearson Education, 2006.
CS3351 6
An abstract data type (ADT) is a set of operations. Abstract data types are mathematical
abstractions; ADT's defines how the set of operations is implemented. This can be viewed as an
extension of modular design.
Objects such as lists, sets, and graphs, along with their operations, can be viewed as
abstract data types, just as integers, reals, and Booleans are data types.
Use of ADT
Reusability of the code, that the implementation of these operations is written once in
the program, and any other part of the program that needs to perform an operation on the ADT
can do so by calling the appropriate function.
A general list of the form a1, a2, a3, . . . , an. We say that the size of this list is n. We
will call the special list of size 0 a null list. For any list except the null list, we say that a i+l
follows (or succeeds) ai (i < n) and that ai-1 precedes ai (i > 1).
The first element of the list is a1, and the last element is an. there is no predecessor of a1
or the successor of an. The position of element ai in a list is i.
2. Insert and delete, which generally insert and delete some key from some position in
the list;
Example:
1. If the list is 34, 12, 52, 16, 12, then find(52) might return 3;
2. Insert(x,3) might make the list into 34, 12, 52, x, 16, 12 (if we insert after the
position given);
3. Delete (3) might turn that list into 34, 12, x, 16, 12.
CS3351 7
1. Array implementation
2. Linked list implementation
Simple Array Implementation of Lists
Even if the array is dynamically allocated, an estimate of the maximum size of the list is
required. Usually this requires a high over-estimate, which wastes considerable space. This
could be a serious limitation, especially if there are many lists of unknown size.
● Printing the list element and find to be carried out in linear time, which is as
good as can be expected, and the find_kth operation takes constant time.
● Insertion and deletion are expensive. Because the running time for insertions and
deletions is so slow and the list size must be known in advance.
Linked Lists Implementation
In order to avoid the linear cost of insertion and deletion, we need to ensure that the list
is not stored contiguously, since otherwise entire parts of the list will need to be moved.
Definition:
The linked list consists of a series of structures, which are not necessarily adjacent
in memory. Each structure contains the element and a pointer to a structure containing its
successor. We call this the next pointer. The last cell's next pointer is always NULL.
Consider the list contains five structures, which happen to reside in memory locations
1000, 800, 712, 992, and 692 respectively. The next pointer in the first structure has the value
800, which provides the indication of where the second structure is located.
● The delete command can be executed in one pointer change. Above diagram
shows the result of deleting the third element in the original list.
● The insert command requires obtaining a new cell from the system by using an
malloc call function and then changing two pointer.
Programming Details
First, It is difficult to insert at the front of the list from the list given.
Second, deleting from the front of the list is a special case, because it changes the start
of the list;
A third problem concerns deletion in general. Although the pointer moves above are
simple, the deletion algorithm requires us to keep track of the cell before the one that
we want to delete.
In order to solve all three problems, we will keep a sentinel node, which is called as a header
or dummy node. (a header node contains the address of the first node in the linked list)
CS3351 9
The above figure shows a linked list with a header representing the list a1, a2, . . . , a5.
find_previous, which will return the position of the predecessor of the cell we wish to delete.
If we use a header, then if we wish to delete the first element in the list, find_previous will
return the position of the header.
struct node
element_type element;
node_ptr next; };
position p;
p = L->next;
p = p->next;
return p;
This routine will delete some element x in list L. We need to decide what to do if x
occurs more than once or not at all. Our routine deletes the first occurrence of x and does
nothing if x is not in the list. First we find p, which is the cell prior to the one containing x, via a
call to find_previous.
CS3351 11
{ position p,
tmp_cell; p =
find_previous( x, L );
{ /* x is found: delete it */
tmp_cell = p->next;
free( tmp_cell ); } }
/* Uses a header. If element is not found, then next field of returned value is NULL */
position p;
p = L;
p = p->next;
return p;
}
CS3351 12
Insertion routine will insert an element after the position implied by p. It is quite
possible to insert the new element into position p which means before the element currently in
position p.
position tmp_cell;
fatal_error("Out of space!!!");
else
tmp_cell->element = x;
tmp_cell->next = p->next;
p->next = tmp_cell;
delete_list( LIST L )
position p;
CS3351 13
L->next = NULL;
while( p != NULL )
free( p );
p = p->next;
} }
position p, tmp;
L->next = NULL;
while( p != NULL )
tmp = p->next;
free( p );
p = tmp;
} }
A linked list is called as doubly when it has two pointers namely forward and backward
pointers. It is convenient to traverse lists both forward and backwards.
CS3351 14
An extra field in the data structure, containing a pointer to the previous cell; The cost of
this is an extra link, which adds to the space requirement and also doubles the cost of insertions
and deletions because there are more pointers to fix.
Node
Structure declaration
struct node
int Element;
Insertion
CS3351 15
Insert(15,L,P)
Deletion:
CS3351 16
A linked list is called as circular when its last pointer point to the first cell in the linked
list forms a circular fashion. It can be singly circular and doubly circular with header or
without header.
Structure declaration:
struct node
int Element;
Insert at beginning:
Insert in middle:
Insert at Last
CS3351 19
Deletion at middle
CS3351 20
Deletion at last:
CS3351 21
A doubly circular linked list is a doubly linked list in which forward link of the last node
points to the first node and backward link of first node points to the last node of the list.
Structure Declaration:
struct node
int Element;
Insert at beginning:
CS3351 22
Insert at Middle:
CS3351 23
Insert at Last:
CS3351 24
Deletion
void dele_first(List L)
Deletion at middle:
{ Position P, Temp;
P=FindPrevious(X);
CS3351 25
Deletion at
Last node:
CS3351 26
A first example where linked lists are used is called The Polynomial ADT.
Example:
P1:4X10+5X5+3
P2:10X6-5X2+2X
int coeff;
int power;
};
Creation of polynomial:
poly *ptr;
if(head==NULL)
Head=newnode;
return(head);
else
CS3351 28
ptr=head;
while(ptr->next!=NULL)
ptr=ptr->next;
ptr->next-newnode;
return(head);
void add ( )
{
poly *ptr1, *ptr2, *newnode;
ptr1 = list1 ;
ptr2 = list 2;
while (ptr1! = NULL && ptr2! = NULL)
{
newnode = malloc (sizeof (Struct
poly)); if (ptr1 power = =
ptr2 power)
{
newnode→coeff = (ptr1 coeff) + (ptr2
coeff);
newnode→power = ptr1 power;
newnode→next = NULL;
list3 = create (list 3, newnode);
ptr1 = ptr1→next;
ptr2 = ptr2→next;
}
else
{
CS3351 29
newnode→coeff = ptr1→coeff;
newnode→power = ptr1→power;
newnode→next = NULL;
list 3 = create (list 3, newnode);
ptr1 = ptr1→next;
}
else
{
newnode→coeff = +(ptr2→coeff);
newnode→power = ptr2→power;
newnode→next = NULL;
list 3 = create (list 3,
newnode); ptr2 = ptr2 next;
}
}
}
}
2 8 10 6
CS3351 31
11 -3 10 2 8
14
CS3351 32
} else
{
if (ptr1→power > ptr2→power)
{
newnode→coeff = ptr1→coeff;
newnode→power = ptr1→power;
newnode→next = NULL;
list 3 = create (list 3, newnode);
ptr1 = ptr1→next;
}
else
{
newnode→coeff = - (ptr2→coeff);
newnode→power = ptr2→power;
newnode→next = NULL;
CS3351 33
}
}
}
POLYNOMIAL DIFFERENTIATION
void diff ( )
{
poly *ptr1, *newnode;
ptr1 = list 1;
while (ptr1 ! = NULL)
{
newnode = malloc (sizeof (Struct poly));
newnode-> coeff = ptr1-> coeff *ptr1-> power;
newnode-> power = ptr1 power - 1;
newnode next = NULL;
list 3 = create (list 3, newnode);
ptr1 = ptr1→next;
}
}
Radix Sort
A second example where linked lists are used is called radix sort. Radix sort is also
known as card sort. Because it was used, until the advent of modern computers, to sort old-style
punch cards.
When ai is read, increment (by one) counts [ai]. After all the input is read, scan the
count array, printing out a representation of the sorted list. This algorithm takes O(m + n);
The following example shows the action of radix sort on 10 numbers. The input is 64, 8,
216, 512, 27, 729, 0, 1, 343, and 125. The first step (Pass 1) bucket sorts by the least significant
digit.. The buckets are as shown in below figure, so the list, sorted by least significant digit, is
CS3351 35
0, 1, 512, 343, 64, 125, 216, 27, 8, 729. These are now sorted by the next least significant digit
(the tens digit here)
Pass 2 gives output 0, 1, 8, 512, 216, 125, 27, 729, 343, 64. This list is now sorted with
respect to the two least significant digits. The final pass, shown in Figure
The final list is 0, 1, 8, 27, 64, 125, 216, 343, 512, and 729.
The running time is O(p(n + b)) where p is the number of passes, n is the number of
elements to sort, and b is the number of buckets. In our case, b = n.
0 1 5 3 64 1 2 27 8 7
1 4 2 1 2
2 3 5 6 9
0 1 2 3 4 5 6 7 8 9
8 216 7 3 6
2 4 4
1 512 9 3
0 2
7
1
2
5
0 1 2 3 4 5 6 7 8 9
6 1 2 3 5 7
4 2 1 4 1 2
5 6 3 2 9
2
7
1
CS3351 36
0
0 1 2 3 4 5 6 7 8 9
CS3351 37
Multilists
A university with 40,000 students and 2,500 courses needs to be able to generate two
types of reports. The first report lists the class registration for each class, and the second report
lists, by student, the classes that each student is registered for.
If we use a two-dimensional array, such an array would have 100 million entries. The average
student registers for about three courses, so only 120,000 of these entries, or roughly 0.1
percent, would actually have meaningful data.
To avoid the wastage of memory, a linked list can be used. We can use two link list one
contains the students in the class. Another linked list contains the classes the student is
registered for.
All lists use a header and are circular. To list all of the students in class C3, we start at
C3 and traverse its list . The first cell belongs to student S1.
Stack Model
A stack is a list with the restriction that inserts and deletes can be performed in only one
position, namely the end of the list called the top. Stacks are sometimes known as LIFO (last
in, first out) lists.
Push onto the Full Stack and Pop or Top on an empty stack is generally considered an
error in the stack ADT.
The model depicted in above figure signifies that pushes are input operations and pops
and tops are output.
Implementation of Stacks
1. Array implementation
2. Linked list implementation
Array Implementation : Refer Class
Stacks
The first implementation of a stack uses a singly linked list. We perform a push by
inserting at the front of the list. We perform a pop by deleting the element at the front of the list.
A top operation merely examines the element at the front of the list, returning its value.
Sometimes the pop and top operations are combined into one.
Creating an empty stack is also simple. We merely create a header node; make_null sets
the next pointer to NULL.
The push is implemented as an insertion into the front of a linked list, where the front of
the list serves as the top of the stack.
The top is performed by examining the element in the first position of the list.
It should be clear that all the operations take constant time, because less a loop that
depends on this size.
These implementations uses the calls to malloc and free are expensive, especially in
comparison to the pointer manipulation routines. Some of this can be avoided by using a second
stack, which is initially empty. When a cell is to be disposed from the first stack, it is merely
placed on the second stack. Then, when new
cells are needed for the first stack, the second stack is checked first.
CS3351 41
struct Node;
Stack CreateSatck(void);
struct node
Element_type element;
PtrToNode next;
};
This routine checks whether Stack is empty or not. If it is not empty it will return a
pointer to the stack. Otherwise return NULL
}
CS3351 42
This routine creates a Stack and return a pointer of the stack. Otherwise return a warning
to say Stack is not created.
STACK S;
if( S == NULL )
fatal_error("Out of
space!!!"); return S; }
if( S == NULL )
else
while (!
IsEmpty(S))
pop(S); }
This routine is to insert the new element onto the top of the stack.
{
CS3351 43
node_ptr tmp_cell;
CS3351 44
fatal_error("Out of space!!!");
else
tmp_cell->element = x;
tmp_cell->next = S->next;
S->next = tmp_cell; } }
if( is_empty( S ) )
error("Empty stack");
else
return S->next->element;
PtrToNode first_cell;
CS3351 45
if( is_empty( S ) )
error("Empty stack");
else
first_cell = S->next;
S->next = S->next->next;
free( first_cell );
} }
Associated with each stack is the top of stack, tos, which is -1 for an empty stack. To
push some element x onto the stack, we increment tos and then set STACK[tos] = x, where
STACK is the array representing the actual stack.
To pop, we set the return value to STACK[tos] and then decrement tos.
Notice that these operations are performed in not only constant time, but very fast
constant time.
Error checking:
A pop on an empty stack or a push on a full stack will overflow the array bounds
CS3351 46
and cause a crash. Ensuring that this routines does not attempt to pop an empty stack and Push
onto the full stack.
Once the maximum size is known,the stack array can be dynamically allocated.
Stack Declaration
struct StackRecord
Int Capacity;
int TopofSatck;
ElementType *array;
};
CS3351 47
This routine creates a Stack and return a pointer of the stack. Otherwise return a warning
to say Stack is not created.
STACK S;
if( S == NULL )
fatal_error("Out of space!!!");
fatalerror("Out of space!!!");
S->Capacity =
MaxElements;
MakeEmpty(S);
return( S );
This routine frees or removes the Stack Structure itself by deleting the array elements
CS3351 48
one by one.
CS3351 49
if( S != NULL )
{ free( S->Array );
free( S ); } }
S->top_of_stack = EMPTY_TOS;
This routine will insert the new elemnt onto the top of the stack using stack pointer.
{ if( IsFull( S ) )
Error("Full
stack");
else
CS3351 50
S->Array[ ++S->TopofStack ] = X; }
CS3351 51
if( !IsEmpty( S ) )
error("Empty stack");
return 0;
if( IsEmpty( S ) )
error("Empty stack");
else
S->TopofStack--;
This routine is to return as well as remove the topmost element from the stack.
{
CS3351 52
if( IsEmpty( S ) )
error("Empty stack");
else
Stack Applications
Balancing Symbols
Compilers check your programs for syntax errors, but frequently a lack of one
symbol (such as a missing brace or comment starter) will cause the compiler to
spill out a hundred lines of diagnostics without identifying the real error.
balanced. Thus, every right brace, bracket, and parenthesis must correspond to
The sequence [()] is legal, but [(]) is wrong. That it is easy to check these things. For
simplicity, we will just check for balancing of parentheses, brackets, and braces and ignore any
other character that appears.
CS3351 53
Expression:
Types of expression:
Eg : A+B
CS3351 54
Prefix Expression:
In an expression if the operator is placed before the operands, then it is called as Prefix
Expression.
Eg : +AB
Postfix Expression:
In an expression if the operator is placed after the operands, then it is called as Postfix
Expression.
Eg : AB+
Stack is used to convert an expression in standard form (otherwise known as infix) into
postfix. We will concentrate on a small version of the general problem by allowing only the
operators +, *, and (, ), and insisting on the usual precedence rules.
a+b*c+(d*e+f)*g.
A correct answer is a b c * + d e * f + g * +.
Algorithm:
6. Finally, if we read the end of input, we pop the stack until it is empty, writing symbols
onto the output.
To see how this algorithm performs, we will convert the infix expression into its postfix
form.
a + b * c + (d * e + f) * g
First, the symbol a is read, so it is passed through to the output. Then '+' is read and pushed onto
the stack. Next b is read and passed through to the output. Then the stack will be as follows.
Next a '*' is read. The top entry on the operator stack has lower precedence than '*', so nothing
is output and '*' is put on the stack. Next, c is read and output.
The next symbol is a '+'. Checking the stack, we find that we will pop a '*' and place it on the
output, pop the other '+', which is not of lower but equal priority, on the stack, and then push
the '+'.
The next symbol read is an '(', which, being of highest precedence, is placed on the stack. Then
d is read and output.
We continue by reading a '*'. Since open parentheses do not get removed except when a closed
parenthesis is being processed, there is no output. Next, e is read and output.
CS3351 56
The next symbol read is a '+'. We pop and output '*' and then push '+'. Then we read and output
f.
Now we read a ')', so the stack is emptied back to the '('. We output a '+' 0nto the stack.
We read a '*' next; it is pushed onto the stack. Then g is read and output.
The input is now empty, so we pop and output symbols from the stack until it is empty.
As before, this conversion requires only O(n) time and works in one pass through the
input. We can add subtraction and division to this repertoire by assigning subtraction and
addition equal priority and multiplication and division equal priority.
A subtle point is that the expression a - b - c will be converted to ab - c- and not abc - -.
Our algorithm does the right thing, because these operators associate from left to right. This is
not necessarily the case in general, since exponentiation associates right to left: 223 = 28 = 256
not 43 = 64.
CS3351 57
Algorithm:
When an operator is seen, the operator is applied to the two numbers (symbols) that are
popped from the stackand the result is pushed onto the stack.
The first four symbols are placed on the stack. The resulting stack is
TopofStack 3
Next a '+' is read, so 3 and 2 are popped from the stack and their sum, 5, is pushed.
TopofStack 5
Next 8 is pushed.
TopofStack 8
TopofStack 40
TopofStack 45
Now, 3 is pushed.
TopofStack 3
45
TopofStack 48
Finally, a '*' is seen and 48 and 6 are popped, the result 6 * 48 = 288 is pushed.
TopofStack 288
The time to evaluate a postfix expression is O(n), because processing each element in
the input consists of stack operations and thus takes constant time. The algorithm to do so is
very simple.
Function Calls
● When a call is made to a new function, all the variables local to the calling routine need
to be saved by the system. Otherwise the new function will overwrite the calling
routine's variables.
● The current location in the routine must be saved so that the new function knows where
to go after it is done.
● The reason that this problem is similar to balancing symbols is that a function call and
function return are essentially the same as an open parenthesis and closed parenthesis,
so the same ideas should work.
● When there is a function call, all the important information that needs to be saved, such
as register values (corresponding to variable names) and the return address is saved "on
a piece of paper" in an abstract way and put at the top of a pile. Then the control is
transferred to the new function, which is free to replace the registers with its values.
● If it makes other function calls, it follows the same procedure. When the function wants
to return, it looks at the "paper" at the top of the pile and restores all the registers. It then
makes the return jump.
● The information saved is called either an activation record or stack frame.
● There is always the possibility that you will run out of stack space by having too many
simultaneously active functions. Running out of stack space is always a fatal error.
● In normal events, you should not run out of stack space; doing so is usually an
indication of runaway recursion. On the other hand, some perfectly legal and seemingly
innocuous program can cause you to run out of stack space.
A bad use of recursion: printing a linked list
print_list( LIST L )
{ if( L != NULL )
print_element( L->element );
print_list( L->next ); }
}
CS3351 60
● The above routine prints out a linked list, is perfectly legal and actually correct. It
properly handles the base case of an empty list, and the recursion is fine. This program
can be proven correct.
● Activation records are typically large because of all the information they contain, so this
program is likely to run out of stack space. This program is an example of an extremely
bad use of recursion known as tail recursion. Tail recursion refers to a recursive call
at the last line.
● Tail recursion can be mechanically eliminated by changing the recursive call to a goto
receded by one assignment per function argument.
● This simulates the recursive call because nothing needs to be saved -- after the recursive
call finishes, there is really no need to know the saved values. Because of this, we can
just go to the top of the function with the values that would have been used in a
recursive call.
The below program is the improved version. Removal of tail recursion is so simple that some
compilers do it automatically.
top:
if( L != NULL )
print_element( L->element );
L = L->next;
goto top;
Recursion can always be completely removed. But doing so can be quite tedious. The non-
recursive programs are generally faster than recursive programs; the speed advantage rarely
justifies the lack of clarity that results from removing the recursion.
CS3351 61
Queue is also a list in which insertion is done at one end, whereas deletion is performed at the
other end. Insertion will be at rear end of the queue and deletion will be at front of the queue. It
is also called as FIFO (First In First Out) which means the element which inserted first will be
removed first from the queue.
Queue Model
1. enqueue, which inserts an element at the end of the list (called the rear)
2. dequeue, which deletes (and returns) the element at the start of the list (known as
the front).
Abstract model of a queue
● Like stacks, both the linked list and array implementations give fast O(1) running times
for every operation. The linked list implementation is straightforward and left as an
exercise. We will now discuss an array implementation of queues.
● For each queue data structure, we keep an array, QUEUE[], and the positions q_front
and q_rear, which represent the ends of the queue. We also keep track of the number of
elements that are actually in the queue, q_size.
The following figure shows a queue in some intermediate state.
CS3351 62
● By the way, the cells that are blanks have undefined values in them. In particular, the
first two cells have elements that used to be in the queue.
● To enqueue an element x, we increment q_size and q_rear, then set QUEUE[q_rear] = x.
● To dequeue an element, we set the return value to QUEUE[q_front], decrement
q_size, and then increment q_front.. After 10 enqueues, the queue appears to be full,
since q_front is now 10, and the next enqueue would be in a nonexistent position.
● However, there might only be a few elements in the queue, because several elements
may have already been dequeued.
● The simple solution is that whenever q_front or q_rear gets to the end of the array, it is
wrapped around to the beginning. This is known as a circular array implementation.
If incrementing either q_rear or q_front causes it to go past the array, the value is reset to the
first position in the array.
There are two warnings about the circular array implementation of queues.
● First, it is important to check the queue for emptiness, because a dequeue when the
queue is empty will return an undefined value.
● Secondly, some programmers use different ways of representing the front and rear of a
queue. For instance, some do not use an entry to keep track of the size, because they rely
on the base case that when the queue is empty, q_rear = q_front - 1.
If the size is not part of the structure, then if the array size is A_SIZE, the queue is full when
there are A_SIZE -1 elements.
In applications where you are sure that the number of enqueues is not larger than the
size of the queue, the wraparound is not necessary.
Notice that q_rear is preinitialized to 1 before q_front. The final operation we will write
is the enqueue routine.
struct QueueRecord
int Capacity;
CS3351 63
int Front;
int Rear;
ElementType *Array;
};
return( Q->q_size == 0 );
Q->size = 0;
Q->Front = -1;
Q->Rear = -1;
value = 0;
return value;
if( isfull( Q ) )
error("Full queue");
else
Q->Size++;
Q->Array[ Q->Rear ] = x;
} }
Applications of Queues
1. When jobs are submitted to a printer, they are arranged in order of arrival. Then jobs
sent to a line printer are placed on a queue.
2. Lines at ticket counters are queues, because service is first-come first-served.
3. Another example concerns computer networks. There are many network setups of
personal computers in which the disk is attached to one machine, known as the file
server.
4. Users on other machines are given access to files on a first-come first-served basis, so
the data structure is a queue.
CS3351 65
Circular Queue:
CS3351 66
In Circular Queue, the insertion of a new element is performed at the very first locations
of the queue if the last location of the queue is full, in which the first element comes
after the last element.
Advantages:
To perform the insertion of the element to the queue, the position of the element is
calculated as rear= (rear+1) % queue_size and set Q[rear]=value.
Similarly the element deleted from the queue using front = (front + 1) % queue_size.
Enqueue:
This routine insert the new element at rear position of the circular queue.
CS3351 67
Dequeue:
This routine deletes the element from the front of the circular queue.
void CQ_dequeue( )
Print(“Queue is empty”);
Else
Temp=CQueue[front];
If(front==rear)
Front=rear=-1;
Else
Front=(front+1)% maxsize;
}}
Priority Queue:
CS3351 68
In an priority queue, an element with high priority is served before an element with
lower priority.
If two elements with the same priority, they are served according to their order in the
queue.
Tree is a Non- Linear datastructure in which data are stored in a hierarchal manner. It is also
defined as a collection of nodes. The collection can be empty. Otherwise, a tree consists of a
distinguished node r, called the root, and zero or more (sub) trees T1, T2, . . . , Tk, each of
whose roots are connected by a directed edge to r.
The root of each subtree is said to be a child of r, and r is the parent of each subtree root.
A tree is a collection of n nodes, one of which is the root, and n - 1 edges. That there are n - 1
edges follows from the fact that each edge connects some node to its parent and every node
except the root has one parent
Generic tree
A tree
Terms in Tree
✔ Nodes with the same parent are siblings; thus K, L, and M are all siblings.
Grandparent and grandchild relations can be defined in a similar manner.
✔ A path from node n1 to nk is defined as a sequence of nodes n1, n2, . . . , nk such that ni
is the parent of ni+1 for 1 i < k.
✔ The length of this path is the number of edges on the path, namely k -1.
✔ There is a path of length zero from every node to itself.
✔ For any node ni, the depth of ni is the length of the unique path from the
root to ni. Thus, the root is at depth 0.
✔ The height of ni is the longest path from ni to a leaf. Thus all leaves are at height 0.
✔ The height of a tree is equal to the height of the root.
Example: For the above tree,
Note:
✔ The depth of a tree is equal to the depth of the deepest leaf; this is always
equal to the height of the tree.
✔ If there is a path from n1 to n2, then n1 is an ancestor of n2 and n2 is a
descendant of n1. If n1 n2, then n1 is a proper ancestor of n2 and n2 is a
proper descendant of n1.
✔ A tree there is exactly one path from the root to each node.
Based on the no. of children for each node in the tree, it is classified into two to types.
1. Binary tree
2. General tree
Binary tree
In a tree, each and every node has a maximum of two children. It can be empty,
one or two. Then it is called as Binary tree.
CS3351 72
Eg:
General Tree
In a tree, node can have any no of children. Then it is called as general Tree.
Eg:
Implementation of Trees
1. Array Implementation
2. Linked List implementation
Apart from these two methods, it can also be represented by First Child and Next
sibling Representation.
One way to implement a tree would be to have in each node, besides its data, a pointer
to each child of the node. However, since the number of children per node can vary so greatly
and is not known in advance, it might be infeasible to make the children direct links in the data
structure, because there would be too much wasted space. The solution is simple: Keep the
children of each node in a linked list of tree nodes.
struct tree_node
{
CS3351 73
element_type element;
tree_ptr first_child;
tree_ptr next_sibling;
};
First child/next sibling representation of the tree shown in the below Figure
Arrows that point downward are first_child pointers. Arrows that go left to right are
next_sibling pointers. Null pointers are not drawn, because there are too many. In the above
tree, node E has both a pointer to a sibling (F) and a pointer to a child (I), while some nodes
have neither.
Tree Traversals
Visiting of each and every node in a tree exactly only once is called as Tree traversals.
Here Left subtree and right subtree are traversed recursively.
1. Inorder Traversal
2. Preorder Traversal
3. Postorder
Traversal Inorder
traversal:
Rules:
Eg
Preorder traversal:
Rules:
Postorder traversal:
Rules:
There are many applications for trees. Most important two applications are,
One of the popular uses is the directory structure in many common operating systems,
including UNIX, VAX/VMS, and DOS.
✔ The root of this directory is /usr. (The asterisk next to the name indicates that /usr is
itself a directory.)
✔ /usr has three children, mark, alex, and bill, which are themselves directories. Thus, /usr
contains three directories and no regular files.
✔ The filename /usr/mark/book/ch1.r is obtained by following the leftmost child three
times. Each / after the first indicates an edge; the result is the full pathname.
✔ Two files in different directories can share the same name, because they must have
different paths from the root and thus have different pathnames.
✔ A directory in the UNIX file system is just a file with a list of all its children, so the
directories are structured almost exactly in accordance with the type declaration.
✔ Each directory in the UNIX file system also has one entry that points to itself and
another entry that point to the parent of the directory. Thus, technically, the UNIX file
system is not a tree, but is treelike.
list_directory ( Directory_or_file D )
list_dir ( D, 0 ); }
if ( D is a legitimate entry)
print_name ( depth, D );
if( D is a directory )
list_dir( c, depth+1 );
} }
✔ The argument to list_dir is some sort of pointer into the tree. As long as the pointer is
valid, the name implied by the pointer is printed out with the appropriate number of
tabs.
✔ If the entry is a directory, then we process all children recursively, one by one. These
children are one level deeper, and thus need to be indenting an extra space.
This traversal strategy is known as a preorder traversal. In a preorder traversal, work at a
node is performed before (pre) its children are processed. If there are n file names to be output,
then the running time is O (n).
/usr
mark
book
chr1.
chr2.
chr3.
CS3351 77
course
cop3530
CS3351 78
fall88
syl.r
spr89
syl.r
sum89
syl.r
junk.c
alex
junk.c
bill
work
course
cop3212
fall88
grades
prog1.r
prog2.r
fall89
prog1.r
prog2.r
grades
CS3351 79
The most natural way to do this would be to find the number of blocks contained in the
subdirectories /usr/mark (30), /usr/alex (9), and /usr/bill (32). The total number of blocks is then
the total in the subdirectories (71) plus the one block used by /usr, for a total of 72.
total_size = 0;
total_size = file_size( D );
if( D is a directory )
total_size += size_directory( c );
return( total_size );
ch1.r 3
CS3351 80
ch 2
2.r
ch 4
3.r
boo 1
k 0
syl. 1
r
fall8 2
8
syl. 5
r
spr8 6
9
syl. 2
r
sum 3
89
cop3530 1
2
cour 1
se 3
jun 6
k.c
mar 3
k 0
jun 8
k.c
alex 9
wo 1
rk
grades 3
prog1.r 4
prog2.r 1
fall8 9
8
CS3351 81
prog2.r 2
prog1.r 7
grades 9
fall8 1
9 9
CS3351 82
cop3212 29
course 30
bill 32
/usr 72
Binary Trees
A binary tree is a tree in which no node can have more than two children.
Figure shows that a binary tree consists of a root and two subtrees, Tl and
Implementation
CS3351 83
A binary tree has at most two children; we can keep direct pointers to them. The
declaration of tree nodes is similar in structure to that for doubly linked lists, in that a node is a
structure consisting of the key information plus two pointers (left and right) to other nodes.
struct tree_node
element_type element;
tree_ptr left;
tree_ptr right;
};
Expression Trees
When an expression is represented in a binary tree, then it is called as an expression Tree. The
leaves of an expression tree are operands, such as constants or variable names, and the other
nodes contain operators. It is possible for nodes to have more than two children. It is also
possible for a node to have only one child, as is the case with the unary minus operator.
We can evaluate an expression tree, T, by applying the operator at the root to the values
obtained by recursively evaluating the left and right subtrees.
In our example, the left subtree evaluates to a + (b * c) and the right subtree evaluates to
((d *e) + f ) *g. The entire tree therefore represents (a + (b*c)) + (((d * e) + f)* g).
producing a parenthesized left expression, then printing out the operator at the
general strattegy ( left, node, right ) is known as an inorder traversal; it gives Infix
Expression.
An alternate traversal strategy is to recursively print out the left subtree, the
right subtree, and then the operator. If we apply this strategy to our tree above, the output is a b
c * + d e * f + g * +, which is called as postfix Expression. This traversal strategy is generally
known as a postorder traversal.
A third traversal strategy is to print out the operator first and then recursively print out
the left and right subtrees. The resulting expression, + + a * b c * + * d e f g, is the less useful
prefix notation and the traversal strategy is a preorder traversal
ab+cde+**
CS3351 85
The first two symbols are operands, so we create one-node trees and push pointers to
them onto a stack.
Next, a '+' is read, so two pointers to trees are popped, a new tree is formed, and a pointer to it is
pushed onto the stack.
Next, c, d, and e are read, and for each a one-node tree is created and a pointer
Now a '+' is read, so two trees are merged. Continuing, a '*' is read, so we pop two tree pointers
and form a new tree with a '*' as root.
CS3351 86
Finally, the last symbol is read, two trees are merged, and a pointer to the final
CS3351 87
The property that makes a binary tree into a binary search tree is that for every
node, X, in the tree, the values of all the keys in the left subtree are smaller than the key
value in X, and the values of all the keys in the right subtree are larger than the key value
in X.
Notice that this implies that all the elements in the tree can be ordered in some
consistent manner.
In the above figure, the tree on the left is a binary search tree, but the tree on the right is
not. The tree on the right has a node with key 7 in the left subtree of a node with key 6. The
average depth of a binary search tree is O(log n).
struct tree_node
element_type element;
CS3351 88
tree_ptr left;
tree_ptr right;
};
Make Empty:
This operation is mainly for initialization. Some programmers prefer to initialize the
first element as a one-node tree, but our implementation follows the recursive definition of trees
more closely.
Find
This operation generally requires returning a pointer to the node in tree T that has key x,
or NULL if there is no such node. The structure of the tree makes this simple. If T is , then we
can just return . Otherwise, if the key stored at T is x, we can return T. Otherwise, we make a
recursive call on a subtree of T, either left or right, depending on the relationship of x to the key
stored in T.
if(T!=NULL)
Makeempty (T->left);
Makeempty (T->Right);
Free( T);
return NULL; }
CS3351 89
if( T == NULL )
return NULL;
else
else
return T;
These routines return the position of the smallest and largest elements in the
tree, respectively.
To perform a findmin, start at the root and go left as long as there is a left child. The
stopping point is the smallest element.
The findmax routine is the same, except that branching is to the right child.
{
CS3351 90
if( T == NULL )
return NULL;
else
return( T );
else
if( T == NULL )
return NULL;
else
return( T );
else
if( T != NULL )
CS3351 91
T=T->left;
return(T);
if( T != NULL )
T=T->right;
return(T);
Insert
To insert x into tree T, proceed down the tree. If x is found, do nothing. Otherwise,
insert x at the last spot on the path traversed.
To insert 5, we traverse the tree as though a find were occurring. At the node with key 4,
we need to go right, but there is no subtree, so 5 is not in the tree, and this is the correct spot.
Insertion routine
Since T points to the root of the tree, and the root changes on the first insertion, insert is
written as a function that returns a pointer to the root of the new tree.
CS3351 92
if( T == NULL )
if( T == NULL )
fatal_error("Out of space!!!");
else
T->element = x;
else
else
return T; }
Delete
Once we have found the node to be deleted, we need to consider several possibilities.
If the node has one child, the node can be deleted after its parent adjusts a pointer to
bypass the node
if a node with two children. The general strategy is to replace the key of this node with
the smallest key of the right subtree and recursively delete that node. Because the smallest node
in the right subtree cannot have a left child, the second
The node to be deleted is the left child of the root; the key value is 2. It is replaced with
the smallest key in its right subtree (3), and then that node is deleted as before.
use is lazy deletion: When an element is to be deleted, it is left in the tree and merely marked as
being deleted.
CS3351 94
Position tmpcell;
if( T == NULL )
else
else
T->element = tmp_cell->element;
tmpcell = T;
if( T->left == NULL ) /* Only a right child */
T= T->right;
CS3351 95
T = T->left;
free( tmpcell );
return T;
✔ All of the operations of the previous section, except makeempty, should take O(log n)
time, because in constant time we descend a level in the tree, thus operating on a tree
that is now roughly half as large.
✔ The running time of all the operations, except makeempty is O(d), where d is the depth
of the node containing the accessed key.
✔ The average depth over all nodes in a tree is O(log n).
✔ The sum of the depths of all nodes in a tree is known as the internal path length.
AVL Trees
The balance condition and allow the tree to be arbitrarily deep, but after every
operation, a restructuring rule is applied that tends to make future operations efficient. These
types of data structures are generally classified as self-adjusting.
An AVL tree is identical to a binary search tree, except that for every node in the tree,
the height of the left and right subtrees can differ by at most 1. (The height of an empty tree is
defined to be -1.)
An AVL (Adelson-Velskii and Landis) tree is a binary search tree with a balance
condition. The simplest idea is to require that the left and right subtrees have the same height.
The balance condition must be easy to maintain, and it ensures that the depth of the tree is
O(log n).
CS3351 96
The above figure shows, a bad binary tree. Requiring balance at the root is not enough.
In Figure, the tree on the left is an AVL tree, but the tree on the right is not.
Thus, all the tree operations can be performed in O(log n) time, except possibly insertion.
When we do an insertion, we need to update all the balancing information for the nodes
on the path back to the root, but the reason that insertion is difficult is that inserting a node
could violate the AVL tree property.
Inserting a node into the AVL tree would destroy the balance condition.
Let us call the unbalanced node α. Violation due to insertion might occur in four cases:
Types of rotation
1. Single Rotation
2. Double Rotation
CS3351 97
The two trees in the above Figure contain the same elements and are both binary search
trees.
First of all, in both trees k1 < k2. Second, all elements in the subtree X are smaller than
k1 in both trees. Third, all elements in subtree Z are larger than k2. Finally, all elements in
subtree Y are in between k1 and k2. The conversion of one of the above trees to the other is
known as a rotation.
In an AVL tree, if an insertion causes some node in an AVL tree to lose the balance
property: Do a rotation at that node.
The basic algorithm is to start at the node inserted and travel up the tree, updating the balance
information at every node on the path.
In the above figure, after the insertion of the in the original AVL tree on the left, node 8
becomes unbalanced. Thus, we do a single rotation between 7 and 8, obtaining the tree on the
right.
Routine :
{
CS3351 98
Position k1;
K1=k2->left;
K2->left=k1-
>right; K1-
>right=k2;
K2->height=max(height(k2->left),height(k2->right));
K1->height=max(height(k1->left),k2->height);
Return k1;
Suppose we start with an initially empty AVL tree and insert the keys 1 through 7 in
sequential order. The first problem occurs when it is time to insert key 3, because the AVL
property is violated at the root. We perform a single rotation between the root and its right child
to fix the problem. The tree is shown in the following figure, before and after the rotation.
A dashed line indicates the two nodes that are the subject of the rotation. Next, we insert the
key 4, which causes no problems, but the insertion of 5 creates a violation at node 3, which is
fixed by a single rotation.
Next, we insert 6. This causes a balance problem for the root, since its left subtree is of
height 0, and its right subtree would be height 2. Therefore, we perform a single rotation at the
root between 2 and 4.
CS3351 99
The rotation is performed by making 2 a child of 4 and making 4's original left subtree
the new right subtree of 2. Every key in this subtree must lie between 2 and 4, so this
transformation makes sense. The next key we insert is 7, which causes another rotation.
Routine :
Position k2;
K2=k1->right;
K1->right=k2-
>left; K2->left=k1;
CS3351 100
K1->height=max(height(k1->left),height(k1->right));
K2->height=max(height(k2->left),k1->height);
Return k2;
Double Rotation
In the above diagram, suppose we insert keys 8 through 15 in reverse order. Inserting 15 is
easy, since it does not destroy the balance property, but inserting 14 causes a height imbalance
at node 7.
As the diagram shows, the single rotation has not fixed the height imbalance. The problem is
that the height imbalance was caused by a node inserted into the tree containing the middle
elements (tree Y in Fig. (Right-left) double rotation) at the same time as the other trees had
identical heights. This process is called as double rotation, which is similar to a single rotation
but involves four subtrees instead of three.
CS3351 102
In our example, the double rotation is a right-left double rotation and involves 7, 15, and
14. Here, k3 is the node with key 7, k1 is the node with key 15,
Next we insert 13, which require a double rotation. Here the double rotation is again a
right-left double rotation that will involve 6, 14, and 7 and will restore the tree. In this case, k3
is the node with key 6, k1 is the node with key 14, and k2 is the node with key 7. Subtree A is
the tree rooted at the node with key 5, subtree B is the empty subtree that was originally the left
child of the node with key 7, subtree C is the tree rooted at the node with key 13, and finally,
subtree D is the tree rooted at the node with key 15.
4 and 7, we know that the single rotation will work. Insertion of 11 will require a single
rotation:
To insert 10, a single rotation needs to be performed, and the same is true for the
subsequent insertion of 9. We insert 8 without a rotation, creating the almost perfectly balanced
tree.
K3->left=singlerotatewithright(k3->left);
Return singlerotatewithleft(k3);
K1->right=singlerotatewithleft(k1->right);
Return singlerotatewithright(k1);
struct avlnode
CS3351 104
elementtype element;
avltree left;
avltree right;
int height;
};
if( p == NULL )
return -1;
else
return p->height;
B-Trees
CS3351 105
AVL tree and Splay tree are binary; there is a popular search tree that is not binary. This
tree is known as a B-tree.
p1, p2, . . . , pm to the children, and values k1, k2, . . . , km - 1, representing the smallest key
found in the subtrees p2, p3, . . . , pm respectively. Some of these pointers might be NULL, and
the corresponding ki would then be undefined.
For every node, all the keys in subtree p1 are smaller than the keys in subtree p2, and so on.
The leaves contain all the actual data, which is either the keys themselves or pointers to records
containing the keys.
A B-tree of order 4 is more popularly known as a 2-3-4 tree, and a B-tree of order 3 is
known as a 2-3 tree
We have drawn interior nodes (nonleaves) in ellipses, which contain the two pieces of
data for each node. A dash line as a second piece of information in an interior node indicates
that the node has only two children. Leaves are drawn in boxes, which contain the keys. The
keys in the leaves are ordered.
To perform a find, we start at the root and branch in one of (at most) three directions,
depending on the relation of the key we are looking for to the two values stored at the node.
When we get to a leaf node, we have found the correct place to put x. Thus, to insert a
node with key 18, we can just add it to a leaf without causing any violations of the 2-3 tree
properties. The result is shown in the following figure.
If we now try to insert 1 into the tree, we find that the node where it belongs is already
full. Placing our new key into this node would give it a fourth element which is not allowed.
This can be solved by making two nodes of two keys each and adjusting the information in the
parent.
CS3351 107
To insert 19 into the current tree, two nodes of two keys each, we obtain the following tree.
This tree has an internal node with four children, but we only allow three per node.
Again split this node into two nodes with two children. Now this node might be one of three
children itself, and thus splitting it would create a problem for its parent but we can keep on
splitting nodes on the way up to the root until we either get to the root or find a node with only
two children.
If we now insert an element with key 28, we create a leaf with four children, which is
split into two leaves of two children.
CS3351 108
This creates an internal node with four children, which is then split into two children.
Like to insert 70 into the tree above, we could move 58 to the leaf containing 41 and 52, place
70 with 59 and 61, and adjust the entries in the internal nodes.
Deletion in B-Tree
● If this key was one of only two keys in a node, then its removal leaves only one key. We
can fix this by combining this node with a sibling. If the sibling has three keys, we can
steal one and have both nodes with two keys.
● If the sibling has only two keys, we combine the two nodes into a single node with three
keys. The parent of this node now loses a child, so we might have to percolate this
strategy all the way to the top.
● If the root loses its second child, then the root is also deleted and the tree becomes one
level shallower.
We repeat this until we find a parent with less than m children. If we split the root, we
create a new root with two children.
The worst-case running time for each of the insert and delete operations is thus O(m
logm n) = O( (m / log m ) log n), but a find takes only O(log n ).
A queue is said to be priority queue, in which the elements are dequeued based on the
priority of the elements.
● Jobs sent to a line printer are generally placed on a queue. For instance, one job might
be particularly important, so that it might be desirable to allow that job to be run as soon
as the printer is available.
● In a multiuser environment, the operating system scheduler must decide which of
several processes to run. Generally a process is only allowed to run for a fixed period of
time. One algorithm uses a queue. Jobs are initially placed at the end of the queue. The
scheduler will repeatedly take the first job on the queue, run it until either it finishes or
its time limit is up, and place it at the end of the queue. This strategy is generally not
appropriate, because very short jobs will seem to take a long time because of the wait
involved to run. Generally, it is important that short jobs finish as fast as possible. This
is called as Shortest Job First (SJF). This particular application seems to require a
special kind of queue, known as a priority queue.
Basic model of a priority queue
A priority queue is a data structure that allows at least the following two operations:
1. Array Implementation
2. Linked list Implementation
3. Binary Search Tree implementation
4. Binary Heap Implementation
Array Implementation
Drawbacks:
1. There will be more wastage of memory due to maximum size of the array should be
define in advance
2. Insertion taken at the end of the array which takes O (N) time.
3. Delete_min will also take O (N) times.
CS3351 110
Another way of implementing priority queues would be to use a binary search tree. This
gives an O(log n) average running time for both operations.
Another way of implementing priority queues would be to use a binary heap. This gives
an O(1) average running time for both operations.
Binary Heap
Like binary search trees, heaps have two properties, namely, a structure property and a heap
order property. As with AVL trees, an operation on a heap can destroy one of the properties,
so a heap operation must not terminate until all heap properties are in order.
1. Structure Property
2. Heap Order
Property Structure
Property
A heap is a binary tree that is completely filled, with the possible exception of the
bottom level, which is filled from left to right. Such a tree is known as a complete binary
tree.
A complete binary tree of height h has between 2h and 2h+1 - 1 nodes. This implies that
CS3351 111
the height of a complete binary tree is log n, which is clearly O(log n).
CS3351 112
For any element in array position i, the left child is in position 2i, the right
child is in the cell after the left child (2i + 1), and the parent is in position
i/2 .
The only problem with this implementation is that an estimate of the maximum heap
size is required in advance.
Types of Binary
A binary heap is said to be Min heap such that any node x in the heap, the key value of
X is smaller than all of its descendants children.
Max Heap
A binary heap is said to be Min heap such that any node x in the heap, the key
value of X is larger than all of its descendants children.
CS3351 113
It is easy to find the minimum quickly, it makes sense that the smallest element should
be at the root. If we consider that any subtree should also be a heap, then any node should be
smaller than all of its descendants.
Applying this logic, we arrive at the heap order property. In a heap, for every node X,
the key in the parent of X is smaller than (or equal to) the key in X.
Similarly we can declare a (max) heap, which enables us to efficiently find and remove
the maximum element, by changing the heap order property. Thus, a priority queue can be used
to find either a minimum or a maximum.
By the heap order property, the minimum element can always be found at the root.
struct heapstruct
int capacity;
int size;
element_type *elements;
};
priorityQ H;
if( H == NULL )
fatal_error("Out of space!!!");
fatal_error("Out of
space!!!"); H->capacity=
max_elements; H->size = 0;
H->elements[0] =
MIN_DATA; return H; }
It is easy to perform the two required operations. All the work involves ensuring that the
heap order property is maintained.
1. Insert
2. Deletemi
n Insert
To insert an element x into the heap, we create a hole in the next available location,
since otherwise the tree will not be complete.
If x can be placed in the hole without violating heap order, then we do so and are done.
Otherwise we slide the element that is in the hole's parent node into the hole, thus bubbling the
hole up toward the root. We continue this process until x can be placed in the hole.
CS3351 115
Figure shows that to insert 14, we create a hole in the next available heap location.
Inserting 14 in the hole would violate the heap order property, so 31 is slide down into the hole.
This strategy is continued until the correct location for 14 is found. This general strategy
is known as a percolate up; the new element is percolated up the heap until the correct location
is found.
We could have implemented the percolation in the insert routine by performing repeated
swaps until the correct order was established, but a swap requires three assignment statements.
If an element is percolated up d levels, the number of assignments performed by the swaps
would be 3d. Our method uses d + 1 assignments.
/* H->element[0] is a sentinel */
int i;
if( is_full( H ) )
CS3351 116
else
i = ++H->size;
H->elements[i] = H->elements[i/2];
i /= 2;
} H->elements[i] = x; } }
If the element to be inserted is the new minimum, it will be pushed all the way to the
top. The time to do the insertion could be as much as O (log n), if the element to be inserted is
the new minimum and is percolated all the way to the root. On
Deletemin
Deletemin are handled in a similar manner as insertions. Finding the minimum is easy;
the hard part is removing it.
When the minimum is removed, a hole is created at the root. Since the heap now
becomes one smaller, it follows that the last element x in the heap must move somewhere in the
heap. If x can be placed in the hole, then we are done. This is unlikely, so we slide the smaller
of the hole's children into the hole, thus pushing the hole down one level. We repeat this step
until x can be placed in the hole. This general strategy is known as a percolate down.
CS3351 117
In Figure, after 13 is removed, we must now try to place 31 in the heap. 31 cannot be
placed in the hole, because this would violate heap order. Thus, we place the smaller child (14)
in the hole, sliding the hole down one level. We repeat this again, placing 19 into the hole and
creating a new hole one level deeper. We then place 26 in the hole and create a new hole on the
bottom level. Finally, we are able to place 31 in the hole.
int i, child;
if( is_empty( H ) )
return H->elements[0];
min_element = H->elements[1];
CS3351 118
last_element = H->elements[H->size--];
child = i*2;
child++;
>elements[child] ) H->elements[i] =
H->elements[child]; else
break;
H->elements[i] = last_element;
return min_element;
The worst-case running time for this operation is O(log n). On average, the element that
is placed at the root is percolated almost to the bottom of the heap, so the average running time
is O (log n).
1. Decreasekey
2. Increasekey
3. Delete
4. Buildheap
Decreasekey
USE:
This operation could be useful to system administrators: they can make their programs
run with highest priority.
Increasekey
USE:
Delete
The delete(x, H) operation removes the node at position x from the heap. This is done
by first performing decreasekey(x,∆ , H) and then performing deletemin(H). When a process is
terminated by a user, it must be removed from the priority queue.
Buildheap
The buildheap(H) operation takes as input n keys and places them into an empty heap.
This can be done with n successive inserts. Since each insert will take O(1) average and O(log
n) worst-case time, the total running time of this algorithm would be O(n) average but O(n log
n) worst-case.
CS3351 120
UNIT IV – GRAPHS
Graph
A graph G = (V, E) consists of a set of vertices, V, and a set of edges, E. Each
edge is a pair (v,w), where v,w € V. Edges are sometimes referred to as arcs.
A B
Edge / arcs
C E
A, B, C, D and E are vertices
D
Types of graph
1. Directed Graph
If the pair is ordered, then the graph is directed. In a graph, if all the edges are
directionally oriented, then the graph is called as directed Graph. Directed graphs are
sometimes referred to as digraphs.
2. Undirected Graph
A B
In a graph, if all the edges are not directionally oriented, then the graph is called as
undirected Graph. In an undirected graph with edge (v,w), and hence (w,v), w is adjacent to v
C E
and v is adjacent to w.
D
1 2
5 4
CS3351 121
3. Mixed Graph
In a graph if the edges are either directionally or not directionally oriented, then it is
called as mixed graph.
S U
Path
A path in a graph is a sequence of verices w1, w2, w3, . . . , wn such that (wi, wi+i) € E
T V
for 1< i < n.
Path length
The length of a path is the number of edges on the path, which is equal to n – 1 where n
is the no of vertices.
Loop
A path from a vertex to itself; if this path contains no edges, then the path length is 0. If
the graph contains an edge (v,v) from a vertex to itself, then the path v, v is sometimes referred
to as a loop.
Simple Path
A simple path is a path such that all vertices are distinct, except that the first and last
could be the same.
CS3351 122
Cycle
In a graph, if the path starts and ends to the same vertex then it is known as Cycle.
Cyclic Graph
Acyclic Graph
A directed graph is acyclic if it has no cycles. A directed acyclic graph is also referred
as DAG.
Connected Graph
An undirected graph is connected if there is a path from every vertex to every other
vertex.
A directed graph is called strongly connected if there is a path from every vertex to
every other vertex.
If a directed graph is not strongly connected, but the underlying graph (without direction
to the arcs) is connected, then the graph is said to be weakly connected.
Complete graph
A complete graph is a graph in which there is an edge between every pair of vertices.
Weighted Graph
In a directed graph, if some positive non zero integer values are assigned to each
and every edges, then it is known as weighted graph. Also called as Network
An example of a real-life situation that can be modeled by a graph is the airport system.
Each airport is a vertex, and two vertices are connected by an edge if there is a nonstop flight
from the airports that are represented by the vertices. The edge could have a weight,
representing the time, distance, or cost of the flight.
CS3351 123
Representation of Graphs
Now we can number the vertices, starting at 1. The graph shown in above figure represents 7
vertices and 12 edges.
One simple way to represent a graph is to use a two-dimensional array. This is known as
an adjacency matrix representation.
For each edge (u, v), we set a[u][v]= 1; otherwise the entry in the array is 0. If the edge has a
weight associated with it, then we can set a[u][v] equal to the weight and use either a very large
or a very small weight as a sentinel to indicate nonexistent edges.
0 otherwise }
0 otherwise }
Adjacency lists
Adjacency lists are the standard way to represent graphs. Undirected graphs can be
similarly represented; each edge (u, v) appears in two lists, so the space usage essentially
doubles. A common requirement in graph algorithms is to find all vertices adjacent to some
given vertex v, and this can be done, in time proportional to the number of such vertices found,
by a simple scan down the appropriate adjacency list.
Topological Sort
A topological sort is an ordering of vertices in a directed acyclic graph, such that if there
is a path from vi to vj, then vj appears after vi in the ordering.
It is clear that a topological ordering is not possible if the graph has a cycle, since for
two vertices v and w on the cycle, v precedes w and w precedes v.
In the above graph v1, v2, v5, v4, v3, v7, v6 and v1, v2, v5, v4, v7, v3, v6 are both
topological orderings.
First, find any vertex with no incoming edges (Source vertex). We can then print this
vertex, and remove it, along with its edges, from the graph.
To formalize this, we define the indegree of a vertex v as the number of edges (u,v). We
compute the indegrees of all vertices in the graph. Assuming that the indegree array is
initialized and that the graph is read into an adjacency list,
Vertex 1 2 3 4 5 6 7
v1 0 0 0 0 0 0 0
v2 1 0 0 0 0 0 0
v3 2 1 1 1 0 0 0
v4 3 2 1 0 0 0 0
v5 1 1 0 0 0 0 0
v6 3 3 3 3 2 1 0
v7 2 2 2 1 0 0 0
CS3351 126
Enqueue v1 v2 v5 v4 v3 v7 v6
Dequeue v1 v2 v5 v4 v3 v7 v6
vertex v, w;
v = find_new_vertex_of_indegree_zero( );
if( v = NOT_A_VERTEX )
break;
top_num[v] = counter;
indegree[w]--;
}
CS3351 127
Explanation
QUEUE Q;
vertex v, w;
Q = create_queue( NUM_VERTEX );
makeempty( Q );
counter = 0;
if( indegree[v] = 0 )
enqueue( v, Q );
while( !isempty( Q ) )
v = dequeue( Q );
if( --indegree[w] = 0 )
enqueue( w, Q );
CS3351 128
Graph Traversal:
Visiting of each and every vertex in the graph only once is called as Graph traversal.
1. If path exists from one node to another node walk across the edge – exploring
the edge
2. If path does not exist from one specific node to any other nodes, return to the
previous node where we have been before – backtracking
CS3351 129
Starting at some vertex V, we process V and then recursively traverse all the vertices adjacent
to V. This process continues until all the vertices are processed. If some vertex is not processed
recursively, then it will be processed by using backtracking. If vertex W is visited from V, then
the vertices are connected by means of tree edges. If the edges not included in tree, then they
are represented by back edges. At the end of this process, it will construct a tree called as DFS
tree.
visited[v] = TRUE;
if( !visited[w] )
dfs( w );
The (global) boolean array visited[ ] is initialized to FALSE. By recursively calling the
procedures only on nodes that have not been visited, we guarantee that we do not loop
indefinitely.
* An efficient way of implementing this is to begin the depth-first search at v1. If we need to
restart the depth-first search, we examine the sequence vk, vk + 1, . . . for an unmarked
vertex,where vk - 1 is the vertex where the last depth-first search was started.
An undirected graph
CS3351 130
a. We start at vertex A. Then we mark A as visited and call dfs(B) recursively. dfs(B)
marks B as visited and calls dfs(C) recursively.
b. dfs(C) marks C as visited and calls dfs(D) recursively.
c. dfs(D) sees both A and B, but both these are marked, so no recursive calls are made.
dfs(D) also sees that C is adjacent but marked, so no recursive call is made there, and
dfs(D) returns back to dfs(C).
d. dfs(C) sees B adjacent, ignores it, finds a previously unseen vertex E adjacent, and thus
calls dfs(E).
e. dfs(E) marks E, ignores A and C, and returns to dfs(C).
f. dfs(C) returns to dfs(B). dfs(B) ignores both A and D and returns.
g. dfs(A) ignores both D and E and returns.
Depth-first search of the graph
Tree edge
The root of the tree is A, the first vertex visited. Each edge (v, w) in the graph is present in
the tree. If, when we process (v, w), we find that w is unmarked, or if, when we process (w, v),
we find that v is unmarked, we indicate this with a tree edge.
If when we process (v, w), we find that w is already marked, and when processing (w, v), we
find that v is already marked, we draw a dashed line, which we will call a back edge, to
indicate that this "edge" is not really part of the tree.
CS3351 131
Here starting from some vertex v, and its adjacency vertices are processed. After all the
adjacency vertices are processed, then selecting any one the adjacency vertex and process will
continue. If the vertex is not visited, then backtracking is applied to visit the unvisited vertex.
visited[v]= true;
visited[w] = true;
S. D BF
No FS S
1 Back tracking is possible from a dead end. Back tracking is not possible.
A
A
B D
C
D
B C
Order of traversal:
A🡪B🡪C🡪D🡪E🡪F🡪G🡪H
It is used in computer networks. If the nodes are computers and the edges are links, then if any
computer goes down, network mail is unaffected if it is a biconnected network.
Articulation points
If a graph is not biconnected, the vertices whose removal would disconnect the graph
are known as articulation points.
The removal of C would disconnect G, and the removal of D would disconnect E and F,
from the rest of the graph.
● First, starting at any vertex, we perform a depth-first search and number the
nodes as they are visited.
● For each vertex v, we call this preorder number num (v). Then, for every vertex
v in the depth-first search spanning tree, we compute the lowest-numbered
vertex, which we call low(v), that is reachable from v by taking zero or more
tree edges and then possibly one back edge (in that order).
By the definition of low, low (v) is the minimum of
1. num(v)
The first condition is the option of taking no edges, the second way is to choose no tree
edges and a back edge, and the third way is to choose some tree edges and possibly a back edge.
The depth-first search tree in the above Figure shows the preorder number first, and then the
lowest-numbered vertex reachable under the rule described above.
The lowest-numbered vertex reachable by A, B, and C is vertex 1 (A), because they can all take
tree edges to D and then one back edge back to A and find low value for all other vertices.
● The root is an articulation point if and only if it has more than one child, because if it
has two children, removing the root disconnects nodes in different subtrees, and if it has
only one child, removing the root merely disconnects the root.
● Any other vertex v is an articulation point if and only if v has some child w such that
low (w)>= num (v). Notice that this condition is always satisfied at the root;
CS3351 135
We examine the articulation points that the algorithm determines, namely C and D. D has a
child E, and low (E)>= num (D), since both are 4. Thus, there is only one way for E to get to
any node above D, and that is by going through D.
vertex w;
num[v] = counter++;
visited[v] = TRUE;
if( !visited[w] )
parent[w] = v;
assignnum ( w ); } }
vertex w;
{
CS3351 136
assignlow( w );
else
Testing for articulation points in one depth-first search (test for the root is omitted) void
findart( vertex v )
vertex w;
visited[v] = TRUE;
parent[w] = v;
findart( w );
else
Euler Circuits
We must find a path in the graph that visits every edge exactly once. If we are to solve
the "extra challenge," then we must find a cycle that visits every edge exactly once. This graph
problem was solved in 1736 by Euler and marked the beginning of graph theory. The problem
is thus commonly referred to as an Euler path or Euler tour or Euler circuit problem,
depending on the specific problem statement.
Consider the three figures as shown below. A popular puzzle is to reconstruct these
figures using a pen, drawing each line exactly once. The pen may not be lifted from the paper
while the drawing is being performed. As an extra challenge, make the pen finish at the same
point at which it started.
Three drawings
1. The first figure can be drawn only if the starting point is the lower left- or right-hand
corner, and it is not possible to finish at the starting point.
CS3351 138
2. The second figure is easily drawn with the finishing point the same as the starting point.
3. The third figure cannot be drawn at all within the parameters of the puzzle.
We can convert this problem to a graph theory problem by assigning a vertex to each
intersection. Then the edges can be assigned in the natural manner, as in figure.
The first observation that can be made is that an Euler circuit, which must end on its starting
vertex, is possible only if the graph is connected and each vertex has an even degree (number of
edges). This is because, on the Euler circuit, a vertex is entered and then left.
If exactly two vertices have odd degree, an Euler tour, which must visit every edge but need not
return to its starting vertex, is still possible if we start at one of the odd-degree vertices and
finish at the other.
If more than two vertices have odd degree, then an Euler tour is not possible.
That is, any connected graph, all of whose vertices have even degree, must have an
Euler circuit
The main problem is that we might visit a portion of the graph and return to the starting point
prematurely. If all the edges coming out of the start vertex have been used up, then part of the
graph is untraversed.
The easiest way to fix this is to find the first vertex on this path that has an untraversed edge,
and perform another depth-first search. This will give another circuit, which can be spliced into
the original. This is continued until all edges have been traversed.
CS3351 139
Suppose we start at vertex 5, and traverse the circuit 5, 4, 10, 5. Then we are stuck,
and most of the graph is still untraversed. The situation is shown in the Figure.
We then continue from vertex 4, which still has unexplored edges. A depth-first search might
come up with the path 4, 1, 3, 7, 4, 11, 10, 7, 9, 3, 4. If we splice this path into the previous path
of 5, 4, 10, 5, then we get a new path of 5, 4, 1, 3, 7 ,4, 11, 10, 7, 9, 3, 4, 10, 5.
The next vertex on the path that has untraversed edges is vertex 3. A possible circuit would then
be 3, 2, 8, 9, 6, 3. When spliced in, this gives the path 5, 4, 1, 3, 2, 8, 9, 6, 3, 7, 4, 11, 10, 7, 9, 3,
4, 10, 5.
On this path, the next vertex with an untraversed edge is 9, and the algorithm finds the circuit 9,
12, 10, 9. When this is added to the current path, a circuit of 5, 4, 1, 3, 2, 8, 9, 12, 10, 9, 6, 3, 7,
4, 11, 10, 7, 9, 3, 4, 10, 5 is obtained. As all the edges are traversed, the algorithm terminates
CS3351 140
Then the Euler Path for the above graph is 5, 4, 1, 3, 2, 8, 9, 12, 10, 9, 6, 3, 7, 4, 11, 10,
7, 9, 3, 4, 10, 5
A cut vertex is a vertex that when removed (with its boundary edges) from a graph creates more
components than previously in the graph.
A cut edge is an edge that when removed (the vertices stay in place) from a graph creates
more components than previously in the graph.
Find the cut vertices and cut edges for the following graphs
Answers
31) The cut vertex is c. There are no cut edges.
32) The cut vertices are c and d. The cut edge is (c,d)
33) The cut vertices are b,c,e and i. The cut edges are: (a,b),(b,c),(c,d),(c,e),(e,i),(i,h)
Applications of graph:
Minimum Spanning
Tree Definition:
The number of edges in the minimum spanning tree is |V| - 1. The minimum spanning
tree is a tree because it is acyclic, it is spanning because it covers every edge.
Application:
1. Prim's Algorithm
2. Kruskal's Algorithm
Kruskal's Algorithm
A second greedy strategy is continually to select the edges in order of smallest weight
and accept an edge if it does not cause a cycle.
Procedure
● When the algorithm terminates, there is only one tree, and this is the minimum
spanning tree.
● The algorithm terminates when enough edges are accepted.
At any point in the process, two vertices belong to the same set if and only if they are connected
in the current spanning forest. Thus, each vertex is initially in its own set.
● If u and v are in the same set, the edge is rejected, because since they are already
connected, adding (u, v) would form a cycle.
● Otherwise, the edge is accepted, and a union is performed on the two sets containing u
and v.
(v1,v4) 1 Accepted
(v6,v7) 1 Accepted
(v1,v2) 2 Accepted
(v3,v4) 2 Accepted
(v2,v4) 3 Rejected
(v1,v3) 4 Rejected
(v4,v7) 4 Accepted
(v3,v6) 5 Rejected
(v5,v7) 6 Accepted
CS3351 144
( ));
Edge e;
Vertex U, V;
edgesaccepted++;
Hashing
The effective technique in which insertion, deletion and search will be done in constant
time. This technique is called as Hashing.
Hash Function
Each key is mapped into some number in the range 0 to Tablesize - 1 and placed in the
appropriate cell. The mapping is called a hash function.
In this example, john hashes to 3, phil hashes to 4, dave hashes to 6, and mary hashes to 7.
If the input keys are integers, then hash function will be key mod Tablesize. Usually,
the keys are strings; in this case, the hash function needs to be chosen carefully.
One option is to add up the ASCII values of the characters in the string.
CS3351 147
int hash_val = 0;
hash_val += *key++;
Another hash function is that, key has at least two characters plus the NULL terminator.
27 represents the number of letters in the English alphabet, plus the blank, and 729 is 272.
int hash_val = O;
When two different keys hash to same position in the hashtable, overwriting of the key
values in the hashtable. This is known as Collision.
Collision resolution
If, when inserting an element, it hashes to the same value as an already inserted element,
then we have a collision and need to resolve it.
The first strategy, commonly known as either open hashing, or separate chaining, is to
keep a list of all elements that hash to the same value. For convenience, our lists have headers.
hash(x) = x mod 10. (The table size is 10)
To perform an insert, we traverse down the appropriate list to check whether the
element is already in place. If the element turns out to be new, it is inserted either at the front of
the list or at the end of the list. New elements are inserted at the front of the list.
CS3351 149
The hash table structure contains the actual size and an array of linked lists, which
Type declaration
struct listnode
elementtype element;
position next;
};
struct hashtbl
int tablesize;
LIST *thelists;
CS3351 150
};
HASHTABLE H;
int i;
return NULL;
if( H == NULL )
fatalerror("Out of space!!!");
fatalerror("Out of space!!!");
fatalerror("Out of space!!!");
CS3351 151
else
H->thelists[i]->next = NULL;
return H;
position p;
LIST L;
p = L->next;
key) ) p = p->next;
return p;
{
position pos, newcell; LIST L;
fatalerror("Out of space!!!");
else
newcell->next = L->next;
newcell->element =
} } }
Separate chaining has the disadvantage of requiring pointers. This tends to slow the
algorithm down a bit because of the time required to allocate new cells, and also essentially
requires the implementation of a second data structure.
In a closed hashing system, if a collision occurs, alternate cells are tried until an empty
cell is found. More formally, cells h0(x), h1(x), h2(x), . . . are tried in succession where h i(x) =
(hash(x) + F(i) mod tablesize), with F(0) = 0. The function, F , is the collision resolution
strategy. Because all the data goes inside the table, a bigger table is needed for closed hashing
than for open hashing. Generally, the load factor should be below = 0.5 for closed hashing.
CS3351 153
1. Linear Probing
2. Quadratic Probing
3. Double Hashing
Linear Probing
The below Figure shows the result of inserting keys {89, 18, 49, 58, 69} into a closed
table using the same hash function as before and the collision resolution strategy, F (i) = i. The
first collision occurs when 49 is inserted; it is put in the next available spot, namely 0, which is
open. 58 collides with 18, 89, and then 49 before an empty cell is found three away. The
collision for 69 is handled in a similar manner. As long as the table is big enough, a free cell
can always be found, but the time to do so can get quite large. Worse, even if the table is
relatively empty, blocks of occupied cells start forming. This effect, known as primary
clustering, means that any key that hashes into the cluster will require several attempts to
resolve the collision, and then it will add to the cluster.
Although we will not perform the calculations here, it can be shown that the expected number
of probes using linear probing is roughly 1/2(1 + 1/(1 - )2) for insertions and unsuccessful
searches and 1/2(1 + 1/ (1- )) for successful searches. These assumptions are satisfied by a
random collision resolution strategy and are reasonable unless is very close to 1.
CS3351 154
Quadratic Probing
Quadratic probing is a collision resolution method that eliminates the primary clustering
problem of linear probing. Quadratic probing is what you would expect-the collision function is
quadratic. The popular choice is F(i) = i 2. the below figure shows the resulting closed table with
this collision function on the same input used in the linear probing example.
When 49 collide with 89, the next position attempted is one cell away. This cell is
empty, so 49 is placed there. Next 58 collides at position 8. Then the cell one away is tried but
another collision occurs. A vacant cell is found at the next cell tried, which is 2 2 = 4 away. 58 is
thus placed in cell 2. The same thing happens for 69.
CS3351 155
For linear probing it is a bad idea to let the hash table get nearly full, because
performance degrades. For quadratic probing, the situation is even more drastic: There is no
guarantee of finding an empty cell once the table gets more than half full, or even before the
table gets half full if the table size is not prime. This is because at most half of the table can be
used as alternate locations to resolve collisions.
struct hash_entry
element_type element;
};
struct hash_tbl
cell *the_cells; };
hashtable H;
int i;
{
CS3351 156
return NULL;
if( H == NULL )
fatal_error("Out of space!!!");
fatal_error("Out of space!!!");
for(i=0; i<H->table_size; i+
+ ) H->the_cells[i].info =
empty; return H;
position i, current_pos;
i = 0;
{
CS3351 157
current_pos += 2*(++i) - 1;
current_pos -= H->table_size;
return current_pos;
position pos;
{ /* ok to insert here */
H->the_cells[pos].info = legitimate;
H->the_cells[pos].element = key;
Double Hashing
The last collision resolution method we will examine is double hashing. For double hashing,
one popular choice is f(i) = i h2 (x). This formula says that we apply a second hash function to x
and probe at a distance h2 (x), 2 h2 (x), . . ., and so on. A function such as h 2 (x) = R - (x mod
R), with R a prime smaller than H_SIZE, will work well. If we choose R = 7, then below Figure
shows the results of inserting the same keys as before.
CS3351 158
Rehashing
If the table gets too full, the running time for the operations will start taking too long
and inserts might fail for closed hashing with quadratic resolution. This can happen if there are
too many deletions intermixed with insertions.
A solution, then, is to build another table that is about twice as big and scan down the
entire original hash table, computing the new hash value for element and inserting it in the new
table.
As an example, suppose the elements 13, 15, 24, and 6 are inserted into a closed hash
table of size 7. The hash function is h(x) = x mod 7. Suppose linear probing is used to resolve
collisions.
CS3351 159
If 23 is inserted into the table, the resulting table in below figure will be over 70 percent
full. Because the table is so full, a new table is created.
The size of this table is 17, because this is the first prime which is twice as large as the
old table size. The new hash function is then h(x) = x mod 17.
The old table is scanned, and elements 6, 15, 23, 24, and 13 are inserted into the new
table. The resulting table appears as below.
CS3351 160
This entire operation is called rehashing. This is obviously a very expensive operation –
the running time is O(n).
Rehashing routines
cell *old_cells;
old_cells = H->the_cells;
old_size = H->table_size;
H = initialize_table( 2*old_size );
insert( old_cells[i].element, H );
free( old_cells );
return H;
Extendible Hashing
If the amount of data is too large to fit in main memory, then is the number of disk accesses
required to retrieve data. As before, we assume that at any point we have n records to store; the
value of n changes over time. Furthermore, at most m records fit in one disk block. We will use
m = 4 in this section.
CS3351 161
If either open hashing or closed hashing is used, the major problem is that collisions
could cause several blocks to be examined during a find, even for a well-distributed hash table.
Furthermore, when the table gets too full, an extremely expensive rehashing step must
be performed, which requires O(n) disk accesses.
If the time to perform this step could be reduced, then we would have a practical scheme. This
is exactly the strategy used by extendible hashing.
Let us suppose, for the moment, that our data consists of several six-bit integers. The
root of the "tree" contains four pointers determined by the leading two bits of the data. Each
leaf has up to m = 4 elements.
It happens that in each leaf the first two bits are identical; this is indicated by the
number in parentheses.
To be more formal, D will represent the number of bits used by the root, which is
sometimes known as the directory. The number of entries in the directory is thus 2 D. dL is the
number of leading bits that all the elements of some leaf have in common. d L will depend on the
particular leaf, and dL<=D.
Suppose that we want to insert the key 100100. This would go into the third leaf, but as the
third leaf is already full, there is no room. We thus split this leaf into two leaves, which are now
determined by the first three bits. This requires increasing the directory size to 3.
If the key 000000 is now inserted, then the first leaf is split, generating two leaves with d L = 3.
Since D = 3, the only change required in the directory is the updating of the 000 and 001
pointers.
CS3351 163
Department of
CSE Question
Bank
UNIT I – LIST
PART – A
1. What is an abstract data type? (Nov / Dec 05),(May/ Jun 06) & (Nov/ Dec 06)
2. What is list? Mention its types.
3. What is singly linked list? Draw its structure.
4. What is doubly linked list? Draw its structure.
5. What is circular linked list? Draw its structure.
6. What are the advantage of Doubly Linked list over Singly linked list (Nov/Dec05)
7. What is header node and null node?
8. Mention the methods of implementing the list.
9. What is data structure? Give examples.
10. What is stack? Mention the sequence of retrieving the data in it.
11. What is an expression? List its types.
12. How the postfix expressions are evaluated?
13. What are the applications of stack? (Nov/ Dec 2006)
14. What is queue? Mention the sequence of retrieving the data in it.( May/ Jun 06)
15. What are the applications of queue? (Nov/ Dec 2006)
16. What is priority Queue? (Nov/ Dec 2006)
17. What is circular Queue?
18. How the stack is used for checking the balancing condition for symbols?
PART – B
1. What is list? Mention the different way of implementation of list. Explain the Linked
list implementation of List operations.
2. Explain in detail about the types of linked list with neat structure.
CS3351 164
3. What is stack? Explain the Linked List implementation of stack in detail with
routine. (Nov/ Dec 2006), (Nov/ Dec 2005) &(May/ Jun 2006)
4. What are the applications of stack? Explain the conversion of infix expression to
postfix expression in detail with example. (May/ Jun 2006)
5. Explain how the postfix expressions are evaluated in detail with example.
6. What is Queue? Explain the array implementation of queue in detail with routine. (Nov/
Dec 2006) & (May/ Jun 2006)
7. Explain the array implementation of queue in detail with routine.(Nov/ Dec 05)
8. Explain the Priority queue implementation. Write necessary algorithms.(Nov/ Dec 06)
9. Explain the cursor implementation of Linked List in detail with routine.
10. Explain the implementation of Circular Queue.
11. Explain the use stack in function call with routine.
PART – A
PART – B
1. Explain the different tree traversals of a tree in detail with example. (Nov/ Dec
2004) & (Nov/ Dec 2005)
2. What is binary tree? Explain the implementation of expression tree using
inorder notation?
3. Explain the Insertion and Deletion Routine of Binary Tree.
4. a) What is expression tree and create a expression tree for the
given expression (a + b* c) + ((d * e + f) * g). (Apr / May
2005)
5. Explain the conversion of Infix to Postfix expression and evaluate the expression for
the same. (Nov/ Dec 2004)
6. Construct binary search tree for the following set of given numbers and find
the preorder, inorder and postorder traversal of a tree
1, 8, 11, 32, 23, 45, 5, 9, 11, 12, 41, 63, 44 & 17. (Nov/ Dec 2005)
PART – A
PART – B
PART A
1. . De f ine Equ i va le nc e r e la t io n.
4 . De f ine e qu iva le n ce c la s s.
6 . D e f i n e ha s h i n g .
9. . What is e xt e nd ib le ha s h ing ?
PART B
UNIT V- GRAPHS
PART – A
PART – B
1. Define the following terms in graph. (Also Important for two marks.)
(i) Path (ii)Path length (iii)Loop (iv)Cyclic Graph (v)Acyclic Graph
(vi)Strongly connected (vii)Weekly connected (viii)Complete graph (ix) Disjoint
(x)Weighted graph (xi)Digraph / directed graph (xii)Undirected graph
(xiii)Adjacent vertices (xiv) Connected graph (xv)Degree, indegree,
outdegree (xvi)Adjacency matrix or incidence matrix (xvii) Source and sink vertex
2. Explain topological sorting in detail with algorithm and example.(May/ Jun 06)
3. Explain Dijkstra’s algorithm in detail with algorithm and example. (Apr /
May 2006) (May/ Jun 2006)
4. Explain the algorithm for finding the shortest path for unweighted graph.
(Apr / May 2006), (Apr/May 2004)
5. Explain Prim’s algorithm( Minimum Spanning Tree) in detail with algorithm
and example. (Apr / May 2006), (May/ Jun 2006)
6. Explain Kruskal’s algorithm in detail with algorithm and example.
7. Explain the application of DFS in detail with example. (Apr / May 2006)
CS3351 170
COMPUTER
SCIENCE AND
ENGINEERI NG
UNIT I CS3351-DATA
STRUCTURES
QUESTION
BANK
2MARKS
1. Explain the term data structure.
The data structure can be defined as the collection of elements and all the possible
operations which are required for those set of elements. Formally data structure can be
defined as a data structure is a set of domains D, a set of domains F and a set of axioms
A. this triple (D,F,A) denotes the data structure d.
2. What do you mean by non-linear data structure? Give example.
The non-linear data structure is the kind of data structure in which the data may be
arranged in hierarchical fashion. For example- Trees and graphs.
3. What do you linear data structure? Give example.
The linear data structure is the kind of data structure in which the data is
linearly arranged. For example- stacks, queues, linked list.
4. Enlist the various operations that can be performed on data structure.
Various operations that can be performed on the data structure are
• Create
• Insertion of element
• Deletion of element
• Searching for the desired element
• Sorting the elements in the data structure
• Reversing the list of elements.
5. What is abstract data type? What are all not concerned in an ADT?
The abstract data type is a triple of D i.e. set of axioms, F-set of functions and A-
Axioms in which only what is to be done is mentioned but how is to be done is not
mentioned. Thus ADT is not concerned with implementation details.
6. List out the areas in which data structures are applied extensively.
Following are the areas in which data structures are applied extensively.
• Operating system- the data structures like priority queues are used
CS3351 171
12. State the properties of LIST abstract data type with suitable example.
Various properties of LIST abstract data type are
(i) It is linear data structure in which the elements are arranged adjacent to each
other. (ii) It allows to store single variable polynomial.
(iii)If the LIST is implemented using dynamic memory then it is called linked list.
Example of LIST are- stacks, queues, linked list.
13. State the advantages of circular lists over doubly linked list.
In circular list the next pointer of last node points to head node, whereas in doubly
linked list each node has two pointers: one previous pointer and another is next pointer. The
main advantage of circular list over doubly linked list is that with the help of single pointer
field we can access head node quickly. Hence some amount of memory get saved because in
circular list only one pointer is reserved.
14. What are the advantages of doubly linked list over singly linked list?
The doubly linked list has two pointer fields. One field is previous link field and
another is next link field. Because of these two pointer fields we can access any node
efficiently whereas in singly linked list only one pointer field is there which stores forward
pointer.
15. Why is the linked list used for polynomial arithmetic?
We can have separate coefficient and exponent fields for representing each term of
polynomial. Hence there is no limit for exponent. We can have any number as an exponent.
16. What is the advantage of linked list over arrays?
The linked list makes use of the dynamic memory allocation. Hence the user can
allocate or de allocate the memory as per his requirements. On the other hand, the array
makes use of the static memory location. Hence there are chances of wastage of the
memory or shortage of memory for allocation.
CS3351 174
➢ Change: the implementation of the ADT can be changed without making changes
ADT.
20. What is static linked list? State any two applications of it.
➢ The linked list structure which can be represented using arrays is called static linked list.
UNIT II
2MARKS
1. Define Stack
A Stack is an ordered list in which all insertions (Push operation) and deletion (Pop
operation) are made at one end, called the top. The topmost element is pointed by top. The
top is initialized to -1 when the stack is created that is when the stack is empty. In a stack S
= (a1,an), a1 is the bottom most element and element a is on top of element ai-1. Stack is also
referred as Last In First Out (LIFO) list.
2. What are the various Operations performed on the Stack?
The various operations that are performed on the stack are
Example: ab+, where a & b are operands and ‘+’ is addition operator.
6. What do you meant by fully parenthesized expression? Give example.
A pair of parentheses has the same parenthetical level as that of the operator to which it
corresponds. Such an expression is called fully parenthesized expression.
Ex: (a+((b*c) + (d * e))
7. Write the postfix form for the expression -A+B-C+D?
A-
B+C
- D+
The condition for testing an empty queue is rear=front-1. In linked list implementation of
CS3351 178
queue the condition for an empty queue is the header node link field is NULL.
13. Write down the function to insert an element into a queue, in which the
queue is implemented as an array. (May 10)
Q – Queue
X – element to added to the queue Q
IsFull(Q) – Checks and true if Queue Q is
full Q->Size - Number of elements in the
queue
Q->Rear – Points to last element of the queue
Q Q->Array – array used to store queue
elements void enqueue (int X, Queue Q) {
if(IsFull(Q))
Error (“Full
queue”);
else {
Q->Size++;
Q->Rear = Q->Rear+1;
Q->Array[ Q-
>Rear ]=X;
}
}
16 MARKS
1. Write an algorithm for Push and Pop operations on Stack using Linked list. (8)
CS3351 179
and explain.
4. Explain linear linked implementation of Stack and Queue?
a. Write an ADT to implement stack of size N using an array. The elements in
the stack are to be integers. The operations to be supported are PUSH, POP
and DISPLAY. Take into account the exceptions of stack overflow and stack
underflow. (8)
b. A circular queue has a size of 5 and has 3 elements 10,20 and 40 where F=2
and R=4. After inserting 50 and 60, what is the value of F and R. Trying to
insert 30 at this stage what happens? Delete 2 elements from the queue and
insert 70, 80 &
90. Show the sequence of steps with necessary diagrams with the value of F
& R. (8 Marks)
5. Write the algorithm for converting infix expression to postfix (polish) expression?
6. Explain in detail about priority queue ADT in detail?
7.Write a function called ‘push’ that takes two parameters: an integer variable and a stack
into
which it would push this element and returns a 1 or a 0 to show success of addition or
failure.
8. What is a DeQueue? Explain its operation with example?
9. Explain the array implementation of queue ADT in detail?
10. Explain the addition and deletion operations performed on a circular queue with
necessary algorithms.(8) (Nov 09)
UNIT III
1. Define tree
Trees are non-liner data structure, which is used to store data items in a shorted sequence. It
represents any hierarchical relationship between any data Item. It is a collection of nodes, which
has a distinguish node called the root and zero or more non-empty sub trees T1, T2,….Tk. each
of which are connected by a directed edge from the root.
5. Define sibling?
Nodes with the same parent are called siblings. The nodes with common parents
are called siblings.
6. Define binary tree?
A Binary tree is a finite set of data items which is either empty or consists of a single
item called root and two disjoin binary trees called left sub tree max degree of any node is two.
Ø Syntax analysis
A B-tree is a tree data structure that keeps data sorted and allows searches, insertions,
and deletions in logarithmic amortized time. Unlike self-balancing binary search trees, it is
optimized for systems that read and write large blocks of data. It is most commonly used in
database and file systems.
Important properties of a B-tree:
• B-tree nodes have many more than two children.
• A B-tree node may contain more than just a single element.
UNIT IV
PART A
1. Write the definition of weighted graph?
A graph in which weights are assigned to every edge is called a weighted graph.
2. Define Graph?
A graph G consist of a nonempty set V which is a set of nodes of the graph, a set E
which is the set of edges of the graph, and a mapping from the set of edges E to set of pairs of
elements of V. It can also be represented as G=(V, E).
3. Define adjacency matrix?
When a directed graph is not strongly connected but the underlying graph is connected, then the
graph is said to be weakly connected.
17. Name the different ways of representing a graph? Give examples (Nov 10)
a. Adjacency
matrix b.
Adjacency list
18. What is an undirected acyclic graph?
When every edge in an acyclic graph is undirected, it is called an undirected acyclic graph.
It is also called as undirected forest.
24. What are the two traversal strategies used in traversing a graph?
a. Breadth first search
b. Depth first search
25. Articulation Points (or Cut Vertices) in a Graph
A vertex in an undirected connected graph is an articulation point (or cut vertex) if
removing it (and edges through it) disconnects the graph. Articulation points
represent vulnerabilities in a connected network – single points whose failure would split the
network into
2 or more disconnected components. They are useful for designing reliable networks.
For a disconnected undirected graph, an articulation point is a vertex removing which
increases number of connected components.
Following are some example graphs with articulation points encircled with red color.
16 MARKS
1. Explain the various representation of graph with example in detail?
4. What is topological sort? Write an algorithm to perform topological sort?(8) (Nov 09)
5. (i) write an algorithm to determine the biconnected components in the given graph.
(10) (may 10)
(ii)determine the biconnected components in a graph. (6)
UNIT – V
2 MARKS
The bubble sort gets its name because as array elements are sorted they gradually
“bubble” to their proper positions, like bubbles rising in a glass of soda.
5. What number is always sorted to the top of the list by each pass of the
Bubble sort algorithm?
Each pass through the list places the next largest value in its proper place. In essence, each item
“bubbles” up to the location where it belongs.
Ans: 3 9 46 16 28 14
9. How does insertion sort algorithm work?
In every iteration an element is compared with all the elements before it. While comparing
if it is found that the element can be inserted at a suitable position, then space is created for it by
shifting the other elements one position up and inserts the desired element at the
suitable position. This procedure is repeated for all the elements in the list until we get
the sorted elements.
10. What operation does the insertion sort use to move numbers from the unsorted
section to the sorted section of the list?
The Insertion Sort uses the swap operation since it is ordering numbers within a single
list.
11. How many key comparisons and assignments an insertion sort makes in its worst
case? The worst case performance in insertion sort occurs when the elements of the input
array are in descending order. In that case, the first pass requires one comparison, the second
pass requires two comparisons, third pass three comparisons,….kth pass requires (k-1), and
finally the last pass requires (n-1) comparisons. Therefore, total numbers of comparisons are:
12. Which sorting algorithm is best if the list is already sorted? Why?
Insertion sort as there is no movement of data if the list is already
sorted and complexity is of the order O(N).
13. Which sorting algorithm is easily adaptable to singly linked lists? Why?
Insertion sort is easily adaptable to singly linked list. In this method there is an array
link of pointers, one for each of the original array elements. Thus the array can be thought of as
th
a linear link list pointed to by an external pointer first initialized to 0. To insert the k element
the linked list is traversed until the proper position for x[k] is found, or until the end of the list is
reached. At that point x[k] can be inserted into the list by merely adjusting the pointers without
shifting any elements in the array which reduces insertion time.
14. Why Shell Sort is known diminishing increment sort?
The distance between comparisons decreases as the sorting
algorithm runs until the last phase in which adjacent elements are compared. In each step, the
sortedness of the sequence is increased, until in the last step it is completely sorted.
15. Which of the following sorting methods would be especially suitable to sort alist L
consisting of a sorted list followed by a few “random” elements?
Quick sort is suitable to sort a list L consisting of a sorted list followed by a few “random”
elements.
rd
16. What is the output of quick sort after the 3 iteration given the following
sequence?
24 56 47 35 10 90 82 31
24. Why do we need a Hash function as a data structure as compared to any other
data structure? (may 10)
Hashing is a technique used for performing insertions, deletions, and finds in
constant average time.
25. What are the important factors to be considered in designing the hash function?
(Nov
10
)
• To avoid lot of collision the table size should be prime
• For string data if keys are very long, the hash function will take long to compute.
26.. What do you mean by hash table?
The hash table data structure is merely an array of some fixed size, containing the keys.
A key is a string with an associated value. Each key is mapped into some number in the range 0
to tablesize-1 and placed in the appropriate cell.