Data Structures
Data Structures
CS3301-DATASTRUCTURES
(REGULATIONR2021–IIISEMESTER)
2 CS3301-DATASTRUCTURES
UNITI LISTS 9
Abstract Data Types (ADTs) – List ADT – Array-based implementation – Linked list
implementation – Singly linked lists – Circularly linked lists – Doubly-linked lists –
Applications of lists – Polynomial ADT – Radix Sort – Multi lists.
UNITIII TREES 9
Tree ADT – Tree Traversals - Binary Tree ADT – Expression trees – Binary Search Tree
ADT – AVL Trees – Priority Queue (Heaps) – Binary Heap.
UNITV SEARCHING,SORTINGANDHASHINGTECHNIQUES 9
Searching – Linear Search – Binary Search. Sorting – Bubble sort– Selection sort – Insertion
sort – Shell sort –. Merge Sort – Hashing – Hash Functions – Separate Chaining – Open
Addressing – Rehashing – Extendible Hashing.
3 CS3301-DATASTRUCTURES
TOTAL: 45 PERIODS
OUTCOMES:
Attheend ofthecourse,thestudentshould beable to:
Definelinearandnon-lineardatastructures.
Implementlinearandnon–lineardatastructureoperations.
Useappropriatelinear/non–lineardatastructureoperationsforsolvingagiven problem.
Applyappropriategraph algorithmsforgraphapplications.
Analyzethe varioussearching andsorting algorithms.
TEXT BOOKS:
1. MarkAllenWeiss,“DataStructuresandAlgorithmAnalysisinC”,2ndEdition, Pearson
Education,1997.
2. Kamthane, IntroductiontoDataStructuresinC,1stEdition,PearsonEducation,2007
REFERENCES:
1. Langsam,Augenstein andTanenbaum, DataStructuresUsingCandC++, 2ndEdition,
Pearson Education, 2015.
2. ThomasH.Cormen,CharlesE.Leiserson,RonaldL.Rivest,CliffordStein, “Introduction to
Algorithms”, Second Edition, Mcgraw Hill, 2002.
3. Alfred V. Aho, Jeffrey D. Ullman,John E. Hopcroft ,Data Structures and Algorithms,
1st edition, Pearson, 2002.
4. Kruse,DataStructuresandProgramDesigninC,2ndEdition,PearsonEducation, 2006.
4 CS3301-DATASTRUCTURES
UNITI LISTS 9
Abstract Data Types (ADTs) – List ADT – Array-based implementation – Linked list
implementation – Singly linked lists – Circularly linked lists – Doubly-linked lists –
Applications of lists – Polynomial ADT – Radix Sort – Multilists.
Otherlimitationsare,
Printing the list element and find to be carried out in linear time, whichis
as good as can be expected, and the find_kth operation takes constant
time.
Insertion and deletion are expensive. Because the running time for
insertions and deletions is so slow and the list size must be known in
advance.
12 CS3301-DATASTRUCTURES
LinkedListsImplementation
In order to avoid the linear cost of insertion and deletion, we need to ensurethat
the list is not stored contiguously, since otherwise entire parts of the list will need to
be moved.
Definition:
The linked list consists of a series of structures, which are not necessarily
adjacent in memory. Each structure contains the element and a pointer to a
structure containing its successor. We call this the next pointer. The last cell's
next pointer is always NULL.
Structureoflinkedlist
Deletionfromalinkedlist
Insertionintoalinkedlist
13 CS3301-DATASTRUCTURES
Thedeletecommandcanbeexecutedinonepointerchange.Above
diagramshowsthe resultofdeletingthethirdelement intheoriginallist.
The insert command requires obtaining a new cell from the system
byusing an malloc call function and then changing two pointer.
ProgrammingDetails
First,Itisdifficulttoinsertatthefrontofthelistfromthe listgiven.
Second, deleting from the front of the list is a special case, because it changes
the start of the list;
A third problem concerns deletion in general. Although the pointer moves
above are simple, the deletion algorithm requires us to keep track of the cell
before the one that we want to delete.
In order to solve all three problems, we will keep a sentinel node, which is called as a
header or dummy node. (a header node contains the address of the first node in
the linked list)
Linkedlistwithaheader
Functiontotestwhetheralinkedlistisempty
intis_empty(LISTL)
{
return(L->next==NULL); }
Functiontotestwhethercurrentpositionisthelastinalinkedlist
intis_last(positionp,LISTL)
{
return(p->next==NULL);
}
Functiontofindtheelementinthelist
/*ReturnpositionofxinL;NULLifnotfound*/
Position find ( element_type x, LIST L )
{
positionp;
p=L->next;
while((p!=NULL)&&(p->element!=x)) p =
p->next;
returnp; }
15 CS3301-DATASTRUCTURES
Functiontodeleteanelementinthelist
This routine will delete some element x in list L. We need to decide what to do
if x occurs more than once or not at all. Our routine deletes the first occurrence of x
anddoesnothingif xisnotinthelist. First we findp, which isthecellpriortotheone
containing x, via a call to find_previous.
/*Deletefromalist.Cellpointedtobyp->nextiswipedout. */
/*Assumethatthepositionislegal.Assumeuseofaheadernode.*/ Void
delete( element_type x, LIST L )
{ positionp,tmp_cell;
p = find_previous( x, L );
if(p->next!=NULL)/*Implicitassumptionofheaderuse*/
{/*xisfound:deleteit*/ tmp_cell
= p->next;
p->next=tmp_cell->next;/*bypassthecelltobedeleted*/
free( tmp_cell ); } }
Functiontofindpreviouspositionofanelementinthelist
Thefind_previousroutineissimilartofind.
/*Usesaheader.Ifelementisnotfound,thennextfieldofreturnedvalueisNULL*/ Position
find_previous( element_type x, LIST L )
{
positionp;
p = L;
while((p->next!=NULL)&&(p->next->element!=x)) p =
p->next;
returnp;
}
Functiontoinsertanelementinthelist
Insertion routine will insert an element after the position implied by p. It isquite
possible to insert the new element into position p which means before the element
currently in position p.
16 CS3301-DATASTRUCTURES
/*Insert(afterlegalpositionp).Headerimplementationassumed.*/ Void
insert ( element_type x, LIST L, position p )
{
positiontmp_cell;
tmp_cell=(position)malloc(sizeof(structnode));
if( tmp_cell == NULL )
fatal_error("Outofspace!!!"); else
{
tmp_cell->element = x;
tmp_cell->next=p->next;
p->next = tmp_cell;
}}
Functiontodeletethelist
/*Incorrectwaytodeletealist*/
delete_list( LIST L )
{
positionp;
p=L->next;/*headerassumed*/ L-
>next = NULL;
while(p!=NULL)
{
free(p);
p=p->next;
} }
Functiontodeletethelist
/* correct way to delete a list*/
Voiddelete_list(LISTL)
{
positionp, tmp;
p=L->next;/*headerassumed*/
17 CS3301-DATASTRUCTURES
L->next = NULL;
while(p!=NULL)
{
tmp=p->next;
free( p );
p=tmp;
} }
DoublyLinkedLists
A linked list is called as doubly when it has two pointers namely forward and
backward pointers. It is convenient to traverse lists both forward and backwards.
An extra field in the data structure, containing a pointer to the previous cell;
The cost of this is an extra link, which adds to the space requirement and also doubles
the cost of insertions and deletions because there are more pointers to fix.
Node
Adoublylinkedlist
Structuredeclaration
struct node
{
intElement;
structnode*FLINK;
structnode*BLINK;
}
18 CS3301-DATASTRUCTURES
Insertion
Insert(15,L,P)
Deletion:
19 CS3301-DATASTRUCTURES
CircularlyLinkedLists
A linked list is called as circular when its last pointer point to the first cell inthe
linked list forms a circular fashion. It can be singly circular and doubly circular
with header or without header.
SinglyCircularlinkedlist:
Structuredeclaration:
struct node
{
intElement;
structnode*Next; }
Insertatbeginning:
20 CS3301-DATASTRUCTURES
voidInsert_beg(intX,ListL)
Insertinmiddle:
voidinsert_mid(intX,ListL,PositionP)
21 CS3301-DATASTRUCTURES
InsertatLast
22 CS3301-DATASTRUCTURES
Deletionatfirstnode:
Deletionatmiddle
23 CS3301-DATASTRUCTURES
Deletionatlast:
DoublyLinkedlist
A doubly circular linked list is a doubly linked list in which forward link of the
lastnodepointstothefirstnodeandbackwardlinkoffirstnodepointstothelast node of
the list.
StructureDeclaration:
struct node
{
intElement;
structnode*FLINK;
structnode*BLINK;
}
24 CS3301-DATASTRUCTURES
Insertatbeginning:
InsertatMiddle:
25 CS3301-DATASTRUCTURES
InsertatLast:
26 CS3301-DATASTRUCTURES
Deletion
Deleting First node
voiddele_first(ListL)
Deletionatmiddle:
voiddele_mid(intX,List L)
{ Position P, Temp;
P=FindPrevious(X);
27 CS3301-DATASTRUCTURES
DeletionatLastnode:
Applicationoflinkedlist
Threeapplicationsthatuseslinkedlistsare,
1. ThePolynomialADT
2. Radixsort
3. Multilist
1) PolynomialADT:
To overcome the disadvantage of array implementation an alternative way is to
use a singly linked list.
Each term in the polynomial is contained in one cell, and the cells are sorted in
decreasing order of exponents.
28 CS3301-DATASTRUCTURES
Example:
P1:4X10+5X5+3
P2:10X6-5X2+2X
StrucuturedeclarationsforLinkedListImplementationofthepolynomialADT:
structlink
{
intcoeff;
int pow;
structlink*next;
};structlink*poly1=NULL,*poly2=NULL,*poly=NULL;
Proceduretoaddtwopolynomials
voidpolyadd(structlink*poly1,structlink*poly2,structlink*poly)
{
while(poly1->next!=NULL&&poly2->next!=NULL)
{
if(poly1->pow>poly2->pow)
{
poly->pow=poly1->pow;
poly->coeff=poly1->coeff;
poly1=poly1->next; }
else if(poly1->pow < poly2->pow)
{
poly->pow=poly2->pow;
29 CS3301-DATASTRUCTURES
poly->coeff=poly2->coeff;
poly2=poly2->next;
}
else
{
poly->pow=poly1->pow;
poly->coeff=poly1->coeff+poly2->coeff;
poly1=poly1->next;
poly2=poly2->next;
}
poly->next=(structlink*)malloc(sizeof(structlink));
poly=poly->next;
poly->next=NULL;
}
if(poly1->next!=NULL)
{
poly->coeff=poly1->coeff;
poly->pow = poly1->pow;
poly->next=(structlink*)malloc(sizeof(structlink));
poly=poly->next;
poly->next=NULL;
}
else
{
poly->coeff=poly2->coeff;
poly->pow = poly2->pow;
poly->next=(structlink*)malloc(sizeof(structlink));
poly=poly->next;
poly->next=NULL;
}
}
30 CS3301-DATASTRUCTURES
FinallywegetthepolynomialC as
11 14 -3 10 2 8
2 8 10 6
SUBTRACTIONOFTWOPOLYNOMIAL
voidsub()
0 1
{
poly*ptr1,*ptr2,*newnode;
ptr1 = list1 ;
ptr2=list2;
while(ptr1!=NULL&&ptr2!=NULL)
{
newnode=malloc(sizeof(Structpoly)); if
(ptr1 power = = ptr2 power)
{
newnode→coeff = (ptr1 coeff) - (ptr2 coeff);
newnode→power = ptr1 power;
newnode→next = NULL;
list3=create(list3,newnode); ptr1
= ptr1→next;
31 CS3301-DATASTRUCTURES
ptr2 = ptr2→next; }
else
{
if(ptr1→power>ptr2→power)
{
newnode→coeff = ptr1→coeff;
newnode→power=ptr1→power;
newnode→next = NULL;
list3=create(list3,newnode);
ptr1 = ptr1→next; }
else
{
newnode→coeff=-(ptr2→coeff);
newnode→power = ptr2→power;
newnode→next = NULL;
list3=create(list3,newnode);
ptr2= ptr2 next; } } }
POLYNOMIALDIFFERENTIATION
voiddiff()
{
poly*ptr1,*newnode;
ptr1 = list 1;
while(ptr1!=NULL)
{
newnode = malloc (sizeof (Struct poly));
newnode coeff = ptr1 coeff *ptr1 power;
newnode power = ptr1 power - 1;
newnode next = NULL;
list3=create(list3,newnode);
ptr1 = ptr1→next; } }
32 CS3301-DATASTRUCTURES
RadixSort
A second example where linked lists are used is called radix sort. Radix sort is
alsoknownascardsort.Becauseitwasused,untiltheadventofmoderncomputers,to sort old-
style punch cards.
If we have n integers in the range 1 to m (or 0 to m - 1) 9, we can use this
information to obtain a fast sort known as bucket sort. We keep an array called count,
of size m, which is initialized to zero. Thus, count has m cells (or buckets), which are
initially empty.
When ai is read, increment (by one) counts [ai]. After all the input is read, scan
the count array, printing out a representation of the sorted list. This algorithm takes
O(m + n); If m = (n), then bucket sort takes O(n) times.
Thefollowingexampleshowstheactionofradixsorton10numbers.Theinput is 64,
8, 216, 512, 27, 729, 0, 1, 343, and 125. The first step (Pass 1) bucket sorts by theleast
significant digit..The buckets are asshownin below figure, sothe list, sorted by least
significant digit, is 0, 1, 512, 343, 64, 125, 216, 27, 8, 729. These are now sorted by
the next least significant digit (the tens digit here)
Pass2givesoutput0,1,8,512,216,125,27,729,343,64.Thislistisnow sorted with
respect to the two least significant digits. The final pass, shown in Figure 3.26, bucket-
sorts by most significant digit.
Thefinallistis0,1,8,27,64,125,216,343,512,and729.
TherunningtimeisO(p(n+b))wherepisthenumberofpasses,nisthe number of
elements to sort, and b is the number of buckets. In our case, b = n.
Bucketsafterfirststepofradix sort
0 1 512 343 64 125 216 27 8 729
0 1 2 3 4 5 6 7 8 9
Bucketsafterthesecondpassofradixsort
8 216 729 343 64
1 512 27
0 125
0 1 2 3 4 5 6 7 8 9
33 CS3301-DATASTRUCTURES
Bucketsafterthelastpassofradixsort
64 125 216 343 512 729
27
8
1
0
0 1 2 3 4 5 6 7 8 9
Multilists
A university with 40,000 students and 2,500 courses needs to be able to
generate two types of reports. The first report lists the class registration for each class,
and the second report lists, by student, the classes that each student is registered for.
If we use a two-dimensional array, such an array would have 100 million entries. The
average student registers for about three courses, so only 120,000 of these entries, or
roughly 0.1 percent, would actually have meaningful data.
To avoid the wastage of memory, a linked list can be used. We can use two link list
one contains the students in the class. Another linked list contains the classes the
student is registered for.
All lists use a header and are circular. To list all of the students in class C3, we
start at C3 and traverse its list . The first cell belongs to student S1.
Multilistimplementationforregistrationproblem
34 CS3301-DATASTRUCTURES
LinkedListImplementationofMultilists:
Multilistscanbeusedtorepresenttheabovescenario.
o Onelisttorepresenteachclasscontainingthestudentsintheclass.
o Onelisttorepresenteachstudentcontainingtheclassesthestudentis
registered for.
Alllistsuseaheaderandarecircular.
Tolistallthestudentsisclass C3:
o StartthetraversalatC3andtraverseitslist(bygoingright).
o ThefirstcellbelongstostudentS1.
o The next cell belongs to student S3.By continuing this it is found that
student S4 and student S5 also belongs to the class C3.
In a similar manner, for any student, all of the classes in which the student
isregistered can be determined.
AdvantageofUsingLinkedList:
o Savesmemoryspace.
DisadvantageofUsingLinkedList:
o Savesmemoryspaceonlyattheexpenseoftime.
35 CS3301-DATASTRUCTURES
UNITIISTACKSAND QUEUES 9
Stack ADT – Operations – Applications – Balancing Symbols – Evaluating arithmetic
expressions- Infix to Postfix conversion – Function Calls – Queue ADT – Operations –
Circular Queue – DeQueue – Applications of Queues.
TheStackADT
Stack Model
A stack is a list with the restriction that inserts and deletes can be performed in
only one position, namely the end of the list called the top. Stacks are sometimes
known as LIFO (last in, first out) lists.
Thefundamentaloperationsonastackare
1. Push,whichisequivalenttoaninsert,
2. Pop,deletesthemostrecentlyinsertedelement.
3. Top,displaythetopmostelementinthestack.
Errorconditions
PushontotheFullStackandPoporToponanemptystackisgenerallyconsidered an
error in the stack ADT.
Stackmodel:inputtoastackisbyPush,outputisbyPop
The model depicted in above figure signifies that pushes are input
operationsand pops and tops are output.
Stackmodel:onlythetopelementisaccessible
36 CS3301-DATASTRUCTURES
ImplementationofStacks
Astackisalist,givestwopopular implementations.
1. Arrayimplementation
2. Linkedlistimplementation
LinkedListImplementationofStacks
The first implementationof a stack uses a singlylinkedlist. We performa push
by inserting at the front of the list. We perform a pop by deleting the element at the
front of the list. A top operation merely examines the element at the front of the list,
returning its value. Sometimes the pop and top operations are combined into one.
Creating an empty stack is also simple. We merely create a header node;
make_null sets the next pointer to NULL.
The push is implemented as an insertion into the front of a linked list, wherethe
front of the list serves as the top of the stack.
Thetopisperformedbyexaminingtheelementinthefirstpositionofthelist. The
pop will delete from the front of the list.
It should be clear that all the operations take constant time, because less a loop
that depends on this size.
Drawbacksandsolution
Theseimplementationsusesthecallstomallocandfreeareexpensive,
especiallyincomparisontothepointermanipulationroutines.Someofthiscanbe
avoidedbyusingasecondstack,whichisinitiallyempty.Whenacellistobedisposed from the
first stack, it is merely placed on the second stack. Then, when new cells are needed
for the first stack, the second stack is checked first.
TypedeclarationforlinkedlistimplementationofthestackADT
structNode;
typedefstructnode*ptrToNode;
typedef ptrToNode Stack;
int IsEmpty(Stack S);
StackCreateSatck(void);
37 CS3301-DATASTRUCTURES
voidDisposeStack(StackS);
void MakeEmpty(Stack S);
voidPush(ElementTypeX,StackS);
ElementType Top (Stack S);
VoidPop(StackS);
structnode
{
Element_typeelement;
PtrToNodenext;
};
Routinetotestwhetherastackisempty-linkedlistimplementation
ThisroutinecheckswhetherStackisemptyornot.Ifitisnotemptyitwill return a
pointer to the stack. Otherwise return NULL
intis_empty(STACKS)
{
return(S->next==NULL);
}
Routinetocreateanemptystack-linkedlistimplementation
This routine creates a Stack and return a pointer of the stack. Otherwise
returna warning to say Stack is not created.
STACKcreate_stack(void)
{
STACKS;
S=malloc(sizeof(structnode)); if( S
== NULL )
fatal_error("Outofspace!!!");
return S; }
38 CS3301-DATASTRUCTURES
Routinetomakethestackasempty-linkedlistimplementation
ThisroutinemakesStackasemptyandreturnNULLpointer.
VoidmakeEmpty(STACKS)
{
if(S==NULL)
error("Mustusecreate_stackfirst");
else
while(!IsEmpty(S))
pop(S); }
Routinetopushontoastack-linkedlistimplementation
Thisroutineistoinsertthenewelementontothetopofthestack. Void
push( element_type x, STACK S )
{
node_ptrtmp_cell;
tmp_cell=(node_ptr)malloc(sizeof(structnode));
if( tmp_cell == NULL )
fatal_error("Outofspace!!!"); else
{
tmp_cell->element = x;
tmp_cell->next=S->next;
S->next=tmp_cell; } }
Routinetoreturntopelementinastack--linkedlistimplementation This
routine is to return the topmost element from the stack.
element_type top( STACK S )
{
if( is_empty( S ) )
error("Emptystack");
else
returnS->next->element;
}
39 CS3301-DATASTRUCTURES
Routinetopopfromastack--linkedlistimplementation
Thisroutineistodeletethetopmostelementfromthestack.
Voidpop(STACKS)
{
PtrToNode first_cell;
if( is_empty( S ) )
error("Emptystack");
else
{
first_cell=S->next;
S->next=S->next->next;
free( first_cell );
} }
ArrayimplementationofStacks
An alternative implementation to avoid pointers is that by using an array
implementation. One problem here is that we need to declare an array size
ahead of time. Generally this is not a problem, if the actual number of elements
in the stack is knows in advance. It is usually easy to declare the array to be
large enough without wasting too much space.
Associated with each stack is the top of stack, tos, which is -1 for an empty
stack. To push some element x onto the stack, we increment tos and then set
STACK[tos] = x, where STACK is the array representing the actual stack.
Topop,wesetthereturnvaluetoSTACK[tos]andthendecrementtos.
Notice that these operations are performed in not only constant time, but
very fast constant time.
Errorchecking:
The efficiency of implementation in stacks is error testing. linked list
implementation carefully checked for errors.
Apoponanemptystackorapushonafullstackwilloverflowthearraybounds
and cause a crash. Ensuring that this routines does not attempt to pop an empty stack
and Push onto the full stack.
40 CS3301-DATASTRUCTURES
ASTACKisdefinedasapointertoastructure.Thestructure
contains the top_of_stack and stack_size fields.
Oncethemaximumsizeisknown,thestackarraycanbedynamicallyallocated.
StackDeclaration
StructStackRecord
typedefstructStackRecord*Stack;
int IsEmpty(Stack S);
StackCreateStack(intMaxElements);
void DisposeStack(Stack S);
voidMakeEmpty(StackS);
voidPush(ElementTypeX,StackS);
ElementType Top (Stack S);
VoidPop(StackS);
ElementTypeTopandPop(StackS); struct
StackRecord
{
Int Capacity;
intTopofSatck;
ElementType*array;
};
#define EmptyTOS(-1) /*Signifiesanemptystack*/
#define MinStackSize (5)
Routinetocreateanemptystack-Arrayimplementation
This routine creates a Stack and return a pointer of the stack. Otherwise
returna warning to say Stack is not created.
StackCreateStack(unsignedintMaxElements)
{
STACKS;
if(MaxElements<MinStackSize)
error("Stack size is too small");
S=(malloc(sizeof(structStackRecord) );
41 CS3301-DATASTRUCTURES
if(S==NULL)
fatal_error("Outofspace!!!");
S->Array=malloc(sizeof(ElementType)*MaxElements); if( S-
>Array == NULL )
fatalerror("Out of space!!!");
S->Capacity=MaxElements;
MakeEmpty(S);
return(S);}
Routineforfreeingstack--arrayimplementation
This routine frees or removes theStackStructure itself by deletingthe
arrayelements one by one.
Voiddispose_stack(StackS)
{
if(S!=NULL)
{ free(S->Array);
free(S); } }
Routinetotestwhetherastackisempty--arrayimplementation
Thisroutineistocheckwhetherstackisemptyornot. int
IsEmpty( Stack S )
{
return(S->top_of_stack==EmptyTOS);}
Routinetocreateanemptystack--arrayimplementation
ThisroutinehelpstomaketheStackasemptyone. Void
MakeEmpty( STACK S )
{
S->top_of_stack=EMPTY_TOS;}
Routinetopushontoastack--arrayimplementation
Thisroutinewillinsertthenewelemntontothetopofthestackusingstack pointer.
Voidpush(ElementTypeX,StackS)
{ if(IsFull(S))
42 CS3301-DATASTRUCTURES
Error("Fullstack");
else
S->Array[++S->TopofStack]=X; }
Routinetoreturntopofstack--arrayimplementation
Thisroutineistoreturnthetopmostelementfromthestack.
ElementTypeTop(StackS)
{
if(!IsEmpty(S))
returnS->Array[S->TopofStack];
error("Empty stack");
return0;
}
Routinetopopfromastack--arrayimplementation
Thisroutineistodeletethetopmostelementfromthestack.
Voidpop(StackS)
{
if( IsEmpty( S ) )
error("Emptystack");
else
S->TopofStack--;
}
Routinetogivetopelementandpopastack--arrayimplementation
Thisroutineistoreturnaswellasremovethetopmostelementfromthestack.
ElementType TopandPop( Stack S )
{
if( IsEmpty( S ) )
error("Emptystack");
else
returnS->Array[S->TopofStack--];
}
43 CS3301-DATASTRUCTURES
StackApplications
Stackisusedforthefollowingapplications.
1. Reversingofthestring
2. Tower’sofHanoi’sproblem
3. BalancingSymbols
4. ConversionofInfixtopostfixexpression
5. ConversionofInfixtoprefixexpression
6. EvaluationofPostfixexpression
7. UsedinFunctioncalls
BalancingSymbols
Compilerscheckyourprogramsforsyntaxerrors,butfrequentlyalackofone symbol
(such as a missing brace or comment starter) will cause the compiler to
spilloutahundredlinesofdiagnosticswithoutidentifyingtherealerror.
Ausefultoolinthissituationisaprogramthatcheckswhethereverythingis
balanced. Thus, every right brace, bracket, and parenthesis must correspond to
their left counterparts.
The sequence [()] is legal, but [(]) is wrong. That it is easy to check these things. For
simplicity, we will just check for balancing of parentheses, brackets, and braces and
ignore any other character that appears.
Thesimplealgorithmusesastackandisasfollows:
Makeanemptystack.
Readcharactersuntilendoffile.
Ifthecharacterisanopenanything,pushitontothestack.
Ifitisacloseanything,then
Ifthestackisemptyreportanerror.
Otherwise,popthestack.
Ifthesymbolpoppedisnotthecorrespondingopeningsymbol,then report an
error.
Atendoffile,ifthestackisnotemptyreportanerror.
44 CS3301-DATASTRUCTURES
Expression:
Expression is defined as a collection of operands and operators. The operators
can be arithmetic, logical or Boolean operators.
Rulesforexpression
Notwooperandshouldbecontinuous
Notwooperatorshouldbecontinuous
Typesofexpression:
Basedonthepositionoftheoperator,itisclassifiedintothree.
1. InfixExpression/Standardnotation
2. PrefixExpression/Polishednotation
3. PostfixExpression/ReversedPolishednotation
InfixExpression:
Inanexpressioniftheoperatorisplacedinbetweentheoperands,thenitis called as
Infix Expression.
Eg:A+B
PrefixExpression:
In an expression if the operator is placed before the operands, then it is
calledas Prefix Expression.
Eg:+AB
PostfixExpression:
Inanexpressioniftheoperatorisplacedaftertheoperands,thenitiscalledas
PostfixExpression.
Eg:AB+
ConversionofinfixtoPostfixExpressions
Stack is used to convert an expression in standard form (otherwise known as
infix) into postfix. We will concentrate on a small version of the general problem by
allowing only the operators +, *, and (, ), and insisting on the usual precedence rules.
Supposewewanttoconverttheinfixexpression
45 CS3301-DATASTRUCTURES
a+b*c+(d*e+f) *g .
Acorrectanswerisabc*+de*f +g*+.
Algorithm:
1. Westartwithaninitiallyemptystack
2. Whenanoperandisread,itisimmediatelyplacedontotheoutput.
3. Operators are not immediately placed onto the output, so they must be saved
somewhere. The correct thing to do is to place operators that have been seen,
but not placed on the output, onto the stack. We will also stack left parentheses
when they are encountered.
4. If we see a right parenthesis, then we pop the stack, writing symbols until we
encounter a (corresponding) left parenthesis, which is popped but not output.
5. If we see any other symbol ('+','*', '(' ), then we pop entries from the stack until
we find an entry of lower priority. One exception is that we never remove a '('
from the stack except when processing a ')'. For the purposes of this operation,
'+' has lowest priority and '(' highest. When the popping is done, we push the
operand onto the stack.
6. Finally, if we read the end of input, we pop the stack until it is empty, writing
symbols onto the output.
To see how this algorithm performs, we will convert the infix expression
intoits postfix form.
a+b*c+(d*e +f)*g
First, the symbol a is read, so it is passed through to the output. Then '+' is read and
pushed ontothestack. Next bisreadand passed throughtotheoutput.Then the stack will
be as follows.
Next a '*' is read. The top entry on the operator stack has lower precedence than '*', so
nothing is output and '*' is put on the stack. Next, c is read and output.
46 CS3301-DATASTRUCTURES
The next symbol read is an '(', which, being of highest precedence, is placed on the
stack. Then d is read and output.
We continue by reading a '*'. Since open parentheses do not get removed except when
a closed parenthesis is being processed, there is no output. Next, e is read and output.
The next symbol read is a '+'. We pop and output '*' and then push '+'. Then we read
and output f.
.
Nowwereada')',sothestackisemptiedbacktothe'('.Weoutputa'+'0ntothestack.
Wereada'*'next;itispushedontothestack.Thengisreadandoutput.
47 CS3301-DATASTRUCTURES
Theinputisnowempty,sowepopandoutputsymbolsfromthestackuntilitis empty.
As before, this conversion requires only O(n) time and works in one pass
through the input. We can add subtraction and division to this repertoire by assigning
subtraction and addition equal priority and multiplication and division equal priority.
Asubtlepointisthattheexpressiona - b- c willbeconvertedtoab - c-and not abc - -.
Our algorithm does the right thing, because these operators associate from left to right.
This is not necessarily the case in general, since exponentiation associates right to left:
223 = 28 = 256 not 43 = 64.
EvaluationofaPostfixExpression
Algorithm:
Whenanumberisseen,itispushedontothestack;
When an operator is seen, the operator is applied to the two numbers (symbols)
that are popped from the stackand the result is pushed onto the stack.
Forexample,thepostfixexpression6523+8*+3+*isevaluatedas follows:
Thefirstfoursymbolsareplacedonthestack.Theresultingstackis
TopofStack 3
2
5
6
Nexta'+'isread,so3and2arepoppedfromthestackandtheirsum,5,ispushed.
TopofStack 5
5
6
48 CS3301-DATASTRUCTURES
Next8ispushed.
TopofStack 8
5
5
6
Nowa'*'isseen,so8and5arepoppedas 8*5=40ispushed.
TopofStack 40
5
6
Nexta'+'isseen,so40and5arepoppedand40+5=45ispushed.
TopofStack 45
6
Now,3is pushed.
TopofStack 3
45
6
Next'+'pops3and45andpushes45+3=48.
TopofStack 48
6
Finally,a'*'isseenand48and6arepopped,theresult6*48=288ispushed.
TopofStack 288
FunctionCalls
When a call is made to a new function, all the variables local to the calling
routine need to be saved by the system. Otherwise the new function will
overwrite the calling routine's variables.
The current location in the routine must be saved so that the new function
knows where to go after it is done.
The reason that this problem is similar to balancing symbols is that a function
call and function return are essentially the same as an open parenthesis and
closed parenthesis, so the same ideas should work.
When there is a function call, all the important information that needs to be
saved, such as register values (corresponding to variable names) and the return
address is saved "on a piece of paper" in an abstract way and put at the top of a
pile.Thenthecontrolistransferredtothenewfunction, whichisfreetoreplace the
registers with its values.
If it makes other function calls, it follows the same procedure. When the
function wants to return, it looks at the "paper" at the top of the pile andrestores
all the registers. It then makes the return jump.
Theinformationsavediscalledeitheranactivationrecordorstackframe.
There is always the possibility that you will run out of stack space by having
toomanysimultaneouslyactivefunctions.Runningoutofstackspaceisalways a fatal
error.
In normal events, you should not run out of stack space; doing so is usually an
indication of runaway recursion. On the other hand, some perfectly legal and
seemingly innocuous program can cause you to run out of stack space.
Abaduseofrecursion:printingalinkedlist
void/*Notusingaheader*/
print_list( LIST L )
{if(L!=NULL)
{
print_element(L->element);
print_list(L->next);} }
50 CS3301-DATASTRUCTURES
TheQueueADT
Queue is also a list in which insertion is done at one end, whereas deletion is
performed at the other end. Insertion will be at rear end of the queue and deletion will
beatfront ofthequeue.Itisalsocalledas FIFO(FirstInFirstOut)whichmeansthe element
which inserted first will be removed first from the queue.
QueueModel
Thebasicoperationsonaqueue are
1. enqueue,whichinsertsanelementattheendofthelist(calledtherear)
2. dequeue, which deletes (and returns) the element at the start of the
list(known as the front).
Abstractmodelofaqueue
ArrayImplementationofQueues
Like stacks, both the linked list and array implementations give fast O(1)
running times for every operation. The linked list implementation is
straightforward and left as an exercise. We will now discuss an array
implementation of queues.
For each queue data structure, we keep an array, QUEUE[], and the positions
q_front and q_rear, which represent the ends of the queue. We also keep track
of the number of elements that are actually in the queue, q_size.
Thefollowingfigureshowsaqueueinsomeintermediatestate.
By the way, the cells that are blanks have undefined values in them. In
particular, the first two cells have elements that used to be in the queue.
52 CS3301-DATASTRUCTURES
Typedeclarationsforqueue--arrayimplementation
structQueueRecord
{
intCapacity;
int Front;
intRear;
intSize;/*Current#ofelementsinQ*/ ElementType
*Array;
};
typedefstructQueueRecord*Queue;
Routinetotestwhetheraqueueisempty-arrayimplementation
intisempty(QueueQ)
{
return(Q->q_size==0 ); }
Routinetomakeanemptyqueue-arrayimplementation
Voidmakeempty(QueueQ)
{
Q->size=0;
Q->Front=-1;
Q->Rear=-1; }
Routinestoenqueue-arrayimplementation
staticintsucc(intvalue,QueueQ)
{
if(++value==Q->Capacity) value
= 0;
returnvalue;}
Voidenqueue(Elementtypex,QueueQ)
{
if( isfull( Q ) )
error("Fullqueue");
54 CS3301-DATASTRUCTURES
else
{
Q->Size++;
Q->Rear=succ(Q->Rear,Q);
Q->Array[ Q->Rear ] = x;
} }
ApplicationsofQueues
Theapplicationsare,
1. Whenjobsaresubmittedtoaprinter,theyarearrangedinorderofarrival. Then jobs
sent to a line printer are placed on a queue.
2. Linesatticketcountersarequeues,becauseserviceisfirst-comefirst-served.
3. Another example concerns computer networks. There are many network setups
of personal computers in which the disk is attached to one machine, known as
the file server.
4. Users on other machines are given access to files on a first-come first-served
basis, so the data structure is a queue.
CircularQueue:
In Circular Queue, the insertion of a new element is performed at the very first
locations of the queue if the last location of the queue is full, in which the first
element comes after the last element.
Advantages:
Itovercomestheproblemofunutilizedspaceinlinearqueue,whenitis implemented
as arrays.
55 CS3301-DATASTRUCTURES
Dequeue:
Thisroutinedeletestheelementfromthefrontofthecircularqueue. void
CQ_dequeue( )
{
If(front==-1&&rear==-1)
Print(“Queue is empty”);
Else
{
Temp=CQueue[front];
If(front==rear)
Front=rear=-1;
Else
Front=(front+1)%maxsize;
}}
56 CS3301-DATASTRUCTURES
PriorityQueue:
In an priority queue, an element with high priority is served before an element
with lower priority.
If two elements with the same priority, they are served according to their order
in the queue.
TwotypesofpriorityQueue.
57 CS3301-DATASTRUCTURES
58 CS3301-DATASTRUCTURES
59 CS3301-DATASTRUCTURES
60 CS3301-DATASTRUCTURES
61 CS3301-DATASTRUCTURES
UNITIII TREES 9
TreeADT–TreeTraversals-BinaryTreeADT–Expressiontrees–BinarySearch Tree ADT –
AVL Trees – Priority Queue (Heaps) – Binary Heap.
TREES
TreeisaNon-Lineardatastructureinwhichdataarestoredinahierarchal manner.Itis also defined
as a collection of nodes. The collection can be empty. Otherwise, a tree consists of a
distinguished node r, called the root, and zero or more (sub) trees T1, T2, . . . , Tk, each of
whose roots are connected by a directed edge to r.
The root of each subtree is said to be a child of r, and r is the parent of each subtree
root.Atree is acollection ofn nodes, one ofwhich is theroot,and n - 1 edges. That thereare n - 1
edges follows from the fact that each edge connects some node to its parent and every node
except the root has one parent
Generic tree
A tree
TermsinTree
Inthetreeabovefigure,therootis A.
NodeFhasA as aparent andK, L, andMas children.
Eachnodemay have anarbitrary numberof children,possibly zero.
62 CS3301-DATASTRUCTURES
Nodeswithnochildren areknownasleaves;
Theleaves in thetreeaboveareB, C, H,I,P, Q,K,L,M, and N.
Nodeswiththesameparentaresiblings;thusK,L,andMareallsiblings.
Grandparentandgrandchild relationscanbedefinedinasimilar manner.
Apath from noden1 tonk is defined as asequenceofnodes n1, n2, . . . , nk such that ni
is the parent of ni+1 for 1 i < k.
Thelength ofthis path is thenumber ofedges onthe path,namely k-1.
Thereisa pathof length zero fromevery nodeto itself.
Foranynodeni,thedepthofniisthelengthoftheuniquepathfromtheroot to ni.
Thus, the root is at depth 0.
Theheight ofniis thelongest path fromni toa leaf. Thusall leaves areatheight 0.
Theheight of atreeis equal to theheight of the root.
Example:Forthe abovetree,
Eis atdepth 1 and height2;
Fis at depth1 and height1;theheight of thetreeis 3. T
Note:
The depth of a tree is equal to the depth of the deepest leaf; this is always
equal to the height of the tree.
If there is a path from n1 to n2, then n1 is an ancestor of n2 and n2 is a
descendant of n1. If n1 n2, then n1 is a proper ancestor of n2 and n2 is a
proper descendant of n1.
Atreethereisexactlyonepathfrom the roottoeach node.
TypesoftheTree
Basedon theno. ofchildrenfor eachnodein thetree, itisclassified intotwo to types.
1. Binarytree
2. Generaltree
Binarytree
Inatree,eachandeverynodehasamaximumoftwochildren.Itcanbe empty, one or
two. Then it is called as Binary tree.
Eg:
63 CS3301-DATASTRUCTURES
GeneralTree
ImplementationofTrees
Tree canbeimplemented bytwo methods.
1. Array Implementation
2. LinkedList implementation
Apart from these two methods, it can also be represented by First Child and
Next sibling Representation.
Onewaytoimplementa treewouldbetohavein eachnode,besidesitsdata,apointer
toeachchildofthenode.However,sincethenumberofchildrenpernode canvarysogreatly and is
not known in advance, it might be infeasible to make the children direct links in the data
structure, because there would be too much wasted space. The solution is simple: Keep the
children of each node in a linked list of tree nodes.
Nodedeclarationsfortrees
typedefstructtree_node*tree_ptr;
struct tree_node
{
element_typeelement;
tree_ptr first_child;
tree_ptr next_sibling;
};
64 CS3301-DATASTRUCTURES
Arrows that point downward are first_child pointers. Arrows that go left to right are
next_sibling pointers. Null pointers are not drawn, because there are too many. In the above
tree, node E has both a pointer to a sibling (F) and a pointer to a child (I), while some nodes
have neither.
TreeTraversals
Visiting of each and every node in a tree exactly only once is called as Tree
traversals. Here Left subtree and right subtree are traversed recursively.
TypesofTreeTraversal:
1. InorderTraversal
2. PreorderTraversal
3. PostorderTraversal
Inorder traversal:
Rules:
TraverseLeftsubtree recursively
Processthenode
TraverseRightsubtree recursively
Eg
Inordertraversal: a+ b*c+ d*e+ f*g.
Preordertraversal:
Rules:
Processthenode
TraverseLeftsubtree recursively
TraverseRightsubtree recursively
Preordertraversal:++a*b c*+*def g
65 CS3301-DATASTRUCTURES
Postordertraversal:
Rules:
TraverseLeftsubtree recursively
TraverseRightsubtree recursively
Processthenode
Postordertraversal:ab c*+de*f+g* +
TreeTraversalswithan Application
Therearemany applicationsfortrees.Mostimportanttwoapplications are,
1. Listingadirectoryinahierarchicalfilesystem
2. Calculatingthe sizeof a directory
The root of this directory is /usr. (The asterisk next to the name indicates that /usr
isitself a directory.)
/usrhasthreechildren,mark,alex,andbill,whicharethemselvesdirectories.Thus,
/usrcontainsthreedirectories andnoregular files.
The filename /usr/mark/book/ch1.r is obtained by following the leftmost child
threetimes. Each / after the first indicates an edge; the result is the full pathname.
Twofilesindifferentdirectoriescansharethesamename,becausetheymusthave different
paths from the root and thus have different pathnames.
66 CS3301-DATASTRUCTURES
A directory in the UNIX file system is just a file with a list of all its children, so the
directories are structured almost exactly in accordance with the type declaration.
Each directory in the UNIX file system also has one entry that points to itself and
anotherentrythat point to theparent ofthedirectory.Thus, technically, theUNIXfile
system is not a tree, but is treelike.
Routinetolistadirectoryin ahierarchicalfile systemvoid
list_directory(Directory_or_fileD)
{
list_dir (D, 0 ); }
Voidlist_dir (Directory_or_fileD,unsigned intdepth )
{
if (D is alegitimateentry)
{
print_name(depth,D); if(
D is a directory )
foreachchild,c,ofD
list_dir( c, depth+1 );
} }
Thelogicof thealgorithm is as follow.
The argument to list_dir is some sort of pointer into the tree. As long as the pointer is
valid, the name implied by the pointer is printed out with the appropriate number of
tabs.
If the entry is a directory, then we process all children recursively, one by one. These
children are one level deeper, and thus need to be indenting an extra space.
This traversal strategy is known as a preorder traversal. In a preorder traversal, work at a
node is performed before (pre) its children are processed. If there are n file names to be
output, then the running time is O (n).
The(preorder)directorylisting
/usr
mark
book
chr1.c
chr2.c
chr3.c
67 CS3301-DATASTRUCTURES
course
cop3530
fall88
syl.r
spr89
syl.r
sum89
syl.r
junk.c
alex
junk.c
bill
work
course
cop3212
fall88
grades
prog1.r
prog2.r
fall89
prog1.r
prog2.r
grades
intsize_directory(Directory_or_fileD)
{
unsignedinttotal_size;
total_size = 0;
if(D is alegitimateentry)
{
total_size=file_size(D);
if( D is a directory )
foreachchild, c,ofD
total_size+=size_directory(c);
}
return(total_size);
}
SizeoftheUNIXDirectory
ch1.r 3
ch2.r 2
ch3.r 4
book 10
syl.r 1
fall88 2
syl.r 5
spr89 6
syl.r 2
sum89 3
cop3530 12
course 13
junk.c 6
mark 30
junk.c 8
alex 9
work 1
grades 3
prog1.r 4
prog2.r 1
69 CS3301-DATASTRUCTURES
fall88 9
prog2.r 2
prog1.r 7
grades9
fall89 19
cop3212 29
course 30
bill 32
/usr 72
IfDisnotadirectory,thensize_directorymerelyreturnsthenumberofblocks used
by D. Otherwise, the number of blocks used by D is added to the number of blocks
(recursively) found in all of the children.
.
Binary Trees
Abinary treeis a treein whichno nodecan havemorethan two children.
Implementation
A binary tree has at most two children; we can keep direct pointers to them. The
declaration of tree nodes is similar in structure to that for doubly linked lists, in that a node is
a structure consisting of the key information plus two pointers (left and right) to other nodes.
70 CS3301-DATASTRUCTURES
Expression Trees
When an expression is represented in a binary tree, then it is called as an expression Tree.
The leaves of an expression tree are operands, such as constants or variable names, and the
other nodes contain operators. It is possible for nodes to have more than two children. It is
also possible for a node to have only one child, as is the case with the unary minus operator.
Wecanevaluateanexpressiontree,T,byapplyingtheoperatorattheroottothe values
obtained by recursively evaluating the left and right subtrees.
In our example, the left subtree evaluates to a + (b * c) and the right subtree evaluates
to ((d *e) + f ) *g. The entire tree therefore represents (a + (b*c)) + (((d * e) + f)* g).
Wecanproducean(overlyparenthesized)infixexpressionbyrecursively
producing a parenthesized left expression, then printing out the operator at the
root, and finally recursively producing a parenthesized right expression. This
generalstrattegy(left,node,right)isknownasaninordertraversal; itgivesInfix
Expression.
Analternatetraversal strategyisto recursivelyprintout theleftsubtree, the
right subtree, and then the operator. If we apply this strategy to our tree above, the output is a
b c * + d e * f + g * +, which is called as postfix Expression. This traversal strategy is
generally known as a postorder traversal.
Athirdtraversal strategy istoprintouttheoperatorfirstandthenrecursivelyprintout theleft
and right subtrees. Theresulting expression, ++a* b c* +* d efg, is theless useful prefix
notation and the traversal strategy is a preorder traversal
Expressiontreefor(a +b* c) +((d *e+ f ) * g)
71 CS3301-DATASTRUCTURES
ConstructinganExpressionTree
Algorithmtoconverta postfixexpressionintoanexpressiontree
1. Readthe postfixexpression onesymbolat atime.
2. Ifthe symbol isan operand, then
a. Wecreateaonenodetreeandpush apointerto itonto a stack.
3. Ifthe symbol is an operator,
a. WepoppointerstotwotreesT1andT2fromthestack(T1ispoppedfirst)and form a
new tree whose root is the operator and whose left and right children point to
T2 and T1 respectively.
4. Apointertothisnewtreeisthenpushedontothestack.
Suppose the input is
a b + cd e+ * *
The first two symbols are operands, so we create one-node trees and push pointers to
them onto a stack.
Next,c,d,andeareread,andforeachaone-nodetreeiscreatedandapointer to the
corresponding tree is pushed onto the stack.
Finally,thelastsymbolisread,twotreesaremerged,andapointertothefinal tree is
left on the stack.
TheSearchTreeADT-BinarySearchTree
The property that makes a binary tree into a binary search tree is that for every
node, X, in the tree, the values of all the keys in the left subtree are smaller than the key
value in X, and the values of all the keys in the right subtree are larger than the key
value in X.
Notice that this implies that all the elements in the tree can be ordered in some
consistent manner.
73 CS3301-DATASTRUCTURES
In the above figure, thetreeon theleftis abinary search tree, but thetree on theright is
not. The tree on the right has a node with key 7 in the left subtree of a node with key 6.The
average depth of a binary search tree is O(log n).
Binary search tree declarations
typedefstructtree_node*tree_ptr;
struct tree_node
{
element_typeelement;
tree_ptr left;
tree_ptrright;
};
typedeftree_ptrSEARCH_TREE;
Make Empty:
This operation is mainly for initialization. Some programmers prefer to initialize the
first element as a one-node tree, but our implementation follows the recursive definition of
trees more closely.
Find
This operation generally requires returning a pointer to the node in tree T that has key
x, or NULL if there is no such node. The structure of the tree makes this simple. If T is , then
we can just return . Otherwise, if the key stored at T is x, we can return T. Otherwise, we
make a recursive call on a subtree of T, either left or right, depending on the relationship of x
to the key stored in T.
Routineto makean empty tree
SearchTreemakeempty(searchtreeT)
{
if(T!=NULL)
{
Makeempty(T->left);
Makeempty(T->Right);
74 CS3301-DATASTRUCTURES
Free(T);
}
returnNULL;}
RoutineforFind operation
Positionfind(ElementtypeX,SearchTreeT)
{
if(T== NULL )
returnNULL;
if( x < T->element )
return(find(x, T->left ));
else
if(x>T->element )
return(find(x, T->right ) );
else
returnT;
}
FindMin &FindMax:
Theseroutinesreturnthepositionofthesmallestandlargestelementsinthe tree,
respectively.
To perform a findmin, start at the root and go left as long as there is a left child. The
stopping point is the smallest element.
Thefindmax routineis thesame, except that branchingis to theright child.
RecursiveimplementationofFindminforbinarysearchtrees
Positionfindmin(SearchTreeT )
{
if(T== NULL )
returnNULL;
else
if(T->left==NULL)
return( T );
else
return(findmin(T->left));
}
75 CS3301-DATASTRUCTURES
RecursiveimplementationofFindMaxforbinarysearch trees
Positionfindmax(SearchTreeT )
{
if(T== NULL )
returnNULL;
else
if(T->Right==NULL)
return( T );
else
return(findmax( T->right ) );
}
NonrecursiveimplementationofFindMinforbinarysearch trees
Positionfindmin(SearchTreeT )
{
if(T !=NULL )
while(T->left!=NULL)
T=T->left;
return(T);
}
NonrecursiveimplementationofFindMaxforbinarysearchtrees
Positionfindmax(SearchTreeT )
{
if(T !=NULL )
while(T->right!=NULL)
T=T->right;
return(T); }
Insert
ToinsertxintotreeT,proceeddownthetree.Ifxisfound,donothing.Otherwise,
insertx at thelast spot onthepath traversed.
76 CS3301-DATASTRUCTURES
searchTreeinsert(elementtypex,SearchTreeT )
{
if(T== NULL )
{
T=(SEARCH_TREE)malloc(sizeof(structtree_node)); if( T
== NULL )
fatal_error("Outofspace!!!");
else
{
T->element=x;
T->left= T->right =NULL;}
}
else
if(x<T->element )
T->left=insert(x,T->left);
else
if(x>T->element )
T->right=insert( x, T->right );
/*elsexisinthetreealready.We'lldonothing*/ return T;}
77 CS3301-DATASTRUCTURES
Delete
Oncewehavefoundthenodetobedeleted,weneedtoconsiderseveralpossibilities. If the
node is a leaf, it can be deleted immediately.
Ifthenodehasonechild,thenodecanbedeletedafteritsparentadjustsapointerto
bypassthenode
if a node with two children. The general strategy is to replace the key of this nodewith
the smallest key of the right subtree and recursively delete that node. Because the smallest
node in the right subtree cannot have a left child, the second
deleteis aneasy one.
The node to be deleted is the left child of the root; the key value is 2. It is replaced
with the smallest key in its right subtree (3), and then that node is deleted as before.
Deletion ofanode(4)withonechild,beforeand after
Deletionofanode(2)withtwochildren,before andafter
Deletionroutineforbinarysearchtrees
Searchtreedelete(elementtypex,searchtreeT )
{
Positiontmpcell;
if(T== NULL )
error("Elementnotfound");
else
if(x<T->element)/*Goleft*/ T-
>left = delete( x, T->left ); else
if(x>T->element)/*Goright*/ T-
>right = delete( x, T->right );
else/*Foundelement tobedeleted */
if(T->left&&T->right) /*Twochildren*/
{
tmp_cell = find_min( T->right );
T->element=tmp_cell->element;
T->right=delete(T->element,T->right);
}
else/* Onechild */
{
tmpcell=T;
if(T->left==NULL)/*Onlyarightchild*/ T= T-
>right;
if(T->right==NULL)/*Onlyaleftchild*/ T =
T->left;
free(tmpcell);
}
returnT;
}
Average-CaseAnalysisofBST
All of the operations of the previous section, except makeempty, should take O(log n)
time, because in constant time we descend a level in the tree, thus operating on a tree
that is now roughly half as large.
The running time of all the operations, except makeempty is O(d), where d is the
depth of the node containing the accessed key.
Theaveragedepth overall nodes in atreeis O(logn).
Thesum ofthedepthsofall nodesin atreeis knownas theinternal path length.
79 CS3301-DATASTRUCTURES
AVLTrees
The balance condition and allow the tree to be arbitrarily deep, but after every
operation, a restructuring rule is applied that tends to make future operations efficient. These
types of data structures are generally classified as self-adjusting.
An AVL tree is identical to a binary search tree, except that for every node in thetree,
the height of the left and right subtrees can differ by at most 1. (The height of an empty tree is
defined to be -1.)
An AVL (Adelson-Velskii and Landis) tree is a binary search tree with a balance
condition. Thesimplest ideais to requirethat the left and right subtrees havethesame height.
The balance condition must be easy to maintain, and it ensures that the depth of the tree is
O(log n).
ViolationofAVLpropertyduetoinsertioncanbeavoidedbydoingsome modification on
the node α. This modification process is called as Rotation.
Typesof rotation
1. SingleRotation
2. DoubleRotation
The two trees in the above Figure contain the same elements and are both binary
search trees.
First of all, in both trees k1 < k2. Second, all elements in the subtree X are smaller
than k1 in both trees. Third, all elements in subtree Z are larger than k2. Finally, all elements
in subtree Y are in between k1 and k2. The conversion of one of the above trees to the otheris
known as a rotation.
In an AVL tree, if an insertion causes some node in an AVL tree to lose the balance
property: Do a rotation at that node.
The basic algorithm is to start at the node inserted and travel up the tree, updating thebalance
information at every node on the path.
In the above figure, after the insertion of the in the original AVL tree on the left, node 8
becomesunbalanced. Thus, wedo asingle rotation between 7 and 8, obtaining thetreeon the
right.
81 CS3301-DATASTRUCTURES
Routine:
StaticpositionSinglerotatewithleft(PositionK2)
{
Position k1;
K1=k2->left;
K2->left=k1->right;
K1->right=k2;
K2->height=max(height(k2->left),height(k2->right));
K1->height=max(height(k1->left),k2->height);Return
k1;
}
SingleRotation(case4)–SinglerotatewithRight
(Refer diagram from Class note)
Suppose we start with an initially empty AVL tree and insert the keys 1 through 7 in
sequential order. The first problem occurs when it is time to insert key 3, because the AVL
property is violated at the root. We perform a single rotation between the root and its right
child to fix the problem. The tree is shown in the following figure, before and after the
rotation.
A dashed line indicates the two nodes that are the subject of the rotation. Next, we insert the
key 4, which causes no problems, but the insertion of5 creates a violation at node 3, which is
fixed by a single rotation.
Next, we insert 6. This causes a balance problem for the root, since its left subtree
isofheight0,anditsright subtreewouldbe height 2.Therefore, weperform asinglerotationat the
root between 2 and 4.
Therotationisperformedbymaking2achildof4andmaking4'soriginalleftsubtree the new
right subtree of 2. Every key in this subtree must lie between 2 and 4, so this transformation
makes sense. The next key we insert is 7, which causes another rotation.
82 CS3301-DATASTRUCTURES
Routine:
StaticpositionSinglerotatewithright(PositionK1)
{
Position k2;
K2=k1->right;
K1->right=k2->left;
K2->left=k1;
K1->height=max(height(k1->left),height(k1->right));
K2->height=max(height(k2->left),k1->height);Return
k2;
}
83 CS3301-DATASTRUCTURES
DoubleRotation
(Right-left)double rotation
(Left-right)doublerotation
In the above diagram, suppose we insert keys 8 through 15 in reverse order. Inserting 15 is
easy, since it does not destroy the balance property, but inserting 14 causes a height
imbalance at node 7.
As the diagram shows, the single rotation has not fixed the height imbalance. The problem is
that the height imbalance was caused by a node inserted into the tree containing the middle
elements (tree Y in Fig. (Right-left) double rotation) at the same time as the other trees had
identical heights. This process is called as double rotation, which is similar to a single
rotation but involves four subtrees instead of three.
84 CS3301-DATASTRUCTURES
85 CS3301-DATASTRUCTURES
In our example, the double rotation is a right-left double rotation and involves 7, 15,
and 14. Here, k3 is the node with key 7, k1 is the node with key 15, and
k2 is the nodewith key 14.
Next we insert 13, which requirea double rotation. Herethe double rotation is again a
right-left double rotation that will involve 6, 14, and 7 and will restore the tree. In this case,
k3 is the node with key 6, k1 is the node with key 14, and k2 is the node with key 7. Subtree
A is the tree rooted at the node with key 5, subtree B is the empty subtree that was originally
the left child of the node with key 7, subtree C is the tree rooted at the node with key 13, and
finally, subtree D is the tree rooted at the node with key 15.
If12 is now inserted,thereis an imbalanceat theroot. Since12 is not between
4 and 7, we know that the single rotation will work. Insertion of 11 will require a single
rotation:
To insert 10, a single rotation needs to be performed, and the same is true for the
subsequent insertion of 9. We insert 8 without a rotation, creating the almost perfectly
balanced tree.
RoutinefordoubleRotation withleft(Case2)
Staticpositiondoublerotatewithleft(positionk3)
{
K3->left=singlerotatewithright(k3->left);
Return singlerotatewithleft(k3);
}
RoutinefordoubleRotationwithright(Case3)
Staticpositiondoublerotatewithlright(positionk1)
{
K1->right=singlerotatewithleft(k1->right);
Return singlerotatewithright(k1);
}
86 CS3301-DATASTRUCTURES
PRIORITYQUEUES(HEAPS)
A queue is said to be priority queue, in which the elements are dequeued based on the
priority of the elements.
Apriorityqueueisused in,
Jobs sent toa line printeraregenerally placed on a queue. For instance, one job might
be particularly important, so that it might be desirable to allow that job to be run as
soon as the printer is available.
In a multiuser environment, the operating system scheduler must decide which of
several processes to run. Generally a process is only allowed to run for a fixed period
of time. One algorithm uses a queue. Jobs are initially placed at the end of the queue.
The scheduler will repeatedly take the first job on the queue, run it until either it
finishes or its time limit is up, and place it at the end of the queue. This strategy is
generally not appropriate, because very short jobs will seem to take a long time
because of the wait involved to run. Generally, it is important that short jobs finish as
fast as possible. This is called as Shortest Job First (SJF). This particular application
seems to require a special kind of queue, known as a priority queue.
Basicmodelofapriorityqueue
Apriority queueis adatastructurethatallows atleast thefollowing two operations:
1. Insert,equivalentofenqueue
2. Deletemin,removestheminimumelementintheheapequivalentofthe
Queue’s dequeue operation.
ImplementationsofPriorityQueue
1. Array Implementation
2. Linkedlist Implementation
3. BinarySearchTreeimplementation
4. BinaryHeap Implementation
ArrayImplementation
Drawbacks:
1. There will be more wastage of memory due to maximum size of the array should
bedefine in advance
2. Insertiontakenattheendofthearraywhichtakes O(N) time.
3. Delete_minwillalsotakeO(N)times.
89 CS3301-DATASTRUCTURES
LinkedlistImplementation
Itovercomesfirsttwoproblemsinarrayimplementation.Butdelete_minoperation takes
O(N) time similar to array implementation.
BinarySearchTree implementation
Anotherwayofimplementingpriorityqueueswouldbetouseabinarysearchtree.
ThisgivesanO(log n)average running timeforboth operations.
Binary Heap Implementation
Another way of implementing priority queues would be to use a binary heap. This
gives an O(1) average running time for both operations.
Binary Heap
Like binary search trees, heaps have two properties, namely, a structure property and a
heap order property. As with AVL trees, an operation on a heap can destroy one of the
properties, so a heap operation must not terminate until all heap properties are in order.
1. StructureProperty
2. HeapOrderProperty
Structure Property
Aheap is a binary tree thatis completely filled, with thepossibleexception ofthe
bottom level, which is filled from lefttoright.Such a treeis known as acompletebinary
tree.
AcompleteBinaryTree
A complete binary tree of height h has between 2h and 2h+1 - 1 nodes. This
impliesthat the height of a complete binary tree is log n, which is clearly O(log n).
Arrayimplementationofcompletebinarytree
Note:
Foranyelementinarraypositioni,theleftchildisinposition2i,theright child is in
the cell after the left child (2i + 1), and the parent is in position
i/2 .
90 CS3301-DATASTRUCTURES
The only problem with this implementation is that an estimate of the maximum heap
size is required in advance.
TypesofBinaryHeap
Min Heap
A binary heap is said to be Min heap such that any node x in the heap, the key
valueof X is smaller than all of its descendants children.
MaxHeap
AbinaryheapissaidtobeMinheapsuchthatanynodexintheheap,thekey
valueofXislargerthanallofitsdescendants children.
It is easy to find the minimum quickly, it makes sense that the smallest elementshould
be at the root. If we consider that any subtree should also be a heap, then any node should be
smaller than all of its descendants.
Applying this logic, we arrive at the heap order property. In a heap, for every node X,
the key in the parent of X is smaller than (or equal to) the key in X.
Similarly we can declare a (max) heap, which enables us to efficiently find and
remove the maximum element, by changing the heap order property. Thus, a priority queue
can be used to find either a minimum or a maximum.
Bytheheaporder property,theminimumelementcanalways befoundattheroot.
91 CS3301-DATASTRUCTURES
Declarationforpriority queue
structheapstruct
{
intcapacity;
int size;
element_type*elements;
};
typedefstructheapstruct*priorityQ;
Createroutineofpriority Queue
priorityQcreate(intmax_elements)
{
priorityQH;
if( max_elements < MIN_PQ_SIZE )
error("Priorityqueuesizeistoosmall");
H=(priorityQ)malloc(sizeof(structheapstruct)); if( H
== NULL )
fatal_error("Outof space!!!");
H->elements=(element_type*)malloc((max_elements+1)*sizeof(element_type)
);
if( H->elements == NULL )
fatal_error("Outofspace!!!");
H->capacity= max_elements;
H->size = 0;
H->elements[0]=MIN_DATA;
return H; }
BasicHeap Operations
It is easy to perform the two required operations. All the work involves ensuring that
the heap order property is maintained.
1. Insert
2. Deletemin
92 CS3301-DATASTRUCTURES
Insert
Toinsertanelementxintotheheap,wecreateaholeinthenextavailablelocation,
sinceotherwisethe tree will not be complete.
If x can be placed in the hole without violating heap order, then we do so and are
done. Otherwise we slide the element that is in the hole's parent node into the hole, thus
bubbling the hole up toward the root. We continue this process until x can be placed in the
hole.
Figure shows that to insert 14, we create a hole in the next available heap location.
Inserting 14 in the hole would violate the heap order property, so 31 is slide down into the
hole.
This strategy is continued until the correct location for 14 is found. This general
strategy is known as a percolate up; the new element is percolated up the heap until the
correct location is found.
We could have implemented the percolation in the insert routine by performing
repeated swaps until the correct order was established, but a swap requires three assignment
statements. If an element is percolated up d levels, the number of assignments performed by
the swaps would be 3d. Our method uses d + 1 assignments.
{
int i;
if( is_full( H ) )
error("Priorityqueueisfull");
else
{
i=++H->size;
while(H->elements[i/2]>x )
{
H->elements[i]=H->elements[i/2]; i
/= 2;
}H->elements[i]=x; }}
If the element to be inserted is the new minimum, it will be pushed all the way to the
top. The time to do the insertion could be as much as O (log n), if the element to be insertedis
the new minimum and is percolated all the way to the root. On
Deletemin
Deletemin are handled in a similar manner as insertions. Finding the minimum iseasy;
the hard part is removing it.
When the minimum is removed, a hole is created at the root. Since the heap now
becomes one smaller, it follows that the last element x in the heap must move somewhere in
the heap. If x can be placed in the hole, then we are done. This is unlikely, so we slide the
smaller of the hole's children into the hole, thus pushing the hole down one level. We repeat
this step until x can be placed in the hole. This general strategy is known as a percolate
down.
In Figure, after 13 is removed, we must now try to place 31 in the heap. 31 cannot be
placedinthehole,becausethiswouldviolateheaporder.Thus,weplacethesmallerchild
(14)inthehole,slidingtheholedownonelevel.Werepeatthisagain,placing19intothe
94 CS3301-DATASTRUCTURES
hole and creating a new hole one level deeper. We then place 26 in the hole and create a new
hole on the bottom level. Finally, we are able to place 31 in the hole.
Routinetoperformdeletemininabinaryheap
element_type delete_min( priorityQ H )
{
int i, child;
element_typemin_element,last_element; if(
is_empty( H ) )
{
error("Priorityqueueisempty");
return H->elements[0];
}
min_element = H->elements[1];
last_element=H->elements[H->size--];
for( i=1; i*2 <= H->size; i=child )
{
child=i*2;
if((child!=H->size)&&(H->elements[child+1]<H->elements[child])) child++;
if(last_element>H->elements[child])
H->elements[i] = H->elements[child];
else
break;
95 CS3301-DATASTRUCTURES
}
H->elements[i]=last_element;
return min_element;
}
Theworst-caserunningtimeforthisoperationisO(logn).Onaverage,theelement
thatisplacedattherootispercolatedalmosttothebottomoftheheap,sotheaverage running time is O
(log n).
OtherHeap Operations
Theotherheap operations are
1. Decreasekey
2. Increasekey
3. Delete
4. Buildheap
Decreasekey
Thedecreasekey(x,∆,H)operationlowersthevalueofthekeyatpositionxbya positive
amount ∆. Since this might violate the heap order, it must be fixed by a percolate up.
USE:
Thisoperationcouldbeusefultosystemadministrators:theycanmaketheirprograms run
with highest priority.
Increasekey
The increasekey(x, ∆,H) operation increases thevalue ofthe keyat position x by
apositive amount ∆. This is done with a percolate down.
USE:
Manyschedulersautomaticallydropthepriorityofaprocessthatisconsumingexcessive
CPU time.
Delete
The delete(x, H) operation removes the node at position x from the heap. This is done
by first performing decreasekey(x,∆ , H) and then performing deletemin(H). When a process
is terminated by a user, it must be removed from the priority queue.
Buildheap
Thebuildheap(H)operationtakesasinputnkeysandplacesthemintoanemptyheap. This
can be done with n successive inserts. Since each insert will take O(1) average andO(log n)
worst-case time, the total running time of this algorithm would be O(n) average but O(n log
n) worst-case.
96 CS3301-DATASTRUCTURES
B-Trees
AVLtreeandSplaytreearebinary;thereisapopularsearchtreethatisnotbinary.
Thistreeis known as aB-tree.
AB-treeoforderm isatree withthefollowingstructural properties:
a. Theroot is eitheraleaf orhas between 2 and m children.
b. Allnonleafnodes (excepttheroot)havebetweenm/2 andm children.
c. Allleaves areat thesamedepth.
Alldataisstored at theleaves. Containedin each interiornodeare pointers
p1, p2, . . . , pm to the children, and values k1, k2, . . . , km - 1, representing the smallest key
found in the subtrees p2, p3, . . . , pm respectively. Some of these pointers might be NULL,
and the corresponding ki would then be undefined.
For every node, all the keys in subtree p1 are smaller than the keys in subtree p2, and so on.
The leaves contain all the actual data, which is either the keys themselves or pointers to
records containing the keys.
Thenumberofkeys inaleafis alsobetweenm/2and m.
AnexampleofaB-tree oforder4
If we now try to insert 1 into the tree, we find that the node where it belongs is
already full. Placing our new key into this node would give it a fourth element which is not
allowed. This can be solved by making two nodes of two keys each and adjusting the
information in the parent.
98 CS3301-DATASTRUCTURES
If we now insert an element with key 28, we create a leaf with four children, which is
split into two leaves of two children.
99 CS3301-DATASTRUCTURES
This creates an internal node with four children, which is then split into two children.
Like to insert 70 into the tree above, we could move 58 to the leaf containing 41 and 52,place
70 with 59 and 61, and adjust the entries in the internal nodes.
DeletioninB-Tree
If this key was one of only two keys in a node, then its removal leaves only one key.
We can fix this by combining this node with a sibling. If the sibling has three keys,we
can steal one and have both nodes with two keys.
If the sibling has only two keys, we combine the two nodes into a single node with
three keys. The parent of this node now loses a child, so we might have to percolate
this strategy all the way to the top.
Iftherootlosesitssecondchild,thentherootisalsodeleted andthetreebecomesone level
shallower.
Werepeat this until wefind aparent with less thanm children. Ifwesplit theroot,we
create a new root with two children.
Thedepth of aB-treeis at most log m/2 n.
The worst-case running time for each of the insert and delete operations is thus O(m
logm n) = O( (m / log m ) log n), but a find takes only O(log n ).
Definitions
Graph
AgraphG=(V,E)consistsofasetofvertices,V,andasetofedges,E.Each edge is a
pair (v,w), where v,w € V. Edges are sometimes referred to as arcs.
A B
Edge/ arcs
A,B,C, Dand Eare vertices
Vertex C E
D
Typesof graph
1. DirectedGraph
If the pair is ordered, then the graph is directed. In a graph, if all the edges are
directionally oriented, then the graph is called as directed Graph.Directed graphs are
sometimes referred to as digraphs.
Vertexwisadjacenttov if A B andonly if(v, w)has anedgeE.
C E
D
100 CS3301-DATASTRUCTURES
2. UndirectedGraph
In a graph, if all the edges are not directionally oriented, then the graph is called as
undirected Graph. In an undirected graph with edge(v,w), and hence(w,v),wis adjacent to v
and v is adjacent to w.
1 2
3
4
5
3. MixedGraph
In a graph if the edges are either directionally or not directionally oriented, then it is
called as mixed graph.
S U
T V
Path
Apathinagraphisasequenceofverticesw1,w2,w3,...,wnsuchthat(wi,wi+i)€
Efor 1<i <n.
Pathlength
The length of a path is the number of edges on the path, which is equal to n – 1 where
n is the no of vertices.
Loop
A path from a vertex to itself; if this path contains no edges, then the path length is 0.
If the graph contains an edge (v,v) from a vertex to itself, then the path v, v is sometimes
referred to as a loop.
101 CS3301-DATASTRUCTURES
SimplePath
A simple path is a path such that all vertices are distinct, except that the first and last
could be the same.
A B
A->C->D->E
C E
D
Cycle
Inagraph, ifthepath startsand endstothe same vertex thenitis knownasCycle.
A->C->D->E->A
CyclicGraph
Adirected graphis said tobecyclicgraph, if ithas cyclicpath.
AcyclicGraph
Adirectedgraph is acyclicifit has no cycles.Adirected acyclicgraph is also referred as
DAG.
ConnectedGraph
Anundirectedgraphisconnectedifthereisapathfromeveryvertextoeveryother
vertex.
Stronglyconnected Graph
A directedgraphiscalled strongly connected if there isa pathfrom every vertex to every
other vertex.
102 CS3301-DATASTRUCTURES
Weaklyconnected Graph
If a directed graph is not strongly connected, but the underlying graph (without
direction to the arcs) is connected, then the graph is said to be weakly connected.
Completegraph
Acompletegraphisagraph inwhich thereisanedgebetweenevery pairof vertices.
Weighted Graph
In a directed graph, if some positive non zero integer values are assigned to
each and every edges, then it is known as weighted graph. Also called as Network
An example of a real-life situation that can be modeled by a graph is the airport
system. Each airport is a vertex, and two vertices are connected by an edge if there is a
nonstop flight from the airports that are represented by the vertices. The edge could have a
weight, representing the time, distance, or cost of the flight.
IndegreeandOutdegree
Indegree : number of edges entering or coming towards a vertex is called Indegree.
Outdegree:NumberofedgesexitingorgoingoutfromavertexiscalledOutdegree. Degree :
Number of edges incident on a vertex is called Degree of a vertex.
Degree= Indegree+ Outdegree
Source/StartVertex:Avertexwhoseindegreeiszeroiscalledsink vertex
Sink/DestinationVertex :Avertex whoseoutdegreeiszero iscalledsinkvertex
103 CS3301-DATASTRUCTURES
Representationof Graphs
1. Adjacencymatrix/Incidence Matrix
2. AdjacencyLinkedList/IncidenceLinked List
Adjacencymatrix
Wewillconsiderdirectedgraphs.(Fig.1)
Now we can number the vertices, starting at 1. The graph shown in above figure represents 7
vertices and 12 edges.
One simple way to represent a graph is to use a two-dimensional array. This is known
as an adjacency matrix representation.
For each edge (u, v), we set a[u][v]= 1; otherwise the entry in the array is 0. If the edge has a
weight associated with it, then we can set a[u][v] equal to the weight and use either a very
large or a very small weight as a sentinel to indicate nonexistent edges.
Advantageis,it isextremelysimple, andthespacerequirementis (|V|2).
Fordirectedgraph
A[u][v]={ 1,if thereis edgefrom uto v
0 otherwise }
Forundirectedgraph
A[u][v]={ 1,if thereis edgebetween u and v
0 otherwise }
Forweightedgraph
A[u][v]={ value, ifthereis edgefrom u tov
∞,if noedgebetween uand v }
104 CS3301-DATASTRUCTURES
Adjacencylists
Adjacency lists are the standard way to represent graphs. Undirected graphs can be
similarly represented; each edge (u, v) appears in two lists, so the space usage essentially
doubles. A common requirement in graph algorithms is to find all vertices adjacent to some
given vertex v, and this can be done, in time proportional to the number of such vertices
found, by a simple scan down the appropriate adjacency list.
Anadjacencylistrepresentation ofagraph(Seeabovefig 5.1)
TopologicalSort
Atopologicalsortisan orderingofverticesinadirectedacyclicgraph,suchthatif there is a
path from vi to vj, then vj appears after vi in the ordering.
It is clear that a topological ordering is not possible if the graph has a cycle, since for
two vertices v and w on the cycle, v precedes w and w precedes v.
Directedacyclicgraph
Inthe above graph v1, v2, v5, v4, v3, v7,v6 andv1, v2, v5, v4,v7, v3,v6
arebothtopological orderings.
105 CS3301-DATASTRUCTURES
v1 0 0 0 0 0 0 0
v2 1 0 0 0 0 0 0
v3 2 1 1 1 0 0 0
v4 3 2 1 0 0 0 0
v5 1 1 0 0 0 0 0
v6 3 3 3 3 2 1 0
v7 2 2 2 1 0 0 0
Enqueue v1 v2 v5 v4 v3 v7 v6
Dequeue v1 v2 v5 v4 v3 v7 v6
SimpleTopologicalOrderingRoutine
Voidtopsort(graph G )
{
unsignedintcounter;
vertex v, w;
for(counter=0; counter <NUM_VERTEX;counter++)
{
v=find_new_vertex_of_indegree_zero();
if( v = NOT_A_VERTEX )
{
error("Graphhasacycle");
break;
}
106 CS3301-DATASTRUCTURES
top_num[v] = counter;
foreachwadjacenttov
indegree[w]--;
}
}
Explanation
Thefunctionfind_new_vertex_of_indegree_zeroscanstheindegreearraylooking for a
vertex with indegree 0 that has not already been assigned a topological number. It returns
NOT_A_VERTEX if no such vertex exists; this indicates that the graph has a cycle.
Routinetoperform TopologicalSort
Voidtopsort(graphG )
{
QUEUEQ;
unsignedintcounter;
vertex v, w;
Q=create_queue(NUM_VERTEX);
makeempty( Q );
counter=0;
foreachvertex v
if(indegree[v]=0)
enqueue( v, Q );
while(!isempty( Q))
{
v=dequeue( Q);
top_num[v]=++counter;/*assignnextnumber*/ for
each w adjacent to v
if(--indegree[w]=0)
enqueue( w, Q );
}
if( counter != NUMVERTEX )
error("Graph has a cycle");
dispose_queue(Q);/*freethememory*/
107 CS3301-DATASTRUCTURES
}
GraphTraversal:
Visitingofeachandevery vertexinthegraphonly onceiscalledasGraphtraversal.
Therearetwo typesofGraph traversal.
1. DepthFirstTraversal/Search(DFS)
2. BreadthFirstTraversal/Search(BFS)
DepthFirstTraversal/Search(DFS)
Depth-first search is a generalization of preorder traversal. Starting at some vertex, v,
we process v and then recursively traverse all vertices adjacent to v. If this process is
performed on a tree, then all tree vertices are systematically visited in a total of O(|E|) time,
since |E| = (|V|).
Weneedtobecarefulto avoidcycles.Todothis,whenwevisitavertexv, wemarkit visited,
since now we have been there, and recursively call depth-first search on all adjacent vertices
that are not already marked.
Thetwo importantkey points ofdepth first search
1. If path exists from one node to another node walk across the edge – exploring
the edge
2. If path does not exist from one specific node to any other nodes, return to the
previous node where we have been before – backtracking
ProcedureforDFS
StartingatsomevertexV,weprocessVandthenrecursivelytraversealltheverticesadjacent to
V.This process continues until all the vertices are processed. If some vertex is not processed
recursively, then it will be processed by using backtracking. If vertex W is visited from V,
then the vertices are connected by means of tree edges. If the edges not included in tree, then
they are represented by back edges. At the end of this process, it will construct a tree called as
DFS tree.
Routinetoperformadepth-firstsearch void
voiddfs(vertexv)
{
visited[v]= TRUE;
foreachwadjacenttov
if( !visited[w] )
dfs(w); }
108 CS3301-DATASTRUCTURES
Anundirectedgraph
Stepstoconstructdepth-firstspanningtree
a. We start at vertex A. Then we mark A as visited and call dfs(B) recursively. dfs(B)
marks B as visited and calls dfs(C) recursively.
b. dfs(C)marksCasvisited andcallsdfs(D) recursively.
c. dfs(D) sees both A and B, but both these are marked, so no recursive calls are made.
dfs(D) also sees that C is adjacent but marked, so no recursive call is made there, and
dfs(D) returns back to dfs(C).
d. dfs(C) sees B adjacent, ignores it, finds a previously unseen vertex E adjacent, and
thus calls dfs(E).
e. dfs(E)marksE,ignoresAandC, andreturnsto dfs(C).
f. dfs(C)returnstodfs(B).dfs(B)ignoresbothA andDand returns.
g. dfs(A)ignoresboth Dand Eand returns.
Depth-firstsearchofthe graph
-------->Back edge
109 CS3301-DATASTRUCTURES
Tree edge
The root of the tree is A, the first vertex visited. Each edge (v, w) in the graph is presentin
the tree. If, when we process (v, w), we find that w is unmarked, or if, when we process(w,
v), we find that v is unmarked, we indicate this with a tree edge.
If when we process (v, w), we find that w is already marked, and when processing (w, v), we
find that v is already marked, we draw a dashed line, which we will call a back edge, to
indicate that this "edge" is not really part of the tree.
BreadthFirstTraversal(BFS)
Here starting from some vertex v, and its adjacency vertices are processed. After all the
adjacency vertices are processed, then selecting any one the adjacency vertex and processwill
continue. If the vertex is not visited, then backtracking is applied to visit the unvisited vertex.
Routine: Example:BFSof theabovegraph
voidBFS(vertexv) A
{
visited[v]= true;
B E
Foreachwadjacenttov If D
(!visited[w]) visited[w]
= true; C
}
DifferencebetweenDFS&BFS
S. No DFS BFS
1 Backtrackingispossible fromadeadend. Backtrackingisnotpossible.
2 Verticesfromwhichexplorationis The vertices to be explored are
incompleteareprocessedinaLIFOorder. organizedasaFIFOqueue.
3 Searchisdoneinoneparticulardirectionat the Theverticesinthesamelevelaremaintain
time. edparallel.(Lefttoright) (
alphabeticalordering)
110 CS3301-DATASTRUCTURES
A
A
B D
C D
B C
E
E G H
Orderoftraversal: F
ABCDE
Order of traversal:
ABCDEFGH
Bi-connectivity/BiconnectedGraph:
A connected undirected graph is biconnected if there are no vertices whose
removaldisconnects the rest of the graph.
The depth-first search tree in the above Figure shows the preorder number first, and then the
lowest-numbered vertex reachable under the rule described above.
The lowest-numbered vertex reachable by A, B, and C is vertex 1 (A), because they can all
take tree edges to D and then one back edge back to A and find low value for all other
vertices.
Depth-firsttreethatresultsifdepth-firstsearchstartsatC
Tofindarticulationpoints,
The root is an articulation point if and only if it has more than one child, because if it
has two children, removing the root disconnects nodes in different subtrees, and if it
has only one child, removing the root merely disconnects the root.
Any other vertex v is an articulation point if and only if v has some child w such that
low (w)>= num (v). Notice that this condition is always satisfied at the root;
We examine the articulation points that the algorithm determines, namely C and D. D has a
child E, and low (E)>= num (D), since both are 4. Thus, there is only one way for E to get to
any node above D, and that is by going through D.
Similarly,Cisanarticulationpoint,becauselow(G)>=num(C).
Routinetoassignnum tovertices
Voidassignnum(vertexv)
{
vertexw;
num[v]=counter++;
visited[v] = TRUE;
113 CS3301-DATASTRUCTURES
foreachwadjacenttov
if( !visited[w] )
{
parent[w]=v;
assignnum( w ); } }
Routinetocomputelow andtotestforarticulation
Voidassignlow(vertexv)
{
vertexw;
low[v]=num[v];/*Rule1*/ for
each w adjacent to v
{
if(num[w]>num[v])/*forward edge*/
{
assignlow(w );
if(low[w] >=num[v] )
printf( "%v is an articulation point\n", v );
low[v]=min(low[v],low[w]);/*Rule3*/
}
else
if(parent[v]!=w )/* back edge */
low[v]=min( low[v],num[w] ); /*Rule 2 */ } }
Testing forarticulation pointsinonedepth-firstsearch (testfortherootis omitted) void
findart( vertex v )
{
vertex
w;visited[v]=TRU
E;
low[v]=num[v]=counter++;/*Rule1*/ for
each w adjacent to v
{
if(!visited[w] )/*forwardedge*/
{
parent[w]=v;
114 CS3301-DATASTRUCTURES
findart(w);
if(low[w] >=num[v])
printf("%visanarticulationpoint\n",v);
low[v]=min(low[v], low[w]);/*Rule */
}
else
if(parent[v]!=w )/* back edge */
low[v]=min( low[v],num[w] ); /*Rule 2 */
}
}
Euler Circuits
Wemustfindapathinthegraph thatvisitsevery edgeexactlyonce. If wearetosolve the
"extra challenge," then we must find a cycle that visits every edge exactly once. This graph
problem was solved in 1736 by Euler and marked the beginning of graph theory. The
problem is thus commonly referred to as an Euler path or Euler tour or Euler circuit
problem, depending on the specific problem statement.
Consider the three figures as shown below. A popular puzzle is to reconstruct these
figures using a pen, drawing each lineexactly once. Thepen may not beliftedfrom the paper
while the drawing is being performed. As an extra challenge, make the pen finish at the same
point at which it started.
Threedrawings
1. The first figure can be drawn only if the starting point is the lower left- or right-hand
corner, and it is not possible to finish at the starting point.
2. Thesecondfigureiseasilydrawnwiththefinishingpointthesameasthestarting point.
3. Thethirdfigure cannotbedrawn at allwithin theparameters ofthe puzzle.
115 CS3301-DATASTRUCTURES
We can convert this problem to a graph theory problem by assigning a vertex to each
intersection. Then the edges can be assigned in the natural manner, as in figure.
The first observation that can be made is that an Euler circuit, which must end on its starting
vertex, is possible only if the graph is connected and each vertex has an even degree (number
of edges). This is because, on the Euler circuit, a vertex is entered and then left.
If exactly two vertices have odd degree, an Euler tour, which must visit every edge but need
not return to its starting vertex, is still possible if we start at one of the odd-degree vertices
and finish at the other.
Ifmorethantwo verticeshaveodddegree, then anEulertourisnot possible.
That is, any connected graph, all ofwhosevertices haveeven degree, musthavean Euler
circuit
Asanexample, consider thegraphin
The main problem is thatwe might visit a portionof the graph and return to the starting point
prematurely. If alltheedgescomingoutofthestartvertexhavebeenused up,thenpartofthe graph is
untraversed.
The easiest way to fix this is to find the first vertex on this path that has an untraversed edge,
and perform another depth-first search. This will give another circuit, which can be spliced
into the original. This is continued until all edges have been traversed.
Supposewestartatvertex5,andtraversethecircuit5,4,10,5.Thenwearestuck, and most
of the graph is still untraversed. The situation is shown in the Figure.
116 CS3301-DATASTRUCTURES
We then continue from vertex 4, which still has unexplored edges. A depth-first search might
comeupwiththepath4,1,3,7,4,11,10,7,9,3,4.Ifwesplicethispathintotheprevious
pathof5,4,10,5,thenwegetanewpathof5,4,1,3,7,4,11,10,7,9,3,4,10,5. The graph that
remains after this is shown in the Figure
The next vertex on the path that has untraversed edges is vertex 3. A possible circuit
wouldthen be3,2,8,9,6,3.Whensplicedin,thisgivesthepath5,4,1,3,2,8,9,6,3,7,4,11,10,
7, 9, 3, 4, 10, 5.
Thegraph that remainsis in theFigure.
On this path,the next vertex with an untraversed edge is 9, and the algorithm finds the circuit
9,12,10,9.Whenthisisaddedtothecurrentpath,acircuitof5,4,1,3,2,8,9,12,10,9,6,
3,7,4,11,10,7,9,3,4,10,5isobtained.Asalltheedgesaretraversed,thealgorithm terminates with an
Euler circuit.
ThentheEulerPathfortheabovegraphis5,4,1,3,2,8,9,12,10,9,6,3,7,4,11,
10, 7, 9, 3, 4, 10, 5
Cutvertexandedges
117 CS3301-DATASTRUCTURES
Acutvertexisavertexthatwhenremoved(withitsboundaryedges)fromagraphcreates more
components than previously in the graph.
Acutedgeisanedgethatwhenremoved(theverticesstayinplace)fromagraphcreates more
components than previously in the graph.
Answers
31) Thecut vertexisc.Therearenocut edges.
32) Thecut vertices arecand d. Thecut edgeis(c,d)
33) Thecutverticesareb,c,eandi.Thecutedgesare:(a,b),(b,c),(c,d),(c,e),(e,i),(i,h)
Applications of graph:
MinimumSpanningTree
Definition:
A minimum spanning tree exists if and only if G is connected. A minimum spanning
tree of an undirected graph G is a tree formed from graph edges that connects all the vertices
of G at lowest total cost.
The number of edges in the minimumspanning tree is |V| - 1. The minimum spanning
tree is a tree because it is acyclic, it is spanning because it covers every edge.
Application:
Housewiringwithaminimumlengthofcable,reducescostofthe wiring.
118 CS3301-DATASTRUCTURES
AgraphGanditsminimum spanningtree
Therearetwoalgorithmstofindtheminimumspanning tree
1. Prim's Algorithm
2. Kruskal'sAlgorithm
Kruskal's Algorithm
A second greedy strategy is continually to select the edges in order of smallest weight
and accept an edge if it does not cause a cycle.
Formally,Kruskal'salgorithm maintainsaforest.Forestisacollectionoftrees.
Procedure
Initially,thereare |V|single-nodetrees.
Addinganedgemergestwo treesinto one.
Whenthe algorithmterminates,there isonlyonetree, andthisistheminimum
spanning tree.
Thealgorithmterminates whenenoughedgesareaccepted.
Ifuandvareinthesameset,theedgeisrejected,becausesincetheyarealready connected,
adding (u, v) would form a cycle.
Otherwise, the edge is accepted, and a union is performed on the two sets containing
u and v.
119 CS3301-DATASTRUCTURES
ActionofKruskal'salgorithmonG
Edge Weight Action
(v1,v4) 1 Accepted
(v6,v7) 1 Accepted
(v1,v2) 2 Accepted
(v3,v4) 2 Accepted
(v2,v4) 3 Rejected
(v1,v3) 4 Rejected
(v4,v7) 4 Accepted
(v3,v6) 5 Rejected
(v5,v7) 6 Accepted
Kruskal'salgorithmaftereachstage
RoutineforKruskal's algorithm
voidGraph::kruskal()
{
int edgesaccepted = 0; DISJSETds(Numvertex);
PRIORIT_QUEUE < edge> pg( getedges ( ));
Edgee;
120 CS3301-DATASTRUCTURES
VertexU,V;
while(edgesaccepted<NUMVERTEX-1 )
{
Pq. deletemin( e ); //e=(u,v)
Settype Uset =ds. find( U, S );
Settype Vset = ds.find( V, S );
if(Uset!=Vset)
{
// accept the edge
edgesaccepted++;
ds.setunion(S,Uset,Vset);
} } }
Dijkstra's Algorithm
Ifthegraphisweighted,theproblembecomesharder,butwecanstillusethe ideas from
the unweighted case.
Each vertex is marked as either known or unknown. A tentative distance dv is kept for each
vertex. The shortest path length from s to v using only known vertices as intermediates.
The general method to solve the single-source shortest-path problem is known as Dijkstra's
algorithm.
Dijkstra's algorithm proceeds in stages, just like the unweighted shortest-path
algorithm. At each stage, Dijkstra's algorithm selects a vertex v, which has the smallest dv
among all the unknown vertices, and declares that the shortest path from s to v is known.
121 CS3301-DATASTRUCTURES
In the above graph, assuming that the start node, s, is v1. The first vertex selected is v1, with
path length 0. This vertex is marked known. Now that v1 is known.
Initialconfigurationtable
v Knowndv pv
v1 0 0 0
v2 0 ∞ 0
v3 0 ∞ 0
v4 0 ∞ 0
v5 0 ∞ 0
v6 0 ∞ 0
v7 0 ∞ 0
Theverticesadjacenttov1arev2andv4.Boththeseverticesgettheirentriesadjusted,as indicated
below
Afterv1isdeclaredknown
v Known dv pv
v1 1 0 0
v2 0 2 v1
v3 0 ∞ 0
v4 0 1 v1
v5 0 ∞ 0
v6 0 ∞ 0
v7 0 ∞ 0
Next,v4isselected andmarkedknown.Verticesv3,v5, v6,andv7areadjacent.
Afterv4isdeclaredknown
v Known dv pv
v1 1 0 0
v2 0 2 v1
v3 0 3 v4
v4 1 1 v1
v5 0 3 v4
v6 0 9 v4
v7 0 5 v4
122 CS3301-DATASTRUCTURES
v1 1 0 0
v2 1 2 v1
v3 0 3 v4
v4 1 1 v1
v5 0 3 v4
v6 0 9 v4
v7 0 5 v4
The next vertex selected is v5 at cost 3. v7 is the only adjacent vertex, but it is not
adjusted,because 3 + 6 > 5. Then v3 is selected, and the distance for v6 is adjusted down to 3
+ 5 = 8.
Afterv5andv3 aredeclared known
v Known dv pv
v1 1 0 0
v2 1 2 v1
v3 1 3 v4
v4 1 1 v1
v5 1 3 v4
v6 0 8 v3
v7 0 5 v4
Nextv7 isselected; v6 getsupdated down to5 +1=6. Theresulting table is
Afterv7isdeclaredknown
v Known dv pv
v1 1 0 0
v2 1 2 v1
v3 1 3 v4
v4 1 1 v1
v5 1 3 v4
v6 0 6 v7
v7 1 5 v4
123 CS3301-DATASTRUCTURES
v1 1 0 0
v2 1 2 v1
v3 1 3 v4
v4 1 1 v1
v5 1 3 v4
v6 1 6 v7
v7 1 5 v4
124 CS3301-DATASTRUCTURES
VertexclassforDijikstra’salgorithm
structVertex
{
List adj;
Boolknown;
disttype dist;
Vertexpath;};
#define NOTAVERTEX0
RoutineforDijkstra'salgorithm
void graph :: dijkstra( Vertex S )
{
foreachvertex v
{
v.dist=INFINITY;
v.known = false; }
s.dist =0;
for(;; )
{
v=smallestunknowndistancevertex;
if(v== NotAVertex)
break;
v. known = TRUE;
foreachwadjacenttov
if( !w. known )
if(v.dist +Cv,w<w.dist )
{
decrease(w.disttov.dist+Cv,w);
w.path = v; } } }
Routinetoprint theactualshortest path
voidGraph::printpath(Vertexv)
{
if(v.path !=NOTAVERTEX)
{
printpath(v.path);
cout<<" to " ; }
cout<<v ; }
125 CS3301-DATASTRUCTURES
RoutineforHashFunction
INDEXhash(char*key,inttablesize)
{
int hash_val = 0;
while( *key != '\0' )
hash_val+=*key++;
return(hash_val % H_SIZE );
}
Collision:
Collisionoccurswhenaashvalueofarecordbeinginsertedhashestoanaddressthatalready contain a
different record (i.e) when two key values hash to the same position.
Example
Values89 and 39arehash tothesame address9, if thetablesizeis 10.
structlistnode
{
elementtypeelement;
position next;
};
structhashtbl
{
int tablesize;
148 CS3301-DATASTRUCTURES
LIST*thelists;
};
RoutineforFind operation
Positionfind(elementtypekey,HASHTABLEH)
149 CS3301-DATASTRUCTURES
{
position p;
LISTL;
L=H->thelists[hash(key,H->tablesize)]; p
= L->next;
while((p!=NULL)&&(p->element!=key)) p =
p->next;
returnp;
}
RoutineforInsert Operation
Voidinsert(elementtype key,HASHTABLEH)
{
positionpos,newcell;LISTL; pos
= find( key, H );
if(pos ==NULL )
{
newcell=(position)malloc(sizeof(structlistnode));
if( newcell == NULL )
fatalerror("Outofspace!!!");
else
{
L=H->thelists[hash(key,H->tablesize)];
newcell->next = L->next;
newcell->element=key;
L->next=newcell;} }}
ClosedHashing(Open Addressing)
Separatechaininghasthedisadvantageofrequiringpointers.Thistendstoslowthealgorithm
downabitbecauseofthetimerequiredtoallocate newcells,and alsoessentiallyrequiresthe
implementation of a second data structure.
Closedhashing,alsoknownasopenaddressing,isanalternativetoresolvingcollisionswith linked
lists.
150 CS3301-DATASTRUCTURES
Inaclosedhashingsystem,ifacollisionoccurs,alternatecellsaretrieduntilanemptycellis found.
More formally, cells h0(x), h1(x), h2(x), . . . are tried in succession where hi(x) = (hash(x) +
F(i) mod tablesize), with F(0) = 0.
Thefunction,F,isthecollisionresolutionstrategy.Becauseallthedatagoesinsidethetable, a bigger
table is needed for closed hashing than for open hashing. Generally, the load factor should be
below = 0.5 for closed hashing.
Threecommoncollisionresolutionstrategiesare
1. Linear Probing
2. QuadraticProbing
3. DoubleHashing
LinearProbing
Inlinearprobing,Fisalinearfunctionofi,typicallyF(i)=i.Thisamountstotryingcells
sequentially (with wraparound) in search of an empty cell.
F(i)=i.
ThebelowFigureshowstheresultofinsertingkeys{89,18,49,58,69}intoaclosedtable using the
same hash function as before and the collision resolution strategy, The first
collisionoccurswhen49 isinserted;itisputinthenextavailablespot,namely0,whichis open. 58
collides with 18, 89, and then 49 before an empty cell is found three away.
{89, 18, 49, 58, 69}
QuadraticProbing
151 CS3301-DATASTRUCTURES
Quadratic probing is a collision resolution method that eliminates the primary clustering
problemoflinearprobing.Quadraticprobingiswhatyouwouldexpect-thecollisionfunction is
quadratic.ThepopularchoiceisF(i)=i2
When49collidewith89, thenextpositionattemptedisonecellaway.This cellisempty,so
49isplacedthere.Next58collidesatposition8.Thenthecellone awayistriedbutanother
collisionoccurs.Avacantcellisfoundatthenext celltried,whichis22=4away.58isthus placed in
cell 2.
{89, 18, 49, 58, 69}
DoubleHashing
The last collision resolution method we will examine is double hashing. For double hashing,
one popular choice is f(i) = i h2 (x). This formula says that we apply a second hash function
toxandprobeatadistanceh2(x),2h2 (x),...,andsoon.Afunctionsuchash2(x)=R -(x mod R), with
R a prime smaller than H_SIZE, will work well.
152 CS3301-DATASTRUCTURES
Rehashing
If the table gets too full, the running time for the operations will start taking too long and
insertsmightfailforclosedhashingwithquadraticresolution.Thiscanhappenifthereare too many
deletions intermixed with insertions.
Asolution,then,istobuildanothertablethatisabouttwiceasbigandscandowntheentire original
hash table, computing the new hash value for element and inserting it in the new table.
Asanexample,supposetheelements13,15,24,and6areinsertedintoaclosedhash tableof size 7.
The hash function is h(x) = x mod 7. Suppose linear probing is used to resolve collisions.
153 CS3301-DATASTRUCTURES
Rehashingroutines
Hashtablerehash(HASH_TABLEH)
{
unsignedinti,old_size;
cell *old_cells;
old_cells = H->the_cells;
old_size=H->table_size;
/*Geta new, emptytable*/
H=initialize_table(2*old_size);
/*Scanthrougholdtable,reinsertingintonew*/
for( i=0; i<old_size; i++ )
if(old_cells[i].info==legitimate)
insert( old_cells[i].element, H );
free( old_cells );
returnH;
}
154 CS3301-DATASTRUCTURES
ExtendibleHashing
If the amount of data is too large to fit in main memory, then is the number of disk accesses
required to retrieve data. As before, we assume that at any point we have n records to store;
the value of n changes over time. Furthermore, at most m records fit in one disk block. We
willusem=4inthissection.Tobemoreformal, Dwillrepresentthenumberofbitsusedby
theroot,whichissometimesknownasthedirectory.Thenumberofentriesinthedirectoryis thus 2 D .
dL is the
numberofleadingbitsthatalltheelementsofsomeleafhaveincommon.dLwilldependon the
particular leaf, and dL<=D.
Supposethatwewanttoinsertthekey100100.Thiswouldgointothethirdleaf,butasthe
thirdleafisalreadyfull,thereisnoroom.Wethussplitthisleafintotwoleaves,whichare now
determined by the first three bits. This requires increasing the directory size to 3.
155 CS3301-DATASTRUCTURES
Ifthekey000000isnowinserted,thenthefirstleafissplit,generatingtwoleaveswithdL= 3.
SinceD=3,theonlychangerequiredinthedirectoryistheupdatingofthe000and001 pointers.
156 CS3301-DATASTRUCTURES
2MARKS
1. Explainthetermdatastructure.
The data structure can be defined as the collection of elements and all the possible
operations which are required for those set of elements. Formally data structure can be
definedasadatastructureisasetofdomainsD,asetofdomainsFandasetof axiomsA.thistriple
(D,F,A) denotes the data structure d.
2. Whatdoyoumean bynon-lineardatastructure?Giveexample.
The non-linear data structure is the kind of data structure in which the data may be
arranged in hierarchical fashion. For example- Trees and graphs.
3. Whatdoyou lineardatastructure?Give example.
Thelineardatastructureisthekindofdatastructureinwhichthedatais linearly arranged.
For example- stacks, queues, linked list.
4. Enlistthevariousoperationsthatcanbeperformedondatastructure.
Variousoperationsthatcan beperformedon thedatastructureare
• Create
• Insertionofelement
• Deletionof element
• Searchingforthedesiredelement
• Sortingtheelementsin thedata structure
• Reversingthelistofelements.
5. Whatisabstractdata type?Whatareall not concernedinan ADT?
The abstract data type is a triple of D i.e. set of axioms, F-set of functions and A-
Axioms in which only what is to be done is mentioned but how is to be done is not
mentioned. Thus ADT is not concerned with implementation details.
6. Listouttheareasinwhichdatastructuresareapplied extensively.
Followingaretheareasinwhich datastructures areapplied extensively.
• Operatingsystem-thedatastructureslikepriorityqueuesare
used for scheduling the jobs in the operating system.
• Compilerdesign-thetreedatastructureisusedinparsingthesource
159 CS3301-DATASTRUCTURES
program.
Stackdatastructureis usedinhandlingrecursivecalls.
• Database management system- The file data structure is used in
databasemanagementsystems.Sortingandsearchingtechniques
can be applied on these data in the file.
• Numericalanalysispackage-thearrayisusedtoperformthe
numerical analysis on the given set of data.
• Graphics-thearrayandthelinked listareusefulingraphicsapplications.
• Artificialintelligence-thegraphandtreesareusedforthe
applications like building expression trees, game playing.
7. Whatis alinkedlist?
A linked list is a set of nodes where each node has two fields ‘data’ and ‘link’. The
data field is used to store actual piece of information and link field is used to store address
of next node.
8. Whatarethepitfallencounteredinsinglylinkedlist?
Followingarethe pitfallencounteredin singlylinked list
• The singly linked list has only forward pointer and no backward link is provided.
Hence the traversing of the list is possible only in one direction. Backward
traversing is not possible.
• Insertionanddeletionoperationsarelessefficientbecauseforinsertingthe
elementatdesired position the listneedsto betraversed. Similarly, traversingofthe list
is required for locating the element which needs to be deleted.
9. Definedoublylinkedlist.
Doubly linked list is a kind of linked list in which each node has two link fields.One
link field stores the address of previous node and the other link field stores the address of
the next node.
10. Writedownthestepsto modifyanodeinlinkedlists.
➢ Searchthecorrespondingnodeinthelinkedlist.
➢ Replacetheoriginal valueof that nodebyanewvalue.
➢ Displaythemessagesas“thenodeismodified”.
160 CS3301-DATASTRUCTURES
20. Whatisstaticlinkedlist?Stateanytwoapplicationsofit.
➢ Thelinkedliststructurewhichcanberepresentedusingarraysiscalledstaticlinked list.
➢ Itiseasytoimplement,hencefor creationofsmalldatabases,it isuseful.
➢ Thesearchingofanyrecordisefficient,hencetheapplicationsinwhichtherecord need to
be searched quickly, the static linked list are used.
16 MARKS
1. Explaintheinsertionoperationinlinkedlist.Hownodesareinsertedafteraspecified node.
2. Writean algorithm to insert anodeat thebeginningof list?
3. Discussthemergeoperationincircularlinkedlists.
4. Whataretheapplications oflinkedlist indynamicstoragemanagement?
5. Howpolynomialexpressioncanberepresentedusinglinkedlist?
6. Whatarethe benefitandlimitations oflinked list?
7. Definethedeletionoperationfromalinked list.
8. Whatarethedifferenttypesofdatastructure?
9. Explaintheoperationoftraversinglinkedlist.Writethealgorithmand give
an example.
162 CS3301-DATASTRUCTURES
UNIT II
2MARKS
1. DefineStack
A Stack is an ordered list in which all insertions (Push operation) and deletion (Pop
operation) are made at one end, called the top. The topmost element is pointed by top. The
top is initializedto-1whenthestackiscreatedthatiswhenthestackisempty.Inastack S = (a1,an),
a1 is the bottom most element and element a is on top of element ai-1. Stack is also referred
as Last In First Out (LIFO) list.
2. WhatarethevariousOperationsperformedonthe Stack?
Thevariousoperationsthatareperformedonthestackare
CREATE(S) – Creates S as an empty stack.
PUSH(S,X) – Adds the element X to the top of the
stack.POP(S)–Deletesthetopmostelementsfrom
thestack.TOP(S)–returnsthevalueoftopelement from
the stack. ISEMTPTY(S) – returns true if Stack is
empty else false. ISFULL(S) - returns true if Stack
is full else false.
3. How doyoutest foranemptystack?
The condition for testing an empty stack is top =-1, where top is the pointer pointing to
the topmost element of the stack, in the array implementation of stack. In linked list
implementation of stack the condition for an empty stack is the header node link field is
NULL.
4. Nametwoapplicationsof stack?
NestedandRecursivefunctionscanbeimplementedusingstack.ConversionofInfixto Postfix
expression can be implemented using stack. Evaluation of Postfix expression can be
implemented using stack.
5. Defineasuffixexpression.
Thenotationusedtowritetheoperatorattheendof theoperandsiscalledsuffixnotation.
Suffix notationformat:operandoperandoperator
Example:ab+,wherea&bareoperandsand‘+’is additionoperator.
6. Whatdoyou meantbyfullyparenthesizedexpression?Give example.
Apairofparentheseshasthesameparentheticallevelasthatoftheoperatortow hich it
corresponds. Such an expression is called fully parenthesized expression.
Ex:(a+((b*c)+(d*e))
163 CS3301-DATASTRUCTURES
7. Writethepostfixformfortheexpression-A+B-C+D?
A-B+C-D+
8. Whatarethepostfixandprefixformsofthe expression?
A+B*(C-
D)/(P-R)
Postfixform:ABCD-
*PR-/+Prefixform:
+A/*B-CD-PR
9. Explaintheusageofstackinrecursivealgorithm implementation?
Inrecursivealgorithms,stackdatastructuresisusedtostorethereturnaddress whena recursive
call is encountered and also to store the values of all the parameters essential to the current
state of the function.
10. DefineQueues.
A Queue is an ordered list in which all insertions take place at one end called the rear,
while all deletions take place at the other end called the front. Rear is initialized to -1 and
front is initialized to 0. Queue is also referred as First In First Out (FIFO) list.
11. WhatarethevariousoperationsperformedontheQueue? The
various operations performed on the queue are
CREATE(Q) – Creates Q as an empty Queue.
Enqueue(Q,X) – Adds the element X to the Queue.
Dequeue(Q) – Deletes a element from the Queue.
ISEMTPTY(Q) – returns true if Queue is empty else
false. ISFULL(Q) - returns true if Queue is full else
false.
12. How doyoutest for anemptyQueue?
The condition for testing an empty queue is rear=front-1. In linked list implementationof
queue the condition for an empty queue is the header node link field is NULL.
13. Writedownthefunctiontoinsertanelementintoaqueue,inwhichthequeueis
implemented as an array. (May 10)
Q– Queue
X – element to added to the queue Q
IsFull(Q)–ChecksandtrueifQueueQisfull
164 CS3301-DATASTRUCTURES
16 MARKS
1. Writeanalgorithm forPushandPopoperationsonStackusingLinkedlist.(8)
2. Explainthelinked listimplementation ofstack ADTin detail?
3. Define an efficient representation of two stacks in a given area of memory with n
words and explain.
4. ExplainlinearlinkedimplementationofStackandQueue?
a. Write an ADT to implement stack of size N using an array. The elements in
the stack are to be integers. The operations to be supported are PUSH, POP
andDISPLAY.Takeintoaccounttheexceptionsofstackoverflowandstack
underflow. (8)
b. A circularqueuehasasizeof5 andhas3 elements10,20and 40 where F=2and
R=4. After inserting 50 and 60, what is the value of F and R.Trying to insert
30 at thisstagewhathappens?Delete2elementsfromthe
165 CS3301-DATASTRUCTURES
queueandinsert70,80&
90. Show the sequenceofsteps with necessarydiagrams with the valueofF &
R. (8 Marks)
5. Writethealgorithm for convertinginfixexpressiontopostfix(polish)expression?
6. Explainindetail aboutpriorityqueueADTin detail?
7. Write a function called ‘push’ that takes two parameters: an integer variable and a
stack into
whichitwouldpushthiselementandreturnsa1ora0toshowsuccessofadditionor failure.
8. WhatisaDeQueue?Explainitsoperationwithexample?
9. Explain thearrayimplementation of queueADTin detail?
10. Explain the addition and deletion operations performed on a circular queue
withnecessary algorithms.(8) (Nov 09)
UNITIII
1. Definetree
Trees are non-liner data structure, which is used to store data items in a shorted sequence. It
represents any hierarchical relationship between any data Item. It is a collection of nodes,
which has a distinguish node called the root and zero or more non-empty sub trees T1, T2,
….Tk. each of which are connected by a directed edge from the root.
2. DefineHeightof tree?
The height of n is the length of the longest path from root to a leaf. Thus all leaves
have height zero. The height of a tree is equal to a height of a root.
3. DefineDepthof tree?
For any node n,the depthof nisthelengthof the unique pathfromthe rootto node n.
Thus for a root the depth is always zero.
4. Whatis thelength ofthepath in a tree?
The length of the path is the number of edges on the path.In a tree there isexactly
one path form the root to each node.
5. Definesibling?
Nodeswiththesameparentarecalledsiblings.Thenodeswithcommon parentsare
called siblings.
166 CS3301-DATASTRUCTURES
6. Definebinarytree?
ABinarytreeisafinitesetofdataitemswhichiseitheremptyorconsistsofa single
itemcalledrootandtwo disjoinbinarytreescalledleftsubtreemax degreeofanynodeis two.
7. Whatarethetwomethodsofbinarytree implementation?
Two methods to implement a binarytreeare,
a. Linearrepresentation.
b. Linkedrepresentation
8. Whataretheapplicationsofbinary tree?
Binarytreeis used in dataprocessing.
a. Fileindexschemes
b. Hierarchicaldatabasemanagementsystem
9. Listoutfewofthe Applicationoftreedata-structure?
ØThemanipulationofArithmeticexpression Ø
Used for Searching Operation
ØUsedtoimplementthefilesystemofseveralpopularoperatingsystems Ø
Symbol Table construction
ØSyntaxanalysis
10. Defineexpressiontree?
Expressiontree isalsoa binary tree inwhichtheleafsterminalnodesor operandsand
non-terminal intermediate nodes are operators used for traversal.
11. Define tree traversal and mention the type of traversals?
Visitingofeachand everynodeinthetreeexactlyiscalledastree
traversal. Three types of tree traversal
1. Inordertraversal
2. Preodertraversal
3. Postordertraversal.
12. Definein -order traversal?
In-ordertraversalentailsthefollowingsteps;
a. Traversetheleft subtree
b. Visit theroot node
c. Traversethe right subtree
167 CS3301-DATASTRUCTURES
13. Definethreadedbinarytree.
A binary tree is threaded by making all right child pointers that would
normallybe null point to the in order successor of the node, and all left child pointers
that would normally be null
pointtotheinorder predecessorofthe node.
14. Whatarethetypesofthreadedbinarytree?
i. Right-in threaded
binary tree ii. Left-in
threaded binary tree iii.
Fully-inthreadedbinarytree
15. DefineBinarySearch Tree.
Binarysearch treeis abinarytreein which foreverynodeXin the tree, the values ofall the keys
initsleftsubtreearesmallerthanthekeyvalueinXandthevaluesofallthekeys in its right
subtreearelargerthanthekeyvaluein X.
16. WhatisAVLTree?
AVL stands forAdelson-Velskii and Landis. An AVL tree is a binary search tree which
has the following properties:
1. Thesub-trees ofeverynodedifferin height byat most one.
2. Everysub-treeisanAVL tree.
SearchtimeisO(logn).AdditionanddeletionoperationsalsotakeO(logn)time.
17. Listoutthestepsinvolvedindeletinganodefromabinarysearchtree.
▪ Deletinganodeisaleafnode(ie)Nochildren
▪ Deletinganodewithone child.
▪ DeletinganodewithtwoChilds.
18. Whatis‘B’ Tree?
A B-tree is a tree data structure that keeps data sorted and allows searches,
insertions, and deletions in logarithmic amortized time. Unlike self-balancing binarysearch
trees, it is optimized for systems that read and write large blocks of data. It is most
commonly used in database and file systems.
ImportantpropertiesofaB-tree:
• B-treenodes havemanymorethan two children.
• AB-treenodemaycontain morethan just asingleelement.
168 CS3301-DATASTRUCTURES
19. Whatisbinomialheaps?
Abinomialheapisacollectionofbinomialtreesthatsatisfiesthefollowing
binomial-heap properties:
1. Notwobinomialtreesinthecollectionhavethesamesize.
2. Eachnodein eachtreehasakey.
3. Eachbinomialtreeinthecollectionisheap-orderedinthesense that
each non-root has a key strictly less than the key of its
parent.The number of trees in a binomial heap is O(log n).
20. Definecompletebinarytree.
Ifallitslevels,possibleexceptthelast,havemaximumnumberofnodesandif all the
nodes in the last level appear as far left as possible.
16 MARKS
1. ExplaintheAVLtreeinsertion and deletion with suitableexample.
2. Describethealgorithms usedto performsingleand doublerotation on AVLtree.
3. ExplainaboutB-Treewithsuitableexample.
4. ExplainaboutB+treeswithsuitablealgorithm.
5. Writeshort notes on
IBinomialheapsii. Fibonacciheaps
6. Explainthetreetraversaltechniqueswithanexample.
7. Construct an expression tree for the expression (a+b*c) + ((d*e+f)*g). Give the outputs
when you apply inorder, preorder and postorder traversals.
8. How to insert and delete an element into a binary search tree and write down the code
for the insertion routine with an example.
9. Whatarethreadedbinarytree?Writeanalgorithmforinsertinganodeinathreaded binary tree.
10. Create a binary search tree for the following numbers start from an empty binary search
tree.
45,26,10,60,70,30,40Deletekeys10,60and45oneaftertheotherandshowthetreesat each stage.
169 CS3301-DATASTRUCTURES
UNITIV
PARTA
1. Writethedefinitionofweightedgraph?
Agraphinwhichweights areassignedtoeveryedgeiscalledaweightedgraph.
2. DefineGraph?
A graph G consist of a nonempty set V which is a set of nodes of the graph, a set E
which is the set of edges of the graph, and a mapping from the set of edges E to set of pairs
of elements of V. It can also be represented as G=(V, E).
3. Defineadjacencymatrix?
Theadjacencymatrix isannxnmatrixAwhoseelementsaijaregivenby Aij =1
if(vi,vj)exists, otherwise 0
4. Defineadjacentnodes?
Any twonodes,whichareconnectedby anedgeinagraph,arecalledadjacent nodes. For
example, if an edge xE is associated with a pair of nodes
(u,v)whereu, v V,then wesaythat theedgexconnects thenodes uandv.
5. Whatisadirectedgraph?
Agraphinwhich everyedgeisdirectediscalleda directedgraph.
6. Whatisanundirected graph?
Agraphin whicheveryedgeisundirected iscalled anundirectedgraph.
7. What is a loop?
Anedgeofa graph,whichconnectstoitself,is calledaloopor sling.
8. Whatis a simple graph?
Asimplegraphis agraph, whichhas notmorethanoneedgebetween apair ofnodes.
9. Whatisaweightedgraph?
Agraphinwhichweights areassignedtoeveryedgeiscalledaweightedgraph.
10. Defineindegreeandoutdegreeofagraph?
Inadirectedgraph, foranynodev,thenumberofedges,whichhavevastheirinitial node, is
called the out degree of the node v.
Outdegree:Numberofedges havingthenodev as root nodeistheoutdegreeof thenodev.
11. Definepathina graph?
Thepathin agraphis theroutetaken toreach terminalnodefrom astartingnode.
12. Whatis asimple path?
i. Apathinadiagraminwhichtheedgesaredistinctiscalledasimple path. ii.
It is also called as edge simple.
170 CS3301-DATASTRUCTURES
23. Definebiconnectedgraph?
Agraph is called biconnected ifthereis no single nodewhose removal causes the graph to
break into two or more pieces. A node whose removal causes the graph to become
disconnected is called a cut vertex.
24. Whatarethetwotraversalstrategiesusedintraversingagraph?
a. Breadthfirst search
b. Depthfirstsearch
25. ArticulationPoints(orCutVertices)inaGraph
A vertex in an undirected connected graph is an articulation point (or cut vertex) if
removingit(andedgesthroughit)disconnectsthegraph. Articulationpoints represent
vulnerabilities in a connected network – single points whose failure would split the network
into2ormoredisconnectedcomponents.Theyareusefulfordesigningreliable networks.
For a disconnected undirected graph, an articulation point is a vertex removing which
increases number of connected components.
Followingaresomeexamplegraphswitharticulationpointsencircledwithredcolor.
16 MARKS
1. Explainthevarious representationof graphwithexamplein detail?
2. ExplainBreadthFirstSearchalgorithmwith example?
3. ExplainDepthfirstandbreadthfirsttraversal?
4. Whatistopologicalsort?Writeanalgorithmtoperformtopologicalsort?(8)(Nov09)
5. (i)writeanalgorithmtodeterminethebiconnectedcomponentsinthegiven graph.(10)
(may 10)
(ii) determinethe biconnectedcomponentsin agraph. (6)
6. ExplainthevariousapplicationsofGraphs.
UNIT –
V2MARK
S
1. Whatis meantby Sorting?
Sortingisorderingofdatainanincreasingordecreasingfashionaccordingtosome linear
relationship among the data items.
2. Listthedifferentsortingalgorithms.
• Bubblesort
• Selectionsort
172 CS3301-DATASTRUCTURES
• Insertionsort
• Shellsort
• Quicksort
• Radixsort
• Heapsort
• Mergesort
3. Whybubblesortiscalledso?
Thebubblesortgetsitsnamebecauseasarrayelementsaresortedtheygradually
“bubble”totheirproperpositions,likebubblesrisinginaglassofsoda.
4. Statethelogicofbubblesort algorithm.
The bubble sort repeatedly compares adjacent elements of an array. The first and
second elements are compared and swapped if out of order.Then the second and third
elements are comparedandswappedif outof order.Thissorting processcontinues
untilthelasttwo
elementsofthearrayare comparedandswappedif outof order.
5. Whatnumberisalwayssortedtothetopof thelistbyeachpassof theBubble sort
algorithm?
Each pass through the list places the next largest value in its proper place. In essence, each
item “bubbles” up to the location where it belongs.
6. WhendoestheBubbleSortAlgorithm stop?
Thebubblesortstopswhenitexaminestheentirearrayandfindsthatno
"swaps"areneeded.Thebubblesort keepstrack oftheoccurringswapsbytheuseofa flag.
7. Statethelogicofselectionsort algorithm.
Itfindsthelowestvalue fromthecollectionand movesittotheleft.Thisis repeated until
the complete collection is sorted.
8. Whatistheoutputofselectionsortafterthe2nditerationgiventhe
following sequence? 16 3 46 9 28 14
Ans: 3 9 46 16 28 14
9. How doesinsertionsortalgorithmwork?
In every iteration an element is compared with all the elements before it. While
comparing if it is found that the element can be inserted at a suitable position, then space is
createdforitbyshiftingtheotherelementsonepositionupandinsertsthedesired
173 CS3301-DATASTRUCTURES
elementatthesuitableposition.Thisprocedureisrepeatedforalltheelementsin
thelistuntilwegetthesorted elements.
10. What operation does the insertion sort use to move numbers from the unsorted
section to the sorted section of the list?
The Insertion Sort uses the swapoperation since it is ordering numberswithin
asingle list.
11. Howmanykeycomparisonsandassignmentsaninsertionsortmakesinitsworst case?
The worst case performance in insertion sort occurs when the elements of the input
array are in descending order. In that case, the first pass requires one comparison, the
secondpassrequirestwo comparisons,thirdpassthreecomparisons,….kthpassrequires(k- 1),
and finally the last pass requires (n-1) comparisons. Therefore, total numbers of
comparisons are:
f(n)=1+2+3+………+(n-k)+…..+(n-2)+(n-1) =n(n-1)/2= O(n2)
12. Whichsortingalgorithmisbestifthelistisalreadysorted?Why?
Insertion sort as there is no movement of data if the list is already
sorted and complexity is of the order O(N).
13. Whichsortingalgorithmiseasilyadaptabletosinglylinkedlists?Why?
Insertion sort is easily adaptable to singly linked list. In this method there is anarray
link of pointers, one for each of the original array elements. Thus the array can be thought
of as a linear link list pointed to by an external pointer first initialized to 0. Toinsert the k th
element the linked list is traversed until the proper position for x[k] is found,or until the end
of the list is reached. At that point x[k] can be inserted into the list by merely adjusting the
pointers without shifting any elements in the array which reduces insertion time.
14. WhyShellSortisknowndiminishingincrement sort?
The distance between comparisons decreases as the sorting
algorithm runs until the last phase in which adjacent elements are compared. In each step,
the sortedness of thesequence is increased, until in the last step it is completely sorted.
15. Which of the following sorting methods would be especially suitable to
sortalistL consisting of a sorted list followed by a few “random” elements?
Quicksortissuitabletosortalist Lconsistingofasorted listfollowedbyafew
“random” elements.
174 CS3301-DATASTRUCTURES
16. What is the output of quick sort after the 3 rditeration given the following
sequence?
24 56 47 35 10 90 82 31
17. Mentionthedifferentwaystoselectapivotelement.
Thedifferentwaystoselectapivotelement are
• Pickthefirstelementaspivot
• Pickthelastelementaspivot
• PicktheMiddleelementas pivot
• Median-of-threeelements
• Pickthreeelements,and findthemedian x oftheseelements
• Usethatmedianasthepivot.
• Randomlypickanelementaspivot.
18. Whatisdivide-and-conquer strategy?
• Dividea probleminto twoormoresub problems
• Solvethesubproblemsrecursively
• Obtainsolutiontooriginalproblem bycombiningthese solutions
the use of a hash table or binary search tree will result in more efficient searching, but more
oftenthan not an arrayor linked list will be used. It is necessaryto understand good ways of
searching data structures not designed to support efficient search.
21. Whatislinearsearch?
InLinearSearchthelistissearchedsequentiallyandthepositionisreturnedif thekey
elementtobesearchedisavailableinthelist,otherwise -1 isreturned.The searchinLinear Search
starts at the beginning of an array and move to the end, testing fora match at each item.
22. WhatisBinary search?
A binary search, also called a dichotomizing search, is a digital scheme for locating a
specific object in a large set. Each object in the set is given a key. The number of keys is
always apower of2.Ifthereare32itemsinalist,forexample,theymightbenumbered 0 through 31
(binary
00000through11111).Ifthereare,say,only29items,theycanbenumbered0
through28(binary00000through11100),withthenumbers29through31(binary
11101,11110,and
11111) as dummy
keys.
23. Definehashfunction?
Hash function takes an identifier and computes the address of that identifier in the hash
table using some function.
24. WhydoweneedaHashfunctionasadatastructureascomparedtoanyother data
structure? (may 10)
Hashingisatechniqueusedforperforminginsertions,deletions,andfindsin constant average
time.
25. Whatare theimportantfactorstobe consideredindesigningthehashfunction? (Nov10)
• Toavoidlotofcollisionthetablesizeshouldbeprime
• Forstringdata ifkeys areverylong,the hashfunction will takelongtocompute.
26. .What doyoumeanby hashtable?
The hash table data structure is merely an array of some fixed size, containing the
keys. A key is a string with an associated value. Each key is mapped into some number in
the range 0 to tablesize-1 and placed in the appropriate cell.
176 CS3301-DATASTRUCTURES
16 MARKS
1. Writeanalgorithm toimplementBubblesortwith suitableexample.
2. Explainanytwo techniquesto overcomehash collision.
3. Writeanalgorithm toimplementinsertionsortwithsuitableexample.
4. Writeanalgorithm toimplementselectionsortwith suitableexample.
5. Writeanalgorithmtoimplementradixsortwithsuitable example.
6. Writeanalgorithm forbinarysearchwithsuitableexample.
7. Discussthecommoncollisionresolutionstrategies usedinclosedhashingsystem.
8. Giventheinput{4371,1323,6173,4199,4344,9679,1989}andahashfunctionof h(X)=X
(mod 10) show the resulting:
a. SeparateChaininghashtable
b. Openaddressinghash tableusinglinear probing
9. ExplainRe-hashingandExtendiblehashing.
10. Showtheresultofinsertingthekeys2,3,5,7,11,13,15,6,4intoaninitially
empty extendible hashing data structure with M=3. (8) (Nov 10)
11. whataretheadvantagesanddisadvantagesofvariouscollisionresolution
strategies? (6)
***ALLTHEBEST***