Data Structure Complete Notes
Data Structure Complete Notes
Using “C”
Lecture-01 IntroductiontoDatastructure
Lecture-02 Search Operation
Lecture-03 SparseMatrixanditsrepresentations
Lecture-04 Stack
Lecture-05 StackApplications
Lecture-06 Queue
Lecture-07 Linked List
Lecture-08 Polynomial List
Lecture-09 DoublyLinkedList
Lecture-10 Circular Linked List
Lecture-11 Memory Allocation
Lecture-12 InfixtoPostfixConversion
Lecture-13 Binary Tree
Lecture-14 SpecialFormsofBinaryTrees
Lecture-15 Tree Traversal
Lecture-16 AVLTrees
Lecture-17 B+-tree
Lecture-18 BinarySearchTree(BST)
Lecture-19 Graphs Terminology
Lecture-20 Depth First Search
Lecture-21 Breadth First Search
Lecture-22 Graph representation
Lecture-23 Topological Sorting
Lecture-24 Bubble Sort
Lecture-25 Insertion Sort
Lecture-26 SelectionSort
Lecture-27 Merge Sort
Lecture-28 Quick sort
Lecture-29 Heap Sort
Lecture-30 Radix Sort
Lecture-31 BinarySearch
Lecture-32 Hashing
Lecture-33 HashingFunctions
Module-1
Lecture-01
IntroductiontoData structures
In computer terms, a data structure is a Specific way to store and organize data in a
computer's memory so that these data can be used efficiently later. Data may be
arranged in many different ways such as the logical or mathematical model for a
particular organization of data is termed as a data structure. The variety of a particular
data model depends on the two factors -
Firstly, it must be loaded enough in structure to reflect the actual relationships of
the data with the real world object.
Secondly, the formation should be simple enough so that anyone can efficiently
process the data each time it is necessary.
CategoriesofDataStructure:
Thedatastructurecanbesubdividedinto major types:
LinearDataStructure
Non-linearDataStructure
LinearData Structure:
A data structure is said to be linear if its elements combine to form any specific order.
There are basically two techniques of representing such linear structure within memory.
First way is to provide the linear relationships among all the elementsrepresented
by means of linear memory location. These linear structures are termed as arrays.
The second techniqueisto provide thelinearrelationship among all the elements
represented by using the concept of pointers or links. These linear structures aretermed
as linked lists.
Thecommon examplesoflinear datastructure are:
Arrays
Queues
Stacks
Linkedlists
NonlinearData Structure:
Thisstructureismostlyusedforrepresentingdatathatcontainsahierarchical relationship
among various elements.
Examplesof NonLinearDataStructuresarelisted below:
Graphs
family oftrees and
tableofcontents
Tree: In this case, data often contain a hierarchical relationship among various
elements. The data structure that reflects this relationship is termed as rooted treegraph
or a tree.
Graph: In this case, data sometimes hold a relationship between the pairs of elements
which is not necessarily following the hierarchical structure. Such data structure is
termed as a Graph.
Array is a container which can hold a fix number of items and these items should be of
the same type. Most of the data structures make use of arrays to implement their
algorithms. Following are the important terms to understand the concept of Array.
Element − Eachitemstoredinanarrayis calledanelement.
Index− Each location of an element in an array has a numerical index, which is
used to identify the element.
ArrayRepresentation:(Storagestructure)
Arrays can be declaredin various ways in differentlanguages. For illustration,let's take C
array declaration.
bool false
char 0
int 0
float 0.0
double 0.0f
void
wchar_t 0
InsertionOperation
Insert operation is to insert one or more data elements into an array. Based on the
requirement, a new element can be added at the beginning, end, or any given index of
array.
Here, we see a practical implementation of insertion operation, where we add data at
the end of the array −
Algorithm
Let LAbe a Linear Array (unordered) with Nelements and Kis a positive integer such
thatK<=N.FollowingisthealgorithmwhereITEMisinsertedintotheK thpositionofLA
–
1. Start
2. Set J=N
3. SetN=N+1
4. Repeat steps 5and6whileJ>=K
5. SetLA[J+1]= LA[J]
6. SetJ=J-1
7. Set LA[K]=ITEM
8. Stop
Example
Followingistheimplementationofthe abovealgorithm−
#include <stdio.h>
main(){
intLA[]= {1,3,5,7,8};
intitem=10, k =3, n =5; int i
= 0, j = n;
printf("Theoriginalarrayelementsare:\n"); for(i =
0; i<n; i++) {
printf("LA[%d]=%d\n",i,LA[i]);
}
n = n + 1;
while(j>=k){
LA[j+1]=LA[j];
j=j-1;
}
LA[k]= item;
printf("Thearrayelementsafterinsertion:\n");
for(i = 0; i<n; i++) {
printf("LA[%d]=%d\n",i,LA[i]);
}
}
Whenwe compileand executethe above program, it produces thefollowing result −
Output
Theoriginalarrayelementsare: LA[0] =
1
LA[1]=3
LA[2]=5
LA[3]=7
LA[4]=8
Thearrayelementsafterinsertion:
LA[0] = 1
LA[1]=3
LA[2]=5
LA[3]=10
LA[4]=7
LA[5]=8
DeletionOperation
Deletion refers to removing an existing element from the array and re-organizing all
elements of an array.
Algorithm
Consider LAisalineararraywith Nelementsand Kisapositiveintegersuch that K<=N.
Following is the algorithm to delete anelement available at the K thposition of LA.
1. Start
2. Set J=K
3. Repeat steps 4and5while J<N
4. SetLA[J]= LA[J+ 1]
5. Set J=J+1
6. SetN= N-1
7. Stop
Example
Followingistheimplementationofthe abovealgorithm−
#include <stdio.h>
voidmain(){
intLA[]= {1,3,5,7,8};
intk=3,n=5; int
i, j;
printf("Theoriginalarrayelementsare:\n"); for(i
= 0; i<n; i++) {
printf("LA[%d]=%d\n",i,LA[i]);
}
j=k;
while( j <n) {
LA[j-1]=LA[j];
j=j+1;
}
n=n-1;
printf("Thearrayelementsafterdeletion:\n");
for(i = 0; i<n; i++) {
printf("LA[%d]=%d\n",i,LA[i]);
}
}
Whenwe compileandexecutethe aboveprogram, itproducesthefollowingresult −
Output
Theoriginalarrayelementsare: LA[0] =
1
LA[1]=3
LA[2]=5
LA[3]=7
LA[4]=8
Thearrayelementsafterdeletion: LA[0]
=1
LA[1]=3
LA[2]=7
LA[3]=8
Lecture-02
SearchOperation
Youcanperformasearchforanarrayelementbasedonitsvalueorits index.
Algorithm
Consider LAisalineararraywith Nelementsand Kisapositiveintegersuch that K<=N.
Following is the algorithm to find an element with a value of ITEM using sequential
search.
1. Start
2. Set J=0
3. Repeat steps 4and5while J<N
4. IFLA[J]isequalITEMTHENGOTOSTEP6
5. SetJ=J+1
6. PRINTJ,ITEM
7. Stop
Example
Followingistheimplementationofthe abovealgorithm−
#include <stdio.h>
voidmain(){
intLA[]= {1,3,5,7,8};
intitem=5,n=5; int i
= 0, j = 0;
printf("Theoriginalarrayelementsare:\n"); for(i
= 0; i<n; i++) {
printf("LA[%d]=%d\n",i,LA[i]);
}
while(j<n){
if(LA[j]==item){
break;
}
j=j+1;
}
printf("Foundelement%datposition%d\n",item,j+1);
}
Whenwe compileand execute the above program, itproducesthefollowingresult −
Output
Theoriginalarrayelementsare: LA[0] =
1
LA[1]=3
LA[2]=5
LA[3]=7
LA[4]=8
Foundelement 5 atposition 3
UpdateOperation
Updateoperationreferstoupdatinganexistingelement fromthe arrayata givenindex.
Algorithm
Consider LAisalineararraywith Nelementsand Kisapositiveintegersuch that K<=N.
Following is the algorithm to update an element available at the K thposition of LA.
1. Start
2. Set LA[K-1]= ITEM
3. Stop
Example
Followingistheimplementation of the abovealgorithm−
#include <stdio.h>
voidmain(){
intLA[]= {1,3,5,7,8};
int k =3, n =5,item=10; int
i, j;
printf("Theoriginalarrayelementsare:\n"); for(i
= 0; i<n; i++) {
printf("LA[%d]=%d\n",i,LA[i]);
}
LA[k-1] =item;
printf("Thearrayelementsafterupdation:\n");
for(i = 0; i<n; i++) {
printf("LA[%d]=%d\n",i,LA[i]);
}
}
Whenwe compileand executethe above program,it producesthefollowing result −
Output
Theoriginalarrayelementsare: LA[0] =
1
LA[1]=3
LA[2]=5
LA[3]=7
LA[4]=8
Thearrayelementsafterupdation:
LA[0] = 1
LA[1]=3
LA[2]=10
LA[3]=7
LA[4]=8
Lecture-
03Sparse Matrix and its representations
A matrixis a two-dimensional data object made of m rows and n columns, therefore
having total m x n values. If most of the elements of the matrix have 0 value, then it is
called a sparse matrix.
WhytouseSparseMatrixinsteadofsimplematrix?
Storage:Therearelessernon-zeroelementsthanzerosandthuslesser
memory can be used to store only those elements.
Computingtime:Computingtimecanbe savedbylogicallydesigningadata
structure traversing only non-zero elements..
Example:
00304
00570
00000
02600
Representing a sparse matrix by a 2D array leads to wastage of lots of memory as
zeroes in the matrix are of no use in most of the cases. So, instead of storing zeroeswith
non-zero elements, we only store non-zero elements. This means storing non-zero
elements with triples- (Row, Column, value).
SparseMatrixRepresentationscanbedoneinmany waysfollowingaretwocommon representations:
1. Arrayrepresentation
2. Linkedlistrepresentation
Method 1: Using Arrays
#include<stdio.h>
intmain()
{
//Assume4x5sparsematrix int
sparseMatrix[4][5] =
{
{0,0,3,0,4},
{0,0,5,7,0},
{0,0,0,0,0},
{0,2,6,0,0}
};
int size= 0;
for (int i = 0; i < 4; i++)
for(intj=0;j<5;j++)
if(sparseMatrix[i][j]!=0)
size++;
int compactMatrix[3][size];
//Makingofnewmatrix
intk=0;
for (int i = 0; i < 4; i++)
for(intj=0;j<5;j++)
if(sparseMatrix[i][j]!=0)
{
compactMatrix[0][k] = i;
compactMatrix[1][k] = j;
compactMatrix[2][k]=sparseMatrix[i][j];
k++;
}
for(inti=0;i<3;i++)
{
for(intj=0;j<size;j++)
printf("%d",compactMatrix[i][j]);
printf("\n");
}
return 0;
}
Lecture-04
STACK
A stack is an Abstract Data Type (ADT), commonly used in most programming languages. It is
named stack as it behaves like a real-world stack, for example –a deck of cards or a pile of
plates, etc.
A real-world stack allows operations at one end only. For example, we can place or remove a
card or plate from the top of the stack only. Likewise, Stack ADT allows all data operations at
one end only. At any given time, we can only access the top element of a stack.
This feature makes it LIFO data structure. LIFO stands for Last-in-first-out. Here, the element
which is placed (inserted or added) last, is accessed first. In stack terminology, insertion
operation is called PUSH operation and removal operation is called POP operation.
StackRepresentation
Thefollowingdiagramdepictsastackanditsoperations−
A stack can be implemented by means of Array, Structure, Pointer, and Linked List. Stack can
either be a fixed size one or it may have a sense of dynamic resizing. Here, we are going to
implement stack using arrays, which makes it a fixed size stack implementation.
BasicOperations
Stack operations may involve initializing the stack, using it and then de-initializing it. Apart from
thesebasicstuffs,astackisusedforthefollowingtwoprimaryoperations−
push()−Pushing(storing)anelementonthestack.
pop()−Removing(accessing)anelementfromthestack.
WhendataisPUSHedontostack.
Touseastackefficiently,weneedtocheckthestatusofstackaswell.Forthesamepurpose,
thefollowingfunctionalityisaddedtostacks−
peek()−getthetopdataelementofthestack,withoutremovingit.
isFull()−checkifstackisfull.
isEmpty()−checkifstackisempty.
At all times, we maintain a pointer to the last PUSHed data on the stack. As this pointer always
represents the top of the stack, hence named top. The top pointer provides top value of the
stack without actually removing it.
Firstwe shouldlearnabout proceduresto supportstackfunctions−
peek()
Algorithmof peek()function−
beginprocedurepeek return
stack[top]
endprocedure
Implementationofpeek() functioninCprogramming language−
Example
intpeek(){
returnstack[top];
}
isfull()
Algorithmof isfull() function −
beginprocedureisfull
iftopequalstoMAXSIZE
return true
else
returnfalse
endif
endprocedure
Implementationofisfull()functioninCprogramminglanguage−
Example
boolisfull(){
if(top==MAXSIZE)
return true;
else
return false;
}
isempty()
Algorithmofisempty() function−
beginprocedureisempty
iftoplessthan1
return true
else
returnfalse
endif
endprocedure
Implementation of isempty() function in C programming language is slightly different. We
initialize top at -1, as the index in array starts from 0. So we check if the top is below zero or -1to
determine if the stack is empty. Here's the code −
Example
boolisempty(){
if(top == -1)
returntrue;
else
return false;
}
PushOperation
The process of puttinga new dataelement ontostack is knownas aPush Operation. Push
operationinvolves a series of steps −
Step1−Checksifthestackisfull.
Step2−Ifthestackisfull,producesanerrorandexit.
Step3−Ifthestackisnotfull,incrementstoptopointnextemptyspace.
Step4−Addsdataelementtothestacklocation,wheretopispointing.
Step5−Returnssuccess.
Ifthelinkedlistisusedtoimplementthestack,theninstep3,weneedtoallocatespace dynamically.
Algorithmfor PUSH Operation
Asimple algorithmforPushoperationcan bederived asfollows−
beginprocedure push:stack,data
ifstackisfull
returnnull
endif
top ← top + 1
stack[top]←data
endprocedure
Implementationofthis algorithmin C,is veryeasy. Seethefollowingcode −
Example
voidpush(intdata){
if(!isFull()) {
top = top + 1;
stack[top]=data;
}else{
printf("Couldnotinsert data, Stackisfull.\n");
}
}
PopOperation
Accessing the content while removing it from the stack, is known as a Pop Operation. In
anarrayimplementationofpop()operation,thedataelementisnotactuallyremoved, instead topis
decremented to a lower position in the stack to point to the next value. But in linked-
listimplementation, pop()actually removes data element and deallocates memory space. A Pop
operation may involve the following steps −
Step1−Checksifthestackisempty.
Step2−Ifthestackisempty,producesanerrorandexit.
Step3−Ifthestackisnotempty,accessesthedataelementatwhichtopispointing.
Step4−Decreasesthevalueoftopby1.
Step5−Returnssuccess.
AlgorithmforPopOperation
Asimple algorithmforPop operationcan bederived asfollows−
beginprocedure pop:stack
ifstackisempty
return null
endif
data←stack[top]
top ← top - 1
return data
endprocedure
Implementationofthisalgorithmin C,isasfollows−
Example
intpop(intdata){
if(!isempty()){
data=stack[top];
top = top - 1;
return data;
}else{
printf("Couldnotretrievedata,Stackis empty.\n");
}
}
Lecture-05
StackApplications
Expressionevaluation
In particular we will consider arithmetic expressions.Understand that there are boolean and
logical expressions that can be evaluated in the same way.Control structures can also be treated
similarly in a compiler.
This study ofarithmetic expression evaluation isan example of problemsolving where yousolve a
simpler problem and then transform the actual problem to the simpler one.
Aside:The NP-Complete problem. There are a set of apparently intractable problems: findingthe
shortest route in a graph (Traveling Salesman Problem), bin packing, linear programming, etc.
that are similar enough that if a polynomial solution is ever found (exponential solutions abound)
for one of these problems, then the solution can be applied to all problems.
Infix,PrefixandPostfixNotation
We are accustomed to write arithmetic expressions with the operation between the two
operands: a+b or c/d.If we write a+b*c, however, we have to applyprecedencerulesto avoid the
ambiguous evaluation (add first or multiply first?).
There's noreal reasonto putthe operation between the variablesor values.They can justas well
precedeor followthe operands.You shouldnote the advantageof prefixand postfix: the need for
precedence rules and parentheses are eliminated.
Infix Prefix Postfix
a+b +ab ab+
a+b*c +a*bc abc* +
(a+b)*(c-d) *+ab-cd ab+cd-*
b* b-4* a* c
40-3* 5+1
Postfixexpressionsareeasilyevaluatedwiththe aidofastack.
PostfixEvaluationAlgorithm
Assumewe haveastringofoperandsandoperators, aninformal, by hand processis
1. Scantheexpressionlefttoright
2. Skipvaluesorvariables(operands)
3. Whenan operatorisfound, applytheoperation totheprecedingtwooperands
4. Replacethetwooperandsandoperator with thecalculatedvalue(threesymbolsare
replaced with one operand)
5. Continuescanninguntilonlya valueremains--theresultoftheexpression
The time complexity isO(n) because eachoperandisscannedonce, and eachoperationis
performed once.
A moreformalalgorithm:
createanewstack
while(inputstreamisnotempty){
token = getNextToken();
if(token instanceof operand){
push(token);
}elseif(tokeninstanceofoperator) op2
= pop();
op1=pop();
result=calc(token,op1,op2);
push(result);
}
}
returnpop();
Demonstration with234+ *5-
InfixtransformationtoPostfix
This process uses a stack as well.We have to hold information that's expressed inside
parentheses while scanning to find the closing ')'. We also have to hold information onoperations
that are of lower precedence on the stack.The algorithm is:
1. Createanemptystack andanemptypostfixoutputstring/stream
2. Scantheinfixinputstring/streamlefttoright
3. If the currentinput tokenis anoperand, simply appenditto theoutput string(note the
examples above that the operands remain in the same order)
4. If thecurrentinputtokenis anoperator,popoffall operators that haveequal orhigher
precedence andappend themto the output string; push the operator onto thestack.The
order of popping is the order in the output.
5. Ifthecurrentinputtokenis '(',pushitontothestack
6. If the currentinput tokenis')',pop off all operatorsand append themto the output string
until a '(' is popped; discard the '('.
7. If the end of theinput stringis found,pop all operators andappend themto the output
string.
This algorithmdoesn't handle errors in theinput, although careful analysis of parenthesis or lack
of parenthesis could point to such error determination.
Applythealgorithmtotheaboveexpressions.
Backtracking
Backtracking is used in algorithms in which there are steps along some path (state) from some
starting point to some goal.
Findyourwaythrougha maze.
Findapathfromonepointinagraph(roadmap)toanotherpoint.
Playa game in which there are movesto bemade (checkers, chess).
In all of these cases, there are choices to be made among a number of options.We need some
way to remember these decision points in case we want/need to come back and try the
alternative
Consider the maze.At a point where a choice is made, wemay discover that the choice leadsto a
dead-end.We want to retrace back to that decision point and then try the other (next) alternative.
Again, stacks can be used as part of the solution.Recursion is another, typically more favored,
solution, which is actually implemented by a stack.
MemoryManagement
Any modern computer environment uses a stack as theprimary memory management model for
a running program.Whether it's native code (x86, Sun, VAX) or JVM, a stack is at the center of
the run-time environment for Java, C++, Ada, FORTRAN, etc.
The discussion of JVM in the text is consistent with NT, Solaris, VMS, Unix runtime
environments.
Each program thatis running in a computer system has its own memory allocation containingthe
typical layout as shown below.
Callandreturnprocess
Whenamethod/functioniscalled
1. An activation recordiscreated;its size dependson thenumber and size of thelocal
variables and parameters.
2. TheBase Pointervalueissavedin the speciallocation reservedforit
3. TheProgramCountervalueis savedintheReturnAddresslocation
4. The Base Pointerisnowreset to the newbase (top of the call stackprior to the creation of
the AR)
5. The ProgramCounteris set tothelocation of the first bytecode of the method being
called
6. Copiesthecallingparametersinto theParameterregion
7. Initializeslocalvariablesinthelocalvariableregion
Whilethemethodexecutes,thelocal variablesandparametersaresimplyfoundbyaddinga constant
associated with each variable/parameter to the Base Pointer.
Whenamethodreturns
1. Get theprogramcounterfromthe activation recordand replacewhat'sin the PC
2. Get thebasepointervaluefromtheAR and replacewhat's intheBP
3. PoptheAR entirelyfromthestack.
Lecture-06
QUEUE
Queue is an abstract data structure, somewhat similar to Stacks. Unlike stacks, a queueisopen
at both its ends. One end is always used to insert data (enqueue) and the other is used to
remove data (dequeue). Queue follows First-In-First-Out methodology, i.e., the data item stored
first will be accessed first.
A real-world example of queue can be a single-lane one-way road, where the vehicle entersfirst,
exits first. More real-world examples can be seen as queues at the ticket windows and bus -
stops.
QueueRepresentation
Aswenowunderstandthatinqueue,weaccessbothendsfordifferentreasons.Thefollowing
diagramgivenbelowtriestoexplainqueuerepresentationasdatastructure−
ifrearequalstoMAXSIZE return
true
else
returnfalse
endif
endprocedure
Implementationofisfull()functioninCprogramminglanguage−
Example
boolisfull(){
if(rear==MAXSIZE-1)
return true;
else
return false;
}
isempty()
Algorithmofisempty() function−
Algorithm
beginprocedureisempty
endprocedure
Ifthevalueof frontislessthanMINor0,ittellsthatthequeueisnotyetinitialized,hence empty.
Here'sthe Cprogramming code−
Example
boolisempty(){
if(front<0||front>rear)
return true;
else
return false;
}
EnqueueOperation
Queues maintain two data pointers, frontand rear. Therefore, its operations are comparatively
difficult to implement than that of stacks.
The followingstepsshould betakento enqueue (insert)dataintoaqueue −
Step1−Checkifthequeueisfull.
Step2−Ifthequeueisfull,produceoverflowerrorandexit.
Step3−Ifthequeueisnotfull,incrementrearpointertopointthenextemptyspace.
Step4−Adddataelementtothequeuelocation,wheretherearispointing.
Step5−returnsuccess.
Sometimes,wealsochecktoseeifaqueueisinitializedornot,tohandleanyunforeseen situations.
Algorithmforenqueueoperation
procedure enqueue(data)
if queue is full
returnoverflow
endif
rear ← rear + 1
queue[rear]←data
return true
endprocedure
Implementationofenqueue()inCprogramminglanguage−
Example
intenqueue(intdata)
if(isfull())
return0;
rear = rear + 1;
queue[rear]=data;
return
1;endprocedur
e
Dequeue Operation
Accessing data from the queue is a process of two tasks−access the data wherefrontis
pointingandremovethedataafteraccess.Thefollowingstepsaretakentoperform dequeue
operation −
Step1−Checkifthequeueisempty.
Step2−Ifthequeueisempty,produceunderflowerrorandexit.
Step3−Ifthequeueisnotempty,accessthedatawhere frontispointing.
Step4−Incrementfrontpointertopointtothenextavailabledataelement.
Step5−Returnsuccess.
Algorithmfordequeueoperation
procedure dequeue
if queue is empty
returnunderflow
end if
data=queue[front]
front ← front + 1
return true
endprocedure
Implementationofdequeue()inCprogramminglanguage−
Example
intdequeue(){
if(isempty())
return0;
intdata=queue[front];
front = front + 1;
return data;
}
Lecture-07
LINKEDLIST
A linked list is a sequence of data structures, which are connected together via links.
Linked List is a sequence oflinks which contains items. Each link contains a connection
to another link. Linked list is the second most-used data structure after array. Following
are the important terms to understand the concept of Linked List.
Link−Eachlinkofalinkedlistcanstoreadatacalledanelement.
Next−EachlinkofalinkedlistcontainsalinktothenextlinkcalledNext.
LinkedList−ALinkedListcontainstheconnectionlinktothefirstlinkcalled First.
LinkedListRepresentation
Linked list can be visualized as a chain of nodes, where every node points to the
nextnode.
This willputthenewnodeinthemiddleofthetwo.Thenewlistshouldlooklikethis−
Similar steps should be taken if the node is being inserted at the beginning of the list.
While inserting it at the end, the second last node of the list should point to the new
node and the new node will point to NULL.
DeletionOperation
Deletion is also a more than one step process. We shall learn with pictorial
representation. First, locate the target node to be removed, by using searching
algorithms.
This will remove the link that was pointing to the target node. Now, using the following
code, we will remove what the target node is pointing at.
TargetNode.next−>NULL;
Weneedtousethedeletednode.Wecankeepthatinmemoryotherwisewecan simply
deallocate memory and wipe off the target node completely.
ReverseOperation
This operation is a thorough one. We need to make the last node to be pointed by the
head node and reverse the whole linked list.
First,wetraversetotheendofthelist.ItshouldbepointingtoNULL.Now,weshall make it
point to its previousnode −
We have to make sure that the last node is not the lost node. So we'll have some temp
node, which looks like the head node pointing to the last node. Now, we shall make all
left side nodes point to their previous nodes one by one.
Except the node (first node) pointed by the head node, all nodes should point to their
predecessor, making them their new successor. The first node will point to NULL.
Thelinkedlistisnowreversed.
Program:
#include
<stdio.h>#include
<string.h>#include
<stdlib.h>#include<st
dbool.h>
structnode{
int data;int
key;
structnode*next;
};
struct node *head = NULL;
structnode*current=NULL;
//displaythelist
voidprintList(){
structnode*ptr=head;
printf("\n[ ");
//startfromthebeginning
while(ptr != NULL) {
printf("(%d,%d)",ptr->key,ptr->data);
ptr = ptr->next;
}
printf("]");
}
link->key = key;
link->data=data;
//pointittooldfirstnode
link->next = head;
//pointfirsttonewfirstnode
head = link;
}
//deletefirstitem
structnode*deleteFirst() {
//marknexttofirstlinkasfirst
head = head->next;
//returnthedeletedlink
returntempLink;
}
intlength(){
intlength=0;
structnode*current;
for(current =head;current!=NULL;current=current->next) {
length++;
}
returnlength;
}
//findalinkwithgivenkey struct
node*find(int key){
//deletealinkwithgivenkey
structnode*delete(intkey){
//foundamatch,updatethelink if(current
== head) {
//changefirsttopointtonextlink head =
head->next;
}else{
//bypass the current link
previous->next=current->next;
}
returncurrent;
}
voidsort(){
inti,j,k,tempKey,tempData;
struct node *current;
structnode*next;
intsize=length(); k
= size ;
for(j=1;j<k;j++){
if(current->data>next->data){
tempData = current->data;
current->data = next->data;
next->data = tempData;
tempKey = current->key;
current->key=next->key;
next->key = tempKey;
}
current=current->next;
next = next->next;
}
}
}
voidreverse(structnode**head_ref){
struct node* prev= NULL;
structnode*current=*head_ref;
struct node* next;
while(current!=NULL){
next= current->next;
current->next = prev;
prev = current;
current=next;
}
*head_ref=prev;
}
void main() {
insertFirst(1,10);
insertFirst(2,20);
insertFirst(3,30);
insertFirst(4,1);
insertFirst(5,40);
insertFirst(6,56);
printf("OriginalList:");
//print list
printList();
while(!isEmpty()){
structnode*temp=deleteFirst();
printf("\nDeleted value:");
printf("(%d,%d)",temp->key,temp->data);
}
printf("\nListafterdeletingallitems:");
printList();
insertFirst(1,10);
insertFirst(2,20);
insertFirst(3,30);
insertFirst(4,1);
insertFirst(5,40);
insertFirst(6,56);
printf("\nRestoredList:");
printList();
printf("\n");
struct node*foundLink=find(4);
if(foundLink != NULL) {
printf("Elementfound:");
printf("(%d,%d)",foundLink->key,foundLink->data);
printf("\n");
}else{
printf("Elementnotfound.");
}
delete(4);
printf("Listafterdeletinganitem:");
printList();
printf("\n");
foundLink=find(4);
if(foundLink != NULL) {
printf("Elementfound:");
printf("(%d,%d)",foundLink->key,foundLink->data);
printf("\n");
}else{
printf("Elementnotfound.");
}
printf("\n");
sort();
printf("Listaftersortingthedata:");
printList();
reverse(&head);
printf("\nListafterreversingthedata:");
printList();
}
If we compile and run the above program, it will produce the following result
−
O utput
OriginalList:
[(6,56)(5,40)(4,1)(3,30)(2,20)(1,10)]
Deletedvalue:(6,56)
Deletedvalue:(5,40)
Deleted value:(4,1)
Deletedvalue:(3,30)
Deletedvalue:(2,20)
Deletedvalue:(1,10)
Listafterdeletingall items:
[]
RestoredList:
[(6,56)(5,40)(4,1)(3,30)(2,20)(1,10)]
Elementfound:(4,1)
Listafterdeletingan item:
[(6,56)(5,40)(3,30)(2,20)(1,10)]
Elementnotfound.
Listaftersortingthedata:
[(1,10)(2,20)(3,30)(5,40)(6,56)]
Listafterreversingthedata:
[(6,56)(5,40)(3,30)(2,20)(1,10)]
Lecture-08
PolynomialList
A polynomial p(x) is the expression in variable x which is in the form(ax n + bxn-1 + …. +
jx+ k), where a, b, c …., k fall in the category of real numbers and 'n' is non negative
integer, which is called the degree of polynomial.
Animportantcharacteristicsofpolynomial isthateachterminthepolynomial expression
consists of two parts:
oneisthecoefficient
otheristheexponent
Example:
10x2+26x, here10and26arecoefficientsand2,1areitsexponential value. Points to
keep in Mind while working with Polynomials:
Thesignofeachcoefficientandexponentisstoredwithinthecoefficientandthe
exponent itself
Additionaltermshavingequalexponentispossibleone
Thestorageallocationfor eachterminthepolynomial mustbedonein
ascending and descending order of their exponent
RepresentationofPolynomial
Polynomialcan berepresentedinthe variousways. Theseare:
Bytheuseofarrays
By theuseofLinkedList
RepresentationofPolynomialsusingArrays
There may arise some situation where you need to evaluate many polynomial
expressions and perform basic arithmetic operations like: addition and subtraction with
those numbers. For this you will have to get a way to represent those polynomials. The
simple way is to represent a polynomial with degree 'n' and store the coefficient of n+1
terms of the polynomial in array. So every array element will consists of two values:
Coefficientand
Exponent
RepresentationofPolynomialUsingLinkedLists
Apolynomial canbe thought of as anorderedlist ofnon zero terms. Each non zero term
is a two tuple which holds two pieces of information:
Theexponentpart
Thecoefficientpart
AddingtwopolynomialsusingLinkedList
Giventwopolynomialnumbersrepresentedbyalinkedlist.Writeafunctionthatadd these lists
means add the coefficients who have same variable powers.
Example:
Input:
1stnumber=5x^2+4x^1+2x^0 2nd
number = 5x^1 + 5x^0
Output:
5x^2+9x^1+7x^0 Input:
1stnumber=5x^3+4x^2+2x^0 2nd
number = 5x^1 + 5x^0
Output:
5x^3+4x^2+5x^1+7x^0
structNode
intcoeff;
int pow;
structNode*next;
};
voidcreate_node(intx,inty,structNode **temp)
structNode*r,*z; z
= *temp;
if(z== NULL)
r=(structNode*)malloc(sizeof(structNode));
r->coeff = x;
r->pow= y;
*temp=r;
r->next=(structNode*)malloc(sizeof(structNode)); r
= r->next;
r->next=NULL;
else
r->coeff=x;
r->pow = y;
r->next=(structNode*)malloc(sizeof(structNode)); r
= r->next;
r->next=NULL;
while(poly1->next&&poly2->next)
if(poly1->pow>poly2->pow)
poly->pow = poly1->pow;
poly->coeff=poly1->coeff;
poly1 = poly1->next;
elseif(poly1->pow<poly2->pow)
poly->pow = poly2->pow;
poly->coeff=poly2->coeff;
poly2 = poly2->next;
else
poly->pow= poly1->pow;
poly->coeff=poly1->coeff+poly2->coeff;
poly1=poly1->next;
poly2=poly2->next;
poly->next=(structNode*)malloc(sizeof(structNode)); poly
= poly->next;
poly->next= NULL;
while(poly1->next||poly2->next)
if(poly1->next)
poly->pow = poly1->pow;
poly->coeff=poly1->coeff;
poly1 = poly1->next;
if(poly2->next)
poly->pow = poly2->pow;
poly->coeff=poly2->coeff;
poly2 = poly2->next;
poly->next=(structNode*)malloc(sizeof(structNode)); poly
= poly->next;
poly->next= NULL;
}
}
voidshow(structNode*node)
while(node->next!= NULL)
printf("%dx^%d",node->coeff,node->pow);
node = node->next;
if(node->next!=NULL)
printf(" + ");
intmain()
//Createfirstlistof5x^2+4x^1+2x^0
create_node(5,2,&poly1);
create_node(4,1,&poly1);
create_node(2,0,&poly1);
//Createsecondlistof5x^1+5x^0
create_node(5,1,&poly2);
create_node(5,0,&poly2);
show(poly1);
show(poly2);
poly=(struct Node*)malloc(sizeof(structNode));
//Functionaddtwopolynomialnumbers
printf("\nAddedpolynomial:");
show(poly);
return0;
Output:
1stNumber:5x^2+4x^1+2x^0 2nd
Number: 5x^1 + 5x^0
Addedpolynomial:5x^2+9x^1+7x^0
Lecture-09
DoublyLinked List
ADoubly Linked List(DLL)containsanextrapointer,typicallycalled previouspointer,
together with next pointer and data which are there in singly linked list.
Followingisrepresentationof a DLLnodeinClanguage.
/*Nodeofadoublylinkedlist*/ struct
Node {
int data;
struct Node* next; // Pointer to next node in DLL
struct Node*prev;//Pointerto previousnodeinDLL
};
Followingareadvantages/disadvantagesofdoublylinkedlistoversinglylinkedlist.
Advantagesoversinglylinkedlist
1) ADLLcanbetraversedinbothforward andbackward direction.
2) The deleteoperationin DLLismore efficient ifpointer to the node to be deletedis
given.
3) Wecanquicklyinsertanewnodebeforeagivennode.
In singlylinkedlist, to delete a node, pointer to the previousnodeisneeded.To get
thispreviousnode, sometimesthe lististraversed. In DLL, we can get the previous
node using previous pointer.
Disadvantagesoversinglylinkedlist
1) Everynode of DLL Require extra spaceforan previouspointer. Itispossible to
implement DLL with single pointer though
2) All operations require an extra pointer previous to be maintained. For example, in
insertion, we need to modify previous pointers together with next pointers. For
exampleinfollowingfunctionsforinsertionsatdifferentpositions,weneed1 or2 extra steps
to set previous pointer.
Insertion
Anodecanbeaddedinfourways
1) AtthefrontoftheDLL
2) Afteragivennode.
3) AttheendoftheDLL
4) Beforeagiven node.
1) Addanodeatthefront:(A5stepsprocess)
ThenewnodeisalwaysaddedbeforetheheadofthegivenLinkedList.Andnewly
addednodebecomesthenewheadofDLL.ForexampleifthegivenLinkedListis
10152025 and we add an item5 at the front, then the Linked List becomes 510152025.
Let us call the function that adds at the front of the list is push(). The push() mustreceive
a pointer to the head pointer, because push must change the head pointer to point to
the new node
2) Addanodeafteragivennode.:(A7stepsprocess)
We are given pointer to a node as prev_node, and the new node is inserted after the
given node.
3) Addanodeattheend:(7steps process)
The newnodeis always added after thelastnode of the given Linked List. For example
ifthegivenDLLis510152025andweaddanitem30attheend,thentheDLLbecomes
51015202530. Since a Linked List is typically represented by the head of it, we have to
traverse the list till end and then change the next of last node to new node.
4) Addanodebeforeagivennode:
Steps
Let the pointer to this given node be next_node and the data of the new node to be
added as new_data.
1. Checkifthenext_nodeisNULLornot.Ifit’sNULL,returnfromthefunction because any
new node can not be added before a NULL
2. Allocatememory forthenewnode,letitbecallednew_node
3. Setnew_node->data=new_data
4. Set the previouspointer of thisnew_node asthe previousnode of the next_node,
new_node->prev = next_node->prev
5. Set the previous pointer of the next_node as the new_node, next_node->prev =
new_node
6. Setthenextpointerofthisnew_nodeasthenext_node,new_node->next= next_node;
7. Ifthepreviousnodeofthenew_nodeisnotNULL,thensetthenextpointerof this
previous node as new_node, new_node->prev->next = new_node
Lecture-10
CircularLinkedList
Circularlinkedlist isalinkedlistwhereallnodesareconnectedtoformacircle.Thereis no NULL
at the end. A circular linked list can be a singly circular linked list or doubly circular
linked list.
AdvantagesofCircularLinked Lists:
1) Any node can be a starting point.We can traverse the whole list by starting fromany
point. We just need to stop when the first visited node is visited again.
2) Useful for implementation of queue. Unlike thisimplementation, we don’t need to
maintain two pointers for front and rear if we use circular linked list. We can maintain a
pointer to the last inserted node and front can always be obtained as next of last.
3) Circular lists are useful in applications to repeatedly go around the list. For example,
when multiple applications are running on a PC, it is common for the operating systemto
put the running applications on a list and then to cycle through them, giving each of
them a slice of time to execute, and then making them wait while the CPU is given to
anotherapplication.Itis convenient for the operatingsystemto use acircularlistso that
when it reaches the end of the list it can cycle around to the front of thelist.
4) CircularDoublyLinkedListsareusedforimplementationof advanced data structures like
Fibonacci Heap.
Insertioninanempty List
inserting a node T,
After insertion, Tis thelastnode sopointer lastpoints tonodeT. And Node Tis first and
last node, so T is pointing to itself.
Function toinsert nodein anempty List,
structNode*addToEmpty(structNode*last,int data)
{
//Thisfunctionisonlyforemptylist if
(last != NULL)
returnlast;
//Creatinganodedynamically. struct
Node *last =
(structNode*)malloc(sizeof(structNode));
//Assigningthedata.
last -> data = data;
//Note:listwasempty.Welinksinglenode
//toitself.
last-> next=last;
returnlast;
}
RunonIDE
Insertionatthebeginningofthelist
ToInsertanodeatthebeginningofthelist,followthesestep:
1. Createanode,sayT.
2. MakeT-> next=last->next.
3. last->next=T.
After insertion,
FunctiontoinsertnodeinthebeginningoftheList, struct
Node *addBegin(struct Node *last, int data)
{
if(last==NULL)
returnaddToEmpty(last,data);
//Creatinganodedynamically.
struct Node *temp
=(structNode*)malloc(sizeof(structNode));
//Assigningthedata.
temp -> data = data;
//Adjustingthelinks.
temp->next=last->next; last -
> next = temp;
returnlast;
}
Insertionattheendofthelist
ToInsertanodeattheendofthelist,followthesestep:
1. Createanode,sayT.
2. MakeT-> next=last->next;
3. last-> next=T.
4. last= T.
Afterinsertion,
//Assigningthedata.
temp -> data = data;
//Adjustingthelinks.
temp->next=last->next; last -
> next = temp;
last=temp;
returnlast;
}
Insertioninbetweenthenodes
ToInsertanodeattheendofthelist,followthesestep:
1. Createanode,sayT.
2. SearchthenodeafterwhichTneedtobeinsert,saythatnodebeP.
3. MakeT-> next= P->next;
4. P->next=T.
Suppose 12 need tobeinsert afternode havingvalue 10,
Aftersearchingand insertion,
FunctiontoinsertnodeintheendoftheList,
structNode*addAfter(structNode*last,intdata,intitem)
{
if(last==NULL)
return NULL;
structNode*temp,*p; p
= last -> next;
//Searchingtheitem.
do
{
if(p->data== item)
{
temp=(structNode*)malloc(sizeof(structNode));
//Assigningthedata.
temp -> data = data;
// Adjusting the links.
temp->next=p->next;
//Addingnewlyallocatednodeafterp. p -
> next = temp;
//Checkingforthelastnode. if
(p == last)
last=temp;
return last;
}
p=p->next;
}while(p!=last->next);
1. StaticMemoryAllocation:
When memory is allocated during compilation time, it is called ‘Static Memory
Allocation’. This memory is fixed and cannot be increased or decreased after
allocation. If more memory is allocated than requirement, then memory is wasted. If
less memory is allocated than requirement, then program will not run successfully.
So exact memory requirements must be known in advance.
2. DynamicMemoryAllocation:
When memory is allocated during run/execution time, it is called ‘Dynamic Memory
Allocation’.This memory is not fixed and is allocated according to our requirements.
Thusinit thereisno wastage of memory. So thereisno need to knowexact memory
requirements in advance.
Garbage Collection-
Whenever a node is deleted, some memory space becomes reusable. This memory
space should be available for future use. One way to do this is to immediately insert the
free spaceinto availabilitylist. Butthismethod may be time consuming for the operating
system. So another method is used which is called ‘Garbage Collection’. This method is
described below: In this method the OS collects the deleted space time to time onto the
availability list. This process happens in two steps. In first step, the OS goes through all
the lists and tags all those cells which are currentlybeing used. In the second step, the
OS goes through all the lists again and collects untagged space and adds this collected
space to availability list. The garbage collection may occur when small amount of free
space is left in the systemor no free space is leftin the systemor when CPU isidle and
has time to do the garbage collection.
Compaction
Onepreferablesolutiontogarbagecollectioniscompaction. The process of moving all
marked nodes to one end of memory and all available memory to other end is called
compaction. Algorithm which performs compaction is called compacting algorithm.
Lecture-12
InfixtoPostfixConversion
1 #include<stdio.h>
2 char stack[20];int
3 top = -1;
4 voidpush(charx)
5 {
6 stack[++top]=x;
7 }
8
9 charpop()
10 {
11 if(top == -1)
12 return-1;
13 else
14 returnstack[top--];
15 }
16
17 intpriority(charx)
18 {
19 if(x == '(')
20 return0;
21 if(x=='+' ||x=='-') return
22 1;
23 if(x=='*'||x=='/') return 2;
24 }
25
26 main()
27 {
28 charexp[20];
29 char *e, x;
30 printf("Entertheexpression::");
31 scanf("%s",exp);
32 e = exp;
33 while(*e!='\0')
34 {
35 if(isalnum(*e))
36 printf("%c",*e);
37 elseif(*e=='(')
38 push(*e);
39 elseif(*e == ')')
40 {
41
42 while((x=pop())!='(') printf("%c",
43 x);
44 }
45 else
46 {
47 while(priority(stack[top])>=priority(*e))
48 printf("%c",pop());
49 push(*e);
50 }
51 e++;
52 }
53 while(top!=-1)
54 {
55 printf("%c",pop());
56 }
57 }
OUTPUT:
Entertheexpression::a+b*c abc*+
Entertheexpression::(a+b)*c+(d-a)
ab+c*da-+
EvaluatePOSTFIXExpressionUsing Stack
1 #include<stdio.h>i
2 nt stack[20];
3 int top = -
4 1;voidpush(intx
5 )
6 {
7 stack[++top]=x;
8 }
10 intpop()
11 {
12 returnstack[top--];
13 }
14
15 intmain()
16 {
17 charexp[20];
18 char *e;
19 intn1,n2,n3,num;
20 printf("Entertheexpression::");
21 scanf("%s",exp);
22 e = exp;
23 while(*e!='\0')
24 {
if(isdigit(*e))
25 {
26 num=*e-48; push(num);
27 }
28 else
29 {
30 n1=pop();
31 n2=pop(); switch(*e)
32 {
33 case'+':
34 {
35 n3=n1+n2;
36 break;
37 }
38 case '-':
39 {
40 n3=n2-n1;
41 break;
42 }
43 case'*':
44 {
45 n3=n1*n2;
46 break;
47 }
48
49
50 case'/':
51 {
52 n3=n2/n1; break;
53 }
54 }
55 push(n3);
56 }
57 e++;
58 }
59 printf("\nTheresultofexpression%s=%d\n\n",exp,pop());
60 return 0;
61
62 }
63
64
Output:
Entertheexpression::245+*
The resultofexpression245+*=18
Lecture-13
Binary Tree
Abinary tree consists of a finite set of nodes that is either empty, or consists of one
specially designated node called the rootof the binary tree, and the elements of two
disjoint binary trees called the left subtree and right subtree of the root.
Note that the definition above is recursive: we have defined a binary tree in terms of
binary trees. This is appropriate since recursion is an innate characteristic of tree
structures.
Diagram1: Abinarytree
BinaryTree Terminology
Tree terminology is generally derived from the terminology of family trees (specifically,
the type of family tree called a lineal chart).
Eachrootissaidtobetheparentoftherootsofits subtrees.
Two nodes with the same parent are said to be siblings; they are the childrenof
their parent.
Therootnodehasnoparent.
A great deal of tree processing takes advantage of the relationship between a
parentanditschildren,andwecommonlysaya directededge (orsimply an edge)
extends from a parent to its children. Thusedges connect a root with the roots of
each subtree. An undirectededgeextendsin both directionsbetween a parent and
a child.
Grandparentand grandchild relations can be defined in a similar manner; we
could also extend this terminology further if we wished (designating nodes as
cousins, as an uncle or aunt, etc.).
OtherTreeTerms
SpecialFormsofBinaryTrees
Thereareafewspecialforms ofbinarytreeworthmentioning.
If every non-leaf node in a binary tree has nonempty left and right subtrees, the tree is
termed a strictly binary tree. Or, to put it another way, all of the nodes in a strictly binary
tree are of degree zero or two, never degree one. A strictly binary tree with Nleaves
always contains 2N – 1 nodes.
Some texts callthisa"full"binary tree.
Acomplete binary tree of depth dis the strictly binary tree allof whose leaves are at level
d.
Thetotal number of nodes in a complete binary tree of depth dequals 2d+1–1. Since all
leaves in such a tree are at level d, the tree contains 2 dleaves and, therefore, 2d- 1
internal nodes.
Diagram2: Acompletebinarytree
Abinarytreeofdepthdisanalmostcompletebinarytreeif:
Eachleafinthetreeiseitheratleveldoratleveld–1.
Foranynode ndinthetreewitharightdescendantatleveld,alltheleft descendants of
ndthat are leaves are also at level d.
Diagram3:Analmostcompletebinarytree
An almost complete strictly binary tree with Nleaves has 2N–1 nodes (as does any
otherstrictlybinarytree).Analmostcompletebinarytreewith Nleavesthatisnot
strictlybinaryhas2Nnodes.Therearetwodistinctalmostcompletebinarytrees with N leaves,
one of which is strictly binary and one of which is not.
There is only a single almost complete binary tree with Nnodes. This tree is strictly
binary if and only if N is odd.
RepresentingBinaryTreesinMemory
ArrayRepresentation
For a complete or almost complete binary tree, storing the binary tree as an array may
be a good choice.
One way to do this is to store the root of the tree in the first element of the array. Then,
for each node in the tree that is stored at subscript k, the node's left child can be stored
at subscript 2k+1 and the right child can be stored at subscript 2k+2. For example, the
almost complete binary tree shown in Diagram 2 can be stored in an array like so:
wouldbestored usingthistechinquelikeso:
Linked Representation
If a binarytreeis not complete or almost complete, a better choice for storing it is to use a
linked representation similar to the linked list structures covered earlier in the semester:
Eachtreenodehastwopointers(usuallynamed left andright).Thetreeclasshasa pointer to the root
node of the tree (labeled root in the diagram above).
Any pointer in the tree structure that does not point to a node will normally contain
thevalue NULL. A linked tree with N nodes will always contain N + 1 null links.
Lecture-15
TreeTraversal:
Traversal is a process to visit all the nodes of a tree and may print their values too.
Because, all nodes are connected via edges (links) we always start from the root( head)
node. That is, we cannot randomly access a node in a tree. There are three ways which
we use to traverse a tree −
In-orderTraversal
Pre-orderTraversal
Post-order Traversal
Generally,wetraverseatreetosearchorlocateagivenitem orkeyinthetreeorto print all the
values it contains.
In-orderTraversal
In this traversal method, the left subtree is visited first, then the root and later the right
sub-tree. We should always remember that every node may represent a subtree
itself.Ifabinarytreeistraversed in-order,theoutputwillproducesortedkeyvaluesinan
ascending order.
Pre-orderTraversal
In this traversal method, the root node isvisited first, then the left subtree and finallythe
right subtree.
We start fromA, and following pre-order traversal, we first visit Aitself and then move to
its left subtree B. Bis also traversed pre-order. The process goes on until all the nodes
are visited. The output of pre-order traversal of this tree will be −
A → B → D → E→ C → F→ G
Algorithm
We start from A, and following Post-order traversal, we first visit the left subtree B. Bis
also traversed post-order. The process goes on until all the nodes are visited. The
output of post-order traversal of this tree will be −
D → E→ B → F→ G → C → A
Algorithm
You need tobe careful with this definition:it permits some apparentlyunbalanced trees!
For example, here are some trees:
Tree AVLtree?
Yes
Examination showsthat
eachleftsub-tree has a
height 1 greater than
each right sub- tree.
No
Sub-treewithroot8has
height 4 and sub-tree
with root 18 has height2
Insertion
As with the red-black tree, insertion is somewhat complex and involves a number of
cases. Implementations of AVL tree insertionmay be found in many textbooks: theyrely
on adding an extra attribute, the balance factorto each node. This factor indicates
whether the tree is left-heavy(the height of the left sub-tree is 1 greater than the right
sub-tree), balanced(both sub-treesare the same height) or right-heavy(the heightof the
right sub-tree is 1 greater than the left sub-tree). If the balance would be destroyed byan
insertion, a rotation is performed to correct the balance.
A new item has been
added to the left subtree
of node 1, causing its
height to become 2
greater than 2's right sub-
tree (shown in green). A
right-rotationisperformed
tocorrectthe imbalance.
Lecture-17
B+-tree
InB+-tree, each node stores up to dreferences to children and up to d−1 keys. Each
reference is considered “between” two of the node's keys; it references the root of a
subtree for which all values are between these two keys.
Hereisafairlysmalltree using 4asourvalueford.
A B+-tree requires that each leaf be the same distance from the root, as in this picture,
where searching for any of the 11 values (all listed on the bottom level) will involve
loading three nodes from the disk (the root block, a second-level block, and a leaf).
In practice,dwill be larger —as large, in fact, as it takes to fill a disk block. Suppose a
block is 4KB, our keys are 4-byte integers, and each reference is a 6-byte file offset.
Then we'd choose dto be the largest value so that 4 (d−1) +6d≤4096; solving this
inequalityfor d,weendupwith d≤410,so we'duse410for d.Asyoucansee, dcan be large.
AB+-treemaintainsthefollowinginvariants:
Every nodehas onemorereferencesthanithas keys.
Allleaves areatthe samedistancefromthe root.
For every non-leaf node Nwith kbeing the number of keys in N: allkeys in the first
child's subtree are less than N's first key; and allkeys in the ithchild's subtree (2 ≤
i ≤ k) are between the (i −1)th key of n and the ith key of n.
Theroothasatleasttwochildren.
Every non-leaf,non-rootnodehas atleastfloor(d/2)children.
Eachleafcontainsatleastfloor(d /2)keys.
Everykeyfromthetableappearsinaleaf,inleft-to-rightsortedorder.
In our examples, we'll continue to use 4 for d. Looking at our invariants, this requiresthat
each leaf have at least two keys, and each internal node to have at least two children
(and thus at least one key).
2. Insertionalgorithm
Descendtotheleafwherethekey fits.
1. If thenodehasanemptyspace,insertthekey/referencepairintothenode.
2. If the node is already full, split it into two nodes, distributing the keys evenly
between the two nodes. If the nodeis aleaf, take a copy of the minimumvalue in
the second of these two nodes and repeat this insertion algorithm to insert it into
the parent node. If the node is a non-leaf, exclude the middle value during the
split and repeat this insertion algorithm to insert this excluded value into the
parent node.
Initial:
Insert20:
Insert13:
Insert15:
Insert10:
Insert11:
Insert12:
3. Deletionalgorithm
Descendtotheleafwherethekey exists.
1. Removethe required keyand associatedreference fromthe node.
2. Ifthe node stillhasenoughkeysandreferencestosatisfy theinvariants, stop.
3. If the node has too few keys to satisfy the invariants, but its next oldest or next
youngest sibling at the same level has more than necessary, distribute the keys
between this node and the neighbor. Repair the keys in the level above to
represent that these nodes now have a different “split point” between them; this
involves simply changing a key in the levels above, without deletion or insertion.
4. If the node has too few keys to satisfy the invariant, and the next oldest or nex t
youngest sibling is at the minimum for the invariant, then merge the node with its
sibling; if the node is a non-leaf, we will need to incorporate the “split key” from
the parent into our merging. In either case, we will need to repeat the removal
algorithm on the parent node to remove the “split key” that previously separated
these merged nodes —unless the parent is the root and we are removing thefinal
key from the root, in which case the merged node becomes the new root (and the
tree has become one level shorter than before).
Initial:
Delete13:
Delete15:
Delete1:
ExpressionTrees:
Trees are used in many other ways in the computer science. Compilers and database
are two major examples in this regard. In case of compilers, when the languages are
translated into machine language, tree-like structures are used. We have also seen an
example of expression tree comprising the mathematical expression. Let’s have more
discussion on the expression trees. We will see what are the benefits of expressiontrees
and how can we build an expression tree. Following is the figure of an expression tree.
In the above tree, the expression on the left side is a + b * c while on the right side, we
have d * e + f * g. If you look at the figure, it becomes evident that the inner nodes
contain operators while leaf nodes have operands. We know that there are two types of
nodes in the tree i.e. inner nodes and leaf nodes. The leaf nodes are such nodes which
have left and right subtrees as null. You will find these at the bottom level of the tree.
The leaf nodes are connected with the inner nodes. So in trees, we have some inner
nodes and some leaf nodes.
In the above diagram, all the inner nodes (the nodes which have either left or right child
or both) have operators. In this case, we have + or * as operators. Whereas leaf nodes
contain operands only i.e. a, b, c, d, e, f, g. This tree is binary as the operators are
binary. We have discussed the evaluation of postfix and infix expressions and haveseen
that the binary operatorsneed two operands. In the infixexpressions, one operand
isontheleftsideoftheoperatorandtheotherisontherightside.Suppose,ifwehave
+ operator, it will be written as 2 + 4. However, in case of multiplication, we will write as
5*6. We may have unary operators like negation (-) or in Boolean expression we have
NOT. In this example, there are all the binary operators. Therefore, this tree is a binary
tree. This is not the Binary Search Tree. In BST, the values on the left side of the nodes
are smaller and the values on the right side are greater than the node. Therefore, this is
not a BST. Here we have an expression tree with no sorting process involved.
This is not necessary that expression tree is always binary tree. Suppose we have a
unary operatorlike negation. Inthis case, we have a node which has(-)initand thereis
only one leaf node under it. It means just negate that operand.
Let’s talk about the traversal of the expression tree. The inorder traversal may be
executed here.
Lecture-18
BinarySearchTree (BST)
A Binary SearchTree (BST) is a treein whichall the nodes followthe below-mentioned
properties −
Theleftsub-treeofanodehasakeylessthanorequaltoitsparentnode'skey.
The right sub-tree of a node has a key greater than to its parent node's
key.Thus,BSTdividesallitssub-treesintotwosegments;theleftsub-treeandtheright sub-
tree and can be defined as −
left_subtree(keys)≤node(key)≤right_subtree(keys)
Representation
BST is a collection of nodes arranged in a way where they maintain BST properties.
Each node has a key and an associated value. While searching, the desired key is
compared to the keys in BST and if found, the associated value is retrieved.
Followingisa pictorialrepresentationofBST−
We observe that the root node key (27) has all less-valued keys on the left sub-treeand
the higher valued keys on the right sub-tree.
BasicOperations
Following are thebasicoperationsof a tree −
Search−Searchesanelementinatree.
Insert−Insertsanelementinatree.
Pre-orderTraversal−Traversesatreeinapre-ordermanner.
In-orderTraversal−Traversesatreeinanin-ordermanner.
Post-orderTraversal −Traversesa treeinapost-order manner.
Node
Defineanodehavingsomedata,references toitsleftandrightchildnodes.
structnode{
int data;
struct node *leftChild;
structnode*rightChild;
};
SearchOperation
Whenever an element is to be searched, startsearching fromthe root node. Then if the
data is less than the key value, search for the element in the left subtree. Otherwise,
search for the element in the right subtree. Follow the same algorithm for each node.
Algorithm
structnode*search(intdata){
struct node *current = root;
printf("Visitingelements:");
while(current->data!= data){
if(current != NULL) {
printf("%d",current->data);
//notfound
if(current==NULL){
return NULL;
}
}
}
return current;
}
InsertOperation
Whenever an element is to be inserted, first locate its proper location. Start searching
from the root node, then if the data is less than the key value, search for the empty
locationin the left subtree andinsert the data.Otherwise, search for theempty location in
the right subtree and insert the data.
Algorithm
voidinsert(intdata){
structnode*tempNode=(structnode*)malloc(sizeof(structnode)); struct
node *current;
structnode*parent;
tempNode->data = data;
tempNode->leftChild = NULL;
tempNode->rightChild=NULL;
if(current==NULL){
parent->leftChild=tempNode;
return;
}
}//go to rightof the tree else
{
current=current->rightChild;
Intheabovegraph, V
= {a, b, c, d, e}
E={ab,ac,bd,cd,de}
Then a graph can be:
Directed graph (di-graph) if all the edges are directed
Undirectedgraph(graph)ifalltheedgesareundirected Mixed
graph if edges are both directed or undirected Illustrate
terms on graphs
End-verticesofanedgearetheendpointsoftheedge.
Two verticesare adjacentiftheyareendpoints ofthesameedge.
Anedgeisincidentonavertexifthevertexis anendpointoftheedge.
Outgoingedgesofavertexaredirectededgesthatthevertexistheorigin.
Incomingedgesofavertexaredirectededges thatthevertexis thedestination.
Degreeofa vertex,v, denoteddeg(v)isthenumberofincidentedges.
Out-degree,outdeg(v),isthenumberofoutgoingedges.
In-degree,indeg(v),isthenumber of incomingedges.
Paralleledgesormultipleedgesareedgesofthesametypeandend-vertices
Self-loopisan edgewiththeendverticesthesamevertex
Simplegraphshavenoparalleledgesor self-loops
Properties
If graph,G,hasm edgesthenΣv∈Gdeg(v)=2m
Ifa di-graph,G, hasmedgesthen
Σv∈Gindeg(v)=m=Σv∈Goutdeg(v)
If asimple graph,G,hasmedges andn vertices:
IfG isalso directedthenm≤n(n-1)
IfG isalso undirectedthenm≤n(n-1)/2
Soasimplegraph with nverticeshasO(n2) edgesatmost
MoreTerminology
Pathis a sequence of alternating vetches and edges such that each successive vertexis
connected by the edge.Frequently only the vertices are listed especially if there are no
parallel edges.
Cycle isapaththatstartsandendatthesamevertex.
Simple path is a path with distinct vertices.
Directed path is a path of only directed edges
Directedcycleisacycleofonlydirectededges.
Sub-graph is a subset of vertices and edges.
Spanningsub-graphcontains all thevertices.
Connectedgraph hasallpairsofverticesconnectedbyatleastonepath. Connected
componentis the maximal connected sub-graph of a unconnected graph. Forest is a
graph without cycles.
Treeis a connected forest (previous type of trees are called rooted trees, these are free
trees)
Spanningtreeisaspanning subgraphthatisalsoatree.
MoreProperties
IfGisanundirectedgraph withnverticesandm edges:
IfGis connectedthenm ≥n-1
IfGisatreethenm=n-1
IfGisaforestthenm≤n–1
GraphTraversal:
1. Depth First Search
2. BreadthFirst Search
Lecture-20
DepthFirstSearch:
Depth First Search (DFS) algorithm traverses a graph in a depthward motion and uses
a stack to rememberto get the next vertex to start a search, when a dead end occursin
any iteration.
As in the example given above, DFS algorithm traverses from Sto A toD to G to EtoB
first, then to F and lastly to C. It employs the following rules.
Rule1−Visittheadjacentunvisitedvertex.Markitasvisited.Displayit.Pushit in a
stack.
Rule2−Ifnoadjacentvertexisfound,popupavertexfromthestack.(Itwill
popupalltheverticesfromthestack,whichdonothaveadjacentvertices.)
Rule3−RepeatRule1andRule2untilthestackisempty.
Initializethe stack.
2
Mark Sas visited and put it
onto the stack. Explore any
unvisitedadjacentnodefromS.
We have three nodes and we
can pick any of them. For this
example, we shalltake the
node in an alphabetical order.
4
VisitDand mark it as visited
and put onto the stack. Here,
we haveBandCnodes,
which are adjacent to Dand
both are unvisited. However,
we shall again choose in an
alphabetical order.
We choose B, mark it as
visited and put onto the stack.
Here Bdoes not have any
unvisited adjacent node. So,
we pop Bfrom the stack.
6
Initializethe queue.
2
We start from
visitingS(starting node), and
mark it as visited.
3
We then see an unvisited
adjacent node fromS. In this
example,wehavethreenodes
butalphabeticallywechoose A,
mark it as visited and enqueue
it.
From A we haveDas
unvisited adjacent node. We
mark it as visited andenqueue
it.
At this stage, we are left with no unmarked (unvisited) nodes. But as per the algorithm
we keep on dequeuing in order to get all unvisited nodes. When the queue gets
emptied, the program is over.
Lecture-22
Graphrepresentation
You can represent a graph in many ways. The two most common ways of
representinga graph is as follows:
Adjacencymatrix
An adjacency matrix is a VxVbinary matrixA. Element Ai,jis 1 if there is an edge from
vertex i to vertex j else Ai,jis 0.
Note:Abinarymatrixisamatrixinwhichthecellscanhaveonlyoneoftwopossible values -
either a 0 or 1.
The adjacency matrix can also be modified forthe weighted graph in which instead
ofstoring 0 or 1 in Ai,j, the weight or cost of the edge will be stored.
Inanundirectedgraph,if Ai,j=1,then Aj,i =1.Inadirectedgraph,if Ai,j=1, then Aj,i may or
may not be 1.
Adjacencymatrixprovides constanttimeaccess (O(1)) todetermineifthereisan edge
between two nodes. Space complexity of the adjacency matrix is O(V2).
The adjacency matrix of the following graph is:
i/j:1 234
1:0101
2:1010
3:0101
4:1010
A2→1→3
A3→2→4
A4→1→3
Consider the same directed graph from an adjacency matrix. The adjacency list of
thegraph is as follows:
A1→2
A2→4
A3→1→4
A4→2
Lecture-23
TopologicalSorting:
Topological sorting for Directed Acyclic Graph (DAG) is a linear ordering of
verticessuchthatforeverydirectededgeuv,vertexucomesbeforevinthe ordering.
Topological Sorting for a graph is not possible if the graph is not a DAG.
For example, a topological sorting of the following graph is “5 4 2 3 1 0”. There can be
more than one topological sorting for a graph. For example, another topological sorting
of the following graph is “4 5 2 3 1 0”. The first vertex in topological sorting is always a
vertex with in-degree as 0 (a vertex with no in-coming edges).
AlgorithmtofindTopological Sorting:
In DFS, we start from a vertex, we first print it and then recursively call DFS for its
adjacent vertices. In topological sorting, we use a temporary stack. We don’t print the
vertex immediately, we first recursively call topological sorting for all its adjacent
vertices, then push it to a stack. Finally, print contents of stack. Note that a vertex is
pushed to stack only when all ofits adjacent vertices(and their adjacent verticesand so
on) are already in stack.
TopologicalSortingvsDepthFirstTraversal (DFS):
In DFS, we print a vertex and then recursively call DFS for its adjacent vertices. In
topological sorting, we need to print a vertex before its adjacent vertices. For example,in
the given graph, the vertex ‘5’ should be printed before vertex ‘0’, but unlike DFS, the
vertex‘4’ should also be printed before vertex ‘0’. So Topological sorting is different
fromDFS.Forexample,aDFSoftheshowngraphis“523104”,butitisnota
topologicalsorting
DynamicProgramming
The Floyd Warshall Algorithmis for solving the All Pairs Shortest Path problem. The
problem is to find shortest distances between every pair of vertices in a given edge
weighted directed Graph.
Example:
Input:
graph[][]={{0,5,INF,10},
{INF,0,3,INF},
{INF,INF,0,1},
{INF,INF,INF,0}}
whichrepresentsthefollowinggraph 10
(0) ------->(3)
| /|\
5| |
| |1
\|/ |
(1) ------->(2)
3
Note thatthe valueofgraph[i][j] is0ifi isequalto j
Andgraph[i][j]isINF(infinite)ifthereis noedgefromvertexitoj.
Output:
Shortestdistance matrix
0 5 8 9
INF 0 3 4
INF INF 0 1
INF INF INF 0
FloydWarshallAlgorithm
We initialize the solution matrix same as the input graph matrix as a first step. Then we
update the solution matrix by considering all vertices as an intermediate vertex. Theidea
is to one by one pick all vertices and update all shortest paths which include the picked
vertex as an intermediate vertex in the shortest path. When we pick vertex number k as
an intermediate vertex, we already have considered vertices{0, 1, 2, .. k-1} as
intermediate vertices. For every pair (i, j) of source and destination vertices respectively,
there are two possible cases.
1) k is not an intermediate vertex in shortest path from i to j. We keep the value of
dist[i][j] as it is.
2) k is an intermediate vertex in shortest path fromi to j. We update the value of dist[i][j]
as dist[i][k] + dist[k][j].
The following figure shows the above optimal substructure property in the all-pairs
shortest path problem.
Lecture-24
BubbleSort
Wetakeanunsortedarrayforourexample.BubblesorttakesΟ(n 2)timesowe're keeping it
short and precise.
Bubble sort starts with very first two elements, comparing them to check which one
isgreater.
Thenewarrayshouldlooklikethis−
Weknowthenthat10issmaller35.Hencetheyarenotsorted.
Weswapthesevalues.Wefindthatwehavereachedtheendofthearray.Afterone
iteration,thearrayshouldlooklikethis−
Tobeprecise,wearenowshowinghowanarrayshouldlooklikeaftereachiteration.
Aftertheseconditeration,itshouldlooklikethis−
Noticethat aftereachiteration, atleast onevalue movesat theend.
Andwhenthere'snoswaprequired,bubblesortslearnsthatanarrayiscompletely sorted.
Nowweshouldlookintosomepracticalaspectsofbubble sort.
Algorithm
Weassume listisanarrayof nelements.Wefurtherassumethat swapfunction swaps the
values of the given array elements.
beginBubbleSort(list)
forallelementsoflist if
list[i] >list[i+1]
swap(list[i],list[i+1])
end if
end for
returnlist
endBubbleSort
Pseudocode
We observe in algorithm that Bubble Sort compares each pair of array element unless
the whole array is completely sorted in an ascending order. This may cause a few
complexity issues like what if the array needs no more swapping as all the elementsare
already ascending.
To ease-out the issue, we use one flag variable swappedwhich will help us see if any
swap has happened or not. If no swap has occurred, i.e. the array requires no more
processing to be sorted, it will come out of the loop.
Pseudocode ofBubbleSortalgorithmcan bewritten asfollows−
procedurebubbleSort(list:arrayofitems) loop
= list.count;
fori =0toloop-1do:
swapped = false
forj=0toloop-1do:
/*comparetheadjacentelements*/ if
list[j] >list[j+1] then
/* swap them */
swap(list[j],list[j+1])
swapped = true
endif
endfor
/*ifnonumberwasswappedthatmeans array is
sorted now, break the loop.*/
if(notswapped)then
break
endif
endfor
endprocedurereturnlist
Lecture-25
InsertionSort
Wetakeanunsortedarray forourexample.
Insertionsortcomparesthe firsttwoelements.
It finds that both 14 and 33 are already in ascending order. For now, 14 is in sortedsub-
list.
Insertionsortmovesaheadandcompares33with 27.
Andfinds that33isnotinthecorrectposition.
It swaps 33 with 27. It also checks with all the elements of sorted sub-list. Here we see
that the sorted sub-list has only one element 14, and 27 is greater than 14. Hence, the
sorted sub-list remains sorted after swapping.
Thesevaluesarenotinasortedorder.
Sowe swapthem.
However,swappingmakes27and10unsorted.
Hence,weswapthemtoo.
Weswapthemagain.Bytheendofthirditeration,wehaveasortedsub-listof4items.
This processgoes on until all the unsorted values are coveredin a sorted sub-list. Now
we shall see some programming aspects of insertion sort.
Algorithm
while holePosition>0andA[holePosition-1]>valueToInsertdo:
A[holePosition] = A[holePosition-1]
holePosition=holePosition-1
end while
A[holePosition]=valueToInsert
endfor
endprocedure
Lecture-26
SelectionSort
Consider thefollowing depictedarrayasanexample.
For the first position in the sorted list, the whole list is scanned sequentially. The first
position where 14 is stored presently, we search the whole list and find that 10 is the
lowest value.
So we replace 14 with 10. After one iteration 10, which happens to be the minimum
value in the list, appears in the first position of the sorted list.
For the second position, where 33 is residing, we start scanning the rest of the list in a
linear manner.
Wefind that 14is the secondlowest value in the list and it should appearat the second
place. We swap these values.
After two iterations, two least values are positioned at the beginning in a sortedmanner.
Algorithm
Step1−SetMINtolocation0
Step2−Searchtheminimumelementinthelist
Step3−SwapwithvalueatlocationMIN
Step4−IncrementMINtopointtonextelement
Step5−Repeatuntillistissorted
Pseudocode
procedureselectionsort
list: array of items
n :sizeoflist
fori=1ton-1
/*setcurrentelementasminimum*/
min = i
/*checktheelementtobeminimum*/ for j
= i+1 to n
iflist[j]<list[min] then
min=j;
end if
endfor
/*swaptheminimumelementwiththecurrentelement*/ if
indexMin != ithen
swaplist[min]andlist[i]
end if
endfor
endprocedure
Lecture-27
MergeSort
To understand mergesort, wetake anunsortedarrayas the following−
We know that merge sort first divides the whole array iteratively into equal halves
unless the atomic values are achieved.We see here that an array of 8 items is divided
into two arrays of size 4.
This does not change the sequence of appearance of items in the original. Now we
divide these two arrays into halves.
We further divide these arrays and we achieve atomic value which can no more be
divided.
Now, we combine themin exactly the same manner as they were broken down. Please
note the color codes given to these lists.
We first compare the element for each list and then combine them into another list in a
sorted manner. We see that 14 and 33 are in sorted positions.We compare 27 and 10
and in the targetlist of 2 values we put 10 first, followed by 27.We change the order of
19 and 35 whereas 42 and 44 are placed sequentially.
In the next iteration of the combining phase, we compare lists of two data values, and
merge them into a list of found data values placing all in a sorted order.
Nowweshouldlearnsomeprogrammingaspectsofmerge sorting.
Algorithm
Merge sort keeps on dividing the list into equal halves until it can no more be divided.
By definition, if it is only one element in the list, it is sorted. Then, merge sort combines
the smaller sorted lists keeping the new list sorted too.
Step1−ifitisonlyoneelementinthelistitisalreadysorted,return.
Step2−dividethelistrecursivelyintotwohalvesuntilitcannomorebedivided.
Step3−mergethesmallerlistsintonewlistinsortedorder.
Mergesortworkswithrecursionandweshallseeourimplementationinthesameway.
proceduremergesort(varaasarray) if (
n == 1 ) return a
var l1 as array = a[0] ... a[n/2]
varl2asarray=a[n/2+1]...a[n] l1 =
mergesort( l1 )
l2 = mergesort( l2 )
returnmerge(l1,l2)
endprocedure
proceduremerge( vara asarray,var basarray) var c
as array
while(aandbhaveelements) if (
a[0] >b[0] )
addb[0]totheendofc
remove b[0] from b
else
adda[0]totheendofc
remove a[0] from a
end if
endwhile
whileTruedo
whileA[++leftPointer]<pivotdo
//do-nothing
end while
whilerightPointer>0&&A[--rightPointer]>pivotdo
//do-nothing
end while
ifleftPointer>=rightPointer
break
else
swapleftPointer,rightPointer
end if
endwhile
swapleftPointer,right
return leftPointer
endfunction
QuickSortAlgorithm
Using pivot algorithm recursively, we end up with smaller possible partitions. Each
partitionis then processed for quick sort.Wedefine recursive algorithmfor quicksort as
follows −
Step1−Maketheright-mostindexvaluepivot
Step 2 − partition the array using pivot value
Step 3 − quicksort left partition recursively
Step 4 − quicksort right partition recursively
QuickSortPseudocode
Togetmoreintoit,letseethepseudocodeforquicksortalgorithm−
procedurequickSort(left,right)
ifright-left<=0
return
else
pivot=A[right]
partition=partitionFunc(left,right,pivot)
quickSort(left,partition-1)
quickSort(partition+1,right)
end if
endprocedure
Lecture-29
Heap Sort
Heap sort is a comparison based sorting technique based on Binary Heap data
structure. It is similar to selection sort where we first find the maximum element and
place the maximum element at the end. We repeat the same process for remaining
element.
WhatisBinaryHeap?
Let us first define a Complete Binary Tree. A complete binary tree is a binary tree in
which every level, except possibly the last, is completely filled, and all nodes are as far
left as possible
A BinaryHeapisaCompleteBinaryTreewhere itemsarestored in aspecialorder such that
value in a parent node is greater(or smaller) than the values in its two children nodes.
The former is called as max heap and the latter is called min heap. The heapcan be
represented by binary tree or array.
WhyarraybasedrepresentationforBinary Heap?
Since a Binary Heap is a Complete Binary Tree, it can be easily represented as array
and array based representation is space efficient. If the parent node is stored at index I,
the left child can be calculated by 2 * I + 1 and right child by 2 * I + 2 (assuming the
indexing starts at 0).
HeapSortAlgorithmforsorting inincreasing order:
1. Buildamaxheapfromtheinputdata.
2. At this point, the largest item is stored at the root of the heap. Replace it with the last
item of the heap followed by reducing the size of heap by 1. Finally, heapify the root of
tree.
3. Repeatabovestepswhilesize ofheapisgreaterthan1.
How tobuildtheheap?
Heapify procedure can be applied to a node only if its children nodes are heapified. So
the heapification must be performed in the bottom up order.
Letsunderstandwiththehelpofanexample:
Inputdata: 4, 10,3, 5, 1
4(0)
/ \
10(1)3(2)
/\
5(3) 1(4)
Thenumbersinbracketrepresenttheindicesinthearray
representation of data.
Applyingheapifyproceduretoindex1: 4(0)
/ \
10(1) 3(2)
/\
5(3) 1(4)
Applyingheapifyproceduretoindex0:
10(0)
/\
5(1)3(2)
/\
4(3) 1(4)
Theheapifyprocedurecallsitselfrecursivelytobuildheap in
top down manner.
RadixSort
ThelowerboundforComparisonbasedsortingalgorithm (MergeSort,HeapSort, Quick-Sort
.. etc) is Ω(nLogn), i.e., they cannot do better than nLogn.
Counting sortis a linear time sorting algorithm that sort in O(n+k) time when
elementsare in range from 1 to k.
Whatiftheelementsareinrangefrom1ton2?
We can’t use counting sort because counting sort will take O(n2) which is worse than
comparisonbasedsortingalgorithms.Canwesortsuchanarrayinlineartime? Radix Sort is
the answer. The idea of Radix Sort is to do digit by digit sort starting from least
significant digit to most significant digit. Radix sort uses counting sort as a subroutine to
sort.
Lecture-30
RadixSort
1) Dofollowingforeachdigitiwhereivariesfromleastsignificantdigittothemost significant
digit.
… ............. a)Sortinputarrayusingcountingsort(oranystablesort)accordingtothei’th
digit.
Example:
Original,unsortedlist:
170,45,75,90, 802,24, 2, 66
Sorting by least significant digit (1s place) gives: [*Notice that we keep 802 before 2,
because 802 occurred before 2 in the original list, and similarly for pairs 170 & 90 and45
& 75.]
170,90,802,2, 24,45,75, 66
Sorting by next digit (10s place) gives: [*Notice that 802 again comes before 2 as 802
comes before 2 in the previous list.]
802,2,24,45,66, 170,75,90
Sortingbymostsignificantdigit(100splace)gives: 2,
24, 45, 66, 75, 90, 170, 802
WhatistherunningtimeofRadixSort?
Let there be d digits in input integers. Radix Sort takes O(d*(n+b)) time where b is the
base for representing numbers, for example, for decimal system, b is 10. What is the
value of d? If k is the maximum possible value, then d would be O(log b(k)). So overall
time complexity is O((n+b) * logb(k)). Which looks more than the time complexity of
comparison based sorting algorithms fora large k. Let us first limit k. Let k <= n c wherec
is a constant. In that case, the complexity becomes O(nLog b(n)). But it still doesn’t beat
comparison based sorting algorithms.
LinearSearch
Linearsearchistocheckeachelementonebyoneinsequence.Thefollowing method
linearSearch() searches a target in an array and returns the index of the target; if not
found, it returns -1, which indicates an invalid index.
1 intlinearSearch(intarr[],inttarget)
2 {
3 for(inti= 0;i< arr.length;i++)
4 {
5 if(arr[i]==target) return
6 i;
7 }
8 return-1;
9 }
Linear search loops through each element in the array; each loop body takes constant
time. Therefore, it runs in linear time O(n).
Lecture-31
BinarySearch
For sorted arrays, binary searchis more efficient than linear search. The process starts
from the middle of the input array:
If the target equalsthe element in the middle,returnitsindex.
Ifthetargetislargerthantheelementinthemiddle,searchtherighthalf.
Ifthetargetissmaller,searchthelefthalf.
Inthefollowing binarySearch() method,thetwoindexvariables first andlast indicatesthe
searching boundary at each round.
1 intbinarySearch(intarr[],inttarget)
2 {
3 intfirst= 0,last= arr.length-1;
4
5 while(first<=last)
6 {
7 intmid=(first+last)/2; if
8 (target == arr[mid])
9 return mid;
10 if(target>arr[mid]) first
11 = mid + 1;
12 else
13 last = mid-1;
14 }
15 return-1;
16 }
1 arr:{3, 9,10, 27,38, 43,82}
2
3 target: 10
4 first: 0,last:6,mid: 3,arr[mid]: 27 --go left
5 first: 0,last:2,mid: 1,arr[mid]: 9 --goright
6 first:2,last:2,mid:2,arr[mid]:10--found
7
8 target: 40
9 first: 0,last:6,mid: 3,arr[mid]: 27 --goright
10 first: 4,last:6,mid: 5,arr[mid]: 43 --go left
11 first: 4, last: 4, mid: 4, arr[mid]: 38 --goright
12 first: 5, last: 4 --not found
Binary search divides the array in the middle at each round of the loop. Suppose the
array has length n and the loop runs in t rounds, then we have n * (1/2)^t = 1 since at
each round the array length is divided by 2. Thus t = log(n). At each round, the loopbody
takes constant time. Therefore, binary search runs in logarithmic time O(log n).
The following code implements binary search using recursion. To call the method, we
needprovidewiththeboundaryindexes,forexample, binarySearch(arr, 0, arr.length - 1, target);
1
2 binarySearch(intarr[],intfirst,intlast,int target)
3 {
4 if(first>last) return
5 -1;
6
7 intmid=(first+last)/2; if
8
9 (target == arr[mid])
10 returnmid;
11 if(target>arr[mid])
12 returnbinarySearch(arr,mid +1,last, target);
13 //target<arr[mid]
14 return binarySearch(arr,first,mid-1,target);
15 }
Lecture-32
Hashing
Introduction
Whenweputobjectsintoahashtable,itispossiblethatdifferentobjects(by the
equals()method) might have the same hashcode. This is called a collision. Here is the
example of collision. Two different strings ""Aa" and "BB" have the same key: .
"Aa" = 'A' * 31 + 'a' = 2112
"BB"='B'* 31+'B'=2112
The big attraction of using a hash table is a constant-time performance for the basic
operations add, remove, contains, size . Though, because of collisions, we cannot guarantee
the constant runtime in the worst-case.Why?Imagine that all our objects collideinto the
same index. Then searching for one of themwill be equivalent to searching in a list, that
takes a liner runtime. However, we can guarantee an expected constant runtime, if we
make sure that our lists won't become too long. This is usually implemnted by
maintaining a load factor that keeps a track of the averagelength of lists. If a load factor
approaches a set in advanced threshold, we create a bigger array and rehashall
elements from the old table into the new one.
Another technique of collision resolution is a linear probing. If we cannoit insert at index
k, we try the next slot k+1. If that one is occupied, we go to k+2, and so on.
Lecture-33
HashingFunctions
Choosingagoodhashingfunction, h(k),isessentialforhash-tablebased searching. hshould
distribute the elements of our collection as uniformly as possible to the "slots" of the
hash table. The key criterion is that there should be a minimumnumber of collisions.
If the probability that a key, k, occursin our collection is P(k), then if there are mslots in
our hash table, a uniform hashing function, h(k), would ensure:
Sometimes, this is easy to ensure. For example, if the keys are randomly distributed in
(0,r], then,
h(k) =floor((mk)/r)
willprovideuniformhashing.
Mappingkeystonaturalnumbers
Most hashing functions will first map the keys to some set of natural numbers, say (0,r].
There are many ways to do this, for example if the key is a string of ASCII characters,we
can simply add the ASCII representations of the characters mod 255 to produce a
numberin(0,255)-orwecouldxorthem,orwecouldaddthem inpairsmod2 16-1,or
...
Havingmappedthekeystoasetofnaturalnumbers,wethenhaveanumberofpossibilities.
1. Useamodfunction:
h(k) =kmodm.
When using this method, we usually avoid certain values of m. Powers of 2 are
usually avoided, for k mod 2bsimply selects the blow order bits of k. Unless we
know that all the 2 bpossible values of the lower order bits are equally likely, this
will not be a good choice, because some bits of the key are not used in the hash
function.
Prime numbers which are close to powers of 2 seem to be generally goodchoices
for m.
For example, if we have 4000 elements, and we have chosen an overflow table
organization, but wish to have the probability of collisions quite low, then wemight
choose m = 4093. (4093 is the largest prime less than 4096 = 2 12.)
2. Usethemultiplicationmethod:
o Multiply thekeyby aconstantA,0<A<1,
o Extract thefractional partoftheproduct,
o Multiplythisvaluebym.
Thus the hash function is:
h(k) =floor(m*(kA-floor(kA)))
In this case, the value of mis not critical andwe typically choose a power of 2 so
that we can get the following efficient procedure on most digital computers:
o Choosem=2p.
o Multiply thewbitsofkbyfloor(A*2w )toobtaina2wbitproduct.
o Extractthe pmostsignificantbitsofthelower halfofthisproduct. It
seems that:
A=(sqrt(5)-1)/2= 0.6180339887
is a good choice (seeKnuth, "Sorting and Searching", v. 3 of "The Art of
Computer Programming").
3. Useuniversal hashing:
A malicious adversary can always chose the keys so that they all hash to the
same slot, leading to an average O(n)retrieval time. Universal hashing seeks to
avoid this by choosing the hashing function randomly from a collection of hash
functions (cf Cormen et al, p 229- ). This makes the probability that the hash
function will generate poor behaviour small and produces good average
performance.