Data Structures and Algorithms Using Python - Subrata Saha
Data Structures and Algorithms Using Python - Subrata Saha
All major algorithms have been discussed and analysed in detail, and the
corresponding codes in Python have been provided. Diagrams and examples
have been extensively used for better understanding. Running time
complexities are also discussed for each algorithm, allowing the student to
better understand how to select the appropriate one.
The book has been written with both undergraduate and graduate students in
mind. Each chapter ends with a large number of problems, including
multiple choice questions, to help consolidate the knowledge gained. This
will also be helpful with competitive examinations for engineering in India
such as GATE and NET. As such, the book will be a vital resource for
students as well as professionals who are looking for a handbook on data
structures in Python.
Subrata Saha
314–321, 3rd Floor, Plot 3, Splendor Forum, Jasola District Centre, New
Delhi – 110025, India 103 Penang Road, #05–06/07, Visioncrest
Commercial, Singapore 238467
www.cambridge.org
A catalogue record for this publication is available from the British Library
ISBN 978-1-009-27697-9 Paperback
Cambridge University Press & Assessment has no responsibility for the
persistence or accuracy of URLs for external or third-party internet websites
referred to in this publication and does not guarantee that any content on
such websites is, or will remain, accurate or appropriate.
To my beloved wife,
Contents
OceanofPDF.com
Preface
xxi
Acknowledgments
xxiii
1.3.1 Array
3
1.3.2 Linked list
1.3.3 Stack
1.3.4 Queue
1.3.5 Graph
1.3.6 Tree
1.3.7 Heap
10
11
11
Review Exercises
14
viii Contents
2. INTRODUCTION TO ALGORITHM
15
15
16
16
16
2.4.1 Flowchart
17
2.4.2 Pseudocode
19
20
24
25
25
27
27
29
30
30
31
34
OceanofPDF.com
Introduction to Algorithm At a glance
35
36
Review Exercises
39
3. ARRAY
41
3.1 Definition
41
41
45
48
48
51
53
54
55
Contents
ix
56
57
59
59
63
63
64
65
65
66
66
66
68
Array at a Glance
79
79
Review Exercises
81
84
85
4.1 Lists
85
85
86
88
89
91
93
94
95
96
97
97
98
x Contents
4.1.7 Looping in a List
99
100
4.2 Tuples
101
101
102
103
105
105
106
4.3 Sets
107
4.3.1 Creating a Set
108
108
109
109
110
111
4.3.4 Frozenset
114
4.4 Dictionaries
115
115
116
4.4.3 Operations on a Dictionary
116
116
117
118
119
120
120
122
122
137
Multiple Choice Questions
138
Review Exercises
141
143
Contents
xi
5. STRINGS
145
5.1 Introduction
145
147
148
149
149
5.4 String Methods
150
157
158
165
166
168
175
Strings at a Glance
181
182
Review Exercises
185
Problems for Programming
186
6. RECURSION
187
6.1 Definition
187
189
192
192
192
194
196
196
6.7 Programming Examples
199
Recursion at a Glance
201
202
Review Exercises
205
206
xii Contents
7. LINKED LIST
207
7.1 Definition
208
208
208
211
212
213
214
215
216
217
219
220
221
222
228
233
233
234
235
236
239
7.8.3.3 Inserting a Node after a Specified Node
240
242
243
244
245
246
253
255
256
Contents xiii
261
261
262
263
270
271
7.13.3 Deleting the First Node from a Circular Doubly Linked List 274
7.13.4 Deleting the Last Node from a Circular Doubly Linked List 276
296
296
297
303
304
Review Exercises
308
309
8. STACK
311
311
312
312
314
317
321
321
322
325
xiv Contents
8.5.2.3 Evaluation of a Postfix Expression
331
339
8.5.4 Recursion
341
Stack at a Glance
343
344
Review Exercises
348
349
9. QUEUE
351
351
9.2 Operations Associated with Queues
352
352
353
360
361
369
371
371
9.3.4.2 Using a Single Circular Linked List with a Single Tail Pointer 375
377
9.5.1 DEQue
378
381
384
Queue at a Glance
385
385
Review Exercises
387
388
10. TREES
391
391
10.2 Terminology
392
393
Contents
xv
393
10.3.2 Forest
393
394
394
395
395
396
397
398
398
398
399
400
401
403
404
406
408
410
412
414
415
415
420
421
10.9.7 Counting the Total Number of Nodes in a Binary Search Tree 422
10.9.8 Counting the Number of External Nodes in a Binary Search Tree 422
10.9.9 Counting the Number of Internal Nodes in a Binary Search Tree 423
423
424
436
439
xvi Contents
439
440
443
450
450
455
456
457
460
462
465
10.15 B Tree
466
468
470
10.16 B+ Tree
471
471
473
10.17 B* Tree
476
476
477
Strings at a Glance
478
Review Exercises
481
485
11. HEAP
487
487
488
488
488
Contents xvii
491
495
11.4.1 Implementing a Priority Queue Using Heap
495
Heap at a Glance
498
499
Review Exercises
501
501
12.GRAPH
503
503
12.2 Terminology
504
506
506
12.3.2 Incidence Matrix Representation
507
508
509
511
511
511
521
521
525
528
12.5.1 Prim’s Algorithm
529
532
537
537
542
549
Graph at a Glance
549
550
Review Exercises
553
554
xviii Preface
557
557
558
559
564
566
567
571
574
583
587
591
594
598
598
598
600
Review Exercises
602
Programming Exercises
604
14. HASHING
605
605
607
607
608
609
610
610
611
612
613
615
Contents xix
617
14.3.2 Chaining
620
14.4 Rehashing
623
623
Hashing at a Glance
623
624
Review Exercises
627
629
Index
633
Preface
In computer science and engineering, data structure and algorithm are two
very important parts. Data structure is the logical representation of data, so
that insertion, deletion, and retrieval can be done efficiently, and an
algorithm is the step-by-step procedure to solve any problem. By studying
different data structures, we are able to know their merits and demerits,
which enriches our knowledge and our ability to apply the appropriate data
structures at proper places when we try to write new applications. Studying
different standard algorithms provides us much knowledge about solving
new problems. Both data structures and algorithms are interrelated and are
complementary to each other. By studying both data structures and
algorithms, we may acquire a solid foundation of writing good code. This
comprehensive knowledge helps to understand new frameworks as well.
In this book different data structures and algorithms are discussed in a lucid
manner so that students can understand the concept easily. All the relevant
data structures and their operations are discussed with diagrams and
examples for better understanding.
xxii Preface
problems for practice and a set of MCQ questions at the end of each chapter
increase the self-learning process of the students.
• This book is written in very simple English for better understanding the
complex concepts.
style and their demand inspired me to write this book. My heartiest thanks
to them.
I would like to thank my student Sayandip Naskar for drawing the figures
of stack and queue.
Finally, I would like to thank all the reviewers of this book for their critical
comments and suggestions. I convey my sincere gratitude to Mr Agnibesh
Das and the entire editing team at Cambridge University Press, India, for
their great work.
OceanofPDF.com
Chapter
Data Structure Preliminaries
Our basic aim is to write a good program. A program must give correct
results. But a correct program may not be a good program. It should possess
some characteristics such as readability, well documented, easy to debug
and modify, etc. But the most important feature is that it should efficiently.
Efficient means that the program should take minimum time and minimum
space to execute. To achieve this, we need to store and retrieve data in
memory following some logical models. These are called data structures.
Before discussing data structures let us recapitulate what data are and what
data type is.
Data are raw facts. Through a program we convert data into our required
information.
For example, marks scored by a student in some examination are data. But
total marks, average marks, grade, whether the student has passed or failed
– these are information. The nature of all data is not the same. Some of
them are whole numbers, some are real numbers that contain some
fractional value, some are of character type, etc. These types of data are
known as data type. Data type indicates the nature of the data stored in a
particular variable.
The data types which are defined by user, i.e. programmer here, according
to the needs of the application, are known as user defined data type.
Generally this data type consists of one or more primitive data types. In C,
examples of user defined data types are structure, union, and enumerated
list. In Python, user defined data type is class.
Here the term ‘abstract’ means hiding the internal details. Hence, abstract
data type (ADT) represents the type of an object whose internal structure
for storing data and the operations on that data is hidden. It just provides an
interface by which its behavior is understandable but does not have the
implementation details. For example, Stack is an abstract data type which
follows the LIFO operations of data. It has two basic operations – Push for
inserting data into a stack and Pop for retrieving data from a stack. But as
an ADT it is not known whether it is implemented through linked list, list,
or array. ADT does not even disclose in which programming language it is
implemented.
Based on allocation style, data structures are of two types. These are static
and non-static data structures. If the number of elements in a data structure
is fixed and is defined before compilation, it is known as static data
structure. Example is an array. But if the number of elements in a data
structure is not fixed, and we may insert or delete elements as and when
required during program execution, it is known as dynamic data structure.
Linked list is an example of dynamic data structure.
1.3.1 Array
Hence, if we declare an array, arr, of size 5, then to access the first element
of the above array we have to write arr[0] as in Python, array index always
starts from 0.
Similarly, to access the next elements we have to write arr[1], arr[2], and so
on.
arr
Head
Node 1
Node 2
Node 3
10
20
30 None
1.3.3 Stack
A stack is a linear data structure in which both the insertion and deletion
operations occur only at one end. Generally this end is called the top of the
stack. The insertion operation is commonly known as PUSH and the
deletion operation is known as POP. As both operations occur at one end,
when the elements are pushed into a stack, the elements are stored one after
another and when the elements need to be popped, only the top-most
element can be removed first and then the next element gets the scope of
being popped. Hence, stack
Pop
40
Top
30
20
10
1.3.4 Queue
Deque
10
20
30
40
Enque
Front end
Rear end
1.3.5 Graph
e4
v2
e1
v4
e8
e3
e5
v1
e7
v6
e2
v3
v5
e9
e6
To represent a graph using a linked list there are also two ways. These are:
In Breadth First Search (BFS) algorithm, starting from a source vertex all
the adjacent vertices are traversed first. Then the adjacent vertices of these
traversed vertices are traversed one by one. This process continues until all
the vertices are traversed. Another graph traversal algorithm is Depth First
Search (DFS) algorithm. In this algorithm starting from the source vertex,
instead of traversing all the adjacent vertices, we need to move deeper and
deeper until we reach a dead end. Then by backtracking we return to the
most recently visited vertex and from that position again we start to move to
a deeper level through unvisited vertices. This process continues until we
reach the goal node or traverse the entire graph.
1.3.6 Tree
The structure of a tree is recursive by its nature. Each node connected with
the root node may be further considered as a root node with which some
other nodes can be connected and form a sub-tree. Thus, a tree, T, can be
defined as a finite non-empty set of elements among which one is the root
and others are partitioned into trees, known as sub-trees of T.
Figure 1.6 shows a sample tree structure.
EF
LM
Root node
Root node of
Root node of
left sub-tree
right sub-tree
Left sub-tree
Right sub-tree
Figure 1.7 A binary tree
Here, A is the root node and the nodes B, D, E, H, and I form the left sub-
tree and the nodes C, F, G, J, and K form the right sub-tree. Again B is the
root node of this left sub-tree whose left sub-tree consists of D, H, and I
nodes and the right sub-tree consists of a single node, E, and so on. If any
node does not have a left sub-tree and/or a right sub-tree, that means it
consists of empty sub-trees.
1.3.7 Heap
We can define a heap as a binary tree which has two properties. These are:
shape property and order property. By the shape property, a heap must be a
complete binary tree. By the order property, there are two types of heaps.
One is max heap and the other is min heap.
By default a heap means it is a max heap. In a max heap, the root should be
larger than or equal to its children. There is no order in between the
children. This is true for its sub-trees also. In a min heap the order is
reversed. Here the root is smaller than or equal to any of its children. Thus
the root of a max heap always provides the largest element of a list whereas
the root of a min heap always provides the smallest element. Figure 1.8
shows a max heap and a min heap.
60
10
20
40
70
30
15
10
75
90
42
57
(a)
(b)
We have already discussed that data type represents the type of the data
stored in a variable, whereas data structure represents the logical
representation of data in order to store and retrieve data efficiently. We have
also noticed that using data structures we are able to create ADT which
represents some user defined data type. On the other hand, primitive data
structures are the basic data type in any programming language. Hence,
these are very close in meaning. If we consider the data types in Python, i.e.
integer(int), string(str), list, dictionary(dict), etc., we find that all the data
types are represented as class, which is nothing but a data structure.
• Insertion: By this operation new elements are added in the data structure.
When new elements are added, they follow the properties of the
corresponding data structure. For example, when a new product needs to be
added in the inventory, all the information regarding the new product will
be inserted in the appropriate data structure.
After the operation, the lists are combined into one list maintaining the
initial order of the lists. For example, if we have two lists sorted in
ascending order, after merging they becomes a single list where all the
elements would be present in ascending order.
11
✓ The data types which are predefined and in-built are known as primitive
or basic data type.
✓ The data types which are defined by the user according to the needs of
the applications are known as user defined data type.
✓ Abstract data type (ADT) represents the type of an object whose internal
structure for storing data and the operations on that data are hidden. It just
provides an interface by which its behavior is understandable but does not
have the implementation details.
✓ Examples of linear data structures are array, linked list, stack, queue, etc.,
whereas graph and tree are examples of non-linear data structures.
a) int
b) char
c) float
d) bool
a) ‘Data Structure’
b) “Data Structure”
c) ‘‘‘Data Structure’’’
d) All of these
a) Stack
b) Queue
d) None of these
a) Array
b) List
c) Queue
d) All of these
a) Heap
b) Graph
c) Tree
d) All of these
a) Heap
b) Stack
c) Queue
d) None of these
a) Heap
b) Linked List
c) B Tree
d) None of these
a) Linear
b) Homogeneous
c) Static
d) All of these
i. Linear
ii. Homogeneous
iii. Static
iv. Dynamic
a) i and ii only
b) i, ii and iv
d) All of these
13
a) Push
b) Pop
c) Peek
d) Retrieve
11. Which of the given data structures supports all the following
operations?
a) Stack
b) Queue
c) Linked List
d) Graph
a) FIFO
b) LIFO
c) FILO
d) LILO
a) FIFO
b) LIFO
c) FILO
d) LILO
a) Searching
b) Sorting
c) Merging
d) All of these
a) Adjacency Matrix
b) Incidence Matrix
c) Adjacency multi-list
d) All of these
a) Preorder
b) Inorder
c) Level order
d) None of these
Review Exercises
8. What are the advantages and disadvantages of array over linked list?
Chapter
OceanofPDF.com
Introduction to Algorithm
Algorithm is a very common word in computer science, especially in case
of any procedural programming languages. In this chapter we will know
about algorithms, different types of algorithms, different approaches to
designing an algorithm, analysis of algorithms, etc. We will also be able to
learn how an algorithm can be written using different control structures.
• Input: Inputs are the values that are supplied externally. Inputs are those
without which we cannot proceed with the problems. Every algorithm must
have zero or any number of inputs.
• Finiteness: Whatever may be the inputs for all possible values, every
algorithm must terminate after executing a finite number of steps.
When we try to write a program, first we need to identify all the tasks.
Some tasks are easily identifiable while some tasks maybe hidden within
the problem definition. For that a detailed analysis of the problem is
required. After proper analysis we are able to find what the expected output
is. What are the inputs required using which we can get the output?
Using the inputs how we reach the solution? To accomplish this ‘how’, after
identifying all the tasks we need to arrange them in a proper sequence. This
is called an algorithm and in common words ‘Program Logic’. A small and
well-known program may be written arbitrarily, but for a large program or
programs about which we are not much familiar, we must prepare an
algorithm first.
Introduction to Algorithm
17
the instructions of any algorithm step by step. Not only that; the algorithm
we write should be efficient, which means it should use minimum resources
like processor’s time and memory space. To write an algorithm, first we
need to identify the inputs, if any. Suppose we have to print all two digit
prime numbers. In this case, we do not have any input because no extra
information is required; we know what the two digit numbers are and what
the criteria are for any number becoming a prime number. But if our
problem is to find the area of a rectangle, we have to take two inputs –
length and breadth of the rectangle. Unless these two values are provided
externally, we are not able to calculate the area. Consider the following
example.
7. Stop
2.4.1 Flowchart
1. Program flowchart
2. System flowchart
SYMBOL
NAME
USAGE
Terminal
computer-related process.
Input/Output
system.
Comment
Flow line.
Document
Input/Output
Decision
action.
On-page
Off-page
connector
on separate pages.
Introduction to Algorithm
19
• Program logic should depict the flow from top to bottom and from left to
right.
• Each symbol used in a program flowchart should contain only one entry
point and one exit point, with the exception of the decision symbol.
Solution:
Start
peri = 2*(length+breadth)
Stop
2.4.2 Pseudocode
However there are some disadvantages also. As there are no standard rules
for writing a pseudocode, the style of writing pseudocode varies from
programmer to programmer and hence may create some difficulties in
understanding. Another drawback of this tool is that writing a pseudocode is
much more difficult for beginners in comparison to drawing a flowchart.
Introduction to Algorithm
21
If condition, Then
Statement or Statements
Statement or Statements
Else
Statement or Statements
Both the If and If–Else constructs may also be nested. The Statement or
Statements may further contain an If or If–Else construct. Here is an
example to show the selection control structure in an algorithm.
Example 2.3: In a shop, a discount of 10% on purchase amount is given
only if purchase amount exceeds ₹5000; otherwise 5% discount is given.
Write an algorithm to find net payable amount.
Solution:
2. If
3. Else
6. Stop
While Condition, Do
Statement or Statements
The Statement or Statements under a While statement will be executed
repeatedly till the condition associated with the While is true. When the
condition becomes false, the control comes out of the loop. Consider the
following example.
Solution:
2. Set
Fact = 1
3. Set
I=1
b. Set I = I + 1
6. Stop
The Do–While statement is an exit control type loop statement. The general
form of a Do–
Do
Statement or Statements
While Condition
The Statement or Statements under a Do–While statement will be executed
repeatedly till the condition associated with the While is true. When the
condition becomes false, the control comes out of the loop. Consider the
following example: Example 2.5: Write an algorithm to display the first n
natural numbers.
Solution:
2. Set I = 1
3. Do
a. Print I
b. Set I = I + 1
Introduction to Algorithm
23
4. While I <= N
5. Stop
StepValue],Do
Statement or Statements
Example 2.6: Write an algorithm to find the sum of the first n natural
numbers.
Solution:
2. Set Sum = 0
3. For I = 1 to N, Do
5. Stop
Statement or Statements
Example 2.7: Write an algorithm to display the first n odd natural numbers.
Solution:
2. Set I = 1
a. Print I
b. Set I = I + 2
c. Set C = C + 1
5. Stop
2.7 Best Case, Worst Case, Average Case Time Complexity When we
analyze an algorithm, the calculation of its exact running time is not
possible.
Introduction to Algorithm
25
element is not found at all or found at the last comparison. The worst case
time complexity denotes the maximum time required to execute that
algorithm. The importance of it is that it makes us sure that in any condition
the execution time does not cross this value.
There is no specific rule that one we shall opt for; we have to take the
decision based on the situation, constraints, and a comparative study. When
time is the main constraint, we may sacrifice space. On the other hand, if
space is the main constraint, we may need to sacrifice speed. Sometimes it
may so happen that we choose an algorithm that neither shows the best
running time complexity nor the best space complexity, but the overall
performance is very good. So, considering all these things, we need to
decide which algorithm has to be chosen.
Frequency count or step count is the simplest method for finding the
complexity of an algorithm. In this method, the frequency or occurrence of
execution of each statement or step is calculated. Adding all these values we
get a polynomial which represents the total number of executable
statements or steps. Ignoring coefficient and lower terms, the term
containing the highest power is considered the order of time complexity.
To illustrate the concept, let us consider the algorithm to find the factorial of
a number: https://avxhm.se/blogs/hill0
Set fact = 1
Set i = 1
While i <= n, do
n+1
fact = fact * i
i=i+1
Return (fact)
3n+5
Let us consider another example where a nested loop is used. The problem
is to sort an array using the bubble sort. In the following, function Arr is an
array and n is its size.
Algorithm
Frequency of each statement
BubbleSort (Arr, n)
For i = 1 to n-1, do
1+n+(n-1) = 2n
For j = 1 to n-1, do
(1+n+(n-1))*(n-1) = 2n2-2n
(n-1)*(n-1)= n2-2n+1
n2-2n+1
n2-2n+1
n2-2n+1
Return (Arr)
6n2-8n+6
Calculating the frequency count is not so easy for all problems, especially
when the loop index does not increase or decrease linearly. We need to
make an extra effort for calculating
Introduction to Algorithm
27
We have already discussed that efficiency indicates how much time and
space is required to execute an algorithm. To formulate the efficiency of an
algorithm we need to find the order of growth of the running time of the
algorithm. Based on these we may also compare with other algorithms. We
generally find that the ‘order of growth’ is effective for a large input size.
Asymptotic efficiency of an algorithm indicates that the ‘order of growth’ is
effective for a large enough input size. Now we discuss some asymptotic
notation.
( n>n ), f( n) will grow no more than a constant factor than g( n). Therefore,
g provides an 0
≥ n }0
Figure 2.2 shows for all values of n to the right of n , the value of the
function f( n) is on 0
or below g( n).
cgn
()
f( n)
n0
fn
( ) = O( g n
( ))
https://avxhm.se/blogs/hill0
f( n) = O( g( n))
12
3n + 5
O(n)
5n2 + 6n + 2
O(n2)
n3 – 4n2 + 13n + 9
O(n3)
2log n + 16
O(log n)
O( n log n)
O( n 2)
O( n 3)
O(2 n)
2
2
10
3.32 ≈ 4
10
40
102
103
1024
100
6.64 ≈ 7
100
700
104
106
1.26×1030
1000
9.96 ≈ 10
1000
10000
106
109
1.07×10301
10000
13.28 ≈ 14
10000
140000
108
1012
1.99×103010
Introduction to Algorithm
29
From the Table 2.3 it is observed that the rate of growth is the slowest for
the logarithmic function log n and the fastest for the exponential function 2
n. The rate of growth depends 2
However, the Big O notation suffers from some drawbacks. These are as
follows:
• The Big O notation ignores the coefficient of the terms as well as the
terms with lower powers of n. For example, if T(n) of an algorithm is n 3 +
15 and that of another algorithm is 10000 n 2 + 8, then according to the Big
O notation, the time complexity of the first algorithm is O( n 3), which is
slower than the other, whose complexity is O( n 2). But in reality this is not
true for n < 10000. Similarly, if T(n) of an algorithm is 3 n 2 + 6 n + 5 and
that of another algorithm is 12 n 2 + 108n + 60000, then according to the
big O notation, the time complexity of both algorithms is the same and it is
O( n 2). But practically there is a great difference between these
considerations.
≤ f( n), " n ≥ n , which means for all sufficiently large input values ( n>n ),
f( n) will grow 0
above g( n).
fn
()
cgn
()
()=(
Ω ( ))
fn
gn
https://avxhm.se/blogs/hill0
1
2
f( n) ≤ c g( n), " n ≥ n , which means for all sufficiently large input values (
n>n ), f( n) will 2
grow no less than a constant factor than g( n) and no more than another
constant factor than g( n). Therefore, g provides an asymptotic tight bound
for f( n). In other words, Θ( g( n)) = { f( n) : there exist positive constants c
, c and n such that 0 ≤ c g( n) ≤ f( n)
≤ c g( n), " n ≥ n }
The Theta (Θ) notation provides an asymptotic tight bound for f( n) . Figure
2.4 shows an intuitive diagram of functions f( n) and g( n) where f( n) = Θ(
g( n)). For all values of n to the right of n , the value of the function f( n) is
on or above c g( n) and on or below c g( n).
0
1
c g( )
fn
()
c()
1gn
fn
()=(
Θ g( n))
Now we will discuss another two notations. These are little Oh (o) and little
Omega (ω).
Definitions of the Big O and little o are almost the same except that in the
Big O notation the bound 0 ≤ f( n) ≤ c g( n) holds for some constant c > 0,
whereas in little o notation the bound 0 ≤ f( n) < cg( n) holds for all
constant c > 0.
Introduction to Algorithm
31
We may define ω( g( n)), read as ‘little omega of g of n’, as the set ω( g( n))
= { f( n) : there exist a constant n >0, for any positive constants c> 0 , such
that 0
We may notice that the definitions of the big Omega (Ω) and little omega
(ω) are almost the same except that in the big Omega notation the bound 0
≤ cg( n) ≤ f( n) holds for some constant c > 0, whereas in the little omega
notation the bound 0 ≤ cg( n) < f( n) holds for all constant c > 0.
Actually the key operation is done at the ‘combine’ step. At the last level of
recursion each sub-sequence contains a single element. Hence they are
sorted. Now on every two sub-sequences merging operation is applied and
we get a sorted sequence of two elements.
These sub-sequences now move into the upper level and the same merging
operations are applied on them. In this way gradually the entire sequence
becomes sorted.
https://avxhm.se/blogs/hill0
33
85
40
12
57
25
33
85
40
12
57
25
33
85
40
12
57
25
33
85
40
12
57
25
33
85
12
40
57
25
12
33
40
85
25
57
12
25
33
40
57
85
The main advantage of the divide and conquer paradigm is that we can
simplify a large problem easily by breaking it into smaller unit. Another
advantage is that as every time it is broken into two sub-sequences, the
algorithm helps to achieve running time complexity in terms of O(log n).
For example, running time complexity of merge sort is O( n log n).
2
2.12 Dynamic Programming
It is mainly effective for the type of problems that have overlapping sub-
problems. Like divide and conquer algorithms, this also combines the
solutions of sub-problems to find the solution of the original problem. We
may find that recursion repeatedly solves some common sub-sub-problems,
i.e., for specific arguments the function is called several times.
Introduction to Algorithm
33
def FibNum(n):
if n == 1:
return 0
elif n == 2:
return 1
else:
return FibNum(n - 1) + FibNum(n - 2)
Now, if we want to find the 5th term, the recursion tree of the function call
will look like as shown in Figure 2.6.
FibNum(5)
FibNum(4)
FibNum(3)
FibNum(3) + FibNum(2)
FibNum(2) + FibNum(1)
return(1) return(0)
Figure 2.6 Recursive calls for n-th Fibonacci numbers From the diagram
we can find that to calculate FibNum(5) it recursively calls FibNum(4) and
FibNum(3). FibNum(4) further recursively calls FibNum(3) and
FibNum(2).
FibNum(5)
FibNum(4)
FibNum(3)
FibNum(3) + FibNum(2)
return(1)
return(1) return(0)
Figure 2.7 Recursive calls for n-th Fibonacci numbers using dynamic
programming https://avxhm.se/blogs/hill0
fiboSeries = {1:0,2:1}
def dynamicFibo(term):
global fiboSeries
dynamicFibo(term-2)
return fiboSeries[term]
Output:
10 th Fibonacci number is : 34
A greedy algorithm starts with an empty solution set. Next in each step, it
chooses the best possible solution and tests its feasibility. If it is a feasible
solution, it adds to the solution set; otherwise it is rejected forever. This
process continues until a solution is reached.
Introduction to Algorithm
35
₹2, and ₹1. There is no limit to the number of each type of coins. To find
the solution, we start with an empty set. Now, in each iteration we will
choose the highest-valued coin to minimize the number of coins. So, we
add a coin of ₹10 twice to the solution set and the sum of the set becomes
20. Now we cannot choose a coin of ₹10 or ₹5 as it is not feasible.
So, we choose a coin of ₹2 and the set becomes {10, 10, 2}. Similarly, the
next feasible coin to choose is a coin of ₹1. Hence, the final solution set is
{10, 10, 2, 1}, which is also the optimal solution of the problem. But if the
available coins are of ₹10, ₹6, and ₹1, the greedy method finds the solution
as {10, 10, 1, 1, 1}, which is not an optimal solution. The optimal solution
is {10, 6, 6, 1}.
To make sense of whether a problem can be solved using the greedy method
we need to find two properties within the problem. These are the greedy
choice property and the optimal sub-structure. The greedy choice property
tells that an overall optimal solution is achieved by selecting the local best
solution without considering the previous results coming from sub-
problems. The optimal sub-structure indicates that the optimal solution of
the whole problem contains the optimal solution of the sub-problems.
Both dynamic programming and the greedy method have optimal sub-
structure properties. But they differ in the greedy choice property. In
dynamic programming, choice taken at each step depends on the solutions
of the sub-problems. But the greedy choice does not consider the solutions
of the sub-problems. The greedy method is a top-down approach whereas
dynamic programming is a bottom-up approach.
Though the greedy method may not find the best solution in all cases yet it
shows very good performance for many algorithms like the Knapsack
problem, job scheduling problem, finding minimum spanning tree using
Prim’s and Kruskal’s algorithms, finding shortest path using Dijkstra’s
algorithm, Huffman coding, etc.
OceanofPDF.com
Introduction to Algorithm At a glance
✓ An algorithm is a step-by-step procedure to solve any problem.
https://avxhm.se/blogs/hill0
✓ Best case time complexity denotes the minimum time required to execute
an algorithm.
✓ Greedy method always selects the best option available at that moment.
It always chooses the local optimal solution in anticipation that this choice
leads to the global optimal solution.
a) Definiteness
b) Finiteness
c) Fineness
d) Effectiveness
Introduction to Algorithm
37
b) Flowchart
c) Diagram chart
d) None of these
a) Flowchart
b) Pseudocode
c) Both a) and b)
d) None of these
a) Flowchart
b) Pseudocode
c) Both a) and b)
d) None of these
a) Sequence
b) Selection
c) Insertion
d) Iteration
6. Which of the following denotes the minimum time requirement to
execute that algorithm?
a) Big O notation
b) Ω notation
https://avxhm.se/blogs/hill0
d) ω notation
a) Big O notation
b) Ω notation
c) Θ notation
d) ω notation
10. Which notation provides both upper and lower bound on a function?
a) Big O notation
b) Ω notation
c) Θ notation
d) ω notation
11. Which notation provides an upper bound that is not asymptotically tight
on a function?
a) Big O notation
b) Ω notation
d) ω notation
12. Which notation provides a lower bound that is not asymptotically tight
on a function?
a) Big O notation
b) Ω notation
d) ω notation
13. Which of the following algorithms follows the divide and conquer
design paradigm?
a) Prim’s Algorithm
b) Quick Sort
c) Huffman Encoding
d) Radix Sort
14. Which of the following algorithms follows the the greedy method?
a) Prim’s Algorithm
b) Kruskal’s Algorithm
c) Huffman Encoding
d) All of these
Introduction to Algorithm
39
d) All of these
Review Exercises
2. What are the different criteria we need to keep in mind to write a good
algorithm?
3. Explain the different approaches to designing an algorithm.
12. Explain with example: best case, worst case and average case time
complexity.
14. What are the different notations used to express time complexity?
16. Explain with example the divide and conquer design paradigm of any
algorithm.
18. What is the greedy method? Mention the algorithms that follow the
greedy method.
19. Compare and contrast dynamic programming and the greedy method.
https://avxhm.se/blogs/hill0
OceanofPDF.com
Chapter
Array
students, we cannot not use this program for that class. We have to rewrite
the program according to that new class. Thus, this approach of processing
a large set of data is too cumbersome and surely not flexible enough.
Almost all modern high level programming languages provide a more
convenient way of processing such collections. The solution is array or
subscripted variables.
3.1 Definition
https://avxhm.se/blogs/hill0
Here,
value_list: specifies the list of elements which will be the content of the
array.
Table 3.1 shows commonly used type codes for declaring array.
Example:
Table 3.1 Commonly used Type Codes for declaring array Code
C Type
Python Type
Min bytes
b
signed char
int
unsigned char
int
Py_UNICODE
Unicode
signed short
int
unsigned short
int
i
signed int
int
unsigned int
int
signed long
int
unsigned long
int
int
Q
unsigned long long
int
float
float
double
float
Array
43
array(value_list, [dtype=type])
Example:
#Array of Floats
We can create an array of zeros and ones using the zeros() and ones()
methods.
zeros() method returns an array of given shape and data type containing all
zeros. The general format of zeros() method is:
myZerosArray1 = npy.zeros(5)
print(myZerosArray1)
myOnesArray1 = npy.ones(4)
print(myOnesArray1)
print(myOnesArray2)
Output:
[0. 0. 0. 0. 0.]
[0 0 0 0 0]
[1. 1. 1. 1.]
[1 1 1 1]
where Start is the starting value of the range, End indicates the end value
of the range but excluding End value, i.e., up to End-1, Step indicates the
gap between values, and dtype indicates the data type of the array. The
default value of Step is 1.
myArray1 = npy.arange(10)
print(myArray1)
myArray2 = npy.arange(1,24,2)
print(myArray2)
myArray3 = npy.arange(1,2,.2)
print(myArray3)
Output:
[0 1 2 3 4 5 6 7 8 9]
[ 1 3 5 7 9 11 13 15 17 19 21 23]
Array
45
If we declare an array, arr, of size 5, then to access the first element of the
array we have to write arr[0] as in Python array index always starts from 0.
Similarly, to access the next elements we have to write arr[1], arr[2], and so
on.
arr[0]
arr[1]
arr[2]
arr[3]
arr[4]
arr
arr[1] = 72
num = arr[1]
Thus, the expression arr[1] acts just like a variable of type int. Not only
assignment, it can also be used in input statement, output statement, or in
any arithmetic expression –
Notice that the second element of arr is specified as arr[1], since the first
one is arr[0].
Therefore, the third element will be arr[2], fourth element will be arr[3],
and the last element will be arr[4]. Thus, if we write arr[5], it would be the
sixth element of the array arr and therefore exceeding the size of the array,
which is an error.
https://avxhm.se/blogs/hill0
myArray=arr.array(‘i’,[0]*60)
for i in range(60):
print(myArray[i])
arr[0] = a
arr[a] = 75
b = arr [a+2]
Now we will write a complete program that will demonstrate the operations
of the array.
#n students.
import array as arr
marks=arr.array(‘i’,[0]*n)
for i in range(n):
max=marks[0]
for i in range(1,n):
if marks[i]>max:
max=marks[i]
In the above program, first we take the number of students as input. Based
on that inputted number we declare an integer array named marks with the
statement marks=arr.array(‘i’,[0]*n)
Array
47
Next we take the input to the array. To find the highest marks we store the
first element into the max variable and compare it with the rest elements of
the array. If the array element is larger than max, it will be stored within the
max variable. So, max variable contains the highest marks.
From the above program we can see now with a single statement we are
able to store marks of n number of students. A for loop and an input
statement are able to take input of the marks of n students and similarly a
for loop and an if statement are able to find the maximum of these marks.
This is the advantage of using an array.
Not only a single array, but we can also use as many arrays as required in a
program.
Program 3.2 Write a program that will store the positive numbers first,
then zeros if any, and the negative numbers at the end in a different array
from a group of numbers.
elements=arr.array(‘i’,[0]*n)
arranged=arr.array(‘i’,[0]*n)
j=0
if elements[i]>0:
arranged[j]=elements[i]
j+=1
arranged[j]=elements[i]
j+=1
if elements[i]<0:
arranged[j]=elements[i]
j+=1
print(arranged[i], end=‘ ’)
https://avxhm.se/blogs/hill0
48 Data Structures and Algorithms Using Python Apart from this general
convention of accessing element, Python also provides the flexibility to
access array elements from the end. When we access an array from the left
or the beginning, the array index is started from 0. But if we start accessing
elements from the end, array index starts from -1. Python uses negative
index numbers to access elements in the backward direction.
-5
-4
-3
-2
-1
arr
10
1234
563
27
98
Figure 3.1 Positive and negative index to access array elements Consider
Figure 3.1. Here, arr[-1] returns 98, arr[-2] returns 27, and so on. Hence,
Python gives us the opportunity to access array elements in the forward as
well as in the backward direction.
Python is enriched by its operator and its vast library functions. We can add
elements into an array, remove elements from an array, modify the content
of an array, extract a portion of an array (slicing), search an element from an
array and concatenate multiple arrays. We can also create a new array by
repeating the elements of an existing array.
We can add one or more elements into an array using append(), extend(),
and insert(). Using append() we can add a single element at the end of an
array whereas extend() helps us to add multiple elements at the end of an
array. On the other hand, insert() helps us to insert an element at the
beginning, end, or at any index position in an array. The following program
explains the operations of these functions: Program 3.3: Write a program
to show the use of append(), insert(), and extend().
#and extend()
elements=arr.array(‘i’,[10,20,30])
print(elements)
Array
49
elements.append(40)
#Inserts 40 at end
print(elements)
#70 at end
print(elements)
elements.insert(0,5)
#Inserts 5 at index
#position 0
print(elements)
#position 2
print(elements)
Output:
array(‘i’, [5, 10, 15, 20, 30, 40, 50, 60, 70])
Elements can be removed from an array using the del statement or remove()
and pop(). Using the del statement we can remove an element of a specific
index position.
If we want to delete the entire array, we need to mention the array name
only with del.
Program 3.4: Write a program to show the use of del, remove(), and pop().
https://avxhm.se/blogs/hill0
# pop()
elements=arr.array(‘i’,[10,20,30,40,10])
print(elements)
print(elements)
print(elements)
print(elements)
del elements
Output:
Popped element is : 10
array(‘i’, [40])
Array
51
By using the slicing operation we can extract one or more elements from an
array. This operation can be done using ( : ) operator within the subscript
operator ([ ]). The general format of a slicing operation on an array is:
where Start is the starting index position from which the extraction
operation starts. The extraction operation continues up to the End – 1 index
position, and Step represents the incremented or decremented value needed
to calculate the next index position to extract elements. If Start is omitted,
the beginning of the array is considered, i.e., the default value of Start is 0.
If End is omitted, the end of the array is considered and the default Step
value is 1. If we use negative Step value, elements will be extracted in the
reverse direction. Thus, for negative Step value, the default value of Start
is -1 and that of End indicates the beginning of the array. The following
example illustrates these concepts: Program 3.5 Write a program to show
the slicing operation on an array.
#array
elements=arr.array(‘i’,[10,20,30,40,50])
slice1 = elements[1:4]
print(slice1)
slice2 = elements[:4]
print(slice2)
slice3 = elements[1:]
print(slice3)
https://avxhm.se/blogs/hill0
print(slice4)
slice4 = elements[::2]
print(slice4)
slice5 = elements[4:1:-1]
print(slice5)
slice6 = elements[-1:-4:-1]
print(slice6)
slice7 = elements[-1:1:-1]
print(slice7)
slice8 = elements[-3::-1]
print(slice8)
slice9 = elements[:-4:-1]
print(slice9)
slice10 = elements[::-1]
print(slice10)
Array
53
Output:
Before slicing operation : array(‘i’, [10, 20, 30, 40, 50]) New extracted
array is : array(‘i’, [20, 30, 40])
The membership operators in and not in can also be applied on an array for
searching an element. The operator in returns True if the element is found
in the array; otherwise it returns False. The operator not in works as its
reverse. If the element is found in the array, it returns False; otherwise it
returns True. Hence, using the membership operator, whether an element is
present in an array or not can be confirmed. After confirmation we can
determine its position using the index() method. There is another function,
count(), which counts the occurrence of a element, passed as argument with
this method, in an array.
elements=arr.array(‘i’,[10,20,30,40,10])
print(elements)
if num in elements:
https://avxhm.se/blogs/hill0
end =“”)
print(posn)
%(num,elements.count(num)))
else:
Sample Output:
Arrays are mutable in Python. We can easily update the array elements by
just assigning value at the required index position. With the help of the
slicing operation certain portions of an array can be modified. Consider the
following example: Program 3.7 Write a program to show the updating
operation in an array.
#PRGD3_7: Program to show the updating operation in an
#array
elements=arr.array(‘i’,[10,20,30,40,10])
print(elements)
#position 2
print(elements)
arr2=arr.array(‘i’,[22,33,44])
Array
55
#position 1 to 3
3: ”)
print(elements)
Output:
We can also concatenate two or more arrays. By this operation two or more
arrays can be joined together. Concatenation is done using the + operator.
Consider the following example:
#arrays
arr1=arr.array(‘i’,[10,20,30,40])
arr2=arr.array(‘i’,[5,10,15,20,25])
arr3=arr.array(‘i’,[22,33,44])
print(arr1)
print(arr2)
print(myArr)
https://avxhm.se/blogs/hill0
Concatenated Array :
array(‘i’, [10, 20, 30, 40, 5, 10, 15, 20, 25, 22, 33, 44])
#array
arr1=arr.array(‘i’,[10,20,30,40])
arr2=arr.array(‘i’,[1])
print(arr1)
print(arr2)
print(arr3)
#zeros
print(myArr)
Output:
After repeting twice: array(‘i’, [10, 20, 30, 40, 10, 20, 30, 40])
Array
57
n-1
n-2
n-1
9
0
while True:
poly[pr]=cof
ch=input(“Continue?(y/n): " )
if ch.upper()==‘N’:
break
size=len(poly)
https://avxhm.se/blogs/hill0
if poly[i]!=0:
print(str(poly[i])+“x^”+str(i),end=“+”)
print(“\b ”)
pol3=arr.array(‘i’,[0]*l)
for i in range(l):
pol3[i]=pol1[i]+pol2[i]
return pol3
”))
p=max(p1,p2)
poly1=arr.array(‘i’,[0]*(p+1))
poly2=arr.array(‘i’,[0]*(p+1))
create_poly(poly1)
create_poly(poly2)
poly3=add_poly(poly1, poly2)
display(poly1)
display(poly2)
display(poly3)
0
0
0…0
012
7
8
9 10 11 12 … 30 31 32 33 34 35 36
Array
59
arr2d
1
2
print(myInt2DArray)
9.3]])
print(myFloat2DArray)
myComplex2DArray = npy.array([[5,10,15],[20,25,30]],
dtype=complex)
my2DInt32Array = npy.array([[1,3,5,7],[2,4,6,8]],
dtype=npy.int32)
print(my2DInt32Array)
https://avxhm.se/blogs/hill0
[[10 20 30]
[40 50 60]]
[4.5 6. 9.3 ]]
[[1 3 5 7]
[2 4 6 8]]
We can create a 2D array of zeros and ones using the zeros() and ones()
methods also. In both methods we need to specify the length of the
dimensions, i.e., the number of rows and columns, as argument. The general
format of the zeros() method is: Numpy.zeros(Shape[, dtype=type][,
order])
where Shape indicates the dimensions of the array, i.e. the number of rows
and columns of the array, dtype indicates the data type of the array, and
order represents whether it follows row major or column major
representation of data in memory.
where Shape indicates the dimensions of the array, i.e. the number of rows
and columns of the array, dtype indicates the data type of the array, and
order represents whether it follows row major or column major
representation of data in memory. Consider the following example:
my2DZerosArray = npy.zeros((2,3))
print(my2DZerosArray)
print(my2DOnesArray)
’float’)])
print(myMixArray)
Array
61
Output:
[[0. 0. 0.]
[0. 0. 0.]]
[[1 1 1]
[1 1 1]
[1 1 1]
[1 1 1]]
where Start is the starting value of the range, End indicates the end value
of the range but excluding End value, i.e., up to End-1, step indicates the
gap between values, and dtype indicates the data type of the array.
On the other hand, the reshape() method changes the shape of the array
without changing the data of the array. The number of rows and columns
are needed to be sent as argument with this method. But we have to be
careful that the product of rows and columns must be the same as the total
elements in the array.
my2DArray = npy.arange(1,24,2).reshape(3,4)
print(my2DArray)
[[ 1 3 5 7]
[ 9 11 13 15]
[17 19 21 23]]
https://avxhm.se/blogs/hill0
Array_name[row][column]
In Python, row number and column number both start from zero, as
described in Figure 3.3. So, to access the second element vertically and the
fourth horizontally from the array named arr2d the expression would be:
arr2d[1][3]
arr2d
arr2d[1][3]
a = arr2d[i+2][j];
We can access the rows and columns of a 2D array like single elements. To
access the rows we need to specify only the row index along with the array
name. Hence, arr2d[0]
represents the first row, arr2d[1] represents the second row, and arr2d[2]
represents the third row. And yes obviously! We can specify negative index
also to access from the end. To access the columns we have to specify a (:)
followed by a (,) followed by the column index within the subscript
operator. The following example illustrates these: import numpy as npy
myInt2DArray = npy.array([[10, 20, 30], [40, 50, 60], [70, 80, 90]])
print(“Entire array:”)
print(myInt2DArray)
print(“First row:”)
print(myInt2DArray[0])
print(“Second row:”)
print(myInt2DArray[1])
Array
63
print(“Third row:”)
print(myInt2DArray[2])
print(“First & Second row:”)
print(myInt2DArray[0:2])
print(myInt2DArray[1:])
print(myInt2DArray[:0:-1])
print(“First column:”)
print(myInt2DArray[:,0])
print(“Second column:”)
print(myInt2DArray[:,1])
print(“Third column:”)
print(myInt2DArray[:,2])
elements of the third row will be stored in the next n memory locations, and
so on. Thus, elements are stored row by row. This is followed in most of the
programming languages like C, C++, Java, and so on. This concept has
been shown in Figure 3.4.
(0,0) (0,1) (0,2) (1,0) (1,1) (1,2) (2,0) (2,1) (2,2) (3,0) (3,1) (3,2) Figure 3.4
Row major representation of a 4×3 two dimensional array Like single
dimensional arrays, two dimensional arrays also store only the base address,
i.e., the starting address of the array. Based on the base address, the
addresses of other positions https://avxhm.se/blogs/hill0
As in C, C++ or Java, both the value of r_ind and c_ind is 0, above formula
is converted to,
n columns, the first m elements of the first column will be stored in the first
m memory locations, then m elements of the second column will be stored
in the next m memory locations, then m elements of the third column will
be stored in the next m memory locations, and so on. Thus elements are
stored column by column. This is followed in the FORTRAN programming
language. This concept has been shown in Figure 3.5.
(0,0) (1,0) (2,0) (3,0) (0,1) (1,1) (2,1) (3,1) (0,2) (1,2) (2,2) (3,2) Figure 3.5
Column major representation of a 4×3 two dimensional array Suppose we
have a 2D array, arr, of size m× n and the lower index of row is r_ind and
that of column is c_ind. Therefore, following the column major
representation, the address of the element at the i-th row and j-th column
position can be calculated as: Address of arr[i][j] = Base Address of arr +
As in C, C++ or Java, both the value of r_ind and c_ind is 0, above formula
is converted to,
Solution:
Array
65
= 2060 + [ 60 + 6 ] * 2
= 2060 + 66 * 2
= 2060 + 132
= 2192
= 2060 + [ 80 + 11 ] * 2
= 2060 + 91 * 2
= 2060 + 182
= 2242
print(newArray)
Output:
https://avxhm.se/blogs/hill0
print(“Resultant Matrix: ”)
print(newArray)
Output:
Resultant Matrix:
[[220 280]
[490 640]]
newArray=npy.transpose(my2DArray)
print(newArray)
Output:
[[10 40]
[20 50]
[30 60]]
Array
67
(:) only n is mentioned, it considers the n-th column only. To specify the
range we need to mention the start and end before and after (:)
correspondingly.
print(“my2DArray =”)
print(my2DArray)
print(“my2DArray[:2,:2] =” )
print(my2DArray[:2,:2])
print(“my2DArray[:,:2] =”)
print(my2DArray[:,:2])
print(“my2DArray[:2,] =”)
print(my2DArray[:2,])
print(“my2DArray[:,2] =”)
print(my2DArray[:,2])
print(“my2DArray[:2,2] =”)
print(my2DArray[:2,2])
print(“my2DArray[1:3,1:3] =”)
print(my2DArray[1:3,1:3])
Output:
my2DArray =
[[10 20 30]
[40 50 60]
[70 80 90]
[ 5 10 15]]
my2DArray[:2,:2] =
[[10 20]
[40 50]]
my2DArray[:,:2] =
[[10 20]
[40 50]
[70 80]
[ 5 10]]
my2DArray[:2,] =
https://avxhm.se/blogs/hill0
[[10 20 30]
[40 50 60]]
my2DArray[:,2] =
[30 60 90 15]
my2DArray[:2,2] =
[30 60]
my2DArray[1:3,1:3] =
[[50 60]
[80 90]]
0
0
15
23
0
0
Array
69
0
0
21
11
Solution:
In the original matrix, number of rows is 10, number of columns is 6 and
total number of elements is 7. Hence, number of rows in the new sparse
matrix will be 7+1=8 and the resultant array will be as follows:
10
15
23
21
5
11
def OriginalToSparse(matrix):
row = len(matrix)
col = len(matrix[0])
c=0
for i in range(row):
for j in range(col):
if matrix[i][j] != 0 :
temp[0] = i
temp[1] = j
temp[2] = matrix[i][j]
https://avxhm.se/blogs/hill0
npy.append(sparseMatrix,[temp],axis=0)
c+=1
sparseMatrix[0][2] = c
return sparseMatrix
[0, 2, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 5, 0, 0, 7, 0],
[4, 0, 0, 0, 0, 0, 0]])
print(“\nOriginal Matrix: ”)
sparseMatrix = OriginalToSparse(myMatrix)
print(“\nSparse Matrix: ”)
Original Matrix:
[[0 0 0 6 0 0 0]
[0 2 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 5 0 0 7 0]
[4 0 0 0 0 0 0]]
Sparse Matrix:
[[5 7 5]
[0 3 6]
[1 1 2]
[3 2 5]
[3 5 7]
[4 0 4]]
#general matrix
Array
71
row = sparseMatrix[0][0]
col = sparseMatrix[0][1]
c=0
r = sparseMatrix[i][0]
c = sparseMatrix[i][1]
element = sparseMatrix[i][2]
myMatrix[r][c] = element
return myMatrix
[0, 3, 6],
[1, 1, 2],
[3, 2, 5],
[3, 5, 7],
[4, 0, 4]])
print(“\nSparse Matrix: ”)
originalMatrix = SparseToOriginal(sparseMatrix)
print(“\nOriginal Matrix: ”)
Output:
Sparse Matrix:
[[5 7 5]
[0 3 6]
[1 1 2]
[3 2 5]
[3 5 7]
[4 0 4]]
Original Matrix:
[[0. 0. 0. 6. 0. 0. 0.]
https://avxhm.se/blogs/hill0
[0. 2. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0.]
[0. 0. 5. 0. 0. 7. 0.]
[4. 0. 0. 0. 0. 0. 0.]]
row1 = Matrix1[0][0]
col1 = Matrix1[0][1]
row2 = Matrix2[0][0]
col2 = Matrix2[0][1]
i=j=1
c=0
sparseMatrix =
npy.append(sparseMatrix,[Matrix1[i]],axis=0)
i+=1
sparseMatrix =
npy.append(sparseMatrix,[Matrix2[j]],axis=0)
j+=1
npy.append(sparseMatrix,[Matrix1[i]],axis=0)
i+=1
sparseMatrix =
npy.append(sparseMatrix,[Matrix2[j]],axis=0)
j+=1
else:
temp[0]= Matrix1[i][0]
temp[1]= Matrix1[i][1]
temp[2]= Matrix1[i][2]+Matrix2[j][2]
sparseMatrix =
Array
73
npy.append(sparseMatrix,[temp],axis=0)
i+=1
j+=1
c+=1
sparseMatrix[0][2] = c
return sparseMatrix
sparseMatrix1 = npy.array([[5, 7, 5],
[0, 2, 6],
[1, 1, 2],
[3, 2, 5],
[3, 5, 7],
[4, 0, 4]])
[0, 3, 6],
[1, 1, 2],
[3, 2, 5],
[3, 5, 7],
[4, 0, 4]])
print(sparseMatrix1)
print(sparseMatrix2)
print(“\nResultant Matrix: ”)
print(finalMatrix)
Output:
1st Sparse Matrix:
[[5 7 5]
[0 2 6]
[1 1 2]
[3 2 5]
[3 5 7]
[4 0 4]]
https://avxhm.se/blogs/hill0
[[5 7 5]
[0 3 6]
[1 1 2]
[3 2 5]
[3 5 7]
[4 0 4]]
Resultant Matrix:
[[ 5 7 6]
[ 0 2 6]
[ 0 3 6]
[ 1 1 4]
[ 3 2 10]
[ 3 5 14]
[ 4 0 8]]
Here are a few programming examples that will help us understand the
various operations that can be performed on an array.
Program 3.14 Write a program to store n numbers in an array and find their
average.
#their average
elements=arr.array(‘i’)
sum=0
for i in range(n):
sum+=elements[i]
avg = sum/n
print(“Average = ”,avg)
Output:
Enter number of Elements: 4
Array
75
Average = 43.25
Program 3.15 Write a program to input five numbers through the keyboard.
Compute and display the sum of even numbers and the product of odd
numbers.
#numbers.
myArray=arr.array(‘i’)
for i in range(n):
sum=0
prod=1
for i in range(n):
if myArray[i]%2==0:
sum+=myArray[i]
else:
prod*=myArray[i]
Output:
https://avxhm.se/blogs/hill0
elements=arr.array(‘i’)
sum=0
for i in range(n):
sum+=elements[i]
mean = sum/n
sqrDev = 0
for i in range(n):
deviation = elements[i]-mean
sqrDev += deviation*deviation
variance = sqrDev/n
print(“Variance = ”,variance)
Output:
Enter number 1: 23
Enter number 2: 78
Enter number 3: 57
Enter number 4: 7
Enter number 5: 43
Enter number 6: 39
41.166666666666664
Variance = 518.8055555555555
11
121
1331
14641
1 5 10 10 5 1
.....
.....
Array
77
for i in range(n):
for j in range(i+1):
if j==0 or i==j:
# and diagonal
else:
pascal[i][j] = pascal[i-1][j]+pascal[i-1][j-1]
print(“%2d”%(pascal[i][j]), end=‘ ’)
print()
#Sales
for i in range(row):
for j in range(col):
for i in range(1,col+1):
print(‘Month%d’%(i), end=‘\t\t’)
print(‘Total’)
for i in range(row):
tot=0
print(“Employee”,i+1, end=‘\t’)
for j in range(col):
tot+=sales[i][j]
print(“%5d”%(tot))
grandTotal=0
for j in range(col):
tot=0
for i in range(row):
tot+=sales[i][j]
print(“%5d”%(tot), end=‘\t\t’)
grandTotal+=tot
print(“%5d”%(grandTotal))
Output:
Month1
Month2
Month3
Total
Employee 1 25
365
121
511
Employee 2 23
2356
420
2799
Employee 3 225
65
320
610
Employee 4 110
80
275
465
Total
383
2866
1136
4385
Array
79
Array at a Glance
✓ The subscript or array index may be any valid integer constant, integer
variable or integer expression.
1. What is an array?
a) Collection of homogeneous data elements.
d) None of these.
d) None of these.
a) 0
b) 1
c) -1
a) Price of cars.
d) Marks of an examination.
6. What is the proper syntax to declare an integer array using array module
after executing the statement: import array as arr
a) myArray=arr.array(‘i’,[10,20,30])
b) myArray=arr.array(‘i’,(10,20,30))
c) myArray=arr.array(‘i’,{10,20,30})
a) narray
b) ndarray
c) nd_array
d) darray
a) zeros()
b) ones()
c) arange()
d) All of these
b) append()
Array
81
c) extend ()
10.pop() is used
d) to remove and return the last element as well as any other element from
an array.
a) zeroArra=array.array(‘i’,[0]*5)
b) zeroArra=array.array(‘i’,[0*5])
c) zeroArra=array.array([0,0,0,0,0])
a) multiply()
b) * operator
c) dot ()
14. A is an array of size m * n, stored in the row major order. If the address
of the first element in the array is M, the address of the element A(i, j) (A(0,
0) is the first element of the array and each element occupies one location in
memory) is
a) M+(i-j)*m+j-1
b) M+(i -1)*m+i-1
c) M+i*m+j
d) M+(i-1)*n+j-1
Review Exercises
5. What are the different ways by which we can remove elements from an
array in Python?
elements=arr.array(‘i’,[5,10,15,20,25,30,35,40,45,50])
print(elements[2:8])
print(elements[0:10])
print(elements[:8])
print(elements[8:])
print(elements[0:])
print(elements[-8:-2])
print(elements[-8:-2:-1])
print(elements[-2:-8]:-1)
print(elements[-2:-8:-2])
print(elements[::-2])
print(elements[::2])
print(myArray1)
myArray2 = npy.arange(5,50,5)
print(myArray2)
myArray3 = npy.arange(.5,5.5,.5)
print(myArray3)
myArray=arr.array(‘f’,[5,10,15,20])
myArray.append(2.5)
print(myArray)
myArray.append(25)
print(myArray)
Array
83
myArray.insert(0,.5)
print(myArray)
myArray.insert(10,.5)
print(myArray)
myArray.extend([5,.5,5.5])
print(myArray)
elements=arr.array(‘i’,[1,2,2,3,4,2,1,3])
del elements[2]
print(elements)
elements.remove(1)
print(elements)
num=elements.pop()
print(elements)
print(elements)
elements=arr.array(‘f’,[1,1.5,2,2.5,3])
arr1=arr.array(‘f’,[1.2])
elements[1:4]=arr1
print(elements)
arr2=arr.array(‘f’,[1.4,1.6,1.8])
elements[1:4]=arr1
print(elements)
odd =arr.array(‘i’,[1,3,5,7])
even=arr.array(‘i’,[2,4,6])
numbers=odd+even
print(numbers)
print(odd*3)
print(2*even)
10. Write a program to delete an element from the k-th position of an array.
18. Maximum temperatures of each day for 20 cities are recorded for the
month of January. Write a program to find the following:
b) The day in which the highest temperature is recorded for the city.
OceanofPDF.com
Chapter
Python Data Structures
4.1 Lists
List is a very important and useful data structure in Python. It has almost all
the functionalities of an array but with more flexibility. Basically a list is a
collection of heterogeneous elements. List items are ordered, which means
elements of a list are stored in a specific index position. List items are
mutable, which indicates that it is possible to change or edit the elements of
a list, and duplicate items are allowed in a list. Another important feature of
a list is that it is dynamic. It can grow or shrink during program execution
according to our requirement.
myList = [ ]
myList = list( )
List elements are accessed just like array elements, i.e., the elements of a
list are accessed using a list index. Like arrays, in case of lists also the index
value starts from 0. Hence, the first position of a list is 0, second is 1, and so
on. To access a list element we need to mention the list name followed by
the list index enclosed within [ ]. The general format to access a list element
is:
List_name[ index ]
Suppose we have a list, myList = [10, 12, 25, 37, 49], then to access the
first element of the list we have to write myList[0]. Similarly, to access the
next elements we have to write myList[1], myList[2], and so on.
myList[0]
myList[1]
myList[2]
myList[3]
myList[4]
myList
10
12
25
37
49
myList[1] = 53
num = myList[1]
87
Thus, the expression myList [1] acts just like a variable of some basic type.
Not only assignment, it can also be used in input statement, output
statement, in any arithmetic expression – everywhere its use is similar to a
basic type variable.
We can use a variable as a subscript or list index to access the list elements.
At runtime this variable may contain several values and thus different list
elements are accessed accordingly. This facility of using variables as
subscripts makes lists equally useful as arrays.
myList[0] = a
myList[a] = 75
# where a is an integer
b = myList[a+2]
# where a is an integer
-5
-4
-3
-2
-1
myList
10
1234
563
27
98
Figure 4.1 Positive and negative index to access list elements Consider
Figure 4.1. Here, myList[-1] returns 98, myList[-2] returns 27, and so on.
Hence, Python gives us the opportunity to access array elements in forward
direction as well as backward direction.
Example 4.1: What will be the output for the following code segment?
Output:
myList[0] = 10
myList[1] = -3
myList[4] = 57
myList[-1] = 57
myList[-4] = -3
Just like an array in Python, we are able to add one or more elements into a
list using append(), extend(), and insert() methods. Using append() we can
add a single element at the end of a list, whereas extend() helps us to add
multiple elements at the end of a list. On the other hand, insert() helps us to
insert an element at the beginning, end, or at any index position in a list.
The following program explains the operations of these functions:
Example 4.2: Write a program to show the use of append(), insert() and
extend()
elements=[10,20,30]
print(elements)
elements.append(40)
#Inserts 40 at end
print(elements)
#end
print(elements)
89
elements.insert(0,5)
print(elements)
elements.insert(2,15)
elements.insert(-2,15)
print(elements)
Output:
[5, 10, 15, 20, 30, 40, 50, 15, 60, 70]
Elements can be removed from a list using the del statement or remove(),
pop(), and clear(). Using the del statement we can remove an element of a
specific index position. If we want to delete the entire list, we need to
mention the list name only with the del statement. remove() deletes the first
occurrence of an element in a list. But if the element does not exist in the
list, it raises an error. pop() removes an element from a specified index
position. If we do not mention the index position, pop() removes the last
element. In both cases, after removing the element it also returns the
element. To remove all elements from a list, the clear() method is used. We
can also remove a portion of a list by assigning an empty list to a slice of
elements. The following program explains the operations of these functions:
Example 4.3: Write a program to show the use of del, remove(), pop() and
clear().
elements=[10,20,30,40,10,50,60,70]
print(elements)
print(elements)
print(elements)
print(elements)
=“ ”)
print(elements)
#position 1 and 2
”)
print(elements)
print(elements)
del elements
Output:
Before deletion : [10, 20, 30, 40, 10, 50, 60, 70]
91
Popped element is : 20
[40, 60]
By using slicing operation we can extract one or more elements from a list.
This operation can be done using the ( : ) operator within the subscript
operator ([ ]). The general format of the slicing operation on a list is:
where, Start is the starting index position from which the extraction
operation starts. The extraction operation continues up to the End – 1 index
position and Step represents the incremented or decremented value needed
to calculate the next index position to extract elements. If Start is omitted,
the beginning of the list is considered, i.e., the default value of Start is 0. If
End is omitted, the end of the list is considered and the default Step value
is 1. If we use negative Step value, elements will be extracted in reverse
direction.
Thus, for negative Step value, the default value of Start is -1 and that of
End indicates the beginning of the list. The following example illustrates
these concepts: Example 4.4: Write a program to show the slicing
operation on a list.
elements=[10,20,30,40,50]
print(elements)
slice1 = elements[1:4]
print(slice1)
slice2 = elements[:4]
print(slice2)
slice3 = elements[1:]
92 Data Structures and Algorithms Using Python print (“New extracted List
is : ”, end =“ ”)
print(slice3)
slice4 = elements[1:4:2]
slice4 = elements[::2]
print(slice4)
slice5 = elements[4:1:-1]
print(slice5)
slice6 = elements[-1:-4:-1]
print(slice6)
slice7 = elements[-1:1:-1]
print(slice7)
slice8 = elements[-3::-1]
the List
print (“New extracted List is : ”, end =“ ”)
print(slice8)
slice9 = elements[:-4:-1]
print(slice9)
slice10 = elements[::-1]
93
print(slice10)
Output:
To search an element within a list index() is used. This function returns the
index position of the first occurrence of an element passed as an argument
with this function.
The membership operators in and not in can also be applied on a list for
searching an element. The operator in returns True if the element is found
in the list; otherwise it returns False. The operator not in works just as its
reverse. If the element is found in the list, it returns False; otherwise True.
Hence, using the membership operator it can be confirmed whether an
element is present or not in a list. After confirmation we can determine its
position using the index() method. There is another function, count(), which
counts the occurrence of an element, passed as an argument with this
method, in a list.
elements=[10,20,30,40,10]
print(elements)
#occurrence of num
end =“”)
print(posn)
%(num,elements.count(num)))
else:
Output:
elements=[10,20,30,40,10]
print(elements)
print(elements)
#position 1 to 3
3: ”)
print(elements)
95
#by 3 elements
print(elements)
Output:
Note the last output. Here two existing elements are replaced by three new
elements and thus the number of elements in the list increases.
We can also concatenate two or more lists. By this operation two or more
lists can be joined together. Concatenation is done using the + operator.
Consider the following example: Example 4.7: Write a program to show
the concatenation operation of lists.
myList1=[10,20,30,40]
myList2=[5,10,15,20,25]
myList3=[22,33,44]
print(myList1)
print(myList2)
print(“3rd List : ”, end =“ ”)
print(myList3)
#3 lists
print(myNewList)
Concatenated List :
[10, 20, 30, 40, 5, 10, 15, 20, 25, 22, 33, 44]
list1=[10,20,30,40]
list2=[1]
print(list2)
print(list3)
print(list4)
Output:
After repeating twice: [10, 20, 30, 40, 10, 20, 30, 40]
97
print(“mylist = ”,mylist)
print(“mylist[0] = ”,mylist[0])
print(“mylist[1] = ”,mylist[1])
print(“mylist[2] = ”,mylist[2])
print(“mylist[0][2] = ”,mylist[0][2])
print(“mylist[1][2] = ”,mylist[1][2])
print(“mylist[2][1] = ”,mylist[2][1])
Output:
mylist[0] = Techno
mylist[1] = [8, 4, 6]
mylist[0][2] = c
mylist[1][2] = 6
mylist[2][1] = 3.8
4.1.5 List Functions
Python provides a set of functions that work on iterables, which include all
sequence types (like list, string, tuples, etc.) and some non-sequence types
(like dict, file objects, etc.). In this section we discuss some of these built-in
functions. Here list is used as an argument of these functions. However,
they are applicable on other iterables as well.
The following example illustrates the use of these functions: Example 4.10:
Write a program to show the use of list functions.
print(“mylist = ”,mylist)
print(“Max = ”,max(mylist))
print(“Min = ”,min(mylist))
print(“Sum = ”,sum(mylist))
print(“any(mylist) : ”,any(mylist))
Output:
Max = 371
Min = 4
Sum = 627
all(mylist) : True
any(mylist) : True
We have already discussed some list methods. Here we are discussing some
other methods.
copy( ): It returns the copy of the list for which the method is called.
Though we can use
‘=’ to copy the content of a list to another variable, ‘=’ does not copy at all.
It just assigns the reference of the last variable, which means any change to
a variable also reflects in the other; for true copy we need to use the copy()
method.
count( ): It returns the number of occurrences of an element passed as
argument in a list.
sort( ): This method sorts the list elements in ascending order. To sort in
descending order we need to pass the argument reverse = True.
99
mylist = [2,3,4,2,1,2,3]
print(“mylist = ”,mylist)
newlist = mylist.copy()
reflist = mylist
mylist.reverse()
mylist.sort()
mylist.sort(reverse=True)
Output:
mylist = [2, 3, 4, 2, 1, 2, 3]
Occurrence of 2 in list = 3
As a list is iterable, we can iterate through a list very easily using the for
statement.
my_list = [10,20,30,40]
for i in my_list:
print(i, end=‘ ’)
10 20 30 40
We can also iterate through a list using the list index as follows: Example
4.13: Write a program to show how to iterate through a list using the loop
index.
my_list = [10,20,30,40]
for i in range(len(my_list)):
print(my_list[i], end=’ ‘)
Output:
10 20 30 40
If we use the list index to iterate through a list, we may use the while loop
also.
Example 4.14: Write a program to show how to iterate through a list using
the while loop.
my_list = [10,20,30,40]
i=0
while i<len(my_list):
print(my_list[i], end=‘ ’)
i+=1
Output:
10 20 30 40
Though the operations of a list and an array are almost the same, still there
are some differences between them. First of all, an array is a strictly
homogeneous collection, whereas a list may contain heterogeneous
elements. An array is a static data structure.
We cannot change its size during program execution. But a list is dynamic.
We may insert new elements or remove existing elements as and when
required. Now the question is: which one is better? If its answer is not
simple, then what to use and when? If our goal is to accumulate some
elements only, then list is better as it is much more flexible than an array.
But when we need to maintain strictly that elements should be
homogeneous, array is the choice. In mathematics, matrix plays a very
important role. Using NumPy we can do mathematical computations on
arrays and matrices very easily. On the other hand, arrays of array module
use less space and perform much faster. Hence, if our requirement is that
we need not change the array size and it strictly stores homogeneous
elements, use of arrays of array module is the better option. Otherwise it is
better to use Numpy for mathematical computations on arrays and matrices
and list for other uses.
4.2 Tuples
), etc., are treated as tuples by default. Like lists, tuples are also ordered and
allow duplicate values, but elements of tuples are immutable. We cannot
insert new elements, delete or modify existing elements in a tuple. Hence, it
cannot be grown or shrunk dynamically.
We can also create a tuple without using parenthesis. This is called tuple
packing.
The above statement creates a tuple, myTuple. If we now print the tuple as
print(myTuple), it will show the output: ( 25, “Sudip Chowdhury”, 87.2) In
contrast to packing there is another operation called Unpacking. By this
feature, tuple elements are stored into different variables. For example, the
statement
stores the tuple elements into the variables a, b and c respectively. Thus,
after the execution of this statement, a will contain 25, b will contain “Sudip
Chowdhury”
We can also create an empty tuple. An empty tuple can be created as:
myTuple = ( )
myTuple = tuple( )
But to create a single element tuple we cannot write:
notTuple = (5)
This statement will execute well but the type of the variable notTuple is not
a tuple. It is an integer. To declare a single element tuple we need to write:
myTuple = (5,)
Just as arrays and lists, tuple elements are also accessed using index values.
Tuple index also starts from 0. Negative index is also allowed in accessing
tuple elements. It always starts from -1. To access a tuple element we
mention the tuple name followed by the tuple index enclosed within [ ].
Slicing operation is also applicable on tuples. The general format to access
a tuple element is:
Tuple_name[ index ]
The following example illustrates how tuple elements are accessed and the
slicing operation on a tuple:
Output:
my_tuple[2] = 30
my_tuple[-3] = 60
Tuples are immutable. Hence, we cannot use any type of insertion, deletion
or modification operation. However, if any element of a tuple is mutable,
then it can be modified. We can delete an entire tuple using the del
statement. Like lists, tuples also support the concatenation and repetition
operations. Membership operators (in and not in) also work on tuples. We
can also compare two tuples. In case of comparison, corresponding
members of two tuples are compared. If the corresponding members satisfy
the condition, it proceeds for the next set of elements; otherwise it returns
False. It returns True only if for all the corresponding set of members, the
condition is satisfied. The following example illustrates these features:
oddTuple = (1, 3, 5)
evenTuple = (2, 4, 6, 8)
print(“oddTuple : ”, oddTuple)
print(“evenTuple : ”, evenTuple)
newTuple = oddTuple * 3
print(“(1,2,3)<(1,2,3,4)? : ”, (1,2,3)<(1,2,3,4))
print(“(1,20,30)>(10,2,3)? : ”, (1,20,30)>(10,2,3))
print(“(1,20,3)>=(1,2,30)? : ”, (1,20,3)>=(1,2,30))
Output:
oddTuple : (1, 3, 5)
evenTuple : (2, 4, 6, 8)
Is 2 in oddTuple? : False
(1,2,3)<(1,2,3,4)? : True
(1,20,30)>(10,2,3)? : False
(1,20,3)>=(1,2,30)? : True
As tuples are also iterable, we can use basic Python functions that work on
any iterable function. These functions have already been discussed with list.
The following example shows their operation on tuples:
print(“myTuple = ”,myTuple)
print(“Max = ”,max(myTuple))
print(“Min = ”,min(myTuple))
print(“Sum = ”,sum(myTuple))
print(“all(myTuple) : ”,all(myTuple))
print(“any(myTuple) : ”,any(myTuple))
newTuple1 = tuple(“Python”)
Output:
Max = 371
Min = 4
Sum = 627
all(myTuple) : True
any(myTuple) : True
Similar to a list, a tuple may contain another tuple or list as an element. This
is called nested tuple. Accessing the nested tuple elements is just like
accessing nested list elements.
Output:
my_tuple[0] = Techno
my_tuple[1] = (8, 4, 6)
my_tuple[1][2] = 6
my_tuple[2][1] = 3.8
Tuples have only two methods. These are count() and index(). The
operations of these methods are similar to the corresponding method of list.
The count() method counts the occurrence of a particular element supplied
as an argument in a tuple and
106 Data Structures and Algorithms Using Python index() returns the index
position of the first occurrence of an element supplied as argument in a
tuple. Consider the following example:
my_tuple=(2,4,5,2,6,2,7)
index(6))
count(6))
Output:
my_tuple = (2, 4, 5, 2, 6, 2, 7)
As a tuple is also iterable, we can iterate through a tuple just like a list very
easily using the for statement. Consider the following example:
my_tuple = (10,20,30,40)
for i in my_tuple:
print(i, end=‘ ’)
Output:
10 20 30 40
We can also iterate through a tuple using a tuple index and that can be
performed using the for as well as while loop.
my_tuple = (10,20,30,40)
for i in range(len(my_tuple)):
print(my_tuple[i], end=‘ ’)
Output:
10 20 30 40
for i in square_tuple:
print(i, end=‘ ’)
Output:
Output:
Square of 1 is 1
Square of 2 is 4
Square of 3 is 9
4.3 Sets
Like lists and tuples, an empty set can also be created as:
mySet = {}
mySet = set( )
setFromString = set(“AEIOU”)
To add an element into a set the add() method is used. We can add the
elements of a set into other using the update() method. But remember, a set
does not allow duplicate elements. Consider the following example:
mySet1.add(20)
mySet1.update(mySet2)
Output:
After updating by another set: {35, 37, 10, 12, 15, 49, 20, 25}
print(mySet)
110 Data Structures and Algorithms Using Python print (“After deleting 10
: ”)
print(mySet)
print(mySet)
print(mySet)
del mySet
Output:
After deleting 10 :
After deleting 12 :
Popped element is : 49
After popping :
{25, 37}
oddSet = {1, 3, 5}
evenSet = {2, 4, 6, 8}
print(“oddSet : ”, oddSet)
print(“evenSet : ”, evenSet)
print(“Is 2 in oddSet? : ”, 2 in oddSet)
Output:
oddSet : {1, 3, 5}
evenSet : {8, 2, 4, 6}
Is 2 in oddSet? : False
union( ): This method finds the union of two sets. The ‘|’ operator also does
the same job.
intersection( ): This method finds the intersection of two sets. Using the
‘&’ operator we can do the same operation.
intersection_update( ): This method updates the set for which this method
is called by the intersection values of two sets.
difference( ): This method finds the difference between two sets. A new set
contains the elements that are in first set (here for which this method is
called) but not in the second set (which is passed as an argument). The
operator ‘–’ also does the same.
difference_update( ): This method updates the set for which this method is
called by the difference between two sets.
isdisjoint( ): This method returns True if the two sets do not have any
common element, i.e., their intersection produces a null set.
issubset( ): This method returns True if every element of the first set (i.e.,
for which this method is called) is also present in the second set (i.e., which
is passed as an argument). The operator ‘<=’ can also be used for this
operation.
issuperset( ): This method returns True if every element of the second set
(i.e., which is passed as an argument) is also present in the first set (i.e., for
which this method is called).
Along with these methods sets are also used with Python’s basic functions
for iterables.
These are max(), min(), len(), sum(), all(), any(), sorted(), etc. The
following example illustrates the operations of set methods:
112 Data Structures and Algorithms Using Python Example 4.26: Example
to show operations of set methods.
students = {‘Jadu’,’Madhu’,’Parna’,’Pulak’,’Ram’,’Rabin’,’
Shreya’,’Shyam’}
unionSet1 = physics.union(chemistry)
unionSet1)
interSet1 = physics.intersection(chemistry)
physics.intersection_update(chemistry)
physics)
diffSet1 = physics.difference(chemistry)
physics.difference_update(chemistry)
physics)
symDiffSet1 = physics.symmetric_difference(chemistry)
in both : ”, symDiffSet1)
physics.issubset(students))
Output:
print(“mySet = ”,mySet)
print(“Max = ”,max(mySet))
print(“Min = ”,min(mySet))
print(“Sum = ”,sum(mySet))
print(“all(mySet) : ”,all(mySet))
print(“any(mySet) : ”,any(mySet))
copiedSet = mySet.copy()
print(“mySet = ”,mySet)
Output:
Max = 371
Min = 4
Sum = 627
all(mySet) : True
any(mySet) : True
New Set created from string : {‘h’, ‘n’, ‘P’, ‘o’, ‘y’, ‘t’}
4.3.4 Frozenset
mySet.add(25)
print(“mySet = ”,mySet)
myFrozenSet=frozenset(mySet)
print(“myFrozenSet = ”,myFrozenSet)
myFrozenSet.add(25)
Output:
4.4 Dictionaries
key 2 : value 2,
key 3 : value 3,
….. }
Hence we may create a dictionary where keys are the roll numbers of
students and names of the students are the corresponding values of the keys
as: studentList = {1:“Shreya”, 2:“Sriparna”, 3:“Subrata”}
A dictionary where attribute and values are the key–value pair may be
defined as: my_dict = {‘Roll’:3, ‘Name’:‘Subrata’}
116 Data Structures and Algorithms Using Python An empty dictionary can
be created as:
mydict = { }
Dictionary_name[ Key ]
print(‘Roll : ‘, my_dict[‘Roll’])
print(‘Name : ‘, my_dict[‘Name’])
Output:
Roll : 3
Name : Subrata
To add an element we need not use any methods. This is true for modifying
an element as well. And the fun is that the same statement may add or
modify an element. What we
simply need to do is specify a value against a key. If the key is not in the
dictionary the key–value pair will be added to the dictionary; otherwise the
key will be updated with the new value. The same operation can be done
using the update( ) method. Consider the following example:
#new value
new_dict={‘Marks’: 576}
my_dict.update(new_dict)
Output:
del dictionary_name[key]
118 Data Structures and Algorithms Using Python If the key is present in
the dictionary, it will be removed; otherwise it will raise KeyError.
We may also use the pop() method to solve this problem. The pop() method
removes the key, passed as an argument, from a dictionary and returns the
corresponding value. If the key is not present in the dictionary, it will also
raise KeyError. But we can solve this problem by supplying an additional
argument with the pop() method: dictionary_name.pop(key [, d])
where d is the default value that will be returned if the key is not present in
the dictionary.
del my_dict[‘Marks’]
my_dict.clear()
del my_dict
Output:
print(“my_dict : “, my_dict)
Output:
The keys(), values() and items() methods return iterables containing keys,
values and key–value pair respectively as tuples. Thus based on these
iterables we may iterate through a dictionary. Consider the following
example:
print(“my_dict : ”, my_dict)
print(k, end=‘ ’)
for v in my_dict.values():
print(v, end=‘ ’)
print(“\nItems are : ”)
print(k, v)
Output:
Items are :
Roll 1
Name Subrata
Marks 675
Like lists and tuples, a dictionary may also be nested. We may insert a
dictionary as a value against a key. Consider the following example. Here
names of the students are used as a key in the dictionary and against each
key a dictionary is inserted which further represents the marks of three
subjects as a pair of key–value. In this nested dictionary, subjects are the
keys and corresponding marks are the values against each key.
my_dict = { ‘Ram’:{‘c’:67,’Java’:82,’Python’:93},
‘Shyam’:{‘c’:82,’Java’:73,’Python’:89},
‘Jadu’:{‘c’:77,’Java’:85,’Python’:90}
print(“Name : ”, k, “\tMarks : ”, v)
Output:
Name : Ram
Name : Shyam
Name : Jadu
get(key[, d] ): Returns the value of the key sent as argument. If key is not
present, returns the default value, d.
setdefault( key[, d] ): It also returns the value of the key sent as argument.
But if the key is not present, it inserts the key with the default value, d.
Along with these methods dictionaries are also used with some of Python’s
basic functions like len(), str(), etc. The following example illustrates the
operations of dictionary methods and functions:
new_dict=dict.fromkeys([1,3,5])
dict)
new_dict=dict.fromkeys([1,3,5],0)
dict)
val = square_dict.get(3)
val = square_dict.setdefault(3)
copied_dict = square_dict.copy()
elements = len(square_dict)
str(square_dict))
Output:
122 Data Structures and Algorithms Using Python Final Dictionary : {1: 1,
2: 4, 3: 9, 4: 16, 5: 25, 7: 49}
Converting the dictionary into string:{1:1, 2:4, 3:9, 4:16, 5:25, 7:49}
• Lists and tuples are ordered sets of elements, but the elements of sets and
dictionaries are not ordered.
• Lists and tuples allow duplicate elements but the elements of a set and the
keys of dictionaries must be unique.
• Slicing operation can be done on lists and tuples but not on sets and
dictionaries.
• Lists, tuples and dictionaries can be nested but sets cannot be.
Program 4.1: Write a program to create a list whose elements are divisible
by n but not divisible by n2 for a given range.
#range.
numList=[]
numList.append(i)
print(numList)
Sample Output:
Program 4.2: Write a program to find the largest difference among a list of
elements. Do not use library functions and traverse the list only once.
numList=[]
for i in range(n):
numList.append(num)
max=min=numList[0]
for i in range(1,n):
if numList[i]>max:
max=numList[i]
elif numList[i]<min:
min=numList[i]
numList=[]
for i in range(n):
numList.append(num)
sum=0
for i in numList:
sum+=i
print(“Average = ”, sum/n)
Sample Output:
Average = 34.4
l=len(myList)
Python Data Structures 125
for i in range(l):
if num==myList[i]:
print(i, end=‘,’)
numList=[]
for i in range(n):
numList.append(num)
findNum(numList, number)
Sample Output:
#between 1 to 100.
import random
randomList=[]
for i in range(n):
num = random.randint(1,100)
randomList.append(num)
Sample Output:
numList=[]
for i in range(n):
numList.append(num)
oddList=[]
evenList=[]
for i in numList:
if i%2==0:
evenList.append(i)
else:
oddList.append(i)
Sample Output:
def mergeList(m,n):
l1=len(m)
l2=len(n)
i=j=0
newList=[]
while i<l1 and j <l2:
if m[i]<n[j]:
newList.append(m[i])
i+=1
else:
newList.append(n[j])
j+=1
while i<l1:
newList.append(m[i])
i+=1
while j<l2:
newList.append(n[j])
j+=1
return newList
def inputList(lst):
for i in range(n):
lst.append(num)
numList1=[]
numList2=[]
inputList(numList1)
inputList(numList2)
newList=mergeList(numList1,numList2)
Enter element 1: 2
Enter element 2: 4
Enter element 3: 6
Enter element 4: 8
Enter element 5: 12
Enter element 1: 1
Enter element 2: 3
Enter element 3: 5
Enter element 4: 7
Enter element 5: 9
Enter element 6: 11
Program 4.8: Write a program to create two sets such that one will contain
the names of the students who have passed in Economics and the other will
contain the names of those who have passed in Accountancy.
ii. Find the students who have passed in at least one subject iii. Find the
students who have passed in one subject but not in both iv. Find the students
who have passed in Economics but not in Accountancy v. Find the students
who have passed in Accountancy but not in Economics.
economics=set()
for i in range(n):
economics.add(name)
accountancy=set()
Python Data Structures 129
for i in range(n):
accountancy.add(name)
accountancy)
| accountancy)
both:”,economics ^ accountancy)
Accountancy:”,economics - accountancy)
Economics:”,accountancy - economics)
Sample Output:
Program 4.9: Write a program to create a dictionary that will contain all
ASCII values and their corresponding characters.
#characters.
130 Data Structures and Algorithms Using Python asciiDict={}
for i in range(256):
asciiDict[i]=chr(i)
print(asciiDict)
Output:
‘.’, 47: ‘/’, 48: ‘0’, 49: ‘1’, 50: ‘2’, 51: ‘3’, 52: ‘4’, 53: ‘5’, 54: ‘6’, 55: ‘7’,
56: ‘8’, 57: ‘9’, 58: ‘:’, 59: ‘;’, 60: ‘<’, 61:
‘=’, 62: ‘>’, 63: ‘?’, 64: ‘@’, 65: ‘A’, 66: ‘B’, 67: ‘C’, 68: ‘D’, 69: ‘E’, 70:
‘F’, 71: ‘G’, 72: ‘H’, 73: ‘I’, 74: ‘J’, 75: ‘K’, 76: ‘L’, 77: ‘M’, 78: ‘N’, 79:
‘O’, 80: ‘P’, 81: ‘Q’, 82: ‘R’, 83: ‘S’, 84: ‘T’, 85: ‘U’, 86: ‘V’, 87: ‘W’, 88:
‘X’, 89: ‘Y’, 90: ‘Z’, 91: ‘[‘, 92: ‘\\’, 93: ‘]’, 94: ‘^’, 95: ‘_’, 96: ‘`’, 97: ‘a’,
98: ‘b’, 99: ‘c’, 100: ‘d’, 101: ‘e’, 102: ‘f’, 103: ‘g’, 104: ‘h’, 105: ‘i’, 106:
‘j’, 107: ‘k’, 108: ‘l’, 109: ‘m’, 110: ‘n’, 111: ‘o’, 112: ‘p’, 113: ‘q’, 114: ‘r’,
115: ‘s’, 116:
‘t’, 117: ‘u’, 118: ‘v’, 119: ‘w’, 120: ‘x’, 121: ‘y’, 122: ‘z’, 123: ‘{‘, 124: ‘|’,
125: ‘}’, 126: ‘~’, 127: ‘\x7f’, 128: ‘\ x80’, 129: ‘\x81’, 130: ‘\x82’, 131:
‘\x83’, 132: ‘\x84’, 133: ‘\x85’, 134: ‘\x86’, 135: ‘\x87’, 136: ‘\x88’, 137:
‘\x89’, 138: ‘\x8a’, 139: ‘\x8b’, 140: ‘\x8c’, 141: ‘\x8d’, 142: ‘\x8e’, 143:
‘\x8f’, 144: ‘\x90’, 145: ‘\x91’, 146:
‘\x92’, 147: ‘\x93’, 148: ‘\x94’, 149: ‘\x95’, 150: ‘\x96’, 151: ‘\x97’, 152:
‘\x98’, 153: ‘\x99’, 154: ‘\x9a’, 155:
‘\x9b’, 156: ‘\x9c’, 157: ‘\x9d’, 158: ‘\x9e’, 159: ‘\x9f’, 160: ‘\xa0’, 161:
‘¡’, 162: ‘¢’, 163: ‘£’, 164: ‘¤’, 165: ‘¥’, 166: ‘¦’, 167: ‘§’, 168: ‘¨’, 169:
‘©’, 170: ‘ª’, 171: ‘«‘, 172: ‘¬’, 173: ‘\xad’, 174: ‘®’, 175: ‘¯’, 176: ‘°’,
177: ‘±’, 178: ‘²’, 179: ‘³’, 180: ‘´’, 181: ‘µ’, 182: ‘¶’, 183: ‘·’, 184: ‘¸’, 185:
‘¹’, 186: ‘º’, 187: ‘»’, 188: ‘¼’, 189: ‘½’, 190:
‘¾’, 191: ‘¿’, 192: ‘À’, 193: ‘Á’, 194: ‘Â’, 195: ‘Ã’, 196: ‘Ä’, 197: ‘Å’,
198: ‘Æ’, 199: ‘Ç’, 200: ‘È’, 201: ‘É’, 202:
‘Ê’, 203: ‘Ë’, 204: ‘Ì’, 205: ‘Í’, 206: ‘Î’, 207: ‘Ï’, 208: ‘Ð’, 209: ‘Ñ’, 210:
‘Ò’, 211: ‘Ó’, 212: ‘Ô’, 213: ‘Õ’, 214: ‘Ö’, 215: ‘×’, 216: ‘Ø’, 217: ‘Ù’,
218: ‘Ú’, 219: ‘Û’, 220: ‘Ü’, 221: ‘Ý’, 222: ‘Þ’, 223: ‘ß’, 224: ‘à’, 225: ‘á’,
226: ‘â’, 227: ‘ã’, 228: ‘ä’, 229: ‘å’, 230: ‘æ’, 231: ‘ç’, 232: ‘è’, 233: ‘é’,
234: ‘ê’, 235: ‘ë’, 236: ‘ì’, 237: ‘í’, 238: ‘î’, 239:
‘ï’, 240: ‘ð’, 241: ‘ñ’, 242: ‘ò’, 243: ‘ó’, 244: ‘ô’, 245: ‘õ’, 246: ‘ö’, 247:
‘÷’, 248: ‘ø’, 249: ‘ù’, 250: ‘ú’, 251: ‘û’, 252: ‘ü’, 253: ‘ý’, 254: ‘þ’, 255:
‘ÿ’}
import re
lastDate={1:31,2:28,3:31,4:30,5:31,6:30,7:31,8:31,9:30,
10:31,11:30,12:31}
def IncrementDay(date):
dd,mm,yy=map(int,re.split(‘[./-]’,date))
sep=date[2]
lastDate[2]=29
Python Data Structures 131
if dd<lastDate[mm]:
dd+=1
else:
dd=1
mm+=1
if mm==13:
mm=1
yy+=1
return(str(dd)+sep+str(mm)+sep+str(yy))
incDt=IncrementDay(dt)
Sample Output:
students={}
def addNew():
name=input(“Enter Name: ”)
students[name]={‘Python’:python,’OS’:os,’DBMS’:dbms}
def modify():
name=input(“Enter Name: ”)
students[name]={‘Python’:python,’OS’:os,’DBMS’:dbms}
def delete():
name=input(“Enter Name: “)
return
del students[name]
def findHighest():
maxP=0
maxO=0
maxD=0
if v[‘Python’]>maxP:
maxP=v[‘Python’]
if v[‘OS’]>maxO:
maxO=v[‘OS’]
if v[‘DBMS’]>maxD:
maxD=v[‘DBMS’]
def display():
print(“\t\tStudent Details”)
print(“---------------------------------------”)
print(“---------------------------------------”)
3d”%(k,v[‘Python’],v[‘OS’],v[‘DBMS’]))
print(“---------------------------------------”)
print()
while(True):
print(“==========================================”)
print(“1. Add new Student”)
print(“6. Exit”)
print(“==========================================”)
if choice==1 :
addNew()
elif choice==2 :
modify()
elif choice==3 :
delete()
elif choice==4 :
findHighest()
elif choice==5 :
display()
elif choice==6 :
print(“\nQuiting.......”)
break
else:
Choice”)
continue
Sample Output:
==========================================
6. Exit
==========================================
134 Data Structures and Algorithms Using Python Enter your Choice : 1
6. Exit
==========================================
==========================================
==========================================
==========================================
6. Exit
==========================================
Student Details
Name
Python
OS
DBMS
Arijit Mitra
90
66
82
Gautam Dey
67
45
69
Subrata Saha
78
92
89
==========================================
6. Exit
==========================================
Highest marks in OS is = 92
==========================================
6. Exit
==========================================
Quiting.......
if type(var)==list:
elif type(var)==tuple:
elif type(var)==set:
myList=[1,2,3]
myTuple=(1,2,3)
mySet={1,2,3}
check(myList)
check(myTuple)
check(mySet)
Output:
[1, 2, 3] is a List
(1, 2, 3) is a Tuple
{1, 2, 3} is a Set
Program 4.13: Write a program to sort a list of tuples by the second item.
#second item
def usingLambda(myList):
myList.sort(key=lambda x:x[1])
def usingBubbleSort(myList):
n=len(myList)
for i in range(n-1):
for j in range(n-i-1):
if myList[j][1]>myList[j+1][1]:
myList[j],myList[j+1]=myList[j+1],myList[j]
list1=[(3,5),(7,9),(6,7)]
list2=[(3,9),(4,7),(2,8)]
usingLambda(list1)
usingBubbleSort(list2)
Output:
✓ Operations of list are similar to that of array, but arrays are homogeneous
collections and lists are heterogeneous collections.
✓ Arrays are static, i.e., fixed in size but lists are dynamic, i.e., these can
grow or sink as and when required.
✓ Both list and tuple elements are accessed with index values.
✓ Both list and tuple support both positive and negative index values.
✓ Python allows all the basic mathematical operations on a set, like union,
intersection, difference and symmetric difference, with the in-built set data
structures.
✓ The membership operators are applicable for all the four above discussed
data structures.
✓ Lists, tuples and dictionaries can be nested but sets cannot be.
a) ( )
b) { }
c) [ ]
a) random.uniform( )
b) random.randint()
c) random.random()
list += ‘de’
print(list)
d) Error
4. What will be the output of the following Python code?
print(even)
d) Error
odd = [1, 3, 5]
print(odd1)
a) [1, 3, 5, 7, 9]
odd = [ 3]
odd1= odd * 4
print(odd1)
a) [12]
b) [3, 3, 3, 3]
a) 0
b) 20
c) 30
d) 40
8. Which function constructs a list from those elements of the list for which
a function returns True?
a) enumerate()
b) map()
c) reduce()
d) filter()
odd = [1, 3, 5]
odd.append([7,9])
print(odd)
a) [1, 3, 5, [7, 9]]
b) [1, 3, 5, 7, 9]
c) 1, 3, 5, 7, 9
d) Error
10. If my_list = [1, 2, 3, 4, 5, 6], what will be the content of my_list after
executing the statement, my_list.insert(-2,-2)
a) [1, -2, 2, 3, 4, 5, 6]
b) [1, 2, -2, 4, 5, 6]
c) [1, 2, 3, 4, -2, 5, 6]
140 Data Structures and Algorithms Using Python 11. Which of the
following statement removes the first element from any list, say my_list?
a) my_list.remove(0)
b) my_list.del(0)
c) my_list.clear(0)
d) my_list.pop(0)
a) List
b) Tuple
c) Dictionary
d) Function cannot return multiple values
d) {stream1:”Bca”, stream2:”Mca”}
14. If d={1:4, 2:3, 3:2, 4:1}, then the statement print(d[d[2]]) will print a)
2:3
b) 2
c) 3
d) Error
c) dict([[1, “A”],[2,”B”]])
d) { }
b) False
c) Equal
d) Error
mySet={1,3,5,7}
mySet.add(4)
mySet.add(5)
print(mySet)
a) {1, 3, 5, 7, 4, 5}
b) {1, 3, 4, 5, 7}
c) Error
Review Exercises
print(mylist[0])
print(mylist[1])
print(mylist[0][2])
print(mylist[1][2])
my_list = [1,2,3,4,5,6,7,8]
print(my_list[2:5])
print(my_list[:5])
print(my_list[5:])
print(my_list[:])
print(my_list[:-5])
print(my_list[-5:])
print(my_list[-2:-5:-1])
print(my_list[::-1])
142 Data Structures and Algorithms Using Python 9. What will be the
output of the following code snippet?
print(even)
print(even)
print(even)
print(even)
11. What are the different methods in Python by which elements can be
added in a list?
my_list = [1,2,3,4]
my_list.insert(2,5)
print(my_list)
my_list.insert(0,7)
print(my_list)
my_list.insert(-2,9)
print(my_list)
13. How can we delete multiple items from a list? Explain with an example.
14. Explain different methods that are used to delete items from a list.
15. What is the difference between creating a new list using the ‘=’ operator
and the copy( ) method?
19. What do you mean by packing and unpacking with respect to tuples?
21. Is it possible to modify any tuple element? What happens when the
following statements are executed?
my_tuple[3][0] = 7
print(my_tuple)
31. ‘Dictionaries are not sequences, rather they are mappings.’ Explain.
6. Write a menu driven program to add, edit, delete and display operations
on a list.
7. Write a program to remove all duplicate elements from a list.
11. Suppose a list containing duplicate values. Write a function to find the
frequency of each number in the list.
144 Data Structures and Algorithms Using Python 12. Write a program to
remove the elements from the first set that are also in the second set.
OceanofPDF.com
Chapter
Strings
In the previous two chapters we have studied about arrays and other in-built
data structures like lists, tuples, sets and dictionaries. In this chapter we will
discuss another in-built data type/data structure of Python, and that is string.
In programming languages, string handling is a very important task and in
data structure we will learn about different algorithms for efficient string
handling.
5.1 Introduction
So, ‘India’, “India” and ‘“India’’’ – all three are valid string constants in
Python. Apart from a normal string, there are two other types of strings.
These are escape sequences and raw strings. Like C, C++ and Java
programming, Python also supports escape sequences. An escape sequence
is the combination of a backslash (\) and another character that together
represent a specific meaning escaping from their original meaning when
used individually.
Escape Characters
Represents
\b
Backspace
\f
Form feed
\n
New line
\r
Carriage return
Represents
\t
Horizontal tab
\’
\”
\\
Backslash
\ooo
Octal value
\xhh
Hex value
print(“Hello \nClass!!”)
Output:
Hello
Class!!
print(“Hello \tClass!!”)
Output:
Hello
Class!!
print(‘Cat\rM’)
Output:
Mat
print(‘Cat\b\bu’)
Output:
Cut
Output:
Welcome to the world of ‘Python’
print(“\”India\“ is great.”)
Strings 147
Output:
“India” is great.
print(‘Techno\nIndia’)
Output:
Techno
India
print(r‘Techno\nIndia’)
Output:
Techno\nIndia
print(‘Techno\\India’)
Output:
Techno\India
print(R‘Techno\\India’)
Output:
Techno\\India
The index value -1 represents the last character, -2 represents the last but
one, and so on.
There is a large set of methods and functions that can be used to manipulate
strings. In this section we briefly discuss all these string operations.
148 Data Structures and Algorithms Using Python 5.2.1 Slicing Operations
on String
Like lists, slicing operation can also be done on strings. We can extract a
substring from a string using the slicing operation. The general format of
slicing is: String_name [ Start : End : Step ]
where Start is the starting index position from which the extraction
operation begins. The extraction operation continues up to the End – 1
index position and Step represents the incremented or decremented value
needed to calculate the next index position to extract characters. If Start is
omitted, the beginning of the string is considered, i.e., the default value of
Start is 0. If End is omitted, the end of the string is considered and the
default Step value is 1. If we use negative Step value, elements will be
extracted in the reverse direction. Thus, for negative Step value, the default
value of Start is -1 and that of End indicates the beginning of the string. The
following example illustrates these concepts: Example 5.3: Example to
show the slicing operation on strings.
string = ‘Python Programming’
print(“string[0] : ”, string[0])
print(“string[5] : ”, string[5])
print(“string[7:14] : ”, string[7:14])
print(“string[7:] : ”, string[7:])
print(“string[:6] : ”, string[:6])
print(“string[-1] : ”, string[-1])
print(“string[-3] : ”, string[-3])
print(“string[1:6:2] : ”, string[1:6:2])
print(“string[:6:2] : ”, string[:6:2])
print(“string[::2] : ”, string[::2])
print(“string[:6:-1] : ”, string[:6:-1])
print(“string[5::-1] : ”, string[5::-1])
print(“string[::-1] : ”, string[::-1])
print(“string[-2:-5:-1] : ”, string[-2:-5:-1])
Output:
string[0] : P
string[5] : n
string[7:14] : Program
string[7:] : Programming
string[:6] : Python
string[-1] : g
string[-3] : i
Strings 149
string[1:6:2] : yhn
string[:6:2] : Pto
string[:6:-1] : gnimmargorP
string[5::-1] : nohtyP
string[-2:-5:-1] : nim
The operator ‘+’ is used to concatenate two or more strings. The resultant of
this operation is a new string. The operator ‘*’ is used to repeat a string a
certain number of times. The order of the string and the number is not
important. But generally string comes first.
firstName = ‘Subrata’
lastName = ‘Saha’
Output:
------------------------
Hi !! Hi !! Hi !! Hi !!
As strings are iterable like lists, we can iterate through a string very easily
using the for loop. Consider the following example:
myString = ‘Python’
for i in myString:
print(i, end=‘ ’)
Python
We can also iterate through a string using index values as follows, Example
5.6: Write a program to show how to iterate through a string using the index
value.
myString = ‘Python’
for i in range(len(myString)):
print(myString[i], end=‘ ’)
Output:
Python
We can also use the while loop to iterate through a string when using index
values.
myString = ‘Python’
i=0
print(myString[i], end=‘ ’)
i+=1
Output:
Python
Python’s str class provides several methods for properly handling strings.
Here we discuss them briefly.
upper( ): This method converts the entire string into upper case.
lower( ): This method converts the entire string into lower case.
capitalize( ): This method converts the first character of the string into
upper case.
casefold( ): This method converts the entire string into lower case. This is
similar to lower() but it is more powerful as it able to converts more
characters.
Strings 151
title( ): This method converts the first character of each word into upper
case.
swapcase( ): This method converts each character into its reverse case, i.e.,
upper case letter is converted into to lower case and lower case letter into
upper case.
fillchar fills the extra positions. If fillchar is not specified, extra positions
will be filled up with space.
fillchar fills the extra positions. If fillchar is not specified, extra positions
will be filled up with space.
fillchar fills the extra positions. If fillchar is not specified, extra positions
will be filled up with space.
zfill( width ): This method returns a right justified string of specified width
whose extra positions will be filled up with zeros. This method is used with
digits.
count( Sub [, Start [, End ]] ): This method counts the occurrence of the
substring Sub in the string. We can restrict the search area by mentioning
start and end.
replace( Old, New [, Count] ): This method replaces the all occurrence of
the substring Old with the substring New. Count restricts the maximum
number of replacement.
join( Iterable ): This method joins each element of the iterable using as
delimiter the string for which it is invoked.
split( Sep, Maxsplit ): This method splits the string based on the separator
Sep and returns a list of substrings. If Sep is not specified, it splits on white
space. Maxsplit restricts the maximum number of splitting.
rsplit( Sep, Maxsplit ): This method is the same as split( ) but it starts
splitting from right.
splitlines( [Keepends] ): This method splits the string based on the newline
character and returns a list of substrings. If Keepends is True, the newline
characters will be included with the split substring. By default, Keepends is
False.
isalha( ): This method checks whether all the characters in the string are
letters of the alphabet. If so, it returns True; otherwise it returns False.
isalnum( ): This method checks whether all the characters in the string are
alphanumeric, i.e., either alphabet or numeric. If so, it returns True;
otherwise it returns False.
isdigit( ): This method checks whether all the characters in the string are
digits. If so, it returns True; otherwise it returns False.
isdecimal( ): This method checks whether all the characters in the string are
decimal numbers. If so, it returns True; otherwise it returns False.
isnumeric( ): This method checks whether all the characters in the string
are numeric digits. If so, it returns True; otherwise it returns False.
islower( ): If the string contains at least one letter of the alphabet and all the
letters are in lower case, this method returns True; otherwise it returns
False.
isupper( ): If the string contains at least one letter of the alphabet and all
the letters are in upper case, this method returns True; otherwise it returns
False.
isspace( ): This method checks whether all the characters in the string are
white spaces. If so, it returns True; otherwise it returns False.
istitle( ): If the string is in title case, this method returns True; otherwise it
returns False.
Apart from these methods Python string supports some built-in functions
also. Here are some of them:
Strings 153
len( ): This function returns the length of a string, i.e., number of characters
in the string.
sorted( ): This function returns a list containing the characters of the string
in ascending order.
The following example shows the use of string methods and functions.
Example 5.8: Write a program to show the use of string methods and
functions.
string=‘PYTHON programming’
print(“String : ”,string)
print(“string.capitalize(): ”, string.capitalize())
print(“string.casefold() : ”, string.casefold())
print(“string.upper() : ”, string.upper())
print(“string.lower() : ”, string.lower())
print(“string.title() : ”, string.title())
print(“string.swapcase() : ”, string.swapcase())
print()
string=‘Python’
print(“|String| : ”,‘|’+string+‘|’)
print(“|string.center(10)| : ”,‘|’+string.
center(10)+‘|’)
print(“|string.center(10,‘*’)|: ”,‘|’+string.
center(10,‘*’)+‘|’)
print(“|string.ljust(10)| : ”,‘|’+string.
ljust(10)+‘|’)
print(“|string.rjust(10,‘*’)| : ”,‘|’+string.
rjust(10,‘*’)+‘|’)
print(“|string.zfill(10)| : ”,‘|’+string.
zfill(10)+‘|’)
print(“string.startswith(‘Py’): ”, string.
startswith(‘Py’))
print(“string.endswith(‘on’) : ”, string.endswith(‘on’))
print()
string=‘ Python ’
print(“|String| : ”, ‘|’+string+‘|’)
print(“|string.lstrip()| : ”, ‘|’+string.lstrip()+‘|’)
print(“|string.strip()| : ”, ‘|’+string.strip()+‘|’)
string=‘Python\tProgramming’
print(“string.expandtabs() : ”, string.expandtabs())
print(“string.expandtabs(10): ”, string.expandtabs(10))
print()
string=‘Programming’
print(“String : ”,string)
print(“string.count(‘r’ ) : ”, string.count(‘r’))
print(“string.find(‘r’) : ”, string.find(‘r’))
print(“string.rfind(‘r’) : ”, string.rfind(‘r’))
print(“string.index(‘r’) : ”, string.index(‘r’))
print(“string.rindex(‘r’) : ”, string.rindex(‘r’))
print(“string.replace(‘r’,‘R’) : ”, string.
replace(‘r’,‘R’))
print(“string.replace(‘r’,‘R’,1): ”, string.
replace(‘r’,‘R’,1))
print(“string.join([‘1’,‘2’,‘3’]) : ”, string.
join([‘1’,‘2’,‘3’]))
print(“‘ ’.join([‘1’,‘2’,‘3’]) : ”, ‘
’.join([‘1’,‘2’,‘3’]))
print(“string.split(‘r’) : ”, string.split(‘r’))
print(“string.split(‘r’,1) : ”, string.split(‘r’,1))
print(“string.rsplit(‘r’) : ”, string.rsplit(‘r’))
print(“string.rsplit(‘r’,1): ”, string.rsplit(‘r’,1))
print()
string=‘World\nof\nPython’
print(“string.splitlines() : ”, string.splitlines())
print(“string.splitlines(True) : ”, string.
splitlines(True))
print()
string=‘1234’
print(“String : ”, string)
print(“string.isdecimal() : ”, string.isdecimal())
print(“string.isnumeric() : ”, string.isnumeric())
print()
string=‘Python’
Strings 155
print(“String : ”, string)
print(“string.isalpha() : ”, string.isalpha())
print(“string.isalnum() : ”, string.isalnum())
print(“string.isdigit() : ”, string.isdigit())
print(“string.islower() : ”, string.islower())
print(“string.isupper() : ”, string.isupper())
print(“string.isidentifier() : ”, string.isidentifier())
print(“string.isspace() : ”, string.isspace())
print(“string.istitle() : ”, string.istitle())
print(“len(string) : ”, len(string))
print(“list(enumerate(string)): ”,
list(enumerate(string)))
print(“sorted(string) : ”, sorted(string))
Output:
|String| : |Python|
|string.center(10)| : | Python |
|string.center(10,’*’)|: |**Python**|
|string.ljust(10)| : |Python |
|string.rjust(10,’*’)| : |****Python|
|string.zfill(10)| : |0000Python|
string.startswith(‘Py’): True
string.endswith(‘on’) : True
|String| : | Python |
|string.lstrip()| : |Python |
|string.rstrip()| : | Python|
|string.strip()| : |Python|
string.count(‘r’ ) : 2
string.find(‘r’) : 1
string.rfind(‘r’) : 4
string.index(‘r’) : 1
string.rindex(‘r’) : 4
string.replace(‘r’,’R’) : PRogRamming
string.replace(‘r’,’R’,1): PRogramming
string.join([‘1’,’2’,’3’]) : 1Programming2Programming3
‘ ‘.join([‘1’,’2’,’3’]) : 1 2 3
String : 1234
string.isdecimal() : True
string.isnumeric() : True
String : Python
string.isalpha() : True
string.isalnum() : True
string.isdigit() : False
string.islower() : False
string.isupper() : False
string.isidentifier() : True
string.isspace() : False
string.istitle() : True
len(string) : 6
list(enumerate(string)): [(0, ‘P’), (1, ‘y’), (2, ‘t’), (3, ‘h’), (4, ‘o’), (5, ‘n’)]
The Python string module also provides some useful string constants that
help us in handling string. These are shown in Table 5.2. Remember, to use
these constants first we need to import the string module.
Strings 157
Constant Name
Value
ascii_lowercase
‘abcdefghijklmnopqrstuvwxyz’
ascii_uppercase
‘ABCDEFGHIJKLMNOPQRSTUVWXYZ’
ascii_letters
‘ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz’
digits
‘0123456789’
hexdigits
‘0123456789abcdefABCDEF’
octdigits
‘01234567’
punctuation
!”#$%&‘()*+,-./:;<=>?@[\]^_`{|}~
whitespace
Python allows us to compare two strings using the relational operators just
like comparing numeric values. All the relational operators, i.e., >, <, >=,
<=, == and != or <>, are used to compare strings. At the time of
comparison, characters of corresponding index positions of two strings are
compared. First, characters of 0th index positions of two strings are
compared. If they are equal, characters of the next index positions are
compared. When a mismatch occurs, ASCII values of the mismatched
characters are compared. Suppose String1= “Anand” and String2 = “Anil”.
Here, the first two characters of both the strings are same. Mismatch occurs
at index position 2. At this position String1 contains ‘a’ and String2
contains ‘i’. As the ASCII value of ‘a’ is less than that of ‘i’, the condition
String1<String2
will be True. Some other cases have been shown in the following example:
Example 5.9: Write a program to show string comparison.
print(“‘Java’>‘Python’ : ”, ‘Java’>‘Python’)
print(“‘Java’<‘Python’ : ”, ‘Java’<‘Python’)
print(“‘Anil’>=‘Anand’ : ”, ‘Anil’>=‘Anand’)
print(“‘Python’<=‘PYTHON’ : ”, ‘Python’<=‘PYTHON’)
print(“‘Python’==‘Python’ : ”, ‘Python’==‘Python’)
print(“‘Python’!=‘python’ : ”, ‘Python’!=‘python’)
Output:
‘Java’>‘Python’ : False
‘Java’<‘Python’ : True
‘Anil’>=‘Anand’ : True
‘Python’<=‘PYTHON’ : False
‘Python’==‘Python’ : True
‘Python’!=‘python’ : True
Along with several methods for string handling, Python also provides a
unique feature called RegEx or Regular Expression or simply RE, which is
nothing but a sequence of characters and some meta-characters that forms a
search pattern. Instead of searching a particular word, regular expressions
help us to find a particular pattern in a text, code, file, log or spreadsheet,
i.e., in any type of documents. For example, the string find() method helps
us to search a particular email id; but if we want to find all the email ids in a
document, Regular Expression is the solution. In Python this feature is
available through the re module.
To avail this feature first we need to create a pattern. Within the pattern we
may use several meta-characters and these meta-characters are shown in
Table 5.3.
Character
Description
Example
[]
A set of characters
“[aeiou]”
“\d”
“.oo.”
Starts with
“^hello”
$
Ends with
“world$”
“bel*”
“hel?o”
“hel+o”
{n}
“al{2}”
{n,}
“al{2,}”
{n,m}
“al{2,4}”
to it
Either or
“mca|btech”
()
“(B|C|M)at”
Strings 159
In the table of meta-characters the first column represents the character set.
Table 5.4
Character Set
Description
[aeiou]
[a-e]
between a and e
[^aeiou]
[345]
[0-9]
[0-5][0-9]
[a-zA-Z]
Returns a match for any alphabet character in lower case OR upper case In
character set, +, *, ., |, (), $,{} has no special meaning. Hence, [+] returns a
match
[+]
Different special sequences and their uses have been shown in Table 5.5.
Character
Description
Example
\A
the string
\b
or at the end of a word (the “r” in the beginning is to indicate that the
r“aim\b”
\B
Returns a match where the specified characters are present, but NOT
r“\Baim”
at the beginning (or at the end) of a word (the “r” in the beginning is
r“aim\B”
\d
“\d”
\D
Returns a match where the string DOES NOT contain any digit
“\D”
\s
Returns a match where the string contains any white space character
“\s”
\S
Returns a match where the string DOES NOT contain any white
“\S”
space character
Description
Example
\w
“\w”
character)
\W
Returns a match where the string DOES NOT contain any word
“\W”
character
\Z
Returns a match if the specified characters are at the end of the string
“aim\Z”
Now we will discuss the different functions that are used to search a pattern.
where the first argument Pattern is to find the pattern in the String. The
values of the Flag variable are discussed in Table 5.6.
Flags
Description
Instead of only the end of the string, ‘$’ matches the end of each re.M or
re.MULTILINE
line and instead of only the beginning of the string, ‘^’ matches the
beginning of any line.
re.S or re.DOTALL
Makes the flag \w to match all characters that are considered letters in re.L
or re.LOCALE
re.X
import re
Strings 161
pattern = ‘\d{2}/\d{2}/\d{4}’
if match:
else:
Output:
Method
Description
group( )
Returns the string matched by any RE method
start( )
end( )
span()
Returns a tuple containing the (start, end) positions in the match string
Consider the following example which shows the use of methods of the
match object: Example 5.11: Write a program to show the use of methods
of the match object.
import re
pattern = ‘\d{2}/\d{2}/\d{4}’
if match:
else:
Starting Index : 0
End Index : 10
import re
pattern = ‘(\d{2})/(\d{2})/(\d{4})’
if match:
print(“Day : ”, match.group(1))
print(“Month : ”, match.group(2))
print(“Year : ”, match.group(3))
else:
Output:
Day : 15
Month : 08
Year : 1947
Starting Index : 0
End Index : 10
Strings 163
where the first argument Pattern is to find the pattern anywhere in the
String. The values of the Flag variable are discussed in Table 5.6. Consider
the following example: Example 5.13: Write a program to show the use of
search().
import re
23701 ”
# Finding a mobile number in the format, 5 digits followed
if match:
else:
Output:
findall( ): findall() returns a list containing all matches found in the given
string. The general format of findall() is:
where the first argument Pattern is to find all occurrences of the pattern in
the String. The values of the Flag variable are discussed in Table 5.6.
Consider the following example: Example 5.14: Write a program to show
the use of findall().
import re
15/08/1947.’
if result:
for i in result:
print(i)
else:
12/01/2022
15/08/1947
finditer( ): This function finds all substrings where the regular expression
matches and returns them as an iterator. If no match found, then also it
returns an iterator. The general format of finditer() is:
where the first argument Pattern is to find the all occurrence of the pattern
in the String.
The values of the Flag variable are discussed in Table 5.6. Consider the
following example: Example 5.15: Write a program to show the use of
finditer().
import re
string = ‘Today is 12/01/2022 - Birthday of Swami \
15/08/1947.’
pattern = ‘\d{2}/\d{2}/\d{4}’
for i in result:
Output:
split( ): This function splits the string into substrings based on the positions
where the pattern matches in the original string and returns these substrings
as a list. The general format of split() is:
where the first argument Pattern is to find all occurrences of the pattern in
the String to split the string. The values of the Flag variable are discussed
in Table 5.6. Consider the following example:
import re
Strings 165
pattern = ‘\s+’
result = re.split(pattern, string)
print(result)
Output:
where the first argument Pattern is to find and replace with Repl in the
String all or Max number of occurrences. Consider the following example:
Example 5.17: Write a program to show the use of sub() by replacing ‘/’ or
‘–’ with ‘.’
import re
15-08-1947.’
Output:
166 Data Structures and Algorithms Using Python 5.7.1 Brute Force
Pattern Matching Algorithm
Solution:
0 1 2 3 4 5 6 7 8 9 10
String:
A B D E A B F AA B C
Pattern: A B C
The algorithm starts the comparison from the beginning of the main string,
i.e., from the index position 0. Now the first two characters of both strings
are the same but a mismatch occurs with the third character. So, searching
again starts from index position 1 of the main string.
0 1 2 3 4 5 6 7 8 9 10
String:
A B D E A B F AA B C
Pattern:
But mismatch occurs. Hence, we start searching again from index position
2.
0 1 2 3 4 5 6 7 8 9 10
String:
A B D E A B F AA B C
Pattern:
0 1 2 3 4 5 6 7 8 9 10
String:
A B D E A B F AA B C
Pattern:
A
Strings 167
0 1 2 3 4 5 6 7 8 9 10
String:
A B D E A B F AA B C
Pattern:
ABC
Now, the first character of the pattern matches with the character of index
position 4. So, we proceed to compare the second character of the pattern
and the character of index position 5. They also match. So, we proceed
again to compare the third character of the pattern and the character of
index position 6. But this time mismatch occurs. Hence, we need to start
searching again and it will be from index position 5 of the main string.
0 1 2 3 4 5 6 7 8 9 10
String:
A B D E A B F AA B C
Pattern:
Pattern:
Pattern:
AB
Again mismatch occurs and we start searching from index position 6. Again
mismatch occurs and search start from index position 7. This time matching
occurs but the corresponding next characters are not the same. Thus, again
we start searching from index position 8.
0 1 2 3 4 5 6 7 8 9 10
String:
A B D E A B F AA B C
Pattern:
ABC
This time all the three characters of the pattern match with the eighth, ninth,
and tenth characters of the main string correspondingly. Hence, the search
operation becomes True and it returns 8 as the index position from where
matching starts.
If the length of the string is n and the length of the pattern is m, in this
algorithm we need to search the pattern almost n times, and in the worst
case, mismatch will occur with the last character of the pattern. Hence, we
need to compare n x m times in the worst case.
So, we can say that the worst case time complexity is O( nm). However, in
the best case, by traversing the first m characters of the main string we can
get the match and, hence, the best case time complexity can be calculated as
O( m).
def Brute_Force(s,p):
l1 = len(s)
for i in range(l1-l2+1):
flag = True
for j in range(l2):
if (s[i+j]!=p[j]):
flag = False
break
if flag:
return i
return -1
if index!= -1:
else:
Sample Output:
The main disadvantage of the basic or naïve type search algorithms is that
every time we have to compare each character of the pattern with the main
string. If there is some repetition of the same sequence of characters within
the pattern and that matches with the main string, the algorithm does not
bother about it. But the Knuth Morris Pratt (KMP) pattern matching
algorithm studies the pattern carefully and considers the repetitions, if any,
and hence avoids the back tracking.
Before starting the discussion of this algorithm, first we need to know some
keywords related to this algorithm. These are prefix, suffix and the π table.
To find the prefix, we start from beginning of the string and take the subset
by considering one, two, three, …
Strings 169
may be situation where the searching operation may fail but the prefix of
the pattern may match with the suffix of the portion of the original string.
This helps us not to back track fully like the naïve approach; rather we may
now continue from this position even if there is some overlapping portion in
the main string. Next we need to find the π table or LPS
(Longest Prefix that is also a Suffix) table. From the LPS table, we are able
to find the index position of the repeating prefix. To construct the LPS
table, we start searching from the second character (i.e., index position 1)
and compare each and every character with the first character (i.e.,
character of the index position 0). If a match does not occur, we set the
value as 0. But if match occurs, we set the corresponding value as 1 and
check whether the next character also matches with the second character of
the string. If so, we set the value 2
for the character to indicate the longest prefix size as 2. This process
continues and we get the longest prefix in a word that is also repeated
within that word. Here are some examples of a few words and their
corresponding LPS table.
Example 5.19: Construct the LPS table for the following words.
i. CUCUMBER
ii. TOMATO
iii. TICTAC
Solution:
i.
01234567
String:
CUCUMBER
LPS Table: 0 0 1 2 0 0 0 0
ii.
012345
String:
T O MAT O
LPS Table: 0 0 0 0 1 2
iii.
012345
String:
T I C TAC
LPS Table: 0 0 0 1 0 0
The Knuth Morris Pratt pattern matching algorithm works on the basis of
this LPS table.
In this algorithm, starting from the first position of the main string, each
character of the pattern string is compared with each character of the main
string. For both strings we need to take two different pointers/indicators to
point at the character of the individual string.
If the corresponding characters do not match, only the pointer of the main
string proceeds
170 Data Structures and Algorithms Using Python to point at the next
character. But if the corresponding characters do match, both pointers
proceed to point at the next character of their corresponding string. After
started matching if any mismatch occurs, instead of backtracking the entire
pattern, pointer of the pattern string finds the corresponding value from the
LPS table and move back only to that value so that common prefix need not
traverse again. From this position both the pointers start comparing again. If
the comparison fails again, the algorithm checks whether the pointer of the
pattern string is at the beginning of the string or not. If it is at beginning of
the string, pointer of the main string proceeds to point the next character of
the string as there is no chance of overlapping pattern; otherwise the pointer
of the pattern string moves back according to the corresponding value from
the LPS table. When all the characters match with the portion of the main
string, algorithm returns the starting index of the matched portion. But if the
pointer of the main string reaches the end of the string and the pattern is not
found, the algorithm returns -1. Following example illustrates this concept.
Example 5.20: Using the KMP pattern matching algorithm, find the pattern
‘ABABC’ in the string ‘ABABDEABABABC’.
Solution: First we have to create the LPS table for the pattern ‘ABABC’.
01234
Pattern:
ABABC
LPS Table:
00120
0 1 2 3 4 5 6 7 8 9 10 11 12
String:
ABABDEABABAB
Pattern: A B A B C
0 1 2 3 4 5 6 7 8 9 10 11 12
String:
ABABDEABABAB
Pattern: A B A B C
Now, String[4] and Pattern[4] are not the same. So, we will check the LPS
table for j-1 and find LPSTable[3] is 2. Hence, j will be reset to 2.
Strings 171
0 1 2 3 4 5 6 7 8 9 10 11 12
String:
ABABDEABABAB
Pattern: A B A B C
Now the search process starts again and fails at immediate comparison as
String[4]
and Pattern[2] are not same. So, we again check LPS table for j-1 and find
LPSTable[1]
0 1 2 3 4 5 6 7 8 9 10 11 12
String:
ABABDEABABAB
Pattern: A B A B C
Now the search process starts again and fails at immediate comparison as
String[4]
and Pattern[0] are not same. But now, value of j is 0. So, i will now be
incremented.
0 1 2 3 4 5 6 7 8 9 10 11 12
String:
ABABDEABABAB
Pattern: A B A B C
j
Now, String[5] is ≠ Pattern[0] and j is also 0. Hence, i will be incremented
again.
0 1 2 3 4 5 6 7 8 9 10 11 12
String:
ABABDEABABAB
Pattern: A B A B C
This time String[6] and Pattern[0] are equal. So, both i and j will be
incremented by 1. Hence, i becomes 7 and j becomes 1.
0 1 2 3 4 5 6 7 8 9 10 11 12
String :
ABABDEABABAB
Pattern: A B A B C
172 Data Structures and Algorithms Using Python Again, String[7] and
Pattern[1] are equal. So, both i and j will be incremented by 1
and i becomes 8 and j becomes 2. This is also true for next two
comparisons. So, i becomes 10 and j becomes 4.
0 1 2 3 4 5 6 7 8 9 10 11 12
String:
ABABDEABABAB
Pattern: A B A B C
But now, String[10] and Pattern[4] are not the same. So, we will check the
LPS
0 1 2 3 4 5 6 7 8 9 10 11 12
String:
ABABDEABABAB
Pattern: A B A B C
Now, String[10] and Pattern[2] are equal. So, both i and j will be
incremented by 1 and i becomes 11 and j becomes 3. This is also true for
the next comparison. So, i becomes 12 and j becomes 4.
0 1 2 3 4 5 6 7 8 9 10 11 12
String:
ABABDEABABA
Pattern: A B A B C
As j has reached the end of the pattern and still matching continues, we can
now say that the pattern is found in the main string and it is at i-j, i.e., at
index position 8.
The main advantage of this algorithm is that we need not move back in the
main string.
This is clear from the previous example as well, where the value of i never
decreases or is reset to some lower value. It is always incremented and if
there is some overlapping portion in the main string we need not set the
pointer of the pattern to 0; we are continuing with the matched prefix of the
pattern and the suffix of the main string. Hence we get the solution in linear
time. If the size of the string is n and the size of the pattern is m, to
construct the LPS table O( m) running time is required and the KMP
algorithm requires O( n) time as generally m is very small in comparison to
n. Hence, the overall time complexity of the KMP algorithm is O( m+ n).
Strings 173
Here String and Pattern are the two strings. Pattern is to be searched within
String.
1. Set L1 = Length of String
4. Set i = 0
5. Set j = 0
i. Set i = i + 1
ii. Set j = j + 1
b. Else
i. If j != 0, then
l. Set j = LPSTable[j-1]
ii. Else,
l. Set i = i + 1
c. If j == L2, then
i. Return i-j
7. Return -1
# Algorithm
def createLpsTable(pattern):
l = len(pattern)
i=1
j=0
lpsTable = [0]* l
while i < l:
if pattern[i] == pattern[j]:
lpsTable[i] = j+1
j+=1
i+=1
else:
if j != 0:
j=lpsTable[j-1]
else:
i+=1
return lpsTable
l2 = len(pattern)
lpsTable=createLpsTable(pattern)
i=0
j=0
if string[i] == pattern[j]:
i+=1
j+=1
else:
if j != 0:
j=lpsTable[j-1]
else:
i+=1
if j == l2:
return(i-j)
return -1
else:
Sample Output:
Strings 175
#string.
words=string.split()
count = len(words)
print(“No. of words =”, count)
Sample Output:
No. of words = 4
#characters in a string.
c=0
for i in string:
if i != ‘ ’:
c+=1
Sample Output:
#a string.
string = input(“Enter any string: ”)
newStr = ‘ ’.join(words)
Sample Output:
import string
v=0
c=0
d=0
p=0
for i in strn:
ch=i.upper()
if ch in ‘AEIOU’:
v+=1
elif ch.isalpha():
c+=1
elif ch.isdigit():
d+=1
p+=1
Sample Output:
No. of Vowels = 6
No. of Consonants = 11
No. of Digits = 8
Strings 177
#palindrome or not.
string = input(“Enter any String: ”)
revString=string[::-1]
if string == revString:
print(“Palindrome”)
else:
print(“Not a Palindrome”)
Sample Output:
Palindrome
Not a Palindrome
#alphabetical order.
names=[]
for i in range(n):
names.append(name)
for i in range(n-1):
for j in range(n-1-i):
if names[j]>names[j+1]:
temp = names[j]
names[j]=names[j+1]
names[j+1]=temp
for i in names:
print(i)
Aakash
Aanand
Bikram
Dipesh
Malini
Program 5.9: Write a program that will convert each character (only
letters) of a string into the next letter of the alphabet.
#alphabet.
def encrypt(string):
encStr=‘’
for i in string:
if i.isalpha():
if i==‘z’:
s=97
elif i==‘Z’:
s=65
else:
s=ord(i)+1
encStr+=chr(s)
else:
encStr+=i
return encStr
Strings 179
Sample Output:
#name.
def abbreviation(name):
nameList=name. title().split()
l=len(nameList)
abbr=‘’
for i in range(l-1):
abbr+=nameList[i][0]+‘.’
abbr+=nameList[l-1]
return abbr
abbrName=abbreviation(name)
#a string.
def countFrequency(string):
wordDict={}
words=string.split()
for w in words:
wordDict[w]=wordDict.get(w,0)+1
return wordDict
countDict=countFrequency(st)
print(k, ‘:’, v)
Sample Output:
Enter any String: the boy is the best boy in the class
Frequency of the words in the String:
boy : 2
the : 3
is : 1
in : 1
best : 1
class : 1
#decimal number.
def bin2dec(binary):
dec=0
length=len(binary)
p=0
for i in range(length-1,-1,-1):
dec+=int(binary[i])*2**p
p+=1
return dec
dec=bin2dec(binary)
print(“Equivalent decimal number is:”, dec)
Sample Output:
Strings 181
Strings at a Glance
✓ A positive index starts from 0 and is used to access a string from the
beginning.
✓ A negative index starts from -1 and is used to access a string from end.
✓ The operator ‘+’ is used to concatenate two or more strings whereas the
operator ‘*’ is used to repeat a string a certain number of times.
1. The
d) None of these
print(r‘Hello\nStudents!!’)
a) Syntax error
b) Hello\nStudents!!
c) Hello
Students!!
d) None of these
_ = ‘1 2 3 4 5 6’
print(_)
d) 1 2 3 4 5 6
a = ‘1 2 ’
print(a * 2, end=‘ ’)
print(a * 0, end=‘ ’)
print(a * -2)
a) 1 2 1 2 0 -1 -2 -1 -2
b) 1 2 1 2
c) 2 4 0 -2 -4
d) Error
Strings 183
word = “MAKAUT”
i=0
while i in word:
print(‘i’, end = “ ”)
a) no output
b) i i i i i i …
c) M A K A U T
d) Error
‘C’ in “PYTHON”
a) Error
b) 0
c) 4
d) False
i. st[::-1]
ii. st.reverse()
iii. “”.join(reversed(st))
a) i, ii
b) i, iii, iv
c) ii, iii, iv
print(string[5::-1])
a) n Programming
b) on Programming
c) gnimmargorP
d) nohtyP
def change(st):
st[0]=’X’
return st
string =“Python”
a) Xython
b) Python
c) XPython
d) None of these
string=‘Python’
while i in string:
print(i, end = “ ”)
a) P y t h o n
b) Python
c) Pyth o n
d) Error
i. s= ‘Python’
ii. s[2]=‘k’
iv.
del s
a) i, ii
b) i, iii, iv
c) i, iv
a) count()
b) len()
c) both a and b
d) None of these
a) title()
b) capitalize()
Strings 185
c) capitalfirst()
d) All of these
c) file.write(“Python Programming”)
d) Both a and c
Review Exercises
1. What is string?
10. Write down the Brute Force pattern matching algorithm. What is the
time complexity to find a pattern using this algorithm?
11. Using the Brute Force pattern matching algorithm, find the pattern ‘Thu’
in the string ‘This Is The Thukpa’.
12. How can the LPS table be created from any pattern?
i. ABCDEAB
ii. ABABCAD
iii. XYXYZ
14. Using the KMP pattern matching algorithm, find the pattern ‘cucumber’
in the string ‘cute cucumber’.
15. Using the KMP pattern matching algorithm, find the pattern ‘pitpit’ in
the string ‘picture of pipitpit’.
3. Write a program that will print a substring within a string. The program
will take the input of the length of the substring and the starting position in
the string.
5. Write a program that will delete a particular word from a line of text.
6. Write a program that will replace a particular word with another word in
a line of text.
9. Write a function to extract the day, month and year from any date.
Assume that dates are in
‘DDMMYYYY’ format but the separator may be ‘/’, ‘.’ or ‘-’.
10. Write a program to check whether a string contains any email address.
11. Write a program to extract all product codes from a string where the
product code is a 6-character string whose first 3 characters are letters in
upper case and last 3 characters are numeric.
12. Write a program to extract all 10-digit phone numbers with a prefix of
‘+91’.
13. Write a program to check whether a string contains the word ‘BCA’ or
‘MCA’.
15. Write a program to find all the words that end with the pattern ‘tion’ in a
string.
OceanofPDF.com
Chapter
Recursion
6.1 Definition
if num == 0:
return 1
else:
return num*factorial(num-1)
factorial(2)); – this statement will execute. This will continue until num
becomes 1.
Then return(1) statement will execute and return the control to its calling
point. Then its previous return statement will return 2*1, i.e. 2. In this way,
factorial(4) will be calculated as 4*3*2*1.
factorial(4)
24
return (4 * factorial(3))
return (3 * factorial(2))
2
return (2 * factorial(1))
return (1 * factorial(0))
return(1)
#positive integer
def factorial(num):
if num == 0:
return 1
else:
return num*factorial(num-1)
fact = factorial(n)
Recursion 189
2. Tail Recursion
3. Binary Recursion
4. Mutual Recursion
5. Nested Recursion
if b == 0:
return 1
else:
def FibNum(n):
if n < 1:
elif n == 1:
return 0
elif n == 2:
return 1
else:
fib = FibNum(term)
In the above function, at a time two recursive functions are called; that is
why it is treated as binary recursion. As we know, the first two terms of
the Fibonacci series is fixed and these are 0 and 1; the terminating condition
is written as:
if n == 1:
return 0
elif n == 2:
return 1
Thus when the value of n will be 1 or 2, the function will return 1 and 0,
respectively. Zero or negative value of n indicates invalid value, thus
returns -1. The other terms of this series are generated by adding the two
predecessor values of this series. So, to find the nth term of this series, we
need to add (n-1)th and (n-2)th terms and the statement becomes: return
FibNum(n-1) + FibNum(n-2)
#recursion
Recursion 191
def iseven(n):
if (n==0):
return True
else:
return isodd(n-1)
def isodd(n):
return not(iseven(n))
if iseven(num):
else:
In the above program both functions mutually call each other to check
whether a number is even or odd. In the main function, instead of iseven(),
isodd()can also be used.
if m == 0:
return n+1
elif n == 0:
return ackermann(m-1,1)
else:
return ackermann(m-1,ackermann(m,n-1))
result = ackermann(x, y)
print(“Result = ”, result)
192 Data Structures and Algorithms Using Python 6.3 Recursion vs.
Iteration
Both recursion and iteration do almost the same job. Both are used when
some repetition of tasks is required. But the way of implementation is
different. Now the question is which one is better or which one is better in
which situation. We know that iteration explicitly uses a repetition structure,
i.e. loop, whereas recursion achieves repetition through several function
calls. If we think about time complexity, recursion is not a better choice.
That is because we know there is a significant overhead always associated
with every function call such as jumping to the function, saving registers,
pushing arguments into the stack, etc., and in recursion these overheads are
multiplied with each function call and, therefore, it slows down the process.
But in iteration there is no such overhead. If we consider space complexity,
with every function call all the local variables will be declared again and
again, which consumes more and more space. But in case of iteration, a
single set of variables are declared and they store different updated values
for each iteration. Hence, the performance of iteration is much better than
recursion when we think about space complexity. So, what is the utility of
recursion? There are some situations where a complex problem can be
easily implemented using recursion, such as implementation of trees, a
sorting algorithm that follows the divide and conquer method, backtracking,
top down approach of dynamic programming, etc.
Though the puzzle can be played for n number of disks, here we are
discussing the solution of this problem for three disks. To mention the rods
and disks properly we are giving names of them. Initially all the disks are in
rod X; we have to shift all disks from X to rod Z; the intermediate rod is Y.
The disks are named 1, 2, and 3 in the increasing order of size.
Recursion 193
#problem
if n<=0:
print(“Invalid Entry”)
elif n==1:
else:
move_disk(p1,p3,p2,n-1)
move_disk(p1,p2,p3,1)
move_disk(p2,p1,p3,n-1)
move_disk(‘X’,’Y’,’Z’,num)
Output:
194 Data Structures and Algorithms Using Python 6.4.2 Eight Queen
Problem
To find the solution to this puzzle we may use Brute Force algorithm which
exhaustively searches all the possibilities and hence takes more time. Here
we are trying for a better solution that uses recursion:
#using recursion.
import numpy as np
#place a queen
l=len(board)
if board[x][i]==1:
return False
if board[i][y]==1:
return False
for i in range(l):
for j in range(l):
if board[i][j]==1:
#diagonal checking
return False
return True
def getSolution(board):
l=len(board)
for i in range(l):
for j in range(l):
if board[i][j]==0:
if isPossible(board,i,j):
getSolution(board)
Recursion 195
return board
board[i][j]=0
return board
N=8
board=np.zeros([N,N],dtype=int)
row=int(input(“Row: ”))
col=int(input(“Column: ”))
board[row, col]=1
board=board.tolist()
Output:
Row: 2
Column: 3
[[1 0 0 0 0 0 0 0]
[0 0 0 0 0 0 1 0]
[0 0 0 1 0 0 0 0]
[0 0 0 0 0 1 0 0]
[0 0 0 0 0 0 0 1]
[0 1 0 0 0 0 0 0]
[0 0 0 0 1 0 0 0]
[0 0 1 0 0 0 0 0]]
196 Data Structures and Algorithms Using Python 6.5 Advantages and
Disadvantages of Recursion
Advantages: The advantages of recursion are as follows: 1. Reduces the
code.
In iteration the same set of variables are used. But in recursion, every time
the function is called, the complete set of variables are allocated again and
again. So, a significant space cost is associated with recursion.
Again there is also a significant time cost associated with recursion, due to
the overhead required to manage the stack and the relative slowness of
function calls.
= T(n/2) + T(1)
Recursion 197
1. Identify the input size of the original problem and smaller sub-problems.
The first two points have already been discussed briefly. Now to solve any
recurrence relation we may follow the Recursion Tree Method or the Master
Theorem.
In the recursion tree method, we need to follow the steps as given here: 1.
The recursion tree of the recurrence relation has to be drawn first.
3. Calculate the cost of additional operations at each level which is the sum
of the cost of all nodes of that level.
4. Next find the total cost by adding the cost of all levels.
T(n) = T(n/2) + 1
T(n)
T(n/2)
T(n/4)
T(n/8)
T(n/2 )k
2
Size of sub-problem at level 1 = n 1
…..
Suppose kth level is the last level and at this level the size of the sub-
problem will reduce to 1.
Thus,
n =1
2k
Or, 2 k = n
2
2
2)
2)+1
2)+2
2)+3
…..
\ Cost at level k = T( n 2 k ) + k
= 1 + log n
Recursion 199
[At kth level, the size of the sub-problem becomes 1, \ T(1) =1 and k = log
n]
2
From the above equation, we can define the complexity of binary search as
O (log n).
Master theorem is used in divide and conquer type algorithms where the
recurrence relation can easily be transformed to the form T(n) = aT(n/b) +
O(n^k), where a>=1 and b>1. Here a denotes the number of sub-problems,
each of size(n/b). O(n^k) is the cost of dividing into sub-problems and
combining results of all the components. There may be three cases:
Program 6.8: Write a program to find the sum of the digits of a number
using recursion.
#digits of a number.
def sumOfDigits(num):
if num == 0:
return 0
else:
return (num%10)+sumOfDigits(num//10)
sod = sumOfDigits(n)
Output:
200 Data Structures and Algorithms Using Python Program 6.9: Write a
recursive function for GCD calculation.
#integers.
def gcd(n1, n2):
if n1%n2 == 0:
return n2
else:
num2)))
Sample Output:
GCD of 15 and 25 is 5
def FibNum(n):
if n < 1:
return -1
elif n == 1:
return 0
elif n == 2:
return 1
else:
def FibonacciSeries(num):
if num > 0:
fib = FibNum(num)
FibonacciSeries(num-1)
print(fib, end=‘ ’)
Recursion 201
FibonacciSeries(term)
Sample Output:
0 1 1 2 3 5 8 13 21 34
#binary equivalent
def d2b(num):
if num == 0:
return ‘’
else:
rem = num % 2
bin = d2b(dec)
Sample Output:
Recursion at a Glance
✓ The advantage of recursion is that it not only reduces the lines of code
but also helps to implement complex algorithm in easier method.
a) Linked list
b) Stack
c) Queue
d) Tree
Recursion 203
d) None of these.
b) A nested function
d) None of these
6. Which of the following is true in case of recursion when it is compared
with iteration?
i=0
global i
i+=1
if(a<b):
return(f(b,a))
if (b==0):
return(a)
return(i)
a,b=11,12
print(f(a,b))
a) 11 12
b) 2
c) 11
d) 12
def sum(n):
if n>0:
return n+sum(n-1)
else:
return 0
a) Error
b) 9
c) 10
d) 15
def func1(n):
if n>0:
return n*func2(n-1)
else:
return 1
def func2(n):
if n>0:
return n/func1(n-1)
else:
return 1
print(func1(5))
a) 13.333
b) 33.333
c) 3.333
d) 0.333
def func(n):
if n>0:
return func(n//10)+1
else:
return 0
print(func(7007))
a) 0
b) 1
c) 2
d) 4
Recursion 205
11. What will be the output of the following Python code?
def func1(n):
if n>0:
return n+func2(n//10)
else:
return 1
def func2(n):
if n>0:
return n-func1(n//10)
else:
return 1
print(func1(123))
a) 0
b) 123
c) 133
d) 134
Review Exercises
1. What do you mean by recursion? What are the advantages in its use?
def func(n):
if n>0:
if n%2:
return func(n//10)+1
else:
return func(n//10)
else:
return 0
n=int(input(“Enter N: ”))
print(“Result=”,func(n))
206 Data Structures and Algorithms Using Python 6. Find the time
complexity of the above code.
2. Write a recursive function to find the sum of the first n natural numbers.
3. Write a recursive function to count the number of digits in an integer.
OceanofPDF.com
Chapter
Linked List
We have already learned that to store a set of elements we have two options:
array and Python’s in-built list. Both provide easy and direct access to the
individual elements. But when we handle a large set of elements there are
some disadvantages with them. First of all, the allocation of an array is
static. Thus we need to mention the size of the array before compilation and
this size cannot be changed throughout the program. On the other hand,
Python list may expand according to requirement but that does not come
without a cost. Adding new elements into a list may require allocations of
new block of memory into which the elements of the original list have to be
copied. Again, if a very large array is required to be defined, then the
program may face a problem of allocation. As an array is a contiguous
allocation, for very large sized arrays, there may be a situation where even
though the total free space is much larger than required, an array cannot be
defined only due to those free spaces not being contiguous. Another problem
related to arrays and lists is in insertion and deletion operations. If we want
to insert an item at the front or at any intermediate position, it cannot be
done very easily as the existing elements need to be shifted to make room.
The same problem is faced in case of deleting an item. The solution to all
these problems is to create a Linked List.
In a linked list we need not allocate memory for the entire list. Here only a
single node is allocated at a time. This allocation takes place at runtime and
as and when required. Thus, unlike an array, a large contiguous memory
chunk is not allocated; instead very small memory pieces are allocated.
When we need to store the first item, we have to allocate space for it only
and it will store the data at this place. Next when the second item needs to be
stored, we will again allocate space for the second item, and so on. As these
memory spaces are not contiguous, to keep track of the memory addresses,
along with the data part we have to store an extra piece information with
each item and it is the address of the next item. Thus to store an item we
have to store two things. One is the data part of the item and the other is the
address part which stores the address of the next item. But Python does not
provide the scope of direct access to memory; thus we will store the
reference of the next
208 Data Structures and Algorithms Using Python item. So, we need to
declare a class accordingly to implement it. In this chapter we discuss the
representation of a linked list and various operations that can be done on it.
7.1 Definition
A linked list is a linear collection of data items that are known as nodes.
Each node contains two parts – one is the data part that stores one or more
data values and the other is the reference part which keeps the reference of
the next node. The first node along with data values keeps the reference of
the second node; the second node keeps the reference of the third node, and
so on. And the last node contains None to indicate the end of the list. In this
way each node is linked with the others to form a list. Thus this type of list is
popularly known as linked list.
A linked list is a dynamic data structure. Hence the size of the linked list
may increase or decrease at run time efficiently. We can create or delete a
node as and when required.
Linear singly linked list is the simplest type of linked list. Here each node
points to the next node in the list and the last node contains None to indicate
the end of the list. The problem of this type of list is that once we move to
the next node, the previous node cannot be traversed again.
Circular singly linked list is similar to linear singly linked list except a
single difference.
Here also each node points to the next node in the list but the last node
instead of containing None points to the first node of the list. The advantage
of this list is that we can traverse circularly in this list. But the disadvantage
is that to return to the previous node we have to traverse the total list
circularly, which is time and effort consuming for large lists.
Two way or doubly linked list solves this problem. Here each node has two
reference parts. One points to the previous node and the other points to the
next node in the list.
Both the ends of the list contain None. The advantage of this list is that we
can traverse forward as well as backward according to our need. The
disadvantage is that we cannot move circularly, i.e. to move from the last
node to the first node we have to traverse the entire list.
Circular doubly linked list is the combination of doubly linked list and
circular linked list.
Here also each node has two reference parts. One points to the previous node
and the other points to the next node in the list. But instead of containing
None the first node points to last node and the last node points to the first
node of the list. The advantage of this list is that we can move in any
direction, i.e. forward, backward as well as circular. Figure 7.2
A linked list is a chain of elements or records called nodes. Each node has at
least two members, one of which holds the data and the other points to the
next node in the list. This is known as a single linked list because the nodes
of this type of list can only point to the next node in the list but not to the
previous. Thus, we can define the class to represent a node as following:
class Node :
self.data = Newdata
self.next = link
Here the first member of the class represents the data part and the other
member, next, is a reference by which the node will point to the next node.
For simplicity the data part is taken as a single variable. But there may be
more than one item with same or different data types. Thus the general form
of a class to represent any node is as follows:
class Node :
self.data1 = Newdata1
self.data2 = Newdata2
self.data3 = Newdata3
...
...
self.next = link
Here are a few examples of node structure of singly linked list: Example
7.1: The following class can be used to create a list to maintain student
information.
class studentNode :
self.roll = rl
self.name = nm
self.address = addr
self.total = tot
self.next = link
Example 7.2: The following class can be used to create a list to maintain the
list of books in a library.
class bookNode :
def __init__( self,acc_no,name,athr,rate,link ) :
self.accession_no = acc_no
self.title = name
self.author = athr
self.price = rate
self.next = link
Head
Node 1
Node 2
Node 3
10
20
30
None
In all the subsequent cases we use the following class definition for the node.
class Node :
self.data = Newdata
self.next = link
212 Data Structures and Algorithms Using Python 7.5.1 Creating a Singly
Linked List
1. Create a node.
2. Assign the reference of the node to a variable to make it the current node.
4. Take input for the option whether another node will be created or not.
b. Assign the reference of the new node to the reference part of the current
node.
c. Assign the reference of the new node to a variable to make it the current
node.
d. Go to Step 3.
6. Else
a. Assign
self.Head = Node()
cur = self.Head
while True:
ch = input(“Continue?(y/n): ”)
if ch.upper()==‘Y’:
cur.next = Node()
cur = cur.next
else:
cur.next = None
# Last node storing None at its reference part
break
Here first a node is created and inputted data is assigned to its data part. Next
a choice has been taken to decide whether another node will be created or
not. If we want to create another node, a new node needs to be created and
the next part of the current node will point to it otherwise None will be
assigned to the next part of the current node to make it the end node.
After creating a list we need to display it to see whether the values are stored
properly. To display the list we have to traverse the list up to the end. The
algorithm may be described as: 1. Start from the first node.
def display(self):
curNode = self.Head
print(curNode.data,end=“->”)
curNode = curNode.next
print(“None”)
In the above function the reference of the first node of the list, i.e. the
content of Head, has been stored within a variable, curNode. Then it starts to
print the data part of the current node and after printing each time it moves
to the next node until the end of the list is reached. As to displaying the
linked list we need to traverse the entire list starting from the first node, the
time complexity to be calculated as O(n).
Here is a complete program that will create a linked list first and then display
it: Program 7.1: Write a program that will create a singly linked list. Also
display its content.
class Node :
self.data = Newdata
self.next = link
class singleLinkedList :
self.Head = None
self.Head = Node()
cur = self.Head
while True:
ch = input(“Continue?(y/n): ”)
if ch.upper()==‘Y’:
cur.next = Node()
cur = cur.next
else:
cur.next = None
break
def display(self):
curNode = self.Head
print(curNode.data,end=“->”)
curNode = curNode.next
print(“None”)
head=singleLinkedList()
head.createList()
head.display()
Sample output:
Enter Data : 10
Continue?(y/n): y
Enter Data : 20
Continue?(y/n): y
Continue?(y/n): n
To insert a node in a linked list, a new node has to be created first and
assigned the new value to its data part. Next we have to find the position in
the list where this new node will
As the element will be inserted at the beginning of the list, the content of the
variable Head will be updated by the new node. Another argument of this
function is the data part of the new node. We can omit it by taking the input
from the function. But sending it as argument is better as it is more
generalized, because input does not always come from keyboard; it may
come from another list or from file, etc.
The general algorithm to insert an element at the beginning of the list may be
defined as follows:
3. Update its reference part with the content of Head variable which is also
passed as argument.
class Node:
self.data = Newdata
self.next = link
class singleLinkedList :
def insert_begin(self,newData):
self.Head=Node(newData,self.Head)
Figure 7.4 shows the position of pointers after inserting a node at the
beginning of a list.
path.
Node 1
Node 2
Node 3
10
20
30 None
New Data
New node
Figure 7.4 Inserting a node at the beginning of a list In the above function, a
new node will be created through the constructor. If the list is empty, the
reference part of the new node will store None as Head contains None for
empty lists. Otherwise it will store the reference of the node which was
previously the first node. So, the new node becomes the first node and to
point the first node of the list, Head will store the reference of the new node.
As inserting a new node at the beginning of a linked list does need not
traversing the list at all, but just updating some reference variables, the time
complexity of inserting a new node at the beginning of a linked list is O(1).
The general algorithm to insert an element at the end of a list may be defined
as follows: 1. Create a new node.
3. Update its reference part with None as it will be the last node.
5. Else
b. Update the reference part of last node with the new node.
def insert_end(self,newData):
newNode=Node(newData)
if self.Head is None:
# For Null list it will be the 1st Node
self.Head=newNode
else:
curNode = self.Head
curNode = curNode.next
curNode.next=newNode
Head
Node 1
Node 2
Node 3
10
20
30
None
New node
newData
None
Figure 7.5 Inserting a node at the end of a list
The specified node can be of two types. A node can be specified with the
node number of the list, i.e. first node, second node,… nth node, etc. Another
way is to mention the data part, i.e. we need to match the supplied data with
the data of each node. Here we discuss the algorithm for the first case.
Another thing that we need to consider is what happens if the specified node
is not in the list. In this situation we can abort the insertion operation by
specifying that ‘the node does not exist’ or we can insert the new node at the
end. Here we are considering the second case. The following algorithm
describes how a new node can be inserted after the nth node.
218 Data Structures and Algorithms Using Python 1. Create a new node.
4. Else
b. If the specified node not exists, stay at last node and update the reference
part of the last node with the new node.
c. Update the reference part of the new node with the reference part of the
nth node.
d. Update the reference part of the nth node with the new node.
The following function shows how a node can be inserted after the nth node.
def insert_after_nth(self,newData,location):
if self.Head is None :
self.Head=Node(newData,self.Head)
else:
curNode = self.Head
c=1
c+=1
curNode = curNode.next
curNode.next=Node(newData,curNode.next)
Head
Node 1
Node 2
Node 3
10
20
30
None
New node
newData
Figure 7.6 Inserting a node after the second node
To find the time complexity of this operation we need to know how many
nodes to traverse or how many times we need to iterate. But here this
depends on the position which is supplied at run time. Thus, in the best case,
the position is 1 and we need not traverse at all.
In this case, time complexity is O(1). But in the worst case, the node will be
inserted at the end and we have to traverse the entire list. Then time
complexity will be O(n).
Here also, to mean a specified node we are considering the nth node. Thus
our task is to insert a node before the nth node. In this case the new node
may be inserted at the beginning of a list if is required to be inserted before
the first node, otherwise it will be inserted in some intermediate position.
But if the value of n is larger than the node count, the new node will be
inserted at the end.
The following algorithm describes how a new node can be inserted before
the nth node: 1. Create a new node.
4. Else
b. If the specified node does not exist, stay at the last node and update the
reference part of the last node with the reference of the new node.
c. Update the reference part of the new node with the reference part of the
current node, i.e. (n-1)th node.
d. Update the reference part of the current node with the new node.
Head
Node 1
Node 2
Node 3
10
20
30
None
New node
newData
220 Data Structures and Algorithms Using Python The following function
shows how a node can be inserted before the nth node.
def insert_before_nth(self,newData,location):
self.Head=Node(newData,self.Head)
else:
curNode = self.Head
c=1
c+=1
curNode = curNode.next
curNode.next=Node(num,curNode.next)
Here time complexity calculation is the same as in the previous and in the
best case the time complexity is O(1) and in the worst case it is O(n).
To delete a node from a linked list, first we have to find the node that is to be
deleted. Next we have to update the reference part of the predecessor and
successor nodes (if any) and finally de-allocate the memory spaces that are
occupied by the node. Like insertion, there are several cases for deletion too.
These may be :
0 Deletion of the node whose data part matches with the given data, etc.
The general algorithm to delete the first node of the list may be defined as
follows: 1. Update the content of Head variable with the reference part of the
first node, i.e. the reference of the second node.
def delete_first(self):
if self.Head is None:
else:
curNode = self.Head
self.Head=self.Head.next
del(curNode)
Head
Node 1
Node 2
Node 3
10
20
30
None
When we need to delete a node except the first node, first we have to move
to the predecessor node of the node that will be deleted. So, to delete the last
node we have to move to the previous node of the last node. Though our task
is to delete the last node, the list may contain only a single node. Then
deletion of this node makes the list empty and thus the content of the Head
variable will be updated with None.
The general algorithm to delete the last node of the list may be defined as
follows: 1. Check whether there is one or more nodes in the list.
2. If the list contains a single node only, update the content of Head variable
with None and de-allocate the memory of the node.
3. Else
222 Data Structures and Algorithms Using Python c. Update the reference
part of current node with None as it will be the last node now.
Head
Node 1
Node 2
Node 3
10
20
None
30
None
The following function shows how the last node can be deleted:
def delete_last(self):
if self.Head is None:
del(self.Head)
self.Head=None
else:
curNode = self.Head
del(curNode.next)
curNode.next=None
If there are n number of nodes in a list, to delete the last node we need to
traverse n-1 nodes.
Thus to delete the last node of a list, the time complexity will be O(n).
of the nth node. Next update the reference part of (n-1)th node with the
reference part of the deleted node and then de-allocate the memory of the nth
node.
But in the second case we have to compare each node with the argument
data. In this case, the first node, the last node, or any intermediate node may
be deleted. Again if the argument data does not match with any node, no
node will be deleted.
The following algorithm describes how a node whose data part matches with
the given data can be deleted:
1. Check the data of the first node with the given data.
2. If it matches, then
a. Update the content of the Head variable with the reference part of first
node, i.e. the reference of the second node.
3. Else
i. Stay at the previous node of the node whose data part matches with the
given data.
ii. Update the reference part of the current node with the reference part of the
next node (i.e. the node to be deleted).
iii. De-allocate the memory of the next node (i.e. the node to be deleted).
The following function shows how the node whose data part matches with
the given data can be deleted:
#given Data
def delete_anynode(self,num):
if self.Head is None:
else:
curNode = self.Head
if curNode.data==num: # For 1st Node
self.Head=self.Head.next
del(curNode)
if curNode.data == num:
flag = 1
break
prev = curNode
curNode = curNode.next
if flag == 0:
else:
prev.next = curNode.next
del(curNode)
Node 1
Node 2
Node 3
10
20
30
None
Figure 7.10 Deletion of an intermediate node (whose data value is 20) from
a list The time complexity calculation of this operation depends on the
position of the node in the list. If we delete the first node, i.e. the best case,
the time complexity is O(1). But if we delete the last node or the node is not
found at all, that will be the worst case and we have to traverse the entire list.
Then the time complexity will be O(n).
self.data = Newdata
self.next = link
class singleLinkedList :
self.Head = None
def insert_end(self,newData):
newNode=Node(newData)
if self.Head is None:
self.Head=newNode
else:
curNode = self.Head
curNode = curNode.next
curNode.next=newNode
self.Head=Node(newData,self.Head)
def insert_before_nth(self,newData,location):
self.Head=Node(newData,self.Head)
else:
curNode = self.Head
c=1
None :
c+=1
curNode = curNode.next
curNode.next=Node(num,curNode.next)
def insert_after_nth(self,newData,location):
if self.Head is None :
self.Head=Node(newData,self.Head)
else:
curNode = self.Head
c=1
None :
c+=1
curNode = curNode.next
curNode.next=Node(newData,curNode.next)
if self.Head is None:
else:
curNode = self.Head
self.Head=self.Head.next
del(curNode)
def delete_last(self):
if self.Head is None:
del(self.Head)
self.Head=None
else:
curNode = self.Head
while curNode.next.next is not None :
curNode = curNode.next
del(curNode.next)
curNode.next=None
# given Data
def delete_anynode(self,num):
if self.Head is None:
else:
curNode = self.Head
self.Head=self.Head.next
del(curNode)
flag=0
flag = 1
break
prev = curNode
curNode = curNode.next
if flag == 0:
else:
prev.next = curNode.next
del(curNode)
def display(self):
if self.Head is None:
print(“Empty List.”)
else:
curNode = self.Head
while curNode is not None :
print(curNode.data,end=“->”)
curNode = curNode.next
print(“None”)
head=singleLinkedList()
while True:
print(“=====================================”)
Data”)
print(“9.Exit”)
print(“=====================================”)
opt=‘Y’
while opt.upper()==‘Y’:
head.insert_end(num)
elif choice==2 :
head.insert_begin(num)
elif choice==3 :
head.insert_before_nth(num,loc)
elif choice==4 :
head.insert_after_nth(num,loc)
elif choice==5 :
head.delete_first()
elif choice==6 :
head.delete_last()
elif choice==7 :
“))
head.delete_anynode(num)
elif choice==8 :
head.display()
elif choice==9 :
print(“\nQuiting.......”)
break
else:
continue
The advantage of using a linked list is that the size of a polynomial may
grow or shrink as all the terms are not present always. Again if the size of
the polynomial is very large, then also it can fit into the memory very easily
as, instead of the total polynomial, each term will be allocated individually.
The general form of a polynomial of degree n is:
n-1
n-2
n-1
of the each term of the polynomial. Thus we can easily implement it using a
singly linked list whose each node will consist of three elements: coefficient,
power, and a link to the next term. So, the class definition will be:
class Node :
def __init__(self,newCoef=None,newPower=None,link=No
ne):
self.coef = newCoef
self.pwr = newPower
self.next = link
Polynomial
def insert_end(self,newCoef,newPower):
newNode=Node(newCoef,newPower)
if self.Head is None:
self.Head=newNode
else:
curNode = self.Head
curNode = curNode.next
curNode.next=newNode
230 Data Structures and Algorithms Using Python By repetitively calling the
above function we can create a polynomial.
def create_poly(self):
while True:
self.insert_end(cof,pr)
ch=input(“Continue?(y/n): ”)
if ch.upper()==‘N’:
break
Now using the above functions we will write a complete program that will
create two polynomials, display them, add these two polynomials, and
finally display the resultant polynomial.
def __init__(self,newCoef=None,newPower=None,link=No
ne):
self.coef = newCoef
self.pwr = newPower
self.next = link
class polynomial:
self.Head = None
#Polynomial
def insert_end(self,newCoef,newPower):
newNode=Node(newCoef,newPower)
if self.Head is None:
self.Head=newNode
else:
curNode = self.Head
curNode = curNode.next
curNode.next=newNode
# Function to create a Polynomial
def create_poly(self):
while True:
self.insert_end(cof,pr)
ch=input(“Continue?(y/n): ”)
if ch.upper()==‘N’:
break
def display(self):
if self.Head is None:
print(“Empty List.”)
else:
curNode = self.Head
print(str(curNode.coef)
+“x^”+str(curNode.pwr), end=“+”)
curNode = curNode.next
print(“\b ”)
pol1=p1.Head
pol2=p2.Head
pol3=polynomial()
pol3.insert_end(pol1.coef,pol1.pwr)
pol1=pol1.next
pol3.insert_end(pol2.coef,pol2.pwr)
pol2=pol2.next
else:
pol3.insert_end(pol1.coef+pol2.coef,pol1.
pwr)
pol1=pol1.next
pol2=pol2.next
pol3.insert_end(pol1.coef,pol1.pwr)
pol1=pol1.next
pol3.insert_end(pol2.coef,pol2.pwr)
pol2=pol2.next
return pol3
poly2=polynomial()
poly1.create_poly()
poly2.create_poly()
poly3=add_poly(poly1, poly2)
poly1.display()
poly2.display()
poly3.display()
Sample output:
Enter Power : 4
Continue?(y/n): y
Enter Coefficient : 2
Enter Power : 3
Continue?(y/n): n
Enter Coefficient : 5
Enter Power : 3
Continue?(y/n): y
Enter Coefficient : 4
Enter Power : 1
Continue?(y/n): n
If the first polynomial contains m nodes and the second polynomial contains
n nodes, the time complexity to add the two polynomials would be O(m+n)
as we need to traverse both lists exactly once.
The advantage of this list is that we can traverse circularly in this list.
So, to implement a circular linked list we can use the same class that is used
to implement a singly linked list. Figure 7.12 illustrates the representation of
a circular singly linked list.
Head
Node 1
Node 2
Node 3
10
20
30
All the operations that can be done on a singly linked list can also be done
on a circular linked list. In this section we will discuss these operations on a
circular singly linked list, such as creating and traversing a list, inserting
elements in a list, deleting elements from a list, etc. As to implementing a
circular singly linked list, the same class is required. In all the following
cases we follow the following class definition for the node: class Node :
self.data = Newdata
self.next = link
1. Create a node.
2. Update the Head field with this node to make it the first node.
3. Assign the reference of the node to a variable to make it the current node.
5. Take input for the option whether another node will be created or not.
b. Assign the reference of the new node to the reference part of the current
node.
c. Assign the reference of the new node to the current node to make it current
node.
d. Go to Step 3.
7. Else
a. Assign the reference of the first node to the reference part of the current
node.
self.Head = Node()
cur = self.Head
while True:
ch = input("Continue?(y/n): ")
if ch.upper()=='Y':
cur.next = Node()
cur = cur.next
else:
cur.next = self.Head
break
In the above function, like in singly linked list creation, first a node is
created and inputted data is assigned to its data part. Next a choice has been
taken to decide whether another node will be created or not. If we want to
create another node, a new node will be created
Linked List 235
and the next part of the current node will point to it, otherwise the reference
of the first node will be assigned to the next part of the current node to make
it circular.
After creating a list we need to display it to see whether the values are stored
properly. To display the list we have to traverse the list up to the end. The
algorithm may be described as: 1. Start from the first node.
5. If the control has not reached the first node of the list, go to Step 3.
def display(self):
curNode = self.Head
print(curNode.data,end=“->”)
curNode = curNode.next
print(curNode.data)
In the above function the reference of the first node of the list, i.e. the
content of Head, has been stored within a variable, curNode. Then it starts to
print the data part of the current node and after printing each time it moves
to the next node until it traverses back to the first node. Like a single linear
linked list, to display the circular linked list we need to traverse the entire
list. Thus, time complexity of this operation is to be calculated as O(n).
Here is a complete program that will create a circular linked list first and
then display it: Program 7.4: Write a program that will create a circular
linked list. Also display its content.
class Node :
self.data = Newdata
self.next = link
class circularLinkedList :
self.Head = Node()
cur = self.Head
while True:
ch = input(“Continue?(y/n): ”)
if ch.upper()==‘Y’:
cur.next = Node()
cur = cur.next
else:
cur.next = self.Head
break
def display(self):
curNode = self.Head
print(curNode.data,end=“->”)
curNode = curNode.next
print(curNode.data)
head=circularLinkedList()
head.createList()
head.display()
Sample output:
Enter Data : 10
Continue?(y/n): y
Enter Data : 20
Continue?(y/n): y
Continue?(y/n): n
Similar to a singly linked list, to insert a node in a circular linked list too, a
new node has to be created first and its data part has to be assigned the new
value. Next we have to find
the position in the list where this new node will be inserted. Then we have to
update the reference part of the predecessor and successor nodes (if any) and
also the new node. The insertion in the list can take place:
As the element will be inserted at the beginning of the list, the content of the
Head variable will be updated by the new node and the new node will point
to the existing first node.
Finally, we need to move to the last node to update the reference part of the
last node by the new node.
The general algorithm to insert an element at the beginning of the circular
linked list may be defined as follows:
a. Update the reference part of the new node with its own reference.
4. Else
b. Update the reference part of the last node with the new node.
c. Update the reference part of the new node with the reference of the first
node.
5. Update the content of the Head variable with the new node.
Figure 7.13 shows the position of pointers after inserting a node at the
beginning of a list.
pointing path.]
Node 1
Node 2
Node 3
10
20
30
newData
New node
Figure 7.13 Inserting a node at the beginning of a list The following is the
function of the above algorithm:
def insert_begin(self,newData):
newNode=Node(newData,self.Head)
curNode = self.Head
if self.Head is None:
newNode.next=newNode
else:
while curNode.next!=self.Head:
curNode=curNode.next
curNode.next=newNode
self.Head=newNode
In the above function, if the list is empty, the new node will point to itself.
So, the reference part of the new node will store its own reference.
Otherwise, the new node will point to the existing first node and the last
node will point to the new node. Thus we need to move to the last node first.
Then the reference of this last node will store the reference of the new node
to point to it and the reference part of the new node will store the reference
of the existing first node to become the first node. As the new node becomes
the first node, Head will now store the reference of the new node.
Though we are inserting the new node at the beginning of the list, we need to
reach the end node traversing all the intermediate nodes. Thus time
complexity to insert a node at the beginning of a circular linked list would be
O(n).
When the node is inserted at the end, we may face two situations. First, the
list may be empty and, second, there is an existing list. In the first case we
have to update the Head variable with the new node. But in the second case,
we need not update the Head variable as the new node will be appended at
the end. For that we need to move to the last node and then have to update
the reference part of the last node. To find the last node, we can check the
reference part of the nodes. The node whose reference part contains the
reference of the first node will be the last node.
The general algorithm to insert an element at the end of a circular list may be
defined as follows:
a. Update the reference part of the new node with its own reference.
b. Update the content of the Head variable with the reference of the new
node.
4. Else
c. Update the reference part of the new node with the first node.
Head
Node 1
Node 2
Node 3
10
20
30
New node
newData
240 Data Structures and Algorithms Using Python Using the above
algorithm we can write the following code:
def insert_end(self,newData):
newNode=Node(newData)
if self.Head is None:
self.Head=newNode
newNode.next=newNode
else:
curNode = self.Head
while curNode.next!=self.Head:
curNode = curNode.next
curNode.next=newNode
newNode.next=self.Head
Here also, to mean a specified node we are considering the nth node. Thus
our task is to insert a node after the nth node. This algorithm is also similar
to the singly linked list. Move to the nth node. If the specified node is not in
the list, stay at the last node. This node will point to the new node and the
new node will point to the next node.
The following algorithm describes how a new node can be inserted after the
nth node: 1. Create a new node.
a. Update the reference part of the new node with its own reference.
b. Update the content of the Head variable with the new node.
4. Else
b. If the specified node does not exist, stay at the last node and make it the
current node.
c. Update the reference part of the new node with the reference part of the
current node.
d. Update the reference part of the current node with the new node.
Head
Node 1
Node 2
Node 3
10
20
30
None
New node
newData
The following function shows how a node can be inserted after the nth node:
def insert_after_nth(self,newData,location):
newNode=Node(newData)
if self.Head is None :
self.Head=newNode
newNode.next=newNode
else:
curNode = self.Head
c=1
Head:
c+=1
curNode = curNode.next
newNode.next=curNode.next
curNode.next=newNode
Time complexity of this operation depends on the position where the node
will be inserted.
If it inserts after first node, that will be the best case as we need not traverse
the list at all and the time complexity will be O(1). But in worst case the new
node will be inserted at the end of the list. In that case we have to traverse all
the nodes to reach the last node and time complexity will be O(n).
Here our task is to insert a node before the nth node. In this case the new
node may be inserted at the beginning of a list if is required to be inserted
before the first node, otherwise it will be inserted in some intermediate
position. But if the value of n is larger than the node count, the new node
will be inserted at the end.
The following algorithm describes how a new node can be inserted before
the nth node.
a. Update the reference part of the new node with its own reference.
b. Update the content of the Head variable with the new node.
b. Update the reference part of the last node with the new node.
c. Update the reference part of the new node with the first node.
5. Else
a. Move to the previous node of nth node, i.e. (n-1)th node and make it the
current node.
b. If the specified node does not exist, stay at the last node and make it the
current node.
c. Update the reference part of the new node with the reference part of the
current node.
d. Update the reference part of the current node with the new node.
The following function shows how a node can be inserted before the nth
node:
newNode=Node(newData)
if self.Head is None :
self.Head=newNode
newNode.next=newNode
elif location==1:
curNode = self.Head
while curNode.next!=self.Head:
curNode=curNode.next
curNode.next=newNode
newNode.next=self.Head
self.Head=newNode
else:
curNode = self.Head
c=1
c+=1
curNode = curNode.next
newNode.next=curNode.next
curNode.next=newNode
Head
Node 1
Node 2
Node 3
10
20
30 None
New node
newData
Time complexity of this operation also depends on the position where the
node will be inserted. But as it inserts before any node and if it inserts before
the first node, we will need to update the reference part of the last node.
Thus we need to move to the last node, which increments the time
complexity to O(n). In this operation we will get the best case situation when
the node will be inserted before the second node. In that case we need not
update any other node except the first node and the new node, and the time
complexity will be calculated as O(1). If a node is inserted before the last
node, the time complexity will be again O(n) as we need to traverse the
entire list.
0 Deletion of the node whose data part matches with the given data, etc.
In case of first node deletion, we have to update the content of the Head
variable as now it points to the second node as well as update the reference
part of the last node to point to the existing second node, which will be now
the first node. But if the list contains only a single node, the Head variable
will contain None.
The general algorithm to delete the first node of the list may be defined as
follows: 1. If the list contains only a single node
2. Else
b. Update the content of the Head variable with the reference part of the
existing first node, i.e. with the second node.
d. Update the reference part of the last node with the content of the Head
variable.
# list
def delete_first(self):
if self.Head is None:
else:
curNode = self.Head
if curNode.next==curNode:
self.Head=None
del(curNode)
else:
while curNode.next!=self.Head:
curNode=curNode.next
self.Head=self.Head.next
del(curNode.next)
curNode.next=self.Head
Head
Node 1
Node 2
Node 3
10
20
30
Figure 7.17 Deletion of first node from a circular linked list Though we are
deleting the first node, we have to update the reference part of the last node.
Thus, we have to traverse the entire list and time complexity will be O(n).
As we know, to delete a node except the first node, first we have to move to
the predecessor node of the node that will be deleted. So, to delete the last
node we have to move at the previous node of the last node. Though our task
is to delete the last node, the list may contain only a single node. Then
deletion of this node makes the list empty and thus the content of the Head
variable will be updated with the None value.
The general algorithm to delete the last node of the list may be defined as
follows: 1. Check whether there is one or more nodes in the list.
2. If the list contains only a single node, update the content of the Head
variable with None and de-allocate the memory of the node.
3. Else
c. Update the reference part of the current node with the first node as it will
be the last node now.
246 Data Structures and Algorithms Using Python Head
Node 1
Node 2
Node 3
10
20
30
Figure 7.18 Deletion of the last node from a circular linked list The
following function shows how the last node can be deleted from a circular
linked list:
# list
def delete_last(self):
if self.Head is None:
else:
curNode = self.Head
if curNode.next==curNode:
self.Head=None
del(curNode)
else:
while curNode.next.next!=self.Head:
curNode=curNode.next
del(curNode.next)
curNode.next=self.Head
The time complexity to delete the last node of a circular linked list is O(n) as
we need to traverse the entire list to reach the last node.
But in the second case we have to compare each node with the argument
data. In this case, the first node, the last node, or any intermediate node may
be deleted. Again, if the argument data does not match with any node, no
node will be deleted.
The following algorithm describes how a node whose data part matches with
the given data can be deleted:
1. Check the data of the first node with the given data.
2. If it matches, then
b. Else
ii. Update the content of the Head variable with the reference part of the
existing first node, i.e. with the second node.
iv. Update the reference part of the last node with the content of the Head
variable.
3. Else
a. Check the data of each node against the given data and store the reference
of the current node to a variable, named ‘previous’, before moving to next
node.
i. Update the reference part of the previous node with the reference part of
the current node (i.e. the node to be deleted).
ii. De-allocate the memory of the current node (i.e. the node to be deleted).
The following function shows how the node whose data part matches with
the given data can be deleted:
#given Data
def delete_anynode(self,num):
if self.Head is None:
curNode = self.Head
if curNode.next==curNode:
self.Head=None
del(curNode)
else:
while curNode.next!=self.Head:
curNode=curNode.next
self.Head=self.Head.next
del(curNode.next)
curNode.next=self.Head
while curNode.next!=self.Head:
if curNode.data == num:
break
prev = curNode
curNode = curNode.next
if curNode.data!=num:
else:
prev.next = curNode.next
del(curNode)
Head
Node 1
Node 2
Node 3
10
20
30
Figure 7.19 Deletion of an intermediate node (whose data value is 20) from
a circular list Time complexity of this operation also depends on the position
from where the node will be deleted. If the first node is deleted, we need to
update the reference part of the last node.
Thus we need to move to the last node which increases the time complexity
to O(n). Time complexity will be the same if the last node is deleted as then
also we need to traverse the
class Node :
self.data = Newdata
self.next = link
class circularLinkedList :
self.Head = None
def insert_end(self,newData):
newNode=Node(newData)
if self.Head is None:
self.Head=newNode
newNode.next=newNode
else:
curNode = self.Head
while curNode.next!=self.Head:
curNode = curNode.next
curNode.next=newNode
newNode.next=self.Head
newNode=Node(newData,self.Head)
if self.Head is None:
newNode.next=newNode
else:
curNode = self.Head
while curNode.next!=self.Head:
curNode=curNode.next
curNode.next=newNode
self.Head=newNode
def insert_before_nth(self,newData,location):
if self.Head is None :
self.Head=newNode
newNode.next=newNode
elif location==1:
curNode = self.Head
while curNode.next!=self.Head:
curNode=curNode.next
curNode.next=newNode
newNode.next=self.Head
self.Head=newNode
else:
curNode = self.Head
c=1
Head:
c+=1
curNode = curNode.next
newNode.next=curNode.next
curNode.next=newNode
def insert_after_nth(self,newData,location):
newNode=Node(newData)
if self.Head is None :
self.Head=newNode
newNode.next=newNode
else:
curNode = self.Head
c=1
Head:
c+=1
curNode = curNode.next
newNode.next=curNode.next
curNode.next=newNode
#list
def delete_first(self):
if self.Head is None:
else:
curNode = self.Head
Linked List 251
if curNode.next==curNode:
self.Head=None
del(curNode)
else:
while curNode.next!=self.Head:
curNode=curNode.next
self.Head=self.Head.next
del(curNode.next)
curNode.next=self.Head
#list
def delete_last(self):
if self.Head is None:
else:
curNode = self.Head
if curNode.next==curNode:
self.Head=None
del(curNode)
else:
while curNode.next.next!=self.Head:
curNode=curNode.next
del(curNode.next)
curNode.next=self.Head
# given Data
def delete_anynode(self,num):
if self.Head is None:
else:
curNode = self.Head
if curNode.next==curNode:
self.Head=None
del(curNode)
else:
while curNode.next!=self.Head:
curNode=curNode.next
self.Head=self.Head.next
del(curNode.next)
while curNode.next!=self.Head:
if curNode.data == num:
break
prev = curNode
curNode = curNode.next
if curNode.data!=num:
else:
prev.next = curNode.next
del(curNode)
def display(self):
if self.Head is None:
print(“Empty List.”)
else:
curNode = self.Head
print(curNode.data,end=“->”)
curNode = curNode.next
print(curNode.data)
head=circularLinkedList()
while True:
print(“=====================================”)
Data”)
print(“=====================================”)
if choice==1 :
opt=‘Y’
while opt.upper()==‘Y’:
head.insert_end(num)
elif choice==2 :
head.insert_begin(num)
elif choice==3 :
head.insert_before_nth(num,loc)
elif choice==4 :
head.insert_after_nth(num,loc)
elif choice==5 :
head.delete_first()
elif choice==6 :
head.delete_last()
elif choice==7 :
”))
head.delete_anynode(num)
elif choice==8 :
head.display()
elif choice==9 :
print(“\nQuiting.......”)
break
else:
continue
254 Data Structures and Algorithms Using Python other end, to implement a
queue using a singly linked list we need two pointers. But if we use a
circular linked list, only one pointer is sufficient. (Details given in Chapter
9.) Implementation of waiting and context switch uses queues in an
operating system.
class Node :
self.data = Newdata
self.next = link
class circularLinkedList :
self.Head = None
self.Head = Node()
cur = self.Head
while True:
cur.data=input(“Enter Name : ”)
ch = input(“Continue?(y/n): ”)
if ch.upper()==‘Y’:
cur.next = Node()
cur = cur.next
else:
cur.next = self.Head
break
Jlist=circularLinkedList()
Jlist.createList()
ptr=Jlist.Head
n=int(input(“Enter any number: ”))
while ptr.next!=ptr:
for i in range(1,n):
prev=ptr
ptr=ptr.next
prev.next=ptr.next
ptr=prev.next
print(ptr.data,“will be freed”)
Sample output:
Continue?(y/n): y
Continue?(y/n): y
Continue?(y/n): y
Continue?(y/n): n
The disadvantage of a doubly linked list is that it requires extra spaces. So, it
needs to be used extremely judiciously on the specific situations where
traversing in both directions is frequently required.
So, to implement a doubly linked list we have to declare a node with at least
three members, one of which holds the data and the other two point to the
previous and next nodes respectively in the list. Thus we can define the class
to represent the node of a doubly linked list as follows:
def __init__(self,Newdata=None,plink=None,nlink=No
ne):
self.previous = plink
self.data = Newdata
self.next = nlink
Head
Node 1
Node 2
Node 3
None 10
20
30 None
is the last node of this list. Thus next part of Node3 also contains None and
its previous part contains the reference of Node2.
All the operations that can be done on a singly linked list or on a circular
linked list can also be done on a doubly linked list. To reduce the
monotonousness of the discussion, here we implement some of these. In all
the subsequent cases we use the following class definition for the node.
class DNode :
def __init__(self,Newdata=None,plink=None,nlink=No
ne):
self.previous = plink
self.data = Newdata
self.next = nlink
Linked List 257
The previous part of the existing first node will contain the reference of the
new node.
4. Update its next part with the content of the Head variable.
5. Update the previous part of the existing first node, if any, with the new
node.
6. Update the content of the Head variable with the new node.
Figure 7.21 shows the position of pointers after inserting a node at the
beginning of a doubly linked list.
Head
Node 1
Node 2
Node 3
None 10
20
30 None
newData
New node
Figure 7.21 Inserting a node at the beginning of a doubly linked list The
following is the function of above algorithm:
def insert_begin(self,newData):
newNode=DNode(newData,None,self.Head)
self.Head.previous=newNode
self.Head=newNode
258 Data Structures and Algorithms Using Python In the above function, if
the list is empty, both the previous and next parts of the new node will store
None as it is the only node in the list. Otherwise the previous part of the new
node will contain None and the next part will store the reference of the
existing first node.
So, the new node becomes the first node, and to point to the first node of the
list, Head will store the reference of the new node. The previous first node, if
it exists, now becomes the second node; so its previous part will point to the
new node.
The general algorithm to insert an element at the end of a list may be defined
as follows: 1. Create a new node.
3. Update its next part with None as it will be the last node.
b. Update the content of the Head variable with the new node.
5. Else
b. Update the previous part of the new node with the existing last node.
c. Update the next part of the existing last node with the new node.
Head
Node 1
Node 2
Node 3
None 10
20
30 None
New node
newData
Figure 7.22 shows the position of pointers after inserting a node at the end of
a doubly linked list.
def insert_end(self,newData):
newNode=DNode(newData)
if self.Head is None:
self.Head=newNode
else:
curNode = self.Head
curNode = curNode.next
curNode.next=newNode
newNode.previous=curNode
As the new node will be inserted at the end, its next part should be initialized
with None first. If the list is empty initially, the previous part of the new
node will also store None.
Otherwise, move to the last node by checking whether the next part is None.
Update the previous part of the new node with the existing last node as it
will also point to the last node and update the next part of the existing last
node with the new node.
The following algorithm describes how a new node can be inserted at the nth
position.
b. Update the next part of the new node with the content of the Head
variable.
260 Data Structures and Algorithms Using Python c. Update the previous
part of the existing first node, if any, with the reference of the new node.
d. Update the content of the Head variable with the reference of the new
node.
4. Else
a. Move to the previous node of the nth node and make it the current node.
b. If the specified node does not exist, stay at the last node and make it the
current node.
c. Update the next part of the new node with the next part of the current
node.
d. Update the previous part of the new node with the current node.
e. Update the next part of the current node with the new node.
f. Update the previous part of the next to current node, if any, with the new
node.
# Nth node
def insert_before_nth(self,newData,location):
newNode=DNode(newData)
self.Head.previous=newNode
newNode.next=self.Head
self.Head=newNode
else:
curNode = self.Head
c=1
while c<=location-2 and curNode.next is not
None :
c+=1
curNode = curNode.next
newNode.next=curNode.next
newNode.previous=curNode
if curNode.next!=None:
curNode.next.previous=newNode
curNode.next=newNode
Figure 7.23 shows the position of nodes and their references after inserting a
node at the third position in a doubly linked list.
In this algorithm as the new node is inserted at the nth position the time
complexity calculation depends on the value of n. When n = 1, we are
inserting at the first position.
Thus we need not traverse any node except the first one. So, the time
complexity is O(1).
This is the best case. But if n denotes the last node, we have to traverse all
the nodes. Then the time complexity would be O(n).
Head
Node 1
Node 2
Node 3
None 10
20
30 None
New node
newData
Figure 7.23 Inserting a node at the third position in a doubly linked list
7.11.4 Deleting a Node from a Doubly Linked List
Deletion of a node from a doubly linked list is also similar to a singly linked
list. Only we have to take extra care of the previous part of the next node, if
any. Here also first we have to find the node that is to be deleted. Next we
have to update the previous and next parts of the predecessor and/or
successor nodes and finally need to de-allocate the memory spaces that are
occupied by the node. Different cases of what we are discussing here are:
In case of first node deletion, here also we have to update the content of the
Head variable as now it will point to the second node, and the previous part
of the second node, if any exists, will point to None as it becomes the first
node. But if the list contains only a single node, the Head variable will
contain None. The time complexity of deletion of the first node is O(1).
The general algorithm to delete the first node of the list may be defined as
follows: 1. Update the content of the Head variable with the reference part of
the first node, i.e.
Head
Node 1
Node 2
Node 3
None 10
None 20
30 None
Figure 7.24 Deletion of the first node from a doubly linked list Here is a
function to delete the first node of a linked list.
def delete_first(self):
if self.Head is None:
else:
curNode = self.Head
self.Head=self.Head.next
self.Head.previous=None
del(curNode)
As a doubly linked list contains the references of both previous and next
nodes, to delete the last we can move to the last node as well as its previous
node. From both positions we can delete the last node. Though our task is to
delete the last node, the list may contain only a single node. Then deletion of
this node makes the list empty and thus the content of the Head variable will
be updated with the None value. The time complexity of the deletion of the
last node is O(n).
The general algorithm to delete the last node of the list may be defined as
follows: 1. Check whether there is one or more nodes in the list.
2. If the list contains only a single node, update the content of the Head
variable with None and de-allocate the memory of the node.
3. Else
c. Update the next part of the current node with None as it will be the last
node now.
def delete_last(self):
if self.Head is None:
print(“Empty List. Deletion not possible...”)
del(self.Head)
self.Head=None
else:
curNode = self.Head
curNode = curNode.next
del(curNode.next)
curNode.next=None
Head
Node 1
Node 2
Node 3
None 10
20 None
30 None
In case of singly linked and circular linked lists we discussed the deletion of
a node whose data part matched with the given data. In case of a doubly
linked list the same algorithm can be followed with the necessary extra
pointer adjustment for the previous part. Here we
264 Data Structures and Algorithms Using Python will discuss the nth node
deletion as an intermediate node deletion. To delete the nth node, we have to
consider two cases. It may be the first node or any other node. If it is the first
node, the operations are similar to the deletion of the first node. But in
second case we may move to the nth node or its previous node. From both
positions we can delete the specified node.
The general algorithm to delete the nth node of the list may be defined as
follows: 1. If n = 1, then
a. Update the content of the Head variable with the reference part of first
node, i.e. with the second node.
b. If the second node exists, update its previous part with None as it will be
the first node now.
2. Else
b. Otherwise, move to the (n-1)th node and make it the current node.
c. Update the reference part of the current node with the reference part of the
next node (i.e. the nth node).
d. If the reference part of the nth node is not None, i.e. if the (n+1)th node
exists, update the previous part of the (n+1)th node with the current node.
e. De-allocate the memory of the next node (i.e. the nth node).
Head
Node 1
Node 2
Node 3
None 10
20
30 None
Figure 7.26 Deletion of the second node from a doubly linked list Here is a
function to delete the nth node of a linked list:
def delete_nth(self,posn):
if self.Head is None:
else:
curNode = self.Head
self.Head=curNode.next
del(curNode)
self.Head.previous=None
print(“Node Deleted Successfully...”)
c=1
None :
c+=1
curNode = curNode.next
if curNode.next is None:
else:
temp=curNode.next
curNode.next=temp.next
temp.next.previous=curNode
del(temp)
class DNode :
def __init__(self,Newdata=None,plink=None,nlink=No
ne):
self.data = Newdata
self.previous = plink
class doublyLinkedList :
self.Head = None
def insert_end(self,newData):
newNode=DNode(newData)
if self.Head is None:
self.Head=newNode
else:
curNode = self.Head
curNode = curNode.next
curNode.next=newNode
newNode.previous=curNode
def insert_begin(self,newData):
newNode=DNode(newData,None,self.Head)
self.Head.previous=newNode
self.Head=newNode
# Nth node
def insert_before_nth(self,newData,location):
newNode=DNode(newData)
self.Head.previous=newNode
newNode.next=self.Head
self.Head=newNode
else:
curNode = self.Head
c=1
while c<=location-2 and curNode.next is not
None :
c+=1
curNode = curNode.next
newNode.next=curNode.next
newNode.previous=curNode
if curNode.next!=None:
curNode.next.previous=newNode
curNode.next=newNode
def insert_after_nth(self,newData,location):
newNode=DNode(newData)
if self.Head is None:
self.Head=newNode
else:
curNode = self.Head
c=1
None :
c+=1
curNode = curNode.next
newNode.next=curNode.next
newNode.previous=curNode
if curNode.next!=None:
curNode.next.previous=newNode
curNode.next=newNode
if self.Head is None:
else:
curNode = self.Head
self.Head=self.Head.next
self.Head.previous=None
del(curNode)
def delete_last(self):
if self.Head is None:
print(“Empty List. Deletion not possible...”)
del(self.Head)
self.Head=None
else:
curNode = self.Head
curNode = curNode.next
del(curNode.next)
curNode.next=None
def delete_nth(self,posn):
if self.Head is None:
else:
curNode = self.Head
self.Head=curNode.next
del(curNode)
if self.Head is not None:
self.Head.previous=None
c=1
None :
c+=1
curNode = curNode.next
if curNode.next is None:
else:
temp=curNode.next
curNode.next=temp.next
temp.next.previous=curNode
del(temp)
def display(self):
if self.Head is None:
print(“Empty List.”)
else:
curNode = self.Head
print(“None<=>”,end=“”)
print(curNode.data,end=“<=>”)
curNode = curNode.next
print(“None”)
def rev_display(self):
if self.Head is None:
print(“Empty List.”)
else:
curNode = self.Head
print(“None<=>”,end=“”)
curNode = curNode.next
curNode = curNode.previous
print(“None”)
head=doublyLinkedList()
while True:
print(“=====================================”)
print(“10.Exit”)
print(“=====================================”)
if choice==1 :
opt=‘Y’
while opt.upper()==‘Y’:
head.insert_end(num)
elif choice==2 :
head.insert_begin(num)
elif choice==3 :
head.insert_before_nth(num,loc)
elif choice==4 :
head.insert_after_nth(num,loc)
elif choice==5 :
head.delete_first()
elif choice==6 :
head.delete_last()
elif choice==7 :
Delete:”))
head.delete_nth(num)
elif choice==8 :
head.display()
elif choice==9 :
head.rev_display()
elif choice==10:
print(“\nQuiting.......”)
break
else:
Choice”)
continue
To implement a circular doubly linked list we can use the same class that is
used to implement a doubly linked list. Whatever change we need to do have
to be done programmatically. Figure 7.27 illustrates the representation of a
circular doubly linked list.
In Figure 7.27, the linked list consists of three nodes. The data values of the
nodes are 10, 20, and 30 respectively. Node1 is the first node in the list; thus
its previous part points to the last node of the list, i.e. here Node3, and the
next part points to its successor node, i.e. Node2. Similarly the previous and
next parts of Node2 point to Node1 and Node3
respectively as they are its predecessor and successor nodes. Node3 is the
last node of this list. Thus the next part of Node3 also contains the reference
of the first node, i.e. Node1, and its previous part contains the reference of
Node2.
Head
Node 1
Node 2
Node 3
10
20
30
class DNode :
def __init__(self,Newdata=None,plink=None,nlink=No
ne):
self.data = Newdata
self.previous = plink
self.next = nlink
272 Data Structures and Algorithms Using Python The general algorithm to
insert an element at the beginning of a circular doubly linked list may be
defined as follows:
a. Update both the previous and next parts of the new node with its own
reference.
4. Else
a. Point the last node from the first node as it is the previous node of the first
node.
b. Update the previous part of the new node with the last node.
c. Update the next part of the new node with the reference of the existing
first node.
d. Update the previous part of the existing first node with the new node.
e. Update the next part of the last node with the new node.
5. Update the content of the Head variable with the new node.
Figure 7.28 shows the position of pointers after inserting a node at the
beginning of a circular doubly linked list.
Head
Node 1
Node 2
Node 3
10
20
30
newData
New node
Figure 7.28 Inserting a node at the beginning of a circular doubly linked list
The following function describes the above algorithm:
def insert_begin(self,newData):
newNode=DNode(newData,None,self.Head)
if self.Head is None:
newNode.next=newNode
newNode.previous=newNode
else:
curNode = self.Head
lastNode=curNode.previous
newNode.previous=lastNode
newNode.next=curNode
curNode.previous=newNode
lastNode.next=newNode
self.Head=newNode
7.13.2 Inserting an Element at the End of a Circular Doubly Linked List
To insert a node at the end of a circular doubly linked list, we may face two
situations.
First, the list may be empty and, second, there is an existing list. For an
empty list we have to update the Head variable with the new node, and the
previous and next parts of the new node will point to itself. But in the second
case, we need not update the Head variable as the new node will be
appended at the end. We just need to move to the last node and then update
both the next part of the last node and the previous part of the first node with
the new node. The previous and next parts of the new node will be updated
with the existing last node and first node respectively.
The time complexity to insert a node at the end of a circular doubly linked
list is also O(1) as we can access the last node staying at the first node. We
need not traverse the entire list.
a. Update both the previous and next parts of the new node with its own
reference.
b. Update the content of the Head variable with the new node.
4. Else
a. Point to the last node from the first node as it is the previous node of the
first node.
b. Update the previous part of the new node with the existing last node.
c. Update the next part of the new node with the first node.
d. Update the previous part of the first node with the new node.
e. Update the next part of the existing last node with the new node.
274 Data Structures and Algorithms Using Python Using the above
algorithm we can write the following code:
def insert_end(self,newData):
newNode=DNode(newData)
if self.Head is None:
self.Head=newNode
newNode.next=newNode
newNode.previous=newNode
else:
curNode = self.Head
lastNode=curNode.previous
newNode.previous=lastNode
newNode.next=curNode
curNode.previous=newNode
lastNode.next=newNode
Head
Node 1
Node 2
Node 3
10
20
30
new Data
New node
Figure 7.29 Inserting a node at the end of a circular doubly linked list Figure
7.29 shows the position of pointers after inserting a node at the end of a
circular doubly linked list.
7.13.3 Deleting the First Node from a Circular Doubly Linked List In
case of first node deletion, here also we have to update the content of the
Head variable as now it will point to the second node, and the previous part
of the second node, if any exists, will point to the last node as it becomes the
first node. The next part of the last node now points to this existing second
node. But if the list contains only a single node, the Head variable will
contain None.
The time complexity to delete the first node from a circular doubly linked
list is also O(1) as we can access the last node staying at the first node. We
need not traverse the entire list.
The general algorithm to delete the first node from a circular doubly linked
list may be defined as follows:
2. Else
a. Move to the last node by getting the reference of the last node from the
previous part of the first node.
b. Update the content of the Head variable with the next part of the first
node, i.e. with the existing second node.
c. Update the previous part of the existing second node (which has now
become the first node) with the last node.
d. Update the next part of the last node with the existing second node.
Figure 7.30 shows the position of pointers after deleting the first node from a
circular doubly linked list.
Head
Node 1
Node 2
Node 3
10
20
30
Figure 7.30 Deletion of the first node from a circular doubly linked list Here
is a function to delete the first node of a circular doubly linked list:
if self.Head is None:
else:
curNode = self.Head
if curNode.next==curNode:
else:
lastNode=curNode.previous
self.Head=self.Head.next
self.Head.previous=lastNode
lastNode.next=self.Head
del(curNode)
7.13.4 Deleting the Last Node from a Circular Doubly Linked List As a
doubly linked list contains the references of both the previous and next
nodes, to delete the last node we may move to the last node as well as its
previous node. From both positions we can delete the last node. We have to
update only the next part of the last node’s previous node and the previous
part of the first node.
As in a circular doubly linked list we can access the last node staying at the
first node, we need not traverse the entire list. Thus the time complexity to
delete the last node is also O(1).
The general algorithm to delete the last node from a circular doubly linked
list may be defined as follows:
2. Else
a. Point to the last node from the first node as it is the previous node of first
node.
b. Update the next part of the last node’s previous node with the first node.
c. Update the previous part of the first node with the last node’s previous
node.
Head
Node 1
Node 2
Node 3
10
20
30
Figure 7.31 Deletion of the last node from a circular doubly linked list
Figure 7.31 shows the position of pointers after deleting the last node from a
circular doubly linked list.
Here is a function to delete the last node of a circular doubly linked list:
def delete_last(self):
if self.Head is None:
curNode = self.Head
if curNode.next==curNode:
self.Head=None
del(curNode)
else:
lastNode=curNode.previous
lastNode.previous.next=curNode
curNode.previous=lastNode.previous
del(lastNode)
For a circular doubly linked list, we have discussed the insertion operation at
the beginning and the end of a list and the deletion operation of the first and
last nodes. Insertion of a node at any intermediate position and deletion of
any intermediate node can also be written similarly. To remove
monotonousness we are not discussing these operations but the code of these
functions is given in the Program 7.8.
class DNode :
def __init__(self,Newdata=None,plink=None,nlink=No
ne):
self.data = Newdata
self.previous = plink
self.next = nlink
class doublyLinkedList :
self.Head = None
def insert_end(self,newData):
newNode=DNode(newData)
if self.Head is None:
self.Head=newNode
newNode.next=newNode
newNode.previous=newNode
else:
curNode = self.Head
lastNode=curNode.previous
newNode.previous=lastNode
newNode.next=curNode
curNode.previous=newNode
lastNode.next=newNode
def insert_begin(self,newData):
newNode=DNode(newData,None,self.Head)
if self.Head is None:
newNode.next=newNode
newNode.previous=newNode
else:
curNode = self.Head
lastNode=curNode.previous
newNode.previous=lastNode
newNode.next=curNode
curNode.previous=newNode
lastNode.next=newNode
self.Head=newNode
#Nth node
def insert_before_nth(self,newData,location):
newNode=DNode(newData)
newNode.next=newNode
newNode.previous=newNode
else:
curNode = self.Head
lastNode=curNode.previous
newNode.previous=lastNode
newNode.next=curNode
curNode.previous=newNode
lastNode.next=newNode
self.Head=newNode
else:
curNode = self.Head
c=1
None :
c+=1
curNode = curNode.next
newNode.next=curNode.next
newNode.previous=curNode
curNode.next.previous=newNode
curNode.next=newNode
def insert_after_nth(self,newData,location):
newNode=DNode(newData)
if self.Head is None:
self.Head=newNode
newNode.next=newNode
newNode.previous=newNode
else:
curNode = self.Head
c=1
None :
c+=1
curNode = curNode.next
newNode.next=curNode.next
newNode.previous=curNode
curNode.next.previous=newNode
curNode.next=newNode
if self.Head is None:
else:
curNode = self.Head
if curNode.next==curNode:
self.Head=None
else:
lastNode=curNode.previous
self.Head=self.Head.next
lastNode.next=self.Head
del(curNode)
def delete_last(self):
if self.Head is None:
else:
curNode = self.Head
if curNode.next==curNode:
self.Head=None
del(curNode)
else:
lastNode=curNode.previous
lastNode.previous.next=curNode
curNode.previous=lastNode.previous
del(lastNode)
def delete_nth(self,posn):
if self.Head is None:
else:
curNode = self.Head
if posn==1:
if curNode.next==curNode:
self.Head=None
else:
lastNode=curNode.previous
self.Head=self.Head.next
self.Head.previous=lastNode
lastNode.next=self.Head
del(curNode)
else:
c=1
Head:
c+=1
curNode = curNode.next
if curNode.next==self.Head:
else:
temp=curNode.next
curNode.next=temp.next
temp.next.previous=curNode
del(temp)
print(“Node Deleted Successfully...”)
def display(self):
if self.Head is None:
print(“Empty List.”)
else:
curNode = self.Head
print(curNode.data,end=“<=>”)
curNode = curNode.next
print(curNode.data)
def rev_display(self):
if self.Head is None:
print(“Empty List.”)
else:
firstNode = self.Head
curNode=lastNode=firstNode.previous
while curNode.previous!=lastNode:
print(curNode.data,end=“<=>”)
curNode = curNode.previous
print(curNode.data)
head=doublyLinkedList()
while True:
LIST ”)
print(“=====================================”)
282 Data Structures and Algorithms Using Python print(“ 9.Displaying the
list in Reverse Order”)
print(“10.Exit”)
print(“=====================================”)
if choice==1 :
opt=‘Y’
while opt.upper()==‘Y’:
head.insert_end(num)
elif choice==2 :
head.insert_begin(num)
elif choice==3 :
head.insert_before_nth(num,loc)
elif choice==4 :
head.insert_after_nth(num,loc)
elif choice==5 :
head.delete_first()
elif choice==6 :
head.delete_last()
elif choice==7 :
Delete : ”))
head.delete_nth(num)
elif choice==8 :
head.display()
elif choice==9 :
head.rev_display()
elif choice==10:
print(“\nQuiting.......”)
break
else:
Choice”)
continue
A header linked list is a linked list with an extra node that contains some
useful information regarding the entire linked list. This node is known as the
header node and may contain any information such as reference of the first
node and/or last node, largest element and/or smallest element in the list,
total number of nodes, mean value of the data elements of the nodes, etc. As
there is no restriction on how many elements there will be or what elements
can be stored, there is no specific structure of the header node. It may vary
from program to program depending on the requirement of the problem.
• Grounded header linked list: The header node points to the first node of
the linked list and the reference part of the last node points to None.
Header Node
• Circular header linked list: The header node points to the first node of
the linked list and the reference part of the last node points to the header
node.
Header Node
• Two-way header linked list: The header node points to the first node as
well as the last node of the linked list, and the previous part of the first node
and the next part of the last node both point to None.
Header Node
• Two-way circular header linked list: The header node points to both the
first node and the last node of the linked list and the previous part of first
node and next part of last node both point to the header node.
Header Node
Figure 7.35 Two-way circular header linked list
class Node :
self.data = Newdata
self.next = link
In the following program we are going to store the reference of the first node
of the linked list, the total number of nodes, and the largest element of the
list in the header node. Thus the header node can be defined as:
class groundedHeaderList :
self.Head = None
self.count=0
self.max=None
self.data = Newdata
self.next = link
self.Head = None
self.count=0
self.max=None
def insert_end(self,newData):
newNode=Node(newData)
if self.Head is None:
self.Head=newNode
self.max=newData
else:
curNode = self.Head
curNode = curNode.next
curNode.next=newNode
if newData>self.max:
self.max=newData
self.count+=1
self.Head=Node(newData,self.Head)
if self.count==0:
self.max=newData
else:
if newData>self.max:
self.max=newData
self.count+=1
def insert_nth(self,newData,location):
self.Head=Node(newData,self.Head)
else:
curNode = self.Head
c=1
286 Data Structures and Algorithms Using Python while c<=location-2 and
curNode.next is not
None :
c+=1
curNode = curNode.next
curNode.next=Node(num,curNode.next)
if self.count==0:
self.max=newData
else:
if newData>self.max:
self.max=newData
self.count+=1
def delete_first(self):
if self.Head is None:
else:
firstNode = self.Head
if self.count==1:
self.max=None
else:
if self.max==firstNode.data:
curNode=firstNode.next
self.max=curNode.data
if curNode.data>self.max:
self.max=curNode.data
curNode = curNode.next
self.Head=self.Head.next
del(firstNode)
self.count-=1
def delete_last(self):
if self.Head is None:
del(self.Head)
self.Head=None
self.max=None
self.count=0
else:
curNode = self.Head
self.max=curNode.data
if curNode.data>self.max:
self.max=curNode.data
prevNode=curNode
curNode = curNode.next
del(curNode)
prevNode.next=None
self.count-=1
#given Data
def delete_anynode(self,num):
if self.Head is None:
else:
curNode = self.Head
if curNode.data==num: # For 1st Node
firstNode = self.Head
if self.count==1:
self.max=None
else:
if self.max==firstNode.data:
curNode=firstNode.next
self.max=curNode.data
if curNode.data>self.max:
self.max=curNode.data
curNode = curNode.next
self.Head=self.Head.next
del(firstNode)
self.count-=1
flag=0
newMax=curNode.data
flag = 1
break
if curNode.data>newMax:
newMax=curNode.data
curNode = curNode.next
if flag == 0:
else:
prev.next = curNode.next
temp=curNode.next
if self.max==curNode.data:
if temp.data>newMax:
newMax=temp.data
temp = temp.next
self.max=newMax
del(curNode)
self.count-=1
print(“Node Deleted Successfully...”)
def display(self):
if self.Head is None:
print(“Empty List.”)
else:
curNode = self.Head
print(curNode.data,end=“->”)
curNode = curNode.next
print(“None”)
head=groundedHeaderList()
while True:
print(“============================================”)
Data”)
print(“10.Exit”)
print(“============================================”)
if choice==1 :
opt=‘Y’
while opt.upper()==‘Y’:
head.insert_end(num)
elif choice==2 :
head.insert_begin(num)
elif choice==3 :
head.insert_nth(num,loc)
elif choice==4 :
head.delete_first()
elif choice==5 :
head.delete_last()
elif choice==6 :
”))
head.delete_anynode(num)
elif choice==7 :
head.display()
elif choice==8 :
”,head.count)
elif choice==9 :
if head.count>0:
head.max)
else:
print(“Null List”)
elif choice==10:
print(“\nQuiting.......”)
break
else:
Choice”)
continue
290 Data Structures and Algorithms Using Python In the above program
every time a node is inserted or deleted, the header node is updated
correspondingly. Thus, to access these pieces of information, we need not
traverse the list.
They are available at the header node. Thus, to count the number of nodes or
to find the largest element, the time complexity reduces from O(n) to O(1).
In the next program, operations on a two-way header list have been shown.
Here the header node points to the first node as well as the last node. So, we
can insert or delete a node from both ends very easily without traversing the
list. For a two-way list we are using the same class that has been used in the
program of a double linked list, i.e.
class DNode :
def __init__(self,Newdata=None,plink=None,nlink=None):
self.previous = plink
self.data = Newdata
self.next = nlink
The header node contains the references of the first node and the last node.
Here also we are storing the total number of nodes and the largest element of
the list. Thus the class to represent the header node will be
class twoWayHeaderList :
self.Head = None
self.count= 0
self.max = None
self.Tail = None
class DNode :
def __init__(self,Newdata=None,plink=None,nlink=No
ne):
self.previous = plink
self.data = Newdata
self.next = nlink
class twoWayHeaderList :
self.Head = None
self.count= 0
self.max = None
self.Tail = None
def insert_end(self,newData):
newNode=DNode(newData)
if self.Head is None:
self.Head=newNode
self.Tail=newNode
self.max=newData
else:
newNode.previous=self.Tail
self.Tail.next=newNode
self.Tail=newNode
if newData>self.max:
self.max=newData
self.count+=1
if self.Head is None:
self.Tail=newNode
self.max=newData
else:
self.Head.previous=newNode
if newData>self.max:
self.max=newData
self.Head=newNode
self.count+=1
def insert_nth(self,newData,location):
newNode=DNode(newData)
self.Head.previous=newNode
else:
self.Tail=newNode
newNode.next=self.Head
self.Head=newNode
else:
c=1
None :
c+=1
curNode = curNode.next
newNode.next=curNode.next
newNode.previous=curNode
if curNode.next!=None:
curNode.next.previous=newNode
else:
self.Tail=newNode
curNode.next=newNode
if self.count==0:
self.max=newData
else:
if newData>self.max:
self.max=newData
self.count+=1
# Function to delete the first node
def delete_first(self):
if self.Head is None:
else:
firstNode = self.Head
if self.count==1:
self.Tail=None
self.max=None
else:
if self.max==firstNode.data:
curNode=firstNode.next
self.max=curNode.data
if curNode.data>self.max:
self.max=curNode.data
curNode = curNode.next
self.Head=self.Head.next
self.Head.previous=None
del(firstNode)
self.count-=1
def delete_last(self):
if self.Head is None:
del(self.Head)
self.Head=None
self.max=None
self.count=0
self.Tail=None
else:
if self.max==self.Tail.data:
curNode = self.Head
self.max=curNode.data
self.max=curNode.data
curNode = curNode.next
lastNode=self.Tail
self.Tail=self.Tail.previous
self.Tail.next=None
del(lastNode)
self.count-=1
def delete_nth(self,posn):
if self.Head is None:
elif posn>self.count:
else:
curNode = self.Head
if self.count==1:
self.Head=None
self.max=None
self.count=0
self.Tail=None
else:
self.Head=curNode.next
self.Head.previous=None
flag=1
self.count-=1
c=1
while c<=posn-1:
c+=1
curNode = curNode.next
if self.Tail==curNode:
self.Tail=curNode.previous
self.Tail.next=None
else:
curNode.previous.next=curNode.next
curNode.next.previous=curNode.
previous
if self.max==curNode.data:
flag=1
self.count-=1
if flag==1:
temp=self.Head
self.max=temp.data
if temp.data>self.max:
self.max=temp.data
temp=temp.next
del(curNode)
def display(self):
if self.Head is None:
print(“Empty List.”)
else:
curNode = self.Head
print(“None<=>”,end=“”)
while curNode is not None :
print(curNode.data,end=“<=>”)
curNode = curNode.next
print(“None”)
def rev_display(self):
if self.Head is None:
print(“Empty List.”)
else:
curNode = self.Tail
print(“None<=>”,end=“”)
print(curNode.data,end=“<=>”)
curNode = curNode.previous
print(“None”)
head=twoWayHeaderList()
while True:
print(“========================================”)
print(“ 1.Create / Appending The List”)
print(“11.Exit”)
print(“========================================”)
if choice==1 :
opt=‘Y’
while opt.upper()==‘Y’:
head.insert_end(num)
elif choice==2 :
num=int(input(“Enter the Data: ”))
head.insert_begin(num)
elif choice==3 :
head.insert_nth(num,loc)
elif choice==4 :
head.delete_first()
elif choice==5 :
elif choice==6 :
Delete : ”))
head.delete_nth(num)
elif choice==7 :
head.display()
elif choice==8 :
head.rev_display()
elif choice==9 :
print(“Total number of Nodes in the List is:
“,head.count)
elif choice==10:
if head.count>0:
head.max)
else:
print(“Null List”)
elif choice==11:
print(“\nQuiting.......”)
break
else:
Choice”)
continue
The basic disadvantage of any type of linked list is that only sequential
access is possible on a linked list. Thus, to find any information from the
linked list, we need to traverse the entire list. This is time consuming. But if
we store the pieces of information that are required frequently in the header
list, the problem can be easily solved. For example, if the reference of last
node is stored within the header node, to insert a node at the end or to delete
the last node we need not traverse the entire list. We can reach directly the
end of the list and operate accordingly. These are the advantages of a header
linked list.
Though there are several advantages of using a linked list, there are also a
few disadvantages.
data part of the node is very small in size, this extra space is really a
headache. For example, if it contains a single integer, then to store a single
item we need to allocate double space.
To allocate 2 bytes, we have to allocate 4 bytes (2 bytes for integer data and
2 bytes for the reference). But if the size of the data part is large or if it
contains many items, then this wastage is negligible. For example, if we
want to store data about students like roll number, name, father’s name, local
address, permanent address, fees paid, etc., and suppose the size of this
record structure is 300 bytes, then to store this 300 bytes we have to allocate
302
def count(self):
c=0
curNode = self.Head
c+=1
curNode = curNode.next
return(c)
Program 7.12: Write a function to find the sum of the data part in a singly
linked list.
def calc_sum(self):
sum=0
curNode = self.Head
sum+=curNode.data
curNode = curNode.next
return(sum)
Program 7.13: Write a function to find a node with the given data in a
singly linked list.
298 Data Structures and Algorithms Using Python
def search(self,num):
curNode = self.Head
if curNode.data==num:
return curNode
curNode = curNode.next
return (None)
Program 7.14: Write a function to find the largest node in a singly linked
list.
def largest_node(self):
if self.Head is None:
return (None)
max = curNode.data
max = curNode.data
maxNode = curNode
curNode = curNode.next
return (maxNode)
def reverse_list(self):
curNode = self.Head
nxtNode = curNode.next;
prevNode = None;
curNode.next = None;
prevNode = curNode
curNode = nxtNode
nxtNode = curNode.next
curNode.next = prevNode
self.Head = curNode
newNode=Node(newData)
if self.Head is None:
self.Head=newNode
else:
curNode = self.Head
newNode.next = curNode
self.Head=newNode
else:
break
prevNode = curNode
curNode = curNode.next
newNode.next = curNode
prevNode.next = newNode
The following functions are written considering that they are the member
functions of the class circularLinkedList that we have already seen in this
chapter.
def countNode(self):
if self.Head is None:
return 0
else:
c=1
curNode = self.Head
c+=1
curNode = curNode.next
return c
Program 7.18: Write a function to find the node with the smallest value in a
circular linked list.
def smallestNode(self):
if self.Head is None:
else:
curNode = self.Head
min=curNode.data
temp=curNode
curNode = curNode.next
min=curNode.data
temp=curNode
curNode = curNode.next
return temp
Program 7.19: Write a function to delete the nth node from a circular linked
list.
def delete_nth(self,posn):
if self.Head is None:
else:
curNode = self.Head
if curNode.next==curNode:
self.Head=None
del(curNode)
else:
while curNode.next!=self.Head:
curNode=curNode.next
self.Head=self.Head.next
del(curNode.next)
curNode.next=self.Head
c=self.countNode()
if posn>c:
else:
c=1
while c<=posn-1:
c+=1
prev = curNode
curNode = curNode.next
prev.next = curNode.next
del(curNode)
The following functions are written considering that they are the member
functions of the class doublyLinkedList that we have already seen in this
chapter.
Program 7.20: Write a function to find the largest difference among the
elements of a doubly linked list.
# Function to find Largest Difference
def largestDifference(self):
if self.Head is None:
return 0
else:
curNode = self.Head
max=min=curNode.data
if curNode.data>max:
max=curNode.data
elif curNode.data<min:
min=curNode.data
curNode = curNode.next
return max-min
newNode=DNode(newData)
if self.Head is None:
self.Head=newNode
else:
curNode = self.Head
newNode.next = curNode
curNode.previous=newNode
self.Head=newNode
else:
break
prevNode = curNode
curNode = curNode.next
newNode.next = curNode
newNode.previous=prevNode
prevNode.next = newNode
curNode.previous = newNode
Program 7.22: Write a function to remove all nodes from a doubly linked
list.
if self.Head is None:
print(“Empty List.”)
else:
self.Head=None
Program 7.23: Write a function to find the average of the data elements of a
circular doubly linked list.
def average(self):
if self.Head is None:
return 0
else:
curNode = self.Head
sum=curNode.data
c=1
curNode=curNode.next
sum+=curNode.data
c+=1
curNode = curNode.next
return sum/c
Program 7.24: Write a function to remove nodes with negative elements
from a circular doubly linked list.
def delNegative(self):
if self.Head is None:
else:
curNode = self.Head
if curNode.next==curNode:
self.Head=None
return
else:
lastNode=curNode.previous
self.Head=self.Head.next
self.Head.previous=lastNode
lastNode.next=self.Head
nextNode=curNode.next
del(curNode)
print(“delete”)
curNode=nextNode
curNode=curNode.next
print(curNode.data)
if curNode.data<0:
prevNode=curNode.previous
prevNode.next=curNode.next
curNode.next.previous=prevNode
temp=curNode
curNode=curNode.next
del(temp)
else:
curNode = curNode.next
✓ Linked list is a dynamic data structure. Hence the size of the linked list
may increase or decrease at run time, i.e. we can add or delete node as and
when required.
✓ Insertion and deletion of nodes at any point in the list is very simple and
require some pointer adjustment only.
✓ In linear singly linked list each node points to the next node in the list and
the last node contains None to indicate the end of list.
✓ In circular singly linked each node points to the next node in the list but
the last node instead of containing None it points to the first node of the list.
✓ In two way or doubly linked list each node has two reference parts. One
points to the previous node and the other points to the next node in the list.
Both the ends of the list contain None.
✓ In circular doubly linked list each node has two reference parts. One
points to the previous node and the other points to the next node in the list.
But instead of containing None the first node points to last node and the last
node points to the first node of the list.
✓ The disadvantage of linked list is that it consumes extra space since each
node contains the reference of the next item in the list and the nodes of the
list can be accessed sequentially only.
✓ Header linked list is a linked list with an extra node which contain some
useful information regarding the entire linked list.
a) Array
b) Tree
c) Graph
2. The address of the fifth element in a linked list of integers is 1000. Then
the address of the eighth element is
a) 1006
b) 1004
c) 1008
d) Cannot Say
a) Insertion Sort
b) Binary Search
c) Polynomial Manipulation
d) Radix Sort.
4. What is the worst case time complexity to search for an element from the
following logical structure?
Head
Data 1
Data 2
Data n
a) O(n)
b) O(1)
c) O(n2)
d) O(n log n)
b) Singly linked list requires less space than a doubly linked list.
c) Singly linked list requires less time than a doubly linked list for both
insertion and deletion operations.
6. Consider a single circular linked list with a tail pointer. Which of the
following operations requires O(1) time?
a) I and II only
7. Consider a two way header list containing references of first and last
node. Which of the following operations cannot be done in O(1) time?
306 Data Structures and Algorithms Using Python 8. Which of the following
is not true when a linked list is compared with an array?
b) Insertion and deletion operations are easy and less time consuming in
linked lists c) Random access is not allowed in case of linked lists
d) Access of elements in linked list takes less time than compared to arrays
9. Suppose we have two reference variables, Head and Tail, to refer to the
first and last nodes of a singly linked list. Which of the following operations
is dependent on the length of the linked list?
10. Which of the following applications makes use of a circular linked list?
d) None of these
12. To insert a node at the beginning of a single circular linked list how
many pointers/ reference fields are needed to be adjusted?
a) 1
b) 2
c) 3
d) 4
13. To delete an intermediate node from a double linked list how many
pointers/ reference fields are needed to be adjusted?
a) 1
b) 2
c) 3
d) 4
14. What is the time complexity to delete the first node of a circular linked
list?
a) O(log n)
b) O(nlogn)
c) O(n)
d) O(1)
15. What is the time complexity to insert a node at the second position of a
double linked list?
a) O(log n)
b) O(nlogn)
c) O(n)
d) O(1)
16. Suppose we have to create a linked list to store some data (roll no.,
name, and fees paid) of students. Which of the following is a valid class
design in Python for this purpose?
a) class student :
roll=0
name= None
fees_paid=0.0
b) class student :
int roll
char name[30]
float fees_paid
c) class student :
name= mname
fees_paid=mfees
d) class student :
self.roll=roll
self.name= name
self.fees_paid=fees_paid
class Node :
self.data = Newdata
self.next = link
self.Head = None
self.Head=Node(newData, self.Head)
a) 2
b) 3
c) 4
d) 5
Review Exercises
7. How many reference adjustments are required to insert a new node at any
intermediate position in a single linked list?
8. How many reference adjustments are required to delete a node from any
intermediate position of a single linked list?
9. What is the time complexity to insert a node at the end of a single linked
list?
10. What is a circular linked list? What advantage do we get from a circular
linked list?
13. What happened if we use tail pointer reference for a single linear linked
list?
14. Compare and contrast between a single linked list and a doubly linked
list.
15. Compare and contrast between a single circular linked list and a doubly
linked list.
16. Compare and contrast between a single circular linked list and a circular
doubly linked list.
18. How many reference adjustments are required to insert a new node in a
circular doubly linked list?
19. How many reference adjustments are required to delete a node from a
circular doubly linked list?
2. Write a function to find the average of the data part in a singly linked list.
3. Write a function to insert a node before a node whose data matches with
given data in a singly linked list.
4. Write a function to insert a node after a node whose data matches with
given data in a singly linked list.
9. Write a function that will split a singly linked list into two different lists,
one of which will contain odd numbers and other will contain even numbers.
13. Write a function to count the number of nodes in a circular linked list.
14. Write a function to insert a node before a node whose data matches with
given data in a circular singly linked list.
15. Write a function to insert a node after a node whose data matches with
given data in a circular singly linked list.
16. Write a function to delete the nth node from a circular singly linked list.
17. Write a program to create a circular linked list using the tail pointer.
Define the necessary functions to add a new node, delete a node, and display
the list.
18. Write a function to insert a node before a node whose data matches with
given data in a doubly linked list.
310 Data Structures and Algorithms Using Python 19. Write a function to
insert a node after a node whose data matches with given data in a doubly
linked list.
20. Write a function to delete the node whose data matches with given data
in a doubly linked list.
21. Write a function to insert a node before a node whose data matches with
given data in a circular doubly linked list.
22. Write a function to insert a node after a node whose data matches with
given data in a circular doubly linked list.
23. Write a function to delete the node whose data matches with given data
in a circular doubly linked list.
25. Write a program to implement a circular linked list such that insert at end
and delete from beginning operation executes with O(1) time complexity.
26. Write a function to split a circular doubly linked list into two single
circular linked lists, one of which will contain positive numbers and other
will contain negative numbers.
OceanofPDF.com
Chapter
Stack
Since in a stack always the element that was inserted last is removed first, it
is known as a LIFO (Last In First Out) data structure.
• Peek: Get the top element of the stack without removing it.
In some situations we need to check the top element without removing it.
That is why the peek operation is required. Again, if a stack is full, the push
operation cannot be done on the stack. Similarly, pop and peek operations
cannot be performed on an empty stack.
Thus, before these operations we need to check whether the stack is full or
empty.
Stack 313
Operation
top
array
Initial state
-1
Push 5
2
3
10
Push 10
10
15
Push 15
3
4
10
Pop
Pop
Considering the above example we can write the algorithm of the push
operation as: 1. If top = SIZE – 1, Then
a. Print “Stack Overflow”
b. Go to Step3
2. Else
a.top = top + 1
3. End
b. Go to Step3
2. Else
a. Element = array[top]
b. top = top – 1
c. Return Element
3. End
314 Data Structures and Algorithms Using Python 8.3.2 Python List
Representation
Though Python does not support traditional arrays, yet the above concept
can be easily implemented using Python’s in-built list data structure. We
already find that all the operations related to stack, i.e. push, pop, and peek,
occur at the topmost position. So, we have to decide which end of the list
will be considered as top. Though we can use any end for pushing the
elements into the stack, appending, i.e. pushing the elements from back end,
is more efficient as it operates with O(1) time complexity whereas inserting
an element through the front end requires time complexity of O(n). So,
obviously we will consider the end of the list as the top of the stack. The list
method append() will help us to push an element into a stack and the
method pop() will help us in the stack’s pop operation.
Now the basic operations on stack are shown with the diagram in Figure
8.3.
Operation
arr
Initial state
Empty Stack
Push 5
10
Push 10
10
15
Push 15
10
Pop
Pop
Figure 8.3 Operations on a stack using list data structure Considering the
given example now we are able to write push() and pop(). To write the
push() operation, our algorithm appends a new element in the list. Thus, the
push()
self.items.append(item)
Again, to remove the last element from the list we have the list function
pop(). So, we will write the pop() function as:
Stack 315
def pop(self):
return self.items.pop()
The peek() function is similar to pop(). The difference is that here the
element will not be removed from the stack. Only the top element will be
returned. Here is the function: def peek(self):
return self.items[len(self.items)-1]
But before invoking these functions from our main program we need to
write some other functions for efficient use of stacks in the program. When
we want to push an element into a stack it is necessary to check whether the
stack is full or not. If the stack is full it is not possible to insert more items
and this situation is known as ‘Stack Overflow’. But we are using the list
data structure and list grows dynamically as long as memory supports. So,
we need not check this overflow condition.
return self.items == []
if s.isEmpty() :
print(“Stack Underflow”)
else :
num=s.pop();
print(“Stack Underflow”)
else :
num=s.peek()
316 Data Structures and Algorithms Using Python Program 8.1: Write a
program to demonstrate the various operations of a stack using list.
class Stack:
def __init__(self):
self.items = []
def isEmpty(self):
return self.items == []
self.items.append(item)
def pop(self):
return self.items.pop()
def peek(self):
return self.items[len(self.items)-1]
def size(self):
return len(self.items)
def display(self):
top=len(self.items)-1
print()
for i in range(top,-1,-1):
print(‘ |’,format(self.items[i],‘>3’),‘|’)
print(“ -------”);
s=Stack()
while(True):
print(“============================”)
print(“\t1. Push”)
print(“\t2. Pop”)
print(“\t3. Peek”)
print(“\t4. Display”)
print(“\t5. Exit”)
print(“============================”)
Stack 317
s.push(num)
elif choice==2 :
if s.isEmpty() :
print(“Stack Underflow”)
else :
num=s.pop();
elif choice==3 :
if s.isEmpty() :
print(“Stack Underflow”)
else :
num=s.peek()
elif choice==4 :
if s.isEmpty() :
print(“Stack is Empty”)
else :
s.display()
elif choice==5 :
print(“\nQuiting.......”)
break
else:
Choice”)
continue
The implementation of a stack using a list is quite easy. But it suffers from
some problems.
List has a few shortcomings. If the number of elements in the stack is large,
then to accommodate the elements in a list Python may have to reallocate
memory. Then push() and pop() may require O(n) time complexities. Thus
to implement a stack, the choice of a linked list is always better. As the
operation of a stack occurs in just one end and need not traverse the entire
list, singly linked list is the better option to implement a stack.
To use a linked list, again we have to decide how to design the stack
structure. Though we designed the end of the list as the top of the stack in
case of Python list representation, yet in case of linked list representation it
would be efficient if we considered the front of the list as the top of the
stack as the head points to the first element. To push an element we just
insert a node at the beginning of a list and to pop an element we delete the
first node.
class stackNode :
self.data = Newdata
self.next = link
To implement a stack using linked list we will define another class which
will contain a reference of the above stackNode class, say top, to point to
the top element of the stack.
To write the push() operation we can insert the new element at any end of
the list. If we insert it at the end, we need to traverse the entire list each
time, which leads to the complexity of push() being O(n). But if we insert
the new element at the beginning of the list, we need not move at all and the
complexity is reduced to O(1). Thus to implement push() we should follow
the second option, i.e. element should be inserted at the beginning of the
list. The following function represents this:
Here, first a new node is created. Its data part will be updated by the new
element and the reference part, top, will store the reference of the first node
in the list.
node = self.top
self.top = self.top.next
Stack 319
return node.data
# linked list.
self.data = Newdata
self.next = link
class Stack :
self.top = None
return self.top.data
node = self.top
self.top = self.top.next
return node.data
curNode = self.top
print()
print(“ -------”);
s=Stack()
while(True):
print(“============================”)
print(“\t1. Push”)
print(“\t2. Pop”)
print(“\t3. Peek”)
print(“\t4. Display”)
print(“\t5. Exit”)
print(“============================”)
if choice==1 :
s.push(num)
elif choice==2 :
if s.isEmpty() :
print(“Stack Underflow”)
else :
num=s.pop();
elif choice==3 :
if s.isEmpty() :
print(“Stack Underflow”)
else :
num=s.peek()
elif choice==4 :
if s.isEmpty() :
print(“Stack is Empty”)
else :
s.display()
elif choice==5 :
print(“\nQuiting.......”)
break
else:
Choice”)
continue
Stack 321
One solution to this problem is to use multiple stacks in the same array to
share the common space. Figure 8.4 shows this concept.
………………
n-4
n-3
n-2
n-1
Stack A
Stack B
Figure 8.4 Representation of two stacks using a single array In Figure 8.4,
it is shown that we may implement two stacks, stack A and stack B, within a
single array. Stack A will grow from left to right and stack B will grow from
right to left.
Within stack A, the first element will be stored at the zeroth position, next
element at the index position 1, then at the index position 2, and so on.
Whereas within stack B, the first element will be stored at the (n-1)th
position, next element at (n-2)th position, then at (n-3)th position, and so
on. The advantage of this scheme is that the intermediate space in between
two stacks is common for both and may be used by any one of them.
Suppose the array size is 10 and in a particular situation stack B contains
three elements. At this position, stack A is free to expand up to array index 6
in the array. Similarly stack B is also able to expand up to the top of stack
A.
number of stacks, each stack will be bounded by s[n] and e[n]. Figure 8.5
illustrates this concept.
s[0] e[0]
s[1] e[1]
…….
s[n-2] e[n-2]
s[n-1] e[n-1]
Stack 1
Stack 2
……
Stack n-1
Stack n
function, f1 function will be pushed first into the stack and then f2 function
will be pushed on top of the f1 function. After completion of all tasks of f3
function, f2 function will be popped from the stack and the remaining task
of f2 function will get executed. Similarly, when the execution of f2
function is over, f1 function will be popped from the stack and get executed.
Thus the stack maintains the order of execution in between different
function calls. Another programming situation where a stack is used is
recursion where the same function is called again and again until a certain
condition is satisfied. Each function call is pushed into a stack and after
achieving the terminating condition all the function calls are popped from
the stack one by one but in reverse order of calls as stacks follow the LIFO
order. We can also use stack in our program in some situations like to
reverse a list, to reverse the order of a set of elements, traversing another
two important data structures –
tree and graph – and many more. Here we will discuss some of them.
1. Create a stack
2. Read each character of the inputted string until end of string a. If the read
character is an opening bracket, i.e. (, {, or [, then push it into the stack
b. Else
ii. Else
Stack 323
4. Else
Print error message like “No matching closing bracket”
st=Stack()
for ch in expr :
st.push(ch)
if st.isEmpty():
parenthesis”)
return
else:
if(getmatch(ch)!= st.pop()):
print(“Improper nesting of
Parenthesis”)
return
if not st.isEmpty():
else:
def getmatch(chr):
if chr==‘)’ :
return ‘(’
elif chr==‘}’ :
return ‘{’
elif chr==‘]’ :
return(‘[’)
#parenthesized
self.data = Newdata
self.next = link
class Stack :
self.top = None
def isEmpty( self ):
return self.top.data
node = self.top
self.top = self.top.next
return node.data
def display(self):
curNode = self.top
print()
print(‘ |’,format(curNode.data,‘>3’),‘|’)
print(“ -------”);
curNode = curNode.next
def chk_parenthesis(expr):
st=Stack()
for ch in expr :
if (ch==‘(’ or ch==‘{’ or ch==‘[’) :
st.push(ch)
if st.isEmpty():
parenthesis”)
return
Stack 325
else:
if(getmatch(ch)!= st.pop()):
print(“Improper nesting of
Parenthesis”)
return
if not st.isEmpty():
else:
def getmatch(chr):
if chr==‘)’ :
return ‘(’
elif chr==‘}’ :
return ‘{’
elif chr==‘]’ :
return(‘[’)
chk_parenthesis(exp)
• Infix Notation
• Prefix Notation
• Postfix Notation
The most common notation is infix notation. In this notation the operator is
placed in between two operands. Examples are:
A+B
X*Y–Z
A + ( B – C ) / D, etc.
326 Data Structures and Algorithms Using Python Though we are much
familiar with these expressions, it is very difficult for the computer to parse
this type of expressions as the evaluation of an infix expression depends on
the precedence and associativity of the operators. The associativity of
arithmetic operators is left to right. So, for same priority, evaluation is
performed from left to right, and for different priorities, based on priority.
Again, brackets override these priorities. On the other hand, prefix and
postfix notations are completely bracket-free notations and to evaluate need
not to bother about the operator precedence. Thus these notations are much
easier to evaluate for the computer.
Example 8.1: Convert the following infix expressions into its equivalent
prefix expressions: a) A + B * C – D
b) A + ( B – C * D ) / E
Solution: a) A + B * C – D
A + [ *BC ] – D
[ + A*BC ] – D
– + A*BCD
b) A + ( B – C * D ) / E
A + ( B – [ *CD ] ) / E
A + [ – B*CD ] / E
A + [ / – B*CDE ]
+ A / – B*CDE
In postfix notation the operator is placed after the operands. Thus, the infix
notation A +
Stack 327
Example 8.2: Convert the following infix expressions into its equivalent
postfix expressions: a) A + B * C – D
b) A + ( B – C * D ) / E
Solution: a) A + B * C – D
A + [ BC* ] – D
[ ABC*+ ] – D
ABC*+ D –
b) A + ( B – C * D ) / E
A + ( B – [ CD* ] ) /E
A + [ BCD* – ] / E
A + [ BCD* – E / ]
ABCD* – E / +
Any infix expression may contain operators, operands, and opening and
closing brackets.
1. Create a stack
2. Read each character from source string until End of string is reached a. If
the scan character is ‘(’, Then
i. Pop from stack and send to target string until ‘(’ is encountered ii. Pop ‘(’
from stack but do not send to target string
i. While stack is not empty or the Top element of stack is not ‘(’ or Priority
of Top element of stack >= Priority of scan character
328 Data Structures and Algorithms Using Python Pop from stack and send
to target string
[End of while]
3. Pop from stack and send to target string until the stack becomes empty
The following example illustrates the above example:
Example 8.3: Convert the following infix expression into its equivalent
postfix expression: A + ( B – C * D ) / E
Scan Character
Stack
Target String
+(
B
+(
AB
+(–
AB
+(–
ABC
+(–*
ABC
+(–*
ABCD
ABCD*–
+/
ABCD*–
E
+/
ABCD*–E
End of string
ABCD*–E/+
def infixToPostfix(source):
target=“”
st=Stack()
for ch in source:
if ch==‘(’:
st.push(ch)
Stack 329
elif ch==‘)’:
target+=st.pop()
st.pop()
ch==‘%’ :
and getPriority(st.peek())>=getPriority(ch)):
target+=st.pop()
st.push(ch)
while(not st.isEmpty()):
target+=st.pop()
return target
def getPriority(opr):
return 1
else:
return 0
#Postfix expression
class stackNode :
self.data = Newdata
self.next = link
class Stack :
self.top = None
330 Data Structures and Algorithms Using Python return self.top is None
return self.top.data
node = self.top
self.top = self.top.next
return node.data
def infixToPostfix(source):
target=“”
st=Stack()
for ch in source:
if ch==‘(’:
st.push(ch)
elif ch==‘)’:
target+=st.pop()
st.pop()
target+=ch
ch==‘%’ :
and getPriority(st.peek())>=getPriority(ch)):
target+=st.pop()
st.push(ch)
while(not st.isEmpty()):
target+=st.pop()
return target
def getPriority(opr):
if (opr==‘*’ or opr==‘/’ or opr==‘%’):
return 1
else:
return 0
postfix=infixToPostfix(infix)
Stack 331
print(postfix)
1. Create a stack
Solution:
Scan Character
Stack
Intermediate Operations
49
493
4932
Stack
Intermediate Operations
496
A=2
B=3
C=3*2=6
4 15
A=6
B=9
C = 9 + 6 = 15
4 15 5
43
A=5
B = 15
C = 15 / 5 =3
1
A=3
B=4
C=4–3=1
End of string
Return (1)
def evaluatePostfix(source):
st=Stack()
for ch in source:
if ch==‘ ’ or ch==‘\t’:
continue
if(ch.isdigit()):
st.push(int(ch))
else:
a=st.pop()
b=st.pop()
if ch==‘+’:
c=b+a
elif ch==‘-’:
c=b-a
elif ch==‘*’:
c=b*a
elif ch==‘/’:
c=b/a
elif ch==‘%’:
Stack 333
c = int(b) % int(a)
st.push(c)
return st.pop()
class stackNode :
self.data = Newdata
self.next = link
class Stack :
self.top = None
return self.top.data
node = self.top
self.top = self.top.next
return node.data
def evaluatePostfix(source):
st=Stack()
for ch in source:
if ch==‘ ’ or ch==‘\t’:
continue
if(ch.isdigit()):
st.push(int(ch))
else:
a=st.pop()
b=st.pop()
if ch==‘+’:
c=b+a
elif ch==‘-’:
c=b-a
elif ch==‘*’:
elif ch==‘/’:
c=b/a
elif ch==‘%’:
c = int(b) % int(a)
st.push(c)
return st.pop()
result=evaluatePostfix(postfix)
”,result)
Stack 335
Solution:
Scan Character
Stack
Intermediate Operations
49
493
4932
4 9 (3*2)
A=2
B=3
C = (3*2)
4 (9+(3*2))
A = (3*2)
B=9
C = (9+(3*2))
5
4 (9+(3*2)) 5
4 ((9+(3*2))/5)
A=5
B = (9+(3*2))
C = ((9+(3*2))/5)
(4–((9+(3*2))/5))
A = ((9+(3*2))/5)
B=4
C = (4–((9+(3*2))/5))
End of string
Return (4–((9+(3*2))/5))
There are several algorithms for conversion of an infix expression into its
equivalent prefix expression. Here we are using a simple one. The
advantage of this algorithm is that we can use the same algorithm of infix to
postfix conversion with a very minor modification and just two extra steps.
The modification is that when the scan character is an operator, then
elements will be popped first only if priority of the top element of stack is
greater than (not
>=) the priority of the scan character. Thus the algorithm can be described
as: 1. Create a stack
2. Reverse the infix string. While reversing, we need to remember that the
left parenthesis will be converted to the right parenthesis and the right
parenthesis will be converted to the left parenthesis.
336 Data Structures and Algorithms Using Python 3. Read each character
from source string until End of string is reached a. If the scan character is
‘(’, Then
i. Pop from stack and send to target string until ‘(’ is encountered ii. Pop ‘(’
from stack but do not send to target staring
i. While stack is not empty or the Top element of stack is not ‘(’ or Priority
of Top element of stack > Priority of scan character
[End of while]
4. Pop from stack and send to target string until the stack becomes empty.
Example 8.6: Convert the following infix expression into its equivalent
prefix expression: A + ( B – C * D ) / E
Reversed string: E / ( D * C – B ) + A
Scan Character
Stack
Target String
/(
/(
ED
/(*
ED
/(*
EDC
/(–
EDC*
/(–
EDC*B
Stack 337
Scan Character
Stack
Target String
EDC*B–
EDC*B–/
A
EDC*B–/A
End of string
EDC*B–/A+
#Prefix expression
class stackNode :
self.data = Newdata
self.next = link
class Stack :
self.top = None
return self.top.data
def pop( self ):
node = self.top
self.top = self.top.next
return node.data
def getPriority(opr):
return 1
else:
return 0
def infixToPrefix(source):
source=reverse(source)
st=Stack()
for ch in source:
if ch==‘(’:
st.push(ch)
elif ch==‘)’:
st.pop()
target+=ch
ch==’%’ :
and getPriority(st.peek())>getPriority(ch)):
target+=st.pop()
st.push(ch)
while(not st.isEmpty()):
target+=st.pop()
target=reverse(target)
return target
str2=“”
j=len(str1)-1
while(j>=0):
if str1[j]==‘(’:
str2+=‘)’
elif str1[j]==‘)’:
str2+=‘(’
else:
str2+=str1[j]
j=j-1
return str2
prefix=infixToPrefix(infix)
Stack 339
class Stack:
def __init__(self):
self.items = []
def isEmpty(self):
return self.items == []
self.items.append(item)
def pop(self):
return self.items.pop()
def peek(self):
return self.items[len(self.items)-1]
def size(self):
return len(self.items)
numbers=arr.array(‘i’, [])
st=Stack()
numbers.append(n)
print(numbers[i])
st.push(numbers[i])
#stack to array
numbers[i]=st.pop()
print(numbers[i])
340 Data Structures and Algorithms Using Python Not only an array, we
can reverse the elements of a linked list also. Consider the following
program:
#linked list.
class Stack:
def __init__(self):
self.items = []
def isEmpty(self):
return self.items == []
self.items.append(item)
def pop(self):
return self.items.pop()
class Node :
self.data = Newdata
self.next = link
class SLList :
self.head = None
current=self.head
print(current.data,“ ”)
current=current.next
if self.head == None:
else:
current=self.head
current.next=Node(Newdata, None)
def reverse(lst):
st=Stack()
Stack 341
ptr1=lst.head
ptr2=ptr1
while (ptr1):
st.push(ptr1.data)
ptr1 = ptr1.next
while (ptr2):
ptr2.data=st.pop()
ptr2 = ptr2.next;
st=Stack()
newList=SLList()
newList.addNode(n)
print(“The List is : ”)
newList.display()
reverse(newList)
newList.display()
8.5.4 Recursion
def factorial(n):
if n==0:
return 1
else:
return n * factorial(n-1)
fact=factorial(num)
Suppose, the above function is called for n=4. So, the first call of this
function will be pushed into the stack since to evaluate the return statement
it will call factorial(3).
factorial(4)
factorial(1)
factorial(2)
factorial(3)
factorial(4)
Now n becomes 0 and from this function call 1 will be returned. Now
factorial(1) will be popped from the stack and will return 1*1, i.e. 1.
factorial(2)
factorial(3)
factorial(4)
Stack 343
factorial(3)
factorial(4)
Stack at a Glance
✓ A stack is a linear data structure in which both the insertion and deletion
operations occurs only at one end.
✓ Infix notation, prefix notation, and postfix notation are three notations
used to represent any algebraic expression.
✓ The prefix notation is also known as Polish notation and the postfix
notation is known as reverse Polish notation.
b) Insert
c) Push
d) Pop
2. The process by which an element is removed from a stack is called a)
Delete
b) Remove
c) Push
d) Pop
b) Overflow
c) Empty collection
d) Garbage Collection
b) Overflow
c) User flow
d) Full flow
a) 5
b) 4
c) 3
d) 2
Stack 345
a) 5
b) 2
c) 3
d) Cannot say
a) 24
b) -24
c) 35
d) -35
b) 2
c) 3
d) 4
b) P Q – R S * T + U * * V /
c) P Q – R S * T + * U * V /
d) P Q – R S T * + * U * V /
11. Which data structure is required to check whether the parentheses are
properly placed in an expression?
a) Array
b) Stack
c) Queue
d) Tree
a) Tree
b) Linked list
c) Queue
d) Stack
346 Data Structures and Algorithms Using Python 13. The postfix form of
A/B-C*D is
a) /AB*CD-
b) AB/CD*-
c) A/BC-*D
d) ABCD-*/
a) Linked List
b) Stack
c) Queue
d) Tree
a) +A/B%*CDE
b) /+AB%*CDE
c) +A/B*C%DE
d) +A/BC*%DE
b) – +ABC * D
c) – +AB * CD
d) – + * ABCD
a) Array
b) Queue
c) Stack
d) List
b) 368
c) 140
d) 375
19. Convert the following infix expressions into its equivalent postfix
expressions.
(X + Y ^ Z) / (P + Q) – R
a) X Y Z ^ + P Q + / R –
b) X Y Z +^ P Q + / R –
Stack 347
c) X Y Z ^ + P Q /+ R –
d) X Y Z P Q + ^ / + R –
Push(‘+’)
Push(‘-’)
Pop()
Push(‘*’)
Push(‘/’)
Pop()
Pop()
Pop()
Push(‘%’)
a) 0
b) 1
c) 2
d) 3
a) Job scheduling
b) Reversing a string
c) Implementation of recursion
23. If the elements “A”, “B”, “C” and “D” are pushed in a stack and then
popped one by one, what will be the order of popping?
a) ABCD
b) ABDC
c) DCAB
d) DCBA
348 Data Structures and Algorithms Using Python 24. If the following
postfix expression is evaluated using a stack, after the first * is evaluated
the top two elements of the stack will be
422^/32*+15*-
a) 6, 1
b) 7, 5
c) 3, 2
d) 1, 5
d) All of these
26. How many stacks are needed to implement a queue? Assume no other
data structure like an array or linked list is available.
a) 1
b) 2
c) 3
d) 4
a) P Q – R S + *
b) P Q R S – + *
c) P Q – R S * +
d) P Q – + R S *
b) a b c + e / f - *
c) a b c + e / * f -
d) None of these
Review Exercises
1. What is a stack?
3. What are the operations associated with stack? Explain with example.
b. ( A – B + C ) * D
c. ( A + B ) / ( C – D ) * ( E % F )
d. 4 + 3 * 1 / 6 + 7 - 4 / 2 + 5 * 3
e. A + ( B * C – ( D / E * F ) * G ) * H
b. ( A – B + C ) * D
c. ( A + B ) / ( C – D ) * ( E % F )
d. 4 + 3 * 1 / 6 + 7 - 4 / 2 + 5 * 3
e. A + ( B * C – ( D / E * F ) * G ) * H
b. A B + C D – / E F % *
c. A B C * D – E F G % + / –
a. 5 3 – 2 + 4 *
b. 9 3 + 6 4 – / 7 2 % *
c. 3 5 2 * + 4 –
b. A B + C D – / E F % *
c. A B C * D – E F G % + / –
9
OceanofPDF.com
Chapter
Queue
Another important data structure is queue. Queues are also exhaustively used
in various situations in computer science. In this chapter we will discuss the
operations related to queues; how a queue is represented in memory, both
array and linked representation of queues, different types of queues and
various applications of queues.
Apart from these basic two operations, to use a queue efficiently we need to
add the following functionalities:
Thus before these operations, we need to check whether the queue is full or
empty.
Queue 353
class Queue:
def __init__(self,size):
self.front=-1
self.rear =-1
self.arr=array.array('i', [0]*size)
Like stack before using the queue we have to initialize the queue and this is
done by assigning the value -1 both to the front and rear to indicate that the
queue is empty. As within the array the value of the front and rear varies
from 0 to SIZE-1, we set the initial value as -1. Now the basic operations on
queue are shown with the diagram in Figure 9.2.
Operation
front
rear
arr
Initial state
-1
-1
Enqueue 5
0
0
f0r
10
Enqueue 10
f0
1r
10
15
Enqueue 15
0
f0
2r
front
rear
arr
10
15
Dequeue
f1
2r
3
4
15
Dequeue
f2r
Dequeue
-1
-1
20
Enqueue 20
0
f0r
20
25
Enqueue 25
f0
1r
20
25
30
Enqueue 30
f0
2r
25
30
Dequeue
f1
2r
25
30
35
Enqueue 35
f1
3r
30
35
Dequeue
f2
3r
30
35
40
Enqueue 40
f2
4r
Queue 355
Operation
front
rear
arr
Enqueue 45
30
35
40
f0
2r
30
35
40
45
f0
3r
Considering the above example we can write the algorithm of the Enqueue
operation as: 1. If front = 0 And rear = SIZE – 1, Then
2. Else
i. front = rear = 0
b. Else
2. Set front = 0
ii. Else
1. rear = rear + 1
4. End
b. Go to Step3
2. Else
a. Element = arr[front]
c. Else
i. front = front + 1
d. Return Element
3. End
If the queue is full it is not possible to insert more items, and this situation is
known as
‘Queue Overflow’. The isfull() function for a queue may be written as: def
isFull(self):
To check whether a queue is full or not, only inspecting the value of rear is
not sufficient.
There may be situations when rear will point to the end of the array but still
the queue is not full (consider the example in Figure 9.2 after enqueueing
40). There may be some vacant place at the front side of the queue. At these
positions front will point to some position except 0. Thus we need to check
the value of front as well as rear.
def isEmpty(self):
return self.front == -1
When a queue is empty the value of the front and rear should be -1 and in
other cases the value of the front is the array index where the first element
of the queue resides and the value of the rear stores the array index of the
last element of the queue. So to write isempty(), we can check only front or
only rear or can check both. Thus all the following versions work the same
as above:
def isEmpty(self):
return self.rear == -1
Or
def isEmpty(self):
Or
def isEmpty(self):
Now, we will write the enqueue() function. Based on the algorithm written
above we will write the following function:
Queue 357
j=0
if self.rear==-1:
self.front=self.rear=0
elif self.rear==self.size()-1:
for i in range(self.front,self.rear+1):
self.arr[j] = self.arr[i]
j=j+1
self.front = 0
self.rear = j
else:
self.rear+=1
self.arr[self.rear]=item
This situation is achieved only when front and rear point to the same
position. As on dequeueing this element, the queue becomes empty, we need
to set both front and rear to
-1. Finally, the temporary variable is returned. The dequeue() function can be
written as:
def dequeue(self):
temp = self.arr[self.front]
if self.front==self.rear:
self.front=self.rear=-1
else:
self.front+=1
return temp
358 Data Structures and Algorithms Using Python Program 9.1: Write a
program to demonstrate the various operations of a queue using an array.
class Queue:
def __init__(self,size):
self.front=-1
self.rear =-1
self.arr=array.array(‘i’, [0]*size)
def isEmpty(self):
return self.front == -1
def isFull(self):
return self.front == 0 and self.rear== self.
size()-1
j=0
if self.rear==-1:
self.front=self.rear=0
elif self.rear==self.size()-1:
for i in range(self.front,self.rear+1):
self.arr[j] = self.arr[i]
j=j+1
self.front = 0
self.rear = j
else:
self.rear+=1
self.arr[self.rear]=item
def dequeue(self):
temp = self.arr[self.front]
if self.front==self.rear:
self.front=self.rear=-1
else:
self.front+=1
return temp
def peek(self):
return self.arr[self.front]
def size(self):
return len(self.arr)
def display(self):
Queue 359
print(self.arr)
print(“------”*self.size()+“-”)
print(“| ”*(self.front),end=“”)
for i in range(self.front,self.rear+1):
print(‘|’,format(self.arr[i],‘>3’), end=“ ”)
print(“| ”*(self.size()-(self.
rear+1)),end=“”)
print(“|”)
print(“------”*self.size()+“-”)
q=Queue(size)
while(True):
print(“\nPROGRAM TO IMPLEMENT QUEUE ”)
print(“============================”)
print(“\t1. Enqueue”)
print(“\t2. Dequeue”)
print(“\t3. Peek”)
print(“\t4. Display”)
print(“\t5. Exit”)
print(“============================”)
if choice==1 :
if q.isFull() :
print(“Queue Overflow”)
else :
q.enqueue(num)
elif choice==2 :
if q.isEmpty() :
print(“Queue Underflow”)
else :
num=q.dequeue()
print(“Item dequeued = ”,num)
elif choice==3 :
if q.isEmpty() :
print(“Queue Underflow”)
else :
num=q.peek()
elif choice==4 :
if q.isEmpty() :
else :
q.display()
elif choice==5 :
print(“\nQuiting.......”)
break
else:
Choice”)
continue
To solve this problem we can imagine the array to be bent so that the
position next to the last position of the array becomes the first position, just
like in Figure 9.3.
So, the linear queue now becomes the circular queue. The advantage of this
concept is that we need not shift elements to enqueue a new element when
the queue is not full but rear reaches the end of the queue (i.e. SIZE-1
position). We can simply store the new element at the next position, i.e. the
zeroth position of the array. In case of dequeue also after removing the
element at position SIZE-1, front will point to the zeroth position of the
array. In the next section we will see the various operations on a circular
queue and the position of the elements within the array.
Queue 361
Operation
front
rear
arr
Initial state
-1
-1
r0
Enqueue 5
0
Enqueue 10
10
1r
front
rear
arr
f
Enqueue 15
10
15 r
Dequeue
10
3
1
15
Dequeue
15
f2r
Dequeue
-1
-1
1
2
20
Enqueue 20
Queue 363
Operation
front
rear
arr
4
20
Enqueue 25
25
1r
20
Enqueue 30
25
30
2
r
Dequeue
25
30
Enqueue 35
3
35 3
25
30
Dequeue
35 3
30
f2
front
rear
arr
r
0
40
Enqueue 40
35
30
Enqueue 45
45
40
0
35 3
30
f2
b. Go to Step4
2. Else
i. front = rear = 0
b. Else
1. rear = 0
ii. Else
1. rear = rear + 1
4. End
The algorithm of a dequeue operation for a circular queue can be written as :
1. If front = -1, Then
Queue 365
b. Go to Step3
2. Else
a. Element = arr[front]
i. front = rear = -1
c. Else
1. front = 0
ii. Else
1. front = front + 1
d. Return Element
3. End
def isFull(self):
if self.rear==-1:
self.front=self.rear=0
elif self.rear==self.size()-1:
self.rear = 0
else:
self.rear+=1
self.arr[self.rear]=item
Or in short:
if self.rear==-1:
self.front=self.rear=0
else:
self.rear=(self.rear+1)%self.size()
self.arr[self.rear]=item
366 Data Structures and Algorithms Using Python The above enqueue()
function is almost similar to enqueue() of a linear queue but when the value
of rear becomes SIZE – 1, we set rear =0.
Similarly the dequeue() function for a circular queue will be as follows: def
dequeue(self):
temp = self.arr[self.front]
if self.front==self.rear:
self.front=self.rear=-1
elif self.front==self.size()-1:
self.front=0
else:
self.front+=1
return temp
Or in short:
def dequeue(self):
temp = self.arr[self.front]
if self.front==self.rear:
self.front=self.rear=-1
else:
self.front=(self.front+1)%self.size()
return temp
class Queue:
def __init__(self,size):
self.front=-1
self.rear =-1
self.arr=array.array(‘i’, [0]*size)
def isEmpty(self):
return self.front == -1
def isFull(self):
size()-1) or (self.front==self.rear+1)
Queue 367
if self.rear==-1:
self.front=self.rear=0
elif self.rear==self.size()-1:
self.rear = 0
else:
self.rear+=1
self.arr[self.rear]=item
def dequeue(self):
temp = self.arr[self.front]
if self.front==self.rear:
self.front=self.rear=-1
elif self.front==self.size()-1:
self.front=0
else:
self.front+=1
return temp
def peek(self):
return self.arr[self.front]
def size(self):
return len(self.arr)
def display(self):
print(“------”*self.size()+“-”)
if(q.front<=q.rear):
print(“| ”*(self.front),end=“”)
for i in range(self.front,self.rear+1):
print(‘|’,format(self.arr[i],‘>3’),
end=“ ”)
print(“| ”*(self.size()-(self.
rear+1)),end=“”)
else:
for i in range(0,self.rear+1):
print(‘|’,format(self.arr[i],‘>3’),
end=“ ”)
print(“| ”*(self.front-(self.
rear+1)),end=“”)
for i in range(self.front,self.size()):
print(‘|’,format(self.arr[i],‘>3’),
end=“ ”)
print(“|”)
print(“------”*self.size()+“-”)
q=Queue(size)
while(True):
print(“======================================”)
print(“\t1. Enqueue”)
print(“\t2. Dequeue”)
print(“\t3. Peek”)
print(“\t4. Display”)
print(“\t5. Exit”)
print(“======================================”)
if choice==1 :
if q.isFull() :
print(“Queue Overflow”)
else :
q.enqueue(num)
elif choice==2 :
if q.isEmpty() :
print(“Queue Underflow”)
else :
num=q.dequeue()
elif choice==3 :
if q.isEmpty() :
print(“Queue Underflow”)
else :
num=q.peek()
elif choice==4 :
if q.isEmpty() :
print(“Queue is Empty”)
else :
q.display()
elif choice==5 :
print(“\nQuiting.......”)
break
else:
Choice”)
continue
Queue 369
Operation
Arr
Initial state
Empty Queue
Enqueue 5
10
Enqueue 10
10
15
Enqueue 15
0
1
10
15
Dequeue
15
Dequeue
Figure 9.5 Operations on queue using list data structure Program 9.3: Write
a program to demonstrate the various operations of a queue using list.
class Queue:
def __init__(self):
self.items = []
def isEmpty(self):
self.items.append(item)
def dequeue(self):
return self.items.pop(0)
def peek(self):
return self.items[0]
def size(self):
return len(self.items)
def display(self):
rear=len(self.items)
print(“------”*rear+“-”)
for i in range(rear):
print(‘|’,format(self.items[i],‘>3’), end=“
”)
print(“|”)
print(“------”*rear+“-”)
q=Queue()
while(True):
print(“============================”)
print(“\t1. Enqueue”)
print(“\t2. Dequeue”)
print(“\t3. Peek”)
print(“\t4. Display”)
print(“\t5. Exit”)
print(“============================”)
if choice==1 :
q.enqueue(num)
elif choice==2 :
if q.isEmpty() :
print(“Queue Underflow”)
else :
num=q.dequeue()
elif choice==3 :
if q.isEmpty() :
print(“Queue Underflow”)
else :
Queue 371
num=q.peek()
if q.isEmpty() :
print(“Queue is Empty”)
else :
q.display()
elif choice==5 :
print(“\nQuiting.......”)
break
else:
Choice”)
continue
As the operation of a queue occurs just at two ends, we need not traverse the
entire list. Thus a singly linked list is the better option to implement a queue.
The linked list representation of a queue can be done efficiently in two ways.
We can represent a queue very efficiently with the help of a header linked
list. Here the header node of the linked list will consist of two pointers to
point to the front and rear nodes of the queue (i.e. the first node and last
node of the list). Figure 9.6 illustrates this.
Header Node
Front
Rear
372 Data Structures and Algorithms Using Python From Figure 9.6 we can
see that to implement it we have to declare two different classes
– one for the nodes of a single linked list and another for the header node.
Nodes of the single linked list represent each element in the queue and the
header node keeps track of the front position and the rear position. For the
nodes of a single linked list the class can be defined as:
class Node :
self.data = Newdata
self.next = link
class Queue :
self.front=None
self.rear =None
The basic operations on a queue using linked list are shown in Figure 9.7.
Initial State:
None
None
Enqueue 5:
Enqueue 10:
10
Queue 373
Enqueue 15:
10
15
Dequeue :
10
15
Initially when the queue is empty there will be no nodes in the linked list.
Thus front and rear both contain None. Next 5 is inserted into the queue.
So, a node is created containing 5 and both front and rear point to that node.
Next operation is Enqueue 10. Again a new node is created containing 10
and this node is inserted after the first node in the linked list.
As now it becomes the last node in the list, rear points to this new node.
Similarly, in case of Enqueue 15, a new node is inserted at the end of the list
and rear points to the new node.
Thus for the enqueue operation, we need not traverse the list and the time
complexity of this operation is O(1). When the dequeue operation is done, as
the front points to the first node of the list, the content of this node, i.e. 5, is
returned and the node is deleted from the list. The front pointer now points
to the next node of the list. Thus the time complexity of dequeue is also
O(1).
self.data = Newdata
self.next = link
class Queue :
self.rear =None
if self.front is None :
else:
self.rear.next = newNode
self.rear = newNode
return self.front.data
frontNode = self.front
if self.front==self.rear :
else:
self.front = frontNode.next
return frontNode.data
def display(self):
curNode = self.front
print()
while curNode is not None :
print(curNode.data,‘-> ’,end=“”)
curNode = curNode.next
print(“None”);
q=Queue()
while(True):
print(“============================”)
print(“\t1. Enqueue”)
print(“\t2. Dequeue”)
print(“\t3. Peek”)
print(“\t4. Display”)
print(“\t5. Exit”)
print(“============================”)
if choice==1 :
q.enqueue(num)
Queue 375
elif choice==2 :
if q.isEmpty() :
print(“Queue Underflow”)
else :
num=q.dequeue()
elif choice==3 :
if q.isEmpty() :
print(“Queue Underflow”)
else :
num=q.peek()
elif choice==4 :
if q.isEmpty() :
print(“Queue is Empty”)
else :
q.display()
elif choice==5 :
print(“\nQuiting.......”)
break
else:
print(“Invalid choice. Please Enter Correct
Choice”)
continue
9.3.4.2 Using a Single Circular Linked List with a Single Tail Pointer
self.data = Newdata
self.next = link
self.tail=None
if self.tail is None :
newNode.next = newNode
else:
newNode.next = self.tail.next
self.tail.next = newNode
self.tail = newNode
return self.tail.next.data
frontNode = self.tail.next
if frontNode.next==frontNode :
self.tail = None
else:
self.tail.next = frontNode.next
return frontNode.data
def display(self):
curNode = self.tail.next
print()
while curNode is not self.tail :
print(curNode.data,‘-> ’,end=“”)
curNode = curNode.next
print(curNode.data)
q=Queue()
while(True):
print(“============================”)
print(“\t1. Enqueue”)
print(“\t2. Dequeue”)
print(“\t3. Peek”)
print(“\t4. Display”)
print(“\t5. Exit”)
print(“============================”)
if choice==1 :
Queue 377
q.enqueue(num)
elif choice==2 :
if q.isEmpty() :
print(“Queue Underflow”)
else :
num=q.dequeue()
elif choice==3 :
if q.isEmpty() :
print(“Queue Underflow”)
else :
num=q.peek()
elif choice==4 :
if q.isEmpty() :
print(“Queue is Empty”)
else :
q.display()
elif choice==5 :
print(“\nQuiting.......”)
break
else:
print(“Invalid choice. Please Enter Correct
Choice”)
continue
One solution to this problem is to use multiple queues in the same array to
share the common space. Figure 9.8 shows this concept.
………………
n-4
n-3
n-2
n-1
Queue A
Queue B
Figure 9.8 Representation of two queues using a single array
378 Data Structures and Algorithms Using Python In Figure 9.8, it is shown
that we may implement two queues, queue A and queue B, within a single
array. Queue A will grow from left to right and queue B will grow from right
to left. Within queue A, the first element will be stored at the zeroth position,
the next element at the index position 1, then at the index position 2, and so
on. Whereas within the queue B, the first element will be stored at the (n-
1)th position, the next element at the (n-2)th position, then at (n-3)th
position, and so on. The advantage of this scheme is that the intermediate
space in between two queues is common for both and may be used by any
one of them. Suppose the array size is 10 and in a particular situation queue
B contains three elements. Thus for queue B, the front will point to 9 and
the rear will point to 7. At this position, queue A is free to expand up to
array index 6 in the array. Similarly, queue B is also able to expand up to the
rear position of queue A.
s[0] e[0]
s[1] e[1]
…….
s[n-2] e[n-2]
s[n-1] e[n-1]
Queue 1
Queue 2
……
Queue n-1
Queue n
9.5.1 DEQue
The full form of DEque is Double Ended Queue. It is a special type of queue
where elements can be inserted or deleted from either end. But insertion or
deletion is not allowed in the middle or any intermediate position. Thus the
operations related to deque are: insert from left, insert from right, delete
from left, and delete from right.
• Input restricted deque: In this deque inputs are restricted, which means
elements will be inserted only through a particular end but the elements can
be deleted from both ends.
Queue 379
Methods
Description
append( )
Inserts a element to the right
appendleft( )
pop( )
popleft( )
remove(n)
clear( )
count( n )
reverse( )
The following program shows the basic operations of a deque: Program 9.6:
Write a program to demonstrate the different operations on deque.
dq = deque()
dq.append(“Sachin”)
print(“Added at right”)
print(dq)
dq.append(“Sunil”)
print(“Added at right”)
print(dq)
dq.appendleft(“Paji”)
print(“Added at left”)
print(dq)
name=dq.pop()
print(name+“ popped”)
name=dq.popleft()
print(name+“ popped”)
print(dq)
dq.append(“Sourav”)
dq.append(“Sehwag”)
dq.append(“Sachin”)
print(dq)
c=dq.count(“Sachin”)
print(c,“‘Sachin’ is here”)
dq.remove(“Sachin”)
print(dq)
dq.reverse()
print(dq)
dq.clear()
print(dq)
deque([])
Added at right
deque([‘Sachin’])
Added at right
deque([‘Sachin’, ‘Sunil’])
Added at left
Sunil popped
deque([‘Paji’, ‘Sachin’])
Paji popped
deque([‘Sachin’])
2 ‘Sachin’ is here
Queue 381
deque([])
but before the first element having priority 5. In this implementation, the
dequeue operation is the same as a normal queue. As element with highest
priority will always be at front position. Thus the time complexity of the
enqueue operation will be O(n) whereas that of dequeue is O(1).
On the other hand, we can insert a new element always at the end of the list
following the normal enqueue operation. At the time of dequeue, the element
with highest priority will be searched and removed. Here the time
complexity of the enqueue operation will be O(1) but that of dequeue is
O(n).
In the following program, a priority queue is implemented using the list data
structure.
As deletion of any element from the end of the list requires time complexity
of O(1), here the end of the list has been considered as the front. So, in this
implementation, the time complexity of the dequeue operation is O(1) and
that of enqueue is O(n).
382 Data Structures and Algorithms Using Python Program 9.7: Write a
program to demonstrate the operations of a priority queue.
class Node :
self.data = Newdata
self.priority= NewPriority
class priorityQueue:
def __init__(self):
self.items = []
def isEmpty(self):
return self.items == []
rear=len(self.items)
for i in range(rear):
if item.priority>=self.items[i].priority :
self.items.insert(i,item)
break
else:
self.items.append(item)
def dequeue(self):
return self.items.pop()
def peek(self):
return self.items[self.size()-1]
def size(self):
return len(self.items)
def display(self):
rear=len(self.items)
print(“---------”*rear+“-”)
for i in range(rear):
print(‘|’,format(str(self.items[i].data)
+“(”+str(self.items[i].priority)+“)”,‘>6’), end=“ ”)
print(“|”)
print(“---------”*rear+“-”)
q=priorityQueue()
while(True):
print(“======================================”)
print(“\t1. Enqueue”)
Queue 383
print(“\t2. Dequeue”)
print(“\t3. Peek”)
print(“\t4. Display”)
print(“\t5. Exit”)
print(“=====================================”)
if choice==1 :
newNode=Node(num,prio)
q.enqueue(newNode)
elif choice==2 :
if q.isEmpty() :
print(“Queue Underflow”)
else :
popNode=q.dequeue()
Priority ”, popNode.priority)
elif choice==3 :
if q.isEmpty() :
print(“Queue Underflow”)
else :
popNode=q.peek()
elif choice==4 :
if q.isEmpty() :
print(“Queue is Empty”)
else :
q.display()
elif choice==5 :
print(“\nQuiting.......”)
break
else:
Choice”)
continue
384 Data Structures and Algorithms Using Python can also be done using an
array, list, or linked list. For array/list representation a 2D array/list is a
better choice where each row will be treated as a different queue of different
priority. We need not store the priority values with the data part. The zeroth
row maintains the queue having priority value 0, the first row maintains the
queue having priority value 1, the second row maintains the queue having
priority value 2, and so on. For linked list representation, each queue of
different priorities is implemented by different linked lists, and a linked list
will contain the priority and address of the fist node of each list.
• Used for serving request on single shared resources like printer, disk, etc.
Queue 385
Queue at a Glance
✓ A circular queue is the same as a linear queue with the only difference
being that it is considered to be bent so that the position next to the last
position of the queue becomes the first position. By this consideration the
enqueue operation never requires shifting.
a) Linked list
b) Stack
c) Queue
d) Tree
a) Array
b) Stack
c) Queue
d) Tree
386 Data Structures and Algorithms Using Python 3. Suppose the four
elements 20, 10, 40 and 30 are inserted one by one in a linear queue. Now if
all the elements are dequeued from the queue, what will be the order?
a) Priority Queue
b) Circular Queue
c) Deque
d) Linear Queue
Which of the following conditions becomes True when the queue becomes
full?
a) Rear == SIZE-1
c) Front == SIZE-1
c) Front = Rear + 1
d) Rear = Front
a) Rear = Front + 1
c) Front = Rear + 1
d) Rear = Front
c) Quick sort
d) Radix sort
Queue 387
d) None of these.
10. The initial configuration of a queue is 10, 20, 30, 40 (10 is at the front
end). To get the configuration 40, 30, 20, 10 one needs a minimum of
a) FIFO
b) LIFO
d) None of these
12. If the elements ‘A’,‘B’,‘C’, and ‘D’ are inserted in a priority queue with
the priority 5, 3, 5, 6
respectively and then removed one by one, in which order will the elements
be removed?
a) ABCD
b) ACBD
c) DACB
d) DCAB
Review Exercises
3. What are the operations associated with a queue? Explain with example.
13. Show the position of a queue of size 5 when implemented using an array
for the following operations. Assume initially the queue is empty.
a. Enqueue 23 and 75
b. Dequeue twice
d. Dequeue
14. Show the position of a circular queue of size 5 for the following
operations. Assume initially the queue is empty.
a. Enqueue 23 and 75
b. Dequeue twice
d. Dequeue
f. Dequeue twice
g. Enqueue 10
h. Deque
15. Show the position of a priority queue of size 5 for the following
operations. Assume initially the queue is empty and higher value is higher
priority.
a. Enqueue 23 and 75 with priority 5
b. Dequeue
d. Dequeue
Queue 389
10
OceanofPDF.com
Chapter
Trees
So far we have discussed different linear data structures like arrays, linked
lists, stacks, queues, etc. These are called linear data structures because
elements are arranged in linear fashion, i.e.
one after another. Apart from these there are also some non-linear data
structures. Trees and graphs are the most common non-linear data
structures. In this chapter we will discuss the tree data structure. A tree
structure is mainly used to store data items that have a hierarchical
relationship among them. In this chapter we will discuss the different types
of trees, their representation in memory, and recursive and non-recursive
implementation.
C
D
EF
LM
Node: It is the main component of a tree. It contains the data elements and
the links to other nodes. In the above tree, A, B, C, D, etc. are the nodes.
Root Node: The topmost node or starting node is known as the root node.
Here A is the root node.
Edge: It is the connecter of two nodes. In Figure 10.1, the line between any
two nodes can be considered as an edge. As every two nodes are connected
by a single edge only, there should be exactly n-1 edges in a tree having n
nodes.
Child Node: The immediate successor node of a node is called its child
node. It is also known as its descendant. Here B and C are the child nodes
of A; I, J, and K are the child nodes of D.
Siblings: All the child nodes of a particular parent node are called siblings.
Thus B and C
Leaf Node: A node that does not have any child node is known as a leaf
node, or terminal node, or external node. I, P, K, L, F, M, etc. are the leaf
nodes.
Internal Node: A node that has any child node is known as an internal node
or non-terminal node. All the nodes except the leaf nodes in a tree are
internal nodes. B, D, E, J, etc. are internal nodes.
Degree: The number of child nodes of a node is called the degree of that
node. Thus the degree of any leaf node is always 0. In the tree in Figure
10.1, the degree of A is 2, degree of D is 3, degree of E is 1, etc. The highest
degree among all nodes in a tree is the degree of that tree. Thus the degree
of the above tree is 3. In other words, the degree or the order of a tree
implies the maximum number of possible child nodes of a particular node.
If the degree or order of a tree is n, the nodes of this tree have a maximum
of n number of children.
Depth: The length of a path starting from the root to a particular node is
called the depth of that node. Here the depth of P is 4 whereas the depth of
E is 2.
Height: The height of a tree is the number of nodes on the path starting
from the root to the deepest node. In other words, total number of levels in a
tree is the height of that tree.
Trees 393
There are several types of trees. Here we will discuss some of them:
• General Tree
• Forest
• Binary Tree
By the term tree we actually mean general tree. Thus it starts with the root
node and this root node may have zero or any number of sub-trees. Except
the root node, every node must have a parent node, and except the leaf
node, every node possesses one or more child nodes. As there is no specific
order or degree of this tree it is treated as a general tree and represented as
an Abstract Data Type (ADT).
10.3.2 Forest
(a) shows a tree and Figure 10.2 (b) shows its corresponding forest.
E
F
(a)
(b)
394 Data Structures and Algorithms Using Python 10.3.3 Binary Tree
A binary tree is a tree of degree 2, i.e. each node of this tree can have a
maximum of two children. It is either empty or consists of a root node and
zero or one or two binary trees as children of the root. These are known as
left sub-tree and right sub-tree. The starting node of each sub-tree is
considered as their root node and can have further left sub-tree and right
sub-tree, if any. Figure 10.3 shows a binary tree.
Root node
Root node of
Root node of
If each node of a binary tree has exactly zero (in case of leaf node) or two
non-empty children, the binary tree is known as a strictly binary tree or 2-
tree. Figure 10.4 shows a strictly binary tree.
EF
Trees 395
EF
A binary tree whose all levels have the maximum number of nodes is called
a full binary tree. Thus except the leaf nodes all nodes have exactly two
child nodes. In a full binary tree, the total number of nodes at the first level
is 1, i.e. 20, at the second level it is 2, i.e. 21, at the third level it is 4, i.e. 22,
and so on. Thus at the d th level the total number of nodes will be 2d-1
and the total number of nodes in a full binary tree of height h can be
calculated as: n = 20 + 21 + 22 + 23 + …… + 2h = ∑ 2h-1 = 2h – 1
From this equation we can also find the height of a full binary tree.
n = 2h – 1
or, 2h = n + 1
or, h = log (n + 1)
Thus, the height of a full binary tree having n nodes is: h = log (n + 1) 2
396 Data Structures and Algorithms Using Python Total number of nodes in
a full binary tree of height h is n = 2h – 1
Figure 10.7 A binary tree and its corresponding extended binary tree 10.3.8
Binary Search Tree (BST)
Trees 397
50
78
43
27
48
62
91
30
52
67
95
28
33
93
63
Next, * will be evaluated. So, in the next step * will be the root, the above
sub-tree will be the left sub-tree, and D will be the right sub-tree.
C
Finally, + will be evaluated and we will get the following tree:
398 Data Structures and Algorithms Using Python Similarly to get back the
expression from an expression tree, we have to proceed from the bottom.
First, we will get (B – C), then ((B – C ) * D), and, finally, A + ((B – C) *
D).
F
D
In the tournament tree shown in Figure 10.10, the first match was played
between A and B, and A was the winner. Similarly between C and D, E and
F, and G and H, the respective winners were D, F, and G. In the next round,
D and F were the winners and finally F was the winner of the tournament.
Trees 399
stored at the index position 1, and the position of immediate left and right
children of the root will be calculated as:
These rules are applicable for sub-trees also. Consider the following
example: A
As the height of the above binary tree is four, the required size of the array
is (24 – 1) + 1 (as we decided to skip the zeroth index position), i.e. 16.
Now, the root element A will be stored at index position 1. The immediate
left child, B, will be stored at position 2*1, i.e. at index position 2, and the
immediate right child, C, will be stored at position 2*1+1, i.e. at index
position 3 of the array. The node B does not have any child. So, we proceed
to the next node, C, which has two children, D and E. The index position of
D will be 3*2, i.e. 6, and that of E will be 3*2+1=7. Similarly, the positions
of F and G will be 12 and 13 respectively and we will get the following
array:
8
9
10
11
12
13
14
15
From Figure 10.12, we can find that there are several vacant places in the
array and thus producing wastage of memory. This representation is
effective in case of a full tree or at least a complete tree. Otherwise there
will be wastage. Especially in case of a skew tree the wastage is huge. The
solution to this problem is linked list representation.
400 Data Structures and Algorithms Using Python nodes that are not
currently present. As and when a new node is needed to be inserted, we can
allocate memory dynamically for the node and insert it into the tree. This is
true for deletion also. And this insertion and deletion operations are much
more efficient than array representation, because here we need not shift
items; only a few pointer operations are required. The only disadvantage is
that to store the reference of left and right children, extra space is required
for each node. But this would be negligible if the number of data items is
large.
In linked list representation each element is represented by a node that has
one data part and two reference parts to store the references of the left child
and the right child. If any node does not have any left and/or right child, the
corresponding reference part will contain NULL (in case of python it is
None). The data part may contain any number of data members according
to the requirement. For simplicity, we are considering a single data member
here. So, to represent the node, we may define the class as: class TreeNode :
def __init__(self,Newdata=None,lchild=None,rchild=None):
self.left = lchild
self.data = Newdata
self.right = rchild
None B None
None
E None
None
F None
None G None
Figure 10.13 Linked list representation of the binary tree of Figure 10.11
10.5 Binary Tree Traversal
Once a binary tree is created, we need to traverse the tree to access the
elements in the tree.
It is the way by which each node of the tree is visited exactly once in a
systematic manner.
Trees 401
Based on the way nodes of a tree are traversed, the traversal technique can
be classified mainly into three categories. These are as follows:
Preorder: A B C D F G E
Inorder: B A F D G C E
Postorder: B F G D E C A
#binary tree
def preorder_rec(curNode):
402 Data Structures and Algorithms Using Python if curNode is not None:
print(curNode.data,end=“ ”)
preorder_rec(curNode.left)
preorder_rec(curNode.right)
Though it is very easy to write the code, we may also write the above
function in an iterative way. For iterative implementation, a stack is
required. Here is a general algorithm to implement a non-recursive or
iterative approach of preorder traversal technique of a binary tree:
1. Create a stack.
2. Push
3. Set
curNode = Root
4. While
d. Otherwise,
#binary tree
def preorder(self):
st=Stack()
st.push(None)
curNode=self.Root
print(curNode.data, end=“ ”)
if curNode.right:
st.push(curNode.right)
if curNode.left:
curNode=curNode.left
else:
curNode=st.pop()
Trees 403
#binary tree
def inorder_rec(curNode):
inorder_rec(curNode.left)
print(curNode.data,end=“ ”)
inorder_rec(curNode.right)
For iterative implementation of inorder traversal, a stack is also required.
The general algorithm to implement a non-recursive or iterative approach to
inorder traversal technique of a binary tree may be defined as:
1. Create a stack.
2. Push
3. Set
curNode = Root
4. While
i. Push
2. Set
flag as False
iii. Otherwise,
#binary tree
st=Stack()
st.push(None)
curNode=self.Root
st.push(curNode)
curNode=curNode.left
curNode=st.pop()
flag = True
print(curNode.data, end=“ ”)
if curNode.right:
curNode=curNode.right
flag=False
else:
curNode=st.pop()
#binary tree
def postorder_rec(curNode):
postorder_rec(curNode.left)
postorder_rec(curNode.right)
print(curNode.data,end=“ ”)
1. Create a stack.
2. Set
curNode = Root
Trees 405
3. While
True, do
i. If
ii. Push
c. If curNode has a right child and top of stack is also the reference of right
child of curNode, then
d. Otherwise,
def postorder(self):
st=Stack()
curNode=self.Root
if curNode is None:
return
while True:
if curNode.right:
st.push(curNode.right)
st.push(curNode)
curNode=curNode.left
curNode=st.pop()
right:
st.pop()
else:
print(curNode.data, end=“ ”)
curNode=None
if st.isEmpty():
break
In level order traversal, all the nodes at a particular level are traversed
before going to the next level. To implement level order traversal, instead of
stack we need a queue. The following algorithm shows how a binary tree
can be traversed level wise.
1. Create a queue.
q=Queue()
q.enqueue(rootNode)
curNode= q.dequeue()
q.enqueue(curNode.left)
q.enqueue(curNode.right)
print(curNode.data, end=“ ”)
Trees 407
path. Purpose of both the preorder and the postorder paths is same. They
identify the root.
For the preorder path, the left-most element is the root, whereas in the
postorder path, the right-most element is the root. Once the root is identified
either from the preorder or the postorder path, the inorder path shows the
elements of the left sub-tree and the right subtree. Elements in the left side
of the root in the inorder path will be in the left sub-tree and the elements in
the right side of the root in the inorder path will be in the right sub-tree.
Inorder : B A F D G C E
Solution: Preorder : A B C D F G E
The left-most element of the preorder path is the root. So, A is the root here.
Now marking A as the root in the inorder path, we can easily identify that
the left sub-tree consists of only a single node, B, and the right sub-tree
consists of F, D, G, C, and E.
Inorder : B A F D G C E
FDGCE
In the left sub-tree, there is a single element, B. So, we have nothing to do.
Now we concentrate on the right sub-tree. To construct the right sub-tree we
have to follow the same technique.
Preorder : C D F G E
Inorder : F D G C E
From preorder we are getting C as the root, and identifying C in the inorder
path we get F, D, and G in the left sub-tree and E in right sub-tree. So, the
tree becomes:
FDG
Preorder : D F G
Inorder : F D G
Now, it is clear that D is the root and F will be in the left sub-tree and G
will be in the right sub-tree and the final tree will be:
Figure 10.16 Final tree constructed from the traversal path Here we
consider the preorder and the inorder paths. It is also possible to reconstruct
a binary tree from its postorder and inorder paths. The algorithm is the
same; the only difference is that the root will be the right-most element in
the postorder path.
10.7 Conversion of a General Tree to a Binary Tree
It is possible to create a binary tree from a general tree. For this conversion
we need to follow three simple rules:
1. Set the root of the general tree as the root of the binary tree.
2. Set the left-most child of any node in the general tree as the left child of
that node in the binary tree.
3. Set the right sibling of any node in the general tree as the right child of
that node in the binary tree.
Trees 409
Example 10.2: Construct a binary tree from the following general tree.
GH
Solution :
Step 1:
Step 2:
Step 3:
Step 4:
Step 5:
Step 6:
A
Step 7:
Step 8:
FH
D
FH
FH
GI
1. As the root of the general tree is A, first we have to set A as the root of
the binary tree.
3. The left-most child of B is E and the right sibling of B is C. Thus the left
and right children of B are E and C respectively.
410 Data Structures and Algorithms Using Python 4. Similarly, as the left
child of C is H and the right sibling of C is D, the left and right children of
C are H and D respectively.
5. The left-most child of D is I and D does not have any right siblings. Thus
D has only left child, I.
6. E does not have any child but its right sibling is F. So, E has only right
child F.
7. K is the left child of F and the right sibling of F is G. Thus the left and
right children of F are K and G respectively.
8. G and H neither have any children nor have any siblings. I does not have
any children but its right sibling is J. As a result, I has only a right child, J.
Node J also neither has any children nor any siblings. So, we get the final
tree.
and/or deletion of an existing element from a BST is also very fast. The
time complexity of these operations is also O(log n). Now we learn how a
BST can be constructed. Consider 2
the following example:
Example 10.3: Construct a binary search tree from the following data: 14
15 4 9 7 18 3 5 16 20 17
Solution:
Insert 14 :
14
Insert 15 :
14
Insert 4 :
14
Insert 9 :
14
15
15
15
Trees 411
Insert 7:
14
Insert 18:
14
Insert 3:
14
15
15
15
18
18
Insert 5:
Insert 16:
14
14
15
15
18
18
16
Insert 20:
14
Insert 17:
14
15
15
18
18
20
16
20
16
17
5
Figure 10.18 Steps to construct a binary search tree Explanation: The first
element is 14. So, it would be the root. The next element is 15. As it is
larger than the root it is inserted as the right child. Next comes 4; it is
smaller than the root and is inserted as the left child of 14. The next element
is 9. Comparison always starts from the root. So, first it would be compared
with 14. As it is less than the root we move towards the left sub-tree and
find 4. Now 9 is greater than 4, thus 9 will be inserted as the right child of
4. The next element is 7. Again first it would be compared with 14. As it is
less than 14, we move to the left sub-tree and find 4. Now 7 is greater than
4; we move to the right sub-tree of 4 and find 9. As 7 is less than 9, it is
inserted as the left child of 9. The next element is 18. It is larger than 14 and
also larger than 15. Thus, it is inserted as the right child of 15. The next
element is 3. It is smaller than 14 as well as 4. So, it is inserted as the left
child of 4. In this way the rest of the elements are processed and we get the
final tree.
class TreeNode :
def __init__(self,Newdata=None,lchild=None,rchild=None):
self.left = lchild
self.data = Newdata
self.right = rchild
10.9.1 Insertion of a New Node in a Binary Search Tree
When we insert a new node in a BST, first we need to check whether the
tree is empty or not. If the tree is empty, the new node will be inserted as
the root node, otherwise we have to find the proper position where the
insertion has to be done. To find the proper position, first the data value of
the new node will be compared with the root. If it is less than the root, the
left sub-tree will be traversed, otherwise the right sub-tree will be traversed.
This process continues moving down until it reaches any leaf node. Finally,
if the data value of the new node is less than this leaf node, the new node
will be inserted as the left child of the leaf node, else the new node will be
inserted as the right child. Remember, the new node always will be inserted
as a leaf node. The time complexity to insert a new node is O(h) where h is
the height of the tree as in the worst case we may have to traverse the
longest path starting from the root. If we want to express it in terms of
number of nodes, i.e. n, it would be O(n) in the worst case (if the tree is a
skew tree) but on average O(log n).
3. Update both the reference parts of the left child and the right child with
None.
5. Else
ii. If the data value of new node is less than data value of current node, then
Set left child of current node as current node
iii. Otherwise,
d. If the data value of new node is less than data value of parent node, then
Set the new node as left child of parent node
e. Otherwise,
def insert(self,newData):
newNode=TreeNode(newData)
if self.Root is None:
self.Root=newNode
else:
curNode = self.Root
parentNode = None
parentNode=curNode
if newData<curNode.data:
curNode = curNode.left
else:
curNode = curNode.right
if newData<parentNode.data:
parentNode.left=newNode
else:
parentNode.right=newNode
def insert_rec(curNode,newData):
if curNode is None:
return TreeNode(newData)
elif newData<curNode.data:
curNode.left=insert_rec(curNode.left, newData)
else:
return curNode
or not. Before starting the search operation, first we need to check whether
the tree is empty or not. If it is empty, there is no scope of finding the
element and the process terminates with an appropriate message, otherwise
the search process is started. The search process always starts from the root.
First the data value of the root node is compared with the key value to be
searched. If both values are the same, the process terminates signaling true.
Otherwise, if the key value is less than the data value of the root we need to
move to the left sub-tree, else the right sub-tree. This process continues
until we get the node or reach None. The time complexity of the search
operation is also O(n) in the worst case and O(log n) in the average case.
The general algorithm to insert a new node in a binary search tree may be
defined as follows:
a. If the key value to be searched is equal to the data value of the current
node, then Return the node
b. Else, If the key value is less than the data value of current node, then Set
left child of current node as current node
3. Return False
if key == curNode.data :
return curNode
curNode = curNode.left
else:
curNode = curNode.right
return curNode
Trees 415
return curNode
elif key<curNode.data:
else:
• Preorder traversal
• Inorder traversal
All the algorithms and functions that were discussed for the traversal of a
binary tree are applicable exactly the same way for a BST.
To delete a node from any BST we have to take care of two things. First,
only the required node will be deleted; no other node will be lost. Second,
after deletion the tree that remains possesses the properties of a BST. Thus
we need to consider following three cases: 1. No-child case: This case is
very simple. Delete the node and set None to the left pointer of its parent if
it is a left child of its parent, otherwise set None to the right pointer.
Consider the following example. Suppose we want to delete the node with
value 17.
14
14
15
15
9
18
18
20
16
20
16
17
Node to be deleted
14
14
15
15
18
18
20
20
16
16
5
Node to be deleted
The node with value 9 is the right child of the node with value 4 and has
only a left sub-tree but no right child, i.e. it is a case of one child. So, after
deletion, the left subtree of the node with value 9 will be the right sub-tree
of the node with value 4.
Trees 417
14
Node to be deleted
14
_7
15
15
18
18
20
20
16
16
be actually deleted
15
18
3
20
16
Figure 10.21 Deletion of a node with two children As the root have both
right sub-tree and left sub-tree, first we find the inorder predecessor of the
root and it is the node with value 7. Now, 14 will be replaced by 7 and the
predecessor node will be deleted physically. From Figure 10.21 we can find
that it is a case of one child and thus the right child reference of its parent
(i.e. the node with value 4) has been updated with the left child of the
predecessor node.
Considering the above two cases now we are able to write the algorithm for
the deletion of a node. The general algorithm to delete a node from a BST
may be defined as follows: 1. Find the node to be deleted along with its
parent
3. Otherwise,
a. Set
4. If both the left child and the right child of current is None(i.e. node has
no child), then
a. If
i. Update
Root with None
i. If
ii. Else,
5. Else, If either the left child or right child of current is None(i.e. node has
only one child), then
a. If
i. Update
b. Else,
i. If
II. Update left child reference of parent with the child of current ii. Else,
II. Update right child reference of parent with the child of current 6. Else
(i.e. node has only two children),
a. Find the largest node in the left sub-tree along with its parent b. Replace
the value of the node to be deleted with the value of this largest node c.
Delete this largest node following steps 4 and 5.
Based on the above algorithm we may write the following code.
# parent
parent = None
curNode= self.Root
if key == curNode.data :
parent = curNode
curNode = curNode.left
else:
parent = curNode
curNode = curNode.right
Trees 419
None:
if parent.right is curNode:
parent.right = None
else:
parent.left = None
else:
self.Root = None
del(curNode)
None:
if curNode.left is None:
childNode = curNode.right
else:
childNode = curNode.left
if parent is not None:
if parent.left is curNode:
parent.left = childNode
else:
parent.right= childNode
else:
self.Root = childNode
del(curNode)
parentLeft = curNode
largestLeft = curNode.left
parentLeft = largestLeft
largestLeft = largestLeft.right
curNode.data = largestLeft.data
if parentLeft.right == largestLeft:
parentLeft.right = largestLeft.left
else:
if curNode is None:
elif key<curNode.data:
curNode.left=delete_rec(curNode.left, key)
elif key>curNode.data:
curNode.right=delete_rec(curNode.right, key)
else:
if curNode.left is None:
return curNode.right
return curNode.left
else:
temp = findLargest_rec(curNode.left)
curNode.data=temp.data
curNode.left=delete_rec(curNode.left, temp.
data)
return curNode
To delete a node first we have to search the node. For that the running time
complexity is O(log n). Next, to delete the node, if the node has 0 or 1
child, time complexity would be 2
the largest node. But this total searching time (i.e. first finding the node to
delete + finding the largest node in the left sub-tree) cannot be larger than
the height of the tree. Thus in all cases, the average time complexity to
delete a node from a BST is O(log n).
In a BST the value in the right sub-tree is larger than its root. Thus the
largest element in a binary search tree would be the right-most element in
the tree. So, to find the largest element we need not compare elements but
have to move to the right sub-tree always until the right sub-tree of any
node is None. Thus the time complexity to find the largest node is also
O(log n).
The general algorithm to find the largest node in a BST may be defined as
follows: 1. Set current node = Root
Trees 421
2. While right child reference of current node is not None, do a. Set right
child of current node as current node
def findLargestNode(self):
largestNode = self.Root
largestNode = largestNode.right
return largestNode
def findLargest_rec(curNode):
return curNode
else:
return findLargest_rec(curNode.right)
10.9.6 Finding the Smallest Node from a Binary Search Tree Similarly,
as in a BST the value in the left sub-tree is smaller than its root, the smallest
element in a BST would be the left-most element in the tree. Thus, to find
the smallest element we need to move to the left sub-tree always until the
left sub-tree of any node is None. Here also the time complexity would be
O(log n).
def findSmallestNode(self):
smallestNode = self.Root
smallestNode = smallestNode.left
The recursive function to find the smallest node in a BST may be defined
as:
def findSmallest_rec(curNode):
return curNode
else:
return findSmallest_rec(curNode.left)
1(for root)
def countNode(curNode):
if curNode is None:
return 0
else:
return countNode(curNode.left)+countNode(curNode.
right)+1
# a BST
Trees 423
def countExternal(curNode):
if curNode is None:
return 0
return 1
else:
return countExternal(curNode.left)+
countExternal(curNode.right)
# a BST
def countInternal(curNode):
if curNode is None or
return 0
else:
return countInternal(curNode.left)+
countInternal(curNode.right)+1
To find the height of a BST, we have to first find the height of the left sub-
tree and the right sub-tree and have to consider the greater one. With this
greater value, add 1 (for the root) to get the final height of the tree.
The recursive function to find the height of a binary tree may be defined as:
def findHeight(curNode):
if curNode is None:
return 0
else:
heightLeft =findHeight(curNode.left)
return heightLeft+1
else:
return heightRight+1
The mirror image of a BST can be obtained by interchanging the left child
and right child references of each and every node. The following recursive
function shows how the mirror image of a BST can be obtained:
def findMirrorImage(curNode):
findMirrorImage(curNode.left)
findMirrorImage(curNode.right)
curNode.left
#recursive)
def __init__(self):
self.items = []
def isEmpty(self):
return self.items == []
self.items.append(item)
def pop(self):
return self.items.pop()
def peek(self):
if self.isEmpty():
Trees 425
return -1
return self.items[len(self.items)-1]
def __init__(self,Newdata=None,lchild=None,rchild=No
ne):
self.left = lchild
self.data = Newdata
self.right = rchild
self.Root = None
def insert(self,newData):
newNode=TreeNode(newData)
if self.Root is None:
self.Root=newNode
else:
curNode = self.Root
parentNode = None
parentNode=curNode
if newData<curNode.data:
curNode = curNode.left
else:
curNode = curNode.right
if newData<parentNode.data:
parentNode.left=newNode
else:
parentNode.right=newNode
def preorder(self):
st=Stack()
st.push(None)
curNode=self.Root
print(curNode.data, end=“ ”)
if curNode.right:
st.push(curNode.right)
if curNode.left:
curNode=curNode.left
curNode=st.pop()
def inorder(self):
st=Stack()
st.push(None)
curNode=self.Root
st.push(curNode)
curNode=curNode.left
curNode=st.pop()
flag = True
print(curNode.data, end=“ ”)
if curNode.right:
curNode=curNode.right
flag=False
else:
curNode=st.pop()
def postorder(self):
st=Stack()
curNode=self.Root
if curNode is None:
return
while True:
if curNode.right:
st.push(curNode.right)
st.push(curNode)
curNode=curNode.left
curNode=st.pop()
right:
st.pop()
st.push(curNode)
curNode=curNode.right
else:
print(curNode.data, end=“ ”)
curNode=None
if st.isEmpty():
Trees 427
break
parent = None
curNode= self.Root
if key == curNode.data :
parent = curNode
curNode = curNode.left
else:
parent = curNode
curNode = curNode.right
if curNode is None:
print(“Node not Found”)
None:
if parent.right is curNode:
parent.right = None
else:
parent.left = None
else:
self.Root = None
del(curNode)
None:
if curNode.left is None:
childNode = curNode.right
else:
childNode = curNode.left
parent.left = childNode
else:
else:
self.Root = childNode
del(curNode)
parentLeft = curNode
largestLeft = curNode.left
parentLeft = largestLeft
largestLeft = largestLeft.right
curNode.data = largestLeft.data
if parentLeft.right == largestLeft:
parentLeft.right = largestLeft.left
else:
parentLeft.left = largestLeft.left
del(largestLeft)
def search(self, key):
curNode= self.Root
if key == curNode.data :
return curNode
curNode = curNode.left
else:
curNode = curNode.right
return curNode
def findLargestNode(self):
largestNode = self.Root
largestNode = largestNode.right
return largestNode
def findSmallestNode(self):
smallestNode = self.Root
smallestNode = smallestNode.left
return smallestNode
bst=BST()
Trees 429
while True:
print(“=======================================”)
print(“1.Insert Node”)
print(“2.Preorder Traversal”)
print(“3.Inorder Traversal”)
print(“4.Postorder Traversal”)
print(“5.Delete a Node”)
print(“6.Search an element”)
print(“9.Exit”)
print(“=======================================”)
if choice==1 :
bst.insert(num)
elif choice==2 :
print(“Preorder : ”, end = ‘ )
bst.preorder()
elif choice==3 :
print(“Inorder : ”, end = ‘ ’)
bst.inorder()
elif choice==4 :
print(“Postorder : ”, end = ‘ ’)
bst.postorder()
elif choice==5 :
bst.delete(num)
elif choice==6 :
findNode=bst.search(num)
if findNode is None:
else:
print(“Node found”)
elif choice==7 :
if bst is None:
print(“Null Tree”)
else:
max=bst.findLargestNode()
elif choice==8 :
print(“Null Tree”)
else:
min=bst.findSmallestNode()
elif choice==9 :
print(“\nQuiting.......”)
break
else:
Choice”)
continue
class TreeNode :
def __init__(self,Newdata=None,lchild=None,rchild=No
ne):
self.left = lchild
self.data = Newdata
self.right = rchild
def insert_rec(curNode,newData):
if curNode is None:
return TreeNode(newData)
elif newData<curNode.data:
curNode.left=insert_rec(curNode.left, newData)
else:
curNode.right=insert_rec(curNode.right, newData)
return curNode
def preorder_rec(curNode):
print(curNode.data,end=“ ”)
preorder_rec(curNode.left)
preorder_rec(curNode.right)
Trees 431
def inorder_rec(curNode):
inorder_rec(curNode.left)
print(curNode.data,end=“ ”)
inorder_rec(curNode.right)
postorder_rec(curNode.left)
postorder_rec(curNode.right)
print(curNode.data,end=“ ”)
if curNode is None:
elif key<curNode.data:
curNode.left=delete_rec(curNode.left, key)
elif key>curNode.data:
curNode.right=delete_rec(curNode.right, key)
else:
if curNode.left is None:
return curNode.right
return curNode.left
else:
temp = findLargest_rec(curNode.left)
curNode.data=temp.data
curNode.left=delete_rec(curNode.left, temp.data)
return curNode
return curNode
elif key<curNode.data:
else:
def findLargest_rec(curNode):
return curNode
else:
return findLargest_rec(curNode.right)
432 Data Structures and Algorithms Using Python def
findSmallest_rec(curNode):
return curNode
else:
return findSmallest_rec(curNode.left)
def countNode(curNode):
if curNode is None:
return 0
else:
return countNode(curNode.left)+countNode(curNode.
right)+1
def countExternal(curNode):
if curNode is None:
return 0
return 1
else:
return countExternal(curNode.
left)+countExternal(curNode.right)
def countInternal(curNode):
if curNode is None or (curNode.left is None and curNode.
right is None):
return 0
else:
return countInternal(curNode.
left)+countExternal(curNode.right)+1
def findHeight(curNode):
if curNode is None:
return 0
else:
heightLeft =findHeight(curNode.left)
heightRight=findHeight(curNode.right)
return heightLeft+1
else:
return heightRight+1
def findMirrorImage(curNode):
Trees 433
findMirrorImage(curNode.left)
findMirrorImage(curNode.right)
curNode.left
root=None
while True:
print(“=======================================”)
print(“=======================================”)
if choice==1 :
root=insert_rec(root,num)
elif choice==2 :
print(“Preorder : ”, end = ‘ ’)
preorder_rec(root)
elif choice==3 :
print(“Inorder : ”, end = ‘ ’)
inorder_rec(root)
elif choice==4 :
print(“Postorder : ”, end = ‘ ’)
postorder_rec(root)
elif choice==5 :
root=delete_rec(root, num)
elif choice==6 :
if findNode is None:
else:
print(“Node found”)
elif choice==7 :
if root is None:
print(“Null Tree”)
else:
max=findLargest_rec(root)
elif choice==8 :
if root is None:
print(“Null Tree”)
else:
min=findSmallest_rec(root)
elif choice==9 :
c=countNode(root)
c=countExternal(root)
elif choice==11:
c=countInternal(root)
elif choice==12:
h=findHeight(root)
elif choice==13:
findMirrorImage(root)
elif choice==14:
print(“\nQuiting.......”)
break
else:
continue
Trees 435
class Queue:
def __init__(self):
self.items = []
def isEmpty(self):
return self.items == []
self.items.append(item)
def dequeue(self):
return self.items.pop(0)
def peek(self):
return self.items[0]
def size(self):
return len(self.items)
class TreeNode :
def __init__(self,Newdata=None,lchild=None,rchild=No
ne):
self.left = lchild
self.data = Newdata
self.right = rchild
def insert_rec(curNode,newData):
if curNode is None:
return TreeNode(newData)
elif newData<curNode.data:
curNode.left=insert_rec(curNode.left, newData)
else:
curNode.right=insert_rec(curNode.right, newData)
return curNode
def levelorder(rootNode):
q=Queue()
q.enqueue(rootNode)
curNode= q.dequeue()
q.enqueue(curNode.left)
q.enqueue(curNode.right)
print(curNode.data, end=“ ”)
root=None
while True:
================”)
print(“ 3.Exit”)
print(“=================================================
================”)
if choice==1 :
root=insert_rec(root,num)
elif choice==2 :
levelorder(root)
elif choice==3 :
print(“\nQuiting.......”)
break
else:
continue
10.10 Threaded Binary Tree
Trees 437
None B
D
None E None
None F
None G
B None
E None
F None
G None
None
None
None
None
F
None
None
B None
E None
F None
G None
E None
G
Figure 10.26 Two-way preorder threaded binary tree
None B
E None
Trees 439
class ThreadedNode :
def __init__(self,Newdata=None,lchild=None,rchild=None):
self.left = lchild
self.data = Newdata
self.right = rchild
self.thread = False
The above class represents a node of a one-way threaded binary tree. For
two-way threaded binary tree, two thread attributes are required – one for
the left thread and another for the right thread. Among all threaded binary
trees the most popular is the right threaded inorder binary tree. Hence, here
we are showing all the operations of a threaded binary tree on a right
threaded inorder binary tree. Based on the above class design the schematic
diagram of a right threaded inorder binary tree is shown in Figure 10.28.
Here, T in the thread field denotes True and F denotes False.
AF
None B T
CF
DF
None E F None
None F T
None G T
1. Ser
curNode = Root
2. While
c. Set
parentNode = curNode
curNode=self.Root
while curNode is not None:
curNode=curNode.left
print(curNode.data, end=“ ”)
parentNode=curNode
curNode=curNode.right
print(curNode.data, end=“ ”)
parentNode=curNode
curNode=curNode.right
Trees 441
1. While finding the position of the new node, if the data value of the new
node is greater than the current, then we will move to the right child only if
the thread part of current node contains false.
2. If the new node is inserted as the left child of a leaf node, its right child
part will be the thread. Thus the thread part of the new node becomes True
and the right child part will contain the reference of the parent node.
3. If the new node is inserted as the right child of a leaf node, then also its
right child part will be treated as the thread. Thus the thread part of the new
node will be True and for the right child part there may be two cases:
a. If parentNode contains a thread, then the right child part of the new node
contains the content of the right child part of the parent node b. Otherwise,
the right child part of the new node contains None The general algorithm to
insert a new node in a right threaded BST may be defined as follows:
3. Update both the reference parts of the left child and the right child with
None.
5. Else
ii. If the data value of new node is less than data value of current node, Set
left child of current node as current node
iii. Otherwise,
• Otherwise,
o Set right child of current node as current node
d. If the data value of new node is less than data value of parent node, then
i. Set the new node as left child of parent node
ii. Right child part of new node contains the reference of parent node
442 Data Structures and Algorithms Using Python iii. Thread part of new
node becomes True
e. Otherwise,
• Right child part of new node contains the content of right child part of
parent node
• Right child part of parent node contains the reference of new node ii.
Otherwise,
• Right child part of parent node contains the reference of new node Using
the above algorithm we can write the following code:
# Search Tree
def insert(self,newData):
newNode=ThreadedNode(newData)
if self.Root is None:
self.Root=newNode
else:
curNode = self.Root
parentNode = None
parentNode=curNode
if newData<curNode.data:
curNode = curNode.left
else:
if curNode.thread is True:
curNode=None
else:
curNode = curNode.right
if newData<parentNode.data:
parentNode.left=newNode
newNode.right=parentNode
newNode.thread=True
else:
if parentNode.thread is True:
Trees 443
parentNode.thread=False
newNode.thread=True
newNode.right=parentNode.right
parentNode.right=newNode
else:
newNode.thread=False
newNode.right =None
parentNode.right=newNode
Deletion of nodes from a threaded binary search tree is little bit tricky. Here
whether a node has a right child or not has to be decided on the value of the
thread, not with the value of the right child. For all cases of deletion it needs
to be kept in mind that the corresponding thread should not be lost and the
value of the thread is properly maintained. Here is the function to delete a
node from a right inorder threaded binary search tree.
parent = None
curNode= self.Root
if key == curNode.data :
parent = curNode
curNode = curNode.left
else:
parent = curNode
if curNode.thread is True:
curNode=None
else:
curNode = curNode.right
if curNode is None:
if parent.right is curNode:
parent.right = curNode.right
444 Data Structures and Algorithms Using Python parent.thread= True
else:
parent.left = None
else:
self.Root = None
del(curNode)
if curNode.left is None:
childNode = curNode.right
else:
childNode = curNode.left
if parent.left is curNode:
parent.left = childNode
if curNode.thread is True:
childNode.right=curNode.right
else:
parent.right= childNode
if childNode.thread is True and
curNode.left is childNode:
childNode.right=parent.right
else:
self.Root = childNode
del(curNode)
parentLeft = curNode
largestLeft = curNode.left
parentLeft = largestLeft
largestLeft = largestLeft.right
curNode.data = largestLeft.data
if parentLeft.right == largestLeft:
if largestLeft.left is None:
parentLeft.thread=True
parentLeft.right = largestLeft.right
else:
parentLeft.right = largestLeft.left
if largestLeft.left.thread:
largestLeft.left.right=
largestLeft.right
Trees 445
else:
if largestLeft.left.thread:
largestLeft.left.right=
largestLeft.right
parentLeft.left = largestLeft.left
del(largestLeft)
#Search Tree
class ThreadedNode :
# Declaration of Node of a
def __init__(self,Newdata=None,lchild=None,rchild=No
ne):
self.left = lchild
self.data = Newdata
self.right = rchild
self.thread = False
self.Root = None
def insert(self,newData):
newNode=ThreadedNode(newData)
if self.Root is None:
self.Root=newNode
else:
curNode = self.Root
parentNode = None
parentNode=curNode
if newData<curNode.data:
curNode = curNode.left
else:
446 Data Structures and Algorithms Using Python if curNode.thread is
True:
curNode=None
else:
curNode = curNode.right
if newData<parentNode.data:
parentNode.left=newNode
newNode.right=parentNode
newNode.thread=True
else:
if parentNode.thread is True:
parentNode.thread=False
newNode.thread=True
newNode.right=parentNode.right
parentNode.right=newNode
else:
newNode.thread=False
newNode.right =None
parentNode.right=newNode
def inorder(self):
curNode=self.Root
while curNode is not None:
curNode=curNode.left
print(curNode.data, end=“ ”)
parentNode=curNode
curNode=curNode.right
print(curNode.data, end=“ ”)
parentNode=curNode
curNode=curNode.right
parent = None
curNode= self.Root
if key == curNode.data :
parent = curNode
curNode = curNode.left
else:
Trees 447
parent = curNode
if curNode.thread is True:
curNode=None
else:
curNode = curNode.right
if curNode is None:
if parent.right is curNode:
parent.right = curNode.right
parent.thread= True
else:
parent.left = None
else:
self.Root = None
del(curNode)
if curNode.left is None:
childNode = curNode.right
else:
childNode = curNode.left
if parent.left is curNode:
parent.left = childNode
if curNode.thread is True:
childNode.right=curNode.right
else:
parent.right= childNode
curNode.left is childNode:
childNode.right=parent.right
else:
self.Root = childNode
del(curNode)
largestLeft = curNode.left
parentLeft = largestLeft
largestLeft = largestLeft.right
curNode.data = largestLeft.data
if parentLeft.right == largestLeft:
if largestLeft.left is None:
parentLeft.thread=True
parentLeft.right = largestLeft.right
else:
parentLeft.right = largestLeft.left
if largestLeft.left.thread:
largestLeft.left.right=
largestLeft.right
else:
if largestLeft.left.thread:
largestLeft.left.right=
largestLeft.right
parentLeft.left = largestLeft.left
del(largestLeft)
tbst=TBST()
while True:
TREE”)
print(“============================================
====”)
print(“1.Insert Node”)
print(“2.Inorder Traversal”)
print(“3.Delete a Node”)
print(“4.Exit”)
print(“============================================
====”)
tbst.insert(num)
elif choice==2 :
print(“Inorder : ”, end = ‘ ’)
tbst.inorder()
Trees 449
elif choice==3 :
”))
tbst.delete(num)
elif choice==4 :
print(“\nQuiting.......”)
break
else:
continue
– Adelson-Velsky and Landis. The advantage of the AVL tree is that the
worst case time complexity to search an element is O(log n). This is also
true for insertion and deletion 2
operations.
H | <= 1. If any node possesses any other value as balance factor, the node
requires some R
50 -1
80
30
40
70
90
60
In the above tree, consider the root node, 50. The height of its left sub-tree
is 2 and that of right sub-tree is 3. Thus, the balance factor is 2-3=-1. For
the node 30, the height of the left sub-tree is 0 as it does not have any left
child, and the height of right sub-tree is 1 as
450 Data Structures and Algorithms Using Python it contains only a single
node. Hence, the balance factor is 0-1=-1. The Balance factor of node 40 is
0 because it is a leaf node. Similarly the balance factor of all other nodes
have been calculated. Since the balance factor of all the nodes of the tree is
either 0, 1, or -1 and the tree possesses the rules of a binary search tree, the
above tree can be treated as an AVL
tree.
All operations on binary search tree are similarly applicable on an AVL tree
except insertion and deletion operations. Hence, we discuss only these two
operations in the following sections.
10.11.1.1 Insertiing a Node in an AVL Tree
To insert a node in an AVL tree, first we have to follow the same operation
as we have done in a binary search tree. After inserting the node, the
balance factor of all nodes of the tree has to be calculated to check whether
any node or nodes possess the value beyond the permissible value. If so,
rotation is required to rebalance the tree. Sometimes it may happen that
with a single insertion of a node more than one node’s balance factor
crosses the restricted limit. In those cases, we have to consider the node at
the highest level. For example, if the nodes at level 3 and level 4 both have
the balance factor more than 1 or less than -1, we have to consider the node
at level 4 for rotation. There may be four cases: LL, RR, LR, and RL. We
shall discuss all these cases here.
LL: When a new node is inserted in the left sub-tree of the left sub-tree of
the node in which balance is disturbed.
RR: When a new node is inserted in the right sub-tree of the right sub-tree
of the node in which balance is disturbed.
LR: When a new node is inserted in the right sub-tree of the left sub-tree of
the node in which balance is disturbed.
RL: When a new node is inserted in the left sub-tree of the right sub-tree of
the node in which balance is disturbed.
The rotation for LL and RR cases are similar in nature but the rotations are
in opposite directions. In both cases, a single rotation is required to
rebalance the tree and the rotation needs to apply on the node in which
balance is disturbed. For the LL case, rotation will be rightwards and for the
RR case, rotation will be leftwards.
The rotation for LR and RL cases are a little bit complicated and are similar
in nature.
crosses the limit and after that on the node whose balance factor crosses the
limit. The following examples clarify the cases in details:
Insert 30:
Insert 20:
Insert 10:
30
30
30
20
20
20
LL
0
0
10
30
10
In the above tree, first node 30 is inserted. As it is the first node, it does not
have any child and the balance factor is calculated as 0. Next 20 is inserted.
Following the rules of BST it is inserted as a left child. The balance factor
of the node 20 and 30 are 0 and 1 respectively.
Up to this, the balance factors of the nodes are within limit. But after the
insertion of node 10, the balance factor of node 30 becomes 2. Since the
new node 10 is inserted in the left sub-tree of the left sub-tree of the node
30, i.e. in which balance is disturbed, we have to follow the rotation rule for
the LL case and a right rotation has to be given on node 30 and we get the
balanced tree as shown in Figure 10.30.
Insert 10:
Insert 20:
-1
Insert 30:
-
0
10
10
10
20
20
20
RR
10
30
30
In the above tree, after the insertion of nodes 10 and 20, the balance factors
of the nodes are within range. After the insertion of node 30, the balance
factor of node 10 becomes -2.
Since the new node 30 is inserted in the right sub-tree of the right sub-tree
of the node 10, i.e. in which balance is disturbed, we have to follow the
rotation rule for the RR case and a left rotation has to be given on node 10
and we get the balanced tree as shown in Figure 10.31.
452 Data Structures and Algorithms Using Python Case 3: Insert nodes in
the following order: 30, 10, 20.
Insert 30:
Insert 10:
Insert 20:
30
30
30
-1
10
10
LR
20
2
30
20
20
10
30
10
Here also, after the insertion of nodes 30 and 10, the balance factors of the
nodes are within range. After the insertion of node 20, the balance factor of
node 30 becomes 2. Since the new node 20 is inserted in the right sub-tree
of the left sub-tree of the node 30, i.e. in which node balance is disturbed,
we have to follow the rotation rule for the LR case. So, first a left rotation
has to be given on the child node of node 30, i.e. left rotation on node 10,
and after that a right rotation on node 30. Finally, we get the balanced tree
as shown in Figure 10.32.
Insert 10:
Insert 30:
Insert 20:
-2
10
10
10
30
30
20
RL
-2
10
-1
20
20
10
30
30
In the above tree, after the insertion of node 20, the balance factor of node
10 becomes -2.
Since the new node 20 is inserted in the left sub-tree of the right sub-tree of
the node 10, i.e. in which balance is disturbed, we have to follow the
rotation rule for the RL case. Thus
Trees 453
first a right rotation will be given on the child node of node 10, i.e. on node
30. Next a left rotation is given on node 10 and the tree becomes balanced
as shown in Figure 10.33.
50
0
1
40
60
30
Insert 20:
50
50
LL
40
60
30
60
1
30
20
40
20
Figure 10.34 Special case when balance factors exceed limit in more than
one node
In the above tree, after the insertion of node 20, the balance factor of node
40 as well as of node 50 becomes 2. As 40 is at a higher level than 50, we
have to give the rotation on node 40 and the new node 20 is inserted in the
left sub-tree of the left sub-tree of node 40; thus we have to follow the
rotation rule for the LL case and a right rotation has to be given on node 40
as shown in Figure 10.34.
50
40
60
55
30
45
70
20
35 42
47
Insert 10:
50
LL
40
40
60
1 30
50
55
30
45
70
20
35
45
60
00
20
35 42
47
10
42
47 55
70
0
10
454 Data Structures and Algorithms Using Python In the above example, it
is very clear that after the insertion of node 10, the balance factor of node
50 becomes 2 and it is a case of LL. So, we have to give a right rotation on
node 50.
Now to complete the rotation process, node 40 moves to the root along with
its left subtree. But what happens to its right sub-tree (because now node 50
would be the right child of node 40)? In this type of situation, the right sub-
tree of node 40 would be the left sub-tree of node 50 after rotation. This is
shown in Figure 10.35. This type of situation may arise in case of a left
rotation as well. Then after rotation the left sub-tree of the corresponding
node would be the right sub-tree of its ancestor node.
Solution:
Insert 50:
Insert 60:
Insert 70:
0
2
50
50
50
60
RR
60
60
50
70
70
Insert 20:
Insert 10:
60
60
60
50
50
70
70
LL
20
0
1
70
20
20
10
50
10
Insert 30:
60
60
50
0
2
LR
20
70
50
70
20
60
10
50
20
10
30
70
30
10
30
Insert 22:
Insert 35:
50
50
-1
1
-
-1
20
60
20
60
10
30
70
10
30
70
0
0
22
22
35
Trees 455
Insert 25:
50
50
50
20
60
20
60
22
60
RL
10
30
70
10
22
70
20
30
70
22
35
30
0
0
10
25
35
25
25
35
Insert 45:
50
50
1
-
30
22
60
30
60
LR
22
50
20
30
70
22
35
70
20
25 35 60
10
25
35
20
25
45
10
45 70
45
10
To delete a node from an AVL tree, first we have to follow the basic rules of
deleting a node from a binary search tree. Then we will check the balance
factors of all nodes. If the tree remains well balanced, then it is okay.
Otherwise, we have to provide some rotation operation to rebalance the
tree. First we will consider the node where balance is disturbed.
If it is found that the balance factor of more than one node exceeds the
limit, we will have to consider the node with the highest level. Next we will
calculate the height of its children and grandchildren. Based on these, there
is also the chance of four possible cases of rotation.
LL: If height of left child is greater than its right child and height of left
grandchild is greater than right grandchild (here grandchildren are the
children of left child).
RR: If height of right child is greater than its left child and height of right
grandchild is greater than left grandchild (here grandchildren are the
children of right child).
LR: If height of left child is greater than its right child and height of right
grandchild is greater than left grandchild (here grandchildren are the
children of left child).
RL: If height of right child is greater than its left child and height of left
grandchild is greater than right grandchild (here grandchildren are the
children of right child).
Based on these cases, we have to apply same rotation strategies as was done
for insertion.
Remember that, unlike insertion, fixing the problem of one node may not
completely balance an AVL tree. So, we need to check the balance factors
of all nodes once again and, if problem persists, we have to follow the
rotation strategies as discussed above.
456 Data Structures and Algorithms Using Python Example 10.5: Delete
the node 25 from the following AVL tree.
30
0
22
50
20
25
35 60
70
10
45
Solution:
0
1
30
30
30
22
50
22
50
LL
20
50
0
1
20
25 35 60
20
35 60
10
22 35 60
10
45 70
10
45 70
45 70
Node to be deleted
As 25 is in a leaf node, deletion is simple. The node will be deleted and the
right reference of 22 becomes None. But deletion of 25 makes node 22
unbalanced. We need to give a rotation. As it is a case of LL, a right rotation
has to be given on 22 and we get the final balanced AVL tree.
Trees 457
• Children of every red node is black, i.e. no two adjacent red nodes is
possible.
• All the paths from a particular node to its leaf node contain an equal
number of black nodes.
To maintain the color of the node an extra bit is used which contains 0 or 1
to indicate the color red or black. In the red–black tree, all the leaf nodes are
considered as external nodes and these nodes do not contain any data. These
are basically left and right references and contain None. So, in our diagram
we are not showing these. The following are some examples of red–black
trees.
30
50
50
20
50
22
60
20
60
40
70
20
30
10
30
70
60
10
25
35
1. Recolor or repaint
2. Rotation
Recolor means changing the color of the node. We have to keep in mind
about color that the color of the root is always black and no two adjacent
nodes will be red. The rotations are the same as rotations of an AVL tree.
Any one among the four cases, i.e., LL, RR, LR, and RL, may apply.
i. Recolor
6. If the grandparent of the new node is not the root, check it. If it violates
the color rule, follow step 5 for grandparent.
Now consider the following example where based on the above algorithm
we will construct a complete red–black tree by inserting a set of key values
sequentially: Example 10.6: Construct a red–black tree from the following
sequence of numbers: 12, 23, 10, 18, 20, 30, 28, 45, 50, 8, 5, and 60.
Solution:
Insert 12:
12
Insert 23:
12
Insert 10:
12
23
10
23
Insert 18:
12
12
Recolor
10
23
10
23
18
18
Insert 20:
12
12
12
LR
Recolor
10
23
10
20
10
20
18
18
23
18
23
20
Insert 30:
12
12
Recolor
10
20
10
20
18
23
18
23
30
30
Trees 459
Insert 28:
12
12
12
RL
Recolor
10
20
10
20
10
20
18
23
18
28
18
28
30
23
30
23
30
28
Insert 45:
12
12
Recolor
10
20
10
20
18
28
18
28
23
30
23
30
45
RR
45
20
20
12
28
Recolor
12
28
10
18
23
30
10
18 23
30
45
45
Insert 50:
20
20
20
12
28
RR
12
28
Recolor
12
28
10
18
23
30
10
18 23
45
10
18
23
45
45
30
50
30
50
50
Insert 8:
Insert 5:
20
20
LL &
20
Recolor
12
28
12
28
12
28
10
18 23
45
10
18 23
45
18 23
45
30
50
30
50
10
30
50
60
20
20
20
Recolor
Recolor of
Grand parent
12
28
12
28
12
28
18 23
45
18 23
45
18 23
45
5
10
30
50
10
30 50
10
30
50
60
60
60
Deletion of a node from a red–black tree is a little bit complex. First we will
delete the node following the rules of a binary search tree. Next we have to
check the color of the deleted node. If it is red, the case is very simple, but
if the color of the deleted node is black, the case is quite complicated
because it decreases the count of black nodes in a particular path which may
violates the basic property of a red–black tree.
In case of deletion of a node from a binary search tree, we need to handle
three types of cases. These are no-child, one-child, and two-children cases.
But in a two-children case we substitute the value of the node with its
inorder predecessor or successor and physically delete this predecessor or
successor node which is either a leaf node or node with one child.
Thus in case of a red–black tree, after deleting the node we actually need to
handle the cases with no child or one child. To describe them better we are
denoting the node to be deleted as d and its child as c (c may be None when
d has no child). Now we are discussing some cases.
Delete 23
12
10
23
10
18
18
i. LL: when both the sibling and its red child are left children of their
parents ii. RR: when both the sibling and its red child are right children of
their parents iii. LR: when the sibling is a left child of its parent and its right
child is red iv. RL: when the sibling is a right child of its parent and its left
child is red In case of LL or RR rotation, the value of the parent node would
be shifted to the node to be deleted. The value of the red node and its parent
node would be shifted to their parent node and the red node would be
actually deleted.
12
12
23
Delete 10
RR
10
23
23
12
25
18
25
18
25
18
Figure 10.41 Deletion of a node from a red–black tree (case II) In case of
LR or RL, in first rotation, child will be inserted in between its parent and
grandparent and its color will be changed to black and its parent now
becomes red. Thus after the final rotation, the old sibling node, which now
becomes red, is moved up and becomes black again. Consider the following
example:
12
12
12
18
Delete 10
RL
10
23
23
18
12
23
18
18
23
Figure 10.42 Deletion of a node from a red–black tree (case III) c. If the
sibling of d is black and both the children siblings are black: Recolor
the sibling and repeat the operation for parent if parent is black.
18
18
Delete 12
12
23
23
Figure 10.43 Deletion of a node from a red–black tree (case IV) d. If the
sibling of d is red: If the sibling is a left child, give a right rotation,
otherwise give a left rotation on the parent to move the old sibling up and
recolor the old sibling and parent.
12
Left
20
20
Delete 10
Recolor
rotation
10
20
20
12
23
12
23
18
23
18
23
18
18
characters. The value of the node will be the sum of the frequency of these
leaf nodes.
4. Repeat step 3 until total tree is formed. Note that at the time of
considering least frequented nodes those nodes already used to form an
intermediate node are never used again.
5. After formation of tree assign 0 to each left edge and assign 1 to each
right edge.
Let us apply this algorithm on the above string to build the Huffman tree.
Our first task is to find the frequency of each character and arrange them in
ascending order of their frequency.
Trees 463
D
E
10
These will be considered as leaf nodes of the tree and the least frequented
nodes form the intermediate node.
3
3
10
The next least frequented nodes are E and B with values 3 and 4
respectively.
10
10
The next least frequented nodes are C and A with values 8 and 10
respectively.
12
18
F
10
464 Data Structures and Algorithms Using Python Now the intermediate
nodes are containing the least values 12 and 18. Hence, the final tree is:
30
12
18
D
E
10
Now we assign 0 and 1 to each left and right edge respectively. Hence we
get the Huffman tree.
30
12
18
0
10
From this tree we are now able to encode the characters. To get the code of
each character we traverse from the root to the corresponding character and
the corresponding path will be the code for that character. Hence, we get the
code of the characters as: Character
Code
11
011
10
001
Trees 465
010
000
11101001100011111100110011011010111110000001110111010010001101
1111101010.
Character
Code
Size in bits
Frequency
Total Size
11
10
20
011
12
10
16
D
001
010
000
Thus, total size of the encoded string = 20+12+16+9+9+6 = 72. But along
with the encoded string we also need to store/send the code table. In the
code table, there are 6 ASCII characters, for which 6x8 = 48 bits are
required, and for the code 2+3+2+3+3+3 = 16 bits are required. Hence, the
total size of the code table = 48+16=64 bits. So, to encode the string total
72+64, i.e. 136, bits are required, which is much less than the original string
(240 bits).
To decode the string, we have to use the same code table. We read each
character from the encoded string and start traversing through the path
starting from the root of the Huffman tree. When we reach at the leaf node
we will get the decoded character. Again we start from the root and follow
the same procedure. In our encoded string, the first 11
M-way search trees are the generalized version of binary search trees. Here
m-way represents multi-way search tree, which means each node of these
trees can have a maximum of m number of children and m-1 key values and
these key values maintain the search order. We can say that the binary
search tree is also an m-way search tree whose value of m = 2. Thus each of
its nodes has a maximum of two children – left sub-tree and right sub-tree –
and the number of key values is 2-1 = 1. But generally, in case of m-way
search trees the value of m is greater than 2. An m-way tree does not mean
the tree has exactly m number of children in each node; rather, it indicates
that the order of the tree is m, which means that each node may have a
maximum of m number of children and the number of key values is one less
than the number of children.
466 Data Structures and Algorithms Using Python The advantage of m-way
search trees is that the search, insertion, and deletion operations are much
faster in comparison to a binary search tree. Since the data elements are
distributed among m paths, the height of the tree is reduced to log n, where
n is the m
number of key values in the tree. Thus the time complexity of these
operations is reduced to O(log n). The general structure of each node of an
m-way search tree is: m
R
K
K1
m-1
m-1
2
m-1
values. The key values in the node are in ascending order, i.e. K < K < K <
….< K , and all 0
m-1
the key values in nodes of the sub-tree whose reference is at R are less than
key value K for i
i = 0, 1, 2, ..., m-1. Similarly, all the key values in nodes of the sub-tree
whose reference is at R are greater than key value K . Figure 10.46 shows
an m-way search tree of order 4.
m-1
20
40
60
10
15
25
32
35
47
50
55
75
82
97
27
30
65
Within the root, compare x with the key values of the root. If x<k , then the
searching i
m-1
m
This process will be continued until x is matched with some key value or
the search ends with an empty sub-tree. Consider the above example and we
want to search the key value 30. As 20<30<40 at the root, we have to move
through the sub-tree whose reference is in between 20 and 40. Now within
the root of this sub-tree, 30 is greater than the first key value, 25, but less
than the next key value, 32. So, we have to move through the sub-tree
whose reference is in between 25 and 32 and we will find 30 at the k
position of this node.
10.15 B Tree
A B tree is a very popular and efficient m-way search tree. Its application is
found widely in database management systems. The B tree was developed
by Rudolf Bayer and Ed McCreight in 1970.
Trees 467
2. All internal nodes except the root can have a maximum of m children and
a minimum of m
children.
2
4. The number of key values in a node is one less than its child and the key
values maintain the search order.
From the first property it is clear that a B tree is a strictly balanced tree and
thus the time complexity of insertion, deletion, and search operations are
calculated as O(log n). Figure m
50
20
30
70 90
12
25
32
48
56
73
81
94 121
The middle part is moved upward and inserted into the parent node. If this
parent node is unable to accommodate the new element, the parent node
may be split further and the splitting propagates upwards. This splitting
process may propagate up to the root. If the root splits, a new node with a
single element is created and the height of the B tree increases by one. The
following example illustrates the total insertion process: Example 10.6:
Construct a B tree of order 5 from the following sequence of numbers: 35,
22, 57, 41, 72, 15, 65, 97, 39, 92, 45, 90, 63, 85, 95, 60, 50, 94, 99 and 20.
Insert 57:
Insert 41:
22 35
22 35 57
22 35 41 57
Insert 72:
41
22 35
57 72
41
15 22 35 39
57 65 72 97
Insert 92:
41
72
15 22 35 39
57 65
92
97
41
72
15 22 35 39
45 57 63 65
85 90 92 97
Insert 95:
41 72 92
15 22 35 39
45 57 63 65
85 90
95
97
Insert 60:
41 60 72 92
15 22 35 39
45
57
63 65
85 90
95 97
41 60 72 92
15 22 35 39
45 50 57
63 65
85 90
94 95 97 99
Insert 20:
60
22 41
72 92
15 20
35 39
45 50 57
63 65
85 90
94 95 97 99
Trees 469
key values, then simply delete the element. But if the element is in a leaf
node and the node does not contain sufficient number of elements, we have
to fill the position by borrowing elements from its siblings via the root. The
largest element of the left sibling or the smallest element of the right sibling
(whichever is available) would be moved into the parent node and the
intervening element from the parent node moves to the node where the
deletion takes place. But if both the left and right siblings do not have
sufficient number of elements, we have to merge the two leaf nodes along
with their intervening element in the parent node into a single leaf node. For
this reason, if the parent node suffers from less than the minimum number
of key values, the process of merging propagates upwards. This may
propagate up to the root, causing the height of the tree to decrease by one. If
the element is in some internal node, the inorder predecessor or successor of
the element has to be fetched from the corresponding leaf node to fill up the
position of the deleted element.
The following example illustrates the various cases that may arise due to
the deletion of elements from a B tree.
Example 10.7: Delete 50, 92, 90 and 35 from the following B tree: 60
22 41
72 92
15 20
35 39
45 50 57
63 65
85 90
94 95 97 99
Solution:
Delete 50:
60
22 41
72 92
15
20
35 39
45 50 57
63 65
85 90
94 95 97 99
Delete 92:
60
22 41
72 92 94
15
20
35 39
45 57
63 65
85 90
94 95 97 99
60
22 41
72 94
15 20
35 39
45 57
63 65
85 90
95 97 99
60
22 41
72 94 95
94
15 20
35 39
45 57
63 65
85 90
95 97 99
60
22 41
72 95
15 20
35 39
45 57
63 65
85 94
97 99
Delete 35:
60
22
41
72
95
15
20
35
39
45
57
63
65
85 94
97 99
60
41
72
95
15
20
22 39
45
57
63
65
85
94
97 99
41
60 72 95
15
20
22 39
45
57
63
65
85
94
97 99
Trees 471
sub-tree contains a single node with key values 45, 50, and 57. On finding
the value 50, the search operation is completed with success.
10.16 B+ Tree
56
25 32
73
94
12
25
32
48
56
73 81
94 121
Figure 10.50 Example of a B+ tree of order 3
But as we discussed, unlike a B tree, since it stores data values only at the
leaf nodes, we need to follow some deviations in implementation. Again the
leaf nodes are connected through a linked list. So, care has to be taken for
that also. To insert an element, first we will locate the node where the new
element will be inserted. Now if the node has sufficient space to
accommodate the new element, it will be inserted smoothly (maximum
capacity is m-1
key values). Otherwise, the node will be split into two parts. The left part
will contain m
2
– 1 elements ( m
is also considered) and the right part will contain the rest. The smallest 2
element in the right part, i.e. the left-most element of the right part, will be
pushed up into the parent node maintaining the ascending order of the
elements in the parent node. Next, the left part points to the right part and
the right part points to the node that was pointed to by the node of insertion
before splitting. The addition of an element in the parent node follows
exactly the same rules that have been followed in the case of a B tree. The
following example illustrates the insertion operation in detail:
Example 10.8: Construct a B tree of order 5 from the following sequence
of numbers: 35, 22, 57, 41, 72, 15, 65, 97, 39, 92, 45, 63, 85, 90, 60, 94 and
20.
Solution:
Insert 72:
41
22
35
41 57 72
41
15 22 35
41 57 65 72
Insert 97:
41 65
15 22 35
41 57
65 72 97
41
65
15 22 35 39
41 45 57 63
65 72 92 97
Insert 85:
41 65 85
15 22 35 39
41 45 57 63
65 72
85 92 97
Trees 473
Insert 90:
41 65 85
15 22 35 39
41 45 57 63
65 72
85 90 92 97
Insert 60:
41 57 65 85
15 22 35 39
41
45
57 60 63
65
72
85 90 92 97
Insert 94:
65
41 57
85 92
15 22 35 39
41 45
57 60 63
65 72
85 90
92 94 97
65
Insert 20:
22 41 57
85 92
15 20
22 35 39
41 45
57 60 63
65 72
85 90
92 94 97
To delete an element, we have to delete in two steps. First from a leaf node
and then from an intermediate node, if it exists. After the deletion of an
element from the leaf node if it is found that the node has more than or
equal to the requisite minimum number of key values, then it is okay. Next
we have to check if there is an entry in the intermediate node for that
element. If it is there, the entry should be removed from the node and
replaced with the copy of the left-most element of the immediate right
child. But after deletion if the number of elements in the leaf node becomes
less than the minimum number of key values, the node will be merged with
its left or right siblings and the intermediate index key value will be
removed from the parent node. This may cause underflow for the parent
node. To fulfill the criteria then we have to borrow from the siblings of this
parent node via its parent node (i.e. grandparent node) if they have more
than the minimum number of keys. But if they do not have so, we need to
merge the node with its left or right siblings along with the intervening
element in the parent node. This may cause underflow to its parent node.
The process of merging propagates upward and in extreme cases it may
propagate up to the root, causing the height of the tree to decrease by one.
Now we discuss several cases that may arise at the time of deletion.
Consider the following examples:
474 Data Structures and Algorithms Using Python Example 10.9: Delete
60, 92, 85, 90 and 94 sequentially from the following B+ tree.
65
22 41 57
85 92
15 20
22 35 39
41 45
57 60 63
65 72
85 90
92 94 97
Solution:
Delete 60:
65
22 41 57
85 92
15 20
22 35 39
41 45 57
60 63
65 72
85 90
92
94 97
65
22
41 57
85 92
15 20
22 35 39
41 45
57 63
65 72
85 90
92 94 97
Figure 10.52 Deletion of node from a B+ tree (Case I) This is the simplest
case. The node where deletion occurs has the required minimum number of
key values after deleting the element from the node and the element has no
entry in intermediate nodes.
Delete 92 :
65
22
41 57
85
92
15 20
22 35 39
41 45
57 63
65 72
85 90
92 94 97
65
22
41 57
85 94
15 20
22 35 39
41 45
57 63
65 72
85 90
94 97
Figure 10.53 Deletion of node from a B+ tree (Case II) In this case, after
deleting the element from the node, it has the required minimum number of
key values. So, there is no problem in deletion. But the element has an entry
in its parent
Trees 475
node. This entry needs to be removed. This vacant position will be replaced
by 94, which is now the left-most element in its right child node.
Delete 85:
65
22 41 57
85 94
15 20
22 35 39
41 45
57 63
65 72
85 90
94 97
65
22 41 57
94
15 20
22 35 39
41 45
57 63
65 72 90
94 97
57
22 41
65 94
15
20
22 35 39
41 45
57 63
65 72 90
94
97
Figure 10.54 Deletion of node from a B+ tree (Case III) After deleting 85,
the node containing 85 becomes underflow. Thus we merge this node with
its left sibling and remove the entry of 85 from the parent node. But now
the parent node becomes underflow. To balance it again, we borrow an
element from its sibling via the root. Thus 65 moves down to this node from
the root and 57 moves up from its sibling to the root. The inorder successor
node of 57 now becomes the inorder predecessor of 65
Deletion of 90 does not violate any rule; thus simply its entry will be
removed from the node. But deletion of 94 makes the corresponding node
underflow. So, this node is merged with its left sibling, and also the entry of
94 has been removed from its parent node. But this makes the parent node
underflow. So, we need to borrow an element from its sibling.
But at this situation the node has a single sibling and it has exactly the
minimum number of required elements. Thus borrowing is not possible and
we merge the node with its sibling along with the parent.
57
22 41
65 94
15
20
22 35 39
41 45
57
63
65 72 90
94 97
57
22
41
65
15
20
22 35 39
41
45
57
63
65 72 97
22 41 57 65
15
20
22 35 39
41 45
57
63
65 72 97
Figure 10.55 Deletion of nodes from a B+ tree (Case IV) 10.17 B* Tree
A B* tree is another variation of a B tree where each node except the root
node is at least two-third full rather than half full. The basic idea to
construct a B* tree is to reduce the splitting of nodes. In this tree, when a
node gets full, instead of splitting the node, keys are shared between
adjacent siblings, and when two adjacent siblings become full, both of them
are merged and split into three. The insertion and deletion operations are
somehow similar to a B tree but quite complex and that is why it has not
become that much popular in comparison to a B tree or a B+ tree.
A 2–3 tree is nothing but a B tree of order 3. Each node of a B tree of order
3 can have either two children or three children. That is why sometimes it is
called a 2–3 tree. Similarly, a B
tree of order 4 is also known as a 2–3–4 tree as each node of this tree can
have either 2, 3, or 4 children. The following is an example of a 2–3 tree.
Trees 477
50
30
60 90
23
42
56
73 84
97 101
A Trie tree is a special type of a search tree. The term ‘Trie’ comes from the
word ‘re trie val’
because it is used for efficient retrieval of a key from a set of key values.
Generally these keys are in the form of strings. Instead of storing key values
in the nodes, it stores a single character and a set of pointers. If we think
about a dictionary, i.e. words are constructed only with letters of the
alphabet, the maximum number of pointers may be 26. To access a key the
tree is traversed from the root through a specific path that matches with the
prefix of a string up to the leaf node. Thus all the children of a particular
node have a common prefix of string. That is why it is also known as a
prefix tree. Application of a Trie tree is mainly found in string matching
operations such as predictive text, auto completing a word, spell checking,
etc. The following example shows the creation of a Trie tree: Example
10.10: Construct a Trie tree for the following:
“A”, “TO”, “THE”, “TED”, “TEN”, “I”, “IN”, “AND”, “INN”, “TEA”,
“THEN”, “THAT”.
Solution:
I
N
AN
TE
IN
TO
N TH
AND
TEA D
TEN
INN
THE
TED
THA
THEN
THAT
✓ A binary tree is a tree of degree 2, i.e. each node of this tree can have a
maximum of two children.
✓ If each node of a binary tree has exactly zero (in case of leaf node) or
two non-empty children, the binary tree is known as a strictly binary tree or
2-tree.
✓ A complete binary tree is a binary tree whose all levels, except possibly
the last level, have the maximum number of nodes and at the last level all
the nodes appear as far left as possible.
✓ A binary tree whose all levels have the maximum number of nodes is
called a full binary tree.
✓ Preorder, inorder, and postorder are the three main traversal techniques.
✓ AVL tree and red–black tree are two self-balancing binary search trees.
1. If a full binary tree has n leaf nodes, the total number of nodes in the tree
is a) n nodes
b) log n nodes
c) 2n-1 nodes
d) 2n nodes
Trees 479
b) 63
c) 31
d) 16
3. The maximum number of nodes in a complete binary tree of depth k is a)
2k
b) 2k
c) 2k-1
d) None of these.
a) BST
b) AVL Tree
c) Heap
d) B tree.
5. If a binary tree has n leaf nodes, the number of nodes of degree 2 in the
tree is a) log n
b) n
c) n-l
d) 2n
b) n+1
c) 2n
d) 2n+1
7. Which of the following need not be a balance tree?
a) BST
b) AVL Tree
c) Red–black Tree
d) B Tree.
480 Data Structures and Algorithms Using Python 9. If the preorder and
postorder traversal of a binary tree generates the same output, the tree can
have a maximum of
a) Three nodes
b) One node
c) Two nodes
a) 4
b) 5
c) 6
d) 7
a) Preorder Traversal
b) Inorder Traversal
c) Postorder Traversal
c) Inorder and any one between preorder and postorder traversal paths are
required.
a) Root node
b) Left sub-tree
c) Right sub-tree
d) Sibling nodes
a) n
b) 2n
c) log n
2
d) 2n – 1
e) Binary Tree
f) B Tree
g) B+ Tree
h) B* Tree
Trees 481
a) 2–3 Tree
b) AVL Tree
c) Red–black Tree
a) Binary Tree
c) Huffman Tree
d) B+ Tree
a) BST
b) Threaded Binary Tree
c) Huffman Tree
d) 2–3 Tree
19. It is found that all leaves are linked with a linked list in a) B Tree
b) B+ Tree
c) Red–black Tree
d) 2–3 Tree
b) m 2
m
c) 2
m
d) 2
Review Exercises
a) a + b * c – d / e
b) a – (b – c * d) + (g / h – e) * f
c) (a * b – c ) % (d / e + f)
d) 2 * (a + b ) – 3 * (c – d)
8. How can the following tree be represented in memory when an array will
be used?
I
9. How can a binary tree be represented using a linked list? Explain with an
example.
11. Write down the preorder, inorder, and postorder traversal paths for the
following binary tree: A
Inorde: A E C G F H B D
Trees 483
Inorder: Q S T R P
Postorder: T S R Q P
15. Is it possible to reconstruct a binary tree with the preorder and postorder
traversal sequences of a binary tree? Explain.
16. Write down the preorder, inorder, postorder, and level order traversal
paths for the following binary tree:
J
K
17. Compare and contrast between array representation and linked list
representation of a binary tree.
21. Write a function to search an element from a binary search tree whose
leaves are all tied to a sentinel node.
57, 32, 45, 69, 65, 87, 5, 40, 43, 12, 80.
484 Data Structures and Algorithms Using Python 24. Define an AVL tree
and give examples of AVL and non-AVL trees.
25. Draw an AVL tree with the following list of names by clearly
mentioning the different rotations used and the balance factor of each node:
Bikash, Palash, Ishan, Nimai, Lakshmi, Perth, Jasmin, Arijit, Himesh,
Dibyendu.
26. Draw an AVL tree with the following sequence of numbers: 52, 67, 72,
17, 9, 35, 21, 37, 24, 44.
Next delete 35 from the above tree maintaining the AVL structure.
MAR, NOV, MAY, AUG, APR, JAN, DEC, JUL, FEB, JUN, OCT, SEP.
28. What is a red–black tree? Draw a red–black tree from the following
sequence of numbers: 25, 37, 48, 12, 7, 33, 27, 15, 10.
29. What is B tree? Construct a B tree of order 3 with the following data:
85, 69, 42, 12, 10, 37, 53, 71, 94, 99.
35, 10, 52, 40, 27, 5, 19, 66, 78, 97, 44, 38, 82.
25, 32, 10, 15, 67, 7, 11, 19, 22, 45, 97, 81, 5, 88, 20, 17, 34, 30, 100, 52.
i. Delete 97
ii. Delete 67
iii. Delete 15
35. Draw the Huffman tree for encoding the string ‘successiveness’. How
many bits are required to encode the string? Write down the encoded string
too.
Ram, Ramen, Ramesh, Suva, Samir, Subir, Rama, Samu, Raktim, Rakhi,
Suvas, Subinoy.
Trees 485
4. Write a function to find the average of the values of keys of a binary tree.
6. Write a function to find the path from the root to a particular node in a
binary search tree.
11
OceanofPDF.com
Chapter
Heap
We can define a heap as a binary tree that has two properties. These are:
shape property and order property. By shape property, a heap must be a
complete binary tree. By order property, there are two types of heaps. One
is max heap and the other is min heap. By default, a heap means it is a max
heap. In max heap, the root should be larger than or equal to its children.
There is no order in between the children. This is true also for its sub-trees.
In min heap the order is the reverse. Here the root is smaller than or equal to
any of its children. Thus the root of a max heap always provides the largest
element of a list whereas the root of a min heap always provides the
smallest element. As a heap is a complete binary tree, its maximum height
is O(log n) where n is the total number of elements. Thus both 2
the insertion of new node and the deletion of an existing node can be done
in O(log n) 2
60
10
20
40
70
30
15
10
75
90
42
57
(a)
(b)
Like other data structures, a heap also can be represented in memory using
an array (or a list in Python) and a linked list. The main problem in tree
representation using arrays is in wastage of memory. But in case of
complete binary tree representation, that chance is almost nil. Moreover, we
need not waste space for storing references of left and right children. So,
using arrays (or lists in Python) we can represent the heap data structure
efficiently.
So, root will be stored at index position 1 and the position of the immediate
left and right children of the root will be calculated as
These rules are applicable also for sub-trees. The memory representation of
the heaps shown in Figure 11.1 is shown in Figure 11.2.
60
20
40
15
10
70
30
75
90
42
57
The main operations related to a heap are insertion of new elements into a
heap and deletion of an existing node from a heap. Generally elements are
accessed from the root of the heap. These are discussed below.
11.3.1 Inserting a New Element in a Heap
When a new element is inserted, first it is inserted as the last element in the
heap if the heap is not empty. If the heap is empty, obviously the new node
will be inserted as a root. Thus the shape property has been maintained.
Now we need to check whether the order property
Heap 489
Solution:
Insert 12:
Insert 15:
15
Insert 10:
12
12
15
15
12
12
10
Insert 20:
15
15
20
12
10
20
10
15
10
20
12
12
Insert 18:
20
20
Insert 4:
20
15
10
18
10
18
10
12
18
12
15
12
15
Insert 27:
20
20
27
18
10
18
27
18
20
12
15
27
12
15
10
12
15
10
Insert 16:
27
27
18
20
18
20
12
15
10
16
15
10
16
12
27
27
27
18
20
18
20
23
20
16
15
10
23
15
10
18
15
10
12
23
12
16
12
16
Figure 11.3 Stepwise construction of a heap
Explanation: Here, the first element is 12; thus it is inserted as the root.
The next element is 15. To maintain the complete tree property it is inserted
as the left child of 12. But 12 is less than 15; so, they are swapped. The next
element is 10. To maintain the shape property it is inserted as the right child
of the root. As 10 < 15, the order property is also maintained.
The next element is 20 and it is inserted as the last element in the tree, i.e. at
the leftmost position of the next level. But 20 > 12; so, they are swapped.
Again 20 > 15 as well; thus they are also swapped. The next element is 18,
which is greater than 15, and thus swapped. But 18 < 20; so, we need not
continue this swap operation. In this way, each of the rest of the elements is
also stored in the heap first at the corresponding last position and then order
property has been maintained by performing the required swapping
operation.
c. Otherwise,
self.items.append(item)
start=len(self.items)-1
while start>1:
parent=start//2
if self.items[parent]<self.items[start]:
self.items[parent],self.items[start] =
Heap 491
self.items[start],self.items[parent]
start=parent
else:
break
In a heap, deletion takes place always from the root. But to maintain the
shape property, i.e. to retain it as a complete tree, we always physically
delete the last element and obviously before deleting the last element it
should be copied to the root. Now we need to check whether in any position
the order property is violated or not. For that we start from the root and
check whether the root is greater than both of its children. If not, the root is
swapped with its largest child. Next this child node is compared against its
children, and so on. This process continues up to the leaf. The following
example illustrates the deletion operation in heap:
23
20
18
15
10
12
16
Solution:
27
16
23
20
23
20
18
15
10
18
15
10
12
16
12
23
23
16
20
18
20
18
15
10
16
15
4
10
12
12
Thus 27 is replaced by 16 and the last node, i.e. the node originally
containing the value 16, is now physically deleted. Next we have to check
the order property. So, the node 16 is compared with its children. Between
the children, 23 is larger and it is also larger than 16.
The general algorithm to delete an element from a heap may be defined as:
1. Find current last position and set it as Last
2. Replace the value of the root with the value of Last position 3. Now
delete the last element
4. Set Current = 1
iii. Based on the new value of Current recalculate Left and Right c.
Otherwise,
last=len(self.items)-1
self.items[1]=self.items[last]
del self.items[last]
current=1
left=2*current
right=2*current+1
while left<last:
max=self.items[left]
posn=left
max=self.items[right]
Heap 493
posn=right
if self.items[current]<self.items[posn]:
self.items[current],self.items[posn] =
self.items[posn],self.items[current]
current=posn
left=2*current
right=2*current+1
else:
break
class Heap:
def __init__(self):
self.items = [0]
def isEmpty(self):
return len(self.items) == 1
self.items.append(item)
start=len(self.items)-1
while start>1:
parent=start//2
if self.items[parent]<self.items[start]:
self.items[parent],self.items[start] =
self.items[start],self.items[parent]
start=parent
else:
break
def delete(self):
last=len(self.items)-1
self.items[1]=self.items[last]
del self.items[last]
current=1
left=2*current
right=2*current+1
while left<last:
posn=left
max=self.items[right]
posn=right
if self.items[current]<self.items[posn]:
self.items[current],self.items[posn] =
self.items[posn],self.items[current]
current=posn
left=2*current
right=2*current+1
else:
break
def display(self):
count=len(self.items)-1
print()
print(“------”*count+“-”)
for i in range(1,count+1):
print(‘|’,format(self.items[i],‘>3’), end=“
”)
print(“|”)
print(“------”*count+“-”)
h=Heap()
while(True):
print(“\t1. Insert”)
print(“\t2. Delete”)
print(“\t3. Display”)
print(“\t4. Exit”)
print(“=========================”)
if choice==1 :
h.insert(num)
elif choice==2 :
if h.isEmpty() :
print(“Heap Underflow”)
else :
h.delete();
Heap 495
elif choice==3 :
if h.isEmpty() :
print(“Heap is Empty”)
else :
h.display()
elif choice==4 :
print(“\nQuiting.......”)
break
else:
Choice”)
continue
496 Data Structures and Algorithms Using Python a heap. Here the larger
value is considered as higher priority and thus max heap has been used.
class Node :
self.data = Newdata
self.priority= NewPriority
class Heap:
def __init__(self):
self.items = [0]
def isEmpty(self):
return len(self.items) == 1
self.items.append(item)
start=len(self.items)-1
while start>1:
parent=start//2
if self.items[parent].priority<self.
items[start].priority:
self.items[parent],self.items[start]=
self.items[start],self.items[parent]
start=parent
else:
break
def dequeue(self):
temp=self.items[1]
last=len(self.items)-1
self.items[1]=self.items[last]
del self.items[last]
current=1
left=2*current
right=2*current+1
while left<last:
max=self.items[left].priority
posn=left
Heap 497
priority>max:
max=self.items[right].priority
posn=right
if self.items[current].priority<self.
items[posn].priority:
self.items[current],self.items[posn]=
self.items[posn],self.items[current]
current=posn
left=2*current
right=2*current+1
else:
break
return temp
def peek(self):
return self.items[1]
def display(self):
count=len(self.items)-1
print()
print(“---------”*count+“-”)
for i in range(1,count+1):
print(‘|’,format(str(self.items[i].data)+
“(”+str(self.items[i].priority)+“)”,‘>6’), end=“ ”)
print(“|”)
print(“---------”*count+“-”)
q=Heap()
while(True):
print(“===================================”)
print(“\t1. Enqueue”)
print(“\t2. Dequeue”)
print(“\t3. Peek”)
print(“\t4. Display”)
print(“\t5. Exit”)
print(“===================================”)
if choice==1 :
newNode=Node(num,prio)
q.enqueue(newNode)
if q.isEmpty() :
print(“Queue Underflow”)
else :
popNode=q.dequeue()
Priority ”, popNode.priority)
elif choice==3 :
if q.isEmpty() :
print(“Queue Underflow”)
else :
popNode=q.peek()
elif choice==4 :
if q.isEmpty() :
print(“Queue is Empty”)
else :
q.display()
elif choice==5 :
print(“\nQuiting.......”)
break
else:
Choice”)
continue
Heap at a Glance
✓ A heap is a complete binary tree which must satisfy the order property.
✓ By order property there are two types of heap – Max heap and Min heap.
✓ In min heap, the root is smaller than or equal to any of its child.
Heap 499
1. If the following array represents a heap, the largest element in the heap is
A
a) A
b) G
c) Cannot say
d) None of these
2. If the following array represents a min heap, the smallest element in the
heap is A
C
D
a) A
b) G
c) Cannot say
d) None of these
3. If the following array represents a heap, the smallest element in the heap
is A
a) A
c) Cannot say
d) Either B or C
b) n–1
c) log n
d) n/2
5. The worst case time complexity to search for an element from a heap of n
elements is a) O(n)
b) O(n2)
c) O(log n)
d) O(n log n)
a) binary tree
b) complete tree
d) None of these
a) stack
c) priority queue
d) an increasing order array
9.
30
25
20
18
15
10
12
16
10. If we delete the root node of the above heap, by which value the root
will be replaced?
a) 4
b) 16
c) 12
d) 10
25
20
18
12
10
a) Binary tree
c) Heap
d) None of these
Heap 501
Review Exercises
1. Define a heap.
23 72 35 49 62 87 12 58 93 76
5. Delete three consecutive elements from the heap constructed in question
no. 4.
82 45 70 35 18 49 27 2 23 15
8. Does the sequence “57, 32, 47, 22, 17, 38, 42, 20, 9” represent a max
heap?
9. Does the sequence “5, 12, 27, 22, 17, 38, 42, 28, 39, 19” represent a min
heap?
12
OceanofPDF.com
Chapter
Graphs
edges.
e4
v2
v4
e1
e8
v1
e3
e5
e7
v6
e2
e9
v3
v5
e6
we can traverse the nodes from v to v but not from v to v . An edge may
contain directions i
i
in both sides. Then we may traverse from v to v as well as from v to v .
Figure 12.2 shows i
a directed graph.
e4
v2
e1
v4
e8
v1
e3
e5
e7
v6
e2
e9
v3
v5
e6
Figure 12.2 Directed graph
Weighted Graph: When the edges of a graph are associated with some
value, it is known as a weighted graph. This value may represent the
distance between two places or may represent the cost of transportation
from one to place to another, etc. A weighted graph may be of two types:
directed weighted graph and undirected weighted graph. If the edge
contains weight as well as direction, it is known as a directed weighted
graph but if the edge contains weight but there is no direction associated
with that edge, it is known as an undirected weighted graph. Figure 12.3
shows both directed weighted graph and undirected weighted graph.
v2
v2
2
e4
v4
v4
e4
e1
e8
e1
e8
v1
1 e3
7 e7 3
v6
v1
1 e3
7 e7 3
v6
e2
e5
e9
e2
e5
e9
v3
e6
v5
e6
v3
v5
Graphs 505
is also a path. If the end points of a path are the same, the path is called a
closed path. If all the vertices in a path are distinct, the path is called
simple.
Cycle: If a path ends at a vertex from which the path started, the path is
called a cycle. In Figure 12.3(a), v1-> v2-> v3-> v1 is a cycle or v1-> v2->
v4-> v6-> v5-> v3-> v1 is another cycle. But there is no cycle in Figure
12.3(b).
edges.
e5
v2
v4
e1 e2
e6
v1
e3
e7
e10
e9
e4
v3
v5
e8
Multiple edges: If there are multiple edges between the same two vertices,
then it is called multiple edges or parallel edges. Figure 12.5 shows multiple
edges, e1, e2, and e3 between the vertices, v1 and v2.
506 Data Structures and Algorithms Using Python Loop: If any edge has
the same end points, then it is called a loop or self-loop. In Figure 12.5, e7
is a loop or self-loop.
e1
e2
v1
v2
e3
e4
e6
v3
e5
v4
e7
Figure 12.5 Multigraph with self-loop
Cut vertex: If there is any such vertex whose deletion makes the graph
disconnected, such a vertex is known as a cut vertex.
To represent a graph using a linked list there are also two ways. These are:
j
undirected graph, suppose there is an edge between v and v . It indicates we
have an edge i
from the vertex v to vertex v as well as from v to v . Thus both arr[v ][v ] =
1 and arr[v ]
ij
Graphs 507
then the expression will be arr[v ][v ]=w. In the following figure, adjacency
matrices of an i
j
v1 v2 v3 v4 v5 v6
v1 v2 v3 v4 v5 v6
v1
v1
v2
1
0
v2
v3
v3
0
v4
v4
1
v5
v5
v6
1
0
v6
(a) Adjacency matrix of the undirected graph shown (b) Adjacency matrix
of the directed graph shown in in Figure 12.1
Figure 12.2
v1 v2 v3 v4 v5 v6
v1 v2 v3 v4 v5 v6
v1
v1
0
v2
v2
0
v3
v3
v4
3
2
v4
v5
v5
0
0
v6
v6
incident to the vertex v . Then the value of arr[v ][e ] will be 1. In case of a
directed graph, i
if an edge e starts from vertex v and is incident to vertex v , then arr[v ][e ]
will be 1 and i
i
graph, an undirected weighted graph, and a directed weighted graph have
been shown.
e1 e2 e3 e4 e5 e6 e7 e8 e9
v1 1 1 0 0 0 0 0 0 0
v1 1
v2 1 0 1 1 1 0 0 0 0
v2 -1 0 -1 1
0
0
v3 0 1 1 0 0 1 0 0 0
v3 0 -1 1
v4 0 0 0 1 0 0 1 1 0
v4 0
0 -1 0
v5 0 0 0 0 1 1 1 0 1
v5 0
0
0
0 -1 -1 -1 0
v6 0 0 0 0 0 0 0 1 1
v6 0
0 -1 -1
in Figure 12.2
e1 e2 e3 e4 e5 e6 e7 e8 e9
e1 e2 e3 e4 e5 e6 e7 e8 e9
v1
3
0
v1 2
v2
1
5
v2 -2 0 -1 5
v3
0
0
v3 0 -3 1
v4
v4 0
0
0 -5 0
v5
v5 0
0 -7 -4 -3 0
4
v6
v6 0
0 -2 -4
and the reference of the first node of the list will be stored within the cell
that will represent v1. Similarly, the cell for v2 contains the reference of a
linked list that contains 4 nodes for storing v1, v3, v4, and v5. In this way
the entire list will be created. In the Figure 12.8, the adjacency lists of an
undirected graph, a directed graph, an undirected weighted graph, and a
directed weighted graph have been shown.
Graphs 509
v1
v2
v3
v1
v2
v3
v2
v1
v3
v5
v4
v2
v4
v5
v3
v1
v2
v5
v3
v2
v5
v4
v2
v5
v6
v4
v5
v6
v5
v3
v2
v4
v6
v5
v6
v6
v4
v5
v6
v1
v2 2
v3 3
v1
v2 2
v3 3
v2
v1 2
v3 1
v5 7
v4 5
v2
v4 5
v5 7
v3
v1 3
v2 1
v5 4
v3
v2 1
v5 4
v4
v2 5
v5 3
v6 2
v4
v5 3
v6 2
v5
v3 4
v2 7
v4 3
v6 4
v5
v6 4
v6
v4 2
v5 4
v6
For each vertex, there is an entry in the vertex table and for each entry a
reference is stored to point to another node of a linked list which stores the
information about the adjacent edge. This node structure contains five
fields.
Link fo v
Link fo v
j
Here, the first field M is a single bit representation which denotes whether
the edge is visited or not. The next two fields are v and v , which are the
two adjacent vertices of the i
edge. The last two fields are ‘Link for v ’ and ‘Link for v ’, which are the
references of the i
510 Data Structures and Algorithms Using Python Nodes containing edge
information
e1
v1
v2
e2
e3
e2
v1
v3 None e3
Table of Vertices
e3
v2
v3
e4
e5
v1
e4
v2
v4 None e5
v2
v3
e5
v3
v4 None e7
v4
v5
e6
v3
v5 None e7
v6
e7
v4
v5
e8
e9
e8
v4
v6 None e9
e9
v5
v6 None None
In the graph, there are six vertices. So, first we create a table containing
labels of six vertices.
Next we need to create nine nodes for nine edges. The nodes are labeled as
e1, e2, … e9.
Next the v and v fields of each node are set. For example, v1 and v2 are the
adjacent nodes i
of the edge e1. Thus v and v fields of e1 are set as v1 and v2, respectively.
Similarly for e2, i
j
these values are v1 and v3. Now to set the link fields we need to look down.
Consider the node for e1 first. To set the link for v1, we look below and find
v1 is at e2 and v2 is at e3.
Thus in the Link for v1 field e2 is set and in the Link for v2 field e3 is set.
In the next node e2, to set the link for v1, we look below but do not find any
entry for v1. Thus in the Link for v1 field of e2, None is set. The second
vertex of e2, i.e. v3, has an entry in the node e3.
Hence, e3 is set in the Link for v3 field. In this way all the values are set.
Now we update the reference field in the vertices table. As v1 and v2 first
appear in the node e1, both v1 and v2
point to e1. v3 first appeared in e2. So, v3 points to e2. Similarly, we find
that v4, v5, and v6
first appear in e4, e6, and e8 nodes, respectively. Thus they point to the
corresponding node in the vertices table and we get the above adjacency
multi-list.
From the above adjacency multi-list we can easily find the adjacency list
for the vertices as shown below.
Vertex
List of edges
v1
e1, e2
v2
e1, e3, e4
v3
e4, e5, e7 e8
v5
e6, e7
v6
e8, e9
Graphs 511
In this section we will discuss how we can insert a vertex in a graph or how
a new edge can be added or how a vertex or edge can be deleted, several
traversing algorithms, printing a graph, etc.
Now, if we want to insert a new vertex we need to insert a new row as well
as a new column whereas insertion of a new edge is quite simple. If the
graph is undirected and the adjacent vertices of the edge is v and v , then we
have to consider an edge from v to v and another i
from v to v . Thus if the array/list name is arr, we have to set arr[v ][v ] = 1
as well as arr[v ]
j
1, it will be arr[v ][v ] = w and arr[v ][v ] =w. For a directed graph if the
source vertex is v i
and end vertex is v , then only set arr[v ][v ] = 1 and for a weighted graph
set arr[v ][v ] = w.
then a new node containing v is inserted in the linked list of the vertex v
and another new j
only a single new node containing the end vertex name is inserted in the
linked list of the source vertex. For a weighted graph, the weight of the
edge is also stored within the node.
j
as arr[v ][v ] = 0. In case of a directed graph, if the source vertex is v and
the end vertex is j
512 Data Structures and Algorithms Using Python graph do the same
operation again considering the source vertex as the end vertex and vice
versa.
Here are some complete programs showing these insertion and deletion
operations for different types of graphs. The first program is an
implementation of an undirected graph using an adjacency matrix.
# matrix
class Graph:
def __init__(self,size=0):
self.size=size
range(size)]
def isEmpty(self):
return self.size == 0
def insert_vertex(self):
#Function to insert a
#vertex
for i in range(self.size):
self.items[i].append(0)
size+1)])
self.size+=1
#an edge
self.items[vi][vj]=1
self.items[vj][vi]=1
#delete a vertex
if v>=self.size:
print(“vertex not found..”)
return
for i in range(self.size):
del self.items[i][v]
del self.items[v]
self.size-=1
print(“Vertex Removed”)
Graphs 513
#an edge
if vi>=self.size or vj>=self.size:
return
self.items[vi][vj]=0
self.items[vj][vi]=0
print(“Edge Removed”)
def display(self):
for i in range(self.size):
print(“v{}|”.format(i),end=‘ ’)
for j in range(self.size):
print(self.items[i][j],end=‘ ’)
print(“|”)
g=Graph()
while(True):
MATRIX”)
print(“=============================================”)
print(“\t5. Display”)
print(“\t6. Exit”)
print(“=============================================”)
if choice==1 :
g.insert_vertex()
elif choice==2 :
elif choice==3 :
g.delete_vertex(v)
elif choice==4 :
g.delete_edge(vs,ve)
if g.isEmpty() :
print(“Graph is Empty”)
else :
g.display()
elif choice==6 :
print(“\nQuiting.......”)
break
else:
Choice”)
continue
The above program is on an undirected graph. For a directed graph the
program is almost the same, the only difference being that we need to
consider only v to v , not v to v . For a i
# adjacency list
class Node:
def __init__(self,vName):
self.vName=vName
self.next=None
class Graph:
def __init__(self):
self.size=0
self.items = []
def isEmpty(self):
return self.size == 0
def insert_vertex(self,v):
self.items.append([v,None])
self.size+=1
Graphs 515
for i in range(self.size):
if self.items[i][0]==vi:
newNode=Node(vj)
if self.items[i][1] is None:
self.items[i][1]=newNode
else:
last=self.items[i][1]
while last.next:
last=last.next
last.next=newNode
if self.items[i][0]==vj:
newNode=Node(vi)
if self.items[i][1] is None:
self.items[i][1]=newNode
else:
last=self.items[i][1]
while last.next:
last=last.next
last.next=newNode
def search(self,v):
for i in range(self.size):
if self.items[i][0]==v:
return True
return False
def delete_vertex(self,v):
for i in range(self.size):
if self.items[i][0]==v:
curNode=self.items[i][1]
while curNode:
self.delete_edge(curNode.vName,v)
curNode=curNode.next
del self.items[i]
break
self.size-=1
print(“Vertex Removed”)
for i in range(self.size):
if self.items[i][0]==vi:
if curNode.vName==vj:
self.items[i][1]=curNode.next
else:
prev=None
while curNode.vName!=vj:
prev=curNode
curNode=curNode.next
prev.next=curNode.next
del curNode
if self.items[i][0]==vj:
curNode=self.items[i][1]
if curNode.vName==vi:
self.items[i][1]=curNode.next
else:
prev=None
while curNode.vName!=vi:
prev=curNode
curNode=curNode.next
prev.next=curNode.next
del curNode
def display(self):
for i in range(self.size):
print(self.items[i][0],“:”,end=‘ ’)
curNode=self.items[i][1]
while curNode:
print(curNode.vName,end=‘->’)
curNode=curNode.next
print(“None”)
g=Graph()
while(True):
print(“===============================================”)
print(“\t5. Display”)
print(“\t6. Exit”)
print(“===============================================”)
Graphs 517
if choice==1 :
if g.search(v):
continue
g.insert_vertex(v)
elif choice==2 :
if not g.search(vs):
continue
if not g.search(ve):
print(“End Vertex not found..”)
continue
g.insert_edge(vs,ve)
elif choice==3 :
if not g.search(v):
continue
g.delete_vertex(v)
elif choice==4 :
if not g.search(vs):
continue
if not g.search(ve):
continue
g.delete_edge(vs,ve)
print(“Edge Removed”)
elif choice==5 :
if g.isEmpty() :
print(“Graph is Empty”)
else :
g.display()
elif choice==6 :
print(“\nQuiting.......”)
break
continue
class Node:
def __init__(self,vName,weight):
self.vName=vName
self.weight=weight
self.next=None
class Graph:
def __init__(self):
self.size=0
self.items = []
def isEmpty(self):
return self.size == 0
def insert_vertex(self,v):
self.items.append([v,None])
self.size+=1
for i in range(self.size):
if self.items[i][0]==vi:
newNode=Node(vj,wt)
if self.items[i][1] is None:
self.items[i][1]=newNode
else:
last=self.items[i][1]
while last.next:
last=last.next
last.next=newNode
def search(self,v):
for i in range(self.size):
if self.items[i][0]==v:
return True
Graphs 519
return False
def delete_vertex(self,v):
for i in range(self.size):
curNode=self.items[i][1]
if curNode.vName==v:
self.items[i][1]=curNode.next
else:
prev=None
while curNode:
if curNode.vName==v:
prev.next=curNode.next
del curNode
break
else:
prev=curNode
curNode=curNode.next
for i in range(self.size):
if self.items[i][0]==v:
del self.items[i]
self.size-=1
break
print(“Vertex Removed”)
for i in range(self.size):
if self.items[i][0]==vi:
curNode=self.items[i][1]
if curNode.vName==vj:
self.items[i][1]=curNode.next
else:
prev=None
while curNode.vName!=vj:
prev=curNode
curNode=curNode.next
prev.next=curNode.next
del curNode
def display(self):
for i in range(self.size):
print(self.items[i][0],“:”,end=‘ ’)
curNode=self.items[i][1]
print(curNode.vName,“(”,curNode.weight,“)”,end=‘
-> ’)
curNode=curNode.next
print(“None”)
g=Graph()
while(True):
print(“============================================”)
print(“\t5. Display”)
print(“\t6. Exit”)
print(“============================================”)
choice=int(input(“Enter your Choice : ”))
if choice==1 :
if g.search(v):
continue
g.insert_vertex(v)
elif choice==2 :
if not g.search(vs):
continue
if not g.search(ve):
continue
g.insert_edge(vs,ve,wt)
elif choice==3 :
continue
g.delete_vertex(v)
elif choice==4 :
Graphs 521
if not g.search(vs):
continue
if not g.search(ve):
continue
g.delete_edge(vs,ve)
print(“Edge Removed”)
elif choice==5 :
if g.isEmpty() :
print(“Graph is Empty”)
else :
g.display()
elif choice==6 :
print(“\nQuiting.......”)
break
else:
continue
Traversal of a graph means examining or reading data from each and every
vertex and edge of the graph. There are two standard algorithms to traverse
a graph. These are:
In Breadth First Search (BFS) algorithm, starting from a source vertex all
the adjacent vertices are traversed first. Then the adjacent vertices of these
traversed vertices are traversed one by one. This process continues until all
the vertices are traversed. This algorithm is quite similar to the level order
traversal of a tree. But the difference is that a graph may contain cycles;
thus we may traverse a vertex again and again. To solve this problem we
may use a flag variable for each vertex to denote whether the vertex is
previously traversed or not. We may set 0, 1, and 2 to this flag variable;
where 0 denotes the vertex is not traversed at all, 1
denotes the vertex is in the queue, and 2 denotes all the adjacent vertices of
the vertex are
522 Data Structures and Algorithms Using Python traversed. Similar to the
level order traversal of a tree, here also we need to use the queue data
structure. The general algorithm of BFS is given here:
1. Create a queue.
Now consider the following graph and we want to traverse using the BFS
method considering source vertex as v1.
v2
v4
v1
v3
v5
First we create a queue and a dictionary named flag. Within the dictionary
we insert all the vertices as key and set 0 to all the elements.
Vertex
v1
v2
v3
v4
v5
Flag Value
We start the algorithm by enqueueing the source vertex v1 and update the
flag value of v1
to 1.
Queue
v1
Vertex
v1
v2
v3
v4
v5
Flag Value
Next dequeue from queue and we get v1. Now we traverse the adjacent list
of v1. v2 and v3 are the adjacent vertices of v1 and flag value of both
vertices is 0. Thus we enqueue both vertices and set their corresponding
flags as1. The next task is to print this dequeued element and set its flag
value = 2 as this vertex is completely processed.
Graphs 523
Queue
v2
v3
Vertex
v1
v2
v3
v4
v5
Flag Value
v1
This last step repeats until the queue becomes empty. Thus in the next
iteration, v2 is dequeued. Its adjacent vertices are v1, v3, and v4. But the
flag value of v1 is 2, which means it is completely processed; the flag value
of v3 is 1, which means it is already in the queue.
Thus we enqueue only v4 as its flag value is 0. After inserting we change its
flag value as 1.
Queue
v3
v4
Vertex
v1
v2
v3
v4
v5
Flag Value
v1 v2
Queue
v4
v5
Vertex
v1
v2
v3
v4
v5
Flag Value
v1 v2 v3
Queue
v5
Vertex
v1
v2
v3
v4
v5
Flag Value
v1 v2 v3 v4
Queue
Vertex
v1
v2
v3
v4
v5
Flag Value
2
2
v1 v2 v3 v4 v5
Now the queue becomes empty and we get the BFS traversal order of
vertices as v1 v2 v3
v4 v5. Remember, this order is not unique for any graph. Suppose, if we are
processing the vertex v1, instead of v2 one may insert v3 first and then v2.
Then we get a different order.
def BFS(self,s):
q=Queue()
self.flag={}
for i in range(self.size):
self.flag[self.items[i][0]]=0
for i in range(self.size):
if self.items[i][0]==s:
break
q.enqueue(s)
self.flag[s]=1
v=q.dequeue()
for i in range(self.size):
if self.items[i][0]==v:
break
next=self.items[i][1]
if self.flag[next.vName]==0:
q.enqueue(next.vName)
self.flag[next.vName]=1
next=next.next
print(v,end=‘ ’)
self.flag[v]=2
Graphs 525
So, starting from a source vertex we push all the adjacent vertices into the
stack. Then we pop a vertex from the stack and move to that vertex. The
same operation is done for that vertex also. This process continues until the
stack becomes empty. Like BFS, here too we need to use a flag variable for
each vertex to denote whether the vertex is previously traversed or not. We
may set 0, 1, and 2 to these flag variable, where 0 denotes the vertex is not
traversed at all, 1 denotes the vertex is in the stack, and 2 denotes all the
adjacent vertices of the vertex are traversed. The general algorithm of DFS
is given here: 1. Create a stack.
Now consider the following graph and we want to traverse using the DFS
method considering source vertex as v1.
v2
v4
v1
v3
v5
First we create a stack and a dictionary named flag. Within the dictionary
we insert all the vertices as key and set 0 to all the elements.
v1
v2
v3
v4
v5
Flag Value
At the very beginning we push the source vertex v1 into the stack and
update the flag value of v1 to 1.
Stack
v1
Vertex
v1
v2
v3
v4
v5
Flag Value
Next pop from stack and we get v1. Now traversing the adjacent list of v1,
we find v2 and v3 are the adjacent vertices of v1 and flag value of both
vertices is 0. Thus we push both the vertices and set their corresponding
flags as 1. The next task is to print this popped element and set its flag value
= 2 as this vertex is completely processed.
Stack
v2
v3
Vertex
v1
v2
v3
v4
v5
Flag Value
v1
This last step repeats until the stack becomes empty. Thus in the next
iteration, v3 is popped.
Its adjacent vertices are v1, v2, v4, and v5. But the flag value of v1 is 2,
which means it is completely processed, and the flag value of v2 is 1, which
means it is already in the stack.
Thus we push v4 and v5 as their flag values are 0 and update their flag
values to 1. Now we print v3 and set its flag value as 2.
Stack
v2
v4
v5
Vertex
v1
v2
v3
v4
v5
Flag Value
v1 v3
Similarly, in the next iteration the popped element is v5. Among its adjacent
vertices there is no vertex whose flag value is 0; thus no element is pushed.
Now v5 is printed and its flag value becomes 2.
Graphs 527
Stack
v2
v4
Vertex
v1
v2
v3
v4
v5
Flag Value
v1 v3 v5
Stack
v2
Vertex
v1
v2
v3
v4
v5
Flag Value
v1 v3 v5 v4
Stack
Vertex
v1
v2
v3
v4
v5
Flag Value
v1 v3 v5 v4 v2
Now the stack becomes empty and we get the DFS traversal order of
vertices as v1 v3 v5
v4 v2. Remember, in case of DFS also this order is not unique for any
graph. At the time of processing the vertex v1, instead of v2 one may insert
v3 first and then v2. Then we get a different order. This is true for all other
vertices as well.
def DFS(self,s):
st=Stack()
self.flag={}
for i in range(self.size):
self.flag[self.items[i][0]]=0
for i in range(self.size):
break
st.push(s)
self.flag[s]=1
v=st.pop()
for i in range(self.size):
if self.items[i][0]==v:
break
next=self.items[i][1]
if self.flag[next.vName]==0:
st.push(next.vName)
self.flag[next.vName]=1
next=next.next
print(v,end=‘ ’)
self.flag[v]=2
A spanning tree of a graph is a subset of that graph where all the vertices of
the graph are connected with the minimum number of edges. As it does not
have any cycle, it is called a tree. Another thing is that all the vertices are
connected through edges; so, we can find a spanning tree only if the graph
is connected. In a connected graph, there may be any number of spanning
trees but at least one spanning tree should exist. In a spanning tree, the
number of edges is always one less than the number of vertices.
If the graph is a weighted graph, then we may find the minimum spanning
tree. A minimum spanning tree of a weighted graph is a subset of the graph
where all the vertices of the graph are connected with the minimum number
of edges in such a way that the sum of the weight of the edges is minimum.
In other words, a minimum spanning tree is the spanning tree whose total
weight of the tree is the minimum compared to all other spanning trees of
the graph. Consider the following weighted graph.
v2
v1
v4
v3
4
Graphs 529
v2
v2
v2
v2
v1
v4
v1
v4
v1
v4
v1
v4
3
v3
v3
v3
v3
Total weight = 12
Total weight = 11
Total weight = 9
Total weight = 10
(a)
(b)
( c)
(d)
All the above four trees are the spanning trees of the above graph but the
one in (c) has the smallest total weight. Hence it is the minimum spanning
tree of the above graph.
In the following sections we will discuss two popular algorithms to find the
minimum spanning tree. These are Prim’s algorithm and Kruskal’s
algorithm.
Prim’s algorithm is one of the most popular algorithms to find the minimum
spanning tree of a connected weighted graph. It is a greedy algorithm that
may start with any arbitrary vertex. The algorithm starts with an empty
spanning tree and in each step one edge is added to this spanning tree. After
choosing the starting vertex our task is to find the edge with the smallest
weight among the edges that are connected to that starting vertex. This edge
and the corresponding vertex are then added to the spanning tree. In the
next step we will find the edge with the smallest weight among the edges
that are connected to the vertices that are already in the spanning tree and
obviously not already included in the minimum spanning tree. If the
inclusion of this new smaller edge does not form a cycle within the
spanning tree, we will include the edge and its corresponding vertex. This
process continues until all the vertices of the original graph have been
added to the spanning tree. Prim’s algorithm can be represented as:
4. Repeat until T does not contain all the vertices of the given graph, do a.
Consider the vertices that are already in T.
b. Consider the edges that are connected to these vertices but not included
in T.
530 Data Structures and Algorithms Using Python The running time of
Prim’s algorithm can be calculated as O(V logV + E logV), where E is the
number of edges and V is the number of vertices. This calculation can be
made simpler as O(E logV) since every insertion of node in the solution
path takes logarithmic time.
v2
v4
v1
v6
v3
v5
2
Solution:
v1
Step 3: Now we have to consider the edges that are connected to v1 and
among them (v1,v2) is with the smallest weight. This edge and the
corresponding vertex v2 would be added.
v2
v1
Step 4: Now we would consider all the edges that are connected to v1 or v2
except the edge (v1,v2). There are two vertices (v2,v4) and (v1,v3); both
have weight 3 and are smaller than others. We may consider any one among
them. Here, we are considering (v2,v4).
v2
v4
v1
Step 5: Next we need to consider all the edges that are connected to v1, v2,
or v4 except the edges (v1,v2) and (v2,v4). Now (v1,v3) is the smallest one.
So, we add it.
Graphs 531
v2
v4
v1
v3
Step 6: In the next step, we consider the edges connected to v1, v2, v3, or
v4 excluding the edges (v1,v2), (v2,v4), and (v1,v3). Now there are two
vertices (v2,v3) and (v4,v5); both have weight 4 and are smaller than
others. If we consider (v2,v3), it will form a cycle.
v2
v4
v1
3
v3
v5
Step 7: Among the edges that are connected to v1, v2, v3, v4, or v5, but not
already included in the spanning tree, (v5, v6) is the smallest and we
include it.
v2
v4
v1
v6
v3
v5
Step 8: As all the vertices of the original graph are now included in the
spanning tree, this is our final spanning tree of the given graph.
vertex=len(adjMatrix)
range(vertex)]
largeNum = float(‘inf’)
while(False in selectedvertex):
minimum = largeNum
start = 0
end = 0
for i in range(0,vertex):
if selectedvertex[i]:
for j in range(0+i,vertex):
adjMatrix[i][j]>0):
minimum = adjMatrix[i][j]
start, end = i, j
selectedvertex[end] = True
mstMatrix[start][end] = minimum
if minimum == largeNum:
mstMatrix[start][end] = 0
mstMatrix[end][start] = mstMatrix[start][end]
return mstMatrix
for i in range(0,v):
for j in range(0+i,v):
adj[j][i] = adj[i][j]
mst=Prims(adj)
for i in range(v):
print(mst[i])
Graphs 533
4. Repeat until T does not contain (n-1) edges where n is the number of
vertices, do a. Select the edge with lowest weight from the list.
v2
v4
7
v1
v6
v3
v5
Solution:
Step 1: Taking an empty graph and adding all the vertices to it.
v2
v4
v1
v6
v3
v5
Step 2: Next we create a list containing all the edges of the given graph and
sort them in ascending order of weight.
(v1,v2)
(v5,v6)
(v1,v3)
(v2,v4)
(v2,v3)
(v4,v5)
(v2,v5)
(v3,v5)
(v4,v6)
Step 3: Now we have to add the edge that has the least weight. There are
two edges that have the least weight. These are (v1,v2) and (v5,v6). Both
have weight 2. We can choose any one of them. Here we are choosing
(v1,v2). It will be added in the graph and removed from the list.
534 Data Structures and Algorithms Using Python v2
v4
v1
v6
v3
v5
(v5,v6)
(v1,v3)
(v2,v4)
(v2,v3)
(v4,v5)
(v2,v5)
(v3,v5)
(v4,v6)
4
5
Step 4: Now the edge (v5,v6) has the least weight in the list. Hence it will
be added in the graph and remove from the list.
v2
v4
v1
v6
v5
v3
(v1,v3)
(v2,v4)
(v2,v3)
(v4,v5)
(v2,v5)
(v3,v5)
(v4,v6)
3
3
Step 5: The next edge with lowest weight in the list is (v1,v3). Hence it will
be added in the graph and removed from the list.
v2
v4
v1
v6
v3
v5
(v2,v4)
(v2,v3)
(v4,v5)
(v2,v5)
(v3,v5)
(v4,v6)
Step 6: Now the edge (v2,v4) has the least weight in the list. Hence it will
be added in the graph and removed from the list.
Graphs 535
v2
v4
v1
v6
v3
v5
2
(v2,v3)
(v4,v5)
(v2,v5)
(v3,v5)
(v4,v6)
Step 7: Again there are two edges that have the least weight in the list.
These are (v2,v3) and (v4,v5). But inclusion of the edge (v2,v3) forms a
cycle. Thus we have to discard it. So, in next step, (v4,v5) will be added in
the graph and removed from the list.
v2
v4
v1
v6
3
v3
v5
(v2,v5)
(v3,v5)
(v4,v6)
Step 7: Now the number of edges in the graph becomes 5, which is 1 less
than the total number of vertices. Hence our algorithm terminates here and
we get the required minimum spanning tree.
class Graph:
self.vertex = vertex
self.edges = []
536 Data Structures and Algorithms Using Python def search(self, parent,
i):
if parent[i] == i:
return i
root_a = self.search(parents, a)
root_b = self.search(parents, b)
parents[root_a] = root_b
parents[root_b] = root_a
else:
parents[root_b] = root_a
rank[root_a] += 1
def Kruskals(self):
mst = []
i, e = 0, 0
element: element[2])
parents = []
rank = []
parents.append(node)
rank.append(0)
i=i+1
a = self.search(parents, vi)
b = self.search(parents, vj)
if a != b:
e=e+1
self.doUnion(parents, rank, a, b)
minCost=0
minCost+=wt
print(“Cost of MST=”,minCost)
Graphs 537
v=int(input(‘Enter the number of vertices: ’))
gr = Graph(v)
ch=“y”
while(ch.upper()==“Y”):
gr.add_edge(i,j,wt)
ch=input(‘Continue?(y/n)’)
gr.Kruskals()
The shortest path between two vertices in a graph indicates the path
between these two vertices having the lowest weight; there is no other path
between these vertices with a lower weight than this path. Finding the
shortest path is a very interesting and useful study under graphs. The
shortest path can be found between two vertices or between one source
vertex and any other vertex or between any vertex and any other vertex in a
graph.
Dijkstra’s algorithm is used for the single source shortest path problem.
More precisely it is used to find the shortest path between one vertex and
every other vertex for any weighted graph – directed or undirected. This
algorithm is very useful in networking for routing protocols as well as to
find the shortest path between one city and any other cities or to find the
path to set a waterline system, etc. The disadvantage of this algorithm is
that it may not work properly for negative weights. Dijkstra’s algorithm can
be represented as: 1. Take a square matrix of size equals to the number of
vertices+1.
3. Fill the other column of the first row with the name of the vertices.
4. Next rows will contain the corresponding shortest distance from the
source vertex.
5. Initially set all columns except the first column of next row, i.e. the
second row, with infinity except the column under source vertex, which will
store 0.
6. Find the smallest value in the row and fill the first column with the vertex
name of the smallest distance.
7. Consider the vertex with the smallest value as u and visit all its adjacent
vertices (v).
Above algorithm creates a table which shows the shortest distance of all
other vertices from the source vertex. To find the shortest path between
source vertex and given vertex, 1. First we have to find the vertex from the
first row.
2. Find the last written value at the corresponding column. This represents
the shortest distance.
4. On that row find the marked smallest value and the corresponding vertex
name.
5. Note this vertex as the previous vertex of the destination vertex in the
path.
6. From this point again move upward and follow the same procedure (i.e.
steps 3 to 5) to get its previous node. This process continues until we reach
the source vertex.
Now consider the following example to find the shortest path using
Dijkstra’s algorithm: Example 12.3: Find the shortest path from vertex v1
to all other vertices in the following graph using Dijkstra’s algorithm:
v2
v4
v1
3
v6
v3
v5
Solution:
v1
v2
v3
v4
v5
v6
v1
∞
v3
v2
v4
v6
v5
6
Graphs 539
From the above table it is clear that the shortest distance from v1 to v2 is 3,
from v1 to v3
From this point we are moving upward and the first ∞ is found in its
previous row. The smallest value in that row is 4 and the corresponding
vertex is v4. Thus we get the previous vertex of v6.
v4 -> v6
From this point we again move upward and find ∞ in the row whose first
column contains v3. Thus we get
v3 -> v4 -> v6
Again moving upward, we find ∞ in its previous row and its first column
contains the source vertex v1. Thus the final shortest path between v1 to v6
is: v1 -> v3 -> v4 -> v6
class Node:
def __init__(self,vName,weight):
self.vName=vName
self.weight=weight
self.next=None
class Graph:
def __init__(self):
self.vertex=0
self.items = []
return self.vertex == 0
def insert_vertex(self,v):
self.items.append([v,None])
self.vertex+=1
for i in range(self.vertex):
if self.items[i][0]==vi:
newNode=Node(vj,wt)
if self.items[i][1] is None:
self.items[i][1]=newNode
else:
last=self.items[i][1]
while last.next:
last=last.next
last.next=newNode
def search(self,v):
for i in range(self.vertex):
if self.items[i][0]==v:
return True
return False
def Dijkstra(self,source):
selectedvertices=[]
largeNum = float(‘inf’)
for i in range(self.vertex):
temp[0][i+1]= self.items[i][0]
i=0
while(self.items[i][0]!=source):
i=i+1
temp[1][i+1]=0
row=1
while(row<=self.vertex):
minm=temp[row][1]
j=1
for i in range(1,self.vertex+1):
if temp[row][i]<minm:
Graphs 541
minm=temp[row][i]
j=i
minVertex=temp[0][j]
temp[row][0]=minVertex
selectedvertices.append(minVertex)
curNode=self.items[j-1][1]
if row==self.vertex:
break
for i in range(1,self.vertex):
temp[row+1][i]=temp[row][i]
while curNode:
vName=curNode.vName
k=1
while(temp[0][k]!=vName):
k=k+1
cumWt=temp[row][j]+curNode.weight
if cumWt<temp[row][k]:
temp[row+1][k]= cumWt
curNode=curNode.next
row=row+1
for i in range(1,self.vertex+1):
print(temp[i])
vertices:-’)
path=[]
r=row
flag=True
while(r>0):
minm=largeNum
for i in range(1,self.vertex+1):
if temp[r][i]<minm:
minm=temp[r][i]
j=i
path.append(temp[r][0])
if flag:
minDist=minm
flag=False
while(temp[r][j]==curVal):
r=r-1
l=len(path)-1
while(l>=0):
print(path[l]+‘->’,end=‘’)
l=l-1
print(‘\b\b : ’,minDist)
gr = Graph()
for i in range(v):
’.format(i+1))
gr.insert_vertex(vert)
ch=“y”
while(ch.upper()==“Y”):
gr.insert_edge(vs,ve,wt)
ch=input(‘Continue?(y/n)’)
gr.Dijkstra(sr)
As the time complexity to find the single source shortest path using
Dijkstra’s algorithm in a graph of n vertices is O(n2), for all pairs of
vertices it will be O(n2) x n, i.e. O(n3).
Another drawback of this algorithm is that it does not work properly for
negative weight.
Graphs 543
24
5
3
1234
10250
24004
30203
45040
From this adjacency matrix we prepare the initial matrix, A0. In this matrix,
if there is a direct path between two vertices, v and v , the distance between
v and v (i.e. the weight i
ij
10
5∞
A0 =
0∞4
3∞2
5∞4
544 Data Structures and Algorithms Using Python from A0 matrix. If there
is no self-loop, the diagonal elements will contain 0 (zero). So, we get the
following matrix:
5∞
A1 =
A0[2,3]
A0[2,1]+ A0[1,3]
>
4+5=9
A1 = 2
9
3∞
A0[2,4]
A0[2,1] + A0[1,4]
A1[2,4]
<
4+∞=∞
A0[3,2]
A0[3,1] + A0[1,2]
A1[3,2]
<
∞+2=∞
2
A0[3,4]
A0[3,1] + A0[1,4]
A1[3,4]
<
∞+∞=∞
Graphs 545
A0[4,2]
A0[4,1] + A0[1,2]
A1[4,2]
>
5+2=7
A0[4,3]
A0[4,1]+ A0[1,3]
A1[4,3]
<
5+5=10
1234
1025∞
A1 =
24094
3∞203
45740
Now the same operation will be done on vertex 2. Thus this time we will
copy the elements of the second row and second column of A1 matrix and
all the diagonal elements remain 0
102
A2 =
24094
20
0
The rest of the elements will also be calculated in the same way: A1[1,3]
A1[1,2] + A1[2,3]
A2[1,3]
<
2+9=11
A1[1,4]
A1[1,2] + A1[2,4]
A2[1,4]
>
2+4=6
A1[3,1]
A1[3,2] + A1[2,1]
A2[3,1]
>
2+4=6
6
A1[3,4]
A1[3,2] + A1[2,4]
A2[3,4]
<
2+4=6
A1[4,1]
A1[4,2] + A1[2,1]
A2[4,1]
<
7+4=11
A1[4,3]
A1[4,2]+ A1[2,3]
A2[4,3]
<
7+9=16
1234
10256
A2 =
24094
36203
45740
Now we consider the vertex 3 as intermediate vertex and will create the A3
matrix. For that first we copy the elements of row 3 and column 3 and the
diagonals of A2 matrix.
1234
10
A3 =
09
36203
40
Calculations for the rest of the elements are as follows:
A2[1,2]
A2[1,3] + A2[3,2]
A3[1,2]
<
5+2=7
A2[1,4]
A2[1,3] + A2[3,4]
A3[1,4]
<
5+3=8
A2[2,1]
A2[2,3] + A2[3,1]
A3[2,1]
<
9+6=15
A2[2,4]
A2[2,3] + A2[3,4]
A3[2,4]
<
9+3=12
A2[4,1]
A2[4,3] + A2[3,1]
A3[4,1]
<
4+6=10
A2[4,2]
A2[4,3]+ A2[3,2]
A3[4,2]
7
>
4+2=6
Graphs 547
1234
10256
A3 =
24094
36203
45640
Finally we consider the vertex 4 as intermediate vertex and will create the
A4 matrix. For that again we copy the elements of row 4 and column 4 and
the diagonals of A3 matrix.
1234
10
A4 =
4
3
03
5640
A3[1,2]
A3[1,4] + A3[4,2]
A4[1,2]
<
6+6=12
A3[1,3]
A3[1,4] + A3[4,3]
A4[1,3]
<
6+4=10
A3[2,1]
A3[2,4] + A3[4,1]
A4[2,1]
<
4+5=9
A3[2,3]
A3[2,4] + A3[4,3]
A4[2,3]
>
4+4=8
A3[3,1]
A3[3,4] + A3[4,1]
A4[3,1]
<
3+5=8
6
A3[3,2]
A3[3,4]+ A3[4,2]
A4[3,2]
<
3+6=9
10256
A4 =
24084
36203
45640
To write the code we may use this formula to find the elements of the
successive arrays.
def FloydWarshall(adjMatrix):
vertex=len(adjMatrix)
range(vertex)]
for i in range(vertex):
for j in range(vertex):
if i==j:
spMatrix[i][j]=0
else:
spMatrix[i][j]=adjMatrix[i][j]
for k in range(vertex):
for i in range(vertex):
for j in range(vertex):
spMatrix[i][j]=min(spMatrix[i][j],
spMatrix[i][k]+spMatrix[k][j])
return spMatrix
INF = float(‘inf’)
adj = [[INF for col in range(v)] for row in range(v)]
ch=“y”
while(ch.upper()== “Y”):
Graphs 549
adj[vs][ve]=wt
ch=input(‘Continue?(y/n)’)
spm=FloydWarshall(adj)
for i in range(v):
print(spm[i])
Graph at a Glance
✓ If the edges of a graph are associated with some directions, the graph is
known as a directed graph or digraph.
✓ If a path ends at a vertex from which the path started, the path is called
cycle.
✓ If there is a path between any two vertices of a graph, then the graph is
known as a connected graph.
✓ If any edge has the same end points, then it is called a loop or self-loop.
✓ If there is any such vertex whose deletion makes the graph disconnected,
then such a vertex is known as a cut vertex.
✓ In memory, there are two ways to represent a graph using arrays. These
are: using adjacency matrix and using incidence matrix.
✓ To represent a graph using linked lists there are also two ways. These are:
using adjacency list and using adjacency multi-list.
✓ Breadth First Search (BFS) and Depth First Search (DFS) are two most
common graph traversal algorithms.
✓ BFS algorithm uses queue and DFS algorithm uses stack as auxiliary
data structure.
✓ Dijkstra’s algorithm is used for the single source shortest path problem
whereas the Floyd–Warshall algorithm is used to find the shortest path
between all pairs of vertices.
a) e = v + 1
b) v = e + 1
c) v = e
d) v = e – 1
Graphs 551
4. Which algorithm is used to find the shortest path among all pairs of
vertices?
a) Dijkstra’s algorithm
b) Floyd–Warshall algorithm
c) Prim’s algorithm
d) Kruskal’s algorithm
5. Which algorithm is used to find the shortest path from a particular vertex
to all other vertices?
a) Dijkstra’s algorithm
b) Floyd–Warshall algorithm
c) Prim’s algorithm
d) Kruskal’s algorithm
a) Greedy method
b) Dynamic programming
d) Backtracking
a) Linked list
b) Stack
c) Queue
d) Tree
a) Linked list
b) Stack
c) Queue
d) Tree
552 Data Structures and Algorithms Using Python 9. The vertex whose in-
degree is greater than 0 but out-degree is 0 is known as a) Isolated vertex
b) Source vertex
c) Sink vertex
d) None of these
b) Source vertex
c) Sink vertex
d) None of these
b) Source vertex
c) Sink vertex
d) None of these
a) Greedy method
b) Dynamic Programming
d) Backtracking
a) Greedy method
b) Dynamic Programming
14. If all the vertices of a graph are connected to each other, it is known as
a) Connected graph
b) Complete graph
c) Forest
d) None of these
b) 2n – 1
c) 2n – 1
d) None of these
Graphs 553
Review Exercises
1. Define a graph.
v2
e4
v4
e1
e7
e3
v1
v6
e8
e2
e5
e6
v3
v5
e9
e4
v2
e1
v4
e5
e3
v1
e7
e2
v3
v5
e6
e8
6. Find the adjacency matrix, incidence matrix, and adjacency list of the
following weighted graph: v4
v2
15
e3
11
e4
e6
e1 12 8
v3
v6
e5
e7
e2
10
9
v1
v5
7. Find the adjacency matrix, incidence matrix, and adjacency list of the
following weighted directed graph:
v2
v4
e3 10
15
11
e5
e1 12
e10 e6
v3
v6
20
e8
e7
e2
10
14
v1
v5
e9
8. Consider the graph given in question no. 6. Find the degree of each
vertex.
9. Consider the graph given in question no. 7. Find the in-degree and out-
degree of each vertex.
11. Consider the graph given in question no. 5. Is there any source and
sink? Explain.
12. Consider the graph given in question no. 7. Is there any source and
sink? Explain.
14. Consider the graph given in question no. 7. Find out the DFS and BFS
paths in this graph.
16. Find out the minimum spanning tree from the following graph using
Prim’s algorithm.
e4
v1
v4
e3 10
15 e5
e6 11
e1
12
e10
v3
20
v6
e8
e7
e2
10
v2
14
v5
e9
17. Find out the minimum spanning tree from the graph given in question
no. 16 using Kruskal’s algorithm.
18. Consider the graph given in question no 16. Find the shortest path a.
from v1 to v6
b. from v5 to v6
c. from v5 to v4
19. Consider the graph given in question no. 16. Find the shortest distance
among all pair of vertices.
Graphs 555
7. Write a function to find the shortest path between two given vertices.
13
OceanofPDF.com
Chapter
Searching and Sorting
Every algorithm has some merits and demerits. In this chapter we will
discuss some important searching algorithms as well as sorting algorithms.
We will discuss the working principles of these algorithms in detail and
how they can be implemented in Python, we will derive their time
complexity, and finally we will compare them with each other.
• Linear Search
• Binary Search
• Interpolation Search
558 Data Structures and Algorithms Using Python 13.1.1 Linear Search
The algorithm which checks each element of a list starting from the first
position is known as Linear Search. As this algorithm checks each element
in a list sequentially it is also known as Sequential Search. When the
elements in a list are unsorted, i.e. not in proper order, this searching
technique is used. Suppose we have the following list of elements: arr
25
12
57
40
63
32
And the key element is 40, i.e. we want to search whether 40 is present in
the list or not and if it is present, then we also need to know its position in
the list. In case of linear search, we start checking from the beginning of the
list, i.e. from the index position 0. So, we first compare whether arr[0]==40.
As the comparison fails, we check for the next index position.
Now arr[1] contains 12. So, the comparison arr[1]==40 becomes false. Next
we need to compare with the value of next index position, i.e. arr[2], and
the comparison arr[2]==40
becomes false again. Again we compare the key element with the next
index position, i.e.
whether arr[3]==40 Now the condition becomes true and the search
operation terminates here returning the index position 3.
But what happens if the element is not present in the list at all? Let us see.
Suppose the key element is 50. The search operation starts from index
position 0. The element in the zeroth position is 25 here. So the first
comparison would be failed and we need to move to the next index
position. Again the comparison arr[1]==50 becomes false and we move to
the next index position. In this way we compare the value of each index
position sequentially and all the comparisons become false. When the last
comparison, i.e. arr[5]==50, becomes false, we can say that the search fails
and the key element, 50, is not present in the list.
Here List is either an array or a Python list and Key_Value is the element to
be searched.
1. Set Index = 0
i. Return Index
b. Index = Index + 1
3. Return -1
Here the algorithm is written as a function that returns the corresponding
index position when the element is found in the list; otherwise, it returns -1
to indicate the failure of the search operation.
size=len(list)
index=0
while index<size:
if list[index]==key:
return index
index=index+1
return -1
lst=[]
for i in range(n):
lst.append(num)
if position == -1:
else:
560 Data Structures and Algorithms Using Python want to move to page
number 359 of a book of, say, 500 pages, first we move to an arbitrary page
which is at nearly the middle of the book. Suppose the page number of this
page is 273. As the required page number is greater than this, it should be
on the right half of the book and we need not check in between 1 and 273.
So we repeat the same operation on the portion that consists of page
numbers 274 to 500. Hence we again move to an arbitrary page that is
nearly at the middle of this portion and, say, it is page number 403. So we
need to check again in between 274 and 402. This process continues until
we reach the required page. This is also true for finding any word from a
dictionary or finding any name from a voter list, etc.
Consider the following example. We have the following list of elements and
the search key element is 32.
Arr
12
25
32
40
57
63
76
0
As the key element, 32, is less than arr[Mid], we need to search within the
left half of the array/list. So, the lower bound of the array/list remains the
same and the new upper bound would be
High = Mid-1=3-1=2
As the key element, 32, is greater than arr[Mid], we need to search within
the right half of the sub-array/list. So, the upper bound of the sub-array/list
remains the same and the new lower bound would be
This indicates that the search element is found at position 2, i.e. the final
position of Mid.
Now, we are considering another example where the key element is not
present in the array/list. Suppose, the key element is 35. Proceeding in the
same way, we get Low = 0 and High = 6
As the key element, 35, is less than arr[Mid], we need to search within the
left half of the array/list. Thus,
High = Mid-1=3-1=2
As the key element, 35, is greater than arr[Mid], we need to search within
the right half of the sub-array/list. Hence,
As the key element, 35, is greater than arr[Mid], we need to search within
the right half of the sub-array/list. Hence,
Here List is either a sorted array or Python list and Key_Value is the element
to be searched.
1. Set Low = 0
a. Mid = (Low+High)/2
i. Return Mid
562 Data Structures and Algorithms Using Python c. Else if Key_Value <
List[Mid], then
i. High = Mid – 1
d. Else,
i. Low = Mid + 1
4. Return -1
low = 0
high = len(list)-1
while low<=high:
mid=(low+high)//2
if key==list[mid]:
return mid
elif key<list[mid]:
high=mid-1
else:
low=mid+1
return -1
lst=[]
for i in range(n):
lst.append(num)
key=int(input(‘Enter the element to be searched: ’))
if position == -1:
else:
say
T=+T
=> + T
1n1n
=1+ 1 + T => + T
=2 + 1 + T => + T
8
…….
n ……………….. (1)
2y
Suppose the y-th search is the last search. Hence in the worst case the
number of elements in the sub-array/list would be 1.
2y
Or, n = 2 y
= y log 2
= y [\ log 2 = 1]
\ y = log n ………………(2)
2
Putting the value of y in Equation (1), we get
T=
nT
log
2y
= log +
n
n T =1
2y
= log n 1
+ T
=1
2
1
Hence, the worst case time complexity of binary search is O(log n).
564 Data Structures and Algorithms Using Python However, the best case is
when the key element would be at the middle position. In that case, the
element would be found with the first comparison. Thus the best case time
complexity of binary search is also O(1).
The working principle of interpolation search is just like binary search but
we can consider it as an improved version of binary search. In binary search
we always find the exact middle position. But in real life, when we search a
page from a book instead of finding the middle position we use some
intuition. If the page number is towards the end of the book, we try to find a
page nearer to the end. Similarly if the page is nearer to the beginning of the
book, we try to find the page nearer to the desired page, i.e. nearer to the
beginning of the book.
Position=low+((keyValue–Arr[low])*(high–low)/(Arr[high]–
Arr[low]))
arr
12
25
32
40
57
63
76
5
6
We start the operation considering low as the staring index and high as the
last index of the array/list. Thus here
= 0+((32-12)*(6-0)/(76-12)) = 20*6/64=1.875
Low = Position+1=1+1=2
This indicates that the search operation is successful and the search element
is found at position 2.
Here List is either a sorted array or Python list and Key_Value is the element
to be searched.
1. Set Low = 0
a. Position = Low+((keyValue–arr[low])*(high–low)/
(arr[high]– arr[low]))
i. Return Position
i. High = Position – 1
d. Else,
i. Low = Position + 1
4. Return -1
low = 0
high = len(list)-1
while low<=high:
if list[low]==key:
return low
else:
return -1
position=low+((key-list[low])*(high-low)//
(list[high]-list[low]))
if key==list[position]:
return position
elif key<list[position]:
high=position-1
else:
low=position+1
return -1
lst=[]
for i in range(n):
lst.append(num)
key=int(input(‘Enter the element to be searched: ’))
if position == -1:
else:
shows the worst case performance and the worst case performance is O(n).
However, the elements may be found with the first comparison and it is
denoted as O(1).
At the end of the first pass, we will find that the largest element has moved
to the last position. This process continues. As in each pass a single element
will move to its required position, n-1 passes will be required to sort the
array. The following example illustrates the working principle of bubble
sort:
arr
85
33
57
12
40
arr
85
33
57
12
40
2
First arr[0] will be compared with arr[1], i.e. 85 with 33. As the first
element is larger than the next one, the elements will be interchanged and
we will get to the next state.
arr
33
85
57
12
40
Next arr[1] will be compared with arr[2], i.e. 85 with 57. Again 85 is larger
than 57. So, the elements will be interchanged. This process continues.
arr
33
57
85
12
40
arr
33
57
12
85
40
arr
33
57
12
40
85
arr
33
57
12
40
85
This is the end of the first pass and we get the largest element at the last
position. Now the second pass will start.
568 Data Structures and Algorithms Using Python arr
33
57
12
40
85
( no interchange )
arr
33
57
12
40
85
( interchange )
arr
33
12
57
40
85
( interchange )
arr
33
12
40
57
85
( interchange )
arr
33
12
40
57
85
( no interchange )
This is the end of the second pass and the second largest element, i.e. 57, is
now in its proper position. In this way in each pass a single element will get
its proper position. Resembling the formation of bubbles, in this sorting
technique the array is sorted gradually from the bottom. That is why it is
known as bubble sort. As the sorting process involves exchange of two
elements, it is also known as exchange sort. Other passes are described
below: Third pass:
arr
33
12
40
57
85
( interchange )
arr
12
33
40
57
85
( no interchange )
arr
12
33
40
57
85
( interchange )
arr
12
33
40
57
85
( no interchange )
arr
12
33
2
40
57
85
( no interchange )
Fourth pass:
Arr
12
33
40
57
85
( no interchange )
Arr
12
33
40
57
85
( interchange )
Arr
12
33
40
57
85
( no interchange )
Arr
12
33
40
57
85
( no interchange )
Arr
12
2
33
40
57
85
( no interchange )
Fifth pass:
Arr
12
33
40
57
85
( interchange )
Arr
12
33
40
57
85
( no interchange )
Arr
12
33
40
57
85
( no interchange )
Arr
12
33
40
57
85
( no interchange )
Arr
2
12
33
40
57
85
( no interchange )
1. size = len(list)
i. if list[j]>list[j+1], then
1. set temp=list[j]
2. set list[j]=list[j+1]
3. set list[j+1]=temp
3. exit
On the basis of this algorithm now we can write the corresponding program.
def bubble_sort(list):
size = len(list)
for i in range(size-1):
for j in range(size-1):
if list[j]>list[j+1]:
temp=list[j]
list[j]=list[j+1]
list[j+1]=temp
lst=[]
for i in range(n):
lst.append(num)
bubble_sort(lst)
for i in range(n):
print(lst[i], end=‘ ’)
In the above code we can make some modification to increase the efficiency
of the program.
In the second pass, as the largest element is already in the last position,
unnecessarily we compare the last two elements. Similarly, in the third pass,
the last two comparisons are unnecessary. We can modify our code in such
a way that in the first pass n-1 comparisons are made, in the second pass n-
2 comparisons are made, in the third pass n-3 comparisons are made, and so
on. In this way, in the last pass, i.e. the (n-1)th pass, only one comparison
will be required. Thus the process speeds up as it proceeds through
successive passes. To implement this concept we can rewrite the function
as:
def bubble_sort(list):
size = len(list)
for i in range(size-1):
for j in range(size-i-1):
if list[j]>list[j+1]:
temp=list[j]
list[j]=list[j+1]
list[j+1]=temp
Another drawback of this algorithm is that if the elements are already sorted
or the array becomes sorted in any intermediate stage, still we have to
proceed for n-1 passes. But if we carefully notice, if the elements have
become sorted, then no further interchange is made.
In other words, if no interchange takes place in a pass, we can say that the
elements are sorted and it can be easily implemented by using an extra
variable, say, flag. Considering this modification, the modified version of
the basic technique of bubble sort is given below: Program 13.5: Modified
version of bubble sort algorithm
size = len(list)
for i in range(size-1):
flag=True
for j in range(size-i-1):
if list[j]>list[j+1]:
temp=list[j]
list[j]=list[j+1]
list[j+1]=temp
flag=False
if flag:
break
lst=[]
for i in range(n):
lst.append(num)
modified_bubble_sort(lst)
print(lst[i], end=‘ ’)
= n(n-1)/2
=121
n−n
Hence, the time complexity of bubble sort is O(n2). In basic bubble sort
algorithm, whatever may be the initial position of the elements, the time
complexity is always O(n2). But in modified bubble sort, if the elements are
already in sorted order, the algorithm terminates with the first pass. In the
first pass, the number of comparisons is n-1. Hence, in the best case, the
modified bubble sort shows the time complexity as O(n). But in the worst
case, time complexity of modified bubble sort is O(n2).
In this sorting technique, first the smallest element is selected and then this
element is interchanged with the element of the first position. Now the first
element is in its proper
572 Data Structures and Algorithms Using Python position and the rest of
the elements are unsorted. So, consider the rest of the elements. The second
element is now logically the first element for the remainging set. Again we
select the smallest element from this remaining set and interchange with the
currently logical first position. This process continues by selecting the
smallest element from the rest and placing it into its proper position by
interchanging with the logical first element. As in each iteration, the
corresponding smallest is selected and placed into its proper position, this
algorithm is known as selection sort. Since in each pass a single element
will move to its required position, n-1 passes will be required to sort the
array/list. The following example illustrates the working principle of
selection sort:
arr
33
85
57
12
40
arr
33
85
57
12
40
2
Here, the smallest element is 2. So, it will be interchanged with the first
element, i.e. 33, and we get the next state.
arr
85
57
12
40
33
Now 2 is placed in its proper position. So, we consider only the rest of the
elements. Among the rest of the elements 12 is the smallest. It will be
interchanged with the second position, which is currently the logical first
position.
arr
12
57
85
40
33
12
33
85
40
57
arr
12
33
40
85
57
arr
12
33
40
57
85
Based on the above working principle, we may define the general algorithm
of selection sort as follows:
Selection_Sort( List)
a. min=list[i]
b. posn=i
c. for j in range(i+1,size):
i. if list[j]<min:
1. min=list[j]
2. posn=j
d. if i!=posn:
i. temp=list[i]
ii. list[i]=list[posn]
iii. list[posn]=temp
3. exit
def selection_sort(list):
size = len(list)
for i in range(size-1):
min=list[i]
posn=i
for j in range(i+1,size):
if list[j]<min:
min=list[j]
posn=j
if i!=posn:
temp=list[i]
list[i]=list[posn]
list[posn]=temp
lst=[]
for i in range(n):
lst.append(num)
selection_sort(lst)
for i in range(n):
print(lst[i], end=‘ ’)
= n(n-1)/2
121
=n−n
Before this element, we have a sorted array/list of two elements. Insert the
third element into its proper position in this sorted array/list and we will get
the first three elements sorted. This process continues up to the last element
for the array/list to become fully sorted. As the sorting technique grows by
inserting each element into its proper position, this sorting technique is
known as Insertion sort. The following example illustrates the working
principle of insertion sort:
arr
85
33
57
12
40
According to this technique the sorting procedure starts from the second
position of the list. So, we have to consider the second element, i.e. arr[1],
which is 33, and the situation is: arr
85
33
To become sorted 85 will be shifted to its next position, i.e. into arr[1], and
33 will be inserted to arr[0] position. So, the new state will be:
arr
33
85
57
12
40
In the next iteration we have to consider the third element, i.e. arr[2]. Now
the situation is: arr
33
85
57
33
57
85
12
40
arr
33
57
85
12
40
arr
12
33
57
85
40
arr
12
33
40
57
85
arr
12
33
40
57
85
Based on the above working principle we may define the general algorithm
of insertion sort as follows:
Insertion_Sort( List)
a. temp=list[i]
b. j=i-1
i. list[j+1]=list[j]
ii. j=j-1
d. list[j+1]=temp
3. exit
576 Data Structures and Algorithms Using Python Program 13.7: Program
to sort elements using insertion sort algorithm
def insertion_sort(list):
size = len(list)
for i in range(1,size):
temp=list[i]
j= i-1
list[j+1]=list[j]
j=j-1
list[j+1]=temp
lst=[]
for i in range(n):
lst.append(num)
insertion_sort(lst)
for i in range(n):
print(lst[i], sep=‘ ’)
= n(n-1)/2
121
=n−n
Hence, the time complexity of insertion sort is O(n2). But if the array is
already sorted, in each pass the maximum number of comparisons would be
1. Hence, Total number of comparisons T = 1 + 1 + 1 + …………….. upto
(n-1) terms n
= n-1
Quick sort is a very efficient algorithm that follows the ‘divide and
conquer’ strategy. The first task in this algorithm is to select the pivot
element. It may be the first element, the last element, the middlemost
element, or any other element. As this choice does not provide any extra
advantages, we are choosing the first element as the pivot element. Next,
the algorithm proceeds with the motivation that this pivot element partitions
the array/list into two sub-arrays/lists such that all the elements smaller than
the pivot element is on the left side of the pivot element and all the larger
elements are on the right side. The same operation would be then applied to
both sub-arrays/lists until the size of the sub arrays/lists becomes one or
zero. Consider the following example:
33
85
40
12
57
25
As we select the first element, i.e. the leftmost, as pivot, scanning starts
from the right side.
We also need to set two variables – left and right. They will contain the
starting index and the end index of the array/list. Thus, initially, left = 0 and
right = 6 here.
arr
33
85
40
12
57
25
Left
Right
arr
33
85
40
12
57
25
Left
Right
After interchange we have to change the direction of scanning. Now we
scan from left.
arr
25
85
40
12
57
33
Left
Right
Now, arr[left]<pivot; thus the value of the left variable will be increased by
1 to point to the next element.
arr
25
85
40
12
57
2
33
Left
Right
578 Data Structures and Algorithms Using Python With this interchange we
again change the direction of scanning.
arr
25
33
40
12
57
85
Left
Right
Now, arr[right]>pivot; thus the value of the right variable will be decreased
by 1 to point to the next element. This time arr[right]<pivot. So, we
interchange these values.
arr
25
33
40
12
57
85
Left
Right
arr
25
40
12
57
33
85
Left
Right
Now, arr[left]<pivot; hence the value of the left variable will be increased to
point to the next element.
This time arr[left]>pivot. So, these values would be interchanged.
arr
25
40
12
57
33
85
Left
Right
arr
25
33
12
57
40
85
Left
Right
Again, arr[right]>pivot and the value of the right variable will be decreased
by 1 to point to the next element.
arr
25
33
12
57
40
85
Left
Right
Again, arr[right]>pivot and once again the value of right variable will be
decreased by 1.
arr
25
33
12
57
40
85
Left
Right
arr
25
12
33
57
40
85
Left
Right
Now, arr[left]<pivot; hence the value of the left variable will be increased.
arr
25
12
33
57
40
85
Right
Left
Now the left and the right become equal, which indicates the end of this
procedure and the pivot element finds its proper position and divides the
array/list into two sub-arrays/lists where all the smaller elements are on the
left side and the larger elements on the right side.
arr
25
12
Left
Right
On the left sub-array/list, again we choose the first element as pivot element
and scanning starts from the right side. Now, left = 0 and right = 2. As
arr[right]<pivot, we will interchange these values.
arr
12
25
Left
Right
Now we scan from the left. Since arr[left]<pivot, the value of the left
variable will be increased.
arr
12
25
Left
Right
12
25
Right
Left
Now the left and the right become equal, which indicates the end of this
procedure and this pivot element, i.e. 25, finds its proper position. But this
time, there will be no element on
580 Data Structures and Algorithms Using Python right sub-array/list and
all the rest of the elements will be in the left sub-array/list. So, we need to
apply the same procedure on this left sub-array/list.
arr
12
Left
Right
arr
12
Left
Right
And we get:
arr
2
12
Left
Right
Now arr[left]<pivot; thus the value of the left variable will be increased and
we get: Arr
12
Right
Left
Now the left and the right become equal; thus this procedure is terminated
here. As the size of the newly formed left sub-array/list is 1 and of the right
sub-array/list is 0, we need not proceed further. Now we have to consider
the first right sub-array/list.
Now the pivot element is arr[4], which is the leftmost or the first element of
the right sub-array/list and left = 4 and right = 6.
arr
57
40
85
Left
Right
Starting scanning from the right side, we find arr[right]>pivot. So, the value
of the right variable is decreased by 1.
arr
57
40
85
Left
Right
arr
57
40
85
Left
Right
And we get:
arr
40
57
85
Left
Right
Now we scan from the left and find that arr[left]<pivot; thus the value of
the left variable will be increased and we get:
arr
40
57
85
Right
Left
As the left and the right become equals, this procedure will terminate here
and we get two sub-arrays/lists each of which contains a single element. So,
we need not proceed further and we get the final sorted array/list as follows:
arr
12
25
33
40
57
85
Based on this working principle, we may now write the algorithm of quick
sort. As we find that the partitioning procedure is called recursively, we
divide the algorithm into two modules where one module does the process
of partitioning and the other module is the driver module that calls the
partitioning module recursively.
Here Arr is an array or python list, left is the starting index, and right is the
end index of the array/list or sub-array/list.
1. If left<right, then
a. Set loc=PartitionList(arr,left,right)
b. call Quick_Sort(arr,left,loc-1)
c. call Quick_Sort(arr,loc+1,right)
2. end
Here Arr is an array or Python list, left is the starting index, and right is the
end index of the array/list or sub-array/list.
2. while True, do
i. Set right=right-1
b. if loc==right, then
i. break
c. elif arr[loc]>arr[right], then
i. set left=left +1
e. if loc==left, then
i. break
3. return loc
loc=left
while True:
right-=1
if loc==right:
break
elif arr[loc]>arr[right]:
arr[loc],arr[right]=arr[right],arr[loc]
loc=right
left+=1
if loc==left:
break
elif arr[loc]<arr[left]:
arr[loc],arr[left]=arr[left],arr[loc]
loc=left
return loc
def quickSort(arr,left,right):
if left<right:
loc=partitionlist(arr,left,right)
quickSort(arr,left,loc-1)
quickSort(arr,loc+1,right)
arr=[]
for i in range(n):
arr.append(int(input(“Enter number {}: ”.format(i+1))))
print(arr)
quickSort(arr,0,n-1)
print(arr)
And we can say that in the best case, the time complexity of quick sort is
O(n log n).
But in the worst case, i.e. when the elements are already in sorted order,
pivot elements would divide the array/list into two sub-arrays/lists, one of
which contains zero element and the other contains n-1 elements. Then,
= n(n-1)/2
121
=n−n
‘divide and conquer’ strategy. The basic policy of this algorithm is to merge
two sorted arrays. As we may consider every single element as a sorted
array of one element, first we divide the array into a set of arrays of single
elements and then merge them one by one to get the final sorted array. To
divide the array, first we divide it into two halves. Then each half will be
divided again into two halves. These halves will be then again divided.
584 Data Structures and Algorithms Using Python These processes will go
on until the size of each half becomes one. The following example
illustrates the total algorithm. Consider the following unsorted array/list.
33
85
40
12
57
25
33
85
40
12
57
25
33
85
40
12
57
25
33
85
40
12
57
2
25
33
85
12
40
57
25
12
33
40
85
25
57
12
25
33
40
57
85
First we divide the array/list into two halves. As the array/list contains 7
elements, the first half contains 4 elements and the other half contains 3
elements. The first half is then again divided into two halves where each
half contains 2 elements. Next these halves are further divided. From this
first half we get two single elements – one is 33 and the other is 85. Now
merging operation is applied on these two elements and we get a sorted
array/list of 33 and 85. Similarly, the second half is divided into two single
elements and these are 40 and 12.
Based on this working principle, now we may define the merge sort
algorithm. This algorithm is also divided into two modules. One, the driver
module, which divides the array/list recursively, and the other module
merges two sorted arrays/lists that will be called recursively to get back the
undivided array/list that was divided previously at that level but is now in
sorted form.
1. If left<right, then
a. Set mid=(left+right)/2
b. Call Merge_Sort(arr,left,mid)
c. Call Merge_Sort(arr,mid+1,right)
d. MergeList(arr,left,mid,mid+1,right)
2. end
Here Arr is an array or Python list, lst and lend are the starting index and
end index of left array/list or sub-array/list, and rst and rend are the starting
index and end index of the right array/list or sub-array/list.
2. Set i=lst
3. Set j=rst
b. else:
5. while i<=lend, do
b. set i=i+1
6. while j<=rend, do
b. set j=j+1
7. Set j=0
a. set arr[i]=temp[j]
b. set j=j+1
9. End
temp=[]
i=lst
temp.append(arr[i])
i+=1
else:
temp.append(arr[j])
j+=1
while i<=lend:
temp.append(arr[i])
i+=1
while j<=rend:
temp.append(arr[j])
j+=1
j=0
arr[i]=temp[j]
j+=1
def mergeSort(arr,left,right):
if left<right:
mid=(left+right)//2
mergeSort(arr,left,mid)
mergeSort(arr,mid+1,right)
mergeList(arr,left,mid,mid+1,right)
arr=[]
for i in range(n):
print(arr)
mergeSort(arr,0,n-1)
print(arr)
As this operation does not depend on the initial order of the elements, there
is no such best or worst case found. Thus the time complexity of merge sort
is always O(n log n).
2
However, the disadvantage of this algorithm is that it uses extra spaces. We
need to declare a temporary array/list whose size is the same as the original
array/list.
We have already discussed the heap data structure in the previous chapter.
Now we shall see one of its applications. As we know the root of a max
heap contains the largest element and the root of a min heap contains the
smallest element, we can use this property to sort an array. When we sort
the elements in ascending order, max heap is used, and if we want to sort in
descending order, min heap is used. In this section we shall discuss the heap
sort algorithm to sort an array/list in ascending order. Thus max heap will
be used in our algorithm. The working principle of heap sort is like
selection sort. In selection sort the smallest element is selected and it is
placed in its proper position, and in heap sort the largest element is selected
and placed in its proper position. But the difference is that to find the
smallest element we need O(n) comparisons whereas in heap sort O(log n)
2
In heap sort, first a max heap is built from the initial array/list. Then
repeated deletion of the root sorts the array/list. More precisely, as the root
contains the largest element, it will be interchanged with the element of the
last position and, excluding this last element, we again apply these two
operations, i.e. rebuild the heap and delete the root. The following example
illustrates this. Consider the following array/list: 33
85
40
12
57
2
25
57
40
12
33
25
As now it is a heap and more precisely max heap, the largest element is at
the root and it is at the first position. So, we need to swap it with the last
element.
25
57
40
12
33
85
The largest element is now in its proper position. Thus, we ignore this
element and logically decrease the array/list size by 1.
25
57
40
12
33
85
From these remaining elements we again rebuild the heap and we get:
33
40
12
25
85
The first element now will be interchanged with the current last position,
i.e. with the sixth element.
33
40
12
25
57
85
33
40
12
25
57
85
From these remaining elements we again rebuild the heap and we get: 40
33
12
25
57
85
The first element now will be interchanged with the current last position,
i.e. with the fifth element, and the array/list size will be decreased by 1.
25
33
12
40
57
85
33
25
12
40
57
85
Now we swap the first element and the logical last element, i.e. the fourth
element, and decrease the array/list size by 1.
12
25
33
40
57
85
25
12
33
40
57
85
Now we swap the first element and logical last element, i.e. the third
element, and decrease the array/list size by 1.
12
25
33
40
57
85
12
2
25
33
40
57
85
Now we swap the first element and the second element and decrease the
array/list size by 1.
12
25
33
40
57
85
As now the array/list size becomes 1, our operation terminates here and we
get the sorted array/list.
Based on this working principle we may define the basic steps as follows:
1. Create the heap from initial elements.
Considering the basic steps we may now define the general algorithm of
heap sort. For ease of implementation, we are dividing the algorithm into
two modules. The second module is a recursive module which will build the
heap and the first module is the driver module which controls the basic
steps of the algorithm.
Heap_Sort( Arr)
2. Set i=n/2
3. while i>0, do
a. call heapify(arr,n,i)
b. set i=i-1
4. set i=n
5. while i>0, do
b. call heapify(arr,i,1)
c. set i=i-1
6. End
Heapify( Arr, n, i)
Here Arr is an array or Python list, i is the index position of the element
from where the heapify operation starts, and i is the index position of the
last element of the heap.
1. Set largest=i
2. Set left=2*i
3. Set right=2*i+1
a. largest=left
a. largest=right
6. if largest!=i, then
b. call heapify(arr,n,largest)
7. End
largest=i
left=2*i
right=2*i+1
if left<n and arr[left]>arr[largest]:
largest=left
largest=right
if largest!=i:
arr[i],arr[largest]=arr[largest],arr[i]
heapify(arr,n,largest)
def heapSort(arr):
n=len(arr)-1
i=n//2
heapify(arr,n,i)
i-=1
i=n
arr[1],arr[i]=arr[i],arr[1]
heapify(arr,i,1)
i-=1
arr=[0]
print(arr)
heapSort(arr)
print(arr)
Analysis of Heap Sort: In the heap sort algorithm, first we create an initial
heap from n elements, which requires O(log n) operations. Next, a loop is
executed n-1 times in which 2
first a swapping takes place that requires constant times and then the
heapify function is called. In the heapify function, the root element is
moved down up to a certain level. As the maximum height of a complete
binary tree may be O(log n), the maximum number of 2
the worst case time complexity of heap sort is (n-1) × O(log n), i.e. O(n log
n).
Radix sort is a very efficient sorting algorithm that sorts the elements digit
by digit if the elements are numbers and letter by letter if the elements are
strings. When the elements are numbers, the sorting process starts from the
least significant digit to the most significant digit, whereas in case of string,
first it sorts the strings based on the first letter, then the second letter, then
the third letter, and so on. This algorithm is also known as bucket sort since
it requires some buckets. The number of buckets required depends on the
base or radix of the number system. That is why it is called radix sort. Thus
to sort some decimal numbers we need 10 buckets, to sort octal numbers we
need 8 buckets, and to sort some name we need 26 buckets as there are 26
letters in the alphabet. For decimal numbers buckets are numbered as 0 to 9.
To illustrate the working principle consider the following example. Suppose
we have the following list of elements:
57
234
89
367
74
109
35
48
37
In the first pass, the numbers would be sorted based on the digits of the unit
place. Hence, we consider the unit place digit of each number and put them
at the corresponding buckets.
37
74
35
367
109
234
57
48
89
Now we collect the numbers from the buckets sequentially and store them
back in the array/list.
35
57
367
37
48
89
109
We may notice that after this first pass, numbers are already sorted on the
last digit. In the next pass, we would repeat the same operation but on the
decimal place digit. Thus we get: 37
109
35
234
48
57
367
74
89
0
Again we collect the numbers from the buckets sequentially and store them
back in the array/list.
109
234
35
37
48
57
367
74
89
After this second pass, numbers are sorted based on last two digits. In the
third pass, we need to repeat the same operation but on the hundredth place
digit and we get: 89
74
57
48
37
35
109
234
367
7
8
Again we collect the numbers from the buckets and store them back in the
array/list.
35
37
48
57
74
89
109
234
367
From the above working principle, it is clear that the number of passes
depends on the number of digits of the largest number. Now we may define
the general algorithm of the radix sort as:
Radix_Sort( Arr)
as digitCount.
4. For I = 1 to digitCount, do
a. Initialize buckets
list
5. End
def radixSort(arr):
n=len(arr)
Max=arr[0]
Max=arr[i]
digitCount=0
#largest element
digitCount+=1
Max//=10
divisor=1
for p in range(digitCount):
for i in range(n):
rem=(arr[i]//divisor)%10
bucketCount[rem]+=1
k=0
#bucket
for j in range(bucketCount[i]):
arr[k]=bucket[i][j]
k+=1
divisor*=10
arr=[]
for i in range(n):
print(arr)
radixSort(arr)
print(arr)
Analysis of Radix Sort: If the number of digits of the largest element in the
list is d, the number of passes to sort the elements is also d. In each pass we
need to access all the elements of the list. If there are n number of elements
in the list, the total number of executions would be O(dn). If the value of d
is not very high, the running time complexity of radix sort is O(n), which
indicates its efficiency. But the main drawback of this algorithm is that it
takes much more space in comparison to other sorting algorithms. To sort
decimal numbers, apart from the list of numbers it requires 10 buckets each
of size n. To sort strings containing only letters, number of buckets required
is 26, each of size n. If the strings contain other characters such as
punctuation marks or digits, the number of buckets will increase.
Shell sort algorithm was invented by Donald Shell in 1959. The basic
principle of this algorithm is designed observing the fact that more the
elements are towards the sorted order, the better the insertion sort performs.
Thus shell sort may be considered as an improved version of insertion sort.
In this algorithm several passes are used and in each pass equally distanced
elements are sorted. Instead of applying the basic insertion sort algorithm
for the whole set of elements, elements with a certain interval are used for
this purpose and they get sorted. In each pass this interval is reduced by
half. So, at the earlier passes when the elements are fully unsorted, the list
size is very small. So it takes less time.
On the other hand, at the later passes when more elements are involved,
elements are almost sorted. Hence, better performance of insertion sort is
achieved. This policy reduces the overall execution time.
92
57
61
12
77
48
25
34
39
50
1
2
10
We start the algorithm with gap size as half of the number of elements.
Here, n = 11. So, gap = 11/2 = 5 (considering integer division since
array/list index should be integer). At this first pass, first we will apply
insertion sort on the zeroth , fifth, and tenth elements of the list.
92
57
61
12
77
48
25
6
34
39
50
10
And we get:
48
57
61
12
77
50
25
34
39
92
10
Next insertion sort will be applied on the first and sixth elements.
48
57
61
12
77
50
25
34
39
92
10
And we get:
48
25
61
12
77
50
57
34
39
92
7
8
10
Following the sequence, the next insertion sort will be applied on the
second and seventh elements, the third and eighth elements, and the fourth
and ninth elements.
48
25
61
12
77
50
57
34
39
92
3
4
10
48
25
12
39
50
57
61
34
77
92
0
1
10
596 Data Structures and Algorithms Using Python In the next pass, the gap
size will be half. Thus gap = 5/2 = 2 (considering integer division).
Now in the second pass, first insertion sort will be applied on the zeroth,
second, fourth, sixth, eighth, and tenth elements.
48
25
12
39
50
57
61
34
77
92
10
And we get:
25
34
12
39
50
48
61
57
77
92
10
The next insertion sort will be applied on the first, third, fifth, seventh, and
ninth elements.
6
25
34
12
39
50
48
61
57
77
92
9
10
12
34
25
39
50
48
61
57
77
92
6
7
10
Now, in the third pass, gap = 2/2 = 1. So, insertion sort will be applied on
the total set.
12
34
25
39
50
48
61
57
77
92
3
4
10
12
25
34
39
48
50
57
61
77
92
0
1
10
Based on this working principle, we may now define the algorithm of shell
sort.
Shell_Sort( Arr)
3. While gap>= 1, do
ii. Set j = i
iii. While j>=gap and Arr[j-gap]>temp, do
2. Set j = j – gap
def shellSort(arr):
n=len(arr)
gap=n//2
while gap>=1:
for i in range(gap,n):
temp=arr[i]
j=i
arr[j]=arr[j-gap]
j-=gap
arr[j]=temp
gap//=2
arr=[]
for i in range(n):
print(arr)
shellSort(arr)
print(arr)
In shell sort we are applying insertion sort using gaps such as n/2, n/4, n/8,
…. 4, 2, 1. Thus in the best case the time complexity of shell sort would be
nearer to O(n). As the earlier passes set the elements towards the sorted
order, in the worst case the time complexity would be less than O(n2).
Algorithm
Best case
Worst Case
Bubble Sort
O(n2)
O(n2)
Selection Sort
O(n2)
O(n2)
Insertion Sort
O(n)
O(n2)
Quick Sort
O(n log n)
O(n2)
Merge Sort
O(n log n)
O(n log n)
Heap Sort
O(n log n)
O(n log n)
2
2
Radix Sort
O(n)
O(d n)
Shell Sort
O(n)
O(n2)
So far in the sorting algorithms we discussed, all the elements are stored in
the main memory. i.e. within RAM, and then the sorting procedure takes
place in the main memory considering all the elements at a time. This type
of sorting is known as internal sorting. But when we need to deal with huge
data, then such amounts of data cannot be placed in the main memory at a
time. We need to store data in a secondary storage device, and part by part
data are loaded into memory and then sorted. This type of sorting is known
as external sorting. An example of external sorting is external merge sort.
✓ The algorithm that checks each and every element of a list one by one
sequentially starting from the first position is known as linear search.
✓ Binary search works on sorted elements only. It is much faster than linear
search and the time complexity of binary search is O(log n).
2
✓ When the amount of data is less such that it easily fits into the main
memory (i.e. RAM), internal sorting is used.
✓ For huge amounts of data, which do not fit completely in the main
memory, external sorting is required.
✓ In basic bubble sort algorithm, whatever may be the initial position of the
elements, the time complexity is always O(n2).
✓ In best case, modified bubble sort shows the time complexity as O(n).
✓ The worst case time complexity of insertion sort is O(n2) whereas the
best case time complexity of insertion sort is O(n).
✓ Quick sort and merge sort both are sorting algorithms that follow the
✓ Quick sort algorithm proceeds with the motivation that the pivot element
partitions the array/list into two sub-arrays/lists such that all the elements
smaller than the pivot element are on the left side of the pivot element and
all the larger elements are on the right side. The same operation will then be
applied to both the sub-arrays/lists until the size of the sub arrays/lists
become one or zero.
✓ In the best case the time complexity of quick sort is O(n log n) but in the
2
✓ Merge sort first divides the list into several lists such that each list
contains a single element and then merge these lists one by one maintaining
the concept of ‘merge two sorted lists into a single sorted list’. The time
complexity of merge sort is always O(n log n) but extra spaces are required.
✓ In heap sort, first a max heap is built from the initial array/list. Then
repeated deletion of the root sorts the array/list. The time complexity of
heap sort is O(n log n).
✓ Radix sort or bucket sort uses the base or radix number of buckets. Its
time complexity is O(n) but uses huge space in comparison to others.
✓ In shell sort several passes are used and in each pass equally distanced
elements are sorted. Its time complexity is between O(n) and O(n2).
b) Backtracking approach
c) Heuristic search
d) Greedy approach
b) Backtracking approach
c) Heuristic search
d) Greedy approach
3. How many swaps are required to sort the given array using bubble sort:
12, 15, 11, 13, 14
a) 5
b) 15
c) 4
d) 14
a) O(n2)
b) O(n)
c) O(n log n)
d) O( log n)
a) O(n2)
b) O(n)
c) O(n log n)
d) O( log n)
a) O( log n)
b) O(n)
c) O(n log n)
d) None of these
a) O(n2)
b) O(n)
c) O(n log n)
d) O( log n)
a) O(n2)
b) O(n)
c) O(n log n)
d) O( log n)
a) O(n2)
b) O(n)
c) O(n log n)
d) O( log n)
a) O(n2)
b) O(n)
c) O(n log n)
d) O( log n)
11. In general which of the following sorting algorithms requires the least
number of assignment operations?
a) Selection sort
b) Insertion sort
c) Quick sort
d) Merge sort
12. When the input array is sorted or nearly sorted, which algorithm shows
the best performance?
a) Selection sort
b) Insertion sort
c) Quick sort
d) Merge sort
602 Data Structures and Algorithms Using Python 13. Suppose we have
two sorted lists: one of size m and the other of size n. What will be the
running time complexity of merging these two lists?
a) O( m)
b) O( n)
c) O( m+n)
d) O( log m + log n)
14. Which of the following sorting algorithms least bother about the
ordering of the elements in the input list?
a) Selection sort
b) Insertion sort
c) Quick sort
d) all of these
Review Exercises
1. What is searching?
Show the steps to search for (a) 83 and (b) 33 using binary search method.
8. Show the steps to sort the following elements using bubble sort
algorithm: 25 10 18 72 40 11 32 9
9. Show the steps to sort the following elements using selection sort
algorithm: 25 10 18 72 40 11 32 9
10. Show the steps to sort the following elements using insertion sort
algorithm: 25 10 18 72 40 11 32 9
11. Show the steps to sort the following elements using quick sort
algorithm: 25 10 18 72 40 11 32 9
12. Show the steps to sort the following elements using merge sort
algorithm: 25 10 18 72 40 11 32 9
13. Show the steps to sort the following elements using heap sort algorithm:
25 10 18 72 40 11 32 9
14. Show the steps to sort the following elements using radix sort
algorithm: 40 525 18 172 11 310 32 9 68 81
15. Show the steps to sort the following elements using shell sort algorithm:
25 37 48 10 56 89 18 5 72 40 31 11 45 32 9
53 29 78 46 12 121 34 68
After two passes, the arrangement of the elements in the list is as follows:
12 29 78 46 53 121 34 68
53 29 78 46 12 121 34 68
After two passes, the arrangement of the elements in the list is as follows:
29 53 78 46 53 121 34 68
53 29 78 46 12 121 34 68
After two passes, the arrangement of the elements in the list is as follows:
29 46 12 53 34 68 78 121
53 29 78 46 12 121 34 68
After two passes, the arrangement of the elements in the list is as follows:
68 34 53 29 12 46 78 121
20. Compare insertion sort, heap sort, and quick sort according to the best
case, worst case, and average case behaviors.
b) Binary search
c) Interpolation search
b) Selection sort
c) Insertion sort
d) Quick sort
e) Merge sort
f) Heap sort
g) Radix sort
h) Shell sort
14
OceanofPDF.com
Chapter
Hashing
than linear search. Interpolation search generally works faster than binary
search and its time complexity is O(log (log n)) when elements are evenly
distributed. But there is 2
Suppose we have to store the data of students whose roll numbers range
from 1 to 60. For this purpose we may take an array/list and store the data
of each student at the corresponding index position which matches with the
roll number. In this situation, to access a student, if we know his/her roll
number, we can directly access his/her data. For example, if we want to
read the data of the student whose roll number is 27, we may directly access
the 27th index position of the array/list. But the problem is that in real life
key values are not always as simple. You may find that in a certain
university someone’s roll number may be 21152122027. This does not
imply that it is a sequential number, nor that this number of students are
admitted in a year in that university. It may be a nomenclature of a key
value where the first two digits may denote the year of registration, the next
three digits are the college code, the next two digits may indicate stream,
and so on. This is true for not only roll numbers but also any other key
value. Thus it is clear that the number of digits does not indicate the total
number of elements. In the above example hardly 10,000 or 20,000
students may take admission in that university in a year. So, it would not be
a wise decision
606 Data Structures and Algorithms Using Python to store the data of a
student if we declare an array/list of size which is equal to the last student’s
roll number. Rather what we can do is declare an array/list of size equal to
the total number of elements and map the roll numbers of the students to the
index position such that each and every record is accommodated properly
and we can also access them directly. This is hashing.
system the record with key value k would be stored at k index position, the
record with 1
key value k at k index position, and so on. But in case of hashing we need
to use a hash 2
function, say h(k), which uses the key value k and returns an index position
where the particular record would be stored. Hence, the record with key
value k would not be stored 1
of k is 2 and it is stored at position 2, and so on. But in Figure 14.2, the hash
function 2
h(k) has been used. We are assuming h(k ) returns 7 and thus it is stored at
position 7.
Similarly, if h(k ), h(k ), and h(k ) return 1, 5, and 2 respectively, they would
be stored at 3
12 NULL
Universe of keys
3
k2
NULL
Actual keys
NULL
k3
k4
k6
Universe of keys
h(k )3
3 NULL
h(k )6
NULL
k5
Actual keys
h(k )4
6 NULL
h(k )
3
k4
k6
Figure 14.2 Relationship between keys and hash table index positions
Hashing 607
Sometimes it may happen that for two or more key values a hash function
may return identical index value. This is known as collision. Though it is a
problem, it is not possible to avoid collision completely. Thus main aim of a
hash function is to produce unique hash values as far as possible within a
defined range for a set of key values by evenly distributing the values for
that range. There are several standard hash functions. We will now discuss
them in this section. But it is not that we have to use any from only among
those. We may define our own hash functions. After defining a hash
function we need to pass the probable set of key values in that function and
then we have to study the returned hash values. If it is found that all the
values are unique or almost unique, we accept the hash function. Now we
discuss some popular hash functions.
It is the simplest hash function. When key values are integers and
consecutive and there is no gap in the sequence in the possible set of values,
this method is very effective. If k is the key value and the size of the hash
table is n, the remainder value of k/n is the hash value of k
which is considered as the index of the hash table. Hence, we may define
the hash function,
h(k), as:
h(k) = k mod n
Here, mod is the modulus operator which returns the remainder value of
k/n.
h(k) = k mod n + 1
n. Studies have shown that any prime number nearer to the size of the hash
table is a good choice whereas any number closest to the exact power of 2 is
the worst choice.
Example 14.1: Consider a hash table of size 100. Map the keys 329 and
4152 to appropriate locations using the division method.
This method operates in two steps. In the first step, a constant value, c, is
chosen such that 0 < c < 1. Then the key value, k, is multiplied by c and the
fractional part of this product is considered. In the next step, this fractional
value is further multiplied by the size of the hash table, n. Finally, the floor
value of the result is considered as the hash value. Hence, we may define
the hash function, h( k), as: (
h k) = n
( k c mod )1,
The advantage of this method is that it works with almost any value of c.
However, Knuth has suggested in his study the best value of c is
c = ( √5 – 1) / 2 = 0.61803398874 …
Example 14.2: Consider a hash table of size 100. Calculate the hash values
for the keys 329
Solution: Here, size of the hash table is 100. Thus, n = 100. We are
considering c =
0.6180339887.
Hence, h(329) =
n(
mod )
1
= 100
( 203.3331822823 mod )
1
= 100
( 0.3331822823)
= 33.
31822823
= 33
h(4152) ) = n(
mod )
1
1
= 100
( 2566.0771210824 mod )
1
Hashing 609
= 100
( 0.0771210824)
= 7.71210824
=7
\ The hash values of the keys 329 and 4152 are 33 and 7 respectively.
The mid-square method is a very effective hash function. In this method the
key value is squared first. Then the middlemost d digits are taken as the
hash value. Hence, we may define the hash function, h( k), as:
h( k) = middlemost d digits of k 2.
Example 14.3: Consider a hash table of size 100. Map the keys 329 and
4152 to appropriate locations using the mid-square method.
Solution: Here, size of the hash table is 100. Thus, d = 2, i.e. the
middlemost 2 digits need to be selected.
= 30
Note that in both case, the third and fourth digits from the right are
considered.
610 Data Structures and Algorithms Using Python 14.2.4 Folding Method
Example 14.4: Consider a hash table of size 10000. Map the keys
21152122027 and 191880101023 to appropriate locations using the folding
method.
Solution: Here, size of the hash table is 10000. Thus, number of digits of
each part would be 4.
h(21152122027) = 4294
Parts = 1918,8010,1023
respectively.
In this method the key value or some portion of the key and the length of
the key are somehow combined to generate the index position in the hash
table or an intermediate key.
There are several processes of combining keys and the length of the keys.
One common
Hashing 611
example is to multiply the first two digits of the key with the length of the
key and then divide the result with the last digit to find the hash value. We
may consider this value or may use it as an intermediate key on which
another hash function may be applied to get the desired hash value.
Example 14.5: Consider a hash table of size 100. Map the keys 329 and
4152 to appropriate locations using the length dependent method.
Last digit = 9
Length of key = 3
\ h(329)
= 4152
Last digit = 2
Length of key = 4
Example 14.6: Consider a hash table of size 1000. Map the keys 123456
and 32759 to appropriate locations using the digit analysis method.
612 Data Structures and Algorithms Using Python Solution: Here, size of
the hash table is 1000. Thus, the number of digits of the hash value would
be 3.
After extracting odd positioned digits from the key, we get 135.
After extracting odd positioned digits from the key, we get 379.
\ The keys 123456 and 32759 will be mapped to positions 351 and 793
respectively.
We have already discussed that when a hash function produces same hash
value for more than one key, collision occurs. Though the basic target of
any hash function is to produce a unique address within a given range, yet it
is hardly achieved. Collisions may occur.
Chances of collision depend on various factors. One major factor is the load
factor. The ratio of the number of keys in a hash table and the size of the
hash table is known as the load factor. Larger the value of the load factor,
larger is the chance of collision. However, if collision occurs, we have to
resolve the problem. There are several collision resolution techniques. The
most two common techniques are:
• Open addressing
• Chaining.
In open addressing all the key elements are stored within the hash table.
When a collision occurs, a new position is calculated in some free slots
within the hash table. Thus the hash table contains either the key elements
or some sentinel value to indicate that the slot is empty. When we apply a
hash function for a particular key, it returns an index position of the hash
table. If this index position contains a sentinel value, it represents that the
slot is empty and we store the key element at that position. But if the index
position is already occupied by some key element, we will find some free
slots in the hash table moving forward in some systematic manner. The
process of examining slots in the hash table is
Hashing 613
In linear probing, when collision occurs the key element is stored in the
next available slot.
Hence, linear probing can be represented with the hash function as: h(k,i) =
[h’(k) + i] mod n
where h’(k) is the basic hash function, n is the size of the hash table and i is
the probe number. For a given key k, first probe is at h’(k)memory location.
If it is free, the element would be stored at this position. Otherwise, next
probe generates the slot number h’(k)
h’(k) – 1 until a free slot is found. The following example illustrates this:
Example 14.7: Consider a hash table of size 10 and the basic hash function
h'(k) = k mod n is used. Insert the following keys into the hash table using
linear probing.
__
__
__
__
__
__
__
__
__
__
1
2
__
__
__
__
34
__
__
__
__
__
__
71
__
__
34
__
__
__
__
__
614 Data Structures and Algorithms Using Python Next key, k =56.
__
71
__
__
34
__
56
__
__
__
Since slot 4 is occupied, the next probe position is calculated as: h(14,1) =
[(14 mod 10) + 1] mod 10 = [4+1] mod 10 = 5 mod 10 = 5.
Since slot 5 is free, 14 will be inserted at slot 5.
__
71
__
__
34
14
56
__
__
__
8
9
__
71
__
__
34
14
56
__
__
69
5
6
Since slot 5 is occupied, the next probe position is calculated as: h(45,1) =
[(45 mod 10) + 1] mod 10 = [5+1] mod 10 = 6 mod 10 = 6.
Since slot 6 is occupied, the next probe position is calculated as: h(45,2) =
[(45 mod 10) + 2] mod 10 = [5+2] mod 10 = 7 mod 10 = 7.
__
71
__
__
34
14
56
45
__
69
0
Hashing 615
71
__
__
34
14
56
45
__
69
In quadratic probing, when collision occurs, to find the free slot quadratic
search is used instead of linear search. Quadratic probing uses the following
hash function: h(k,i) = [h’(k) + c i + c i2] mod n for i=0,1,2,.... n-1, 1
where h’(k) is the basic hash function, n is the size of the hash table, i is the
probe number, and c and c are two constants and c ≠0 and c ≠0. For a given
key k, first probe 1
Example 14.8: Consider a hash table of size 10 and the basic hash function
h'(k) = k mod n is used. Further consider that c1=1 and c2=2. Insert the
following keys into the hash table using quadratic probing:
__
__
__
__
__
__
__
__
__
__
8
9
616 Data Structures and Algorithms Using Python First key, k =34.
__
__
__
__ 34 __ __ __ __ __
__
71
__
__
34
__
__
__
__
__
7
8
__
71
__
__
34
__
56
__
__
__
4
5
Since slot 4 is occupied, the next probe position is calculated as: h(14,1) =
[(14 mod 10) + 1 x 1 + 2 x 12] mod 10 = [4+1+2] mod 10 = 7 mod 10 = 7.
__
71
__
__
34
__
56
14
__
__
0
__
71
__
__
34
__
56
14
__
69
Hashing 617
__ 71 __ __ 34 45 56 14 __ 69
1
2
Since slot 9 is occupied, the next probe position is calculated as: h(9,1) =
[(9 mod 10) + 1 x 1 + 2 x 12] mod 10 = [9+1+2] mod 10 = 12 mod 10 = 2.
__
71
__
34
45
56
14
__
69
where h (k)and h (k)are two hash function, n is the size of the hash table,
and i is the 1
probe number. The hash functions are defined as h (k)= k mod n and h
(k)= k 1
mod n’. The value of n’ is slightly less than n. We may consider n’= n-1 or
n-2.
For a given key k, first we probe at h (k)memory location as first time i=0.
If it is free, the 1
element will be stored at this position. Otherwise, next probes are generated
with the offset value generated by h (k). Since the offset values are
generated by a hash function, double 2
618 Data Structures and Algorithms Using Python hashing is free from
primary and secondary clustering and its performance is very close to the
ideal case of uniform hashing. The following example illustrates the
working principle of double hashing.
Example 14.9: Consider a hash table of size 10 and the hash functions h (k)
= k mod 10 and 1
h (k) = k mod 8 are used. Insert the following keys into the hash table using
double hashing: 2
__
__
__
__
__
__
__
__
__
__
6
7
h(34,0) = [(34 mod 10) + 0 x (34 mod 8)] mod 10 = [4+0] mod 10 = 4 mod
10 = 4.
__
__
__
__ 34 __ __ __ __ __
9
Next key, k =71.
h(71,0) = [(71 mod 10) + 0 x (71 mod 8)] mod 10 = [1+0] mod 10 = 1 mod
10 = 1.
__
71
__
__
34
__
__
__
__
__
6
7
h(56,0) = [(56 mod 10) + 0 x (56 mod 8)] mod 10 = [6+0] mod 10 = 6 mod
10 = 6.
__
71
__
__
34
__
56
__
__
__
3
4
h(14,0) = [(14 mod 10) + 0 x (14 mod 8)] mod 10 = [4+0] mod 10 = 4 mod
10 = 4.
Since slot 4 is occupied, the next probe position is calculated as: h(14,1) =
[(14 mod 10) + 1x (14 mod 8)] mod 10 = [4+6] mod 10 = 10 mod 10 = 0.
Hashing 619
14
71
__
__
34
__
56
__
__
__
h(69,0) = [(69 mod 10) + 0 x (69 mod 8)] mod 10 = [9+0] mod 10 = 9 mod
10 = 9.
14
71
__
__
34
__
56
__
__
69
h(45,0) = [(45 mod 10) + 0 x (45 mod 8)] mod 10 = [5+0] mod 10 = 5 mod
10 = 5.
14
71
__
__
34
45
56
__
__
69
h(9,0) = [(9 mod 10) + 0 x (9 mod 8)] mod 10 = [9+0] mod 10 = 9 mod 10
= 9.
Since slot 9 is occupied, the next probe position is calculated as: h(9,1) =
[(9 mod 10) + 1 x (9 mod 8)] mod 10 = [9+1] mod 10 = 10 mod 10 = 0.
Since slot 0 is also occupied, the next probe position is calculated as: h(9,2)
= [(9 mod 10) + 2 x (9 mod 8)] mod 10 = [9+2] mod 10 = 11 mod 10 = 1.
Since slot 1 is also occupied, the next probe position is calculated as: h(9,3)
= [(9 mod 10) + 3 x (9 mod 8)] mod 10 = [9+3] mod 10 = 12 mod 10 = 2.
14
71
__
34
45
56
__
__
69
3
4
You can note that for the last value 9 we have to probe 4 times. Though
double hashing is a very efficient technique, yet in this example we have
found some degraded performance.
In chaining, when a collision occurs, the elements are not stored in some
free slots. Instead, the hash table maintains separate linked lists for each slot
to store the elements/records. All the elements for which the hash function
returns the same slot in a hash table are put in a single linked list. Initially
all the slots in a hash table contain None or NULL value. When an element
is hashed to a particular slot, a linked list is created containing the element,
and the slot contains the address or reference of the first node (now it
contains only one) of the list. Next, if any element is hashed to same slot,
the element is inserted as a new node in the linked list of that slot. How the
elements are mapped in the hash table and are stored in a linked list is
shown in Figure 14.3.
0 NULL
1
k
Universe of keys
h(k )3
2 NULL
h(k )6
3 NULL
h(k )
k4
Actual keys
5 NULL
k
k
h(k )
k6
Example 14.10: Consider a hash table of size 10 and the basic hash
function h’(k) = k mod n is used. Insert the following keys into the hash
table using the chaining method: 34, 71, 56, 14, 69, 45, and 9.
Hashing 621
0 NULL
1 NULL
2 NULL
3 NULL
4 NULL
5 NULL
6 NULL
7 NULL
8 NULL
9 NULL
0 NULL
1 NULL
h(34) = 34 mod 10 = 4.
2 NULL
3 NULL
34
5 NULL
6 NULL
NULL
NULL
NULL
NULL
71
NULL
h(71) = 71 mod 10 = 1.
3
NULL
34
45 NULL
NULL
NULL
NULL
NULL
NULL
71
NULL
h(56) = 56 mod 10 = 6.
NULL
34
NULL
56
NULL
NULL
NULL
622 Data Structures and Algorithms Using Python 0
NULL
71
NULL
h(14) = 14 mod 10 = 4.
NULL
34
14
NULL
56
NULL
NULL
NULL
NULL
71
NULL
h(69) = 69 mod 10 = 9.
NULL
34
14
NULL
56
7 NULL
NULL
69
NULL
71
NULL
h(45) = 45 mod 10 = 5.
3
NULL
34
14
45
56
NULL
NULL
69
NULL
1
71
NULL
h(9) = 9 mod 10 = 9.
NULL
34
14
45
56
NULL
8
NULL
69
Hashing 623
14.4 Rehashing
Sometimes it may happen that with exhaustive use the hash table may be
nearly full. At this stage, performance is very much degraded with open
addressing. In case of quadratic probing, a free slot may not be found for
inserting a new record. In this situation a new hash table of size double the
current one is created and existing elements/records are remapped in the
new table. This is called rehashing.
• Cryptographic hash functions are used for message digest and password
verification.
Hashing at a Glance
✓ When a hash function produces same hash value for more than one key,
it is known as collision.
✓ In open addressing all the key elements are stored within the hash table.
✓ Linear probing, quadratic probing, and double hashing are common open
addressing schemes for collision resolution.
✓ In linear probing, when the collision occurs the key element is stored in
the next available slot using the hash function
0,1,2,.... n-1
✓ In double hashing two hash functions are used in which the first hash
function is used to probe a location in the hash table and the second hash
function is used to find the interval that is to be added with the address
determined by the first hash function.
✓ In chaining the hash table maintains a separate linked list for each slot to
store the elements.
✓ When an old hash table is nearly full, a new hash table of size double the
current one is created and existing elements/records are remapped in the
new table. This is called rehashing.
a) Open addressing
b) Quadratic probing
c) Folding
d) Chaining
2. The ratio of the number of items in a hash table to the table size is called
a) Load factor
b) Item factor
Hashing 625
c) Balance factor
d) All of these
a) Mid-square method
b) Multiplication method
c) Folding
d) Chaining
a) Linear addressing
b) Quadratic probing
c) Double Hashing
d) Rehashing
a) Mid-square method
b) Multiplication method
c) Folding
d) Chaining
6. When multiple elements are mapped for a same location in the hash
table, it is called a) Repetition
b) Replication
c) Collision
d) Duplication
7. If
n is the size of a hash table, which one of the following may be a hash
function for implementing linear probing?
626 Data Structures and Algorithms Using Python 9. Suppose the hash
function h(k) = k mod 10 is used. Which of the following statements is true
for the following inputs?
a) i only
b) ii only
c) i and ii only
d) iii or iv
10. Consider the size of a hash table as 10 whose starting index is 0 and
initially empty. If the division method is used as a hash function, what will
be the contents of the hash table when the sequence 55, 367, 29, 83, 10,121
is inserted into the table? [‘_’ denotes an empty location in the hash table.]
11. Consider the size of a hash table as 10 whose starting index is 0 and
initially empty. If the division method is used as a basic hash function and
linear probing is used for collision resolution, what will be the contents of
the hash table when the sequence 55, 67, 105, 26, 19, 35, 119 is inserted
into the table? [‘_’ denotes an empty location in the hash table.]
d) None of these.
12. Consider the size of a hash table as 10 whose starting index is 0 and
initially empty. If the division method is used as a basic hash function and
quadratic probing is used for collision resolution, what will be the contents
of the hash table when the sequence 55, 67, 105, 26, 19, 35, 119 is inserted
into the table? [‘_’ denotes an empty location in the hash table.]
d) None of these.
Hashing 627
13. Consider a hash table of size 10 whose starting index is 0. Map the key
568 to an appropriate location using the folding method.
a) 9
b) 8
c) 1
d) 0
14. Consider a hash table of size 100 whose starting index is 0. Map the key
56 to an appropriate location using the folding method.
a) 6
b) 56
c) 1
d) 11
Review Exercises
9. What is rehashing?
11. Consider a hash table of size 1000. Map the keys 29 and 5162 to
appropriate locations using the division method.
12. Consider a hash table of size 1000. Map the key 23401 to an appropriate
location using the mid-square method.
13. Consider a hash table of size 100. Map the keys 153249 and 513 to
appropriate locations using the folding method.
14. Consider a hash table of size 1000. Map the keys 57 and 4392 to
appropriate locations using the multiplication method.
15. Consider a hash table of size 10 and the basic hash function h’(k) = k
mod n is used.
Insert the following keys into the hash table using linear probing: 68, 23,
57, 83, 77, 98, 47, 50, and 9.
628 Data Structures and Algorithms Using Python 16. Consider a hash table
of size 10 and the basic hash function h’(k) = k mod n is used.
Further consider that c1=0 and c2=1. Insert the following keys into the hash
table using quadratic probing:
17. Consider a hash table of size 11 and the hash functions h (k) = k mod 11
and h (k) = k 1
mod 7 are used. Insert the following keys into the hash table using double
hashing: 29, 56, 73, 43, 89, 51, and 16.
18. Consider a hash table of size 10 and the basic hash function h’(k) = k
mod n is used.
Insert the following keys into the hash table using the chaining method: 37,
46, 92, 87, 29, 66, 69, 96, and 7.
Appendix
OceanofPDF.com
Chapter 1: Data Structure Preliminaries
1. b
2. d
3. c
4. d
5. d
6. a
7. b
8. d
9. b
10. d
11. c
12. b
13. a
14. d
15. d
16. d
OceanofPDF.com
Chapter 2: Introduction to Algorithm
1. c
2. b
3. a
4. b
5. c
6. a
7. b
8. a
9. b
10. c
11. c
12. d
13. b
14. d
15. d
OceanofPDF.com
Chapter 3: Array
1. a
2. a
3. d
4. a
5. b
6. d
7. b
8. d
9. c
10. d
11. a
12. c
13. c
14. c
OceanofPDF.com
Chapter 4: Python Data Structures
1. c
2. d
3. a
4. b
5. a
6. b
7. c
8. d
9. a
10. c
11. d
12. b
13. a
14. b
15. b
16. a
17. d
18. b
630 Appendix
OceanofPDF.com
Chapter 5: Strings
1. a
2. c
3. b
4. d
5. b
6. d
7. d
8. b
9. d
10. d
11. d
12. c
13. b
14. a
15. c
OceanofPDF.com
Chapter 6: Recursion
1. b
2. a
3. b
4. d
5. a
6. d
7. b
8. d
9. c
10. d
11. c
OceanofPDF.com
Chapter 7: Linked List
1. d
2. d
3. b
4. a
5. d
6. b
7. c
8. d
9. d
10. c
11. b
12. c
13. b
14. c
15. d
16. d
17. b
18. b
OceanofPDF.com
Chapter 8: Stack
1. c
2. d
3. a
4. b
5. d
6. c
7. b
8. d
9. d
10. c
11. b
12. d
13. b
14. b
15. a
16. c
17. c
18. a
19. a
20. c
21. b
22. a
23. d
24. a
25. d
26. b
27. a
28. a
OceanofPDF.com
Chapter 9: Queue
1. c
2. c
3. a
4. c
5. a
6. c
7. d
8. d
9. a
10. c
11. a
12. c
OceanofPDF.com
Chapter 10: Trees
1. c
2. c
3. d
4. d
5. c
6. b
7. a
8. d
9. b
10. c
11. c
12. c
13. a
14. a
15. a
16. d
17. d
18. d
19. b
20. c
Appendix 631
OceanofPDF.com
Chapter 11: Heap
1. a
2. a
3. b
4. c
5. c
6. c
7. c
8. b
9. b
10. d
OceanofPDF.com
Chapter 12: Graph
1. b
2. a
3. c
4. b
5. a
6. a
7. c
8. b
9. c
10. b
11. a
12. b
13. a
14. b
15. a
OceanofPDF.com
Chapter 13: Searching and Sorting
1. a
2. a
3. c
4. b
5. d
6. d
7. b
8. a
9. c
10. c
11. a
12. b
13. c
14. a
OceanofPDF.com
Chapter 14: Hashing
1. c
2. a
3. d
4. d
5. d
6. c
7. c
8. b
9. c
10. b
11. c
12. a
13. a
14. b
Index
postfix expressions
array, 3–4
dynamic programming
arange() method, 44
characteristics, 15
definition, 41
examples, 74–78
defined, 15
multiplication/repetition, 56–57
example, 17, 19
NumPy package, 43
flowchart, 17
importance of, 16
modular programming, 16
pseudocode, 19–20
time–space trade-off, 25
arrays)
array representation
arithmetic expressions
heap, 488
335–338
queue, 353–360
634 Index
stack, 312–313
extended, 396
full, 395–396
AVL tree
strictly, 394
definition, 449
construction, 406–408
operations, 450–456
conversion, 408–410
Big O Notation
algorithms, 28
defined, 27
drawbacks, 29
examples, 28
B Tree, 466–471
rate of growth, 28
B* Tree, 476
B+ Tree, 471–476
chaining technique
examples, 410–411
circular queue
description, 360
traversing, 415
representation)
operations, 361–368
coding algorithm; M-way search trees;
applications, 253–255
complete, 395
defined, 233
description, 394
Index 635
time complexity, 2
tree, 7–9
data type
defined, 1
primitive, 1–2
user defined, 2
description, 612
definiteness, 15
technique)
DEQUE, 5
POP, 4
dictionaries
control structures
examples, 21–24
implementation of, 20
comparison, 122
selection/decision, 21
sequence control, 20
defined, 115
data structures
array, 3–4
graph, 6–7
heap, 9
linear, 2
non-linear, 2
operations, 10
dot() method, 66
queue, 5–6
space complexity, 2
stack, 4–5
636 Index
frozenset, 114–115
defined, 255
new element, insertion of, 257–261
path algorithm
dynamic programming
recursion algorithms, 32
applications, 549
effectiveness, 15
BFS, 521–525
connected, 505
cycle, 505
defined, 6, 503
DFS, 525–528
directed, 504
finiteness, 15
linked list, 6
loop/self-loop, 506
flowchart
multigraph, 505
example, 19
pictorial/diagrammatic representation, 17
program, 17, 18
path, 505
symbols, 18
representation of, 6
system, 18
sink, 505
source, 505
folding method, 610
undirected, 504
weighted, 504
bubble sort, 26
defined, 25
front end, 5
technique
Index 637
defined, 606
403–404
input, 15
ENQUE, 5
PUSH, 4
applications, 623
definition, 605–606
advantages, 296
description, 283
disadvantages, 296–297
programs, 284–296
heap
defined, 9
algorithm, 168–174
left sub-tree, 7
638 Index
matrix addition, 65
linked list, 4. See also circular singly linked list; matrix multiplication, 66
meta-characters
definition, 208
description, 528
examples, 529
single circular linked list, 375–377
modular programming, 16
multigraph, 505
concatenation, 95–96
defined, 85–86
functions, 97–98
multiplication/repetition, 96
nested list, 97
nodes, 4
graph, 503
Ω (Omega) Notation, 29
loop/self-loop, 506
defined, 32–33
program example, 34
recursive calls, 33
Index 639
definition, 381
multiple, 383–384
single, 381–3
hash table, 612
pseudocode, 19–20
operands, 326
output, 15
stack, 314–317
queue
applications, 384
pattern/substring, 165
definition, 351
PEEK, 5
functionalities, 352
overflow, 5
polynomials, 57–58
underflow, 5
rear end, 5
404–406
recursion, 341–343
advantages, 196
disadvantages, 196
priority queue
640 Index
problems
sets
defined, 107–108
tail recursion, 189
defined, 158
meta-characters, 158–160
siblings, 392
reshape() method, 61
applications, 228–232
right sub-tree, 7
examples, 210–211
searching operation, 10
searching process
binary, 559–564
description, 557
interpolation, 564–566
linear, 558–559
sorting operation, 10
sorting
bubble, 567–571
Index 641
heap, 587–591
insertion, 574–577
merge, 583–587
quick, 577–583
radix, 591–594
shel , 594–597
space complexity, 2, 24
len(), 153–156
stacks
lower(), 150–156
examples, 311
functionalities, 312
sorted(), 153–156
representation of, 5
stack overflow, 5
stack underflow, 5
151–156
string methods
capitalize(), 150–156
casefold(), 150–156
upper(), 150–156
string operations
151–156
156
comparison, 157–158
642 Index
constants, 157
traversing operation, 10
defined, 145
tree
binary tree, 7
defined, 7
degree, 392
sub-modules, 16
sub tree, 7–9, 33, 34, 187, 391, 394, 396, 397,
representation of, 7
system flowcharts, 18
siblings, 392
types of
BST, 396–397
inorder
two-way, 438
tuples
preorder
two-way, 438
defined, 101
methods, 105–106
amortized time, 25
nested, 105
average case, 25
best case, 24
defined, 24
Θ (Theta) Notation, 30
matrix addition, 65
transpose() method, 66
matrix multiplication, 66
transpose of matrix, 66
Index 643
transpose of matrix, 66
OceanofPDF.com