0% found this document useful (0 votes)
5 views82 pages

Aukland Compsci 220 Introduction To Algorithms Data Structures Formal Languages 4e Itebooks Download

The document is a resource for the course 'Aukland Compsci 220 Introduction to Algorithms' and includes a link for downloading the textbook. It also lists several other related ebooks available for download on the same platform. The content covers various topics in algorithms, data structures, and formal languages.

Uploaded by

amemoromneia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views82 pages

Aukland Compsci 220 Introduction To Algorithms Data Structures Formal Languages 4e Itebooks Download

The document is a resource for the course 'Aukland Compsci 220 Introduction to Algorithms' and includes a link for downloading the textbook. It also lists several other related ebooks available for download on the same platform. The content covers various topics in algorithms, data structures, and formal languages.

Uploaded by

amemoromneia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 82

Aukland Compsci 220 Introduction To Algorithms

Data Structures Formal Languages 4e Itebooks


download

https://ebookbell.com/product/aukland-compsci-220-introduction-
to-algorithms-data-structures-formal-
languages-4e-itebooks-23836102

Explore and download more ebooks at ebookbell.com


Here are some recommended products that we believe you will be
interested in. You can click the link to download.

Aukland Compsci 111 Practical Computing Reference Manual Itebooks

https://ebookbell.com/product/aukland-compsci-111-practical-computing-
reference-manual-itebooks-23836050

Aukland Compsci 210 Computer System 1 Lecture Notes Itebooks

https://ebookbell.com/product/aukland-compsci-210-computer-
system-1-lecture-notes-itebooks-23836094

From Tamakimakaurau To Auckland A History Of Auckland Russell Stone

https://ebookbell.com/product/from-tamakimakaurau-to-auckland-a-
history-of-auckland-russell-stone-5733964

The Auckland University Press Anthology Of New Zealand Literature 1st


Jane Stafford

https://ebookbell.com/product/the-auckland-university-press-anthology-
of-new-zealand-literature-1st-jane-stafford-5735162
A Press Achieved The Emergence Of Auckland University Press 19271972
Dennis Mceldowney

https://ebookbell.com/product/a-press-achieved-the-emergence-of-
auckland-university-press-19271972-dennis-mceldowney-5853060

Auckland The Bay Of Islands Road Trips 1st Edition Brett Atkinson

https://ebookbell.com/product/auckland-the-bay-of-islands-road-
trips-1st-edition-brett-atkinson-7004384

From Tamakimakauraurau To Auckland Russell Stone

https://ebookbell.com/product/from-tamakimakauraurau-to-auckland-
russell-stone-44875752

Volcanoes Of Auckland A Field Guide Bruce W Hayward

https://ebookbell.com/product/volcanoes-of-auckland-a-field-guide-
bruce-w-hayward-46191444

Carried Away Auckland Museum

https://ebookbell.com/product/carried-away-auckland-museum-46387344
Introduction to Algorithms and Data Structures

M ICHAEL J. D INNEEN G EORGY G IMEL’ FARB M ARK C. W ILSON

c 2016

(Fourth edition)
Contents

Contents 2

List of Figures 5

List of Tables 7

Preface 8

I Introduction to Algorithm Analysis 14

1 What is Algorithm Analysis? 15


1.1 Efficiency of algorithms: first examples . . . . . . . . . . . . . . . . . . . 16
1.2 Running time for loops and other computations . . . . . . . . . . . . . 20
1.3 “Big-Oh”, “Big-Theta”, and “Big-Omega” tools . . . . . . . . . . . . . . . 21
1.4 Time complexity of algorithms . . . . . . . . . . . . . . . . . . . . . . . 27
1.5 Basic recurrence relations . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.6 Capabilities and limitations of algorithm analysis . . . . . . . . . . . . . 36
1.7 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

2 Efficiency of Sorting 39
2.1 The problem of sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.2 Insertion sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.3 Mergesort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.4 Quicksort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.5 Heapsort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.6 Data selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.7 Lower complexity bound for sorting . . . . . . . . . . . . . . . . . . . . 66
2.8 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

3 Efficiency of Searching 69
3.1 The problem of searching . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.2 Sorted lists and binary search . . . . . . . . . . . . . . . . . . . . . . . . 72
3.3 Binary search trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.4 Self-balancing binary and multiway search trees . . . . . . . . . . . . . 82
3.5 Hash tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.6 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

II Introduction to Graph Algorithms 103

4 The Graph Abstract Data Type 104


4.1 Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.2 Digraphs and data structures . . . . . . . . . . . . . . . . . . . . . . . . 110
4.3 Implementation of digraph ADT operations . . . . . . . . . . . . . . . . 114
4.4 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

5 Graph Traversals and Applications 117


5.1 Generalities on graph traversal . . . . . . . . . . . . . . . . . . . . . . . 117
5.2 DFS and BFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
5.3 Additional properties of depth-first search . . . . . . . . . . . . . . . . . 124
5.4 Additional properties of breadth-first search . . . . . . . . . . . . . . . . 129
5.5 Priority-first search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.6 Acyclic digraphs and topological ordering . . . . . . . . . . . . . . . . . 133
5.7 Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.8 Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.9 Maximum matchings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
5.10 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

6 Weighted Digraphs and Optimization Problems 151


6.1 Weighted digraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
6.2 Distance and diameter in the unweighted case . . . . . . . . . . . . . . 153
6.3 Single-source shortest path problem . . . . . . . . . . . . . . . . . . . . 154
6.4 All-pairs shortest path problem . . . . . . . . . . . . . . . . . . . . . . . 161
6.5 Minimum spanning tree problem . . . . . . . . . . . . . . . . . . . . . . 164
6.6 Hard graph problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
6.7 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

III Appendices 170

A Java code for Searching and Sorting 171


A.1 Sorting and selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
A.2 Search methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
B Java graph ADT 178
B.1 Java adjacency matrix implementation . . . . . . . . . . . . . . . . . . . 181
B.2 Java adjacency lists implementation . . . . . . . . . . . . . . . . . . . . 186
B.3 Standardized Java graph class . . . . . . . . . . . . . . . . . . . . . . . . 191
B.4 Extended graph classes: weighted edges . . . . . . . . . . . . . . . . . . 192

C Background on Data Structures 201


C.1 Informal discussion of ADTs . . . . . . . . . . . . . . . . . . . . . . . . . 201
C.2 Notes on a more formal approach . . . . . . . . . . . . . . . . . . . . . . 203

D Mathematical Background 204


D.1 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
D.2 Mathematical induction . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
D.3 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
D.4 Basic rules of logarithms . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
D.5 L’Hôpital’s rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
D.6 Arithmetic, geometric, and other series . . . . . . . . . . . . . . . . . . . 207
D.7 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209

E Solutions to Selected Exercises 211

Bibliography 234

Index 235
List of Figures

1.1 Linear-time algorithm to sum an array. . . . . . . . . . . . . . . . . . . . . . 17


1.2 Quadratic time algorithm to compute sums of an array. . . . . . . . . . . . 18
1.3 Linear-time algorithm to compute sums of an array. . . . . . . . . . . . . . 19
1.4 “Big Oh” property: g(n) is O(n). . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.5 Telescoping as a recursive substitution. . . . . . . . . . . . . . . . . . . . . 34

2.1 Insertion sort for arrays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45


2.2 Recursive mergesort for arrays . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.3 Linear time merge for arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.4 Basic array-based quicksort. . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.5 Complete binary tree and its array representation. . . . . . . . . . . . . . . 56
2.6 Maximum heap and its array representation. . . . . . . . . . . . . . . . . . 58
2.7 Heapsort. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.8 Basic array-based quickselect. . . . . . . . . . . . . . . . . . . . . . . . . . . 65
2.9 Decision tree for n = 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

3.1 A sequential search algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . 72


3.2 Binary search for the key 42. . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.3 Binary tree representation of a sorted array. . . . . . . . . . . . . . . . . . . 75
3.4 Binary search with three-way comparisons. . . . . . . . . . . . . . . . . . . 76
3.5 Faster binary search with two-way comparisons. . . . . . . . . . . . . . . . 76
3.6 Binary trees: only the leftmost tree is a binary search tree. . . . . . . . . . . 77
3.7 Search and insertion in the binary search tree. . . . . . . . . . . . . . . . . 78
3.8 Removal of the node with key 10 from the binary search tree. . . . . . . . . 79
3.9 Binary search trees obtained by permutations of 1, 2, 3, 4. . . . . . . . . . . 80
3.10 Binary search trees of height about log n. . . . . . . . . . . . . . . . . . . . . 81
3.11 Binary search trees of height about n. . . . . . . . . . . . . . . . . . . . . . . 81
3.12 Left and right rotations of a BST. . . . . . . . . . . . . . . . . . . . . . . . . . 83
3.13 Multiway search tree of order m = 4. . . . . . . . . . . . . . . . . . . . . . . 85
3.14 2–4 B-tree with the leaf storage size 7. . . . . . . . . . . . . . . . . . . . . . 87
3.15 Birthday paradox: Pr365 (n). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.1 A graph G1 and a digraph G2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.2 A subdigraph and a spanning subdigraph of G2 . . . . . . . . . . . . . . . . . 108
4.3 The subdigraph of G2 induced by {1, 2, 3}. . . . . . . . . . . . . . . . . . . . 109
4.4 The reverse of digraph G2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.5 The underlying graph of G2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

5.1 Graph traversal schema. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118


5.2 Node states in the middle of a digraph traversal. . . . . . . . . . . . . . . . 118
5.3 Decomposition of a digraph in terms of search trees. . . . . . . . . . . . . . 120
5.4 A graph G1 and a digraph G2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.5 BFS trees for G1 and G2 , rooted at 0. . . . . . . . . . . . . . . . . . . . . . . . 123
5.6 DFS trees for G1 and G2 , rooted at 0. . . . . . . . . . . . . . . . . . . . . . . 124
5.7 Depth-first search algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.8 Recursive DFS visit algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.9 Breadth-first search algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.10 Priority-first search algorithm (first kind). . . . . . . . . . . . . . . . . . . . 134
5.11 Digraph describing structure of an arithmetic expression. . . . . . . . . . . 135
5.12 Topological orders of some DAGs. . . . . . . . . . . . . . . . . . . . . . . . . 136
5.13 A digraph and its strongly connected components. . . . . . . . . . . . . . . 140
5.14 Structure of a digraph in terms of its strong components. . . . . . . . . . . 140
5.15 Some (di)graphs with different cycle behaviour. . . . . . . . . . . . . . . . . 143
5.16 A bipartite graph. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
5.17 A maximal and maximum matching in a bipartite graph. . . . . . . . . . . 146
5.18 An algorithm to find an augmenting path. . . . . . . . . . . . . . . . . . . . 147
5.19 Structure of the graph traversal tree for finding augmenting paths. . . . . . 149

6.1 Some weighted (di)graphs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152


6.2 Dijkstra’s algorithm, first version. . . . . . . . . . . . . . . . . . . . . . . . . 155
6.3 Picture for proof of Dijkstra’s algorithm. . . . . . . . . . . . . . . . . . . . . 157
6.4 Dijkstra’s algorithm, PFS version. . . . . . . . . . . . . . . . . . . . . . . . . 158
6.5 Bellman–Ford algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
6.6 Floyd’s algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.7 Prim’s algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
6.8 Kruskal’s algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

B.1 Sample output of the graph test program. . . . . . . . . . . . . . . . . . . . 193

D.1 Approximation of an integral by lower rectangles. . . . . . . . . . . . . . . . 209


List of Tables

1.1 Relative growth of linear and quadratic terms in an expression. . . . . . . . 18


1.2 Relative growth of running time T (n) when the input size increases. . . . . 27
1.3 The largest data sizes n that can be processed by an algorithm. . . . . . . . 28

2.1 Sample execution of insertion sort. . . . . . . . . . . . . . . . . . . . . . . . 42


2.2 Number of inversions Ii , comparisons Ci and data moves Mi . . . . . . . . . 44
2.3 Partitioning in quicksort with pivot p = 31. . . . . . . . . . . . . . . . . . . . 54
2.4 Inserting a new node with the key 75 in the heap in Figure 2.6. . . . . . . . 59
2.5 Deletion of the maximum key from the heap in Figure 2.6. . . . . . . . . . 59
2.6 Successive steps of heapsort. . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3.1 A map between airport codes and locations. . . . . . . . . . . . . . . . . . . 70


3.2 Height of the optimal m-ary search tree with n nodes. . . . . . . . . . . . . 85
3.3 Open addressing with linear probing (OALP). . . . . . . . . . . . . . . . . . 90
3.4 Open addressing with double hashing (OADH). . . . . . . . . . . . . . . . . 91
3.5 Birthday paradox: Pr365 (n). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.6 Average search time bounds in hash tables with load factor λ. . . . . . . . . 97

4.1 Digraph operations in terms of data structures. . . . . . . . . . . . . . . . . 114


4.2 Comparative worst-case performance of adjacency lists and matrices. . . . 115

6.1 Illustrating Dijkstra’s algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . 156


Introduction to the Fourth Edition

The fourth edition follows the third edition, while incorporates fixes for the errata
discovered in the third edition.

The textbook’s coverpage displays the complete list of 88 connected forbidden


minors for graphs with vertex cover at most 6, computed by M. J. Dinneen and
L. Xiong in 2001.

This work is licensed under a Creative Commons Attribution-NonCommercial


3.0 Unported License.

Michael J. Dinneen
Georgy Gimel’farb
Mark C. Wilson

Department of Computer Science


University of Auckland
Auckland, New Zealand

{mjd, georgy, mcw}@cs.auckland.ac.nz

February 2016
Introduction to the Third Edition

The focus for this third edition has been to make an electronic version that students
can read on the tablets and laptops that they bring to lectures. The main changes
from the second edition are:

• Incorporated fixes for the errata discovered in the second edition.


• Made e-book friendly by adding cross referencing links to document elements
and formatted pages to fit on most small screen tablets (i.e., width of margins
minimized).
• Limited material for University of Auckland’s CompSci 220 currently taught
topics. Removed the two chapters on formal languages (Part III) and truncated
the book title.
• This work is licensed under a Creative Commons Attribution-NonCommercial
3.0 Unported License.

Michael J. Dinneen
Georgy Gimel’farb
Mark C. Wilson

Department of Computer Science


University of Auckland
Auckland, New Zealand

{mjd, georgy, mcw}@cs.auckland.ac.nz

March 2013
Introduction to the Second Edition

Writing a second edition is a thankless task, as is well known to authors. Much of the
time is spent on small improvements that are not obvious to readers. We have taken
considerable efforts to correct a large number of errors found in the first edition, and
to improve explanation and presentation throughout the book, while retaining the
philosophy behind the original. As far as material goes, the main changes are:
• more exercises and solutions to many of them;
• a new section on maximum matching (Section 5.9);
• a new section on string searching (Part III);
• a Java graph library updated to Java 1.6 and freely available for download.
The web site http://www.cs.auckland.ac.nz/textbookCS220/ for the book pro-
vides additional material including source code. Readers finding errors are encour-
aged to contact us after viewing the errata page at this web site.
In addition to the acknowledgments in the first edition, we thank Sonny Datt for
help with updating the Java graph library, Andrew Hay for help with exercise solu-
tions and Cris Calude for comments. Rob Randtoul (PlasmaDesign.co.uk) kindly
allowed us to use his cube artwork for the book’s cover. Finally, we thank
MJD all students who have struggled to learn from the first edition and have given
us feedback, either positive or negative;
GLG my wife Natasha and all the family for their permanent help and support;
MCW my wife Golbon and sons Yusef and Yahya, for their sacrifices during the writ-
ing of this book, and the joy they bring to my life even in the toughest times.

31 October 2008
Introduction to the First Edition

This book is an expanded, and, we hope, improved version of the coursebook for
the course COMPSCI 220 which we have taught several times in recent years at the
University of Auckland.
We have taken the step of producing this book because there is no single text
available that covers the syllabus of the above course at the level required. Indeed,
we are not aware of any other book that covers all the topics presented here. Our
aim has been to produce a book that is straightforward, concise, and inexpensive,
and suitable for self-study (although a teacher will definitely add value, particularly
where the exercises are concerned). It is an introduction to some key areas at the
theoretical end of computer science, which nevertheless have many practical appli-
cations and are an essential part of any computer science student’s education.
The material in the book is all rather standard. The novelty is in the combina-
tion of topics and some of the presentation. Part I deals with the basics of algorithm
analysis, tools that predict the performance of programs without wasting time im-
plementing them. Part II covers many of the standard fast graph algorithms that
have applications in many different areas of computer science and science in gen-
eral. Part III introduces the theory of formal languages, shifting the focus from what
can be computed quickly to what families of strings can be recognized easily by a
particular type of machine.
The book is designed to be read cover-to-cover. In particular Part I should come
first. However, one can read Part III before Part II with little chance of confusion.
To make best use of the book, one must do the exercises. They vary in difficulty
from routine to tricky. No solutions are provided. This policy may be changed in a
later edition.
The prerequisites for this book are similar to those of the above course, namely
two semesters of programming in a structured language such as Java (currently used
at Auckland). The book contains several appendices which may fill in any gaps in
the reader’s background.
A limited bibliography is given. There are so many texts covering some of the
topics here that to list all of them is pointless. Since we are not claiming novelty
of material, references to research literature are mostly unnecessary and we have
omitted them. More advanced books (some listed in our bibliography) can provide
more references as a student’s knowledge increases.
A few explanatory notes to the reader about this textbook are in order.
We describe algorithms using a pseudocode similar to, but not exactly like, many
structured languages such as Java or C++. Loops and control structures are indented
in fairly traditional fashion. We do not formally define our pseudocode of comment
style (this might make an interesting exercise for a reader who has mastered Part III).
We make considerable use of the idea of ADT (abstract data type). An abstract
data type is a mathematically specified collection of objects together with opera-
tions that can be performed on them, subject to certain rules. An ADT is completely
independent of any computer programming implementation and is a mathematical
structure similar to those studied in pure mathematics. Examples in this book in-
clude digraphs and graphs, along with queues, priority queues, stacks, and lists. A
data structure is simply a higher level entity composed of the elementary memory
addresses related in some way. Examples include arrays, arrays of arrays (matrices),
linked lists, doubly linked lists, etc.
The difference between a data structure and an abstract data type is exemplified
by the difference between a standard linear array and what we call a list. An array is
a basic data structure common to most programming languages, consisting of con-
tiguous memory addresses. To find an element in an array, or insert an element, or
delete an element, we directly use the address of the element. There are no secrets
in an array. By contrast, a list is an ADT. A list is specified by a set S of elements
from some universal set U, together with operations insert, delete, size, isEmpty
and so on (the exact definition depends on who is doing the defining). We denote
the result of the operation as S.isEmpty(), for example. The operations must sat-
isfy certain rules, for example: S.isEmpty() returns a boolean value TRUE or FALSE;
S.insert(x, r) requires that x belong to U and r be an integer between 0 and S.size(),
and returns a list; for any admissible x and r we have S.isEmpty(S.insert(x, r)) =
FALSE, etc. We are not interested in how the operations are to be carried out, only
in what they do. Readers familiar with languages that facilitate object-based and
object-oriented programming will recognize ADTs as, essentially, what are called
classes in Java or C++.
A list can be implemented using an array (to be more efficient, we would also
have an extra integer variable recording the array size). The insert operation, for ex-
ample, can be achieved by accessing the correct memory address of the r-th element
of the array, allocating more space at the end of the array, shifting along some ele-
ments by one, and assigning the element to be inserted to the address vacated by the
shifting. We would also update the size variable by 1. These details are unimportant
in many programming applications. However they are somewhat important when
discussing complexity as we do in Part I. While ADTs allow us to concentrate on algo-
rithms without worrying about details of programming implementation, we cannot
ignore data structures forever, simply because some implementations of ADT oper-
ations are more efficient than others.
In summary, we use ADTs to sweep programming details under the carpet as long
as we can, but we must face them eventually.
A book of this type, written by three authors with different writing styles under
some time pressure, will inevitably contain mistakes. We have been helped to mini-
mize the number of errors by the student participants in the COMPSCI 220 course-
book error-finding competition, and our colleagues Joshua Arulanandham and An-
dre Nies, to whom we are very grateful.
Our presentation has benefitted from the input of our colleagues who have taught
COMPSCI 220 in the recent and past years, with special acknowledgement due to
John Hamer and the late Michael Lennon.

10 February 2004
Part I

Introduction to Algorithm Analysis


Chapter 1

What is Algorithm Analysis?

Algorithmic problems are of crucial importance in modern life. Loosely speaking,


they are precisely formulated problems that can be solved in a step-by-step, me-
chanical manner.
Definition 1.1 (informal). An algorithm is a list of unambiguous rules that spec-
ify successive steps to solve a problem. A computer program is a clearly specified
sequence of computer instructions implementing the algorithm.
For example, a sufficiently detailed recipe for making a cake could be thought of
as an algorithm. Problems of this sort are not normally considered as part of com-
puter science. In this book, we deal with algorithms for problems of a more abstract
nature. Important examples, all discussed in this book, and all with a huge number
of practical applications, include: sorting a database, finding an entry in a database,
finding a pattern in a text document, finding the shortest path through a network,
scheduling tasks as efficiently as possible, finding the median of a statistical sample.
Strangely enough, it is very difficult to give simple precise mathematical defini-
tions of algorithms and programs. The existing very deep general definitions are too
complex for our purposes. We trust that the reader of this book will obtain a good
idea of what we mean by algorithm from the examples in this and later chapters.
We often wish to compare different algorithms for the same problem, in order
to select the one best suited to our requirements. The main features of interest are:
whether the algorithm is correct (does it solve the problem for all legal inputs), and
how efficient it is (how much time, memory storage, or other resources it uses).
The same algorithm can be implemented by very different programs written in
different programming languages, by programmers of different levels of skill, and
then run on different computer platforms under different operating systems. In
searching for the best algorithm, general features of algorithms must be isolated
from peculiarities of particular platforms and programs.
To analyse computer algorithms in practice, it is usually sufficient to first specify
elementary operations of a “typical” computer and then represent each algorithm
as a sequence of those operations.
Most modern computers and languages build complex programs from ordinary
arithmetic and logical operations such as standard unary and binary arithmetic op-
erations (negation, addition, subtraction, multiplication, division, modulo opera-
tion, or assignment), Boolean operations, binary comparisons (“equals”, “less than”,
or “greater than”), branching operations, and so on. It is quite natural to use these
basic computer instructions as algorithmic operations, which we will call elemen-
tary operations.
It is not always clear what should count as an elementary operation. For example,
addition of two 64-bit integers should definitely count as elementary, since it can be
done in a fixed time for a given implementation. But for some applications, such as
cryptography, we must deal with much larger integers, which must be represented
in another way. Addition of “big” integers takes a time roughly proportional to the
size of the integers, so it is not reasonable to consider it as elementary. From now on
we shall ignore such problems. For most of the examples in this introductory book
they do not arise. However, they must be considered in some situations, and this
should be borne in mind.

Definition 1.2 (informal). The running time (or computing time) of an algorithm is
the number of its elementary operations.

The actual execution time of a program implementing an algorithm is roughly


proportional to its running time, and the scaling factor depends only on the partic-
ular implementation (computer, programming language, operating system, and so
on).
The memory space required for running an algorithm depends on how many
individual variables (input, intermediate, and output data) are involved simultane-
ously at each computing step. Time and space requirements are almost always inde-
pendent of the programming language or style and characterise the algorithm itself.
From here on, we will measure effectiveness of algorithms and programs mostly in
terms of their time requirements. Any real computer has limits on the size and the
number of data items it can handle.

1.1 Efficiency of algorithms: first examples


If the same problem can be solved by different algorithms, then all other things
being equal, the most efficient algorithm uses least computational resources. The
theory of algorithms in modern computer science clarifies basic algorithmic notions
algorithm linearSum
Input: array a[0..n − 1]
begin
s←0
for i ← 0 step i ← i + 1 until n − 1 do
s ← s + a[i]
end for
return s
end

Figure 1.1: Linear-time algorithm to sum an array.

such as provability, correctness, complexity, randomness, or computability. It stud-


ies whether there exist any algorithms for solving certain problems, and if so, how
fast can they be. In this book, we take only a few small steps into this domain.
To search for the most efficient algorithm, one should mathematically prove cor-
rectness of and determine time/space resources for each algorithm as explicit func-
tions of the size of input data to process. For simplicity of presentation in this book,
we sometimes skip the first step (proof of correctness), although it is very impor-
tant. The focus of this chapter is to introduce methods for estimating the resource
requirements of algorithms.
Example 1.3 (Sum of elements of an array). Let a denote an array of integers where
n−1
the sum s = ∑i=0 a[i] is required. To get the sum s, we have to repeat n times the same
elementary operations (fetching from memory and adding a number). Thus, run-
ning time T (n) is proportional to, or linear in n: T (n) = cn. Such algorithms are also
called linear algorithms. The unknown factor c depends on a particular computer,
programming language, compiler, operating system, etc. But the relative change
in running time is just the same as the change in the data size: T (10) = 10T (1), or
T (1000) = 1000T (1), or T (1000) = 10T (100). The linear algorithm in Figure 1.1 imple-
ments a simple loop.

Example 1.4 (Sums of subarrays). The problem is to compute, for each subarray
a[ j.. j + m − 1] of size m in an array a of size n, the partial sum of its elements s[ j] =
m−1
∑k=0 a[ j + k]; j = 0, . . . , n − m. The total number of these subarrays is n − m + 1. At first
glance, we need to compute n − m + 1 sums, each of m items, so that the running time
is proportional to m(n − m + 1). If m is fixed, the time depends still linearly on n.
But if m is growing with n as a fraction of n, such as m = 2n , then T (n) = c 2n n2 + 1


= 0.25cn2 + 0.5cn. The relative weight of the linear part, 0.5cn, decreases quickly with
respect to the quadratic one as n increases. For example, if T (n) = 0.25n2 + 0.5n, we
see in the last column of Table 1.1 the rapid decrease of the ratio of the two terms.
Table 1.1: Relative growth of linear and quadratic terms in an expression.

n T (n) 0.25n2 0.5n


value % of quadratic term
10 30 25 5 20.0
50 650 625 25 4.0
100 2550 2500 50 2.0
500 62750 62500 250 0.4
1000 250500 250000 500 0.2
5000 6252500 6250000 2500 0.04

Thus, for large n only the quadratic term becomes important and the running
time is roughly proportional to n2 , or is quadratic in n. Such algorithms are some-
times called quadratic algorithms in terms of relative changes of running time with
respect to changes of the data size: if T (n) ≈ cn2 then T (10) ≈ 100T (1), or T (100) ≈
10000T (1), or T (100) ≈ 100T (10).

algorithm slowSums
Input: array a[0..2m − 1]
begin
array s[0..m]
for i ← 0 to m do
s[i] ← 0
for j ← 0 to m − 1 do
s[i] ← s[i] + a[i + j]
end for
end for
return s
end

Figure 1.2: Quadratic time algorithm to compute sums of an array.

The “brute-force” quadratic algorithm has two nested loops (see Figure 1.2). Let
us analyse it to find out whether it can be simplified. It is easily seen that repeated
computations in the innermost loop are unnecessary. Two successive sums s[i] and
s[i − 1] differ only by two elements: s[i] = s[i − 1] + a[i + m − 1] − a[i − 1]. Thus we need
not repeatedly add m items together after getting the very first sum s[0]. Each next
sum is formed from the current one by using only two elementary operations (ad-
dition and subtraction). Thus T (n) = c(m + 2(n − m)) = c(2n − m). In the first paren-
theses, the first term m relates to computing the first sum s[0], and the second term
2(n − m) reflects that n − m other sums are computed with only two operations per
sum. Therefore, the running time for this better organized computation is always
linear in n for each value m, either fixed or growing with n. The time for comput-
ing all the sums of the contiguous subsequences is less than twice that taken for the
single sum of all n items in Example 1.3
The linear algorithm in Figure 1.3 excludes the innermost loop of the quadratic
algorithm. Now two simple loops, doing m and 2(n − m) elementary operations, re-
spectively, replace the previous nested loop performing m(n − m + 1) operations.

algorithm fastSums
Input: array a[0..2m − 1]
begin
array s[0..m]
s[0] ← 0
for j ← 0 to m − 1 do
s[0] ← s[0] + a[ j]
end for
for i ← 1 to m do
s[i] ← s[i − 1] + a[i + m − 1] − a[i − 1]
end for
return s;
end

Figure 1.3: Linear-time algorithm to compute sums of an array.

Such an outcome is typical for algorithm analysis. In many cases, a careful analy-
sis of the problem allows us to replace a straightforward “brute-force” solution with
much more effective one. But there are no “standard” ways to reach this goal. To ex-
clude unnecessary computation, we have to perform a thorough investigation of the
problem and find hidden relationships between the input data and desired outputs.
In so doing, we should exploit all the tools we have learnt. This book presents many
examples where analysis tools are indeed useful, but knowing how to analyse and
solve each particular problem is still close to an art. The more examples and tools
are mastered, the more the art is learnt.
Exercises
Exercise 1.1.1. A quadratic algorithm with processing time T (n) = cn2 uses 500 ele-
mentary operations for processing 10 data items. How many will it use for processing
1000 data items?
Exercise 1.1.2. Algorithms A and B use exactly TA (n) = cA n lg n and TB (n) = cB n2 ele-
mentary operations, respectively, for a problem of size n. Find the fastest algorithm
for processing n = 220 data items if A and B spend 10 and 1 operations, respectively,
to process 210 ≡ 1024 items.

1.2 Running time for loops and other computations


The above examples show that running time depends considerably on how deeply
the loops are nested and how the loop control variables are changing. Suppose the
control variables change linearly in n, that is, increase or decrease by constant steps.
If the number of elementary operations in the innermost loop is constant, the nested
loops result in polynomial running time T (n) = cnk where k is the highest level of
nesting and c is some constant. The first three values of k have special names: lin-
ear time, quadratic time, and cubic time for k = 1 (a single loop), k = 2 (two nested
loops), and k = 3 (three nested loops), respectively.
When loop control variables change non-linearly, the running time also varies
non-linearly with n.

Example 1.5. An exponential change i = 1, k, k2 , . . . , km−1 of the control variable in


the range 1 ≤ i ≤ n results in logarithmic time for a simple loop. The loop executes
m iterations such that km−1 ≤ n < km . Thus, m − 1 ≤ logk n < m, and T (n) = c⌈logk n⌉.

Additional conditions for executing inner loops only for special values of the
outer variables also decrease running time.

Example 1.6. Let us roughly estimate the running time of the following nested loops:
m←2
for j ← 1 to n do
if j = m then
m ← 2m
for i ← 1 to n do
. . . constant number of elementary operations
end for
end if
end for

The inner loop is executed k times for j = 2, 4, . . . , 2k where k < lg n ≤ k + 1. The


total time for the elementary operations is proportional to kn, that is, T (n) = n⌊lg n⌋.

Conditional and switch operations like if {condition} then {constant running


time T1 } else {constant running time T2 } involve relative frequencies of the groups
of computations. The running time T satisfies T = ftrue T1 + (1 − ftrue )T2 < max{T1 , T2 }
where ftrue is the relative frequency of the true condition value in the if-statement.
The running time of a function or method call is T = ∑ki=1 Ti where Ti is the run-
ning time of statement i of the function and k is the number of statements.
Exercises
Exercise 1.2.1. Is the running time quadratic or linear for the nested loops below?

m←1
for j ← 1 step j ← j + 1 until n do
if j = m then m ← m · (n − 1)
for i ← 0 step i ← i + 1 until n − 1 do
. . . constant number of elementary operations
end for
end if
end for

Exercise 1.2.2. What is the running time for the following code fragment as a func-
tion of n?
for i ← 1 step i ← 2 ∗ i while i < n do
for j ← 1 step j ← 2 ∗ j while j < n do
if j = 2 ∗ i
for k = 0 step k ← k + 1 while k < n do
. . . constant number of elementary operations
end for
else
for k ← 1 step k ← 3 ∗ k while k < n do
. . . constant number of elementary operations
end for
end if
end for
end for

1.3 “Big-Oh”, “Big-Theta”, and “Big-Omega” tools


Two simple concepts separate properties of an algorithm itself from properties
of a particular computer, operating system, programming language, and compiler
used for its implementation. The concepts, briefly outlined earlier, are as follows:

• The input data size, or the number n of individual data items in a single data
instance to be processed when solving a given problem. Obviously, how to
measure the data size depends on the problem: n means the number of items
to sort (in sorting applications), number of nodes (vertices) or arcs (edges) in
graph algorithms, number of picture elements (pixels) in image processing,
length of a character string in text processing, and so on.

• The number of elementary operations taken by a particular algorithm, or its


running time. We assume it is a function f (n) of the input data size n. The
function depends on the elementary operations chosen to build the algorithm.

The running time of a program which implements the algorithm is c f (n) where
c is a constant factor depending on a computer, language, operating system, and
compiler. Even if we don’t know the value of the factor c, we are able to answer
the important question: if the input size increases from n = n1 to n = n2 , how does
the relative running time of the program change, all other things being equal? The
(n2 )
answer is obvious: the running time increases by a factor of TT (n = cc ff (n 2) f (n2 )
(n1 ) = f (n1 ) .
1)
As we have already seen, the approximate running time for large input sizes gives
enough information to distinguish between a good and a bad algorithm. Also, the
constant c above can rarely be determined. We need some mathematical notation to
avoid having to say “of the order of . . .” or “roughly proportional to . . .”, and to make
this intuition precise.
The standard mathematical tools “Big Oh” (O), “Big Theta” (Θ), and “Big Omega”
(Ω) do precisely this.
Note. Actually, the above letter O is a capital “omicron” (all letters in this notation
are Greek letters). However, since the Greek omicron and the English “O” are indis-
tinguishable in most fonts, we read O() as “Big Oh” rather than “Big Omicron”.
The algorithms are analysed under the following assumption: if the running time
of an algorithm as a function of n differs only by a constant factor from the running
time for another algorithm, then the two algorithms have essentially the same time
complexity. Functions that measure running time, T (n), have nonnegative values
because time is nonnegative, T (n) ≥ 0. The integer argument n (data size) is also
nonnegative.

Definition 1.7 (Big Oh). Let f (n) and g(n) be nonnegative-valued functions defined
on nonnegative integers n. Then g(n) is O( f (n)) (read “g(n) is Big Oh of f (n)”) iff there
exists a positive real constant c and a positive integer n0 such that g(n) ≤ c f (n) for all
n > n0 .

Note. We use the notation “iff ” as an abbreviation of “if and only if”.
In other words, if g(n) is O( f (n)) then an algorithm with running time g(n) runs for
large n at least as fast, to within a constant factor, as an algorithm with running time
f (n). Usually the term “asymptotically” is used in this context to describe behaviour
of functions for sufficiently large values of n. This term means that g(n) for large n
may approach closer and closer to c · f (n). Thus, O( f (n)) specifies an asymptotic
upper bound.
Note. Sometimes the “Big Oh” property is denoted g(n) = O( f (n)), but we should not
assume that the function g(n) is equal to something called “Big Oh” of f (n). This
notation really means g(n) ∈ O( f (n)), that is, g(n) is a member of the set O( f (n)) of
functions which are increasing, in essence, with the same or lesser rate as n tends to
infinity (n → ∞). In terms of graphs of these functions, g(n) is O( f (n)) iff there exists
a constant c such that the graph of g(n) is always below or at the graph of c f (n) after
a certain point, n0 .

Example 1.8. Function g(n) = 100 log10 n in Figure 1.4 is O(n) because the graph g(n)
is always below the graph of f (n) = n if n > 238 or of f (n) = 0.3n if n > 1000, etc.

T(n) f(n)=n
f(n) = 0.3n
400
g(n)=100 log 10 n

300

200

100

n0 n0
0 200 400 600 800 1000 1200 n

Figure 1.4: “Big Oh” property: g(n) is O(n).

Definition 1.9 (Big Omega). The function g(n) is Ω( f (n)) iff there exists a positive
real constant c and a positive integer n0 such that g(n) ≥ c f (n) for all n > n0 .

“Big Omega” is complementary to “Big Oh” and generalises the concept of “lower
bound” (≥) in the same way as “Big Oh” generalises the concept of “upper bound”
(≤): if g(n) is O( f (n)) then f (n) is Ω(g(n)), and vice versa.

Definition 1.10 (Big Theta). The function g(n) is Θ( f (n)) iff there exist two positive
real constants c1 and c2 and a positive integer n0 such that c1 f (n) ≤ g(n) ≤ c2 f (n) for
all n > n0 .

Whenever two functions, f (n) and g(n), are actually of the same order, g(n) is
Θ( f (n)), they are each “Big Oh” of the other: f (n) is O(g(n)) and g(n) is O( f (n)). In
other words, f (n) is both an asymptotic upper and lower bound for g(n). The “Big
Theta” property means f (n) and g(n) have asymptotically tight bounds and are in
some sense equivalent for our purposes.
In line with the above definitions, g(n) is O( f (n)) iff g(n) grows at most as fast as
f (n) to within a constant factor, g(n) is Ω( f (n)) iff g(n) grows at least as fast as f (n) to
within a constant factor, and g(n) is Θ( f (n)) iff g(n) and f (n) grow at the same rate to
within a constant factor.
“Big Oh”, “Big Theta”, and “Big Omega” notation formally capture two crucial
ideas in comparing algorithms: the exact function, g, is not very important because
it can be multiplied by any arbitrary positive constant, c, and the relative behaviour
of two functions is compared only asymptotically, for large n, but not near the origin
where it may make no sense. Of course, if the constants involved are very large, the
asymptotic behaviour loses practical interest. In most cases, however, the constants
remain fairly small.
In analysing running time, “Big Oh” g(n) ∈ O( f (n)), “Big Omega” g(n) ∈ Ω( f (n)),
and “Big Theta” g(n) ∈ Θ( f (n)) definitions are mostly used with g(n) equal to “exact”
running time on inputs of size n and f (n) equal to a rough approximation to running
time (like log n, n, n2 , and so on).
To prove that some function g(n) is O( f (n)), Ω( f (n)), or Θ( f (n)) using the defi-
nitions we need to find the constants c, n0 or c1 , c2 , n0 specified in Definitions 1.7,
1.9, 1.10. Sometimes the proof is given only by a chain of inequalities, starting with
f (n). In other cases it may involve more intricate techniques, such as mathemati-
cal induction. Usually the manipulations are quite simple. To prove that g(n) is not
O( f (n)), Ω( f (n)), or Θ( f (n)) we have to show the desired constants do not exist, that
is, their assumed existence leads to a contradiction.

Example 1.11. To prove that linear function g(n) = an + b; a > 0, is O(n), we form
the following chain of inequalities: g(n) ≤ an + |b| ≤ (a + |b|)n for all n ≥ 1. Thus,
Definition 1.7 with c = a + |b| and n0 = 1 shows that an + b is O(n).

“Big Oh” hides constant factors so that both 10−10 n and 1010 n are O(n). It is point-
less to write something like O(2n) or O(an + b) because this still means O(n). Also,
only the dominant terms as n → ∞ need be shown as the argument of “Big Oh”, “Big
Omega”, or “Big Theta”.

Example 1.12. The polynomial P5 (n) = a5 n5 + a4 n4 + a3 n3 + a2 n2 + a1 n + a0 ; a5 > 0, is


O(n5 ) because P5 (n) ≤ (a5 + |a4 | + |a3 | + |a2 | + |a1 | + |a0 |)n5 for all n ≥ 1.

Example 1.13. The exponential function g(n) = 2n+k , where k is a constant, is O(2n )
because 2n+k = 2k 2n for all n. Generally, mn+k is O(l n ); l ≥ m > 1, because mn+k ≤ l n+k =
l k l n for any constant k.

Example 1.14. For each m > 1, the logarithmic function g(n) = logm (n) has the same
rate of increase as lg(n) because logm (n) = logm (2) lg(n) for all n > 0. Therefore we may
omit the logarithm base when using the “Big-Oh” and “Big Theta” notation: logm n is
Θ(log n).

Rules for asymptotic notation


Using the definition to prove asymptotic relationships between functions is hard
work. As in calculus, where we soon learn to use various rules (product rule, chain
rule, . . . ) rather than the definition of derivative, we can use some simple rules to
deduce new relationships from old ones.
We present rules for “Big Oh”—similar relationships hold for “Big Omega” and
“Big Theta”.
We will consider the features both informally and formally using the following
notation. Let x and y be functions of a nonnegative integer n. Then z = x + y and
z = xy denote the sum of the functions, z(n) = x(n) + y(n), and the product function:
z(n) = x(n)y(n), respectively, for every value of n. The product function (xy)(n) returns
the product of the values of the functions at n and has nothing in common with the
composition x(y(n)) of the two functions.
Basic arithmetic relationships for “Big Oh” follow from and can be easily proven
with its definition.

Lemma 1.15 (Scaling). For all constants c > 0, c f is O( f ). In particular, f is O( f ).

Proof. The relationship c f (n) ≤ c f (n) obviously holds for all n ≥ 0.

Constant factors are ignored, and only the powers and functions are taken into
account. It is this ignoring of constant factors that motivates such a notation.

Lemma 1.16 (Transitivity). If h is O(g) and g is O( f ), then h is O( f ).

Proof. See Exercise 1.3.6.

Informally, if h grows at most as quickly as g, which grows at most as quickly as f ,


then h grows at most as quickly as f .

Lemma 1.17 (Rule of sums). If g1 is O( f1 ) and g2 is O( f2 ), then g1 +g2 is O(max{ f1 , f2 }).

Proof. See Exercise 1.3.6.

If g is O( f ) and h is O( f ), then is g + h is O( f ). In particular, if g is O( f ), then g + f


is O( f ). Informally, the growth rate of a sum is the growth rate of its fastest-growing
term.

Lemma 1.18 (Rule of products). If g1 is O( f1 ) and g2 is O( f2 ), then g1 g2 is O( f1 f2 ).

Proof. See Exercise 1.3.6.


In particular, if g is O( f ), then gh is O( f h). Informally, the product of upper
bounds of functions gives an upper bound for the product of the functions.
Using calculus we can obtain a nice time-saving rule.

Lemma 1.19 (Limit Rule). Suppose limn→∞ f (n)/g(n) exists (may be ∞), and denote
the limit by L. Then

• if L = 0, then f is O(g) and f is not Ω(g);

• if 0 < L < ∞ then f is Θ(g);

• if L = ∞ then f is Ω(g) and f is not O(g).

Proof. If L = 0 then from the definition of limit, in particular there is some n0 such
that f (n)/g(n) ≤ 1 for all n ≥ n0 . Thus f (n) ≤ g(n) for all such n, and f (n) is O(g(n)) by
definition. On the other hand, for each c > 0, it is not the case that f (n) ≥ cg(n) for
all n past some threshold value n1 , so that f (n) is not Ω(g(n)). The other two parts are
proved in the analogous way.

To compute the limit if it exists, the standard L’Hôpital’s rule of calculus is useful
(see Section D.5).
More specific relations follow directly from the basic ones.

Example 1.20. Higher powers of n grow more quickly than lower powers: nk is O(nl )
if 0 ≤ k ≤ l. This follows directly from the limit rule since nk /nl = nk−l has limit 1 if
k = l and 0 if k < l.

Example 1.21. The growth rate of a polynomial is given by the growth rate of its
leading term (ignoring the leading coefficient by the scaling feature): if Pk (n) is a
polynomial of exact degree k then Pk (n) is Θ(nk ). This follows easily from the limit
rule as in the preceding example.

Example 1.22. Exponential functions grow more quickly than powers: nk is O(bn ),
for all b > 1, n > 1, and k ≥ 0. The restrictions on b, n, and k merely ensure that
both functions are increasing. This result can be proved by induction or by using the
limit-L’Hôpital approach above.

Example 1.23. Logarithmic functions grow more slowly than powers: logb n is O(nk )
for all b > 1, k > 0. This is the inverse of the preceding feature. Thus, as a result, log n
is O(n) and n log n is O(n2 ).
Exercises
Exercise 1.3.1. Prove that 10n3 − 5n + 15 is not O(n2 ).

Exercise 1.3.2. Prove that 10n3 − 5n + 15 is Θ(n3 ).

Exercise 1.3.3. Prove that 10n3 − 5n + 15 is not Ω(n4 ).

Exercise 1.3.4. Prove that f (n) is Θ(g(n)) if and only if both f (n) is O(g(n) and f (n) is
Ω(g(n)).

Exercise 1.3.5. Using the definition, show that each function f (n) in Table 1.3 stands
in “Big-Oh” relation to the preceding one, that is, n is O(n log n), n log n is O(n1.5 ), and
so forth.

Exercise 1.3.6. Prove Lemmas 1.16–1.18.

Exercise 1.3.7. Decide on how to reformulate the Rule of Sums (Lemma 1.17) for
“Big Omega” and “Big Theta” notation.

Exercise 1.3.8. Reformulate and prove Lemmas 1.15–1.18 for “Big Omega” notation.

1.4 Time complexity of algorithms


Definition 1.24 (Informal). A function f (n) such that the running time T (n) of a
given algorithm is Θ( f (n)) measures the time complexity of the algorithm.

An algorithm is called polynomial time if its running time T (n) is O(nk ) where k is
some fixed positive integer. A computational problem is considered intractable iff
no deterministic algorithm with polynomial time complexity exists for it. But many
problems are classed as intractable only because a polynomial solution is unknown,
and it is a very challenging task to find such a solution for one of them.

Table 1.2: Relative growth of running time T (n) when the input size increases from n = 8 to
n = 1024 provided that T (8) = 1.

Time complexity Input size n Time T (n)


Function Notation 8 32 128 1024
Constant 1 1 1 1 1 1
Logarithmic lg n 1 1.67 2.67 3.33 lg n/3
Log-squared lg2 n 1 2.78 5.44 11.1 lg2 n/9
Linear n 1 4 16 128 n/8
“n log n” n lg n 1 6.67 37.3 427 n lg n/24
Quadratic n2 1 16 256 16384 n2 /64
Cubic n3 1 64 4096 2097152 n3 /512
Exponential 2n 1 224 2120 21016 2n−8
Table 1.2 shows how the running time T (n) of algorithms having different time
complexity, f (n), grows relatively with the increasing input size n. Time complexity
functions are listed in order such that g is O( f ) if g is above f : for example, the linear
function n is O(n log n) and O(n2 ), etc. The asymptotic growth rate does not depend
on the base of the logarithm, but the exact numbers in the table do — we use log2 = lg
for simplicity.

Table 1.3: The largest data sizes n that can be processed by an algorithm with time complexity
f (n) provided that T (10) = 1 minute.

Length of time to run an algorithm


f (n) 1 minute 1 hour 1 day 1 week 1 year 1 decade
n 10 600 14 400 100 800 5.26 × 106 5.26 × 107
n lg n 10 250 3 997 23 100 883 895 7.64 × 106
n1.5 10 153 1 275 4 666 65 128 302,409
n2 10 77 379 1 003 7 249 22,932
n3 10 39 112 216 807 1,738
2n 10 15 20 23 29 32

Table 1.3 is even more expressive in showing how the time complexity of an algo-
rithm affects the size of problems the algorithm can solve (we again use log2 = lg). A
linear algorithm solving a problem of size n = 10 in exactly one minute will process
about 5.26 million data items per year and 10 times more if we can wait a decade. But
an exponential algorithm with T (10) = 1 minute will deal only with 29 data items af-
ter a year of running and add only 3 more items after a decade. Suppose we have
computers 10, 000 times faster (this is approximately the ratio of a week to a minute).
Then we can solve a problem 10, 000 times, 100 times, or 21.5 times larger than before
if our algorithm is linear, quadratic, or cubic, respectively. But for exponential algo-
rithms, our progress is much worse: we can add only 13 more input values if T (n) is
Θ(2n ).
Therefore, if our algorithm has a constant, logarithmic, log-square, linear, or even
“n log n” time complexity we may be happy and start writing a program with no doubt
that it will meet at least some practical demands. Of course, before taking the plunge,
it is better to check whether the hidden constant c, giving the computation volume
per data item, is sufficiently small in our case. Unfortunately, order relations can be
drastically misleading: for instance, two linear functions 10−4 n and 1010 n are of the
same order O(n), but we should not claim an algorithm with the latter time complex-
ity as a big success.
Therefore, we should follow a simple rule: roughly estimate the computation vol-
ume per data item for the algorithms after comparing their time complexities in a
“Big-Oh” sense! We may estimate the computation volume simply by counting the
number of elementary operations per data item.
In any case we should be very careful even with simple quadratic or cubic algo-
rithms, and especially with exponential algorithms. If the running time is speeded
up in Table 1.3 so that it takes one second per ten data items in all the cases, then we
will still wait about 12 days (220 ≡ 1, 048, 576 seconds) for processing only 30 items by
the exponential algorithm. Estimate yourself whether it is practical to wait until 40
items are processed.
In practice, quadratic and cubic algorithms cannot be used if the input size ex-
ceeds tens of thousands or thousands of items, respectively, and exponential algo-
rithms should be avoided whenever possible unless we always have to process data
of very small size. Because even the most ingenious programming cannot make an
inefficient algorithm fast (we would merely change the value of the hidden constant
c slightly, but not the asymptotic order of the running time), it is better to spend more
time to search for efficient algorithms, even at the expense of a less elegant software
implementation, than to spend time writing a very elegant implementation of an
inefficient algorithm.
Worst-case and average-case performance
We have introduced asymptotic notation in order to measure the running time
of an algorithm. This is expressed in terms of elementary operations. “Big Oh”, “Big
Omega” and “Big Theta” notations allow us to state upper, lower and tight asymp-
totic bounds on running time that are independent of inputs and implementation
details. Thus we can classify algorithms by performance, and search for the “best”
algorithms for solving a particular problem.
However, we have so far neglected one important point. In general, the running
time varies not only according to the size of the input, but the input itself. The ex-
amples in Section 1.4 were unusual in that this was not the case. But later we shall
see many examples where it does occur. For example, some sorting algorithms take
almost no time if the input is already sorted in the desired order, but much longer if
it is not.
If we wish to compare two different algorithms for the same problem, it will be
very complicated to consider their performance on all possible inputs. We need a
simple measure of running time.
The two most common measures of an algorithm are the worst-case running
time, and the average-case running time.
The worst-case running time has several advantages. If we can show, for example,
that our algorithm runs in time O(n log n) no matter what input of size n we consider,
we can be confident that even if we have an “unlucky” input given to our program,
it will not fail to run fairly quickly. For so-called “mission-critical” applications this
is an essential requirement. In addition, an upper bound on the worst-case running
time is usually fairly easy to find.
The main drawback of the worst-case running time as a measure is that it may be
too pessimistic. The real running time might be much lower than an “upper bound”,
the input data causing the worst case may be unlikely to be met in practice, and the
constants c and n0 of the asymptotic notation are unknown and may not be small.
There are many algorithms for which it is difficult to specify the worst-case input.
But even if it is known, the inputs actually encountered in practice may lead to much
lower running times. We shall see later that the most widely used fast sorting algo-
rithm, quicksort, has worst-case quadratic running time, Θ(n2 ), but its running time
for “random” inputs encountered in practice is Θ(n log n).
By contrast, the average-case running time is not as easy to define. The use of
the word “average” shows us that probability is involved. We need to specify a prob-
ability distribution on the inputs. Sometimes this is not too difficult. Often we can
assume that every input of size n is equally likely, and this makes the mathematical
analysis easier. But sometimes an assumption of this sort may not reflect the inputs
encountered in practice. Even if it does, the average-case analysis may be a rather
difficult mathematical challenge requiring intricate and detailed arguments. And of
course the worst-case complexity may be very bad even if the average case complex-
ity is good, so there may be considerable risk involved in using the algorithm.
Whichever measure we adopt for a given algorithm, our goal is to show that its
running time is Θ( f ) for some function f and there is no algorithm with running
time Θ(g) for any function g that grows more slowly than f when n → ∞. In this case
our algorithm is asymptotically optimal for the given problem.
Proving that no other algorithm can be asymptotically better than ours is usually
a difficult matter: we must carefully construct a formal mathematical model of a
computer and derive a lower bound on the complexity of every algorithm to solve
the given problem. In this book we will not pursue this topic much. If our analysis
does show that an upper bound for our algorithm matches the lower one for the
problem, then we need not try to invent a faster one.
Exercises
Exercise 1.4.1. Add columns to Table 1.3 corresponding to one century (10 decades)
and one millennium (10 centuries).
Exercise 1.4.2. Add rows to Table 1.2 for algorithms with time complexity f (n) =
lg lg n and f (n) = n2 lg n.

1.5 Basic recurrence relations


As we will see later, a great many algorithms are based on the following divide-
and-conquer principle:
• divide a large problem into smaller subproblems and recursively solve each
subproblem, then
• combine solutions of the subproblems to solve the original problem.
Running time of such algorithms is determined by a recurrence relation accounting
for the size and number of the subproblems and for the cost of splitting the problem
into them. The recursive relation defines a function “in terms of itself”, that is, by an
expression that involves the same function. The definition is not circular provided
that the value at a natural number n is defined in terms of values at smaller natural
numbers, and the recursion terminates at some base case below which the function
is not defined.

Example 1.25 (Fibonacci numbers). These are defined by one of the most famous
recurrence relations: F(n) = F(n − 1) + F(n − 2); F(1) = 1, and F(2) = 1. The last two
equations are called the base of the recurrence or initial condition. The recurrence
relation uniquely defines the function F(n) at any number n because any particular
value of the function is easily obtained by generating all the preceding values until
the desired term is produced, for example, F(3) = F(2) + F(1) = 2; F(4) = F(3) +
F(2) = 3, and so forth. Unfortunately, to compute F(10000), we need to perform
9998 additions.

Example 1.26. One more recurrence relation is T (n) = 2T (n − 1) + 1 with the base
condition T (0) = 0. Here, T (1) = 2 · 0 + 1 = 1, T (2) = 2 · 1 + 1 = 3, T (3) = 2 · 3 + 1 = 7,
T (4) = 2 · 7 + 1 = 15, and so on.

Note. A recurrence relation is sometimes simply called a recurrence. In engineering


it is called a difference equation.
We will frequently meet recurrences in algorithm analysis. It is more convenient
to have an explicit expression, (or closed-form expression) for the function in or-
der to compute it quickly for any argument value n and to compare it with other
functions. The closed-form expression for T (n), that is, what is traditionally called a
“formula”, makes the growth of T (n) as a function of n more apparent. The process
of deriving the explicit expression is called “solving the recurrence relation”.
Our consideration will be restricted to only the two simplest techniques for solv-
ing recurrences: (i) guessing a solution from a sequence of values T (0), T (1), T (2),
. . . , and proving it by mathematical induction (a “bottom-up” approach) and (ii)
“telescoping” the recurrence (a “top-down” approach). Both techniques allow us to
obtain closed forms of some important recurrences that describe performance of
sort and search algorithms. For instance, in Example 1.26 we can simply guess the
closed form expression T (n) = 2n −1 by inspecting the first few terms of the sequence
0, 1, 3, 7, 15 because 0 = 1−1, 1 = 2−1, 3 = 4−1, 7 = 8−1, and 15 = 16−1. But in other
cases these techniques may fail and more powerful mathematical tools beyond the
scope of this book, such as using characteristic equations and generating functions,
should be applied.
Guessing to solve a recurrence
There is no formal way to find a closed-form solution. But after we have guessed
the solution, it may be proven to be correct by mathematical induction (see Sec-
tion D.2).
Example 1.27. For the recurrence T (n) = 2T (n−1)+1 with the base condition T (0) =
0 in Example 1.26 we guessed the closed-form relationship T (n) = 2n − 1 by analysing
the starting terms 0, 1, 3, 7, 15. This formula is obviously true for n = 0, because
20 − 1 = 0. Now, by the induction hypothesis,
T (n) = 2T (n − 1) + 1 = 2(2n−1 − 1) + 1 = 2n − 1
and this is exactly what we need to prove.
The Fibonacci sequence provides a sterner test for our guessing abilities.
Example 1.28. The first few terms of the sequence 1, 1, 2, 3, 5, 8, 13, 21, 34, . . . give no
hint regarding the desired explicit form. Thus let us analyse the recurrence F(n) =
F(n − 1) + F(n − 2) itself. F(n) is almost doubled every time, so that F(n) < 2n . The
simplest guess F(n) = c2n with c < 1 fails because for any scaling factor c it leads to
the impossible equality 2n = 2n−1 + 2n−2 , or 4 = 2 + 1. The next guess is that the base 2
of the exponential function should be smaller, φ < 2, that is, F(n) = cφn . The resulting
equation φn = φn−1√+ φn−2 reduces to the 2
√ quadratic one, φ − φ − 1 = 0, with the two
roots: φ1 = 0.5(1 + 5) and φ2 = 0.5(1 − 5). Because each root solves the recurrence,
the same holds for any linear combination of them, so we know that F(n) = c1 φn1 +
c2 φn2 satisfies the recurrence. We choose the constants c1 and c2 to satisfy the base
conditions F(1) = 1 and F(2) = 1: F(1) = c1 φ1 + c2 φ2 = 1 and F(2) = c1 φ21 + c2 φ22 = 1.
Thus c1 = √15 and c2 = − √15 so that
√ !n √ !n
1 1+ 5 1 1− 5 ∼ φn
F(n) = √ −√ =√
5 2 5 2 5
√ √
= 1.618 is the well-known “golden ratio”. The term with (1 − 5)/2 ∼
where φ = 1+2 5 ∼ =
−0.618 tends to zero when n → ∞, and so F(n) is Θ(φn ).
“Telescoping” of a recurrence
This means a recursive substitution of the same implicit relationship in order to
derive the explicit relationship. Let us apply it to the same recurrence T (n) = 2T (n −
1) + 1 with the base condition T (0) = 0 as in Examples 1.26 and 1.27:
Step 0 initial recurrence T (n) = 2T (n − 1) + 1
Step 1 substitute T (n − 1) = 2T (n − 2) + 1, that is, replace T (n − 1):
T (n) = 2(2T (n − 2) + 1) + 1 = 22 T (n − 2) + 2 + 1
Step 2 substitute T (n − 2) = 2T (n − 3) + 1:

T (n) = 22 (2T (n − 3) + 1) + 2 + 1 = 23 T (n − 3) + 22 + 2 + 1

Step 3 substitute T (n − 3) = 2T (n − 4) + 1:

T (n) = 23 (2T (n − 4) + 1) + 22 + 2 + 1
= 24 T (n − 4) + 23 + 22 + 2 + 1

Step . . . . . .
Step n − 2 . . .

T (n) = 2n−1 T (1) + 2n−2 + . . . + 22 + 2 + 1

Step n − 1 substitute T (1) = 2T (0) + 1:

T (n) = 2n−1 (2T (0) + 1) + 2n−2 + . . . + 2 + 1


= 2n T (0) + 2n−1 + 2n−2 + . . . + 2 + 1

Because of the base condition T (0) = 0, the explicit formula is:

T (n) = 2n−1 + 2n−2 + . . . + 2 + 1 ≡ 2n − 1

As shown in Figure 1.5, rather than successively substitute the terms T (n − 1),
T (n − 2), . . . , T (2), T (1), it is more convenient to write down a sequence of the scaled
relationships for T (n), 2T (n − 1), 22 T (n − 2), . . . , 2n−1 T (1), respectively, then individ-
ually sum left and right columns, and eliminate similar terms in the both sums (the
terms are scaled to facilitate their direct elimination). Such solution is called tele-
scoping because the recurrence unfolds like a telescopic tube.
Although telescoping is not a powerful technique, it returns the desired explicit
forms of most of the basic recurrences that we need in this book (see Examples 1.29–
1.32 below). But it is helpless in the case of the Fibonacci recurrence because after
proper scaling of terms and reducing similar terms in the left and right sums, tele-
scoping returns just the same initial recurrence.
Example 1.29. T (n) = T (n − 1) + n; T (0) = 1.
This relation arises when a recursive algorithm loops through the input to elimi-
nate one item and is easily solved by telescoping:

T (n) = T (n − 1) + n
T (n − 1) = T (n − 2) + (n − 1)
...
T (1) = T (0) + 1
Basic recurrence: an implicit relationship between T(n) and n; the base condition: T(0) = 0

T(n) = 2 T(n−1) + 1 T(n) = 2 T(n−1) + 1

T(n−1) = 2 T(n−2) + 1
substitution

T(n) = 4 T(n−2) + 2 + 1 2 T(n−1) = 4 T(n−2) + 2

T(n−2) = 2 T(n−3) + 1
substitution

T(n) = 8 T(n−3) + 4 + 2 + 1 4 T(n−2) = 8 T(n−3) + 4

... ... ...

T(1) = 2 T(0) + 1
substitution
n
T(n) = 2 T(0) + 2
n−1
+ ... + 4 + 2 + 1 2n−1 T(1) = 2 n T(0) + 2 n−1
left−side sum right−side sum
n
=2 −1
Explicit relationship between T(n) and n by reducing common left− and right−side terms

Figure 1.5: Telescoping as a recursive substitution.

By summing left and right columns and eliminating the similar terms, we obtain that
T (n) = T (0) + 1 + 2 + . . . + (n − 2) + (n − 1) + n = n(n+1)
2 so that T (n) is Θ(n2 ).

Example 1.30. T (n) = T (⌈n/2⌉) + 1; T (1) = 0.


This relation arises for a recursive algorithm that almost halves the input at each
step. Suppose first that n = 2m . Then, the recurrence telescopes as follows:

T (2m ) = T (2m−1 ) + 1
T (2m−1 ) = T (2m−2 ) + 1
...
1
T (2 ) = T (20 ) + 1

so that T (2m ) = m, or T (n) = lg n which is Θ(log n).


For general n, the total number of the halving steps cannot be greater than m =
⌈lg n⌉. Therefore, T (n) ≤ ⌈lg n⌉ for all n. This recurrence is usually called the repeated
halving principle.
Example 1.31. Recurrence T (n) = T (⌈n/2⌉) + n; T (0) = 0.
This relation arises for a recursive algorithm that halves the input after examining
every item in the input for n ≥ 1. Under the same simplifying assumption n = 2m , the
recurrence telescopes as follows:

T (2m ) = T (2m−1 ) + n
T (2m−1 ) = T (2m−2 ) + n/2
T (2m−2 ) = T (2m−3 ) + n/4
...
T (2) = T (1) + 2
T (1) = T (0) + 1

so that T (n) = n + 2n + 4n + . . . + 1 which is Θ(n).


In the general case, the solution is also Θ(n) because each recurrence after halv-
ing an odd-size input may add to the above sum at most 1 and the number of these
extra units is at most ⌈lg n⌉.

Example 1.32. Recurrence T (n) = T (⌈n/2⌉) + T (⌊n/2⌋) + n; T (1) = 0.


This relation arises for a recursive algorithm that makes a linear pass through the
input for n ≥ 2 and splits it into two halves. Under the same simplifying assumption
n = 2m , the recurrence telescopes as follows:

T (2m ) = 2T (2m−1 ) + 2m
m−1
T (2 ) = 2T (2m−2 ) + 2m−1
...
T (2) = 2T (1) + 2

so that
T (2m ) T (2m−1 )
= +1
2m 2m−1
T (2 m−1 ) T (2m−2 )
= +1
2m−1 2m−2
...
T (2) T (1)
= + 1.
2 1

Therefore, T (n) T (1)


n = 1 + m = lg n, so that T (n) = n lg n which is Θ(n log n).
For general n, T (n) is also Θ(n log n) (see Exercise 1.5.2).

There exist very helpful parallels between the differentiation / integration in cal-
culus and recurrence analysis by telescoping.
• The difference equation T (n)−2T (n−1) = c rewritten as T (n)−T1 (n−1) = T (n−1)+
c resembles the differential equation dTdx(x) = T (x). Telescoping of the difference
equation results in the formula T (n) = c(2n − 1) whereas the integration of the
differential equation produces the analogous exponential one T (x) = cex .

• The difference equation T (n)−T (n−1) = cn has the differential analogue dTdx(x) =
cx, and both equations have similar solutions T (n) = c n(n+1)
2 and T (x) = 2c x2 , re-
spectively.

• Let us change variables by replacing n and x with m = lg n and y = ln x so that


n = 2m and x = ey , respectively. The difference equation T (n) − T ( n2 ) = c where
n = 2m and 2n = 2m−1 reduces to T (m) − T (m − 1) = c. The latter has the differen-
tial analogue dTdy(y) = c. These two equations result in the similar explicit expres-
sions T (m) = cm and T (y) = cy, respectively, so that T (n) = c lg n and T (x) = c ln x.

The parallels between difference and differential equations may help us in deriving
the desired closed-form solutions of complicated recurrences.

Exercise 1.5.1. Show that the solution in Example 1.31 is also in Ω(n) for general n.

Exercise 1.5.2. Show that the solution T (n) to Example 1.32 is no more than n lg n +
n − 1 for every n ≥ 1. Hint: try induction on n.

Exercise 1.5.3. The running time T (n) of


 a certain algorithm to process n data items
is given by the recurrence T (n) = kT nk + cn; T (1) = 0 where k is a positive integer
constant and n/k means either ⌈n/k⌉ or ⌊n/k⌋. Derive the explicit expression for T (n)
in terms of c, n, and k assuming n = km with integer m = logk n and k and find time
complexity of this algorithm in the “Big-Oh” sense.

Exercise 1.5.4. The running time T (n) of a slightly different algorithm is given by the
recurrence T (n) = kT nk + ckn; T (1) = 0. Derive the explicit expression for T (n) in


terms of c, n, and k under the same assumption n = km and find time complexity of
this algorithm in the “Big-Oh” sense.

1.6 Capabilities and limitations of algorithm analysis


We should neither overestimate nor underestimate the capabilities of algorithm
analysis. Many existing algorithms have been analysed with much more complex
techniques than used in this book and recommended for practical use on the basis
of these studies. Of course, not all algorithms are worthy of study and we should not
suppose that a rough complexity analysis will result immediately in efficient algo-
rithms. But computational complexity permits us to better evaluate basic ideas in
developing new algorithms.
To check whether our analysis is correct, we may code a program and see whether
its observed running time fits predictions. But it is very difficult to differentiate be-
tween, say, Θ(n) and Θ(n log n) algorithms using purely empirical evidence. Also,
“Big-Oh” analysis is not appropriate for small amounts of input and hides the con-
stants which may be crucial for a particular task.

Example 1.33. An algorithm A with running time TA = 2n lg n becomes less efficient


than another algorithm B having running time TB = 1000n only when 2n lg n > 1000n,
or lg n > 500, or n > 2500 ∼
= 10150 , and such amounts of input data simply cannot be
met in practice. Thus, although the algorithm B is better in the “Big Oh” sense, in
practice we should use algorithm A.

Large constants have to be taken into account when an algorithm is very com-
plex, or when we must discriminate between cheap or expensive access to input
data items, or when there may be lack of sufficient memory for storing large data
sets, etc. But even when constants and lower-order terms are considered, the per-
formance predicted by our analysis may differ from the empirical results. Recall that
for very large inputs, even the asymptotic analysis may break down, because some
operations (like addition of large numbers) can no longer be considered as elemen-
tary.
In order to analyse algorithm performance we have used a simplified mathemat-
ical model involving elementary operations. In the past, this allowed for fairly accu-
rate analysis of the actual running time of program implementing a given algorithm.
Unfortunately, the situation has become more complicated in recent years. Sophis-
ticated behaviour of computer hardware such as pipelining and caching means that
the time for elementary operations can vary wildly, making these models less useful
for detailed prediction. Nevertheless, the basic distinction between linear, quadratic,
cubic and exponential time is still as relevant as ever. In other words, the crude
differences captured by the Big-Oh notation give us a very good way of comparing
algorithms; comparing two linear time algorithms, for example, will require more
experimentation.
We can use worst-case and average-case analysis to obtain some meaningful es-
timates of possible algorithm performance. But we must remember that both re-
currences and asymptotic “Big-Oh”, “Big-Omega”, and “Big-Theta” notation are just
mathematical tools used to model certain aspects of algorithms. Like all models,
they are not universally valid and so the mathematical model and the real algorithm
may behave quite differently.
Exercises
Exercise 1.6.1. Algorithms A and B use TA (n) = 5n log10 n and TB (n) = 40n elementary
operations, respectively, for a problem of size n. Which algorithm has better per-
formance in the “Big Oh” sense? Work out exact conditions when each algorithm
outperforms the other.
Exercise 1.6.2. We have to choose one of two algorithms, A and B, to process a
database containing 109 records.√ The average running time of the algorithms is
TA (n) = 0.001n and TB (n) = 500 n, respectively. Which algorithm should be used,
assuming the application is such that we can tolerate the risk of an occasional long
running time?

1.7 Notes
The word algorithm relates to the surname of the great mathematician Muham-
mad ibn Musa al-Khwarizmi, whose life spanned approximately the period 780–850.
His works, translated from Arabic into Latin, for the first time exposed Europeans
to new mathematical ideas such as the Hindu positional decimal notation and step-
by-step rules for addition, subtraction, multiplication, and division of decimal num-
bers. The translation converted his surname into “Algorismus”, and the computa-
tional rules took on this name. Of course, mathematical algorithms existed well
before the term itself. For instance, Euclid’s algorithm for computing the greatest
common divisor of two positive integers was devised over 1000 years before.
The Big-Oh notation was used as long ago as 1894 by Paul Bachmann and then
Edmund Landau for use in number theory. However the other asymptotic notations
Big-Omega and Big-Theta were introduced in 1976 by Donald Knuth (at time of writ-
ing, perhaps the world’s greatest living computer scientist).
Algorithms running in Θ(n log n) time are sometimes called linearithmic, to match
“logarithmic”, “linear”, “quadratic”, etc.
The quadratic equation for φ in Example 1.28 is called the characteristic equa-
tion of the recurrence. A similar technique can be used for solving any constant
coefficient linear recurrence of the form F(n) = ∑Kk=1 ak F(n − k) where K is a fixed
positive integer and the ak are constants.
Chapter 2

Efficiency of Sorting

Sorting rearranges input data according to a particular linear order (see Section D.3
for definitions of order and ordering relations). The most common examples are the
usual dictionary (lexicographic) order on strings, and the usual order on integers.
Once data is sorted, many other problems become easier to solve. Some of these
include: finding an item, finding whether any duplicate items exist, finding the fre-
quency of each distinct item, finding order statistics such as the maximum, mini-
mum, median and quartiles. There are many other interesting applications of sort-
ing, and many different sorting algorithms, each with their own strengths and weak-
nesses. In this chapter we describe and analyse some popular sorting algorithms.

2.1 The problem of sorting


The problem of sorting is to rearrange an input list of keys, which can be com-
pared using a total order ≤, into an output list such that if i and j are keys and i
precedes j in the output list, then i ≤ j. Often the key is a data field in a larger object:
for example, we may wish to sort database records of customers, with the key being
their bank balance. If each object has a different key, then we can simply use the
keys as identifiers for the purpose of sorting: rather than moving large objects we
need only keep a pointer from the key to the object.
There are several important attributes of sorting algorithms.

Definition 2.1. A sorting algorithm is called comparison-based if the only way to


gain information about the total order is by comparing a pair of elements at a time
via the order ≤.
A sorting algorithm is called stable if whenever two objects have the same key in
the input, they appear in the same order in the output.
A sorting algorithm is called in-place if it uses only a fixed additional amount of
working space, independent of the input size.

With a comparison-based sorting algorithm, we cannot use any information about


the keys themselves (such as the fact that they are all small integers, for example),
only their order relation. These algorithms are the most generally applicable and we
shall focus exclusively on them in this book (but see Exercise 2.7.2).
We consider only two elementary operations: a comparison of two items and
a move of an item. The running time of sorting algorithms in practice is usually
dominated by these operations. Every algorithm that we consider will make at most
a constant number of moves for each comparison, so that the asymptotic running
time in terms of elementary operations will be determined by the number of com-
parisons. However, lower order terms will depend on the exact number of moves.
Furthermore, the actual length of time taken by a data move depends on the imple-
mentation of the list. For example, moving an element from the end to the beginning
of an array takes longer than doing the same for a linked list. We shall discuss these
issues later.
The efficiency of a particular sorting algorithm may depend on many factors, for
instance:

• how many items have to be sorted;

• are the items only related by the order relation, or do they have other restric-
tions (for example, are they all integers from the range 1 to 1000);

• to what extent they are pre-sorted;

• can they be placed into an internal (fast) computer memory or must they be
sorted in external (slow) memory, such as on disk (so called external sorting ).

No one algorithm is the best for all possible situations, and so it is important to
understand the strengths and weaknesses of several algorithms.
As far as computer implementation is concerned, sorting makes sense only for
linear data structures. We will consider lists (see Section C.1 for a review of ba-
sic concepts) which have a first element (the head), a last element (the tail) and a
method of accessing the next element in constant time (an iterator). This includes
array-based lists, and singly- and doubly-linked lists. For some applications we will
need a method of accessing the previous element quickly; singly-linked lists do not
provide this. Also, array-based lists allow fast random access. The element at any
given position may be retrieved in constant time, whereas linked list structures do
not allow this.
Exercises
Exercise 2.1.1. The well-known and obvious selection sort algorithm proceeds as
follows. We split the input list into a head and tail sublist. The head (“sorted”) sublist
is initially empty, and the tail (“unsorted”) sublist is the whole list. The algorithm
successively scans through the tail sublist to find the minimum element and moves
it to the end of the head sublist. It terminates when the tail sublist becomes empty.
(Java code for an array implementation is found in Section A.1).
How many comparisons are required by selection sort in order to sort the input
list (6, 4, 2, 5, 3, 1) ?

Exercise 2.1.2. Show that selection sort uses the same number of comparisons on
every input of a fixed size. How many does it use, exactly, for an input of size n?

Exercise 2.1.3. Is selection sort comparison-based? in-place? stable?


Exercise 2.1.4. Give a linear time algorithm to find whether a sorted list contains
any duplicate elements. How would you do this if the list were not sorted?

2.2 Insertion sort


This is the method usually used by cardplayers to sort cards in their hand. Inser-
tion sort is easy to implement, stable, in-place, and works well on small lists and lists
that are close to sorted. However, it is very inefficient for large random lists.
Insertion sort is iterative and works as follows. The algorithm splits a list of size n
into a head (“sorted”) and tail (“unsorted”) sublist.

• The head sublist is initially of size 1.

• Repeat the following step until the tail sublist is empty:

◦ choose the first element x in the tail sublist;


◦ find the last element y in the head sublist not exceeding x;
◦ insert x after y in the head sublist.

Before each step i = 1, 2, . . . , n − 1, the sorted and unsorted parts have i and n −
i elements, respectively. The first element of the unsorted sublist is moved to the
correct position in the sorted sublist by exhaustive backward search, by comparing
it to each element in turn until the right place is reached.

Example 2.2. Table 2.1 shows the execution of insertion sort. Variables Ci and Mi
denote the number of comparisons and number of positions to move backward, re-
spectively, at the ith iteration. Elements in the sorted part are italicized, the currently
sorted element is underlined, and the element to sort next is boldfaced.
Table 2.1: Sample execution of insertion sort.

i Ci Mi Data to sort
25 8 2 91 70 50 20 31 15 65
1 1 1 8 25 2 91 70 50 20 31 15 65
2 2 2 2 8 25 91 70 50 20 31 15 65
3 1 0 2 8 25 91 70 50 20 31 15 65
4 2 1 2 8 25 70 91 50 20 31 15 65
5 3 2 2 8 25 50 70 91 20 31 15 65
6 5 4 2 8 20 25 50 70 91 31 15 65
7 4 3 2 8 20 25 31 50 70 91 15 65
8 7 6 2 8 15 20 25 31 50 70 91 65
9 3 2 2 8 15 20 25 31 50 65 70 91

Analysis of insertion sort


Insertion sort is easily seen to be correct (see Exercise 2.2.2 for formal proof ),
since the head sublist is always sorted, and eventually expands to include all ele-
ments.
It is not too hard to find the worst case for insertion sort: when the input con-
sists of distinct items in reverse sorted order, every element must be compared with
every element preceding it. The number of moves is also maximized by such input.
The best case for insertion sort is when the input is already sorted, when only n − 1
comparisons are needed.

Lemma 2.3. The worst-case time complexity of insertion sort is Θ(n2 ).

Proof. Fill in the details yourself — see Exercise 2.2.3.

Since the best case is so much better than the worst, we might hope that on aver-
age, for random input, insertion sort would perform well. Unfortunately, this is not
true.

Lemma 2.4. The average-case time complexity of insertion sort is Θ(n2 ).

Proof. We first calculate the average number Ci of comparisons at the ith step. At the
beginning of this step, i elements of the head sublist are already sorted and the next
element has to be inserted into the sorted part. This element will move backward j
steps, for some j with 0 ≤ j ≤ i. If 0 ≤ j ≤ i − 1, the number of comparisons used will
be j + 1. But if j = i (it ends up at the head of the list), there will be only i comparisons
(since no final comparison is needed).
Assuming all possible inputs are equally likely, the value of j will be equally likely
to take any value 0, . . . , i. Thus the expected number of comparisons will be
 
1 1 i(i + 1) i i
Ci = (1 + 2 + · · · + i − 1 + i + i) = +i = + .
i+1 i+1 2 2 i+1

(see Section D.6 for the simplification of the sum, if necessary).


The above procedure is performed for i = 1, 2, . . . , n − 1, so that the average total
number C of comparisons is as follows:
n−1 n−1 
1 n−1 n−1 i

i i
C= ∑ Ci = ∑ +
2 i+1
= ∑ i+ ∑ i+1
2 i=1
i=1 i=1 i=1

The first sum is equal to (n−1)n i 1


2 . To find the second sum, let us rewrite i+1 as 1 − i+1
so that
n−1 n−1   n−1 n
i 1 1 1
∑ i+1 ∑ = 1 −
i + 1
= n−1− ∑
i + 1
= n − ∑ = n − Hn
i=1 i=1 i=1 i=1 i

where Hn denotes the n-th harmonic number: Hn ≈ ln n when n → ∞.


Therefore, C = (n−1)n
4 + n − Hn . Now the total number of data moves is at least
zero and at most the number of comparisons. Thus the total number of elementary
operations is Θ(n2 ).

The running time of insertion sort is strongly related to inversions. The number
of inversions of a list is one measure of how far it is from being sorted.

Definition 2.5. An inversion in a list a is an ordered pair of positions (i, j) such that
i < j but a[i] > a[ j].

Example 2.6. The list (3, 2, 5) has only one inversion corresponding to the pair (3, 2),
the list (5, 2, 3) has two inversions, namely, (5, 2) and (5, 3), the list (3, 2, 5, 1) has four
inversions (3, 2), (3, 1), (2, 1), and (5, 1), and so on.

Example 2.7. Table 2.2 shows the number of inversions, Ii , for each element a[i] of
the list in Table 2.1 with respect to all preceding elements a[0], . . . , a[i − 1] (Ci and Mi
are the same as in Table 2.1).

Note that Ii = Mi in Table 2.1. This is not merely a coincidence—it is always true.
See Exercise 2.2.4.
n−1
The total number of inversions I = ∑i=1 Ii in a list to be sorted by insertion sort
is equal to the total number of positions an element moves backward during the
n−1
sort. The total number of data comparisons C = ∑i=1 Ci is also equal to the total
number of inversions plus at most n − 1. For the initial list in Tables 2.1 and 2.2,
Table 2.2: Number of inversions Ii , comparisons Ci and data moves Mi for each element a[i] in
sample list.

Index i 0 1 2 3 4 5 6 7 8 9
List element a[i] 25 8 2 91 70 50 20 31 15 65
Ii 1 2 0 1 2 4 3 6 2
Ci 1 2 1 2 3 5 4 7 3
Mi 1 2 0 1 2 4 3 6 2

I = 21, and insertion sort performs C = 28 comparisons and M = 21 data moves: in


total, 49 elementary operations.
Swapping two neighbours that are out of order removes exactly one inversion,
and a sorted list has no inversions. If an original list has I inversions, insertion sort
has to swap I pairs of neighbours. Because of Θ(n) other operations in the algorithm,
its running time is Θ(n + I). Thus, on nearly sorted lists for which I is Θ(n), insertion
sort runs in linear time. Unfortunately this type of list does not occur often, if we
choose one randomly. As we have seen above, the average number of inversions
for a randomly chosen list must be in Θ(n2 ). This shows that more efficient sorting
algorithms must eliminate more than just one inversion between neighbours per
swap. One way to do this is a generalization of insertion sort called Shellsort (see
Exercise 2.2.7).
Implementation of insertion sort
The number of comparisons does not depend on how the list is implemented,
but the number of moves does. The insertion operation can be implemented in a
linked list in constant time, but in an array there is no option but to shift elements to
the right when inserting an element, taking linear time in the worst and average case.
Thus if using an array implementation of a list, we may as well move the element
backward by successive swaps. If using a linked list, we can make fewer swaps by
simply scanning backward. On the other hand, scanning backward is easy in an
array but takes more time in a singly linked list. However, none of these issues affect
the asymptotic Big-Theta running time of the algorithm, just the hidden constants
and lower order terms. The main problem with insertion sort is that it takes too
many comparisons in the worst case, no matter how cleverly we implement it.
Figure 2.1 shows basic pseudocode for arrays.
Exercises
Exercise 2.2.1. Determine the quantities Ci and Mi when insertion sort is run on the
input list (91, 70, 65, 50, 31, 25, 20, 15, 8, 2).

Exercise 2.2.2. Prove by induction that algorithm insertionSort is correct.


algorithm insertionSort
Input: array a[0..n − 1]
begin
for i ← 1 to n − 1 do
k ← i−1
while k ≥ 0 and a[k] > a[k + 1] do
swap(a, k, k + 1)
k ← k−1
end while
end for
end

Figure 2.1: Insertion sort for arrays.

Exercise 2.2.3. Prove that the worst-case time complexity of insertion sort is Θ(n2 )
and the best case is Θ(n).

Exercise 2.2.4. Prove that the number of inversions, Ii , of an element a[i] with respect
to the preceding elements, a[0], . . . , a[i − 1], in the initial list is equal to the number of
positions moved backward by a[i] in the execution of insertion sort.

Exercise 2.2.5. Suppose a sorting algorithm swaps elements a[i] and a[i + gap] of a
list a which were originally out of order. Prove that the number of inversions in the
list is reduced by at least 1 and at most 2 gap − 1.

Exercise 2.2.6. Bubble sort works as follows to sort an array. There is a sorted left
subarray and unsorted right subarray; the left subarray is initially empty. At each
iteration we step through the right subarray, comparing each pair of neighbours in
turn, and swapping them if they are out of order. At the end of each such pass, the
sorted subarray has increased in size by 1, and we repeat the entire procedure from
the beginning of the unsorted subarray. (Java code is found in Section A.1.)
Prove that the average time complexity of bubble sort is Θ(n2 ), and that bubble
sort never makes fewer comparisons than insertion sort.

Exercise 2.2.7. Shellsort is a generalization of insertion sort that works as follows.


We first choose an increment sequence . . . ht > ht−1 > . . . > h1 = 1. We start with some
value of t so that ht < n. At each step we form the sublists of the input list a consisting
of elements h := ht apart (for example, the first such list has the elements at position
0, h, 2h, . . . , the next has the elements 1, 1 + h, 1 + 2h, . . . , etc). We sort each of these h
lists using insertion sort (we call this the h-sorting step). We then reduce t by 1 and
continue. Note that the last step is always a simple insertion sort.
Explain why Shellsort is not necessarily slower than insertion sort. Give an input
on which Shellsort uses fewer comparions overall than insertion sort.
Exercise 2.2.8. Find the total numbers of comparisons and backward moves per-
formed by Shellsort on the input list (91, 70, 65, 50, 31, 25, 20, 15, 8, 2) and compare the
total number of operations with that for insertion sort in Exercise 2.2.1.

2.3 Mergesort
This algorithm exploits a recursive divide-and-conquer approach resulting in a
worst-case running time of Θ(n log n), the best asymptotic behaviour that we have
seen so far. Its best, worst, and average cases are very similar, making it a very good
choice if predictable runtime is important. Versions of mergesort are particularly
good for sorting data with slow access times, such as data that cannot be held in
internal memory or are stored in linked lists.
Mergesort is based on the following basic idea.

• If the size of the list is 0 or 1, return.

• Otherwise, separate the list into two lists of equal or nearly equal size and re-
cursively sort the first and second halves separately.

• Finally, merge the two sorted halves into one sorted list.

Clearly, almost all the work is in the merge step, which we should make as effi-
cient as possible. Obviously any merge must take at least time that is linear in the
total size of the two lists in the worst case, since every element must be looked at in
order to determine the correct ordering. We can in fact achieve a linear time merge,
as we see in the next section.
Analysis of mergesort
Lemma 2.8. Mergesort is correct.

Proof. We use induction on the size n of the list. If n = 0 or 1, the result is obviously
correct. Otherwise, mergesort calls itself recursively on two sublists each of which
has size less than n. By induction, these lists are correctly sorted. Provided that the
merge step is correct, the top level call of mergesort then returns the correct answer.

Almost all the work occurs in the merge steps, so we need to perform those effi-
ciently.

Theorem 2.9. Two input sorted lists A and B of size nA and nB , respectively, can be
merged into an output sorted list C of size nC = nA + nB in linear time.

Proof. We first show that the number of comparisons needed is linear in n. Let i,
j, and k be pointers to current positions in the lists A, B, and C, respectively. Ini-
tially, the pointers are at the first positions, i = 0, j = 0, and k = 0. Each time the
smaller of the two elements A[i] and B[ j] is copied to the current entry C[k], and the
corresponding pointers k and either i or j are incremented by 1. After one of the
input lists is exhausted, the rest of the other list is directly copied to list C. Each
comparison advances the pointer k so that the maximum number of comparisons is
nA + nB − 1.
All other operations also take linear time.

The above proof can be visualized easily if we think of the lists as piles of playing
cards placed face up. At each step, we choose the smaller of the two top cards and
move it to the temporary pile.

Example 2.10. If a = (2, 8, 25, 70, 91) and b = (15, 20, 31, 50, 65), then merge into c =
(2, 8, 15, 20, 25, 31, 50, 65, 70, 91) as follows.

Step 1 a[0] = 2 and b[0] = 15 are compared, 2 < 15, and 2 is copied to c, that is, c[0] ← 2,
i ← 0 + 1, and k ← 0 + 1.

Step 2 a[1] = 8 and b[0] = 15 are compared to copy 8 to c, that is, c[1] ← 8, i ← 1 + 1,
and k ← 1 + 1.

Step 3 a[2] = 25 and b[0] = 15 are compared and 15 is copied to c so that c[2] ← 15,
j ← 0 + 1, and k ← 2 + 1.

Step 4 a[2] = 25 and b[1] = 20 are compared and 20 is copied to c: c[3] ← 20, j ← 1 + 1,
and k ← 3 + 1.

Step 5 a[2] = 25 and b[2] = 31 are compared, and 25 is copied to c: c[4] ← 25, i ← 2 + 1,
and k ← 4 + 1.

The process continues as follows: comparing a[3] = 70 and b[2] = 31, a[3] = 70 and
b[3] = 50, and a[3] = 70 and b[4] = 65 results in c[5] ← (b[2] = 31), c[6] ← (b[3] = 50),
and c[7] ← (b[4] = 65), respectively. Because the list b is exhausted, the rest of the list
a is then copied to c, c[8] ← (a[3] = 70) and c[9] ← (a[4] = 91).

We can now see that the running time of mergesort is much better asymptotically
than the naive algorithms that we have previously seen.

Theorem 2.11. The running time of mergesort on an input list of size n is Θ(n log n)
in the best, worst, and average case.

Proof. The number of comparisons used by mergesort on an input of size n satisfies


a recurrence of the form T (n) = T (⌈n/2⌉) + T (⌊n/2⌋) + a(n) where 1 ≤ a(n) ≤ n − 1. It
is straightforward to show as in Example 1.32 that T (n) is Θ(n log n).
The other elementary operations in the divide and combine steps depend on the
implementation of the list, but in each case their number is Θ(n). Thus these opera-
tions satisfy a similar recurrence and do not affect the Θ(n log n) answer.
Implementation of mergesort
It is easier to implement the recursive version above for arrays than for linked
lists, since splitting an array in the middle is a constant time operation. Algorithm
merge in Figure 2.3 follows the above description. The first half of the input array, a,
from the leftmost index l to the middle index m acts as A, the second half from m + 1
to the rightmost index r as B, and a separate temporary array t as C. After merging
the halves, the temporary array is copied back to the original one, a.

algorithm mergeSort
Input: array a[0..n − 1]; array indices l, r; array t[0..n − 1]
sorts the subarray a[l..r]
begin
if l < r then
m ← l+r
 
2
mergeSort(a, l, m,t)
mergeSort(a, m + 1, r,t)
merge(a, l, m + 1, r,t)
end if
end

Figure 2.2: Recursive mergesort for arrays

It is easy to see that the recursive version simply divides the list until it reaches
lists of size 1, then merges these repeatedly. We can eliminate the recursion in a
straightforward manner. We first merge lists of size 1 into lists of size 2, then lists of
size 2 into lists of size 4, and so on. This is often called straight mergesort .

Example 2.12. Starting with the input list (1, 5, 7, 3, 6, 4, 2) we merge repeatedly. The
merged sublists are shown with parentheses.

Step 0: (1)(5)(7)(3)(6)(4)(2) Step 2: (1,3,5,7)(2,4,6)


Step 1: (1,5)(3,7)(4, 6)(2) Step 3: (1,2,3,4,5,6,7)

This method works particularly well for linked lists, because the merge steps can
be implemented simply by redefining pointers, without using the extra space re-
quired when using arrays (see Exercise 2.3.4).
Exercises
Exercise 2.3.1. What is the minimum number of comparisons needed when merg-
ing two nonempty sorted lists of total size n into a single list?

Exercise 2.3.2. Give two sorted lists of size 8 whose merging requires the maximum
number of comparisons.
algorithm merge
Input: array a[0..n − 1]; array indices l, r; array index s; array t[0..n − 1]
merges the two sorted subarrays a[l..s − 1] and a[s..r] into a[l..r]
begin
i ← l; j ← s; k ← l
while i ≤ s − 1 and j ≤ r do
if a[i] ≤ a[ j] then t[k] ← a[i]; k ← k + 1; i ← i + 1
else t[k] ← a[ j]; k ← k + 1; j ← j + 1
end if
end while
while i ≤ s − 1 do copy the rest of the first half
t[k] ← a[i]; k ← k + 1; i ← i + 1
end while
while j ≤ r do copy the rest of the second half
t[k] ← a[ j]; k ← k + 1; j ← j + 1
end while
return a ← t
end

Figure 2.3: Linear time merge for arrays

Exercise 2.3.3. The 2-way merge in this section can be generalized easily to a k-
way merge for any positive integer k. The running time of such a merge is c(k − 1)n.
Assuming that the running time of insertion sort is cn2 with the same scaling factor
c, analyse the asymptotic running time of the following sorting algorithm (you may
assume that n is an exact power of k).

• Split an initial list of size n into k sublists of size nk each.


• Sort each sublist separately by insertion sort.
• Merge the sorted sublists into a final sorted list.

Find the optimum value of k to get the fastest sort and compare its worst/average
case asymptotic running time with that of insertion sort and mergesort.

Exercise 2.3.4. Explain how to merge two sorted linked lists in linear time into a
bigger sorted linked list, using only a constant amount of extra space.

2.4 Quicksort
This algorithm is also based on the divide-and-conquer paradigm. Unlike merge-
sort, quicksort dynamically forms subarrays depending on the input, rather than
sorting and merging predetermined subarrays. Almost all the work of mergesort was
in the combining of solutions to subproblems, whereas with quicksort, almost all the
work is in the division into subproblems.
Quicksort is very fast in practice on “random” data and is widely used in software
libraries. Unfortunately it is not suitable for mission-critical applications, because it
has very bad worst case behaviour, and that behaviour can sometimes be triggered
more often than an analysis based on random input would suggest.
Basic quicksort is recursive and consists of the following four steps.

• If the size of the list is 0 or 1, return the list. Otherwise:

• Choose one of the items in the list as a pivot .


• Next, partition the remaining items into two disjoint sublists: reorder the list
by placing all items greater than the pivot to follow it, and all elements less than
the pivot to precede it.

• Finally, return the result of quicksort of the “head” sublist, followed by the
pivot, followed by the result of quicksort of the “tail” sublist.

The first step takes into account that recursive dynamic partitioning may pro-
duce empty or single-item sublists. The choice of a pivot at the next step is most
critical because the wrong choice may lead to quadratic time complexity while a
good choice of pivot equalizes both sublists in size (and leads to “n log n” time com-
plexity). Note that we must specify in any implementation what to do with items
equal to the pivot. The third step is where the main work of the algorithm is done,
and obviously we need to specify exactly how to achieve the partitioning step (we do
this below). The final step involves two recursive calls to the same algorithm, with
smaller input.
Analysis of quicksort
All analysis depends on assumptions about the pivot selection and partitioning
methods used. In particular, in order to partition a list about a pivot element as
described above, we must compare each element of the list to the pivot, so at least
n − 1 comparisons are required. This is the right order: it turns out that there are
several methods for partitioning that use Θ(n) comparisons (we shall see some of
them below).

Lemma 2.13. Quicksort is correct.

Proof. We use mathematical induction on the size of the list. If the size is 1, the al-
gorithm is clearly correct. Suppose then that n ≥ 2 and the algorithm works correctly
on lists of size smaller than n. Suppose that a is a list of size n, p is the pivot ele-
ment, and i is the position of p after partitioning. Due to the partitioning principle
of quicksort, all elements of the head sublist are no greater than p, and all elements
Other documents randomly have
different content
jamais plus volontiers qu'au milieu des dangers, et qui ne
l'abandonna pas en prison. Quoi qu'on en ait dit, il était plein de
cœur. Il aimait ses amis; il n'en a jamais trahi un seul. Il en exigeait
beaucoup, mais il leur donnait beaucoup. Il prodiguait leur sang,
comme le sien, sur les champs de bataille; mais il les poussait et
demandait pour eux encore plus que pour lui. Un autre, après
Rocroy, eût été jaloux de Gassion, qu'on voulait faire passer pour
avoir conseillé la manœuvre qui décida du sort de la journée[200]; lui,
du champ de bataille, demanda pour Gassion le bâton de maréchal
de France, et la charge de maréchal de camp pour Sirot qui, à la tête
de la réserve, avait achevé la victoire. Lorsqu'au combat de la rue
Saint-Antoine, échappé à grand'peine du carnage, harassé de
fatigue, défait, couvert de sang, il arriva l'épée encore à la main chez
Mademoiselle, son premier cri fut, avec un torrent de larmes: «Ah!
Madame, vous voyez un homme qui a perdu tous ses amis!» A
Bruxelles, quand il négocia sa rentrée en France, il mit dans les
conditions de son traité tous ceux qui l'avaient suivi. Après cela il
était prince, et se permettait tout en paroles. Il a fait des vers très
spirituels, mais satiriques et quelque peu soldatesques[201]. Il aima
une fois et à l'espagnole, selon toutes les règles de l'hôtel de
Rambouillet. Tout à l'heure, nous ferons connaître l'objet de cette
passion touchante qui honore à jamais le grand Condé; mais nous
pouvons dire d'avance que l'héroïne était digne du héros.
Représentez-vous ces deux jeunes gens à l'hôtel de Rambouillet.
Condé s'y amusait beaucoup et riait très volontiers avec Voiture et
les beaux-esprits à sa suite; mais son homme était particulièrement
Corneille. Celui-ci qui était pauvre, sans nul ordre, et avait toujours
besoin d'argent, s'est plaint à Segrais, Normand comme lui, que le
prince de Condé qui professait tant d'admiration pour ses ouvrages,
ne lui avait jamais fait de grandes largesses[202]. Mais quelle pension,
je vous prie, eût valu Condé assistant à la première représentation
de Cinna et laissant éclater ses sanglots à ces incomparables vers:

Soyons amis, Cinna, c'est moi qui t'en convie, etc.


Disons aussi en passant que ce même Condé, qui était admirateur
enthousiaste de Corneille, devint l'ami de Bossuet, et défendit
toujours Molière. Il avait pu voir Bossuet presque enfant commencer
sa carrière de prédicateur à l'hôtel de Rambouillet; il avait assisté, il
avait pensé prendre part aux luttes brillantes de son doctorat; sur la
fin de sa vie il recherchait sa conversation, et il a trouvé en lui
l'historien, non-seulement le plus éloquent, mais le plus exact, le
peintre le plus fidèle de Rocroy, surtout le plus digne interprète de ce
grand cœur, principe immortel du bien et du beau en tout genre.

Mlle de Bourbon devint bien vite un des plus brillants ornements de


l'hôtel de Rambouillet. Elle y rencontra la marquise de Sablé, belle
encore, célèbre par son admiration pour les mœurs espagnoles et
par ses amours avec Montmorency. Mme de Sablé guida les premiers
pas de sa jeune amie, la suivit avec un intérêt fidèle dans toutes les
vicissitudes de sa carrière, et vingt-cinq ans après elles se
retrouvèrent ensemble à ce commun rendez-vous des nobles cœurs
désabusés, la religion. Mais Mlle de Bourbon était alors au matin de
la vie, et, sans songer aux orages qui l'attendaient, échappée des
Carmélites elle s'abandonnait à tous les plaisirs qui venaient au-
devant d'elle.
Comme son frère, elle admirait Corneille; mais elle avait un goût
particulier pour Voiture, et ce goût-là ne la quitta jamais. Elle pensa,
elle parla toujours de Voiture comme Mme de Sévigné. Et ce n'est
pas seulement l'agrément de son esprit qui lui plaisait, elle était
touchée sans doute de la sensibilité que nous y avons relevée, et qui
met pour nous Voiture au-dessus de tous ses rivaux. Dans la
fameuse querelle des deux sonnets sur Job et sur Uranie, qui
partagèrent la cour et la ville, les salons et l'Académie, quand tout le
monde était pour Benserade, Mme de Longueville, alors l'arbitre du
goût et de la suprême élégance, prit en main la cause de Voiture et
ramena l'opinion. On a fait un volume sur cette querelle: elle n'est
pas épuisée, et nous la reprendrons plus tard à l'aide de pièces
nouvelles qui, en faisant connaître pour la première fois les motifs de
Mme de Longueville, nous révéleront la délicatesse de son esprit, qui
tenait à celle de son cœur[203].

Mlle de Bourbon fit aussi connaissance à l'hôtel de Rambouillet avec


Chapelain, instruit, modéré, discret, ami sincère de la bonne
littérature, et qui eût pu devenir un écrivain du troisième, peut-être
même du second ordre, ainsi que son ami Pélisson, si, comme le
disait Boileau dont tous les traits d'esprit sont de sérieux jugements,
il se fût contenté d'écrire en prose[204]. Mlle de Bourbon prit de
l'estime pour Chapelain, et, quand elle fut mariée, elle lui fit donner
une assez forte pension par M. de Longueville, pour travailler avec
sécurité à cette fameuse Pucelle qui devait être l'Iliade de la France,
qu'on applaudissait d'avance dans le cénacle de la rue Saint-Thomas-
du-Louvre, et à laquelle la jeune admiratrice de Corneille et de
Voiture avait déjà le bon goût de s'ennuyer.
Parmi les beaux esprits médiocres qu'elle rencontra dans l'illustre
hôtel, était Godeau, petit abbé qu'on appelait dans la maison le nain
de Julie, et qui, devenu évêque de Grasse et de Vence, a entretenu
un commerce de lettres moitié dévotes, moitié galantes, tour à tour
avec Mlle de Bourbon et avec Mme de Longueville[205]. Il y avait aussi
Jacques Esprit, de l'Académie Française, qui joua toute sorte de
rôles: d'abord homme de lettres et commensal du chancelier Séguier
qui le mit à l'Académie, puis tout à coup prêtre de l'Oratoire, puis
redevenu homme du monde et père de famille, qui ne devait pas
être sans mérite, car il eut de son temps l'estime de fort bons juges;
attaché plus tard à l'ambassade de Münster, un des pensionnaires de
M. et de Mme de Longueville, précepteur de leurs neveux, les petits
princes de Conti, tenant une assez grande place dans le salon de
Mme de Sablé, consulté par La Rochefoucauld, passant même pour
un des auteurs des Maximes, et qui aurait gardé peut-être cette
réputation, si l'on n'avait eu l'imprudence d'en imprimer un ouvrage
en 1678[206].

Nous nous ferions scrupule d'oublier à l'hôtel de Rambouillet Mlle de


Scudéry. C'était[207] une personne d'un noble cœur et d'un talent
véritable, écrivant trop vite peut-être et un peu longuement, mais
avec une correction et une politesse qui n'étaient pas communes.
Elle jouissait d'une grande considération et la méritait. Leibnitz a
recherché l'honneur de sa correspondance. Elle faisait des vers fort
goûtés de son temps, et qui nous paraissent encore très agréables.
Ses romans sont si longs, et les épisodes s'y embarrassent tellement
les uns dans les autres, qu'il est impossible de les lire en entier
aujourd'hui; mais ceux qui oseront s'engager dans ce labyrinthe y
rencontreront çà et là des portraits bien faits et très ressemblants,
quoiqu'un peu flattés, d'originaux illustres, à peine déguisés sous des
noms grecs, persans et romains; d'exactes descriptions des plus
beaux lieux et des plus magnifiques palais de France et de Paris,
transportés à Rome ou en Arménie; les grands sentiments alors à la
mode, des tendresses d'un platonisme alambiqué, des conversations
quelquefois un peu fades et toujours très raffinées, mais qui donnent
une bien agréable idée de celles que Mlle de Scudéry tâchait d'imiter.
Un jour, Mme de La Fayette abrégera ces peintures et ces discours,
elle ôtera ces fadeurs et ces langueurs, elle adoucira ces subtilités;
mais elle gardera le charme de ces mœurs héroïques et galantes, et
les esprits délicats qui aujourd'hui encore font leurs délices de Zaïde
et de la Princesse de Clèves, de la Bérénice de Racine, de la Psyché
de Molière et de Corneille, ne liront pas sans plaisir certains
chapitres du Grand Cyrus et de la Clélie. Georges Scudéry lui-même,
insupportable par son amour-propre et son style de matamore, était
un homme d'honneur, très sûr en amitié, et qui, dans les moments
les plus difficiles, devant Mazarin, dont il dépendait, garda
hautement sa fidélité à Condé et à sa sœur[208].
Nous avons dû citer ces divers personnages, parce qu'ils reparaîtront
dans la vie de Mme de Longueville. Dès l'hôtel de Rambouillet, ils
s'attachèrent à Mlle de Bourbon et commencèrent sa réputation, qui
grandit rapidement d'année en année.

Mlle de Bourbon passait tous les hivers à Paris, à l'hôtel de Condé, au


Louvre, au palais Cardinal, dans quelques hôtels de la Place Royale,
surtout à l'hôtel de Rambouillet, parmi les bals, les concerts, les
comédies, les conversations galantes, et partout elle brillait par les
grâces de son esprit et de sa personne. L'été, d'autres plaisirs: elle
allait à Fontainebleau avec la cour, ou chez sa mère, à Chantilly, ou à
Ruel, chez le cardinal de Richelieu et la duchesse d'Aiguillon, ou bien
à Liancourt, chez la duchesse de Liancourt, Jeanne de Schomberg,
ou bien encore à La Barre, près Paris, chez la baronne Du Vigean,
d'une naissance moins relevée, mais d'une très grande fortune, qui
avait la plus aimable famille, deux fils qui furent tour à tour les
camarades du duc d'Enghien, et deux filles recherchées par tout ce
qu'il y avait de grands seigneurs jeunes et galants. Avant comme
après son mariage, Mlle de Bourbon se partageait entre ces diverses
résidences, qui rivalisaient entre elles de magnificence et
d'agrément. Naturellement, c'était auprès de sa mère, à Chantilly,
qu'elle était le plus souvent.
Il faut voir dans Du Cerceau[209] et dans Perelle[210] ce qu'était
Chantilly au commencement et à la fin du XVIIe siècle. Ce vaste et
beau domaine était depuis longtemps aux Montmorency, et il vint
aux Condé par Mme la Princesse, grâce surtout aux victoires du duc
d'Enghien[211]. Il rassemble donc les souvenirs des deux plus grandes
familles militaires de l'ancienne France. Le connétable Anne et Louis
de Bourbon y sont partout, et ces deux ombres couvriront et
protégeront à jamais Chantilly, tant qu'il restera parmi nous quelque
piété patriotique, quelque orgueil national. Les Montmorency ont
transmis aux Condé le charmant château, un peu antérieur à la
renaissance, que Du Cerceau a fait connaître dans tous ses détails.
C'est le grand Condé, dans les dernières années de sa vie, qui,
trouvant alentour les plus beaux bois, une vraie forêt, avec un grand
canal semblable à une rivière, des eaux abondantes et de vastes
jardins, en a tiré les merveilles que le burin de Perelle nous a
conservées, et que Bossuet n'a pu s'empêcher de louer, ces
fontaines, ces cascades, ces grottes, ces pavillons, «ces superbes
allées, ces jets d'eau qui ne se taisaient ni jour ni nuit[212].» Ils se
taisent aujourd'hui. Le mauvais goût du XVIIIe siècle et les
révolutions ont dégradé Chantilly. Un prince digne de son nom avait
entrepris de le rendre à sa beauté première. Il y voulait mettre toute
la fortune que les malheurs de la maison de Condé lui avaient
apportée, et celle qu'il tenait de sa propre maison. Le jeune
capitaine avait rêvé de revenir un jour, après avoir étendu et assuré
la domination française en Afrique, se reposer avec ses lieutenants
dans la demeure sacrée des Montmorency et des Condé, restaurée
et embellie de ses mains. La Providence en a disposé autrement, et
Chantilly attend encore une main réparatrice. Mais revenons au
Chantilly du XVIIe siècle avant l'époque de sa plus grande
magnificence, entre la description de Du Cerceau et celle de Perelle.

C'était déjà un délicieux séjour. Mme la Princesse s'y plaisait


beaucoup, et y passait avec ses enfants presque tous les étés. Elle
emmenait avec elle une petite cour composée des amis de son fils et
des amies de sa fille, avec quelques beaux esprits, et
particulièrement Voiture, dont on ne pouvait se passer. A défaut de
Voiture on avait sa monnaie, Montreuil ou Sarasin, attachés à la
maison de Condé, et successivement secrétaires du prince de Conti.
Ils avaient l'esprit fin et agréable, et Boileau, dans sa lettre à
Perrault, nomme Sarasin après Voiture[213]. M. le Prince, peu sensible
aux douceurs de la campagne, restait ordinairement à Paris pour y
suivre ses affaires. Mme la Princesse ne haïssait pas les
divertissements, et la jeunesse s'y livrait avec ardeur. On faisait la
cour aux dames. Pendant la chaleur du jour, on s'amusait à lire des
romans ou des poésies; le soir on faisait de longues promenades
avec de longues conversations. On vivait à la manière de l'Astrée, en
attendant les aventures du grand Cyrus. Même en 1650, pendant la
captivité des princes et l'exil de Mme de Longueville, parmi les
troubles de la guerre civile et le bruit des armes, Lenet nous raconte
comment on passait le temps à Chantilly[214]: «Les promenades
étoient les plus agréables du monde... Les soirées n'étoient pas
moins divertissantes. On se retiroit dans l'appartement de la
Princesse, où l'on jouoit à divers jeux. Il y avoit souvent de belles
voix, et surtout des conversations agréables, et des récits d'intrigues
de cour ou de galanterie, qui faisoient passer la vie avec autant de
douceur qu'il étoit possible... Ces divertissements étoient troublés
par les mauvaises nouvelles qu'on apportoit ou qu'on écrivoit. C'étoit
un plaisir très grand de voir toutes ces jeunes dames tristes ou
gaies, suivant les visites rares ou fréquentes qui leur venoient, et
suivant la nature des lettres qu'elles recevoient; et, comme on savoit
à peu près les affaires des unes et des autres, il étoit aisé d'y entrer
assez avant pour s'en divertir. On voyoit à tous moments arriver des
visites et des messages qui donnoient de grandes jalousies à celles
qui n'en recevoient point, et tout cela nous attiroit des chansons, des
sonnets et des élégies qui ne divertissoient pas moins les indifférents
que les intéressés. On faisoit des bouts-rimés et des énigmes qui
occupoient le temps aux heures perdues. On voyoit les unes et les
autres se promener sur le bord des étangs, dans les allées du jardin
ou du parc, sur la terrasse ou sur la pelouse, seules ou en troupe,
suivant l'humeur où elles étoient, pendant que d'autres chantoient
un air ou récitoient des vers, ou lisoient des romans sur un balcon,
ou en se promenant ou couchées sur l'herbe. Jamais on n'a vu un si
beau lieu, dans une si belle saison, rempli de meilleure ni de plus
aimable compagnie.»
Mais avant 1650, avant la Fronde, qui divisa toute la société
française, Chantilly était un séjour bien plus agréable encore. Jugez-
en par cette lettre que Sarasin écrivait de Chantilly, au
commencement de 1648, à Mlle de Rambouillet, devenue Mme de
Montausier, qui venait de partir avec son mari pour leur
gouvernement de Saintonge et d'Angoumois[215]:
«Ni tout ce qu'on a dit de l'heureuse contrée
Où messire Honoré[216] fit adorer Astrée,
Ni tout ce qu'on a feint des superbes beautés
De ces grands palais enchantés
Où l'amoureuse Armide et l'amoureuse Alcine
Emprisonnèrent leurs blondins,
Ni les inventions de ces plaisants jardins
Que, malgré Falcrine,
Détruisit le plus fier de tous les Paladins:
Tout cela, quoi qu'en veuillent dire
Les gens qui nous en ont conté,
Est moins beau que le lieu d'où je vous ai daté,
Et d'où je prétends vous écrire
En stile de roman la pure vérité.

«Le bruit que le zéphyr excite parmi les feuilles des bocages quand
la nuit va couvrir la terre agitoit doucement la forêt de Chantilly,
lorsque, dans la grande route, trois nymphes apparurent au solitaire
Tircis. Elles n'étoient pas de ces pauvres nymphes des bois, plus
dignes de pitié que d'envie, qui, pour logis et pour habit, n'ont que
l'écorce des arbres. Leur équipage étoit superbe et leurs vêtements
brillants... La plus âgée, par la majesté de son visage, imprimoit un
profond respect à ceux qui l'approchoient. Celle qui se trouvoit à
côté faisoit éclater une beauté plus accomplie que la peinture, la
sculpture ni la poésie n'en ont pu jamais imaginer. La troisième avoit
cet air aisé et facile que l'on donne aux Grâces.

Aux deux côtés alloient deux demi-dieux,


L'un d'un air doux et l'autre audacieux;
L'un, comme un vrai foudre de guerre,
Par Mars n'étoit pas égalé;
L'autre avecque raison pouvoit être appelé
Les délices de la terre.
C'est-à-dire, Madame, que hier au soir, entre chien et loup, je
rencontrai dans la grande route de Chantilly Mme la Princesse, qui s'y
promenoit, et qui n'eut jamais tant de santé, accompagnée de Mme
de Longueville, qui n'eut jamais tant de beauté, et de Mme de Saint-
Loup[217], qui n'eut jamais tant de gaieté, toutes trois en déshabillé
et en calèche, suivies des princes de Condé et de Conty... Mme la
Princesse m'ayant aperçu m'appela et me dit: «Sarasin, je veux que
vous alliez tout à cette heure écrire à Mme de Montausier que jamais
Chantilly n'a été plus beau, que jamais on n'y a mieux passé le
temps, qu'on ne l'y a jamais davantage souhaitée, et qu'elle se
mocque d'être en Saintonge pendant que nous sommes ici:
Mandez-lui ce que nous faisons,
Mandez-lui ce que nous disons.
J'obéis comme on me commande,
Et voici que je vous le mande.
Quand l'Aurore sortant des portes d'Orient,
Fait voir aux Indiens son visage riant,
Que des petits oiseaux les troupes éveillées
Renouvellent leurs chants sous les vertes feuillées,
Que partout le travail commence avec effort,
A Chantilly l'on dort.
Aussi, lorsque la nuit étend ses sombres voiles,
Que la lune, brillant au milieu des étoiles,
D'une heure pour le moins a passé la minuit,
Que le calme a chassé le bruit,
Que dans tout l'univers tout le monde sommeille,
A Chantilly l'on veille.
Entre ces deux extrémités,
Que nous passons bien notre vie,
Et que la maison de Silvie[218]
A d'aimables diversités!
......................
Ici nous avons la musique
De luths, de violons et de voix;
Nous goûtons les plaisirs des bois,
Et des chiens et du cor et du veneur qui pique.
Tantôt à cheval nous volons,
Et brusquement nous enfilons
La bague au bout de la carrière;
Nous combattons à la barrière;
Nous faisons de jolis tournois, etc.
.........................
Conterai-je dans cet écrit
Les plaisirs innocents que goûte notre esprit?
Dirai-je qu'Ablancourt[219], Calprenède[220] et Corneille[221],
C'est-à-dire vulgairement
Les vers, l'histoire, le romant,
Nous divertissent à merveille,
Et que nos entretiens n'ont rien que de charmant? etc.»

Imaginez par là ce que devait être Chantilly quelques années


auparavant, quand au lieu de la guerre civile, une paix florissante ou
de glorieuses victoires remplissaient tous les cœurs d'allégresse. Le
duc d'Enghien n'y était jamais qu'entouré des jeunes gentilshommes
qui combattaient avec lui à Rocroy, à Fribourg, à Nortlingen, à
Dunkerque, et partageaient ses plaisirs comme ses dangers.
C'étaient le duc de Nemours, tué si vite, et dont le frère, héritier de
son titre, de sa beauté et de sa bravoure, périt aussi dans un duel
affreux au milieu de la Fronde; Coligny, mort également à la fleur de
l'âge dans un duel d'un tout autre caractère; son frère Dandelot,
depuis le duc de Châtillon, un des héros de Lens, qui promettait un
grand homme de guerre et périt à l'attaque de Charenton dans la
première Fronde; Guy de Laval, le fils de la marquise de Sablé, beau,
brave et spirituel, qui se distingua et fut tué au siége de
Dunkerque[222]; La Moussaye, son aide de camp et son principal
officier dans toutes les batailles, mort jeune encore à Stenay en
1650; Chabot, qui épousa la belle et riche héritière des Rohan;
Pisani, le fils de la marquise de Rambouillet, mort aussi l'épée à la
main; les deux Du Vigean, Nangis, Tavannes, tant d'autres parmi
lesquels croissait le jeune Montmorency Bouteville, depuis le duc
maréchal de Luxembourg; toute cette école de Condé différente de
celle de Turenne, à qui le duc d'Enghien souffla de bonne heure son
génie, le coup d'œil qui saisit d'abord le point stratégique d'une
affaire, avec l'audace et l'opiniâtreté dans l'exécution: école
admirable qui commence à Rocroy et d'où sont sortis douze
maréchaux, sans compter tous ces lieutenants généraux qui,
jusqu'au bout du siècle, ont soutenu l'honneur de la France. Telle
était la jeunesse qui s'amusait à Chantilly, et préludait à la gloire par
la galanterie.

On se doute bien que Mlle de Bourbon n'avait pas plus mal choisi que
son frère. Elle s'était liée avec la marquise de Sablé, qui devint l'amie
de toute sa vie; mais, beaucoup plus jeune qu'elle, elle avait des
compagnes sinon plus chères, au moins plus familières: elle s'était
formé une société intime, particulièrement composée de Mlle de
Rambouillet, de Mlles Du Vigean, et de ses deux cousines, Mlles de
Bouteville. Il faut convenir que c'était là un nid de beautés
attrayantes et redoutables, encore unies dans leur gracieuse
adolescence, mais destinées à se séparer bientôt et à devenir rivales
ou ennemies.
Voiture, on le conçoit, prenait grand soin de ces belles demoiselles,
et surtout de Mlle de Bourbon: il la célébrait en vers et en prose, sur
tous les tons et en toute occasion. Même dans ses lettres écrites à
d'autres, il ne tarit pas sur son esprit et sa beauté: «L'esprit de Mlle
de Bourbon, dit-il, peut seul faire douter si sa beauté est la plus
parfaite chose du monde.» Lui aussi, c'est toujours à un ange qu'il
se plaît à la comparer:

De perles, d'astres et de fleurs,


Bourbon, le ciel fit tes couleurs,
Et mit dedans tout ce mélange
L'esprit d'un ange!

Ailleurs:

L'on jugeroit par la blancheur


De Bourbon, et par sa fraîcheur,
Qu'elle a pris naissance des lis, etc.

C'est à elle encore qu'il adresse cette agréable chanson, destinée


sans doute à être chantée à demi-voix dans un bosquet de Chantilly,
devant Mlle de Bourbon endormie:
Notre Aurore merveille
Sommeille;
Qu'on se taise alentour,
Et qu'on ne la réveille
Que pour donner le jour[223]!

Ces dames s'attardaient-elles un peu trop à la campagne quand


Voiture n'y était pas avec elles, il les rappelait à Paris dans des
complaintes burlesquement sentimentales[224].

Mais on ne passait pas tout l'été à Chantilly. Mme la Princesse


possédait dans le voisinage plusieurs autres terres, Merlou ou Mello,
la Versine, Méru, l'Isle-Adam, où elle allait assez fréquemment. Il
fallait bien aussi visiter M. le Cardinal et Mme d'Aiguillon dans leur
belle résidence d'été à Ruel, sur les bords de la Seine, entre Saint-
Germain et Paris[225]. On trouvait là des plaisirs tout différents de
ceux de Chantilly. L'art régnait à Ruel. Il y avait un théâtre comme à
Paris, où le Cardinal faisait représenter des pièces à machines avec
des appareils nouveaux apportés d'Italie. Il donnait de grands ballets
mythologiques comme ceux du Louvre et des fêtes d'une
magnificence presque royale; tandis qu'à Chantilly, bien plus éloigné
de Paris, il y avait sans doute de la grandeur et de l'opulence, mais
une grandeur pleine de calme et une opulence qui mettait surtout à
son service les beautés de la nature. Ruel était tout aussi animé que
le Palais Cardinal. Richelieu y travaillait avec ses ministres; il y
recevait la cour, la France, l'Europe. Les affaires y étaient mêlées aux
divertissements. La duchesse d'Aiguillon était digne de son oncle,
ambitieuse et prudente, dévouée à celui auquel elle devait tout,
partageant ses soucis comme sa fortune, et gouvernant
admirablement sa maison. Elle était encore assez jeune, d'une
beauté régulière, et on ne lui avait pas donné d'intrigue galante. La
calomnie ou la médisance s'était portée sur ses relations avec
Richelieu et même avec Mme Du Vigean. Elle avait plus de sens que
d'esprit, et elle n'était pas le moins du monde précieuse, quoiqu'elle
fréquentât l'hôtel de Rambouillet. Mme la Princesse n'aimait pas
Richelieu: elle ne lui pardonnait pas le sang de son frère
Montmorency, que toutes ses prières et ses larmes n'avaient pu
sauver; mais elle se laissait conduire à la politique de son mari. Il
fallut bien qu'elle donnât les mains au mariage du duc d'Enghien
avec Mlle de Brézé, et elle était sans cesse avec ses enfants au Palais
Cardinal et à Ruel. Elle y était reçue comme elle devait l'être, et les
poëtes de M. le Cardinal célébraient à l'envi la mère et la fille.
Richelieu, comme on le sait, avait cinq poëtes qui tenaient de lui
pension pour travailler à son théâtre: Bois-Robert, Colletet, L'Étoile,
Corneille et Rotrou. On les appelait les cinq auteurs, et ils ont fait en
commun plusieurs pièces, l'Aveugle de Smyrne, la Comédie des
Tuileries, etc. Cela n'empêchait pas qu'il n'y eût auprès de Son
Éminence d'autres poëtes encore: Georges Scudéry, Voiture lui-
même, qui tout attaché qu'il était au duc d'Orléans, faisait aussi sa
cour à Richelieu et célébrait la duchesse d'Aiguillon. C'est à Ruel
qu'un peu plus tard, rencontrant dans une allée la reine Anne et
interpellé par elle de lui faire quelques vers à l'instant même, Voiture
improvisa cette petite pièce, remarquable surtout par la facilité et
l'audace, où il ne craignit pas de lui parler de Buckingham. Mais les
deux favoris du Cardinal étaient Desmarets et Bois-Robert: il les
avait mis dans les affaires, et employait leur plume en toute
occasion, dans le genre léger comme dans le genre sérieux. Il paraît
que Desmarets avait été chargé de faire les honneurs poétiques de
Ruel à Mme la Princesse et à sa fille. On trouve en effet dans le
recueil, aujourd'hui assez rare et fort peu lu, des œuvres du
conseiller du roi et contrôleur des guerres Desmarets, dédiées à
Richelieu et imprimées avec luxe[226], une foule de vers assez
agréables qui se chantaient dans les ballets mythologiques de Ruel,
et dont plusieurs sont adressés à Mlle de Bourbon et à Mme la
Princesse. Dans une Mascarade des Grâces et des Amours
s'adressant à Mme la duchesse d'Aiguillon en présence de Mme la
Princesse et de Mlle de Bourbon, les Grâces disent à celle-ci:
Merveilleuse beauté, race de tant de rois,
Princesse, dont l'éclat fait honte aux immortelles,
Nous ne pensions être que trois,
Et nous trouvons en vous mille grâces nouvelles.

Ce ne sont là que des fadeurs banales, tandis que les deux petites
pièces suivantes ont au moins l'avantage de décrire la personne de
Mlle de Bourbon telle qu'elle était alors, avant son mariage, quelques
années après le portrait de Du Cayer. On y voit Mlle de Bourbon
commençant à tenir les promesses de son adolescence, et
l'angélique visage, que nous a montré rapidement Mme de Motteville,
déjà accompagné des autres attraits de la véritable beauté:
POUR MADEMOISELLE DE BOURBON.

Jeune beauté, merveille incomparable,


Gloire de la cour,
Dont le beau teint et la grâce adorable
Donnent tant d'amour;
Ah! quel espoir de captiver ton âme,
Puisque la flamme
Des plus grands dieux
Ne peut pas mériter un seul trait de tes yeux, etc.

POUR LA MÊME.
Beau teint de lis sur qui la rose éclate,
Attraits doux et perçans
Qui nous charment les sens,
Beaux cheveux blonds, belle bouche incarnate;
Rare beauté, peut-on n'admirer pas
Vos aimables appas?

Sein, qui rendez tant de raisons malades,


Monts de neige et de feux,
Où volent tant de vœux,
Sur qui l'Amour dresse ses embuscades;
Rare beauté, etc.

Grave douceur, taille riche et légère,


Ris qui nous fait mourir
De joie et de désir,
D'où naît l'espoir que ta vertu modère;
Rare beauté, etc.

A quelques lieues de Chantilly était la belle terre de Liancourt, dont


Jeanne de Schomberg, d'abord duchesse de Brissac, puis duchesse
de Liancourt, avait fait un séjour magnifique. C'était une personne
du plus grand mérite, belle, pieuse, fort instruite, qui même a laissé
un écrit remarquable[227] destiné à l'éducation de sa petite-fille. Elle
se complaisait et s'entendait dans les arrangements de maison et
dans les bâtiments somptueux. Elle acheta, rue de Seine, l'ancien
hôtel de Bouillon, et fit élever à sa place l'hôtel de Liancourt, depuis
nommé l'hôtel de La Rochefoucauld, qui s'étendait de la rue de
Seine à la rue des Augustins, dans l'emplacement aujourd'hui occupé
par la rue des Beaux-Arts. «A Liancourt, dit Tallemant[228], la
duchesse avoit fait tout ce qu'on peut pour des allées et des prairies.
Tous les ans elle y ajoutait quelque nouvelle beauté.» Mme la
Princesse allait souvent en visite dans ce charmant voisinage. Une
année que la petite vérole faisait de grands ravages tout autour de
Chantilly et dans les différents domaines de la princesse, Merlou, La
Versine, Méru, elle envoya ses enfants avec toute leur jeune société
passer quelque temps à Liancourt. Il n'y manquait que Mlles Du
Vigean, que leur mère avait rappelées à Paris. Le fils unique de la
maison, La Roche-Guyon, était un des amis du duc d'Enghien; il fut
tué en 1646, en servant sous lui au siége si meurtrier de Mardyk. On
était en automne. Le jour de la Toussaint, ces demoiselles firent
leurs dévotions avec l'exactitude accoutumée. Ensuite on se livra à
d'honnêtes divertissements, et, faute de mieux, dans ces longs
loisirs de la campagne, avec le goût dominant du bel esprit, on se
mit à rimer tant bien que mal, en sorte que le jour de la Toussaint
même on adressa à Merlou, où était Mme la Princesse, la Vie et les
Miracles de sainte Marguerite Charlotte de Montmorency, princesse
de Condé, mis en vers à Liancourt. Ces vers, dit le manuscrit auquel
nous empruntons ces détails[229], furent faits sur-le-champ, et les
auteurs paraissent avoir été Mlle de Bourbon et Mlles de Rambouillet,
de Bouteville et de Brienne.
Il nous reste à prier une sainte vivante,
Une sainte charmante, etc.

..........................
Sitôt qu'elle nacquit, ses beaux yeux sans pareils
Parurent deux soleils;
Son teint fut fait de lis, et sur ses lèvres closes
On vit naître des roses;
Puis elle les ouvrit et fit voir en riant
Des perles d'Orient.
Elle faisoit mourir par un regard aimable
Autant que redoutable;
Puis d'un autre soudain que la sainte jetoit,
Elle ressuscitoit, etc.

On ne pouvait oublier les deux aimables absentes, Mlles Du Vigean,


qui s'ennuyaient à Paris pendant qu'on s'amusait sans elles à
Liancourt. On leur écrivit donc une assez longue lettre en vers, où on
leur dépeignait et le regret de ne pas les voir et les consolations
qu'on se donnait. Ces vers sont tout aussi médiocres que les
précédents, mais il ne faut pas oublier que ce sont des impromptus
de jeunes filles et de grandes dames.
LETTRE[230] DE Mlle DE BOURBON ET DE Mlles DE RAMBOUILLET, DE
BOUTEVILLE ET DE BRIENNE, ENVOYÉE DE LIANCOURT A Mlles DU
VIGEAN, A PARIS.
Quatre nymphes, plus vagabondes
Que celles des bois et des ondes,
A deux qui d'un cœur attristé
Maudissent leur captivité.

Nous qui prétendions en tous lieux


Être incessamment admirées,
Et que, par un trait de nos yeux,
Nous serions partout adorées, etc.
..................

Tout notre empire a disparu;


Tout nous fuit ou nous fait la mine;
A peine étions-nous à Méru,
Qu'il fallut fuir à La Versine.
..................

Là, cette peste des beautés,


Là, cette mort des plus doux charmes,
Pour rabattre nos vanités,
Nous donna de rudes alarmes.

Au bruit de ce mal dangereux,


Chacun fuit et trousse bagage;
Car adieu tous les amoureux
Si nos beautés faisaient naufrage.

Pour sauver les traits de l'amour


En lieu digne de son empire,
Nous arrivons à Liancour,
Où règne Flore avec Zéphyre,
Où cent promenoirs étendus,
Cent fontaines et cent cascades,
Cent prez, cent canaux épandus,
Sont les doux plaisirs des nayades.
Nous pensions dans un si beau lieu
Faire une assez longue demeure;
Mais voici venir Richelieu[231],
Il en faut partir tout à l'heure.

Voilà celles que les mourants[232]


Nommoient les astres de la France;
Mais ce sont des astres errants
Et qui n'ont guère de puissance.

Ce qu'il y a de plus curieux et de plus inattendu, c'est que la manie


de rimer gagna Condé lui-même. Comme nous l'avons dit, il avait
beaucoup d'esprit et de gaieté, et il faisait très volontiers la partie
des beaux esprits qui l'entouraient. Au milieu de la Fronde, quand la
guerre se faisait aussi avec des chansons, il en avait fait plus d'une
marquée au coin de son humeur libre et moqueuse. Dans la
première guerre de Paris, où Condé, fidèle encore aux vrais intérêts
de sa maison, tenait pour la cour, un des chefs les plus ardents du
parti contraire était le comte de Maure, cadet du duc de Mortemart,
oncle de Mme de Montespan, de Mme de Thianges et de l'aimable et
docte abbesse de Fontevrauld, le mari de la spirituelle Anne Doni
d'Attichy, l'intime amie de Mme de Sablé[233]. Le comte opinait
toujours, dans les conseils de la Fronde, pour les résolutions les plus
téméraires. Les Mazarins le tournaient en ridicule et l'accablaient
d'une grêle d'épigrammes. On avait fait contre lui des triolets très
célèbres dans le temps[234]. Condé, à ce qu'assure Tallemant[235],
ajouta le couplet suivant:
C'est un tigre affamé de sang
Que ce brave comte de Maure.
Quand il combat au premier rang,
C'est un tigre affamé de sang.
Mais il n'y combat pas souvent;
C'est pourquoi Condé vit encore.
C'est un tigre affamé de sang
Que ce brave comte de Maure.

Il comptait parmi ses meilleurs lieutenants le comte de Marsin, le


père du maréchal, bien supérieur à son fils, et qui était un véritable
homme de guerre. Condé en faisait le plus grand cas; mais il ne
l'épargnait pas pour cela. Un jour à table, en buvant à sa santé, il
improvisa sur un air alors fort à la mode cette petite chanson[236], qui
n'a jamais été publiée, et qui nous semble jolie et piquante:

Je bois à toi, mon cher Marsin.


Je crois que Mars est ton cousin,
Et Bellone est ta mère.
Je ne dis rien du père,
Car il est incertain.
Tin, tin, trelin, tin, tin, tin.

Enfin tout le monde connaît la chanson sur le comte d'Harcourt


lorsque celui-ci, en 1650, se chargea d'escorter Condé, Conti et
Longueville de Marcoussis au Havre:
Cet homme gros et court,
Si fameux dans l'histoire;
Ce grand comte d'Harcourt,
Tout rayonnant de gloire,
Qui secourut Cazal et qui reprit Turin,
Est devenu recors de Jules Mazarin.

A Liancourt, n'ayant rien à faire, et impatienté de voir sa sœur et ses


belles amies rester si longtemps à l'église le jour de la Toussaint, il
leur décocha cette épigramme[237]:

Donnez-en à garder à d'autres,


Dites cent fois vos patenôtres,
Et marmottez en ce saint jour.
Nous vous estimons trop habiles;
Pour ouïr des propos d'amour
Vous quitteriez bientôt vigiles.

Il avait eu pendant quelque temps avec lui à Liancourt, entre autres


amis, le marquis de Roussillon, excellent officier et homme d'esprit,
dont il est plus d'une fois question dans les lettres de Voiture, et
l'intrépide La Moussaye, qui lui fut fidèle jusqu'au dernier soupir, et
pendant la captivité de Condé alla s'enfermer avec Mme de
Longueville dans la citadelle de Stenay où il mourut. Roussillon et La
Moussaye ayant été forcés de quitter Liancourt pour s'en aller à
Lyon, Condé, comme pour imiter la lettre de sa sœur à Mlles Du
Vigean, en écrivit ou en fit écrire une du même genre à ses deux
amis absents. Nous donnons cette pièce presque entière, parce
qu'elle est de Condé, ou que du moins Condé y a mis la main,
surtout parce qu'elle peint au naturel la vie qu'on menait alors à
Liancourt, à Chantilly et dans toutes les grandes demeures de cette
aristocratie du XVIIe siècle, si mal appréciée, qui, pendant la paix,
honorait et cultivait les arts de l'esprit, qui donna aux lettres La
Rochefoucauld, Retz, Saint-Evremond, Bussi, Saint-Simon, sans
parler de Mme de Sévigné et de Mme de La Fayette, et qui, la guerre
venue, s'élançait sur les champs de bataille et prodiguait son sang
pour le service de la France. Voici les vers du futur vainqueur de
Rocroy.
LETTRE[238] POUR MONSEIGNEUR LE DUC d'ANGUIEN, ÉCRITE DE LIANCOURT A
MM. DE ROUSSILLON ET DE LA MOUSSAYE, A LYON.
Depuis votre départ nous goûtons cent délices
Dans nos doux exercices;
Même pour exprimer nos passe-temps divers,
Nous composons des vers.

Dans un lieu, le plus beau qui soit en tout le monde,


Où tout plaisir abonde,
Où la nature et l'art, étalant leurs beautés,
Font nos félicités;

Une troupe sans pair de jeunes demoiselles,


Vertueuses et belles,
A pour son entretien cent jeunes damoiseaux
Sages, adroits et beaux.

Chacun fait à l'envi briller sa gentillesse,


Sa grâce et son adresse,
Et force son esprit pour plaire à la beauté
Dont il est arrêté.

On leur dit sa langueur dedans les promenades,


A l'entour des cascades,
Et l'on s'estime heureux du seul contentement
De dire son tourment.

Douze des plus galants, dont les voix sont hardies,


Disent des comédies
Sur un riche théâtre, en habits somptueux,
D'un ton majestueux.

On donne tous les soirs de belles sérénades,


On fait des mascarades;
Mais surtout a paru parmi nos passe-temps
Le Ballet du Printemps.

.........................
Les dames bien souvent, aux plus belles journées,
Montent des haquenées;
On vole la perdrix ou l'on chasse le lou
En allant à Merlou.

Les amants à côté leur disent à l'oreille:


O divine merveille!
Laissez les animaux, puisque vos yeux vainqueurs
Prennent assez de cœurs.

.........................

Voilà nos passe-temps, voilà nos exercices,


Nos jeux et nos délices.
Pensiez-vous que d'ici vous eussiez emporté
Notre félicité?

S'écrire en vers était devenu l'amusement de toute cette jeune et


aimable société. En 1640, quand le duc d'Enghien, n'ayant pas vingt
ans, était à Dijon exerçant déjà les fonctions de gouverneur de la
province, on lui adressait de la rue Saint-Thomas-du-Louvre des
épîtres bien ou mal rimées pour lui donner des nouvelles des
intrigues galantes de Paris, et lui en demander de celles qui se
passaient en Bourgogne[239].

«Or, sachez, Monseigneur, que chacun vous renonce,


Si, ce paquet reçu, vous ne faites réponce,
Et si vous n'exprimez avecque de beaux vers
Des dames de Dijon les entretiens divers.
Adieu, vivez content avecque ces galantes.
Nous vous sommes, Seigneur, serviteurs et servantes.
Écrit trois mois avant juillet
Dedans l'hôtel de Rambouillet.»
Et le jeune duc répondait en vers, souvent très mauvais, même pour
des vers de prince, mais qu'on trouvait fort bons à l'hôtel de Condé
et à l'hôtel de Rambouillet[240], parce qu'ils étaient toujours spirituels
et sans aucune prétention. Il faut convenir au moins que de tels
divertissements, dans une jeunesse d'un si haut rang, montraient
quel cas on faisait alors de l'esprit, et nous transportent dans un
monde bien différent du nôtre.
Un sentiment bien naturel nous porte à rechercher quelle a été la
destinée de cette cour de jeunes et braves gentilshommes, de gaies
et charmantes jeunes filles, qui entouraient alors Mlle de Bourbon et
son frère. Nous avons dit celle des hommes: tous se sont illustrés à
la guerre; la plupart sont morts au champ d'honneur. Mais que sont-
elles devenues leurs aimables compagnes, cet essaim de jeunes
beautés que nous avons suivies sur les pas de Mlle de Bourbon à
Chantilly, à Ruel, à Liancourt, ces cinq inséparables amies dont nous
avons publié des vers moins gracieux que leur figure, Mlle de
Rambouillet, Mlle de Brienne, Mlle de Montmorency Bouteville, Mlles
Du Vigean? Elles ont eu les fortunes les plus dissemblables, que
nous allons rapidement indiquer.
Marie Antoinette de Loménie, fille du comte Loménie de Brienne, un
des ministres de Louis XIII et de la reine Anne, épousa, en 1642, le
marquis de Gamaches, qui devint lieutenant général. On peut voir
son portrait tracé par elle-même dans les Divers Portraits de
Mademoiselle, avec ceux de son père et de sa mère. Elle n'a point
fait de bruit; toute sa vie s'est écoulée honnête et pieuse. Elle est
morte à l'âge de quatre-vingts ans, en 1704. Elle a constamment
entretenu avec Mme de Longueville le commerce le plus amical.
C'était la moins brillante des cinq amies: elle a été la plus heureuse.

On sait ce que devint Mlle de Rambouillet[241]. Du plus rare esprit et


d'un agrément infini, mais un peu ambitieuse, après avoir épousé
Montausier en 1645, elle rechercha, ainsi que son mari, les faveurs
de la cour, et elle les obtint en en payant la rançon. Il est assez triste
d'avoir commencé par être, dans sa jeunesse, si sévère à ses
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.

More than just a book-buying platform, we strive to be a bridge


connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.

Join us on a journey of knowledge exploration, passion nurturing, and


personal growth every day!

ebookbell.com

You might also like