Sorting Searching
Sorting Searching
Sorting Searching
With 87 Figures
Springer-Verlag
Berlin Heidelberg New York Tokyo 1984
Editors
Prof. Dr. Wilfried Brauer
FB Informatik der UniversiHit
Rothenbaum-Chausee 67-69, 2000 Hamburg 13, Germany
Prof. Dr. Grzegorz Rozenberg
Institut of Applied Mathematics and Computer Science
University of Leiden, Wassenaarseweg 80, P. O. Box 9512
2300 RA Leiden, The Netherlands
Prof. Dr. Arto Salomaa
Department of Mathematics, University of Turku
20500 Turku 50, Finland
Author
Prof. Dr. Kurt Mehlhorn
FB 10, Angewandte Mathematik und Informatik
UniversiUit des Saarlandes, 6600 Saarbrucken, Germany
This work is subject to copyright. All rights are reserved, whether the whole or part of material
is concerned, specifically those oftranslation, reprinting, re·use of illustrations, broadcasting,
reproduction by photocopying machine or similar means, and storage in data banks. Under
§ 54 of the German Copyright Law where copies are made for other than private use a fee is
payable to "Verwertungsgesellschaft Wort", Munich.
© Springer-Verlag Berlin Heidelberg 1984
Softcover reprint of the hardcover 1st edition 1984
The use registered namens, trademarks, etc. in the publication does not imply, even in the
absence of a specific statement, that such names are exempt from the relevant protective laws
and regulations and therefore free for general use.
2145/3140-543210
For Ena, UIi, Steffi, and Tim
Preface
The design and analysis of data structures and efficient algorithms has
gained considerable importance in recent years. The concept of
"algorithm" is central in computer science, and "efficiency" is central
in the world of money.
I have organized the material in three volumes and nine chapters.
Vol. 1: Sorting and Searching (chapters I to III)
Vol. 2: Graph Algorithms and NP-completeness (chapters IV to
VI)
Vol. 3: Multi-dimensional Searching and Computational Geo-
metry (chapters VII and VIII)
Volumes 2 and 3 have volume 1 as a common basis but are indepen-
dent from each other. Most of volumes 2 and 3 can be understood
without knowing volume 1 in detail. A general kowledge of algorith-
mic principles as laid out in chapter 1 or in many other books on
algorithms and data structures suffices for most parts of volumes 2 and
3. The specific prerequisites for volumes 2 and 3 are listed in the
prefaces to these volumes. In all three volumes we present and analyse
many important efficient algorithms for the fundamental computa-
tional problems in the area. Efficiency is measured by the running
time on a realistic model of a computing machine which we present in
chapter I. Most of the algorithms presented are very recent inven-
tions; after all computer science is a very young field. There are hardly
any theorems in this book which are older than 20 years and at least
fifty percent of the material is younger than 10 years. Moreover, I
have always tried to lead the reader all the way to current research.
We did not just want to present a collection of algorithms; rather we
also wanted to develop the fundamental principles underlying effi-
cient algorithms and their analysis. Therefore, the algorithms are
usually developed starting from a high-level idea and are not pre-
sented as detailed programs. I have always attempted to uncover the
principle underlying the solution. At various occasions several solu-
tions to the same problem are presented and the merits of the various
solutions are compared (e.g. 11.1.5, III. 7 and V.4). Also, the main
algorithmic paradigms are collected in chapter IX, and an orthogonal
view of the book is developed there. This allows the reader to see
where various paradigms are used to develop algorithms and data
structures. Chapter IX is included in all three volumes.
Developing an efficient algorithm also implies to ask for the optimum.
Techniques for proving lower bounds are dealt with in sections 11.1.6,
VIII
we compare the data structures introduced in the first six sections. The
last section covers data structures for subsets of a small universe where
direct access using arrays becomes feasible. In the course of chapter
III we also deal with important algorithmic paradigms, e. g. dynamic
programming, balancing, and amortization.
There are no special prerequisites for volume 1. However, a certain
mathematical maturity and previous exposure to .programming and
computer science in general is required. The Vordiplom (= examina-
tion at the end of the second year within the German university
system) in Computer Science certainly suffices.
I. Foundations........... 1
II. Sorting......... 40
2. Hashing............ 118
2.1 Hashing with Chaining . . . . 118
2.2 Hashing with Open Addressing 124
2.3 Perfect Hashing . . . 127
2.4 Universal Hashing .. 139
2.5 Extendible Hashing. . 145
3. Searching Ordered Sets 151
3.1 Binary Search and Search Trees . 152
3.2 Interpolation Search . . . . . . 155
4. Weighted Trees . . . . . . . . . 158
4.1 Optimum Weighted Trees, Dynamic Programming, and
Pattern Matching . . . . . . . . . . 159
4.2 Nearly Optimal Binary Search Trees 177
5. Balanced Trees . . . . . 187
5.1 Weight-Balanced Trees . . . . . 189
5.2 Height-Balanced Trees . . . . . 199
5.3 Advanced Topics on (a, b)-Trees 213
5.3.1 Mergable Priority Queues 213
5.3.2 Amortized Rebalancing Cost and Sorting Presorted
Files . . . . . 217
5.3.3 FingerTrees . . . . . . . 228
5.3.4 Fringe Analysis . . . . . . 241
6. Dynamic Weighted Trees . 249
6.1 Self-Organizing Data Structures and Their Amortized
and Average Case Analysis . 252
6.1.1 Self-Organizing Linear Lists 256
6.1.2 Splay Trees . . . . . . . . . 263
6.2 D-trees........... 275
6.3 An Application to Multidimensional Searching 284
7. A Comparison of Search Structures 286
8. Subsets of a Small Universe . . . 289
8.1 The Boolean Array (Bitvector) . 289
8.2 The o (log log N) Priority Queue 290
8.3 The Union-Find Problem 298
9. Exercises...... 305
10. Bibliographic Notes . 315
VI. NP-Completeness
1. Turing Machines and Random Access Machines
2. Problems, Languages and Optimization Problems
3. Reductions and NP-complete Problems
4. The Satisfiability Problem is NP-complete
5. More NP-complete Problems
6. Solving NP-complete Problems
6.1 Dynamic Programming
6.2 Branch and Bound
7. Approximation Algorithms
7.1 Approximation Algorithms for the Traveling Salesman Problem
7.2 Approximation Schemes
7.3 Full Approximation Schemes
8. The Landscape of Complexity Classes
9. Exercises
10. Bibliographic Notes
We can now formulaize one goal of this book. Determine TA(n) for impor-
tant algorithms A. More generally, develop methods for determining
TA(n). Unfortunately, this goal is beyond our reach for many algorithms
in the moment. We have to confine ourselves to determine upper and
lower bounds for TA(n), i.e. to asymptotic analysis. A typical claim
will be: T(n) is bounded above by some quadratic function. We write
T(n) = O(n 2 ) and mean that T(n) ~ cn 2 for some c > 0, nO and all n 2 nO.
Or we claim that T(n) grows at least as fast as n log n. We write
T(n) = ~(n log n) and mean that there are constants c > 0 and nO such
that T(n) 2 c n log n for all n 2 nO. We corne back to this notation in
section I.6.
We can also compare two algorithms A1 and A2 for the same problem. We
say, that A1 is faster than A2 if TA (n) ~ TA (n) for all n and that
1 2
A1 is asymptotically faster than A2 II IjTI TA1 (n)/TA2 (n) o. Of
n~=
Assume first that we buy a machine which is ten times as fast as the
present one, or alternatively that we are willing to spend 10 hours of
computing time. Then maximal solvable problem size goes up to 36000
(13500, 1900, 25) for algorithms A(B,C,O). We infer from this example,
that buying a faster machine hardly helps if we use a very inefficient
algorithm (algorithm 0) and that switching to a faster algorithm has a
more drastic effect on maximally solvable problem size. More generally,
we infer from this example that asymptotic analysis is a useful concept
and that special considerations are required for small instances. (cf.
sections II.1.5 and V.4)
We will next define run time and storage space in precise terms. To do
so we have to introduce a machine model. We want this machine model to
abstract the most important features of existing computers, so as to
make our analysis meaningful for every-day computing, and to make it
simple enough, to make analysis possible.
Cl. 1 accululator
o
Y1 index reg 1
2
y2 1 index reg 2
3
y3 1 index reg 3
Jump instructions
go to k , k E IN 0
if reg IT then go to k , k E IN 0
Arithmetic instructions
CI. .... CI. IT mop
1 :5 j :5 3, i E IN 0
0: Y1 -
1 : a. - 1
p(O)
-
3: a. - a. x 2 n
4: Y1 Y1 - 1 n
5: goto 2 n
6: p (1) - a.
if n 0
L(n)
otherwise
and this explains the name logarithmic cost measure. The logarithmic
cost measure has to be used if the numbers involved do not fit into
single storage locations anymore. It is used exclusively in chapter VI.
In the following table we use m to denote the number moved in load and
store instructions, and m1 and m2 to denote the numbers operated on in
an arithmetic instruction. The meaning of all other quantities is im-
mediate from the instruction format.
cost measure, and it depends on the data in the logarithmic cost meas-
ure.
We thus have total cost 4n + 6 under the unit cost measure and 8(n 2 ) un-
der the logarithmic cost measure (compare I.6. for a definition of 8).
n-1
Note that a cost of r (1 + L(2 i ) + L(2» = 8(n 2 ) arises in line 3. Note
i=O
however, that one can improve the cost of computing 2 n to 8(10g n) un-
der the unit cost measure and 8(n) under the logarithmic cost measure.
(cf. exercise 1). [J
We infer from this example that the cost of a program can differ dras-
tically under the two measures. It is therefore important to always
check whether the unit cost measure can be reasonably used. This will
be the case in chapters II to V, VII, and VIII.
Analogously, we use unit and logarithmic cost measure for storage space
also. In the unit cost measure we count the number of storage locations
and registers which are used in the computation and forget about the
actual contents, in the logarithmic cost measure we sum the lengths of
the binary representations of the contents of registers and storage
locations and maximize over time.
Example continued: Our example program uses registers a and Y1 and lo-
cations 0 and 1. Hence its space complexity is 4 under the unit cost
measure. The content of all 4 cells is bounded by 2n , two of them ac-
tually achieve that value. Hence space complexity is 8(L(2 n » = 8(n)
under the logarithmic cost measure. [J
8
Y1 p(j)
a p(i + Y1)
Simulating Program
cost of the original program and the cost of the simulating program.
We want that location 0 always contains the content of Y1' We achieve
this goal by inserting p(O) ..... Y1 after every instruction which modifies
Y1 and by inserting Y1 ..... p(O) before every instruction which uses Y1'
For example, we replace a ..... p(i +~) by Y1 ..... p(O): a ..... p(i + Y1 ) and
Y1 ..... p(i) by Y1 ..... p(i): p(O) ..... Y1 • This modification increases the cost
only by a constant factor ~nder both measures. Finally, we replace the
instructions using general address substitution as described above,
i.e. we replace e.g. a"'" p(i + p(j» by Y1 ..... p(j): a ..... p(i + Y1 ). Note,
that we do not have to include this piece of code into brackets
p(O) ..... Y1 and Y1 ..... p(O) as before because we took care of saving Y1
elsewhere. We thus have (details are left to exercise 3).
Opcode Address
o Y1 ..... p(a) 0
a ..... a 1
2
-i f Y1 = o then goto k 6
3 a ..... a'a 2
4 Y1 ..... y 1-a 1
5 goto k 2
6 p (a) ..... a 1 []
10
The number of opcodes is finite because there are only four registers
and only a finite number of instructions. For the sequel, we assume a
fixed bijection between opcodes and some initial segment of the natural
numbers. We use Num to denote that bijection.
In addition to the RAM instruction set, the RASP instruction set con-
tains so called ~-instructions. ~-instructions operate on the program
store. They are:
a. .... lTk(i)
a. .... lTk(i+Yj ) i E INo'
lTk(i) .... a. :=;; k :=;; 2,
lTk (i+y j) .... a. :=;; j :=;; 3,
Theorem 3: There is a c > 0 such that every RASP program of time com-
plexity T(n) can be simulated in :=;; c • T(n) time units on a RAM. This
holds true under both cost measures.
gram store and registers of the RASP are stored in the data store of
the RAM, more precisely, we use location 1 for the accumulator, loca-
tions 2, 3 and 4 for the index registers, locations 5, 8, 11, 14, •.•
for the data store, and locations 6, 7, 9, 10, 12, 13, .,. for the pro-
gram store. Two adjacent cells are used to hold the two components of
a location of the program store of the RASP. Location 0 is used as an
instruction counter; it always contains the number of the RASP instruc-
tion to be executed next. The interpreter has the following structure:
( 4) goto loop
We leave the details for the reader (exercise 3). According to theorems
2 and 3, time complexity on RAMs and RASPs differ only by a constant
factor. Since we will neglect constant factors anyway in most of what
follows, the choice of machine model is not crucial. We prefer the RAM
model because of its simplicity.
So far, RAMs (and RASPs) have no ability to interact with their environ-
ment, i.e. there are no I/O facilities. We will live with that defect
except for chapter VI and always assume that the input (and output)
is stored in the memory in some natural way. For chapter VI on NP-com-
pleteness we have to be more careful. We equip our machines with two
semi-infinite tapes, a read only input tape and a write only output
tape. The input tape contains a sequence of integers. There is one head
on the input tape which is positioned initially on the first element of
the input sequence. Execution of the instruction a ~ Input transfers
the integer under the input head into the accumulator and advances the
input head by one position. The cost of instruction a ~ Input is 1 in
the unit cost measure and 1 + L(n) in the logarithmic cost measure where
n is the integer to be read in. Similarly, the statement Output - a
transfers the content of a onto the output tape. Whenever a RAM attempts
12
to read from the input tape and there is no element on the input tape
left the computation blocks. We will then say that the output is unde-
fined and that the time complexity of that particular compution is the
number of time units consumed until blocking occurred.
I. 2. Randomized Computations
0. .... random
1: 0."" random
2: if 0. * 0 then goto
r k TA(p,s)
sE {O, 1}
We can now define TA(n) and T:v(n) as described above. Of course T:v(n)
is only defined with respect to a probability distribution on the in-
puts.
An argument similar to the one used in lemma 1 shows that the limit in
the definition above always exists. Of course, only the case e:(n) < 1/2
is interesting. Then A gives the desired output with probability larger
14
Proof: Consider any pEP. Let n = g(p), T = (4/(1 - 2E»TA (n) and
m = 2· (log 8)/log(1-(1/2 - E)2). On input PtB computes T,chooses m ran-
dom sequences s1,s2, ..• ,sm of length T each, and simulates A on inputs(p,s1)'
.•• , (p,sm) for up to T time units each. It then outputs whatever the
majority of the simulated runs of A output. Apparently B runs for at
most 0«1 + m)T) = 0(m(1/2 - E)-1 TA(n» time units. Moreover,
f(p) *
B(PtS1 •.•.sm)iff A(p,si) * f(p) for at least m/2 distinct i's.
Next note that
I {s E {0,1}
T
; f(p) * A(p,s)} I
2T
T
:0:; lim I {s E {0,1f; f(p) * A(p,s)} I +
I {s E {0,1} ; TA(p,s) 2': T} I
k-oo 2k
since A computes f with error at most £ (and therefore the first term
is bounded by £) and since for pEP with g(p) = n we have
TA (n) ~ TA (p) ~ r {TA (p, s) ; s E {0,1}T}/2 T
~ r {TA (p, s) ; s E {O, 1} T and TA (p, s) ~ T}/2 T
~ T·I {s E {O, 1} T and TA(p,s) ~ T} 1/2T
and therefore the second term is bounded by T/TA(n) 4/ (1-2£). Let
y = 1/2 - 1/2(1/2-£). Then
I {s1 ••• sm E {0,1} ; B(p,S1' ••• sm) * f(p)} 1 m .
Tm
.
~ r (~)y~(1_y)m-~
2Tm i=m/2 ~
by definition of m. []
The choice of the seed is left to the user and we will not further
discuss it. We might use a physical device or actually toss a coin.
17
We will now make these concepts precise. Part c) of the definition below
defines in precise terms what we mean with the phrase that a mapping p
passes a statistical test h. We give two variants of the definition
which are geared towards the two applications of pseudo-random number
generators to be described later: fast simulation of randomized algo-
rithms by deterministic algorithms and reducing the number of coin
tosses in randomized computations.
The reader should pause at this point and should try to grasp the
19
h (s)
P otherwise
the running time of TA on p and s can exceed twice the expected value
for at most half of the sequences of coin tosses. Hence
We close this section with a remark on the relation between the ex-
pected time complexity of deterministic algorithms and the running time
of probabilistic algorithms.
L TA (p,S)/2 a (n)
SE{O,1}a(n)
E({TB(p); pEP and g(p) = n}) ~ E({TC(p); pEP and g(p)= n})
L
sE {O, 1} a (n)
Only a few algorithms in this book are formulated in RAM code, most
algorithms are formulated in a high level Algol-like programming lan-
guage. We feel free to introduce additional statements into the language,
whenever the need arises and whenever translation to RAM code is obvious.
Also we make frequent use of complex data structures such as lists,
stacks, queues, trees and graphs. We choose this very high level de-
scription of many algorithms because it allows us to emphasize the
principles more clearly.
P1
if a = 0 then goto exit;
P2
goto first instruction of P 1 ;
exit:
Records and arrays are the most simple ways of structuring data. An
array A of n elements consists of n variables A[1], ••. ,A[n] of the same
type. An array is stored in n consecutive storage locations, e.g. in
locations BA+1, BA+2, ••• ,BA+n. Here BA stands for ~ase ~ddress. If x
is a variable stored in location i then accessing array element A[x] is
realized by means of index registers. The following piece of code
Y1 p(i)
a P(BA+Y1 )
loads A[x] into the accumulator for a cost of 3 in the unit cost meas-
ure and (2 + L(i) + L(BA) + 2L(x) + L(A[x]}} in the logarithmic cost
measure. Again the logarithmic cost is proportional to the length of
the numbers involved.
Queues, stacks, lists and trees are treated in the sections below. They
are all reduced to arrays.
Queues and stacks are used to represent sequences of elements which can
be modified by insertions and deletions. In the case of queues inser-
tions are restricted to the end of the sequence and deletions are re-
stricted to the front of the sequence. A typical example is a waiting
line in a student cafeteria. Queues are also known under the name FIFO
(first in - first out) store. In the case of stacks, insertions and
deletions are restricted to the end of the sequence: LIFO (last in -
last out) store. Very often, the names Push and Pop are used instead of
insertion into and deletion from a stack.
TOP TOP + 1;
K[TOP] .. a
and the following piece of code deletes an element from the stack and
assigns it to variable x, i.e. it realizes x .. pOpeK)
S[END] --- a;
END --- 1 + (END mod n);
if FRONT = END then error fi
Insertions into and deletions from queues take constant time in the unit
cost measure.
26
1. 4.2 Lists
Linear lists are used to represent sequences which can be modified any-
where. In linear lists the elements of a sequence are not stored in
consecutive storage locations, rather each element explicitely points
to its successor.
element
1
There are many versions of linear lists: singly linked, doubly linked,
circular, • . • . We discuss singly linked lists here and leave the
others for the exercises. In singly linked linear lists each element
points to its successor. There are two realizations of linear lists:
one by records and one by arrays. We discuss both, although internally
the record representation boils down to the array representation. We
use both representations throughout the book whichever is more conven-
ient. In the record representation an element of a linear list is a
record of ~ element = record cont: real, next: telement end and
HEAD is a variable of type telement. HEAD always points to the first
element of the list. The pictorial representation is as given above.
Closer to RAM code is the realization by two arrays. Real array
CONTENT[1 •• n] contains the contents of the elements and integer array
NEXT[1 .. n] contains the pointers. HEAD is an integer variable. Our
example list can be stored as follows:
HEAD:
OJ CONTENT NEXT
element 4 o
2 element 1 4
3 element 3 1
4 element 2 3
Here HEAD = 2 means that row 2 of the arrays contains the first element
of the list, the second element is stored in row 4, . . • . The last ele-
ment of the list is stored in row 1; NEXT[1] = 0 indicates that this
element has no successor.
27
We will next describe insertion into, deletion fran and creation of linear
lists. We give two versions of each program, one using records and one
using arrays. In either case we assume that there is a supply of unused
elements and that c call of procedure newr (~p: telement)
(newa(var p: integer» takes a node from the supply and lets p point to
it in the record (array) version and that a call of procedure dis-
poser(~ p: telement) (disposea(~ p: integer» takes the node point-
ed to by p and returns it to the supply. Suffixes r and a distinguish
between the representations. We discuss later how the supply of ele-
ments is implemented. In our example a call newa(p) might assign 5 to
p because the fifth row is unused and a call newr(p) results in
P:~CI3-lI··
Prodedure Create takes one parameter and makes it the head of an empty
list. Again we use suffixes r and a to distinguish the record and the
array version.
and
We summarize in
One often uses linear lists to realize stacks and queues. In par-
ticular, several stacks and queues may share the same supply of unused
nodes. This will guarantee high storage utilization if we have know-
ledge about the total length of all stacks and queues but no knowledge
about individual length. Typical examples can be found in chapter IV.
30
I . 4.3 Trees
We use the following terms when we talk about trees. Let T be a tree
with root v and subtrees Ti , 1 ~ i ~ m. Let wi = root(T i ). Then wi is
the i~th son of v and v is the father of wi. Descendant (ancestor)
31
Our example tree is a binary tree. A binary tree with n nodes has n+1
leaves. The 1st (2nd) subtree is also called left (right) subtree.
figure 2
32
0 Content of v 4 8 4
2 Content of v 2 3 1
3 Content of b 1 0 0
4 Content of b S 0 0
5 Content of v 1 2 7
6 Content of b 2 0 0
7 Content of v3 6 9
8 Content of b 4 0 0
9 Content of b 3 0 0
figure 1
figure 3
Preorder traversal: visit the root, traverse the left subtree, traverse
the right subtree: Root, L, R.
Postorder traversal: traverse the left subtree, traverse the right sub-
tree, visit the root: L, R, Root.
Symmetric traversal: traverse the left subtree, visit the root, trav-
erse the right subtree: L, Root, R.
Symord below traverses a tree in symmetrical order and prints the con-
tent of all nodes and leaves.
proc SYMORD(v) ;
(1) if v is leaf
(2) then print(CONTENT[v])
(3) else SYMORD (LSON [v]) ;
(4) print(CONTENT[v]);
(5) SYMORD (RSON [v] )
fi
end
I. 5. Recursion
(1')
.
begin co the main program calls
- SYMORD(root)~
(2') TOP .. O~
(16' ) print(CONTENT[K[TOP+1]])~
£2 call SYMORD(RSON[v])~
(17' ) TOP" TOP + 2~
(18' ) K[TOP+1] .. RSON[K[TOP-1]]~
(19' ) K[TOP+2] .. "M2"~
end
parameter and in (13') the return address is stored. In line (14') con-
trol is transfered to SYMORD and the recursive call is started. Upon
completion of the call control is transfered to label M1 (line (15')).
The space for the activation record is freed (line (16')) and execution
of SYMORD is resumed. Analogously, one can associate the other instruc-
tions of the non-recursive program with the instructions of the recur-
sive program.
if K[TOP+2] = "M1"
then goto M1;
if K[TOP+2] = "M2"
then goto M2;
goto HP;
The summation over all calls of a procedure can be done more formally
by means of recursion equations. Let T(v) be the run time of call
SYMORD(v). Then
c1 if v is a leaf
T(v) {
c 2 + T(LSON[v]) + T(RSON[v]) otherwise
I. 6. Order of Growth
In most cases we will not be able to derive exact expressions for the
run time of algorithms; rather we have to be content with order of
growth analysis. We use the following notation.
Definition: Let f: 1N0 '" 1N0 be a function. Then O(f), lI(f) and 6 (f)
denote the following sets of functions:
O(f) {g: IN ... 1N0 ; 3c>0 3no: g (n) ::;; c·f(n) for all n ~ no}
0
lI(f) {g: N - 1N 0 ; 3c>0 3no: g(n) ~ c·f(n) for all n ~ no}
0
6 (f) {g: til - 1N0 ; 3c>0 3no: 1/c.f(n) ::;; g(n) ::;; c· f (n)for all n~ no}
0
[J
I. 7. Secondary Storage
I. 8. Exercises
8) A tree is k-ary if every node has exactly k sons. Show that a k-ary
tree with n nodes has exactly k + (k-1) (n-1) leaves.
9) Prove
f(n) = O(f(n»
O(f(n» + O(f(n» O(f(n) )
cO(f(n» =O(f(n»
O(f(n» O(g(n» = O(f(n)g(n»
39
I. 9. Bibliographic Notes
S. ~S. ~ ••• ~S .•
1.1 1.2 1. n
Apparently, the information associated with the keys does not play
any role in the sorting process. ~"ie will therefore never mention it in
our algorithms. There is one further remark in order. Frequently, the
information associqted with the keys is much larger than the keys them-
selves. It is then very costly to move an object. Therefore it is bet-
ter to replace the objects by records consisting of the key and a
41
pointer to the object and to sort the set of records obtained in this
way. Sometimes it will also suffice, to only compute the permutation
i 1 ,i 2 , ••• ,i n and to not actually rearrange the set of objects. All
algorithms described below are easily adopted to this modified sorting
problem.
Remark: For this chapter we extend our machine model by two instruc-
tions, which are very helpful in sorting algorithms. One instruction
exchanges the contents of accumulator and some storage cell.
We begin with a very simple sorting method. Select the largest element
from the input sequence, delete it and add it to the front of the out-
put sequence. The output sequence is empty initially. Repeat this proc-
ess until the input sequence is exhausted. ive formulate the complete
algorithm in our high level programming language first and then trans-
late it into machine code. The input sequence is given by array S[1 .• n].
(1) j - n;
(2) while j > 1
do co S[j+1 .. n] contains the output sequence in increasing order,
S[1 .. j] contains the remainder of the input sequence;
(3) k - j; max - j; S - S[j];
(4) while k > 1
do co we search for the maximal elelment of S[1 .. j]; S
is always a largest element of S[k .. j];
(5) k - k - 1;
(6) i f S[k] > S
(7) then max - k; S - S[k] fi
od;
(8) S[max] - S[j]; S[j] - S;
(9) j - j - 1
od;
43
Next we translate into machine code. We use the following storage as-
signment. Index register Y1 holds j-1, Y2 holds k-1 and index register
Y3 holds max-1. n is stored in location 0, S is stored in a and array
S[1.. nl is stored in locations m+1, .•• ,m+n. i'1e assume n> 1. In the
right column the number of executions of each instruction is given.
o
1
} j n
2 k j n -
3 Y3 Y1 max j n -
4
-a p (m+ Hy 3 ) S S[maxl n -
5 Y2 Y2- 1 k k-1 A
-
6 a a - p(m+1+Y 2 ) A
7 if a ~ 0 then goto 10 } line (6)
A
8 Y2 max k B
9 p(m+1+Y 3 ) S S[maxl B
10 if Y2 > 0 then goto 5 line (4) A
11 a -- p(m+1+Y1 ) } n -
12 line (8)
p(m+Y 3 +1) CI. n -
13 Y1 - Y1 - 1 j - j-1 n -
14 if Y1 > 0 then goto 2 line (2) n -
Lines 5,6,7,10 are executed for j n, n-1, ... ,2 and k = j-1, ..• ,1.
Hence A = n(n-1)/2. Furthermore B A. Thus the run time is
~
In each execution of the body of the outer loop we exchange Smax with
Sj' Since there are j different possibilities for max, this step trans-
forms j possible arrangements of keys S1""Sj into one arrangement.
Hence after the exchange all possible permutations of keys S1'" Sj_1
are equally likely. Therefore we always have a random permutation of
keys S1"",Sj during the sorting process. Sequence S1"",Sj is
scanned from right to left. Instruction 8 is executed whenever a new
maximum is found. For example, if 4 5 3 1 2 is sequence S1"",Sj (we
assume w.l.o.g. that (S1"",Sj} = (1, .•. ,j}; this convention facili-
44
tates the discussion), then the maximum changes twice: from 2 to 3 and
then to 5. Key Sj is equal to i, 1 ~ i ~ j, with probability 1/j. If
Sj = j then the maximum is found. If Sj = i < j then we may delete
elements 1, ••• ,i-1 from the sequence. They certainly do not change the
maximum. Thus we are left with (j-1)-(i-1) = j-i keys out ofS1 "",Sj_1'
The maximum changes once plus whatever times it will change in the re-
maining sequence. This gives the following recursion for a j the ex-
spected number of changes of the maximum
a1 0
j-1
a. 1/j L (1 + a j _i ) for j > 1
J i=1
Multiplying by j gives
j-1
j·a. j-1 + L ai
J i=1
Subtraction yields
(j+1)a j + 1 - j aj
or
a j +1 a j + 1/(j+1)
Thus j
a. L 1/ i
J i=2
j
where Hj = L
1/i is the j-th harmonic number (cf. appendix). We have
i=1
thus shown that lines (8) and (9) are executed a. times on the average
Jo
when the outer loop is executed with j = joe Hence
45
n n
B k a J. k H. - n (n+1)H n - 2n (cf. appendix)
j=2 j=1 J
~ (n+1)ln n - (n-1)
/1\\
1 234
the second. We can see from these examples that the
trees which we build should have small fan-out. In this
way only little information is lost when a maximum is
removed. Let us consider one more example, the sequence 2 10 3 5 1 7 9
6 4 8. The "ideal" tree has the form
\1
ty but the root label does not satisfy it yet.
6/
/\ /\
6 8 7 3
1\5 1/
4
47
10 In this tree the 2 is still not larger than its two sons
; \ i 1\ 7 3
/\ I
4 5 1
/>
5
11
\6/\7
Removal Removal
/9" of 9 8" of 8 /7""
8 7 > 6/ 7 > 6 4
6
/\
2 1
/\3 /\ 1\
5 2 1 3 5
/ \ 2 11\3
4
/\5 4
/
1
Removal
of 7
48
Removal
2 3 5 of 6 6
1 <---
1
/ <---
1
/\2 <----
3/ "'4 <:---
5
/\4
1
/ \2 /\
3 2
I
1
In this way we always keep the tree an almost complete binary tree. All
leaves are on two adjacent levels, all levels except the bottom level
are complete, and the leaves on the bottom level are as far to the left
as possible. Trees of this form can be stored in very compact form:
Number the nodes starting at the root (with number 1) according to in-
creasing depth and for fixed depth from left to right. Then the sons
of node k are numbered 2k and 2k+1 and the father of node k is numbered
lk/2 J (LXJ is the largest integer ~ x). Thus we can store the tree in
a single array and save the space for pointers.
6
/ \4 1/
(1) i +- l n/ 2 J + 1; r +- n;
(2) while r ~ 2 do
begin ~ either r = n and then S[1 .. n] is a heap starting at i
(build up phase) or i=1 and the n-r largest elements are stored
in increasing order in S[r+1], •.• ,S[n] (selection phase).
49
Line 10 is executed in each iteration of the inner loop and the inner
loop is entered at least once in each iteration of the outer loop.
Hence total running time is proportional to the number of comparisons
in line (10). We count the number of comparisons. For simplicity, let
n = 2 k _1.
k-1
2· E i 2i
i=O
procedure Quicksort(£,r);
co Quicksort(~,r) sorts the subarray S[£], ••. ,S[r] into increas-
ing order;
(1)
(2 )
begin i
while i < k do
£; k - r + 1 ; S - S [£] ;
Let us return to the maximal number QS(n) of key comparisons which are
needed on an array of n elements. We have
Q5(O) Q5(l) o
Q5(n) s n + 1 + max {QS(k-l) + Q5(n-k); 1 s k s nl
arrays before partitioning which produce the array above by the parti-
tioning process. The important fact to observe is that this expression
only depends on k but not on the particular permutations i 1 , •.. ,i k _ 1
and jk+1, ... ,jn' Thus all permutations are equally likely, and hence the
subproblems of size k - 1 and n - k are again random. Let QSav(n) be the
expected number of comparisons on an input of size n. Then
and
n
QS (n)= 1/n L
av
k=1
n-1
n+1 + 2/n L QSav(k) for n 2: 2
k=O
n-1
n QSav(n) = n(n+1) + 2' L QSav(k) for n 2: 2
k=O
QS av ( n+ 1 ) = 2 + n+2
n+1 OS
.- av ()
n
n+1 n+1
L ( II CJ,)b i
i=1 j=i+1
n n+1
L i+1 2
i=2
where Hn+1 is the (n+1)-th harmonic number (cf. appendix). Let us re-
turn to run time. We argued above, that run time of the partitioning
phase is proportional to the number of comparisons and that the total
cost of all other operations is O(n). Thus
( 1a) beg in i .. R- ; k - r + 1;
(1b) j .. a random element of {O, ••• ,r-R-};
(1c) interchange S[R-] and S[R-+j];
(1d) S .. S[R-];
What does that do for us? Let II be any permutation of numbers 1, ••• ,n
and let QSran(II) be the expected number of comparisons in randomized
Quicksort applied to sequence n(1),n(2), ••• ,n(n). In lines (1a) - (1d)
we choose a random element of II as the partitioning element, i.e.
S = S[h] with probability 1/n for 1 ::;;; h ::;;; n. Then subproblems II1 and n 2
55
n
QSran (IT) 1/n r (n+1 + QSran (IT 1 (IT,h» + QSran (IT 2 (IT,h»)
h=1
Theorem 4: Let a,b,f: [1,00) ~ iR+, b strictly increasing and b(n) < n
for all n. Then the recurrence
has solution
rd(n)
T(n) h(n) L g(b(i) (n»
i=o
where
and
g(n) f(n)!h(n)
labels of all nodes and leaves of the tree. We can determine T(n) by
summing the labels by depth. A node of depth d has label f(b(d) (n».
If we distribute that label over all leaves below the node then each
leaf receives a portion f(b(d) (n))/h(b(d) (n» = g(b(d) (n». Thus
rd(n)
T(n) hen) 2: g(b(d)(n».
d=o
rd(n)
T (n) hen) 2: g(b (i) (n»
i=o
rd(n)
h (n) 2: (b(-rd(n)+i) (n »
g . 0
i=o
rd(n)
h (n) 2: g(b (-i) (n »
i=o
o
rd(n}
T(n} O(h(n} LiP}
i=o
O(h(n}} if P < -1
{O(h(n} log rd(n}} if p -1
O(h(n} rd(n}p+1} i f p > -1
rd(n}
T(n} S"l(h(n} L qi}
i=o
= S"l(h(n} qrd(n}}
f} As in part e} we conclude
rd(n}
T(n} e(h(n} L qi}
i=o
e(f(n» c
Corollary 6: Let a,b € IR+, b > 1, and let f: [1,~] ~ R+. Then recurrence
In particular,
59
10gb a
O(n ) i f fen)
10gb a
O(n (10g b n)p+1) p > -1
10gb a 10gb a {
T(n) O(n log log n) i f fen) e(n (10gb n)P)and p =-1
10gb a
O(n ) p < -1
Llog n J •
r a 1 f(n/b i )
i=o
Next note that fen) O(nP ) for some P < 10gb a implies g(n) =
O(q-rd(n)) for some q > 1 and hence T(n) = O(h(n)). Also, if
10gb a
fen) e(n (10g b n)p) then g(n) = O«logb n)p) = O(rd(n)P). Thus
T(n) is as given by corollary 5, cases b) - d). Finally case f) handles
fen) = e(nP ) with P > 10gb a. 0
Quicksort's bad worst case behaviour stems from the fact that the size
of the subproblems created by partitioning is badly controlled. Corol-
lary 6 tells us that a = b = 2 would produce an efficient method. How
can we achieve the splitting into two subproblems of size n/2 each?
There is a simple answer: Take the first half of the input sequence and
sort it, take the second half of the input sequence and sort it. Then
merge the two sorted subsequences to a single sorted sequence. These
considerations lead to the follow algorithm.
60
procedure Mergesort(S}:
begin let n = lSI: split S into two subsequences S1 and S2 of
length rn/ 2, and Ln/2J respectively:
. Mergesort (S 1 ) :
Mergesort(S2} :
suppose that the first recursive call produces sequence
X 1 S; x 2 S; s; x rn / 2" and the second call produces
Y1 S; Y2 S; s; Y
i.. n / 2 J;
merge the two sequences into a single sorted sequence
end
i + 1: j + 1: k + 1:
case.
Proof: We choose the x- and y-sequence such that the elements are pair-
wise different and such that the following relations between the ele-
61
ments hold: x 1 < Y1 < x 2 < Y2 < ..• < X ln / 2j < Y ln/2 j « x rn / 2,). Element
x rn/ 2' only exists if.n is odd. Let us assume that some algorithm pro-
duces the correct z-sequence and uses less than n - 1 comparisons in
doing so. Then it has not compared a pair xi' Yi or Yi' x i + 1 • Let us
assume the former case, i.e. it has not compared xi with Yi for some i,
1 ~ i ~ n/2. Then let us consider the algorithm on an input where the
following relations hold: x 1 < Y1 < x 2 < ... < Yi-1 < Yi < xi < x i +1 <
•••• All comparisons which are made by the algorithm have the same
outcome on the original and the modified input and hence the algorithm
produces the same z-sequence on both inputs, a contradiction. c
M(1) 0 and
M( n) n - 1 + M( rn/ 2') + M( l n/ 2 j) if n > 1
M(n)
rlog n'
n + n (rlog n' - 1) - 2 +
1) - 2
fl og n' - 1 + 1
M(n)
n •
flog n'
Theorem 8: Mergesort sorts n elements with at most nflog n' - 2 +1
comparisons and run time 8(n log n) .
S6 S7 - Merge (S 6 ' S 4 )
S2 S5 8 8 - Merge(8 2 'S5)
83 8 9 - Merge (8 7 ,8 8 )
S1
63
Definition: Let T be a binary tree with n leaves v" •.. ,vn,let CONT:
leaves of T - {w" ••• ,wn } be a bijection and let d i be the depth of
leavevi . Then
C08T(T)
Let Topt with labelling CONT opt be an optimal tree. Let {Y1""'Y n } be
the set of leaves of Topt Assume w.l.o.g. that CONT opt (Yi) wi for
1 ~ i ~ n. Let d~pt
J_
be the depth of leaf y.1 in tree Top t'
Proof: Assume otherwise, say w1' < w. and d~pt < d~Pt for some i and j.
J 1 J
If we interchange the labels of leaves Yi and Yj then we obtain a tree
with cost
a contradiction. o
~ Cost(T~lg) + w1 + w2
Cost (Talg )
It remains to analyse the run time of the algorithm. The crucial ob-
servation is:
Lemma 4: Let z1,z2, ••• ,zn_1 be the nodes created by the algorithm in
order. Then CONT[z1] ~ CONT[z2] ~ ••• ~ CONT[Zn_1]. Furhtermore, we
always have V = {vi, ••• ,vn }, I = {Zj, ••• ,zk} for some i ~ n+1,
j ~ k+1 ~ n when entering the body of the loop.
Single Maximum
selection Heapsort Quicksort M:!rgesort A-Sort
# of c:onp3risons
worst case n2/2 2nlogn n2/2 n log n O(n log F/n)
average case n2/2 1':12nlogn 1.44n:bg n n log n O(n log F/n)
nm tine
worst case 2.5 n2 2onlogn 0(n2 ) 12 log n O(n log F/n)
average case 2.5 n2 ~20 n log n 9nlogn 12nlogn O(n log F/n)
storage re- n + const n + const n+logn 2 n + const 5 n
quirement + const
if k ~(R. + r)/2
then i f R. < k-1 then Quicksort(R.,k-1) fi~
if k + 1 < r then Quicksort(k + 1, r) fi
else i f k + < r then Quicksort(k + 1, r) fi~
i f R. < k-1 then Quicksort(R.,k-1) fi
fi
This modification has the following effect. The size of the subproblem
which is solved first has at most 1/2 the size of the original problem.
Therefore there are never more than log n recursive calls pending.
Mergesort requires two arrays of length n and space for some additional
pointers. A-sort is based on (a,b)-trees (cf. III.S.2. and III.S.3.).
For a = 2 and b = 3 a node of an (a,b)-tree requires 5 storage loca-
tions and may contain only one key. Hence A-sort requires up to Sn
storage locations.
Average asymptotic run time: The table contains the dominating terms of
the bounds on run time derived above. The relation of the run times of
Quicksort: Mergesort: Heapsort is 1: 1.33: 2.22 which is typical for
many computers. Note however, that the worst case run time of Mergesort
and Heapsort is not much larger than their expected run time. The re-
lative efficiency of Quicksort is based on the following facts: All
comparisons in the partitioning phase are made with the same element
and therefore this element can be kept in a special register. Also an
exchange of keys is required only after every fourth comparison on the
average, after every second in Heapsort and after every comparison in
68
General sorting algorithms are comparison based, i.e. the only opera-
tion applicable to elements of the universe from which the keys are
drawn is the comparison between two elements with outcome ~ or >. In
particular, the pair of keys compared at any moment of time depends on-
lyon the outcome of previous comparisons and nothing else. For pur-
poses of illustration let us consider sorting by simple maximum selec-
tion on sequence 8 1 ,8 2 ,8 3 , The first comparison is 8 2 : 8 3 , If 8 3 > 8 2
then 8 3 is compared with 8 1 next, if 8 3 ~ 8 2 then 8 2 is compared with
8 1 , next, •••• We can illustrate the complete sorting algorithm by a
tree, a decision tree.
69
'··le can now define the worst case and average case complexity of the
sorting problem. For a decision tree T and permutation 0 let ~T be the
o
depth of the leaf which is reached on input S1, ... ,Sn with
where T ranges over all decision trees for sorting n elements and 0
ranges over all permutations of n elements. ~; is the number of compar-
isons used on input 0 by decision tree T. Thus S(n) is the minimal
worst case complexity of any algorithm and A(n) is the minimal average
case complexity of any algorithm. We prove lower bounds on S(n) and
A(n) .
k
Suppose S(n) ::; k. A tree of depth::; k has at most 2 leaves. A decision
tree for sorting n elements has at least n! leaves. Thus
2 S (n) :2: n!
or
S(n) :2: rlogn !'
70
An upper bound for S(n) comes from the analysis of sorting algorithms.
In particular, we infer from the analysis of Mergesort
rlog n'
S(n) ::; n rlog n' - 2 + 1
and hence
o othervlise
L
1T
t; 1/n! ~ H(1/n!, •.. ,1/n!)
for any tree T. Here H is the entropy of the distribution (cf. 111.4.).
Theorem 10: Every decision tree algorithm for sorting n elements needs
~ rlog n!' comparisons in the worst case and ~ log n! comparisons on
the average.
71
Theorem 11: Every decision tree algorithm for the element uniqueness
problem of size n needs at least log n! comparisons in the worst case.
For permutation 'TT of n elements let v('TT) be the leaf reached on an in-
put (S1' .. '. ,Sn) E Un with S'TT(1) < S'TT(2) < ... < S'TT(n). Note that this
leaf is well-defined because the computation on any input
(S1, .•• ,Sn) E Un with S'TT(1) < S'TT(2) < ••. < S'TT(n) will end in the same
leaf.
Proof: Assume other\'1ise, i.e. there are permutations 'TT and p such that
v('TT) = v(p). Note that leaf v('TT) is labelled yes. We will now construct
an input (S1, ..• ,Sn) such that the computation on input (S1, ... ,Sn)
ends in leaf v('TT), vet
.'
S.]. = S.J for some i ~ J'. This is a contradiction •
(S1, .•• ,Sn) is constructed as follows. Let w 1 , •.• ,w t be the nodes on
the path to leaf v('TT). Consider partial order P(v('TT» on (S1, •.. ,Sn)
defined as follows. For k, 1 ~ k ~ t: if wk is labelled S. : S. and the
]. J
~ - edge (> - edge) is taken from wk to wk + 1 then Si < Sj (Si > Sj) in
partial order p(v('TT». Then P(v('TT» is the smallest partial order (with
respect to set inclusion) satisfying these constraints. In particular,
if (R 1 , ••. ,Rn ) E un is such that R cr (1) < R cr (2) < .•• < Rcr(n) for
cr E {'TT,p} then (R 1 , ..• ,Rn ) satisfies partial order P(v('TT». Hence
P(v('TT» is not a linear order and therefore there are a and b, a ~ b,
such that Sa,Sb are not related in partial order P(v('TT». Let
(S1, •.• ,Sn) E un be such that (S1, ..• ,Sn) satisfies partial order
p(v('TT» and Sa = Sb. Then the computation on input (S1, ••. ,Sn) ends in
leaf v('TT) and hence has result yes, a contradiction. 0
72
and
ken) Ji,T ]
Aran(n) min 1/ n ! L[1/2 L TI,S
T TI sE {a, rf(n)
where T ranges over all random decision trees which solve the sorting
problem of size nand TI ranges over all permutations of n elements.
Aran (Sran) is the minimal average (worst) case complexity of any
randomized comparison-based sorting algorithm. We can now use the argu-
ment outlined at the end of section 1.2. to show that randomization
cannot improve the conplexity of sorting below log n!. We have
since for every fixed sequence s of coin tosses random decision tree T
defines an ordinary decision tree
~ A(n)
of T let Dv = {(S1, ••• ,Sn) € ~n; the computation on input (S1, ••• ,Sn)
ends in leaf v}. Then {Dv ; v leaf of T} is a partition of Rn and hence
\..J
v leaf of T
(D
v
n Wi)
Theorem 13: Any rational decision tree for the rank function of n argu-
ments has depth at least log n!
Theorem 14: Any rational decision tree for the searching function of
size n has depth at least log(n+1).
namely IT (x. - x.) =0 ? This shows that there are problems which
1~i<j~n 1. J
are difficult in the restricted model of decision trees but become
simple if tests between rational functions are allowed and no charge
is made forcomputing those functions. Note however that the best
algorithm known for computing the product IT (xi - x j ) requires
1~i<j~n
n(nlog n) multiplications and divisions. Hence, if we charge for tests
and algebraic operations then an n(nlog n) lower bound for the element
uniqueness problem is conceivable. We will use the remainder of the
section to prove such a lower bound. We start by fixing the model of
computation.
We close with the warning that the argument above is at most a hint
towards a proof. o
We are now ready for the lower bound. We use #V to denote the number of
components of set V.
Theorem 15: Let T be an algebraic computation for inputs S1, ... ,Sn
which decides the membership problem for V == JRn , and let h be the
depth of T. Then
77
The system r involves variables S1, ••. ,Sn' Y(v 1 ), •.. ,Y(v r ) for some
integer r. Also r contains exactly r equalities and some number s of
inequalities. Clearly r + s ~ h since each node on P adds one equality
' 1 'lty. Let W ~ m
or lnequa n+r e
b th e set 0 f so l '
utlons to t h e sys t em r
Then the projection of W onto the first n coordinates is D(w). Since
the projection function is continous we conclude that #D(v) ~ #W and
i t therefore suffices to show #W ~ 3h +n •
In order to apply the fact above we transform this system into a set of
equalities in a higher dimensional space. Let t = #W and let P"""Pt
be points in the different components of W. Let
E = min {qj(Pi) ~ , ~ j ~ m, 1 ~ i ~ t}
Then E O. Let WE be defined by the system
~
2
xn+r+m + E
2 2
qm+'(x" •.. ,x n + r ) x n + r +m+" .•• , qs(x" •.• ,xn + r ) = xn+r+s
where xn+r+" .•• ,xn + r +$ are new variables. Then WE is the projection of
Z onto the first n+r coordinates and hence #WE ~ #Z. Furthermore, Z is
defined by a system of polynomial equations of degree at most two in
n + r + s ~ n + h variables. Hence
#Z ~ 2(4 - 1)n+h-1 ~ 3n +h
by the fact above. CJ
we will now argue. For x = (x 1 ' •.. , xn) E lRn, x. * x. for i * j, let (J
~ ]
79
be the order type of x, i.e. cr is a permutation with x cr (1) < x cr (2) <
••• < xcr(n). Then it is easy to see that if x, x E lRn have different
order types then any line connecting them must go through a point with
two equal coordinates, i.e. a point outside V, and hence x and x'
belong to different components of V. This proves #V ~ n! Since
log n! = Q (nlog n) the proof is completed by an application of
theorem 15 . [J
Our second application is the convex hull problem (cf. section VIII.2):
Given n points in the plane, does their convex hull have n vertices?
Theorem 17: The complexity of the convex hull problem in the model of
algebraic decision trees is Q(nlog n).
Proof: Let V = {(x 1 'Y1, ••• ,xn 'Yn); the convex hull of point
set {(xi'Yi); 1 ~ i ~ n} has n vertices} =
lR2n. As in the proof of
theorem 16 we assign an order type to every element of V. Let (x 1 'Y1' ••
•• ,xn'Yn) E V, and let p be any point in the interior of the convex
hull of point set {(xi' Yi); 1 ~ i ~ n}. Then the order type of this
tuple is the circular ordering of the points around p. Note that
there are (n - 1)! different order types and that any line in lR 2n
connecting tuples of different order types must go through a point out-
side V. Hence #V ~ (n - 1)! and the theorem follows from theorem 15. [J
Theorems 16, 17, and 18 show the strength of theorem 15. It can be used
to prove lower bounds for a great variety of algorithmic problems. We
should also note that theorem 15 is not strong enough to imply lower
bounds on the complexity of RAM computations because indirect addressing
is a "non-algebraic" operation ( a lower bound on the complexity of RAM
Qomputaion will be shown in section II.3) and that is not strong enough
to imply lower bounds for computations over the integers when integer
division belongs to the instruction repertoire.
We may assume w.l.o.g. that ~ = {1,2, ••• ,m} with the usual ordering re-
lation. The ordering on ~ is extended to an ordering on the words over
~ as usual.
There are many different ways of storing words. A popular method stores
all words in a cornmon storage area called string space. The characters
of any word are stored in consecutive storage locations. Each string
variable then consists of a pointer into the string space and a loca-
tion containing the current length of the word. The figure below shows
an example for x 1= ABD, x 2 = DBA and x 3 = Q.
81
~1
x1
2
)
x B D D B A
x3
bucket one 321 and hence the input sequence for the
bucket two second pass is 321, 123, 223, 124, 324.
bucket three: 123, 223 The second pass yields:
bucket four : 124, 324
bucket one and hence the input sequence for the third
bucket two 321, 123, pass is 321, 123, 223, 124, 324. The third
223, 124, 324 pass yields:
bucket three:
bucket four :
bucket one 123, 124 and hence the final result sequence is
bucket two 223 123, 124, 223, 321, 324.
bucket three: 321, 324
bucket four :
There is a very simple method for determining which buckets are non-
empty in the j-th pass, i.e. which letters occur in the j-th position.
Create set {(j,X~)' 1 ~ j ~ k, 1 ~ i ~ n} of size n·k and sort it by
bucketsort according to the second component and then according to the
first. Then the j-th bucket contains all characters which appear in the
j-th position in sorted order. The cost of the first pass is O(n·k + ro),
the cost of the second pass is O(n·k + k). Total cost is thus O(nk + m).
i
4) We finally sort words x by distribution. All lists in the following
program are used as queues; the head of each list contains a pointer to
the last element. Also x is a string variable and x. is the j-th letter
J
of string x.
84
Proof: a) Running time of the first ohase is clearly O(n). Let us as-
sume that ti elements end UD in the i-th bucket, 1 ~ i ~ k. Then the
cost of the second phase is 0(2: t. log til Itlhere O·log 0 = 0 and
1
i
L t. = n. But 2: t. log t. ~ 2: ti log n = n log n.
1 1 1
i i
h
k· 2:
h=2
h
:s; k· 2:
h=2
k(n(n-1)/k 2 + n/k)
n-2 n-1
since h 2 (~) (h (h-1) + h) (~) n(n-1) (h-2) + n(h-1)
O(n)
since k an Cl
Before y,le state the precise result we briefly recall the RAN model. It
is convenient to use indirect addressing instead of index registers as
the means for address calculations. We saw in 1.1. that this does not
affect running time by more than a constant; however it makes the proof
more readable. The basic operations are (i, k E IN):
goto k
a .... random
r x - y if x <:: Y
Here x y and
~
10 if x < y
Theo.!:em 1: Let A be any RRAH which solves the sortin~ ),roblem of size
n. Then there is an x E ~n such that the exnected running time of A on
x is at least loC! n!, i.e. the worst case running time of A. is at least
log n!.
Proof: Let A be any P.Rl\.1·~ 'vhich solves the sorting problem of size nand
stops on all inDuts x E Rn and all sequences of coin tosses in at most
t steps. Here t is some integer. The proof is by reduction to the deci-
sion tree model studied in section II.1.6 .. The first step is to re-
place all instructions of the form a - a ~ p(i) by the equivalent
sequence if p(i) ~ a then a - 0 else a - a - p(i) fi. After the modifi-
cation A uses only true sUbtractions. In the secone. step ,ve associate
with ?rogram A a (maybe infinite) decision tree T(A) by unfolding. T(A)
is a binarv tree whose nodes are labelled by instructions anc tests and
some of \vhose edges are labelled by 0 or 1. Hore precisely, let A con-
sist of m instructions. Then Ti(A), 1 ~ i ~ m+1 is defined by
Ti (A)
Ti + 1 (A)
In the third step we associate with every edge e of the decision tree
a set {Pe,i' i E ~o} of polynomials in indeterminates x1 , ••• ,X n • The
intended meaning of 0-- e,~. (x) is the content of location i (the accumula-
tor for i = 0) if the input is x E IRn and control has reached edge e.
Polynomials Pe,i are defined inductively by induction on the depth of
e. Polynomials p e, ~. will not have the intended meanina - for all
- inputs
x but only for a sufficiently large set of nice inputs as made precise
below. If p and q are !1olynomials we write p::q to denote the fact that-
p and q define the same rational function. The inductive definition is
started by:
if 1 ~ i ~ n
Pe,i(X)
othen·Tise
For the induction step let e be an edge leaving node v, let d be the
edge into v and let ins (v) be the label of node v.
(1) if ins (v) is p(i) > a or coin toss then p e,~. = Pd . for all i
,~
~ 0
if j = 0
{ Pd,o op Pd,i
Pd,j otherwise
89
i if j = 0
{ otherwise
if j = 0
otherwise
(5) *
if ins (v) is a ~ p (p (k» and Pd,k i for all i E IN O then let w be
the closest ancestor (i.e. an ancestor of maximal depth) of v such that
the label of w is p(p(£'» ~ a for some £, EN and P f ,£, = Pd,k where f is
the edge entering w. If w does exist then
if j = 0
otherwise
if j = 0
otherwise
if j = i
otherwise
(7) if ins (v) is p(p(k» ~ a and Pd,k $ i for all i E INo then
Pd . for all j.
, J
finitely many (at most n + depth of e) i's with Pe,i '" O. This is
easily seen by induction on the depth of e. Let QA' a set of polynomials,
be defined by QA = I U {p.. e, J..; e is edge of T(A) and i ~ O}. QA is a
fini te set of polynomials by the renark above. A point x E IRn is
admissible if p(x) * q(x) for any p,g E QA with P '" q.
Case (5): Let w,f and ~ be defined as above. Let us consider the case
that w exists first. Let h = Pf,~(x) = Pd,k(x). Then Pf,o(x) was
stored in location h by instruction ins(w). It remains to be shown that
location h was not written into on the path from w to v. Let z be any
node on that path excluding v and w. If ins(z) is not a store instruc-
tion then there is nothing to show. So let us assume that ins(z) is
either p(i) + a or p(p(i» + a. In the former case we have i *h
Pd,k(x) since i E I ~ QA' x is admissible for QA' and Pd,k is not con-
stant, in the latter case we have Pg,i(x) * h = Pf,~(x), where g is the
edge entering z, since Pg,i E QA and x is admissible for QA' In either
case ins(z) does not write into location z.
If w does not exist then one proves in the same way that nothing was
ever written in location h = Pd,k(x) before control reaches edge e.
Case (7): Then ins(v) is p(p(k» + a and Pd,k '" i for all i E mo' Hence
Pd,k(x) I since x is admissible for QA and I ~ QA' Thus the content
~
of locations with addresses in I is not changed by ins (v) . D
type TI which is admissible for QA} I. We show below that p = n! Let x(TI)
be admissible for QA of order type TI. Execution of T(A) on input x(TI)
ends in some leaf with ingoing edge e; of course, the leaf depends on
sequence s E {O,1}2t of coin tosses used in the computation. We have
Pe,i(x(TI)) xTI(i)(TI) for all i
E {1, ..• ,n}. Since X1 ' ... 'Xn E QA and
x(TI) is admissible for QA this is only possible if Pv,i = XTI(i) for all
i E {1, .•. ,n}. Thus for every sequence s E {O,1}2t of coin tosses at
least n! different leaves are reachable one for each order type. Let
d(TI,s) be the number of nodes of outdegree 2 on the path from the root
to the leaf which is reached on input x(TI) when sequense s of coin
tosses is used. Then
or
1/n! L (1/2 2t L d(TI,s)) ~ log n!
TI sE {a, 1} 2t
Proof: Since R(X n (1)' .•. 'X n (n)) is again a polynomial of degree m it
suffices to prove the claim for n being the identity. This is done by
induction on n. For n = 1 the claim follows from the fact that a
univariate polynomial of degree m has at most m zeroes. For the in-
duction step let
s
R(X) L R i (X 1 , ••• ,xn _ 1 ) X~
i=o
in X is non-zero. 00
n
Similar lower bounds can be shown for related problems, e.g. the nearest
neighbour problem (exercise 20) and the problem of searching an ordered
table (exercise 21). It is open whether theorem 1 stays true if integer
division is added as an additional primitive operation. It should be
noted however that one can sort in linear time (in the unit cost model)
if bitwise boolean operations (componentwise negation and componentwise
AND of register contents, which are imagined to be binary representa-
tions of numbers) and integer division are additional primitives.
Let x 1 ,x 2 , ..• ,x n be integers. We first find the maximum, say x m' com-
pute x v X- (bitwise) and add 1. This produces 10 k where k is the
m m
length of the binary representation of x • Note that 10 k is the binary
m
representation of 2k+1.
93
We shall compute the rank of every xi' i.e. compare each xi with all
Xj and count how many are smaller than it. Note that if we have
A X.
~
B o Xj 0
indicator
and can be computed in time O(n) given a, x 1 ' ... ,x n • Finally observe,
n-l
that L b 2(k+2)i is the desired result.
i=o
~ ~ ~
(k+2) bits (k+2) bits (k+l) bits
94
The ranks are now easily retrieved and the sort is completed in time
O(n) • c
When set M is stored in an array then lines (2) - (4) of Find are best
implemented by lines (2) - (6) of Quicksort. Then a call of Find has
cost O(IMI + the time spent in the recursive call). The worst case
running time of Find is clearly o(n 2 ). Consider the case i = 1 and
95
i-1 n
T(n,i) ~ cn + 1/n[ L T(n-k, i-k) + L T (k-1, i) 1
k=1 k=i+1
since the partitioning process takes time cn and the recursive call
takes expected time T(n--k,i-k) if k = IH11 + 1 < i and time T(k-1,i)
if k = IM11 + 1 > i. Thus
i-1 n-1
T(n) ~ cn + 1/n max[ L T(n-k) + L T(k) 1
i k=1 k=i
n-1 n-1
T(n) ~ cn + 1/n max[ L4ck + L 4ckl
i k=n-i+1 k=i
~ 4cn
The expected linearity of Find stems from the fact that the expected
size of the subproblem to be solved is only a fraction of the size of
the original problem. However, the worst case running time of Find is
quadratic because the size of the subproblem might be only one smaller
than the size of the original problem. If one wants a linear worst case
algorithm one has to choose the partitioning element more carefully. A
first approach is to take a reasonable size sample of M, say of size
IMI/s, and to take the median of the sample as partitioning element.
However, this idea is not good enough yet because the sample might
consist of small elements only. A better way of choosing the sample is
to divide M into small groups of say S elements each and to make the
sample the set of medians of the groups. Then it is guaranteed that a
fair fraction of elements are smaller (larger) than the partitioning
element. This leads to the following algorithm.
r.- -.'
I
• I
I
...: J mn/S
Let T(n) be the maximal running time of algorithm Select on any set M
of n elements and any i.
by definition of c. o
II. 5. Exercises
1) Let S[nl and P[1 .• nl be two arrays. Let p[1l, ••. ,P[nl be a permuta-
tion of the integers 1, .•. ,n. Describe an algorithm which rearranges S
as given by P, i.e. Safter[il = Sbefore[P[ill.
7) Translate Quicksort into RAM-code and analyse it. (Use exercise 6).
a) 3/2
n log n for :s; n < 10
T{n) { In T{/n) + n 3/2
log n for n 2: 10
11) Modify Quicksort such that sequences of length at most M are sorted
by repeatedly searching for the maximum. What is the best value for M.
(Compare the proof of theorem V.4.2.). Running time?
12) Would you recommend to use the linear median algorithm of section
11.4. in the partitioning phase of Quicksort?
14) Let T be any binary tree with n leaves. Interpret the tree as a
merging pattern and assume that all initial sequences have length 1. In
section 111.5.3.3., theorem 14 we give an algorithm which merges two
sequences of length x and y in time O{log{x+ y )). Show that the total
y
cost of merge pattern T is O{n log n).
n
16) For (x 1 ' .•• ,x n ) E R let f{x 1 , .•• ,x n ) = min{i; xi 2: Xj for all j}
be the maximum function. Use theorem 120f section II.1.6. to prove a
log n lower bound on the depth of rational decision trees for the
maximum function. Can you achieve depth O{log n)?
if x < 0
17) For n E~, x E IR let f{x) if 0 :s; x :s; n
if x > n
100
i
18) Let x , 1 $ i $ n, be words of length k over alphabet
L = {1, ••. ,m}. Discuss the following variant of bucketsort. In phase 1
the input set is divided into m groups according to the first letter.
Then the algorithm is applied recursively to each group. Show that the
worst case running time of this method is Q(n m k). Why does the German
postal service use this method to distribute letters?
20) A program solves the nearest neighbour problem of size n if for all
inputs (x 1 , •.. ,x n 'Y1' ... 'Yn) the first n outputs produced by the pro-
gram are (x i1 ' ... 'x in ) where for all ~: i~ E {i; IXi-y~1 $ IXj-y~1 for
all j}. Show an n log n lower bound for the nearest neighbour problem
in the PRru~-model of section II.1.6. (Hint: for ~ a permutation of
{1, ••• ,n} say that (x 1 , .•• ,x n 'Y1' .•. 'Yn) is of order type ~ if for all
i,j: Ix (O)-Yo I
~ ~ ~
< Ixo-yo
J ~
I. Modify lemma 1 in the proof of theorem 1).
21) A program solves the searching problem of size n if for all inputs
(x 1 ,.·.,xn 'y) with x 1 < x 2 < ••• < xn it computes the rank of y, i.e.
it outputs I {i, xi < y} I. Prove an Q(log n) lower bound in the PRAM-
model of section II.1.6 ••
We treat the case of drastic size difference between the universe and
the set stored first (sections 1 to 7) and discuss the case of equal
size later (section 8). The major difference between the solutions is
based on the fact that accessing an information via a key is a diffi-
cult problem in the first case, but can be made trivial in the second
case by the use of an array.
Operations Insert und Delete change set S, i;e. the old version of set
S is destroyed by the operations. Operation names Access and Member are
used interchangeable; we will always use Member when the particular
application only requires yes/no answers.
We know already one data structure which supports all operations men-
tioned above, the linear list of section 1.3.2 . . A set S={x 1 , ... ,x n }
1~
One of the most simple ways to structure a file is to use the digital
representation of its elements; e.g. we may represent S = {121, 102,
211, 120, 210, 212} by the following TRIE
ments of S are stored (in our example this would be set {121, 201, 112,
021,012, 212}), then the following program will realize operation
Access (x)
v +- root;
y +- Xi
do t times (i,y) +- (y mod k, y div k)i
v +- i-th son of v
od;
if x CONTENT [v] then "yes" else "no".
A disturbing fact about TRIES is that the worst case running time de-
pends on the size of the universe rather than the size of the set
stored. We will next show that the average case behaviour of TRIES is
O(logkn)~ i.e. average access time is logarithmic in the size of the
Proof: Let qd be the probability that the TRIE has depth d or more.
Then E(d ) = L qd. Let S = {x 1 , .• ,x } C U = {O, ••• ,k-1} be a subset
~
n d>1 n -
of U of size n7 The compressed TRIE for S has depth less than d if the
function trunc d _ 1 which maps an element of U into its first d-1 digits
is injective on S. Furthermore, trunc d _ 1 is an injective mapping on
107
n-1
1 - n (1_i/k d - 1 )
i=O
n-1
Hn(1-i/k d - 1 )
n-1
n (1_i/k d - 1 ) i=O
e
i=O
J ~n (1-x/k d-1 ) dx
n
<': eO
The inequality follows from the fact that f(x) = ~n(1-x/kd-1) is a de-
i+1
creasing function in x and hence f(i) <': f f(x)dx. The last equality
i
can be seen by evaluating the integral using the substitution
x = k d - 1 (1-t). Hence
c
E (dn ) L qd + L qd
q=l d2':c+l
$ c + L n 2 /k d
d2':c
$ 2 r log k n' + (n 2 /k c ). L k- d
d2':O
0 2 3 4 5 6 7 8 9
I 1 1
I
1
I
1
I 11
I 11 I 12
I 12
I I
11
I
v v v v v v v v v
210 211 212 120 121 a b 102 c
to store all four arrays of length 3. The root node starts at loca-
tion 4, node a starts at 7, node b starts at 0 and c starts at loca-
tion 3. This information is stored in an additional table
a b c root
7 o 3 4
Suppose now that we search for 121. We follow the 1-st pointer out of
the root node, i.e. the pointer in location 4 + 1 of the large array.
This brings us to node a. Ne follow the 2-nd pointer out of a, i.e.
the pointer in location 7 + 2 of the big array which brings us to node
c, •..• It is also instructive to perform an unsuccessful search, say
to search for 012. Ne follow the O-th pointer out of the root, i.e.
the pointer in location 4 + O. This brings us to a leaf with content
121 *012. So 012 is not member of the set.
o 2 3 4 5 6 7 8 9
b: 1 1 1
c: 1 1 0
root: 0 1 1
a: 1 0 1
Proof: Consider a row with exactly ~ ones. When that row is placed in
step 2 only rows with ~ ~ ones have been placed and hence the array C
can contain at most m(~-1) ones. Each such one can block at most ~
It remains to prove the time bound. In time O(r·s) we compute the number
of ones in each row. Bucket sort will then sort the rows according to the
number of ones in time O(r+m). Finally, in order to compute rd(i) one
has to try up to m candidates. Testing one candidate takes time O(s).
We obtain a total of O(r·s + r+m + r·m·s) = O(r·m·s) time units. D
o 2
o 1 o 2
1 1 1 cd 0
2 0 1 0
3 1 1 1
4 0 1
o 2 3 4 5 6 7 8
o 1 0 0
1 1 1
2 0 1 0
3 1 1 1
4 0 0 1
The important fact to remember is that all ones of matrix A are mapped
on distinct ones of matrix C, namely A[i,j] = 1 implies A[i,j] =
= B[i + cd[j],j] = C[rd[i + cd[j]] + j]. The question remains how to
choose the column displacements? We will use the first fit method.
112
In order to ensure the harmonic decay property after all colunm dis-
placements have been chosen we enforce a more stringent restriction
during the selection process, i.e. we enforce
We use the first fit method to choose the column displacements: Choose
cd(O) O. For j > 0 choose cd(j) 2: 0 minimal such that
mj(~) ~ m/f(~,mj) for all ~ 2: O.
We need an upper bound on the values cd(j) obtained in that way. We ask
the following question: Consider a fixed ~ 2: 1. How many choices of
cd(j) may be blocked because of violation of requirement ~ : mj (~) ~
where mj(~) is computed using displacement k for the j-th column. Since
requirement ~ was satisfied after choosing cd(j-1) we also have
mj _ 1 (~) ~ m/ f ( ~ , mj -1 )
and hence
mJ. ( ~ ) - m. 1 (~)
J-
> m/ f ( ~ , mJ.) - m/ f ( ~ , mJ-
. 1)'
113
Let q = m/f(~,m.)
]
- m/f(~,m.
]-
1). We can interpret q as follows: In ma-
trix B j the number of ones in rows with ~ ~ + 1 ones is at least q
more than in matrix B.] - 1. We can count these q ones in a different way.
There are at least q/(~+1)pairs (a row of B.] - 1 with ~ ~ ones, a one in
column j) which are aligned when column displacement cd(j) = k is cho-
sen. Note that any such pair contributes either 1 (if the row of B j _ 1
has ~ ~ + 1 ones) or ~ + 1 (if the row of B.] - 1 has exactly ~ ones) to
q. Since the number of rows of B.] - 1 with ~ ~ ones is bounded by
m.] - 1 (~-1)/~ ~ m/~.f(l-1,m.
]-
1) and since the j-th column of A contains
exactly m.] - m.] - 1 ones the number of such pairs is bounded by
Hence there are at most p/[q/(~+1)] possible choices k for cd(j) which
are blocked because of violation of requirement ~, ~ ~ 1. Hence the
total number of blocked values is bounded by
~o (H1)·(m. - m. 1)·m
] J-
BV
where ~O = min{~; m/f(~,mj_1) < ~}. Note that there are no rows of
B j _ 1 with ~ ~O ones and hence p (as defined above) will be 0 for ~ > ~O.
Hence we can we can always choose cd(j) such that 0 ~ cd(j) ~ BV.
(m. - m. 1) f (~,m.] - 1)
BV 2+1 ] ]-
~ (f(~,m. 1)/f(~,m.)-1) f(~-1,mj_1)
]- ]
~o
2+1
(m j - mj _ 1 ) g (~,Iaj_1) -g (~-1 ,m j _ 1 )
BV L 2
~=1
~ g(~,m'_1)-g(~,miJ
(2] -'-1)
~O (m j - m,] - 1)
~ L
~+1
-~-
(g (~,m, 1) g(~,mj)) ·~n 2
. 2]-
g(l,m. 1)-9(~-1,m. 1)
1-
~=1 ]-
114
since 2 x -l 2 x·tn 2. Next we note that the two differences involve on-
ly one argument of g. This suggests to set g(t,m j ) = h(t)·k(m j ) for
functions hand k to be chosen later. The upper bound for BV simpli-
fies to
(m j - mj _ l ) ·2(h(O-h(t-l».k(m j _ l )
.Hl
t h(t) (k(m j _ l ) k(mj»tn 2
£0 (m j - m. 1)
Hl J- 22
BV ~ L
t t· (m/m m. 1 /m) . tn 2
t=l J-
to to to
~ L (4/ln 2)m. (Hl)/t 2 (4 ·m/ £n 2) ( L l/t + L 1/ £ 2)
t=l t=l £=1
to
since L l/t ~ 1 + tn to (cf. appendix) and L 1/t 2 n 2 /6
9,=1 9,21
~ 4·m·log £0 + 15.3 m
Finally notice that f(9"m.) = 29,· (2-m j /m) 2 29, and therefore 9,0 ~ log m.
9, J 0
Also f(9"m s _ l ) = 2 2 9,+ 1 for all £ and f(O,m j ) = 2 = 1 ~ m/m j for
all j and so f satisfies the boundary conditions. Thus we can always
find cd(j) such that 0 ~ cd(j) ~ 4·m·loglog m + 15.3·m.
Theorem 4: Given a matrix A[O .. r-l, O .. s-l] with m nonzero entries one
can find colurnn displacements cd(j), 0 ~ j ~ s-l, such that
o ~ cd(j) ~ 4·m.loglog m + 15.3m and such that matrix B obtained from A
using those displacements has the harmonic decay property. The column
displacements can be found in time Oeser + m.loglog rn)2).
Proof: The bound on the column displacements follows from the discus-
sion above. The time bound can be seen as follows. For every row of B.
J
we keep a count on the n~mber of ones in the row and we keep the num-
115
Let us put theorems 3 and 4 together. Given a matrix A[0 .• r-1, 0 .. s-1]
with m nonzero entries, theorem 4 gives us column displacements cd(j),
o ~ j < s, and a matrix B such that
or in other words
Since B has only m nonzero entries there are at most m different h's
such that rd(h) * o. Furthermore, array C has length at most m + s
since rd(h) ~ m for all h.
The entries of arrays rd, cd and C are numbers in the range 0 to O(n),
i.e. bitstrings of length ~ log n + 0(1). As we observed above, array
rd has ~ c n loglog n entries for some constant c all but 2n of which
are zero. We will next describe a compression of vector rd based on
the following assumption: A storage location can hold several numbers
if their total length is less than log n.
such that i~ lies in that block, if any such ~ exists. This defines a
vector base of length 2.c·n. For any other element of a block we store
the offset with respect to ~ in a two-dimensional array offset, i.e.
I
min {~; i~ div d v} v
base[v]
- 1 otherwise
for 0
offset[v,j]
~ v < 2·c·n and
- 1 otherwise
for 0 ~ j < d.
d-1
by off[v] L (offset[v,j] + 1)(d + 1)j
j=O
117
Then 0 ~ off[v] ~ (d + 1)d ~ n and thus off[v] fits into a single stor-
age location. Also
and so offset[v,j] can be computed in time 0(1) from off[v] and a table
of the ~rs of d + 1. Altogether we decreased the space requirement to
O(n), namely O(n) each for arrays crd, base and off, and increased
access time only by a constant factor.
rd: [0 1 0 11 0 1 I0 0 0 10 0 1
Then
offset:
-1 0 -1
0 -1 1 and off: 33
-1 -1 ~1 o
-1 -1 0 16
Proof: The time bound follows from theorems 1 and 2. The space bound
follows from the preceding discussion. o
118
III. 2. Hashing
The ingredients of hashing are very simple: an array T[0 •• m-1], the
hash table, and a function h U ~ [0 .• m-1], the hash function. U is
the universe; we will assume U [0 .• N-1] throughout this section. The
basic idea is to store a set S as follows: xES is stored in T[h(x)].
Then an access is an extremely simple operation: compute h(x) and look
up T[h(x)].
Suppose m = 5, S = {03, 15, 22, 24} c [0 .• 99] and h(x) x mod 5. Then
the hash table looks as follows:
o 15
2 22
3 3
4 24
0 3
II 21 It-II
II 4 II 7 II 10 IHI
2 17
H
Operation Access(x,S) is realized by the followinq oroqram:
1) compute h (x)
2) search for element x in list T[h(x)]
, and
Our two assumptions imply that value h(~) o~ the hash function on
the argument of the k-th operation is uniformely distributed in
[0 ••• m-1], i.e. prob(h(xk ) = i) = 11m for all k E [1 ••• n] and
i E [0 ••• m-1].
Proof: We will first compute the expected cost of the (k+1)-st oper-
ation. Assume that h(xk + 1 ) = i, i.e. the (k+1)-st operation accesses
the i-th list. Let prob(tk(i) = j) be the probability that the i-th
list has length j after the k-th operation. Then
L prob(tk(i) j) (1+j)
j2:0
k . k-·
~ 1 + L (j) (11m) ] (1-1/m) J. j
j2:0
1 + (kim) L
j 2:1
1 + kim
121
n n
L EC k O( L (1+(k-1)/m))
k=1 k=1
0(1 + Bn/2) o
(b - a)+(j - i) *0 mod m
Since 2561(b - a) we should choose m such that m does not divide numbers
of the form 256· C + d where I dis; 9. In this way one can obtain even
better practical performance than predicted by theorem 2.
load factor
0.5 0.6 0.7 0.8 0.9
access
time
In the second case we can use a sequence TO' T 1 , T 2 , .•• of hash tables
of size m, 2m, 4m, •.• respectively. Also, for each i we have a hash
function hi. We start with hash table TO. Suppose now, that we use
hash table Ti and the load factor reaches 1 or 1/4. In the first case,
we move to hash table T i + 1 by storing all 2im elements which are pres-
ent in Ti in table Ti + 1 • On the average, this will cost 0(2 i m) time
units. Also, the load factor of table T i + 1 is 1/2 and therefore we can
i+1 .
execute at least (1/4)·2 ·M = (1/2) .2 l .m operations until the next
restructuring is required. In the second case, we move to hash table
Ti _ 1 by storing all (1/4) .2 i ·m elements which are pr~sent in Ti in
table T.l - 1. On the average, this will cost 0«1/4) ·2 l ·m) time units.
Also, the load factor of table Ti - 1 is 1/2 and therefore we can execute
123
We close this section with a second look at the worst case behaviour
of hashing and discuss the expected worst case behaviour of hashing
with chaining. Suppose that a set of n elements is stored in the hash
table. Then max 0h(x,S) is the worst case cost of an operation when set
xEU
S is stored. We want to compute the expected value of the worst case
cost under the assumption that S is a rand8m subset of the universe.
More precisely, we assume that S = {x 1 ' •.. ,x n } and that
prob(h(x k ) = i) = 1/m for all k E [1 .•• n] and i E [0 •.• m-1]. The very
same assumption underlies theorem 2.
Theorem 3: Under the assumption above, the expected worst case cost of
hashing with chaining is O«log n)/log log n) provided that S = n/m $ 1.
Proof: Let t(i) be the number of elements of S which are stored in the
i-th list. Then prob(t(i) :::: j) $ (t;l)(1/m)j. Also
J
m-1
prob «max t(i)) :::: j) $ ~ prob(t(i) :::: j)
i i=o
124
'-1
5 n(n/m)] (1/j!)
Thus the expected worst case cost EloVe, i.e. the expected lenth of the
longest chain, is
, -1
5 ~ min(1, n(n/m)] (1/j!»
j 2':1
Let jo = min (j; n(n/m)j-1 (1/j)! 5 1} 5 min (j; n 5 j!} since n/m 5 1.
From j! 2': (j/2)j/2 we conclude jo = O«log n)/log log n). Thus
jo j-j
Ewe 5 ~ 1 + r 1/j 0
j=1 j>jo 0
A very popular method for defining function h(x,i) is to use the linear
combination of two hash functions h1 and h 2 :
Hashing with open addressing does not require any additional space.
However, its performance becomes poorly when the load factor is nearly
one and it does not support deletion.
Theorem 4: The expected cost of an Insert into a table with load fac-
tor B = n/m is (m+1)/(m-n+1) ~ 1/(1-B) provided that n/m < 1.
Proof: Let C(n,m) be the expected cost of an Insert into a table with
m positions, n of which are occupied. Let qj (n,m) be the probability
that table positions h(x,a) , •.. , h(x,j-1) are occupied. Then
qo(n,m)= 1, qj(n,m) = [n/m]·[(n-1)/(m-1)]. .. [(n-(j-1»/(m-(j-1)] and
n
C(n,m) L (q, (n,m) - gJ'+1 (n,m» (j+1)
j=oJ
n
L qJ' (n,m) - (n+1)qn+1 (n,m)
j=a
n
L gJ' (n,m) •
j=a
The expression
- for q,
-J (n,m) can be J'ustified as follows. Position h(x,a)
is occupied in n out of m cases. If h(x,a) is occupied then position
h(x,1) is occupied in n-1 out of m-1 cases (note that h(x,1) * h(x,O», ...
We prove C(n,m) = (m+1)/(m-n+1) by induction on m and for fixed m by
induction on n. Note first that C(a,m) = qa(a,m) = 1. Next note that
qj (n,m) = (n/m) ·qj-1 (n-1 ,m-1) for j ~ 1. Hence
126
n
C(n,m) L qJ' (n,m)
j=O
n-1
1 + (n/m)· L q, (n-1 ,m-1)
j=O J
1 + (n/m)·C(n-1,m-1)
1 + (n/m) ·m/(m-n+1)
(m+1)/(m-n+1) o
n-1 n-1
L C(i,m) L m+1
n i=O n i=O (m-i+1)
m+1 m-n+1
m+1
n
[ L 1/j - L 1/j]
j=1 j=1
~ (1/B).tn[(m+1)/(m-n+1)]
~ (1/ B) • tn 1/ ( 1- B)
Theorems 4 and 5 state that Insert and Access time will go up steeply
as B approaches 1, and
127
that open addressing works fine as long as 8 is bounded away from one,
say 8 ::; 0.9.
a) How large are the programs for perfect hash functions, i.e. is there
always a short program computing a perfect hash function?
the length of the program written as a string over some finite alpha-
bet. We prove upper bounds on program size by exp1icite1y exhibiting
programs of a certain length. We prove lower bounds on program length
by deriving a lower bound on the number of different programs required
and then using the fact that the number of words (and hence programs)
of length ~ L over an alphabet of size c is ~ c L + 1 /(c_1).
a) IHI
c) There is at least one S ~ [O ••• N-1], lSI = n, such that the length
of the shortest program computing a perfect hash function for S is
n(n-1) n(n-1)
max (2m fu 2 2N )/,n 2 (1 - ~,
n-1) 1 og 1 og N - 1 og 1 og m)/1 og c - 1 ,
Ih- 1 (i )1
n
This expression is maximal if Ih- 1 (i) I N/m for all i E [O ••• m-1] and
its value is equal to (N/m)n.(m) in this case.
n
for i < t
129
-1 -1
where j is such that lUi n h i + 1 (j) I ~ lUi n h i + 1 (t) I for every
c) Let LBa and LBb be the lower bounds for IHI proven in parts a) and
b). Since there are at most c L+ 1 different programs of length ~ Land
different functions require different programs we conclude that there
is at least one S c [O ••• N-1], lSI = n, such that the shortest proqram
computing a perfect hash function for S has length
max(log LBa, log LBb)/log c - 1. But
and
N .•• (N-n+1) .mn
log ~~~~~~~~--
log LBa
Nn m· " . (m-n+1)
n-1 N ,n-1 ,
log [n ~/ n m-1]
i=O N i=O m
n-1 n-1
= [ L tn(1-i/N) - L tn(1-i/m)]/tn 2
i=O i=O
n-1 n-1
~ [(1 - n-l). L (i/N) + L (i/m)]/tn 2
N i=O i=O
How good is that lower bound? We give our answer in two steps. We will
first show by a non-constructive argument that the lower bound is al-
most achievable (theorems 7 and 8). Unfortunately, the programs con-
structed in theorem 8 are extremely inefficient, i.e. theorem 8 settles
question a) concerning the program size of perfect hash functions but
does not give convincing answers to questions b) and c). In a second
step we will then derive answers for questions b) and c) •
Note that the rows corresponding to elements not in S may be filled ar-
bitrarily. Thus there is a perfect class H, IHI = t, if
131
or
or
n-1
8ince ~n(N) ~ n'~n N, - ~n(1 - m(m-1) ••• (m-n+1)/mn ) ~ n ( 1-i/m)
n
i=O
n-1
L ~n(1-i/m)
n-1 n
i=O
e and L ~n(1-i/m) ~ f ~n(1-x/m)dx =
i=O 0
m[ (1-n/m) (1-~n(1-n/m) )-1] ~ m[ (1-n/m) (1+n/m)-1] = n2/m there will be a
perfect class H with IHI = t provided that
2
t ~ n' (~n N) • en /m [J
Theorem 8: Let N,m,n E IN. For every 8 c [0 ..• N-1], 181 n, there is
a program of length
which computes a perfect hash function h [0 •.• N-1] .... [0 ..• m-1] for 8.
(4) search through all 2k by t matrices with entries in [0 •.• m-1] until
a (N,m,n) - perfect matrix is found; use the i-th column of that
matrix as the table of th~ hash function for 8j
132
The length of this program is log log N + 0(1) for line (1), log n +
log log N + n2/m + 0(1) for lines (2) and (3) and 0(1) for line (4).
Note that the length of the text for line (4) is independent of N, t
and m. This proves the claim. o
for 0 ~ i < m.
m-1 k
a) For every m there is a k such that L (b.)2 ~ n + 2n(n-1)/m. More-
l
i=o
over, such a k can be found in time O(nN).
m-1
b) For every m, a k satisfying L (b~)2
1
~ n + 4n(n-1)/m can be found
i=o
in random time O(n).
133
Proof: a) For k, 1 ~ k < N, let hk(x) = «kx) mod N) mod m be the hash
function defined by multiplier k. Then
N-1 m-1
r r I ((x,y); x,y E S, x * y, hk(x) i} I
k=1 i=o
r I {k; hk(x)
x,yES
x*y
Next note that hk(x) = hk(y) iff [(kx) mod N - (ky) mod N] mod m = O.
We are asking for the number of solutions k with 1 ~ k < N.
Since k ~ (kx)mod N is a permutation of {1, ••• ,N - 1} we may assume
w.l.o.g. that x = 1 and hence y * 1. Thus
N-1
L lSI r I{k; h k (1)
k=1 yES,y*1
-N + 2 ~ gy(k) ~ N - 2.
We can therefore rewrite the sum above as
L (N-2) /mJ
lSI L I{k; gy(k) :I: im} I
i=1
We show I{k; gy(k) = :I: im}1 ~ 2. Let k 1 ,k 2 E {1, ••• ,N-1} be such that
gy(k 1 ) a g y (k 2 ) where a E {-1, +1}. Then
m-1
L (b~)2::::; n + 2n(n-1)/m.
i=O ~
N-1 m-1
b) In part a) we have shown that L (L (b~) 2 - n) ::::; 2n (n-1) (N-2) 1m.
k=1 i=o ~
m-1
Since L (b~)2 - n ~ 0 for all k we conclude that we must have
~
i=o
m-1
L (b~) 2 - n ::::; 4 (n-1) nlm for at least fifty percent of the k' s between
~
i=o
1 and N-1. Thus the expected number of tries needed to find a k with
m-1
L (b~)
~
2 - n ::::; 4 n(n-1) 1m is at most two. Testing a particular k for
i=o
that property takes time O(n). [J
Corollary 10: Let N be a prime and let S c [O ••• N-1], lSI n. Let b~~
be defined as in lemma 9 •
135
m-1
a) If n = m then a k satisfying L (b~)
1.
2 < 3 n can be found in time
i=o
m-1
O(nN) deterministically, and a k satisfying L (b~)
1.
2 < 4 n can be found
i=o
in random time O(n).
can be found in time O(nN). Let b i = lSi' and let c i = 1 +bi (b i -1)
For every i, 0 ~ i < n, there is ki' 1 ~ ki < N, such that
x - «kix)mod N)mod c i operates injectively on Si by corollary 10 b).
Moreover, k i can be determined in time O(biN) .
The size of this program is O(n log N)i namely, O(log N) bits each for
i-1
integers k, k i and L C9,l 0 ~ i < n - 1. The running time is clearly
9,=0
0(1) in the unit cost measure. Finally, the time to find the program is
bounded by O(nN + L b. N) = O(nN).
i 1
b) The proof of part b) is very similar to the proof of part a). Note
n-1
first that a k with L IS. ,2 < 4n can be found in random time O(n).
i=o 1
Also, ki's such that x ~ «kix)mod N)mod (2b i (b i -1) + 1) operates in-
jectively on Si can be found in random time L O(b.) = o (n). Putting
i 1
these functions together as described above yields an injective func-
n-1
tion from S into [0 ... m-1] where m = L (2b. (b.-1) + 1) =L b~ < 4n o
i=o 1 1 i 1
Proof: Let m have q different prime divisors and let Pl' P2' P3' ...
be the list of primes in increasing order. Then
q
L 2n i
i=l
e
ax + b q + rm mod N
ay + b q + sm mod N
I {(a,b); h a, b(x)
:$ {1 + cn/m if x ~ S
E (1 + 0h(x,S»/IHI
hEH 1 + c(n-1)/m if xES
142
Proof: a)
(1 + c·n/m) if x ~ S
:s; (HI
IHI (1 + c· (n-1)/m) i f x E S
c) immediate from part b) and the observation that the expected cost of
the i-th operation in the sequence is 0(1 + c·i/m). c
Theorem15, c reads the same as theorem 2 (except for the factor c).
However, there is a major difference between the two theorems. They are
derived under completely different assumptions. In theorem 2 each sub-
set S =Ur lSI = n, was assumed to be equally likely, in theorem 15, c
each element h E H is assumed to be equally likely. So universal hash-
ing gives exactly the same performance as standard hashing, but the
dices are now controlled by the algorithm not by the user.
b) Let N,m E ~ and let t be minimal such that t·~n Pt ~ m'~n N. Here
Pt denotes the t-th prime. Then t = O(m ~n N). Let
where
x mod p~
and
2
Then IH21 t'P2t and hence log IH21 = O(log t) = O(log m + log log N)
since log P2t = O(log t) by the prime number theorem. It remains to
show that H2 is 8 -universal.
Thus there have to exist q E [0 ... m-1] and r,s E [0 . . . r p2t / m' -1] such
that
q + r·m
We have to count the number of triples (c,d,~) which solve this pair
of equations. We count the solutions in two groups. The first group
contains all solutions (c,d,~) with x mod p~ *y mod p~ and the second
145
Group 2: Let L ={£; t < £ :-:; 2t and x mod P£ = Y mod p£} and let
ILl
p = n {p£; £ E L} . Then P ;:: Pt . Also P divides x - y and hence P :-:; N.
Thus ILl :-:; (£n N)/£n Pt :-:; tim by definition of t.
Extendible hashing uses a table T[0 ••• 2 d (S) - 1], called the directory,
and some number of buckets to store set S. More precisely, the entries
of table T are pointers to buckets. If x is an element of S then the
bucket containing x can be found as follows.
(2) Use this integer to index directory T; pointer T[i] points to the
bucket containing x.
Example: Let h(S) {OOOO, 0000, 0100, 1100} and let b 2. Then
deS) = 2.
00
F 0000, 0001
P
01
0100
10
11
E 1100 CJ
As one can see from the example, we allow entries of T to point to the
same bucket. However, we require that sharing of buckets must conform
to the buddy principle. More precisely, if r < deS), a E {O,1}r,
a 1 ,a 2 E {0,1}
d(S)-r
, a1 *
a 2 and T[aa 1 ] = T[aa 2 ] then T[aa 1 ] = T[aa 3 ]
d(S)-r . deS)
for all a 3 E {0,1} • In other words, 1f we represent {0,1} as
a TRIE then the table entries sharing a bucket form a subtree of the
TRIE. Finally, if B is a bucket and 2 d (s)-r table entries point to B then
147
010
011
100
PI 0100
101
110
-
T[aza d ( S )and
-r-
1 ], a 1 E {0,1} z 1 - z, point to B'. Note that B' does
not necessarily exist. If B' does exist and Band B' contain together
at most b keys then we merge Band B' to a single bucket of local depth
r - 1. In addition, if r = d(S) and Band B' were the only buckets of
depth d(S) then we half the size of the directory. This completes the
description of the deletion algorithm.
What can we say about the behaviour of extendible hashing? What are the
relevant quantities? We have seen already that an Access operation
requires only two accesses to secondary memory. This is also true for
an Insert, except in the case that a bucket overflows. In this case we
have to obtain a new bucket, a third acces&,and to distribute the ele-
ments of the bucket to be split over the new buckets. In rare cases we
will also have to double the directory. These remarks show that the
time complexity of extendible hashing is very good. The space complexity
requires further investigation. The two main questions are: What is the
size of directory T? How many buckets are used? First of all, as for all
hashing schemes worst case behaviour can be very, very bad. So let us
discuss expected behaviour. The analysis is based on the following
assumption. We have k = =, i.e. hash function h maps U into bit strings
of infinite length. Furthermore, h(x) is uniformly distributed in in-
terval [0,1]. Note that h(x) can be interpreted as the binary represen-
tation of a real number. A discussion of expected behaviour is particu-
larly relevant in view of our treatment of universal hashing. Note that
the behaviour of extendible hashing heavily depends on the hash function
in use and that universal hashing teaches us how to choose good hash
functions. Unfortunately, a complete analysis of the expected behaviour
of extendible hashing is quite involved and far beyond the scope of the
book.
and
10 5 0
b n 10 6 10 8 10
We can see from. this table that the non-linear growth is clearly per-
ceptible for small b, say b 10, and that directory size exceeds the
~
Why does directory size grow non-linearly in the size of the file? In the
case b = 1 this is not too hard to see and follows from the birthday
paradox. If b is 1 then directory size doubles whenever two elements of
S hash to the same entry of the directory (have the same birthday). If
directory size is m (the year has m days) and the size of 8 is n then
the probability p that no two elements of 8 hash to the same entry
n,m
of the directory is
p
n,m
= m(m-1) •.• Un -n+1) /mn
n-1
IT (1 - i/m)
i=o
n-1
L 9,n( 1-i/m)
i=o
e
n-1
- L i/m
i=o
~ e ~ e
Definition: A binary search tree for set S = {x 1 < x 2 < ... < xnl is a
binary tree with n nodes {v1, ••. ,vnl. The nodes are labelled with the
elements of S, i.e. there is an injective mapping CONTENT:
{v1, •.• ,vnl S. The labelling preserves the order, i.e. if vi(v j ) is
a node in the left (right) subtree of the tree with root v k then
CONTENT[Vi ] < CONTENT[vk ] < CONTENT[V j ].
v ..... root of Ti
while v is a node and a * CONTENT[v]
do if a < CONTENT[v]
154
then v LSON[v]
else v RSON[v]
fi
od
Case 1: At least one son of v is a leaf, say the left. Then we replace
v by its right son and delete v and its left son from the tree.
The following figure illustrates both cases. The node with content 4 is
-A
deleted, leaves are not drawn.
155
The considerations above show that the hejght of the search tree
plays a crucial role for the efficiency of the basic set operations.
We will see in III.5. that h(T) = O(log lSI) can be achieved by the use
of various balanced tree schemes. In the exercises it is shown that the
average height of a randomly grown tree is O(log n).
( +r a - S [low-.1 ] ,
next - low-1) S[high+1] _ S[low-1] • (high-low+1)
It is assumed that positions S[O] and S[n+1] are added and filled with
artificial elements. The worst case complexity of interpolation search
is clearly O(n); consider the case that S[O] = 0, S[n+1] = 1,
a = 1/(n+1) and S ~ (O,a). Then next = low always and interpolation
search deteriorates to linear search. Average case behaviour is
much better. Average access time is O(log log n) under the assumption
that keys x 1 , ••• ,xn are drawn independently from a uniform distribution
over the open interval (xO ,xn + 1 ). An exact analysis of interpolation
search is beyond the scope of this book. However, we discuss a variant
of interpolation search, which also has O(log log n) expected behaviour:
quadratic binary search.
is a fast (at least on the average) way to find the node on the path of
search which is halfway down the tree, i.e. the node on the path of
search which has depth 1/2 log n. There are 21/2 log n = In of these
nodes and they are In apart in the array representation of the tree.
Let us make an initial guess by interpolating and then search through
these nodes by linear search. Note that each step of the linear search
jumps over In elements of S and hence as we will see shortly only 0(1)
steps are required on the average. Thus an expected cost of 0(1) has
reduced the size of the set from n to In (or in other words determined
the first half of the path of search) and hence total expected search
time is O(log log n).
C L i(p~-P~+1)
i~1 ~ ~
where actual rank of a denotes the number of xj's smaller than a. Hence
n
n J' n-J'
~ r j ( , ) p (l-p) pn
j=O J
with variance
2 n 2 n' n-J'
cr r (j-~) (,)pJ (l-p) p. (l-p) . n
j=O J
Thus
S p. (1-p)n/«i-2)/n)2
s p. (1-p)/(i-2)2
S 1/ (4· (i-2)2)
C s 2 + r 1/(4(i-2)2)
i2:3
The worst case access time of quadratic binary search is O(/n). Note
that the maximal number of comparisons used is n 1 / 2 + n 1 / 4 + n 1 / 8 + •••
= O(/n ). This worst case behaviour can be reduced to O(log n) without
sacrificing the O(log log n) expected behaviour as follows. Instead of
using linear search to determine the i with S[next + (i-2) ./n] < a
~ S[next + (i-1)./n] with i comparisons we use exponential + binary
search (cf. III. -4.2., theorem 10) for a cost of only O(log i) compari-
sons. Since log i ~ i the expected behaviour stays the same. However,
the maximal number of comparisons is now log n 1 / 2 + log n 1 / 4 + log n 1 / 8
+ ... = (1/2 + 1/4 + 1/8 + ... ) log n = log n. Thus worst case behav-
iour is O(log n) if exponential + binary search is used.
Definition: Let S = {x 1 < x 2 < ... < x } and let B. (a.) be the proba-
n ~ J
bility of operation Access (a,S) where a = x. (x. < a < x. 1) for
~ J J+
1 ~ i ~ n(O ~ j ~ 0, a. ~ 0 and ~B. + La. = 1. The
n). Then B. ~
J ~ ~ J
(2n + 1)-tuple (aO,B1,a1, ... ,Bn,an) is called access (probability)
distribution.
159
Let T be a search tree for set 5, let b~ be the depth of node xi and
].
TnT n T
p = l: Si (1 + b.) + l: 0. a
i=l ]. j=O j j
We associated with every search tree for set 5 a single real number,
its weighted path length. We can therefore ask for the tree with the
minimal weighted path length. This tree will then also optimize average
access time.
Definition: Let 5
160
and
We use T ~J
.. to denote an optimum
_ _ binary_ search
_ tree for set {x.,
~
••• ,x.}
J
with access distribution (a. 1,B., ••• ,B.,d.). Let P .. be the weighted
~- ~ J J ~J
path length of Tij and let r ij be the index of the root of T ij , i.e.
xm with m = r ij is the root of T ij •
Lemma 1: a) Pi ,i-1 0
Proof: a) T i ,i-1 consists of a single leaf which has depth zero. Thus
P i ,i-1 = O.
b) Let T .. be an optimum tree for set {x., ..• ,x.}. Then Tij has weighted
~J ~ J
path length P .. and root x with m = r ... Let To (T ) be the left (right)
~J m ~J N r
subtree of T .. with weighted path length Po (P ). Since
~J N r
161
T .. Tt
b k 1J 1 + bk for i ~ k ~ m - 1
T ..
b 1J 0
m
T .. T
b k 1J 1 + b r for m + 1 ~ k ~ j
k
and analogously for the leaf weights we have Wij P ij = wij + wi ,m-1 P t
+ wm+ 1 ,j P r T t is a search tree for set x i , .•• ,xm_ 1 and therefore
P t ~ P i ,m-1. We must have Po = P. 1 because otherwise we could replace
7v 1,m-
T t by T i ,m-1 and obtain a tree with smaller weighted path length. This
proves b). [J
(5) do j i + k;
(8)
uses three arrays P .. , w .. and r .. and hence has space complexity 0(n 2 ).
~]~] ~]
We turn next to the time complexity. Note first that one execution of
lines (5), (6), (8), (9) takes time 0(1) and one execution of line (7)
takes time 0(j-i+1) = 0(k+1) for fixed i and k. Thus the total running
time is
n-1 n-k
:L :L 0(k+1)
k=O i=1
This falls short of the 0(n 2 ) bound promised in the theorem. Before we
sketch an improvement we illustrate the algorithm on the example from
the beginning of III.4 .• Arrays P ij , wij ' 1 ~ i ~ 5, 0 ~ j ~ 4 and r ij ,
1 ~ i ~ 4, 1 ~ j ~ 4 are given below:
5 11 24 5 8 14
24P
o 3 12 48]
31
24w
o 3 9 24]
19
o 6 22 o 6 16
o 13 3 13
o 10
2
2 3
r =
3
let m, r ~,]-
.. 1 ~ m ~ r'+
~ 1 , ]. be such •..
n-1 n-k
O(:L :L (1+r i + 1 ,i+k - r i ,i+k-1»
k=O i=1
163
n-l
O( L (n-k + r -k - r 1 k»
k=O n,n,
n-l
o (L n)
k=O
c (i, i) 0
Optimum binary search trees are a special case of these recursion equa-
tions: take w(i,j) = wi + 1 ,j = a i + Si+l + ... + Sj + aji then
c(i,j) = Pi+l,j.
Lemma 3 is the key for improving the running time of dynamic program-
ming to O(n 2 ) as we have seen above. It remains to prove lemmas 2 and 3.
:5 c(i,j')
Case 1.2: k ~ j. This is symmetric to case 1.1 and left to the reader.
Case 2: i < i' < j' < j. Let y = K(i' ,j) and z = K(i,j'). Again we
have to distinguish two symmetric cases z:5 y or z ~ y. We only treat
the case z :5 y. Note first that z :5 y :5 j by definition of y and i < z
by definition of z. We have
,by QI for w
,by definition of y and z. This completes the induction step and thus
proves lemma 2. o
for all i < k :5 k' :5 j, i.e. if K(i,j) prefers k' over k then so does
K(i,j+1). In fact, we will show the stronger inequality
or equivalently
However, this is simply the QI for c at k ::; k' ::; j ::; j+l. DD
Of course, the exponential running time of OST stems from the fact that
subproblems OST(i,j) are solved repeatedly. It is therefore a good idea
to introduce a global table P[i,j] and to record the values computed by
OST in this table. This leads to
167
Note that lines (4') - (11') are executed at most once for every pair
(i,j), i + 1 $ j. Hence the total time spent in lines (4') - (11') of
OST is O(n 3 ). Also note that the total number of calls of OST is O(n 3 )
since lines (4') - (11') are executed at most once for every pair (i,j).
Thus the total time spent in lines (1') - (3') and (12') - (13') of OST
is also o(n 3 ).
We can see from this example that tabulating function values in ex-
haustive search algorithms can have a dramatic effect on running time.
In fact, the revised program above is essentially our dynamic pro-
gramming algorithm. The only additional idea required for the dynamic
programming algorithm is the observation that recursion can be replaced
by iteratioJ;L.
$ a1 a2 a. an ¢
1
U
[COntrOl J --+
168
(1) move the head to the right endmarker ¢ and then move left and store
b n ,b n _ 1 , ••. ,b 1 in the pushdown store; b 1 is at the top of the push-
down store. Position the input head on a 1 •
(2) while the symbol under the input head and the top pushdown symbol
agree
do move the input head to the right and delete one symbol from
the pushdown store
od
(4) if the top symbol on the pushdown store is C (the special symbol)
then empty the pushdown store and reject fi
100
(5) Move the input head to a 1 • While moving left push all symbols
scanned onto the pushdown store
(6) Remove one symbol from the pushdown store and go to (2).
b j b j +1 b j +i-1 bj+i
II
a1 a2 ai *
a i +1
a 1 •••• • ah ah+1
172
i - O~ j - 1~ f(1) - O~
while j :5 n
do -a a and (i <
a 1 ••· a i - j-i+1··· j
R, < ]' implies a 1 ···a.~ *
a.]-R, +1 ..• a ].)
+1 = a j +1
if a i
then f(j+1) - i + 1~ i j + 1
else if i = 0
then f (j+1) O~ j j + 1
~ i - f(i)
fi
od
[]
Proof: Similar to the proof of lemma 4.
We sUIlll1iarize in
Theorem 4: All occurrences of pattern a 1 .•. a n in string b 1 ••• b n can be
found in linear time O(n + m).
The dynamic programming algorithm for optimum binary search trees has
quadratic space and time requirement. It can therefore only be used for
small or medium size n. In the next section we will discuss algorithms
which construct nearly optimal binary search trees in linear time. Note
that giving up optimality is really not that bad because the access
probabilities are usually only known approximatively anyhow. There is
one further advantage of nearly optimal trees. We will be able to bound
the cost of every single access operation and not only expected cost.
Recall that average and worst case behaviour can differ by large
amounts, e.g. in Quicksort~ a similar situation could arise here.
We will not be able to relate weighted path length of optimum trees and
nearly optimal trees directly. Rather we compare both of them to
an independent yardstick, the entropy of the access distribution.
174
Some basic properties of the entropy function can be found in the ap-
pendix. In the sequel we consider a fixed search tree T for set
S = {x 1 ' ••• ,x n } and access distribution (cx O,S1, ••• ,Sn'cx n ). We use H to
denote the entropy of the distribution, i.e. H = H(CX O,S1, ••• ,Sn'cx n )
and we use bi(a j ) for the depth of node xi (leaf (x j ,X j + 1 » for
1 :5 i :5 n (0:5 j :5 n). P is the weighted path length of T. We will prove
lower bounds on the search times in tree T. These lower bounds are in-
dependent of the structure of T; in particular, they are valid for
the optimum tree. The proofs follow classical proofs of the noiseless
coding theorem.
Bi «1_C)/2)b i c , 1 :5 i :5 n
a.
(Xj «1-c)/2) J ,0 :5 j :5 n
for :5 i :5 k-1
b.J. for i k
for k+1 :5 i :5 n
and
175
={
a '. + 1 for 0 S; j S; k-1
J
aj
a'! + 1 for k S; j S; n
J
LfL +
i l
La.
j J
= «1-c)/2)Sn + c + «1-c)/2)Sr = 1
Yv
IJ
(H - B 10g(c/c))/10g(1/c)
H S; P.log(2+2- d ) + dB
P 10g(2+2B/P) + B 10g(P/2B)
~ L~i[(bi+1») + Laj[a j )
We show that the inequality above is almost true componentwise for the
expressions in square bracketsi more precisely, for h E JR, h> 0
define
and
Then
L ~o L ao ~ 2
-h ,
iENh J. + jELh J
i.e. for a set of leaves and nodes, whose total weight exceeds 1 - 2
-h ,
theorem Sa "almost" holds componentwise. "Almost" has to be interpret-
ed as: up to the additive factor -h/log(2 + 2- d ). The proof of this
claim goes as follows. Let d = log(c/c) with c = (1 - c)/2 and
- -d-
o ~ c < 1. Then d = log c - log c and log(2 + 2 ) = log(1/c). A simple
computation shows that the definitions of Nh and Lh are equivalent to
Hi 2- h Si} and
Nh ~i ~
{ji ao 2- h U }
Lh J
~ j
LSi + La 0
J
~ L Si + Lao ~ 2h[ L i3 + Lao)
i
iENh J°EL h J iENh J°EL h J
We summarize in
Nh {ii ~i ~ 2- h Si}
and
177
o 1/4 1/2
Point 1/2 lies wibhin a Si or an o. j • In the first case we should choose
xi as the root of the search tree, in the second case we should either
choose Xj if 1/2 lies in the left half of o. j or x j + 1 if 1/2 lies in the
right half of o. j • In our example we choose x3 as the root. This also
fixes the right subtree. For the left subtree we still have the choice
of choosing either x 1 or x 2 as the root.
procedure construct-tree(i,j,cut,£);
if i + 1 = j (Case A)
else
determine k such that
if k = i + 1 (Case B)
then return
construct-tree(i+1,j,cut+2- t ,t+1)
if k j (Case C)
then return
construct-tree(i,j-1,cut,t+1)
-t
construct-tree(i,k-1,cut,t+1) construct-tree(k,j,cut+2 ,t+1)
fi
end
180
Proof: Assume that the parameters satisfy conditions (1) to (4) and
-£+1
that i + 1 f j. In particular, cut $ s. $ cut + 2 • Suppose, that
J _~ _~
there is no k,i < k $ j, with sk-1 $ cut + 2 and sk 2 cut + 2 '. Then
either for all k,i < k $ j,sk < cut + 2-~ or for all k,i < k $ j,
-~
sk_1 > cut + 2 In the first case k j satisfies (6) and (7), in the
second case k i + 1 satisfies (6) and (7). This shows that k alvlays
exists. It remains to show that the parameters of the recursive calls
satisfy again (1) and (4). This follows immediately from the fact that
-~
k satisfies (5) to (7) and that i + 1 f j and hence sk 2 cut + 2 in
-~
Case Band sk-1 $ cut + 2 in Case C. D
Proof: The proof is by induction, Fact 1 and the observation that the
actual parameters of the top-level call construct-tree (O,n,O,1) satis-
fy (1) to (4). D
Proof: The actual parameters of the call satisfy condition (4) by fact
2. Thus
- S.
1
~ Sh(resp. a h /2). D
-b -a +2
We infer from facts 3 and 4, Sh ~ 2 hand a h ~ 2 h Taking
logarithms and observing that b h and a h are integers proves the
theorem. D
for most nodes and leaves of tree TBB . Substituting the bounds on b i
and a j given in theorem 7 into the definition of weighted path length
we obtain
and further
Theorem 9 can be interpreted in two ways. On the one hand it shows that
the weighted path length P BB of tree TBB is always very close to the
optimum and hence TBB is a good search tree. Essentially, part b) shows
for the N most common words, N = 10, 100, 1 000, 10 000, 100 000. Let
P N be the weighted path length of the tree constructed for the N most
common words. Then PN ~ 11 for N ~ 00 as the figure (due to Gotlieb/
Walker) below suggests. This is in good agreement with theorem 9.
PN logn
11
T(1) = c1
T(O) 0
184
-£
Binary Search: We first try r = L(i + 1 + j)/2 J • If sr ~ cut + 2
then k 5 r, otherwise k ~ r. We iterate as described in 111.3.1. and
find k in log n steps. Thus Ts(n,m) 5 d·log n for some constant d, and
we obtain
T(O) 0
n
T(n) ~ T(n-1) + d·log n + c ~ d· L log i r/(n log n)
i=1
(2) Find the smallest t, t 0,1,2, ..• , such that s ~ cut + 2-£.
i+2 t
185
T(O) 0
name of node
or leaf
depth in
Topt
depth in probability -
L log PJ Bi , 01..
J
for
'13B c = 1/2
x1 2 1/24 4 1/32
x2 2 1/8 3 1/8
x3 0 0 1/8 3 1/2
x4 0 00 1/8
,x 2 ) 2 3 1/6 2 1/64
(x 1 ,x 2 ) 3 3 0 00 1/64
187
(x 2 'x 3 ) 3 2 o 00 1/16
(x 3 ,x 4 ) 2 2 1/8 3 1/16
We can see from this table that the upper bounds given by theorem 7
exceed the actual values by 1 or 2. If we apply theorem 6 with c = 1/2
and h = 2 then N2 {3,4} and L2 {1,2} and
We consider only unweighted sets in this section, i.e. we are only in-
terested in the size of the sets involved. In other words, if T is a
search tree for set S = {x 1 < x 2 < .•• < xn} then we assume uniform access
probabilities, i.e. Si = 1/n and a j = O. As above, we use b i to denote
the depth of node xi in T.
p = r• (1/n) (b.1. + 1)
1.
a) P ~ Llog(n+1)J - 1
k
P ~ 1/n( L (H1)2 i + (k+2) (n_2 k + 1 + 1))
i=O
~ Llog(n+1) J - 1 []
~
insertions (exercise 9). Deterioration
can be avoided completely if the
.... ..... tree is rebalanced after every in-
..... ..... sertion or deletion. We will see
..... later that rebalancing can be re-
.....
~ stricted to local changes of the
tree structure along the path from
the root to the inserted or deleted node. In this way, rebalancing time
is at most proportional to search time and hence total cost is still
logarithmic.
189
All known classes of balanced trees can be divided into two groups:
weight-balanced and height-balanced trees. In weight-balanced trees one
balances the number of nodes in the subtrees, in height-balanced trees
one balances the height of the subtrees. We will discuss one representa-
tive of each group in the text and mention some more in the exercises.
For this section a is a fixed real, 1/4 < a ~ 1 - 12/2. The bounds on a
will become clear lateron.
is called the root balance of T. Here ITI denotes the number of leaves
of tree T.
a ~ p(T') ~ 1 - a
the subtrees with root u (v,w,x) have root balance 1/2 (2/3,2/5,5/14).
The tree is in BB[a] for a ~ 1/3.
190
a) P ~ (1 + 1/n)log(n + 1)/H(a, 1 - a) - 1
1
~ H(a,1-a) [(Hl) log (H1) + (r+l) log (r+1)] + 1
H (H1 r+1)
(n+1) log (n+1) + 1 _ (n+1). n+1' n+1
H(a,1-a) H(a,1-a)
rotation
192
•
double rotation
The root balances of the transformed trees can be computed from the old
balances P1' P2 (and P3) as given in the figure. We verify this claim
for the rotation and leave the double rotation to the reader. Let a,b,c
be the number of leaves in the subtrees shown. Then
Since
a + b P1 (a+b+c) + P2 (b+c)
= ( P1 + P2 (1-P 1 )) (a+b+c)
o
eration Insert(a) described in 111.3.1 creates the following subtree
cJi-itJ
wi th root-balance 1/2. I'le will now walk back the path to-
wards the root and rebalance all nodes on this path. So let
193
us assume that we reached node vi and that the root balances of all
proper descendants of vi are in the range [a,1-a]. Then 0 ~ i ~ k-1.
If the root-balance of node vi is still in the range [a,1-a] then we
can move on to node v i - 1 . If it is outside the range [a,1-a] we have to
rebalance as described in the following lemma.
Y1 ~ a/{2-a) + (1 - a/{2-a».a
= 81/208 ~ 1.05 a o
We still have to clarify two small points: how to find the path from
the inserted element back to the root and how to determine whether a
node is out of balance. The path back to the root is easy to find. Note
that we traversed that very path when we searched for the leaf where
the new element had to be inserted. We only have to store the nodes of
this path in a stack; unstacking will lead us back to the root. This
solves the first problem. In order to solve the second problem we store
in each node v of the tree not only its content, the pointers to the
left and right son, but also its size, i.e. the number of leaves in the
subtree with root v. So the format of a node is
The root balance of a node is then easily computed. Also the SIZE field
is easily updated when we walk back the path of search to the root.
Node v did not take part in rebalancing operations between the k-th and
the j-th transaction. But its balance changed from t'/w' to t/w and
hence many transactions went through v. Suppose that a insertions and b
deletions went through v. Then w = w' + a - band t ~ t' - b.
197
Claim: a + b ~ oaw
Proof: Let vO, ••. ,'vk be the rebalancing path for the j-th transaction
and let w~ be the weight of node v~, i.e. the number of leaves in the
tree with root v~. Then w~+1 ~ (1-a)w~ fO:_~ ~ 0. Thus ther~+~re at
most three nodes on the path with 1/(1-a)~ ~ w~ < 1/(1-a) . Hence
at most three is added to ~ TA. (v) for every transaction. c
v ~
It is now easy to complete the proof. ~ ~ BO. (v) is the total number of
~ v ~
single and double rotations required to process the sequence of m trans-
actions. We estimate this sum in two parts: i < k and i ~ k where k is
some integer.
~ ~ ((1-a)i/ oa ) 3m , by lemma 3
i~k
~ (1_a)k 3m/oa 2
and
since there is at most one node v for each transaction such that 00. (v) is
1.
increased by that transaction for any fixed i. Hence the total number
of single and double rotations is bounded by [(k-1)+3(1-a)k/ oa 2]m for
any integer k. o
We can use this fact as follows. In chapters VII and VIII we will
frequently augment BB[a]-trees by additional information. In these
augmented trees the cost of a rotation or double rotation will not be
0(1); rather the cost of a rebalancing operation caused by node v will
depend on the current "thickness" of node v, i.e. if node v causes a
rebalancing operation and the subtree with root v has w leaves then
the cost of the rebalancing operation is f(w) time units for some non-
decreasing function f. In many applications we will have f(x) = x or
f(x) = xlog x.
-i -i-1
Proof: If a node v with (1-a) < th(v) ~ (1-a) causes a rebalancing
operation then the cost of this operation is at most f«1_a)-i-1) time
units since f is non-decreasing. Every such rebalancing operation is
recorded in account BOi(V). Hence the total cost of all rebalancing
operations is
199
Height-balanced trees are the second basic type of balanced tree. They
come in many different kinds: AVL-trees,(2,3)-trees,B-trees, HB-trees,
••• and (a,b)-trees which we describe here.
v _ root of T;
while v is not a leaf
do find i, S i S p(v)-l,
such that k i - 1 (v) < x S ki(x);
v i-th son of v;
od;
if x = CONTENT[v]
then success else failure fi;
a) 2·a h - 1 S n S b h
b) log n/log b S h S 1 + log(n/2)/log a
Proof: Since each node has at most b sons there are at most b h leaves.
Since the root has at least two sons and every other node has at least
a sons there are at least 2·a h - 1 leaves. This proves ~. Part b) follows
from part a) by taking logarithms. o
Proof: The search for x takes time O(log 151). Also we store the path
of search in a pushdown store during the search. After adding a new
leaf to hold x we walk back to the root using the pushdown store and
apply some number of splitting operations. If a node v is split it has
b+1 sons. It is split into nodes v and v' with L(b+1)/2 J and r(b+1)/2'
sons respectively. Since b ~ 2a-1 we have L(b+1)/2 J ~ a and since b ~ 3 we
have r(b+1)/2' ~ b and thus v and v' satisfy the arity constraint after
the split. A split takes time Orb) = 0(1) and splits are restricted to
(a final segment) of the path of search. This proves lemma 5. D
Deletions are processed very similarly. Again we search for x, the ele-
ment to be deleted. The search ends in leaf w with father v.
v .. Zi
let y be a brother of v
od;
comment we have either p(v) ~ a and rebalancing is completed or
p(v) = a-1 and p(y) > a and rebalancing is completed by sharing;
i f p (v) = a-1
then
comment we assume that y is the right brother of v;
take the lef~~ost son away from y and make it an additional (right-
most) son of v; also move one key (the key between the pointers to v
and y) from z down to v and replace it by the leftmost key of y;
fi.
-,
completed:
Example continued: The tree of the previous example can be either re-
balanced by sharing or fusing depending on the choice of y. If Y is the
left brother of v then fusing yields
Proof: The search for x takes time O{log lSI). An (a,b)-tree is rebal-
anced after the removal of a leaf by a sequence of fusings followed by
at most one sharing. Each fusing or sharing takes time O{b) = O(1) and
fusings and sharings are restricted to the path of search. Finally note
that a fusing combines a node with a-1 sons with a node with a sons and
yields a node with 2a-1 ~ b sons. c
Proof: For operations Access, Insert and Delete this is immediate from
lemma 4, 5 and 6. For Min and De1etemin one argues as in theorem 3. c
If the tree is kept in~ain memory then typical values for the con-
stants are c 1 ~ c 2 ~ C1and C2 = 0 and we get a = 2 or a = 3. If the
tree is kept in secondary storage, say on a disk, then typical values
of the constants are c 1 ~ c 2 ~ C2 and C1 ~ 1000 c 1 • Note that C1 is
the latency time and C2 is the time to move one storage location. In
this case we obtain a ~ 100. From this coarse discussion one sees that
in practice one will either use trees with small arity or trees with
fairly large arity.
and
(1) p +- root;
(2) while pt.son[O] nil * -- a test, whether p is a leaf
(3) do if x ~ pt.content
(4) then push (p,O);
(5) p +- pt.son[O]
(6) else push (p,1);
(7) p +- pt.son[1]
(8) fi
(9) od;
(10) if pt.content = x
(11) then "successful"
(12) else "unsuccessful"
(13) fi
(1) to (15) below. We also redirect the pointer from p'S father to the
new node r. If p'S father is black this completes the insertion.
(1) new(q);
(2) qt.color black; qt.son[O]- qt.son[1] - nil;
(3) qt.content +- x;
(4) new (r);
(5) rt.content +- min(x,pt.content);
(6) r t. color +- red;
(7) if x < pt.content
(8) then rt.son[O] +- q; rt.son[1] p
°
+-
---
(20) pt.son[O].color - pt.son[l].color - black;
( 21) pt.color - red;
(22) if p = root
(23 ) then pt.color - black;
(24) terminate the insertion algorithm;
210
(25)
(26) r +- p;
(27 ) (q,dir2) - pop stack;
(28) if qt.color = black
(29) then terminate the insertion algorithm
(30) fi;
(31) (p,dir1) - pop stack
(32) else if dir1 = dir2
(33) then we rebalance the tree by a rotation
S2 {x E S1 ; x $ y} and
S3 {x E 51; x > y}
The splitting process yields two forests F1 and F 2 , i.e. two sets of
trees. F1 is the set of trees to the left of the path and F2 is the set
of trees to the right of the path. The union of the leaves of F1 gives
8 4 and the union of the leaves of F2 gives 8 5 • We show how to build 8 4
in time O(log 18 1 I). The root of each tree in F1 is a piece of a node
on the path of search. Hence F1 'contains at most h1 = height(T 1 ) trees.
Let D1 , ••• ,D m, m ~ h1' be the trees in F1 from left to right. Then each
Di is an (a,b)-tree except for the fact that the arity of the root
might be only one. Also max Di < min Di + 1 and height(D i ) > height(D i + 1 )
for 1 ~ i < m. Hence we can form a tree for set 8 4 by executing the
following sequence of Concatenates.
Concatenate(Dm_1,Dm,D~_1)
Concatenate(Dm_2,D~_1,D~_2)
,property of Concatenate
216
, since h i + 1 < hi o
m-2
Thus Ih 1 - h I + L Ih. - h~+l I + m
m- m i=l ~ ~
m-2
$ h 1 - h + L (h. - h~+l) + m
m- m i=l ~ ~
m-2
$ h' 1 - h + L (h~ - h~+l) + m
m- m i=l ~ ~
3 5 7
4 7
a) SH ~ d ~ n
b) c
(2c-1)·SP + c·F ~ n + c + a+c-1 (i-d-2)
For a node v (unequal the root) of an (a,b)-tree let the balance b(v)
of v be
Proof: obvious. 0
a) If ply) = a then let T' be the tree obtained by fusing v and y and
shrinking x. Furthermore, if x is the root and has degree 1 after the
shrinking, then x is deleted. Then bIT') ~ bIT) + c.
b) If ply) > a then let T' be the tree obtained by sharing, i.e. by
taking 1 son away from y and making it a son of v. Then bIT') ~ b(T).
Case 2: x is the root of T. If pIx) ~ 3 before the fusing then the ana-
lysis of case 1 applies. Otherwise we have pIx) = 2 and hence b*(x) = 0
before the fusing, and x is deleted after the fusing and w is the new
root. Hence
b) Taking one son away from y decreases the balance of y by at most one.
Giving v (of arity a-1) an additional son increases b(v) by one. Hence
bIT') ~b(T). 0
221
c
L (a+j-1)m. ~ m-2
j=O J
Since
we conclude
c
b (T) ~ c + L j·m.
j=O J
c j
~ c + L
a+j-1
(a+j-1) ·m.
j=O J
Deletion of
rightmost leaf
the way to the root. However, theorem 8 is true for b 2a-1 if one
considers insertions only (exercise n) .
n
and F = L i =1 f i . Then 0 :::; F :::; n(n-1)/2, F is called the number of in-
versions of the sequence x 1 ' .. ,x . \life take F as a measure of sortedness;
n 2
F = 0 for a sorted sequence, F ~ n for a completely unsorted sequence
and F « n 2 for a nearly sorted sequence. These remarks are not meant
to be definitions. We can sort sequence x 1 , ... ,x n by insertion sort,
i.e. we start with the sorted sequence xn represented by an (a,b)-tree
and then insert xn-1,xn_2, ••. ,x1 in turn. By theorem 5 this will take
total time O(n log n). Note however, when we have sorted x i + 1 , ... ,x n
already and next insert xi' then xi will be inserted at the fi-th posi-
223
2
tion from the left in that sequence. If F « n , then fi « n for most
i and hence most xis will be inserted near the beginning of the sequence.
What does this mean for the insertion algorithm. It will move down the
left spine (the path to the leftmost leaf), for quite a while and only
depart from it near the leaf-level. It is then cheaper to search for
the pOint of departure from the left spine by moving the spine upwards
from the leftmost leaf. Of course, this requires the existence of son-
to-father pointers on the left spine.
n n n
E [O(log f i } + O(si} ] O(n + E log fi + E si}
i=1 i=1 i=1
O(n + n log(F/n}}
since Elog fi = log ITfi and (ITf i ) 1/n ~ (Efi}/n (the geometric mean is
never larger than the arithmetic mean) and since ES i = O(n} by theorem
8. We thus have
Proof: The first part of the proof parallels the proof of theorem 8.
a) if p(y)= a then let T' be the tree obtained by fusing v and y and
shrinking x. Furthermore, if x is the root of T and has degree after
the shrinking, then x is deleted. Then bh(T') ~ bh(T) + c + 1,
b h + 1 (T') ~ b h + 1 (T) - 1 and b£(T ' ) = b£(T) for £ * h, h+1.
b) if p(y) > a then let T' be the tree obtained by sharing. Then
b£(T ' ) ~ b£(T) for all £.
The proofs of facts 1 to 3 are very similar to the proof of the corre-
sponding facts in theorem 8 and therefore left to the reader.
Proof: Facts 1-3 imply that splits (fusings) at height h increase the
balance at height h by 2c(c+1) and decrease the balance at height h+1
by at most 1. o
Since bh(To ) = a for all h (recall that we start with an empty tree) and
2c ~ c+1 (recall that c ~ 1) we conclude from fact 4
2?6
and hence
n/(c+1)h +
h
L b£ (Tn) (c+1) £ :s; (c+1) (i-d) :s; (c+1)·n
£=1
Proof: Let m.(h) be the number of nodes of height h and degree a+j,
J
o :s; j < c, and let rnc(h) be the number of nodes of height h and degree
at least a+c. Then
c c
(*) L (a+j)mJ.(h):S; L rn. (h-1 )
j=O j=O J
for h ~ 2 since the number of edges ending at height h-1 is equal to the
c
number of edges emanating at height h. Setting L m.(O) := i-d (the nurn-
j=o J
ber of leaves of Tn) the inequality is true for all h ~ 1.
c
L j.rn.(£) and relation (*) we obtain
j=O J
h c
L [(c+1)£ L j rn J. (£) ]
£=1 j=O
:s;
h
L [(c+1) •£{C rn.(£-1) - a L rn.(£) ]
L
C}
£=1 j=O J j=O J
c
(c+1) L m. (0)
j=O J
h c c
+ L (c+1)£[ L m. (£-1) - (c!1) L rn. (£-1)]
£=1 j=O J j=O J
227
h c
- (c+1) ·a· L m. (h)
j=O J
With respect to sharing, note that a fusing at height h-1 either com-
pletes rebalancing or is followed by a sharing or fusing at heigth h.
Hence
for h ~ 2 and
In level linked (a,b)-trees all tree edges are made traversible in both
directions (i.e. there are also pointers from sons to fathers); in addi-
tion each node has pOinters to the two neighboring nodes on the same
level The figure on the next page shows a level-linked (2,4)-tree for
list 2, 4, 7, 10, 11, 15, 17, 21, 22, 24.
Proof: Obvious.
We can now appeal to theorem 8 and show that although the search time
in level-linked (a,b)-trees can be greatly reduced by maintaining fin-
gers, it still dominates the total execution time, provided that b ~ 2a.
Proof: Let n be the length of the sequence. Since every operation has
to be preceded immediately by a search, the total cost for the searches
is ~(n) by lemma 7. On the other hand the total cost for the finger
creations and removals is O(n) by lemma 10 and the total cost of in-
sertions and deletions is O(n) by lemmas 8 and 9 and theorem 8. 0
deletion of
4th leaf
and sharing
\ a phantom
deletion of third
leaf and fusing
insertion of a
~
phantoms
insertion of new
4th leaf and splitting
---------.~ ...
Adding a leaf: When a new leaf is added we mark the new leaf and its
father.
Fusing: Suppose that node v (v has a marked son at this point) is fused
with its brother y. Then we mark v and its father.
Sharing: Suppose that node v (v has a marked son at this point) receives
a son from its brother y. Then we mark v and we unmark y (if it was
marked) •
Since the only leaves which are marked are the new leaves and the phan-
toms we conclude from Fact 1 that all marked nodes lie on paths from
the new leaves and the phantoms to the root. In fact 5 we derive an up-
per bound on the number of such nodes.
The marked balance of a tree is the sum of the balances of its marked
nodes, i.e.
{ 0 i f pew) b
b(w) 2 i f p (w) 2a-1
i f a ::;:; pew) ::;:; b and p (w) * b, 2a-1
and
b* (r)
{ 0
2
i f p (r)
i f p (r)
b
2a-1
i f 2 ::;:; p (r) : ;:; b and p (r) * b, 2a-1
mb(T') mb(T) + b(v after adding leaf) - [b(v before adding leaf)]
b) Similar to part a) .
e) Suppose that we fuse v and its brother y and shrink their cornmon
father x. If x is the root and has arity 1 after the fusing then x is
deleted. Also v and x are marked after the fusing and at least v is
marked before. y and x may be marked before. Hence
f) Suppose that v takes away a son from y. Then v is marked before and
after the sharing, y is not marked after the sharing and may be marked
or not before the sharing. Also the balance of y before the sharing is
at most 2.
235
Hence
Suppose now that we start with initial tree T (no node marked) and per-
form s insertions and t deletions and obtain Tn (n = s+t).
c) All marked nodes lie on a path from a new leaf or phantom to the root.
r
m ~ 3r + 2(LlogaNJ + L Lloga(Pi-Pi_1 + 1)J)
i=2
Proof: For every node v label the outgoing edges with O, ••• ,p(v)-1 from
left to right.
~(V)-1
236
Then a path from the root to a node corresponds to a word over alphabet
{O,1,2, ••• } in a natural way.
Let Ai be the number of edges labelled 0 on the path from the root to
leaf p., 1 $ i $ r. Since an (a,~)-tree of height h has at least 2'a h - 1
~
Consider any i ~ 2. Let v be the lowest common node on the paths from
leaves Pi-1 and Pi to the root. Then edge k1 is taken out of v on the
path to Pi-1 and edge k2 > k1 is taken on the path to Pi' Note that the
path from v to leaf Pi-1 as well as to leaf Pi consists of ti + 1 edges.
Proof: The paths from Pi-1 and Pi to the root differ only below node v.
Let s be minimal such that
a) the path from v to Pi-1 has the form k 1 aB with IBI s and a contains
no o.
b) the path from v to Pi has the form k2 olal y for some y with Iyl s.
Note that either B starts with a 0 or y starts with a non-zero and that
Ial + I BI ti .
v
237
o then -1
+ number of zeroes in y
number of zeroes in B
~ Ai - 1 - 1 + (ti-s) + 0 - s
We noted above that either B starts with a zero or y starts with a non-
zero. In the first case consider node w which is reached from v via
k1 a 1, in the second case node w which is reached from v via k2 olal o .
All leaf descendants of w lie properly between Pi-1 and Pi· Further-
s-1 leaf descendants. This
more, w has height s-1 and hence at least a
proves
s-1
a ~ Pi - P i -1 - 1
and hence
s ~ + L10ga(Pi-Pi_1-1)j
~ 1 + Lloga(Pi-Pi_1+1)j D
r
t.~ - 2 L loga(Pi-Pi_1+1)
i=2
- 3 (r-1)
r r
L ti ~ 3r - 3 + 1 + Lloga N/2 j + 2 L Lloga(Pi-Pi_1+1)j
i=2 i=2
and hence
r
m ~ 3r - 1 + 2 Llog a N/2 j + 2 i~2 Lloga(Pi-Pi_1+1)j
At this point, we can put everything together and obtain a bound on the
number of rebalancing operations.
SH + SP + F ~ SH + SPNC + SPC + F
~ s + t + SPC + F by (Fact 2)
~ 3(s+t) + 2m by (Fact 4d)
whe~~ m is the total number of nodes on the paths from the new leaves
and phantoms to the root in the final tree. Since the final tree is an
(a,~)-tree and has n+s leaves and phantom leaves, we conclude from factS
and hence
s+t
SH + SP + F ~ 9(s+t) + 4(Lloga (n+s)j + L Lloga(Pi-Pi_1+1)j)
i=2
since each finger except the first must be established by a search. Con-
sider the graph G with vertex set f~11 ~ i ~ £} and edge set
{(qi'Pi) 11 ~ i ~ £ and Pi is not the originally fingered item}.
Some constant times the sum of edge costs in G is a lower bound on the
total search cost, since 1£(Pi)-£(qi) 1+1 can only underestimate the ac-
tual distance between qi and Pi when Pi is accessed. We shall describe
a way to modify G, while never increasing its cost, until it becomes
where r 1 < r 2 < ••. < r k are the k = s+t inserted or deleted leaves.
Since the cost of this graph is E max(rlog (£(r i ) - £(r i _ 1 )+1)1,1)
1 <:L::;k
~ k + E log (r. - r i - 1 + 1), the theorem then follows from theorem 12.
1 <i~k ~
as desired. c
Finger trees can be used effectively for many basic set operations such
as union, intersection, difference, symmetric difference, ••••
Proof: a) It is easy to see that theorems Sand 6 are true for level-
linked (a, b)-trees.
Let P1, ••. ,Pm' m = IBI, be the positions of the elements of B in the
set A U B, let Po = 1. Then the above program takes time
241
m-1
O(m + log(n+m) + ~ 10g(Pi+1 - Pi + 1»
i=O
Finally we have to consider A' B. If IAI ~ IBI, then we use the pro-
gram above. If IAI < IBI then we scan through A linearly, search for
the elements of A in B as described above (roles of A and B reversed)
and delete the appropriate elements from A. Apparently, the same time
bound holds. c
For T an (a,b)-tree let neT) be the number of nodes of T and let PN(T)
be the probability that T is obtained by N random insertions from the
empty tree. Of course, PN(T) is zero if the nQmber of leaves of T is un-
equal N. Then
Let ni(T) be the number of leaves of T which are sons of a node with
exactly i sons, a ~ i ~ b. Then L n l" (T) = ITI, the number of leaves of
i
T. Let
n.l (N) N
since every node of height 2 or more has at most b and at least a sons
(except for the root which has at least two sons) and since there are
exactly L n. (T)/i nodes of height 1 in T. The 1 on the right hand side
. ~
accounts~for the fact that the degree of the root might be as low as 2.
and
n. (N+1) n. (N) + i (n.~- 1 (N) ni(N»/N for a+2 ~ i ~ b
~ ~
Proof: We prove the second equation and leave the two others to the
reader. Let T be an (a,b)-tree with N leaves. Consider a random inser-
tion into T resulting in tree T'. Note that n a + 1 (T') - n a + 1 (T) grows
by a + 1 if the insertion is into a node with either exactly a (proba-
bility na(T)/N) or b (probability nb(T)/N) leaf sons and that
n a + 1 (T') - n a + 1 (T) goes down by a+1 if the insertion is into a node of
height 1 with exactly a+1 sons (probability n a + 1 (T)/N). Thus
n a + 1 (N+1)
- T
Let Q(N) be the column-vector (na(N), •.. , nb (N) ) Then lemma 10 can
be written in matrix form as
244
where
-(a+1)
a+2 -(a+2)
+b
Note that matrix B-1 is singular since each column sums to zero. We
show below that q(N) converges to q, a right eigenvector of B-1 with
respect to eigenvalue 0, as N goes to infinity. Also Iq(N) - ql = O(N- c )
for some c > O. This is shown in theorem 16.
T
We will next determine q = (qa, ••• ,qb) From (B-I)q o and
I: q. = lim I: ni (N) /N = one concludes
i . J. N.... co i
for a+1 ~ i ~ b
a
where Ha I: 1/i is the a-th harmonic number.
i=1
1
(a+1) (b+1) +
245
1
(a+1) (b+1) +
since b 2a
N N
- e; ~s(N) ~ - - - - - - - - - - + e;
(b-1) (1+1/(a-1»r(N) (b-1) (1+1/ (b-1 » r (N)
Thus Is(N) - tn21 ~ Cia + e; for all sufficiently large N and some con-
stant C. c
Storage utilization of a random (a, 2a) -tree is thus about tn2 RI 69%.
Fringe analysis can also be used to sharpen theorem 8 on the total re-
balancing cost. The bound in theorem 8 was derived under the pessimistic
assumption that every insertion/deletion decreases the value of the tree
by 1. Fringe analysis provides us with a smaller bound at least if we
restrict attention to random sequences of insertions. It is not hard to
define what we mean by a random deletion; any leaf of the tree is re-
moved with equal probability. However, it seems to be very hard to ex-
tend fringe analysis to random insertions and deletions.
246
2) there is an i such that for all j there are jo,··.,jk with i jo'
j = jk and h . . > 0 for 0 ~ ~ < k. Then
J~,J~+1
b) Let q(O) be any non-zero m-vector and define q(N+1) = (I+ N !1H)q(N)
for N ~ O. Then q(N) converges to q a right eigenvector of H with re-
Re A2
spect to eigenvalue 0 and Iq - q(N) I = O(N ).
r
where E 1 , •.• ,E m_ 1 are the eigenvalues of Hii , we infer that either
m-1
det(H ii ) = 0 or sign (det Hii ) = (-1) • Thus det Hii = 0 iff
det Hii = 0 for all i.
::; u. L h tj + (Iukl/u j - 1 ) h kj ),
J t
since lutl ::;u. and h tj 2: 0 for .Q, j
J *
< 0
since L h.Q,' I uk I/u j < 1 and h kj > O. Hence u is not left
0,
.Q, J
eigenvector of H .. with respect to eigenvalue o.
1.1.
n
b) Let fn(x) n (1 + x/j) and let f(x) lim fn(x). Then
j=1 n-> =
f(O) = fn(O) 1. Also
11 + x/ jl 2 11 + Re(x/j) 12 + IIm(x/j) 12
n
exp( L .Q,n 11 + x/jl + L (tn(1 + Re(x/j» +
j::;jo j=jo
tn ( (1 + c I 1m (x/j) I 2) 1/2)
for some constant C depending on x since tn(1 + Re(x/j» ::; Re(x)/j and
tn«1 + c IIm(x/j)1 2 )1/2)::; (c/2) IIm(x)1 2 /j:2. 0
248
Let q(O) be some m-vector and let q(N+1) = (I + n+1 H) q(N) for N ~ o.
Then q(N) = fn(H)q(O) where fn(H) n (I ~ J
1
H). We consider matrix
1::;j::;n
fn(H) in more detail.
Let J = THT- 1 ( Jo
1.. . 0) JK
be the Jordan matrix corresponding to H;
J 1 , •• ·,Jk are the blocks of the Jordan matrix. We have J 1 (0), i.e.
~ is a one by one matrix whose only entry is zero. Also
f (J )
n 1 f (J )
n 2
(v t -1 )
f(2) (A ) fn (At)
n t
21 v t1
_(e:1'1~~~... e:1,Vt(n))
. e: (n)
f ( In ) - • •
n N
o • •
vt,V t
249
that even in the unweighted case the cost of a join is related to the
height difference of the operand trees which is in general not related
to the log of the weight ratio (cf. exercises 28 and 38) and therefore
will only show a bound of O(log(W 1+W 2 )/w) where w is the weight of the
rightmost element of the first tree.
The main result of this section is that these time bounds are in fact
achievable. How can we proceed? How can we generalize weight-balanced
or height-balanced trees in order to cope with elements of different
weights? A first idea is to balance weights and cardinalities of sets
in weight-balanced trees (cf. the generalization of binary search from
unweighted to weighted sets in 4.1.) or to let the height of a leaf
depend on the weight of the leaf in height-balanced trees, e.g. to
set the height of a leaf of weight w to Llog wJ • The following example
shows that this idea does not work in its pure form. Consider a set of
three elements of weight 1, n, and 1 respectively.
~
~rrJ ~
\
leaves representing
x2 E S
a tree for
a tree for a tree for
w1 ' .•• ,w i - 1 wi"" ,w n - 1
w2 "" ,wn
has an unbalanced root and all non-trivial subtrees have weight at most
W/3. Thus logarithmic search time is guaranteed and unbalanced nodes
only occur near leaves. A similar statement can be made if we look at
height-balanced trees. In their paper on biased trees Bent/Sleator/
Tarjan show that this property can be maintained dynamically and that
the time bounds stated above can be achieved by biased trees.
252
Suppose also that we have a function bal: {Do ,D 1 , ..• } ~ lR from the set
of possible data structures into the set of reals. If D is a data
structure then we call bal(D) the balance of the account of the
structure. A unit of balance represents the ability to pay for 0(1)
units of computing time. We define the amortized cost of operation OPi
by
(x)
m
L a i + bal(Do ) - bal(Dm )
i=1
i.e. the total actual cost of the sequence of operations is equal to
254
the total amortized cost plus the net decrease in balance between
initial and final configuration. We will mostly use the concept of
amortization as follows. Suppose that we have defined the intended cost
iC i of operation OPi by some means; e.g. in dynamic weighted trees the
intended cost of an access operation to an element of weight w in a set
of total weight W is log W/w. We will then try to find an account for
the data structure, i.e. a function bal, such that the amortized cost
of every operation is bounded by the intended cost, i.e.
Then
m m
L t.::::; L iC i + bal(D o ) - bal(Dm) ,
i=1 1. i=1
i.e. total actual cost is bounded by total intended cost plus the net
decrease in balance. Amortization is a means of cost sharing between
operations. Operations which do not require their intended cost, i.e.
ti ::::; ic i , pay for the cost of operations which exceed their intended
cost, i.e. ti > ic i • The sharing between the costs is made by the
account associated with the data structure, i.e. an operation deposits
«or withdraws) iC i - ti units into (from) the account in order to cover
the cost difference between intended and actual time bound.
m m
L t. ::::; L ic. + bal(D o )
1. 1.
i=1 i=1
,i.e. actual cost is bounded by intended cost plus initial balance. The
interpretation of a positive account is particularly simple. We start
with an initial investment of bal(D o ) units. Using this initial
investment and the units provided by the intended costs of the
operations we can also cover the actual cost. In other words, we use the
account to have operations pay for the cost of later operations which
might overrun their intended cost.
Let 1/4 < a < 1 - /272 and let 0 > 0 be defined as in lemma 1 of section
5.1. For a binary tree T let
where p(v) is the root balance of node v and the summation is over all
nodes of tree T. We claim that the amortized cost of adding or deleting
a leaf is 0(1) (this follows from the fact that adding a leaf below a
node of thickness th(v) can change its root balance by at most 1/th(v)
and that the amortized cost of a rotation or double rotation is zero
(this follows from the fact that a node v which causes a rebalancing
operation, i.e. p(v) < a or p(v) > 1-a, contributes at least one to
bal(T) and does not contribute anything after the rotation). Hence the
total number of rotations and double rotations is bounded by the order
of the number of insertions and deletions; cf. theorem 4 of section 5.1.
o
In the case of a negative account we have
m m
L t. ~ L iC i - bal(Dm)
i=1 l i=1
i.e. actual cost is bounded by intended cost plus final debt. In the
case of a negative account we use bal(D m) to keep track of the cost
overrun, i.e.lbal(Dm) I measures the overrun of actual cost over intended
cost. In other words, we use the account to have later operations pay
for the cost of preceding operations, and we use bal(Dm) to keep track
of the final dept.
There are many strategies for self-organizing linear search. Two very
popular strategies are the Move-to-Front and the Transposition rule.
Move-to-Front Rule (MFR): Operations Access (x) and Insert(x) make x the
first element of the list and leave the order of the remaining elements
unchanged; operation Delete (x) removes x from the list.
i
Proof: Let t j , 1 ~ j ~ k i , be the cost of the j-th access to element xi
under the move-to-front rule. We are going to derive a bound on
n k. i n
ik. L ( Ll(t. i)) •
L
i=l
l i=l j=l J
i
Consider an arbitrary pair i,j. If t. > i then there must have been at
i J
least tj - i accesses to elements ~, h > i, between the (j-l)-th and
the j-th access to element xi (the o-th access is the initial con-
figuration). Hence
k. i
Ll (t J. - i) ~ k i + 1 + ... + k n
j=1
for all i, 1 ~ i ~ n, and therefore
n
CMPR(s) - CpDR(s) ~ L (k i + 1 + ... + k n )
i=l
n
L (i-J)k. CpDR (s) - m o
i=J l
to charge the excess cost to those x h ' h > i, which are in front of xi
when item xi is accessed for the j-th time. In other words, whenever
we access x h then we charge an extra h-1 time units to this access and
use these time units to pay for the excess cost of accesses to elements
xi' i < h. In this way, the amortized cost of accessing x h is 2h - 1.
258
Proof: We use the bank account paradigm. Let s' be a prefix of sand
let LMFR and LA be the sequences obtained by processing s' under
algorithm MFR and A respectively. Assume w.l.o.g. that LA = (x 1 x 2 ••• x n ).
We define our account by
n
bal(LMFR ) =,L I{xh ; h>i and x h is in front of xi in LMFR } I
1=1
n
=L I{x.; h>i and x. is behind x h in LMFR } I
11=1 1 1
259
Note that the balance of LMFR is defined with respect to the numbering
of elements defined by list LA' If we want to emphasize this dependency
we write bal(~R,LA) instead of bal(L MFR ).
Proof: a) Let bal' = bal(LMFR , LA) and bal = bal(~FR,LA) where the
prime denotes the lists after performing operation Op. Since the actual
cost of Op is t its amortized cost is t + bal' - bal. It therefore
suffices to show that t + bal' - bal ~ 2i - 1.
Let k be the number of items x h with h > i and x h is in front of xi
in LMFR • Then k + (i-1) ~ t - 1 and hence k ~ t - i.
b) and c). Note first that the actual cost of a free or paid exchange
of algorithm A is zero since no action of algorithm MFR is involved.
260
Also, a free exchange decreases the balance of list LMFR by one and a
paid exchange can increase the balance by at most one. 0
We can now finish the proof by observing that the balance of the initial
list is zero, that we deal with a positive account, and by appealing to
the bank account paradigm. Hence
2
- m + n
2
by the proof of theorem 1 and since bal(L MFR , LA) ~ n
This inequality can be interpreted as follows. The MF-rule needs O(u ~ )
time units in order to make up for the unfavorable initial configuration;
after that its cost is at most twice the cost of algorithm A.
For the analya~s of the MF and the TR rules we use Markov chains. The
following diagram illustrates the transposition rule for a set of three
elements. The diagram has 6 = 3! states
The move-to-front rule also induces a Markov chain with n! states. The
states correspond to the n! linear arrangements of set S, i.e.
permutation n represents a linear list where xi is at position n{i).
Furthermore, there is a transition from state n to state n if there is
an i such that p{i) =" p{j) = n{j) + , for j such that n{j) < n{i)
and p{j) = n{j) otherwise~ This transition has probability Si. It is
easy to see that this Markov chain is irreducible and aperiodic and
therefore stationary probabilities Yn exist. Yn is the asymptotic
(limiting) probability that the chain is in state n, i.e. that xi is
at position n{i) for , ~ i ~ n. Then
262
a) L Si(1 + L o(j,i»
i j*i
b) o(j,i)
Pi L Y'JT 'JT(i)
'JT
1 + L o(j,i)
j*i
front of x ..
1
S/(Si + Sj) o
263
a) PMFR 1 + 2
b) PMFR 5 2.PFDR - 1
Proof: a) We have
L Bi (1 + L B/(B i + Bj » by lerruna 2b
i jH
1 + L Bi B/(B i + Bj )
j,.i
1 + 2 L B{ BJ./(B{ + BJ.)
j<i ~ ~
5 1 + 2 L Bi(i - 1) 2 PFDR - 1 Cl
i
Insert(x,T): insert x into tree T and return the resulting tree (i.e. a
pointer to its root)
We will now show how to reduce all other operations to operation Splay.
In order to do Access(x,T) we do Splay(x,T) and then inspect uhe root.
Note that x is stored in tree T iff x is stored in the root of the tree
returned by Splay(x,T). To do Insert(x,T) we first do Splay(x,T), then
split the resulting tree into one containing all items less than x
and one containing all items greater than x (this is tantamount to
breaking one of the linKs leaving the root) and then build a new tree
with the root storing x and the two trees being the left and right
subtree. To do Delete(x,T) we do Splay(x,T), discard the root and
join the two subtrees T 1 ,T 2 by Join2(T 1 ,T 2 ). To do Join2(T 1 ,T 2 ) we do
splay(+00,T 1 ) where +00
is assumed to be larger than all elements of U
and then make T2 the right son of the root of the resulting tree. Note
that splay(+00,T 1 ) makes the largest element of T1 the root and hence
creates a tree with an empty right subtree. To do Join3(T 1 ,x,T 2 ) we
make T1 and T2 the subtrees of a tree with root x. To do Split(x,T)
we do Splay(x,T) and then break the two links leaving the root. Finally,
to do Change Weight(x,t,8) we do Splay(x,T).
The following diagram illustrates how all other operations are reduced
to Splay
6
inspect root
~
Access (x,T): splay(x,T).
Insert(X~T):splay(x,T)
•
266
Join2(T 1 ,T 2 ):
&
x
Split(x,T):
b 'play (x ' ;....)_ _---->
v - troot;
while v * nil and vt.content *x
do u v;
if x < vt.content
then v - vt.lson else v - vt.rson fi
od;
if v * nil then u - v fi
Case 2: Node u has a father v and a grandfather w. Let v the the a-son
of wand let u be the b-son of v, a,b E {left,right}.
Case 2.1. a *
b: We perform a rotation at v followed by a rotation at
w. This is tantamount to a double rotation at w.
Case 2.2. a b: We perform a rotation at w followed by a rotation at
v.
2
10
7 9
splay(l,T) splay(2,T) 10
•
2
3
2
7 6
splay(7,T) splay(6,T)
•
10
10
Before we can analyse splay trees we need some more notation. For x E U
we write w(x) to denote the weight of element x. For v a node of a
splay tree we write tw(v) to denote the sum of the weights of all
elements which are stored in descendants of node v. Note that the total
weight of a node changes over time. If T is a splay tree we write twiT)
instead of tw(root(T)).
bal(T) L r (v) •
v node of T
271
For the following lemma we assume that each splay step takes time 1.
We have:
Proof: In this proof we use r' (u), r' (v), r' (w) to denote the ranks of
the various nodes after the splay step. Note that r'(u) = r(v) in case
1 and r' (u) = r(w) in cases 2.1. and 2.2.
We are now ready to discuss the various cases of the splay step. Note
that the actual cost is one in all three cases.
Assume next that r(w) = r(u). Then r(w) = rev) = r(u) r ' (u).
Also r' (w) :5 r' (u) = r(u) and r' (v) :5 r' (u) = r(u).
If r' (w) = r' (v) then r' (u) <:: r' (w) + 1 and hence r' (w) + r' (v)
:5 2r(u) - 1. This shows that the amortized cost is bounded above by
zero (= 3(r(w) - r(u») in this case.
Assume first that r(w) > r(u). Then we conclude further, using
1 <:: r(w) - r(u), r(u) :5 rev) :5 r(w) and r' (w) :5 r' (v) :5 r' (u) = r(w),
that the amortized cost is bounded by
:5 r(w) - r(u) + r(w) + r(w) -r(u) - r(u)
3(r(w) - r(u»
Assume next that r(w) = r(u). Then r(w) = rev) = r(u) = r' (u) and
r' (u) :2: r' (v) :2: r' (w). If r' (u) > r' (w) then the amortized cost is
bounded by zero (= 3(r(w) - r(u» and we are done. 50 assume r' (u)
r' (w). Consider the middle tree in the figure illustrating case 2.2.
We use F(u) ,F(v) ,F(w) to denote the ranks in this tree. We have
F(u) r(u), F(w) = r' (w) and F(v) r(w). If r' (w) = r' (u) then
F(w) = r'(u) = r(w) = r(u)
r'(w) F(u) and hence F(v) > F(w) and
therefore r' (u) > r' (w), a contradiction. Hence r' (u) > r' (w) always.
This finishes the proof in case 2.2. o
Theorem 3:
The amortized cost of 5play(x,T) is O(log tw(T)/tw(x».
tw (T)
The amortized cost of Delete(x,T) is O(log
min(tw(x),tw(x-»
where x is the predecessor of x in tree T
273
tW(T1) + tW(T 2 )
The amortized cost of Join2(T 1 ,T 2 ) is O(log where
tw(x)
x is the largest element in tree T 1 .
tw ( T 1) + tw ( T 2 )
The amortized cost of Join3(T 1 ,x,T 2 ) is O(log w(x)
tw' (T) )
The amortized cost of Insert(x,T) is O(log
min(tw(x- ) ,tw(x+ ),w(x» '
We close this section with a comparison of splay trees and static trees.
Proof: Let U be the set of elements stored in the static tree. Let d be
the depth of the static tree and let d(x) be the depth of element x E U
in the static tree. We use the weight function w(x) = 3 d - d (x).
Consider any splay tree T for set U. The balance bal(T) is at most
L rex) L d+1 = O(n 2 ). Also the total weight of T is at most
xEU xEU
d
L w(x) = L L 2i 3 d - i ~ 3 d + 1 and therefore the amortized
xEU xEU i=o
cost of operation Access(x,T) is O(log tw(T)/tw(x» O(d(x) +1).
Since d(x) + 1 comparisons are needed to search for x in the static tree
this implies that the amortized cost of the sequence of operations is
OCt). Also the balance of the initial tree is O(n 2 ). This proves the
bound. o
d) Consider the j-joint v. w'. j-Ieaves are to the left of v and w". j-
J J
leaves are to the right of v. If W'j ~ w'j then the j-node of minimal
depth to the left of v is active, otherwise the j-node of minimal
depth to the right of v is active.
The following Lemma shows that D-trees are good search trees.
~f: Let v be the father of the active j-node. Then all j-leaves
which are on the same side of the j-joint as the active j-node are
descendants of v. Hence th(v) ~ Wj/2. The argument of theorem 2 of
section 5.1. finishes the proof. c
double rotation to [
the left about A )
~otation
~ ~eft
to the
about A
rotation to B~
right about ~"C~
splitting
rotation
4-joint 4-joint
1 1
Example: Rotation to the left about the father of the active 2-node
280
rotation combining
If s < w'!) then B will be the j-joint after the rotation. The distribu-
tion of j-leaves with respect to B is w! + s, w'! - s.
) )
Case 2.1.: wJ! + s ~ w'! - s. Then w'. ~ w'! and the active j-node was to
) ) )
the right of A, in fact it was node T 2 • Also the active j-node will be
to the right of B after the rotation and it still is to the right of A.
Hence we only have to copy A's query into B.
Case 2.2.: w'. + s > w'~ - s. Then the active j-node will be to the left
) )
of B after the rotation, and hence it will be node T 2 •
Case 2.2.1.: w! ~ w'~. Then T2 also was the active j-node before the
) )
rotation. No additional action is required in this case.
281
active j-node
2-joint
same as
before
2-joint
same as same as
before before
3
2~
The basic idea for compact D-trees is to only store special nodes
(= active nodes and joints) and nodes having special nodes in both
subtrees. Clearly there are only O(n) such nodes. All other nodes are
deleted and only some information about their size is kept. The diagram
below shows a D-tree for distribution (3,20,4,10) and also its compacted
version. In the compacted tree
284
Let us turn to amortized update cost next. Rotations and double rota-
tions in D-trees are only caused by the underlying BB[a]-tree. Hence
the results of section 5.1. carryover without any change. In particular,
theorem 5 of section 5.1. carries over.
Dynamic weighted trees have numerous applications. The most obvious one
is the organization of sets under changing or unknown access
probabilities. In this section we describe an application to TRIES and
exact match queries in multi-dimensional spaces. Let U be an ordered
set, d be an integer (=dimension), and let S =
u d • We want to maintain
set S under searches, insertions and deletions taking comparisons
between elements of U as our basic operation. One solution immediately
comes to mind: Order ud by lexicographic order, i.e. (x 1 ' ••• ,xd )
:s; (Y1""'Yd) if there is an i such that Xj = Yj for j < i and Xi < Yi'
and use ordinary balanced trees. This will allow us to perform all
three operations in time O(dolog lSI) since a comparison between two
elements of S takes time O(d). We will show next that this can be
improved to time O(d + log lSI).
285
d
~
i=1
d
~ 0(1 + log(p /p » by theorem 3 or 5
i=1 x 1 ···x i _ 1 x 1 ···x i
d
O( ~
i=1
= Oed + log PIC: - log px) = Oed + log n - log 1) = Oed + log n)
i
~ 0(1 + log«p + 1)/p » + O(log p ) + Oed-i)
j=1 x 1 •· .x j _ 1 x 1 ·· .x j x 1 ·· .x i
286
i
o (d + !: (log ( 2p ) - log P ) + log P )
j=1 x 1 •• ,x j _ 1 x 1 ••• x j x 1 ••• x i
oed + log n)
Hashing x O(S) x x
expected
Perfect x 0(1) x
Hashing
Interpola- x O(loglogn x
tion Search expected
static
weighted x O(logl/p) (x) x
search tree
BB[a]-trees x o (logn) x x x
In the table N denotes the size of the universe, n denotes the size of
the set stored in the data structure, p is the probability of the
element to be accessed, k is the branching factor of the TRIE and S is
the loading factor of the hash table.
There are three data structures which support only operation Access:
Perfect hashing, interpolation search and static weighted search trees.
Perfect hashing guarantees an access time of 0(1) and uses about 3n
words of storage for the hash table and essentially n additional words
for the hash program. It can be used for secondary storage and then
requires two accesses to secondary storage, namely one to pick up the
relevant part of the hash program and one to access the actual data.
Perfect hashing (as all hashing methods) does not support sequential
access to the data in its natural order. Note however, that sequential
access in "hashed order" is possible. The other two methods support
sequential access. Interpolation search provides us with O(loglogn)
expected access time and requires no additional storage. Static weighted
trees support weighted data and allow us to access an item with access
probability p in time O(log 1/p). We have seen that expected access time
to a collection of items is 0(1) for a variety of frequency distribu-
tions. Multi-way weighted trees which were not discussed in the text,
can also be used in secondary storage. TRIES, hashing and balanced trees
(weight-balanced or height-balanced) can be used to handle dynamic
unweighted data. Hashing (with chaining) has an O(S) expected access
time, although worst case access time is 0(n) and expected worst case
access time is 0(log n), and supports Access, Insert and Delete.
The expected behavior of hashing with chaining can be turned into
probabilistic behavior by the use of universal hashing. For secondary
storage extendible hashing is particularly appropriate. It requires only
two accesses to secondary storage. Extendible hashing is closely related
to TRIE searching on the hashed keys. TRIES for the unhashed keys
support also sequential access. The branching factor k can be used to
trade between access time and storage requirement.
Balanced trees have logarithmic worst case behavior and also support
additional operations such as split, concatenate and finger searches.
Among the balanced trees, (a,b)-trees are the most flexible and elegant
version. The realization by colored trees is particularly useful.
(a,b)-trees with b ~ 2a exhibit s~all amortized rebalancing cost which
288
TOP TOP + 1 ;
A[i] TOP;
K[TOP] it
BV[i] b;
It is easy to see that this piece of code maintains the invariant. Thus
initialization and operations Access, Insert and Delete take time 0(1)
each.
k
Definition: A k-structure T for set S c: [0 •• 2 -1]
consists of
1) integer T.SIZE = I {x 1 , •.• ,xm} I.
k'
4) A k'-structure T.TOP and an array T.BOTTOM[0 ••• 2 -1] of k"-
structures. If m S 1 then structures T.TOP and T.BOTTOM[i],
o SiS 2k' -1, are all empty, i.e. their size fields are 0, the list is
291
empty and the bitvector is all zero. If m > 1 then T.TOP is a k'-struc-
ture for set {x1""'x~} and for each x E [0 .• 2k '_1] ~.BOTTOM[x] is a
kg-structure for set {xi; 1 S i S m and x = xi} . 0
T.B
T.P
T.LIST
say x', at depth r(log N)/2', i.e. to its ancestor half-way up the tree.
In our implementation the nodes at depth r(log N/2' are represented by
array T.BOTTOM, structure T.TOP represents the top half of the tree and
r(log N)/2'
structures T.BOTTOM[i], 0 ~ i ~ 2 - 1 represent the bottom
half of the tree. The pointers from leaves x to ancestors x' are not
provided explicitely; we simulate them by a simple division by 2k "
Suppose next that we could decide in time 0(1) whether x and succ(x)
have the same ancestor at depth r(log N/2'. In our implementation we
can do so by comparing x" with the last element on list T.BOTTOM[x'].
LIST. If x and succ(x) have the same ancestor then we can restrict our
search to the O(IN) descendants of node x', i.e. we only need to find
the successor of x" in T.BOTTOM[x']. If x and succ(x) do not have the
same ancestor then BUCC(X) is the first element of the list
T.BOTTOM[z' ].LIST where z' is the successor of x' in the top half
T.TOP of the tree. Thus in either case we can reduce the size of the
problem from N to O(IN) in time 0(1) and hence a time bound of
O(loglog N) should result. The details are given below.
Leroma 1: The program above correctly computes SUCC in time O(rlog k').
Proof: Note first that the successor of x in T does not exist iff
either T is empty or x is at least as large as all elements of T. So
suppose that T is non-empty and x has a successor and let x', x" be the
two parts of x. We have to distinguish two cases; either there is a
point in T with first part equal to x' or there is not. If there is a
point y in T with y' = x' then we find the successor of x" among the
points with first part equal to x'. If the successor exists then we are
done (line 8). If it does not exist or if there is no point in T with
first part equal to x' then we have to find the successor of x' in the
top structure (line 9) and coMbine it with the first element in the
appropriate bottom structure (line 10). This shows correctness.
Next we analyze the running time of function SUCC. Let R(k) be the
maximal running time of SUCC on any k-structure. Then R(1) = c for some
constant c. Next note that the running time of SUCC(x,T) is 0(1) if x
does not have a successor in T. This is true for any k. Thus the time
spent in line (7) is 0 (1) and we obtain the recurrence R(k) ~ c + R( rk/2')
for R where c is some constant. Note that we generate one re-
cursive calIon either a Lk/2J or a rk / 2'-structure and that only a
constant amount of time is spent outside the recursive calls. It is now
easy to verify by induction on k that R(k) ~ c(1+ r log k'). This is ob-
vious for k = 1. For k > 1 we have
, recurrence for R
For the last step, let ~ be such that 2~ < k ~ 2~+1. Then rlog k'
~ + 1 and 2~-1 < k/2 ~ 2~ and hence rlog rk /2" = ~. 0
The algorithms for Insert and Delete are now easily derived.
k
(1) procedure INSERT(x: [0 •• 2 -1], T: k-structure);
(2) if T.B[x] = false
294
inserted into an empty structure. Thus only one "real" recursive call
is initiated. The same statement is true if T has more than one element
(lines 13 and 14). Either x' is already an element of T.TOP and then
line 13 is trivial or x' is not yet an element of T.TOP and then line
14 is trivial. In either case the cost arising in lines 6 - 15 is
bounded by c + R{rk/2') for some c. Line 4 takes time O{rlog k'). Thus
the following recurrence arises for R: R{k) ~ c{1+ r log k') + R{rk/2')
which solves for R{k) = O{{log k)2). This is not quite what we wanted.
0: do nothing;
1: DELETE(x', T.TOP)1
DELETE (x", T.BOTTOM[x'll;
~2: DELETE (x" , T.BOTTOM[x'll;
if T.BOTTOM[x'].SIZE = 0
then DELETE(x', T.TOP)
fi
esac
fi
end
Sp(1) = c
Sp(k) c.2 k + Sp(rk/2') + 2rk/2'SP(Lk/2J) for k ~ 2
We summarize in
The space bound and the time for initialization can be improved to O(N)
with a little care without sacrifycing the O(log log N) bound for the
other operations (cf. exercise 47 ).
The most obvious solution for the union-find problem is to use an array
IS_IN[O •• N-1]. IS_IN[i] contains the name of the set containing i.
Then execution of Find(x) reduces to table-lookup and therefore
takes constant time. Union(A,B,C) is simple but slow. We scan through
array IS_IN and change all A's and B's to C's in t~e O(N). A small
extension of the data structure allows us to reduce the time for the
Union operation. We associate with every set name a linear list, the
list of its members. Then we only have to step through the elements of
A U B in order to process Union(A,B,C). One further extension requires
us to only step through either the elements of A or the elements of B~
e.g. we might step through the elements of A and change all A's to B's.
In addition, we record the fact that the B's in array IS_IN are really
C's. This is best done by distinguishing between the internal name
(the name appearing in array IS_IN) and the external name (the name
appearing in the Union operations) of a set and to provide for two
arrays MAPOUT and MAPIN~ MAPOUT[i] is the external name of the set with
internal name i and MAPIN[A] is the internal name of the set with ex-
ternal name A.
Our first approach to the union-find problem favours Finds. A Find costs
0(1) time units, whilst the (amortized) cost of a Union is O(log n)
time units. We will next describe a solution which favours Unions. Each
set is represented by a tree. The nodes and leaves of the tree are the
elements of the set and the root is in addition labelled by the (inter-
nal) name of the set. Then Unions are particularly easy. In order to
299
unite A and B we only have to make the root of the tree labelled A a
son of the root of the tree labelled B, or vice versa. Thus a Union
operation takes constant time. The cost of Find(x} is proportional to
the depth of node x in the tree containing x; access node x (via an
array), traverse the path to the root, and look up the label of the
root. Since the depth of a node is at most n (again assuming that n
Unions are performed) the cost of any Find is at most O(n} •
Again the running time can be reduced drastically by applying the idea
of weighting. When executing Union(A,B,C} we make the root of the
smaller set a son of the root of the larger set. Ties are broken arbi-
trarily. We refer to this rule as the weighted union rule. Before we
can analyse this algorithm we need some notation.
Definition: Suppose that we start with singleton sets and perform n-1
Unions. Let T be the resulting tree. For any x E U let rank(x} be the
height of x in T and let weight(x} be the number of descendants of x
in T. c
Lemma 6: If the weighted union rule is used then weight(x} ~ 2 rank (x}
for all x E U. In particular, rank (x) ~ log n.
weight(x} ~ 2·weight(y}
~ 2.2 rank (y} , by induction hypothesis
2rank (x}
Finally, n-1 unions can build sets of size at most n and hence
weight (x) ~ n for all n. c
300
Proof: A Union takes time 0(1). Also the maximal height of any tree
built by the unions is log n by lemma 6. Thus a Find takes time at most
O(log n). c
The details of the Find algorithm are given below. In this program we
assume that an array FATHER[O •• N-1] is used to store son-to-father
pointers. More precisely, if FATHER[x] *x then FATHER[x] is the
father of x, if FATHER[x] = x then x is a root.
procedure Find(x)~
r ... Xi
while r * FATHER[r] do r .. FATHER[r] od~
while x * r do y .. x~ x FATHER[x] ~ FATHER[y]" rod.
output the label of r
end
A(i,O)
A(O,x) °2x for all i
for all x ~ 0
~
°
A(i+1 ,x+1) A(i, A(i+1 ,x» for all i,x ~ 0
•• 2
In particular, if log n < A(3,4) = 2·· } 65536-times then a(m,n) ~ 3.
Thus for all practical purposes, a(m,n) ~ 3; the union-find algorithm
Let T be the tree formed by the n-1 unions, let rank (x) be the height
of node x in tree T. If (x,y) is an edge of T (in the sequel we will
always assume that edges pOint towards the root) then rank(x) < rank(y).
The total cost of all partial finds is proportional to the total num-
ber of edges traversed by all partial finds. Let F be that set. We
show IFI = O(ma(m,n» by partitioning the edges of F into groups and
then bounding the number of edges in each group. Edges are put into
302
for 0 ~ k ~ z and
b) INo - Lol ~ n
c) and d): Consider any node x and let x E Gkj , i.e. A(k,j) ~ rank(x)
< A(k,j+1). Suppose that there are q edges (x'Y1)' ... '(x,y q ) E Nk - Lk
303
where rank(Y1) S rank(Y2) S .•. S rank(x,y q ). For all i,1 S i < q, there
must be edge (s i' t i ) E Nk follo\·ling (x, Yi ) -on the same partial find
.
path. Thus rank(y~) S rank(s.) ~
< rank(t.).
~
Also after -path compression,
x's father is an ancestor of ti (maybe ti itself). In particular,
rank(t i ) S rank(Yi+1).
Also since (x,Y i ) ¢ Nk - 1 , (si,t i ) ¢ Nk - 1 there must be j', j" such that
and hence rank(y i ) < A(k-1,j") S rank(Yi+1). Thus there exists j'" such
that
Proof: Consider any ~ with A(k,j) S ~ < A(k,j+1). We count the number
of nodes x E Gkj with rank(x) = ~. Any two such nodes have disjoint
sets of descendants in T. Also any such node has at least 2~ descendants
by lemma 6 and hence there are at most n/2~ nodes of rank ~. Thus
L CJ
A(k,j)S~<A(k,j+1)
. 2j
S r 2n 2 J /2 , since A(k, j) ~ 2j
j~2
S 5n/8
z+1 z+1
IF i r I Lk I + riNk - Lk I
k=o k=o
= O(m a(m,n»
since m ~ n. IJ
The running time of the union-find algorithm with path compression and
weighted union is almost linear. However, it is not linear. Tarjan has
shown that almost any conceivable union-find algorithm requires time
n(m a(m,n» to process a sequence of n-1 Unions and m ~ n Finds. Appli-
cations of the Union-Find problem are discussed in exercises 48 to 52.
305
III.9. Exercises
3) Show that the expected worst case cost of hashing with chaining is
8«log n)/log log n) provided that 0.5 ~ S ~ 1. Note that theorem 3 of
section 2.1. only states an upper bound.
8) Let U be the set of real numbers between 0 and 1. Discuss the use
of hash function h : x - LxmJ where m is the size of the hash table.
9) Let ~ be a permutation of 1,2, ... ,n. Let T[~] be the tree obtained
by inserting ~(1), ••• ,~(n) in that order into an initially empty tree.
Insertion is done as described in section 3.1 •.
b) Let d(T[~]) be the depth of T[~]. Show that L d(T[~])/n! = O(log n),
~
i.e. the expected depth of a randomly grown tree is O(log n). Note
that part b) does not follow from part a) .
11) Let L 1 , ••• ,L n be the set of words. We want to compute the product
(catenation) L 1L 2 ••• Ln • Assume that it costs p.q time units to multiply
a set of size p with a set of size q and that the resulting set has
size pq. Find the optimal evaluation order.
17) A string x =
a 1 ···a n is a subsequence of string y b 1 ••• b m i f
there are i1 < i2 < ...
< i n such that ]
a . b.
1j
for 5 i 5 n. Design
19) Show that the upper bounds given in theorems 7 and 8 are existen-
tially sharp.
22) Design a balanced tree scheme where the worst case rebalancing cost
after an insertion or deletion is O(log log n). Hint : Use-weight
balanced trees. If n elements are stored then rebalance up to height
308
a) Show how to implement all three operations with worst case time
O(log n) .
a) Associate with every node x of the tree labels x 1 and x 2 . For tree T
with root x and subtrees T 1 , ••. ,Tk let Pre(T) = x 1 pre(T 1 ) ... pre(T k ) x 2
and let Post(T) = x 1 post(Tk ) ... post(T 1 ) x 2 . Show: x is an ancestor of
y in Tiff x 1 precedes Y1 in Pre(T) and Post(T).
25) Let T be a binary tree. The height hey) of node v is the longest
path from v to a leaf. The height balance hb(v) of v is the height dif-
ference of the two sons of v, i.e. hb(v) = hex) - hey) where x(y) is
the left (right) son of x. Tree T is an AVL-tree if hb(v) E {-l,O,+l}
for all nodes of T.
26) Let T be a binary tree and v a node of T. Let sty) be the lenght of
a shortest path from v to a leaf and let hey) be the length of a longest
path to a leaf; hey) is the height of v. T is half-balanced if hey) ~
2h/2+1 _ 2 if h is even
n ~ {
3 2(h-1)/2 - 2 if h is odd
by induction on h) .
27) This exercise treats red-black trees in more details (cf. end
of section 111.4)
a) Design an algorithm for deleting a leaf from a red-black tree.
b) In the text, we used red-black trees with leaf-oriented
storage organization, i.e. set S was stored in the leaves and internal
nodes are only used as a directory. Develop the algorithms for node-
oriented storage organization, i.e. set S is stored in the internal
nodes and leaves have empty content. In order to improve storage
utilization use a single record of type node to represent all leaves.
The diagram below shows a node-oriented red-black tree for set
S = {2,4,7,S}.
Use the "single leaf" as a sentinel in searches, i.e. if you search
for x then store x in the sentinel first so as to make every search
successful.
c) Discuss operations Split and Concatenate for red-black trees.
Use both storage organizations.
28} Show that the following statement is false: There is c > 1 such
that for every (2,4}-tree with n leaves and node v of depth d we have:
the number of leaves below v is bounded by n/c d . Note that the statement
is true for BB[a]-trees.
34) Show that theorem 14 (of 5.3.3.) stays true if ordinary (2,4)-trees
are used instead of level-linked (2,4)-trees.
36) Use fringe analysis to improve upon theorem 8 of 5.3.2. for se-
quences of random insertions. (Hint: Use it to improve upon fact 4 in
the proof of theorem 8).
42) Write a RAM program for the move to front rule realizing the linear
list a) as an array and b) as a linked list. If the linear list is re-
alized as an array then moving an element from the i-th position to the
front also requires to physically move elements in positions 1,••• /i - 1
one position to the rear. This can be done simultaneously with the
search. Compute expected running time and compare with the running time
for the optimal arrangement.
313
1f(i) < 1f(j). Can you generalize self-organizing linear search to this
situation?
44) Set S (x 1 ' ••• ,xn } and let e i be the probability of accessing xi.
Assume e 1 :2 e 2 :2 ••• :2 en.
Y 1f
c·
i=1
R
e~-1f(i) under the transposition rule where c is such that
~
I: Y1f = 1.
1f
45) Use the explicite expression given for PMFR in theorem 1 to compute
PMFR and PMFR/Popt for the following distributions.
of k requests.
47) Let S ~ [0 .. N-1] , N = 2 k (log k) for some k. Let S. =
k J.
{x E S; (log k) ~ X < (i+1)log k} for 0 ~ i < 2 • Represent S as
follows. Store {i; S; * ¢} in a priority queue as described in section
8.2. and store each non-empty Si in a ~inear list. Show that the data
structure requires linear space O(N), can be created in time O(N) and
supports Insert, Delete, Min in time O(log log N) .
49) Discuss the following weighted union rule for the Union-Find prob-
lem: If trees T1 and T2 are to be united then make the tree of smaller
height a subtree of the tree of larger height. Ties are broken arbi-
trarily. Show that the cost of a sequence of n unions and m finds is
o (n + m log n).
50) Show that theorem 2 of section 8.3. is best possible, i.e. describe
a sequence of n unions and m finds which force the algorithm to run for
~(m log n + n) steps.
51) Consider a forest of disjoint trees over node set 0, ..., n-1 .
Link(r,v) makes r, the root of one of the trees, a son of v, a node in
another tree. Depth (v) outputs the current depth of node v. Show how to
use the union-find data structure to solve this problem. (Hint: If T is
one of the trees in the forest then the nodes of T form one of the sets
in the union find problem. Recall that these sets are stored as trees.
315
Maintain function help on the nodes of the union-find trees such that
Depth (v) = L help(w) where the summation is over all ancestors of v in
the union-find tree containing v) .
52) Design a Union-Find algorithm with worst case complexity O(log n).
Can you do better?
The o(n 2 ) dynamic programming algorithm for optimum binary search trees
is due to Knuth (71); theorem 2 on quadrangle inequalities is due to
F. Yao (80). If Si = 0 for all i then an optimum tree can be constructed
in time O(n log n), cf. Hu/Tucker (71) or Garsia/Wachs (77). The linear
time simulation of 2 DPDA's on RAMs is by Cook (71); the proof given in
text follows Jones (77). The linear time pattern matching algorithm
316
The move to front rule for self-organizing linear search was first
analysed by Me Cabe(65). Exercises 44,45 and 46 are taken from Rivest
(76), Hendricks (76), and Gonnet/Munro/Suwanda (79). The amortized
analysis of the move to front rule was started by Bentley/Me Geoch (82)
and then refined by Sleator/Tarjan (83).
There are basically two ways for structuring a book on data structures
and algorithms: problem or paradigm oriented. We have mostly followed
the first alternative because it allows for a more concise treatment.
However, at certain occassions (e.g. section VIII.4 on the sweep para-
digm in computational geometry) we have also followed the second ap-
proach. In this last chapter of the book we attempt to review the en-
tire book from the paradigm oriented point of view.
In more structured state spaces one can use dynamic programming and
tabulation (111.4.1, IV.7.3, and VI.6.1). Dynamic programming is par-
ticularly useful when the problem space ist structured by size in a
natural way and when solutions to larger problems are easily obtained
from solutions to (all, sufficiently many) smaller problems. In this
situation it is natural to solve all conceivable subproblems in order
of increasing size in a systematic way. The efficiency of dynamic pro-
gramming is directly related to the size of the state space. We en-
countered a large state space in the application to the travelling
salesman problem (VI.6.1) and a fairly small state space in the appli-
cation to optimum search trees (111.4.1) and least cost paths (IV.7.3).
In some occassions, e. g. 111.4.1, the search could be restricted to a
suitabl¥ chosen subset of the state space.
318
split a planar graph of n nodes into two subgraphs of about half the
size by the removal of only O(In) nodes. Moreover, the separating set
can be determined in linear time. We used the planar separator theorem
in several efficient algorithms on planar graphs, e. g. least cost path,
chromatic number, •••• The second strategy is given by the fact that a
planar graph always contains a large set of independent nodes of small
degree. We used this fact in searching planar subdivisions (VIII.3.2.1)
and in the hierarchical representation of convex polyhedra (VIII,
exercise 2).
Trees are a prime example for the divide - and - conquer paradigm. In
trees one either organizes the universe (section 111.1 on TRIES) or one
organizes the set to be stored in the tree. The latter approach was
used in sections 111.3 to 111.7 and leads to balanced trees. In these
trees one chooses an element of the set for the root which balances the
subproblems. In balanced trees for unweighted data balancing is done
either according to the cardinality of the subproblems (weight - bal-
anced trees) or according to the height of the subtrees (height - bal-
anced trees). In trees for weighted data balancing is done according to
the probability of the subproblems. We have also seen on two occassions
(111.6.1 on weighted dynamic trees for multidimensional searching and
VIII.S.1.3 on segment trees) that search structures for unweighted com-
plex data can sometimes be constructed from trees for simpler but weigh-
ted data. The former approach, i. e. organizing the universe, was used
in section 111.1 on TRIES and in the static version of interval, priori-
ty search, and segment trees (V1II.S.1). The organization of the uni-
verse gives rise to particularly simple tree structures.
Closely related to trees which organize the universe are key transfor-
mation ( =hashing) and direct access(II.2, 111.2 and 111.8). In these
methods one uses the key or a transformed key in order to directly
access data. This immediately implies small running times. Another
application of the very same idea is presorting, i. e. transforming a
problem on an arbitrary set into a problem on a sorted set by sorting.
It is then often possible to identify the objects with an initial
segment of the integers which opens up all methods of direct access. We
used presorting in sections VII.2.2 on multi-dimensional divide - and -
conquer and in section VIII.S on orthogonal objects in computational
geometry.
320
The method of reduction also played a major role in other parts of the
book. The entire chapter on NP - completeness is centered around the
notion of reduction or transformation. We used reductions to structure
the world of problems, to define and explore classes of equivalent
problems (VI.1 to VI.5), to transfer algorithms (from network flow to
matching in IV.9, from matrix product over the integers to matrix pro-
duct over the set of booleans in V.5, from iso - oriented objects to
general objects in VIII.5.2, and from straight-line to circular objects
in VIII.6), and to transfer lower bounds (from sorting to element uni-
queness in 11.6, from decision trees to RAMs in 11.3 and from boolean
matrix product to matrix product over semi-rings of characteristic zero
in V. 7).
321
Worst case analysis (and amortized analysis which is the worst case
analysis of sequences of operations) is the dominant method of analysis
used throughout this book. Expected case analysis was done in only a few
places; e. g. quicksort (11.3), selection (11.4), TRIES (111.1.1),
hashing (111.2), interpolation search (111.3.2), weighted trees (111.4),
self-organizing linear search (III.6.1.1),and transitive closure (IV.3).
Expected case analysis rests upon an a-priori probability distribution
on problem instances and therefore its predictions should be interpreted
with care. In particular, it should always be checked whether reality
conforms with the probability assumptions. Note however, that the ex-
pected running of many algorithms is fairly robust with respect to
changes in the distribution. For example, a near-optimal search tree
for distribution 8 ist also a near-optimal search tree for distribution
8' provided that 8 and 8' do not differ too much. Furthermore, a care-
ful analysis of algorithms with small expected running time sometimes
leads to fast algorithms with small wort case running time (e.g. selec-
tion) or to fast probabilistic algorithms (e.g. quicksort and hashing) .
The last two principles which we are going to discuss are approximation
algorithms and probabilistic algorithms. These paradigms suggest to
either change the problem to be solved (solve a simpler problem) or to
change our notion of computation (use a more powerful computing machine) •
We observed at several places that a "slight" change in the formulation
of a problem can have a drastic effect on its complexity: The satis-
fiability problem with three literals per clause is NP-complete but
323
with two literals per clause it becomes fairly simple, the precedence
constrained scheduling problem is NP-complete but if the precedence re-
lation is a tree or there are only two machines then the problem is in
P. Similarly, the computation of the convex hull of a point set takes
time e(n logn) but if the points are sorted by x-coordinate then time
O(n) suffices. For optimization problems there is a standard method
for simplifying the problem; instead of asking for an optimal solution
we are content with a nearly optimal solution. This approach is parti-
cularly important when the optimization problem is NP-complete and
therefore we devoted the entire section V.7 to approximation algorithms
for NP-complete problems. We saw that some NP-complete problems resist
even approximate solution but many others have good or even very good
approximation algorithms. Even inside P approximation algorithms are
important. A good example are the weighted trees of section 111.4. The
best algorithm for constructing optimum weighted trees has running time
e(n 2 ) and there is an O(n) algorithm which constructs nearly optimal
trees. Already for moderate size n, say n = 10 4 , the difference between
n 2 and n is substantial.
Let '1' ••• "n be a probability distribution, i.e. ' i ElR'"Yi ~O, and
L ' i = 1. The entropy of this distribution is defined as
i
n
H('1'···"n) = - L ' i log "Yi
i=1
where O·log 0 = o. []
E1) Let 01, ••• ,on be another distribution. Then H('1' •.• "n) ~
n
- L " log 0i with equality iff '1.' = 0, for all i.
i=1 1. 1.
o . []
In the second group we list some frequently used sums and inteqrals,
k
, i k+1
81) L 1. 2 = (k-1) 2 + 2 for k ~ 1
i=1
326
k
L 1/i is the k-th harmonic number. We have
i=1
J/,n k ;;;; Hk ;;;; 1 + J/,n k •
k+1 k
Proof: From f (1/x)dx :> 1/k ;;;; f dx
k k-1
k+1
we infer Hk ;;: f (1 Ix) dx J/,n(k+1) and
1
k k
Hk 1 + r 1/i ;;;; 1 + f (1 Ix) dx 1 + J/,n k . []
i=2 1
n
53) L Hk = (n+1) Hn - n for n ;;; 1 .
i=1
n n k n
Proof: L Hk L L 1/i L (n-i+1)/i = (n+1)H n - n []
14) eX ~ 1+x
Aho, A.V., Hopcroft, J.E., Ullman, J.D. (1974): The Design and Analysis
of Computer Algorithms, Addison Wesley
Bayer, P.J. (1975): Improved Bounds on the Cost of Optimal and Balanced
Binary Search Trees, Technical Report, Dept. of Computer Science,
MIT
Bent, S.W., Sleator, D.O., Tarjan, R.E. (1980): Biased 2-3 Trees,
21st FOCS, 248-254
Bentley, J.L., Haken, D., Saxe, J.B. (1980): A General Method for
Solving Divide-and-Conquer Recurrences, SIGACT Newsletter 12, 36-44
Blum, M., Floyd, R.W., Pratt, V.R., Rivest, R.L., Tarjan, R.E. (1972):
Time Bounds for Selection, JCSS 7, 448-461
Cook, S.A., Reckhow, R.A. (1973): Time bounded random access machines,
JCSS 7
Eisenbarth, B., Ziviani, N., Gonnet, G.H., Mehlhorn, K., Wood, D. (1982):
The Theory of Fringe Analysis and its Application to 2-3 Trees and
B-trees, Inform. & Control 55, 125-174
Emde Boas, P.v. (1977): Preserving order in a forest in less than loga-
rithmic time and linear space, IPL 6, 80-82
Emde Boas, P.v., Kaas, R., Zijlstra, E. (1977): Design and Implementa-
tion of an efficient priority queue, Math. Systems Theory 10,
99-127
Garsia, A.M., Wachs, M.L. (1977): A new algorithm for minimum cost
binary trees, SICOMP 4, 622-642
Meijer, H., Akl, S.G. (1980): The Design and Analysis of a New Hybrid
Sorting Algorithm, IPL 10, 213-218
Perl, Y., Itai, A., Avni, H. (1978): Interpolation Search - A Log Log N
Search, CACM 21, 550-553
Tarjan, R.E. (1975): Efficiency of a good but not linear set union
algorithm, JACM 22, 215-225
Tarjan, R.E., Yao, A.C.-C. (1979): Storing a Sparse Table, CACM 22,
606-611
Yao, A.C.-C. (1981): A Lower Bound to Finding Convex Hulls, JACM 28,
780-787
Subject Index
codes 62 median 93
convex hull 79 merging
cost measure - many sequences 62
- logarithmic- 6 - two sequences
- unit- 5 sirriple- 60
optimal- 240
decision tree 69 multi-dimensional searching 282
D-tree 275
dynamic programming 163 path compression 292, 300
path length
element uniqueness 75 expected- 306
entropy 174, 325 - lower bounds 175
exponential and binary search - optimum- 159
184 pattern matching 171
fringe analysis 241 priority queues
- heaps 45
half-balanced tree 309 - mergable queues 213
hashing small universe 289
- chaining 118 pseudo- random 18
- expected worst case 120 pushdown automata 167
336
Compiler
Construction
1984. 196 figures. XIV, 446 pages
(Texts and Monographs in Computer Science)
ISBN 3-540-90821-8
The Science of
Programming
2nd printing. 1983. XV, 366 pages
(Texts and Monographs in Computer Science)
ISBN 3-S40-90641-X