0% found this document useful (0 votes)

17 views

Lecture 02

The document discusses longest common prefixes and how they relate to sorting strings. It defines longest common prefixes and lexicographical ordering. It proves that any string sorting algorithm requires at least Ω(ΣLCP(R) + n log n) comparisons in the worst case. It also presents a ternary quicksort algorithm to sort strings.

Uploaded by

demomento

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views

Lecture 02

Uploaded by

demomento

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

Longest Common Prefixes

The standard ordering for strings is the lexicographical order. It is induced

by an order over the alphabet. We will use the same symbols (≤, <, ≥, 6≤,
etc.) for both the alphabet order and the induced lexicographical order.

We can define the lexicographical order using the concept of the longest
common prefix.

Definition 1.8: The length of the longest common prefix of two strings
A[0..m) and B[0..n), denoted by lcp(A, B), is the largest integer
` ≤ min{m, n} such that A[0..`) = B[0..`).

Definition 1.9: Let A and B be two strings over an alphabet with a total
order ≤, and let ` = lcp(A, B). Then A is lexicographically smaller than or
equal to B, denoted by A ≤ B, if and only if

1. either |A| = `
2. or |A| > `, |B| > ` and A[`] < B[`].

29
An important concept for sets of strings is the LCP (longest common
prefix) array and its sum.
Definition 1.10: Let R = {S1 , S2 , . . . , Sn } be a set of strings and assume
S1 < S2 < · · · < Sn . Then the LCP array LCPR [1..n] is defined so that
LCPR [1] = 0 and for i ∈ [2..n]
LCPR [i] = lcp(Si , Si−1 )
Furthermore, the LCP array sum is
X
ΣLCP (R) = LCPR [i] .
i∈[1..n]

Example 1.11: For R = {ali$, alice$, anna$, elias$, eliza$}, ΣLCP (R) = 7
and the LCP array is:
LCPR
0 ali$
3 alice$
1 anna$
0 elias$
3 eliza$
30
A variant of the LCP array sum is sometimes useful:
Definition 1.12: For a string S and a string set R, define
lcp(S, R) = max{lcp(S, T ) | T ∈ R}
X
Σlcp(R) = lcp(S, R \ {S})
S∈R

The relationship of the two measures is shown by the following two results:
Lemma 1.13: For i ∈ [2..n], LCPR [i] = lcp(Si , {S1 , . . . , Si−1 }).
Lemma 1.14: ΣLCP (R) ≤ Σlcp(R) ≤ 2 · ΣLCP (R).
The proofs are left as an exercise.
The concept of distinguishing prefix is closely related and often used in place
of the longest common prefix for sets. The distinguishing prefix of a string
is the shortest prefix that separates it from other strings in the set. It is
easy to see that dp(S, R \ S) = lcp(S, R \ S) + 1 (at least for a prefix free R).
Example 1.15: For R = {ali$, alice$, anna$, elias$, eliza$}, Σlcp(R) = 13
and Σdp(R) = 18.
31
Theorem 1.16: The number of nodes in trie(R) is exactly
||R|| − ΣLCP (R) + 1, where ||R|| is the total length of the strings in R.

Proof. Consider the construction of trie(R) by inserting the strings one by

one in the lexicographical order using Algorithm 1.2. Initially, the trie has
just one node, the root. When inserting a string Si , the algorithm executes
exactly |Si | rounds of the two while loops, because each round moves one
step forward in Si . The first loop follows existing edges as long as possible
and thus the number of rounds is LCPR [i] = lcp(Si , {S1 , . . . , Si−1 }). This
leaves |Si | − LCPR [i] rounds for the second loop, each of which adds one new
node to the trie. Thus the total number of nodes in the trie at the end is:
X
1+ (|Si | − LCPR [i]) = ||R|| − ΣLCP (R) + 1 .
i∈[1..n]

The proof reveals a close connection between LCPR and the structure of
the trie. We will later see that LCPR is useful as an actual data structure in
its own right.

32
String Sorting

Ω(n log n) is a well known lower bound for the number of comparisons
needed for sorting a set of n objects by any comparison based algorithm.
This lower bound holds both in the worst case and in the average case.

There are many algorithms that match the lower bound, i.e., sort using
O(n log n) comparisons (worst or average case). Examples include quicksort,
heapsort and mergesort.

If we use one of these algorithms for sorting a set of n strings, it is clear

that the number of symbol comparisons can be more than O(n log n) in the
worst case. Determining the order of A and B needs at least lcp(A, B)
symbol comparisons and lcp(A, B) can be arbitrarily large in general.

On the other hand, the average number of symbol comparisons for two
random strings is O(1). Does this mean that we can sort a set of random
strings in O(n log n) time using a standard sorting algorithm?

33
The following theorem shows that we cannot achieve O(n log n) symbol
comparisons for any set of strings (when σ = no(1) ).
Theorem 1.17: Let A be an algorithm that sorts a set of objects using
only comparisons between the objects. Let R = {S1 , S2 , . . . , Sn } be a set of n
strings over an ordered alphabet Σ of size σ. Sorting R using A requires
Ω(n log n logσ n) symbol comparisons on average, where the average is taken
over the initial orders of R.

• If σ is considered to be a constant, the lower bound is Ω(n(log n)2).

• Note that the theorem holds for any comparison based sorting algorithm
A and any string set R. In other words, we can choose A and R to
minimize the number of comparisons and still not get below the bound.

• Only the initial order is random rather than “any”. Otherwise, we could
pick the correct order and use an algorithm that first checks if the order
is correct, needing only O(n + ΣLCP (R)) symbol comparisons.

An intuitive explanation for this result is that the comparisons made by a

sorting algorithm are not random. In the later stages, the algorithm tends
to compare strings that are close to each other in lexicographical order and
thus are likely to have long common prefixes.
34
Proof of Theorem 1.17. Let k = b(logσ n)/2c. For any string α ∈ Σk , let
Rα be the set of strings in R having α as a prefix. Let nα = |Rα |.
Let us analyze the number of symbol comparisons when comparing strings
in Rα against each other.

• Each string comparison needs at least k symbol comparisons.

• No comparison between a string in Rα and a string outside Rα gives
any information about the relative order of the strings in Rα .

• Thus A needs to do Ω(nα log nα) string comparisons and Ω(knα log nα)
symbol comparisons to determine the relative order of the strings in Rα .
P
Thus the total number of symbol comparisons is Ω α∈Σk knα log nα and
X √
√ n− n √ √
knα log nα ≥ k(n − n) log k
≥ k(n − n) log( n − 1)
σ
α∈Σk
= Ω (kn log n) = Ω (n log n logσ n) .
√ P √
Here we have used the facts that σ k ≤ n, that nα > n − σ k ≥ n− n,
P P P k
α∈Σk

and that α∈Σk nα log nα ≥ α∈Σk nα log α∈Σk nα /σ (by Jensen’s

Inequality).
35
The preceding lower bound does not hold for algorithms specialized for
sorting strings.

Theorem 1.18: Let R = {S1 , S2 , . . . , Sn } be a set of n strings. Sorting R

into the lexicographical order by any algorithm based on symbol
comparisons requires Ω(ΣLCP (R) + n log n) symbol comparisons.

Proof. If we are given the strings in the correct order and the job is to
verify that this is indeed so, we need at least ΣLCP (R) symbol
comparisons. No sorting algorithm could possibly do its job with less symbol
comparisons. This gives a lower bound Ω(ΣLCP (R)).

On the other hand, the general sorting lower bound Ω(n log n) must hold
here too.

The result follows from combining the two lower bounds.

• Note that the expected value of ΣLCP (R) for a random set of n
strings is O(n logσ n). The lower bound then becomes Ω(n log n).

We will next see that there are algorithms that match this lower bound.
Such algorithms can sort a random set of strings in O(n log n) time.
36
String Quicksort (Multikey Quicksort)

Quicksort is one of the fastest general purpose sorting algorithms in

practice.

Here is a variant of quicksort that partitions the input into three parts
instead of the usual two parts.

Algorithm 1.19: TernaryQuicksort(R)

Input: (Multi)set R in arbitrary order.
Output: R in ascending order.
(1) if |R| ≤ 1 then return R
(2) select a pivot x ∈ R
(3) R< ← {s ∈ R | s < x}
(4) R= ← {s ∈ R | s = x}
(5) R> ← {s ∈ R | s > x}
(6) R< ← TernaryQuicksort(R< )
(7) R> ← TernaryQuicksort(R> )
(8) return R< · R= · R>

37
In the normal, binary quicksort, we would have two subsets R≤ and R≥ , both
of which may contain elements that are equal to the pivot.

• Binary quicksort is slightly faster in practice for sorting sets.

• Ternary quicksort can be faster for sorting multisets with many

duplicate keys. Sorting a multiset of size n with σ distinct elements
takes O(n log σ) comparisons.

The time complexity of both the binary and the ternary quicksort depends
on the selection of the pivot (exercise).

In the following, we assume an optimal pivot selection giving O(n log n)

worst case time complexity.

38
String quicksort is similar to ternary quicksort, but it partitions using a single
character position. String quicksort is also known as multikey quicksort.

Algorithm 1.20: StringQuicksort(R, `)

Input: (Multi)set R of strings and the length ` of their common prefix.
Output: R in ascending lexicographical order.
(1) if |R| ≤ 1 then return R
(2) R⊥ ← {S ∈ R | |S| = `}; R ← R \ R⊥
(3) select pivot X ∈ R
(4) R< ← {S ∈ R | S[`] < X[`]}
(5) R= ← {S ∈ R | S[`] = X[`]}
(6) R> ← {S ∈ R | S[`] > X[`]}
(7) R< ← StringQuicksort(R< , `)
(8) R= ← StringQuicksort(R= , ` + 1)
(9) R> ← StringQuicksort(R> , `)
(10) return R⊥ · R< · R= · R>

In the initial call, ` = 0.

39
Example 1.21: A possible partitioning, when ` = 2.
al p habet al i gnment
al i gnment al g orithm
al l ocate al i as
al g orithm al l ocate
=⇒
al t ernative al l
al i as al p habet
al t ernate al t ernative
al l al t ernate

Theorem 1.22: String quicksort sorts a set R of n strings in

O(ΣLCP (R) + n log n) time.

• Thus string quicksort is an optimal symbol comparison based algorithm.

• String quicksort is also fast in practice.

40
Proof of Theorem 1.22. The time complexity is dominated by the symbol
comparisons on lines (4)–(6). We charge the cost of each comparison either
on a single symbol or on a string depending on the result of the comparison:

S[`] = X[`]: Charge the comparison on the symbol S[`].

• Now the string S is placed in the set R=. The recursive call on R=
increases the common prefix length to ` + 1. Thus S[`] cannot be
involved in any future comparison and the total charge on S[`] is 1.
• Only lcp(S, R \ {S}) symbols in S can be involved in these
comparisons. Thus the total number of symbol comparisons
resulting equality is at most Σlcp(R) = Θ(ΣLCP (R)).
(Exercise: Show that the number is exactly ΣLCP (R).)
S[`] 6= X[`]: Charge the comparison on the string S.
• Now the string S is placed in the set R< or R>. The size of either
set is at most |R|/2 assuming an optimal choice of the pivot X.
• Every comparison charged on S halves the size of the set containing
S, and hence the total charge accumulated by S is at most log n.
Thus the total number of symbol comparisons resulting inequality is
at most O(n log n).

41
Radix Sort
The Ω(n log n) sorting lower bound does not apply to algorithms that use
stronger operations than comparisons. A basic example is counting sort for
sorting integers.
Algorithm 1.23: CountingSort(R)
Input: (Multi)set R = {k1 , k2 , . . . kn } of integers from the range [0..σ).
Output: R in nondecreasing order in array J[0..n).
(1) for i ← 0 to σ − 1 do C[i] ← 0
(2) for i ← 1 to n do C[ki ] ← C[ki ] + 1
(3) sum ← 0
(4) for i ← 0 to σ − 1 do // cumulative sums
(5) tmp ← C[i]; C[i] ← sum; sum ← sum + tmp
(6) for i ← 1 to n do // distribute
(7) J[C[ki ]] ← ki ; C[ki ] ← C[ki ] + 1
(8) return J

• The time complexity is O(n + σ).

• Counting sort is a stable sorting algorithm, i.e., the relative order of

equal elements stays the same.

42
Similarly, the Ω(ΣLCP (R) + n log n) lower bound does not apply to string
sorting algorithms that use stronger operations than symbol comparisons.
Radix sort is such an algorithm for integer alphabets.

Radix sort was developed for sorting large integers, but it treats an integer
as a string of digits, so it is really a string sorting algorithm.

There are two types of radix sorting:

MSD radix sort starts sorting from the beginning of strings (most
significant digit).
LSD radix sort starts sorting from the end of strings (least
significant digit).

43
The LSD radix sort algorithm is very simple.
Algorithm 1.24: LSDRadixSort(R)
Input: (Multi)set R = {S1 , S2 , . . . , Sn } of strings of length m over alphabet [0..σ).
Output: R in ascending lexicographical order.
(1) for ` ← m − 1 to 0 do CountingSort(R,`)
(2) return R

• CountingSort(R,`) sorts the strings in R by the symbols at position `

using counting sort (with ki replaced by Si [`]). The time complexity is
O(|R| + σ).

• The stability of counting sort is essential.

Example 1.25: R = {cat, him, ham, bat}.

cat hi m h a m b at
him ha m c a t c at
=⇒ =⇒ =⇒
ham ca t b a t h am
bat ba t h i m h im

It is easy to show that after i rounds, the strings are sorted by suffix of
length i. Thus, they are fully sorted at the end.
44
The algorithm assumes that all strings have the same length m, but it can
be modified to handle strings of different lengths (exercise).

Theorem 1.26: LSD radix sort sorts a set R of strings over the alphabet
[0..σ) in O(||R|| + mσ) time, where ||R|| is the total length of the strings in
R and m is the length of the longest string in R.

Proof. Assume all strings have length m. The LSD radix sort performs m
rounds with each round taking O(n + σ) time. The total time is
O(mn + mσ) = O(||R|| + mσ).

The case of variable lengths is left as an exercise.

• The weakness of LSD radix sort is that it uses Ω(||R||) time even when
ΣLCP (R) is much smaller than ||R||.

• It is best suited for sorting short strings and integers.

45
MSD radix sort resembles string quicksort but partitions the strings into σ
parts instead of three parts.

Example 1.27: MSD radix sort partitioning.

al p habet al g orithm
al i gnment al i gnment
al l ocate al i as
al g orithm al l ocate
=⇒
al t ernative al l
al i as al p habet
al t ernate al t ernative
al l al t ernate

46
Algorithm 1.28: MSDRadixSort(R, `)
Input: (Multi)set R = {S1 , S2 , . . . , Sn } of strings over the alphabet [0..σ)
and the length ` of their common prefix.
Output: R in ascending lexicographical order.
(1) if |R| < σ then return StringQuicksort(R, `)
(2) R⊥ ← {S ∈ R | |S| = `}; R ← R \ R⊥
(3) (R0 , R1 , . . . , Rσ−1 ) ← CountingSort(R, `)
(4) for i ← 0 to σ − 1 do Ri ← MSDRadixSort(Ri , ` + 1)
(5) return R⊥ · R0 · R1 · · · Rσ−1

• Here CountingSort(R,`) not only sorts but also returns the partitioning
based on symbols at position `. The time complexity is still O(|R| + σ).

• The recursive calls eventually lead to a large number of very small sets,
but counting sort needs Ω(σ) time no matter how small the set is. To
avoid the potentially high cost, the algorithm switches to string
quicksort for small sets.

47
Theorem 1.29: MSD radix sort sorts a set R of n strings over the
alphabet [0..σ) in O(ΣLCP (R) + n log σ) time.
Proof. Consider a call processing a subset of size k ≥ σ:
• The time excluding the recursive calls but including the call to counting
sort is O(k + σ) = O(k). The k symbols accessed here will not be
accessed again.
• At most dp(S, R \ {S}) ≤ lcp(S, R \ {S}) + 1 symbols in S will be
accessed by the algorithm. Thus the total time spent in this kind of
calls is O(Σdp(R)) = O(Σlcp(R) + n) = O(ΣLCP (R) + n).

The calls for a subsets of size k < σ are handled by string quicksort. Each
string is involved in at most one such call. Therefore, the total time over all
calls to string quicksort is O(ΣLCP (R) + n log σ).

• There exists a more complicated variant of MSD radix sort with time
complexity O(ΣLCP (R) + n + σ).
• Ω(ΣLCP (R) + n) is a lower bound for any algorithm that must access
symbols one at a time (simple string model).
• In practice, MSD radix sort is very fast, but it is sensitive to
implementation details.

Regulatory Forms For Spouse or Co Borrower-BDO
No ratings yet
Regulatory Forms For Spouse or Co Borrower-BDO
9 pages
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
A Stronger Subadditivity Relation? With Applications To Squashed Entanglement, Sharability and Separability
100% (1)
A Stronger Subadditivity Relation? With Applications To Squashed Entanglement, Sharability and Separability
6 pages
Logic and Philosophy of Logic: Recent Trends in Latin America and Spain
100% (1)
Logic and Philosophy of Logic: Recent Trends in Latin America and Spain
307 pages
Lecture03 PDF
No ratings yet
Lecture03 PDF
22 pages
Lecture 03
No ratings yet
Lecture 03
25 pages
54.string 2notes
No ratings yet
54.string 2notes
20 pages
Suffix Arrays: Justin Zhang 24 May 2017
No ratings yet
Suffix Arrays: Justin Zhang 24 May 2017
5 pages
Covering A Set of Points With A Minimum Number of Lines
No ratings yet
Covering A Set of Points With A Minimum Number of Lines
13 pages
Unit 4 Counting 1
No ratings yet
Unit 4 Counting 1
25 pages
Majority CA
No ratings yet
Majority CA
35 pages
2015 ODE Bessel
No ratings yet
2015 ODE Bessel
10 pages
art07
No ratings yet
art07
13 pages
Distance of A System From A Nearest Singular Descriptor System Having Impulsive Initial-Conditions
No ratings yet
Distance of A System From A Nearest Singular Descriptor System Having Impulsive Initial-Conditions
6 pages
Eulerzig Zag
No ratings yet
Eulerzig Zag
6 pages
Geometry of grassmann manifolds
No ratings yet
Geometry of grassmann manifolds
23 pages
Lecture 08
No ratings yet
Lecture 08
11 pages
Approximation For Periodic Functions Via Statistical A-Summability
No ratings yet
Approximation For Periodic Functions Via Statistical A-Summability
11 pages
Benesty-Gänsler2004 Article NewInsightsIntoTheRLSAlgorithm
No ratings yet
Benesty-Gänsler2004 Article NewInsightsIntoTheRLSAlgorithm
9 pages
Epsilon Nets: 15.1 Motivation
No ratings yet
Epsilon Nets: 15.1 Motivation
16 pages
DAA (Algorithms Knowledge Capsule 4 by Dr. Choudhary Ravi Singh)
No ratings yet
DAA (Algorithms Knowledge Capsule 4 by Dr. Choudhary Ravi Singh)
20 pages
Rapidly Varying Sequences and Rapid Convergence: D. Djur Ci C, Lj.D.R. Ko Cinac, M.R. Žižovi C
No ratings yet
Rapidly Varying Sequences and Rapid Convergence: D. Djur Ci C, Lj.D.R. Ko Cinac, M.R. Žižovi C
7 pages
Majority Car Xiv
No ratings yet
Majority Car Xiv
26 pages
Complexity, The Changing Minimum and Closest Pair: 1 Las Vegas and Monte Carlo Algorithms
No ratings yet
Complexity, The Changing Minimum and Closest Pair: 1 Las Vegas and Monte Carlo Algorithms
5 pages
Distance of A System From A Nearest Singular Descriptor System Having Impulsive Initial-Conditions
No ratings yet
Distance of A System From A Nearest Singular Descriptor System Having Impulsive Initial-Conditions
8 pages
Distance of A System From A Nearest Singular Descriptor System Having Impulsive Initial-Conditions
No ratings yet
Distance of A System From A Nearest Singular Descriptor System Having Impulsive Initial-Conditions
7 pages
Kleene
No ratings yet
Kleene
6 pages
Solution PDF
No ratings yet
Solution PDF
305 pages
Geodesic Completeness of The Left-Invariant Metrics On RHN
No ratings yet
Geodesic Completeness of The Left-Invariant Metrics On RHN
8 pages
6
No ratings yet
6
25 pages
1 Introduction To Complexity Theory: 1.1 Basic Notation
No ratings yet
1 Introduction To Complexity Theory: 1.1 Basic Notation
10 pages
Homework Two Solution - CSE 355
No ratings yet
Homework Two Solution - CSE 355
8 pages
The Matrix Representation of A Three-Dimensional Rotation-Revisited
No ratings yet
The Matrix Representation of A Three-Dimensional Rotation-Revisited
7 pages
Pseudo-Differential Operators On S
No ratings yet
Pseudo-Differential Operators On S
13 pages
8.4 Closures of Relations: - Definition
No ratings yet
8.4 Closures of Relations: - Definition
20 pages
18k-1090 18k0429 An Efficient Algorithm For LCS Problem Between Two Arbitrary Sequences
No ratings yet
18k-1090 18k0429 An Efficient Algorithm For LCS Problem Between Two Arbitrary Sequences
5 pages
Relational Algebra Exercises
No ratings yet
Relational Algebra Exercises
2 pages
Goldbach.linnik
No ratings yet
Goldbach.linnik
38 pages
Shift-Register Synthesis (Modulo M)
No ratings yet
Shift-Register Synthesis (Modulo M)
23 pages
Rings: 1 Basic Definitions
No ratings yet
Rings: 1 Basic Definitions
11 pages
Arithmetic Progressions
No ratings yet
Arithmetic Progressions
4 pages
Transference
No ratings yet
Transference
6 pages
Hiroshi Tamura - Random Point Fields For Para-Particles of Any Order
No ratings yet
Hiroshi Tamura - Random Point Fields For Para-Particles of Any Order
15 pages
Bottom Schur Functions
No ratings yet
Bottom Schur Functions
16 pages
Simplicity of An
No ratings yet
Simplicity of An
9 pages
Silverman-Suzuki1998 Chapter EllipticCurveDiscreteLogarithm
No ratings yet
Silverman-Suzuki1998 Chapter EllipticCurveDiscreteLogarithm
16 pages
Ramanujan Tau Function Lygeros Rozier
No ratings yet
Ramanujan Tau Function Lygeros Rozier
14 pages
Simple Zeros of The Ramanujan T-Dirichlet Series
No ratings yet
Simple Zeros of The Ramanujan T-Dirichlet Series
17 pages
Btech 1st Sem: Maths: Sequence and Series
50% (4)
Btech 1st Sem: Maths: Sequence and Series
14 pages
Muchnik's Proof of Tarski-Seidenberg
No ratings yet
Muchnik's Proof of Tarski-Seidenberg
8 pages
A Survey of Feedback With Carry Shift Registers: 1 Intro Duction
No ratings yet
A Survey of Feedback With Carry Shift Registers: 1 Intro Duction
16 pages
Lisp Notes
No ratings yet
Lisp Notes
139 pages
Vort Rag
No ratings yet
Vort Rag
9 pages
Suffix Array Tutorial
No ratings yet
Suffix Array Tutorial
17 pages
Alternating Group & Parity of Permutations
No ratings yet
Alternating Group & Parity of Permutations
11 pages
Chapter Two Regular Expression and Regular Language
No ratings yet
Chapter Two Regular Expression and Regular Language
30 pages
Unit 2: Role of Lexical Analyzer
No ratings yet
Unit 2: Role of Lexical Analyzer
11 pages
18 022 PDF
No ratings yet
18 022 PDF
33 pages
Formal Languages & Finite Theory of Automata: BS Course
No ratings yet
Formal Languages & Finite Theory of Automata: BS Course
33 pages
Mathematical Functions
From Everand
Mathematical Functions
Oliver Linton
No ratings yet
An Introduction to Linear Algebra and Tensors
From Everand
An Introduction to Linear Algebra and Tensors
M. A. Akivis
1/5 (1)
Recursive Analysis
From Everand
Recursive Analysis
R. L. Goodstein
No ratings yet
Penn Mutual Case Study Final
No ratings yet
Penn Mutual Case Study Final
6 pages
User ID
No ratings yet
User ID
2 pages
Pricelist Bermula - Jun 2021
No ratings yet
Pricelist Bermula - Jun 2021
11 pages
Unemployment Insurance and COVID-19
0% (1)
Unemployment Insurance and COVID-19
2 pages
Nursing Management of A Patient With Cardiomyopathy
No ratings yet
Nursing Management of A Patient With Cardiomyopathy
4 pages
Pronoun-Antecednet Agreement Lesson Plan
No ratings yet
Pronoun-Antecednet Agreement Lesson Plan
3 pages
GROHE Digital Planning Master 2016 DS
No ratings yet
GROHE Digital Planning Master 2016 DS
25 pages
80102e PDF
No ratings yet
80102e PDF
4 pages
S 2 01
No ratings yet
S 2 01
1 page
Gold and Silver Unit Conversion Chart
No ratings yet
Gold and Silver Unit Conversion Chart
1 page
5-CRC Cards
No ratings yet
5-CRC Cards
18 pages
Treatment of Pediatric Overweight and Obesity Position of the Academy of Nutrition and Dietetics Based on an Umbrella Review of Systematic Reviews
No ratings yet
Treatment of Pediatric Overweight and Obesity Position of the Academy of Nutrition and Dietetics Based on an Umbrella Review of Systematic Reviews
14 pages
Modern Physics for Scientists and Engineers 5th Edition Stephen Thornton & Andrew Rex instant download
No ratings yet
Modern Physics for Scientists and Engineers 5th Edition Stephen Thornton & Andrew Rex instant download
40 pages
Advantages and Disadvantages of Hieararchical Organization Chart
No ratings yet
Advantages and Disadvantages of Hieararchical Organization Chart
3 pages
BBC - History - Viking Weapons and Warfare
No ratings yet
BBC - History - Viking Weapons and Warfare
5 pages
Applied Physics: Lab Manual
No ratings yet
Applied Physics: Lab Manual
13 pages
Download ebooks file Hibernate Recipes A Problem Solution Approach Recipe Series 1st Edition Gary Mak all chapters
100% (5)
Download ebooks file Hibernate Recipes A Problem Solution Approach Recipe Series 1st Edition Gary Mak all chapters
61 pages
Gastrointestinal
No ratings yet
Gastrointestinal
6 pages
Module 4 Control Systems
No ratings yet
Module 4 Control Systems
4 pages
Calcii Vectors Basics
No ratings yet
Calcii Vectors Basics
6 pages
Seth Boriel - Lab - Enzyme Substrate Reaction
No ratings yet
Seth Boriel - Lab - Enzyme Substrate Reaction
4 pages
Lesson Plan in Trends Module 4
No ratings yet
Lesson Plan in Trends Module 4
2 pages
Principles of Tourism
No ratings yet
Principles of Tourism
2 pages
Art Critique About Mother and Child Painting by Vicente Manansala
No ratings yet
Art Critique About Mother and Child Painting by Vicente Manansala
5 pages
Vege
No ratings yet
Vege
1 page
Orca Share Media1664807147155 6982707276541558547
No ratings yet
Orca Share Media1664807147155 6982707276541558547
5 pages
Learn Java From Basics - Bikram
No ratings yet
Learn Java From Basics - Bikram
130 pages
Pro-Cut Grinder KG32XP Parts Manual 19767
No ratings yet
Pro-Cut Grinder KG32XP Parts Manual 19767
4 pages

Lecture 02

Uploaded by

Lecture 02

Uploaded by

Longest Common Prefixes

The standard ordering for strings is the lexicographical order. It is induced

Proof. Consider the construction of trie(R) by inserting the strings one by

If we use one of these algorithms for sorting a set of n strings, it is clear

• If σ is considered to be a constant, the lower bound is Ω(n(log n)2).

An intuitive explanation for this result is that the comparisons made by a

• Each string comparison needs at least k symbol comparisons.

and that α∈Σk nα log nα ≥ α∈Σk nα log α∈Σk nα /σ (by Jensen’s

Theorem 1.18: Let R = {S1 , S2 , . . . , Sn } be a set of n strings. Sorting R

The result follows from combining the two lower bounds.

Quicksort is one of the fastest general purpose sorting algorithms in

Algorithm 1.19: TernaryQuicksort(R)

• Binary quicksort is slightly faster in practice for sorting sets.

• Ternary quicksort can be faster for sorting multisets with many

In the following, we assume an optimal pivot selection giving O(n log n)

Algorithm 1.20: StringQuicksort(R, `)

In the initial call, ` = 0.

Theorem 1.22: String quicksort sorts a set R of n strings in

• Thus string quicksort is an optimal symbol comparison based algorithm.

• String quicksort is also fast in practice.

S[`] = X[`]: Charge the comparison on the symbol S[`].

• The time complexity is O(n + σ).

• Counting sort is a stable sorting algorithm, i.e., the relative order of

There are two types of radix sorting:

• CountingSort(R,`) sorts the strings in R by the symbols at position `

• The stability of counting sort is essential.

Example 1.25: R = {cat, him, ham, bat}.

The case of variable lengths is left as an exercise.

• It is best suited for sorting short strings and integers.

Example 1.27: MSD radix sort partitioning.

You might also like