0% found this document useful (0 votes)

20 views159 pages

Unit 1

The document provides an overview of algorithms, defining them as sequences of unambiguous instructions for solving problems. It discusses the importance of algorithms in computer science, various algorithm design strategies, and examples of computational problems such as sorting and searching. Additionally, it touches on the historical context of algorithms and their applications in practical scenarios.

Uploaded by

Anshuman Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views159 pages

Unit 1

Uploaded by

Anshuman Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 159

CSE408

Fundamentals of
Algorithms

Lecture #1

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
What is an algorithm?

An algorithm is a sequence of unambiguous instructions

for solving a problem, i.e., for obtaining a required
output for any legitimate input in a finite amount of time.

problem

algorithm

input “computer” output

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Algorithm

An algorithm is a sequence of unambiguous

instructions for solving a problem, i.e., for obtaining a
required output for any legitimate input in a finite
amount of time.

• Can be represented various forms

• Unambiguity/clearness
• Effectiveness
• Finiteness/termination
• Correctness
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Historical Perspective

Euclid’s algorithm for finding the greatest common divisor

Muhammad ibn Musa al-Khwarizmi – 9th century

mathematician
www.lib.virginia.edu/science/parshall/khwariz.html

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Notion of algorithm and problem

problem

algorithm

input “computer” output

(or instance)

algorithmic solution
(different from a conventional solution)
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Example of computational problem: sorting

Statement of problem:
• Input: A sequence of n numbers <a1, a2, …, an>

• Output: A reordering of the input sequence <a´1, a´2, …, a´n> so that a´i
≤ a´j whenever i < j

Instance: The sequence <5, 3, 2, 8, 3>

Algorithms:
• Selection sort
• Insertion sort
• Merge sort
• (many others)

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Selection Sort

Input: array a[1],…,a[n]

Output: array a sorted in non-decreasing order

Algorithm:

for i=1 to n
swap a[i] with smallest of a[i],…,a[n]

• Is this unambiguous? Effective?

• See also pseudocode, section 3.1
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Some Well-known Computational Problems

Sorting
Searching
Shortest paths in a graph
Minimum spanning tree
Primality testing
Traveling salesman problem
Knapsack problem
Chess
Towers of Hanoi
Program termination

Some of these problems don’t have efficient algorithms,

or algorithms at all!
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Basic Issues Related to Algorithms

How to design algorithms

How to express algorithms

Proving correctness

Efficiency (or complexity) analysis

• Theoretical analysis

• Empirical analysis

Optimality

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Algorithm design strategies

Brute force Greedy approach

Divide and conquer Dynamic programming

Decrease and conquer

Backtracking and branch-and-bound
Transform and conquer
Space and time tradeoffs

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Analysis of Algorithms

How good is the algorithm?

• Correctness
• Time efficiency
• Space efficiency

Does there exist a better algorithm?

• Lower bounds
• Optimality

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
What is an algorithm?

Recipe, process, method, technique, procedure, routine,…

with the following requirements:
1. Finiteness
terminates after a finite number of steps
2. Definiteness
rigorously and unambiguously specified
3. Clearly specified input
valid inputs are clearly specified
4. Clearly specified/expected output
can be proved to produce the correct output given a valid input
5. Effectiveness
steps are sufficiently simple and basic

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Why study algorithms?

Theoretical importance

• the core of computer science

Practical importance

• A practitioner’s toolkit of known algorithms

• Framework for designing and analyzing algorithms for new problems

Example: Google’s PageRank Technology

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Euclid’s Algorithm

Problem: Find gcd(m,n), the greatest common divisor of two

nonnegative, not both zero integers m and n

Examples: gcd(60,24) = 12, gcd(60,0) = 60, gcd(0,0) = ?

Euclid’s algorithm is based on repeated application of equality

gcd(m,n) = gcd(n, m mod n)
until the second number becomes 0, which makes the problem
trivial.

Example: gcd(60,24) = gcd(24,12) = gcd(12,0) = 12

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Two descriptions of Euclid’s algorithm

Step 1 If n = 0, return m and stop; otherwise go to Step 2

Step 2 Divide m by n and assign the value of the remainder to r
Step 3 Assign the value of n to m and the value of r to n. Go to
Step 1.

while n ≠ 0 do
r ← m mod n
m← n
n←r
return m

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Other methods for computing gcd(m,n)

Consecutive integer checking algorithm

Step 1 Assign the value of min{m,n} to t
Step 2 Divide m by t. If the remainder is 0, go to Step 3;
otherwise, go to Step 4
Step 3 Divide n by t. If the remainder is 0, return t and stop;
otherwise, go to Step 4
Step 4 Decrease t by 1 and go to Step 2

Is this slower than Euclid’s algorithm?

How much slower?

O(n), if n <= m , vs O(log n)

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Other methods for gcd(m,n) [cont.]

Middle-school procedure
Step 1 Find the prime factorization of m
Step 2 Find the prime factorization of n
Step 3 Find all the common prime factors
Step 4 Compute the product of all the common prime factors
and return it as gcd(m,n)

Is this an algorithm?

How efficient is it?

Time complexity: O(sqrt(n))
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Sieve of Eratosthenes
Input: Integer n ≥ 2
Output: List of primes less than or equal to n
for p ← 2 to n do A[p] ← p
for p ← 2 to n do
if A[p]  0 //p hasn’t been previously eliminated from the list
j ← p* p
while j ≤ n do
A[j] ← 0 //mark element as eliminated
j←j+p

Example: 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Time complexity: O(n)
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Fundamentals of Algorithmic Problem Solving

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Two main issues related to algorithms

How to design algorithms

How to analyze algorithm efficiency

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Algorithm design techniques/strategies

Brute force Greedy approach

Divide and conquer Dynamic programming

Decrease and conquer Iterative improvement

Transform and conquer Backtracking

Space and time tradeoffs Branch and bound

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Analysis of algorithms

How good is the algorithm?

• time efficiency
• space efficiency
• correctness ignored in this course

Does there exist a better algorithm?

• lower bounds
• optimality

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Important problem types

sorting

searching

string processing

graph problems

combinatorial problems

geometric problems

numerical problems

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Sorting (I)

Rearrange the items of a given list in ascending order.

• Input: A sequence of n numbers <a1, a2, …, an>
• Output: A reordering <a´1, a´2, …, a´n> of the input sequence such that a´1≤
a´2 ≤ … ≤ a´n.
Why sorting?
• Help searching
• Algorithms often use sorting as a key subroutine.
Sorting key
• A specially chosen piece of information used to guide sorting. E.g., sort
student records by names.

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Sorting (II)

Examples of sorting algorithms

• Selection sort
• Bubble sort
• Insertion sort
• Merge sort
• Heap sort …
Evaluate sorting algorithm complexity: the number of key comparisons.
Two properties
• Stability: A sorting algorithm is called stable if it preserves the relative order of any
two equal elements in its input.
• In place : A sorting algorithm is in place if it does not require extra memory,
except, possibly for a few memory units.

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Selection Sort

Algorithm SelectionSort(A[0..n-1])
//The algorithm sorts a given array by selection sort
//Input: An array A[0..n-1] of orderable elements
//Output: Array A[0..n-1] sorted in ascending order
for i  0 to n – 2 do
min  i
for j  i + 1 to n – 1 do
if A[j] < A[min]
min  j
swap A[i] and A[min]

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Searching

Find a given value, called a search key, in a given set.

Examples of searching algorithms
• Sequential search
• Binary search …
Input: sorted array a_i < … < a_j and key x;
m (i+j)/2;
while i < j and x != a_m do
if x < a_m then j  m-1
else i  m+1;
if x = a_m then output a_m;

Time: O(log n)
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
String Processing

A string is a sequence of characters from an alphabet.

Text strings: letters, numbers, and special characters.
String matching: searching for a given word/pattern in a text.

Examples:
(i) searching for a word or phrase on WWW or in a
Word document
(ii) searching for a short read in the reference genomic
sequence

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
CSE408
Fundamentals of Data
Structure

Lecture #2

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Fundamental data structures

list graph

• array tree and binary tree

• linked list set and dictionary

• string
stack
queue
priority queue/heap

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Linear Data Structures

Arrays ◼ Arrays
• A sequence of n items of the same data ◼ fixed length (need preliminary
type that are stored contiguously in reservation of memory)
computer memory and made accessible ◼ contiguous memory locations
by specifying a value of the array’s
index. ◼ direct access
Linked List ◼ Insert/delete
• A sequence of zero or more nodes each◼ Linked Lists
containing two kinds of information:
some data and one or more links called ◼ dynamic length
pointers to other nodes of the linked ◼ arbitrary memory locations
list. ◼ access by following links
• Singly linked list (next pointer)
◼ Insert/delete
• Doubly linked list (next + previous
pointers)
a1 a2 … an .
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Stacks and Queues

Stacks
• A stack of plates
– insertion/deletion can be done only at the top.
– LIFO
• Two operations (push and pop)
Queues
• A queue of customers waiting for services
– Insertion/enqueue from the rear and deletion/dequeue from the
front.
– FIFO
• Two operations (enqueue and dequeue)

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Priority Queue and Heap

◼ Priority queues (implemented using heaps)

◼ A data structure for maintaining a set of elements,
each associated with a key/priority, with the
following operations
◼ Finding the element with the highest priority
◼ Deleting the element with the highest priority
◼ Inserting a new element 9
◼ Scheduling jobs on a shared computer 6 8
5 2 3

9 6 8 5 2 3
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Graphs

1 2
3 4
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Graph Representation

Adjacency matrix
• n x n boolean matrix if |V| is n.
• The element on the ith row and jth column is 1 if there’s an edge
from ith vertex to the jth vertex; otherwise 0.
• The adjacency matrix of an undirected graph is symmetric.
Adjacency linked lists
• A collection of linked lists, one for each vertex, that contain all the
vertices adjacent to the list’s vertex.
Which data structure would you use if the graph is a 100-node star
shape?

0111 2 3 4
0001 4
0001 4
0000
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Weighted Graphs

Weighted graphs
• Graphs or digraphs with numbers assigned to the edges.

5
1 2
6 7
9
3 4
8

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Graph Properties -- Paths and Connectivity

Paths
• A path from vertex u to v of a graph G is defined as a sequence of
adjacent (connected by an edge) vertices that starts with u and ends with
v.
• Simple paths: All edges of a path are distinct.
• Path lengths: the number of edges, or the number of vertices – 1.
Connected graphs
• A graph is said to be connected if for every pair of its vertices u and v
there is a path from u to v.
Connected component
• The maximum connected subgraph of a given graph.

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Graph Properties -- Acyclicity

Cycle
• A simple path of a positive length that starts and ends a
the same vertex.
Acyclic graph
• A graph without cycles
• DAG (Directed Acyclic Graph)

1 2
3 4

Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Trees

Trees
• A tree (or free tree) is a connected acyclic graph.
• Forest: a graph that has no cycles but is not necessarily connected.
Properties of trees

• For every two vertices in a tree there always exists exactly one simple
path from one of these vertices to the other. Why?
– Rooted trees: The above property makes it possible to select an
arbitrary vertex in a free tree and consider it as the root of the so
called rooted tree.
– Levels in a rooted tree.
rooted
3
◼ |E| = |V| - 1 1 3 5
4 1 5
2 4
2
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Rooted Trees (I)

Ancestors
• For any vertex v in a tree T, all the vertices on the simple path
from the root to that vertex are called ancestors.
Descendants
• All the vertices for which a vertex v is an ancestor are said to be
descendants of v.
Parent, child and siblings
• If (u, v) is the last edge of the simple path from the root to
vertex v, u is said to be the parent of v and v is called a child of
u.
• Vertices that have the same parent are called siblings.
Leaves
• A vertex without children is called a leaf.
Subtree
• A vertex v with all its descendants is called the subtree of T
rooted at v.
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Rooted Trees (II)

Depth of a vertex
• The length of the simple path from the root to the vertex.
Height of a tree
• The length of the longest simple path from the root to a leaf.

h=2
3
4 1 5
2
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Ordered Trees
Ordered trees
• An ordered tree is a rooted tree in which all the children of each vertex
are ordered.
Binary trees
• A binary tree is an ordered tree in which every vertex has no more than
two children and each children is designated s either a left child or a right
child of its parent.
Binary search trees
• Each vertex is assigned a number.
• A number assigned to each parental vertex is larger than all the numbers
in its left subtree and smaller than all the numbers in its right subtree.
log2n  h  n – 1, where h is the height of a binary tree and n the size.

9 6
6 8 3 9
5 2 3 2 5 8
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “Introduction to the Design & Analysis of Algorithms,” 2nd ed., Ch. 1
CSE408
String Matching
Algorithm
Lecture # 5&6
String Matching Problem
Motivations: text-editing, pattern matching in DNA sequences

32.1

Text: array T[1...n] nm Pattern: array P[1...m]

Array Element: Character from finite alphabet S
Pattern P occurs with shift s in T if P[1...m] = T[s+1...s+m] 0 s  n−m
String Matching Algorithms

• Naive Algorithm
– Worst-case running time in O((n-m+1) m)
• Rabin-Karp
– Worst-case running time in O((n-m+1) m)
– Better than this on average and in practice

• Knuth-Morris-Pratt
– Worst-case running time in O(n + m)
Notation & Terminology

• S* = set of all finite-length strings formed

using characters from alphabet S
• Empty string: e
• |x| = length of string x
• w is a prefix of x: w x
• w is a suffix of x: w x ab abcca

• prefix, suffix are transitive cca abcca

Naive String Matching

worst-case running time is in Q((n-m+1)m)

32.4
Rabin-Karp Algorithm

• Assume each character is digit in radix-d notation (e.g. d=10)

• p = decimal value of pattern
• ts = decimal value of substring T[s+1..s+m] for s = 0,1...,n-m
• Strategy:
– compute p in O(m) time (which is in O(n))
– compute all ti values in total of O(n) time
– find all valid shifts s in O(n) time by comparing p with each ts
• Compute p in O(m) time using Horner’s rule:
– p = P[m] + d(P[m-1] + d(P[m-2] + ... + d(P[2] + dP[1])))
• Compute t0 similarly from T[1..m] in O(m) time
• Compute remaining ti‘s in O(n-m) time
– ts+1 = d(ts - d m-1T[s+1]) + T[s+m+1]
Rabin-Karp Algorithm

p, ts may be large, so use mod

32.5
Rabin-Karp Algorithm (continued)

ts+1 = d(ts - d m-1T[s+1]) + T[s+m+1]

p = 31415

spurious
hit
Rabin-Karp Algorithm (continued)
Rabin-Karp Algorithm (continued)

d is radix q is modulus

Q(m) in Q(n) high-order digit position for m-digit window

Preprocessing

Q(m)

Matching loop invariant: when line 10 executed

ts=T[s+1..s+m] mod q
Q((n-m+1)m) rule out spurious hit
Q(m)
Try all
possible
shifts

worst-case running time is in Q((n-m+1)m)

Rabin-Karp Algorithm (continued)
d is radix q is modulus

Q(m) in Q(n) high-order digit position for m-digit window

Preprocessing
Q(m)
Matching loop invariant: when line 10 executed
ts=T[s+1..s+m] mod q
Q((n-m+1)m) rule out spurious hit
Q(m)
Try all
possible
shifts

Assume reducing mod q is like random mapping from S* to Zq

Estimate (chance that ts= p mod q) = 1/q # spurious hits is in O(n/q)

Expected matching time = O(n) + O(m(v + n/q)) (v = # valid shifts)

If v is in O(1) and q >= m average-case running time is in O(n+m)

The Knuth-Morris-Pratt Algorithm

Knuth, Morris and Pratt proposed a linear time

algorithm for the string matching problem.
A matching time of O(n) is achieved by avoiding
comparisons with elements of ‘S’ that have
previously been involved in comparison with
some element of the pattern ‘p’ to be
matched. i.e., backtracking on the string ‘S’
never occurs
Components of KMP algorithm

• The prefix function, Π

The prefix function,Π for a pattern encapsulates
knowledge about how the pattern matches against
shifts of itself. This information can be used to avoid
useless shifts of the pattern ‘p’. In other words, this
enables avoiding backtracking on the string ‘S’.
• The KMP Matcher
With string ‘S’, pattern ‘p’ and prefix function ‘Π’ as
inputs, finds the occurrence of ‘p’ in ‘S’ and returns
the number of shifts of ‘p’ after which occurrence is
found.
The prefix function, Π

Following pseudocode computes the prefix fucnction, Π:

Compute-Prefix-Function (p)
1 m  length[p] //’p’ pattern to be matched
2 Π[1]  0
3 k0
4 for q  2 to m
5 do while k > 0 and p[k+1] != p[q]
6 do k  Π[k]
7 If p[k+1] = p[q]
8 then k  k +1
9 Π[q]  k
10 return Π
Example: compute Π for the pattern ‘p’ below:

P a b a b a c a
Initially: m = length[p] = 7
Π[1] = 0
k=0
q 1 2 3 4 5 6 7
p a b a b a c a
Step 1: q = 2, k=0 Π 0 0
Π[2] = 0
q 1 2 3 4 5 6 7
p a b a b a c a
Π 0 0 1

Step 2: q = 3, k = 0,
q 1 2 3 4 5 6 7
Π[3] = 1
p a b a b a c A
Π 0 0 1 2
Step 4: q = 5, k =2 q 1 2 3 4 5 6 7
Π[5] = 3 p a b a b a c a
Π 0 0 1 2 3

q 1 2 3 4 5 6 7
Step 5: q = 6, k = 3
Π[6] = 1 p a b a b a c a
Π 0 0 1 2 3 1

q 1 2 3 4 5 6 7
Step 6: q = 7, k = 1 p a b a b a c a
Π[7] = 1 Π 0 0 1 2 3 1 1

After iterating 6 times, the prefix q 1 2 3 4 5 6 7

function computation is p a b A b a c a
complete: →
Π 0 0 1 2 3 1 1
The KMP Matcher

The KMP Matcher, with pattern ‘p’, string ‘S’ and prefix function ‘Π’ as input, finds a match of p in S.
Following pseudocode computes the matching component of KMP algorithm:
KMP-Matcher(S,p)
1 n  length[S]
2 m  length[p]
3 Π  Compute-Prefix-Function(p)
4q0 //number of characters matched
5 for i  1 to n //scan S from left to right
6 do while q > 0 and p[q+1] != S[i]
7 do q  Π[q] //next character does not match
8 if p[q+1] = S[i]
9 then q  q + 1 //next character matches
10 if q = m //is all of p matched?
11 then print “Pattern occurs with shift” i – m
12 q  Π[ q] // look for the next match

Note: KMP finds every occurrence of a ‘p’ in ‘S’. That is why KMP does not terminate in step 12, rather it
searches remainder of ‘S’ for any more occurrences of ‘p’.
Illustration: given a String ‘S’ and pattern ‘p’ as follows:

b a c b a b a b a b a c a c a
S
p a b a b a c a
Let us execute the KMP algorithm to find
whether ‘p’ occurs in ‘S’.
For ‘p’ the prefix function, Π was computed previously and is as follows:

q 1 2 3 4 5 6 7

p a b A b a c a

Π 0 0 1 2 3 1 1
Initially: n = size of S = 15;
m = size of p = 7
Step 1: i = 1, q = 0
comparing p[1] with S[1]