0% found this document useful (0 votes)
12 views

Merged Lecture Notes

The document discusses sorting algorithms and tries. It explains that comparison-based sorting algorithms like mergesort and quicksort have worst-case complexity of O(n log n). Radix sorting is discussed as a non-comparison based sorting algorithm with worst-case linear time complexity of O(n). Radix sorting works by distributing items into buckets based on the value of certain bits in each iteration and concatenating the buckets, with m/b iterations needed for items of length m bits sorted using a radix of b bits. An example of radix sorting with m=6 and b=2 is provided.

Uploaded by

Bentoja
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Merged Lecture Notes

The document discusses sorting algorithms and tries. It explains that comparison-based sorting algorithms like mergesort and quicksort have worst-case complexity of O(n log n). Radix sorting is discussed as a non-comparison based sorting algorithm with worst-case linear time complexity of O(n). Radix sorting works by distributing items into buckets based on the value of certain bits in each iteration and concatenating the buckets, with m/b iterations needed for items of length m bits sorted using a radix of b bits. An example of radix sorting with m=6 and b=2 is provided.

Uploaded by

Bentoja
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 314

1 Sorting and Tries 2

2 Graphs and Graph Algorithms 29


3 Strings and text algorithms 101
4 NP Completeness 185
5 Computability 242
Algorithmics I 2022

Algorithmics I

Section 1 – Sorting and Tries

Dr. Gethin Norman

School of Computing Science


University of Glasgow

[email protected]

1 Sorting and Tries 2


Sorting - Recap
Naïve sorting algorithms: O(n2) in the worst/average case
− Selectionsort, Insertionsort, Bubblesort

Clever sorting algorithms: O(n log n) in the worst/average case


− Mergesort, Heapsort (which we have just seen)

The fastest sorting algorithm in practice is Quicksort


− O(n log n) on average
− but no better than O(n2) in the worst case (unless a clever variant is used)

Question: can we come up with a sorting algorithm that is better


than O(n log n) in the worst case?
− for example a O(n) algorithm

Algorithmics I, 2022 2
1 Sorting and Tries 3
Sorting - Comparison based sorting
Claim: no sorting algorithm that is based on pairwise comparison
of values can be better than O(n log n)

Justification: describe the algorithm by a decision tree (binary tree)


− each node represents a comparison between two elements
− path branches left or right depending on the outcome of the comparison

a1>b1
no yes

a2>b2 a3>b3
no yes no yes

Algorithmics I, 2022 3
1 Sorting and Tries 4
Sorting - Comparison based sorting
Claim: no sorting algorithm that is based on pairwise comparison
of values can be better than O(n log n)

Justification: describe the algorithm by a decision tree (binary tree)


− each node represents a comparison between two elements
− path branches left or right depending on the outcome of the comparison
− an execution of the algorithm is a path from the root to a leaf node
− the number of leaf nodes in the tree must be at least the number of
‘outcomes’ of the algorithm
− therefore number of leaf nodes equals the possible orderings of n items
− that is there are least n! leaf nodes (remember permutations from AF2)

Algorithmics I, 2022 4
1 Sorting and Tries 5
Sorting - Comparison based sorting
We have shown the decision tree has at least n! leaf nodes

The worst-case complexity of the algorithm is no better than O(h)


− where h is the height of the tree
− an execution is a path from the root node to a leaf node
− we perform an operation an each branch node so h operations in the
worst case

A decision tree is a binary tree (two branches ‘yes’ and ‘no’)


and hence the number of leaf nodes is less than or equal to 2h+1-1
− a binary tree of height h has at most 2h+1-1 nodes

Combining these properties it follows that n!≤ 2h+1–1 ≤ 2h+1

Algorithmics I, 2022 5
1 Sorting and Tries 6
Sorting - Comparison based sorting
We have shown: complexity is no better than O(h) and 2h+1 ≥ n!
− h is the height of the decision tree
− n is the number of items to be sorted

Taking log2 of both sides of 2h+1≥n! yields:

h+1 ≥ log2(n!)
> log2(n/2)n/2 (since n! > (n/2)n/2)
= (n/2)log2(n/2) (since log ab = b log a)
= (n/2)log2n - (n/2)log22 (since log a/b = log a – log b)
= (n/2)log2n - n/2 (since logaa = 1)

Giving a complexity of at least O(n log n) as required

Algorithmics I, 2022 6
1 Sorting and Tries 7
Sorting – Radix sorting
We haven shown no sorting algorithm that is based on pairwise
comparisons can be better than O(n log n) in the worst case
− therefore to improve on this worst case bound, we have to devise a
method based on something other than comparisons

Radix sort uses a different approach to achieve an O(n) complexity


− but the algorithm has to exploit the structure of the items being sorted,
so may be less versatile
− in practice, it is faster than O(n log n) algorithms only for very large n

Assume items to sort can be treated as bit-sequences of length m


− let b be a chosen factor of m
− so b and m are constants for any particular instance

Algorithmics I, 2022 7
1 Sorting and Tries 8
Sorting – Radix sorting - Algorithm
Each item has bit positions labelled 0,1,…,m-1
− bit 0 being the least significant (i.e. the right-most)

The algorithm uses m/b iterations


− in each iteration the items are distributed into 2b buckets
length b length b
− a bucket is just a list
− the buckets are labelled 0,1,…,2b-1 (or, equivalently, 00...0 to 11...1)
− during the ith iteration an item is placed in the bucket corresponding to
the integer represented by the bits in positions b×i−1,…,b×(i−1)
• e.g. for b=4 and i=2

item = 0010100100110001

Algorithmics I, 2022 8
1 Sorting and Tries 9
Sorting – Radix sorting - Algorithm
Each item has bit positions labelled 0,1,…,m-1
− bit 0 being the least significant (i.e. the right-most)
The algorithm uses m/b iterations
− in each iteration the items are distributed into 2b buckets
− a bucket is just a list
length b length b
− the buckets are labelled 0,1,…,2b-1 (or, equivalently, 00...0 to 11...1)
− during the ith iteration an item is placed in the bucket corresponding to
the integer represented by the bits in positions b×i−1,…,b×(i−1)
• e.g. for b=4 and i=2, consider bits in position 7,..,4
item = 0010100100110001
• 0011 represents the integer 3
• so item is placed in the bucket labelled 3 (or, equivalently, 0011)
− at the end of an iteration the buckets are concatenated to give a new
sequence which will be used as the starting point of the next iteration

Algorithmics I, 2022 9
1 Sorting and Tries 10
Sorting – Radix sorting - Example
Suppose we want to sort the following sequence with Radix sort

15 43 5 27 60 18 26 2

Binary encodings are given by

15 = 001111 43 = 101011 5 = 000101 27 = 011011


60 = 111100 18 = 010010 26 = 011010 2 = 000010

− items have bit positions 0,…,5, hence m=6


− b must be a factor of m, so lets choose b=2

This means in Radix sort we have:


− 2b=22=4 buckets labelled 0,1,2,3 (or equivalently 00,01,10,11)
and m/b = 3 iterations are required

Algorithmics I, 2022 10
1 Sorting and Tries 11
Sorting – Radix sorting - Example
Sequence: 15 43 5 27 60 18 26 2

Binary encodings: 15 = 001111 43 = 101011 5 = 000101 27 = 011011


60 = 111100 18 = 010010 26 = 011010 2 = 000010

First iteration of radix


− items are distributed into 4 buckets (a bucket is just a list)
− during the 1st iteration, an item is placed in a bucket corresponding to
the integer represented by the bits in positions 1,…,0
− buckets concatenated at the end of an iteration to give input sequence
for the next iteration
1st iteration:
bucket 00: 60
bucket 01: 5
bucket 10: 18 26 2
bucket 11: 15 43 27
new sequence: 60 5 18 26 2 15 43 27
Algorithmics I, 2022 11
1 Sorting and Tries 12
Sorting – Radix sorting - Example
New sequence: 60 5 18 26 2 15 43 27

Binary encodings: 60 = 111100 5 = 000101 18 = 010010 26 = 011010


2 = 000010 15 = 001111 43 = 101011 27 = 011011

Second iteration of radix


− items are distributed into 4 buckets (a bucket is just a list)
− during the 2nd iteration, an item is placed in a bucket corresponding to
the integer represented by the bits in positions 3,…,2
− buckets concatenated at the end of an iteration to give input sequence
for the next iteration
2nd iteration:
bucket 00: 18 2
bucket 01: 5
bucket 10: 26 43 27
bucket 11: 60 15
new sequence: 18 2 5 26 43 27 60 15

Algorithmics I, 2022 12
1 Sorting and Tries 13
Sorting – Radix sorting - Example
New sequence: 18 2 5 26 43 27 60 15

Binary encodings: 18 = 010010 2 = 000010 5 = 000101 26 = 011010


43 = 101011 27 = 011011 60 = 111100 15 = 001111

Third (and final) iteration of radix


− items are distributed into 4 buckets (a bucket is just a list)
− during the 3rd iteration, an item is placed in a bucket corresponding to
the integer represented by the bits in positions 5,…,4
− buckets concatenated at the end of an iteration to give input sequence
for the next iteration
3rd iteration:
bucket 00: 2 5 15
bucket 01: 18 26 27
bucket 10: 43
bucket 11: 60
sorted sequence: 2 5 15 18 26 27 43 60

Algorithmics I, 2022 13
1 Sorting and Tries 14
Sorting – Radix sorting - Pseudocode

// assume we have the following method which returns the value


// represented by the b bits of x when starting at position pos
private int bits(Item x, int b, int pos)

// suppose that:
// a is the sequence to be sorted
// m is the number of bits in each item of the sequence a
// b is the ‘block length’ of radix sort

int numIterations = m/b; // number of iterations required for sorting


int numBuckets = (int) Math.pow(2, b); // number of buckets

// represent sequence a to be sorted as an ArrayList of Items


ArrayList<Item> a = new ArrayList<Item>();

// represent the buckets as an array of ArrayLists


ArrayList<Item>[] buckets = new ArrayList[numBuckets];
for (int i=0; i<numBuckets; i++) buckets[i] = new ArrayList<Item>();

Algorithmics I, 2022 14
1 Sorting and Tries 15
Sorting – Radix sorting - Pseudocode

for (int i=1; i<=numIterations; i++){

// clear the buckets


for (int j=0; j<numBuckets; j++) buckets[j].clear();

// distribute the items (in order from the sequence a)


for (Item x : a){
// find the value of the b bits starting from position (i-1)*b in x
int k = bits(x, b, (i-1)*b); // find the correct bucket for item x
buckets[k].add(x); // add item to this bucket
}

a.clear(); // clear the sequence

// concatenate the buckets (in sequence) to form the new sequence


for (j=0; j<numBuckets; j++) a.addAll(buckets[j]);

Algorithmics I, 2022 15
1 Sorting and Tries 16
Sorting – Radix sorting - Correctness
Let x and y be two items with x<y
− need to show that x precedes y in the final sequence

Suppose j is the last iteration for which relevant bits of x and y differ
− since x<y and j is the last iteration that x and y differ
the relevant bits of x must be smaller than those of y
− therefore x goes into an ‘earlier’ bucket than y
and hence x precedes y in the sequence after this iteration

− since j is the last iteration where bits differ:


in all later iterations x and y go in the same bucket
so their relative order is unchanged

Algorithmics I, 2022 16
1 Sorting and Tries 17
Sorting – Radix sorting - Complexity
Number of iterations is m/b and number of buckets is 2b

During each of the m/b iterations


− the sequence is scanned and items are allocated buckets: O(n) time
− buckets are concatenated: O(2b) time

Therefore the overall complexity is O(m/b·(n+2b))


− this is O(n), since m and b are constants

Time-space trade-off
− the larger the value of b, the smaller the multiplicative constant (m/b) in
the complexity function and so the faster the algorithm will become
− however an array of size 2b is required for the buckets
therefore increasing b will increase the space requirements

Algorithmics I, 2022 17
1 Sorting and Tries 18
Tries (retrieval)
Binary search trees are comparison-based data structures

Tries are to binary trees as Radixsort is to comparison-based sorting


− stored items have a key value that is interpreted as a sequence of bits,
or characters, …
− there is a multiway branch at each node where each branch has an
associated symbol and no two siblings have the same symbol
− the branch taken at level i during a search, is determined by the ith
element of the key value (ith bit, ith character, …)
− tracing a path from the root to a node spells out the key value of the item

Example: use a trie to store items with a key value that is a string
− say the words in a dictionary

Algorithmics I, 2022 18
1 Sorting and Tries 19
Tries - Examples
An example trie containing words from a 4 letter alphabet

a t
e r
r t a r a e a r
e

e t e r t a r t a r a e

e e e r e t t a e

r t

• Two kinds of nodes


− nodes representing words path represents the string: ar
− internal/intermediate nodes (not a word)

Algorithmics I, 2022 19
1 Sorting and Tries 20
Tries - Examples
An example trie containing words from a 4 letter alphabet

a t
e r
r t a r a e a r
e

e t e r t a r t a r a e

e e e r e t t a e

r t

• Two kinds of nodes


− nodes representing words path represents the word: art
− internal/intermediate nodes

Algorithmics I, 2022 20
1 Sorting and Tries 21
Tries – Search algorithm (pseudo code)
// searching for a word w in a trie t
Node n = root of t; // current node (start at root)
int i = 0; // current position in word w (start at beginning)

while (true) {
if (n has a child c labelled w.charAt(i)) {
// can match the character of word in the current position
if (i == w.length()-1) { // end of word
if (c is an 'intermediate' node) return "absent";
else return "present";
}
else { // not at end of word
n = c; // move to child node
i++; // move to next character of word
}
}
else return "absent"; // cannot match current character
}
Algorithmics I, 2022 21
1 Sorting and Tries 22
Tries – Insertion algorithm (pseudo code)

// inserting a word w in a trie t


Node n = root of t; // current node (start at root)

for (int i=0; i < w.length(); i++){ // go through chars of word


if (n has no child c labelled w.charAt(i)){
// need to add new node
create such a child c;
mark c as intermediate;
}
n = c; // move to child node
}
mark n as representing a word;

Algorithmics I, 2022 22
1 Sorting and Tries 23
Tries - Algorithms
Deletion of a string from a trie
− exercise

Complexity of trie operations


− (almost) independent of the number of items
− essentially linear in the string length

Algorithmics I, 2022 23
1 Sorting and Tries 24
Tries - Implementation
Various possible implementations
− using an array (of pointers to represent the children of each node)
− using a linked lists (to represent the children of each node)
− time/space trade-off
List implementation
a e r t
− trie
− becomes the list e

′′ F

′a′ T ′e′ F ′r′ F ′t′ F

′e′ T

Algorithmics I, 2022 24
1 Sorting and Tries 25
Tries – Class to represent dictionary tries
public class Node { // node of a trie
private char letter; // label on incoming branch
private boolean isWord; // true when node represents a word
private Node sibling; // next sibling (when it exists)
private Node child; // first child (when it exists)

/** create a new node with letter c */


public Node(char c){
letter = c;
isWord = false;
sibling = null;
child = null;
}
// include accessors and mutators for the various components of class
}
public class Trie {
private Node root;
public Trie() {
root = new Node(Character.MIN_VALUE); // null character in root
}

Algorithmics I, 2022 25
1 Sorting and Tries 26
Tries – Method to search
private enum Outcomes {PRESENT, ABSENT, UNKNOWN}
/** search trie for word w */
public boolean search(String w) {
Outcomes outcome = Outcomes.UNKNOWN;
int i = 0; // position in word so far searched (start at beginning)
Node current = root.getChild(); // start with first child of root
while (outcome == Outcomes.UNKNOWN) {
if (current == null) outcome = Outcomes.ABSENT; // dead-end
else if (current.getLetter() == w.charAt(i)) { // positions match
if (i == w.length()-1) outcome = Outcomes.PRESENT; // matched word
else { // descend one level…
current = current.getChild(); // in trie
i++; // in word being searched
}
}
else current = current.getSibling(); // try next sibling
}
if (outcome != Outcomes.PRESENT) return false;
else return current.getIsWord(); // true if current node represents a word
}

Algorithmics I, 2022 26
1 Sorting and Tries 27
Tries – Method to insert
public void insert(String w){ /* insert word w into trie */
int i = 0; // position in word (start at beginning)
Node current = root; // current node of trie (start at root)
Node next = current.getChild(); // child of current node we are testing
while (i < w.length()) { // not reached the end of the word
if (next.getLetter() == w.charAt(i)) { // chars match: descend a level
current = next; // update current to the child node
next = current.getChild(); // update child node
i++; // next position in word
} else if (next != null) next = next.getSibling(); // try next child
else { // no more siblings: need new node
Node x = new Node(s.charAt(i)); // label with ith element of word
x.setSibling(current.getChild()); // sibling: first child of current
current.setChild(x); // make it first child of current node
current = x; // move to the new node
next = current.getChild(); // update child node
i++; // next position in word
}
}
current.setIsWord(true); // current represents word w
}
Algorithmics I, 2022 27
1 Sorting and Tries 28
Algorithmics I 2022

Algorithmics I

Section 2 – Graphs & graph algorithms

Dr. Gethin Norman


School of Computing Science
University of Glasgow
[email protected]

2 Graphs and Graph Algorithms 29


Graph basics
(undirected) graph G = (V,E)
− V is finite set of vertices (the vertex set)
− E is set of edges, each edge is a subset of V of size 2 (the edge set)
Pictorially:
− a vertex is represented by a point
− an edge by a line joining the relevant pair of points
− a graph can be drawn in different ways
− e.g. two representations of the same graph

a b c x c
V= {a,b,c,x,y,z}
E= { {a,x},{a,y},{a,z},
a z
{b,x},{b,y},{b,z},
{c,x},{c,y},{c,z} }
x y z y b

Algorithmics I, 2022 2
2 Graphs and Graph Algorithms 30
Graph basics
x c
a b c

a z

x y z y b

In this graph:
− vertices a & z are adjacent that is {a,z} is an element of the edge set E
− vertices a & b are non-adjacent that is {a,b} is not an element of E
− vertex a is incident to edge {a,x}
− a➝x➝b➝y➝c is a path of length 4 (number of edges)
− a➝x➝b➝y➝a is a cycle of length 4
− all vertices have degree 3
• i.e. all vertices are incident to three edges

Algorithmics I, 2022 3
2 Graphs and Graph Algorithms 31
Graph basics - Definitions
A graph is: connected, if every pair of vertices is joined by a path
x y z

u v w
A non-connected graph has two or more connected components

• A graph is a tree if it is connected and acyclic (no cycles)


a tree with n vertices has n-1 edge
- at least n-1 edges to be connected
- at most n-1 edges to be acyclic

• A graph is a forest if it is acyclic and components are trees


Algorithmics I, 2022 4
2 Graphs and Graph Algorithms 32
Graph basics - Definitions
A graph is complete (a clique) if every pair vertices is joined by an edge

K6, the clique on 6 vertices

A graph is bipartite if the vertices are in two disjoint sets U & W


and every edge joins a vertex in U to a vertex in W
a b c
U
the complete bipartite graph K3,3
W
x y z

it is complete since all edges between vertices in U and W are present


Algorithmics I, 2022 5
2 Graphs and Graph Algorithms 33
Graph basics - Definitions
A graph is complete (a clique) if every pair vertices is joined by an edge

K6, the clique on 6 vertices

A graph is bipartite if the vertices are in two disjoint sets U & W


and every edge joins a vertex in U to a vertex in W
a b c
U
bipartite graphs do not need to be complete
W
x y z

Algorithmics I, 2022 6
2 Graphs and Graph Algorithms 34
Graph basics – Directed graphs
A directed graph (digraph) D = (V,E)
− V is the finite set of vertices and E is the finite set of edges
− here each edge is an ordered pair (x,y) of vertices

Pictorially: edges are drawn as directed lines/arrows

v x
u for example (u,v),(w,y),(y,w) Î E

w z
y
− u is adjacent to v and v is adjacent from u
− y has in-degree 2 and out-degree 1

In a digraph, paths and cycles must follow edge directions


• e.g. u ➝ w ➝ x is a path and w ➝ y ➝ w is a cycle

Algorithmics I, 2022 7
2 Graphs and Graph Algorithms 35
Graph representations – Undirected graphs
Undirected graph: Adjacency matrix
− one row and column for each vertex
− row i, column j contains a 1 if ith and jth vertices adjacent, 0 otherwise

Undirected graph: Adjacency lists


− one list for each vertex
− list i contains an entry for j if the vertices i and j are adjacent

Algorithmics I, 2022 8
2 Graphs and Graph Algorithms 36
Graph representations – Undirected graphs
Undirected graph G
x y z

u v w

Adjacency matrix for G Adjacency lists for G

u v w x y z
u: 0 1 0 1 0 0 u: v➝x
v: 1 0 1 1 1 0 v: u➝w➝x➝y
w: 0 1 0 1 1 0 w: v➝x➝y
x: 1 1 1 0 1 0 x: u➝v➝w➝y
y: 0 1 1 1 0 1 y: v➝w➝x➝z
z: 0 0 0 0 1 0 z: y
|V|×|V| array 2×|E| entries in all

Algorithmics I, 2022 9
2 Graphs and Graph Algorithms 37
Graph representations – Directed graphs
Directed graph: Adjacency matrix
− one row and column for each vertex
− row i, column j contains a 1 if there is an edge from i to j
and 0 otherwise

Directed graph: Adjacency lists


− one list for each vertex
− the list for vertex i contains vertex j if there is an edge from i to j

Algorithmics I, 2022 10
2 Graphs and Graph Algorithms 38
Graph representations – Directed graphs
Directed graph D v x
u

w z
y

Adjacency matrix for D Adjacency lists for D


u v w x y z u: v➝w
u: 0 1 1 0 0 0 v:
v: 0 0 0 0 0 0 w: x➝y
w: 0 0 0 1 1 0 x:
x: 0 0 0 0 0 0 y: w
y: 0 0 1 0 0 0 z: y
z: 0 0 0 0 1 0
|E| entries in all
|V|×|V| array

Algorithmics I, 2022 11
2 Graphs and Graph Algorithms 39
Implementation – Adjacency lists
Recall adjacency list for an undirected graph
− one list for each vertex
− list i contains an element for j if the vertices i and j are adjacent

graph G adjacency lists for G


x z v: w➝x➝y
y w: v➝x➝y
x: v➝w➝y
w y: v➝w➝x➝z
v z: y

Implementation: define classes for


− the entries of adjacency lists
− the vertices (includes a linked list representing its adjacency list)
− graphs (includes the size of the graph and an array of vertices)
• array allows for efficient access using “index” of a vertex
Algorithmics I, 2022 12
2 Graphs and Graph Algorithms 40
Implementation – Adjacency lists
/** class to represent an entry in the adjacency list of a vertex
in a graph */
public class AdjListNode {

private int vertexIndex; // the vertex index of the entry

// possibly other fields, for example representing properties


// of the edge such as weight, capacity, …

/** creates a new entry for vertex indexed i */


public AdjListNode(int i){
vertexIndex = i;
}
public int getVertexIndex(){ // gets the vertex index of the entry
return vertexIndex;
}
public void setVertexIndex(int i){ // sets vertex index to i
vertexIndex = i;
}
}

Algorithmics I, 2022 13
2 Graphs and Graph Algorithms 41
Implementation – Adjacency lists
import java.util.LinkedList; // we require the linked list class

/** class to represent a vertex in a graph */


public class Vertex {

private int index; // the index of this vertex


private LinkedList<AdjListNode> adjList; // the adjacency list of vertex

// possibly other fields, e.g. representing data stored at the node

/** create a new instance of vertex with index i */


public Vertex(int i) {
index = i; // set index
adjList = new LinkedList<AdjListNode>();// create empty adjacency list
}

/** return the index of the vertex */


public int getIndex(){
return index;
}
Algorithmics I, 2022 14
2 Graphs and Graph Algorithms 42
Implementation – Adjacency lists
// class Vertex continued

/** set the index of the vertex */


public void setIndex(int i){
index = i;
}
/** return the adjacency list of the vertex */
public LinkedList<AdjListNode> getAdjList(){
return adjList;
}
/** add vertex with index j to the adjacency list */
public void addToAdjList(int j){
adjList.addLast(new AdjListNode(j));
}
/** return the degree of the vertex */
public int vertexDegree(){
return adjList.size();
}
}

Algorithmics I, 2022 15
2 Graphs and Graph Algorithms 43
Implementation – Adjacency lists
import java.util.LinkedList; // again require the linked list class
/** class to represent a graph */
public class Graph {

private Vertex[] vertices; // the vertices


private int numVertices = 0; // number of vertices

// possibly other fields representing properties of the graph

/** Create a Graph with n vertices indexed 0,...,n-1 */


public Graph(int n) {
numVertices = n;
vertices = new Vertex[n];
for (int i = 0; i < n; i++) vertices[i] = new Vertex(i);
}
/** returns number of vertices in the graph */
public int size(){
return numVertices;
}
}

Algorithmics I, 2022 16
2 Graphs and Graph Algorithms 44
Graph search and traversal algorithms
Graph search and traversal algorithms
− a systematic way to explore a graph (when starting from some vertex)

v x
u

w z
y
Example: web crawler collects data from hypertext documents
by traversing a directed graph D where
− vertices are hypertext documents
− (u,v) is an edge if document u contains a hyperlink to document v

A search/traversal visits all vertices by travelling along edges


− traversal is efficient if it explores graph in O(|V|+|E|) time

Algorithmics 3, 2010 17
2 Graphs and Graph Algorithms 45
Depth first search/traversal (DFS)
From starting vertex
− follow a path of unvisited vertices until path can be extended no further
− then backtrack along the path until an unvisited vertex can be reached
− continue until we cannot find any unvisited vertices
Repeat for other components (if any)

Algorithmics I, 2022 18
2 Graphs and Graph Algorithms 46
Depth first search/traversal (DFS)
From starting vertex
− follow a path of unvisited vertices until path can be extended no further
− then backtrack along the path until an unvisited vertex can be reached
− continue until we cannot find any unvisited vertices
Repeat for other components (if any)

The edges traversed form a spanning tree (or forest)


− a depth-first spanning tree (forest)
− spanning tree of a graph is tree composed of all the vertices and some
(or perhaps all) of the edges of the graph

Algorithmics I, 2022 19
2 Graphs and Graph Algorithms 47
Depth first traversal - Example
Undirected graph G

Depth first spanning tree of G


8 1 2
4

7 3

5 6

Algorithmics I, 2012 20
2 Graphs and Graph Algorithms 48
Implementation – DFS – Add to vertex class

private boolean visited; // has vertex been visited in a traversal?

private int pred; // index of the predecessor vertex in a traversal

public boolean getVisited(){


return visited;
}
public void setVisited(boolean b){
visited = b;
}
public int getPred(){
return pred;
}
public void setPred(int i){
pred = i;
}

Algorithmics I, 2012 21
2 Graphs and Graph Algorithms 49
Implementation – DFS – Add to graph class
/** visit vertex v, with predecessor index p, during a dfs */
private void visit(Vertex v, int p){
v.setVisited(true); // update as now visited
v.setPred(p); // set predecessor (indicates edge used to find vertex)
LinkedList<AdjListNode> L = v.getAdjList(); // get adjacency list

for (AdjListNode node : L){ // go through all adjacent vertices


int i = node.getIndex(); // find index of current vertex in list
if (!vertices[i].getVisited()) // if vertex has not been visited
visit(vertices[i], v.getIndex()); // continue dfs search from it
// setting the predecessor vertex index to the index of v
}
}
/** carry out a depth first search/traversal of the graph */
public void dfs(){
for (Vertex v : vertices) v.setVisited(false); // initialise
for (Vertex v : vertices) if (!v.getVisited()) visit(v,-1);
// if vertex is not yet visited, then start dfs on vertex
// -1 is used to indicate v was not found through an edge of the graph
}

Algorithmics I, 2012 22
2 Graphs and Graph Algorithms 50
Analysis – Depth first search
Each vertex is visited, and each element in the adjacency lists is
processed, so overall O(n+m)
− where n is the number of vertices and m the number of edges

Can be adapted to the adjacency matrix representation


− but now O(n2) since look at every entry of the adjacency matrix

Some applications
− to determine if a given graph is connected
− to identify the connected components of a graph
− to determine if a given graph contains a cycle (see tutorial questions)
− to determine if a given graph is bipartite (see tutorial questions)

Algorithmics I, 2022 23
2 Graphs and Graph Algorithms 51
Breadth first search/traversal (BFS)
Search fans out as widely as possible at each vertex
− from the current vertex, visit all the adjacent vertices
this is referred to as processing the current vertex
− vertices are processed in the order in which they are visited
− continue until all vertices in current component have been processed
− then repeat for other components
(if there are any)

Again the edges traversed form a spanning tree (or forest)


− a breadth-first spanning tree (forest)
− spanning tree of a graph is tree composed of all the vertices and some
(or perhaps all) of the edges of the graph

Algorithmics I, 2022 24
2 Graphs and Graph Algorithms 52
Breadth first traversal - Example
Undirected graph G

Breadth first spanning tree of G

b a c
d

e f

g h

Algorithmics I, 2022 25
2 Graphs and Graph Algorithms 53
Analysis – Breadth first search
Complexity
− each vertex is visited and queued exactly once
− each adjacency list is traversed once
− so overall O(n+m) (n is the number of vertices and m number of edges)
− can adapt to adjacency matrix representation but O(n2) (as for DFS)

Example application
− finding the distance between two vertices, say v and w, in a graph
− the distance is the number of edges in the shortest path from v to w
− assign distance to v to be 0
− carry out a breadth-first search from v
− when visiting a new vertex for first time, assign its distance to be
1 + the distance to its predecessor in the BF spanning tree
− stop when w is reached

Algorithmics I, 2022 26
2 Graphs and Graph Algorithms 54
Distance between two vertices - Example
Distance between v and w v
− assign distance to v to be 0
− carry out a breadth-first search from v
− when visiting a new vertex for first time
assign its distance to be 1 + the distance
to its predecessor in the BF spanning tree w

1 0 1

v 2

shortest
1 2 path

number beside each vertex 3


2 w
indicates the distance from v

Algorithmics I, 2022 27
2 Graphs and Graph Algorithms 55
Weighted graphs
Each edge e has an integer weight given by wt(e)>0
− graph may be undirected or directed
− weight may represent length, cost, capacity, etc
− if an edge is not part of the graph its weight is infinity

4 u 5

v w
5
6 7 5 4
x y
8
6 z 7

Example: cost of sending a message down a particular edge


− could be a monetary cost or some combination of time and distance
− can be used to formulate the shortest path problem for routing packets
Algorithmics I, 2022 28
2 Graphs and Graph Algorithms 56
Weighted graphs - Representation
Adjacency matrix becomes weight matrix 4 u 5
Adjacency lists include weight in node
v w
5
6 7 5 4

x y
8
6 z 7

adjacency matrix
adjacency list
u v w x y z
u 0 4 5 7 0 0 u:v(4)➝w(5)➝x(7)
v 4 0 5 6 0 0 v:u(4)➝w(5)➝x(6)
w 5 5 0 0 4 5 w:u(5)➝v(5)➝y(4)➝z(5)
x 7 6 0 0 8 6 x:u(7)➝v(6)➝y(8)➝z(6)
y 0 0 4 8 0 7 y:w(4)➝x(8)➝z(7)
z 0 0 5 6 7 0 z:w(5)➝x(6)➝y(7)

Algorithmics I, 2022 29
2 Graphs and Graph Algorithms 57
Weighted graphs - Shortest Paths
Given a weighted (un)directed graph and two vertices u and v
find a shortest path between u and v (for directed from u to v)
− where the length of a path is the sum of the weights of its edges

Example: weights are distances between airports


− shortest path between San Francisco and Miami

Applications include:
− flight reservations
− internet packet routing
− driving directions

Algorithmics I, 2022 30
2 Graphs and Graph Algorithms 58
Edsger Dijkstra, in an interview in 2010...
"… the algorithm for the shortest path, which I designed in
about 20 minutes. One morning I was shopping in Amsterdam
with my young fiancé, and tired, we sat down on the cafe
terrace to drink a cup of coffee, and I was just thinking about
whether I could do this, and I then designed the algorithm
for the shortest path."
Dijkstra, E.W. A note on two problems in Connexion with graphs.
Numerische Mathematik 1, 269–271 (1959)
Dijkstra describes the algorithm in English in 1956 (he was 26 years old)
− most people were programming in assembly language
− only one high-level language: Fortran by John Backus at IBM and not quite finished

No big O notation in 1959, in the paper, Dijkstra says: “my solution is preferred
to another one … the amount of work to be done seems considerably less.”

Algorithmics I, 2022 31
2 Graphs and Graph Algorithms 59
Dijkstra’s algorithm (as seen in NOSE2)
Algorithm finds shortest path between one vertex u and all others
− based on maintaining a set S containing all vertices for which shortest
path with u is currently known
− S initially contains only u (obviously shortest path between u and u is 0)
− eventually S contains all the vertices (so all shortest paths are known)

Each vertex v has a label d(v) indicating the length of a shortest


path between u and v passing only through vertices in S
− if no path exists then we set to d(v) infinity
− if v is in S, then d(v) is the length of the shortest path between u and v

Invariant of the algorithm: if v is in S and w is not, then the length of


the shortest path between u and w is at least that between u and v
− this means the weight of the edge between u and w is at least d(v)
Algorithmics I, 2022 32
2 Graphs and Graph Algorithms 60
Dijkstra’s algorithm (as seen in NOSE2)
Algorithm finds shortest path between one vertex u and all others
− based on maintaining a set S containing all vertices for which shortest
path with u is currently known
− S initially contains only u (obviously shortest path between u and u is 0)
− eventually S contains all the vertices (so all shortest paths are known)

Each vertex v has a label d(v) indicating the length of a shortest


path between u and v passing only through vertices in S
− at each step we add to S the vertex v not in S such that d(v) is minimum
− after having added a vertex v to S, carry out edge relaxation operations
i.e. we update the length d(w) for all vertices w still not in S
• d(w) is the length of a shortest path between u and v passing only
through vertices in S
• and S has changed since we have added vertex v to S
Algorithmics I, 2022 33
2 Graphs and Graph Algorithms 61
Dijkstra’s algorithm – Edge relaxation
Each vertex v has a label d(v) indicating the length of a shortest
path between u and v passing only through vertices in S
− suppose v and w are not in S then we know
• the shortest path between u and v passing only through S equals d(v)
• the shortest path between u and w passing only through S equals d(w)

d(v)
v

w
u
d(w)

Algorithmics I, 2022 34
2 Graphs and Graph Algorithms 62
Dijkstra’s algorithm – Edge relaxation
Each vertex v has a label d(v) indicating the length of a shortest
path between u and v passing only through vertices in S
− suppose v and w are not in S then we know
• the shortest path between u and v passing only through S equals d(v)
• the shortest path between u and w passing only through S equals d(w)
− now suppose v is added to S and the edge e = {v,w} has weight wt(e)
− calculate the shortest path between u and w passing only through S∪{v}

shortest path is either:


d(v) - original path through S of length d(w)
v - path combining edge e and shortest
wt(e) path between v and u which has length
w wt(e) + d(v)
u
d(w) therefore length updated to:

Algorithmics I, 2022 d(w) = min{ d(w), d(v) + wt(e) } 35


2 Graphs and Graph Algorithms 63
Dijkstra’s algorithm – Pseudo code

// S is set of vertices for which shortest path with u is known


// d(w) represents length of a shortest path between u and w
// passing only through vertices of S

S = {u}; // initialise S
for (each vertex w) d(w) = wt(u,w); // initialise lengths

while (S != V){ // still vertices to add to S


find v not in S with d(v) minimum;
add v to S;
for (each w not in S and adjacent to v) // perform relaxation
d(w) = min{ d(w) , d(v)+wt(v,w) };
}

Algorithmics I, 2022 36
2 Graphs and Graph Algorithms 64
Dijkstra’s algorithm – Complexity
S = {u}; // initialise S
for (each vertex w) d(w) = wt(u,w); // initialise lengths

while (S != V){ // still vertices to add to S


find v not in S with d(v) minimum;
add v to S;
for (each w not in S and adjacent to v) // perform relaxation
d(w) = min{ d(w) , d(v)+wt(v,w) };
}

Analysis (n vertices and m edges) using unordered array for lengths


− O(n) to initialise lengths
− finding minimum is O(n2) overall
• each time it takes O(n) and there are n-1 to find
− relaxation is O(m) overall
• each edge is considered once and updating length takes O(1)
• note: we are not considering each iteration of the while loop but overall ops
hence O(n2) overall (number of edges at most n(n-1))
Algorithmics I, 2022 37
2 Graphs and Graph Algorithms 65
Dijkstra’s algorithm – Pseudo code
S = {u}; // initialise S
for (each vertex w) d(w) = wt(u,w); // initialise lengths

while (S != V){ // still vertices to add to S


find v not in S with d(v) minimum;
add v to S;
for (each w not in S and adjacent to v) // perform relaxation
d(w) = min{ d(w) , d(v)+wt(v,w) };
}

Analysis (n vertices and m edges) using a heap for lengths


− O(n) to initialise lengths and create heap
− finding minimum is O(n log n) overall
• each time it takes O(log n) and there are n-1 to find
− relaxation is O(m log n) overall
• each edge is considered once and updating length takes O(log n)
• note: this involves updating a specific value in the heap not the root
so care must be taken (need to keep track of positions of vertices in the heap)

Algorithmics I, 2022 38
2 Graphs and Graph Algorithms 66
Dijkstra’s algorithm – Pseudo code
S = {u}; // initialise S
for (each vertex w) d(w) = wt(u,w); // initialise lengths

while (S != V){ // still vertices to add to S


find v not in S with d(v) minimum;
add v to S;
for (each w not in S and adjacent to v) // perform relaxation
d(w) = min{ d(w) , d(v)+wt(v,w) };
}

Analysis (n vertices and m edges) using a heap for lengths


− O(n) to initialise lengths and create heap
− finding minimum is O(n log n) overall
• each time it takes O(log n) and there are n-1 to find
− relaxation is O(m log n) overall
• each edge is considered once and updating lengths takes O(log n)
hence O(m log n) overall (more edges than vertices)
− a graph with n vertices has O(n2) edges
Algorithmics I, 2022 39
2 Graphs and Graph Algorithms 67
Spanning trees
Spanning tree:
− subgraph (subset of edges) which is both a tree and ‘spans’ every vertex
− a spanning tree is obtained from a connected graph by deleting edges
− the weight of a spanning tree is the sum of the weights of its edges

Problem: for a weighted connected undirected graph, find a


minimum weight spanning tree
− this represents the ‘cheapest’ way of interconnecting the vertices

Applications include:
− design of networks for computer, telecommunications, transportation,
gas, electricity, ...
− clustering, approximating the travelling salesman problem

Algorithmics I, 2022 40
2 Graphs and Graph Algorithms 68
Weighted graphs – Example – Spanning tree
Weighted graph G spanning tree:
4 5 subgraph which is
both a tree and
5 ‘spans’ every vertex
6 7 5 4

8
6 7

5
Spanning tree for G delete edges while still
‘spanning’ vertices
− weight 28 5
6 5 cannot delete any
more edges and
we have a tree
7
Algorithmics I, 2022 41
2 Graphs and Graph Algorithms 69
Weighted graphs – Example – Spanning tree
Weighted graph G spanning tree:
4 5 subgraph which is
both a tree and
5 ‘spans’ every vertex
6 7 5 4

8
6 7

Spanning tree for G 4 5


− weight 24 delete edges while still
‘spanning’ vertices

5 4 cannot delete any


more edges and
we have a tree
6
Algorithmics I, 2022 42
2 Graphs and Graph Algorithms 70
Minimum weight spanning tree problem
An example of a problem in combinatorial optimisation
− find ‘best’ way of doing something among a (large) number of candidates
− can always be solved, at least in theory, by exhaustive search
− however this may be infeasible in practice
− typically an exponential-time algorithm
− e.g. Kn (clique of size n) has nn-2 spanning trees (Cayley’s formula)
• recall: a graph is a clique if every pair vertices is joined by an edge

− a much more efficient algorithm may be possible


and is true in the case of minimum weight spanning trees

Algorithmics I, 2022 43
2 Graphs and Graph Algorithms 71
Minimum weight spanning tree problem
An example of a problem in combinatorial optimisation
− find ‘best’ way of doing something among a (large) number of candidates
− can always be solved, at least in theory, by exhaustive search
− however this may be infeasible
− typically an exponential-time algorithm

The Prim-Jarnik minimum spanning tree algorithm


− an example of a greedy algorithm
− it makes a sequence of decisions based on local optimality
− and ends up with the globally optimal solution

For many problems, greedy algorithms do not yield optimal solution


− see examples later in the course

Algorithmics I, 2022 44
2 Graphs and Graph Algorithms 72
The Prim-Jarnik algorithm
Min spanning tree is constructed by choosing a sequence of edge
set an arbitrary vertex r to be a tree-vertex (tv);
set all other vertices to be non-tree-vertices (ntv);
while (number of ntv > 0){
find edge e = {p,q} of graph such that
p is a tv;
q is an ntv;
wt(e) is minimised over such edges;
adjoin edge e to the (spanning) tree;
make q a tv;
}
Analysis (n is the number of vertices)
− intitialisation O(n) (n operations to set vertices to be tv or ntv)
− the outer loop is executed n-1 times
− the inner loop checks all edges from a tree-vertex to a non-tree-vertex
− there can be O(n2) of these each time so overall the algorithm is O(n3)

Algorithmics I, 2022 45
2 Graphs and Graph Algorithms 73
The Prim-Jarnik algorithm – Example
Weighted graph G 4 5

5
6 7 5 4

8
6 7

u
Minimum spanning
4
tree for G
v w
− weight 24 5
6 5 4

x y

Algorithmics I, 2022 z 46
2 Graphs and Graph Algorithms 74
Dijkstra’s refinement
Introduce a attribute bestTV for each non-tree vertex (ntv) q
− bestTV is set to the tree vertex (tv) p for which wt({p,q}) is minimised

set an arbitrary vertex r to be a tree-vertex (tv);


set all other vertices to be non-tree-vertices (ntv);
for (each ntv s) set s.bestTV = r; // r is the only tv

while (size of ntv > 0){


find ntv q for which wt({q, q.bestTV}) is minimal;
adjoin {q, q.bestTV} to the tree;
make q a tv;

for (each ntv s) update s.bestTV;


// update bestTV as tree vertices have changed

Algorithmics I, 2022 47
2 Graphs and Graph Algorithms 75
Dijkstra’s refinement – Analysis
set an arbitrary vertex r to be a tree-vertex (tv);
set all other vertices to be non-tree-vertices (ntv);
for (each ntv s) set s.bestTV = r; // r is the only tv

while (size of ntv > 0){


find ntv q for which wt({q, q.bestTV}) is minimal;
adjoin {q, q.bestTV} to the tree;
make q a tv;

for (each ntv s) update s.bestTV; // update as tvs have changed

− initialisation is O(n)

Algorithmics I, 2022 48
2 Graphs and Graph Algorithms 76
Dijkstra’s refinement – Analysis
set an arbitrary vertex r to be a tree-vertex (tv);
set all other vertices to be non-tree-vertices (ntv);
for (each ntv s) set s.bestTV = r; // r is the only tv

while (size of ntv > 0){


find ntv q for which wt({q, q.bestTV}) is minimal;
adjoin {q, q.bestTV} to the tree;
make q a tv;

for (each ntv s) update s.bestTV; // update as tvs have changed

− initialisation is O(n)
− while loop is executed n-1 times

Algorithmics I, 2022 49
2 Graphs and Graph Algorithms 77
Dijkstra’s refinement – Analysis
set an arbitrary vertex r to be a tree-vertex (tv);
set all other vertices to be non-tree-vertices (ntv);
for (each ntv s) set s.bestTV = r; // r is the only tv

while (size of ntv > 0){


find ntv q for which wt({q, q.bestTV}) is minimal;
adjoin {q, q.bestTV} to the tree;
make q a tv;

for (each ntv s) update s.bestTV; // update as tvs have changed

− initialisation is O(n)
− while loop is executed n-1 times
− first part takes O(n)
• O(n) to find minimal ntv and O(1) to adjoin and update

Algorithmics I, 2022 50
2 Graphs and Graph Algorithms 78
Dijkstra’s refinement – Analysis
set an arbitrary vertex r to be a tree-vertex (tv);
set all other vertices to be non-tree-vertices (ntv);
for (each ntv s) set s.bestTV = r; // r is the only tv

while (size of ntv > 0){


find ntv q for which wt({q, q.bestTV}) is minimal;
adjoin {q, q.bestTV} to the tree;
make q a tv;

for (each ntv s) update s.bestTV; // update as tvs have changed

− initialisation is O(n)
− while loop is executed n-1 times
− second part (inner loop) takes O(n)
• for each ntv s only need to compare weights for s.bestTV and new tv vertex
(i.e. q) to update the value of s.bestTV

Algorithmics I, 2022 51
2 Graphs and Graph Algorithms 79
Dijkstra’s refinement – Analysis
set an arbitrary vertex r to be a tree-vertex (tv);
set all other vertices to be non-tree-vertices (ntv);
for (each ntv s) set s.bestTV = r; // r is the only tv

while (size of ntv > 0){


find ntv q for which wt({q, q.bestTV}) is minimal;
adjoin {q, q.bestTV} to the tree;
make q a tv;

for (each ntv s) update s.bestTV; // update as tvs have changed


}

− initialisation is O(n)
− while loop is executed n-1 times
− first part and second part each take O(n)
− overall the algorithm is O(n2)

Algorithmics I, 2022 52
2 Graphs and Graph Algorithms 80
Dijkstra’s refinement - Example
u
Weighted graph G
4

v w
5
Minimum spanning tree for G 5 4
− weight 24 x y

6
q q.bestTV wt({q.bestTV,q}) z
u - -
v - -
w - -
x - -
y - -
z - -

Algorithmics I, 2022 53
2 Graphs and Graph Algorithms 81
The Prim-Jarnik algorithm – Correctness
Is the algorithm correct ?
− i.e. does it return a minimum weight spanning tree for any graph G

Proof will not be part of the exam

Proof:
− suppose for graph G the algorithm returns the tree T
− compare T with a minimum spanning tree X of G
− if they are the same we are happy (it is a minimum weight spanning tree)
− therefore remains to consider the case when they are different…

Algorithmics I, 2022 54
2 Graphs and Graph Algorithms 82
The Prim-Jarnik algorithm – Correctness
Suppose that T and X are different
− T tree returned by the algorithm and X a minimum spanning tree of G
− let e be the first edge chosen to be in T that is not in X

T X adding e to X we
get a cycle C

(since X is a
e e spanning tree)

Algorithmics I, 2022 55
2 Graphs and Graph Algorithms 83
The Prim-Jarnik algorithm – Correctness
Suppose that T and X are different
− T tree returned by the algorithm and X a minimum spanning tree of G
− let e be the first edge chosen to be in T that is not in X

T adding e to X we
C
get a cycle C

(since X is a
e e spanning tree)

Algorithmics I, 2022 56
2 Graphs and Graph Algorithms 84
The Prim-Jarnik algorithm – Correctness
Suppose that T and X are different
− T tree returned by the algorithm and X a minimum spanning tree of G
− let e be the first edge chosen to be in T that is not in X

let S be the set of tree


T
in S C verticies (tvs) at the
point when the
tv algorithm selected e
e e now by definition of the
algorithm one end of
the edge e is in S, and
ntv the other is not in S
not
in S

Algorithmics I, 2022 57
2 Graphs and Graph Algorithms 85
The Prim-Jarnik algorithm – Correctness
Suppose that T and X are different
− T tree returned by the algorithm and X a minimum spanning tree of G
− let e be the first edge chosen to be in T that is not in X

it follows that C must


T
in S C have another edge f
that connects a vertex
tv in S with one that is not
e e
i.e. a tv with a ntv
f in S

ntv
not
not in S
in S

Algorithmics I, 2022 58
2 Graphs and Graph Algorithms 86
The Prim-Jarnik algorithm – Correctness
Suppose that T and X are different
− T tree returned by the algorithm and X a minimum spanning tree of G
− let e be the first edge chosen to be in T that is not in X
we also have:
T
in S C wt(f)≥wt(e)
since the algorithm
tv
picks e and not f
e e
we can replace f by e
f in S in X to get another
spanning tree Y
ntv
not
not in S
in S

Algorithmics I, 2022 59
2 Graphs and Graph Algorithms 87
The Prim-Jarnik algorithm – Correctness
Suppose that T and X are different
− T tree returned by the algorithm and X a minimum spanning tree of G
− let e be the first edge chosen to be in T that is not in X

Y we also have:
T
wt(f)≥wt(e)
since the algorithm
tv
picks e and not f
e e
we can replace f by e
in X to get another
spanning tree Y
ntv
since wt(f)≥wt(e),
weight of Y cannot be
greater than X, and
since X is minimal, Y is
minimal
Algorithmics I, 2022 60
2 Graphs and Graph Algorithms 88
The Prim-Jarnik algorithm – Correctness
Suppose that T and X are different
− T tree returned by the algorithm and X a minimum spanning tree of G
− let e be the first edge chosen to be in T that is not in X

Y we also have:
T
continuing the process we can convert X to T maintaining minimality
wt(f)≥wt(e)
since the algorithm
tv
which proves that T is indeed a minimal spanning tree
picks e and not f
e e
hence the algorithm is correctwe can replace f by e
in X to get another
spanning tree Y
ntv
since wt(f)≥wt(e),
weight of Y cannot be
greater than X, and
since X is minimal, Y is
minimal
Algorithmics I, 2022 61
2 Graphs and Graph Algorithms 89
Directed Acyclic Graphs -Topological ordering
A Directed Acyclic Graph (DAG) is a directed graph with no cycles

A topological order on a DAG is a labelling of the vertices 1,…,n


such that (u,v)∈E implies label(u)<label(v)
− many applications, e.g. scheduling, PERT networks, deadlock detection

A directed graph D has a topological order if and only if it is a DAG


− obviously impossible if D has a cycle (try to label the vertices in a cycle)

A source is a vertex of in-degree 0 and a sink has out-degree 0

Basic fact: a DAG has at least one source and at least one sink
− forms the basis of a topological ordering algorithm

Algorithmics I, 2022 62
2 Graphs and Graph Algorithms 90
Directed Acyclic Graphs - Example
Directed acyclic graph D
− with more than one source and more than one sink

Algorithmics I, 2022 63
2 Graphs and Graph Algorithms 91
Directed Acyclic Graphs - Example
Directed acyclic graph D
− with more than one source and more than one sink

Algorithmics I, 2022 64
2 Graphs and Graph Algorithms 92
Directed Acyclic Graphs - Example
Directed acyclic graph D
− with more than one source and more than one sink

Algorithmics I, 2022 65
2 Graphs and Graph Algorithms 93
Directed Acyclic Graphs - Example
Directed acyclic graph D
6
1
5
Topological ordering of D
8

2
3
9
Source vertex (in-degree equals 0)
4 7
Sink vertex (out-degree equals 0)

A topological order on a DAG is a labelling of the vertices 1,…,n


such that (u,v)ÎE implies label(u)<label(v)

Algorithmics I, 2022 66
2 Graphs and Graph Algorithms 94
Topological ordering algorithm
// assume each vertex has 2 integer attributes: label and count
// count is the number of incoming edges from unlabelled vertices
// label will give the topological ordering

for (each vertex v) v.setCount(v.getInDegree()); // initial count values

Set up an empty sourceQueue

for (each vertex v) // add vertices with no incoming edges to the queue
if (v.getCount() == 0) add v to sourceQueue; // i.e. source vertices

int nextLabel = 1; // initialise labelling (gives topological ordering)


while (sourceQueue is non-empty){
dequeue v from sourceQueue;
v.setLabel(nextLabel++); // label vertex (and increment nextLabel)
for (each w with (v,w) Î E){ // consider each vertex w adjacent from v
w.setCount(w.getCount() – 1); // update attribute count
// add vertex to source queue if there are no incomming vertices
if (w.getCount() == 0) add w to sourceQueue;
}
}
Algorithmics I, 2022 67
2 Graphs and Graph Algorithms 95
Topolgocal ordering - Example
Directed acyclic graph D t x

z
u
source queue: 〈〉 v w

y
t 7 x s
1
5
8 r
z
u
v w
2 a topological ordering on D
3 y 9
s
4
6
r

Algorithmics I, 2022 68
2 Graphs and Graph Algorithms 96
Topological ordering algorithm - Correctness
A vertex is given a label only when the number of incoming edges
from unlabelled vertices is zero
− all predecessor vertices must already be labelled with smaller numbers

Analysis (n vertices, m edges)


• for adjacency matrix representation
− finding in-degree of each vertex is O(n2) (scan each column)
− main loop is executed n times within it one row is scanned O(n)
− so overall the algorithm is O(n2)

Algorithmics I, 2022 69
2 Graphs and Graph Algorithms 97
Topological ordering algorithm - Correctness
A vertex is given a label only when the number of incoming edges
from unlabelled vertices is zero
− all predecessor vertices must already be labelled with smaller numbers
− dependent on using a queue (first in first out for labelling)

Algorithmics I, 2022 70
2 Graphs and Graph Algorithms 98
Topological ordering algorithm - Analysis
Analysis (n vertices, m edges)
• for adjacency lists representation
− finding in-degree of each vertex is O(n+m) (scan adjacency lists)
− main loop is executed n times within it one list is scanned
(and the same list is never scanned twice)
− so every list is scanned again and overall algorithm is O(n+m)

Algorithmics I, 2022 71
2 Graphs and Graph Algorithms 99
Deadlock detection
Determining whether a digraph contains a cycle

Method 1 (an adaptation of the topological ordering algorithm)


− if the source queue becomes empty before all vertices are labelled,
then there must be a cycle
− if all vertices can be labelled, then the digraph is acyclic

Method 2 (an adaptation of depth-first-search)


− when a vertex u is ‘visited’ check whether there is an edge from u to a
vertex v which is on the current path from the current starting vertex
− the existence of such a vertex indicates a cycle
(adaptation of depth first search since need to ‘remember’ current path)
− see tutorials for more detail

Algorithmics I, 2022 72
2 Graphs and Graph Algorithms 100
Algorithmics I 2022

Algorithmics I

Section 3 – Strings and text algorithms

Dr. Gethin Norman


School of Computing Science
University of Glasgow
[email protected]

3 Strings and text algorithms 101


Text compression
A special case of data compression
− saves disk space and transmission time

Text compression must be lossless


− i.e. the original must be recoverable without error

compression algorithm
original file compressed file

compressed file original file


decompression algorithm (unchanged)

Some other forms of compression can afford to be lossy


− e.g. for pictures, sound, etc. (not considered here)

Algorithmics I, 2022 2
3 Strings and text algorithms 102
Text compression
Examples of text compression
− compress, gzip in Unix, ZIP utilities for Windows, …
− two main approaches statistical and dictionary

Compression ratio: x/y


− x is the size of compressed file and y is the size of original file
− e.g. measured in B, KB, MB, …
− compressing a 10MB file to 2MB would yield a compression ratio of 2/10=0.2

Percentage space saved: (1 – “compression ratio”)×100%


− space saved expressed as a percentage of the original file size
− compressing a 10MB file to 2MB yields a percentage space savings of 80%

Space savings in the range 40% - 60% are typical


− obviously the higher the saving the better the compression
Algorithmics I, 2022 3
3 Strings and text algorithms 103
Text compression – Huffman encoding
The classical statistical method
− now mostly superseded in practice by more effective dictionary methods
− fixed (ASCII) code replaced by variable length code for each character
− every character is represented by a unique codeword (bit string)
− frequently occurring characters are represented by shorter codewords

The code has the prefix property


− no codeword is a prefix of another (gives unambiguous decompression)

Based on a Huffman tree (a proper binary tree)


− each character is represented by a leaf node
− codeword for a character is given by the path from the root to the
appropriate leaf (left=0 and right=1)
− the prefix property follows from this

Algorithmics I, 2022 4
3 Strings and text algorithms 104
Huffman tree construction - Example
Space E A T I S R O N U H C D
Character frequencies: 15 11 9 8 7 7 7 6 4 3 2 1 1

Next, while there is more than one parentless node


− add new parent to nodes of smallest weight
− weight of new node equals 81
sum of the weights of 49 32
the child nodes
28 21 15 17
Space
14 14 11 10 8 9
E T A
7 7 7 7 6 4
I S R O N
3
4
U
2 2
H
1 1
Algorithmics I, 2022 C D 5
3 Strings and text algorithms 105
Huffman tree construction - Pseudocode
// set up the leaf nodes
for (each distinct character c occurring in the text){
make a new parentless node n;
int f = frequency count for c;
n.setWeight(f); // weight equals the frequency
n.setCharacter(c); // set character value
// leaf so no children
n.setLeftChild(null);
n.setRightChild(null);
}
// construct the branch nodes and links
while (no. of parentless nodes > 1){
make a new parentless node z; // new node
x, y = 2 parentless nodes of minimum weight; // its children
z.setLeftChild(x); // set x to be the left child of new node
z.setRightChild(y); // set y to be the right child of new node
int w = x.getWeight()+y.getWeight(); // calculate weight of node
z.setWeight(w); // set the weight of the new node
}
// the final node z is root of Huffman tree

Algorithmics I, 2022 6
3 Strings and text algorithms 106
Huffman code - Example
Space E A T I S R O N U H C D
Character frequencies: 15 11 9 8 7 7 7 6 4 3 2 1 1

Huffman tree: 81
49 32

28 21 15 17
Space
14 14 11 10 8 9
E T A
7 7 7 7 6 4
I S R O N
3
4 huffman code:
U
Space 10 I 0000 U 00101
2 2 E 010 S 0001 H 001001
H A 111 R 0011 C 0010000
1 1 T 110 O 0110 D 0010001
N 0111
C D
Algorithmics I, 2022 7
3 Strings and text algorithms 107
Huffman code - Example
Space E A T I S R O N U H C D
Character frequencies: 15 11 9 8 7 7 7 6 4 3 2 1 1

Huffman tree: 81
49 32

28 21 15 17
Space
14 14 11 10 8 9
E T A
7 7 7 7 6 4
I S R O N
3
4
U prefix property: no codeword is a prefix of another
2 2
H equivalently: no path to one character is a prefix of
1 1 another (since characters are only found at leaves)
C D
Algorithmics I, 2022 8
3 Strings and text algorithms 108
Huffman encoding - Optimality
Weighted path length (WPL) of a tree T
− ∑ (weight)×(distance from root) where sum is over all leaf nodes
− for the example tree: WPL equals: 7×4 + 7×4 + 1×7 + 1×7 + 2×6 +
3×5 + 7×4 + 11×3 + 6×4 + 4×4 + 15×2 + 8×3 + 9×3 = 279

81

49 32

28 21 15 17
Space
14 14 8 9
11 10
E T A
7 7 7 7
6 4
I S R
O N
3
4
U

2 2
H
Algorithmics I, 2022 1 1 9
3 Strings and text algorithms C D 109
Huffman encoding - Optimality
Weighted path length (WPL) of a tree T
− ∑ (weight)×(distance from root) where sum is over all leaf nodes
− for the example tree: WPL equals: 7×4 + 7×4 + 1×7 + 1×7 + 2×6 +
3×5 + 7×4 + 11×3 + 6×4 + 4×4 + 15×2 + 8×3 + 9×3 = 279

Huffman tree has minimum WPL over all binary trees with the
given leaf weights
− Huffman tree need not be unique (e.g. nodes>2 with min weight)
− however all Huffman trees for a given set of frequencies have same WPL
− so what?
− weighted path length (WPL) is the number of bits in compressed file
• bits = sum over chars (frequency of char × code length of char)
− so a Huffman tree minimises this number
− hence Huffman coding is optimal, for all possible codes built in this way
Algorithmics I, 2022 10
3 Strings and text algorithms 110
Huffman encoding – Algorithmic requirements
Building the Huffman tree
− if the text length equals n and there are m distinct chars in text
− O(n) time to find the frequencies
− O(mlog m) time to construct the code, for example using a (min-) heap
to store the parentless nodes and their weights
• initially build a heap where nodes correspond to the m characters labelled
by their frequencies, therefore takes O(m) time to build the heap
• one iteration takes O(log m) time:
• find and remove (O(log m)) two minimum weights
• then insert (O(log m)) new weight (sum of minimum weights found)
• and there are m-1 iterations before the heap is empty
• each iteration decreases the size of the heap by 1
− so O(n + mlog m) overall
− in fact, m is essentially a constant, so it is really O(n)

Algorithmics I, 2022 11
3 Strings and text algorithms 111
Huffman encoding – Algorithmic requirements
Compression & decompression are both O(n) time
− assuming m is constant

Compression uses a code table (an array of codes, indexed by char)


− O(mlog m) to build the table:
• m characters so m paths of length O(log m)
− O(n) to compress: n characters in the text so n lookups in the array O(1)
− so O(nlog m) + O(n) overall

Decompression uses the tree directly (repeatedly trace paths in tree)


− O(nlog m) as n characters so n paths of length O(log m)

Algorithmics I, 2022 12
3 Strings and text algorithms 112
Huffman encoding – Algorithmic requirements
Problem: some representation of the Huffman tree must be stored
with the compressed file
− otherwise decompression would be impossible

Alternatives
− use a fixed set of frequencies based on typical values for text
• but this will usually reduce the compression ratio
− use adaptive Huffman coding: the (same) tree is built and adapted by the
compressor and by the decompressor as characters are encoded/decoded
• this slows down compression and decompression (but not by much if
done in a clever way)

Algorithmics I, 2022 13
3 Strings and text algorithms 113
LZW compression
A popular dictionary-based method
− the basis of compress and gzip in Unix also used in gif and tiff formats
− due to Lempel, Ziv and Welch
− algorithm was under patented to Unisys (but patent now expired)

The dictionary is a collection of strings


− each with a codeword that represents it
− the codeword is a bit pattern
− but it can be interpreted as a non-negative integer

Whenever a codeword is outputted during compression, what is


written to the compressed file is the bit pattern
− using a number of bits determined by the current codeword length
− so at any point all bit patterns are the same length

Algorithmics I, 2022 14
3 Strings and text algorithms 114
LZW compression
The dictionary is build dynamically during compression
− and also during decompression

Initially dictionary contains all possible strings of length 1

Throughout the dictionary is closed under prefixes


− i.e. if the string s is represented in the dictionary, so is every prefix of s

It follows that a trie is an ideal representation of the dictionary


− every node in the trie represents a 'word' in the dictionary
− a trie is effective and efficient for other reasons too

Algorithmics I, 2022 15
3 Strings and text algorithms 115
LZW compression
Key question: how many bits are in a codeword?
− in the most used version of the algorithm, this value changes as the
compression (or decompression) algorithm proceeds

At any given time during compression (or decompression)


− there is a current codeword length k
− so there are exactly 2k distinct codewords available
• i.e. all possible bit-strings of length k
− this limits the size of the dictionary
− however the codeword length can be incremented when necessary
− thereby doubling the number of available codewords
− initial value of k should be large enough to encode all strings of length 1

Algorithmics I, 2022 16
3 Strings and text algorithms 116
LZW compression – Pseudo code
set current text position i to 0;
initialise codeword length k (say to 8);
initialise the dictionary d;

while (the text t is not exhausted) {

identify the longest string s, starting at position i of text t


that is represented in the dictionary d;
// there is such string, as all strings of length 1 are in d

output codeword for the string s; // using k bits

// move to the next position in the text


i += s.length(); // move forward by the length of string just encoded
c = character at position i in t; // character in next position

add string s+c to dictionary d, paired with next available codeword;


// may have to increment the codeword length k to make this possible
}

Algorithmics I, 2022 17
3 Strings and text algorithms 117
LZW compression - Variants
Constant codeword length: fix the codeword length for all time
− the dictionary has fixed capacity: when full, just stop adding to it

Dynamic codeword length (the version described here)


− start with shortest reasonable codeword length, say, 8 for normal text
− whenever dictionary becomes full
• add 1 to current codeword length (doubles the number of codewords)
• does not affect the sequence of codewords already output
− may specify a maximum codeword length, as increasing the size
indefinitely may become counter-productive

LRU version: when dictionary full and codeword length maximal


− current string replaces Least Recently Used string in dictionary

Algorithmics I, 2022 18
3 Strings and text algorithms 118
LZW compression - Example
Text = G A C G A T A C G A T A C G
File size = 14 bytes, or 28 bits if 2 bits/char

Compressed file: 10 000 001 100 011 0101 0111 1001


file size = 26 bits

step position in longest string in b add to code


string dictionary dictionary
1 1 G 10 GA 4
2 2 A 000 AC 5
3 3 C 001 CG 6
4 4 GA 100 GAT 7
5 6 T 011 TA 8
6 7 AC 0101 ACG 9
7 9 GAT 0111 GATA 10
8 12 ACG 1001 - -

Algorithmics I, 2022 19
3 Strings and text algorithms 119
LZW decompression
Decompression algorithm builds same dictionary as compression
algorithm
− but one step out of phase

Algorithmics I, 2022 20
3 Strings and text algorithms 120
LZW decompression – Pseudo code
initialise codeword length k;
initialise the dictionary;

read the first codeword x from the compressed file f; // i.e. read k bits
String s = d.lookUp(x); // look up codeword in dictionary
output s; // output decompressed string

while (f is not exhausted){

String oldS = s.clone(); // copy last string decompressed

if (d is full) k++; // dictionary full so increase the code word length

get next codeword x from f; // i.e. read k bits


s = d.lookUp(x); // look up codeword in dictionary
output s; // output decompressed string

String newS = oldS + s.charAt(0); // string to add to dictionary


add string newS to dictionary d paired with next available codeword;
}

Algorithmics I, 2022 21
3 Strings and text algorithms 121
LZW decompression - Example
Compressed file: 10000001100011010101111001
file size = 26 bits

Uncompressed Text = G A C G A T A C G A T A C G

step position old code from string add to code


in file string dictionary dictionary
0 1 - 10 G - -
1 3 G 000 A GA 4
2 6 A 001 C AC 5
3 9 C 100 GA CG 6
4 12 GA 011 T GAT 7
5 15 T 0101 AC TA 8
6 19 AC 0111 GAT ACG 9
7 23 GAT 1001 ACG GATA 10

Algorithmics I, 2022 22
3 Strings and text algorithms 122
LZW decompression – Special case
It is possible to encounter a codeword that is not (yet) in the
dictionary
− because decompression is ‘out of phase’ with compression
− but in that case it is possible to deduce what string it must represent
− consider: A A B A B A B A A
and work through compression and decompression for this text

The solution: if (lookUp fails) s = oldS + oldS.charAt(0);

Example of this special case is available on moodle

Algorithmics I, 2022 23
3 Strings and text algorithms 123
LZW decompression
Appropriate data structure for decompression is a simple table

Complexity of compression and decompression both O(n)


− for a text of length n (if suitably implemented)
− algorithms essentially involves just one pass through the text

Algorithmics I, 2022 24
3 Strings and text algorithms 124
Strings - Notation
For a string s=s0s1…sm-1
− m is the length of the string
− s[i] is the (i+1)th element of the string, i.e. si
− s[i..j] is the substring from the ith to jth position, i.e. sisi+1…sj

Prefixes and suffixes


− jth prefix is the first j characters of s denoted s[0..j-1]
• i.e. s[0..j-1] = s0s1…sj-1
• s[0..0-1]=s[0..-1] (the 0th prefix) is the empty string
− jth suffix is the last j characters of s denoted s[m-j..m-1]
• i.e. s[m-j..m-1] = sm-jsm-j+1…sm-1
• s[m..m-1] (the 0th suffix) is the empty string

Algorithmics I, 2022 25
3 Strings and text algorithms 125
String comparison
Fundamental question: how similar, or how different, are 2 strings?
− applications include:
• biology (DNA and protein sequences)
• file comparison (diff in Unix, and other similar file utilities)
• spelling correction, speech recognition,…

A more precise formulation:


given strings s=s0s2…sm-1 and t=t0t2…tn-1 of lengths m and n,
what is the smallest number of basic operations needed to transform s to t?

‘Basic’ operations for transforming strings:


− insert a single character
− delete of a single character
− substitute one character by another

Algorithmics I, 2022 26
3 Strings and text algorithms 126
String comparison – String distance
The distance between s and t is defined to be the smallest
number of basic operations needed to transform s to t
− for example consider the strings s and t

s: a b a d c d b
t: a c b a c a c b

− we can show an alignment between s and t that illustrates how 4 steps


would suffice to transform s into t
− hence the distance between s and t is less than or equal to 4

insert ‘c’ delete ‘d’ substitute ‘a’ for ‘d’ insert ‘c’

s: a - b a d c d - b
t: a c b a - c a c b

Algorithmics I, 2022 27
3 Strings and text algorithms 127
String comparison – String distance
The distance between s and t is defined to be the smallest
number of basic operations needed to transform s into t
− for example for the strings

s: a b a d c d b
t: a c b a c a c b

the distance between s and t is less than or equal to 4

s: a - b a d c d - b
t: a c b a - c a c b

But could it be done in 3 steps?


− the answer is no, proof later based on our algorithm to find the
distance for any two strings, so above alignment is an optimal alignment

Algorithmics I, 2022 28
3 Strings and text algorithms 128
String comparison – String distance
More complex models are possible
− e.g. we can allocate a cost to each basic operation
− our methods adapt easily but we will stick to the unit-cost model

String comparison algorithms use dynamic programming


− the problem is solved by building up solutions to sub-problems of ever
increasing size
− often called the tabular method (it builds up a table of relevant values)
− eventually, one of the values in the table gives the required answer

The dynamic programming technique has applications to many


different problems

Algorithmics I, 2022 29
3 Strings and text algorithms 129
String distance – Dynamic programming
Recall the ith prefix of string s is the first i characters of s
− let d(i,j) be the distance between ith prefix of s and the jth prefix of t
− distance between s and t is then d(m,n)
(since s and t of lengths m and n)

The basis of dynamic programming method is a recurrence relation


− more precisely we define the distance d(i,j) between ith prefix of s and
the jth prefix of t in terms of the distance between shorter prefixes
• i.e. in terms of the distances d(i-1,j-1), d(i,j-1) and d(i-1,j)

− in the base cases we set d(i,0)=i and d(0,j)=j for all i≤n and j≤m
− since the distance from/to an empty string to/from a string of length k
is equal to k (we require k insertions/deletions)

Algorithmics I, 2022 30
3 Strings and text algorithms 130
String distance – Dynamic programming
In an optimal alignment of the ith prefix of s with the jth prefix of t
the last position of the alignment must either be of the form:

* - * *
*
if s[i-1] = t[j-1] and , or otherwise
* - $

where - is a gap, while * and $ are arbitrary but different characters

In this case, no operations are required and the distance is given by


that between the i-1th and j-1th prefixes of s and t

d(i-1,j-1) if s[i-1]=t[j-1]
d(i,j) =
otherwise

Algorithmics I, 2022 31
3 Strings and text algorithms 131
String distance – Dynamic programming
In an optimal alignment of the ith prefix of s with the jth prefix of t
the last position of the alignment must either be of the form:

* - * *
*
if s[i-1] = t[j-1] and , or otherwise
* - $

where - is a gap, while * and $ are arbitrary but different characters

In this case, insert element into s and distance given by 1 (for the
insertion) plus distance between ith prefix of s and i-1th prefix of t

d(i-1,j-1) if s[i-1]=t[j-1]
d(i,j) =
1 + min{ d(i,j−1) } otherwise

Algorithmics I, 2022 32
3 Strings and text algorithms 132
String distance – Dynamic programming
In an optimal alignment of the ith prefix of s with the jth prefix of t
the last position of the alignment must either be of the form:

* - * *
*
if s[i-1] = t[j-1] and , or otherwise
* - $

where - is a gap, while * and $ are arbitrary but different characters

In this case, delete an element from s and distance given by 1 plus


distance between i-1th prefix of s and ith prefix of t

d(i-1,j-1) if s[i-1]=t[j-1]
d(i,j) =
1 + min{ d(i,j−1), d(i−1,j), } otherwise

Algorithmics I, 2022 33
3 Strings and text algorithms 133
String distance – Dynamic programming
In an optimal alignment of the ith prefix of s with the jth prefix of t
the last position of the alignment must either be of the form:

* - * *
*
if s[i-1] = t[j-1] and , or otherwise
* - $

where - is a gap, while * and $ are arbitrary but different characters

In this case, substitute an element in s and distance given by 1


plus distance between i-1th prefix of s and i-1th prefix of t

d(i-1,j-1) if s[i-1]=t[j-1]
d(i,j) =
1 + min{ d(i,j−1), d(i−1,j), d(i−1,j−1) } otherwise

Algorithmics I, 2022 34
3 Strings and text algorithms 134
String distance – Dynamic programming
In an optimal alignment of the ith prefix of s with the jth prefix of t
the last position of the alignment must either be of the form:

* - * *
*
if s[i-1] = t[j-1] and , or otherwise
* - $

where - is a gap, while * and $ are arbitrary but different characters

We take the minimum when s[i-1]≠t[j-1] as we want the optimal


(minimal) distance

d(i-1,j-1) if s[i-1]=t[j-1]
d(i,j) =
1 + min{ d(i,j−1), d(i−1,j), d(i−1,j−1) } otherwise

Algorithmics I, 2022 35
3 Strings and text algorithms 135
String distance – Dynamic programming
The complete recurrence relation is given by:

d(i-1,j-1) if s[i-1]=t[j-1]
d(i,j) =
1+min{ d(i,j−1),d(i−1,j),d(i−1,j−1)} otherwise

subject to d(i,0)=i and d(0,j)=j for all i≤n-1 and j≤m-1

Algorithmics I, 2022 36
3 Strings and text algorithms 136
String distance – Dynamic programming

The dynamic programming algorithm for string distance comes


immediately from the formula
− fill in the entries of an m×n table row by row, and column by column

Time and space complexity both O(mn)


− a consequence of the size of the table
− can easily reduce the space complexity to O(m+n)
− just keep the most recent entry in each column of the table

But what about obtaining an optimal alignment?


− can use a ‘traceback’ in the table (see example below)
− less obvious how this can be done using only O(m+n) space
− but in fact it turns out that it's still possible (Hirschberg's algorithm)

Algorithmics I, 2022 37
3 Strings and text algorithms 137
String distance - Example
s\t 0 1 2 3 4 5 6 7 8
a c b a c a c b
0 0 1 2 3 4 5 6 7 8
1 a 1 0 1 2 3 4 5 6 7
2 b 2 1 1 1 2 3 4 5 6
3 a 3 2 2 2 1 2 3 4 5
4 d 4 3 3 3 2 2 3 4 5
5 c 5 4 3 4 3 2 3 3 4
6 d 6 5 4 4 4 3 3 4 4
7 b 7 6 5 4 5 4 4 4 4

The entries are calculated one by one by application of the formula


− the final table: d(7,8)=4 so the string distance is 4

Algorithmics I, 2022 38
3 Strings and text algorithms 138
String distance – Dynamic programming
The traceback phase used to construct an optimal alignment
− trace a path in the table from bottom right to top left
− draw an arrow from an entry to the entry that led to its value

Interpretation
− vertical steps as deletions
− horizontal steps as insertions
− diagonal steps as matches or substitutions
• a match if the distance does not change and a substitution otherwise

The traceback is not necessarily unique


− since there can be more than one optimal alignment

Algorithmics I, 2022 39
3 Strings and text algorithms 139
String distance – Example (traceback)
s\t 0 1 2 3 4 5 6 7 8
a c b a c a c b
0 0 1 2 3 4 5 6 7 8
1 a 1 0 1 2 3 4 5 6 7
2 b 2 1 1 1 2 3 4 5 6
3 a 3 2 2 2 1 2 3 4 5
4 d 4 3 3 3 2 2 3 4 5
5 c 5 4 3 4 3 2 3 3 4
6 d 6 5 4 4 4 3 3 4 4
7 b 7 6 5 4 5 4 4 4 4

s: a - b a d - c d b
t: a c b a c a c - b
Corresponding alignment: step: d h d d d h d v d
(d=diagonal, v = vertical, h = horizontal)
Algorithmics I, 2022 40
3 Strings and text algorithms 140
String/pattern search
Searching a (long) text for a (short) string/pattern
− many applications including
• information retrieval
• text editing
• computational biology

Many variants, such as exact or approximate matches


− first occurrence or all occurrences
− one text and many strings/patterns
− many texts and one string/pattern

We describe three different solutions to the basic problem:


− given a text t (of length n) and a string/pattern s (of length m)
− find the position of the first occurrence (if it exists) of s in t
− usually n is large and m is small
Algorithmics I, 2022 41
3 Strings and text algorithms 141
String search – Brute force algorithm
Given a text t (of length n) and a string/pattern s (of length m) find
the position of the first occurrence (if any) of s in t

The naive brute force algorithm


− also known as exhaustive search (as we simply test all possible positions)
− set the current starting position in the text to be zero
− compare text and string characters left-to-right until the entire string is
matched or a character mismatches
− in the case of a mismatch
advance the starting position in the text by 1 and repeat
− continue until a match is found or the text is exhausted

Algorithms expressed with char arrays rather than strings in Java

Algorithmics I, 2022 42
3 Strings and text algorithms 142
String search – Brute force algorithm
/** return smallest k such that s occurs in t starting at position k */
public int bruteForce (char[] s, char[] t){
int m = s.length; // length of string/pattern
int n = t.length; // length of text
int sp = 0; // starting position in text t
int i = 0; // curr position in text
int j = 0; // curr position in string/pattern s
while (sp <= n-m && j < m) { // not reached end of text/string
if (t[i] == s[j]){ // chars match
i++; // move on in text
j++; // move on in string/pattern
} else { // a mismatch
j = 0; // start again in string
sp++; // advance starting position
i = sp; // back up in text to new starting position
}
}
if (j == m) return sp; // occurrence found (reached end of string)
else return -1; // no occurrence (reached end of text)
}

Algorithmics I, 2022 43
3 Strings and text algorithms 143
String search – Brute force algorithm
Worst case is no better than O(mn)
− e.g. search for s = aa … ab in t = aa ... aaaa … ab
length m length n

− m character comparisons needed at each n–(m+1) positions in the text


before the text/pattern is found

Typically, the number of comparisons from each point will be small


− often just 1 comparison needed to show a mismatch
− so we can expect O(n) on average

Challenges: can we find a solution that is…


1. linear, i.e. O(m+n) in the worst case?
2. (much) faster than brute force on average?

Algorithmics I, 2022 44
3 Strings and text algorithms 144
String search – KMP algorithm
The Knuth-Morris-Pratt (KMP) algorithm
− addresses first challenge: linear (O(m+n)) in the worst case

It is an on-line algorithm
− i.e., it removes the need to back-up in the text
− involves pre-processing the string to build a border table
− border table: an array b with entry b[j] for each position j of the string

If we get a mismatch at position j in the string/pattern


− we remain on the current text character (do not back-up)
− the border table tells us which string character should next be compared
with the current text character

Algorithmics I, 2022 45
3 Strings and text algorithms 145
String search – KMP algorithm
A substring of string s is a sequence of consecutive characters of s
− if s has length n, then s[i..j] is a substring for i and j with 0≤i≤j≤n-1

A prefix of s is a substring that begins at position 0


− i.e. s[0..j] for any j with 0≤j≤n-1

A suffix of s is a substring that ends at position n-1


− i.e. s[i..n-1] for any i with 0≤i≤n-1

A border of a string s is a substring that is both a prefix and a suffix


and cannot be the string itself
− e.g. s = a c a c g a t a c a c
− a c and a c a c are borders and a c a c is the longest border
Many strings have no border
− we then say that the empty string ε (of length 0) is the longest border

Algorithmics I, 2022 46
3 Strings and text algorithms 146
String search – Border table
KMP algorithm requires the border table of the string pattern
− a border of a string s is a substring that is both a prefix and a suffix and
cannot be the string itself
Border table b: array which has the same size as the string
− b[j] = the length of the longest border of s[0..j-1]
= max { k | s[0..k-1] = s[j-k..j-1] ∧ k<j }

Example
string/pattern s a b a b a c a

j 0 1 2 3 4 5 6

b[j] 0 0 0 1 2 3 0

− no common prefix/suffix of ababac so set to 0

Algorithmics I, 2022 47
3 Strings and text algorithms 147
String search – Brute force versus KMP
Example - Mismatch between s and t at position 9 in s
jnew j

0 1 2 3 4 5 6 7 8 9 10 11 12 13
string/pattern s a g a g c a g a g a g c a g
text t a g a g c a g a g t * * * * …

inew i
Applying the brute force algorithm, after the mis-match:
− s has to be ‘moved along’ one position relative to t
− then we start again at position 0 in s and jump back j-1 positions in t

Algorithmics I, 2022 48
3 Strings and text algorithms 148
String search – Brute force versus KMP
Example - Mismatch between s and t at position 9 in s
j

0 1 2 3 4 5 6 7 8 9 10 11 12 13
string/pattern s a g a g c a g a g a g c a g
text t a g a g c a g a g t * * * * …

i
Applying the KMP algorithm, after the mis-match:
− s has to be ‘moved along’ until the characters to the left of i again match

Algorithmics I, 2022 49
3 Strings and text algorithms 149
String search – Brute force versus KMP
mis-match
j
string/pattern s s[0..j-1] $ …

text t … s[0..j-1] * …
i
Need to move s along until the characters to the left of i match
therefore need start of s[0..j-1] to match end of s[0..j-1]
− therefore use longest border of s[0..j-1]
− i.e. longest substring that is both a prefix and a suffix of s[0..j-1]

string/pattern s $ …

text t … * …

Algorithmics I, 2022 50
3 Strings and text algorithms 150
String search – Brute force versus KMP
Example - Mismatch between s and t at position 9 in s
jnew j

0 1 2 3 4 5 6 7 8 9 10 11 12 13
string/pattern s a g a g c a g a g a g c a g
text t a g a g c a g a g * * * * * …

i
Applying the KMP algorithm, after the mis-match:
− s has to be ‘moved along’ until the characters to the left of i again match
− this determines the new value of j, the value of i is unchanged
− length of the longest border of s[0..j-1] is 4 in this case
• i.e. longest substring that is both a prefix and a suffix of s[0..j-1]
− so the new value of j is 4

Algorithmics I, 2022 51
3 Strings and text algorithms 151
String search – Brute force versus KMP
Example - Mismatch between s and t at position 9 in s
jnew j

0 1 2 3 4 5 6 7 8 9 10 11
string/pattern s t g a g c a g a g a g c
text t t g a g c a g a g t * * * * …

i
Applying the KMP algorithm, after the mis-match:
− s has to be ‘moved along’ until the characters to the left of i again match

If we cannot move s along to get a match, then we need to


− reset j (i.e. return to the start of the string) and i remains unchanged

Algorithmics I, 2022 52
3 Strings and text algorithms 152
String search – Brute force versus KMP
Example - Mismatch between s and t at position 0 in s
j

0 1 2 3 4 5 6 7 8 9 10 11 12 13
string/pattern s t g a g c a g a g a g c a g
text t a g a g c a g a g t * * * * …

i inew
Applying the KMP algorithm, after the mis-match:
− s has to be ‘moved along’ until the characters to the left of i again match

If we cannot move s along to get a match, then we need to


− reset j (i.e. return to the start of the string) and i remains unchanged
− unless j is already 0 and in this case increment i

Algorithmics I, 2022 53
3 Strings and text algorithms 153
KMP search - Implementation
/** return smallest k such that s occurs from position k in t or -1 if no k exists */
public int kmp(char[] t, char[] s) {
int m = s.length; // length of string/pattern
int n = t.length; // length of text
int i = 0; // current position in text
int j = 0; // current position in string s
int [] b = new int[m]; // create border table
setUp(b); // set up the border table
while (i <= n) { // not reached end of text
if (t[i] == s[j]){ // if positions match
i++; // move on in text
j++; // move on in string
if (j = m) return i – j; // reached end of string so a match
} else { // mismatch adjust current position in string using the border table
if (b[j] > 0) // there is a common prefix/suffix
j = b[j]; // change position in string (position in text unchanged)
else { // no common prefix/suffix
if (j = 0) i++; // move forward one position in text if not advanced
else j = 0; // else start from beginning of the string
}
}
}
return -1; // no occurrence
}
54
Algorithmics I, 2022
3 Strings and text algorithms 154
KMP - Example

string/pattern s a b a b a c a

text t b a c b a b a b a b a c a a b

String/pattern has been found position in string j=6

string/pattern s a b a b a c a

j 0 1 2 3 4 5 6

b[j] 0 0 0 1 2 3 0

Algorithmics I, 2022 55
3 Strings and text algorithms 155
KMP search - Analysis
while (i<n)
if (t[i] == s[j]){
i++; j++;
}
else {
if (b[j]>0) j = b[j];
else {
if (j=0) i++;
else j = 0;
}
}

For the complexity we need to know the number of loop iterations


Consider values of i and k (where k=i-j) during the iterations
− clearly i≤n and since j is never negative we also have k≤n
− in each iteration either i or k is incremented and neither is decremented

Algorithmics I, 2022 56
3 Strings and text algorithms 156
KMP search - Analysis
while (i<n)
if (t[i] == s[j]){
i++; j++;
}
else {
if (b[j]>0) j = b[j];
else {
if (j=0) i++;
else j = 0;
}
}

For the complexity we need to know the number of loop iterations


Consider values of i and k (where k=i-j) during the iterations
− clearly i≤n and since j is never negative we also have k≤n
− in each iteration either i or k is incremented and neither is decremented
• i++ > i and (i++)-(j++) = i-j

Algorithmics I, 2022 57
3 Strings and text algorithms 157
KMP search - Analysis
while (i<n)
if (t[i] == s[j]){
i++; j++;
}
else {
if (b[j]>0) j = b[j];
else {
if (j=0) i++;
else j = 0;
}
}

For the complexity we need to know the number of loop iterations


Consider values of i and k (where k=i-j) during the iterations
− clearly i≤n and since j is never negative we also have k≤n
− in each iteration either i or k is incremented and neither is decremented
• i = i and i-b[j] > i-j
• since b[j]<j as b[j] longest border in a string of length j

Algorithmics I, 2022 58
3 Strings and text algorithms 158
KMP search - Analysis
while (i<n)
if (t[i] == s[j]){
i++; j++;
}
else {
if (b[j]>0) j = b[j];
else {
if (j=0) i++;
else j = 0;
}
}

For the complexity we need to know the number of loop iterations


Consider values of i and k (where k=i-j) during the iterations
− clearly i≤n and since j is never negative we also have k≤n
− in each iteration either i or k is incremented and neither is decremented
• i++ > i and (i++)-j > i-j

Algorithmics I, 2022 59
3 Strings and text algorithms 159
KMP search - Analysis
while (i<n)
if (t[i] == s[j]){
i++; j++;
}
else {
if (b[j]>0) j = b[j];
else {
if (j=0) i++;
else j = 0;
}
}

For the complexity we need to know the number of loop iterations


Consider values of i and k (where k=i-j) during the iterations
− clearly i≤n and since j is never negative we also have k≤n
− in each iteration either i or k is incremented and neither is decremented
• i = i and i-0 > i-j
• since j>0 must hold for the else case to be taken

Algorithmics I, 2022 60
3 Strings and text algorithms 160
KMP search - Analysis
while (i<n)
if (t[i] == s[j]){
i++; j++;
}
else {
if (b[j]>0) j = b[j];
else {
if (j=0) i++;
else j = 0;
}
}

For the complexity we need to know the number of loop iterations


Consider values of i and k (where k=i-j) during the iterations
− clearly i≤n and since j is never negative we also have k≤n
− in each iteration either i or k is incremented and neither is decremented
− so the number of iterations of the loop is at most 2n
Hence KMP is O(n) in the worst case
Algorithmics I, 2022 61
3 Strings and text algorithms 161
KMP search - Analysis
KMP search is O(n) in the worst case

Creating the border table


− naïve method requires O(j2) steps to evaluate b[j] giving O(m3) overall
− a more efficient method is possible that requires just O(m) steps in total
involves a subtle application of the KMP algorithm (details are omitted)

Overall complexity of KMP search


− KMP can be implemented to run in O(m+n) time
− O(m) for setting up the border table
− O(n) for conducting the search

Have addressed challenge 1


− KMP algorithm is linear (i.e. O(m+n))

Algorithmics I, 2022 62
3 Strings and text algorithms 162
Boyer-Moore Algorithm
Challenge 1: can we find a solution that is linear in the worst case?

Yes: KMP

Challenge 2: can we find a solution that is (much) faster than brute


force on average?

Boyer-Moore: almost always faster than brute force or KMP


− variants are used in many applications
− typically, many text characters are skipped without even being checked
− the string/pattern is scanned right-to-left
− text character involved in a mismatch is used to decide next comparison

Algorithmics I, 2022 63
3 Strings and text algorithms 163
Boyer-Moore Algorithm – Example
Search for ‘pill’ in ‘the caterpillar’

the caterpillar
pill
^

Search for string from right to left


− start by comparing mth element of text with last character of string
m is the length of the string, i.e. equals 4

Algorithmics I, 2022 64
3 Strings and text algorithms 164
Boyer-Moore Algorithm – Example
Search for ‘pill’ in ‘the caterpillar’

the caterpillar
pill
^

Search for string from right to left


− continue search from the last position in the string
− ‘p’ matches and we have found the string in the text

Algorithmics I, 2022 65
3 Strings and text algorithms 165
Boyer-Moore Algorithm – Simplified version
The string is scanned right-to-left
− text character involved in a mismatch is used to decide next comparison
− involves pre-processing the string to record the position of the last
occurrence of each character c in the alphabet
− therefore the alphabet must be fixed in advance of the search

Last occurrence position of character c in the string s


− equals max{k | s[k]=c } if such a k exists and -1 otherwise

Want to store last occurrence position of c in an array element p[c]


− but in Java we can not index an array by characters
− instead can use the static method Character.getNumericValue(c)
− to compute an appropriate array index

Simplified version (often called the Boyer–Moore–Horspool algorithm)

Algorithmics I, 2022 66
3 Strings and text algorithms 166
Boyer-Moore Algorithm – Simplified version
In our pseudocode we assume an array p[c] indexed by characters
− the characters range over the underlying alphabet of the text
− p[c] records the position in the string of the last occurrence of char c
− if the character c is absent from the string s, then let p[c]=-1

Assume ASCII character set (128 characters)


− for Unicode (more than 107,000 characters), p would be a large array

On finding a mismatch there is a jump step in the algorithm


− if the mismatch is between s[j] and t[i]
− ‘slide s along’ so that position p[t[i]] of s aligns with t[i]
• i.e. align last position in s of character t[i] with position i of t
− if this moves s in the ‘wrong direction’, instead move s one position right
− if t[i] does not appear in string, ‘slide string’ passed t[i]
• i.e. align position -1 of s with position i of t
Algorithmics I, 2022 67
3 Strings and text algorithms 167
Boyer-Moore Algorithm – Jump step – Case 1
Assume a mismatch between position s[j] and position t[i]
Case 1: the last position of character t[i] in s is before position j
sp i
text t * * * * * * a * * * * * * …

string/pattern s . . a . . . b . .

p[t[i]] j
reminder: p[t[i]] records the position
in s the last occurrence of character t[i]

− i records the current position in the text we are checking


− j records the current position in the string we are checking
− sp records the current starting position of string in the text

Algorithmics I, 2022 68
3 Strings and text algorithms 168
Boyer-Moore Algorithm – Jump step – Case 1
Assume a mismatch between position s[j] and position t[i]
Case 1: the last position of character t[i] in s is before position j
sp i inew
text t * * * * * * a * * * * * * …

string/pattern s . . a . . . b . .
j

(m-1)-p[t[i]]
p[t[i]]
m-1
− i records the current position in the text we are checking
− new value of i equals i+(m-1)-p[t[i]]

Algorithmics I, 2022 69
3 Strings and text algorithms 169
Boyer-Moore Algorithm – Jump step – Case 1
Assume a mismatch between position s[j] and position t[i]
Case 1: the last position of character t[i] in s is before position j
sp i
text t * * * * * * a * * * * * * …

string/pattern s . . a . . . b . .
jnew
p[t[i]] j

− j records the current position in the string we are checking


− new value of j equals m-1 (start again from the end of the string/pattern)

Algorithmics I, 2022 70
3 Strings and text algorithms 170
Boyer-Moore Algorithm – Jump step – Case 1
Assume a mismatch between position s[j] and position t[i]
Case 1: the last position of character t[i] in s is before position j
sp spnew
i
text t * * * * * * a * * * * * * …

string/pattern s . . a . . . b . .

p[t[i]] j

j-p[t[i]]

− sp records the current starting position of string in the text


− new value of sp equals sp+j-p[t[i]] as this is the amount the pattern/
string has been moved forward

Algorithmics I, 2022 71
3 Strings and text algorithms 171
Boyer-Moore Algorithm – Jump step – Case 2
Assume a mismatch between position s[j] and position t[i]
Case 2: last position of character t[i] in s is at least at position j
sp i
text t * * * * a * * * * * * * …

string/pattern s . . b . . . a
. . .
j p[t[i]]

move string along by one place and start again from the end of the string

− i records the current position in the text we are checking


− j records the current position in the string we are checking
− sp records the current starting position of string in the text

Algorithmics I, 2022 72
3 Strings and text algorithms 172
Boyer-Moore Algorithm – Jump step – Case 2
Assume a mismatch between position s[j] and position t[i]
Case 2: last position of character t[i] in s is at least at position j
sp i inew
text t * * * * a * * * * * * * …

string/pattern s . . b . . . a . .
j position m-1
position j-1
(m-1)–(j-1)

− i records the current position in the text we are checking


− new value of i equals i+(m-1)–(j-1) = i+(m-j)

Algorithmics I, 2022 73
3 Strings and text algorithms 173
Boyer-Moore Algorithm – Jump step – Case 2
Assume a mismatch between position s[j] and position t[i]
Case 2: last position of character t[i] in s is at least at position j
sp i
text t * * * * a * * * * * * * …

string/pattern s . . b . . . a . .
jnew
j

− j records the current position in the string we are checking


− new value of j equals m-1

Algorithmics I, 2022 74
3 Strings and text algorithms 174
Boyer-Moore Algorithm – Jump step – Case 2
Assume a mismatch between position s[j] and position t[i]
Case 2: last position of character t[i] in s is at least at position j
sp spnew
i
text t * * * * a * * * * * * * …

string/pattern s . . b . . . a . .
j

− sp records the current starting position of string in the text


− new value of sp equals sp+1

Algorithmics I, 2022 75
3 Strings and text algorithms 175
Boyer-Moore Algorithm – Jump step – Case 3
Assume a mismatch between position s[j] and position t[i]
Case 3: character t[i] does not appear in s (i.e. we have p[j]=-1)
sp i
text t * * * * * * a * * * * * * * * .
* …

string/pattern s . . . . . . b . .

− i records the current position in the text we are checking


− j records the current position in the string we are checking
− sp records the current starting position of string in the text

Algorithmics I, 2022 76
3 Strings and text algorithms 176
Boyer-Moore Algorithm – Jump step – Case 3
Assume a mismatch between position s[j] and position t[i]
Case 3: character t[i] does not appear in s (i.e. we have p[j]=-1)
sp i inew
text t * * * * * * a * * * * * * * * .
* …

string/pattern s . . . . . . b . .

m-1

− i records the current position in the text we are checking


− new value of i equals i+m

Algorithmics I, 2022 77
3 Strings and text algorithms 177
Boyer-Moore Algorithm – Jump step – Case 3
Assume a mismatch between position s[j] and position t[i]
Case 3: character t[i] does not appear in s (i.e. we have p[j]=-1)
sp i inew
text t * * * * * * a * * * * * * * * .
* …

string/pattern s . . . . . . b . .

− i records the current position in the text we are checking


− new value of i equals i+m

Algorithmics I, 2022 78
3 Strings and text algorithms 178
Boyer-Moore Algorithm – Jump step – Case 3
Assume a mismatch between position s[j] and position t[i]
Case 3: character t[i] does not appear in s (i.e. we have p[j]=-1)
sp i inew
text t * * * * * * a * * * * * * * * .
* …

string/pattern s . . . . . . b . .
jnew
j

− j records the current position in the string we are checking


− new value of j equals m-1 (start again from the end of the string/pattern)

Algorithmics I, 2022 79
3 Strings and text algorithms 179
Boyer-Moore Algorithm – Jump step – Case 3
Assume a mismatch between position s[j] and position t[i]
Case 3: character t[i] does not appear in s (i.e. we have p[j]=-1)
sp spnew
i inew
text t * * * * * * a * * * * * * * * .
* …

string/pattern s . . . . . . b . .

j+1
− sp records the current starting position of string in the text
− new value of sp equals sp+(j+1) as this is the amount the pattern/
string has been moved forward

Algorithmics I, 2022 80
3 Strings and text algorithms 180
Boyer-Moore Algorithm – All cases
Case 1: p[t[i]]<j and p[t[i]]≥0
− new value of i equals i+m-1-p[t[i]] Note p[t[i]] cannot
− new value of j equals m-1 equal j as p[t[i]] last
Case 2: p[t[i]]>j
position of character
t[i] in s
− new value of i equals i+m-j
and mismatch between
− new value of j equals m-1
t[i] and s[j]
− new value of sp equals sp+1
Case 3: p[t[i]]=-1
− new value of i equals i+m
− new value of j equals m-1
− new value of sp equals sp+j+1

Algorithmics I, 2022 81
3 Strings and text algorithms 181
Boyer-Moore Algorithm – All cases
We find that we can express these updates as follows:
− new value of i equals i + m – min(1+p[t[i]],j)
− new value of j equals m-1
− new value of sp equals sp + max(j-p[t[i]],1)

You do not need to learn these updates, just how the algorithm works
− this is sufficient for running it on an example (as you saw)
− and for working out what the updates are if needed (again as you saw)

Algorithmics I, 2022 82
3 Strings and text algorithms 182
Boyer-Moore Algorithm - Implementation
/** return smallest k such that s occurs at k in t or -1 if no k exists */
public int bm(char[] t, char[] s) {
int m = s.length; // length of string/pattern
int n = t.length; // length of text
int sp = 0; // current starting position of string in text
int i = m-1; // current position in text
int j = m-1; // current position in string/pattern
// declare a suitable array p
setUp(s, p); // set up the last occurrence array
while (sp <= n-m && j >= 0) {
if (t[i] == s[j]){ // current characters match
i--; // move back in text
j--; // move back in string
} else { // current characters do not match
sp += max(1, j - p[t[i]]);
i += m – min(j, 1 + p[t[i]]);
j = m-1; // return to end of string
}
}
if (j < 0) return sp; else return -1; // occurrence found yes/no
}
Algorithmics I, 2022 83
3 Strings and text algorithms 183
Boyer-Moore Algorithm - Complexity
Worst case is no better than O(mn)
− e.g. search for s = ba … aa in t = aa … aaaa … aa
length m length n

− m character comparisons needed at each n–(m+1) positions in the text


before the text/pattern is found

There is an extended version which is linear, i.e. O(m+n)


− this as the good suffix rule (or magic)

Algorithmics I, 2022 84
3 Strings and text algorithms 184
Algorithmics I 2022

Algorithmics I

Section 4 – NP completeness

Dr. Gethin Norman


School of Computing Science
University of Glasgow
[email protected]

4 NP Completeness 185
Some efficient algorithms we have seen

We have seen algorithms for a wide range of problems so far, giving


us a spectrum of worst-case complexity functions:

• searching a sorted list O(log n) (for an array/list of length n)

• finding the max value O(n) (for an array/list of length n)

• sorting O(n log n) (for an array/list of length n)

• distance between two strings O(n2) (for two strings of length n)

• finding a shortest path O(n2) (for weighted graph with n vertices)

These are all examples of problems that admit polynomial-time


algorithms: their worst-case complexity is O(nc) for some constant c

Algorithmics I, 2022 2
4 NP Completeness 186
Recall the Eulerian cycle problem (AF2)

G undirected graph: decide whether G admits an Euler cycle


− an Eulerian cycle is a cycle that traverses each edge exactly once

1/8
1 3 4

7 2
2/6 5

Theorem (Euler, 1736). A connected undirected graph has an


Euler cycle if and only if each vertex has even degree
therefore we can test whether G has an Euler cycle (and find one) in:
− O(n2) time if G is represented by an adjacency matrix
− O(m+n) time if G is represented by adjacency lists
− where m=|E| and n=|V|

Algorithmics I, 2022 3
4 NP Completeness 187
Recall the Hamiltonian cycle problem (AF2)

G undirected graph, decide whether G admits an Hamiltonian cycle


− a Hamiltonian cycle is a cycle that visits each vertex exactly once

This problem is superficially similar to the Euler cycle problem


− however in an algorithmic sense it is very different
− nobody has found a polynomial-time algorithm for Hamiltonian cycle

Algorithmics I, 2022 4
4 NP Completeness 188
Recall the Hamiltonian cycle problem (AF2)
Brute force algorithm:
− generate all permutations of vertices
− check each one to see if it is a cycle, i.e. corresponding edges are present

Complexity of the algorithm (n is the number of vertices)


− n! permutations will be generated in the worst case
− for each permutation π, O(n2) operations to check whether π is a
Hamiltonian cycle (assuming G is represented by adjacency lists)
• worst case: to check an edge is present have to traverse adjacency list of
length n-1 and have n edges to check

Therefore worst-case number of operations is O(n2n!)


− this is an example of an exponential algorithm
− an algorithm whose time complexity is no better than O(bn) for some
constant b (and so cannot be expressed as O(nc) for any constant c)

Algorithmics I, 2022 5
4 NP Completeness 189
Polynomial versus exponential time
Table shows running time of algorithms with various complexities
(assuming 109 operations per second)
20 40 50 60 70

n .00001 sec .00003 sec .00004 sec .00005 sec .00006 sec

n2 .0001 sec .0009 sec .0016 sec .0025 sec .0036 sec

n3 .001 sec .027 sec .064 sec .125 sec .216 sec

n5 .1 sec 24.3 secs 1.7 mins 5.2 mins 13.0 mins

2n .001 sec 17.9 mins 12.7 days 35.7 years 366 cents

3n .059 sec 6.5 years 3855 cents 2´108 cents 1.3´1013 cents

n! 3.6 secs 8.4´1016 cents 2.6´1032 cents 9.6´1048 cents 2.6´1066 cents

As n grows, distinction between polynomial and exponential time


algorithms becomes dramatic

Algorithmics I, 2022 6
4 NP Completeness 190
Polynomial versus exponential time
This behaviour still applies even with increases in computing power
− sizes of largest instance solvable in 1 hour on a current computer
− what happens when computers become faster?
current computer 100 times computer 1000 times
computer faster faster
n N1 100 N1 1000 N1
n2 N2 10 N2 31.6 N2
n3 N3 4.64 N3 10 N3
n5 N4 2.5 N4 3.98 N4
2n N5 N5 + 6.64 N5 + 9.97
3n N6 N6 + 4.19 N6 + 6.29
n! N7 £ N7 + 1 £ N7 + 1

A thousand-fold increase in computing power only adds 6 to the size of the


largest problem instance solvable in 1 hour, for an algorithm with complexity 3n

Algorithmics I, 2022 7
4 NP Completeness 191
Polynomial versus exponential time

The message:
• Exponential-time algorithms are in general “bad”
− increases in processor speeds to do not lead to significant changes in this
slow behaviour when the input size is large
• Polynomial-time algorithms are in general “good”

When we refer to “efficient algorithms” we mean polynomial-time


− often polynomial-time algorithms require some extra insight
− often exponential-time algorithms are variations on exhaustive search

A problem is polynomial-time solvable if it admits a polynomial-time


algorithm

Algorithmics I, 2022 8
4 NP Completeness 192
A brief interlude
You are asked to find a polynomial-time algorithm for the
Hamiltonian cycle problem
− this could be a difficult task, you do not want to have to report:

perhaps instead you could try to prove that the


problem is intractable

“I cannot find an efficient algorithm I guess I’m too dumb”

Algorithmics I, 2022 9
4 NP Completeness 193
A brief interlude
Definition: a problem Π is intractable if there does not exist a
polynomial-time algorithm that solves Π
− you could try to prove that the Hamiltonian Cycle problem is intractable

it can be very difficult to prove that a problem is


intractable, and such proofs are rare

“I cannot find an efficient algorithm, because no such algorithm is possible!”

Algorithmics I, 2022 10
4 NP Completeness 194
A brief interlude
You could try to prove that the Hamiltonian cycle problem is “just as
hard” as a whole family of other difficult problems

these difficult problems are known as the


NP-complete problems

“I cannot find an efficient algorithm, but neither can all these famous people!”
Algorithmics I, 2022 11
4 NP Completeness 195
A brief interlude
State of the Art for Hamiltonian cycle
− no polynomial-time algorithm has been found
− similarly, no proof of intractability has been found
− the problem is known to be an NP-complete problem

So what can we do in these circumstances?


− search for a polynomial-time algorithm should be given a lower priority
− could try to solve only “special cases” of the problem
− could look for an exponential-time algorithm that does reasonably well in
practice
− could search for a polynomial-time algorithm that meets only some of
the problem specifications

Algorithmics I, 2022 12
4 NP Completeness 196
NP-complete problems
No polynomial-time algorithm is known for a NP-complete problem
− however, if one of them is solvable in polynomial time, then they all are

No proof of intractability is known for a NP-complete problem


− however, if one of them is intractable, then they all are

There is a strong belief in the


community that NP-complete
problems are intractable
− we can think of all of them
as being of equivalent
difficulty

Algorithmics I, 2022 13
4 NP Completeness 197
Intractable problems
Two different causes of intractability (no polynomial algorithm):
1. polynomial time is not sufficient in order to discover a solution
2. solution itself is so large that exponential time is needed to output it

We will be concerned with case 1


− there are intractability proofs for case 1
− some problems have been shown to be undecidable
i.e. no algorithm of any sort could solve them (examples later)
− some decidable problems have been shown to be intractable

Example of case 2:
− consider problem of generating all cycles for a given graph

Algorithmics I, 2022 14
4 NP Completeness 198
Intractable problems - Roadblock
A decidable problem that is intractable: Roadblock
− there are two players: A and B
− there is a network of roads, comprising intersections connected by roads
− each road is coloured either black, blue or green
− some intersections are marked either “A wins” or “B wins”
− a player has a fleet of cars located at intersections
• at most one per intersection

Player A begins, and subsequently players make moves in turn


− by moving one of their cars on one or more roads of the same colour
− a car may not stop at or pass over an intersection which already has a car

The problem is to decide, for a given starting configuration, whether


A can win, regardless of what moves B takes
Algorithmics I, 2022 15
4 NP Completeness 199
Intractable problems – Roadblock - Example

A moves first and A can win, no matter what B does. How?


− A moves (along the green road)
− B moves (along the black road) to try and stop A from winning on its
next turn
B wins
if B does not do this
A could move to the
same place and win A

A wins
A
B
B wins B

A wins
Algorithmics I, 2022 16
4 NP Completeness 200
Intractable problems – Roadblock - Example

A moves first and A can win, no matter what B does. How?


− A moves (along the green road)
− B moves (along the black road) to try and stop A from winning
− but A can still win (by moving along the black road)
B wins

A wins
A
B
B
B wins B

A wins
Algorithmics I, 2022 17
4 NP Completeness 201
Summary

Polynomial-time Intractable
solvable problems
problems

? ?

NP-complete
problems

One of the question marks must be an ’equals’ sign, while the other
must be a ’not-equals’ sign

Algorithmics I, 2022 18
4 NP Completeness 202
Problem and problem instances
A problem is usually characterised by (unspecified) parameters
− typically there are infinitely many instances for a given problem
A problem instance is created by giving these parameters values

An NP-complete problem:
− Name: Hamiltonian Cycle (HC)
− Instance: a graph G
− Question: does G contain a cycle that visits
each vertex exactly once?

This is an example of a decision problem


− the answer is ’yes’ or ’no’
− every instance is either a ’yes’-instance or a ’no’-instance

Algorithmics I, 2022 19
4 NP Completeness 203
Other NP-complete problems
Name: Travelling Salesman Decision Problem (TSDP)
Instance: a set of n cities and integer distance d(i,j) between each
pair of cities i, j, and a target integer K
Question: is there a permutation p1p2…pn-1pn of 1,2,…,n such that
d(p1,p2) + d(p2,p3) + … + d(pn-1,pn) + d(pn,p1) ≤ K ?
− i.e. is there a ‘travelling salesman tour’ of length ≤ K c1

Example: 9
10 5
− there is a travelling salesman tour of length 29
6
• d(1,3)+d(3,2)+d(2,4)+d(4,1)=5+6+9+9=29 c3 9
− there is no tour of length < 29 c2 9 c4
The travelling salesman decision problem is NP-complete

Algorithmics I, 2022 20
4 NP Completeness 204
Other NP-complete problems
Name: Clique Problem (CP)
Instance: a graph G and a target integer K
Question: does G contain a clique of size K?
− i.e. a set of K vertices for which there is an edge between all pairs

Example:
− there is a clique of size 4
− there is no clique of size 5

The clique decision problem is NP-complete

Algorithmics I, 2022 21
4 NP Completeness 205
Other NP-complete problems
Name: Graph Colouring Problem (GCP)
Instance: a graph G and a target integer K
Question: can one of K colours be attached to each vertex of G so
that adjacent vertices always have different colours?

Example:
− there is a colouring using 3 colours
− there is no colouring using 2 colours

The graph colouring decision problem is NP-complete


Algorithmics I, 2022 22
4 NP Completeness 206
Other NP-complete problems
Name: Satisfiability (SAT)
Instance: Boolean expression B in conjunctive normal form (CNF)
− CNF: C1 ∧ C2 ∧ … ∧ Cn where each Ci is a clause
− Clause C: (l1 ∨ l2 ∨ … ∨ lm) where each lj is a literal
− Literal l: a variable x or its negation ¬x
Question: is B satisfiable?
− i.e. can values be assigned to the variables that make B true?

Example:
− B = (x1∨x2∨¬x3)∧(¬x1∨x3∨¬x4)∧(¬x2∨x4)∧(x2∨¬x3∨x4)
− B is satisfiable: x1=true, x2=false, x3=true, x4=true

The satisfiability problem is NP-complete

Algorithmics I, 2022 23
4 NP Completeness 207
Optimisation and search problems
An optimisation problem: find the maximum or minimum value
− e.g. the travelling salesman optimisation problem (TSOP) is to find the
minimum length of a tour

A search problem: find some appropriate optimal structure


− e.g. the travelling salesman search problem (TSSP) is to find a minimum
length tour

NP-completeness deals primarily with decision problems


− corresponding to each instance of an optimisation or search problem
− is a family of instances of a decision problem by setting ’target’ values
− almost invariably, an optimisation or search problem can be solved in
polynomial time if and only if the corresponding decision problem can
(we will consider some examples of this in the tutorials)

Algorithmics I, 2022 24
4 NP Completeness 208
The class P
P is the class of all decision problems that can be solved in
polynomial time

Fortunately, many problems are in P


− is there a path of length ≤K from vertex u to vertex v in a graph G?
− is there a spanning tree of weight ≤K in a graph G?
− is a graph G bipartite?
− is a graph G connected?
− deadlock detection: does a directed graph D contain a cycle?
− text searching: does a text t contain an occurrence of a string s?
− string distance: is d(s,t)≤K for strings s and t?
− …

P often extended to include search and optimisation problems

Algorithmics I, 2022 25
4 NP Completeness 209
The class NP

The decision problems solvable in non-deterministic polynomial time


− a non-deterministic algorithm can make non-deterministic choices
• the algorithm is allowed to guess (so when run can give different answers)
− hence is apparently more powerful than a normal deterministic algorithm

P is certainly contained within NP


− a deterministic algorithm is just a special case of a non-deterministic one

But is that containment strict?


− there is no problem known to be in NP and known not to be in P

The relationship between P and NP is the most notorious unsolved


question in computing science
− there is a million dollar prize if you can solve this question

Algorithmics I, 2022 26
4 NP Completeness 210
Non-deterministic algorithms (NDAs)
Such an algorithm has an extra operation: non-deterministic choice

int nonDeterministicChoice(int n)
// returns a positive integer chosen from the range 1,…,n

− an NDA has many possible executions depending on values returned

An NDA “solves” a decision problem Π if


− for a ‘yes’-instance I of Π there is some execution that returns ’yes’
− for a ‘no’-instance I of Π there is no execution that returns ’yes’

and “solves” a decision problem Π in polynomial time if


− for every ‘yes’-instance I of Π there is some execution that returns
‘yes’ which uses a number of steps bounded by a polynomial in the input
− for a ‘no’-instance I of Π there is no execution that returns ’yes’
Algorithmics I, 2022 27
4 NP Completeness 211
Non-deterministic algorithms (NDAs)
An NDA “solves” a decision problem Π if
− for a ‘yes’-instance I of Π there is some execution that returns ’yes’
− for a ‘no’-instance I of Π there is no execution that returns ’yes’

Clearly such algorithms are not useful in practice


− who would use an algorithm that sometimes gives the right answer

However they are a useful mathematical concept for defining the


classes of NP and NP-complete problems

Algorithmics I, 2022 28
4 NP Completeness 212
Non-deterministic algorithms - Example
Graph colouring

// return true if graph g is k-colourable and false otherwise


boolean nDGC(Graph g, int k){

for (each vertex v in g) v.setColour(nonDeterministicChoice(k));

for (each edge {u,v} in g)


if (u.getColour() == v.getColour()) return false;
return true;
}

“guess” a colour
“verify” the for each vertex
colouring

Algorithmics I, 2022 29
4 NP Completeness 213
Non-deterministic algorithms
An non-deterministic algorithm can be viewed as
− a guessing stage (non-deterministic)
− a checking stage (deterministic and polynomial time)

guess a verify the stop


start
’certificate’ certificate

non-deterministic polynomial time


algorithm algorithm

Algorithmics I, 2022 30
4 NP Completeness 214
Polynomial time reductions
A polynomial-time reduction (PTR) is a mapping f from a decision
problem Π1 to a decision problem Π2 such that:

for every instance I1 of Π1 we have


− the instance f(I1) of Π2 can be constructed in polynomial time
− f(I1) is a ’yes’-instance of Π2 if and only if I1 is a ’yes’-instance of Π1

We write Π1 ∝ Π2 as an abbreviation for:


there is a polynomial-time reduction from Π1 to Π2

Algorithmics I, 2022 31
4 NP Completeness 215
Polynomial time reductions - Properties
Transitivity: Π1 ∝ Π2 and Π2 ∝ Π3 implies that Π1 ∝ Π3

Since Π1 ∝ Π2 and Π2 ∝ Π3 we have


− a PTR f from Π1 to Π2
− a PTR g from Π2 to Π3

Now for any instance I1 of Π1 since f is PTR we have


− I2=f(I1) is an instance of Π2 that can be constructed in polynomial time
− I2 has the same answer as I1
and since g is a PTR we have
− I3=g(I2) is an instance of Π3 that can be constructed in polynomial time
− I3 has the same answer as I2

Algorithmics I, 2022 32
4 NP Completeness 216
Polynomial time reductions - Properties
Transitivity: Π1 ∝ Π2 and Π2 ∝ Π3 implies that Π1 ∝ Π3

Since Π1 ∝ Π2 and Π2 ∝ Π3 we have


− a PTR f from Π1 to Π2
− a PTR g from Π2 to Π3

Putting the results together: for any instance I1 of Π1


− I3=g(f(I1)) is an instance of Π3 constructed in polynomial time
− I3 has the same answer as I1
− i.e. the composition of f and g is a PTR from from Π1 to Π3

Algorithmics I, 2022 33
4 NP Completeness 217
Polynomial time reductions - Properties
Relevance to P: Π1 ∝ Π2 and Π2∈P implies that Π1∈P
− to solve an instance of Π1, reduce it to an instance of Π2
− roughly speaking, Π1 ∝ Π2 means that Π1 is ‘no harder’ than Π2
i.e. if we can solve Π2, then we can solve Π1 without much more effort
• just need to additional perform a polynomial time reduction

Algorithmics I, 2022 34
4 NP Completeness 218
Polynomial time reductions - Example
Reducing Hamiltonian cycle problem to travelling salesman problem

Hamiltonian Cycle Problem (HC)


− instance: a graph G
− question: does G contain a cycle that visits
each vertex exactly once?

Travelling Salesman Decision Problem (TSDP)


− instance: a set of n cities and integer distance c1

d(i,j) between each pair of cities i,j, and a target integer K


9
− question: is there a permutation p of {1,2,…,n} such that 10 5

d(p1,p2)+d(p2,p3)+…+d(pn-1,pn)+d(pn,p1)≤K ? 6
c3 9

• i.e. is there a ’travelling salesman tour’ of length ≤K c2 9


c4

Algorithmics I, 2022 35
4 NP Completeness 219
Polynomial time reductions - Example
Reducing Hamiltonian cycle problem to travelling salesman problem
− G = (V,E) is an instance of HC
− construct TSDP instance f(G) where
• cities = V
• d(u,v)=1 if {u,v}∈E and 2 otherwise (is not an edge of G)
• K = |V|

a b a b
1
1 1
1
c 2
e e c
2 1
1
2 1
G
f(G)
d d

Algorithmics I, 2022 36
4 NP Completeness 220
Polynomial time reductions - Example
Reducing Hamiltonian cycle problem to travelling salesman problem
− G = (V,E) is an instance of HC
− construct TSDP instance f(G)
a b a b
1
1 1
1
c 2
e e c
2 1
1
2 1
G
f(G)
d d
− f(G) can be constructed in polynomial time
− f(G) has a tour of length ≤|V| if and only if G has a Hamiltonian cycle
(tour includes |V| edges so cannot take any of the edges with weight 2)
− therefore TSDP∈P implies that HC∈P
− equivalently HC∉P implies that TSDP∉P (contrapositive)

Algorithmics I, 2022 37
4 NP Completeness 221
NP-completeness
A decision problem Π is NP-complete if
1. Π∈NP
2. for every problem Π’ in NP: Π’ is polynomial-time reducable to Π

Consequences of definition
− if Π is NP-complete and Π∈P, then P = NP
− every problem in NP can be solved in polynomial time by reduction to Π
− supposing P ≠ NP, if Π is NP-complete, then Π∉P

The structure of NP if P ≠ NP
NP
P NP-complete

Algorithmics I, 2022 38
4 NP Completeness 222
Proving NP-completeness
A decision problem Π is NP-complete if
1. Π∈NP
2. for every problem Π’ in NP: Π’ is polynomial-time reducable to Π

How can we possibly prove any problem to be NP-complete?


− it is not feasible to describe a reduction from every problem in NP
− however, suppose we knew just one NP-complete problem Π1

To prove Π2 is NP-complete enough to show


− Π2 is in NP
− there exists a polynomial-time reduction from Π1 to Π2

Algorithmics I, 2022 39
4 NP Completeness 223
Proving NP-completeness
A decision problem Π is NP-complete if
1. Π∈NP
2. for every problem Π’ in NP: Π’ is polynomial-time reducable to Π

Suppose we knew just one NP-complete problem Π1, then to prove


Π2 is NP-complete it is enough to show
− Π2 is in NP
− there exists a polynomial-time reduction from Π1 to Π2

Correctness of the approach


− for any Π∈NP, since Π1 is NP-complete we have Π ∝ Π1
− since Π ∝ Π1, Π1 ∝ Π2 and ∝ is transitive, it follows that Π ∝ Π2
− since Π∈NP was arbitrary, Π ∝ Π2 for all Π∈NP
− and hence Π2 is NP-complete

Algorithmics I, 2022 40
4 NP Completeness 224
Proving NP-completeness
The first NP-complete problem?

Name: Satisfiability (SAT)


Instance: Boolean expression B in conjunctive normal form (CNF)
− CNF: C1 ∧ C2 ∧ … ∧ Cn where each Ci is a clause
− Clause C: (l1 ∨ l2 ∨ … ∨ lm) where each lj is a literal
− Literal l: a variable x or its negation ¬x
Question: is B satisfiable?
− i.e. can values be assigned to the variables that make B true?

Example:
− B = (x1∨x2∨¬x3)∧(¬x1∨x3∨¬x4)∧(¬x2∨x4)∧(x2∨¬x3∨x4)
− B is satisfiable: x1=true, x2=false, x3=true, x4=true

Algorithmics I, 2022 41
4 NP Completeness 225
Proving NP-completeness
The first NP-complete problem?

Cook’s Theorem (1971): Satisfiability (SAT) is NP-complete


− the proof consists of a generic polynomial-time reduction to SAT
from an abstract definition of a general problem in the class NP
− the generic reduction could be instantiated to give an actual
reduction for each individual NP problem

Given Cook’s theorem, to prove a decision problem Π is NP-complete


it is sufficient to show that:
− Π is in NP
− there exists a polynomial-time reduction from SAT to Π

Algorithmics I, 2022 42
4 NP Completeness 226
Clique is NP-complete
Name: Clique Problem (CP)
Instance: a graph G and a target integer K
Question: does G contain a clique of size K?
− i.e. a set of K vertices for which there is an edge between all pairs

To prove Clique is NP –complete


− show CP is in NP (straightforward)
− there exists a polynomial-time reduction from SAT to CP

Algorithmics I, 2022 43
4 NP Completeness 227
Clique is NP-complete
To complete the proof we need to show SAT µ CP
− i.e. a polynomial time reduction from SAT to CP

This is not examinable – this is just to show you that it is possible to


build PTRs between very different problems

Algorithmics I, 2022 44
4 NP Completeness 228
Clique is NP-complete
To complete the proof we need to show SAT µ CP
− i.e. a polynomial time reduction from SAT to CP

Given an instance B of SAT we construct (G,K) an instance of CP


− K number of clauses of B
− vertices of G are pairs (l,C) where l is a literal in clause C
− {(l,C),(m,D)} is an edge of G if and only if l≠¬m and C≠D
• recall that ¬(¬x)=x so l≠¬m is equivalent to ¬l≠m
• edge if distinct literals from different clauses can be satisfied simultaneously
− polynomial time construction (O(n2) where n is the number of literals)
• worst case: to construct edges we need to compare every literal with
every other literal
This is a polynomial time reduction since:
− B has a satisfying assignment if and only if G has a clique of size K

Algorithmics I, 2022 45
4 NP Completeness 229
Clique is NP-complete
To prove it is a polynomial time reduction we can show:

If B has a satisfying assignment, then


− if we choose a true literal in each clause the corresponding vertices
form a clique of size K in G

If G has a clique of size K, then


− assigning each literal associated with a vertex in the clique to be true
yields a satisfying assignment for B

Algorithmics I, 2022 46
4 NP Completeness 230
Clique is NP-complete
Why does the construction work?

{(l,C),(m,D)} is an edge if and only if l≠¬m and C≠D


− only edges between literals in distinct clauses
− only edges between literals that can be satisfied simultaneously

Therefore in a clique of size K (recall K is the number of clauses)


− must include one literal from each clause (i.e. from K clauses)
− we can satisfy all the literals in the clique simultaneously
− this means we can satisfy all clauses
• a clause is a disjunction of literals and we can satisfy one of them
− and therefore satisfy B
• B is the conjunction of the clauses

Algorithmics I, 2022 47
4 NP Completeness 231
Clique is NP-complete - Example

B = (x1∨x2∨¬x3)∧(¬x1∨x3∨¬x4)∧(¬x2∨x4)∧(x2∨¬x3∨x4)
− there are K = 4 clauses
¬x3 ¬x1
C1 C2
The graph G x3
x2
− vertices of G are pairs
(l,C) where l is a literal ¬x4
x1
in clause C
− {(l,C),(m,D)} is an edge
if and only if l≠¬m and C≠D
x4
x2

¬x2 C3
C4 ¬x3
x4
Algorithmics I, 2022 48
4 NP Completeness 232
Clique is NP-complete

B = (x1∨x2∨¬x3)∧(¬x1∨x3∨¬x4)∧(¬x2∨x4)∧(x2∨¬x3∨x4)
− there are K = 4 clauses
¬x3 ¬x1
C1 C2
The graph G x3
x2

G has a clique of size 4 ¬x4


x1
if and only if
B has a satisfying assignment

x4
x2
satisfying assignment
clique of size 4 ¬x2 C3
C4 ¬x3
x4
Algorithmics I, 2022 49
4 NP Completeness 233
Problem restrictions
A restriction of a problem consists of a subset of the instances of the
original problem
− if a restriction of a given decision problem Π is NP-complete, then so is Π
− given NP-complete problem Π, a restriction of Π might be NP-complete

For example a clique restricted to cubic graphs is in P


− (a cubic graph is a graph in which every vertex belongs to 3 edges)
− a largest clique has size at most 4 so exhaustive search is O(n4)

While graph colouring restricted to cubic graphs is NP-complete


− not proved here

Algorithmics I, 2022 50
4 NP Completeness 234
Problem restrictions

K-colouring
− restriction of Graph Colouring for for a fixed number K of colours
− 2-colouring is in P (it reduces to checking the graph is bipartite)
− 3-colouring is NP-complete

K-SAT
− restriction of SAT in which every clause contains exactly K literals
− 2-SAT is in P (proof is a tutorial exercise)
− 3-SAT is NP-complete
− showing 3-SAT ∈ NP is easy we will just show SAT ∝ 3-SAT

Algorithmics I, 2022 51
4 NP Completeness 235
SAT ∝ 3-SAT
Given instance B of SAT will construct an instance B’ of 3-SAT
For each clause C of B we construct a number of clauses of B’

− if C=l1, we introduce 2 addition variables x1 and x2 and add the


clauses (l1∨x1∨x2),(l1∨x1∨¬x2),(l1∨¬x1∨x2),(l1∨¬x1∨¬x2) to B’

− B’ holds if and only if all the clauses (l1∨x1∨x2), (l1∨x1∨¬x2),


(l1∨¬x1∨x2), (l1∨¬x1∨¬x2) hold (B’ is a conjunction of clauses)

− for any assignment to x1 and x2 this requires l1 holds


i.e. all clauses hold if and only if the clause C hold

Algorithmics I, 2022 52
4 NP Completeness 236
SAT ∝ 3-SAT
Given instance B of SAT will construct an instance B’ of 3-SAT
For each clause C of B we construct a number of clauses of B’

− if C=l1, we introduce 2 addition variables x1 and x2 and add the


clauses (l1∨x1∨x2),(l1∨x1∨¬x2),(l1∨¬x1∨x2),(l1∨¬x1∨¬x2) to B’

− if C=(l1∨l2), we introduce 1 addition variable y and add the clauses


(l1∨l2∨y) and (l1∨l2∨¬y) to B’

− B’ holds if and only if both the clauses (l1∨l2∨y) and (l1∨l2∨¬y) hold

− for any assignment to y this requires (l1∨l2) holds


i.e. both clauses hold if and only if the clause C holds

Algorithmics I, 2022 53
4 NP Completeness 237
SAT ∝ 3-SAT
Given instance B of SAT will construct an instance B’ of 3-SAT
For each clause C of B we construct a number of clauses of B’

− if C=l1, we introduce 2 addition variables x1 and x2 and add the


clauses (l1∨x1∨x2),(l1∨x1∨¬x2),(l1∨¬x1∨x2),(l1∨¬x1∨¬x2) to B’

− if C=(l1∨l2), we introduce 1 addition variable y and add the clauses


(l1∨l2∨y) and (l1∨l2∨¬y) to B’

− if C=(l1∨l2∨l3), we add the clause C to B’

Algorithmics I, 2022 54
4 NP Completeness 238
SAT ∝ 3-SAT
Given instance B of SAT will construct an instance B’ of 3-SAT
For each clause C of B we construct a number of clauses of B’

− if C=l1, we introduce 2 addition variables x1 and x2 and add the


clauses (l1∨x1∨x2),(l1∨x1∨¬x2),(l1∨¬x1∨x2),(l1∨¬x1∨¬x2) to B’

− if C=(l1∨l2), we introduce 1 addition variable y and add the clauses


(l1∨l2∨y) and (l1∨l2∨¬y) to B’

− if C=(l1∨l2∨l3), we add the clause C to B’

− if C=(l1∨…∨lk) and k>3, we introduce k-3 addition variables z1,…,zk-3


and add the clauses (l1∨l2∨z1), (¬z1∨l3∨z2),(¬z2∨l4∨z3),…,
(¬zk-4∨lk-2∨zk-3), (¬zk-3∨lk-1∨lk) to B’

Algorithmics I, 2022 55
4 NP Completeness 239
SAT ∝ 3-SAT
Given instance B of SAT will construct an instance B’ of 3-SAT
For each clause C of B we construct a number of clauses of B’

− if C=l1, we introduce 2 addition variables x1 and x2 and add the


clauses (l1∨x1∨x2),(l1∨x1∨¬x2),(l1∨¬x1∨x2),(l1∨¬x1∨¬x2) to B’

− if C=(l1∨l2), we introduce 1 addition variable y and add the clauses


(l1∨l2∨y) and (l1∨l2∨¬y) to B’

− if C=(l1∨l2∨l3), we add the clause C to B’

− if C=(l1∨…∨lk) and k>3, we introduce k-3 addition variables z1,…,zk-3


and add the clauses (l1∨l2∨z1), (¬z1∨l3∨z2),(¬z2∨l4∨z3),…,
(¬zk-4∨lk-2∨zk-3), (¬zk-3∨lk-1∨lk) to B’
− again all clauses hold if and only if C holds

Algorithmics I, 2022 56
4 NP Completeness 240
Coping with NP-completeness
What to do if faced with an NP-complete problem?
Maybe only a restricted version is of interest (which maybe in P)
− e.g. 2-SAT, 2-colouring are in P
Seek an exponential-time algorithm improving on exhaustive search
− e.g. backtracking (as in the assessed exercise), branch-and-bound
− should extend the set of solvable instances
For an optimisation problem (e.g. calculating min/max value)
− settle for an approximation algorithm that runs in polynomial time
− especially if it gives a provably good result (within some factor of optimal)
− use a heuristic
• e.g. genetic algorithms, simulated annealing, neural networks
For a decision problem
− settle for a probabilistic algorithm correct answer with high probability

Algorithmics I, 2022 57
4 NP Completeness 241
Algorithmics I 2022

Algorithmics I

Section 5 - Computability

Dr. Gethin Norman

School of Computing Science


University of Glasgow

[email protected]

5 Computability 242
Introduction to Computability
What is a computer?

input x black box output f(x)

What can the black box do?


− it computes a function that maps an input to an output

Computability concerns which functions can be computed


− a formal way of answering ‘what problems can be solved by a computer?’
− or alternatively ‘what problems cannot be solved by a computer?’

To answer such questions we require a formal definition


− i.e. a definition of what a computer is
− or what an algorithm is if we view a computer as a device that can
execute an algorithm

Algorithmics I, 2022 2
5 Computability 243
Unsolvable problems
Some problems cannot be solved by a computer
− even with unbounded time
Example: The Tiling Problem
− a tile is a 1×1 square, divided into 4 triangles by its diagonals with each
triangle is given a colour
− each tile has a fixed orientation (no rotations allowed)
− example tiles:

Instance: a finite set S of tile descriptions


Question: can any finite area, of any size, be completely covered
using only tiles of types in S, so that adjacent tiles colour match?

Algorithmics I, 2022 3
5 Computability 244
Tiling problem - Tiling a 5×5 square
Available tiles:

We can use these tiles to tile a 5×5 square as follows:

Algorithmics I, 2022 4
5 Computability 245
Tiling problem - Extending to a larger region
Overlap the top two rows with
the bottom two rows
− obtain an 8×5 tiled area
Next place two of
these 8×5 rectangles
side by side
− with the right hand
rectangle one row
above the left hand
rectangle
By repeating this pattern it
follows that any finite area
can be tiled
Algorithmics I, 2022 5
5 Computability 246
Tiling problem - Altering the tiles
Original tiles:

New tiles:

Now impossible to tile a 3×3 square

There are 39=19,683 possibilities if you want to try them all out…

Algorithmics I, 2022 6
5 Computability 247
Tiling problem
Tiling problem: given a set of tile descriptions, can any finite area, of
any size, be completely ‘tiled’ using only tiles from this set?

There is no algorithm for the tiling problem


− for any algorithm A that we might try to formulate there is a set of tiles S
for which either A does not terminate or A gives the wrong answer

The problem is that:


− “any size” means we have to check all finite areas and there are infinitely
many of these
− and for certain sets of tile descriptions that can tile any area, there is no
“repeated pattern” we can use
− so to be correct the algorithm would really have to check all finite areas

Algorithmics I, 2022 7
5 Computability 248
Undecidable problems
A problem Π that admits no algorithm is called non-computable or
unsolvable

If Π is a decision problem and Π admits no algorithm it is called


undecidable

The Tiling Problem is undecidable

Algorithmics I, 2022 8
5 Computability 249
Post’s correspondence problem (PCP)
A word is a finite string over some given finite alphabet

Instance: two finite sequences of words X1,…,Xn and Y1,…,Yn


− the words are all over the same alphabet
Question: does there exist a sequence i1,i2,…,ir of integers chosen
from {1,…,n} such that Xi1Xi2…Xir = Yi1Yi2…Yir ?
− i.e. concatenating the Xij's and the Yij's gives the same result

Example: n=5
− X1=abb, X2=a, X3=bab, X4=baba, X5=aba
− Y1=bbab, Y2=aa, Y3=ab, Y4=aa, Y5=a
− correspondence is given by the sequence 2, 1, 1, 4, 1, 5
• word constructed from Xi’s: aabbabbbabaabbaba
• word constructed from Yi’s: aabbabbbabaabbaba

Algorithmics I, 2022 9
5 Computability 250
Post’s correspondence problem (PCP)
A word is a finite string over some given finite alphabet

Instance: two finite sequences of words X1,…,Xn and Y1,…,Yn


− the words are all over the same alphabet
Question: does there exist a sequence i1,i2,…,ir of integers chosen
from {1,…,n} such that Xi1Xi2…Xir = Yi1Yi2…Yir ?
− i.e. concatenating the Xij's and the Yij's gives the same result

Example: n=5 (with first letter from X1 and Y1 removed)


− X1=bb, X2=a, X3=bab, X4=bab, X5=aba
− Y1=bab, Y2=aa, Y3=ab, Y4=aa, Y5=a
− to get a match we must start with either 2 or 5
− follows that we can now never get a correspondence
Post’s Correspondence Problem is undecidable

Algorithmics I, 2022 10
5 Computability 251
The halting problem
An impossible project: write a program Q that takes as input
− a legal program X (say in Java)
− an input string S for program X
and returns as output
− yes if program X halts when run with input S
− no if program X enters an infinite loop when run with input S

We will prove that no such program Q can exists, meaning the


halting problem is undecidable

Algorithmics I, 2022 11
5 Computability 252
The halting problem
Example (small) programs

public void test(int n){


if (n == 1)
while (true)
null;
}

The program ‘test’ will terminates if and only if input n≠1

Algorithmics I, 2022 12
5 Computability 253
The halting problem
Example (small) programs

public int erratic(int n){


while (n != 1)
if (n % 2 == 0) n = n/2;
else n = 3*n + 1;
}

For example if ‘erratic’ is called with n=7 sequence of values:

22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1

Nobody knows whether ‘erratic’ terminates for all values of n

Algorithmics I, 2022 13
5 Computability 254
The halting problem - Undecidability
A formal definition of the halting problem (HP)
Instance: a legal Java program X and an input string S for X
− can substitute any language for Java
Question: does X halt when run on S?

Theorem: HP is undecidable proof (by contradiction):


− suppose we have an algorithm A that decides (solves) HP
− let Q be an implementation of this algorithm as a Java method
with X and S as parameters

yes output:"yes"
program X Q
does X
input string S halt on S?
no output:"no"

Algorithmics I, 2022 14
5 Computability 255
The halting problem - Undecidability
Define a new program P with input a legal program W in Java

program P(W)
yes while (true) null;
input program W Q
does W
program W input string W halt on W?
no exit

− P makes a copy of W and calls Q(W,W)


− Q terminates by assumption, returning either "yes" or "no"
− if Q returns "yes", then P enters an infinite loop
− if Q returns ”no", then P terminates

Algorithmics I, 2022 15
5 Computability 256
The halting problem - Undecidability
Define a new program P with input a legal program W in Java

program P(W)
yes while (true) null;
input program W Q
does W
program W input string W halt on W?
no exit

Now let the input W be the program P itself

program P(P)
yes while (true) null;
program P Q
input
does P
program P input string P halt on P?
no exit

Algorithmics I, 2022 16
5 Computability 257
The halting problem - Undecidability
Now let the input W to P be the program P itself
program P(P)
yes while (true) null;
input program P Q
does P
program P input string P halt on P?
no exit

P calls Q(P,P)
− Q terminates by assumption, returning either "yes" or "no"
− recall we have assumed Q solves the halting problem
− suppose Q returns "yes", then by definition of Q this means P terminates
− but this also means P does not terminate (it enters the infinite loop)
− this is a contradiction therefore Q must return "no"

Algorithmics I, 2022 17
5 Computability 258
The halting problem - Undecidability
Now let the input W to P be the program P itself
program P(P)
yes while (true) null;
input program P Q
does P
program P input string P halt on P?
no exit

P calls Q(P,P)
− Q terminates by assumption, returning either "yes" or "no"
− recall we have assumed Q solves the halting problem
− therefore Q must return "no"
− this means by definition of Q that P does not terminate
− but this also means P does terminate
− so again a contradiction

Algorithmics I, 2022 18
5 Computability 259
The halting problem - Undecidability
Now let the input W to P be the program P itself
program P(P)
yes while (true) null;
input program P Q
does P
program P input string P halt on P?
no exit

P calls Q(P,P)
− Q terminates by assumption, returning either "yes" or "no"
− recall we have assumed Q solves the halting problem
− therefore Q can return neither "yes" nor "no"
− meaning no such program Q can exist
− if no such Q can exist, then no algorithm can solve the halting problem
− hence the problem is undeciable

Algorithmics I, 2022 19
5 Computability 260
The halting problem - Undecidability
To summarise the proof
− we assumed the existence of an algorithm A that solved HP
− implemented this algorithm as the program Q
− then constructed a program P which contains Q as a subroutine
− showing that if Q gives the answer "yes", we reach a contradiction
− so Q must give the answer "no", but this also leads to a contradiction
− the contradiction stems from assuming that Q, and hence A exists
− therefore no algorithm A exists and HP is undecidable

Notice we are not concerned with the complexity of A just the


existence of A

Algorithmics I, 2022 20
5 Computability 261
Proving undecidability by reduction
Suppose we can reduce any instance I of Π1 into an instance J of Π2
such that
− I has a ‘yes’-answer for Π1 if and only if J has a "yes"-answer for Π2
(like PTRs but no need for J to be constructed in polynomial time)

If Π1 is undecidable and we can perform such a reduction,


then Π2 is undecidable
− suppose for a contradiction Π2 is decidable
− then using this reduction we can decide Π1
− however Π1 is undecidable, therefore Π2 cannot be decidable

Algorithmics I, 2022 21
5 Computability 262
Hierarchy of decision problems

Undecidable
e.g. Tiling Problem,
Halting Problem

Intractable
e.g. Roadblock

NP-complete
e.g. SAT, HC, TSDP
Exactly one of
these lines is real
Polynomial-time solvable
e.g. String distance, (depends on whether
Eulerian cycle P equals NP)

Algorithmics I, 2022 22
5 Computability 263
Models of computation

input x black box output f(x)

Attempts to define "the black box”


− we will look at three classical models of computation of increasing power
• Finite-State Automata
− simple machines with a fixed amount of memory
− have very limited (but still useful) problem-solving ability
• Pushdown Automata (PDA)
− simple machines with an unlimited memory that behaves like a stack
• Turing machines (TM)
− simple machines with an unlimited memory that can be used essentially
arbitrarily
− these have essentially the same power as a typical computer

Algorithmics I, 2022 23
5 Computability 264
Deterministic finite-state automata
Simple machines with limited memory which recognise input on
a read-only tape

A DFA consists of
− a finite input alphabet Σ
− a finite set of states Q
− a initial/start state q0 ∈ Q and set of accepting states F ⊆ Q
− control/program or transition relation T ⊆ (Q × Σ) × Q
• ((q,a),q’) ∈ T means if in state q and read a, then move to state q’

− deterministic means that if


((q,a1),q1), ((q,a2),q2) ∈ T either a1≠a2 or q1=q2
− i.e. for any state and action there is at most one move (i.e. no choice)

Algorithmics I, 2022 24
5 Computability 265
Deterministic finite-state automata
Simple machines with limited memory which recognise input on
a read-only tape

A DFA consists of add input tape (finite sequence of


elements/actions from the alphabet)
− a finite input alphabet Σ
− a finite set of states Q
− a initial/start state q0 ∈ Q and set of accepting states F ⊆ Q
− control/program or transition relation T ⊆ (Q × Σ) × Q
b control/program
((q0,a), q1)
a b a ((q0,b), q3)
q0 q1 q2 q3
((q1,a), q1)
a b a,b ((q1,b), q2)
((q2,a), q3)
((q2,b), q2)
((q3,a), q3)
Algorithmics I, 2022 25
5 Computability ((q3,b), q3) 266
Deterministic finite-state automata
A DFA define a language
− determines whether the string on the input tape belongs to that language
− in other words, it solves a decision problem

More precisely a DFA recognises or accepts a language


− the input strings which when ‘run’ end in an accepting state

Question: what language does this DFA recognise?

a a b b b
b

a b a
q0 q1 q2 q3
string is accepted
a b a,b

Algorithmics I, 2022 26
5 Computability 267
Deterministic finite-state automata
A DFA define a language
− determines whether the string on the input tape belongs to that language
− in other words, it solves a decision problem

More precisely a DFA recognises or accepts a language


− the input strings which when ‘run’ end in an accepting state

Question: what language does this DFA recognise?

a a b b b a
b

a b a
q0 q1 q2 q3
string is not accepted
a b a,b

Algorithmics I, 2022 27
5 Computability 268
Deterministic finite-state automata
A DFA define a language
− determines whether the string on the input tape belongs to that language
− in other words, it solves a decision problem

More precisely a DFA recognises or accepts a language


− the input strings which when ‘run’ end in an accepting state

Question: what language does this DFA recognise?

answer: the language


b consisting of the set of
all strings comprising
a b a
q0 q1 q2 q3 one or more a's followed
a b a,b by one or more b's

Algorithmics I, 2022 28
5 Computability 269
Deterministic finite-state automata
Recognises the language of strings containing two
consecutive a’s
b

a a
q0 q1 q2
b a,b

Recognises the complement, i.e., the language of strings that do not


contain two consecutive a’s

a a
q0 q1 q2
b a,b

Algorithmics I, 2022 29
5 Computability 270
Another example
a
b b
q0 q1 q2
b

Recognises strings that start and end with b


However this is not a DFA, but a non-deterministic finite-state
automaton (NFA)
− in state q1 under b can move to q1 or q2

Recognition for NFA is similar to non-deterministic algorithms


“solving” a decision problem
− only require there exists a ‘run’ that ends in an accepting state
− i.e. under one possible resolution of the nondeterministic choices

Algorithmics I, 2022 30
5 Computability 271
Another example
a
b b
q0 q1 q2
b

Recognises strings that start and end with b


However this is not a DFA, but a non-deterministic finite-state
automaton (NFA)
− in state q1 under b can move to q1 or q2

But any NFA can be converted into a DFA

Therefore non-determinism does not expand the class of languages


that can be recognised by finite state automata
− being able to guess does not give us any extra power

Algorithmics I, 2022 31
5 Computability 272
NFA to DFA reduction
Can reduce a NFA to a DFA using the subset construction
− states of the DFA are sets of states of the NFA
− construction can cause a blow-up in the number of states
• in the worst case from N states to 2N states

Example (without blow-up)


− recognises strings that start and end with b

a
b b
− NFA q0 q1 q2
b

a
b b
− DFA {q0} {q1} {q1,q2} b
a

Algorithmics I, 2022 32
5 Computability 273
Regular languages and regular expressions
The languages that can be recognised by finite-state automata
are called the regular languages

A regular language (over an alphabet Σ) can be specified by


a regular expression over Σ
− ε (the empty string) is a regular expression
− σ is a regular expression (for any σ∈Σ)

if R and S are regular expressions, then so are


− RS which denotes concatenation
− R | S which denotes choice between R or S
− R* which denotes 0 or more copies of R (sometimes called closure)
− (R) which is needed to override precedence between operators

Algorithmics I, 2022 33
5 Computability 274
Regular expressions
Order of precedence (highest first)
− closure (*) then concatenation then choice (|)
− as mentioned brackets can be used to override this order

Example: suppose Σ = {a,b,c,d}


− R = (ac|a*b)d means ( ( ac ) | ( (a*) b ) ) d
− corresponding language LR is
{acd, bd, abd, aabd, aaabd, aaaabd, … }

Additional operations
− complement ¬x
• equivalent to the 'or' of all characters in Σ except x
− any single character ?
• equivalent to the 'or' of all characters

Algorithmics I, 2022 34
5 Computability 275
Regular expressions - Examples
The examples from earlier

1) the language comprising one or more a's followed by one or more b’s
− aa*bb*

2) the language of strings containing two consecutive a’s


− (a|b)*aa(a|b)*

3) the language of strings that do not contain two consecutive a’s (harder)
− b*(abb*)*(ε|a)

4) the language of strings that start and end with b


− b(a|b)*b

Algorithmics I, 2022 35
5 Computability 276
Regular expressions - Closure
To clarify what R* means
− corresponds to 0 or more copies of the regular expression R

Let L(R) be the language corresponding to the regular expression R


− then concatenation is given by L(RS)={ rs | r∈L(R) and s∈L(S) }
and L(R*)=L(R0)∪L(R1)∪L(R2)… where L(R0)={ε} and L(Ri+1)=L(RRi)
− note (a*b*)* is in fact equivalent to (a|b)*

L(R*) does not mean { r* | r∈L(R) }


− which for certain regular expressions cannot be recognized by any DFA
− essentially for such a language would need a memory to remember which
string in r∈L(R) is repeated and there might be an unbounded number

Algorithmics I, 2022 36
5 Computability 277
Regular expressions - Example
Consider the language (aa*bb*)*
− i.e. zero or more sequences which consist of a non-zero number of a’s
followed by a non-zero number of b’s
Corresponding DFA:
a b
a b
q0 q1 q2

b
a
q3
a,b a b a a b

Algorithmics I, 2022 37
5 Computability 278
Regular expressions - Example
A DFA cannot recognise { r* | r∈L(aa*bb*) }
− i.e. { (ambn)* | m>0 and n>0 }
− the problem is the DFA would need to remember the m and n to check
that a string is in the langauge
− but there are infinitely many values for m and n
− hence the DFA would need infinitely many states
− and we only have a finite number (DFA = deterministic finite automaton)

Similarly a DFA cannot recognise { anbn | n>0 }


− i.e. a number of a's followed by the same number of b’s

Languages that are recognised by DFAs are called regular languages


so, for example { anbn | n>0 } is not regular

Algorithmics I, 2022 38
5 Computability 279
Regular expressions - Example
How can we recognising strings of the form anbn?
− i.e. a number of a's followed by the same number of b's

It turns out that there is no DFA that can recognise this language
− it cannot be done without some form of memory, e.g. a stack

Idea: as you read a’s, push them onto a stack, then pop the stack as
you read b’s, i.e. the stack works like a counter
So there are some functions (languages) that we would regard as
computable that cannot be computed by a finite-state automaton
− DFAs are not an adequate model of a general-purpose computer
i.e. our 'black box’

Next: pushdown automata extend finite-state automata with a stack

Algorithmics I, 2022 39
5 Computability 280
Pushdown automata
A pushdown automaton (PDA) consists of:
− a finite input alphabet Σ, a finite set of stack symbols G
− a finite set of states Q including start state and set of accepting states
− control or transition relation T ⊆ (Q×ΣÈ{ε}×GÈ{ε})×(Q×GÈ{ε})

ε – empty string
current tape old stack new new stack
state symbol symbol state symbol
or ε or ε or ε

tape control stack


a b a b a
w top
head v

Algorithmics I, 2022 40
5 Computability 281
Pushdown automata
Transition relation T ⊆ (Q × ΣÈ{ε} × GÈ{ε}) × (Q × GÈ{ε})
tape stack
control
a b a b a
w top
head v

Informally, the transition (q1,a,w) ➝ (q2,v) means that


− if we are in state q1
− if a≠ε, then the symbol a is at the head of the tape
− if w≠ε, then the symbol w is is on top of the stack
− then move to state q2 and
− if a≠ε, then move head forward one position
− if w≠ε, then pop w from the stack
− if v≠ε, then push v onto the stack

Algorithmics I, 2022 41
5 Computability 282
Pushdown automata
A PDA accepts an input if and only if after the input has been read,
the stack is empty and control is in an accepting state

Example tuples from a PDA program when in state q1


− (q1,ε,ε)➝(q2,ε) move to q2

− (q1,a,ε)➝(q2,ε) if head of tape is a, move to q2 & move head forward

− (q1,a,ε)➝(q2,v) if head of tape is a, move to q2, move head forward


& push v onto stack

− (q1,a,w)➝(q2,ε) if head of tape is a & w is top stack, move to q2, move


head forward & pop w from stack

− (q1,a,w)➝(q2,v) if head of tape is a & w is top of stack, move to q2,


move head forward, pop w & push v onto stack

Algorithmics I, 2022 42
5 Computability 283
Pushdown automata
There is no explicit test that the stack is empty
− this can be achieved by adding a special symbol ($) to the stack at the
start of the computation
− i.e. we add the symbol to the stack when we know the stack is empty
and we never add $ at any other point during the computation
• unless we pop it from the stack as at this point we again know its empty
− then can check for emptiness by checking $ is on top of the stack

− when we want to finish in an accepting state we just need to make


sure we pop $ from the stack (we will see this in an example later)

Algorithmics I, 2022 43
5 Computability 284
Pushdown automata
Note PDA defined here are non-deterministic (NDPDA)
− deterministic PDAs (DPDAs) are less powerful
− this differs from DFAs where non-determinism does not add power
− i.e. there are languages that can be recognised by a NDPDA but
not by a DPDA, e.g. the language of palindromes
• palindromes: strings that read the same forwards and backwards

Algorithmics I, 2022 44
5 Computability 285
Pushdown automata - Palindromes
Palindrones are sequences of characters that read the same forwards
and backwards (second half is the reverse of the first half)

How to recognize palindrones with a pushdown automaton?


− push the first half of the sequence onto the stack
− then as we read each new character check it is the same as the top
element on the the stack and pop this element
− then enter an accepting state if all checks succeed

Why do we need non-determinism?


− we need to “guess” where the middle of the stack is
• and if there are even or odd number of characters
− cannot work this out first and then check the string as would need an
unbounded number of states as the string could be of any finite length

Algorithmics I, 2022 45
5 Computability 286
Pushdown automata - Example
Consider the following PDA program (alphabet is {a,b})
− q0 is the start state and q0 and q3 are the only accepting states
− (q0,ε,ε)➝(q1,$) move to q1 and push $ onto stack ($ - special symbol)
− (q1,a,ε)➝(q1,1) read a & push 1 onto stack
− (q1,b,1)➝(q2,ε) read b & 1 is top of stack, pop stack & move to q2
− (q2,b,1)➝(q2,ε) read b & 1 is top of stack, pop stack
− (q2,ε,$)➝(q3,ε) if $ is the top of the stack, pop stack & move to q3

tape
a a b b (empty)
stack
ε,ε➝$
head q0 q1 a,ε➝1

b,1➝ε

ε,$➝ε
q3 q2 b,1➝ε
Algorithmics I, 2022 46
5 Computability 287
Pushdown automata - Example
Consider the following PDA program (alphabet is {a,b})
− q0 is the start state and q0 and q3 are the only accepting states
− (q0,ε,ε)➝(q1,$) move to q1 and push $ onto stack ($ - special symbol)
− (q1,a,ε)➝(q1,1) read a & push 1 onto stack
− (q1,b,1)➝(q2,ε) read b & 1 is top of stack, pop stack & move to q2
− (q2,b,1)➝(q2,ε) read b & 1 is top of stack, pop stack
− (q2,ε,$)➝(q3,ε) if $ is the top of the stack, pop stack & move to q3

Example Inputs
− if you try to recognise aabb, all of the input is read, as we have just seen
end up in an accepting state, and the stack is empty
− if you try to recognise aaabb, all the input is read, you end up in state q2
and the stack in not empty
− if you try to recognise aabbb, you are left with b on the tape, which
cannot be read because of an empty stack

Algorithmics I, 2022 47
5 Computability 288
Pushdown automata - Example
Consider the following PDA program (alphabet is {a,b})
− q0 is the start state and q0 and q3 are the only accepting states
− (q0,ε,ε)➝(q1,$) move to q1 and push $ onto stack ($ - special symbol)
− (q1,a,ε)➝(q1,1) read a & push 1 onto stack
− (q1,b,1)➝(q2,ε) read b & 1 is top of stack, pop stack & move to q2
− (q2,b,1)➝(q2,ε) read b & 1 is top of stack, pop stack
− (q2,ε,$)➝(q3,ε) if $ is the top of the stack, pop stack & move to q3

Automaton recognises the language: { an bn | n≥0 }

Algorithmics I, 2022 48
5 Computability 289
Pushdown automata
Pushdown automata are more powerful than finite-state automata
− a PDA can recognise some languages that cannot be recognised by a DFA
− e.g. {anbn | n≥0} is recognised by the PDA example

The languages that can be recognised by a PDA are the context-free


languages

Are all languages regular or context-free?


i.e. is a PDA an adequate model of a general purpose computer (our 'black box')?

No, for example, consider the language {anbncn | n≥0}


− this cannot be recognised by a PDA
− but it is easy to write a program (say in Java) to recognise it

Algorithmics I, 2022 49
5 Computability 290
Turing machines
A Turing Machine T to recognise a particular language consists of

• a finite alphabet Σ, including a blank symbol (denoted by #)


• an unbounded tape of squares
− each can hold a single symbol of Σ
− tape unbounded in both directions
• a tape head that scans a single square
− it can read from it and write to the square
− then moves one square left or right along the tape
• a set S of states
− includes a single start state s0 and two halt (or terminal) states sY and sN
• a transition function
− essentially the inbuilt program

Algorithmics I, 2022 50
5 Computability 291
Turing machines - Computation
The transition function is of the form
f : ((S⁄{sY,sN}) × Σ) ➝ (S × Σ × {Left, Right})

For each non-terminal state and symbol the function f specifies


− a new state (perhaps unchanged)
− a new symbol (perhaps unchanged)
− a direction to move along the tape

f(s,σ)=(s¢,σ¢,d) means reading symbol σ from the tape in state s


− move to state s¢∈S
− overwrite the symbol σ on the tape with the symbol σ¢∈Σ
• if you do not want to overwrite the symbol write the symbol you read
− move the tape head one square in direction d∈{Left, Right}

Algorithmics I, 2022 52
5 Computability 292
Turing machines - Computation
The (finite) input string is placed on the tape
− assume initially all other squares of the tape contain blanks

The tape head is placed on the first symbol of the input

T starts in state s0 (scanning the first symbol)


− if T halts in state sY, the answer is ‘yes’ (accepts the input)
− if T halts in state sN, the answer is ‘no’ (rejects the input)

Algorithmics I, 2022 53
5 Computability 293
The palindrome problem
Instance: a finite string Y
Question: is Y a palindrome, i.e. is Y equal to the reverse of itself
− simple Java method to solve the above:

public boolean isPalindrome(String s){


int n = s.length();
if (n < 2) return true;
else
if (s.charAt(0) != s.charAt(n-1)) return false;
else return isPalindrome(s.substring(1,n-2));
}

We will design a Turing Machine that solves this problem


− in fact, as stated previously, a NDPDA can recognise palindromes

For simplicity, we assume that the string is composed of a's and b's

Algorithmics I, 2022 54
5 Computability 294
The palindrome problem – Turing machine
Formally defining a Turing Machine for even simple problems is hard
− much easier to design a pseudocode version

Recall: for pushdown automata we needed nondeterminism to solve


the palindrome problem
− needed to guess where the middle of the palindrome was

However as we will show using Turing machines we do not need


nondeterminism

Algorithmics I, 2022 55
5 Computability 295
The palindrome problem – Turing machine
Formally defining a Turing Machine for even simple problems is hard
− much easier to design a pseudocode version
TM Algorithm for the Palindrome problem
read the symbol in the current square;
erase this symbol;
enter a state that 'remembers' it;
move tape head to the end of the input;
if (only blank characters remain)
enter the accepting state and halt;
else if (last character matches the one erased)
erase it too;
else
enter rejecting state and halt;
if (no input left)
enter accepting state and halt;
else
move to start of remaining input;
repeat from first step;

Algorithmics I, 2022 56
5 Computability 296
The palindrome problem – Turing machine
We need the following states (assuming alphabet is Σ={#,a,b}):

− s0 reading and erasing the leftmost symbol

− s1, s2 moving right to look for the end, remembering the symbol erased
• i.e. s1 when read (and erased) a and s2 when read (and erased) b

− s3, s4 testing for the appropriate rightmost symbol


• i.e. s3 testing against a and s4 testing against b

− s5 moving back to the leftmost symbol

Algorithmics I, 2022 57
5 Computability 297
The palindrome problem – Turing machine
Transitions:
− from s0, we enter sY if a blank is read, or move to s1 or s2 depending on
whether an a or b is read, erasing it in either case
− we stay in s1/s2 moving right until a blank is read, at which point we
enter s3/s4 and move left
− from s3/s4 we enter sY if a blank is read, sN if the 'wrong' symbol is read,
otherwise erase it, enter s5, and move left
− in s5 we move left until a blank is read, then move right and enter s0
States:
− s0 reading, erasing and remembering the leftmost symbol
− s1, s2 moving right to look for the end, remembering the symbol erased
− s3, s4 testing for the appropriate rightmost symbol
− s5 moving back to the leftmost symbol

Algorithmics I, 2022 58
5 Computability 298
The palindrome problem – Turing machine
A Turing machine can be described by its state transition diagram
which is a directed graph where
− each state is represented by a vertex
− f(s,σ) = (s¢,σ¢,d) is represented by an edge from vertex s to vertex s¢,
labelled (σ➝σ¢,d)
• edge from s to s’ represents moving to state s’
• σ➝σ¢ represents overwriting the symbol σ on the tape with the symbol σ¢
• d represents moving the tape head one square in direction d

TM for the Palindrome problem (see next slide)


− alphabet is Σ = {#,a,b}
− states are S = {s0,s1,s2,s3,s4,s5,sY,sN}

Algorithmics I, 2022 59
5 Computability 299
The palindrome problem – Turing machine

(a®a,R)
(b®b,R)
(#®#,L)
(a®#,R) s1 s3 (a®#,L)
(a®a,L)
(#®#,L) (b®b,L) (b®b,L)

s0 sY sN s5
(#®#,R)
(#®#,L) (a®a,L)
(#®#,L)
(b®#,R) s2 s4 (b®#,L)
(a®a,R)
(b®b,R)

Algorithmics I, 2022 (#®#,R) 60


5 Computability 300
Turing machines - Functions
The Turing machine that accepts language L actually computes the
function f where f(x) equals 1 if x∈L and 0 otherwise

The definition of a TM can be amended as follows:


− to have a set H of halt states
− the function it computes is defined by f(x)=x¢ where
• x is the initial string on the tape
• x¢ is the string on the tape when the machine halts

For example, the palindrome TM could be redefined such that it


deletes the tape contents and
− instead of entering sY it writes 1 on the tape and enters a halt state
− instead of entering sN it writes 0 on the tape and enters a halt state

Algorithmics I, 2022 61
5 Computability 301
Turing machines – Functions - Example
Design a Turing machine to compute the function f(k) = k+1
− where the input is in binary
Example 1
− input: 1 0 0 0 1 0
pattern: replace right-most 0 with 1
− output: 1 0 0 0 1 1
then moving right:
Example 2 if 1 replace with 0 and continue right
− input: 1 0 0 1 1 1 if blank halt
− output: 1 0 1 0 0 0
Example 3 (special case)
− input 1 1 1 1 1 special case: no right-most 0, i.e. only 1’s
in the input pattern:
− output: 1 0 0 0 0 0
replace first blank before input with 1
then moving right:
if 1 replace with 0 and continue right
if blank halt
Algorithmics I, 2022 62
5 Computability 302
Turing machines – Functions - Example
Design a Turing machine to compute the function f(k) = k+1
− where the input is in binary
TM Algorithm for the function f(k) = k+1

move right seeking first blank square;


move left looking for first 0 or blank;
when 0 or blank found
change it to 1;
move right changing each 1 to 0;
halt when blank square reached;

Now to translate this pseudocode into a TM description


− identify the states and specify the transition function

Algorithmics I, 2022 63
5 Computability 303
Turing machines – Functions - Example
We need the following states
− s0: (start state) moving right seeking start of the input (first blank)
− s1: moving left to right-most 0 or blank
− s2: find first 0 or blank, changed it to 1 and moving right changing 1s to 0s
− s3: the halt state

and the following transitions


− from s0 we enter s1 at the first blank
− from s1 we enter s2 if a 0 (found right-most 0) or blank is read
− from s2 we enter s3 (halt) at the first blank

Algorithmics I, 2022 64
5 Computability 304
Transition state diagram

(1➝1,R) (1➝1,L) (1➝0,R)

(0➝1,R) (#➝#,L)
s0 s1 s2 s3
(#➝#,L) (#➝1,R)

(0➝0,R)

Exercise: execute this TM for inputs:


− 1 0 0 1 1 1
− 1 0 0 0 1 0
− 1 1 1 1 1

Algorithmics I, 2022 65
5 Computability 305
Turing recognizable and decidable

A language L is Turing-recognizable if some Turing Machine


recognizes it, that is given an input string x:
− if x∈L, then the TM halts in state sY
− if xÏL, then the TM halts in state sN or fails to halt

A language L is Turing-decidable if some Turing Machine decides it,


that is given an input string x:
− if x∈L, then the TM halts in state sY
− if xÏL, then the TM halts in state sN

Every decidable language is recognizable, but not every recognizable


language is decidable
− e.g., the language corresponding to the Halting Problem
(if a program terminates we will enter sY, but not sN if it does not)
Algorithmics I, 2022 66
5 Computability 306
Turing computable

A function f: Σ*➝Σ* is Turing-computable if there is a Turing


machine M such that
− for any input x, the machine M halts with output f(x)

Algorithmics I, 2022 67
5 Computability 307
Enhanced Turing machines
A Turing machines may be enhanced in various ways:
− two or more tapes, rather than just one, may be available
− a 2-dimensional 'tape' may be available
− the TM may operate non-deterministically
• i.e. the transition 'function’ may be a relation rather than a function
− and many more …

None of these enhancements change the computing power


− every language/function that is recognizable/decidable/computable with
an enhanced TM is recognizable/decidable/computable with a basic TM
• so nondeterminism adds power to pushdown automata but neither to
finite-state automata or Turing machines…
− proved by showing that a basic TM can simulate any of these enhanced
Turing machines

Algorithmics I, 2022 68
5 Computability 308
Turing machines – P and NP
The class P is often introduced as the class of decision problems
solvable by a Turing machine in polynomial time

and the class NP is introduced as the class of decision problems


solvable by a non-deterministic Turing machine in polynomial time
− in a non-deterministic TM the transition function is replaced by a relation
f ⊆ ( (S × Σ) × (S × Σ × {Left, Right}) )
i.e. can make a number of different transitions based on the current state
and the symbol at the tape head
− nondeterminism does to change what can be computed, but can speed up
the computation
Hence to show P ≠ NP sufficient to show a (standard) Turing machine
cannot solve an NP-complete problem in polynomial time

Algorithmics I, 2022 69
5 Computability 309
Counter programs
A completely different model of computation
− all general purpose programming languages have essentially the
same computational power
− a program written in one language could be translated (or compiled) into
a functionally equivalent program in any other

So how simple can a programming language be and still have this


same computational power?

Algorithmics I, 2022 70
5 Computability 310
Counter programs
Counter programs have

• variables of type int

• labelled statements are of the form:


− L : unlabelled_statement

• unlabelled statements are of the form:


− x = 0; (set a variable to zero)
− x = y+1; (set a variable to be the value of another variable plus 1)
− x = y-1; (set a variable to be the value of another variable minus 1)
− if x==0 goto L; (conditional goto where L is a label of a statement)
− halt; (finished)

Algorithmics I, 2022 71
5 Computability 311
Counter programs - Example
A counter program to evaluate the product x·y
(A, B and C are labels)
// initialise some variables
u = 0;
z = 0; // this will be the product of x and y when we finish

A: if x==0 goto C; // end of outer for loop


x = x-1; // perform this loop x times
v = y+1; // each time around the loop we set v to equal y
v = v-1; // in a slightly contrived way

B: if v==0 goto A; // end of inner for loop (return to outer loop)


v = v-1; // perform this loop v times (i.e. y times)
z = z+1; // each time incrementing z
// so really added y to z by the end of the inner loop
if u==0 goto B; // really just goto B (return to start of inner loop)

C: halt;

Algorithmics I, 2022 72
5 Computability 312
The Church-Turing Thesis
So is the Turing machine an appropriate model for the ‘black box’?

The answer is ‘yes’ this is known as the Church-Turing thesis


− it is based on the fact that a whole range of different computational
models turn out to be equivalent in terms of what they can compute
− so it is reasonable to infer that any one of these models encapsulates
what is effectively computable

Put simply it states that everything “effectively computable” is


computable by a Turing machine
− a thesis not a theorem as uses the informal term “effectively computable”
− means there is an effective procedure for computing the value
of the function including all computers/programming languages that we
know about at present and even those that we do not

Algorithmics I, 2022 73
5 Computability 313
The Church-Turing Thesis
So is the Turing machine an appropriate model for the ‘black box’?

The answer is ‘yes’ this is known as the Church-Turing thesis


− it is based on the fact that a whole range of different computational
models turn out to be equivalent in terms of what they can compute
− so it is reasonable to infer that any one of these models encapsulates
what is effectively computable

Equivalent computational models (each can 'simulate' all others)


− Lambda calculus (Church)
− Turing machines (Turing)
− Recursive functions (Kleene)
− Production systems (Post)
− Counter programs and all general purpose programming languages

Algorithmics I, 2022 74
5 Computability 314

You might also like