Algorithms Made Easy_ A beginners Handbook to easily learn all algorithms and all types of data structures
Algorithms Made Easy_ A beginners Handbook to easily learn all algorithms and all types of data structures
About
Chapter 1: Getting started with algorithms
Section 1.1: A sample algorithmic problem
Section 1.2: Getting Started with Simple Fizz Buzz Algorithm in Swift
Chapter 2: Algorithm Complexity
Section 2.1: Big-Theta notation
Section 2.2: Comparison of the asymptotic notations
Section 2.3: Big-Omega Notation
Chapter 3: Big-O Notation
Section 3.1: A Simple Loop
Section 3.2: A Nested Loop
Section 3.3: O(log n) types of Algorithms
Section 3.4: An O(log n) example
Chapter 4: Trees
Section 4.1: Typical anary tree representation
Section 4.2: Introduction
Section 4.3: To check if two Binary trees are same or not
Chapter 5: Binary Search Trees
Section 5.1: Binary Search Tree - Insertion (Python)
Section 5.2: Binary Search Tree - Deletion(C++)
Section 5.3: Lowest common ancestor in a BST
Section 5.4: Binary Search Tree - Python
Chapter 6: Check if a tree is BST or not
Section 6.1: Algorithm to check if a given binary tree is BST
Section 6.2: If a given input tree follows Binary search tree property or not
Chapter 7: Binary Tree traversals
Section 7.1: Level Order traversal - Implementation
Section 7.2: Pre-order, Inorder and Post Order traversal of a Binary Tree
Chapter 8: Lowest common ancestor of a Binary Tree
Section 8.1: Finding lowest common ancestor
Chapter 9: Graph
Section 9.1: Storing Graphs (Adjacency Matrix)
Section 9.2: Introduction To Graph Theory
Section 9.3: Storing Graphs (Adjacency List)
Section 9.4: Topological Sort
Section 9.5: Detecting a cycle in a directed graph using Depth First
Traversal
Section 9.6: Thorup's algorithm
Chapter 10: Graph Traversals
Section 10.1: Depth First Search traversal function
Chapter 11: Dijkstra’s Algorithm
Section 11.1: Dijkstra's Shortest Path Algorithm
Chapter 12: A* Pathfinding
Section 12.1: Introduction to A*
Section 12.2: A* Pathfinding through a maze with no obstacles
Section 12.3: Solving 8-puzzle problem using A* algorithm
Chapter 13: A* Pathfinding Algorithm
Section 13.1: Simple Example of A* Pathfinding: A maze with no
obstacles
Chapter 14: Dynamic Programming
Section 14.1: Edit Distance
Section 14.2: Weighted Job Scheduling Algorithm
Section 14.3: Longest Common Subsequence
Section 14.4: Fibonacci Number
Section 14.5: Longest Common Substring
Chapter 15: Applications of Dynamic Programming
Section 15.1: Fibonacci Numbers
Chapter 16: Kruskal's Algorithm
Section 16.1: Optimal, disjoint-set based implementation
Section 16.2: Simple, more detailed implementation
Section 16.3: Simple, disjoint-set based implementation
Section 16.4: Simple, high level implementation
Chapter 17: Greedy Algorithms
Section 17.1: Human Coding
Section 17.2: Activity Selection Problem
Section 17.3: Change-making problem
Chapter 18: Applications of Greedy technique
Section 18.1: Oine Caching
Section 18.2: Ticket automat
Section 18.3: Interval Scheduling
Section 18.4: Minimizing Lateness
Chapter 19: Prim's Algorithm
Section 19.1: Introduction To Prim's Algorithm
Chapter 20: Bellman–Ford Algorithm
Section 20.1: Single Source Shortest Path Algorithm (Given there is a
negative cycle in a graph)
Section 20.2: Detecting Negative Cycle in a Graph
Section 20.3: Why do we need to relax all the edges at most (V-1) times
Chapter 21: Line Algorithm
Section 21.1: Bresenham Line Drawing Algorithm
Chapter 22: Floyd-Warshall Algorithm
Section 22.1: All Pair Shortest Path Algorithm
Chapter 23: Catalan Number Algorithm
Section 23.1: Catalan Number Algorithm Basic Information
Chapter 24: Multithreaded Algorithms
Section 24.1: Square matrix multiplication multithread
Section 24.2: Multiplication matrix vector multithread
Section 24.3: merge-sort multithread
Chapter 25: Knuth Morris Pratt (KMP) Algorithm
Section 25.1: KMP-Example
Chapter 26: Edit Distance Dynamic Algorithm
Section 26.1: Minimum Edits required to convert string 1 to string 2
Chapter 27: Online algorithms
Section 27.1: Paging (Online Caching)
Chapter 28: Sorting
Section 28.1: Stability in Sorting
Chapter 29: Bubble Sort
Section 29.1: Bubble Sort
Section 29.2: Implementation in C & C++
Section 29.3: Implementation in C#
Section 29.4: Python Implementation
Section 29.5: Implementation in Java
Section 29.6: Implementation in Javascript
Chapter 30: Merge Sort
Section 30.1: Merge Sort Basics
Section 30.2: Merge Sort Implementation in Go
Section 30.3: Merge Sort Implementation in C & C#
Section 30.4: Merge Sort Implementation in Java
Section 30.5: Merge Sort Implementation in Python
Section 30.6: Bottoms-up Java Implementation
Chapter 31: Insertion Sort
Section 31.1: Haskell Implementation
Chapter 32: Bucket Sort
Section 32.1: C# Implementation
Chapter 33: Quicksort
Section 33.1: Quicksort Basics
Section 33.2: Quicksort in Python
Section 33.3: Lomuto partition java implementation
Chapter 34: Counting Sort
Section 34.1: Counting Sort Basic Information
Section 34.2: Psuedocode Implementation
Chapter 35: Heap Sort
Section 35.1: C# Implementation
Section 35.2: Heap Sort Basic Information
Chapter 36: Cycle Sort
Section 36.1: Pseudocode Implementation
Chapter 37: Odd-Even Sort
Section 37.1: Odd-Even Sort Basic Information
Chapter 38: Selection Sort
Section 38.1: Elixir Implementation
Section 38.2: Selection Sort Basic Information
Section 38.3: Implementation of Selection sort in C#
Chapter 39: Searching
Section 39.1: Binary Search
Section 39.2: Rabin Karp
Section 39.3: Analysis of Linear search (Worst, Average and Best Cases)
Section 39.4: Binary Search: On Sorted Numbers
Section 39.5: Linear search
Chapter 40: Substring Search
Section 40.1: Introduction To Knuth-Morris-Pratt (KMP) Algorithm
Section 40.2: Introduction to Rabin-Karp Algorithm
Section 40.3: Python Implementation of KMP algorithm
Section 40.4: KMP Algorithm in C
Chapter 41: Breadth-First Search
Section 41.1: Finding the Shortest Path from Source to other Nodes
Section 41.2: Finding Shortest Path from Source in a 2D graph
Section 41.3: Connected Components Of Undirected Graph Using BFS
Chapter 42: Depth First Search
Section 42.1: Introduction To Depth-First Search
Chapter 43: Hash Functions
Section 43.1: Hash codes for common types in C#
Section 43.2: Introduction to hash functions
Chapter 44: Travelling Salesman
Section 44.1: Brute Force Algorithm
Section 44.2: Dynamic Programming Algorithm
Chapter 45: Knapsack Problem
Section 45.1: Knapsack Problem Basics
Section 45.2: Solution Implemented in C#
Chapter 46: Equation Solving
Section 46.1: Linear Equation
Section 46.2: Non-Linear Equation
Chapter 47: Longest Common Subsequence
Section 47.1: Longest Common Subsequence Explanation
Chapter 48: Longest Increasing Subsequence
Section 48.1: Longest Increasing Subsequence Basic Information
Chapter 49: Check two strings are anagrams
Section 49.1: Sample input and output
Section 49.2: Generic Code for Anagrams
Chapter 50: Pascal's Triangle
Section 50.1: Pascal triangle in C
Chapter 51: Algo:- Print a m*n matrix in square wise
Section 51.1: Sample Example
Section 51.2: Write the generic code
Chapter 52: Matrix Exponentiation
Section 52.1: Matrix Exponentiation to Solve Example Problems
Chapter 53: polynomial-time bounded algorithm for Minimum
Vertex Cover
Section 53.1: Algorithm Pseudo Code
Chapter 54: Dynamic Time Warping
Section 54.1: Introduction To Dynamic Time Warping
Chapter 55: Fast Fourier Transform
Section 55.1: Radix 2 FFT
Section 55.2: Radix 2 Inverse FFT
Appendix A: Pseudocode
Section A.1: Variable aectations
Section A.2: Functions
Chapter 1: Getting started with
algorithms
Section 1.1: A sample algorithmic
problem
An algorithmic problem is specified by describing the complete set of
instances it must work on and of its output after running on one of these
instances. This distinction, between a problem and an instance of a problem,
is fundamental. The algorithmic problem known as sorting is defined as
follows: [Skiena:2008:ADM:1410219]
'_1 <= a'_2 <= ... <= a'_{n-1} <= a'_nProblem: Sorting
Input: A sequence of n keys, a_1, a_2, ...,
a_n.
Output: The reordering of the input sequence such that a
Haskell, Emacs An instance of sorting might be an array of strings, such as {}
or a sequence of numbers such as
154, 245, 1337 {}.
Section 1.2: Getting Started with
Simple Fizz Buzz Algorithm in Swift
For those of you that are new to programming in Swift and those of you
coming from different programming bases, such as Python or Java, this
article should be quite helpful. In this post, we will discuss a simple solution
for implementing swift algorithms.
Fizz Buzz
You may have seen Fizz Buzz written as Fizz Buzz, FizzBuzz, or Fizz-Buzz;
they're all referring to the same thing. That "thing" is the main topic of
discussion today. First, what is FizzBuzz?
This is a common question that comes up in job interviews.
Imagine a series of a number from 1 to 10.
Fizz and Buzz refer to any number that's a multiple of 3 and 5 respectively.
In other words, if a number is divisible by 3, it is substituted with fizz; if a
number is divisible by 5, it is substituted with buzz. If a number is
simultaneously a multiple of 3 AND 5, the number is replaced with "fizz
buzz." In essence, it emulates the famous children game "fizz buzz".
To work on this problem, open up Xcode to create a new playground and
initialize an array like below:
To find all the fizz and buzz, we must iterate through the array and check
which numbers are fizz and which are buzz. To do this, create a for loop to
iterate through the array we have initialised:
After this, we can simply use the "if else" condition and module operator in
swift ie - % to locate the fizz and buzz
Great! You can go to the debug console in Xcode playground to see the
output. You will find that the "fizzes" have been sorted out in your array.
For the Buzz part, we will use the same technique. Let's give it a try before
scrolling through the article — you can check your results against this
article once you've finished doing this.
As Simple as that, you can use any language of your choice and get started
Enjoy Coding
Chapter 2: Algorithm Complexity
Section 2.1: Big-Theta notation
Unlike Big-O notation, which represents only upper bound of the running
time for some algorithm, Big-Theta is a tight bound; both upper and lower
bound. Tight bound is more precise, but also more difficult to compute.
Ө Ө (g(x)) <=>
g
If the algorithm for the input n takes ^2 ++ 4 operations to finish, we say that is
O(n^2), but is also O(n^3) and O(n^). However, it is Ө (n^2) and it is not Ө (n^3),
Ө (n^4) etc. Algorithm that is Ө (f(n)) is also O(f(n)), but not vice versa!
Formal
mathematical
definition Ө (g(x))
is a set of
functions.
such that there exist positive constants c1, c2, N such that 0 <= c1 *g(x) <= f Ө (g(x))
= {f(x)
(x)
<= *g(x) for all x >N }
c2
∈ Ө
Ө
Ө ->infinity
Let f and g be two functions defined on some subset of
the real numbers. We write f(x) =(g(x)) as x if and only if there are
positive constants K and L and a real number x0 such that holds:
<= f (x) <= L |g(x)| for x >= x0K|g(x)|.
all
The definition is equal to:
Ω O (g(x)) and f
=f(n) =
Notation f(n) = O(g(n)) f(n) = Ω (g(n)) f(n) = Θ (g(n)) o(g(n)) ω (g(n))
Links
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein.
Introduction to Algorithms.
Section 2.3: Big-Omega Notation
Ω -notation is used for asymptotic lower bound.
Formal definition
Ω
Let f(n) and g(n) be two functions defined on the set of the positive real
numbers. We write f(n) =(g(n)) if there are positive constants c and n0 such that:
c g f (n) for all n ≥ n0
0 ≤ (n) ≤ .
Notes
Ω
For two any functions f(n) and g(n) we have f(n) =(g(n)) if and only if f(n) =(n) =
(g(n)).
We would like to say the algorithm requires exponential time but in fact you
cannot prove a Ω (n^2) lower bound using the usual definition of Ω since the
algorithm runs in linear time for n odd. We should instead define f(n)= Ω (g(n))
by saying for some constant c>0, f(n) ≥ (n) for infinitely many n. This gives a
nice correspondence between upper and lower bounds: f(n)= Ω (g(n)) iff f(n)(g(n)).
References
Formal definition and theorem are taken from the book "Thomas H. Cormen,
Charles E. Leiserson, Ronald L. Rivest, Clifford Stein. Introduction to
Algorithms".
Chapter 3: Big-O Notation
Definition
The Big-O notation is at its heart a mathematical notation, used to compare
the rate of convergence of functions.
n -> f (n) and n -> g (n) be functions defined over the natural numbers. f = O
Then we say that
Let (g) if and
f = O <=only if f(n)/g(n) is bounded when n approaches infinity. In other
A
words, (g) if and only if there exists a constant A, such that for all n,
f(n)/g(n).
Actually the scope of the Big-O notation is a bit wider in mathematics but for
simplicity I have narrowed it to what is used in algorithm complexity
analysis : functions defined on the naturals, that have non-zero values, and
the case of n growing to infinity.
What does it mean ?
100n 10n n
Let's take the case of f(n) =^2 ++ 1 and g(n) =^2. It is quite clear that both of these
functions tend to infinity as n tends to infinity. But sometimes knowing the
limit is not enough, and we also want to know the speed at which the
functions approach their limit. Notions like Big-O help compare and classify
functions by their speed of convergence.
10 f = O 100 + 10 / n <= 100
Let's find out if (g) by applying the definition. We have f(n)/g(n) =+ 1/n^2. Since
/n is 10 when n is 1 and is decreasing, and since 1/n^2 is 1 when n is 1 and is
also decreasing, we have ̀ f(n)/g(n)+
10 111 . The definition is satisfied because we have found a bound of f = O
f(n)/g(n) (111) and so
The input size is the size of the array, which I called len in the code.
Let's count the operations.
These two assignments are done only once, so that's 2 operations. The
operations that are looped are:
an bn + c
The inner loop performs at each iteration a number of operations that
is constant with n. The outer loop also does a few constant operations, and
runs the inner loop n times. The outer loop itself is run n times. So the
operations inside the inner loop are run n^2 times, the operations in the outer
loop are run n times, and the assignment to i is done one time. Thus, the
complexity will be something like ^2 +, and since the highest term is n^2, the
O notation is O(n^2).
n/2k
3. From 1 and 2:
n/2k
or
n
=
2k
loge
loge2
or
loge
loge
=
log2
or
simply
log
So now if some one asks you if n is 256 how many steps that loop( or any
other algorithm that cuts down it's problem size into half) will run you can
very easily calculate.
k=
log2
256
k=
log2
28(
=>
logaa
= 1)
k=8
Another very good example for similar case is Binary Search Algorithm.
Section 3.4: An O(log n) example
Introduction
Consider the following problem:
a and b are the indexes between which 0 is to be found. Each time we enter
the loop, we use an index between a and b and use it to narrow the area to be
searched.
In the worst case, we have to wait until a and b are equal. But how many
operations does that take? Not n, because each time we enter the loop, we
divide the distance between a and b by about two. Rather, the complexity is
O(log n).
Explanation
Note: When we write "log", we mean the binary logarithm, or log base 2
(which we will write "log_2"). As O(log_2 n) = O(log n) (you can do the
math) we will use "log" instead of "log_2".
Let's call x the number of operations: we know that 1 = n / (2^x).
So 2^x = n,
then x = log n
Conclusion
When faced with successive divisions (be it by two or by any number),
remember that the complexity is logarithmic.
Chapter 4: Trees
Section 4.1: Typical anary tree
representation
Typically we represent an anary tree (one with potentially unlimited children
per node) as a binary tree, (one with exactly two children per node). The
"next" child is regarded as a sibling. Note that if a tree is binary, this
representation creates extra nodes.
We then iterate over the siblings and recurse down the children. As most
trees are relatively shallow - lots of children but only a few levels of
hierarchy, this gives rise to efficient code. Note human genealogies are an
exception (lots of levels of ancestors, only a few children per level).
If necessary back pointers can be kept to allow the tree to be ascended. These
are more difficult to maintain.
Note that it is typical to have one function to call on the root and a recursive
function with extra parameters, in this case tree depth.
Section 4.2: Introduction
Trees are a sub-type of the more general node-edge graph data structure.
Example:1
a)
b)
Following the code snippet each image shows the execution visualization
which makes it easier to visualize how this code works.
Section 5.2: Binary Search Tree -
Deletion(C++)
Before starting with deletion I just want to put some lights on what is a
Binary search tree(BST), Each node in a BST can have maximum of two
nodes(left and right child).The left sub-tree of a node has a key less than or
equal to its parent node's key. The right sub-tree of a node has a key greater
than to its parent node's key.
Deleting a node in a tree while maintaining its Binary search tree property.
So the in-order
will be: 4 2 5 1 6 3 7
Post-order traversal(root) is traversing the left sub-tree of the node then the
right sub-tree and then the node.
So the post-order traversal of above tree will be:
4526731
Chapter 8: Lowest common
ancestor of a Binary Tree
Lowest common ancestor between two nodes n1 and n2 is defined as the
lowest node in the tree that has both n1 and n2 as descendants.
Section 8.1: Finding lowest common
ancestor
Consider the tree:
Here you can see a table beside the graph, this is our adjacency matrix. Here
Matrix[i][j] = 1 represents there is an edge between i and j. If there's no
edge, we simply put Matrix[i][j] = 0.
These edges can be weighted, like it can represent the distance between two
cities. Then we'll put the value in Matrix[i][j] instead of putting 1.
The graph described above is Bidirectional or Undirected, that means, if we
can go to node 1 from node 2, we can also go to node 2 from node 1. If the
graph was Directed, then there would've been arrow sign on one side of the
graph. Even then, we could represent it using adjacency matrix.
We represent the nodes that don't share edge by infinity. One thing to be
noticed is that, if the graph is undirected, the matrix becomes symmetric.
The pseudo-code to create the matrix:
In the example above, there are two paths from A to D. A->B, B->C, C->D
is one path. The cost of this path is 3 + 4 + 2 = 9. Again, there's another path
A->D. The cost of this path is 10. The path that costs the lowest is called
shortest path.
Degree:
The degree of a vertex is the number of edges that are connected to it. If
there's any edge that connects to the vertex at both ends (a loop) is counted
twice.
In directed graphs, the nodes have two types of degrees:
In-degree: The number of edges that point to the node.
Out-degree: The number of edges that point from the node to other
nodes.
For undirected graphs, they are simply called degree.
From this one, we can easily find out the total number of nodes connected to
any node, and what these nodes are.
It takes less time than Adjacency Matrix. But if we needed to find out if
there's an edge between u and v, it'd have been easier if we kept an
adjacency matrix.
Section 9.4: Topological Sort
A topological ordering, or a topological sort, orders the vertices in a directed
acyclic graph on a line, i.e. in a list, such that all directed edges go from left
to right. Such an ordering cannot exist if the graph contains a directed cycle
because there is no way that you can keep going right on a line and still
return back to where you started from.
G V, ), then a linear ordering of all its vertices is such that if G u, v
E
contains an edge ( Formally,
in a graph = () ∈
Efrom vertex u to vertex v then u precedes v in the ordering.
It is important to note that each DAG has at least one topological sort.
There are known algorithms for constructing a topological ordering of any
DAG in linear time, one example is:
depth_first_search (G) to compute 1.v.Call f for each vertex v
finishing times 2. As each vertex is finished, insert it
into the front of a linked list
3. the linked list of vertices, as it is now sorted.
V+E V+E
A topological sort can be performed in ( ) time, since the depth-
first search algorithm takes ( ) time and it takes Ω (1) (constant time)
to insert each of |V| vertices into the front of a linked list.
Many applications use directed acyclic graphs to indicate precedences among
events. We use topological sorting so that we get an ordering to process each
vertex before any of its successors.
Vertices in a graph may represent tasks to be performed and the edges may
represent constraints that one task must be performed before another; a
topological ordering is a valid sequence to perform the tasks set of tasks
described in V.
Problem instance and its solution
Task ( hours_to_complete : int ), i. Task Cooldown ( hours : int ) such CooldownLet a
e. that vertice v
describe a
(4) describes a Task that takes 4 hours to complete, and an edge e describe a (3)
describes a duration of time to cool down after a completed task.
Let our graph be called dag (since it is a directed acyclic graph), and let it
contain 5 vertices:
where we connect the vertices with directed edges such that the graph is
acyclic,
We need to understand Edge Relaxation. Let's say, from your house, that is
source, it takes 10 minutes to go to place A. And it takes 25 minutes to go to
place B. We have,
Now let's say it takes 7 minutes to go from place A to place B, that means:
Then we can go to place B from source by going to place A from source and
then from place A, going to place B, which will take 10 + 7 = 17 minutes,
instead of 25 minutes. So,
Then we update,
This can be used to find the shortest path of all node from the source. The
complexity of this code is not so good.
Here's why,
In BFS, when we go from node 1 to all other nodes, we follow first come,
first serve method. For example, we went to node 3 from source before
processing node 2. If we go to node 3 from source, we update node 4 as 5 +
3 = 8. When we again update node 3 from node 2, we need to update node 4
as 3 + 3 = 6 again! So node 4 is updated twice.
Dijkstra proposed, instead of going for First come, first serve method, if we
update the nearest nodes first, then it'll take less updates. If we processed
node 2 before, then node 3 would have been updated before, and after
updating node 4 accordingly, we'd easily get the shortest distance! The idea
is to choose from the queue, the node, that is closest to the source. So we
will use Priority Queue here so that when we pop the queue, it will bring us
the closest node u from source. How will it do that? It'll check the value of
d[u] with it.
Let's see the pseudo-code:
The pseudo-code returns distance of all other nodes from the source. If we
want to know distance of a single node v, we can simply return the value
when v is popped from the queue.
Now, does Dijkstra's Algorithm work when there's a negative edge? If there's
a negative cycle, then infinity loop will occur, as it will keep reducing the
cost every time. Even if there is a negative edge, Dijkstra won't work, unless
we return right after the target is popped. But then, it won't be a Dijkstra
algorithm. We'll need Bellman – Ford algorithm for processing negative
edge/cycle.
Complexity:
The complexity of BFS is O(log(V+E)) where V is the number of nodes and
E is the number of edges. For Dijkstra, the complexity is similar, but sorting
of Priority Queue takes O(logV). So the total complexity is: O(Vlog(V)+E)
Below is a Java example to solve Dijkstra's Shortest Path Algorithm using
Adjacency Matrix
int min = Integer.MAX_VALUE, min_index=-1;
for (int v = 0; v < V; v++) if (sptSet[v]
== false && dist[v] <= min) { min
= dist[v]; min_index = v; }
return
min_index; }
void printSolution(int dist[], int n)
{
System.out.println("Vertex Distance from Source");
for (int i = 0; i < V; i++) System.out.println(i+"
\t\t "+dist[i]); }
In order to calculate these heuristics, this is the formula we will use: x -x) +
to. abs ( from.
y -y)
Great! We've got the value: 1. Now, let's try calculating the "h"
We've calculated the g, h, and f values for all of the blue nodes. Now, which
do we pick?
Whichever one has the lowest f value.
However, in this case, we have 2 nodes with the same f value, 5. How do we
pick between them?
Simply, either choose one at random, or have a priority set. I usually prefer to
have a priority like so: "Right > Up > Down > Left"
One of the nodes with the f value of 5 takes us in the "Down" direction, and
the other takes us "Left". Since Down is at a higher priority than Left, we
choose the square which takes us "Down".
I now mark the nodes which we calculated the heuristics for, but did not
move to, as orange, and the node which we chose as cyan:
Alright, now let's calculate the same heuristics for the nodes around the cyan
node:
Again, we choose the node going down from the cyan node, as all the options
have the same f value:
Let's calculate the heuristics for the only neighbour that the cyan node has:
Alright, since we will follow the same pattern we have been following:
Once more, let's calculate the heuristics for the node's neighbour:
Let's move there:
Finally, we can see that we have a winning square beside us, so we move
there, and we are done.
Section 12.3: Solving 8-puzzle problem
using A* algorithm
Problem definition:
An 8 puzzle is a simple game consisting of a 3 x 3 grid (containing 9
squares). One of the squares is empty. The object is to move to squares
around into different positions and having the numbers displayed in the "goal
state".
Given an initial state of 8-puzzle game and a final state of to be reached, find
the most cost-effective path to reach the final state from initial state.
Let us consider the Manhattan distance between the current and final state as
the heuristic for this problem statement.
First we find the heuristic value required to reach the final state from initial
state. The cost function, g(n) = 0, as we are in the initial state
Again the total cost function is computed for these states using the method
described above and it turns out to be 6 and 7 respectively. We chose the
state with minimum cost which is state (1). The next possible moves can be
Left, Right or Down. We won't move Left as we were previously in that
state. So, we can move Right or Down.
Again we find the states obtained from (1).
(3) leads to cost function equal to 6 and (4) leads to 4. Also, we will consider
(2) obtained before which has cost function equal to 7. Choosing minimum
from them leads to (4). Next possible moves can be Left or Right or Down.
We get states:
We get costs equal to 5, 2 and 4 for (5), (6) and (7) respectively. Also, we
have previous states (3) and (2) with 6 and 7 respectively. We chose
minimum cost state which is (6). Next possible moves are Up, and Down and
clearly Down will lead us to final state leading to heuristic function value
equal to 0.
Chapter 13: A* Pathfinding
Algorithm
This topic is going to focus on the A* Pathfinding algorithm, how it's used,
and why it works.
Note to future contributors: I have added an example for A* Pathfinding
without any obstacles, on a 4x4 grid. An example with obstacles is still
needed.
Section 13.1: Simple Example of A*
Pathfinding: A maze with no obstacles
Let's say we have the following 4 by 4 grid:
In order to calculate these heuristics, this is the formula we will use: x -x) +
to. abs ( from.
y -y)
Great! We've got the value: 1. Now, let's try calculating the "h"
We've calculated the g, h, and f values for all of the blue nodes. Now, which
do we pick?
Whichever one has the lowest f value.
However, in this case, we have 2 nodes with the same f value, 5. How do we
pick between them?
Simply, either choose one at random, or have a priority set. I usually prefer to
have a priority like so: "Right > Up > Down > Left"
One of the nodes with the f value of 5 takes us in the "Down" direction, and
the other takes us "Left". Since Down is at a higher priority than Left, we
choose the square which takes us "Down".
I now mark the nodes which we calculated the heuristics for, but did not
move to, as orange, and the node which we chose as cyan:
Alright, now let's calculate the same heuristics for the nodes around the cyan
node:
Again, we choose the node going down from the cyan node, as all the options
have the same f value:
Let's calculate the heuristics for the only neighbour that the cyan node has:
Alright, since we will follow the same pattern we have been following:
Once more, let's calculate the heuristics for the node's neighbour:
Let's move there:
Finally, we can see that we have a winning square beside us, so we move
there, and we are done.
Chapter 14: Dynamic Programming
Dynamic programming is a widely used concept and its often used for
optimization. It refers to simplifying a complicated problem by breaking it
down into simpler sub-problems in a recursive manner usually a bottom-
up approach. There are two key attributes that a problem must have in
order for dynamic programming to be applicable "Optimal substructure"
and "Overlapping sub-problems". To achieve its optimization, dynamic
programming uses a concept called memoization
Section 14.1: Edit Distance
The problem statement is like if we are given two string str1 and str2
then how many minimum number of operations can be performed on
the str1 that it gets converted to str2. Implementation in Java
}
Output
3
Section 14.2: Weighted Job
Scheduling Algorithm
Weighted Job Scheduling Algorithm can also be denoted as Weighted
Activity Selection Algorithm.
The problem is, given certain jobs with their start time and end time, and a
profit you make when you finish the job, what is the maximum profit you can
make given no two jobs can be executed in parallel?
This one looks like Activity Selection using Greedy Algorithm, but there's an
added twist. That is, instead of maximizing the number of jobs finished, we
focus on making the maximum profit. The number of jobs performed doesn't
matter here.
Let's look at an example:
+-------------------------+---------+---------+---------+---------+---------+---------+
| Name | A | B | C | D | E | F |
+-------------------------+---------+---------+---------+---------+---------+---------+
|(Start Time, Finish Time)| (2,5) | (6,7) | (7,9) | (1,3) | (5,8) | (4,6) |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Profit | 6 | 4 | 2 | 5 | 11 | 5 |
+-------------------------+---------+---------+---------+---------+---------+---------+
The jobs are denoted with a name, their start and finishing time and profit.
After a few iterations, we can find out if we perform Job-A and Job-E, we
can get the maximum profit of 17. Now how to find this out using an
algorithm?
The first thing we do is sort the jobs by their finishing time in non-decreasing
order. Why do we do this? It's because if we select a job that takes less time
to finish, then we leave the most amount of time for choosing other jobs. We
have:
+-------------------------+---------+---------+---------+---------+---------+---------+
| Name | D | A | F | B | E | C |
+-------------------------+---------+---------+---------+---------+---------+---------+
|(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Profit | 5 | 6 | 5 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
+-------------------------+---------+---------+---------+---------+---------+---------+
| Acc_Prof | 5 | 6 | 5 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
Now let's denote position 2 with i, and position 1 will be denoted with j. Our
strategy will be to iterate j from 1 to i-1 and after each iteration, we will
increment i by 1, until i becomes n+1.
j i
+-------------------------+---------+---------+---------+---------+---------+---------+
| Name | D | A | F | B | E | C |
+-------------------------+---------+---------+---------+---------+---------+---------+
|(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Profit | 5 | 6 | 5 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Acc_Prof | 5 | 6 | 5 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
We check if Job[i] and Job[j] overlap, that is, if the finish time of Job[j] is
greater than Job[i]'s start time, then these two jobs can't be done together.
However, if they don't overlap, we'll check if Acc_Prof[j] + Profit[i] >
Acc_Prof[i]. If this is the case, we will update Acc_Prof[i] = Acc_Prof[j] +
Profit[i]. That is:
|(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Profit | 5 | 6 | 5 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Acc_Prof | 5 | 6 | 5 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
Now Job[j] and Job[i] don't overlap. The total amount of profit we can make
by picking these two jobs is: Acc_Prof[j] + Profit[i] = 5 + 5 = 10 which is
greater than Acc_Prof[i]. So we update Acc_Prof[i] = 10. We also increment
j by 1.
We get,
j i
+-------------------------+---------+---------+---------+---------+---------+---------+
| Name | D | A | F | B | E | C |
+-------------------------+---------+---------+---------+---------+---------+---------+
|(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Profit | 5 | 6 | 5 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Acc_Prof | 5 | 6 | 10 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
Here, Job[j] overlaps with Job[i] and j is also equal to i-1. So we increment i
by 1, and make j = 1. We get,
j i
+-------------------------+---------+---------+---------+---------+---------+---------+
| Name | D | A | F | B | E | C |
+-------------------------+---------+---------+---------+---------+---------+---------+
|(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Profit | 5 | 6 | 5 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Acc_Prof | 5 | 6 | 10 | 4 | 11 | 2 | +-------------------------+---------+--------
-+---------+---------+---------+---------+
Now, Job[j] and Job[i] don't overlap, we get the accumulated profit 5 + 4 =
9, which is greater than Acc_Prof[i]. We update Acc_Prof[i] = 9 and
increment j by 1.
j i
+-------------------------+---------+---------+---------+---------+---------+---------+
| Name | D | A | F | B | E | C |
+-------------------------+---------+---------+---------+---------+---------+---------+
|(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Profit | 5 | 6 | 5 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Acc_Prof | 5 | 6 | 10 | 9 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
Again Job[j] and Job[i] don't overlap. The accumulated profit is: 6 + 4 = 10,
which is greater than Acc_Prof[i]. We again update Acc_Prof[i] = 10. We
increment j by 1. We get:
j i
+-------------------------+---------+---------+---------+---------+---------+---------+
| Name | D | A | F | B | E | C |
+-------------------------+---------+---------+---------+---------+---------+---------+
|(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Profit | 5 | 6 | 5 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Acc_Prof | 5 | 6 | 10 | 10 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
If we continue this process, after iterating through the whole table using i, our
table will finally look like:
+-------------------------+---------+---------+---------+---------+---------+---------+
| Name | D | A | F | B | E | C |
+-------------------------+---------+---------+---------+---------+---------+---------+
|(Start Time, Finish Time)| (1,3) | (2,5) | (4,6) | (6,7) | (5,8) | (7,9) |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Profit | 5 | 6 | 5 | 4 | 11 | 2 |
+-------------------------+---------+---------+---------+---------+---------+---------+
| Acc_Prof | 5 | 6 | 10 | 14 | 17 | 8 |
+-------------------------+---------+---------+---------+---------+---------+---------+
Overlapping Sub-problems
Here fib(0),fib(1) and fib(3) are the overlapping sub-problems.fib(0) is
getting repeated 3 times, fib(1) is getting repeated 5 times and fib(3) is
getting repeated 2 times.
Implementation
Time Complexity
O(n)
Section 14.5: Longest
Common Substring
Given 2 string str1 and str2 we have to find the length of the longest common
substring between them.
Examples
Input : X = "abcdxyz", y = "xyzabcd" Output : 4
The longest common substring is "abcd" and is of length 4.
Input : X = "zxabcdezy", y = "yzabcdezx" Output : 6
The longest common substring is "abcdez" and is of length 6.
Implementation in Java
Time Complexity
O(m*n)
Chapter 15: Applications of
Dynamic Programming
The basic idea behind dynamic programming is breaking a complex problem
down to several small and simple problems that are repeated. If you can
identify a simple subproblem that is repeatedly calculated, odds are there is a
dynamic programming approach to the problem.
As this topic is titled Applications of Dynamic Programming, it will focus
more on applications rather than the process of creating dynamic
programming algorithms.
Section 15.1: Fibonacci Numbers
Fibonacci Numbers are a prime subject for dynamic programming as the
traditional recursive approach makes a lot of repeated calculations. In these
examples I will be using the base case of
f(0) = (1) = 1.
i , 1 , 0 , 1 , 0 , 1 , ..., i
To store these last 2 results I use an array of size 2 and simply flip which
index I am assigning to by using % 2 which will alternate like so: 0% 2.
== 11 i
I add both indexes of the array together because we know that addition is
commutative (5 + 6 = and 6 + 5 11). The result is then assigned to the older of
the two spots (denoted by % 2). The final result is then stored at the position
n%2
Notes
It is important to note that sometimes it may be best to come up with a
iterative memoized solution for
functions that perform large calculations repeatedly as you will build up
a cache of the answer to the function calls and subsequent calls may be
O(1) if it has already been computed.
Chapter 16: Kruskal's Algorithm
Section 16.1: Optimal, disjoint-set
based implementation
We can do two things to improve the simple and sub-optimal disjoint-set
subalgorithms:
1. Path compression heuristic: findSet does not need to ever handle a tree
with height bigger than 2. If it ends up iterating such a tree, it can link
the lower nodes directly to the root, optimizing future traversals;
2. Height-based merging heuristic: for each node, store the height of its
subtree. When merging, make the taller tree the parent of the smaller
one, thus not increasing anyone's height.
alpha Thisleads to O((n)) time for each operation, where alpha is the inverse of
the fast-growing Ackermann function, thus it is very slow growing, and can
be considered O(1) for practical purposes.
m log m + m O ( m log
m This makes the entire Kruskal's algorithm O() =), because of
the initial sorting.
Note
Path compression may reduce the height of the tree, hence comparing heights
of the trees during union operation might not be a trivial task. Hence to avoid
the complexity of storing and calculating the height of the trees the resulting
parent can be picked randomly:
implement.
Section 16.2: Simple, more detailed
implementation
In order to efficiently handle cycle detection, we consider each node as part
of a tree. When adding an edge, we check if its two component nodes are part
of distinct trees. Initially, each node makes up a one-node tree.
algorithm
kruskalMST(G: a graph)
sort Gs edges by their
value
O(n log n)
time for managing the disjoint-set data O(m*n
structure, leading to
This naive implementation leads to log n) time for the
entire Kruskal's algorithm.
Section 16.4: Simple, high
level implementation
Sort the edges by value and add each one to the MST in sorted order, if it
doesn't create a cycle.
Chapter 17: Greedy Algorithms
Section 17.1: Human Coding
Huffman code is a particular type of optimal prefix code that is commonly
used for lossless data compression. It compresses data very effectively saving
from 20% to 90% memory, depending on the characteristics of the data being
compressed. We consider the data to be a sequence of characters. Huffman's
greedy algorithm uses a table giving how often each character occurs (i.e., its
frequency) to build up an optimal way of representing each character as a
binary string. Huffman code was proposed by David A. Huffman in 1951.
Suppose we have a 100,000-character data file that we wish to store
compactly. We assume that there are only 6 different characters in that file.
The frequency of the characters are given by:
+------------------------+-----+-----+-----+-----+-----+-----+
| Character | a | b | c | d | e | f |
+------------------------+-----+-----+-----+-----+-----+-----+
|Frequency (in thousands)| 45 | 13 | 12 | 16 | 9 | 5 |
+------------------------+-----+-----+-----+-----+-----+-----+
We have many options for how to represent such a file of information. Here,
we consider the problem of designing a Binary Character Code in which
each character is represented by a unique binary string, which we call a
codeword.
The constructed tree will provide us with:
+------------------------+-----+-----+-----+-----+-----+-----+
| Character | a | b | c | d | e | f |
+------------------------+-----+-----+-----+-----+-----+-----+
| Fixed-length Codeword | 000 | 001 | 010 | 011 | 100 | 101 |
+------------------------+-----+-----+-----+-----+-----+-----+
|Variable-length Codeword| 0 | 101 | 100 | 111 | 1101| 1100|
+------------------------+-----+-----+-----+-----+-----+-----+
Greedy Explanation:
Huffman coding looks at the occurrence of each character and stores it as a
binary string in an optimal way. The idea is to assign variable-length codes to
input input characters, length of the assigned codes are based on the
frequencies of corresponding characters. We create a binary tree and operate
on it in bottom-up manner so that the least two frequent characters are as far
as possible from the root. In this way, the most frequent character gets the
smallest code and the least frequent character gets the largest code.
References:
Introduction to Algorithms - Charles E. Leiserson, Clifford Stein,
Ronald Rivest, and Thomas H. Cormen Huffman Coding - Wikipedia
Discrete Mathematics and Its
Applications - Kenneth H. Rosen
Section 17.2: Activity
Selection Problem
The Problem
You have a set of things to do (activities). Each activity has a start time and a
end time. You aren't allowed to perform more than one activity at a time.
Your task is to find a way to perform the maximum number of activities.
For example, suppose you have a selection of classes to choose from.
Activity No. start time end time
10.20 A.M 11.00AM
10.30 A.M 11.30AM
11.00 A.M 12.00AM
10.00 A.M 11.30AM
9.00 A.M 11.00AM
Remember, you can't take two classes at the same time. That means you can't
take class 1 and 2 because they share a common time 10.30 A.M to 11.00
A.M. However, you can take class 1 and 3 because they don't share a
common time. So your task is to take maximum number of classes as
possible without any overlap. How can you do that?
Analysis
Lets think for the solution by greedy approach.First of all we randomly chose
some approach and check that will work or not.
sort the activity by start time that means which activity start first we
will take them first. then take first to last from sorted list and check it
will intersect from previous taken activity or not. If the current activity
is not intersect with the previously taken activity, we will perform the
activity otherwise we will not perform. this approach will work for some
cases like
Activity No. start time end time
11.00 A.M 1.30P.M
11.30 A.M 12.00P.M
1.30 P.M 2.00P.M 4 10.00 A.M 11.00AM
the sorting order will be 4-->1-->2-->3 .The activity 4--> 1--> 3 will be
performed and the activity 2 will be skipped. the maximum 3 activity will be
performed. It works for this type of cases. but it will fail for some cases. Lets
apply this approach for the case
Activity No. start time end time
11.00 A.M 1.30P.M
11.30 A.M 12.00P.M
1.30 P.M 2.00P.M
10.00 A.M 3.00P.M
The sort order will be 4-->1-->2-->3 and only activity 4 will be performed
but the answer can be activity 1-->3 or 2-
->3 will be performed. So our approach will not work for the above case.
Let's try another approach
Sort the activity by time duration that means perform the shortest
activity first. that can solve the previous problem . Although the
problem is not completely solved. There still some cases that can fail the
solution. apply this approach on the case bellow.
Activity No. start time end time
1 6.00 A.M
11.40A.M 2
11.30 A.M
12.00P.M
3 11.40 P.M 2.00P.M
if we sort the activity by time duration the sort order will be 2--> 3 --->1 .
and if we perform activity No. 2 first then no other activity can be performed.
But the answer will be perform activity 1 then perform 3 . So we can perform
maximum 2 activity.So this can not be a solution of this problem. We should
try a different approach.
The solution
Sort the Activity by ending time that means the activity finishes first
that come first. the algorithm is given below
1. sort: activities
2. perform first activity from the sorted list of activities.
3. Set : Current_activity := first activity
4. set: end_time := end_time of Current activity
5. go to next activity if exist, if not exist terminate .
6. if start_time of current activity <= end_time : perform the
activity and go to 4
7. else: got to 5.
These systems are made so that change-making is easy. The problem gets
harder when it comes to arbitrary money system.
General case. How to give 99 € with coins of 10 € , 7 € and 5 € ? Here,
giving coins of 10 € until we are left with 9 € leads obviously to no solution.
Worse than that a solution may not exist. This problem is in fact np-hard, but
acceptable solutions mixing greediness and memoization exist. The idea is
to explore all the possibilies and pick the one with the minimal number of
coins.
To give an amount X > 0, we choose a piece P in the money system, and then
solve the sub-problem corresponding to X-P. We try this for all the pieces of
the system. The solution, if it exists, is then the smallest path that led to 0.
Here an OCaml recursive function corresponding to this method. It returns
None, if no solution exists.
Note: We can remark that this procedure may compute several times the
change set for the same value. In practice, using memoization to avoid these
repetitions leads to faster (way faster) results.
Chapter 18: Applications of Greedy
technique
Section 18.1: Oine Caching
<= kthen we just put all elements in the cache and it will >>The caching
work, but usually is m problem arises
from the
limitation of finite space. Lets assume our cache C has k pages. Now we want
to process a sequence of m item requests which must have been placed in the
cache before they are processed.Of course if mk.
We say a request is a cache hit, when the item is already in cache, otherwise
its called a cache miss. In that case we must bring the requested item into
cache and evict another, assuming the cache is full. The Goal is a eviction
schedule that minimizes the number of evictions.
There are numerous greedy strategies for this problem, lets look at some:
1. First in, first out (FIFO): The oldest page gets evicted
2. Last in, first out (LIFO): The newest page gets evicted
3. Last recent out (LRU): Evict page whose most recent access was
earliest
4. Least frequently requested(LFU): Evict page that was least frequently
requested
5. Longest forward distance (LFD): Evict page in the cache that is not
requested until farthest in the future.
Attention: For the following examples we evict the page with the smallest
index, if more than one page could be evicted.
Example (FIFO)
Let the cache size be k=3 the initial cache a,b,c and the request
a,a,d,e,b,b,a,c,f,d,e,a,f,b,e,c:
cache
Request aadebba
cfdeafbe
c 1a a d d d d
aaadddff
fc
cache
2 bbbeeeeccceeebbb
cache
3 c c c c b b b b f f f a a a e e cache miss x x x
xxxxxxxxxx
Thirteen cache misses by sixteen requests does not sound very optimal, lets
try the same example with another strategy:
Example (LFD)
Let the cache size be k=3 the initial cache a,b,c and the request
a,a,d,e,b,b,a,c,f,d,e,a,f,b,e,c:
Request a a d e b b a c f d e a f b e c
cache 1 a a d e e e e e e e e e e e e c
cache 2 b b b b b b a a a a a a f f f f
cache 3 c c c c c c c c f d d d d b b b cache miss x x x xx xx x
Eight cache misses is a lot better.
Selftest: Do the example for LIFO, LFU, RFU and look what happend.
The following example programm (written in C++) consists of two parts:
The skeleton is a application, which solves the problem dependent on the
chosen greedy strategy:
The basic idea is simple: for every request I have two calls two my strategy:
1. apply: The strategy has to tell the caller which page to use
2. update: After the caller uses the place, it tells the strategy whether it was
a miss or not. Then the strategy may update its internal data. The
strategy LFU for example has to update the hit frequency for the cache
pages, while the LFD strategy has to recalculate the distances for the
cache pages.
Now lets look of example implementations for our five strategies:
FIFO
FIFO just needs the information how long a page is in the cache (and of
course only relative to the other pages). So the only thing to do is wait for a
miss and then make the pages, which where not evicted older. For our
example above the program solution is:
Thats exact the solution from above.
LIFO
The implementation of LIFO is more or less the same as by FIFO but we
evict the youngest not the oldest page. The program results are:
LRU
In case of LRU the strategy is independent from what is at the cache page, its
only interest is the last usage. The programm results are:
LFU
LFU evicts the page uses least often. So the update strategy is just to count
every access. Of course after a miss the count resets. The program results are:
LFD
The LFD strategy is different from everyone before. Its the only strategy that
uses the future requests for its decission who to evict. The implementation
uses the function calcNextUse to get the page which next use is farthest away in
the future. The program solution is equal to the solution by hand from above:
The greedy strategy LFD is indeed the only optimal strategy of the five
presented. The proof is rather long and can be found here or in the book by
Jon Kleinberg and Eva Tardos (see sources in remarks down below).
Algorithm vs Reality
The LFD strategy is optimal, but there is a big problem. Its an optimal
offline solution. In praxis caching is usually an online problem, that means
the strategy is useless because we cannot now the next time we need a
particular item. The other four strategies are also online strategies. For
online problems we need a general different approach.
Section 18.2: Ticket automat
First simple Example:
You have a ticket automat which gives exchange in coins with values 1, 2, 5,
10 and 20. The dispension of the exchange can be seen as a series of coin
drops until the right value is dispensed. We say a dispension is optimal when
its coin count is minimal for its value.
50 ]be the price for the ticket T 50 ] the money somebody paid PM >=Let M
and P in [1, for T, with in [1,.
Let
D=P-M.
We define the benefit of a step as the difference between D and D-c with c the
coin the automat dispense in this step.
The Greedy Technique for the exchange is the following pseudo algorithmic
approach:
D > 20 dispense a 20 coin D = D - 20
and set
Step 1: while
D > 10 dispense a 10 coin D = D - 10
and set
Step 2: while
DD=D
Afterwards the sum of all coins clearly equals D. Its a greedy algorithm
because after each step and after each repitition of a step the benefit is
maximized. We cannot dispense another coin with a higher benefit.
Now the ticket automat as program (in C++):
Be aware there is now input checking to keep the example simple. One
example output:
As long as 1 is in the coin values we now, that the algorithm will terminate,
because:
Dstrictly decreases with every step
D is never >0 and smaller than than the smallest coin 1 at the same
time
But the algorithm has two pitfalls:
1. Let C be the biggest coin value. The runtime is only polynomial as long
as D/C is polynomial, because the representation of D uses only log D bits
and the runtime is at least linear in D/C.
2. In every step our algorithm chooses the local optimum. But this is not
sufficient to say that the algorithm finds the global optimal solution (see
more information here or in the Book of Korte and Vygen).
A simple counter example: the coins are 1,3,4 and D=6. The optimal
solution is clearly two coins of value 3 but greedy chooses 4 in the first
step so it has to choose 1 in step two and three. So it gives no optimal
soution. A possible optimal Algorithm for this example is based on
dynamic programming.
Section 18.3: Interval Scheduling
a,b,c,d,e,f,gWe have a set of jobs J={}. Let j in J be a job than its start at sj and
ends at fj. Two jobs are compatible if they don't overlap. A picture as
example:
Which leaves us with earliest finish time. The pseudo code is quiet simple:
f1<=f2<=...<=fn
1. Sort jobs by finish time so that
2. Let A be an empty set
3. for j=1 to n if j is compatible to all jobs in A set A=A+{j}
4. A is a maximum subset of mutually compatible jobs
Or as C++ program:
The implementation of the algorithm is clearly in Θ (n^2). There is a Θ (n
log n) implementation and the interested reader may continue reading below
(Java Example).
Now we have a greedy algorithm for the interval scheduling problem, but is it
optimal?
Proposition: The greedy algorithm earliest finish time is optimal.
Proof:(by contradiction)
i1 = j1,i2 = j2,...,ir = jr Assume
greedy is not optimal and i1,i2,...,ik denote the set of
jobs selected by greedy. Let j1,j2,...,jm denote the set of jobs in an optimal
solution with for the largest possible value of r.
j1,j2,...,jr,i (r+1) ,j (r+2) ,...,jm The job i(r+1) exists and finishes before j(r+1)
(earliest finish). But than is
jk = ikalso a optimal solution and for all k in [1,(r+1)] is . thats a contradiction
to the maximality of r. This concludes the proof.
This second example demonstrates that there are usually many possible
greedy strategies but only some or even none might find the optimal solution
in every instance.
Below is a Java program that runs in Θ (n log n)
import java.util.Comparator;
class
Job
{
int start, finish, profit;
Job(int start, int finish, int profit)
{ this.start =
start;
this.finish = finish;
this.profit =
profit;
}
}
}
}
public class WeightedIntervalScheduling { static public
int binarySearch(Job jobs[], int index) { int lo = 0,
hi = index - 1;
while (lo <= hi) { int mid = (lo + hi) /
2; if (jobs[mid].finish <= jobs[index].start)
{ if (jobs[mid + 1].finish <= jobs[index].start)
lo = mid + 1; else return mid;
} else hi = mid - 1;
}
return -1;
}
static public int schedule(Job jobs[]) {
Arrays.sort(jobs, new
JobComparator());
int n = jobs.length;
int table[] = new
int[n]; table[0] =
jobs[0].profit;
for (int i=1; i<n; i++) { int inclProf =
jobs[i].profit; int l = binarySearch(jobs, i);
if (l != -1) inclProf += table[l];
table[i] = Math.max(inclProf, table[i-1]);
set
set
t=t+
s1,f1 ],[ s2,f2 ] ,..., [ sn,fn
4. return intervals []
And as implementation in C++:
And the output for this program is:
That means the lateness after swap is less or equal than before. This
concludes the proof.
Proposition: The earliest deadline first schedule S is optimal.
Proof:(by contradiction)
Lets assume S* is optimal schedule with the fewest possible number of
inversions. we can assume that S* has no idle time. If S* has no inversions,
then S=S* and we are done. If S* has an inversion, than it has an adjacent
inversion. The last Proposition states that we can swap the adjacent inversion
without increasing lateness but with decreasing the number of inversions.
This contradicts the definition of S*.
The minimizing lateness problem and its near related minimum makespan
problem, where the question for a minimal schedule is asked have lots of
applications in the real world. But usually you don't have only one machine
but many and they handle the same task at different rates. These problems get
NP-complete really fast.
Another interesting question arises if we don't look at the offline problem,
where we have all tasks and data at hand but at the online variant, where
tasks appear during execution.
Chapter 19: Prim's Algorithm
Section 19.1: Introduction To Prim's
Algorithm
Let's say we have 8 houses. We want to setup telephone lines between these
houses. The edge between the houses represent the cost of setting line
between two houses.
Our task is to set up lines in such a way that all the houses are connected and
the cost of setting up the whole connection is minimum. Now how do we
find that out? We can use Prim's Algorithm.
Prim's Algorithm is a greedy algorithm that finds a minimum spanning tree
for a weighted undirected graph. This means it finds a subset of the edges
that forms a tree that includes every node, where the total weight of all the
edges in the tree are minimized. The algorithm was developed in 1930 by
Czech mathematician Vojtěch Jarník and later rediscovered and republished
by computer scientist Robert Clay Prim in 1957 and Edsger Wybe Dijkstra in
1959. It is also known as DJP algorithm, Jarnik's algorithm, Prim-Jarnik
algorithm or Prim-Dijsktra algorithm.
Now let's look at the technical terms first. If we create a graph, S using some
nodes and edges of an undirected graph G, then S is called a subgraph of the
graph G. Now S will be called a Spanning Tree if and only if:
It contains all the nodes of G.
It is a tree, that means there is no cycle and
all the nodes are connected. There are (n-1)
edges in the tree, where n is the number of nodes
in G.
There can be many Spanning Tree's of a graph. The Minimum Spanning
Tree of a weighted undirected graph is a tree, such that sum of the weight of
the edges is minimum. Now we'll use Prim's algorithm to find out the
minimum spanning tree, that is how to set up the telephone lines in our
example graph in such way that the cost of set up is minimum.
At first we'll select a source node. Let's say, node-1 is our source. Now we'll
add the edge from node-1 that has the minimum cost to our subgraph. Here
we mark the edges that are in the subgraph using the color blue. Here 1-5 is
our desired edge.
Now we consider all the edges from node-1 and node-5 and take the
minimum. Since 1-5 is already marked, we
take 1-2.
This time, we consider node-1, node-2 and node-5 and take the minimum
edge which is 5-4.
The next step is important. From node-1, node-2, node-5 and node-4, the
minimum edge is 2-4. But if we select that one, it'll create a cycle in our
subgraph. This is because node-2 and node-4 are already in our subgraph. So
taking edge 2-4 doesn't benefit us. We'll select the edges in such way that it
adds a new node in our subgraph. So we select edge 4-8.
If we continue this way, we'll select edge 8-6, 6-7 and 4-3. Our subgraph will
look like:
This is our desired subgraph, that'll give us the minimum spanning tree. If we
remove the edges that we didn't
select, we'll get:
This is our minimum spanning tree (MST). So the cost of setting up the
telephone connections is: 4 + 2 + 5 + 11 + 9 + 2 + 1 = 34. And the set of
houses and their connections are shown in the graph. There can be multiple
MST of a graph. It depends on the source node we choose.
The pseudo-code of the algorithm is given below:
Complexity:
Time complexity of the above naive approach is O(V ² ). It uses adjacency
matrix. We can reduce the complexity using priority queue. When we add a
new node to Vnew, we can add its adjacent edges in the priority queue. Then
pop the minimum weighted edge from it. Then the complexity will be:
O(ElogE), where E is the number of edges.
Again a Binary Heap can be constructed to reduce the complexity to
O(ElogV).
The pseudo-code using Priority Queue is given below:
Here key[] stores the minimum cost of traversing node-v. parent[] is used to
store the parent node. It is useful for traversing and printing the tree.
Below is a simple program in Java:
LinkCost[i][j] = mat[i][j];
if ( LinkCost[i][j] == 0 )
LinkCost[i][j] = infinite;
} } for ( i=0; i <
NNodes; i++) { for (
j=0; j < NNodes; j++)
if ( LinkCost[i][j] < infinite )
System.out.print( " " + LinkCost[i][j] + " " ); else
System.out.print(" * " );
System.out.println();
} } public int
unReached(boolean[] r) {
boolean done = true; for ( int i =
0; i < r.length; i++ ) if ( r[i] ==
false ) return i; return -1;
} public void Prim( ) { int i, j, k,
x, y; boolean[] Reached = new
boolean[NNodes]; int[] predNode =
new int[NNodes]; Reached[0] = true;
for ( k = 1; k < NNodes; k++ ) {
Reached[k] = false; }
predNode[0] = 0; printReachSet(
Reached ); for (k = 1; k <
NNodes; k++) { x = y = 0;
for ( i = 0; i < NNodes; i++ )
for ( j = 0; j < NNodes; j++ )
{
if ( Reached[i] && !Reached[j] &&
LinkCost[i][j] < LinkCost[x][y] )
{ x = i; y = j;
}
}
System.out.println("Min cost edge: (" +
+ x + "," +
+ y + ")" + "cost = " +
LinkCost[x][y]); predNode[y] = x; Reached[y] =
true; printReachSet( Reached );
System.out.println(); } int[] a= predNode;
for ( i = 0; i < NNodes; i++ )
System.out.println( a[i] + " --> " + i );
} void printReachSet(boolean[]
Reached )
javac Graph.java
Compile
Output:
Chapter 20: Bellman–Ford
Algorithm
Section 20.1: Single Source Shortest
Path Algorithm (Given there is a
negative cycle in a graph)
Before reading this example, it is required to have a brief idea on edge-
relaxation. You can learn it from here
Bellman-Ford Algorithm is computes the shortest paths from a single source
vertex to all of the other vertices in a weighted digraph. Even though it is
slower than Dijkstra's Algorithm, it works in the cases when the weight of
the edge is negative and it also finds negative weight cycle in the graph. The
problem with Dijkstra's Algorithm is, if there's a negative cycle, you keep
going through the cycle again and again and keep reducing the distance
between two vertices.
The idea of this algorithm is to go through all the edges of this graph one-by-
one in some random order. It can be any random order. But you must ensure,
if u-v (where u and v are two vertices in a graph) is one of your orders, then
there must be an edge from u to v. Usually it is taken directly from the order
of the input given. Again, any random order will work.
After selecting the order, we will relax the edges according to the relaxation
formula. For a given edge u-v going from u to v the relaxation formula is:
That is, if the distance from source to any vertex u + the weight of the edge
u-v is less than the distance from source to another vertex v, we update the
distance from source to v. We need to relax the edges at most (V-1) times
where V is the number of edges in the graph. Why (V-1) you ask? We'll
explain it in another example. Also we are going to keep track of the parent
vertex of any vertex, that is when we relax an edge, we will set:
It means we've found another shorter path to reach v via u. We will need this
later to print the shortest path from source to the destined vertex.
Let's look at an example. We have a graph:
We have selected 1 as the source vertex. We want to find out the shortest
path from the source to all other vertices.
At first, d[1] = 0 because it is the source. And rest are infinity, because we
don't know their distance yet.
We will relax the edges in this sequence:
+--------+--------+--------+--------+--------+--------+--------+
| Serial | 1 | 2 | 3 | 4 | 5 | 6 |
+--------+--------+--------+--------+--------+--------+--------+
| Edge | 4->5 | 3->4 | 1->3 | 1->4 | 4->6 | 2->3 |
+--------+--------+--------+--------+--------+--------+--------+
You can take any sequence you want. If we relax the edges once, what do we
get? We get the distance from source to all other vertices of the path that
uses at most 1 edge. Now let's relax the edges and update the values of d[].
We get:
1. d[4] + cost[4][5] = infinity + 7 = infinity. We can't update this one.
2. d[2] + cost[3][4] = infinity. We can't update this one.
3. d[1] + cost[1][3] = 0 + 2 = 2 < d[2]. So d[3] = 2. Also parent[1] = 1.
4. d[1] + cost[1][4] = 4. So d[4] = 4 < d[4]. parent[4] = 1.
5. d[4] + cost[4][6] = 9. d[6] = 9 < d[6]. parent[6] = 4.
Our second iteration will provide us with the path using 2 nodes. We get:
1. d[4] + cost[4][5] = 12 < d[5]. d[5] = 12. parent[5] = 4.
2. d[3] + cost[3][4] = 1 < d[4]. d[4] = 1. parent[4] = 3.
3. d[3] remains unchanged.
4. d[4] remains unchanged.
5. d[4] + cost[4][6] = 6 < d[6]. d[6] = 6. parent[6] = 4.
6. d[3] remains unchanged.
Our graph will look like:
Our 3rd iteration will only update vertex 5, where d[5] will be 8. Our graph
will look like:
After this no matter how many iterations we do, we'll have the same
distances. So we will keep a flag that checks if any update takes place or not.
If it doesn't, we'll simply break the loop. Our pseudo-code will be:
To keep track of negative cycle, we can modify our code using the procedure
described here. Our completed pseudo-code will be:
Printing Path:
To print the shortest path to a vertex, we'll iterate back to its parent until we
find NULL and then print the vertices.
The pseudo-code will be:
Complexity:
Since we need to relax the edges maximum (V-1) times, the time complexity
of this algorithm will be equal to O(V * E) where E denotes the number of
edges, if we use adjacency list to represent the graph. However, if adjacency matrix
is used to represent the graph, time complexity will be O(V^3). Reason is
we can iterate through all edges in O(E) time when adjacency list is used, but it
takes O(V^2) time when adjacency matrix is used.
Section 20.2: Detecting Negative Cycle
in a Graph
To understand this example, it is recommended to have a brief idea about
Bellman-Ford algorithm which can be found here
Using Bellman-Ford algorithm, we can detect if there is a negative cycle in
our graph. We know that, to find out the shortest path, we need to relax all
the edges of the graph (V-1) times, where V is the number of vertices in a
graph. We have already seen that in this example, after (V-1) iterations, we
can't update d[], no matter how many iterations we do. Or can we?
If there is a negative cycle in a graph, even after (V-1) iterations, we can
update d[]. This happens because for every iteration, traversing through the
negative cycle always decreases the cost of the shortest path. This is why
BellmanFord algorithm limits the number of iterations to (V-1). If we used
Dijkstra's Algorithm here, we'd be stuck in an endless loop. However, let's
concentrate on finding negative cycle.
Let's assume, we have a graph:
Here, the source vertex is 1. We will find out the shortest distance between
the source and all the other vertices. We can clearly see that, to reach vertex
4, in the worst case, it'll take (V-1) edges. Now depending on the order in
which the edges are discovered, it might take (V-1) times to discover vertex
4. Didn't get it? Let's use Bellman-Ford algorithm to find out the shortest
path here:
We're going to use this sequence:
Second iteration:
1. d[3] + cost[3][4] = infinity. It won't change anything.
2. d[2] + cost[2][3] = 5 < d[3]. d[3] = 5. parent[3] = 2.
3. It won't be changed.
This time the relaxation process changed d[3]. Our graph will look like:
Third iteration:
1. d[3] + cost[3][4] = 7 < d[4]. d[4] = 7. parent[4] = 3.
2. It won't be changed.
3. It won't be changed.
Our third iteration finally found out the shortest path to 4 from 1. Our graph
will look like:
So, it took 3 iterations to find out the shortest path. After this one, no matter
how many times we relax the edges, the values in d[] will remain the same.
Now, if we considered another sequence:
We'd get:
1. d[1] + cost[1][2] = 2 < d[2]. d[2] = 2.
2. d[2] + cost[2][3] = 5 < d[3]. d[3] = 5.
3. d[3] + cost[3][4] = 7 < d[4]. d[4] = 5.
Our very first iteration has found the shortest path from source to all the
other nodes. Another sequence 1->2, 3->4, 2->3 is possible, which will give
us shortest path after 2 iterations. We can come to the decision that, no matter
how we arrange the sequence, it won't take more than 3 iterations to find out
shortest path from the source in this example.
We can conclude that, for the best case, it'll take 1 iteration to find out the
shortest path from source. For the worst case, it'll take (V-1) iterations,
which is why we repeat the process of relaxation (V-1) times.
Chapter 21: Line Algorithm
Line drawing is accomplished by calculating intermediate positions
along the line path between two specified endpoint positions. An output
device is then directed to fill in these positions between the endpoints.
Section 21.1: Bresenham Line
Drawing Algorithm
Background Theory: Bresenham ’ s Line Drawing Algorithm is an efficient
and accurate raster line generating algorithm developed by Bresenham. It
involves only integer calculation so it is accurate and fast. It can also be
extended to display circles another curves.
In Bresenham line drawing algorithm:
For Slope |m|<1:
Either value of x is increased
OR both x and y is increased using decision parameter.
For Slope |m|>1:
Either value of y is increased
OR both x and y is increased using decision parameter.
Algorithm for slope |m|<1:
1. Input two end points (x1,y1) and (x2,y2) of the line.
2. Plot the first point (x1,y1).
3. Calculate
Delx =| x2 – x1 |
Dely = | y2 – y1 |
4. Obtain the initial decision parameter as P = 2 * dely – delx
5. For I = 0 to delx in step of 1
If p < 0 then
X1 = x1 + 1
Pot(x1,y1)
P = p+ 2dely
Else
X1 = x1 + 1
Y1 = y1 + 1
Plot(x1,y1)
P = p + 2dely – 2 * delx
End if
End for
6. END
Source Code:
Algorithm for slope |m|>1:
1. Input two end points (x1,y1) and (x2,y2) of the line.
2. Plot the first point (x1,y1).
3. Calculate
Delx =| x2 – x1 |
Dely = | y2 – y1 |
4. Obtain the initial decision parameter as P = 2 * delx – dely
5. For I = 0 to dely in step of 1
If
p
<
0
then
y1
=
y1
+
1
Pot(x1,y1)
P = p+ 2delx
Else
X1
=
x1
+
1
Y1
=
y1
+
1
Plot(x1,y1)
P = p + 2delx – 2 * dely
End if
End for
6. END
Source Code:
Chapter 22: Floyd-Warshall
Algorithm
Section 22.1: All Pair Shortest Path
Algorithm
Floyd-Warshall's algorithm is for finding shortest paths in a weighted graph
with positive or negative edge weights. A single execution of the algorithm
will find the lengths (summed weights) of the shortest paths between all pair
of vertices. With a little variation, it can print the shortest path and can detect
negative cycles in a graph. FloydWarshall is a Dynamic-Programming
algorithm.
Let's look at an example. We're going to apply Floyd-Warshall's algorithm on
this graph:
distance w path u
First thing we do is, we take two 2D matrices. These are adjacency matrices.
The size of the matrices is going to be the total number of vertices. For our
graph, we will take 4 * 4 matrices. The Distance Matrix is going to store the
minimum distance found so far between two vertices. At first, for the edges,
if there is an edge between u-v and the distance/weight is w, we'll store: [u][v]
=. For all the edges that doesn't exist, we're gonna put infinity. The Path
Matrix is for regenerating minimum distance path between two vertices. So
initially, if there is a path between u and v, we're going to put [u][v] =. This
means the best way to come to vertex-v from vertex-u is to use the edge that
connects v with u. If there is no path between two vertices, we're going to put
N there indicating there is no path available now. The two tables for our
graph will look like:
+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+
| | 1 | 2 | 3 | 4 | | | 1 | 2 | 3 | 4 |
+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+
| 1 | 0 | 3 | 6 | 15 | | 1 | N | 1 | 1 | 1 |
+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+
| 2 | inf | 0 | -2 | inf | | 2 | N | N | 2 | N |
+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+
| 3 | inf | inf | 0 | 2 | | 3 | N | N | N | 3 |
+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+
Since there is no loop, the diagonals are set N. And the distance from the
vertex itself is 0.
To apply Floyd-Warshall algorithm, we're going to select a middle vertex k.
Then for each vertex i, we're going to check if we can go from i to k and then
k to j, where j is another vertex and minimize the cost of going from i to j. If
the current distance[i][j] is greater than distance[i][k] + distance[k][j],
we're going to put distance[i][j] equals to the summation of those two
distances. And the path[i][j] will be set to path[k][j], as it is better to go
from i to k, and then k to j. All the vertices will be selected as k. We'll have
3 nested loops: for k going from 1 to 4, i going from 1 to 4 and j going from
1 to 4. We're going check:
So what we're basically checking is, for every pair of vertices, do we get a
shorter distance by going through another vertex? The total number of
operations for our graph will be 4 * 4 * 4 = 64. That means we're going to do
this check 64 times. Let's look at a few of them:
When k = 1, i = 2 and j = 3, distance[i][j] is -2, which is not greater than
distance[i][k] + distance[k][j] = -2 + 0 = -2. So it will remain unchanged.
Again, when k = 1, i = 4 and j = 2, distance[i][j] = infinity, which is greater
than distance[i][k] + distance[k][j] = 1 + 3 = 4. So we put distance[i][j] =
4, and we put path[i][j] = path[k][j] = 1. What this means is, to go from
vertex-4 to vertex-2, the path 4->1->2 is shorter than the existing path. This
is how we populate both matrices. The calculation for each step is shown
here. After making necessary changes, our matrices will look like:
+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+
| | 1 | 2 | 3 | 4 | | | 1 | 2 | 3 | 4 |
+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+
| 1 | 0 | 3 | 1 | 3 | | 1 | N | 1 | 2 | 3 |
+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+
| 2 | 1 | 0 | -2 | 0 | | 2 | 4 | N | 2 | 3 |
+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+
| 3 | 3 | 6 | 0 | 2 | | 3 | 4 | 1 | N | 3 |
+-----+-----+-----+-----+-----+ +-----+-----+-----+-----+-----+
| 4 | 1 | 4 | 2 | 0 | | 4 | 4 | 1 | 2 | N | +-----+-----+-----+-----+-----
+ +-----+-----+-----+-----+-----+ distance path
This is our shortest distance matrix. For example, the shortest distance from 1
to 4 is 3 and the shortest distance between 4 to 3 is 2. Our pseudo-code will
be:
Printing the path:
To print the path, we'll check the Path matrix. To print the path from u to v,
we'll start from path[u][v]. We'll set keep changing v = path[u][v] until we
find path[u][v] = u and push every values of path[u][v] in a stack. After
finding u, we'll print u and start popping items from the stack and print them.
This works because the path matrix stores the value of the vertex which
shares the shortest path to v from any other node. The pseudo-code will be:
For dp[1][1] we have to check what can we do to convert a into a.It will be
0.For dp[1][2] we have to check what can we do to convert a into ab.It will
be 1 because we have to insert b.So after 1st iteration our array will look like
For iteration 2
For dp[2][1] we have to check that to convert az to a we need to remove z,
hence dp[2][1] will be 1.Similary for dp[2][2] we need to replace z with b,
hence dp[2][2] will be 1.So after 2nd iteration our dp[] array will look like.
Time Complexity
Chapter 27: Online algorithms
Theory
Definition 1: An optimization problem Π consists of a set of instances
ΣΠ . For every instance σ∈ΣΠ there is a set Ζσ of solutions and a
objective functionf σ : Ζσ → ℜ≥ 0 which assigns apositive real value to
every solution. We say OPT( σ ) is the value of an optimal solution, A( σ )
is the solution of an Algorithm A for the problem Π and
wA( σ )=f σ (A( σ )) its value.
Definition 2: An online algorithm A for a minimization problem Π has a
competetive ratio of r ≥ 1 if there is a constant τ∈ℜ with
wA( σ ) ≤ r ⋅ OPT(&sigma)
for all instances σ∈ΣΠ then A is called a strictly r-competitive online
algorithm.
Proposition 1.3: LRU and FWF are marking algorithm.
Proof: At the beginning of each phase (except for the first one) FWF has a
cache miss and cleared the cache. that means we have k empty pages. In
every phase are maximal k different pages requested, so there will be now
eviction during the phase. So FWF is a marking algorithm.
Lets assume LRU is not a marking algorithm. Then there is an instance σ
where LRU a marked page x in phase i evicted. Let σ t the request in phase i
where x is evicted. Since x is marked there has to be a earlier request σ t* for
x in the same phase, so t* < t. After t* x is the caches newest page, so to got
evicted at t the sequence σ t*+1,..., σ t has to request at least k from x
different pages. That implies the phase i has requested at least k+1 different
pages which is a contradictory to the phase definition. So LRU has to be a
marking algorithm.
Proposition 1.4: Every marking algorithm is strictly k-competitive.
Proof: Let σ be an instance for the paging problem and l the number of
phases for σ . Is l = 1 then is every marking algorithm optimal and the
optimal offline algorithm cannot be better.
We assume l ≥ 2. the cost of every marking algorithm for instance σ is
bounded from above with l ⋅ k because in every phase a marking algorithm
cannot evict more than k pages without evicting one marked page.
Now we try to show that the optimal offline algorithm evicts at least k+l-2
pages for σ , k in the first phase and at least one for every following phase
except for the last one. For proof lets define l-2 disjunct subsequences of σ .
Subsequence i ∈ {1,...,l-2} starts at the second position of phase i+1 and
end with the first position of phase i+2. Let x be the first page of phase i+1.
At the beginning of subsequence i there is page x and at most k-1 different
pages in the optimal offline algorithms cache. In subsequence i are k page
request different from x, so the optimal offline algorithm has to evict at least
one page for every subsequence. Since at phase 1 beginning the cache is
still empty, the optimal offline algorithm causes k evictions during the first
phase. That shows that
which is equal to
To satisfy this inequality you just have to choose l sufficient big. So LFU and
LIFO are not competetive.
Proposition 1.7: There is no r-competetive deterministic online algorithm
for paging with r < k.
Sources
Basic Material
1. Script Online Algorithms (german), Heiko Roeglin, University Bonn
2. Page replacement algorithm
Further Reading
1. Online Computation and Competetive Analysis by Allan Borodin and
Ran El-Yaniv
Source Code
1. Source code for offline caching
2. Source code for adversary game
Section 27.1:
Paging (Online
Caching)
Preface
Instead of starting with a formal definition, the goal is to approach these topic
via a row of examples, introducing definitions along the way. The remark
section Theory will consist of all definitions, theorems and propositions to
give you all information to faster look up specific aspects.
The remark section sources consists of the basis material used for this topic
and additional information for further reading. In addition you will find the
full source codes for the examples there. Please pay attention that to make the
source code for the examples more readable and shorter it refrains from
things like error handling etc. It also passes on some specific language
features which would obscure the clarity of the example like extensive use of
advanced libraries etc.
Paging
<= kthen we just put all elements in the cache and it will >>The paging
work, but usually is m problem arises
from the
limitation of finite space. Let's assume our cache C has k pages. Now we want
to process a sequence of m page requests which must have been placed in the
cache before they are processed. Of course if mk.
We say a request is a cache hit, when the page is already in cache, otherwise,
its called a cache miss. In that case, we must bring the requested page into
the cache and evict another, assuming the cache is full. The Goal is an
eviction schedule that minimizes the number of evictions.
There are numerous strategies for this problem, let's look at some:
1. First in, first out (FIFO): The oldest page gets evicted
2. Last in, first out (LIFO): The newest page gets evicted
3. Least recently used (LRU): Evict page whose most recent access was
earliest
4. Least frequently used (LFU): Evict page that was least frequently
requested
5. Longest forward distance (LFD): Evict page in the cache that is not
requested until farthest in the future. 6. Flush when full (FWF): clear
the cache complete as soon as a cache miss happened
There are two ways to approach this problem:
1. offline: the sequence of page requests is known ahead of time
2. online: the sequence of page requests is not known ahead of time
Offline Approach
For the first approach look at the topic Applications of Greedy technique. It's
third Example Offline Caching considers the first five strategies from above
and gives you a good entry point for the following.
The example program was extended with the FWF strategy:
The full sourcecode is available here. If we reuse the example from the topic,
we get the following output:
Even though LFD is optimal, FWF has fewer cache misses. But the main
goal was to minimize the number of evictions and for FWF five misses mean
15 evictions, which makes it the poorest choice for this example.
Online Approach
Now we want to approach the online problem of paging. But first we need an
understanding how to do it. Obviously an online algorithm cannot be better
than the optimal offline algorithm. But how much worse it is? We need
formal definitions to answer that question:
Definition 1.1: An optimization problem Π consists of a set of instances
ΣΠ . For every instance σ∈ΣΠ there is a set Ζσ of solutions and a
objective functionf σ : Ζσ → ℜ≥ 0 which assigns apositive real value to
every solution. We say OPT( σ ) is the value of an optimal solution, A( σ )
is the solution of an Algorithm A for the problem Π and
wA( σ )=f σ (A( σ )) its value.
Definition 1.2: An online algorithm A for a minimization problem Π has a
competetive ratio of r ≥ 1 if there is a constant τ∈ℜ with
wA( σ ) ≤ r ⋅ OPT( σ )
for all instances σ∈ΣΠ then A is called a strictly r-competitive online
algorithm.
So the question is how competitive is our online algorithm compared to an
optimal offline algorithm. In their famous book Allan Borodin and Ran El-
Yaniv used another scenario to describe the online paging situation:
There is an evil adversary who knows your algorithm and the optimal offline
algorithm. In every step, he tries to request a page which is worst for you and
simultaneously best for the offline algorithm. the competitive factor of your
algorithm is the factor on how badly your algorithm did against the
adversary's optimal offline algorithm. If you want to try to be the adversary,
you can try the Adversary Game (try to beat the paging strategies).
Marking Algorithms
Instead of analysing every algorithm separately, let's look at a special online
algorithm family for the paging problem called marking algorithms.
Let σ =( σ 1,..., σ p) an instance for our problem and k our cache size, than
σ can be divided into phases:
Phase 1 is the maximal subsequence of σ from the start till
maximal k different pages are requested Phase i ≥ 2 is the
maximal subsequence of σ from the end of pase i-1 till maximal k
different pages are requested
The first page 1 is requested l times than page 2 and so one. At the end, there
are (l-1) alternating requests for page k and k+1.
LFU and LIFO fill their cache with pages 1-k. When page k+1 is requested
page k is evicted and vice versa. That means every request of subsequence
(k,k+1)l-1 evicts one page. In addition, their are k-1 cache misses for the first
time use of pages 1-(k-1). So LFU and LIFO evict exact k-1+2(l-1) pages.
Now we must show that for every constant τ∈ℜ and every constant r ≤ 1
there exists an l so that
which is equal to
To satisfy this inequality you just have to choose l sufficient big. So LFU and
LIFO are not competitive.
Proposition 1.7: There is no r-competetive deterministic online algorithm
for paging with r < k.
The proof for this last proposition is rather long and based of the statement
that LFD is an optimal offline algorithm. The interested reader can look it up
in the book of Borodin and El-Yaniv (see sources below).
The Question is whether we could do better. For that, we have to leave the
deterministic approach behind us and start to randomize our algorithm.
Clearly, its much harder for the adversary to punish your algorithm if it's
randomized.
Randomized paging will be discussed in one of next examples...
Chapter 28: Sorting
Parameter Description
A sorting algorithm is stable if it preserves the relative order of
Stability
equal elements after sorting.
A sorting algorithm is in-place if it sorts using only O(1) auxiliary
In place
memory (not counting the array that needs to be sorted).
Best case A sorting algorithm has a best case time complexity of O(T
complexity running time is at least T(n) for all possible inputs.
Average case A sorting algorithm has an average case time complexity of
complexity if its running time, averaged over all possible inputs, is T
Worst case A sorting algorithm has a worst case time complexity of
complexity its running time is at
most T(n).
Section 28.1: Stability in Sorting
Stability in sorting means whether a sort algorithm maintains the relative
order of the equals keys of the original input in the result output.
So a sorting algorithm is said to be stable if two objects with equal keys
appear in the same order in sorted output as they appear in the input unsorted
array.
Consider a list of pairs:
Now we will sort the list using the first element of each pair.
A stable sorting of this list will output the below list:
Unstable sort may generate the same output as the stable sort but not always.
Well-known stable sorts:
Merge sort
Insertion sort
Radix sort
Tim sort
Bubble Sort
Well-known unstable sorts:
Heap sort
Quick sort
Chapter 29: Bubble Sort
Parameter Description
Stable Yes
In place Yes
Best case complexity O(n)
Average case complexity O(n^2)
Worst case
complexity O(n^2)
Space
complexity O(1)
Section 29.1: Bubble Sort
The BubbleSort compares each successive pair of elements in an unordered list
and inverts the elements if they are not in order.
The following example illustrates the bubble sort on the list {6,5,3,1,8,7,2,4}
(pairs that were compared in each step are encapsulated in '**'):
After one iteration through the list, we have {5,3,1,6,7,2,4,8}. Note that the
greatest unsorted value in the array (8 in this case) will always reach its final
position. Thus, to be sure the list is sorted we must iterate n-1 times for lists
of length n.
Graphic:
Section 29.2: Implementation in C &
C++
C Implementation
C# Merge Sort
Section 30.4: Merge Sort
Implementation in Java
Below there is the implementation in Java using a generics approach. It is the
same algorithm, which is presented above.
Section 30.5: Merge Sort
Implementation in Python
Section 30.6: Bottoms-up Java
Implementation
Chapter 31: Insertion Sort
Section 31.1: Haskell Implementation
Chapter 32: Bucket Sort
Section 32.1: C# Implementation
Chapter 33: Quicksort
Section 33.1: Quicksort Basics
Quicksort is a sorting algorithm that picks an element ("the pivot") and
reorders the array forming two partitions such that all elements less than the
pivot come before it and all elements greater come after. The algorithm is
then applied recursively to the partitions until the list is sorted.
1. Lomuto partition scheme mechanism :
This scheme chooses a pivot which is typically the last element in the array.
The algorithm maintains the index to put the pivot in variable i and each time
it finds an element less than or equal to pivot, this index is incremented and
that element would be placed before the pivot.
Example Question
You are an economist, a pretty bad one though. You are given the task of
finding the equilibrium price (that is, the price where supply = demand) for
rice.
Remember the higher a price is set, the larger the supply and the lesser the
demand
As your company is very efficient at calculating market forces, you can
instantly get the supply and demand in units of rice when the price of rice is
set at a certain price p.
10 ^ 17Yourboss wants the equilibrium price ASAP, but tells you that the
equilibrium price can be a positive integer that is at most and there is
guaranteed to be exactly 1 positive integer solution in the range. So get going
with your job before you lose it!
getSupply (k) getDemand You
are allowed to call functions (k), which will do
and exactly what is stated in the problem.
Example Explanation
10 ^ 17Here our search space is from 1 to . Thus a linear search is infeasible.
getSupply (k) increases getDemand (k) decreases. Thus, x >y However, notice
and for any that as the k goes
up, ,
getDemand getSupply getDemand getSupply
(x) -(x) >(y) -(y). Therefore, this search space is monotonic and
we can use Binary Search.
The following psuedocode demonstrates the usage of Binary Search:
~O ( log 10 ^ 17 )
time. This can be ~O ( log S This algorithm runs in ) time
generalized to where S is the size of the
search space since at every iteration of the while loop, we halved the search
space (from [low:high] to either [low:mid] or [mid:high]).
C Implementation of Binary Search with Recursion
Section 39.2: Rabin Karp
The Rabin – Karp algorithm or Karp – Rabin algorithm is a string searching
algorithm that uses hashing to find any one of a set of pattern strings in a
text.Its average and best case running time is O(n+m) in space O(p), but its
worst-case time is O(nm) where n is the length of the text and m is the length
of the pattern.
Algorithm implementation in java for string matching
This pattern does exist in the text. So our substring search should return 3,
the index of the position from which this pattern starts. So how does our
brute force substring search procedure work?
mn What we usually do is: we start from the 0th index of the text and the 0th
index of our *pattern and we compare Text[0] with Pattern[0]. Since they
are not a match, we go to the next index of our text and we compare Text[1]
with Pattern[0]. Since this is a match, we increment the index of our pattern
and the index of the Text also. We compare Text[2] with Pattern[1]. They
are also a match. Following the same procedure stated before, we now
compare Text[3] with Pattern[2]. As they do not match, we start from the
next position where we started finding the match. That is index 2 of the Text.
We compare Text[2] with Pattern[0]. They don't match. Then incrementing
index of the Text, we compare Text[3] with Pattern[0]. They match. Again
Text[4] and Pattern[1] match, Text[5] and Pattern[2] match and Text[6]
and Pattern[3] match. Since we've reached the end of our Pattern, we now
return the index from which our match started, that is 3. If our pattern was:
bcgll, that means if the pattern didn't exist in our text, our search should return
exception or -1 or any other predefined value. We can clearly see that, in the
worst case, this algorithm would take O() time where m is the length of the
Text and n is the length of the Pattern. How do we reduce this time
complexity? This is where KMP Substring Search Algorithm comes into the
picture.
The Knuth-Morris-Pratt String Searching Algorithm or KMP Algorithm
searches for occurrences of a "Pattern" within a main "Text" by employing
the observation that when a mismatch occurs, the word itself embodies
sufficient information to determine where the next match could begin, thus
bypassing re-examination of previously matched characters. The algorithm
was conceived in 1970 by Donuld Knuth and Vaughan Pratt and
independently by James H. Morris. The trio published it jointly in 1977.
Let's extend our example Text and Pattern for better understanding:
+-------+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| Index |0 |1 |2 |3 |4 |5 |6 |7 |8 |9 |10|11|12|13|14|15|16|17|18|19|20|21|22|
+-------+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| Text |a |b |c |x |a |b |c |d |a |b |x |a |b |c |d |a |b |c |d |a |b |c |y |
+-------+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
+---------+---+---+---+---+---+---+---+---+
| Index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
At first, our Text and Pattern matches till index 2. Text[3] and Pattern[3]
doesn't match. So our aim is to not go backwards in this Text, that is, in case
of a mismatch, we don't want our matching to begin again from the position
that we started matching with. To achieve that, we'll look for a suffix in our
Pattern right before our mismatch occurred (substring abc), which is also a
prefix of the substring of our Pattern. For our example, since all the
characters are unique, there is no suffix, that is the prefix of our matched
substring. So what that means is, our next comparison will start from index 0.
Hold on for a bit, you'll understand why we did this. Next, we compare
Text[3] with Pattern[0] and it doesn't match. After that, for Text from index
4 to index 9 and for Pattern from index 0 to index 5, we find a match. We
find a mismatch in Text[10] and Pattern[6]. So we take the substring from
Pattern right before the point where mismatch occurs (substring abcdabc),
we check for a suffix, that is also a prefix of this substring. We can see here
ab is both the suffix and prefix of this substring. What that means is, since
we've matched until Text[10], the characters right before the mismatch is ab.
What we can infer from it is that since ab is also a prefix of the substring we
took, we don't have to check ab again and the next check can start from
Text[10] and Pattern[2]. We didn't have to look back to the whole Text, we
can start directly from where our mismatch occurred. Now we check
Text[10] and Pattern[2], since it's a mismatch, and the substring before
mismatch (abc) doesn't contain a suffix which is also a prefix, we check
Text[10] and Pattern[0], they don't match. After that for Text from index 11
to index 17 and for Pattern from index 0 to index 6. We find a mismatch in
Text[18] and Pattern[7]. So again we check the substring before mismatch
(substring abcdabc) and find abc is both the suffix and the prefix. So since
we matched till Pattern[7], abc must be before Text[18]. That means, we
don't need to compare until Text[17] and our comparison will start from
Text[18] and Pattern[3]. Thus we will find a match and we'll return 15
which is our starting index of the match. This is how our KMP Substring
Search works using suffix and prefix information.
Now, how do we efficiently compute if suffix is same as prefix and at what
point to start the check if there is a mismatch of character between Text and
Pattern. Let's take a look at an example:
We'll generate an array containing the required information. Let's call the
array S. The size of the array will be same as the length of the pattern. Since
the first letter of the Pattern can't be the suffix of any prefix, we'll put S[0] =
0. We take i = 1 and j = 0 at first. At each step we compare Pattern[i] and
Pattern[j] and increment i. If there is a match we put S[i] = j + 1 and
increment j, if there is a mismatch, we check the previous value position of j
(if available) and set j = S[j-1] (if j is not equal to 0), we keep doing this until
S[j] doesn't match with S[i] or j doesn't become 0. For the later one, we put
S[i] = 0. For our example:
Pattern[j] and Pattern[i] don't match, so we increment i and since j is 0, we
don't check the previous value and put
Pattern[i] = 0. If we keep incrementing i, for i = 4, we'll get a match, so we
put S[i] = S[4] = j + 1 = 0 + 1 = 1 and increment j and i. Our array will look
like:
This is our required array. Here a nonzero-value of S[i] means there is a S[i]
length suffix same as the prefix in that substring (substring from 0 to i) and
the next comparison will start from S[i] + 1 position of the Pattern. Our
algorithm to generate the array would look like:
The time complexity to build this array is O(n) and the space complexity is
also O(n). To make sure if you have completely understood the algorithm, try
to generate an array for pattern aabaabaa and check if the result matches with
this one.
Now let's do a substring search using the following example:
+---------+---+---+---+---+---+---+---+---+---+---+---+---+
| Index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |10 |11 |
+---------+---+---+---+---+---+---+---+---+---+---+---+---+
| Text | a | b | x | a | b | c | a | b | c | a | b | y |
+---------+---+---+---+---+---+---+---+---+---+---+---+---+
+---------+---+---+---+---+---+---+
| Index | 0 | 1 | 2 | 3 | 4 | 5 |
The time complexity of this algorithm apart from the Suffix Array
Calculation is O(m). Since GenerateSuffixArray takes O(n), the total time
complexity of KMP Algorithm is: O(m+n).
j := S
PS: If you want to find multiple occurrences of Pattern in the Text,
instead of returning the value, print it/store it and set [j-1]. Also keep a flag to
track whether you have found any occurrence or not and handle it
accordingly.
Section 40.2: Introduction to Rabin-
Karp Algorithm
Rabin-Karp Algorithm is a string searching algorithm created by Richard M.
Karp and Michael O. Rabin that uses hashing to find any one of a set of
pattern strings in a text.
A substring of a string is another string that occurs in. For example, ver is a
substring of stackoverflow. Not to be confused with subsequence because
cover is a subsequence of the same string. In other words, any subset of
consecutive letters in a string is a substring of the given string.
In Rabin-Karp algorithm, we'll generate a hash of our pattern that we are
looking for & check if the rolling hash of our text matches the pattern or not.
If it doesn't match, we can guarantee that the pattern doesn't exist in the text.
However, if it does match, the pattern can be present in the text. Let's look at
an example:
Let's say we have a text: yeminsajid and we want to find out if the pattern
nsa exists in the text. To calculate the hash and rolling hash, we'll need to use
a prime number. This can be any prime number. Let's take prime = 11 for
this example. We'll determine hash value using this formula:
(1st letter) X (prime) + (2nd letter) X (prime) ¹ + (3rd letter) X (prime) ² X + ......
We'll denote:
Now we find the rolling-hash of our text. If the rolling hash matches with the
hash value of our pattern, we'll check if the strings match or not. Since our
pattern has 3 letters, we'll take 1st 3 letters yem from our text and calculate
hash value. We get:
This value doesn't match with our pattern's hash value. So the string doesn't
exists here. Now we need to consider the next step. To calculate the hash
value of our next string emi. We can calculate this using our formula. But
that would be rather trivial and cost us more. Instead, we use another
technique.
1653 - 25 = 1628 1628 / 11 = 148 We subtract the value of the First Letter of
Previous String from our current hash value.
In this case, y. We get, .
We divide the difference with our prime,
which is 11 for this example. We get, .
148 X = 1237
11 ² We add new letter X (prime) ⁻¹ , where m is the length of the
pattern, with the quotient, which is i = 9. We get, + 9.
The new hash value is not equal to our patterns hash value. Moving on, for n
we get:
Hash Recalculation:
String Match:
Rabin-Karp:
output:
Input:
output:
C Language Implementation:
Reference:
http://www.geeksforgeeks.org/searching-for-patterns-set-2-kmp-algorithm/
Chapter 41: Breadth-First Search
Section 41.1: Finding the Shortest Path
from Source to other Nodes
Breadth-first-search (BFS) is an algorithm for traversing or searching tree or
graph data structures. It starts at the tree root (or some arbitrary node of a
graph, sometimes referred to as a 'search key') and explores the neighbor
nodes first, before moving to the next level neighbors. BFS was invented in
the late 1950s by Edward Forrest Moore, who used it to find the shortest path
out of a maze and discovered independently by C. Y. Lee as a wire routing
algorithm in 1961.
The processes of BFS algorithm works under these assumptions:
1. We won't traverse any node more than once.
2. Source node or the node that we're starting from is situated in level 0.
3. The nodes we can directly reach from source node are level 1 nodes, the
nodes we can directly reach from level 1 nodes are level 2 nodes and so
on.
4. The level denotes the distance of the shortest path from the source.
Let's see an example:
Let's assume this graph represents connection between multiple cities, where
each node denotes a city and an edge between two nodes denote there is a
road linking them. We want to go from node 1 to node 10. So node 1 is our
source, which is level 0. We mark node 1 as visited. We can go to node 2,
node 3 and node 4 from here. So they'll be level (0+1) = level 1 nodes. Now
we'll mark them as visited and work with them.
The colored nodes are visited. The nodes that we're currently working with
will be marked with pink. We won't visit the same node twice. From node 2,
node 3 and node 4, we can go to node 6, node 7 and node 8. Let's mark
them as visited. The level of these nodes will be level (1+1) = level 2.
If you haven't noticed, the level of nodes simply denote the shortest path
distance from the source. For example: we've found node 8 on level 2. So
the distance from source to node 8 is 2.
We didn't yet reach our target node, that is node 10. So let's visit the next
nodes. we can directly go to from node 6, node 7 and node 8.
We can see that, we found node 10 at level 3. So the shortest path from
source to node 10 is 3. We searched the graph level by level and found the
shortest path. Now let's erase the edges that we didn't use:
After removing the edges that we didn't use, we get a tree called BFS tree.
This tree shows the shortest path from source to all other nodes.
So our task will be, to go from source to level 1 nodes. Then from level 1 to
level 2 nodes and so on until we reach our destination. We can use queue to
store the nodes that we are going to process. That is, for each node we're
going to work with, we'll push all other nodes that can be directly traversed
and not yet traversed in the queue.
The simulation of our example:
First we push the source in the queue. Our queue will look like:
The level of node 1 will be 0. level[1] = 0. Now we start our BFS. At first,
we pop a node from our queue. We get node 1. We can go to node 4, node 3
and node 2 from this one. We've reached these nodes from node 1. So
level[4] = level[3] = level[2] = level[1] + 1 = 1. Now we mark them as
visited and push them in the queue.
Now we pop node 4 and work with it. We can go to node 7 from node 4.
level[7] = level[4] + 1 = 2. We mark node 7 as visited and push it in the
queue.
From node 3, we can go to node 7 and node 8. Since we've already marked
node 7 as visited, we mark node 8 as visited, we change level[8] = level[3] +
1 = 2. We push node 8 in the queue.
This process will continue till we reach our destination or the queue becomes
empty. The level array will provide us with the distance of the shortest path
from source. We can initialize level array with infinity value, which will
mark that the nodes are not yet visited. Our pseudo-code will be:
By iterating through the level array, we can find out the distance of each node
from source. For example: the distance of node 10 from source will be
stored in level[10].
parent [v] := u Sometimeswe might need to print not only the shortest distance,
but also the path via which we can go to our destined node from the source.
For this we need to keep a parent array. parent[source] will be NULL. For
each update in level array, we'll simply add in our pseudo code inside the for
loop. After finishing BFS, to find the path, we'll traverse back the parent
array until we reach source which will be denoted by NULL value. The
pseudo-code will be:
scanf("%d%d",&u,&v);
graph[u-1][v-
1] = 1;
graph[v-1][u-1] =
1;
} if(isConnected(graph,n))
printf("The graph is connected"); else
printf("The graph is NOT connected\n");
}
void
enqueue(int
vertex) {
if(Qfront ==
NULL)
{
Qfront = malloc(sizeof(Node));
Qfront->v = vertex;
Qfront->next = NULL;
Qrear = Qfront;
else
{
Nodeptr newNode =
malloc(sizeof(Node)); newNode->v =
vertex; newNode->next = NULL;
Qrear->next = newNode;
Qrear = newNode;
}
}
int deque()
{
For Finding all the Connected components of an undirected graph, we only
need to add 2 lines of code to the BFS function. The idea is to call BFS
function until all vertices are visited.
The lines to be added are:
AND
Complexity:
Each nodes and edges are visited once. So the complexity of DFS is O(V+E),
where V denotes the number of nodes and E denotes the number of edges.
Applications of Depth First Search:
Finding all pair shortest path in an undirected graph.
Detecting cycle in a graph.
Path finding.
Topological Sort.
Testing if a graph is bipartite.
Finding Strongly
Connected
Component. Solving
puzzles with one
solution.
Chapter 43: Hash Functions
Section 43.1: Hash codes for common
types in C#
GetHashCode() The
hash codes produced by method for built-in and common C#
types from the System namespace are shown below.
Boolean
1 if value is
true, 0
otherwise.
Byte,
UInt16,
Int32,
UInt32,
Single
Value (if necessary casted to Int32).
SByte
Char
Int16
Int64, Double
Xor between lower and upper 32 bits of 64 bit number
Decimal
Object
ValueType
The first non-static field is look for and get it's hashcode. If the type has no
non-static fields, the hashcode of the type returns. The hashcode of a static
member can't be taken because if that member is of the same type as the
original type, the calculating ends up in an infinite loop.
Nullable<T>
Array
References
GitHub .Net Core CLR
Section 43.2:
Introduction to hash
functions
x ∈ X of arbitrary size y ∈ Y Hash function h() is an arbitrary function which
to value mapped data of fixed size: y
h =(x). Good hash functions have follows
restrictions:
In general case size of hash function less then size of input data: |y| < |x|. Hash
functions are not reversible or in
x1, x2 ∈ X, x1 ≠ x2 : h ( x1 h ( x2other words it may be collision: ∃ ) =). X may be
finite or infinite set and Y is finite set.
Hash functions are used in a lot of parts of computer science, for example in
software engineering, cryptography, databases, networks, machine learning
and so on. There are many different types of hash functions, with differing
domain specific properties.
GetHashCode hashCode Often
hash is an integer value. There are special methods
in programmning languages for hash calculating. For example, in C# ()
method for all types returns Int32 value (32 bit integer number). In Java every
class provides () method which return int. Each data type has own or user
defined implementations.
Hash methods
z z x∈X
There are several approaches for determinig hash function. Without loss of
generality, lets = { ∈ ℤ : ≥ 0} are positive integer numbers. Often m is prime
(not too close to an exact power of 2).
Method Hash function
x mod m
The next methods are used to compute the probe sequences required for open
addressing
Method Formula
'(x) + i) mod m x, i
Time Complexity
N * NThere are N! permutations to go through and the cost of each path is
calculated in O(N), thus this algorithm takes O(!) time to output the exact
answer.
Section 44.2: Dynamic Programming
Algorithm
Notice that if we consider the path (in order):
The cost of going from vertex 1 to vertex 2 to vertex 3 remains the same, so
why must it be recalculated? This result can be saved for later use.
dp [ bitmask ] vertex Let ]
represent the minimum cost of travelling through all the
[
vertices whose corresponding bit in bitmask is set to 1 ending
at vertex. For example:
dp [ 12Since 12
represents 1100 in binary, ][2] represents going through vertices 2
and 3 in the graph with the path ending at vertex 2.
Thus we can have the following algorithm (C++ implementation):
bitmask << i
Here, | (1) sets the ith bit of bitmask to 1, which represents that the ith vertex has
been visited. The i after the comma represents the new pos in that function
call, which represents the new "last" vertex.
cost [ pos ][i] is to add the cost of travelling from vertex pos to vertex i.
Thus, this line is to update the value of cost to the minimum possible value of
travelling to every other vertex that has not been visited yet.
Time Complexity
TSP ( bitmask,pos
The function ) has 2^N values for bitmask and N values for pos.
Each function takes O(N) time to run (the for loop). Thus this implementation
takes O(N^2 * 2^N) time to output the exact answer.
Chapter 45: Knapsack Problem
Section 45.1: Knapsack Problem Basics
The Problem: Given a set of items where each item contains a weight and
value, determine the number of each to include in a collection so that the
total weight is less than or equal to a given limit and the total value is as large
as possible.
Pseudo code for Knapsack Problem
Given:
1. Values(array v)
2. Weights(array w)
3. Number of distinct items(n)
4. Capacity(W)
O(nW) Time Complexity of the above code: where n is the number of items
and W is the capacity of knapsack.
Section 45.2: Solution Implemented in
C#
Chapter 46: Equation Solving
Section 46.1: Linear Equation
There are two classes of methods for solving Linear Equations:
1. Direct Methods: Common characteristics of direct methods are that they
transform the original equation into equivalent equations that can be
solved more easily, means we get solve directly from an equation.
2. Iterative Method: Iterative or Indirect Methods, start with a guess of the
solution and then repeatedly refine the solution until a certain
convergence criterion is reached. Iterative methods are generally less
efficient than direct methods because large number of operations
required. Example- Jacobi's Iteration Method, Gauss-Seidal Iteration
Method.
Implementation in C-
while(!rootFound){
for(i=0; i<n; i++){ //calculation Nx[i]=b[i];
for(j=0; j<n; j++){ if(i!=j) Nx[i] = Nx[i]-a[i]
[j]*Nx[j];
}
Nx[i] = Nx[i] / a[i][i]; }
rootFound=1; //verification for(i=0; i<n; i++){ if(!( (Nx[i]-x[i])/x[i] >
-0.000001 && (Nx[i]-x[i])/x[i] < 0.000001 )){ rootFound=0; break; }
}
for(i=0; i<n; i++){ //evaluation
x[i]=Nx[i]; } }
return ; }
//Print array with comma separation
void print(int n, double x[n]){ int i;
for(i=0; i<n; i++){ printf("%lf, ",
x[i]);
} printf("\n\n");
return ; }
int main(){
//equation initialization int n=3;
//number of variables double x[n];
//variables
double b[n], //constants a[n][n];
//arguments
//assign values a[0][0]=8; a[0][1]=2; a[0][2]=-2; b[0]=8; //8x ₁
+2x ₂ -2x ₃ +8=0 a[1][0]=1; a[1][1]=-8; a[1][2]=3; b[1]=-4; //x ₁ -8x ₂
+3x ₃ -4=0 a[2][0]=2; a[2][1]=1; a[2][2]=9; b[2]=12; //2x ₁ +x ₂
+9x ₃ +12=0
int i;
for(i=0; i<n; i++){ //initialization x[i]=0; }
JacobisMethod(n, x, b, a); print(n, x);
Implementation in C:
} } printf("It took %d loops.\n",
loopCounter);
return
root;
}
/** * Takes two initial values and shortens the distance by single side.
**/ double FalsePosition(){ double root=0;
double a=1,
b=2; double
c=0;
int loopCounter=0; if(f(a)*f(b) < 0){
while(1){ loopCounter++; c=
(a*f(b) - b*f(a)) / (f(b) - f(a));
/*/printf("%lf\t %lf \n", c, f(c));/**////test
if(f(c)<0.00001 && f(c)>-0.00001){ root=c;
break; }
if((f(a))*(f(c)) < 0){
b=c;
}else{ a=c;
}
} } printf("It took %d loops.\n",
loopCounter);
return
root;
}
/** * Uses one initial value and gradually takes that value near to the real one. **/
double NewtonRaphson(){ double root=0;
double
x1=1;
double
x2=0;
int
loopCounter=0;
while(1){
loopCounter++;
x2 = x1 - (f(x1)/f2(x1));
/*/printf("%lf \t %lf \n", x2, f(x2));/**////test
if(f(x2)<0.00001 &&
f(x2)>-0.00001){ root=x2;
break;
}
return
root;
}
/** * Uses one initial value and gradually takes that value near to the real one. **/
double FixedPoint(){ double root=0; double x=1;
int
loopCounter=0;
while(1){
loopCounter++;
if( (x-g(x)) <0.00001 && (x-g(x)) >-0.00001){
root = x; break; }
/*/printf("%lf \t %lf \n", g(x), x-(g(x)));/**////test
x=g(x); } printf("It took %d
loops.\n", loopCounter);
return
root;
}
/** * uses two initial values & both value approaches to the
root. **/ double Secant(){ double root=0;
double
x0=1;
double
x1=2;
double
x2=0;
int
loopCounter=0;
while(1){
loopCounter++;
/*/printf("%lf \t %lf \t %lf \n", x0, x1, f(x1));/**////test
if(f(x1)<0.00001 && f(x1)>-0.00001){
root=x1; break; } x2 = ((x0*f(x1))-
(x1*f(x0))) / (f(x1)-f(x0));
x0=x1; x1=x2; } printf("It
took %d loops.\n", loopCounter);
return root;
}
Chapter 47: Longest Common
Subsequence
Section 47.1: Longest Common
Subsequence Explanation
One of the most important implementations of Dynamic Programming is
finding out the Longest Common Subsequence. Let's define some of the
basic terminologies first.
Subsequence:
A subsequence is a sequence that can be derived from another sequence by
deleting some elements without changing the order of the remaining
elements. Let's say we have a string ABC. If we erase zero or one or more
than one character from this string we get the subsequence of this string. So
the subsequences of string ABC will be {"A", "B", "C", "AB", "AC",
"BC", "ABC", " "}. Even if we remove all the characters, the empty string
will also be a subsequence. To find out the subsequence, for each characters
in a string, we have two options - either we take the character, or we don't. So
if the length of the string is n, there are 2n subsequences of that string.
Longest Common Subsequence:
As the name suggest, of all the common subsequencesbetween two strings,
the longest common subsequence(LCS) is the one with the maximum length.
For example: The common subsequences between "HELLOM" and
"HMLD" are "H", "HL", "HM" etc. Here "HLL" is the longest common
subsequence which has length 3.
Brute-Force Method:
We can generate all the subsequences of two strings using backtracking.
Then we can compare them to find out the common subsequences. After
we'll need to find out the one with the maximum length. We have already
seen that, there are 2n subsequences of a string of length n. It would take
years to solve the problem if our n crosses 20-25.
Dynamic Programming Method:
Let's approach our method with an example. Assume that, we have two
strings abcdaf and acbcf. Let's denote these with s1 and s2. So the longest
common subsequence of these two strings will be "abcf", which has length
4. Again I remind you, subsequences need not be continuous in the string. To
construct "abcf", we ignored "da" in s1 and "c" in s2. How do we find this
out using Dynamic Programming?
We'll start with a table (a 2D array) having all the characters of s1 in a row
and all the characters of s2 in column. Here the table is 0-indexed and we put
the characters from 1 to onwards. We'll traverse the table from left to right
for each row. Our table will look like:
0 1 2 3 4 5 6
+-----+-----+-----+-----+-----+-----+-----+-----+
| ch ʳ | | a | b | c | d | a | f |
+-----+-----+-----+-----+-----+-----+-----+-----+
0 | | | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
1 | a | | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
2 | c | | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
3 | b | | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
4 | c | | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
Here each row and column represent the length of the longest common
subsequence between two strings if we take the characters of that row and
column and add to the prefix before it. For example: Table[2][3] represents
the length of the longest common subsequence between "ac" and "abc".
The 0-th column represents the empty subsequence of s1. Similarly the 0-th
row represents the empty subsequence of s2. If we take an empty
subsequence of a string and try to match it with another string, no matter how
long the length of the second substring is, the common subsequence will
have 0 length. So we can fill-up the 0th rows and 0-th columns with 0's. We
get:
0 1 2 3 4 5 6
+-----+-----+-----+-----+-----+-----+-----+-----+
| ch ʳ | | a | b | c | d | a | f |
+-----+-----+-----+-----+-----+-----+-----+-----+
0 | | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+-----+-----+-----+-----+-----+-----+-----+-----+
1 | a | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
2 | c | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
3 | b | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
4 | c | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
5 | f | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
Let's begin. When we're filling Table[1][1], we're asking ourselves, if we had
a string a and another string a and nothing else, what will be the longest
common subsequence here? The length of the LCS here will be 1. Now let's
look at Table[1][2]. We have string ab and string a. The length of the LCS
will be 1. As you can see, the rest of the values will be also 1 for the first row
as it considers only string a with abcd, abcda, abcdaf. So our table will look
like:
0 1 2 3 4 5 6
+-----+-----+-----+-----+-----+-----+-----+-----+
| ch ʳ | | a | b | c | d | a | f |
+-----+-----+-----+-----+-----+-----+-----+-----+
0 | | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+-----+-----+-----+-----+-----+-----+-----+-----+
1 | a | 0 | 1 | 1 | 1 | 1 | 1 | 1 |
+-----+-----+-----+-----+-----+-----+-----+-----+
2 | c | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
3 | b | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
4 | c | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
5 | f | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
For row 2, which will now include c. For Table[2][1] we have ac on one side
and a on the other side. So the length of the LCS is 1. Where did we get this
1 from? From the top, which denotes the LCS a between two substrings. So
what we are saying is, if s1[2] and s2[1] are not same, then the length of the
LCS will be the maximum of the length of
LCS at the top, or at the left. Taking the length of the LCS at the top denotes
that, we don't take the current character from s2. Similarly, Taking the length
of the LCS at the left denotes that, we don't take the current character from s1
to create the LCS. We get:
0 1 2 3 4 5 6
+-----+-----+-----+-----+-----+-----+-----+-----+
| ch ʳ | | a | b | c | d | a | f |
+-----+-----+-----+-----+-----+-----+-----+-----+
0 | | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+-----+-----+-----+-----+-----+-----+-----+-----+
1 | a | 0 | 1 | 1 | 1 | 1 | 1 | 1 |
+-----+-----+-----+-----+-----+-----+-----+-----+
2 | c | 0 | 1 | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
3 | b | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
4 | c | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
5 | f | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
Moving on, for Table[2][2] we have string ab and ac. Since c and b are not
same, we put the maximum of the top or left here. In this case, it's again 1.
After that, for Table[2][3] we have string abc and ac. This time current
values of both row and column are same. Now the length of the LCS will be
equal to the maximum length of LCS so far + 1. How do we get the
maximum length of LCS so far? We check the diagonal value, which
represents the best match between ab and a. From this state, for the current
values, we added one more character to s1 and s2 which happened to be the
same. So the length of LCS will of course increase. We'll put 1 + 1 = 2 in
Table[2][3]. We get,
0 1 2 3 4 5 6
+-----+-----+-----+-----+-----+-----+-----+-----+
| ch ʳ | | a | b | c | d | a | f |
+-----+-----+-----+-----+-----+-----+-----+-----+
0 | | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+-----+-----+-----+-----+-----+-----+-----+-----+
1 | a | 0 | 1 | 1 | 1 | 1 | 1 | 1 |
+-----+-----+-----+-----+-----+-----+-----+-----+
2 | c | 0 | 1 | 1 | 2 | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
3 | b | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
4 | c | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
5 | f | 0 | | | | | | |
+-----+-----+-----+-----+-----+-----+-----+-----+
We have defined both the cases. Using these two formulas, we can populate
the whole table. After filling up the table, it will look like this:
0 1 2 3 4 5 6
+-----+-----+-----+-----+-----+-----+-----+-----+
| ch ʳ | | a | b | c | d | a | f |
+-----+-----+-----+-----+-----+-----+-----+-----+
0 | | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
+-----+-----+-----+-----+-----+-----+-----+-----+
1 | a | 0 | 1 | 1 | 1 | 1 | 1 | 1 |
+-----+-----+-----+-----+-----+-----+-----+-----+
2 | c | 0 | 1 | 1 | 2 | 2 | 2 | 2 |
+-----+-----+-----+-----+-----+-----+-----+-----+
3 | b | 0 | 1 | 2 | 2 | 2 | 2 | 2 |
+-----+-----+-----+-----+-----+-----+-----+-----+
4 | c | 0 | 1 | 2 | 3 | 3 | 3 | 3 |
+-----+-----+-----+-----+-----+-----+-----+-----+
5 | f | 0 | 1 | 2 | 3 | 3 | 3 | 4 |
+-----+-----+-----+-----+-----+-----+-----+-----+
The time complexity for this algorithm is: O(mn) where m and n denotes the
length of each strings.
How do we find out the longest common subsequence? We'll start from the
bottom-right corner. We will check from where the value is coming. If the
value is coming from the diagonal, that is if Table[i-1][j-1] is equal to
Table[i][j] - 1, we push either s2[i] or s1[j] (both are the same) and move
diagonally. If the value is coming from top, that means, if Table[i-1][j] is
equal to Table[i][j], we move to the top. If the value is coming from left, that
means, if Table[i][j-1] is equal to Table[i][j], we move to the left. When we
reach the leftmost or topmost column, our search ends. Then we pop the
values from the stack and print them. The pseudo-code:
Part 2:
Recursive Solution:
Approach 1:
Approach 2:
Approach 3:
You can see hashKey 'o' is containing value 2 because o is 2 times in string.
Now loop over str2 and check for each character are present in hashMap, if
yes, decrease value of hashMap Key, else return false (which indicate it's not
anagram).
Now, loop over hashMap object and check all values are zero in the key of
hashMap.
In our case all values are zero so its a
anagram.
Section 49.2: Generic
Code for Anagrams
Time complexity: 3n i.e O(n).
Chapter 50: Pascal's Triangle
Section 50.1: Pascal triangle in C
Output
Chapter 51: Algo:- Print a m*n
matrix in square wise
Check sample input and output below.
Section 51.1: Sample Example
Section 51.2: Write the generic code
Chapter 52: Matrix Exponentiation
Section 52.1: Matrix Exponentiation to
Solve Example Problems
f f
Find f(n): nth Fibonacci number. The problem is quite easy when n is
relatively small. We can use simple recursion, f(n) =(n-1) +(n-2), or we can use
dynamic programming approach to avoid the calculation of same function
over and over again. But what will you do if the problem says, Given 0 < n <
10 ⁹ , find f(n) mod 999983? Dynamic programming will fail, so how do we
tackle this problem?
First let's see how matrix exponentiation can help to represent recursive
relation.
Prerequisites:
Given two matrices, know how to find their product. Further, given
the product matrix of two matrices, and one of them, know how to
find the other matrix.
Given a matrix of size d X d, know how to find its nth power in
O(d3log(n)).
Patterns:
At first we need a recursive relation and we want to find a matrix M which
can lead us to the desired state from a set of already known states. Let's
assume that, we know the k states of a given recurrence relation and we want
to find the (k+1)th state. Let M be a k X k matrix, and we build a matrix A:
[k X 1] from the known states of the recurrence relation, now we want to get
a matrix B:[k X 1] which will represent the set of next states, i. e. M X A =
B as shown below:
So, if we can design M accordingly, our job will be done! The matrix will
then be used to represent the recurrence relation.
Type 1:
f f
[Note: Matrix A will be always designed in such a way that, every state on
which f(n+1) depends, will be present] Now, we need to design a 2X2 matrix
M such that, it satisfies M X A = B as stated above.
The first element of B is f(n+1) which is actually f(n) + (n-1). To get this, from
matrix A, we need, 1 X f(n) and 1 X f(n-1). So the first row of M will be [1
1].
[Note: ----- means we are not concerned about this value.]
Similarly, 2nd item of B is f(n) which can be got by simply taking 1 X f(n)
from A, so the 2nd row of M is [1 0].
(n-1),
we need [a, b] in the first row of objective matrix M. And for the 2nd
item in B, i.e. f(n) we already have that in matrix A, so we just take that,
which leads, the 2nd row of the matrix M to [1 0]. This time we get:
If you've survived through to this stage, you've grown much older, now let's
face a bit complex relation: find f(n) =
cXf aXf
(n-1) +(n-3)?
Ooops! A few minutes ago, all we saw were contiguous states, but here, the
state f(n-2) is missing. Now?
aXf Xf
(n-3), deducing f(n+1) =(n) + 0(n-1) +(n-2). Now, we see that, this is actually
a form described in Type 2. So here the objective matrix M will be 3 X 3,
and the elements are:
| a 0 c | | f(n) | | f(n+1)
| | 1 0 0 | X | f(n-1) | = |
f(n) | | 0 1 0 | | f(n-2) | |
f(n-1) |
These are calculated in the same way as type 2, if you find it difficult, try it
on pen and paper.
Type 4:
f f c
Life is getting complex as hell, and Mr, Problem now asks you to find f(n) =(n-
1) +(n-2) + where c is any constant.
Now this is a new one and all we have seen in past, after the multiplication,
each state in A transforms to its next state in B.
So , normally we can't get it through previous fashion, but how about we add
c as a state:
Now, its not much hard to design M. Here's how its done, but don't forget to
verify:
Type 5:
aXf cXf dXf e
Let's put it altogether: find f(n) =(n-1) +(n-3) +(n-4) +. Let's leave it as an exercise
for
you. First try to find out the states and matrix M. And check if it matches
with your solution. Also find matrix A and B.
Type 6:
Sometimes the recurrence is given like this:
In short:
Here, we can split the functions in the basis of odd even and keep 2 different
matrix for both of them and calculate them separately.
Type 7:
Feeling little too confident? Good for you. Sometimes we may need to
maintain more than one recurrence, where they are interested. For example,
let a recurrence re;atopm be:
Here, recurrence g(n) is dependent upon f(n) and this can be calculated in the
same matrix but of increased dimensions. From these let's at first design the
matrices A and B.
So, these are the basic categories of recurrence relations which are used to
solveby this simple technique.
Chapter 53: polynomial-time
bounded algorithm for Minimum
Vertex Cover
Variable Meaning
G Input connected un-directed graph
X Set of vertices
C Final set of vertices
This is a polynomial algorithm for getting the minimum vertex cover of
connected undirected graph. The time complexity of this algorithm is
O(n2)
Section 53.1: Algorithm Pseudo
Code
Algorithm PMinVertexCover (graph G)
Chapter 54: Dynamic Time
Warping
Section 54.1: Introduction To Dynamic
Time Warping
Dynamic Time Warping (DTW) is an algorithm for measuring similarity
between two temporal sequences which may vary in speed. For instance,
similarities in walking could be detected using DTW, even if one person was
walking faster than the other, or if there were accelerations and decelerations
during the course of an observation. It can be used to match a sample voice
command with others command, even if the person talks faster or slower than
the prerecorded sample voice. DTW can be applied to temporal sequences of
video, audio and graphics data-indeed, any data which can be turned into a
linear sequence can be analyzed with DTW.
In general, DTW is a method that calculates an optimal match between two
given sequences with certain restrictions. But let's stick to the simpler points
here. Let's say, we have two voice sequences Sample and Test, and we want
to check if these two sequences match or not. Here voice sequence refers to
the converted digital signal of your voice. It might be the amplitude or
frequency of your voice that denotes the words you say. Let's assume:
We want to find out the optimal match between these two sequences.
At first, we define the distance between two points, d(x, y) where x and y
represent the two points. Let,
Let's create a 2D matrix Table using these two sequences. We'll calculate the
distances between each point of Sample with every points of Test and find
the optimal match between them.
+------+------+------+------+------+------+------+------+
| | 0 | 1 | 1 | 2 | 2 | 3 | 5 |
+------+------+------+------+------+------+------+------+
| 0 | | | | | | | |
+------+------+------+------+------+------+------+------+
| 1 | | | | | | | |
+------+------+------+------+------+------+------+------+
| 2 | | | | | | | |
+------+------+------+------+------+------+------+------+
| 3 | | | | | | | |
+------+------+------+------+------+------+------+------+
| 5 | | | | | | | |
+------+------+------+------+------+------+------+------+
| 5 | | | | | | | |
+------+------+------+------+------+------+------+------+
| 5 | | | | | | | |
+------+------+------+------+------+------+------+------+
| 6 | | | | | | | |
+------+------+------+------+------+------+------+------+
Now for each step, we'll consider the distance between each points in concern
and add it with the minimum distance we found so far. This will give us the
optimal distance of two sequences up to that position. Our formula will be,
For the first one, d(1, 1) = 0, Table[0][0] represents the minimum. So the
value of Table[1][1] will be 0 + 0 = 0. For the second one, d(1, 2) = 0.
Table[1][1] represents the minimum. The value will be: Table[1][2] = 0 + 0
= 0. If we continue this way, after finishing, the table will look like:
+------+------+------+------+------+------+------+------+
| | 0 | 1 | 1 | 2 | 2 | 3 | 5 |
+------+------+------+------+------+------+------+------+
| 0 | 0 | inf | inf | inf | inf | inf | inf |
+------+------+------+------+------+------+------+------+
| 1 | inf | 0 | 0 | 1 | 2 | 4 | 8 |
+------+------+------+------+------+------+------+------+
| 2 | inf | 1 | 1 | 0 | 0 | 1 | 4 |
+------+------+------+------+------+------+------+------+
| 3 | inf | 3 | 3 | 1 | 1 | 0 | 2 |
+------+------+------+------+------+------+------+------+
| 5 | inf | 7 | 7 | 4 | 4 | 2 | 0 |
+------+------+------+------+------+------+------+------+
| 5 | inf | 11 | 11 | 7 | 7 | 4 | 0 |
+------+------+------+------+------+------+------+------+
| 5 | inf | 15 | 15 | 10 | 10 | 6 | 0 |
+------+------+------+------+------+------+------+------+
| 6 | inf | 20 | 20 | 14 | 14 | 9 | 1 |
+------+------+------+------+------+------+------+------+
Now if we backtrack from the last point, all the way back towards the
starting (0, 0) point, we get a long line that moves horizontally, vertically
and diagonally. Our backtracking procedure will be: if Table[i-1][j-1] <=
Table[i-1][j] and Table[i-1][j-1] <= Table[i][j-1]
We'll continue this till we reach (0, 0). Each move has its own meaning:
Sample [i] is matched Test [j], then i -j We can also add a locality constraint.
with | That is, we require that if | is no larger
than w, a window parameter.
Complexity:
The complexity of computing DTW is O(m * n) where m and n represent the
length of each sequence. Faster techniques for computing DTW include
PrunedDTW, SparseDTW and FastDTW.
Applications:
Spoken word recognition
Correlation Power Analysis
Chapter 55: Fast Fourier
Transform
The Real and Complex form of DFT (Discrete Fourier Transforms) can be
used to perform frequency analysis or synthesis for any discrete and
periodic signals. The FFT (Fast Fourier Transform) is an implementation
of the DFT which may be performed quickly on modern CPUs.
Section 55.1: Radix 2 FFT
The simplest and perhaps best-known method for computing the FFT is the
Radix-2 Decimation in Time algorithm. The Radix-2 FFT works by
decomposing an N point time domain signal into N time domain signals each
composed of a single point
.
Signal decomposition, or ‘ decimation in time ’ is achieved by bit reversing
the indices for the array of time domain data. Thus, for a sixteen-point signal,
sample 1 (Binary 0001) is swapped with sample 8 (1000), sample 2 (0010) is
swapped with 4 (0100) and so on. Sample swapping using the bit reverse
technique can be achieved simply in software, but limits the use of the Radix
2 FFT to signals of length N = 2^M.
The value of a 1-point signal in the time domain is equal to its value in the
frequency domain, thus this array of decomposed single time-domain points
requires no transformation to become an array of frequency domain points.
The N single points; however, need to be reconstructed into one N-point
frequency spectra. Optimal reconstruction of the complete frequency
spectrum is performed using butterfly calculations. Each reconstruction stage
in the Radix-2 FFT performs a number of two point butterflies, using a
similar set of exponential weighting functions, Wn^R.
The FFT removes redundant calculations in the Discrete Fourier Transform
by exploiting the periodicity of Wn^R. Spectral reconstruction is completed
in log2(N) stages of butterfly calculations giving X[K]; the real and
imaginary frequency domain data in rectangular form. To convert to
magnitude and phase (polar coordinates) requires finding the absolute value,
√ (Re2 + Im2), and argument, tan-1(Im/Re).
The complete butterfly flow diagram for an eight point Radix 2 FFT is shown
below. Note the input signals have previously been reordered according to
the decimation in time procedure outlined previously.
The FFT typically operates on complex inputs and produces a complex
output. For real signals, the imaginary part may be set to zero and real part
set to the input signal, x[n], however many optimisations are possible
involving the transformation of real-only data. Values of Wn^R used
throughout the reconstruction can be determined using the exponential
weighting equation.
The value of R (the exponential weighting power) is determined the current
stage in the spectral reconstruction and the current calculation within a
particular butterfly.
Code Example (C/C++)
A C/C++ code sample for computing the Radix 2 FFT can be found below.
This is a simple implementation which works for any size N where N is a
power of 2. It is approx 3x slower than the fastest FFTw implementation, but
still a very good basis for future optimisation or for learning about how this
algorithm works.
if ((NN != N) || (NN == 0)) // Check N is a power of 2. return
false;
return
true;
}
void rad2FFT(int N, complex *x,
complex *DFT) { int M = 0;
// Check if power of two. If not, exit if (!isPwrTwo(N,
&M)) throw "Rad2FFT(): N must be a power of 2 for Radix
FFT";
// Integer Variables
int BSep; // BSep is memory spacing between butterflies int BWidth; // BWidth is memory
spacing of opposite ends of the butterfly int P; // P is number of similar Wn's to be used in that stage
int j; // j is used in a loop to perform all calculations in each stage int stage = 1; // stage is the
stage number of the FFT. There are M stages in total
(1 to M).
int HiIndex; // HiIndex is the index of the DFT array for the top value of each butterfly calc
int iaddr; // bitmask for bit reversal int ii; // Integer bitfield for bit reversal (Decimation in Time)
int MM1 = M - 1;
unsigned int i; int l;
unsigned int nMax = (unsigned
int)N;
// Double Precision Variables double TwoPi_N = TWOPI / (double)N; // constant to save computational
time. = 2*PI / N double TwoPi_NP;
// complex Variables (See 'struct complex') complex WN; // Wn is the exponential weighting
function in the form a + jb complex TEMP; // TEMP is used to save computation in the butterfly
calc complex *pDFT = DFT; // Pointer to first elements in DFT array complex *pLo; //
Pointer for lo / hi value of butterfly calcs complex *pHi; complex *pX; // Pointer to x[n]
// Decimation In Time - x[n] sample sorting
for (i = 0; i < nMax; i++, DFT++)
{ pX = x + i; // Calculate current x[n] from base address *x and index i.
ii = 0; // Reset new address for DFT[n] iaddr =
i; // Copy i for manipulations for (l = 0; l < M; l++) // Bit
reverse i and store in ii...
{ if (iaddr & 0x01) // Detemine least significant bit ii += (1 << (MM1 - l)); //
Increment ii by 2^(M-1-l) if lsb was 1 iaddr >>= 1; // right shift iaddr to test next bit.
Use logical operations for speed increase if (!iaddr) break; }
DFT = pDFT + ii; // Calculate current DFT[n] from base address *pDFT and bit reversed index ii
DFT->Re = pX->Re; // Update the complex array with address sorted time domain signal x[n]
DFT->Im = pX->Im; // NB: Imaginary is always zero }
// FFT Computation by butterfly calculation for (stage = 1; stage <= M; stage++)
// Loop for M stages, where 2^M = N {
BSep = (int)(pow(2, stage)); // Separation between butterflies = 2^stage
P = N / BSep; // Similar Wn's in this stage = N/Bsep
BWidth = BSep / 2; // Butterfly width (spacing between opposite points) = Separation / 2. TwoPi_NP
TwoPi_N*P;
for (j = 0; j < BWidth; j++) // Loop for j calculations per butterfly
{ if (j != 0) // Save on calculation if R = 0, as WN^0 = (1 + j0) {
//WN.Re = cos(TwoPi_NP*j)
WN.Re = cos(TwoPi_N*P*j); // Calculate Wn (Real and Imaginary)
WN.Im = -sin(TwoPi_N*P*j);
}
for (HiIndex = j; HiIndex < N; HiIndex += BSep) // Loop for HiIndex Step BSep butterflies per stage
{ pHi = pDFT + HiIndex; // Point to higher value pLo = pHi +
BWidth; // Point to lower value (Note VC++ adjusts for spacing between elements)
if (j != 0) // If exponential power is not zero...
{
//CMult(pLo, &WN, &TEMP); // Perform complex multiplication of Lovalue with Wn
TEMP.Re = (pLo->Re * WN.Re) - (pLo->Im * WN.Im);
TEMP.Im = (pLo->Re * WN.Im) + (pLo->Im * WN.Re);
//CSub (pHi, &TEMP, pLo); pLo->Re = pHi->Re - TEMP.Re; // Find new Lovalue
(complex subtraction) pLo->Im = pHi->Im - TEMP.Im;
//CAdd (pHi, &TEMP, pHi); // Find new Hivalue (complex addition) pHi
= (pHi->Re + TEMP.Re); pHi->Im = (pHi->Im + TEMP.Im);
else
{
TEMP.Re = pLo->Re;
TEMP.Im = pLo->Im;
//CSub (pHi, &TEMP, pLo); pLo->Re = pHi->Re - TEMP.Re; // Find new Lovalue
(complex subtraction) pLo->Im = pHi->Im - TEMP.Im;
//CAdd (pHi, &TEMP, pHi); // Find new Hivalue (complex addition) pHi
= (pHi->Re + TEMP.Re); pHi->Im = (pHi->Im + TEMP.Im);
}
}
}
}
Section 55.2: Radix 2 Inverse FFT
Due to the strong duality of the Fourier Transform, adjusting the output of a
forward transform can produce the inverse FFT. Data in the frequency
domain can be converted to the time domain by the following method:
1. Find the complex conjugate of the frequency domain data by inverting
the imaginary component for all instances of K.
2. Perform the forward FFT on the conjugated frequency domain data.
3. Divide each output of the result of this FFT by N to give the true time
domain value.
4. Find the complex conjugate of the output by inverting the imaginary
component of the time domain data for all instances of n.
Note: both frequency and time domain data are complex variables. Typically
the imaginary component of the time domain signal following an inverse FFT
is either zero, or ignored as rounding error. Increasing the precision of
variables from 32-bit float to 64-bit double, or 128-bit long double
significantly reduces rounding errors produced by several consecutive FFT
operations.
Code Example (C/C++)
Appendix A: Pseudocode
Section A.1: Variable aectations
You could describe variable affectation in different ways.
Typed
Section A.2: Functions
As long as the function name, return statement and parameters are clear,
you're fine.
or
or
are all quite clear, so you may use them. Try not to be ambiguous with a
variable affectation
THE END