0% found this document useful (0 votes)
674 views

UNIT - 2 Advanced Algorithm PDF

The document summarizes topics related to matroids and greedy algorithms. It defines matroids and provides examples such as uniform matroids, linear matroids, graphic matroids, and expanded matroids. It then introduces greedy algorithms, describing their history, components, and characteristics. Key aspects are that greedy algorithms make locally optimal choices at each step to arrive at a global solution, and involve selecting candidates from a set based on a selection function.

Uploaded by

bharat emandi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
674 views

UNIT - 2 Advanced Algorithm PDF

The document summarizes topics related to matroids and greedy algorithms. It defines matroids and provides examples such as uniform matroids, linear matroids, graphic matroids, and expanded matroids. It then introduces greedy algorithms, describing their history, components, and characteristics. Key aspects are that greedy algorithms make locally optimal choices at each step to arrive at a global solution, and involve selecting candidates from a set based on a selection function.

Uploaded by

bharat emandi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

UNIT-2

S.No TOPICS PAGE NO


1 Matroids 48
2 Introduction to greedy paradigm 50
Algorithm to Compute a Maximum Weight Maximal
3 53
Independent set
4 Application to MST 55
5 Graph Matching 65
6 Algorithm to Compute Maximum Matching 68
Characterization of Maximum Matching by
7 75
Augmenting Paths
Edmond’s Blossom Algorithm to Compute
8 75
Augmenting Path

Page47
MATROIDS:
Matroid is a structure that abstracts and generalizes the notion of linear independence in vector
spaces. Matroid is a pair ⟨X,I⟩⟨X,I⟩ where XX is called ground set and II is set of
all independent subsets of XX. In other words matroid ⟨X,I⟩⟨X,I⟩ gives a classification for each subset
of XX to be either independent or dependent (included in II or not included in II).Of course, we are not
speaking about arbitrary classifications. These 3 properties must hold for any matroid:
1. Empty set is independent.
2. Any subset of independent set is independent.
3. If independent set AA has smaller size than independent set BB, there exist at least one element
in BB that can be added into AA without loss of independency.
These are axiomatic properties of matroid. To prove that we are dealing with matroid, we generally
have to prove these three properties. For example, explicit presentation of matroid on ground
set {x,y,z}{x,y,z} which considers {y,z}{y,z} and {x,y,z}{x,y,z} to be dependent and marked red.
Other sets are independent and marked green in the below diagram.

Examples
There are matroids of different types. There are some examples:
 Uniform matroid. Matroid that considers subset SS independent if size of SS is not greater than
some constant kk (|S|≤k|S|≤k). Simplest one, this matroid does not really distinguish elements of
ground set in any form, it only cares about number of taken elements. All subsets of size kk are
bases for this matroid, all subsets of size (k+1)(k+1) are circuits for this matroid. We can also define
some specific cases of uniform matroid.

 Trivial matroid. k=0 k=0. Only empty set is considered independent, any element of ground set is
considered dependent (any combination of ground set elements is also considered dependent as a
consequence).

 Complete matroid. k=|X| k=|X|. All subsets are considered independent including complete ground
set itself.

 Linear (algebra) matroid. Ground set consists of vectors of some vector space. Set of vectors is
considered independent if it is linearly independent (no vector can be expressed as linear

Page48
combination of other vectors from that set). This is the matroid from which whole matroid theory
originates from. Linear bases of vector set are bases of matroid. Any circuit of this matroid is set of
vectors, where each vector can be expressed as combination of all other vectors, but this
combination involves all other vectors in circuit.

 Colorful matroid. Ground set consists of colored elements. Each element has exactly one color. Set
of elements is independent if no pair of included elements share a color. Rank of a set is amount of
different colors included into a set. Bases of this matroid are sets that have exactly one element of
each color. Circuits of this matroid are all possible pairs of elements of the same color.

 Graphic matroid. This matroid is defined on edges of some undirected graph. Set of edges is
independent if it does not contain a cycle. This type of matroids is the greatest one to show some
visual examples, because it can include dependent subsets of a large size and can be represented on
a picture at the same time. If graph is connected then any basis of this graph is just a spanning tree
of this graph. If graph is not connected then basis is a forest of spanning trees that include one
spanning tree for each connected component. Circuits are simple loops of this graph. Independence
oracle for this matroid type can be implemented with DFS, BFS (start from each vertex in graph and
check that no edge connect a vertex with already visited one) or DSU (keep connected components,
start with disjoint vertices, join by all edges and ensure that each edge connected different
components upon addition). Here is an example of circuit combinations property in graphic matroid:

 Truncated matroid. We can limit rank of any matroid by some number kk without breaking
matroid properties. For example, basis of truncated colorful matroid is set of elements that include
no more than kk different colors and all colors are unique. Basis of truncated graphic matroid is
acyclic set of edges that leaves at least (n−k)(n−k) connected components in the graph (where nn is
amount if vertices in a graph). This is possible because third matroid property does not only refer to
bases of matroid, but to any independent set in matroid and when all independent sets with sizes
greater than kk are simultaneously removed, independent sets of size kk become new bases and for
any lesser independent set can still find elements from each basis that can be added.

 Matroid on a subset of ground set. We can limit ground set of matroid to its subset without
breaking matroid properties. This is possible because rules of dependence does not rely on specific
elements being in ground set. If we remove an edge from a graph, we will still have a valid graph. If
we remove an element from set (of vectors or colored elements) we will still get a valid set of some

Page49
element of the same type and rules will preserve. Now, we can also define rank of subset in matroid
as a rank of matroid on a ground set limited to this subset.

 Expanded matroid. Direct matroid sum. We can consider two matroids as one big matroid
without any difficulties if elements of ground set of first matroid does not affect independence,
neither intersect with elements of ground set of second matroid and vice versa. If we consider two
graphic matroids on two connected graphs, we can unite their graphs together resulting in graph
with two connected components, but it is clear that including some edges in one component have no
effect on other component. This is called direct matroid sum. Formally,
M1=⟨X1,I1⟩M1=⟨X1,I1⟩,
M2=⟨X2,I2⟩M2=⟨X2,I2⟩,
M1+M2=⟨X1∪X2,I1×I2⟩M1+M2=⟨X1∪X2,I1×I2⟩,
Where ×× means cartesian product of two sets. We can unite as many matroids of as many different
types without restrictions.

INTRODUCTION TO GREEDY PARADIGM:

Greedy Algorithm:
In Greedy Algorithm a set of resources are recursively divided based on the maximum, immediate
availability of that resource at any given stage of execution. To solve a problem based on the greedy
approach, there are two stages
1. Scanning the list of items
2. Optimization.
These stages are covered parallelly, on course of division of the array.
To understand the greedy approach, we need to have a working knowledge of recursion and context
switching. This helps us to understand how to trace the code.
Two conditions define the greedy paradigm
 Each stepwise solution must structure a problem towards its best-accepted solution.
 It is sufficient if the structuring of the problem can halt in a finite number of greedy steps.

History of Greedy Algorithms:


Here is an important landmark of greedy algorithms:
 Greedy algorithms were conceptualized for many graph walk algorithms in the 1950s.
 EsdgerDjikstra conceptualized the algorithm to generate minimal spanning trees. He aimed to
shorten the span of routes within the Dutch capital, Amsterdam.
 In the same decade, Prim and Kruskal achieved optimization strategies that were based on
minimizing path costs along weighed routes.
 In the '70s, American researchers, Cormen, Rivest and Stein proposed a recursive substructuring of
greedy solutions in their classical introduction to algorithms book.

Page50
 The greedy paradigm was registered as a different type of optimization strategy in the NIST records
in 2005.
 Till date, protocols that run the web, such as the open-shortest-path-first (OSPF) and many other
network packet switching protocols use the greedy strategy to minimize time spent on a network.

Components of Greedy Algorithm:


Greedy algorithms have the following five components:
 A candidate set − A solution is created from this set.
 A selection function − Used to choose the best candidate to be added to the solution.
 A feasibility function − Used to determine whether a candidate can be used to contribute to the
solution.
 An objective function − Used to assign a value to a solution or a partial solution.
 A solution function − Used to indicate whether a complete solution has been reached.

Characteristics of the Greedy Approach:


The important characteristics of a greedy method are:
 There is an ordered list of resources, with costs or value attributions. These quantify constraints on a
system.
 We will take the maximum quantity of resources in the time a constraint applies.
 For example, in an activity scheduling problem, the resource costs are in hours, and the activities
need to be performed in serial order.

Here are the reasons for using the greedy approach:


 The greedy approach has a few tradeoffs, which may make it suitable for optimization.
 One prominent reason is to achieve the most feasible solution immediately. In the activity selection
problem, if more activities can be done before finishing the current activity, these activities can be
performed within the same time.
 Another reason is to divide a problem recursively based on a condition, with no need to combine all
the solutions.
 In the activity selection problem, the "recursive division" step is achieved by scanning a list of
items only once and considering certain activities.

How to solve the activity selection problem?


In the activity scheduling example, there is a "start" and "finish" time for every activity. Each Activity
is indexed by a number for reference. There are two activity categories.
1. Considered activity: is the activity, which is the reference from which the ability to do more
than one remaining activity is analyzed.
2. Remaining activities: activities at one or more indexes ahead of the considered activity.
The total duration gives the cost of performing the activity. That is (finish - start) gives us the durational
as the cost of an activity. The greedy extent is the number of remaining activities we can perform in
the time of a considered activity.
Page51
Architecture of the Greedy approach:
STEP 1: Scan the list of activity costs, starting with index 0 as the considered Index.
STEP 2: When more activities can be finished by the time, the considered activity finishes, start
searching for one or more remaining activities.
STEP 3: If there are no more remaining activities, the current remaining activity becomes the next
considered activity. Repeat step 1 and step 2, with the new considered activity. If there are no
remaining activities left, go to step 4.
STEP 4: Return the union of considered indices. These are the activity indices that will be used to
maximize throughput.

Disadvantages of Greedy Algorithms:


 It is not suitable for problems where a solution is required for every sub-problem like sorting.
 In such problems, the greedy strategy can be wrong; in the worst case even lead to a non-optimal
solution.
 Therefore the disadvantage of greedy algorithms is using not knowing what lies ahead of the current
greedy state. Below is a depiction of the disadvantage of the greedy approach.

In the greedy scan shown here as a tree (higher value higher greed), an algorithm state at value: 40, is
likely to take 29 as the next value. Further, its quest ends at 12. This amounts to a value of 41.
However, if the algorithm took a sub-optimal path or adopted a conquering strategy then 25 would be
followed by 40, and the overall cost improvement would be 65, which is valued 24 points higher as a
suboptimal decision.

Examples of Greedy Algorithms:


Most networking algorithms use the greedy approach. Here is a list of few of them:
 Prim's Minimal Spanning Tree Algorithm
 Travelling Salesman Problem
 Graph - Map Coloring
 Kruskal's Minimal Spanning Tree Algorithm
 Dijkstra's Minimal Spanning Tree Algorithm
 Graph - Vertex Cover
 Knapsack Problem
 Job Scheduling Problem

Page52
ALGORITHM TO COMPUTE A MAXIMUM WEIGHT MAXIMAL
INDEPENDENT SET:
The Greedy Approach and Divide and conquer algorithms are used to compute a maximum
weight maximal independent set.

Greedy Algorithm: The maximum (weighted) independent set (MIS(MWIS)) is one of the most
important optimization problems. In several heuristic methods for optimization problems,
the greedy strategy is the most natural and simplest one. For MIS, two simple greedy algorithms have
been investigated. One is called GMIN, which selects a vertex of minimum degree, removes it and its
neighbors from the graph and iterates this process on the remaining graph until no vertex remains. (the
set of selected vertices is an independent set). The other is called GMAX, which deletes a vertex of
maximum degree until no edge remains (the set of remaining vertices is an independent set).

Divide and Conquer:


Divide and Conquer is an algorithmic paradigm. A typical Divide and Conquer algorithm solves a
problem using following three steps.
1. Divide: Break the given problem into sub-problems of same type.
2. Conquer: Recursively solve these sub-problems
3. Combine: Appropriately combine the answers

Divide and Conquer algorithm:


DAC(a, i, j)
{
if(small(a, i, j))
return(Solution(a, i, j))
else
m = divide(a, i, j) // f1(n)
b = DAC(a, i, mid) // T(n/2)
c = DAC(a, mid+1, j) // T(n/2)
d = combine(b, c) // f2(n)
return(d)
}

Divide/Break: This step involves breaking the problem into smaller sub-problems. Sub-problems
should represent a part of the original problem. This step generally takes a recursive approach to divide
the problem until no sub-problem is further divisible. At this stage, sub-problems become atomic in
nature but still represent some part of the actual problem.

Conquer/Solve: This step receives a lot of smaller sub-problems to be solved. Generally, at this level,
the problems are considered 'solved' on their own.

Merge/Combine: When the smaller sub-problems are solved, this stage recursively combines them
until they formulate a solution of the original problem. This algorithmic approach works recursively
Page53
conquers and merge steps works so close that they appear as one. Below diagram indicates this:

The following computer algorithms are based on divide-and-conquer programming approach


1. Binary Search is a searching algorithm. In each step, the algorithm compares the input element x
with the value of the middle element in array. If the values match, return the index of the middle.
Otherwise, if x is less than the middle element, then the algorithm recurs for left side of middle
element, else recurs for the right side of the middle element.
2. Quick sort is a sorting algorithm. The algorithm picks a pivot element, rearranges the array
elements in such a way that all elements smaller than the picked pivot element move to left side of
pivot and all greater elements move to right side. Finally, the algorithm recursively sorts the sub-
arrays on left and right of pivot element.
3. Merge Sort is also a sorting algorithm. The algorithm divides the array in two halves, recursively
sorts them and finally merges the two sorted halves.
4. Closest Pair of Points The problem is to find the closest pair of points in a set of points in x-y
plane. The problem can be solved in O(n^2) time by calculating distances of every pair of points
and comparing the distances to find the minimum. The Divide and Conquer algorithm solves the
problem in O(nLogn) time.
5. Strassen’s Algorithm is an efficient algorithm to multiply two matrices. A simple method to
multiply two matrices need 3 nested loops and is O(n^3). Strassen’s algorithm multiplies two
matrices in O(n^2.8974) time.
6. Cooley–Tukey Fast Fourier Transform (FFT) algorithm is the most common algorithm for
FFT. It is a divide and conquer algorithm which works in O(nlogn) time.
7. Karatsuba algorithm for fast multiplication it does multiplication of two n-digit numbers and at
most it takes single-digit multiplications in general (and
exactly when n is a power of 2). It is therefore faster than the classical algorithm, which
requires n single-digit products. If n = 210 = 1024, in particular, the exact counts are 310 = 59, 049
2

and (210)2 = 1, 048, 576, respectively.

Page54
APPLICATION TO MST:
Tree: A tree is a graph with the following properties:
1. The graph is connected (can go from anywhere to anywhere)
2. There are no cyclic (Acyclic)

Spanning Tree:
Given a connected undirected graph, a spanning tree of that graph is a sub-graph that is a tree and
joined all vertices. A single graph can have many spanning trees.

For Example:

For the corresponding connected graph,


There can be multiple spanning Trees like

Connected Undirected Graph

Properties of Spanning Tree:


1. There may be several minimum spanning trees of the same weight having the minimum number
of edges.
2. If all the edge weights of a given graph are the same, then every spanning tree of that graph is
minimum.
3. If each edge has a distinct weight, then there will be only one, unique minimum spanning tree.
4. A connected graph G can have more than one spanning trees.
5. A disconnected graph can't have to span the tree, or it can't span all the vertices.

Page55
6. Spanning Tree does not contain cycles.
7. Spanning Tree has (n-1) edges where n is the number of vertices.
Addition of even one single edge results in the spanning tree losing its property of Acyclicity and
elimination of one single edge results in its losing the property of connectivity.

Minimum Spanning Tree:


Minimum Spanning Tree is a Spanning Tree which has minimum total cost. If we have a linked
undirected graph with a weight (or cost) combine with each edge. Then the cost of spanning tree would
be the sum of the cost of its edges.

Application of Minimum Spanning Tree:


1. Consider n stations are to be linked using a communication network and laying of
communication links between any two stations involves a cost.
The ideal solution would be to extract a sub-graph termed as minimum cost spanning tree.
2. Suppose we want to construct highways or railroads spanning several cities then we can use the
concept of minimum spanning trees.
3. Designing Local Area Networks.
4. Laying pipelines connecting offshore drilling sites, refineries and consumer markets.
5. Suppose we want to apply a set of houses with
o Electric Power
o Water
o Telephone lines
o Sewage lines
To reduce cost, we can connect houses with minimum cost spanning trees.

Example: Problem laying Telephone Wire

Page56
Methods of Minimum Spanning Tree
There are two methods to find Minimum Spanning Tree
1. Kruskal's Algorithm
2. Prim's Algorithm

Kruskal's Algorithm:
This is an algorithm to construct a Minimum Spanning Tree for a connected weighted graph. It is a
Greedy Algorithm. If the graph is not linked, then it finds a Minimum Spanning Tree.

Steps for finding MST using Kruskal's Algorithm:


1. Arrange the edge of G in order of increasing weight.
2. Starting only with the vertices of G and proceeding sequentially add each edge which does not
result in a cycle, until (n - 1) edges are used.
3. EXIT.
MST- KRUSKAL (G, w)
1. A ← ∅
2. for each vertex v ∈ V [G]
3. do MAKE - SET (v)
4. sort the edges of E into non decreasing order by weight w
5. for each edge (u, v) ∈ E, taken in non decreasing order by weight
6. do if FIND-SET (μ) ≠ if FIND-SET (v)
7. then A ← A ∪ {(u, v)}
8. UNION (u, v)
9. return A

Analysis:
Where E is the number of edges in the graph and V is the number of vertices, Kruskal's Algorithm
can be shown to run in O(E log E) time, or simply, O(E log V) time, all with simple data structures.
These running times are equivalent because:
o E is at most V2 and log V2= 2 x log V is O (log V).

Page57
o If we ignore isolated vertices, V ≤ 2 E, so log V is O (log E).Thus the total time is
:O (E log E) = O (E log V).

For Example: Find the Minimum Spanning Tree of the following graph using Kruskal's algorithm.

Solution: First we initialize the set A to the empty set and create |v| trees, one containing each vertex
with MAKE-SET procedure. Then sort the edges in E into order by non-decreasing weight.
There are 9 vertices and 12 edges. So MST formed (9-1) = 8 edges

Weight Source Destination


1 H g
2 G F
4 A B
6 I G
7 H I
7 C D
8 B C
8 A H
9 D E
10 E F
11 B H
14 D F

Now, check for each edge (u, v) whether the endpoints u and v belong to the same tree. If they do then
the edge (u, v) cannot be supplementary. Otherwise, the two vertices belong to different trees, and the
edge (u, v) is added to A and the vertices in two trees are merged in by union procedure.

Page58
Steps to find Minimum Spanning Tree using Kruskal's algorithm
Step1: So, first take (h, g) edge Step 2: then (g, f) edge.

Step 3: then (a, b) and (i, g) edges are Step 4: Now, edge (h, i). Both h and i
considered, and the forest becomes vertices are in the same set. Thus it creates
a cycle. So this edge is discarded.
Then edge (c, d), (b, c), (a, h), (d, e), (e, f)
are considered, and the forest becomes.

Step 5: In (e, f) edge both endpoints e and Step 6: After that edge (d, f) and the final
f exist in the same tree so discarded this spanning tree is shown as in dark lines.
edge. Then (b, h) edge, it also creates a
cycle.

Step 7: This step will be required Minimum Spanning Tree because it contains all the 9
vertices and (9 - 1) = 8 edges
e → f, b → h, d → f [cycle will be formed]

Minimum Cost MST

Page59
Prim's Algorithm
It is a greedy algorithm. It starts with an empty spanning tree. The idea is to maintain two sets of
vertices:
o Contain vertices already included in MST.
o Contain vertices not yet included.
At every step, it considers all the edges and picks the minimum weight edge. After picking the edge, it
moves the other endpoint of edge to set containing MST.

Steps for finding MST using Prim's Algorithm:


1. Create MST set that keeps track of vertices already included in MST.
2. Assign key values to all vertices in the input graph. Initialize all key values as INFINITE (∞).
Assign key values like 0 for the first vertex so that it is picked first.
3. While MST set doesn't include all vertices.
a. Pick vertex u which is not is MST set and has minimum key value. Include 'u' to MST
set.
b. Update the key value of all adjacent vertices of u. To update, iterate through all adjacent
vertices. For every adjacent vertex v, if the weight of edge u.v less than the previous key
value of v, update key value as a weight of u.v.
MST-PRIM (G, w, r)
1. for each u ∈ V [G]
2. do key [u] ← ∞
3. π [u] ← NIL
4. key [r] ← 0
5. Q ← V [G]
6. While Q ?∅
7. do u ← EXTRACT - MIN (Q)
8. for each v ∈Adj [u]
9. do if v ∈ Q and w (u, v) < key [v]
10. then π [v] ← u
11. key [v] ← w (u, v)

Example: Generate minimum cost spanning tree for the following graph using Prim's algorithm.

Page60
Solution: In Prim's algorithm, first we initialize the priority Queue Q. to contain all the vertices and the
key of each vertex to ∞ except for the root, whose key is set to 0. Suppose 0 vertex is the root, i.e., r. By
EXTRACT - MIN (Q) procure, now u = r and Adj [u] = {5, 1}.Removing u from set Q and adds it to set
V - Q of vertices in the tree. Now, update the key and π fields of every vertex v adjacent to u but not in
a tree.

Vertex 0 1 2 3 4 5 6
Key 0 ∞ ∞ ∞ ∞ ∞ ∞
Value
Parent NIL NIL NIL NIL NIL NIL NIL

Taking 0 as starting vertex


Root = 0
Adj [0] = 5, 1
Parent, π [5] = 0 and π [1] = 0
Key [5] = ∞ and key [1] = ∞
w [0, 5) = 10 and w (0,1) = 28
w (u, v) < key [5] , w (u, v) < key [1]
Key [5] = 10 and key [1] = 28
So update key value of 5 and 1 is:
Vertex 0 1 2 3 4 5 6
Key 0 28 ∞ ∞ ∞ 10 ∞
Value
Parent NIL 0 NIL NIL NIL 0 NIL

Now by EXTRACT_MIN (Q) Removes 5 because key [5] = 10 which is minimum so u = 5.


Adj [5] = {0, 4} a
nd 0 is already in heap
Taking 4, key [4] = ∞ π [4] = 5
(u, v) < key [v] then key [4] = 25
w (5,4) = 25
Page61
w (5,4) < key [4]
date key value and parent of 4.
Vertex 0 1 2 3 4 5 6
Key 0 28 ∞ ∞ 25 10 ∞
Value
Parent NIL 0 NIL NIL 5 0 NIL

Now remove 4 because key [4] = 25 which is minimum, so u =4


Adj [4] = {6, 3}
Key [3] = ∞ key [6] = ∞
w (4,3) = 22 w (4,6) = 24
w (u, v) < key [v] w (u, v) < key [v]
w (4,3) < key [3] w (4,6) < key [6]

Update key value of key [3] as 22 and key [6] as 24.

And the parent of 3, 6 as 4.

π[3]= 4 π[6]= 4
Vertex 0 1 2 3 4 5 6
Key 0 28 ∞ 22 25 10 ∞
Value
Parent NIL 0 NIL 4 5 0 NIL

u = EXTRACT_MIN (3, 6) [key [3] < key [6]]


u=3 i.e. 22 < 24

Now remove 3 because key [3] = 22 is minimum so u =3.

Page62
Adj [3] = {4, 6, 2}
4 is already in heap
4 ≠ Q key [6] = 24 now becomes key [6] = 18
Key [2] = ∞ key [6] = 24
w (3, 2) = 12 w (3, 6) = 18
w (3, 2) < key [2] w (3, 6) < key [6]

Now in Q, key [2] = 12, key [6] = 18, key [1] = 28 and parent of 2 and 6 is 3.

π [2] = 3 π[6]=3

Now by EXTRACT_MIN (Q) Removes 2, because key [2] = 12 is minimum.

Vertex 0 1 2 3 4 5 6
Key 0 28 12 22 25 10 18
Value
Parent NIL 0 3 4 5 0 3

u = EXTRACT_MIN (2, 6)
u=2 [key [2] < key [6]]
12 < 18
Now the root is 2
Adj [2] = {3, 1}
3 is already in a heap
Taking 1, key [1] = 28
w (2,1) = 16
w (2,1) < key [1]

So update key value of key [1] as 16 and its parent as 2.

π[1]= 2

Page63
Vertex 0 1 2 3 4 5 6
Key
0 16 12 22 25 10 18
Value
Parent NIL 2 3 4 5 0 3

Now by EXTRACT_MIN (Q) Removes 1 because key [1] = 16 is minimum.


Adj [1] = {0, 6, 2}
0 and 2 are already in heap.
Taking 6, key [6] = 18
w [1, 6] = 14
w [1, 6] < key [6]
Update key value of 6 as 14 and its parent as 1.
Π [6] = 1
Vertex 0 1 2 3 4 5 6
Key 0 16 12 22 25 10 14
Value
Parent NIL 2 3 4 5 0 1

Now all the vertices have been spanned, Using above the table we get Minimum Spanning Tree.

0→5→4→3→2→1→6
[Because Π [5] = 0, Π [4] = 5, Π [3] = 4, Π [2] = 3, Π [1] =2, Π [6] =1]
Thus the final spanning Tree is

Total Cost = 10 + 25 + 22 + 12 + 16 + 14 = 99

Page64
GRAPH MATCHING:
Graph matching is the problem of finding a similarity between graphs. Graphs are commonly used to
encode structural information in many fields, including computer vision and pattern recognition, and
graph matching is an important tool in these areas. In these areas it is commonly assumed that the
comparison is between the data graph and the model graph.
The case of exact graph matching is known as the graph isomorphism problem. The problem of exact
matching of a graph to a part of another graph is called subgraph isomorphism problem.
The inexact graph matching refers to matching problems when exact matching is impossible, e.g.,
when the numbers of vertices in the two graphs are different. In this case it is required to find the best
possible match. For example, in image recognition applications, the results of image
segmentation in image processing typically produces data graphs with the numbers of vertices much
larger than in the model graphs data expected to match against. In the case of attributed graphs, even if
the numbers of vertices and edges are the same, the matching still may be only inexact.
Two categories of search methods are the ones based on identification of possible and impossible
pairings of vertices between the two graphs and methods which formulate graph matching as
an optimization problem. Graph edit distance is one of similarity measures suggested for graph
matching. The class of algorithms is called error-tolerant graph matching.
Definition
A matching graph is a sub-graph of a graph where there are no edges adjacent to each other. Simply,
there should not be any common vertex between any two edges.

Matching
Let ‘G’ = (V, E) be a graph. A subgraph is called a matching M(G), if each vertex of G is incident
with at most one edge in M, i.e.,deg(V) ≤ 1 ∀ V ∈ G. Which means in the matching graph M(G), the
vertices should have a degree of 1 or 0, where the edges should be incident from the graph G.

Page65
Notation − M(G) The Example:

In a matching,
ifdeg(V) = 1, then (V) is said to be matched
ifdeg(V) = 0, then (V) is not matched.
In a matching, no two edges are adjacent. It is because if any two edges are adjacent, then the degree of
the vertex which is joining those two edges will have a degree of 2 which violates the matching rule.

Maximal Matching
A matching M of graph ‘G’ is said to be maximal if no other edges of ‘G’ can be added to M.
Example

M1, M2, M3 from the above graph are the maximal matching of G.

Maximum Matching
It is also known as largest maximal matching. Maximum matching is defined as the maximal matching
with maximum number of edges.
The number of edges in the maximum matching of ‘G’ is called its matching number.

Maximum Matching –Example

Page66
For a graph given in the above example, M1 and M2 are the maximum matching of ‘G’ and its
matching number is 2. Hence by using the graph G, we can form only the sub-graphs with only 2
edges maximum. Hence we have the matching number as two.

Perfect Matching
A matching (M) of graph (G) is said to be a perfect match, if every vertex of graph g (G) is incident
to exactly one edge of the matching (M), i.e.,deg(V) = 1 ∀ V
The degree of each and every vertex in the subgraph should have a degree of 1.

Perfect Matching - Example


In the following graphs, M1 and M2 are examples of perfect matching of G.

Note − Every perfect matching of graph is also a maximum matching of graph, because there is no
chance of adding one more edge in a perfect matching graph.
A maximum matching of graph need not be perfect. If a graph ‘G’ has a perfect match, then the
number of vertices |V(G)| is even. If it is odd, then the last vertex pairs with the other vertex, and
finally there remains a single vertex which cannot be paired with any other vertex for which the degree
is zero. It clearly violates the perfect matching principle.
Example

Page67
Note − The converse of the above statement need not be true. If G has even number of vertices, then
M1 need not be perfect.
Example

It is matching, but it is not a perfect match, even though it has even number of vertices.

ALGORITHM TO COMPUTE MAXIMUM MATCHING:


Hopcroft–Karp Algorithm for Maximum Matching

A matching in a Bipartite Graph is a set of the edges chosen in such a way that no two edges
share an endpoint. A maximum matching is a matching of maximum size (maximum number of
edges). In a maximum matching, if any edge is added to it, it is no longer a matching. There can be
more than one maximum matching for a given Bipartite Graph.
Hopcroft Karp algorithm is an improvement that runs in O(√V x E) time. Let us define few
terms before we discuss the algorithm.
Free Node or Vertex: Given a matching M, a node that is not part of matching is called free node.
Initially all vertices as free (from first graph of below diagram). In second graph, u2 and v2 are free. In
third graph, no vertex is free.

Matching and Not-Matching edges: Given a matching M, edges that are part of matching are called
Matching edges and edges that are not part of M (or connect free nodes) are called Not-Matching edges.
In first graph, all edges are non-matching. In second graph, (u0, v1), (u1, v0) and (u3, v3) are matching
and others not-matching.

Page68
Alternating Paths: Given a matching M, an alternating path is a path in which the edges belong
alternatively to the matching and not matching. All single edges paths are alternating paths. Examples
of alternating paths in middle graph are u0-v1-u2 and u2-v1-u0-v2.

Augmenting path: Given a matching M, an augmenting path is an alternating path that starts from and
ends on free vertices. All single edge paths that start and end with free vertices are augmenting paths. In
below diagram, augmenting paths are highlighted with blue color. Note that the augmenting path
always has one extra matching edge.
The Hopcroft Karp algorithm is based on below concept.
A matching M is not maximum if there exists an augmenting path. It is also true other way, i.e, a
matching is maximum if no augmenting path exists. So the idea is to one by one look for augmenting
paths. And add the found paths to current matching.

Hopcroft Karp Algorithm


1. Initialize Maximal Matching M as empty.
2. While there exists an Augmenting Path p, remove matching edges of p from M and add not-matching
edges of p to M(This increases size of M by 1 as p status and ends with a free vertex)
3. Return M.

Below diagram shows working of the algorithm.

In the initial graph all single edges are augmenting paths and we can pick in any order. In the middle
stage, there is only one augmenting path. We remove matching edges of this path from M and add not-
matching edges. In final matching, there are no augmenting paths so the matching is maximum.

Ford-Fulkerson algorithm
The Ford–Fulkerson method or Ford–Fulkerson algorithm (FFA) is a greedy algorithm that computes
Page69
the maximum flow in a flow network. It is sometimes called a "method" instead of an "algorithm" as
the approach to finding augmenting paths in a residual graph is not fully specified or it is specified in
several implementations with different running times.

Max Flow Problem Introduction


The max flow problem is an optimization problem for determining the maximum amount
of stuff that can flow at a given point in time through a single source/sink flow network. A flow network
is essentially just a directed graph where the edge weights represent the flow capacity of each edge.
The stuff that flows through these networks could be literally anything. Maybe it’s traffic driving
through a city, water flowing through pipes or bits traveling across the internet.

Algorithm of Ford-Fulkerson algorithm:


1. Start with initial flow as 0.
2. While there is a augmenting path from source to sink. Add this path-flow to flow.
3. Return flow.

Time Complexity: Time complexity of the above algorithm is O(max_flow * E). We run a loop while
there is an augmenting path. In worst case, we may add 1 unit flow in every iteration. Therefore the
time complexity becomes O(max_flow * E).

How to implement the above simple algorithm?


Let us first define the concept of Residual Graph which is needed for understanding the
implementation.
Residual Graph of a flow network is a graph which indicates additional possible flow. If there is a path
from source to sink in residual graph, then it is possible to add flow. Every edge of a residual graph has
a value called residual capacity which is equal to original capacity of the edge minus current flow.
 Residual capacity is basically the current capacity of the edge. Residual capacity is 0 if there is no
edge between two vertices of residual graph.
 We can initialize the residual graph as original graph as there is no initial flow and initially residual
capacity is equal to original capacity. To find an augmenting path, we can either do a BFS or DFS
of the residual graph. We have used BFS in below implementation.
 Using BFS, we can find out if there is a path from source to sink. BFS also builds parent[] array.
Using the parent[] array, we traverse through the found path and find possible flow through this path
by finding minimum residual capacity along the path. We later add the found path flow to overall
flow.
 The important thing is, we need to update residual capacities in the residual graph. We subtract path
flow from all edges along the path and we add path flow along the reverse edges. We need to add
path flow along reverse edges because may later need to send flow in reverse direction.(See
following link for example).

Ford-Fulkerson Algorithm for Maximum Flow Problem

Page70
Given a graph which represents a flow network where every edge has a capacity. Also given two
vertices source ‘s’ and sink ‘t’ in the graph, find the maximum possible flow from s to t with following
constraints:
a) Flow on an edge doesn’t exceed the given capacity of the edge.
b) Incoming flow is equal to outgoing flow for every vertex except s and t.
For example, consider the following graph.

The maximum possible flow in the above graph is 23.

Maximum Bipartite Matching

A matching in a Bipartite Graph is a set of the edges chosen in such a way that no two edges share an
endpoint. A maximum matching is a matching of maximum size (maximum number of edges). In a
maximum matching, if any edge is added to it, it is no longer a matching. There can be more than one
maximum matchings for a given Bipartite Graph.

Why do we care?
There are many real world problems that can be formed as Bipartite Matching.
For example, consider the following problem:
There are M job applicants and N jobs. Each applicant has a subset of jobs that he/she is interested in.
Each job opening can only accept one applicant and a job applicant can be appointed for only one job.
Find an assignment of jobs to applicants in such that as many applicants as possible get jobs.

Page71
Example: Ford-Fulkerson Algorithm for Maximum Flow Problem

Maximum Bipartite Matching and Max Flow Problem


Maximum Bipartite Matching (MBP) problem can be solved by converting it into a flow network
Following are the steps:
1) Build a Flow Network
There must be a source and sink in a flow network. So we add a source and add edges from source to all
applicants. Similarly, add edges from all jobs to sink. The capacity of every edge is marked as 1 unit.

2) Find the maximum flow.


We use Ford-Fulkerson algorithm to find the maximum flow in the flow network built in step 1. The
maximum flow is actually the MBP we are looking for.

Page72
How to implement the above approach?
Let us first define input and output forms. Input is in the form of Edmonds matrix which is a 2D
array ‘bpGraph[M][N]’ with M rows (for M job applicants) and N columns (for N jobs). The value
bpGraph[i][j] is 1 if i’th applicant is interested in j’th job, otherwise 0.
Output is number maximum number of people that can get jobs.
 A simple way to implement this is to create a matrix that represents adjacency matrix
representation of a directed graph with M+N+2 vertices. Call the fordFulkerson() for the matrix. This
implementation requires O((M+N)*(M+N)) extra space.
 Extra space can be be reduced and code can be simplified using the fact that the graph is bipartite and
capacity of every edge is either 0 or 1. The idea is to use DFS traversal to find a job for an applicant
(similar to augmenting path in Ford-Fulkerson). We call bpm() for every applicant, bpm() is the DFS
based function that tries all possibilities to assign a job to the applicant.
 In bpm(), we one by one try all jobs that an applicant ‘u’ is interested in until we find a job, or all
jobs are tried without luck. For every job we try, we do following.
If a job is not assigned to anybody,
 We simply assign it to the applicant and return true. If a job is assigned to somebody else say x, then
we recursively check whether x can be assigned some other job. To make sure that x doesn’t get the
same job again, we mark the job ‘v’ as seen before we make recursive call for x. If x can get other
job, we change the applicant for job ‘v’ and return true. We use an array maxR[0..N-1] that stores the
applicants assigned to different jobs.
 If bmp() returns true, then it means that there is an augmenting path in flow network and 1 unit
of flow is added to the result in maxBPM().

Example- Ford-Fulkerson algorithm:

Each Directed Edge is labeled with capacity. Use the Ford-Fulkerson algorithm to find the maximum
flow.

Page73
The left side of each part shows the residual network Gf with a shaded augmenting path p,and the right
side of each part shows the net flow f.

Page74
CHARACTERIZATION OF MAXIMUM MATCHING BY
AUGMENTING PATHS:
Augmenting Paths for Matchings:
Definitions: ñ Given a matching M in a graph G, a vertex that is not incident to any edge of M is called
a free vertex w. r. t. M. ñ for a matching M a path P in G is called an alternating path if edges in M
alternate with edges not in M. ñ An alternating path is called an augmenting path for matching M if it
ends at distinct free vertices.

Theorem: A matching M is a maximum matching if and only if there is no augmenting path w. r. t. M.

Augmenting Paths in Action


Proof.
 If M is maximum there is no augmenting path P, because we could switch matching and
non-matching edges along P. This gives matching M0M P with larger cardinality.

 Suppose there is a matching M0 with larger cardinality. Consider the graph H with edge-set
M0M (i.e., only edges that are in either M or M0 but not in both).

 Each vertex can be incident to at most two edges (one from M and one from M0). Hence, the
connected components are alternating cycles or alternating path.

 As jM0j>jMj there is one connected component that is a path P for which both endpoints are
incident to edges from M0. P is an alternating path.

Algorithmic idea:
As long as we find an augmenting path augment our matching using this path. When we arrive at a
matching for which no augmenting path exists we have a maximum matching.

EDMOND’S BLOSSOM ALGORITHM TO COMPUTE


AUGMENTING PATH:

Page75
Introduction
The blossom algorithm, created by Jack Edmunds in 1961, was the first polynomial time algorithm that
could produce a maximum matching on any graph. Previous to the discovery of this algorithm, there
were no fast algorithms that were able to find a maximum matching on a graph with odd length cycles.
The Hungarian Algorithm came the closest still created a maximum matching on a graph under the
condition that there were no odd length cycles. A matching is a graph that occurs when all vertices are
connected to at most one other vertex through a single edge and therefore the set of all edges in the
original graph do not contain any common vertices. A maximum matching occurs when the number of
edges in the matching graph is maximized.

Background Understanding - Augmenting and Alternating Path


This algorithm uses the ideas of finding augmenting and alternating paths in a graph. An augmenting
path is a path along within a graph where the edges alternate between unmatched and matched edges
and it ultimately ends with an unmatched edge. This is in contrast to an alternating path where the path
in the original graph alternates between unmatched and matched edges, but it does not start and end
with an unmatched edge. Furthermore, this algorithm exploits the theorem that for any matching graph,
it is only maximum if and only if there are no augmenting paths (Berge's Lemma). The brief intuition
behind this is that if an augmenting path exists, all of the edges’ labels can be inverted to create a path
that increases the magnitude of the matching edge set by one. Furthermore, to prove that if a graph is
maximum, then there must be no augmenting paths by assuming that for a matching M there is a larger
matching M’. Then take the graph G that consists of the edges M x or M’. This graph only has vertices
with degree at most 2, and therefore it must contain only simple paths or cycles. All cycles must have
an equal number of edges from M and M’, but for the paths, there will be more edges from M’ than
from M. Therefore there is at least one path with more edges from M’ than M, so this path must start
and end with an edge from M’ and thus this path is an augmenting path. An example of an alternating
path where matched vertices are orange and unmatched vertices are black. The edges that make up the
matching are blue.

An example of an augmenting path where matched vertices are orange and unmatched vertices are
black. The edges that make up the matching are blue.

Finding an Augmenting Path

Page76
Finding an augmenting path happens as follows. Starting with a free (not part of the matching) vertex
V, explore all of its neighbors. Either V will discover another free vertex (and thus find an augmenting
path) or it will find a vertex W that is part of the matching. If the latter occurs, mark the edge from V to
W and then extend the path to W's mate X. This will always produce an alternating path starting from
the original vertex to some other matched vertex in the graph. Finally, recurse with vertex X. This
process will continue as long as there is a vertex X that can recursed on.

1. To start all vertices are unmatched and colored black.


2. An unmatched vertex V is chosen.
3. A neighbor (yellow highlight) of V is searched
4. If the neighbor is unmatched, then the augmenting path back to V is inverted (and colored
blue).All vertices on the path are marked as matched (colored orange). The graph is then
recursed on from (2).
5. Else, the neighbor is matched and the path from V to X is stored (highlighed in green). X is then
added to a Queue Q.
a. If V has unexplored neighbors, recurse from (2) with V
b. Else, all the neighbors of V are matched. Then X' is dequeued from Q and labled V, and
then recurse from (2) with the new V.
Blossoms
A blossom occurs when an odd length cycle is found while searching for augmenting paths. When a
blossom is found, all of the nodes that are part of the cycle are contracted into one super node. This
super node retains all the information of the contracted nodes and it gets all of the edges that connect
the cycle to the rest of the graph.

Page77
The search for an augmenting path already occured as described as steps 1-5 for finding an augmenting
path. The starting point for this potential augmenting path is vertex A
1. The yellow edge connects two nodes that could be part of an alternating path back towards A.
So a blossom is present.
2. All nodes that are part of the cycle are then contracted into the supernode (red).
3. The supernode is then enqueued into Q.
After the nodes are contracted, the search for an augmenting path continues as if the supernode were a
single node. Once the search for an augmenting path is completed (regardless of if it was found or not),
the supernode is then lifted. This lifting means that the supernode once again becomes all of the original
nodes. If an augmented path was found, then there is an augmenting path from the original free node to
another free node with some vertices of the odd length cycle in the path. This augmenting path is then
inverted (just like before) in order to increase the size of the matching by one.

The search for an augmenting path is continued from supernode V as previously described, and it
is found that the augmenting path passes through the supernode.

1. The supernode is lifted from its current state so that it is expanded.


2. The path through the supernode that forms an augmenting path is found (green highlight).
3. The augmenting path is then inverted.

The Algorithm
Intuitively the algorithm works by repeatedly trying to find augmenting paths (specifically paths
starting and ending with free vertices) in the graph until no more augmenting paths can be found. This
occurs when either all vertices are matched with a mate or when a free vertex is only connected to
matched vertices which recursively only match to other matched vertices.

All Vertices Matched and a maximum Not All Vertices Matched and a
matching maximum matching

Furthermore, the process of finding augmented paths can be slightly modified so that it
simultaneously can check to see if there is a blossom in the graph. This can be accomplished by labeling
each vertex in an alternating manner O (for odd) and E (for even) depending on how far away a vertex
is from the root. To start, label the original free vertex E, and then alternate labeling from there. The
Page78
algorithm that finds augmeted paths will always search for neighboring vertices from a vertex labeled E
(the matched vertex’s mate always is two edges further away from the root) and thus if an adjacent
vertex also has the label E, there is a blossom that contains both of those vertices. This blossom is then
contracted into one supernode with a label E and then the search for augmenting paths continues.

All together, this algorithm starts by finding augmenting paths within any graph while labeling vertices
E or O and then inverting these paths to increase the size of the matching by one. If a blossom is found,
it is contracted into a single node and the augmenting path algorithm continues as normal. This repeats
until all vertices of the graph have been either searched or are already part of the matching set.

Blossoms can be nested within each other, and therefore, supernodes may end up containing
other supernodes.

Time Complexity
Overall, this algorithm takes O(E * V2) time. Every attempt to find an augmenting path takes O(E) time
in order to iterate over all of the edges. Assuming that an attempt to find an augmenting path succeeds,
it will take O(V) time to flip all of the incident edges. Finally, there are O(V) blossoms in any graph and
for each blossom the smaller graph with the supernode may need to be recursed on. This means that the
total time is O(E * V2) or simply O(V4) for a dense graph.

Page79

You might also like