TopCoder Hungarian Alg
TopCoder Hungarian Alg
TopCoder Hungarian Alg
1 of 7
http://www.topcoder.com/tc?module=Static&d1=tutorials&d2=hungarian...
Competitions
TopCoder Networks
Events
Statistics
Tutorials
Login
Archive
Printable view
Discuss this article
Write for TopCoder
By x-ray
Overview
TopCoder Member
Algorithm Tutorials
Component Tutorials
Introduction
Marathon Tutorials
Are you familiar with the following situation? You open the Div I Medium and don't know how to approach it, while a lot of people in your room submitted it in less than 10
minutes. Then, after the contest, you find out in the editorial that this problem can be simply reduced to a classical one. If yes, then this tutorial will surely be useful for you.
Wiki
Forums
Surveys
My TopCoder
About TopCoder
Problem statement
In this article we'll deal with one optimization problem, which can be informally defined as:
Assume that we have N workers and N jobs that should be done. For each pair (worker, job) we know salary that should be paid to worker for him to perform the job. Our
goal is to complete all jobs minimizing total inputs, while assigning each worker to exactly one job and vice versa.
Converting this problem to a formal mathematical definition we can form the following equations:
- cost matrix, where cij - cost of worker i to perform job j.
- resulting binary matrix, where xij = 1 if and only if ith worker is assigned to jth job.
Member Search:
Go
Advanced Search
We can also rephrase this problem in terms of graph theory. Let's look at the job and workers as if they were a bipartite graph, where each edge between the i
th
worker and
job has weight of cij. Then our task is to find minimum-weight matching in the graph (the matching will consists of N edges, because our bipartite graph is complete).
Actually, this step is not necessary, but it decreases the number of main cycle iterations.
3/31/2011 10:46 PM
2 of 7
http://www.topcoder.com/tc?module=Static&d1=tutorials&d2=hungarian...
Step 1)
A. Find the maximum matching using only 0-weight edges (for this purpose you can use max-flow algorithm, augmenting path algorithm, etc.).
B. If it is perfect, then the problem is solved. Otherwise find the minimum vertex cover V (for the subgraph with 0-weight edges only), the best way to do this is to use
Kning's graph theorem.
Step 2) Let
But there is a nuance here; finding the maximum matching in step 1 on each iteration will cause the algorithm to become O(n ). In order to avoid this, on each step we can
just modify the matching from the previous step, which only takes O(n2) operations.
4
It's easy to see that no more than n2 iterations will occur, because every time at least one edge becomes 0-weight. Therefore, the overall complexity is O(n ).
3
Let
. Then
Let
. Then
Vertex labeling
This is simply a function
(for each vertex we assign some number called a label). Let's call this labeling feasible if it satisfies the following condition:
. In other words, the sum of the labels of the vertices on both sides of a given edge are greater than or equal to the weight
of that edge.
Equality subgraph
Let Gl=(V,El) be a spanning subgraph of G (in other words, it includes all vertices from G). If G only those edges (x,y) which satisfy the following condition:
, then it is an equality subgraph. In other words, it only includes those edges from the bipartite matching
which allow the vertices to be perfectly feasible.
Now we're ready for the theorem which provides the connection between equality subgraphs and maximum-weighted matching:
If M* is a perfect matching in the equality subgraph Gl, then M* is a maximum-weighted matching in G.
The proof is rather straightforward, but if you want you can do it for practice. Let's continue with a few final definitions:
Alternating path and alternating tree
Consider we have a matching M (
Vertex
).
is called matched if
(In the diagram below, W1, W2, W3, J1, J3, J4 are matched, W4, J2 are exposed)
3/31/2011 10:46 PM
3 of 7
http://www.topcoder.com/tc?module=Static&d1=tutorials&d2=hungarian...
Path P is called alternating if its edges alternate between M and E\M. (For example, (W4, J4, W3, J3, W2, J2) and (W4, J1, W1) are alternating paths)
If the first and last vertices in alternating path are exposed, it is called augmenting (because we can increment the size of the matching by inverting edges along this path,
therefore matching unmatched edges and vice versa). ((W4, J4, W3, J3, W2, J2) - augmenting alternating path)
A tree which has a root in some exposed vertex, and a property that every path starting in the root is alternating, is called an alternating tree. (Example on the picture above,
with root in W4)
That's all for the theory, now let's look at the algorithm:
First let's have a look on the scheme of the Hungarian algorithm:
Step 0. Find some initial feasible vertex labeling and some initial matching.
Step 1. If M is perfect, then it's optimal, so problem is solved. Otherwise, some exposed
going to build). Go to step 2.
Step 2. If
go to step 3, else
exists; set
. Find
(1)
(2)
Now replace
with
. If y is exposed then an alternating path from x (root of the tree) to y exists, augment matching along this path and go to step 1.
If y is matched in M with some vertex z add (z,y) to the alternating tree and set
, go to step 2.
And now let's illustrate these steps by considering an example and writing some code.
As an example we'll use the previous one, but first let's transform it to the maximum-weighted matching problem, using the second method from the two described above.
(See Picture 1)
Picture 1
Here are the global variables that will be used in the code:
#define N 55
#define INF 100000000
int cost[N][N];
int n, max_match;
int lx[N], ly[N];
int xy[N];
int yx[N];
bool S[N], T[N];
int slack[N];
int slackx[N];
//cost matrix
//n workers and n jobs
//labels of X and Y parts
//xy[x] - vertex that is matched with x,
//yx[y] - vertex that is matched with y
//sets S and T in algorithm
//as in the algorithm description
//slackx[y] such a vertex, that
// l(slackx[y]) + l(y) - w(slackx[y],y) = slack[y]
//array for memorizing alternating paths
int prev[N];
Step 0:
And as an initial matching we'll use an empty one. So we'll get equality subgraph as on Picture 2. The code for initializing is quite easy, but I'll paste it for completeness:
3/31/2011 10:46 PM
4 of 7
http://www.topcoder.com/tc?module=Static&d1=tutorials&d2=hungarian...
void init_labels()
{
memset(lx, 0, sizeof(lx));
memset(ly, 0, sizeof(ly));
for (int x = 0; x < n; x++)
for (int y = 0; y < n; y++)
lx[x] = max(lx[x], cost[x][y]);
}
The next three steps will be implemented in one function, which will correspond to a single iteration of the algorithm. When the algorithm halts, we will have a perfect
matching, that's why we'll have n iterations of the algorithm and therefore (n+1) calls of the function.
Step 1
According to this step we need to check whether the matching is already perfect, if the answer is positive we just stop algorithm, otherwise we need to clear S, T and
alternating tree and then find some exposed vertex from the X part. Also, in this step we are initializing a slack array, I'll describe it on the next step.
void augment()
{
if (max_match == n) return;
int x, y, root;
int q[N], wr = 0, rd = 0;
memset(S, false, sizeof(S));
memset(T, false, sizeof(T));
memset(prev, -1, sizeof(prev));
for (x = 0; x < n; x++)
if (xy[x] == -1)
{
q[wr++] = root = x;
prev[x] = -2;
S[x] = true;
break;
}
Updating slack:
1) On step 3, when vertex x moves from X\S to S, this takes O(n).
2) On step 2, when updating labeling, it's also takes O(n), because:
3/31/2011 10:46 PM
5 of 7
http://www.topcoder.com/tc?module=Static&d1=tutorials&d2=hungarian...
//main cycle
//building tree with bfs cycle
//current vertex from X part
//iterate through all edges in equality graph
!T[y])
T[y] = true;
q[wr++] = yx[y];
add_to_tree(yx[y], x);
}
if (y < n) break;
}
if (y < n) break;
update_labels();
//augmenting path not found, so improve labeling
wr = rd = 0;
for (y = 0; y < n; y++)
//in this cycle we add edges that were added to the equality graph as a
//result of improving the labeling, we add edge (slackx[y], y) to the tree if
//and only if !T[y] && slack[y] == 0, also with this edge we add another one
//(y, yx[y]) or augment the matching, if y was exposed
if (!T[y] && slack[y] == 0)
{
if (yx[y] == -1)
//exposed vertex in Y found - augmenting path exists!
{
x = slackx[y];
break;
}
else
{
T[y] = true;
//else just add y to T,
if (!S[yx[y]])
{
q[wr++] = yx[y];
//add vertex yx[y], which is matched with
//y, to the queue
add_to_tree(yx[y], slackx[y]);
//and add edges (x,y) and (y,
//yx[y]) to the tree
}
}
}
if (y < n) break;
//augmenting path found!
}
if (y < n)
{
max_match++;
//in this cycle we inverse edges along augmenting path
for (int cx = x, cy = y, ty; cx != -2; cx = prev[cx], cy = ty)
{
ty = xy[cx];
yx[cy] = cx;
xy[cx] = cy;
}
augment();
}
}//end of augment() function
The only thing in code that hasn't been explained yet is the procedure that goes after labels are updated. Say we've updated labels and now we need to complete our
alternating tree; to do this and to keep algorithm in O(n3) time (it's only possible if we use each edge no more than one time per iteration) we need to know what edges
should be added without iterating through all of them, and the answer for this question is to use BFS to add edges only from those vertices in Y, that are not in T and for
3
which slack[y] = 0 (it's easy to prove that in such way we'll add all edges and keep algorithm to be O(n )). See picture below for explanation:
3/31/2011 10:46 PM
6 of 7
http://www.topcoder.com/tc?module=Static&d1=tutorials&d2=hungarian...
//step 0
//steps 1-3
//forming answer there
To see all this in practice let's complete the example started on step 0.
Build
alternating tree
Augmenting
path found
Build
alternating tree
Update labels
(=1)
Build
alternating tree
Update labels
(=1)
Build
alternating tree
Augmenting
path found
Build
alternating tree
Update labels
(=2)
Build
alternating tree
Update labels
(=1)
Build
alternating tree
Augmenting
path found
Optimal matching found
Finally, let's talk about the complexity of this algorithm. On each iteration we increment matching so we have n iterations. On each iterations each edge of the graph is used
2
no more than one time when finding augmenting path, so we've got O(n ) operations. Concerning labeling we update slack array each time when we insert vertex from X into
S, so this happens no more than n times per iteration, updating slack takes O(n) operations, so again we've got O(n2). Updating labels happens no more than n time per
2
iterations (because we add at least one vertex from Y to T per iteration), it takes O(n) operations - again O(n ). So total complexity of this implementation is O(n ).
Some practice
For practice let's consider the medium problem from SRM 371 (div. 1). It's obvious we need to find the maximum-weighted matching in graph, where the X part is our players,
the Y part is the opposing club players, and the weight of each edge is:
3/31/2011 10:46 PM
7 of 7
http://www.topcoder.com/tc?module=Static&d1=tutorials&d2=hungarian...
Though this problem has a much simpler solution, this one is obvious and fast coding can bring more points.
Try this one for more practice. I hope this article has increased the wealth of your knowledge in classical algorithms Good luck and have fun!
References
1. Mike Dawes "The Optimal Assignment Problem"
2. Mordecaj J. Golin "Bipartite Matching and the Hungarian Method"
3. Samir Khuller "Design and Analysis of Algorithms: Course Notes"
4. Lawler E.L. "Combinatorial Optimization: Networks and Matroids"
3/31/2011 10:46 PM