Graph Coloring Using Multi-Threading
Graph Coloring Using Multi-Threading
1. Introduction
1. Chromatic Number
A coloring using at most k colors is called a proper
k-coloring. The minimum cardinality of the set of
colors needed to color a graph G is called its
chromatic number, and is denoted by (G). The
chromatic polynomial counts the number of ways a
graph can be colored using no more than a given
number of colors.
2. Problem Definition
A graph G is a pair (V, E) of a set of vertices V and a
set of edges E. The edges are unordered pairs of the
form {i, j} where i, j V. Two vertices i and j are said
to be adjacent if and only if {i, j} E and non-
adjacent otherwise.
3. Algorithms
1. Polynomial Time
The problem for k=2, is the same as
determining whether the graph is bipartite or
not and thus has a time complexity of the order
O(n) using breadth-first search or depth-first
search. If the graph is planar and has low
branch-width, then it can be solved in
polynomial time using dynamic programming.
3. Greedy Coloring
Greedy coloring considers vertices in a
sequence <v1, v2,., vn> and assigns to vi the
smallest available color not used by vis
neighbors among v1,,vi-1, using a new color
when needed.
1: Input: p no of threads
2: uniform random partitioning of V in V1, V2, . . ,Vp
3: m maximum degree of graph Vertices are inherently
ordered by their vertex ids
4: procedure ParallelGraphColoring(G = (V, E))
5: for all thread Ti| i {1,. . . ,p} do
6: Identify boundary vertices in Vi
7: Initialise TotalColors[m + 1] = {0, 1, ...., m}
8: for each v Vi| v is a internal vertex in Vi do
9: color(v) min{TotalColors color(adjacent(v)}
10: end for
11: for each v Vi| v is a boundary vertex in Vi do
12: List Ai adj(v)
13: Ai Ai {v}
14: Lock all vertices in Ai in increasing order of
vertex ids
15: color(v) min{TotalColors color(adjacent(v)}
16: Unlock all vertices in Ai
17: end for
18: end for
19: end procedure
Deadlock:
As each thread must acquire locks in order to color vertices, it
may be possible that a deadlock arises. As each vertex can be
assumed as one resource, the resources here are of single
instance type. Due to only single instances of resources being
present, we can use a simple resource-allocation graph
algorithm that detects a cycle in the graph for all the resources.
The time taken for this is given as O(V2). However, we simply
prevent deadlock rather than perform recovery from deadlock.
The deadlock prevention is done by the elimination of one of the
4 conditions for deadlock. A global ordering of vertices is
maintained and vertices acquire locks in that order. This ensures
the property of circular waiting to never hold.
Multithreading
Multithreading enables various streams of a program that
can communicate with each other and interfere, such streams are
known as threads. Multithreaded program execution involves an
interleaved unpredictable order of execution of the threads,
which may require external synchronization in order to not
conflict with the aim of the program.
Depending on the amount of parallelization the code/program
can incur, the speedup of the program changes.
Pthreads using POSIX API is one of the most common ways for
the implementation of the above parallelizable code.
Synchronization
As in the graph-coloring problem, each vertex is
considered as a common object to all the threads, in such cases
we require coordination between the threads to ensure no
violations of the critical section problem.
Synchronization can be achieved in many ways such as by the
use of Semaphores, locks, hardware methods, monitors etc.
The above code uses locks to achieve synchronization.
Mutexes and Spinlocks are commonly used.
For single core systems we prefer using mutexes that will sleep
and wake threads up rather than performing busy waiting that
increases CPU idle time. For multiple core systems, spinlocks
may prove to be more optimal as the busy waiting on one core
still allows execution by the other cores, and in cases,
supersedes the overheads due to sleeping and waking threads up.
Overheads
The theoretical speedup for a completely parallelizable
code is greatly optimistic and doesnt match the practical
speedup. The major reason for this is the overheads required to
achieve parallel and concurrent execution. Tasks synchronize at
barriers where they all finish a timestep and here the slowest
task determines the overall speed. A more efficient approach
used is using a global array variable as used above. Minimizing
synchronization overheads such as time spent in busy waiting,
sleeping and waking threads is important and thus makes the
program more scalable. Since synchronization overheads tend to
grow rapidly with the number of threads, the scalability
increases if the overheads increase at a rate slower than the rate
of increase of the number of threads.
8. Applications
The graph-coloring problem has multiple applications, due to
which we try to speedup the process for the generalized
distance-1 coloring. Some of the prominent applications
include:
1. Map Coloring: Geographical map coloring, with an
emphasis on the Four Color theorem to sufficiently color a map.
2. Bipartiteness
3. Register Allocation: For compiler optimization
(reordering of code, prevention of hazards, static scheduling),
the register allocation process of assigning a large number of
target program variables onto a small number of CPU registers
is also a graph coloring problem.
4. It is used for modeling scheduling problems.
9. Conclusions and Future Scope
Massively multithreaded algorithms entail a speedup from
the sequential algorithms and thus are more useful in solving the
NP-hard graph-coloring problem that has numerous
applications. However has the degree of multithreading for the
above used algorithm increases, the synchronization overheads
such as interprocess communication, context switching, using
locks due to the nature of the problem being similar to that of a
critical section problem causes the gain due to parallelism to be
subdued by the overheads to ensure this hazard free parallelism.
Some general conclusions include:
1. Simultaneous multithreading provides an effective way to
tolerate latency.
2. Synchronization overheads can become heavy for multiple
threads as the number of threads exceeds ~150 and nullify the
effect of parallelism leading to a speedup factor of less than
unity.
3. The usage of spinlocks for multi-core systems such as the
above one, the locks are plenty which can be held by cores and
thus the time overhead incurred due to busy waiting of the
spinlocks is less than that for sleeping and waking up the locks
as opposed to the use of mutexes on single core systems.
The above research and field has a large future scope. Some of
the most prominent fields that the above can be extended into
are:
1. Performance metric comparison with an interprocess
communication model using interconnection networks such as
meshes, torus, iliac mesh etc to study the speedup variation with
various network parameters as compared to the shared memory
model.
2. The distance-1 problem though the most common, the above
algorithms and speedup calculations can be extended for
distance-n problems.
3. More approximation techniques, parallelization and reduction
of sequential bottlenecks can be studied and added.
4. More methods for scheduling the threads and defining an
ordering for vertex partition that improve the heuristics can be
included.
5. More fine grain locking can be used to study the issue of long
waiting chains that cause a deterioration of performance in the
algorithm for the number of threads greater than 150.
[2] Umit V. C atalyurek, John Feo, Assefaw Hadish Gebremedhin , Mahantesh Halappanavar, and
Alex Pothen. Graph coloring algorithms for multi-core and massively multithreaded architectures.
Parallel Computing, 38(10-11):576594, 2012.
[3] Assefaw H Gebremedhin, Fredrik Manne, and Tom Woods. Speeding up parallel graph coloring.
In Applied Parallel Computing. State of the Art in Scientific Computing, pages 10791088. Springer,
2006.
[4] Assefaw Hadish Gebremedhin and Fredrik Manne. Scalable parallel graph coloring algorithms.
Concurrency - Practice and Experience, 12(12):11311146, 2000.
[5] Mark T. Jones and Paul E. Plassmann. A parallel graph coloring heuristic. SIAM J. Scientific
Computing, 14(3):654669, 1993.
[6] Jure Leskovec and Andrej Krevl. SNAP Datasets: Stanford large network dataset collection.
http://snap.stanford.edu/data, June 2014.
[7] Md Mostofa Ali Patwary, Assefaw H Gebremedhin, and Alex Pothen. New multithreaded ordering
and coloring algorithms for multicore architectures. In Euro-Par 2011 Parallel Processing, pages 250
262. Springer, 2011.
[8] wikpedia.org
[9] stackoverflow.com
[10] geeksforgeeks.org
[11] T. F. Coleman and J. J. More. Estimation of sparse Hessian matrices and graph coloring
problems. Mathematical Programming, 28:243270, 1984.
C++ Code for Sequential Greedy Graph Coloring
#include <bits/stdc++.h>
class Graph
public:
void greedyColoring(); };
{ adj[v].push_back(w);
adj[w].push_back(v); }
void Graph::greedyColoring()
{ int result[V];
result[0] = 0;
bool available[V];
available[cr] = false;
{ list<int>::iterator i;
if (result[*i] != -1)
available[result[*i]] = true;
if (result[*i] != -1)
available[result[*i]] = false; }
cout << "Vertex " << u << " ---> Color "<< result[u] << endl;}
int main()
{ Graph g1(5);
g1.greedyColoring();
return 0;}