Graph Algorithm
Graph Algorithm
Graphs
Graphs is abstract mathematical objects to model complex problems into simple representation. Graph
theory became a major branch of mathematics that has been studied intensively for hundreds of years
since Euler introduced this idea to solve “seven bridge of konigsberg” at 1735. There are four important
type of graphs:
1. Graphs that have simple connections.
2. Digraphs that the direction of each connection is significant.
3. Edge-wighted graphs that each connection has an associated weight.
4. Edge-weighted digraphs that each connection has both direction and a weight.
A graph my has cycle that a path has a same vertex at first at last. A cycle called as simple cycle if has
no repeated edges or vertices between first an last. The length of a path and cycle defined by number of
edges.
We called a graph is connected if every vertex connected. Furthermore, a set of connected graph called
as set of connected components graph called as acyclic if has no cycles. If we can found an acyclic
graph as subgraph of connected graph, then we called that as tree. A disjoint set of trees is called as
forest. A single tree called as spanning tree if contains all vertices in the graph. A spanning tree is
union of spanning trees of its connected components.
A graph called as bipartite graph if we can divide it into two set of graph.
Depth-First Search
DFS is oldest graph processing algorithm. Furthermore, we can say that DFS is representation of the
nature of the recursion. Image below represent that the trace of recursive process can be drawn as a
binary tree.
Each call will process current vertex at its first neighbor until we can not found a next first neighbor,
then going up to explore next first neighbor of upper vertex. That is, we use stack to store each call
step.
DFS can be used to solve single source connectivity problem and single source path.
Breath-First Search
BFS implementing Queue to store each call process, so we can iterate to all neighbor of current vertex
first, then go to iterate all neighbors of next vertex. Intuitively, the method implemented by BFS is
likely same as the way we try to learn best path at unknown places.
BFS can be sued to solve single source shortest path.
4.2 Directed Graph
A directed graph (also called as digraph) is set of vertices that each vertices may be connected with
directed edge, so the connection may became ordered defined by arrow of edges.
Let v → w is one of vertex in digraph, then we say that v is tail or parent and w is head or child.
Remember, that a vertex v may has many childs. The outdegree is how many child that vertex v has,
while indegree is how many child that vertex v has.
There are many application of digraph, some of them are topological sort for scheduling purpose using
Depth First Order algorithm, finding strong connected components for identify social network or
electronic circuit using kosajaru algorithm. Another interesting application is WordNet which is
rooted DAG (Directed Acyclic Graph) of words an its hypernims. We can determine semantic
relationship between two words or more by implementing Breath-First to find Shortest Ancestral
Path and Ancestral Parent.
Scheduling Problem
Is one of most practical case of directed graph. For example, let a college schedule of computer science
student shown below:
That’s a simple schedule that easy to remember and manage, but its
will be a problem if we have huge number of entities in our
schedule. To manage it better, we need mathematician thinking,
simplify all of them into number. So, our schedule will looked like
a picture at the right.
Let say, we need to pay attention into the study path that fit in our current study year. To do it better, we
just need to implement topological order in our schedule graph. So, our study path will be looked like:
is an algorithm work like ‘growing tree’ to construct MST. This algorithm used Priority Queue to
promoting next edge. Meanwhile, there are two approach that we can used to maintain priority queue;
Lazy approach which still keeping ineligible edges in priority queue (space = E, time = E log E) and
Eager approach which is keeping only eligible edges in priority queue (space = V, time = E log V).
We can do better in running time by implementing more sophisticated data structure. Keep in mind that
its optimality case which should be consider. For example, we may used array rather that binary heap to
solve minimum spanning tree of dense graphs, so thinking about improving the running time will be
useless. The detail running time of prim’s algorithm in different data structures described in table
below:
Kruskal algorithm
is an algorithm that work in ‘ordered manner’ to construct MST. This algorithm needed preprocessing
that sorted list of edges, so then in processing we can easily choose best edge first until find longest
possible edge (space = E, time = E log E).
Using priority queue and union finding, we could infer that the running time proportional to E log E:
The main purpose of this operation is to find any shorter distance to a goal than we have already
found. Let say, there are many alternative way to go from s to w. There are set of V’ = {v1, v2, v3,
…, vn} that would connecting s to w. So in the end, the result of edge relaxation is find a vi
which is has shortest distance to w.
• Vertex relaxation
The solution of longest path finding in DAG is straightforward and super easy. All we need is
just negate all weight in our DAG, so all weight changed into negative value.
• Parallel job scheduling
Recall our scheduling problem in section 4.2 which is scheduling problem in single processor
unit. In this problem, we have such a machine which has multi processor units that capable to
process multiple jobs at same time. To do better with respect in processing time, we has to know
how much time it should be all job to be done or the upper bound of processing time.
The best way to find the upper bound, we only need to implementing DAG longest path
algorithm, so we will found a critical path which is has to done at last order of parallel
processing. Let take an example:
Below we have a table of jobs with its intuitive solution:
Look like so easy, but for me, it’s not. A critical path is longest path from start to end. To find it
within fastest way, we need to implementing some rules in our longest path algorithm:
1. Add two virtual vertices in the start and the end of our job schedule.
2. A job indicated by two vertex units: an initial vertex (vi) and a terminal vertex (vj). There is a
weighted edge connecting vi→ vj indicated as duration of job to be done.
3. A constrain of two jobs ji and jt defined by a connection of vj ji → vj jt.
So, referring from the rule above, the DAG representation of our parallel job scheduling has
2 * N + 2 vertices.
Our longest path algorithm will be found a critical path that looked like an image below:
Our focus in this problem is to find whatever any deadlines is feasible or not. To do it, we only
need to find the shortest paths. To represent constrain of relative deadlines, we need to create
some rules:
1. A deadline has negative weight and opposite direction.
2. Constrain of deadlines create a cycle of two path: P+ which is contains a path from v to w
and P- which is contains a path from w to v.
3. A feasible deadline as part of P- should be making P- has cost not less than P+.
Unfortunately, there is no linear time algorithm existed to solve this problem. One algorithm
will guarantee to provide optimal solution is Bellman-Ford algorithm which is exponential
general edge-weighted digraph algorithm.
• Bellman-Ford algorithm
This algorithm is not perfect general for edge-weighted digraph, since we should avoid such
negative cycle existed. If we consider the practical implementation, which one of them is
parallel job scheduling with relative deadlines, then the negative cycle should be indicated an
error in our schedule that should be fixed by supervised correction.
Bellman-Ford algorithm quite simple that has E*V running time and space proportional to V.
This algorithm said, “For every vertex in digraph, do edge relaxation”. We can effort little
improvisation of running time by maintain eligible vertex into queue.