Graph Algorithms
Graph Algorithms
PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information.
PDF generated at: Wed, 29 Aug 2012 18:41:05 UTC
Contents
Articles
Introduction
Graph theory
Undirected graphs
19
Directed graphs
26
28
32
Adjacency list
35
Adjacency matrix
37
Implicit graph
40
44
Depth-first search
44
Breadth-first search
49
52
54
Topological sorting
57
60
62
Connected components
62
Edge connectivity
64
Vertex connectivity
65
66
Ear decomposition
67
70
72
73
76
82
82
83
86
87
Application: 2-satisfiability
88
Shortest paths
101
101
Dijkstra's algorithm for single-source shortest paths with positive edge lengths
106
BellmanFord algorithm for single-source shortest paths allowing negative edge lengths
112
115
118
122
Bidirectional search
125
A* search algorithm
127
134
137
141
144
149
164
164
Borvka's algorithm
170
Kruskal's algorithm
171
Prim's algorithm
174
177
180
181
182
184
185
187
193
Clique problem
193
206
210
214
Graph coloring
218
Bipartite graph
232
Greedy coloring
237
239
242
Vertex cover
242
Dominating set
247
251
254
Tours
256
Eulerian path
256
Hamiltonian path
260
263
265
278
278
279
Matching
281
Matching
281
285
289
Assignment problem
296
298
303
306
309
Permanent
311
314
Network flow
318
318
322
325
330
333
335
Closure problem
342
345
347
Planar graph
347
Dual graph
353
Fry's theorem
355
Steinitz's theorem
357
Planarity testing
360
362
Graph drawing
363
368
Graph embedding
372
Application: Sociograms
374
375
378
Interval graph
378
Chordal graph
381
Perfect graph
384
Intersection graph
388
390
Line graph
392
Claw-free graph
397
Median graph
403
Graph isomorphism
414
Graph isomorphism
414
416
Graph canonization
423
424
Color-coding
426
429
430
431
Graph partition
431
KernighanLin algorithm
435
Tree decomposition
436
Branch-decomposition
441
Path decomposition
445
457
Graph minors
469
Courcelle's theorem
474
RobertsonSeymour theorem
475
Bidimensionality
479
References
Article Sources and Contributors
482
489
Article Licenses
License
494
Introduction
Graph theory
Further information: Graph (mathematics)andGlossary of graph theory
In mathematics and computer science, graph theory is
the study of graphs, which are mathematical structures
used to model pairwise relations between objects from
a certain collection. A "graph" in this context is a
collection of "vertices" or "nodes" and a collection of
edges that connect pairs of vertices. A graph may be
undirected, meaning that there is no distinction
between the two vertices associated with each edge, or
its edges may be directed from one vertex to another;
see graph (mathematics) for more detailed definitions
and for other variations in the types of graph that are
commonly considered. Graphs are one of the prime
objects of study in discrete mathematics.
A drawing of a graph
The graphs studied in graph theory should not be confused with the graphs of functions or other kinds of graphs.
Refer to the glossary of graph theory for basic definitions in graph theory.
Applications
Graphs are among the most ubiquitous models of both natural and human-made structures. They can be used to
model many types of relations and process dynamics in physical, biological[1] and social systems. Many problems of
practical interest can be represented by graphs.
In computer science, graphs are used to represent networks of communication, data organization, computational
devices, the flow of computation, etc. One practical example: The link structure of a website could be represented by
a directed graph. The vertices are the web pages available at the website and a directed edge from page A to page B
exists if and only if A contains a link to B. A similar approach can be taken to problems in travel, biology, computer
chip design, and many other fields. The development of algorithms to handle graphs is therefore of major interest in
computer science. There, the transformation of graphs is often formalized and represented by graph rewrite systems.
They are either directly used or properties of the rewrite systems (e.g. confluence) are studied. Complementary to
graph transformation systems focussing on rule-based in-memory manipulation of graphs are graph databases geared
towards transaction-safe, persistent storing and querying of graph-structured data.
Graph-theoretic methods, in various forms, have proven particularly useful in linguistics, since natural language
often lends itself well to discrete structure. Traditionally, syntax and compositional semantics follow tree-based
structures, whose expressive power lies in the Principle of Compositionality, modeled in a hierarchical graph. More
contemporary approaches such as Head-driven phrase structure grammar (HPSG) model syntactic constructions via
the unification of typed feature structures, which are directed acyclic graphs. Within lexical semantics, especially as
applied to computers, modeling word meaning is easier when a given word is understood in terms of related words;
semantic networks are therefore important in computational linguistics. Still other methods in phonology (e.g.
Optimality Theory, which uses lattice graphs) and morphology (e.g. finite-state morphology, using finite-state
Graph theory
transducers) are common in the analysis of language as a graph. Indeed, the usefulness of this area of mathematics to
linguistics has borne organizations such as TextGraphs [2], as well as various 'Net' projects, such as WordNet,
VerbNet, and others.
Graph theory is also used to study molecules in chemistry and physics. In condensed matter physics, the three
dimensional structure of complicated simulated atomic structures can be studied quantitatively by gathering statistics
on graph-theoretic properties related to the topology of the atoms. For example, Franzblau's shortest-path (SP) rings.
In chemistry a graph makes a natural model for a molecule, where vertices represent atoms and edges bonds. This
approach is especially used in computer processing of molecular structures, ranging from chemical editors to
database searching. In statistical physics, graphs can represent local connections between interacting parts of a
system, as well as the dynamics of a physical process on such systems.
Graph theory is also widely used in sociology as a way, for example, to measure actors' prestige or to explore
diffusion mechanisms, notably through the use of social network analysis software.
Likewise, graph theory is useful in biology and conservation efforts where a vertex can represent regions where
certain species exist (or habitats) and the edges represent migration paths, or movement between the regions. This
information is important when looking at breeding patterns or tracking the spread of disease, parasites or how
changes to the movement can affect other species.
In mathematics, graphs are useful in geometry and certain parts of topology, e.g. Knot Theory. Algebraic graph
theory has close links with group theory.
A graph structure can be extended by assigning a weight to each edge of the graph. Graphs with weights, or
weighted graphs, are used to represent structures in which pairwise connections have some numerical values. For
example if a graph represents a road network, the weights could represent the length of each road.
A digraph with weighted edges in the context of graph theory is called a network. Network analysis have many
practical applications, for example, to model and analyze traffic networks. Applications of network analysis split
broadly into three categories:
1. First, analysis to determine structural properties of a network, such as the distribution of vertex degrees and the
diameter of the graph. A vast number of graph measures exist, and the production of useful ones for various
domains remains an active area of research.
2. Second, analysis to find a measurable quantity within the network, for example, for a transportation network, the
level of vehicular flow within any portion of it.
3. Third, analysis of dynamical properties of networks.
History
The paper written by Leonhard Euler on the Seven Bridges of
Knigsberg and published in 1736 is regarded as the first paper in
the history of graph theory.[3] This paper, as well as the one
written by Vandermonde on the knight problem, carried on with
the analysis situs initiated by Leibniz. Euler's formula relating the
number of edges, vertices, and faces of a convex polyhedron was
studied and generalized by Cauchy[4] and L'Huillier,[5] and is at
the origin of topology.
More than one century after Euler's paper on the bridges of
Knigsberg and while Listing introduced topology, Cayley was led
by the study of particular analytical forms arising from differential
Graph theory
calculus to study a particular class of graphs, the trees. This study had many implications in theoretical chemistry.
The involved techniques mainly concerned the enumeration of graphs having particular properties. Enumerative
graph theory then rose from the results of Cayley and the fundamental results published by Plya between 1935 and
1937 and the generalization of these by De Bruijn in 1959. Cayley linked his results on trees with the contemporary
studies of chemical composition.[6] The fusion of the ideas coming from mathematics with those coming from
chemistry is at the origin of a part of the standard terminology of graph theory.
In particular, the term "graph" was introduced by Sylvester in a paper published in 1878 in Nature, where he draws
an analogy between "quantic invariants" and "co-variants" of algebra and molecular diagrams:[7]
"[...] Every invariant and co-variant thus becomes expressible by a graph precisely identical with a Kekulan
diagram or chemicograph. [...] I give a rule for the geometrical multiplication of graphs, i.e. for constructing a
graph to the product of in- or co-variants whose separate graphs are given. [...]" (italics as in the original).
The first textbook on graph theory was written by Dnes Knig, and published in 1936.[8] A later textbook by Frank
Harary, published in 1969, was enormously popular, and enabled mathematicians, chemists, electrical engineers and
social scientists to talk to each other. Harary donated all of the royalties to fund the Plya Prize.[9]
One of the most famous and productive problems of graph theory is the four color problem: "Is it true that any map
drawn in the plane may have its regions colored with four colors, in such a way that any two regions having a
common border have different colors?" This problem was first posed by Francis Guthrie in 1852 and its first written
record is in a letter of De Morgan addressed to Hamilton the same year. Many incorrect proofs have been proposed,
including those by Cayley, Kempe, and others. The study and the generalization of this problem by Tait, Heawood,
Ramsey and Hadwiger led to the study of the colorings of the graphs embedded on surfaces with arbitrary genus.
Tait's reformulation generated a new class of problems, the factorization problems, particularly studied by Petersen
and Knig. The works of Ramsey on colorations and more specially the results obtained by Turn in 1941 was at the
origin of another branch of graph theory, extremal graph theory.
The four color problem remained unsolved for more than a century. In 1969 Heinrich Heesch published a method for
solving the problem using computers.[10] A computer-aided proof produced in 1976 by Kenneth Appel and
Wolfgang Haken makes fundamental use of the notion of "discharging" developed by Heesch.[11][12] The proof
involved checking the properties of 1,936 configurations by computer, and was not fully accepted at the time due to
its complexity. A simpler proof considering only 633 configurations was given twenty years later by Robertson,
Seymour, Sanders and Thomas.[13]
The autonomous development of topology from 1860 and 1930 fertilized graph theory back through the works of
Jordan, Kuratowski and Whitney. Another important factor of common development of graph theory and topology
came from the use of the techniques of modern algebra. The first example of such a use comes from the work of the
physicist Gustav Kirchhoff, who published in 1845 his Kirchhoff's circuit laws for calculating the voltage and
current in electric circuits.
The introduction of probabilistic methods in graph theory, especially in the study of Erds and Rnyi of the
asymptotic probability of graph connectivity, gave rise to yet another branch, known as random graph theory, which
has been a fruitful source of graph-theoretic results.
Graph theory
Drawing graphs
Graphs are represented graphically by drawing a dot or circle for every vertex, and drawing an arc between two
vertices if they are connected by an edge. If the graph is directed, the direction is indicated by drawing an arrow.
A graph drawing should not be confused with the graph itself (the abstract, non-visual structure) as there are several
ways to structure the graph drawing. All that matters is which vertices are connected to which others by how many
edges and not the exact layout. In practice it is often difficult to decide if two drawings represent the same graph.
Depending on the problem domain some layouts may be better suited and easier to understand than others.
The pioneering work of W. T. Tutte was very influential in the subject of graph drawing. Among other
achievements, he introduced the use of linear algebraic methods to obtain graph drawings.
Graph drawing also can be said to encompass problems that deal with the crossing number and its various
generalizations. The crossing number of a graph is the minimum number of intersections between edges that a
drawing of the graph in the plane must contain. For a planar graph, the crossing number is zero by definition.
Drawings on surfaces other than the plane are also studied.
List structures
Incidence list
The edges are represented by an array containing pairs (tuples if directed) of vertices (that the edge connects)
and possibly weight and other data. Vertices connected by an edge are said to be adjacent.
Adjacency list
Much like the incidence list, each vertex has a list of which vertices it is adjacent to. This causes redundancy
in an undirected graph: for example, if vertices A and B are adjacent, A's adjacency list contains B, while B's
list contains A. Adjacency queries are faster, at the cost of extra storage space.
Matrix structures
Incidence matrix
The graph is represented by a matrix of size |V | (number of vertices) by |E| (number of edges) where the entry
[vertex, edge] contains the edge's endpoint data (simplest case: 1 - incident, 0 - not incident).
Adjacency matrix
This is an n by n matrix A, where n is the number of vertices in the graph. If there is an edge from a vertex x to
a vertex y, then the element
is 1 (or in general the number of xy edges), otherwise it is 0. In computing,
this matrix makes it easy to find subgraphs, and to reverse a directed graph.
Laplacian matrix or "Kirchhoff matrix" or "Admittance matrix"
This is defined as D A, where D is the diagonal degree matrix. It explicitly contains both adjacency
information and degree information. (However, there are other, similar matrices that are also called "Laplacian
matrices" of a graph.)
Distance matrix
Graph theory
A symmetric n by n matrix D, where n is the number of vertices in the graph. The element
of a shortest path between x and y; if there is no such path
is the length
Graph coloring
Many problems have to do with various ways of coloring graphs, for example:
Graph theory
Route problems
Network flow
There are numerous problems arising especially from applications that have to do with various notions of flows in
networks, for example:
Max flow min cut theorem
Covering problems
Covering problems are specific instances of subgraph-finding problems, and they tend to be closely related to the
clique problem or the independent set problem.
Set cover problem
Vertex cover problem
Graph theory
Graph classes
Many problems involve characterizing the members of various classes of graphs. Overlapping significantly with
other types in this list, this type of problem includes, for instance:
Notes
[1] Mashaghi, A.; et al. (2004). "Investigation of a protein complex network". European Physical Journal B 41 (1): 113121.
doi:10.1140/epjb/e2004-00301-0.
[2] http:/ / www. textgraphs. org
[3] Biggs, N.; Lloyd, E. and Wilson, R. (1986), Graph Theory, 1736-1936, Oxford University Press
[4] Cauchy, A.L. (1813), "Recherche sur les polydres - premier mmoire", Journal de l'cole Polytechnique 9 (Cahier 16): 6686.
[5] L'Huillier, S.-A.-J. (1861), "Mmoire sur la polydromtrie", Annales de Mathmatiques 3: 169189.
[6] Cayley, A. (1875), "Ueber die Analytischen Figuren, welche in der Mathematik Bume genannt werden und ihre Anwendung auf die Theorie
chemischer Verbindungen", Berichte der deutschen Chemischen Gesellschaft 8 (2): 10561059, doi:10.1002/cber.18750080252.
[7] John Joseph Sylvester (1878), Chemistry and Algebra. Nature, volume 17, page 284. doi:10.1038/017284a0. Online version (http:/ / www.
archive. org/ stream/ nature15unkngoog#page/ n312/ mode/ 1up). Retrieved 2009-12-30.
[8] Tutte, W.T. (2001), Graph Theory (http:/ / books. google. com/ books?id=uTGhooU37h4C& pg=PA30), Cambridge University Press, p.30,
ISBN978-0-521-79489-3, .
[9] Society for Industrial and Applied Mathematics (2002), "The George Polya Prize" (http:/ / www. siam. org/ about/ more/ siam50. pdf),
Looking Back, Looking Ahead: A SIAM History, p.26, .
[10] Heinrich Heesch: Untersuchungen zum Vierfarbenproblem. Mannheim: Bibliographisches Institut 1969.
[11] Appel, K. and Haken, W. (1977), "Every planar map is four colorable. Part I. Discharging", Illinois J. Math. 21: 429490.
[12] Appel, K. and Haken, W. (1977), "Every planar map is four colorable. Part II. Reducibility", Illinois J. Math. 21: 491567.
[13] Robertson, N.; Sanders, D.; Seymour, P. and Thomas, R. (1997), "The four color theorem", Journal of Combinatorial Theory Series B 70:
244, doi:10.1006/jctb.1997.1750.
References
Berge, Claude (1958), Thorie des graphes et ses applications, Collection Universitaire de Mathmatiques, II,
Paris: Dunod. English edition, Wiley 1961; Methuen & Co, New York 1962; Russian, Moscow 1961; Spanish,
Mexico 1962; Roumanian, Bucharest 1969; Chinese, Shanghai 1963; Second printing of the 1962 first English
edition, Dover, New York 2001.
Biggs, N.; Lloyd, E.; Wilson, R. (1986), Graph Theory, 17361936, Oxford University Press.
Bondy, J.A.; Murty, U.S.R. (2008), Graph Theory, Springer, ISBN978-1-84628-969-9.
Bondy, Riordan, O.M (2003), Mathematical results on scale-free random graphs in "Handbook of Graphs and
Networks" (S. Bornholdt and H.G. Schuster (eds)), Wiley VCH, Weinheim, 1st ed..
Chartrand, Gary (1985), Introductory Graph Theory, Dover, ISBN0-486-24775-9.
Gibbons, Alan (1985), Algorithmic Graph Theory, Cambridge University Press.
Reuven Cohen, Shlomo Havlin (2010), Complex Networks: Structure, Robustness and Function, Cambridge
University Press
Golumbic, Martin (1980), Algorithmic Graph Theory and Perfect Graphs, Academic Press.
Harary, Frank (1969), Graph Theory, Reading, MA: Addison-Wesley.
Harary, Frank; Palmer, Edgar M. (1973), Graphical Enumeration, New York, NY: Academic Press.
Mahadev, N.V.R.; Peled, Uri N. (1995), Threshold Graphs and Related Topics, North-Holland.
Mark Newman (2010), Networks: An Introduction, Oxford University Press.
Graph theory
External links
Online textbooks
Graph Theory with Applications (http://www.math.jussieu.fr/~jabondy/books/gtwa/gtwa.html) (1976) by
Bondy and Murty
Phase Transitions in Combinatorial Optimization Problems, Section 3: Introduction to Graphs (http://arxiv.org/
pdf/cond-mat/0602129) (2006) by Hartmann and Weigt
Digraphs: Theory Algorithms and Applications (http://www.cs.rhul.ac.uk/books/dbook/) 2007 by Jorgen
Bang-Jensen and Gregory Gutin
Graph Theory, by Reinhard Diestel (http://diestel-graph-theory.com/index.html)
Other resources
Graph theory tutorial (http://www.utm.edu/departments/math/graph/)
A searchable database of small connected graphs (http://www.gfredericks.com/main/sandbox/graphs)
Image gallery: graphs (http://web.archive.org/web/20060206155001/http://www.nd.edu/~networks/
gallery.htm)
Concise, annotated list of graph theory resources for researchers (http://www.babelgraph.org/links.html)
rocs (http://www.kde.org/applications/education/rocs/) - a graph theory IDE
Basics
A graph G consists of two types of elements, namely vertices and edges. Every edge has two endpoints in the set of
vertices, and is said to connect or join the two endpoints. An edge can thus be defined as a set of two vertices (or an
ordered pair, in the case of a directed graph - see Section Direction). The two endpoints of an edge are also said to
be adjacent to each other.
Alternative models of graphs exist; e.g., a graph may be thought of as a Boolean binary function over the set of
vertices or as a square (0,1)-matrix.
A vertex is simply drawn as a node or a dot. The vertex set of G is usually denoted by V(G), or V when there is no
danger of confusion. The order of a graph is the number of its vertices, i.e. |V(G)|.
An edge (a set of two elements) is drawn as a line connecting two vertices, called endpoints or (less often)
endvertices. An edge with endvertices x and y is denoted by xy (without any symbol in between). The edge set of G
is usually denoted by E(G), or E when there is no danger of confusion.
The size of a graph is the number of its edges, i.e. |E(G)|.[1]
A loop is an edge whose endpoints are the same vertex. A link has two distinct endvertices. An edge is multiple if
there is another edge with the same endvertices; otherwise it is simple. The multiplicity of an edge is the number of
multiple edges sharing the same end vertices; the multiplicity of a graph, the maximum multiplicity of its edges. A
graph is a simple graph if it has no multiple edges or loops, a multigraph if it has multiple edges, but no loops, and
a multigraph or pseudograph if it contains both multiple edges and loops (the literature is highly inconsistent).
When stated without any qualification, a graph is usually assumed to be simple, except in the literature of category
theory, where it refers to a quiver.
Graphs whose edges or vertices have names or labels are known as labeled, those without as unlabeled. Graphs with
labeled vertices only are vertex-labeled, those with labeled edges only are edge-labeled. The difference between a
labeled and an unlabeled graph is that the latter has no specific set of vertices or edges; it is regarded as another way
to look upon an isomorphism type of graphs. (Thus, this usage distinguishes between graphs with identifiable vertex
or edge sets on the one hand, and isomorphism types or classes of graphs on the other.)
(Graph labeling usually refers to the assignment of labels (usually natural numbers, usually distinct) to the edges
and vertices of a graph, subject to certain rules depending on the situation. This should not be confused with a
graph's merely having distinct labels or names on the vertices.)
A hyperedge is an edge that is allowed to take on any number of
vertices, possibly more than 2. A graph that allows any hyperedge is
called a hypergraph. A simple graph can be considered a special case
of the hypergraph, namely the 2-uniform hypergraph. However, when
stated without any qualification, an edge is always assumed to consist
of at most 2 vertices, and a graph is never confused with a hypergraph.
A non-edge (or anti-edge) is an edge that is not present in the graph.
More formally, for two vertices and ,
is a non-edge in a
graph
whenever
is not an edge in
there is either no edge between the two vertices or (for directed graphs)
at most one of
and
from
is an arc in G.
Occasionally the term cotriangle or anti-triangle is used for a set of three vertices none of which are connected.
The complement
edge in
of a graph G is a graph with the same vertex set as G but with an edge set such that xy is an
An edgeless graph or empty graph or null graph is a graph with zero or more vertices, but no edges. The empty
graph or null graph may also be the graph with no vertices and no edges. If it is a graph with no edges and any
number of vertices, it may be called the null graph on vertices. (There is no consistency at all in the
literature.)
A graph is infinite if it has infinitely many vertices or edges or both; otherwise the graph is finite. An infinite graph
where every vertex has finite degree is called locally finite. When stated without any qualification, a graph is usually
assumed to be finite. See also continuous graph.
Two graphs G and H are said to be isomorphic, denoted by G ~ H, if there is a one-to-one correspondence, called an
isomorphism, between the vertices of the graph such that two vertices are adjacent in G if and only if their
corresponding vertices are adjacent in H. Likewise, a graph G is said to be homomorphic to a graph H if there is a
mapping, called a homomorphism, from V(G) to V(H) such that if two vertices are adjacent in G then their
corresponding vertices are adjacent in H.
10
Subgraphs
A subgraph of a graph G is a graph whose vertex set is a subset of that of G, and whose adjacency relation is a
subset of that of G restricted to this subset. In the other direction, a supergraph of a graph G is a graph of which G is
a subgraph. We say a graph G contains another graph H if some subgraph of G is H or is isomorphic to H.
A subgraph H is a spanning subgraph, or factor, of a graph G if it has the same vertex set as G. We say H spans G.
A subgraph H of a graph G is said to be induced (or full) if, for any pair of vertices x and y of H, xy is an edge of H
if and only if xy is an edge of G. In other words, H is an induced subgraph of G if it has exactly the edges that appear
in G over the same vertex set. If the vertex set of H is the subset S of V(G), then H can be written as G[S] and is said
to be induced by S.
A graph that does not contain H as an induced subgraph is said to be H-free.
A universal graph in a class K of graphs is a simple graph in which every element in K can be embedded as a
subgraph.
A graph G is minimal with some property P provided that G has property P and no proper subgraph of G has
property P. In this definition, the term subgraph is usually understood to mean "induced subgraph." The notion of
maximality is defined dually: G is maximal with P provided that P(G) and G has no proper supergraph H such that
P(H).
Many important classes of graphs can be defined by sets of forbidden subgraphs, the minimal graphs that are not in
the class.
Walks
A walk is an alternating sequence of vertices and edges, beginning and ending with a vertex, where each vertex is
incident to both the edge that precedes it and the edge that follows it in the sequence, and where the vertices that
precede and follow an edge are the end vertices of that edge. A walk is closed if its first and last vertices are the
same, and open if they are different.
The length l of a walk is the number of edges that it uses. For an open walk, l = n1, where n is the number of
vertices visited (a vertex is counted each time it is visited). For a closed walk, l = n (the start/end vertex is listed
twice, but is not counted twice). In the example graph, (1, 2, 5, 1, 2, 3) is an open walk with length 5, and (4, 5, 2, 1,
5, 4) is a closed walk of length 5.
A trail is a walk in which all the edges are distinct. A closed trail has been called a tour or circuit, but these are not
universal, and the latter is often reserved for a regular subgraph of degree two.
Traditionally, a path referred to what is now usually known as an open
walk. Nowadays, when stated without any qualification, a path is
usually understood to be simple, meaning that no vertices (and thus no
edges) are repeated. (The term chain has also been used to refer to a
walk in which all vertices and edges are distinct.) In the example
graph, (5, 2, 1) is a path of length 2. The closed equivalent to this type
of walk, a walk that starts and ends at the same vertex but otherwise
has no repeated vertices or edges, is called a cycle. Like path, this term
traditionally referred to any closed walk, but now is usually understood
to be simple by definition. In the example graph, (1, 5, 2, 1) is a cycle
of length 3. (A cycle, unlike a path, is not allowed to have length 0.)
Paths and cycles of n vertices are often denoted by Pn and Cn,
respectively. (Some authors use the length instead of the number of
vertices, however.)
11
C1 is a loop, C2 is a digon (a pair of parallel undirected edges in a multigraph, or a pair of antiparallel edges in a
directed graph), and C3 is called a triangle.
A cycle that has odd length is an odd cycle; otherwise it is an even cycle. One theorem is that a graph is bipartite if
and only if it contains no odd cycles. (See complete bipartite graph.)
A graph is acyclic if it contains no cycles; unicyclic if it contains exactly one cycle; and pancyclic if it contains
cycles of every possible length (from 3 to the order of the graph).
The girth of a graph is the length of a shortest (simple) cycle in the graph; and the circumference, the length of a
longest (simple) cycle. The girth and circumference of an acyclic graph are defined to be infinity .
A path or cycle is Hamiltonian (or spanning) if it uses all vertices exactly once. A graph that contains a Hamiltonian
path is traceable; and one that contains a Hamiltonian path for any given pair of (distinct) end vertices is a
Hamiltonian connected graph. A graph that contains a Hamiltonian cycle is a Hamiltonian graph.
A trail or circuit (or cycle) is Eulerian if it uses all edges precisely once. A graph that contains an Eulerian trail is
traversable. A graph that contains an Eulerian circuit is an Eulerian graph.
Two paths are internally disjoint (some people call it independent) if they do not have any vertex in common,
except the first and last ones.
A theta graph is the union of three internally disjoint (simple) paths that have the same two distinct end vertices. A
theta0 graph has seven vertices which can be arranged as the vertices of a regular hexagon plus an additional vertex
in the center. The eight edges are the perimeter of the hexagon plus one diameter.
Trees
A tree is a connected acyclic simple graph. For directed graphs, each
vertex has at most one incoming edge. A vertex of degree 1 is called a
leaf, or pendant vertex. An edge incident to a leaf is a leaf edge, or
pendant edge. (Some people define a leaf edge as a leaf and then
define a leaf vertex on top of it. These two sets of definitions are often
used interchangeably.) A non-leaf vertex is an internal vertex.
Sometimes, one vertex of the tree is distinguished, and called the root;
in this case, the tree is called rooted. Rooted trees are often treated as
directed acyclic graphs with the edges pointing away from the root.
A subtree of the tree T is a connected subgraph of T.
A forest is an acyclic simple graph. For directed graphs, each vertex
has at most one incoming edge. (That is, a tree with the connectivity
requirement removed; a graph containing multiple disconnected trees.)
12
Cliques
The complete graph Kn of order n is a simple graph with n vertices in
which every vertex is adjacent to every other. The example graph to
the right is complete. The complete graph on n vertices is often
denoted by Kn. It has n(n-1)/2 edges (corresponding to all possible
choices of pairs of vertices).
A clique in a graph is a set of pairwise adjacent vertices. Since any
subgraph induced by a clique is a complete subgraph, the two terms
and their notations are usually used interchangeably. A k-clique is a
clique of order k. In the example graph above, vertices 1, 2 and 5 form
a 3-clique, or a triangle. A maximal clique is a clique that is not a
subset of any other clique (some authors reserve the term clique for
maximal cliques).
The clique number (G) of a graph G is the order of a largest clique
in G.
Knots
A knot in a directed graph is a collection of vertices and edges with the property that every vertex in the knot has
outgoing edges, and all outgoing edges from vertices in the knot terminate at other vertices in the knot. Thus it is
impossible to leave the knot while following the directions of the edges.
Minors
A minor
of
is an injection from
to
to
operations which collapse a path and all vertices on it into a single edge (see Minor (graph theory)).
13
Embedding
An embedding
of
is an injection from
to
14
A strongly regular graph is a regular graph such that any adjacent vertices have the same number of common
neighbors as other adjacent pairs and that any nonadjacent vertices have the same number of common neighbors as
other nonadjacent pairs.
Independence
In graph theory, the word independent usually carries the connotation of pairwise disjoint or mutually nonadjacent.
In this sense, independence is a form of immediate nonadjacency. An isolated vertex is a vertex not incident to any
edges. An independent set, or coclique, or stable set or staset, is a set of vertices of which no pair is adjacent. Since
the graph induced by any independent set is an empty graph, the two terms are usually used interchangeably. In the
example above, vertices 1, 3, and 6 form an independent set; and 3, 5, and 6 form another one.
Two subgraphs are edge disjoint if they have no edges in common. Similarly, two subgraphs are vertex disjoint if
they have no vertices (and thus, also no edges) in common. Unless specified otherwise, a set of disjoint subgraphs
are assumed to be pairwise vertex disjoint.
The independence number (G) of a graph G is the size of the largest independent set of G.
A graph can be decomposed into independent sets in the sense that the entire vertex set of the graph can be
partitioned into pairwise disjoint independent subsets. Such independent subsets are called partite sets, or simply
parts.
A graph that can be decomposed into two partite sets bipartite; three sets , tripartite; k sets , k-partite; and an
unknown number of sets, multipartite. An 1-partite graph is the same as an independent set, or an empty graph. A
2-partite graph is the same as a bipartite graph. A graph that can be decomposed into k partite sets is also said to be
k-colourable.
A complete multipartite graph is a graph in which vertices are adjacent if and only if they belong to different partite
sets. A complete bipartite graph is also referred to as a biclique; if its partite sets contain n and m vertices,
respectively, then the graph is denoted Kn,m.
A k-partite graph is semiregular if each of its partite sets has a uniform degree; equipartite if each partite set has the
same size; and balanced k-partite if each partite set differs in size by at most 1 with any other.
The matching number
G.
A spanning matching, also called a perfect matching is a matching that covers all vertices of a graph.
Complexity
Complexity of a graph denotes the quantity of information that a graph contained, and can be measured in several
ways. For example, by counting the number of its spanning trees, or the value of a certain formula involving the
number of vertices, edges, and proper paths in a graph. [2]
Connectivity
Connectivity extends the concept of adjacency and is essentially a form (and measure) of concatenated adjacency.
If it is possible to establish a path from any vertex to any other vertex of a graph, the graph is said to be connected;
otherwise, the graph is disconnected. A graph is totally disconnected if there is no path connecting any pair of
vertices. This is just another name to describe an empty graph or independent set.
A cut vertex, or articulation point, is a vertex whose removal disconnects the remaining subgraph. A cut set, or
vertex cut or separating set, is a set of vertices whose removal disconnects the remaining subgraph. A bridge is an
analogous edge (see below).
Distance
The distance dG(u, v) between two (not necessary distinct) vertices u and v in a graph G is the length of a shortest
path between them. The subscript G is usually dropped when there is no danger of confusion. When u and v are
identical, their distance is 0. When u and v are unreachable from each other, their distance is defined to be infinity .
The eccentricity G(v) of a vertex v in a graph G is the maximum distance from v to any other vertex. The diameter
diam(G) of a graph G is the maximum eccentricity over all vertices in a graph; and the radius rad(G), the minimum.
When there are two components in G, diam(G) and rad(G) defined to be infinity . Trivially, diam(G) 2 rad(G).
Vertices with maximum eccentricity are called peripheral vertices. Vertices of minimum eccentricity form the
center. A tree has at most two center vertices.
The Wiener index of a vertex v in a graph G, denoted by WG(v) is the sum of distances between v and all others.
The Wiener index of a graph G, denoted by W(G), is the sum of distances over all pairs of vertices. An undirected
graph's Wiener polynomial is defined to be qd(u,v) over all unordered pairs of vertices u and v. Wiener index and
Wiener polynomial are of particular interest to mathematical chemists.
The k-th power Gk of a graph G is a supergraph formed by adding an edge between all pairs of vertices of G with
distance at most k. A second power of a graph is also called a square.
A k-spanner is a spanning subgraph, S, in which every two vertices are at most k times as far apart on S than on G.
The number k is the dilation. k-spanner is used for studying geometric network optimization.
15
Genus
A crossing is a pair of intersecting edges. A graph is embeddable on a surface if its vertices and edges can be
arranged on it without any crossing. The genus of a graph is the lowest genus of any surface on which the graph can
embed.
A planar graph is one which can be drawn on the (Euclidean) plane without any crossing; and a plane graph, one
which is drawn in such fashion. In other words, a planar graph is a graph of genus 0. The example graph is planar;
the complete graph on n vertices, for n> 4, is not planar. Also, a tree is necessarily a planar graph.
When a graph is drawn without any crossing, any cycle that surrounds a region without any edges reaching from the
cycle into the region forms a face. Two faces on a plane graph are adjacent if they share a common edge. A dual, or
planar dual when the context needs to be clarified, G* of a plane graph G is a graph whose vertices represent the
faces, including any outerface, of G and are adjacent in G* if and only if their corresponding faces are adjacent in G.
The dual of a planar graph is always a planar pseudograph (e.g. consider the dual of a triangle). In the familiar case
of a 3-connected simple planar graph G (isomorphic to a convex polyhedron P), the dual G* is also a 3-connected
simple planar graph (and isomorphic to the dual polyhedron P*).
Furthermore, since we can establish a sense of "inside" and "outside" on a plane, we can identify an "outermost"
region that contains the entire graph if the graph does not cover the entire plane. Such outermost region is called an
outer face. An outerplanar graph is one which can be drawn in the planar fashion such that its vertices are all
adjacent to the outer face; and an outerplane graph, one which is drawn in such fashion.
The minimum number of crossings that must appear when a graph is drawn on a plane is called the crossing
number.
The minimum number of planar graphs needed to cover a graph is the thickness of the graph.
Direction
A directed arc, or directed edge, is an ordered pair of endvertices that can be represented graphically as an arrow
drawn between the endvertices. In such an ordered pair the first vertex is called the initial vertex or tail; the second
one is called the terminal vertex or head (because it appears at the arrow head). An undirected edge disregards any
sense of direction and treats both endvertices interchangeably. A loop in a digraph, however, keeps a sense of
direction and treats both head and tail identically. A set of arcs are multiple, or parallel, if they share the same head
and the same tail. A pair of arcs are anti-parallel if one's head/tail is the other's tail/head. A digraph, or directed
graph, or oriented graph, is analogous to an undirected graph except that it contains only arcs. A mixed graph may
contain both directed and undirected edges; it generalizes both directed and undirected graphs. When stated without
any qualification, a graph is almost always assumed to be undirected.
16
17
A digraph is called simple if it has no loops and at most one arc between any pair of vertices. When stated without
any qualification, a digraph is usually assumed to be simple. A quiver is a directed graph which is specifically
allowed, but not required, to have loops and more than one arc between any pair of vertices.
In a digraph , we distinguish the out degree d+(v), the number of edges leaving a vertex v, and the in degree
d-(v), the number of edges entering a vertex v. If the graph is oriented, the degree d(v) of a vertex v is equal to the
sum of its out- and in- degrees. When the context is clear, the subscript can be dropped. Maximum and minimum
out degrees are denoted by +() and +(); and maximum and minimum in degrees, -() and -().
An out-neighborhood, or successor set, N+(v) of a vertex v is the set of heads of arcs going from v. Likewise, an
in-neighborhood, or predecessor set, N-(v) of a vertex v is the set of tails of arcs going into v.
A source is a vertex with 0 in-degree; and a sink, 0 out-degree.
A vertex v dominates another vertex u if there is an arc from v to u. A vertex subset S is out-dominating if every
vertex not in S is dominated by some vertex in S; and in-dominating if every vertex in S is dominated by some
vertex not in S.
A kernel in a graph G is an independent set S so that (G \ S) is an out-dominating set. A digraph is kernel perfect if
every induced sub-digraph has a kernel.[3]
An Eulerian digraph is a digraph with equal in- and out-degrees at every vertex.
The zweieck of an undirected edge
and
dicircuit.
An orientation is an assignment of directions to the edges of an undirected or partially directed graph. When stated
without any qualification, it is usually assumed that all undirected edges are replaced by a directed one in an
orientation. Also, the underlying graph is usually assumed to be undirected and simple.
A tournament is a digraph in which each pair of vertices is connected by exactly one arc. In other words, it is an
oriented complete graph.
A directed path, or just a path when the context is clear, is an oriented simple path such that all arcs go the same
direction, meaning all internal vertices have in- and out-degrees 1. A vertex v is reachable from another vertex u if
there is a directed path that starts from u and ends at v. Note that in general the condition that u is reachable from v
does not imply that v is also reachable from u.
If v is reachable from u, then u is a predecessor of v and v is a successor of u. If there is an arc from u to v, then u is
a direct predecessor of v, and v is a direct successor of u.
A digraph is strongly connected if every vertex is reachable from every other following the directions of the arcs.
On the contrary, a digraph is weakly connected if its underlying undirected graph is connected. A weakly connected
graph can be thought of as a digraph in which every vertex is "reachable" from every other but not necessarily
following the directions of the arcs. A strong orientation is an orientation that produces a strongly connected digraph.
A directed cycle, or just a cycle when the context is clear, is an oriented simple cycle such that all arcs go the same
direction, meaning all vertices have in- and out-degrees 1. A digraph is acyclic if it does not contain any directed
cycle. A finite, acyclic digraph with no isolated vertices necessarily contains at least one source and at least one sink.
An arborescence, or out-tree or branching, is an oriented tree in which all vertices are reachable from a single
vertex. Likewise, an in-tree is an oriented tree in which a single vertex is reachable from every other one.
Colouring
Vertices in graphs can be given colours to identify or label them. Although they may actually be rendered in
diagrams in different colours, working mathematicians generally pencil in numbers or letters (usually numbers) to
represent the colours.
Given a graph G (V,E) a k-colouring of G is a map : V {1, ... , k} with the property that (u, v) E (u)
(v) - in other words, every vertex is assigned a colour with the condition that adjacent vertices cannot be assigned
the same colour.
The chromatic number (G) is the smallest k for which G has a k-colouring.
Given a graph and a colouring, the colour classes of the graph are the sets of vertices given the same colour.
Various
A graph invariant is a property of a graph G, usually a number or a polynomial, that depends only on the
isomorphism class of G. Examples are the order, genus, chromatic number, and chromatic polynomial of a graph.
References
[1] Harris, John M. (2000). Combinatorics and Graph Theory (http:/ / www. springer. com/ new+ & + forthcoming+ titles+ (default)/ book/
978-0-387-79710-6). New York: Springer-Verlag. pp.5. ISBN0-387-98736-3. .
[2] Neel, David L. (2006), "The linear complexity of a graph" (http:/ / www. emis. ams. org/ journals/ EJC/ Volume_13/ PDF/ v13i1r9. pdf), The
electronic journal of combinatorics,
[3] Bla Bollobs, Modern Graph theory, p. 298
18
Undirected graphs
19
Undirected graphs
Further information: Graph theory
In mathematics, a graph is an abstract representation of
a set of objects where some pairs of the objects are
connected by links. The interconnected objects are
represented by mathematical abstractions called
vertices, and the links that connect some pairs of
vertices are called edges.[1] Typically, a graph is
depicted in diagrammatic form as a set of dots for the
vertices, joined by lines or curves for the edges. Graphs
are one of the objects of study in discrete mathematics.
The edges may be directed (asymmetric) or undirected
(symmetric). For example, if the vertices represent
A drawing of a labeled graph on 6 vertices and 7 edges.
people at a party, and there is an edge between two
people if they shake hands, then this is an undirected graph, because if person A shook hands with person B, then
person B also shook hands with person A. On the other hand, if the vertices represent people at a party, and there is
an edge from person A to person B when person A knows of person B, then this graph is directed, because
knowledge of someone is not necessarily a symmetric relation (that is, one person knowing another person does not
necessarily imply the reverse; for example, many fans may know of a celebrity, but the celebrity is unlikely to know
of all their fans). This latter type of graph is called a directed graph and the edges are called directed edges or arcs.
Vertices are also called nodes or points, and edges are also called lines or arcs. Graphs are the basic subject studied
by graph theory. The word "graph" was first used in this sense by J.J. Sylvester in 1878.[2]
Definitions
Definitions in graph theory vary. The following are some of the more basic ways of defining graphs and related
mathematical structures.
Graph
In the most common sense of the term,[3] a graph is an ordered pair G=(V,E)
comprising a set V of vertices or nodes together with a set E of edges or lines,
which are 2-element subsets of V (i.e., an edge is related with two vertices, and the
relation is represented as unordered pair of the vertices with respect to the
particular edge). To avoid ambiguity, this type of graph may be described precisely
as undirected and simple.
Other senses of graph stem from different conceptions of the edge set. In one more
generalized notion,[4] E is a set together with a relation of incidence that associates
with each edge two vertices. In another generalized notion, E is a multiset of
unordered pairs of (not necessarily distinct) vertices. Many authors call this type of
object a multigraph or pseudograph.
All of these variants and others are described more fully below.
The vertices belonging to an edge are called the ends, endpoints, or end vertices of the edge. A vertex may exist in
a graph and not belong to an edge.
Undirected graphs
20
V and E are usually taken to be finite, and many of the well-known results are not true (or are rather different) for
infinite graphs because many of the arguments fail in the infinite case. The order of a graph is
(the number of
vertices). A graph's size is
, the number of edges. The degree of a vertex is the number of edges that connect to
it, where an edge that connects to the vertex at both ends (a loop) is counted twice.
For an edge {u,v}, graph theorists usually use the somewhat shorter notation uv.
Adjacency relation
The edges E of an undirected graph G induce a symmetric binary relation ~ on V that is called the adjacency relation
of G. Specifically, for each edge {u,v} the vertices u and v are said to be adjacent to one another, which is denoted
u~v.
Types of graphs
Distinction in terms of the main definition
As stated above, in different contexts it may be useful to define the term graph with different degrees of generality.
Whenever it is necessary to draw a strict distinction, the following terms are used. Most commonly, in modern texts
in graph theory, unless stated otherwise, graph means "undirected simple finite graph" (see the definitions below).
Undirected graph
An undirected graph is one in which edges have no orientation. The edge (a, b) is
identical to the edge (b, a), i.e., they are not ordered pairs, but sets {u,v} (or
2-multisets) of vertices.
Directed graph
A directed graph or digraph is an ordered pair D=(V,A) with
V a set whose elements are called vertices or nodes, and
A a set of ordered pairs of vertices, called arcs, directed edges, or arrows.
A directed graph
A directed graph D is called symmetric if, for every arc in D, the corresponding inverted arc also belongs to D. A
symmetric loopless directed graph D=(V,A) is equivalent to a simple undirected graph G=(V,E), where the pairs
of inverse arcs in A correspond 1-to-1 with the edges in E; thus the edges in G number |E| = |A|/2, or half the number
Undirected graphs
of arcs in D.
A variation on this definition is the oriented graph, in which not more than one of (x,y) and (y,x) may be arcs.
Mixed graph
A mixed graph G is a graph in which some edges may be directed and some may be undirected. It is written as an
ordered triple G=(V,E,A) with V, E, and A defined as above. Directed and undirected graphs are special cases.
Multigraph
A loop is an edge (directed or undirected) which starts and ends on the same vertex; these may be permitted or not
permitted according to the application. In this context, an edge with two different ends is called a link.
The term "multigraph" is generally understood to mean that multiple edges (and sometimes loops) are allowed.
Where graphs are defined so as to allow loops and multiple edges, a multigraph is often defined to mean a graph
without loops,[5] however, where graphs are defined so as to disallow loops and multiple edges, the term is often
defined to mean a "graph" which can have both multiple edges and loops,[6] although many use the term
"pseudograph" for this meaning.[7]
Quiver
A quiver or "multidigraph" is a directed graph which may have more than one arrow from a given source to a given
target. A quiver may also have directed loops.
Simple graph
As opposed to a multigraph, a simple graph is an undirected graph that has no loops and no more than one edge
between any two different vertices. In a simple graph the edges of the graph form a set (rather than a multiset) and
each edge is a distinct pair of vertices. In a simple graph with n vertices every vertex has a degree that is less than n
(the converse, however, is not true there exist non-simple graphs with n vertices in which every vertex has a
degree smaller than n).
Weighted graph
A graph is a weighted graph if a number (weight) is assigned to each edge.[8] Such weights might represent, for
example, costs, lengths or capacities, etc. depending on the problem at hand. Some authors call such a graph a
network.[9]
Half-edges, loose edges
In exceptional situations it is even necessary to have edges with only one end, called half-edges, or no ends (loose
edges); see for example signed graphs and biased graphs.
21
Undirected graphs
22
Properties of graphs
Two edges of a graph are called adjacent (sometimes coincident) if they share a common vertex. Two arrows of a
directed graph are called consecutive if the head of the first one is at the nock (notch end) of the second one.
Similarly, two vertices are called adjacent if they share a common edge (consecutive if they are at the notch and at
the head of an arrow), in which case the common edge is said to join the two vertices. An edge and a vertex on that
edge are called incident.
The graph with only one vertex and no edges is called the trivial graph. A graph with only vertices and no edges is
known as an edgeless graph. The graph with no vertices and no edges is sometimes called the null graph or empty
graph, but the terminology is not consistent and not all mathematicians allow this object.
In a weighted graph or digraph, each edge is associated with some value, variously called its cost, weight, length or
other term depending on the application; such graphs arise in many contexts, for example in optimal routing
problems such as the traveling salesman problem.
Normally, the vertices of a graph, by their nature as elements of a set, are distinguishable. This kind of graph may be
called vertex-labeled. However, for many questions it is better to treat vertices as indistinguishable; then the graph
may be called unlabeled. (Of course, the vertices may be still distinguishable by the properties of the graph itself,
e.g., by the numbers of incident edges). The same remarks apply to edges, so graphs with labeled edges are called
edge-labeled graphs. Graphs with labels attached to edges or vertices are more generally designated as labeled.
Consequently, graphs in which vertices are indistinguishable and edges are indistinguishable are called unlabeled.
(Note that in the literature the term labeled may apply to other kinds of labeling, besides that which serves only to
Undirected graphs
23
Examples
The diagram at right is a graphic representation of the following
graph:
V = {1, 2, 3, 4, 5, 6}
E=.
In category theory a small category can be represented by a directed
multigraph in which the objects of the category represented as
vertices and the morphisms as directed edges. Then, the functors
between categories induce some, but not necessarily all, of the
digraph morphisms of the graph.
In computer science, directed graphs are used to represent knowledge (e.g., Conceptual graph), finite state
machines, and many other discrete structures.
A binary relation R on a set X defines a directed graph. An element x of X is a direct predecessor of an element y
of X iff xRy.
Important graphs
Basic examples are:
In a complete graph, each pair of vertices is joined by an edge; that is, the graph contains all possible edges.
In a bipartite graph, the vertex set can be partitioned into two sets, W and X, so that no two vertices in W are
adjacent and no two vertices in X are adjacent. Alternatively, it is a graph with a chromatic number of 2.
In a complete bipartite graph, the vertex set is the union of two disjoint sets, W and X, so that every vertex in W is
adjacent to every vertex in X but there are no edges within W or X.
In a linear graph or path graph of length n, the vertices can be listed in order, v0, v1, ..., vn, so that the edges are
vi1vi for each i = 1, 2, ..., n. If a linear graph occurs as a subgraph of another graph, it is a path in that graph.
In a cycle graph of length n 3, vertices can be named v1, ..., vn so that the edges are vi1vi for each i = 2,...,n in
addition to vnv1. Cycle graphs can be characterized as connected 2-regular graphs. If a cycle graph occurs as a
subgraph of another graph, it is a cycle or circuit in that graph.
A planar graph is a graph whose vertices and edges can be drawn in a plane such that no two of the edges
intersect (i.e., embedded in a plane).
A tree is a connected graph with no cycles.
A forest is a graph with no cycles (i.e. the disjoint union of one or more trees).
More advanced kinds of graphs are:
Undirected graphs
Operations on graphs
There are several operations that produce new graphs from old ones, which might be classified into the following
categories:
Elementary operations, sometimes called "editing operations" on graphs, which create a new graph from the
original one by a simple, local change, such as addition or deletion of a vertex or an edge, merging and splitting of
vertices, etc.
Graph rewrite operations replacing the occurrence of some pattern graph within the host graph by an instance of
the corresponding replacement graph.
Unary operations, which create a significantly new graph from the old one. Examples:
Line graph
Dual graph
Complement graph
Binary operations, which create new graph from two initial graphs. Examples:
Disjoint union of graphs
Cartesian product of graphs
Tensor product of graphs
Strong product of graphs
Lexicographic product of graphs
Generalizations
In a hypergraph, an edge can join more than two vertices.
An undirected graph can be seen as a simplicial complex consisting of 1-simplices (the edges) and 0-simplices (the
vertices). As such, complexes are generalizations of graphs since they allow for higher-dimensional simplices.
Every graph gives rise to a matroid.
In model theory, a graph is just a structure. But in that case, there is no limitation on the number of edges: it can be
any cardinal number, see continuous graph.
In computational biology, power graph analysis introduces power graphs as an alternative representation of
undirected graphs.
In geographic information systems, geometric networks are closely modeled after graphs, and borrow many concepts
from graph theory to perform spatial analysis on road networks or utility grids.
Notes
[1] Trudeau, Richard J. (1993). Introduction to Graph Theory (http:/ / store. doverpublications. com/ 0486678709. html) (Corrected, enlarged
republication. ed.). New York: Dover Pub.. pp.19. ISBN978-0-486-67870-2. . Retrieved 8 August 2012. "A graph is an object consisting of
two sets called its vertex set and its edge set."
[2] Gross, Jonathan L.; Yellen, Jay (2004). Handbook of graph theory (http:/ / books. google. com/ ?id=mKkIGIea_BkC). CRC Press. p. 35
(http:/ / books. google. com/ books?id=mKkIGIea_BkC& pg=PA35& lpg=PA35). ISBN978-1-58488-090-5.
[3] See, for instance, Iyanaga and Kawada, 69 J, p. 234 or Biggs, p. 4.
[4] See, for instance, Graham et al., p. 5.
[5] For example, see Balakrishnan, p. 1, Gross (2003), p. 4, and Zwillinger, p. 220.
[6] For example, see. Bollobs, p. 7 and Diestel, p. 25.
[7] Gross (1998), p. 3, Gross (2003), p. 205, Harary, p.10, and Zwillinger, p. 220.
[8] Fletcher, Peter; Hoyle, Hughes; Patty, C. Wayne (1991). Foundations of Discrete Mathematics (International student ed. ed.). Boston:
PWS-KENT Pub. Co.. pp.463. ISBN0-53492-373-9. "A weighted graph is a graph in which a number w(e), called its weight, is assigned to
each edge e."
[9] Strang, Gilbert (2005), Linear Algebra and Its Applications (http:/ / books. google. com/ books?vid=ISBN0030105676) (4th ed.), Brooks
Cole, ISBN0-03-010567-6,
24
Undirected graphs
References
Balakrishnan, V. K. (1997-02-01). Graph Theory (1st ed.). McGraw-Hill. ISBN0-07-005489-4.
Berge, Claude (1958) (in French). Thorie des graphes et ses applications. Dunod, Paris: Collection Universitaire
de Mathmatiques, II. pp.viii+277. Translation: . Dover, New York: Wiley. 2001 [1962].
Biggs, Norman (1993). Algebraic Graph Theory (2nd ed.). Cambridge University Press. ISBN0-521-45897-8.
Bollobs, Bla (2002-08-12). Modern Graph Theory (1st ed.). Springer. ISBN0-387-98488-7.
Bang-Jensen, J.; Gutin, G. (2000). Digraphs: Theory, Algorithms and Applications (http://www.cs.rhul.ac.uk/
books/dbook/). Springer.
Diestel, Reinhard (2005). Graph Theory (http://diestel-graph-theory.com/GrTh.html) (3rd ed.). Berlin, New
York: Springer-Verlag. ISBN978-3-540-26183-4.
Graham, R.L., Grtschel, M., and Lovsz, L, ed. (1995). Handbook of Combinatorics. MIT Press.
ISBN0-262-07169-X.
Gross, Jonathan L.; Yellen, Jay (1998-12-30). Graph Theory and Its Applications. CRC Press.
ISBN0-8493-3982-0.
Gross, Jonathan L., & Yellen, Jay, ed. (2003-12-29). Handbook of Graph Theory. CRC. ISBN1-58488-090-2.
Harary, Frank (January 1995). Graph Theory. Addison Wesley Publishing Company. ISBN0-201-41033-8.
Iyanaga, Shkichi; Kawada, Yukiyosi (1977). Encyclopedic Dictionary of Mathematics. MIT Press.
ISBN0-262-09016-3.
Zwillinger, Daniel (2002-11-27). CRC Standard Mathematical Tables and Formulae (31st ed.). Chapman &
Hall/CRC. ISBN1-58488-291-3.
Further reading
Trudeau, Richard J. (1993). Introduction to Graph Theory (http://store.doverpublications.com/0486678709.
html) (Corrected, enlarged republication. ed.). New York: Dover Publications. ISBN978-0-486-67870-2.
Retrieved 8 August 2012.
External links
A searchable database of small connected graphs (http://www.gfredericks.com/main/sandbox/graphs)
VisualComplexity.com (http://www.visualcomplexity.com) A visual exploration on mapping complex
networks
PulseView Data-To-Graph Program (http://www.quarktet.com/PulseView.html) An efficient data plotting
and analysis program.
Weisstein, Eric W., " Graph (http://mathworld.wolfram.com/Graph.html)" from MathWorld.
Intelligent Graph Visualizer (https://sourceforge.net/projects/igv-intelligent/) IGV create and edit graph,
automatically places graph, search shortest path (+coloring vertices), center, degree, eccentricity, etc.
Visual Graph Editor 2 (http://code.google.com/p/vge2/) VGE2 designed for quick and easy creation,
editing and saving of graphs and analysis of problems connected with graphs.
GraphsJ (http://gianlucacosta.info/software/graphsj/) GraphsJ is an open source didactic Java software
which features an easy-to-use GUI and interactively solves step-by-step many graph problems. Extensible via its
Java SDK.
GraphClasses (http://graphclasses.org) Information System on Graph Classes and their Inclusions.
25
Directed graphs
26
Directed graphs
In mathematics, a directed graph or digraph is a graph, or set of nodes connected
by edges, where the edges have a direction associated with them. In formal terms a
digraph is a pair
(sometimes
) of:[1]
a set V, whose elements are called vertices or nodes,
a set A of ordered pairs of vertices, called arcs, directed edges, or arrows (and
sometimes simply edges with the corresponding set named E instead of A).
It differs from an ordinary or undirected graph, in that the latter is defined in terms
of unordered pairs of vertices, which are usually called edges.
A directed graph.
Sometimes a digraph is called a simple digraph to distinguish it from a directed multigraph, in which the arcs
constitute a multiset, rather than a set, of ordered pairs of vertices. Also, in a simple digraph loops are disallowed. (A
loop is an arc that pairs a vertex to itself.) On the other hand, some texts allow loops, multiple arcs, or both in a
digraph.
Basic terminology
An arc
arc;
, and
to
. If a path made up of
, then
to
is said to be a successor of
, and
is said to be a
Directed graphs
27
is
A digraph with vertices labeled (indegree,
outdegree)
Digraph connectivity
A digraph G is called weakly connected (or just connected[4]) if the undirected underlying graph obtained by
replacing all directed edges of G with undirected edges is a connected graph. A digraph is strongly connected or
strong if it contains a directed path from u to v and a directed path from v to u for every pair of vertices u,v. The
strong components are the maximal strongly connected subgraphs.
Classes of digraphs
A directed acyclic graph or acyclic digraph is a directed graph with no directed cycles.
Special cases of directed acyclic graphs include the multitrees (graphs in which no two
directed paths from a single starting node meet back at the same ending node), oriented
trees or polytrees (the digraphs formed by orienting the edges of undirected acyclic
graphs), and the rooted trees (oriented trees in which all edges of the underlying
undirected tree are directed away from the root).
A tournament on 4 vertices
Directed graphs
Notes
[1] Bang-Jensen & Gutin (2000). Diestel (2005), Section 1.10. Bondy & Murty (1976), Section 10.
[2] Diestel (2005), Section 1.10.
[3] Satyanarayana, Bhavanari; Prasad, Kuncham Syam, Discrete Mathematics and Graph Theory, PHI Learning Pvt. Ltd., p.460,
ISBN978-81-203-3842-5; Brualdi, Richard A. (2006), Combinatorial matrix classes, Encyclopedia of mathematics and its applications, 108,
Cambridge University Press, p.51, ISBN978-0-521-86565-4.
[4] Bang-Jensen & Gutin (2000) p. 19 in the 2007 edition; p. 20 in the 2nd edition (2009).
References
Bang-Jensen, Jrgen; Gutin, Gregory (2000), Digraphs: Theory, Algorithms and Applications (http://www.cs.
rhul.ac.uk/books/dbook/), Springer, ISBN1-85233-268-9
(the corrected 1st edition of 2007 is now freely available on the authors' site; the 2nd edition appeared in 2009
ISBN 1-84800-997-6).
Bondy, John Adrian; Murty, U. S. R. (1976), Graph Theory with Applications (http://www.ecp6.jussieu.fr/
pageperso/bondy/books/gtwa/gtwa.html), North-Holland, ISBN0-444-19451-7.
Diestel, Reinhard (2005), Graph Theory (http://www.math.uni-hamburg.de/home/diestel/books/graph.
theory/) (3rd ed.), Springer, ISBN3-540-26182-6 (the electronic 3rd edition is freely available on author's site).
Harary, Frank; Norman, Robert Z.; Cartwright, Dorwin (1965), Structural Models: An Introduction to the Theory
of Directed Graphs, New York: Wiley.
Number of directed graphs (or digraphs) with n nodes. (http://oeis.org/A000273)
28
29
30
31
sequence with one fewer element; the tree formed in this way for a set of strings is called a trie. A directed acyclic
word graph saves space over a trie by allowing paths to diverge and rejoin, so that a set of words with the same
possible suffixes can be represented by a single tree node.
The same idea of using a DAG to represent a family of paths occurs in the binary decision diagram,[4][5] a
DAG-based data structure for representing binary functions. In a binary decision diagram, each non-sink vertex is
labeled by the name of a binary variable, and each sink and each edge is labeled by a 0 or 1. The function value for
any truth assignment to the variables is the value at the sink found by following a path, starting from the single
source vertex, that at each non-sink vertex follows the outgoing edge labeled with the value of that vertex's variable.
Just as directed acyclic word graphs can be viewed as a compressed form of tries, binary decision diagrams can be
viewed as compressed forms of decision trees that save space by allowing paths to rejoin when they agree on the
results of all remaining decisions.
In many randomized algorithms in computational geometry, the algorithm maintains a history DAG representing
features of some geometric construction that have been replaced by later finer-scale features; point location queries
may be answered, as for the above two data structures, by following paths in this DAG.
Enumeration
The graph enumeration problem of counting directed acyclic graphs was studied by Robinson (1973).[8] The number
of DAGs on n labeled nodes, for n=1, 2, 3, ..., is
1, 3, 25, 543, 29281, 3781503, ... (sequence A003024 in OEIS).
These numbers may be computed by the recurrence relation
[8]
Eric W. Weisstein conjectured,[9] and McKay et al. (2004) proved,[10] that the same numbers count the (0,1) matrices
in which all eigenvalues are positive real numbers. The proof is bijective: a matrix A is an adjacency matrix of a
DAG if and only if the eigenvalues of the (0,1) matrix A+I are positive, where I denotes the identity matrix.
32
References
[1] Christofides, Nicos (1975), Graph theory: an algorithmic approach, Academic Press, pp.170174.
[2] Thulasiraman, K.; Swamy, M. N. S. (1992), "5.7 Acyclic Directed Graphs", Graphs: Theory and Algorithms, John Wiley and Son, p.118,
ISBN978-0-471-51356-8.
[3] Bang-Jensen, Jrgen (2008), "2.1 Acyclic Digraphs", Digraphs: Theory, Algorithms and Applications, Springer Monographs in Mathematics
(2nd ed.), Springer-Verlag, pp.3234, ISBN978-1-84800-997-4.
[4] Lee, C. Y. (1959), "Representation of switching circuits by binary-decision programs", Bell Systems Technical Journal 38: 985999.
[5] Akers, Sheldon B. (1978), "Binary decision diagrams", IEEE Transactions on Computers C-27 (6): 509516, doi:10.1109/TC.1978.1675141.
[6] Stanley, Richard P. (1973), "Acyclic orientations of graphs", Discrete Mathematics 5 (2): 171178, doi:10.1016/0012-365X(73)90108-8.
[7] Harary, Frank; Norman, Robert Z.; Cartwright, Dorwin (1965), Structural Models: An Introduction to the Theory of Directed Graphs, John
Wiley & Sons, p.63.
[8] Robinson, R. W. (1973), "Counting labeled acyclic digraphs", in Harary, F., New Directions in the Theory of Graphs, Academic Press,
pp.239273. See also Harary, Frank; Palmer, Edgar M. (1973), Graphical Enumeration, Academic Press, p.19, ISBN0-12-324245-2.
[9] Weisstein, Eric W., " Weisstein's Conjecture (http:/ / mathworld. wolfram. com/ WeissteinsConjecture. html)" from MathWorld.
[10] McKay, B. D.; Royle, G. F.; Wanless, I. M.; Oggier, F. E.; Sloane, N. J. A.; Wilf, H. (2004), "Acyclic digraphs and eigenvalues of
(0,1)-matrices" (http:/ / www. cs. uwaterloo. ca/ journals/ JIS/ VOL7/ Sloane/ sloane15. html), Journal of Integer Sequences 7, , Article
04.3.3.
External links
Weisstein, Eric W., " Acyclic Digraph (http://mathworld.wolfram.com/AcyclicDigraph.html)" from
MathWorld.
Algorithms
Graph algorithms are a significant field of interest within computer science. Typical higher-level operations
associated with graphs are: finding a path between two nodes, like depth-first search and breadth-first search and
finding the shortest path from one node to another, like Dijkstra's algorithm. A solution to finding the shortest path
from each node to every other node also exists in the form of the FloydWarshall algorithm.
A directed graph can be seen as a flow network, where each edge has a capacity and each edge receives a flow. The
FordFulkerson algorithm is used to find out the maximum flow from a source to a sink in a graph.
Operations
The basic operations provided by a graph data structure G usually include:
Representations
Different data structures for the representation of graphs are used in practice:
Adjacency list
Vertices are stored as records or objects, and every vertex stores a list of adjacent vertices. This data structure
allows the storage of additional data on the vertices.
Incidence list
Vertices and edges are stored as records or objects. Each vertex stores its incident edges, and each edge stores
its incident vertices. This data structure allows the storage of additional data on vertices and edges.
Adjacency matrix
A two-dimensional matrix, in which the rows represent source vertices and columns represent destination
vertices. Data on edges and vertices must be stored externally. Only the cost for one edge can be stored
between each pair of vertices.
Incidence matrix
A two-dimensional Boolean matrix, in which the rows represent the vertices and columns represent the edges.
The entries indicate whether the vertex at a row is incident to the edge at a column.
The following table gives the time complexity cost of performing various operations on graphs, for each of these
representations. In the matrix representations, the entries encode the cost of following an edge. The cost of edges that
are not present are assumed to be
.
33
34
Adjacency list
Incidence list
Adjacency matrix
Incidence matrix
Storage
Add vertex
Add edge
Remove vertex
Remove edge
Query: are vertices u, v
adjacent? (Assuming that the
storage positions for u, v are
known)
Remarks
Adjacency lists are generally preferred because they efficiently represent sparse graphs. An adjacency matrix is
preferred if the graph is dense, that is the number of edges |E| is close to the number of vertices squared, |V|2, or if
one must be able to quickly look up if there is an edge connecting two vertices.[1]
Types
Skip graphs
References
[1] Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001). Introduction to Algorithms (2nd ed.). MIT Press and
McGrawHill. ISBN 0-262-53196-8.
External links
Boost Graph Library: a powerful C++ graph library (http://www.boost.org/libs/graph)
PulseView Graphing: an efficient Data-To-Graph program (http://www.quarktet.com/PulseView.html)
Adjacency list
35
Adjacency list
In graph theory, an adjacency list is the representation of all edges or arcs in a
graph as a list.
If the graph is undirected, every entry is a set (or multiset) of two nodes containing
the two ends of the corresponding edge; if it is directed, every entry is a tuple of two
nodes, one denoting the source node and the other denoting the destination node of
the corresponding arc.
Typically, adjacency lists are unordered.
This undirected cyclic graph can
be described by the list {a,b},
{a,c}, {b,c}.
adjacent to
b,c
adjacent to
a,c
adjacent to
a,b
In computer science, an adjacency list is a data structure for representing graphs. In an adjacency list representation,
we keep, for each vertex in the graph, a list of all other vertices which it has an edge to (that vertex's "adjacency
list"). For instance, the representation suggested by van Rossum, in which a hash table is used to associate each
vertex with an array of adjacent vertices, can be seen as an example of this type of representation. Another example
is the representation in Cormen et al. in which an array indexed by vertex numbers points to a singly linked list of the
neighbors of each vertex.
One difficulty with the adjacency list structure is that it has no obvious place to store data associated with the edges
of a graph, such as the lengths or costs of the edges. To remedy this, some texts, such as that of Goodrich and
Tamassia, advocate a more object oriented variant of the adjacency list structure, sometimes called an incidence list,
which stores for each vertex a list of objects representing the edges incident to that vertex. To complete the structure,
each edge must point back to the two vertices forming its endpoints. The extra edge objects in this version of the
adjacency list cause it to use more memory than the version in which adjacent vertices are listed directly, but these
extra edges are also a convenient location to store additional information about each edge (e.g. their length).
Trade-offs
The main alternative to the adjacency list is the adjacency matrix. For a graph with a sparse adjacency matrix an
adjacency list representation of the graph occupies less space, because it does not use any space to represent edges
that are not present. Using a naive array implementation of adjacency lists on a 32-bit computer, an adjacency list for
an undirected graph requires about 8e bytes of storage, where e is the number of edges: each edge gives rise to
entries in the two adjacency lists and uses four bytes in each.
On the other hand, because each entry in an adjacency matrix requires only one bit, they can be represented in a very
compact way, occupying only n2/8 bytes of contiguous space, where n is the number of vertices. Besides just
avoiding wasted space, this compactness encourages locality of reference.
Adjacency list
Noting that a graph can have at most n2 edges (allowing loops) we can let d=e/n2 denote the density of the graph.
Then, if 8e>n2/8, the adjacency list representation occupies more space, which is true when d>1/64. Thus a graph
must be sparse for an adjacency list representation to be more memory efficient than an adjacency matrix. However,
this analysis is valid only when the representation is intended to store the connectivity structure of the graph without
any numerical information about its edges.
Besides the space trade-off, the different data structures also facilitate different operations. It is easy to find all
vertices adjacent to a given vertex in an adjacency list representation; you simply read its adjacency list. With an
adjacency matrix you must instead scan over an entire row, taking O(n) time. If you, instead, want to perform a
neighbor test on two vertices (i.e., determine if they have an edge between them), an adjacency matrix provides this
at once. However, this neighbor test in an adjacency list requires time proportional to the number of edges associated
with the two vertices.
References
Joe Celko (2004). Trees and Hierarchies in SQL for Smarties. Morgan Kaufmann. excerpt from Chapter 2:
"Adjacency List Model" [1]. ISBN1-55860-920-2.
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein (2001). Introduction to
Algorithms, Second Edition. MIT Press and McGraw-Hill. pp.527529 of section 22.1: Representations of
graphs. ISBN0-262-03293-7.
David Eppstein (1996). "ICS 161 Lecture Notes: Graph Algorithms" [2].
Michael T. Goodrich and Roberto Tamassia (2002). Algorithm Design: Foundations, Analysis, and Internet
Examples. John Wiley & Sons. ISBN0-471-38365-1.
Guido van Rossum (1998). "Python Patterns Implementing Graphs" [3].
External links
The Boost Graph Library implements an efficient adjacency list [4]
Open Data Structures - Section 12.2 - AdjacencyList: A Graph as a Collection of Lists [5]
References
[1]
[2]
[3]
[4]
[5]
36
Adjacency matrix
37
Adjacency matrix
In mathematics and computer science, an adjacency matrix is a means of representing which vertices (or nodes) of
a graph are adjacent to which other vertices. Another matrix representation for a graph is the incidence matrix.
Specifically, the adjacency matrix of a finite graph G on n vertices is the n n matrix where the non-diagonal entry
aij is the number of edges from vertex i to vertex j, and the diagonal entry aii, depending on the convention, is either
once or twice the number of edges (loops) from vertex i to itself. Undirected graphs often use the latter convention of
counting loops twice, whereas directed graphs typically use the former convention. There exists a unique adjacency
matrix for each isomorphism class of graphs (up to permuting rows and columns), and it is not the adjacency matrix
of any other isomorphism class of graphs. In the special case of a finite simple graph, the adjacency matrix is a
(0,1)-matrix with zeros on its diagonal. If the graph is undirected, the adjacency matrix is symmetric.
The relationship between a graph and the eigenvalues and eigenvectors of its adjacency matrix is studied in spectral
graph theory.
Examples
The convention followed here is that an adjacent edge counts 1 in the matrix for an undirected graph.
Labeled graph
Adjacency matrix
The adjacency matrix of a complete graph is all 1's except for 0's on the diagonal.
The adjacency matrix of an empty graph is a zero matrix.
Adjacency matrix
38
where B is an r s matrix and O is an all-zero matrix. Clearly, the matrix B uniquely represents the bipartite graphs.
It is sometimes called the biadjacency matrix. Formally, let G = (U, V, E) be a bipartite graph with parts
and
. The biadjacency matrix is the r x s 0-1 matrix B in which
iff
.
If G is a bipartite multigraph or weighted graph then the elements
the vertices or the weight of the edge
respectively.
Properties
The adjacency matrix of an undirected simple graph is symmetric, and therefore has a complete set of real
eigenvalues and an orthogonal eigenvector basis. The set of eigenvalues of a graph is the spectrum of the graph.
Suppose two directed or undirected graphs
and
and
and
are given.
and
such that
are similar and therefore have the same minimal polynomial, characteristic polynomial,
eigenvalues, determinant and trace. These can therefore serve as isomorphism invariants of graphs. However, two
graphs may possess the same set of eigenvalues but not be isomorphic. [1]
If A is the adjacency matrix of the directed or undirected graph G, then the matrix An (i.e., the matrix product of n
copies of A) has an interesting interpretation: the entry in row i and column j gives the number of (directed or
undirected) walks of length n from vertex i to vertex j. This implies, for example, that the number of triangles in an
undirected graph G is exactly the trace of A3 divided by 6.
The main diagonal of every adjacency matrix corresponding to a graph without loops has all zero entries. Note that
here 'loops' means, for example A->A, not 'cycles' such as A->B->A.
For
, and
is connected if and
Variations
An (a, b, c)-adjacency matrix A of a simple graph has Aij = a if ij is an edge, b if it is not, and c on the diagonal. The
Seidel adjacency matrix is a (1,1,0)-adjacency matrix. This matrix is used in studying strongly regular graphs and
two-graphs.[2]
The distance matrix has in position (i,j) the distance between vertices vi and vj. The distance is the length of a
shortest path connecting the vertices. Unless lengths of edges are explicitly provided, the length of a path is the
number of edges in it. The distance matrix resembles a high power of the adjacency matrix, but instead of telling
only whether or not two vertices are connected (i.e., the connection matrix, which contains boolean values), it gives
the exact distance between them.
Adjacency matrix
39
Data structures
For use as a data structure, the main alternative to the adjacency matrix is the adjacency list. Because each entry in
the adjacency matrix requires only one bit, it can be represented in a very compact way, occupying only
bytes
of contiguous space, where
References
[1] Godsil, Chris; Royle, Gordon Algebraic Graph Theory, Springer (2001), ISBN 0-387-95241-1, p.164
[2] Seidel, J. J. (1968). "Strongly Regular Graphs with (1,1,0) Adjacency Matrix Having Eigenvalue 3". Lin. Alg. Appl. 1 (2): 281298.
doi:10.1016/0024-3795(68)90008-6.
Further reading
Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001). "Section 22.1:
Representations of graphs". Introduction to Algorithms (Second ed.). MIT Press and McGraw-Hill. pp.527531.
ISBN0-262-03293-7.
Godsil, Chris; Royle, Gordon (2001). Algebraic Graph Theory. New York: Springer. ISBN0-387-95241-1.
External links
Fluffschack (http://www.x2d.org/java/projects/fluffschack.jnlp) an educational Java web start game
demonstrating the relationship between adjacency matrices and graphs.
Open Data Structures - Section 12.1 - AdjacencyMatrix: Representing a Graph by a Matrix (http://
opendatastructures.org/versions/edition-0.1e/ods-java/12_1_AdjacencyMatrix_Repres.html)
McKay, Brendan. "Description of graph6 and sparse6 encodings" (http://cs.anu.edu.au/~bdm/data/formats.
txt).
Caf math : Adjacency Matrices of Graphs (http://cafemath.kegtux.org/mathblog/article.
php?page=GoodWillHunting.php) : Application of the adjacency matrices to the computation generating series of
walks.
Implicit graph
Implicit graph
In the study of graph algorithms, an implicit graph representation (or more simply implicit graph) is a graph
whose vertices or edges are not represented as explicit objects in a computer's memory, but rather are determined
algorithmically from some more concise input.
Neighborhood representations
The notion of an implicit graph is common in various search algorithms which are described in terms of graphs. In
this context, an implicit graph may be defined as a set of rules to define all neighbors for any specified vertex.[1] This
type of implicit graph representation is analogous to an adjacency list, in that it provides easy access to the neighbors
of each vertex. For instance, in searching for a solution to a puzzle such as Rubik's Cube, one may define an implicit
graph in which each vertex represents one of the possible states of the cube, and each edge represents a move from
one state to another. It is straightforward to generate the neighbors of any vertex by trying all possible moves in the
puzzle and determining the states reached by each of these moves; however, an implicit representation is necessary,
as the state space of Rubik's Cube is too large to allow an algorithm to list all of its states.
In computational complexity theory, several complexity classes have been defined in connection with implicit
graphs, defined as above by a rule or algorithm for listing the neighbors of a vertex. For instance, PPA is the class of
problems in which one is given as input an undirected implicit graph (in which vertices are n-bit binary strings, with
a polynomial time algorithm for listing the neighbors of any vertex) and a vertex of odd degree in the graph, and
must find a second vertex of odd degree. By the handshaking lemma, such a vertex exists; finding one is a problem
in NP, but the problems that can be defined in this way may not necessarily be NP-complete, as it is unknown
whether PPA=NP. PPAD is an analogous class defined on implicit directed graphs that has attracted attention in
algorithmic game theory because it contains the problem of computing a Nash equilibrium.[2] The problem of testing
reachability of one vertex to another in an implicit graph may also be used to characterize space-bounded
nondeterministic complexity classes including NL (the class of problems that may be characterized by reachability in
implicit directed graphs whose vertices are O(log n)-bit bitstrings), SL (the analogous class for undirected graphs),
and PSPACE (the class of problems that may be characterized by reachability in implicit graphs with
polynomial-length bitstrings). In this complexity-theoretic context, the vertices of an implicit graph may represent
the states of a nondeterministic Turing machine, and the edges may represent possible state transitions, but implicit
graphs may also be used to represent many other types of combinatorial structure.[3] PLS, another complexity class,
captures the complexity of finding local optima in an implicit graph.[4]
Implicit graph models have also been used as a form of relativization in order to prove separations between
complexity classes that are stronger than the known separations for non-relativized models. For instance, Childs et
al. used neighborhood representations of implicit graphs to define a graph traversal problem that can be solved in
polynomial time on a quantum computer but that requires exponential time to solve on any classical computer.[5]
40
Implicit graph
Sparse graphs. If every vertex in G has at most d neighbors, one may number the vertices of G from 1 to n
and let the identifier for a vertex be the (d + 1)-tuple of its own number and the numbers of its neighbors. Two
vertices are adjacent when the first numbers in their identifiers appear later in the other vertex's identifier. More
generally, the same approach can be used to provide an implicit representation for graphs with bounded arboricity
or bounded degeneracy, including the planar graphs and the graphs in any minor-closed graph family.[7][8]
Intersection graphs. An interval graph is the intersection graph of a set of line segments in the real line. It may
be given an adjacency labeling scheme in which the points that are endpoints of line segments are numbered from
1 to 2n and each vertex of the graph is represented by the numbers of the two endpoints of its corresponding
interval. With this representation, one may check whether two vertices are adjacent by comparing the numbers
that represent them and verifying that these numbers define overlapping intervals. The same approach works for
other geometric intersection graphs including the graphs of bounded boxicity and the circle graphs, and
subfamilies of these families such as the distance-hereditary graphs and cographs.[7][9] However, a geometric
intersection graph representation does not always imply the existence of an adjacency labeling scheme, because it
may require more than a logarithmic number of bits to specify each geometric object; for instance, representing a
graph as a unit disk graph may require exponentially many bits for the coordinates of the disk centers.[10]
Low-dimensional comparability graphs. The comparability graph for a partially ordered set has a vertex for
each set element and an edge between two set elements that are related by the partial order. The order dimension
of a partial order is the minimum number of linear orders whose intersection is the given partial order. If a partial
order has bounded order dimension, then an adjacency labeling scheme for the vertices in its comparability graph
may be defined by labeling each vertex with its position in each of the defining linear orders, and determining that
two vertices are adjacent if each corresponding pair of numbers in their labels has the same order relation as each
other pair. In particular, this allows for an adjacency labeling scheme for the chordal comparability graphs, which
come from partial orders of dimension at most four.[11][12]
Not all graph families have local structures. For some families, a simple counting argument proves that adjacency
labeling schemes do not exist: only O(n log n) bits may be used to represent an entire graph, so a representation of
this type can only exist when the number of n-vertex graphs in the given family F is at most 2O(n log n). Graph
families that have larger numbers of graphs than this, such as the bipartite graphs or the triangle-free graphs, do not
have adjacency labeling schemes.[7][9] However, even families of graphs in which the number of graphs in the family
is small might not have an adjacency labeling scheme; for instance, the family of graphs with fewer edges than
vertices has 2O(n log n) n-vertex graphs but does not have an adjacency labeling scheme, because one could transform
any given graph into a larger graph in this family by adding a new isolated vertex for each edge, without changing its
labelability.[6][9] Kannan et al. asked whether having a forbidden subgraph characterization and having at most 2O(n
log n)
n-vertex graphs are together enough to guarantee the existence of an adjacency labeling scheme; this question,
which Spinrad restated as a conjecture, remains open.[7][9]
If a graph family F has an adjacency labeling scheme, then the n-vertex graphs in F may be represented as induced
subgraphs of a common universal graph of polynomial size, the graph consisting of all possible vertex identifiers.
Conversely, if a universal graph of this type can be constructed, then the identities of its vertices may be used as
labels in an adjacency labeling scheme.[7] For this application of implicit graph representations, it is important that
the labels use as few bits as possible, because the number of bits in the labels translates directly into the number of
vertices in the universal graph. Alstrup and Rauhe showed that any tree has an adjacency labeling scheme with log2 n
+ O(log* n) bits per label, from which it follows that any graph with arboricity k has a scheme with k log2 n + O(log*
n) bits per label and a universal graph with nk2O(log* n) vertices. In particular, planar graphs have arboricity at most
three, so they have universal graphs with a nearly-cubic number of vertices.[13]
41
Implicit graph
Evasiveness
The AanderaaKarpRosenberg conjecture concerns implicit graphs given as a set of labeled vertices with a
black-box rule for determining whether any two vertices are adjacent; this differs from an adjacency labeling scheme
in that the rule may be specific to a particular graph rather than being a generic rule that applies to all graphs in a
family. This difference allows every graph to have an implicit representation: for instance, the rule could be to look
up the pair of vertices in a separate adjacency matrix. However, an algorithm that is given as input an implicit graph
of this type must operate on it only through the implicit adjacency test, without reference to the implementation of
that test.
A graph property is the question of whether a graph belongs to a given family of graphs; the answer must remain
invariant under any relabeling of the vertices. In this context, the question to be determined is how many pairs of
vertices must be tested for adjacency, in the worst case, before the property of interest can be determined to be true
or false for a given implicit graph. Rivest and Vuillemin proved that any deterministic algorithm for any nontrivial
graph property must test a quadratic number of pairs of vertices;[14] the full AanderaaKarpRosenberg conjecture is
that any deterministic algorithm for a monotonic graph property (one that remains true if more edges are added to a
graph with the property) must in some cases test every possible pair of vertices. Several cases of the conjecture have
been proven to be truefor instance, it is known to be true for graphs with a prime number of vertices[15]but the
full conjecture remains open. Variants of the problem for randomized algorithms and quantum algorithms have also
been studied.
Bender and Ron have shown that, in the same model used for the evasiveness conjecture, it is possible in only
constant time to distinguish directed acyclic graphs from graphs that are very far from being acyclic. In contrast,
such a fast time is not possible in neighborhood-based implicit graph models,[16]
References
[1] Korf, Richard E. (2008), "Linear-time disk-based implicit graph search", Journal of the ACM 55 (6): Article 26, 40pp,
doi:10.1145/1455248.1455250, MR2477486.
[2] Papadimitriou, Christos (1994), "On the complexity of the parity argument and other inefficient proofs of existence" (http:/ / www. cs.
berkeley. edu/ ~christos/ papers/ On the Complexity. pdf), Journal of Computer and System Sciences 48 (3): 498532,
doi:10.1016/S0022-0000(05)80063-7,
[3] Immerman, Neil (1999), "Exercise 3.7 (Everything is a Graph)" (http:/ / books. google. com/ books?id=kWSZ0OWnupkC& pg=PA48),
Descriptive Complexity, Graduate Texts in Computer Science, Springer-Verlag, p.48, ISBN978-0-387-98600-5, .
[4] Yannakakis, Mihalis (2009), "Equilibria, fixed points, and complexity classes", Computer Science Review 3 (2): 7185.
[5] Childs, Andrew M.; Cleve, Richard; Deotto, Enrico; Farhi, Edward; Gutmann, Sam; Spielman, Daniel A. (2003), "Exponential algorithmic
speedup by a quantum walk", Proceedings of the Thirty-Fifth Annual ACM Symposium on Theory of Computing, New York: ACM, pp.5968,
doi:10.1145/780542.780552, MR2121062.
[6] Muller, John Harold (1988), Local structure in graph classes, Ph.D. thesis, Georgia Institute of Technology.
[7] Kannan, Sampath; Naor, Moni; Rudich, Steven (1992), "Implicit representation of graphs", SIAM Journal on Discrete Mathematics 5 (4):
596603, doi:10.1137/0405049, MR1186827.
[8] Chrobak, Marek; Eppstein, David (1991), "Planar orientations with low out-degree and compaction of adjacency matrices" (http:/ / www. ics.
uci. edu/ ~eppstein/ pubs/ ChrEpp-TCS-91. pdf), Theoretical Computer Science 86 (2): 243266, doi:10.1016/0304-3975(91)90020-3, .
[9] Spinrad, Jeremy P. (2003), "2. Implicit graph representation" (http:/ / books. google. com/ books?id=RrtXSKMAmWgC& pg=PA17),
Efficient Graph Representations, pp.1730, ISBN0-8218-2815-0, .
[10] Kang, Ross J.; Mller, Tobias (2011), Sphere and dot product representations of graphs (http:/ / homepages. cwi. nl/ ~mueller/ Papers/
SphericityDotproduct. pdf), .
[11] Ma, Tze Heng; Spinrad, Jeremy P. (1991), "Cycle-free partial orders and chordal comparability graphs", Order 8 (1): 4961,
doi:10.1007/BF00385814, MR1129614.
[12] Curtis, Andrew R.; Izurieta, Clemente; Joeris, Benson; Lundberg, Scott; McConnell, Ross M. (2010), "An implicit representation of chordal
comparability graphs in linear time", Discrete Applied Mathematics 158 (8): 869875, doi:10.1016/j.dam.2010.01.005, MR2602811.
[13] Alstrup, Stephen; Rauhe, Theis (2002), "Small induced-universal graphs and compact implicit graph representations" (http:/ / www. it-c. dk/
research/ algorithms/ Kurser/ AD/ 2002E/ Uge7/ parent. pdf), Proceedings of the 43rd Annual IEEE Symposium on Foundations of Computer
Science: 5362, doi:10.1109/SFCS.2002.1181882, .
[14] Rivest, Ronald L.; Vuillemin, Jean (1975), "A generalization and proof of the Aanderaa-Rosenberg conjecture", Proc. 7th ACM Symposium
on Theory of Computing, Albuquerque, New Mexico, United States, pp.611, doi:10.1145/800116.803747.
42
Implicit graph
[15] Kahn, Jeff; Saks, Michael; Sturtevant, Dean (1983), "A topological approach to evasiveness", Symposium on Foundations of Computer
Science, Los Alamitos, CA, USA: IEEE Computer Society, pp.3133, doi:10.1109/SFCS.1983.4.
[16] Bender, Michael A.; Ron, Dana (2000), "Testing acyclicity of directed graphs in sublinear time", Automata, languages and programming
(Geneva, 2000), Lecture Notes in Comput. Sci., 1853, Berlin: Springer, pp.809820, doi:10.1007/3-540-45022-X_68, MR1795937.
43
44
Search algorithm
Data structure
Graph
if entire graph is traversed without repetition, O(longest path length searched) for implicit graphs
without elimination of duplicate nodes
Depth-first search (DFS) is an algorithm for traversing or searching a tree, tree structure, or graph. One starts at the
root (selecting some node as the root in the graph case) and explores as far as possible along each branch before
backtracking.
A version of depth-first search was investigated in the 19th century by French mathematician Charles Pierre
Trmaux[1] as a strategy for solving mazes.[2][3]
Depth-first search
Formal definition
Formally, DFS is an uninformed search that progresses by expanding the first child node of the search tree that
appears and thus going deeper and deeper until a goal node is found, or until it hits a node that has no children. Then
the search backtracks, returning to the most recent node it hasn't finished exploring. In a non-recursive
implementation, all freshly expanded nodes are added to a stack for exploration.
Properties
The time and space analysis of DFS differs according to its application area. In theoretical computer science, DFS is
typically used to traverse an entire graph, and takes time O(|V|+|E|), linear in the size of the graph. In these
applications it also uses space O(|V|) in the worst case to store the stack of vertices on the current search path as well
as the set of already-visited vertices. Thus, in this setting, the time and space bounds are the same as for breadth-first
search and the choice of which of these two algorithms to use depends less on their complexity and more on the
different properties of the vertex orderings the two algorithms produce.
For applications of DFS to search problems in artificial intelligence, however, the graph to be searched is often either
too large to visit in its entirety or even infinite, and DFS may suffer from non-termination when the length of a path
in the search tree is infinite. Therefore, the search is only performed to a limited depth, and due to limited memory
availability one typically does not use data structures that keep track of the set of all previously visited vertices. In
this case, the time is still linear in the number of expanded vertices and edges (although this number is not the same
as the size of the entire graph because some vertices may be searched more than once and others not at all) but the
space complexity of this variant of DFS is only proportional to the depth limit, much smaller than the space needed
for searching to the same depth using breadth-first search. For such applications, DFS also lends itself much better to
heuristic methods of choosing a likely-looking branch. When an appropriate depth limit is not known a priori,
iterative deepening depth-first search applies DFS repeatedly with a sequence of increasing limits; in the artificial
intelligence mode of analysis, with a branching factor greater than one, iterative deepening increases the running
time by only a constant factor over the case in which the correct depth limit is known due to the geometric growth of
the number of nodes per level.
DFS may be also used to collect a sample of graph nodes. However, incomplete DFS, similarly to incomplete BFS,
is biased towards nodes of high degree.
Example
For the following graph:
a depth-first search starting at A, assuming that the left edges in the shown graph are chosen before right edges, and
assuming the search remembers previously visited nodes and will not repeat them (since this is a small graph), will
visit the nodes in the following order: A, B, D, F, E, C, G. The edges traversed in this search form a Trmaux tree, a
structure with important applications in graph theory.
Performing the same search without remembering previously visited nodes results in visiting nodes in the order A, B,
D, F, E, A, B, D, F, E, etc. forever, caught in the A, B, D, F, E cycle and never reaching C or G.
Iterative deepening is one technique to avoid this infinite loop and would reach all nodes.
45
Depth-first search
46
Vertex orderings
It is also possible to use the depth-first search to linearly order the vertices of the original graph (or tree). There are
three common ways of doing this:
A preordering is a list of the vertices in the order that they were first visited by the depth-first search algorithm.
This is a compact and natural way of describing the progress of the search, as was done earlier in this article. A
preordering of an expression tree is the expression in Polish notation.
A postordering is a list of the vertices in the order that they were last visited by the algorithm. A postordering of
an expression tree is the expression in reverse Polish notation.
A reverse postordering is the reverse of a postordering, i.e. a list of the vertices in the opposite order of their last
visit. Reverse postordering is not the same as preordering. For example, when searching the directed graph
beginning at node A, one visits the nodes in sequence, to produce lists either A B D B A C A, or A C D C A B
A (depending upon whether the algorithm chooses to visit B or C first). Note that repeat visits in the form of
backtracking to a node, to check if it has still unvisited neighbours, are included here (even if it is found to
have none). Thus the possible preorderings are A B D C and A C D B (order by node's leftmost occurrence in
above list), while the possible reverse postorderings are A C B D and A B C D (order by node's rightmost
occurrence in above list). Reverse postordering produces a topological sorting of any directed acyclic graph.
This ordering is also useful in control flow analysis as it often represents a natural linearization of the control
flow. The graph above might represent the flow of control in a code fragment like
Depth-first search
47
if (A) then {
B
} else {
C
}
D
and it is natural to consider this code in the order A B C D or A C B D, but not natural to use the order A B D
C or A C D B.
Pseudocode
Input: A graph G and a vertex v of G
Output: A labeling of the edges in the connected component of v as discovery edges and back edges
1 procedure DFS(G,v):
2
label v as explored
3
for all edges e in G.incidentEdges(v) do
4
if edge e is unexplored then
5
w G.opposite(v,e)
6
if vertex w is unexplored then
7
label e as a discovery edge
8
recursively call DFS(G,w)
9
else
10
label e as a back edge
Applications
Algorithms that use depth-first search as a building
block include:
Finding connected components.
Topological sorting.
Finding 2-(edge or vertex)-connected
components.
Finding 3-(edge or vertex)-connected
components.
Finding the bridges of a graph.
Finding strongly connected components.
Planarity testing[4][5]
Solving puzzles with only one solution, such as
mazes. (DFS can be adapted to find all solutions
to a maze by only including nodes on the
current path in the visited set.)
Depth-first search
Notes
[1] Charles Pierre Trmaux (18591882) cole Polytechnique of Paris (X:1876), French engineer of the telegraph
in Public conference, December 2, 2010 by professor Jean Pelletier-Thibert in Acadmie de Macon (Burgundy France) (Abstract
published in the Annals academic, March 2011 ISSN: 0980-6032)
[2] Even, Shimon (2011), Graph Algorithms (http:/ / books. google. com/ books?id=m3QTSMYm5rkC& pg=PA46) (2nd ed.), Cambridge
University Press, pp.4648, ISBN978-0-521-73653-4, .
[3] Sedgewick, Robert (2002), Algorithms in C++: Graph Algorithms (3rd ed.), Pearson Education, ISBN978-0-201-36118-6.
[4] Hopcroft, John; Tarjan, Robert E. (1974), "Efficient planarity testing", Journal of the Association for Computing Machinery 21 (4): 549568,
doi:10.1145/321850.321852.
[5] de Fraysseix, H.; Ossona de Mendez, P.; Rosenstiehl, P. (2006), "Trmaux Trees and Planarity", International Journal of Foundations of
Computer Science 17 (5): 10171030, doi:10.1142/S0129054106004248.
References
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms,
Second Edition. MIT Press and McGraw-Hill, 2001. ISBN 0-262-03293-7. Section 22.3: Depth-first search,
pp.540549.
Knuth, Donald E. (1997), The Art Of Computer Programming Vol 1. 3rd ed (http://www-cs-faculty.stanford.
edu/~knuth/taocp.html), Boston: Addison-Wesley, ISBN0-201-89683-4, OCLC155842391
Goodrich, Michael T. (2001), Algorithm Design: Foundations, Analysis, and Internet Examples, Wiley,
ISBN0-471-38365-1
External links
Depth-First Explanation and Example (http://www.cse.ohio-state.edu/~gurari/course/cis680/cis680Ch14.
html)
C++ Boost Graph Library: Depth-First Search (http://www.boost.org/libs/graph/doc/depth_first_search.
html)
Depth-First Search Animation (for a directed graph) (http://www.cs.duke.edu/csed/jawaa/DFSanim.html)
Depth First and Breadth First Search: Explanation and Code (http://www.kirupa.com/developer/actionscript/
depth_breadth_search.htm)
QuickGraph (http://quickgraph.codeplex.com/Wiki/View.aspx?title=Depth First Search Example), depth first
search example for .Net
Depth-first search algorithm illustrated explanation (Java and C++ implementations) (http://www.algolist.net/
Algorithms/Graph_algorithms/Undirected/Depth-first_search)
YAGSBPL A template-based C++ library for graph search and planning (http://code.google.com/p/yagsbpl/
)
48
Breadth-first search
49
Breadth-first search
Breadth-first search
Search algorithm
Data structure
Graph
In graph theory, breadth-first search (BFS) is a strategy for searching in a graph when search is limited to
essentially two operations: (a) visit and inspect a node of a graph; (b) gain access to visit the nodes that neighbor the
currently visited node. The BFS begins at a root node and inspect all the neighboring nodes. Then for each of those
neighbor nodes in turn, it inspects their neighbor nodes which were unvisited, and so on. Compare it with the
depth-first search.
How it works
BFS is an uninformed search method that aims to expand and
examine all nodes of a graph or combination of sequences by
systematically searching through every solution. In other words, it
exhaustively searches the entire graph or sequence without
considering the goal until it finds it.
From the standpoint of the algorithm, all child nodes obtained by
expanding a node are added to a FIFO (i.e., First In, First Out) queue.
In typical implementations, nodes that have not yet been examined
for their neighbors are placed in some container (such as a queue or
linked list) called "open" and then once examined are placed in the
container "closed".
Animated example of a breadth-first search
Breadth-first search
50
Algorithm
The algorithm uses a queue data structure to store
intermediate results as it traverses the graph, as
follows:
1. Enqueue the root node
2. Dequeue a node and examine it
If the element sought is found in this node, quit
the search and return a result.
Otherwise enqueue any successors (the direct
child nodes) that have not yet been discovered.
3. If the queue is empty, every node on the graph has
been examined quit the search and return "not
found".
4. If the queue is not empty, repeat from Step 2.
Note: Using a stack instead of a queue would turn this
algorithm into a depth-first search.
An example map of Germany with some connections between cities
Pseudocode
Input: A graph G and a root v of G
The breadth-first tree obtained when running BFS on the given map
and starting in Frankfurt
1 procedure BFS(G,v):
2
create a queue Q
3
enqueue v onto Q
4
mark v
5
while Q is not empty:
6
t Q.dequeue()
7
if t is what we are looking for:
8
return t
9
for all edges e in G.incidentEdges(t) do
10
o G.opposite(t,e)
11
if o is not marked:
12
mark o
13
enqueue o onto Q
Breadth-first search
51
Features
Space complexity
When the number of vertices in the graph is known ahead of time, and additional data structures are used to
determine which vertices have already been added to the queue, the space complexity can be expressed as
where
Time complexity
The time complexity can be expressed as
worst case. Note:
Applications
Breadth-first search can be used to solve many problems in graph theory, for example:
Finding all nodes within one connected component
Copying Collection, Cheney's algorithm
Finding the shortest path between two nodes u and v (with path length measured by number of edges)
Testing a graph for bipartiteness
(Reverse) CuthillMcKee mesh numbering
FordFulkerson method for computing the maximum flow in a flow network
Serialization/Deserialization of a binary tree vs serialization in sorted order, allows the tree to be re-constructed in
an efficient manner.
Testing bipartiteness
BFS can be used to test bipartiteness, by starting the search at any vertex and giving alternating labels to the vertices
visited during the search. That is, give label 0 to the starting vertex, 1 to all its neighbours, 0 to those neighbours'
neighbours, and so on. If at any step a vertex has (visited) neighbours with the same label as itself, then the graph is
not bipartite. If the search ends without such a situation occurring, then the graph is bipartite.
References
Knuth, Donald E. (1997), The Art Of Computer Programming Vol 1. 3rd ed. (http://www-cs-faculty.stanford.
edu/~knuth/taocp.html), Boston: Addison-Wesley, ISBN0-201-89683-4
External links
Breadth-First Explanation and Example (http://www.cse.ohio-state.edu/~gurari/course/cis680/cis680Ch14.
html#QQ1-46-92)
The algorithm
The lexicographic breadth-first search algorithm replaces the queue of vertices of a standard breadth-first search with
an ordered sequence of sets of vertices. The sets in the sequence form a partition of the remaining vertices. At each
step, a vertex v from the first set in the sequence is removed from that set, and if that removal causes the set to
become empty then the set is removed from the sequence. Then, each set in the sequence is replaced by two subsets:
the neighbors of v and the non-neighbors of v. The subset of neighbors is placed earlier in the sequence than the
subset of non-neighbors. In pseudocode, the algorithm can be expressed as follows:
Initialize a sequence of sets, to contain a single set containing all vertices.
Initialize the output sequence of vertices to be empty.
While is non-empty:
Each vertex is processed once, each edge is examined only when its two endpoints are processed, and (with an
appropriate representation for the sets in that allows items to be moved from one set to another in constant time)
each iteration of the inner loop takes only constant time. Therefore, like simpler graph search algorithms such as
breadth-first search and depth first search, this algorithm takes linear time.
The algorithm is called lexicographic breadth-first search because the lexicographic order it produces is an ordering
that could also have been produced by a breadth-first search, and because if the ordering is used to index the rows
and columns of an adjacency matrix of a graph then the algorithm sorts the rows and columns into Lexicographical
order.
52
Chordal graphs
A graph G is defined to be chordal if its vertices have a perfect elimination ordering, an ordering such that for any
vertex v the neighbors that occur later in the ordering form a clique. In a chordal graph, the reverse of a lexicographic
ordering is always a perfect elimination ordering. Therefore, as Rose, Tarjan, and Lueker show, one can test whether
a graph is chordal in linear time by the following algorithm:
Use lexicographic breadth-first search to find a lexicographic ordering of G
Reverse this ordering
For each vertex v:
Let w be the neighbor of v occurring prior to v in the reversed sequence, as close to v in the sequence as
possible
(Continue to the next vertex v if there is no such w)
If the set of earlier neighbors of v (excluding w itself) is not a subset of the set of earlier neighbors of w, the
graph is not chordal
If the loop terminates without showing that the graph is not chordal, then it is chordal.
Graph coloring
A graph G is said to be perfectly orderable if there is a sequence of its vertices with the property that, for any
induced subgraph of G, a greedy coloring algorithm that colors the vertices in the induced sequence ordering is
guaranteed to produce an optimal coloring.
For a chordal graph, a perfect elimination ordering is a perfect ordering: the number of the color used for any vertex
is the size of the clique formed by it and its earlier neighbors, so the maximum number of colors used is equal to the
size of the largest clique in the graph, and no coloring can use fewer colors. An induced subgraph of a chordal graph
is chordal and the induced subsequence of its perfect elimination ordering is a perfect elimination ordering on the
subgraph, so chordal graphs are perfectly orderable, and lexicographic breadth-first search can be used to optimally
color them.
The same property is true for a larger class of graphs, the distance-hereditary graphs: distance-hereditary graphs are
perfectly orderable, with a perfect ordering given by the reverse of a lexicographic ordering, so lexicographic
breadth-first search can be used in conjunction with greedy coloring algorithms to color them optimally in linear
time.[1]
Other applications
Bretscher et al. (2008) describe an extension of lexicographic breadth-first search that breaks any additional ties
using the complement graph of the input graph. As they show, this can be used to recognize cographs in linear time.
Habib et al. (2000) describe additional applications of lexicographic breadth-first search including the recognition of
comparability graphs and interval graphs.
Notes
[1] Brandstdt, Le & Spinrad (1999), Theorem 5.2.4, p. 71.
References
Brandstdt, Andreas; Le, Van Bang; Spinrad, Jeremy (1999), Graph Classes: A Survey, SIAM Monographs on
Discrete Mathematics and Applications, ISBN0-89871-432-X.
Bretscher, Anna; Corneil, Derek; Habib, Michel; Paul, Christophe (2008), "A simple linear time LexBFS cograph
recognition algorithm" (http://www.liafa.jussieu.fr/~habib/Documents/cograph.ps), SIAM Journal on
53
54
Properties
IDDFS combines depth-first search's space-efficiency and breadth-first search's completeness (when the branching
factor is finite). It is optimal when the path cost is a non-decreasing function of the depth of the node.
The space complexity of IDDFS is
, where
Since iterative deepening visits states multiple times, it may seem wasteful, but it turns out to be not so costly, since
in a tree most of the nodes are in the bottom level, so it does not matter much if the upper levels are visited multiple
times.[1]
The main advantage of IDDFS in game tree searching is that the earlier searches tend to improve the commonly used
heuristics, such as the killer heuristic and alpha-beta pruning, so that a more accurate estimate of the score of various
nodes at the final depth search can occur, and the search completes more quickly since it is done in a better order.
For example, alpha-beta pruning is most efficient if it searches the best moves first.[1]
A second advantage is the responsiveness of the algorithm. Because early iterations use small values for
, they
execute extremely quickly. This allows the algorithm to supply early indications of the result almost immediately,
followed by refinements as increases. When used in an interactive setting, such as in a chess-playing program,
this facility allows the program to play at any time with the current best move found in the search it has completed so
far. This is not possible with a traditional depth-first search.
The time complexity of IDDFS in well-balanced trees works out to be the same as Depth-first search:
In an iterative deepening search, the nodes on the bottom level are expanded once, those on the next to bottom level
are expanded twice, and so on, up to the root of the search tree, which is expanded
times.[1] So the total
number of expansions in an iterative deepening search is
For
and
the number is
55
, when
the overhead of repeatedly expanded states, but even when the branching factor is 2, iterative deepening search only
takes about twice as long as a complete breadth-first search. This means that the time complexity of iterative
deepening is still
, and the space complexity is
like a regular depth-first search. In general, iterative
deepening is the preferred search method when there is a large search space and the depth of the solution is not
known.[1]
Example
For the following graph:
a depth-first search starting at A, assuming that the left edges in the shown graph are chosen before right edges, and
assuming the search remembers previously-visited nodes and will not repeat them (since this is a small graph), will
visit the nodes in the following order: A, B, D, F, E, C, G. The edges traversed in this search form a Trmaux tree, a
structure with important applications in graph theory.
Performing the same search without remembering previously visited nodes results in visiting nodes in the order A, B,
D, F, E, A, B, D, F, E, etc. forever, caught in the A, B, D, F, E cycle and never reaching C or G.
Iterative deepening prevents this loop and will reach the following nodes on the following depths, assuming it
proceeds left-to-right as above:
0: A
1: A (repeated), B, C, E
(Note that iterative deepening has now seen C, when a conventional depth-first search did not.)
2: A, B, D, F, C, G, E, F
(Note that it still sees C, but that it came later. Also note that it sees E via a different path, and loops back to F
twice.)
3: A, B, D, F, E, C, G, E, F, B
For this graph, as more depth is added, the two cycles "ABFE" and "AEFB" will simply get longer before the
algorithm gives up and tries another branch.
-
Algorithm
The following pseudocode shows IDDFS implemented in terms of a recursive depth-limited DFS (called DLS).
IDDFS(root, goal)
{
depth = 0
repeat
{
result = DLS(root, goal, depth)
Related algorithms
Similar to iterative deepening is a search strategy called iterative lengthening search that works with increasing
path-cost limits instead of depth-limits. It expands nodes in the order of increasing path cost; therefore the first goal
it encounters is the one with the cheapest path cost. But iterative lengthening incurs substantial overhead that make it
less useful than iterative deepening.
Notes
[1] Russell, Stuart J.; Norvig, Peter (2003), Artificial Intelligence: A Modern Approach (http:/ / aima. cs. berkeley. edu/ ) (2nd ed.), Upper Saddle
River, New Jersey: Prentice Hall, ISBN0-13-790395-2,
56
Topological sorting
57
Topological sorting
In computer science, a topological sort (sometimes abbreviated topsort or toposort) or topological ordering of a
directed graph is a linear ordering of its vertices such that, for every edge uv, u comes before v in the ordering. For
instance, the vertices of the graph may represent tasks to be performed, and the edges may represent constraints that
one task must be performed before another; in this application, a topological ordering is just a valid sequence for the
tasks. A topological ordering is possible if and only if the graph has no directed cycles, that is, if it is a directed
acyclic graph (DAG). Any DAG has at least one topological ordering, and algorithms are known for constructing a
topological ordering of any DAG in linear time.
Examples
The canonical application of topological sorting (topological order) is in scheduling a sequence of jobs or tasks based
on their dependencies; topological sorting algorithms were first studied in the early 1960s in the context of the PERT
technique for scheduling in project management (Jarnagin 1960). The jobs are represented by vertices, and there is
an edge from x to y if job x must be completed before job y can be started (for example, when washing clothes, the
washing machine must finish before we put the clothes to dry). Then, a topological sort gives an order in which to
perform the jobs.
In computer science, applications of this type arise in instruction scheduling, ordering of formula cell evaluation
when recomputing formula values in spreadsheets, logic synthesis, determining the order of compilation tasks to
perform in makefiles, serialization, and resolving symbol dependencies in linkers.
The graph shown to the left has many valid topological sorts, including:
3, 7, 8, 5, 11, 10, 2, 9
5, 7, 3, 8, 11, 10, 9, 2 (fewest edges first)
Algorithms
The usual algorithms for topological sorting have running time linear in the number of nodes plus the number of
edges (
).
One of these algorithms, first described by Kahn (1962), works by choosing vertices in the same order as the
eventual topological sort. First, find a list of "start nodes" which have no incoming edges and insert them into a set S;
at least one such node must exist in an acyclic graph. Then:
L Empty list that will contain the sorted elements
S Set of all nodes with no incoming edges
while S is non-empty do
remove a node n from S
insert n into L
Topological sorting
for each node m with an edge e from n to m do
remove edge e from the graph
if m has no other incoming edges then
insert m into S
if graph has edges then
return error (graph has at least one cycle)
else
return L (a topologically sorted order)
If the graph is a DAG, a solution will be contained in the list L (the solution is not necessarily unique). Otherwise,
the graph must have at least one cycle and therefore a topological sorting is impossible.
Note that, reflecting the non-uniqueness of the resulting sort, the structure S can be simply a set or a queue or a stack.
Depending on the order that nodes n are removed from set S, a different solution is created. A variation of Kahn's
algorithm that breaks ties lexicographically forms a key component of the CoffmanGraham algorithm for parallel
scheduling and layered graph drawing.
An alternative algorithm for topological sorting is based on depth-first search. For this algorithm, the edges are
examined in the opposite direction as the previous one (it looks for nodes with edges pointing to a given node instead
of from it, which might set different requirements for the data structure used to represent the graph; and it starts with
the set of nodes with no outgoing edges). The algorithm loops through each node of the graph, in an arbitrary order,
initiating a depth-first search that terminates when it hits any node that has already been visited since the beginning
of the topological sort:
L Empty list that will contain the sorted nodes
S Set of all nodes with no outgoing edges
for each node n in S do
visit(n)
function visit(node n)
if n has not been visited yet then
mark n as visited
for each node m with an edge from m to n do
visit(m)
add n to L
Note that each node n gets added to the output list L only after considering all other nodes on which n depends (all
ancestor nodes of n in the graph). Specifically, when the algorithm adds node n, we are guaranteed that all nodes on
which n depends are already in the output list L: they were added to L either by the preceding recursive call to visit(),
or by an earlier call to visit(). Since each edge and node is visited once, the algorithm runs in linear time. Note that
the simple pseudocode above cannot detect the error case where the input graph contains cycles. The algorithm can
be refined to detect cycles by watching for nodes which are visited more than once during any nested sequence of
recursive calls to visit() (e.g., by passing a list down as an extra argument to visit(), indicating which nodes have
already been visited in the current call stack). This depth-first-search-based algorithm is the one described by
Cormen et al. (2001); it seems to have been first described in print by Tarjan (1976).
58
Topological sorting
Uniqueness
If a topological sort has the property that all pairs of consecutive vertices in the sorted order are connected by edges,
then these edges form a directed Hamiltonian path in the DAG. If a Hamiltonian path exists, the topological sort
order is unique; no other order respects the edges of the path. Conversely, if a topological sort does not form a
Hamiltonian path, the DAG will have two or more valid topological orderings, for in this case it is always possible to
form a second valid ordering by swapping two consecutive vertices that are not connected by an edge to each other.
Therefore, it is possible to test in polynomial time whether a unique ordering exists, and whether a Hamiltonian path
exists, despite the NP-hardness of the Hamiltonian path problem for more general directed graphs (Vernet &
Markenzon 1997).
References
Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001), "Section 22.4: Topological
sort", Introduction to Algorithms (2nd ed.), MIT Press and McGraw-Hill, pp.549552, ISBN0-262-03293-7.
Jarnagin, M. P. (1960), Automatic machine methods of testing PERT networks for consistency, Technical
Memorandum No. K-24/60, Dahlgren, Virginia: U. S. Naval Weapons Laboratory.
Kahn, Arthur B. (1962), "Topological sorting of large networks", Communications of the ACM 5 (11): 558562,
doi:10.1145/368996.369025.
Tarjan, Robert E. (1976), "Edge-disjoint spanning trees and depth-first search", Acta Informatica 6 (2): 171185,
doi:10.1007/BF00268499.
Vernet, Oswaldo; Markenzon, Lilian (1997), "Hamiltonian problems for reducible flowgraphs", Proc. 17th
International Conference of the Chilean Computer Science Society (SCCC '97), pp.264267,
doi:10.1109/SCCC.1997.637099.
59
Topological sorting
60
External links
NIST Dictionary of Algorithms and Data Structures: topological sort [1]
Weisstein, Eric W., "TopologicalSort [2]" from MathWorld.
References
[1] http:/ / www. nist. gov/ dads/ HTML/ topologicalSort. html
[2] http:/ / mathworld. wolfram. com/ TopologicalSort. html
Definition
Given a set of objects S and a transitive relation
with
with
and R being
relation directly: A depends on B and C, because you can add two variables if and only if you know the values of
both variables. Thus, B and C must be calculated before A can be calculated. However, D's value is known
immediately, because it is a number literal.
61
with
numbering orders two elements a and b so that a will be evaluated before b, then a must not depend on b.
Furthermore, there can be more than a single correct evaluation order. In fact, a correct numbering is a topological
order, and any topological order is a correct numbering. Thus, any algorithm that derives a correct topological order
derives a correct evaluation order.
Assume the simple calculator from above once more. Given the equation system "A = B+C; B = 5+D; C=4; D=2;", a
correct evaluation order would be (D, C, B, A). However, (C, D, B, A) is a correct evaluation order as well.
Examples
Dependency graphs are used in:
Automated software installers. They walk the graph looking for software packages that are required but not yet
installed. The dependency is given by the coupling of the packages.
Software build scripts such as the Unix Make system or Apache Ant. They need to know what files have changed
so only the correct files need to be recompiled.
In Compiler technology and formal language implementation:
Instruction Scheduling. Dependency graphs are computed for the operands of assembly or intermediate
instructions and used to determine an optimal order for the instructions.
Dead code elimination. If no side effected operation depends on a variable, this variable is considered dead and
can be removed.
Spreadsheet calculators. They need to derive a correct calculation order similar to that one in the example used in
this article.
Web Forms standards such as XForms to know what visual elements to update if data in the model changes.
Dependency graphs are one aspect of:
Manufacturing Plant Types. Raw materials are processed into products via several dependent stages.
Job Shop Scheduling. A collection of related theoretical problems in computer science.
References
Balmas, Francoise (2001) Displaying dependence graphs: a hierarchical approach [1], [2] wcre, p.261, Eighth
Working Conference on Reverse Engineering (WCRE'01)
References
[1] http:/ / www. ai. univ-paris8. fr/ ~fb/ version-ps/ pdep. ps
[2] http:/ / doi. ieeecomputersociety. org/ 10. 1109/ WCRE. 2001. 957830
62
An equivalence relation
An alternative way to define connected components
involves the equivalence classes of an equivalence
relation that is defined on the vertices of the graph. In an
A graph with three connected components.
undirected graph, a vertex v is reachable from a vertex u
if there is a path from u to v. In this definition, a single vertex is counted as a path of length zero, and the same vertex
may occur more than once within a path. Reachability is an equivalence relation, since:
It is reflexive: There is a trivial path of length zero from any vertex to itself.
It is symmetric: If there is a path from u to v, the same edges form a path from v to u.
It is transitive: If there is a path from u to v and a path from v to w, the two paths may be concatenated together to
form a path from u to w.
The connected components are then the induced subgraphs formed by the equivalence classes of this relation.
Algorithms
It is straightforward to compute the connected components of a graph in linear time (in terms of the numbers of the
vertices and edges of the graph) using either breadth-first search or depth-first search. In either case, a search that
begins at some particular vertex v will find the entire connected component containing v (and no more) before
returning. To find all the connected components of a graph, loop through its vertices, starting a new breadth first or
depth first search whenever the loop reaches a vertex that has not already been included in a previously found
connected component. Hopcroft and Tarjan (1973)[1] describe essentially this algorithm, and state that at that point it
was "well known".
Connected components
There are also efficient algorithms to dynamically track the connected components of a graph as vertices and edges
are added, as a straightforward application of disjoint-set data structures. These algorithms require amortized O((n))
time per operation, where adding vertices and edges and determining the connected component in which a vertex
falls are both operations, and (n) is a very slow-growing inverse of the very quickly growing Ackermann function.
A related problem is tracking connected components as all edges are deleted from a graph, one by one; an algorithm
exists to solve this with constant time per query, and O(|V||E|) time to maintain the data structure; this is an
amortized cost of O(|V|) per edge deletion. For forests, the cost can be reduced to O(q + |V| log |V|), or O(log |V|)
amortized cost per edge deletion.[2]
Researchers have also studied algorithms for finding connected components in more limited models of computation,
such as programs in which the working memory is limited to a logarithmic number of bits (defined by the
complexity class L). Lewis & Papadimitriou (1982) asked whether it is possible to test in logspace whether two
vertices belong to the same connected component of an undirected graph, and defined a complexity class SL of
problems logspace-equivalent to connectivity. Finally Reingold (2008) succeeded in finding an algorithm for solving
this connectivity problem in logarithmic space, showing that L=SL.
References
[1] Hopcroft, J.; Tarjan, R. (1973). "Efficient algorithms for graph manipulation". Communications of the ACM 16 (6): 372378.
doi:10.1145/362248.362272.
[2] Shiloach, Y. and Even, S. 1981. An On-Line Edge-Deletion Problem. Journal of the ACM: 28, 1 (Jan. 1981), pp.1-4.
Lewis, Harry R.; Papadimitriou, Christos H. (1982), "Symmetric space-bounded computation", Theoretical
Computer Science 19 (2): 161187, doi:10.1016/0304-3975(82)90058-5.
Reingold, Omer (2008), "Undirected connectivity in log-space", Journal of the ACM 55 (4): Article 17, 24 pages,
doi:10.1145/1391289.1391291.
External links
Connected components (http://www.cs.sunysb.edu/~algorith/files/dfs-bfs.shtml), Steven Skiena, The Stony
Brook Algorithm Repository
63
Edge connectivity
64
Edge connectivity
In graph theory, a graph is k-edge-connected if it remains connected whenever fewer than k edges are removed.
Formal definition
Let G=(E,V) be an arbitrary graph. If G'=(E\X,V) is connected for all XE where |X|<k, then G is
k-edge-connected. Trivially, a graph that is k-edge-connected is also (k1)-edge-connected.
Computational aspects
There is a polynomial-time algorithm to determine the largest k for which a graph G is k-edge-connected. A simple
algorithm would, for every pair (u,v), determine the maximum flow from u to v with the capacity of all edges in G
set to 1 for both directions. A graph is k-edge-connected if and only if the maximum flow from u to v is at least k for
any pair (u,v), so k is the least u-v-flow among all (u,v).
If V is the number of vertices in the graph, this simple algorithm would perform
flow problem, which can be solved in
is
in total.
An improved algorithm will solve the maximum flow problem for every pair (u,v) where u is arbitrarily fixed while v
varies over all vertices. This reduces the complexity to
and is sound since, if a cut of capacity less than k
exists, it is bound to separate u from some other vertex.
A related problem: finding the minimum k-edge-connected subgraph of G (that is: select as few as possible edges in
G that your selection is k-edge-connected) is NP-hard for
.[1]
References
[1] M.R. Garey and D.S. Johnson. Computers and Intractability: a Guide to the Theory of NP-Completeness. Freeman, San Francisco, CA, 1979.
Vertex connectivity
Vertex connectivity
In graph theory, a graph G with vertex set V(G) is said to be k-vertex-connected (or k-connected) if the graph
remains connected when you delete fewer than k vertices from the graph. Alternatively, a graph is k-connected if k
is the size of the smallest subset of vertices such that the graph becomes disconnected if you delete them.[1]
An equivalent definition for graphs that are not complete is that a graph is k-connected if any two of its vertices can
be joined by k independent paths; see Menger's theorem (Diestel 2005, p.55). However, for complete graphs the two
definitions differ: the n-vertex complete graph has unbounded connectivity according to the definition based on
deleting vertices, but connectivity n1 according to the definition based on independent paths, and some authors
use alternative definitions according to which its connectivity isn.[1].
A 1-vertex-connected graph is called connected, while a 2-vertex-connected graph is said to be biconnected.
The vertex-connectivity, or just connectivity, of a graph is the largest k for which the graph is k-vertex-connected.
The 1-skeleton of any k-dimensional convex polytope forms a k-vertex-connected graph (Balinski's theorem,
Balinski 1961). As a partial converse, Steinitz's theorem states that any 3-vertex-connected planar graph forms the
skeleton of a convex polyhedron.
Notes
[1] Schrijver, Combinatorial Optimization, Springer
References
Balinski, M. L. (1961), "On the graph structure of convex polyhedra in n-space" (http://www.projecteuclid.org/
Dienst/UI/1.0/Summarize/euclid.pjm/1103037323), Pacific Journal of Mathematics 11 (2): 431434.
Diestel, Reinhard (2005), Graph Theory (http://www.math.uni-hamburg.de/home/diestel/books/graph.
theory/) (3rd ed.), Berlin, New York: Springer-Verlag, ISBN978-3-540-26183-4.
65
References
Menger, Karl (1927). "Zur allgemeinen Kurventheorie". Fund. Math. 10: 96115.
Aharoni, Ron and Berger, Eli (2009). "Menger's Theorem for infinite graphs" [1]. Inventiones Mathematicae 176:
162. doi:10.1007/s00222-008-0157-3.
Halin, R. (1974), "A note on Menger's theorem for infinite locally finite graphs", Abhandlungen aus dem
Mathematischen Seminar der Universitt Hamburg 40: 111114, MR0335355.
External links
References
[1] http:/ / www. springerlink. com/ content/ 267k231365284lr6/ ?p=ddccdd0319b24e53958e286488757ca7& pi=0
[2] http:/ / www. math. unm. edu/ ~loring/ links/ graph_s05/ Menger. pdf
[3] http:/ / www. math. fau. edu/ locke/ Menger. htm
[4] http:/ / gepard. bioinformatik. uni-saarland. de/ teaching/ ws-2008-09/ bioinformatik-3/ lectures/ V12-NetworkFlow. pdf
[5] http:/ / gepard. bioinformatik. uni-saarland. de/ teaching/ ws-2008-09/ bioinformatik-3/ lectures/ V13-MaxFlowMinCut. pdf
66
Ear decomposition
67
Ear decomposition
In graph theory, an ear of an undirected graph G is a path P where the
two endpoints of the path may coincide, but where otherwise no
repetition of edges or vertices is allowed, so every internal vertex of P
has degree two in P. An ear decomposition of an undirected graph G
is a partition of its set of edges into a sequence of ears, such that the
one or two endpoints of each ear belong to earlier ears in the sequence
and such that the internal vertices of each ear do not belong to any
earlier ear. Additionally, in most cases the first ear in the sequence
must be a cycle. An open ear decomposition or a proper ear
decomposition is an ear decomposition in which the two endpoints of
each ear after the first are distinct from each other.
Ear decompositions may be used to characterize several important graph classes, and as part of efficient graph
algorithms. They may also be generalized from graphs to matroids.
Graph connectivity
A graph is k-vertex-connected if the removal of any (k1) vertices leaves a connected subgraph, and
k-edge-connected if the removal of any (k1) edges leaves a connected subgraph.
The following result is due to Hassler Whitney(1932):
A graph
with
Ear decomposition
68
Factor-critical graphs
An ear decomposition is odd if each of its ears uses an odd number of edges. A factor-critical graph is a graph with
an odd number of vertices, such that for each vertex v, if v is removed from the graph then the remaining vertices
have a perfect matching. Lszl Lovsz(1972) found that:
A graph G is factor-critical if and only if G has an odd ear decomposition.
More generally, a result of Frank (1993) makes it possible to find in any graph G the ear decomposition with the
fewest even ears.
Series-parallel graphs
A tree ear decomposition is a proper ear decomposition in which the first ear is a single edge and for each
subsequent ear
, there is a single ear
,
, such that both endpoints of
lie on
(Khuller 1989). A
nested ear decomposition is a tree ear decomposition such that, within each ear
other ears
form a set of nested intervals. A series-parallel graph is a graph with two designated
terminals s and t that can be formed recursively by combining smaller series-parallel graphs in one of two ways:
series composition (identifying one terminal from one smaller graph with one terminal from the other smaller graph,
and keeping the other two terminals as the terminals of the combined graph) and parallel composition (identifying
both pairs of terminals from the two smaller graphs).
The following result is due to David Eppstein(1992):
A 2-vertex-connected graph is series-parallel if and only if it has a nested ear decomposition.
Moreover, any open ear decomposition of a 2-vertex-connected series-parallel graph must be nested. The result may
be extended to series-parallel graphs that are not 2-vertex-connected by using open ear decompositions that start with
a path between the two terminals.
Matroids
The concept of an ear decomposition can be extended from graphs to matroids. An ear decomposition of a matroid is
defined to be a sequence of circuits of the matroid, with two properties:
each circuit in the sequence has a nonempty intersection with the previous circuits, and
each circuit in the sequence remains a circuit even if all previous circuits in the sequence are contracted.
When applied to the graphic matroid of a graph G, this definition of an ear decomposition coincides with the
definition of a proper ear decomposition of G: improper decompositions are excluded by the requirement that each
circuit include at least one edge that also belongs to previous circuits. Using this definition, a matroid may be defined
as factor-critical when it has an ear decomposition in which each circuit in the sequence has an odd number of new
elements (Szegedy & Szegedy 2006).
Algorithms
On classical computers, ear decompositions of 2-edge-connected graphs and open ear decompositions of
2-edge-connected graphs may be found by a greedy algorithm that finds each ear one at a time.
Lovsz (1985), Maon, Schieber & Vishkin (1986), and Miller & Ramachandran (1986) provided efficient parallel
algorithms for constructing ear decompositions of various types. For instance, to find an ear decomposition of a
2-edge-connected graph, the algorithm of Maon, Schieber & Vishkin (1986) proceeds according to the following
steps:
1. Find a spanning tree of the given graph and choose a root for the tree.
2. Determine, for each edge uv that is not part of the tree, the distance between the root and the lowest common
ancestor of u and v.
Ear decomposition
3. For each edge uv that is part of the tree, find the corresponding "master edge", a non-tree edge wx such that the
cycle formed by adding wx to the tree passes through uv and such that, among such edges, w and x have a lowest
common ancestor that is as close to the root as possible (with ties broken by edge identifiers).
4. Form an ear for each non-tree edge, consisting of it and the tree edges for which it is the master, and order the
ears by their master edges' distance from the root (with the same tie-breaking rule).
These algorithms may be used as subroutines for other problems including testing connectivity, recognizing
series-parallel graphs, and constructing st-numberings of graphs (an important subroutine in planarity testing).
References
Eppstein, D. (1992), "Parallel recognition of series-parallel graphs", Information & Computation 98 (1): 4155,
doi:10.1016/0890-5401(92)90041-D, MR1161075.
Frank, Andrs (1993), "Conservative weightings and ear-decompositions of graphs", Combinatorica 13 (1):
6581, doi:10.1007/BF01202790, MR1221177.
Gross, Jonathan L.; Yellen, Jay (2006), "Characterization of strongly orientable graphs" [1], Graph theory and its
applications, Discrete Mathematics and its Applications (Boca Raton) (2nd ed.), Chapman &Hall/CRC, Boca
Raton, FL, pp.498499, ISBN978-1-58488-505-4, MR2181153.
Khuller, Samir (1989), "Ear decompositions" [2], SIGACT News 20 (1): 128.
Lovsz, Lszl (1972), "A note on factor-critical graphs", Studia Sci. Math. Hung. 7: 279280, MR0335371.
Lovsz, Lszl (1985), "Computing ears and branchings in parallel", 26th Annual Symposium on Foundations of
Computer Science, pp.464467, doi:10.1109/SFCS.1985.16.
Maon, Y.; Schieber, B.; Vishkin, U. (1986), "Parallel ear decomposition search (EDS) and ST-numbering in
graphs", Theoretical Computer Science 47 (3), doi:10.1016/0304-3975(86)90153-2, MR0882357.
Miller, G.; Ramachandran, V. (1986), Efficient parallel ear decomposition with applications, Unpublished
manuscript.
Robbins, H. E. (1939), "A theorem on graphs, with an application to a problem of traffic control", The American
Mathematical Monthly 46: 281283.
Schrijver, Alexander (2003), Combinatorial Optimization. Polyhedra and efficiency. Vol A, Springer-Verlag,
ISBN978-3-540-44389-6.
Szegedy, Balzs; Szegedy, Christian (2006), "Symplectic spaces and ear-decomposition of matroids",
Combinatorica 26 (3): 353377, doi:10.1007/s00493-006-0020-3, MR2246153.
Whitney, H. (1932), "Non-separable and planar graphs", Transactions of the American Mathematical Society 34:
339362.
References
[1] http:/ / books. google. com/ books?id=unEloQ_sYmkC& pg=PA498
[2] http:/ / portalparts. acm. org/ 70000/ 65780/ bm/ backmatter. pdf
69
70
bridges, since
Bridgeless graphs
A bridgeless graph is a graph that does not have any bridges. Equivalent conditions are that each connected
component of the graph has an open ear decomposition,[3] that each connected component is 2-edge-connected, or
(by Robbins' theorem) that every connected component has a strong orientation.[3]
An important open problem involving bridges is the cycle double cover conjecture, due to Seymour and Szekeres
(1978 and 1979, independently), which states that every bridgeless graph admits a set of simple cycles which
contains each edge exactly twice.[4]
71
Bridge-finding algorithm
A linear time algorithm for finding the bridges in a graph was described by Robert Tarjan in 1974.[5] It performs the
following steps:
Find a spanning forest of
Create a rooted forest from the spanning tree
Traverse the forest in preorder and number the nodes. Parent nodes in the forest now have lower numbers than
child nodes.
For each node
in preorder, do:
at child
nodes of and of the preorder labels of nodes reachable from by edges that do not belong to .
Similarly, compute
, the highest preorder label reachable by a path for which all but the last edge stays
within the subtree rooted at
at child
is a bridge.
Notes
[1] Bollobs, Bla (1998), Modern Graph Theory (http:/ / books. google. com/ books?id=SbZKSZ-1qrwC& pg=PA6), Graduate Texts in
Mathematics, 184, New York: Springer-Verlag, p.6, doi:10.1007/978-1-4612-0619-4, ISBN0-387-98488-7, MR1633290, .
[2] Westbrook, Jeffery; Tarjan, Robert E. (1992), "Maintaining bridge-connected and biconnected components on-line", Algorithmica 7 (5-6):
433464, doi:10.1007/BF01758773, MR1154584.
[3] Robbins, H. E. (1939), "A theorem on graphs, with an application to a problem of traffic control", The American Mathematical Monthly 46:
281283.
[4] Jaeger, F. (1985), "A survey of the cycle double cover conjecture", Annals of Discrete Mathematics 27 Cycles in Graphs, North-Holland
Mathematics Studies, 27, pp.112, doi:10.1016/S0304-0208(08)72993-1.
[5] Tarjan, R. Endre, "A note on finding the bridges of a graph", Information Processing Letters 2 (6): 160161,
doi:10.1016/0020-0190(74)90003-9, MR0349483.
72
Algorithm
The depth is standard to maintain during a depth-first search. The lowpoint of v can be computed after visiting all
descendants of v (i.e., just before v gets popped off the depth-first-search stack) as the minimum of the depth of v,
the depth of all neighbors of v (other than the parent of v in the depth-first-search tree) and the lowpoint of all
children of v in the depth-first-search tree.
The key fact is that a nonroot vertex v is a cut vertex (or articulation point) separating two biconnected components
if and only if there is a child y of v such that lowpoint(y) depth(v). This property can be tested once the depth-first
search returned from every child of v (i.e., just before v gets popped off the depth-first-search stack), and if true, v
separates the graph into different biconnected components. This can be represented by computing one biconnected
component out of every such y (a component which contains y will contain the subtree of y, plus v), and then erasing
the subtree of y from the tree.
The root vertex must be handled separately: it is a cut vertex if and only if it has at least two children. Thus, it
suffices to simply build one component out of each child subtree of the root (including the root).
Other algorithms
In the online version of the problem, vertices and edges are added (but not removed) dynamically, and a data
structure must maintain the biconnected components. Jeffery Westbrook and Robert Tarjan (1992) [2] developed an
efficient data structure for this problem based on disjoint-set data structures. Specifically, it processes n vertex
additions and m edge additions in O(m(m,n)) total time, where is the inverse Ackermann function. This time
bound is proved to be optimal.
Uzi Vishkin and Robert Tarjan (1985) [3] designed a parallel algorithm on CRCW PRAM that runs in O(logn) time
with n+m processors. Guojing Cong and David A. Bader (2005) [4] developed an algorithm that achieves a speedup
of 5 with 12 processors on SMPs. Speedups exceeding 30 based on the original Tarjan-Vishkin algorithm were
reported by James A. Edwards and Uzi Vishkin (2012)[5]
Notes
[1] Hopcroft, J.; Tarjan, R. (1973). "Efficient algorithms for graph manipulation". Communications of the ACM 16 (6): 372378.
doi:10.1145/362248.362272.
[2] Westbrook, J.; Tarjan, R. E. (1992). "Maintaining bridge-connected and biconnected components on-line". Algorithmica 7: 433464.
doi:10.1007/BF01758773.
[3] Tarjan, R.; Vishkin, U. (1985). "An Efficient Parallel Biconnectivity Algorithm". SIAM Journal on Computing 14 (4): 862000.
doi:10.1137/0214061.
[4] Guojing Cong and David A. Bader, (2005). "An Experimental Study of Parallel Biconnected Components Algorithms on Symmetric
Multiprocessors (SMPs)". Proceedings of the 19th IEEE International Conference on Parallel and Distributed Processing Symposium.
pp.45b. doi:10.1109/IPDPS.2005.100.
[5] Edwards, J. A.; Vishkin, U. (2012). "Better speedups using simpler parallel programming for graph connectivity and biconnectivity".
Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores - PMAM '12. pp.
103. doi:10.1145/2141702.2141714. ISBN9781450312110.
References
Eugene C. Freuder (1985). "A Sufficient Condition for Backtrack-Bounded Search". Journal of the Association
for Computing Machinery 32 (4): 755761. doi:10.1145/4221.4225.
External links
The tree of the biconnected components Java implementation (http://code.google.com/p/jbpt/) in the jBPT
library (see BCTree class).
73
Structure
An SPQR tree takes the form of an unrooted tree in which for each node x there is associated an undirected graph or
multigraph Gx. The node, and the graph associated with it, may have one of four types, given the initials SPQR:
In an S node, the associated graph is a cycle graph with three or more vertices and edges. This case is analogous
to series composition in series-parallel graphs; the S stands for "series".[3]
In a P node, the associated graph is a dipole graph, a multigraph with two vertices and three or more edges, the
planar dual to a cycle graph. This case is analogous to parallel composition in series-parallel graphs; the P stands
for "parallel".[3]
In a Q node, the associated graph has a single edge. This trivial case is necessary to handle the graph that has only
one edge, but does not appear in the SPQR trees of more complex graphs.
In an R node, the associated graph is a 3-connected graph that is not a cycle or dipole. The R stands for "rigid": in
the application of SPQR trees in planar graph embedding, the associated graph of an R node has a unique planar
embedding.[3]
Each edge xy between two nodes of the SPQR tree is associated with two directed virtual edges, one of which is an
edge in Gx and the other of which is an edge in Gy. Each edge in a graph Gx may be a virtual edge for at most one
SPQR tree edge.
An SPQR tree T represents a 2-connected graph GT, formed as follows. Whenever SPQR tree edge xy associates the
virtual edge ab of Gx with the virtual edge cd of Gy, form a single larger graph by merging a and c into a single
supervertex, merging b and d into another single supervertex, and deleting the two virtual edges. That is, the larger
graph is the 2-clique-sum of Gx and Gy. Performing this gluing step on each edge of the SPQR tree produces the
graph GT; the order of performing the gluing steps does not affect the result. Each vertex in one of the graphs Gx may
be associated in this way with a unique vertex in GT, the supervertex into which it was merged.
Typically, it is not allowed within an SPQR tree for two S nodes to be adjacent, nor for two P nodes to be adjacent,
because if such an adjacency occurred the two nodes could be merged into a single larger node. With this
assumption, the SPQR tree is uniquely determined from its graph. When a graph G is represented by an SPQR tree
with no adjacent P nodes and no adjacent S nodes, then the graphs Gx associated with the nodes of the SPQR tree are
known as the triconnected components of G.
74
Notes
[1] Hopcroft & Tarjan (1973); Gutwenger & Mutzel (2001).
[2] E.g., Hopcroft & Tarjan (1973) and Bienstock & Monma (1988), both of which are cited as precedents by Di Battista and Tamassia.
[3] Di Battista & Tamassia (1989).
References
Bienstock, Daniel; Monma, Clyde L. (1988), "On the complexity of covering vertices by faces in a planar graph",
SIAM Journal on Computing 17 (1): 5376, doi:10.1137/0217004.
Di Battista, Giuseppe; Tamassia, Roberto (1989), "Incremental planarity testing", Proc. 30th Annual Symposium
on Foundations of Computer Science, pp.436441, doi:10.1109/SFCS.1989.63515.
Di Battista, Giuseppe; Tamassia, Roberto (1990), "On-line graph algorithms with SPQR-trees", Proc. 17th
International Colloquium on Automata, Languages and Programming, Lecture Notes in Computer Science, 443,
Springer-Verlag, pp.598611, doi:10.1007/BFb0032061.
Di Battista, Giuseppe; Tamassia, Roberto (1996), "On-line planarity testing" (http://cs.brown.edu/research/
pubs/pdfs/1996/DiBattista-1996-OPT.pdf), SIAM Journal on Computing 25 (5): 956997,
doi:10.1137/S0097539794280736.
Gutwenger, Carsten; Mutzel, Petra (2001), "A linear time implementation of SPQR-trees", Proc. 8th International
Symposium on Graph Drawing (GD 2000), Lecture Notes in Computer Science, 1984, Springer-Verlag,
pp.7790, doi:10.1007/3-540-44541-2_8.
Hopcroft, John; Tarjan, Robert (1973), "Dividing a graph into triconnected components", SIAM Journal on
Computing 2 (3): 135158, doi:10.1137/0202012.
Mac Lane, Saunders (1937), "A structural characterization of planar combinatorial graphs", Duke Mathematical
Journal 3 (3): 460472, doi:10.1215/S0012-7094-37-00336-3.
External links
SQPR tree implementation (http://www.ogdf.net/doc-ogdf/classogdf_1_1_s_p_q_r_tree.html) in the Open
Graph Drawing Framework.
The tree of the triconnected components Java implementation (http://code.google.com/p/jbpt/) in the jBPT
library (see TCTree class).
75
76
into a pair of
and
Algorithm description
Before describing the algorithm, we have to define the contraction of two nodes, which is to combine two different
nodes u and v, in a graph, as a new node u' with edges that were connected with either u or v, except for the edge(s)
connecting u and v.
example
of
77
contraction
is
displayed:
Example
This is an example of executing the inner while loop of Karger's Basic algorithm once. There are 5 nodes and 7
edges in the graph G. The min-cut of G is 2, while after one execution of the inner while loop, the cut is 4.
Analysis
Proof of correctness
Lemma 1: let k denote the actual result of the min-cut of G, every node in the graph has a degree that is equal or
larger than k.
Proof: If there exists a node N in G, whose degree is smaller than k, then select N as on one side and the other nodes
on the other side. As a result we get a min-cut which is smaller than k. This is a contradiction.
Theorem: With high probability we can find the min-cut of given graph G by executing Karger's algorithm.
Proof: Let C denote the edge set of minimum k-cut. At each stage. we have ni node and by Lemma 1 there are at
least (ni)k2 edges. As such, the probability of selecting an edge in C to make a contraction is
In this manner, we have to run n2 iterations to reduce a graph of n nodes to a graph of only two nodes with i from
0 ton3. Thus, the probability of C surviving after n2 iterations is
78
79
Therefore, we have at least (1/n2) chances to find the min-cut C for executing the inner while loop of Karger's
Basic Algorithm once. Suppose Pr[Failure] indicates that the probability of failing finding the min-cut via executing
the inner while loop once. As such if we execute the inner while loop T = cn(n1)/2 = O(n2) times, the probability
of successfully returning C is
Running time
Karger's algorithm is the fastest known minimum cut algorithm, is randomized, and computes a minimum cut with
high probability in time O(|V|2 log3|V|). To prove this authors at first mention that contraction of
to
vertices
can
be
implemented
in
time.
And
the
. Running algorithm
running
time
is
80
Algorithm description
Inspired by the formula above, we can find that the less edges that left in this graph, the less chance that C survives.
We, therefore, improve the basic algorithm that
INPUT:
Running time
Then we can get the probability of min cut survive P(n) (n is the number of vertices) meets this recurrence relation:
.
times The
81
Improvement bound
To determine a min-cut, one has to touch every edge in the graph at least once, which is
KargerStein's min-cut algorithm takes the running time of
time. The
References
[1] Karger, David (1993). "Global Min-cuts in RNC and Other Ramifications of a Simple Mincut Algorithm" (http:/ / people. csail. mit. edu/
karger/ Papers/ mincut. ps). Proc. 4th Annual ACM-SIAM Symposium on Discrete Algorithms. .
[2] Karger, David; Clifford Stein (1996). "A New Approach to the Minimum Cut Problem" (http:/ / people. csail. mit. edu/ karger/ Papers/
contract. ps). Journal of the ACM 43 (4): 601640. .
82
References
Aspvall, Bengt; Plass, Michael F.; Tarjan, Robert E. (1979), "A linear-time algorithm for testing the truth of
certain quantified boolean formulas", Information Processing Letters 8 (3): 121123,
doi:10.1016/0020-0190(79)90002-4.
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms,
Second Edition. MIT Press and McGraw-Hill, 2001. ISBN 0-262-03293-7. Section 22.5, pp.552557.
External links
Java implementation for computation of strongly connected components [1] in the jBPT library (see
StronglyConnectedComponents class).
References
[1] http:/ / code. google. com/ p/ jbpt/
Overview
The algorithm takes a directed graph as input, and produces a partition of the graph's vertices into the graph's
strongly connected components. Every vertex of the graph appears in a single strongly connected component, even if
it means a vertex appears in a strongly connected component by itself (as is the case with tree-like parts of the graph,
as well as any vertex with no successor or no predecessor).
The basic idea of the algorithm is this: a depth-first search begins from an arbitrary start node (and subsequent
depth-first searches are conducted on any nodes that have not yet been found). The search does not explore any node
that has already been explored. The strongly connected components form the subtrees of the search tree, the roots of
which are the roots of the strongly connected components.
The nodes are placed on a stack in the order in which they are visited. When the search returns from a subtree, the
nodes are taken from the stack and it is determined whether each node is the root of a strongly connected component.
If a node is the root of a strongly connected component, then it and all of the nodes taken off before it form that
strongly connected component.
83
84
Remarks
1. Complexity: The Tarjan procedure is called once for each node; the forall statement considers each edge at most
twice. The algorithm's running time is therefore linear in the number of edges in G, i.e.
.
2. The test for whether v' is on the stack should be done in constant time, for example, by testing a flag stored on
each node that indicates whether it is on the stack.
3. While there is nothing special about the order of the nodes within each strongly connected component, one useful
property of the algorithm is that no strongly connected component will be identified before any of its successors.
Therefore, the order in which the strongly connected components are identified constitutes a reverse topological
sort of the DAG formed by the strongly connected components.[2]
References
[1] Tarjan, R. E. (1972), "Depth-first search and linear graph algorithms", SIAM Journal on Computing 1 (2): 146160, doi:10.1137/0201010
[2] Harrison, Paul. "Robust topological sorting and Tarjan's algorithm in Python" (http:/ / www. logarithmic. net/ pfh/ blog/ 01208083168). .
Retrieved 9 February 2011.
External links
Description of Tarjan's Algorithm (http://www.ics.uci.edu/~eppstein/161/960220.html#sca)
Implementation of Tarjan's Algorithm in .NET (http://stackoverflow.com/questions/6643076/
tarjan-cycle-detection-help-c#sca)
Implementation of Tarjan's Algorithm in Java (http://algowiki.net/wiki/index.php?title=Tarjan's_algorithm)
Implementation of Tarjan's Algorithm in Python (http://www.logarithmic.net/pfh/blog/01208083168)
Another implementation of Tarjan's Algorithm in Python (https://github.com/bwesterb/py-tarjan/)
Implementation of Tarjan's Algorithm in Javascript (https://gist.github.com/1440602)
85
Description
The algorithm performs a depth-first search of the given graph G, maintaining as it does two stacks S and P. Stack S
contains all the vertices that have not yet been assigned to a strongly connected component, in the order in which the
depth-first search reaches the vertices. Stack P contains vertices that have not yet been determined to belong to
different strongly connected components from each other. It also uses a counter C of the number of vertices reached
so far, which it uses to compute the preorder numbers of the vertices.
When the depth-first search reaches a vertex v, the algorithm performs the following steps:
1. Set the preorder number of v to C, and increment C.
2. Push v onto S and also onto P.
3. For each edge from v to a neighboring vertex w:
If the preorder number of w has not yet been assigned, recursively search w;
Otherwise, if w has not yet been assigned to a strongly connected component:
Repeatedly pop vertices from P until the top element of P has a preorder number less than or equal to the
preorder number of w.
4. If v is the top element of P:
Pop vertices from S until v has been popped, and assign the popped vertices to a new component.
Pop v from P.
The overall algorithm consists of a loop through the vertices of the graph, calling this recursive search on each vertex
that does not yet have a preorder number assigned to it.
Related algorithms
Like this algorithm, Tarjan's strongly connected components algorithm also uses depth first search together with a
stack to keep track of vertices that have not yet been assigned to a component, and moves these vertices into a new
component when it finishes expanding the final vertex of its component. However, in place of the second stack,
Tarjan's algorithm uses a vertex-indexed array of preorder numbers, assigned in the order that vertices are first
visited in the depth-first search. The preorder array is used to keep track of when to form a new component.
86
Notes
[1] Sedgewick (2004).
[2] History of Path-based DFS for Strong Components (http:/ / www. cs. colorado. edu/ ~hal/ Papers/ DFS/ pbDFShistory. html), Hal Gabow,
accessed 2012-04-24.
References
Cheriyan, J.; Mehlhorn, K. (1996), "Algorithms for dense graphs and networks on the random access computer",
Algorithmica 15: 521549, doi:10.1007/BF01940880.
Dijkstra, Edsger (1976), A Discipline of Programming, NJ: Prentice Hall, Ch.25.
Gabow, Harold N. (2000), "Path-based depth-first search for strong and biconnected components", Information
Processing Letters 74 (3-4): 107114, doi:10.1016/S0020-0190(00)00051-X, MR1761551.
Munro, Ian (1971), "Efficient determination of the transitive closure of a directed graph", Information Processing
Letters 1: 56-58.
Purdom, P., Jr. (1970), "A transitive closure algorithm", BIT 10: 76-94.
Sedgewick, R. (2004), "19.8 Strong Components in Digraphs", Algorithms in Java, Part 5 Graph Algorithms
(3rd ed.), Cambridge MA: Addison-Wesley, pp.205216.
Complexity
Provided the graph is described using an adjacency list, Kosaraju's algorithm performs two complete traversals of the
graph and so runs in (V+E) (linear) time, which is asymptotically optimal because there is a matching lower bound
(any algorithm must examine all vertices and edges). It is the conceptually simplest efficient algorithm, but is not as
efficient in practice as Tarjan's strongly connected components algorithm and the path-based strong component
algorithm, which perform only one traversal of the graph.
If the graph is represented as an adjacency matrix, the algorithm requires (V2) time.
87
References
Alfred V. Aho, John E. Hopcroft, Jeffrey D. Ullman. Data Structures and Algorithms. Addison-Wesley, 1983.
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein. Introduction to Algorithms, 3rd
edition. The MIT Press, 2009. ISBN 0-262-03384-8 .
Micha Sharir. A strong connectivity algorithm and its applications to data flow analysis. Computers and
Mathematics with Applications 7(1):6772, 1981.
External links
A description and proof of Kosaraju's Algorithm [1]
Good Math, Bad Math: Computing Strongly Connected Components [2]
Java implementation at AlgoWiki.net [3]
References
[1] http:/ / lcm. csa. iisc. ernet. in/ dsa/ node171. html
[2] http:/ / scienceblogs. com/ goodmath/ 2007/ 10/ computing_strongly_connected_c. php
[3] http:/ / algowiki. net/ wiki/ index. php?title=Kosaraju%27s_algorithm
Application: 2-satisfiability
In computer science, 2-satisfiability (abbreviated as 2-SAT or just 2SAT) is the problem of determining whether a
collection of two-valued (Boolean or binary) variables with constraints on pairs of variables can be assigned values
satisfying all the constraints. It is a special case of the general Boolean satisfiability problem, which can involve
constraints on more than two variables, and of constraint satisfaction problems, which can allow more than two
choices for the value of each variable. But in contrast to those problems, which are NP-complete, it has a known
polynomial time solution. Instances of the 2-satisfiability problem are typically expressed as 2-CNF or Krom
formulas.
88
Application: 2-satisfiability
89
Problem representations
A 2-SAT problem may be described using a
Boolean expression with a special restricted
form: a conjunction of disjunctions (and of
ors), where each disjunction (or operation)
has two arguments that may either be
variables or the negations of variables. The
variables or their negations appearing in this
formula are known as terms and the
disjunctions of pairs of terms are known as
clauses. For example, the following formula
is in conjunctive normal form, with seven
variables and eleven clauses:
The implication graph for the example 2-SAT instance shown in this section.
The 2-satisfiability problem is to find a truth assignment to these variables that makes a formula of this type true: we
must choose whether to make each of the variables true or false, so that every clause has at least one term that
becomes true. For the expression shown above, one possible satisfying assignment is the one that sets all seven of the
variables to true. There are also 15 other ways of setting all the variables so that the formula becomes true.
Therefore, the 2-SAT instance represented by this expression is satisfiable.
Formulas with the form described above are known as 2-CNF formulas; the "2" in this name stands for the number
of terms per clause, and "CNF" stands for conjunctive normal form, a type of Boolean expression in the form of a
conjunction of disjunctions. They are also called Krom formulas, after the work of UC Davis mathematician Melven
R. Krom, whose 1967 paper was one of the earliest works on the 2-satisfiability problem.[1]
Each clause in a 2-CNF formula is logically equivalent to an implication from one variable or negated variable to the
other. For example,
Because of this equivalence between these different types of operation, a 2-satisfiability instance may also be written
in implicative normal form, in which we replace each or operation in the conjunctive normal form by both of the two
implications to which it is equivalent.
A third, more graphical way of describing a 2-satisfiability instance is as an implication graph. An implication graph
is a directed graph in which there is one vertex per variable or negated variable, and an edge connecting one vertex to
another whenever the corresponding variables are related by an implication in the implicative normal form of the
instance. An implication graph must be a skew-symmetric graph, meaning that the undirected graph formed by
forgetting the orientations of its edges has a symmetry that takes each variable to its negation and reverses the
orientations of all of the edges.[2]
Application: 2-satisfiability
90
Algorithms
Several algorithms are known for solving the 2-satisfiability problem; the most efficient of them take linear
time.[1][2][3]
and
Krom writes that a formula is consistent if repeated application of this inference rule cannot generate both the
clauses
and
, for any variable . As he proves, a 2-CNF formula is satisfiable if and only if it is
consistent. For, if a formula is not consistent, it is not possible to satisfy both of the two clauses
and
simultaneously. And, if it is consistent, then the formula can be extended by repeatedly adding one clause of the
form
or
at a time, preserving consistency at each step, until it includes such a clause for every
variable. At each of these extension steps, one of these two clauses may always be added while preserving
consistency, for if not then the other clause could be generated using the inference rule. Once all variables have a
clause of this form in the formula, a satisfying assignment of all of the variables may be generated by setting a
variable to true if the formula contains the clause
and setting it to false if the formula contains the clause
.[1] If there were a clause not satisfied by this assignment, i.e., one in which both variables appeared with
sign opposite to their appearances in the added clauses, it would be possible to resolve this clause with one to reverse
the sign of that variable, and then to resolve it with the original clause to produce a clause of the other variable in
double with the sign it held in the original clause. Since the formula is known to have remained consistent, this is
impossible, so the assignment must satisfy the original formula as well.
Krom was concerned primarily with completeness of systems of inference rules, rather than with the efficiency of
algorithms. However, his method leads to a polynomial time bound for solving 2-satisfiability problems. By
grouping together all of the clauses that use the same variable, and applying the inference rule to each pair of
clauses, it is possible to find all inferences that are possible from a given 2-CNF instance, and to test whether it is
consistent, in total time O(n3), where n is the number of variables in the instance: for each variable, there may be
O(n2) pairs of clauses involving that variable, to which the inference rule may be applied. Thus, it is possible to
determine whether a given 2-CNF instance is satisfiable in time O(n3). Because finding a satisfying assignment using
Krom's method involves a sequence of O(n) consistency checks, it would take time O(n4). Even, Itai & Shamir
(1976) quote a faster time bound of O(n2) for this algorithm, based on more careful ordering of its operations.
Nevertheless, even this smaller time bound was greatly improved by the later linear time algorithms of Even, Itai &
Shamir (1976) and Aspvall, Plass & Tarjan (1979).
In terms of the implication graph of the 2-satisfiability instance, Krom's inference rule can be interpreted as
constructing the transitive closure of the graph. As Cook (1971) observes, it can also be seen as an instance of the
DavisPutnam algorithm for solving satisfiability problems using the principle of resolution. Its correctness follows
from the more general correctness of the Davis-Putnam algorithm, and its polynomial time bound is clear since each
resolution step increases the number of clauses in the instance, which is upper bounded by a quadratic function of the
number of variables.[4]
Application: 2-satisfiability
Limited backtracking
Even, Itai & Shamir (1976) describe a technique involving limited backtracking for solving constraint satisfaction
problems with binary variables and pairwise constraints; they apply this technique to a problem of classroom
scheduling, but they also observe that it applies to other problems including 2-SAT.[3]
The basic idea of their approach is to build a partial truth assignment, one variable at a time. Certain steps of the
algorithms are "choice points", points at which a variable can be given either of two different truth values, and later
steps in the algorithm may cause it to backtrack to one of these choice points. However, only the most recent choice
can be backtracked over; all choices made earlier than the most recent one are permanent.[3]
Initially, there is no choice point, and all variables are unassigned. At each step, the algorithm chooses the variable
whose value to set, as follows:
If there is a clause in which both of the variables are set, in a way that falsifies the clause, then the algorithm
backtracks to its most recent choice point, undoing the assignments it made since that choice, and reverses the
decision made at that choice. If no choice point has been reached, or if the algorithm has already backtracked over
the most recent choice point, then it aborts the search and reports that the input 2-CNF formula is unsatisfiable.
If there is a clause in which one of the variables has been set, and the clause may still become either true or false,
then the other variable is set in a way that forces the clause to become true.
If all clauses are either guaranteed to be true for the current assignment or have two unset variables, then the
algorithm creates a choice point and sets one of the unassigned variables to an arbitrarily chosen value.
Intuitively, the algorithm follows all chains of inference after making each of its choices; this either leads to a
contradiction and a backtracking step, or, if no contradiction is derived, it follows that the choice was a correct one
that leads to a satisfying assignment. Therefore, the algorithm either correctly finds a satisfying assignment or it
correctly determines that the input is unsatisfiable.[3]
Even et al. did not describe in detail how to implement this algorithm efficiently; they state only that by "using
appropriate data structures in order to find the implications of any decision", each step of the algorithm (other than
the backtracking) can be performed quickly. However, some inputs may cause the algorithm to backtrack many
times, each time performing many steps before backtracking, so its overall complexity may be nonlinear. To avoid
this problem, they modify the algorithm so that, after reaching each choice point, it tests in parallel both alternative
assignments for the variable set at the choice point, interleaving both parallel tests to produce a sequential algorithm.
As soon as one of these two parallel tests reaches another choice point, the other parallel branch is aborted. In this
way, the total time spent performing both parallel tests is proportional to the size of the portion of the input formula
whose values are permanently assigned. As a result, the algorithm takes linear time in total.[3]
91
Application: 2-satisfiability
In terms of the implication graph, two terms belong to the same strongly connected component whenever there exist
chains of implications from one term to the other and vice versa. Therefore, the two terms must have the same value
in any satisfying assignment to the given 2-satisfiability instance. In particular, if a variable and its negation both
belong to the same strongly connected component, the instance cannot be satisfied, because it is impossible to assign
both of these terms the same value. As Aspvall et al. showed, this is a necessary and sufficient condition: a 2-CNF
formula is satisfiable if and only if there is no variable that belongs to the same strongly connected component as its
negation.[2]
This immediately leads to a linear time algorithm for testing satisfiability of 2-CNF formulae: simply perform a
strong connectivity analysis on the implication graph and check that each variable and its negation belong to
different components. However, as Aspvall et al. also showed, it also leads to a linear time algorithm for finding a
satisfying assignment, when one exists. Their algorithm performs the following steps:
Construct the implication graph of the instance, and find its strongly connected components using any of the
known linear-time algorithms for strong connectivity analysis.
Check whether any strongly connected component contains both a variable and its negation. If so, report that the
instance is not satisfiable and halt.
Construct the condensation of the implication graph, a smaller graph that has one vertex for each strongly
connected component, and an edge from component i to component j whenever the implication graph contains an
edge uv such that u belongs to component i and v belongs to component j. The condensation is automatically a
directed acyclic graph and, like the implication graph from which it was formed, it is skew-symmetric.
Topologically order the vertices of the condensation; the order in which the components are generated by
Kosaraju's algorithm is automatically a topological ordering.
For each component in this order, if its variables do not already have truth assignments, set all the terms in the
component to be false. This also causes all of the terms in the complementary component to be set to true.
Due to the topological ordering, when a term x is set to false, all terms that lead to it via a chain of implications will
themselves already have been set to false. Symmetrically, when a term is set to true, all terms that can be reached
from it via a chain of implications will already have been set to true. Therefore, the truth assignment constructed by
this procedure satisfies the given formula, which also completes the proof of correctness of the necessary and
sufficient condition identified by Aspvall et al.[2]
As Aspvall et al. show, a similar procedure involving topologically ordering the strongly connected components of
the implication graph may also be used to evaluate fully quantified Boolean formulae in which the formula being
quantified is a 2-CNF formula.[2]
Applications
Conflict-free placement of geometric objects
A number of exact and approximate algorithms for the automatic label placement problem are based on
2-satisfiability. This problem concerns placing textual labels on the features of a diagram or map. Typically, the set
of possible locations for each label is highly constrained, not only by the map itself (each label must be near the
feature it labels, and must not obscure other features), but by each other: two labels will be illegible if they overlap
each other. In general, label placement is an NP-hard problem. However, if each feature has only two possible
locations for its label (say, extending to the left and to the right of the feature) then it may be solved in polynomial
time. For, in this case, one may create a 2-satisfiability instance that has a variable for each label and constraints
preventing each pair of labels from being assigned overlapping positions. If the labels are all congruent rectangles,
the corresponding 2-SAT instance can be shown to have only linearly many constraints, leading to near-linear time
algorithms for finding a labeling.[7] Poon, Zhu & Chin (1998) describe a map labeling problem in which each label is
a rectangle that may be placed in one of three positions with respect to a line segment that it labels: it may have the
92
Application: 2-satisfiability
segment as one of its sides, or it may be centered on the segment. They represent these three positions using two
binary variables in such a way that, again, testing the existence of a valid labeling becomes a 2-SAT problem.[8]
Formann & Wagner (1991) use this observation as part of an approximation algorithm for the problem of finding
square labels of the largest possible size for a given set of points, with the constraint that each label has one of its
corners on the point that it labels. To find a labeling with a given size, they eliminate squares that, if doubled, would
overlap another points, and they eliminate points that can be labeled in a way that cannot possibly overlap with
another point's label, and they show that the remaining points have only two possible label placements, allowing the
2-SAT approach to be used. By searching for the largest size that leads to a solvable 2-SAT instance, they find a
solution with approximation ratio at most two.[7][9] Similarly, if each label is rectangular and must be placed in such
a way that the point it labels is somewhere along its bottom edge, then using 2-SAT to find the optimal solution in
which the label has the point on a bottom corner leads to an approximation ratio of at most two.[10]
Similar reductions to 2-satisfiability have been applied to other geometric placement problems. In graph drawing, if
the vertex locations are fixed and each edge must be drawn as a circular arc with one of two possible locations, then
the problem of choosing which arc to use for each edge in order to avoid crossings is a 2SAT problem with a
variable for each edge and a constraint for each pair of placements that would lead to a crossing. However, in this
case it is possible to speed up the solution, compared to an algorithm that builds and then searches an explicit
representation of the implication graph, by searching the graph implicitly.[11] In VLSI integrated circuit design, if a
collection of modules must be connected by wires that can each bend at most once, then again there are two possible
routes for the wires, and the problem of choosing which of these two routes to use, in such a way that all wires can
be routed in a single layer of the circuit, can be solved as a 2SAT instance.[12]
Boros et al. (1999) consider another VLSI design problem: the question of whether or not to mirror-reverse each
module in a circuit design. This mirror reversal leaves the module's operations unchanged, but it changes the order of
the points at which the input and output signals of the module connect to it, possibly changing how well the module
fits into the rest of the design. Boros et al. consider a simplified version of the problem in which the modules have
already been placed along a single linear channel, in which the wires between modules must be routed, and there is a
fixed bound on the density of the channel (the maximum number of signals that must pass through any cross-section
of the channel). They observe that this version of the problem may be solved as a 2-SAT instance, in which the
constraints relate the orientations of pairs of modules that are directly across the channel from each other; as a
consequence, the optimal density may also be calculated efficiently, by performing a binary search in which each
step involves the solution of a 2-SAT instance.[13]
Data clustering
One way of clustering a set of data points in a metric space into two clusters is to choose the clusters in such a way
as to minimize the sum of the diameters of the clusters, where the diameter of any single cluster is the largest
distance between any two of its points; this is preferable to minimizing the maximum cluster size, which may lead to
very similar points being assigned to different clusters. If the target diameters of the two clusters are known, a
clustering that achieves those targets may be found by solving a 2-satisfiability instance. The instance has one
variable per point, indicating whether that point belongs to the first cluster or the second cluster. Whenever any two
points are too far apart from each other for both to belong to the same cluster, a clause is added to the instance that
prevents this assignment.
The same method also can be used as a subroutine when the individual cluster diameters are unknown. To test
whether a given sum of diameters can be achieved without knowing the individual cluster diameters, one may try all
maximal pairs of target diameters that add up to at most the given sum, representing each pair of diameters as a
2-satisfiability instance and using a 2-satisfiability algorithm to determine whether that pair can be realized by a
clustering. To find the optimal sum of diameters one may perform a binary search in which each step is a feasibility
test of this type. The same approach also works to find clusterings that optimize other combinations than sums of the
93
Application: 2-satisfiability
cluster diameters, and that use arbitrary dissimilarity numbers (rather than distances in a metric space) to measure the
size of a cluster.[14] The time bound for this algorithm is dominated by the time to solve a sequence of 2-SAT
instances that are closely related to each other, and Ramnath (2004) shows how to solve these related instances more
quickly than if they were solved independently from each other, leading to a total time bound of O(n3) for the
sum-of-diameters clustering problem.[15]
Scheduling
Even, Itai & Shamir (1976) consider a model of classroom scheduling in which a set of n teachers must be scheduled
to teach each of m cohorts of students; the number of hours per week that teacher i spends with cohort j is described
by entry Rij of a matrix R given as input to the problem, and each teacher also has a set of hours during which he or
she is available to be scheduled. As they show, the problem is NP-complete, even when each teacher has at most
three available hours, but it can be solved as an instance of 2-satisfiability when each teacher only has two available
hours. (Teachers with only a single available hour may easily be eliminated from the problem.) In this problem, each
variable vij corresponds to an hour that teacher i must spend with cohort j, the assignment to the variable specifies
whether that hour is the first or the second of the teacher's available hours, and there is a 2-SAT clause preventing
any conflict of either of two types: two cohorts assigned to a teacher at the same time as each other, or one cohort
assigned to two teachers at the same time.[3]
Miyashiro & Matsui (2005) apply 2-satisfiability to a problem of sports scheduling, in which the pairings of a
round-robin tournament have already been chosen and the games must be assigned to the teams' stadiums. In this
problem, it is desirable to alternate home and away games to the extent possible, avoiding "breaks" in which a team
plays two home games in a row or two away games in a row. At most two teams can avoid breaks entirely,
alternating between home and away games; no other team can have the same home-away schedule as these two,
because then it would be unable to play the team with which it had the same schedule. Therefore, an optimal
schedule has two breakless teams and a single break for every other team. Once one of the breakless teams is chosen,
one can set up a 2-satisfiability problem in which each variable represents the home-away assignment for a single
team in a single game, and the constraints enforce the properties that any two teams have a consistent assignment for
their games, that each team have at most one break before and at most one break after the game with the breakless
team, and that no team has two breaks. Therefore, testing whether a schedule admits a solution with the optimal
number of breaks can be done by solving a linear number of 2-satisfiability problems, one for each choice of the
breakless team. A similar technique also allows finding schedules in which every team has a single break, and
maximizing rather than minimizing the number of breaks (to reduce the total mileage traveled by the teams).[16]
94
Application: 2-satisfiability
Digital tomography
Tomography is the process of recovering shapes from
their cross-sections. In digital tomography, a simplified
version of the problem that has been frequently studied,
the shape to be recovered is a polyomino (a subset of
the squares in the two-dimensional square lattice), and
the cross-sections provide aggregate information about
the sets of squares in individual rows and columns of
the lattice. For instance, in the popular nonogram
puzzles, also known as paint by numbers or griddlers,
the set of squares to be determined represents the dark
pixels in a bitmap image, and the input given to the
puzzle solver tells him or her how many consecutive
blocks of dark pixels to include in each row or column
of the image, and how long each of those blocks should
be. In other forms of digital tomography, even less
information about each row or column is given: only
Example of a nonogram puzzle being solved.
the total number of squares, rather than the number and
length of the blocks of squares. An equivalent version
of the problem is that we must recover a given 0-1 matrix given only the sums of the values in each row and in each
column of the matrix.
Although there exist polynomial time algorithms to find a matrix having given row and column sums,[17] the solution
may be far from unique: any submatrix in the form of a 22 identity matrix can be complemented without affecting
the correctness of the solution. Therefore, researchers have searched for constraints on the shape to be reconstructed
that can be used to restrict the space of solutions. For instance, one might assume that the shape is connected;
however, testing whether there exists a connected solution is NP-complete.[18] An even more constrained version
that is easier to solve is that the shape is orthogonally convex: having a single contiguous block of squares in each
row and column. Improving several previous solutions, Chrobak & Drr (1999) showed how to reconstruct
connected orthogonally convex shapes efficiently, using 2-SAT.[19] The idea of their solution is to guess the indexes
of rows containing the leftmost and rightmost cells of the shape to be reconstructed, and then to set up a 2-SAT
problem that tests whether there exists a shape consistent with these guesses and with the given row and column
sums. They use four 2-SAT variables for each square that might be part of the given shape, one to indicate whether it
belongs to each of four possible "corner regions" of the shape, and they use constraints that force these regions to be
disjoint, to have the desired shapes, to form an overall shape with contiguous rows and columns, and to have the
desired row and column sums. Their algorithm takes time O(m3n) where m is the smaller of the two dimensions of
the input shape and n is the larger of the two dimensions. The same method was later extended to orthogonally
convex shapes that might be connected only diagonally instead of requiring orthogonal connectivity.[20]
More recently, as part of a solver for full nonogram puzzles, Batenburg and Kosters(2008, 2009) used 2-SAT to
combine information obtained from several other heuristics. Given a partial solution to the puzzle, they use dynamic
programming within each row or column to determine whether the constraints of that row or column force any of its
squares to be white or black, and whether any two squares in the same row or column can be connected by an
implication relation. They also transform the nonogram into a digital tomography problem by replacing the sequence
of block lengths in each row and column by its sum, and use a maximum flow formulation to determine whether this
digital tomography problem combining all of the rows and columns has any squares whose state can be determined
or pairs of squares that can be connected by an implication relation. If either of these two heuristics determines the
value of one of the squares, it is included in the partial solution and the same calculations are repeated. However, if
95
Application: 2-satisfiability
96
both heuristics fail to set any squares, the implications found by both of them are combined into a 2-satisfiability
problem and a 2-satisfiability solver is used to find squares whose value is fixed by the problem, after which the
procedure is again repeated. This procedure may or may not succeed in finding a solution, but it is guaranteed to run
in polynomial time. Batenburg and Kosters report that, although most newspaper puzzles do not need its full power,
both this procedure and a more powerful but slower procedure which combines this 2-SAT approach with the limited
backtracking of Even, Itai & Shamir (1976)[3] are significantly more effective than the dynamic programming and
flow heuristics without 2-SAT when applied to more difficult randomly generated nonograms.[21]
Other applications
2-satisfiability has also been applied to problems of recognizing undirected graphs that can be partitioned into an
independent set and a small number of complete bipartite subgraphs,[22] inferring business relationships among
autonomous subsystems of the internet,[23] and reconstruction of evolutionary trees.[24]
The median graph representing all solutions to the example 2-SAT instance whose
implication graph is shown above.
Application: 2-satisfiability
the majority of the three solutions; this median always forms another solution to the instance.[27]
Feder (1994) describes an algorithm for efficiently listing all solutions to a given 2-satisfiability instance, and for
solving several related problems.[28] There also exist algorithms for finding two satisfying assignments that have the
maximal Hamming distance from each other.[29]
Maximum-2-satisfiability
In the maximum-2-satisfiability problem (MAX-2-SAT), the input is a formula in conjunctive normal form with two
literals per clause, and the task is to determine the maximum number of clauses that can be simultaneously satisfied
by an assignment. MAX-2-SAT is NP-hard and it is a particular case of a maximum satisfiability problem.
By formulating MAX-2-SAT as a problem of finding a cut (that is, a partition of the vertices into two subsets)
maximizing the number of edges that have one endpoint in the first subset and one endpoint in the second, in a graph
related to the implication graph, and applying semidefinite programming methods to this cut problem, it is possible
to find in polynomial time an approximate solution that satisfies at least 0.940... times the optimal number of
clauses.[35] A balanced MAX 2-SAT instance is an instance of MAX 2-SAT where every variable appears positively
and negatively with equal weight. For this problem, one can improve the approximation ratio to
.
If the unique games conjecture is true, then it is impossible to approximate MAX 2-SAT, balanced or not, with an
approximation constant better than 0.943... in polynomial time.[36] Under the weaker assumption that P NP, the
problem is only known to be inapproximable within a constant better than 21/22 = 0.95454...[37]
Various authors have also explored exponential worst-case time bounds for exact solution of MAX-2-SAT
instances.[38]
Weighted-2-satisfiability
In the weighted 2-satisfiability problem (W2SAT), the input is an -variable 2SAT instance and an integer k, and
the problem is to decide whether there exists a satisfying assignment in which at most k of the variables are true. One
may easily encode the vertex cover problem as a W2SAT problem: given a graph G and a bound k on the size of a
vertex cover, create a variable for each vertex of a graph, and for each edge uv of the graph create a 2SAT clause u
v. Then the satisfying instances of the resulting 2SAT formula encode solutions to the vertex cover problem, and
there is a satisfying assignment with k true variables if and only if there is a vertex cover with k vertices. Therefore,
97
Application: 2-satisfiability
W2SAT is NP-complete.
Moreover, in parameterized complexity W2SAT provides a natural W[1]-complete problem,[39] which implies that
W2SAT is not fixed-parameter tractable unless this holds for all problems in W[1]. That is, it is unlikely that there
exists an algorithm for W2SAT whose running time takes the form f(k)nO(1). Even more strongly, W2SAT cannot
be solved in time no(k) unless the exponential time hypothesis fails.[40]
Many-valued logics
The 2-SAT problem can also be asked for propositional many-valued logics. The algorithms are not usually linear,
and for some logics the problem is even NP-complete; see Hhnle(2001, 2003) for surveys.[41]
References
[1] Krom, Melven R. (1967), "The Decision Problem for a Class of First-Order Formulas in Which all Disjunctions are Binary", Zeitschrift fr
Mathematische Logik und Grundlagen der Mathematik 13: 1520, doi:10.1002/malq.19670130104.
[2] Aspvall, Bengt; Plass, Michael F.; Tarjan, Robert E. (1979), "A linear-time algorithm for testing the truth of certain quantified boolean
formulas" (http:/ / www. math. ucsd. edu/ ~sbuss/ CourseWeb/ Math268_2007WS/ 2SAT. pdf), Information Processing Letters 8 (3):
121123, doi:10.1016/0020-0190(79)90002-4, .
[3] Even, S.; Itai, A.; Shamir, A. (1976), "On the complexity of time table and multi-commodity flow problems", SIAM Journal on Computing 5
(4): 691703, doi:10.1137/0205048.
[4] Cook, Stephen A. (1971), "The complexity of theorem-proving procedures", Proc. 3rd ACM Symp. Theory of Computing (STOC),
pp.151158, doi:10.1145/800157.805047.
[5] Tarjan, Robert E. (1972), "Depth-first search and linear graph algorithms", SIAM Journal on Computing 1 (2): 146160,
doi:10.1137/0201010.
[6] First published by Cheriyan, J.; Mehlhorn, K. (1996), "Algorithms for dense graphs and networks on the random access computer",
Algorithmica 15 (6): 521549, doi:10.1007/BF01940880. Rediscovered in 1999 by Harold N. Gabow, and published in Gabow, Harold N.
(2003), "Searching (Ch 10.1)", in Gross, J. L.; Yellen, J., Discrete Math. and its Applications: Handbook of Graph Theory, 25, CRC Press,
pp.953984.
[7] Formann, M.; Wagner, F. (1991), "A packing problem with applications to lettering of maps", Proc. 7th ACM Symposium on Computational
Geometry, pp.281288, doi:10.1145/109648.109680.
[8] Poon, Chung Keung; Zhu, Binhai; Chin, Francis (1998), "A polynomial time solution for labeling a rectilinear map", Information Processing
Letters 65 (4): 201207, doi:10.1016/S0020-0190(98)00002-7.
[9] Wagner, Frank; Wolff, Alexander (1997), "A practical map labeling algorithm", Computational Geometry: Theory and Applications 7 (56):
387404, doi:10.1016/S0925-7721(96)00007-7.
[10] Doddi, Srinivas; Marathe, Madhav V.; Mirzaian, Andy; Moret, Bernard M. E.; Zhu, Binhai (1997), "Map labeling and its generalizations"
(http:/ / portal. acm. org/ citation. cfm?id=314250), Proc. 8th ACM-SIAM Symp. Discrete Algorithms (SODA), pp.148157, .
[11] Efrat, Alon; Erten, Cesim; Kobourov, Stephen G. (2007), "Fixed-location circular arc drawing of planar graphs" (http:/ / jgaa. info/ accepted/
2007/ EfratErtenKobourov2007. 11. 1. pdf), Journal of Graph Algorithms and Applications 11 (1): 145164, .
[12] Raghavan, Raghunath; Cohoon, James; Sahni, Sartaj (1986), "Single bend wiring", Journal of Algorithms 7 (2): 232237,
doi:10.1016/0196-6774(86)90006-4.
[13] Boros, Endre; Hammer, Peter L.; Minoux, Michel; Rader, David J., Jr. (1999), "Optimal cell flipping to minimize channel density in VLSI
design and pseudo-Boolean optimization", Discrete Applied Mathematics 90 (13): 6988, doi:10.1016/S0166-218X(98)00114-0.
[14] Hansen, P.; Jaumard, B. (1987), "Minimum sum of diameters clustering", Journal of Classification 4 (2): 215226,
doi:10.1007/BF01896987.
[15] Ramnath, Sarnath (2004), "Dynamic digraph connectivity hastens minimum sum-of-diameters clustering", SIAM Journal on Discrete
Mathematics 18 (2): 272286, doi:10.1137/S0895480102396099.
98
Application: 2-satisfiability
[16] Miyashiro, Ryuhei; Matsui, Tomomi (2005), "A polynomial-time algorithm to find an equitable homeaway assignment", Operations
Research Letters 33 (3): 235241, doi:10.1016/j.orl.2004.06.004.
[17] Brualdi, R. A. (1980), "Matrices of zeros and ones with fixed row and column sum vectors", Linear Algebra Appl. 33: 159231,
doi:10.1016/0024-3795(80)90105-6.
[18] Woeginger, G. J. (1996), The reconstruction of polyominoes from their orthogonal projections, Technical Report SFB-65, Graz, Austria: TU
Graz.
[19] Chrobak, Marek; Drr, Christoph (1999), "Reconstructing hv-convex polyominoes from orthogonal projections", Information Processing
Letters 69 (6): 283289, doi:10.1016/S0020-0190(99)00025-3.
[20] Kuba, Attila; Balogh, Emese (2002), "Reconstruction of convex 2D discrete sets in polynomial time", Theoretical Computer Science 283
(1): 223242, doi:10.1016/S0304-3975(01)00080-9; Brunetti, Sara; Daurat, Alain (2003), "An algorithm reconstructing convex lattice sets",
Theoretical computer science 304 (13): 3557, doi:10.1016/S0304-3975(03)00050-1.
[21] Batenburg, K. Joost; Kosters, Walter A. (2008), "A reasoning framework for solving Nonograms", Combinatorial Image Analysis, 12th
International Workshop, IWCIA 2008, Buffalo, NY, USA, April 79, 2008, Proceedings, Lecture Notes in Computer Science, 4958,
Springer-Verlag, pp.372383, doi:10.1007/978-3-540-78275-9_33; Batenburg, K. Joost; Kosters, Walter A. (2009), "Solving Nonograms by
combining relaxations", Pattern Recognition 42 (8): 16721683, doi:10.1016/j.patcog.2008.12.003.
[22] Brandstdt, Andreas; Hammer, Peter L.; Le, Van Bang; Lozin, Vadim V. (2005), "Bisplit graphs", Discrete Mathematics 299 (13): 1132,
doi:10.1016/j.disc.2004.08.046.
[23] Wang, Hao; Xie, Haiyong; Yang, Yang Richard; Silberschatz, Avi; Li, Li Erran; Liu, Yanbin (2005), "Stable egress route selection for
interdomain traffic engineering: model and analysis", 13th IEEE Conf. Network Protocols (ICNP), pp.1629, doi:10.1109/ICNP.2005.39.
[24] Eskin, Eleazar; Halperin, Eran; Karp, Richard M. (2003), "Efficient reconstruction of haplotype structure via perfect phylogeny" (http:/ /
www. worldscinet. com/ cgi-bin/ details. cgi?id=pii:S0219720003000174& type=html), Journal of Bioinformatics and Computational Biology
1 (1): 120, doi:10.1142/S0219720003000174, PMID15290779, .
[25] Papadimitriou, Christos H. (1994), Computational Complexity, Addison-Wesley, pp.chapter 4.2, ISBN0-201-53082-1., Thm. 16.3.
[26] Cook, Stephen; Kolokolova, Antonina (2004), "A Second-Order Theory for NL", 19th Annual IEEE Symposium on Logic in Computer
Science (LICS'04), pp.398407, doi:10.1109/LICS.2004.1319634.
[27] Bandelt, Hans-Jrgen; Chepoi, Victor (2008), "Metric graph theory and geometry: a survey" (http:/ / www. lif-sud. univ-mrs. fr/ ~chepoi/
survey_cm_bis. pdf), Contemporary Mathematics, , to appear. Chung, F. R. K.; Graham, R. L.; Saks, M. E. (1989), "A dynamic location
problem for graphs" (http:/ / www. math. ucsd. edu/ ~fan/ mypaps/ fanpap/ 101location. pdf), Combinatorica 9 (2): 111132,
doi:10.1007/BF02124674, . Feder, T. (1995), Stable Networks and Product Graphs, Memoirs of the American Mathematical Society, 555.
[28] Feder, Toms (1994), "Network flow and 2-satisfiability", Algorithmica 11 (3): 291319, doi:10.1007/BF01240738.
[29] Angelsmark, Ola; Thapper, Johan (2005), "Algorithms for the maximum Hamming distance problem", Recent Advances in Constraints,
Lecture Notes in Computer Science, 3419, Springer-Verlag, pp.128141, doi:10.1007/11402763_10.
[30] Valiant, Leslie G. (1979), "The complexity of enumeration and reliability problems", SIAM Journal on Computing 8 (3): 410421,
doi:10.1137/0208032
[31] Welsh, Dominic; Gale, Amy (2001), "The complexity of counting problems", Aspects of complexity: minicourses in algorithmics,
complexity and computational algebra: mathematics workshop, Kaikoura, January 715, 2000: 115ff, Theorem 57.
[32] Dahllf, Vilhelm; Jonsson, Peter; Wahlstrm, Magnus (2005), "Counting models for 2SAT and 3SAT formulae", Theoretical Computer
Science 332 (13): 265291, doi:10.1016/j.tcs.2004.10.037
[33] Frer, Martin; Kasiviswanathan, Shiva Prasad (2007), "Algorithms for counting 2-SAT solutions and colorings with applications",
Algorithmic Aspects in Information and Management, Lecture Notes in Computer Science, 4508, Springer-Verlag, pp.4757,
doi:10.1007/978-3-540-72870-2_5.
[34] Bollobs, Bla; Borgs, Christian; Chayes, Jennifer T.; Kim, Jeong Han; Wilson, David B. (2001), "The scaling window of the 2-SAT
transition", Random Structures and Algorithms 18 (3): 201256, doi:10.1002/rsa.1006; Chvtal, V.; Reed, B. (1992), "Mick gets some (the
odds are on his side)", Proc. 33rd IEEE Symp. Foundations of Computer Science (FOCS), pp.620627, doi:10.1109/SFCS.1992.267789;
Goerdt, A. (1996), "A threshold for unsatisfiability", Journal of Computer and System Sciences 53 (3): 469486, doi:10.1006/jcss.1996.0081.
[35] Lewin, Michael; Livnar, Dror; Zwick, Uri (2002), "Improved Rounding Techniques for the MAX 2-SAT and MAX DI-CUT Problems",
Proceedings of the 9th International IPCO Conference on Integer Programming and Combinatorial Optimization (Springer-Verlag): 6782,
ISBN3-540-43676-6
[36] Khot, Subhash; Kindler, Guy; Mossel, Elchanan; O'Donnell, Ryan (2004), "Optimal Inapproximability Results for MAX-CUT and Other
2-Variable CSPs?", FOCS '04: Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science (IEEE): 146154,
doi:10.1109/FOCS.2004.49, ISBN0-7695-2228-9
[37] Hstad, Johan (2001), "Some optimal inapproximability results", Journal of the Association for Computing Machinery 48 (4): 798859,
doi:10.1145/502090.502098.
[38] Bansal, N.; Raman, V. (1999), "Upper bounds for MaxSat: further improved", in Aggarwal, A.; Pandu Rangan, C., Proc. 10th Conf.
Algorithms and Computation, ISAAC99, Lecture Notes in Computer Science, 1741, Springer-Verlag, pp.247258; Gramm, Jens; Hirsch,
Edward A.; Niedermeier, Rolf; Rossmanith, Peter (2003), "Worst-case upper bounds for MAX-2-SAT with an application to MAX-CUT",
Discrete Applied Mathematics 130 (2): 139155, doi:10.1016/S0166-218X(02)00402-X; Kojevnikov, Arist; Kulikov, Alexander S. (2006), "A
new approach to proving upper bounds for MAX-2-SAT", Proc. 17th ACM-SIAM Symp. Discrete Algorithms, pp.1117,
doi:10.1145/1109557.1109559
99
Application: 2-satisfiability
[39] Flum, Jrg; Grohe, Martin (2006), Parameterized Complexity Theory (http:/ / www. springer. com/ east/ home/ generic/ search/
results?SGWID=5-40109-22-141358322-0), Springer, ISBN978-3-540-29952-3,
[40] Chen, Jianer; Huang, Xiuzhen; Kanj, Iyad A.; Xia, Ge (2006), "Strong computational lower bounds via parameterized complexity", J.
Comput. Syst. Sci. 72 (8): 13461367, doi:10.1016/j.jcss.2006.04.007
[41] Hhnle, Reiner (2001), "Advanced many-valued logics" (http:/ / books. google. com/ books?id=_ol81ow-1s4C& pg=PA373), in Gabbay,
Dov M.; Gnthner, Franz, Handbook of philosophical logic, 2 (2nd ed.), Springer, p.373, ISBN978-0-7923-7126-7, ; Hhnle, Reiner (2003),
"Complexity of Many-valued Logics", in Fitting, Melvin; Orlowska, Ewa, Beyond two: theory and applications of multiple-valued logic,
Springer, ISBN978-3-7908-1541-2
100
101
Shortest paths
Shortest path problem
In graph theory, the shortest path problem is the problem of finding a
path between two vertices (or nodes) in a graph such that the sum of
the weights of its constituent edges is minimized.
This is analogous to the problem of finding the shortest path between
two intersections on a road map: the graph's vertices correspond to
intersections and the edges correspond to road segments, each
weighted by the length of its road segment.
(6, 4, 5, 1) and (6, 4, 3, 2, 1) are both paths
between vertices 6 and 1.
Definition
The shortest path problem can be defined for graphs whether undirected, directed, or mixed. It is defined here for
undirected graphs; for directed graphs the definition of path requires that consecutive vertices be connected by an
appropriate directed edge.
Two vertices are adjacent when they are both incident to a common edge. A path in an undirected graph is a
sequence of vertices
such that is adjacent to
for . Such a
path
from
to
. (The
position in the sequence and needs not to relate to any canonical labeling of the vertices.)
Let
and
to
is the path
, and an
(where
Algorithms
The most important algorithms for solving this problem are:
Roadnetworks
A roadnetwork can be considered as a graph with positive weights. The nodes represent road junctions and each edge
of the graph is associated with a road segment between two junctions. The weight of an edge may correspond to the
length of the associated road segment, the time needed to traverse the segment or the cost of traversing the segment.
Using directed edges it is also possible to model one-way streets. Such graphs are special in the sense that some
edges are more important than others for long distance travel (i.e. highways). This property has been formalized
using the notion of highway dimension. (research.microsoft.com/pubs/115272/soda10.pdf [2]) There are a great
number of algorithms that exploit this property and are therefore able to compute the shortest path a lot quicker than
would be possible on general graphs.
All of these algorithms work in two phases. In the first phase, the graph is preprocessed without knowing the source
or target node. This phase may take several days for realistic data and some techniques. The second phase is the
query phase. In this phase, source and target node are known. The running time of the second phase is generally less
than a second. The idea is that the road network is static, so the preprocessing phase can be done once and used for a
large number of queries on the same road network.
The algorithm with the fastest known query time is called hub labeling and is able to compute shortest path on the
road
networks
of
Europe
or
the
USA
in
a
fraction
of
a
microsecond.
(research.microsoft.com/pubs/142356/HL-TR.pdf). Other techniques that have been used are:
ALT
Arc Flags
Contraction Hierarchies (http://www.springerlink.com/index/j062316602803057.pdf)
Transit Node Routing
Reach based Pruning
Labeling
102
103
Algorithm
Time
complexity
Author
O(V4)
Shimbel 1955
O(V2EL)
Ford 1956
O(VE)
O(V2logV)
Dantzig 1958, Dantzig 1960, Minty (cf. Pollack & Wiebenson 1960), Whiting &
Hillier 1960
Dijkstra's algorithm
O(V2)
...
...
...
O(E+VlogV)
O(EloglogL)
O(ElogE/VL)
O(E+VlogL)
BellmanFord algorithm
Gabow's algorithm
Applications
Shortest path algorithms are applied to automatically find directions between physical locations, such as driving
directions on web mapping websites like Mapquest or Google Maps. For this application fast specialized algorithms
are available.[3]
If one represents a nondeterministic abstract machine as a graph where vertices describe states and edges describe
possible transitions, shortest path algorithms can be used to find an optimal sequence of choices to reach a certain
goal state, or to establish lower bounds on the time needed to reach a given state. For example, if vertices represents
the states of a puzzle like a Rubik's Cube and each directed edge corresponds to a single move or turn, shortest path
algorithms can be used to find a solution that uses the minimum possible number of moves.
In a networking or telecommunications mindset, this shortest path problem is sometimes called the min-delay path
problem and usually tied with a widest path problem. For example, the algorithm may seek the shortest (min-delay)
widest path, or widest shortest (min-delay) path.
A more lighthearted application is the games of "six degrees of separation" that try to find the shortest path in graphs
like movie stars appearing in the same film.
Other applications include "operations research, plant and facility layout, robotics, transportation, and VLSI
design".[4]
104
Related problems
For shortest path problems in computational geometry, see Euclidean shortest path.
The travelling salesman problem is the problem of finding the shortest path that goes through every vertex exactly
once, and returns to the start. Unlike the shortest path problem, which can be solved in polynomial time in graphs
without negative cycles, the travelling salesman problem is NP-complete and, as such, is believed not to be
efficiently solvable (see P = NP problem). The problem of finding the longest path in a graph is also NP-complete.
The Canadian traveller problem and the stochastic shortest path problem are generalizations where either the graph
isn't completely known to the mover, changes over time, or where actions (traversals) are probabilistic.
The shortest multiple disconnected path [5] is a representation of the primitive path network within the framework of
Reptation theory.
The problems of recalculation of shortest paths arises if some graph transformations (e.g., shrinkage of nodes) are
made with a graph.[6]
The widest path problem seeks a path so that the minimum label of any edge is as large as possible.
subject to
This LP, which is common fodder for operations research courses, has the special property that it is integral; more
specifically, every basic optimal solution (when one exists) has all variables equal to 0 or 1, and the set of edges
whose variables equal 1 form an s-t dipath. See Ahuja et al.[7] for one proof, although the origin of this approach
dates back to mid-20th century.
The dual for this linear program is
maximize yt ys subject to for all ij, yj yi wij
and feasible duals correspond to the concept of a consistent heuristic for the A* algorithm for shortest paths. For any
feasible dual y the reduced costs
are nonnegative and A* essentially runs Dijkstra's
algorithm on these reduced costs.
References
[1] Cherkassky, Boris V.; Goldberg, Andrew V.; Radzik, Tomasz (1996). "Shortest paths algorithms: theory and experimental evaluation" (http:/
/ ftp. cs. stanford. edu/ cs/ theory/ pub/ goldberg/ sp-alg. ps. Z). Mathematical Programming. Ser. A 73 (2): 129174.
doi:10.1016/0025-5610(95)00021-6. MR1392160. <!-- Bot inserted parameter. Either remove it; or change its value to "." for the cite to end in
a ".">.
[2] http:/ / research. microsoft. com/ pubs/ 115272/ soda10. pdf
[3] Sanders, Peter (March 23, 2009). Fast route planning (http:/ / www. youtube. com/ watch?v=-0ErpE8tQbw). Google Tech Talk. .
[4] Chen, Danny Z. (December 1996). "Developing algorithms and software for geometric path planning problems". ACM Computing Surveys 28
(4es): 18. doi:10.1145/242224.242246.
[5] Kroger, Martin (2005). "Shortest multiple disconnected path for the analysis of entanglements in two- and three-dimensional polymeric
systems". Computer Physics Communications 168 (168): 209232. doi:10.1016/j.cpc.2005.01.020.
[6] Ladyzhensky Y., Popoff Y. Algorithm to define the shortest paths between all nodes in a graph after compressing of two nodes. Proceedings
of Donetsk national technical university, Computing and automation. Vol.107. Donetsk, 2006, pp. 6875.
[7] Ravindra K. Ahuja, Thomas L. Magnanti, and James B. Orlin (1993). Network Flows: Theory, Algorithms and Applications. Prentice Hall.
ISBN0-13-617549-X.
Cormen, Thomas H.; Leiserson, Charles E., Rivest, Ronald L., Stein, Clifford (2001) [1990]. "Single-Source
Shortest Paths and All-Pairs Shortest Paths". Introduction to Algorithms (2nd ed.). MIT Press and McGraw-Hill.
pp.580642. ISBN0-262-03293-7.
D. Frigioni; A. Marchetti-Spaccamela and U. Nanni (1998). "Fully dynamic output bounded single source
shortest path problem" (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.32.9856). Proc. 7th
Annu. ACM-SIAM Symp. Discrete Algorithms. Atlanta, GA. pp.212-221.
Further reading
Dijkstra, E. W. (1959). "A note on two problems in connexion with graphs" (http://www-m3.ma.tum.de/twiki/
pub/MN0506/WebHome/dijkstra.pdf). Numerische Mathematik 1: 269271. doi:10.1007/BF01386390.
Fredman, Michael Lawrence; Tarjan, Robert E. (1984). "Fibonacci heaps and their uses in improved network
optimization algorithms" (http://www.computer.org/portal/web/csdl/doi/10.1109/SFCS.1984.715934).
25th Annual Symposium on Foundations of Computer Science (IEEE): 338346. doi:10.1109/SFCS.1984.715934.
ISBN0-8186-0591-X.
Fredman, Michael Lawrence; Tarjan, Robert E. (1987). "Fibonacci heaps and their uses in improved network
optimization algorithms" (http://portal.acm.org/citation.cfm?id=28874). Journal of the Association for
Computing Machinery 34 (3): 596615. doi:10.1145/28869.28874.
Leyzorek, M.; Gray, R. S.; Johnson, A. A.; Ladew, W. C.; Meaker, Jr., S. R.; Petry, R. M.; Seitz, R. N. (1957).
Investigation of Model Techniques First Annual Report 6 June 1956 1 July 1957 A Study of Model
Techniques for Communication Systems. Cleveland, Ohio: Case Institute of Technology.
Moore, E.F. (1959). "The shortest path through a maze". Proceedings of an International Symposium on the
Theory of Switching (Cambridge, Massachusetts, 25 April 1957). Cambridge: Harvard University Press.
pp.285292.
105
Dijkstra's algorithm for single-source shortest paths with positive edge lengths
Search algorithm
Data structure
Graph
Dijkstra's algorithm, conceived by Dutch computer scientist Edsger Dijkstra in 1956 and published in 1959,[1][2] is
a graph search algorithm that solves the single-source shortest path problem for a graph with nonnegative edge path
costs, producing a shortest path tree. This algorithm is often used in routing and as a subroutine in other graph
algorithms.
For a given source vertex (node) in the graph, the algorithm finds the path with lowest cost (i.e. the shortest path)
between that vertex and every other vertex. It can also be used for finding costs of shortest paths from a single vertex
to a single destination vertex by stopping the algorithm once the shortest path to the destination vertex has been
determined. For example, if the vertices of the graph represent cities and edge path costs represent driving distances
between pairs of cities connected by a direct road, Dijkstra's algorithm can be used to find the shortest route between
one city and all other cities. As a result, the shortest path first is widely used in network routing protocols, most
notably IS-IS and OSPF (Open Shortest Path First).
Dijkstra's original algorithm does not use a min-priority queue and runs in O(|V|2). The idea of this algorithm is also
given in (Leyzorek et al. 1957). The implementation based on a min-priority queue implemented by a Fibonacci heap
and running in O(|E|+|V|log|V|) is due to (Fredman & Tarjan 1984). This is asymptotically the fastest known
single-source shortest-path algorithm for arbitrary directed graphs with unbounded nonnegative weights.
106
Dijkstra's algorithm for single-source shortest paths with positive edge lengths
107
Algorithm
Let the node at which we are starting be called the initial node. Let the
distance of node Y be the distance from the initial node to Y.
Dijkstra's algorithm will assign some initial distance values and will try
to improve them step by step.
1. Assign to every node a tentative distance value: set it to zero for our
initial node and to infinity for all other nodes.
2. Mark all nodes unvisited. Set the initial node as current. Create a set
of the unvisited nodes called the unvisited set consisting of all the
nodes except the initial node.
3. For the current node, consider all of its unvisited neighbors and
calculate their tentative distances. For example, if the current node
A is marked with a tentative distance of 6, and the edge connecting
it with a neighbor B has length 2, then the distance to B (through A)
will be 6+2=8. If this distance is less than the previously recorded
tentative distance of B, then overwrite that distance. Even though a
neighbor has been examined, it is not marked as visited at this time,
and it remains in the unvisited set.
5. If the destination node has been marked visited (when planning a route between two specific nodes) or if the
smallest tentative distance among the nodes in the unvisited set is infinity (when planning a complete traversal),
then stop. The algorithm has finished.
6. Set the unvisited node marked with the smallest tentative distance as the next "current node" and go back to step
3.
Description
Note: For ease of understanding, this discussion uses the terms intersection, road and map however,
formally these terms are vertex, edge and graph, respectively.
Suppose you want to find the shortest path between two intersections on a city map, a starting point and a
destination. The order is conceptually simple: to start, mark the distance to every intersection on the map with
infinity. This is done not to imply there is an infinite distance, but to note that that intersection has not yet been
visited; some variants of this method simply leave the intersection unlabeled. Now, at each iteration, select a current
intersection. For the first iteration the current intersection will be the starting point and the distance to it (the
intersection's label) will be zero. For subsequent iterations (after the first) the current intersection will be the closest
unvisited intersection to the starting pointthis will be easy to find.
From the current intersection, update the distance to every unvisited intersection that is directly connected to it. This
is done by determining the sum of the distance between an unvisited intersection and the value of the current
intersection, and relabeling the unvisited intersection with this value if it is less than its current value. In effect, the
intersection is relabeled if the path to it through the current intersection is shorter than the previously known paths.
To facilitate shortest path identification, in pencil, mark the road with an arrow pointing to the relabeled intersection
if you label/relabel it, and erase all others pointing to it. After you have updated the distances to each neighboring
intersection, mark the current intersection as visited and select the unvisited intersection with lowest distance (from
the starting point) -- or lowest labelas the current intersection. Nodes marked as visited are labeled with the
Dijkstra's algorithm for single-source shortest paths with positive edge lengths
108
shortest path from the starting point to it and will not be revisited or returned to.
Continue this process of updating the neighboring intersections with the shortest distances, then marking the current
intersection as visited and moving onto the closest unvisited intersection until you have marked the destination as
visited. Once you have marked the destination as visited (as is the case with any visited intersection) you have
determined the shortest path to it, from the starting point, and can trace your way back, following the arrows in
reverse.
Of note is the fact that this algorithm makes no attempt to direct "exploration" towards the destination as one might
expect. Rather, the sole consideration in determining the next "current" intersection is its distance from the starting
point. In some sense, this algorithm "expands outward" from the starting point, iteratively considering every node
that is closer in terms of shortest path distance until it reaches the destination. When understood in this way, it is
clear how the algorithm necessarily finds the shortest path, however it may also reveal one of the algorithm's
weaknesses: its relative slowness in some topologies.
Pseudocode
In the following algorithm, the code u := vertex in Q with smallest dist[], searches for the vertex
u in the vertex set Q that has the least dist[u] value. That vertex is removed from the set Q and returned to the
user. dist_between(u, v) calculates the length between the two neighbor-nodes u and v. The variable alt
on lines 20 & 22 is the length of the path from the root node to the neighbor node v if it were to go through u. If this
path is shorter than the current shortest path recorded for v, that current path is replaced with this alt path. The
previous array is populated with a pointer to the "next-hop" node on the source graph to get the shortest route to
the source.
1
2
3
4
5
// Initializations
// Unknown distance function from
// source to v
previous[v] := undefined ;
7
8
dist[source] := 0 ;
10
11
12
13
remove u from Q ;
14
if dist[u] = infinity:
15
break ;
16
17
18
19
removed from Q.
20
21
22
dist[v] := alt ;
23
previous[v] := u ;
24
25
decrease-key v in Q;
return dist;
// Relax (u,v,a)
Dijkstra's algorithm for single-source shortest paths with positive edge lengths
109
If we are only interested in a shortest path between vertices source and target, we can terminate the search at
line 13 if u = target. Now we can read the shortest path from source to target by iteration:
1
2
3
4
5
6
S := empty sequence
u := target
while previous[u] is defined:
insert u at the beginning of S
u := previous[u]
end while ;
Now sequence S is the list of vertices constituting one of the shortest paths from source to target, or the
empty sequence if no path exists.
A more general problem would be to find all the shortest paths between source and target (there might be
several different ones of the same length). Then instead of storing only a single node in each entry of previous[]
we would store all nodes satisfying the relaxation condition. For example, if both r and source connect to
target and both of them lie on different shortest paths through target (because the edge cost is the same in
both cases), then we would add both r and source to previous[target]. When the algorithm completes,
previous[] data structure will actually describe a graph that is a subset of the original graph with some edges
removed. Its key property will be that if the algorithm was run with some starting node, then every path from that
node to any other node in the new graph will be the shortest path between those nodes in the original graph, and all
paths of that length from the original graph will be present in the new graph. Then to actually find all these short
paths between two given nodes we would use a path finding algorithm on the new graph, such as depth-first search.
Running time
An upper bound of the running time of Dijkstra's algorithm on a graph with edges
expressed as a function of
and
and vertices
can be
, where
and
are times needed to perform decrease key and extract minimum operations in set
, respectively.
The simplest implementation of the Dijkstra's algorithm stores vertices of set
in an ordinary linked list or array,
and extract minimum from
.
For sparse graphs, that is, graphs with far fewer than
efficiently by storing the graph in the form of adjacency lists and using a binary heap, pairing heap, or Fibonacci
heap as a priority queue to implement extracting minimum efficiently. With a binary heap, the algorithm requires
time (which is dominated by
, assuming the graph is connected). To
avoid O(|V|) look-up in decrease-key step on a vanilla binary heap, it is necessary to maintain a supplementary index
mapping each vertex to the heap's index (and keep it up to date as priority queue
changes), making it take only
time instead. The Fibonacci heap improves this to
.
Note that for Directed acyclic graphs, it is possible to find shortest paths from a given starting vertex in linear time,
by processing the vertices in a topological order, and calculating the path length for each vertex to be the minimum
length obtained via any of its incoming edges.[3]
Dijkstra's algorithm for single-source shortest paths with positive edge lengths
110
to
to
and
is a paraphrasing of Bellman's famous Principle of Optimality in the context of the shortest path problem.
Notes
[1] Dijkstra, Edsger; Thomas J. Misa, Editor (2010-08). "An Interview with Edsger W. Dijkstra". Communications of the ACM 53 (8): 4147.
doi:10.1145/1787234.1787249. "What is the shortest way to travel from Rotterdam to Groningen? It is the algorithm for the shortest path
which I designed in about 20 minutes. One morning I was shopping with my young fiance, and tired, we sat down on the caf terrace to drink
a cup of coffee and I was just thinking about whether I could do this, and I then designed the algorithm for the shortest path."
[2] Dijkstra 1959
[3] http:/ / www. boost. org/ doc/ libs/ 1_44_0/ libs/ graph/ doc/ dag_shortest_paths. html
[4] Sniedovich, M. (2006). "Dijkstras algorithm revisited: the dynamic programming connexion" (http:/ / matwbn. icm. edu. pl/ ksiazki/ cc/
cc35/ cc3536. pdf) (PDF). Journal of Control and Cybernetics 35 (3): 599620. . Online version of the paper with interactive computational
modules. (http:/ / www. ifors. ms. unimelb. edu. au/ tutorial/ dijkstra_new/ index. html)
[5] Denardo, E.V. (2003). Dynamic Programming: Models and Applications. Mineola, NY: Dover Publications. ISBN978-0-486-42810-9.
[6] Sniedovich, M. (2010). Dynamic Programming: Foundations and Principles. Francis & Taylor. ISBN978-0-8247-4099-3.
[7] Dijkstra 1959, p.270
Dijkstra's algorithm for single-source shortest paths with positive edge lengths
References
Dijkstra, E. W. (1959). "A note on two problems in connexion with graphs" (http://www-m3.ma.tum.de/twiki/
pub/MN0506/WebHome/dijkstra.pdf). Numerische Mathematik 1: 269271. doi:10.1007/BF01386390.
Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001). "Section 24.3: Dijkstra's
algorithm". Introduction to Algorithms (Second ed.). MIT Press and McGraw-Hill. pp.595601.
ISBN0-262-03293-7.
Fredman, Michael Lawrence; Tarjan, Robert E. (1984). "Fibonacci heaps and their uses in improved network
optimization algorithms" (http://www.computer.org/portal/web/csdl/doi/10.1109/SFCS.1984.715934).
25th Annual Symposium on Foundations of Computer Science (IEEE): 338346. doi:10.1109/SFCS.1984.715934.
Fredman, Michael Lawrence; Tarjan, Robert E. (1987). "Fibonacci heaps and their uses in improved network
optimization algorithms" (http://portal.acm.org/citation.cfm?id=28874). Journal of the Association for
Computing Machinery 34 (3): 596615. doi:10.1145/28869.28874.
Zhan, F. Benjamin; Noon, Charles E. (February 1998). "Shortest Path Algorithms: An Evaluation Using Real
Road Networks". Transportation Science 32 (1): 6573. doi:10.1287/trsc.32.1.65.
Leyzorek, M.; Gray, R. S.; Johnson, A. A.; Ladew, W. C.; Meaker, Jr., S. R.; Petry, R. M.; Seitz, R. N. (1957).
Investigation of Model Techniques First Annual Report 6 June 1956 1 July 1957 A Study of Model
Techniques for Communication Systems. Cleveland, Ohio: Case Institute of Technology.
External links
Oral history interview with Edsger W. Dijkstra (http://purl.umn.edu/107247), Charles Babbage Institute
University of Minnesota, Minneapolis.
Dijkstra's Algorithm in C# (http://www.codeproject.com/KB/recipes/ShortestPathCalculation.aspx)
Fast Priority Queue Implementation of Dijkstra's Algorithm in C# (http://www.codeproject.com/KB/recipes/
FastHeapDijkstra.aspx)
Applet by Carla Laffra of Pace University (http://www.dgp.toronto.edu/people/JamesStewart/270/9798s/
Laffra/DijkstraApplet.html)
Animation of Dijkstra's algorithm (http://www.cs.sunysb.edu/~skiena/combinatorica/animations/dijkstra.
html)
Visualization of Dijkstra's Algorithm (http://students.ceid.upatras.gr/~papagel/english/java_docs/minDijk.
htm)
Shortest Path Problem: Dijkstra's Algorithm (http://www-b2.is.tokushima-u.ac.jp/~ikeda/suuri/dijkstra/
Dijkstra.shtml)
Dijkstra's Algorithm Applet (http://www.unf.edu/~wkloster/foundations/DijkstraApplet/DijkstraApplet.htm)
Open Source Java Graph package with implementation of Dijkstra's Algorithm (http://code.google.com/p/
annas/)
Haskell implementation of Dijkstra's Algorithm (http://bonsaicode.wordpress.com/2011/01/04/
programming-praxis-dijkstras-algorithm/) on Bonsai code
Java Implementation of Dijkstra's Algorithm (http://algowiki.net/wiki/index.php?title=Dijkstra's_algorithm)
on AlgoWiki
QuickGraph, Graph Data Structures and Algorithms for .NET (http://quickgraph.codeplex.com/)
Implementation in Boost C++ library (http://www.boost.org/doc/libs/1_43_0/libs/graph/doc/
dijkstra_shortest_paths.html)
Implementation in T-SQL (http://hansolav.net/sql/graphs.html)
A Java library for path finding with Dijkstra's Algorithm and example applet (http://www.stackframe.com/
software/PathFinder)
111
Dijkstra's algorithm for single-source shortest paths with positive edge lengths
A MATLAB program for Dijkstra's algorithm (http://www.mathworks.com/matlabcentral/fileexchange/
20025-advanced-dijkstras-minimum-path-algorithm)
A basic C++ implementation of Dijkstra's algorithm (http://www.technical-recipes.com/2011/
implementing-dijkstras-algorithm/)
Data structure
Graph
The BellmanFord algorithm computes single-source shortest paths in a weighted digraph.[1] For graphs with only
non-negative edge weights, the faster Dijkstra's algorithm also solves the problem. Thus, BellmanFord is used
primarily for graphs with negative edge weights. The algorithm is named after its developers, Richard Bellman and
Lester Ford, Jr.
Negative edge weights are found in various applications of graphs, hence the usefulness of this algorithm.[2]
However, if a graph contains a "negative cycle", i.e., a cycle whose edges sum to a negative value, then walks of
arbitrarily low weight can be constructed by repeatedly following the cycle, so there may not be a shortest path. In
such a case, the Bellman-Ford algorithm can detect negative cycles and report their existence, but it cannot produce a
correct "shortest path" answer if a negative cycle is reachable from the source.[3][1]
Algorithm
BellmanFord is based on dynamic programming approach. In its basic structure it is similar to Dijkstra's Algorithm,
but instead of greedily selecting the minimum-weight node not yet processed to relax, it simply relaxes all the edges,
and does this |V |1 times, where |V | is the number of vertices in the graph. The repetitions allow minimum
distances to propagate accurately throughout the graph, since, in the absence of negative cycles, the shortest path can
visit each node at most only once. Unlike the greedy approach, which depends on certain structural assumptions
derived from positive weights, this straightforward approach extends to the general case.
BellmanFord runs in O(|V||E|) time, where |V| and |E| are the number of vertices and edges respectively.
procedure BellmanFord(list vertices, list edges, vertex source)
// This implementation takes in a graph, represented as lists of vertices
// and edges, and modifies the vertices so that their distance and
// predecessor attributes store the shortest paths.
// Step 1: initialize graph
for each vertex v in vertices:
if v is source then v.distance := 0
else v.distance := infinity
v.predecessor := null
112
BellmanFord algorithm for single-source shortest paths allowing negative edge lengths
Proof of correctness
The correctness of the algorithm can be shown by induction. The precise statement shown by induction is:
Lemma. After i repetitions of for cycle:
If Distance(u) is not infinity, it is equal to the length of some path from s to u;
If there is a path from s to u with at most i edges, then Distance(u) is at most the length of the shortest path from s
to u with at most i edges.
Proof. For the base case of induction, consider i=0 and the moment before for cycle is executed for the first time.
Then, for the source vertex, source.distance = 0, which is correct. For other vertices u, u.distance =
infinity, which is also correct because there is no path from source to u with 0 edges.
For the inductive case, we first prove the first part. Consider a moment when a vertex's distance is updated by
v.distance := u.distance + uv.weight. By inductive assumption, u.distance is the length of
some path from source to u. Then u.distance + uv.weight is the length of the path from source to v that
follows the path from source to u and then goes to v.
For the second part, consider the shortest path from source to u with at most i edges. Let v be the last vertex before u
on this path. Then, the part of the path from source to v is the shortest path from source to v with at most i-1 edges.
By inductive assumption, v.distance after i1 cycles is at most the length of this path. Therefore, uv.weight
+ v.distance is at most the length of the path from s to u. In the ith cycle, u.distance gets compared with
uv.weight + v.distance, and is set equal to it if uv.weight + v.distance was smaller. Therefore,
after i cycles, u.distance is at most the length of the shortest path from source to u that uses at most i edges.
If there are no negative-weight cycles, then every shortest path visits each vertex at most once, so at step 3 no further
improvements can be made. Conversely, suppose no improvement can be made. Then for any cycle with vertices
v[0], ..., v[k1],
v[i].distance <= v[(i-1) mod k].distance + v[(i-1) mod k]v[i].weight
Summing around the cycle, the v[i].distance terms and the v[i1 (mod k)] distance terms cancel, leaving
0 <= sum from 1 to k of v[i-1 (mod k)]v[i].weight
I.e., every cycle has nonnegative weight.
113
BellmanFord algorithm for single-source shortest paths allowing negative edge lengths
Applications in routing
A distributed variant of the BellmanFord algorithm is used in distance-vector routing protocols, for example the
Routing Information Protocol (RIP). The algorithm is distributed because it involves a number of nodes (routers)
within an Autonomous system, a collection of IP networks typically owned by an ISP. It consists of the following
steps:
1. Each node calculates the distances between itself and all other nodes within the AS and stores this information as
a table.
2. Each node sends its table to all neighboring nodes.
3. When a node receives distance tables from its neighbors, it calculates the shortest routes to all other nodes and
updates its own table to reflect any changes.
The main disadvantages of the BellmanFord algorithm in this setting are as follows:
It does not scale well.
Changes in network topology are not reflected quickly since updates are spread node-by-node.
Count to infinity (if link or node failures render a node unreachable from some set of other nodes, those nodes
may spend forever gradually increasing their estimates of the distance to it, and in the meantime there may be
routing loops).
Yen's improvement
Yen (1970) described an improvement to the BellmanFord algorithm for a graph without negative-weight cycles.
Yen's improvement first assigns some arbitrary linear order on all vertices and then partitions the set of all edges into
one of two subsets. The first subset, Ef, contains all edges (vi, vj) such that i < j; the second, Eb, contains edges (vi, vj)
such that i > j. Each vertex is visited in the order v1, v2, ..., v|V|, relaxing each outgoing edge from that vertex in Ef.
Each vertex is then visited in the order v|V|, v|V|1, ..., v1, relaxing each outgoing edge from that vertex in Eb. Yen's
improvement effectively halves the number of "passes" required for the single-source-shortest-path solution.
Notes
[1] Bang-Jensen, Jrgen; Gregory Gutin. "Section 2.3.4: The Bellman-Ford-Moore algorithm" (http:/ / www. cs. rhul. ac. uk/ books/ dbook/ ).
Digraphs: Theory, Algorithms and Applications (First ed.). ISBN978-1-84800-997-4. .
[2] Sedgewick, Robert. "Section 21.7: Negative Edge Weights" (http:/ / safari. oreilly. com/ 0201361213/ ch21lev1sec7). Algorithms in Java
(Third ed.). ISBN0-201-36121-3.
[3] Kleinberg, Jon; Tardos, va (2006), Algorithm Design, New York: Pearson Education, Inc..
References
Bellman, Richard (1958). "On a routing problem". Quarterly of Applied Mathematics 16: 8790. MR0102435..
Ford, L. R., Jr.; Fulkerson, D. R. (1962). Flows in Networks. Princeton University Press..
Cormen, Thomas H.; Leiserson, Charles E., Rivest, Ronald L.. Introduction to Algorithms. MIT Press and
McGraw-Hill., Second Edition. MIT Press and McGraw-Hill, 2001. ISBN 0-262-03293-7. Section 24.1: The
BellmanFord algorithm, pp.588592. Problem 24-1, pp.614615.
114
BellmanFord algorithm for single-source shortest paths allowing negative edge lengths
Cormen, Thomas H.; Leiserson, Charles E., Rivest, Ronald L.. Introduction to Algorithms. MIT Press and
McGraw-Hill., Third Edition. MIT Press, 2009. ISBN 978-0-262-53305-8. Section 24.1: The BellmanFord
algorithm, pp.651655.
Heineman, George T.; Pollice, Gary; Selkow, Stanley (2008). "Chapter 6: Graph Algorithms". Algorithms in a
Nutshell. O'Reilly Media. pp.160164. ISBN978-0-596-51624-6.
Yen, Jin Y. (1970). "An algorithm for finding shortest routes from all source nodes to a given destination in
general networks". Quarterly of Applied Mathematics 27: 526530. MR0253822..
External links
C code example (http://snippets.dzone.com/posts/show/4828)
Open Source Java Graph package with Bellman-Ford Algorithms (http://code.google.com/p/annas/)
C++ implementation of Yen's Algorithm (Microsoft Visual Studio project) (http://www.technical-recipes.com/
2012/the-k-shortest-paths-algorithm-in-c/)
Algorithm description
Johnson's algorithm consists of the following steps:
1. First, a new node q is added to the graph, connected by zero-weight edges to each of the other nodes.
2. Second, the BellmanFord algorithm is used, starting from the new vertex q, to find for each vertex v the
minimum weight h(v) of a path from q to v. If this step detects a negative cycle, the algorithm is terminated.
3. Next the edges of the original graph are reweighted using the values computed by the BellmanFord algorithm:
an edge from u to v, having length w(u,v), is given the new length w(u,v) + h(u) h(v).
4. Finally, q is removed, and Dijkstra's algorithm is used to find the shortest paths from each node s to every other
vertex in the reweighted graph.
115
Example
The first three stages of Johnson's algorithm are depicted in the illustration below.
The graph on the left of the illustration has two negative edges, but no negative cycles. At the center is shown the
new vertex q, a shortest path tree as computed by the BellmanFord algorithm with q as starting vertex, and the
values h(v) computed at each other node as the length of the shortest path from q to that node. Note that these values
are all non-positive, because q has a length-zero edge to each vertex and the shortest path can be no longer than that
edge. On the right is shown the reweighted graph, formed by replacing each edge weight w(u,v) by w(u,v) + h(u)
h(v). In this reweighted graph, all edge weights are non-negative, but the shortest path between any two nodes uses
the same sequence of edges as the shortest path between the same two nodes in the original graph. The algorithm
concludes by applying Dijkstra's algorithm to each of the four starting nodes in the reweighted graph.
Correctness
In the reweighted graph, all paths between a pair s and t of nodes have the same quantity h(s) h(t) added to them.
The previous statement can be proven as follows: Let p be an s-t path. Its weight W in the reweighted graph is given
by the following expression:
is cancelled by
116
Analysis
The time complexity of this algorithm, using Fibonacci heaps in the implementation of Dijkstra's algorithm, is
O(V2log V + VE): the algorithm uses O(VE) time for the BellmanFord stage of the algorithm, and O(V log V + E)
for each of V instantiations of Dijkstra's algorithm. Thus, when the graph is sparse, the total time can be faster than
the FloydWarshall algorithm, which solves the same problem in time O(V3).
References
Black, Paul E. (2004), "Johnson's Algorithm" [1], Dictionary of Algorithms and Data Structures, National
Institute of Standards and Technology.
Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001), Introduction to Algorithms,
MIT Press and McGraw-Hill, ISBN978-0-262-03293-3. Section 25.3, "Johnson's algorithm for sparse graphs",
pp.636640.
Johnson, Donald B. (1977), "Efficient algorithms for shortest paths in sparse networks", Journal of the ACM 24
(1): 113, doi:10.1145/321992.321993.
Suurballe, J. W. (1974), "Disjoint paths in a network", Networks 14 (2): 125145, doi:10.1002/net.3230040204.
External links
Boost: All Pairs Shortest Paths [2]
References
[1] http:/ / www. nist. gov/ dads/ HTML/ johnsonsAlgorithm. html
[2] http:/ / www. boost. org/ doc/ libs/ 1_40_0/ libs/ graph/ doc/ johnson_all_pairs_shortest. html
117
Data structure
Graph
O(|V|3)
(|V|3)
(|V| )
In computer science, the FloydWarshall algorithm (also known as Floyd's algorithm, RoyWarshall algorithm,
RoyFloyd algorithm, or the WFI algorithm) is a graph analysis algorithm for finding shortest paths in a weighted
graph (with positive or negative edge weights) and also for finding transitive closure of a relation R. A single
execution of the algorithm will find the lengths (summed weights) of the shortest paths between all pairs of vertices,
though it does not return details of the paths themselves. The algorithm is an example of dynamic programming. It
was published in its currently recognized form by Robert Floyd in 1962. However, it is essentially the same as
algorithms previously published by Bernard Roy in 1959 and also by Stephen Warshall in 1962 for finding the
transitive closure of a graph.[1] The modern formulation of Warshall's algorithm as three nested for-loops was first
described by Peter Ingerman, also in 1962.
Algorithm
The FloydWarshall algorithm compares all possible paths through the graph between each pair of vertices. It is able
to do this with only (|V|3) comparisons in a graph. This is remarkable considering that there may be up to (|V|2)
edges in the graph, and every combination of edges is tested. It does so by incrementally improving an estimate on
the shortest path between two vertices, until the estimate is optimal.
Consider a graph G with vertices V, each numbered 1 throughN. Further consider a function shortestPath(i,j,k) that
returns the shortest possible path from i to j using vertices only from the set {1,2,...,k} as intermediate points along
the way. Now, given this function, our goal is to find the shortest path from each i to each j using only vertices 1
tok+1.
For each of these pairs of vertices, the true shortest path could be either (1) a path that only uses vertices in the set
{1,...,k} or (2) a path that goes from i to k+1 and then from k+1 to j. We know that the best path from i to j that
only uses vertices 1 through k is defined by shortestPath(i,j,k), and it is clear that if there were a better path from i
to k+1 to j, then the length of this path would be the concatenation of the shortest path from i to k+1 (using
vertices in {1,...,k}) and the shortest path from k+1 to j (also using vertices in{1,...,k}).
If
is the weight of the edge between vertices i and j, we can define shortestPath(i,j,k) in terms of the
This formula is the heart of the FloydWarshall algorithm. The algorithm works by first computing
shortestPath(i,j,k) for all (i,j) pairs for k=1, then k=2, etc. This process continues until k=n, and we have found
118
119
the shortest path for all (i,j) pairs using any intermediate vertices.
Pseudocode
Conveniently, when calculating the kth case, one can overwrite the information saved from the computation ofk1.
This means the algorithm uses quadratic memory. Be careful to note the initialization conditions:
1 /* Assume a function edgeCost(i,j) which returns the cost of the edge from i to j
2
4 */
5
6 int path[][];
7 /* A 2-dimensional matrix. At each step in the algorithm, path[i][j] is the shortest path
8
edgeCost(i,j).
10 */
11
12 procedure FloydWarshall ()
13
14
15
16
for k := 1 to n
for i := 1 to n
for j := 1 to n
path[i][j] = min ( path[i][j], path[i][k]+path[k][j] );
Path reconstruction
The FloydWarshall algorithm typically only provides the lengths of the paths between all pairs of vertices. With
simple modifications, it is possible to create a method to reconstruct the actual path between any two endpoint
vertices. While one may be inclined to store the actual path from each vertex to each other vertex, this is not
necessary, and in fact, is very costly in terms of memory. For each vertex, one need only store the information about
the highest index intermediate vertex one must pass through if one wishes to arrive at any given vertex. Therefore,
information to reconstruct all paths can be stored in a single NN matrix next where next[i][j] represents the highest
index vertex one must travel through if one intends to take the shortest path from i to j. Implementing such a scheme
for k := 1 to n
for i := 1 to n
for j := 1 to n
path[i][j] := path[i][k]+path[k][j];
next[i][j] := k; }
8
9 function GetPath (i,j)
10
11
12
13
14
15
16
else
return GetPath(i,intermediate) + intermediate + GetPath(intermediate,j);
Analysis
To find all n2 of shortestPath(i,j,k) (for all i and j) from those of shortestPath(i,j,k1) requires 2n2 operations. Since
we begin with shortestPath(i,j,0)=edgeCost(i,j) and compute the sequence of n matrices shortestPath(i,j,1),
shortestPath(i,j,2), , shortestPath(i,j,n), the total number of operations used is n 2n2=2n3. Therefore, the
complexity of the algorithm is (n3).
120
Implementations
Implementations are available for many programming languages.
References
[1] Weisstein, Eric. "Floyd-Warshall Algorithm" (http:/ / mathworld. wolfram. com/ Floyd-WarshallAlgorithm. html). Wolfram MathWorld. .
Retrieved 13 November 2009.
[2] "Lecture 12: Shortest paths (continued)" (http:/ / www. ieor. berkeley. edu/ ~ieor266/ Lecture12. pdf) (PDF). Network Flows and Graphs.
Department of Industrial Engineering and Operations Research, University of California, Berkeley. 7 October 2008. .
[3] Penaloza, Rafael. "Algebraic Structures for Transitive Closure" (http:/ / citeseerx. ist. psu. edu/ viewdoc/ summary?doi=10. 1. 1. 71. 7650). .
[4] http:/ / www. boost. org/ libs/ graph/ doc/
[5] http:/ / www. codeplex. com/ quickgraph
[6] http:/ / paste. lisp. org/ display/ 95370
[7] http:/ / svn. apache. org/ repos/ asf/ commons/ dormant/ graph2/ branches/ jakarta/ src/ java/ org/ apache/ commons/ graph/ impl/ AllPaths.
java
[8] http:/ / algowiki. net/ wiki/ index. php?title=Floyd-Warshall%27s_algorithm
[9] http:/ / www. mathworks. com/ matlabcentral/ fileexchange/ 10922
[10] http:/ / search. cpan. org/ search?query=Graph& mode=all
[11] http:/ / www. microshell. com/ programming/ computing-degrees-of-separation-in-social-networking/ 2/
[12] http:/ / www. microshell. com/ programming/ floyd-warshal-algorithm-in-postgresql-plpgsql/ 3/
[13] http:/ / cran. r-project. org/ web/ packages/ e1071/ index. html
[14] https:/ / github. com/ chollier/ ruby-floyd
Cormen, Thomas H.; Leiserson, Charles E., Rivest, Ronald L. (1990). Introduction to Algorithms (1st ed.). MIT
Press and McGraw-Hill. ISBN0-262-03141-8.
121
External links
Interactive animation of the FloydWarshall algorithm (http://www.pms.informatik.uni-muenchen.de/lehre/
compgeometry/Gosper/shortest_path/shortest_path.html#visualization)
The FloydWarshall algorithm in C#, as part of QuickGraph (http://quickgraph.codeplex.com/)
Visualization of Floyd's algorithm (http://students.ceid.upatras.gr/~papagel/english/java_docs/allmin.htm)
Definitions
Let G be a weighted directed graph containing a set V of n vertices and a set E of m directed edges; let s be a
designated source vertex in G, and let t be a designated destination vertex.. Let each edge (u,v) in E, from vertex u to
vertex v, have a non-negative cost w(u,v).
Define d(s,u) to be the cost of the shortest path to node u from node s in the shortest path tree rooted at s.
Algorithm
Suurballe's algorithm performs the following steps:
1. Find the shortest path tree T rooted at node s by running Dijkstra's algorithm. This tree contains for every vertex
u, a shortest path from s to u. Let P1 be the shortest cost path from s to t. The edges in T are called tree edges and
the remaining edges are called non tree edges.
2. Modify the cost of each edge in the graph by replacing the cost w(u,v) of every edge (u,v) by w'(u,v) = w(u,v)
d(s,v) + d(s,u). According to the resulting modified cost function, all tree edges have a cost of 0, and non tree
edges have a non negative cost.
3. Create a residual graph Gt formed from G by removing the edges of G that are directed into s and by reversing the
direction of the zero length edges along path P1.
4. Find the shortest path P2 in the residual graph Gt by running Dijkstra's algorithm.
5. Discard the reversed edges of P2 from both paths. The remaining edges of P1 and P2 form a subgraph with two
outgoing edges at s, two incoming edges at t, and one incoming and one outgoing edge at each remaining vertex.
Therefore, this subgraph consists of two edge-disjoint paths from s to t and possibly some additional (zero-length)
cycles. Return the two disjoint paths from the subgraph.
122
Example
The following example shows how Suurballe's algorithm finds the shortest pair of disjoint paths from A to F.
Correctness
The weight of any path from s to t in the modified system of weights equals the weight in the original graph, minus
d(s,t). Therefore, the shortest two disjoint paths under the modified weights are the same paths as the shortest two
paths in the original graph, although they have different weights.
Suurballe's algorithm may be seen as a special case of the successive shortest paths method for finding a minimum
cost flow with total flow amount two from s to t. The modification to the weights does not affect the sequence of
paths found by this method, only their weights. Therefore the correctness of the algorithm follows from the
correctness of the successive shortest paths method.
123
Variations
The version of Suurballe's algorithm as described above finds paths that have disjoint edges, but that may share
vertices. It is possible to use the same algorithm to find vertex-disjoint paths, by replacing each vertex by a pair of
adjacent vertices, one with all of the incoming adjacencies of the original vertex, and one with all of the outgoing
adjacencies. Two edge-disjoint paths in this modified graph necessarily correspond to two vertex-disjoint paths in
the original graph, and vice versa, so applying Suurballe's algorithm to the modified graph results in the construction
of two vertex-disjoint paths in the original graph. Suurballe's original 1974 algorithm was for the vertex-disjoint
version of the problem, and was extended in 1984 by Suurballe and Tarjan to the edge-disjoint version.
By using a modified version of Dijkstra's algorithm that simultaneously computes the distances to each vertex t in
the graphs Gt, it is also possible to find the total lengths of the shortest pairs of paths from a given source vertex s to
every other vertex in the graph, in an amount of time that is proportional to a single instance of Dijkstra's algorithm.
References
[1] Suurballe, J. W. (1974), "Disjoint paths in a network", Networks 14: 125145, doi:10.1002/net.3230040204.
[2] Suurballe, J. W.; Tarjan, R. E. (1984), "A quick method for finding shortest pairs of disjoint paths" (http:/ / www. cse. yorku. ca/
course_archive/ 2007-08/ F/ 6590/ Notes/ surballe_alg. pdf), Networks 14: 325336, doi:10.1002/net.3230140209, .
[3] Bhandari, Ramesh (1999), "Suurballe's disjoint pair algorithms", Survivable Networks: Algorithms for Diverse Routing, Springer-Verlag,
pp.8691, ISBN978-0-7923-8381-9.
124
Bidirectional search
125
Bidirectional search
Bidirectional search is a graph search algorithm that finds a shortest path from an initial vertex to a goal vertex in a
directed graph. It runs two simultaneous searches: one forward from the initial state, and one backward from the
goal, stopping when the two meet in the middle. The reason for this approach is that in many cases it is faster: for
instance, in a simplified model of search problem complexity in which both searches expand a tree with branching
factor b, and the distance from start to goal is d, each of the two searches has complexity O(bd/2) (in Big O notation),
and the sum of these two search times is much less than the O(bd) complexity that would result from a single search
from the beginning to the goal.
As in A* search, bi-directional search can be guided by a heuristic estimate of the remaining distance to the goal (in
the forward tree) or from the start (in the backward tree).
Ira Pohl was the first one to design and implement a bi-directional heuristic search algorithm. Andrew Goldberg and
other explain how the correct termination for the bidirectional Dijkstras Algorithm has to be.[1]
Description
A Bidirectional Heuristic Search is a state space search from some state to another state , searching from to
and from to simultaneously (or quasi-simultaneously if done on a sequential machine). It returns a valid list
of operators that if applied to will give us .
While it may seem as though the operators have to be invertible for the reverse search, it is only necessary to be able
to find, given any node , the set of parent nodes of such that there exists some valid operator from each of the
parent nodes to . This has often been likened to a one way street in the route-finding domain: it is not necessary to
be able to travel down both directions, but it is necessary when standing at the end of the street to determine the
beginning of the street as a possible route.
Similarly, for those edges that have inverse arcs (i.e. arcs going in both directions) it is not necessary that each
direction be of equal cost. The reverse search will always use the inverse cost (i.e. the cost of the arc in the forward
direction). More formally, if is a node with parent , then
, defined as being the cost
from
to
to node
Bidirectional search
126
the leaves of
, the root is
, if
, the root is
(sometimes referred to as
expansion. In bidirectional search, these are sometimes called the search 'frontiers' or 'wavefronts', referring to
how they appear when a search is represented graphically. In this metaphor, a 'collision' occurs when, during
the expansion phase, a node from one wavefront is found to have successors in the opposing wavefront.
the non-leaf nodes of
Front-to-Back
Front-to-Back algorithms calculate the
of the opposite search tree,
or
value of a node
Front-to-Back is the most actively researched of the three categories. The current best algorithm (at least in the
Fifteen puzzle domain) is the BiMAX-BS*F algorithm, created by Auer and Kaindl (Auer, Kaindl 2004).
Front-to-Front
Front-to-Front algorithms calculate the
subset of
value of a node
and some
between the current node and the nodes on the opposing front. Or, formally:
where
returns an admissible (i.e. not overestimating) heuristic estimate of the distance between nodes
and .
Front-to-Front suffers from being excessively computationally demanding. Every time a node is put into the open
list, it's
value must be calculated. This involves calculating a heuristic estimate from to every node
in the opposing
with
Bidirectional search
127
References
[1] Efficient Point-to-Point Shortest Path Algorithms (http:/ / www. cs. princeton. edu/ courses/ archive/ spr06/ cos423/ Handouts/ EPP shortest
path algorithms. pdf)
de Champeaux, Dennis; Sint, Lenie (1977), "An improved bidirectional heuristic search algorithm", Journal of
the ACM 24 (2): 177191, doi:10.1145/322003.322004.
de Champeaux, Dennis (1983), "Bidirectional heuristic search again", Journal of the ACM 30 (1): 2232,
doi:10.1145/322358.322360.
Pohl, Ira (1971), "Bi-directional Search", in Meltzer, Bernard; Michie, Donald, Machine Intelligence, 6,
Edinburgh University Press, pp.127140.
Russell, Stuart J.; Norvig, Peter (2002), "3.4 Uninformed search strategies", Artificial Intelligence: A Modern
Approach (2nd ed.), Prentice Hall.
A* search algorithm
In computer science, A* (pronounced "A star"( listen)) is a computer algorithm that is widely used in pathfinding
and graph traversal, the process of plotting an efficiently traversable path between points, called nodes. Noted for its
performance and accuracy, it enjoys widespread use.
Peter Hart, Nils Nilsson and Bertram Raphael of Stanford Research Institute (now SRI International) first described
the algorithm in 1968.[1] It is an extension of Edsger Dijkstra's 1959 algorithm. A* achieves better performance (with
respect to time) by using heuristics.
Description
A* uses a best-first search and finds a least-cost path from a given initial node to one goal node (out of one or more
possible goals). As A* traverses the graph, it follows a path of the lowest known heuristic cost, keeping a sorted
priority queue of alternate path segments along the way.
It uses a distance-plus-cost heuristic function of node
(usually denoted
the search visits nodes in the tree. The distance-plus-cost heuristic is a sum of two functions:
the path-cost function, which is the cost from the starting node to the current node (usually denoted
an admissible "heuristic estimate" of the distance from to the goal (usually denoted
).
The
part of the
function must be an admissible heuristic; that is, it must not overestimate the distance to
is physically the smallest possible distance between any two points or nodes.
If the heuristic h satisfies the additional condition
for every edge x, y of the graph (where
d denotes the length of that edge), then h is called monotone, or consistent. In such a case, A* can be implemented
more efficientlyroughly speaking, no node needs to be processed more than once (see closed set below)and A*
is equivalent to running Dijkstra's algorithm with the reduced cost
.
History
In 1968 Nils Nilsson suggested a heuristic approach for Shakey the Robot to navigate through a room containing
obstacles. This path-finding algorithm, called A1, was a faster version of the then best known formal approach,
Dijkstra's algorithm, for finding shortest paths in graphs. Bertram Raphael suggested some significant improvements
upon this algorithm, calling the revised version A2. Then Peter E. Hart introduced an argument that established A2,
with only minor changes, to be the best possible algorithm for finding shortest paths. Hart, Nilsson and Raphael then
jointly developed a proof that the revised A2 algorithm was optimal for finding shortest paths under certain
A* search algorithm
128
well-defined conditions. They thus named the new algorithm in Kleene star syntax to be the algorithm that starts
with A and includes all possible version numbers or A*.
Process
Like all informed search algorithms, it first searches the routes that
appear to be most likely to lead towards the goal. What sets A* apart
from a greedy best-first search is that it also takes the distance already
traveled into account; the
part of the heuristic is the cost from
the starting point, not simply the local cost from the previously
expanded node.
Starting with the initial node, it maintains a priority queue of nodes to
be traversed, known as the open set. The lower
for a given node
, the higher its priority. At each step of the algorithm, the node with
the lowest
value is removed from the queue, the and
values of its neighbors are updated accordingly, and these neighbors
are added to the queue. The algorithm continues until a goal node has a
lower
value than any node in the queue (or until the queue is
empty). (Goal nodes may be passed over multiple times if there remain
other nodes with lower
a goal.) The
since
shortest path is desired, the algorithm may also update each neighbor
with its immediate predecessor in the best path found so far; this
information can then be used to reconstruct the path by working
backwards from the goal node. Additionally, if the heuristic is
monotonic (or consistent, see below), a closed set of nodes already traversed may be used to make the search more
efficient.
Pseudocode
The following pseudocode describes the algorithm:
function A*(start,goal)
closedset := the empty set
evaluated.
openset := {start}
A* search algorithm
129
if current = goal
return reconstruct_path(came_from, goal)
remove current from openset
add current to closedset
for each neighbor in neighbor_nodes(current)
if neighbor in closedset
continue
tentative_g_score := g_score[current] +
dist_between(current,neighbor)
if neighbor not in openset or tentative_g_score < g_score[neighbor]
if neighbor not in openset
add neighbor to openset
came_from[neighbor] := current
g_score[neighbor] := tentative_g_score
f_score[neighbor] := g_score[neighbor] +
heuristic_cost_estimate(neighbor, goal)
return failure
function reconstruct_path(came_from, current_node)
if came_from[current_node] is set
p := reconstruct_path(came_from, came_from[current_node])
return (p + current_node)
else
return current_node
Remark: the above pseudocode assumes that the heuristic function is monotonic (or consistent, see below), which is
a frequent case in many practical problems, such as the Shortest Distance Path in road networks. However, if the
assumption is not true, nodes in the closed set may be rediscovered and their cost improved. In other words, the
closed set can be omitted (yielding a tree search algorithm) if a solution is guaranteed to exist, or if the algorithm is
adapted so that new nodes are added to the open set only if they have a lower value than at any previous iteration.
A* search algorithm
130
Example
An example of an A star (A*) algorithm in action where nodes are cities connected with roads and h(x) is the
straight-line distance to target point:
Properties
Like breadth-first search, A* is complete and will always find a solution if one exists.
If the heuristic function
is admissible, meaning that it never overestimates the actual minimal cost of reaching the
goal, then A* is itself admissible (or optimal) if we do not use a closed set. If a closed set is used, then
be monotonic (or consistent) for A* to be optimal. This means that for any pair of adjacent nodes
and
must also
, where
where
is the path
extended to include
. In other words, it is
impossible to decrease (total distance so far + estimated remaining distance) by extending a path to include a
neighboring node. (This is analogous to the restriction to nonnegative edge weights in Dijkstra's algorithm.)
Monotonicity implies admissibility when the heuristic estimate at any goal node itself is zero, since (letting
be a shortest path from any node to the nearest goal ):
A* is also optimally efficient for any heuristic
expand fewer nodes than A*, except when there are multiple partial solutions where
the optimal path. Even in this case, for each graph there exists some order of breaking ties in the priority queue such
that A* examines the fewest possible nodes.
A* search algorithm
131
Special cases
Dijkstra's algorithm, as another example of a uniform-cost search algorithm, can be viewed as a special case of A*
where
for all . General depth-first search can be implemented using the A* by considering that there
is a global counter C initialized with a very large value. Every time we process a node we assign C to all of its newly
discovered neighbors. After each single assignment, we decrease the counter C by one. Thus the earlier a node is
discovered, the higher its
value. It should be noted, however, that both Dijkstra's algorithm and depth-first
search can be implemented more efficiently without including a
Implementation details
There are a number of simple optimizations or implementation details that can significantly affect the performance
of an A* implementation. The first detail to note is that the way the priority queue handles ties can have a significant
effect on performance in some situations. If ties are broken so the queue behaves in a LIFO manner, A* will behave
like depth-first search among equal cost paths.
When a path is required at the end of the search, it is common to keep with each node a reference to that node's
parent. At the end of the search these references can be used to recover the optimal path. If these references are being
kept then it can be important that the same node doesn't appear in the priority queue more than once (each entry
corresponding to a different path to the node, and each with a different cost). A standard approach here is to check if
a node about to be added already appears in the priority queue. If it does, then the priority and parent pointers are
changed to correspond to the lower cost path. When finding a node in a queue to perform this check, many standard
implementations of a min-heap require
time. Augmenting the heap with a hash table can reduce this to
constant time.
A* search algorithm
132
Bounded relaxation
While the admissibility criterion guarantees an optimal solution path, it
also means that A* must examine all equally meritorious paths to find
the optimal path. It is possible to speed up the search at the expense of
optimality by relaxing the admissibility criterion. Oftentimes we want
to bound this relaxation, so that we can guarantee that the solution path
is no worse than
times the optimal solution path. This new
guarantee is referred to as
There are a number of
Weighted A*. If
-admissible.
-admissible algorithms:
is an admissible heuristic function, in the
since fewer nodes are expanded). The path hence found by the
search algorithm can have a cost of at most
Static Weighting[5] uses the cost function
, where
, and where d(n) is the depth of the search and N is the anticipated length
of the solution path.
Sampled Dynamic Weighting.[7] uses sampling of nodes to better estimate and debias the heuristic error.
.[8] uses two heuristic functions. The first is the FOCAL list, which is used to select candidate nodes, and the
second
[9]
is used to select the most promising node from the FOCAL list.
selects nodes with the function
, where
,
, where
and
Complexity
The time complexity of A* depends on the heuristic. In the worst case, the number of nodes expanded is exponential
in the length of the solution (the shortest path), but it is polynomial when the search space is a tree, there is a single
goal state, and the heuristic function h meets the following condition:
where
that returns the true distance from x to the goal (see Pearl
A* search algorithm
Variants of A*
D*
Field D*
IDA*
Fringe
Fringe Saving A* (FSA*)
Generalized Adaptive A* (GAA*)
Lifelong Planning A* (LPA*)
Simplified Memory bounded A* (SMA*)
Theta*
A* can be adapted to a bidirectional search algorithm
References
[1] Hart, P. E.; Nilsson, N. J.; Raphael, B. (1968). "A Formal Basis for the Heuristic Determination of Minimum Cost Paths". IEEE Transactions
on Systems Science and Cybernetics SSC4 4 (2): 100107. doi:10.1109/TSSC.1968.300136.
[2] Dechter, Rina; Judea Pearl (1985). "Generalized best-first search strategies and the optimality of A*" (http:/ / portal. acm. org/ citation.
cfm?id=3830& coll=portal& dl=ACM). Journal of the ACM 32 (3): 505536. doi:10.1145/3828.3830. .
[3] Koenig, Sven; Maxim Likhachev, Yaxin Liu, David Furcy (2004). "Incremental heuristic search in AI" (http:/ / portal. acm. org/ citation.
cfm?id=1017140). AI Magazine 25 (2): 99112. .
[4] Pearl, Judea (1984). Heuristics: intelligent search strategies for computer problem solving (http:/ / portal. acm. org/ citation. cfm?id=525).
Addison-Wesley Longman Publishing Co., Inc.. ISBN0-201-05594-5. .
[5] Pohl, Ira (1970). "First results on the effect of error in heuristic search". Machine Intelligence 5: 219-236.
[6] Pohl, Ira (August, 1973). "The avoidance of (relative) catastrophe, heuristic competence, genuine dynamic weighting and computational
issues in heuristic problem solving". Proceedings of the Third International Joint Conference on Artificial Intelligence (IJCAI-73). 3.
California, USA. pp.11-17.
[7] Kll, Andreas; Hermann Kaindl (August, 1992). "A new approach to dynamic weighting". Proceedings of the Tenth European Conference on
Artificial Intelligence (ECAI-92). Vienna, Austria. pp.16-17.
[8] Pearl, Judea; Jin H. Kim (1982). "Studies in semi-admissible heuristics". IEEE Transactions on Pattern Analysis and Machine Intelligence
(PAMI) 4 (4): 392-399.
[9] "
- an efficient near admissible heuristic search algorithm". Proceedings of the Eighth International Joint Conference on Artificial
Further reading
Hart, P. E.; Nilsson, N. J.; Raphael, B. (1972). "Correction to "A Formal Basis for the Heuristic Determination of
Minimum Cost Paths"". SIGART Newsletter 37: 2829.
Nilsson, N. J. (1980). Principles of Artificial Intelligence. Palo Alto, California: Tioga Publishing Company.
ISBN0-935382-01-1.
Pearl, Judea (1984). Heuristics: Intelligent Search Strategies for Computer Problem Solving. Addison-Wesley.
ISBN0-201-05594-5.
133
A* search algorithm
External links
A* Pathfinding for Beginners (http://www.policyalmanac.org/games/aStarTutorial.htm)
A* with Jump point search (http://harablog.wordpress.com/2011/09/07/jump-point-search/)
Clear visual A* explanation, with advice and thoughts on path-finding (http://theory.stanford.edu/~amitp/
GameProgramming/)
Variation on A* called Hierarchical Path-Finding A* (HPA*) (http://www.cs.ualberta.ca/~mmueller/ps/
hpastar.pdf)
A* Algorithm tutorial (http://www.heyes-jones.com/astar.html)
A* Pathfinding in Objective-C (Xcode) (http://www.humblebeesoft.com/blog/?p=18)
NP-completeness
The NP-completeness of the decision problem can be shown using a reduction from the Hamiltonian path problem.
Clearly, if a certain general graph has a Hamiltonian path, this Hamiltonian path is the longest path possible, as it
traverses all possible vertices. To solve the Hamiltonian path problem using an algorithm for the longest path
problem, we use the algorithm for the longest path problem on the same input graph and set k=|V|-1, where |V| is the
number of vertices in the graph.
If there is a Hamiltonian path in the graph, then the algorithm will return yes, since the Hamiltonian path has length
equal to |V|-1. Conversely, if the algorithm outputs yes, there is a simple path of length |V|-1 in the graph. Since it is a
simple path of length |V|-1, it is a path that visits all the vertices of the graph without repetition, and this is a
Hamiltonian path by definition.
Since the Hamiltonian path problem is NP-complete, this reduction shows that this problem is NP-hard. To show that
it is NP-complete, we also have to show that it is in NP. This is easy to see, however, since the certificate for the
yes-instance is a description of the path of length k.
134
135
algorithm dag-longest-path is
input:
Directed acyclic graph G
output:
Length of the longest path
length_to = array with |V(G)| elements of type int with default value 0
for each vertex v in topOrder(G) do
for each edge (v, w) in E(G) do
if length_to[w] <= length_to[v] + weight(G,(v,w)) then
length_to[w] = length_to[v] + weight(G, (v,w))
return max(length_to[v] for v in V(G))
Correctness can be checked as follows: The topological order traverses the given graph in layers of ascending
distance from the source vertices of the graph (at first, all vertex with distance 0 from the source vertices, then all
vertices with distance 1 from the source vertices, ...). For each vertex, it clearly holds that a path of maximum length
consists of a path of maximum distance through all the layers which are closer to the source than this vertex and also
Related problems
Travelling salesman problem
GallaiHasseRoyVitaver theorem
External links
NP-Completeness [1]
Source of the algorithm [2]
"Find the Longest Path [3]", song by Dan Barrett
References
[1] http:/ / www. ics. uci. edu/ ~eppstein/ 161/ 960312. html
[2] http:/ / www. cs. princeton. edu/ courses/ archive/ spr04/ cos226/ lectures/ digraph. 4up. pdf
[3] http:/ / valis. cs. uiuc. edu/ ~sariel/ misc/ funny/ longestpath. mp3
136
137
A closely related problem, the minimax path problem, asks for the path that minimizes the maximum weight of any
of its edges. It has applications that include transportation planning.[7] Any algorithm for the widest path problem
can be transformed into an algorithm for the minimax path problem, or vice versa, by reversing the sense of all the
weight comparisons performed by the algorithm, or equivalently by replacing every edge weight by its negation.
Undirected graphs
In an undirected graph, a widest path may be found as the path between the two vertices in the maximum spanning
tree of the graph, and a minimax path may be found as the path between the two vertices in the minimum spanning
tree.[8][9][10]
In any graph, directed or undirected, there is a straightforward algorithm for finding a widest path once the weight of
its minimum-weight edge is known: simply delete all smaller edges and search for any path among the remaining
edges using breadth first search or depth first search. Based on this test, there also exists a linear time algorithm for
finding a widest s-t path in an undirected graph, that does not use the maximum spanning tree. The main idea of the
algorithm is to apply the linear-time path-finding algorithm to the median edge weight in the graph, and then either
to delete all smaller edges or contract all larger edges according to whether a path does or does not exist, and recurse
in the resulting smaller graph.[9][11][12]
Fernandez, Garfinkel & Arbiol (1998) use undirected bottleneck shortest paths in order to form composite aerial
photographs that combine multiple images of overlapping areas. In the subproblem to which the widest path problem
applies, two images have already been transformed into a common coordinate system; the remaining task is to select
Directed graphs
In directed graphs, the maximum spanning tree solution cannot be used. Instead, several different algorithms are
known; the choice of which algorithm to use depends on whether a start or destination vertex for the path is fixed, or
whether paths for many start or destination vertices must be found simultaneously.
All pairs
The all-pairs widest path problem has applications in the Schulze method for choosing a winner in multiway
elections in which voters rank the candidates in preference order. The Schulze method constructs a complete directed
graph in which the vertices represent the candidates and every two vertices are connected by an edge. Each edge is
directed from the winner to the loser of a pairwise contest between the two candidates it connects, and is labeled with
the margin of victory of that contest. Then the method computes widest paths between all pairs of vertices, and the
winner is the candidate whose vertex has wider paths to each opponent than vice versa.[2] The results of an election
using this method are consistent with the Condorcet method a candidate who wins all pairwise contests
automatically wins the whole election but it generally allows a winner to be selected, even in situations where the
Concorcet method itself fails.[15] The Schulze method has been used by several organizations including the
Wikimedia Foundation.[16]
To compute the widest path widths for all pairs of nodes in a dense directed graph, such as the ones that arise in the
voting application, the asymptotically fastest known approach takes time O(n(3+)/2) where is the exponent for fast
matrix multiplication. Using the best known algorithms for matrix multiplication, this time bound becomes
O(n2.688).[17] Instead, the reference implementation for the Schulze method method uses a modified version of the
simpler FloydWarshall algorithm, which takes O(n3) time.[2] For sparse graphs, it may be more efficient to
repeatedly apply a single-source widest path algorithm.
138
Single source
If the edges are sorted by their weights, then a modified version of Dijkstra's algorithm can compute the bottlenecks
between a designated start vertex and every other vertex in the graph, in linear time. The key idea behind the
speedup over a conventional version of Dijkstra's algorithm is that the sequence of bottleneck distances to each
vertex, in the order that the vertices are considered by this algorithm, is a monotonic subsequence of the sorted
sequence of edge weights; therefore, the priority queue of Dijkstra's algorithm can be replaced by an array indexed
by the numbers from 1 to m (the number of edges in the graph), where array cell i contains the vertices whose
bottleneck distance is the weight of the edge with position i in the sorted order. This method allows the widest path
problem to be solved as quickly as sorting; for instance, if the edge weights are represented as integers, then the time
bounds for integer sorting a list of m integers would apply also to this problem.[12]
139
References
[1] Shacham, N. (1992), "Multicast routing of hierarchical data", IEEE International Conference on Communications (ICC '92), 3,
pp.12171221, doi:10.1109/ICC.1992.268047; Wang, Zheng; Crowcroft, J. (1995), "Bandwidth-delay based routing algorithms", IEEE
Global Telecommunications Conference (GLOBECOM '95), 3, pp.21292133, doi:10.1109/GLOCOM.1995.502780
[2] Schulze, Markus (2011), "A new monotonic, clone-independent, reversal symmetric, and Condorcet-consistent single-winner election
method", Social Choice and Welfare 36 (2): 267303, doi:10.1007/s00355-010-0475-4
[3] Fernandez, Elena; Garfinkel, Robert; Arbiol, Roman (1998), "Mosaicking of aerial photographic maps via seams defined by bottleneck
shortest paths", Operations Research 46 (3): 293304, JSTOR222823
[4] Ullah, E.; Lee, Kyongbum; Hassoun, S. (2009), "An algorithm for identifying dominant-edge metabolic pathways" (http:/ / ieeexplore. ieee.
org/ xpls/ abs_all. jsp?arnumber=5361299), IEEE/ACM International Conference on Computer-Aided Design (ICCAD 2009), pp.144150,
[5] Ahuja, Ravindra K.; Magnanti, Thomas L.; Orlin, James B. (1993), "7.3 Capacity Scaling Algorithm", Network Flows: Theory, Algorithms
and Applications, Prentice Hall, pp.210212, ISBN0-13-617549-X
[6] Pollack, Maurice (1960), "The maximum capacity through a network", Operations Research 8 (5): 733736, JSTOR167387
[7] Berman, Oded; Handler, Gabriel Y. (1987), "Optimal Minimax Path of a Single Service Unit on a Network to Nonservice Destinations",
Transportation Science 21 (2): 115122, doi:10.1287/trsc.21.2.115
[8] Hu, T. C. (1961), "The maximum capacity route problem", Operations Research 9 (6): 898900, doi:10.1287/opre.9.6.898, JSTOR167055
[9] Punnen, Abraham P. (1991), "A linear time algorithm for the maximum capacity path problem", European Journal of Operational Research
53 (3): 402404, doi:10.1016/0377-2217(91)90073-5
[10] Malpani, Navneet; Chen, Jianer (2002), "A note on practical construction of maximum bandwidth paths", Information Processing Letters 83
(3): 175180, doi:10.1016/S0020-0190(01)00323-4, MR1904226
[11] Camerini, P. M. (1978), "The min-max spanning tree problem and some extensions", Information Processing Letters 7 (1): 1014,
doi:10.1016/0020-0190(78)90030-3
[12] Kaibel, Volker; Peinhardt, Matthias A. F. (2006), On the bottleneck shortest path problem (http:/ / www. zib. de/ Publications/ Reports/
ZR-06-22. pdf), ZIB-Report 06-22, Konrad-Zuse-Zentrum fr Informationstechnik Berlin,
[13] Leclerc, Bruno (1981), "Description combinatoire des ultramtriques" (in French), Centre de Mathmatique Sociale. cole Pratique des
Hautes tudes. Mathmatiques et Sciences Humaines (73): 537, 127, MR623034
[14] Demaine, Erik D.; Landau, Gad M.; Weimann, Oren (2009), "On Cartesian trees and range minimum queries", Automata, Languages and
Programming, 36th International Colloquium, ICALP 2009, Rhodes, Greece, July 5-12, 2009, Lecture Notes in Computer Science, 5555,
pp.341353, doi:10.1007/978-3-642-02927-1_29
[15] More specifically, the only kind of tie that the Schulze method fails to break is between two candidates who have equally wide paths to each
other.
[16] See Jesse Plamondon-Willard, Board election to use preference voting, May 2008; Mark Ryan, 2008 Wikimedia Board Election results,
June 2008; 2008 Board Elections, June 2008; and 2009 Board Elections, August 2009.
[17] Duan, Ran; Pettie, Seth (2009), "Fast algorithms for (max, min)-matrix multiplication and bottleneck shortest paths" (http:/ / portal. acm.
org/ citation. cfm?id=1496813), Proceedings of the 20th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '09), pp.384391, .
For an earlier algorithm that also used fast matrix multiplication to speed up all pairs widest paths, see Vassilevska, Virginia; Williams, Ryan;
Yuster, Raphael (2007), "All-pairs bottleneck paths for general graphs in truly sub-cubic time", Proceedings of the 39th Annual ACM
Symposium on Theory of Computing (STOC '07), New York: ACM, pp.585589, doi:10.1145/1250790.1250876, MR2402484 and Chapter 5
of Vassilevska, Virginia (2008), Efficient Algorithms for Path Problems in Weighted Graphs (http:/ / www. cs. cmu. edu/ afs/ cs/ Web/
People/ virgi/ thesis. pdf), Ph.D. thesis, Report CMU-CS-08-147, Carnegie Mellon University School of Computer Science,
[18] Gabow, Harold N.; Tarjan, Robert E. (1988), "Algorithms for two bottleneck optimization problems", Journal of Algorithms 9 (3): 411417,
doi:10.1016/0196-6774(88)90031-4, MR955149
[19] Han, Yijie; Thorup, M. (2002), "Integer sorting in O(nlog log n) expected time and linear space", Proc. 43rd Annual Symposium on
Foundations of Computer Science (FOCS 2002), pp.135144, doi:10.1109/SFCS.2002.1181890.
140
Problem description
For a given instance, there are a number of possibilities, or realizations, of how the hidden graph may look. Given an
instance, a description of how to follow the instance in the best way is called a policy. The CTP task is to compute
the expected cost of the optimal policies. To compute an actual description of an optimal policy may be a harder
problem.
Given an instance and policy for the instance, every realization produces its own (deterministic) walk in the graph.
Note that the walk is not necessarily a path since the best strategy may be to, e.g., visit every vertex of a cycle and
return to the start. This differs from the shortest path problem (with strictly positive weights), where repetitions in a
walk implies that a better solution exists.
Variants
There are primarily five parameters distinguishing the number of variants of the Canadian Traveller Problem. The
first parameter is how to value the walk produced by a policy for a given instance and realization. In the Stochastic
Shortest Path Problem with Recourse, the goal is simply to minimize the cost of the walk (defined as the sum over
all edges of the cost of the edge times the number of times that edge was taken). For the Canadian Traveller Problem,
the task is to minimize the competitive ratio of the walk, i.e. to minimize the number of times longer the produced
walk is to the shortest path in the realization.
The second parameter is how to evaluate a policy with respect to different realizations consistent with the instance
under consideration. In the Canadian Traveller Problem, one wishes to study the worst case and in SSPPR, the
average case. For average case analysis, one must furthermore specify an a priori distribution over the realizations.
The third parameter is restricted to the stochastic versions and is about what assumptions we can make about the
distribution of the realizations and how the distribution is represented in the input. In the Stochastic Canadian
Traveller Problem and in the Edge-independent Stochastic Shortest Path Problem (i-SSPPR), each uncertain edge (or
cost) has an associated probability of being in the realization and the event that an edge is in the graph is independent
of which other edges are in the realization. Even though this is a considerable simplification, the problem is still
#P-hard. Another variant is to make no assumption on the distribution but require that each realization with non-zero
probability be explicitly stated (such as Probability 0.1 of edge set { {3,4},{1,2} }, probability 0.2 of ...). This is
called the Distribution Stochastic Shortest Path Problem (d-SSPPR or R-SSPPR) and is NP-complete. The first
141
142
variant is harder than the second because the former can represent in logarithmic space some distributions that the
latter represents in linear space.
The fourth and final parameter is how the graph changes over time. In CTP and SSPPR, the realization is fixed but
not known. In the Stochastic Shortest Path Problem with Recourse and Resets or the Expected Shortest Path
problem, a new realization is chosen from the distribution after each step taken by the policy. This problem can be
solved in polynomial time by reducing it to a Markov decision process with polynomial horizon. The Markov
generalization, where the realization of the graph may influence the next realization, is known to be much harder.
An additional parameter is how new knowledge is being discovered on the realization. In traditional variants of CTP,
the agent uncovers the exact weight (or status) of an edge upon reaching an adjacent vertex. A new variant was
recently suggested where an agent also has the ability to perform remote sensing from any location on the
realization. In this variant, the task is to minimize the travel cost plus the cost of sensing operations.
Formal definition
We define the variant studied in the paper from 1989. That is, the goal is to minimize the competitive ratio in the
worst case. It is necessary that we begin by introducing certain terms.
Consider a given graph and the family of undirected graphs that can be constructed by adding one or more edges
from a given set. Formally, let
where we think of E as
the edges that must be in the graph and of F as the edges that may be in the graph. We say that
a realization of the graph family. Furthermore, let W be an associated cost matrix where
is
is the cost of going
, let
be the cost of the shortest path in the graph from s to t. This is called
the off-line problem because an algorithm for such a problem would have complete information of the graph.
We say that a strategy to navigate such a graph is a mapping from
to E, where
denotes the powerset of X. We define the cost
of a strategy
as follows.
Let
For
and
, define
.
,
, and
.
, then
, otherwise let
In other words, we evaluate the policy based on the edges we currently know are in the graph (
we know might be in the graph (
). When we take a step in the graph, the edges adjacent to our new location
become known to us. Those edges that are in the graph are added to
or not, they are removed from the set of unknown edges,
infinite cost. If the goal is reached, we define the cost of the walk as the sum of the costs of all of the edges
traversed, with cardinality.
Finally, we define the Canadian traveller problem.
Given a CTP instance
realization
optimal,
Complexity
The original paper analysed the complexity of the problem and reported it to be PSPACE-complete. It was also
shown that finding an optimal path in the case where each edge has an associated probability of being in the graph
(i-SSPPR) is a PSPACE-easy but #P-hard problem.[1] It is an open problem to bridge this gap.
The directed version of the stochastic problem is known in operations research as the Stochastic Shortest Path
Problem with Recourse.
Applications
The problem is said to have applications in operations research, transportation planning, artificial intelligence,
machine learning, communication networks, and routing. A variant of the problem has been studied for robot
navigation with probabilistic landmark recognition.[2]
Open problems
Despite the age of the problem and its many potential applications, many natural questions still remain open. Is there
a constant-factor approximation or is the problem APX-hard? Is i-SSPPR #P-complete? An even more fundamental
question has been left unanswered - is there a polynomial-size description of an optimal policy, setting aside for a
moment the time necessary to compute the description? [3]
Notes
[1] Papadimitriou and Yannakakis, 1982, p. 148
[2] Amy J. Briggs; "Carrick Detweiler, Daniel Scharstein (2004). "Expected shortest paths for landmark-based robot navigation". "International
Journal of Robotics Research" 23 (78): 717718. doi:10.1177/0278364904045467.
[3] Karger and Nikolova, 2008, p. 1
References
C.H. Papadimitriou; M. Yannakakis (1989). "Shortest paths without a map". Lecture notes in computer science.
372. Proc. 16th ICALP. Springer-Verlag. pp.610620.
David Karger; Evdokia Nikolova (2008). Exact Algorithms for the Canadian Traveller Problem on Paths and
Trees.
Zahy Bnaya; Ariel Felner, Solomon Eyal Shimony (2009). Canadian Traveller Problem with remote sensing.
143
144
Degree centrality
Historically first and conceptually simplest is degree centrality, which is defined as the number of links incident
upon a node (i.e., the number of ties that a node has). The degree can be interpreted in terms of the immediate risk of
a node for catching whatever is flowing through the network (such as a virus, or some information). In the case of a
directed network (where ties have direction), we usually define two separate measures of degree centrality, namely
indegree and outdegree. Accordingly, indegree is a count of the number of ties directed to the node and outdegree is
the number of ties that the node directs to others. When ties are associated to some positive aspects such as
friendship or collaboration, indegree is often interpreted as a form of popularity, and outdegree as gregariousness.
The degree centrality of a vertex
with
vertices and
edges, is defined
as
Calculating degree centrality for all the nodes in a graph takes
):
The value of
is as follows:
contains one central node to which all other nodes are connected
.
145
Closeness centrality
In graphs there is a natural distance metric between all pairs of nodes, defined by the length of their shortest paths.
The farness of a node s is defined as the sum of its distances to all other nodes, and its closeness is defined as the
inverse of the farness.[3] Thus, the more central a node is the lower its total distance to all other nodes. Closeness can
be regarded as a measure of how fast it will take to spread information from s to all other nodes sequentially.[4]
In the classic definition of the closeness centrality, the spread of information is modeled by the use of shortest paths.
This model might not be the most realistic for all types of communication scenarios. Thus, related definitions have
been discussed to measure closeness, like the random walk closeness centrality introduced by Noh and Rieger
(2004). It measures the speed with which randomly walking messages reach a vertex from elsewhere in the
networka sort of random-walk version of closeness centrality.[5]
The information centrality of Stephenson and Zelen (1989) is another closeness measure, which bears some
similarity to that of Noh and Rieger. In essence it measures the harmonic mean length of paths ending at a vertex i,
which is smaller if i has many short paths connecting it to other vertices.[6]
Note that by definition of graph theoretic distances, the classic closeness centrality of all nodes in an unconnected
graph would be 0. In a work by Dangalchev (2006) relating network vulnerability, the definition for closeness is
modified such that it can be calculated more easily and can be also applied to graphs which lack connectivity [7]:
Another extension to networks with disconnected components has been proposed by Opsahl (2010).[8]
Betweenness centrality
Betweenness is a centrality measure of a vertex within a
graph (there is also edge betweenness, which is not
discussed here). It was introduced as a measure for
quantifying the control of a human on the communication
between other humans in a social network by Linton
Freeman.[9] In his conception, vertices that have a high
probability to occur on a randomly chosen shortest path
between two randomly chosen nodes have a high
betweenness.
The betweenness of a vertex
with
in a graph
where
pass through
to node
and
. The betweenness may be normalised by dividing through the number of pairs of vertices not
146
. For example, in an u
center vertex (which is contained in every possible shortest path) would have a betweenness of
(1, if norma
while the leaves (which are contained in no shortest paths) would have a betweenness of 0.
From a calculation aspect, both betweenness and closeness centralities of all vertices in a graph involve calculating
the shortest paths between all pairs of vertices on a graph, which requires
time with the FloydWarshall
algorithm. However, on sparse graphs, Johnson's algorithm may be more efficient, taking
time. In the case of unweighted graphs the calculations can be done with Brandes' algorithm[10] which takes
time. Normally, these algorithms assume that graphs are undirected and connected with the allowance of
loops and multiple edges. When specifically dealing with network graphs, oftentimes graphs are without loops or
multiple edges to maintain simple relationships (where edges represent connections between two people or vertices).
In this case, using Brandes' algorithm will divide final centrality scores by 2 to account for each shortest path being
counted twice.[10]
Eigenvector centrality
Eigenvector centrality is a measure of the influence of a node in a network. It assigns relative scores to all nodes in
the network based on the concept that connections to high-scoring nodes contribute more to the score of the node in
question than equal connections to low-scoring nodes. Google's PageRank is a variant of the Eigenvector centrality
measure.[11] Another closely related centrality measure is Katz centrality.
, and
can be
defined as:
where
and
additional requirement that all the entries in the eigenvector be positive implies (by the PerronFrobenius theorem)
that only the greatest eigenvalue results in the desired centrality measure.[12] The
component of the related
eigenvector then gives the centrality score of the vertex in the network. Power iteration is one of many eigenvalue
algorithms that may be used to find this dominant eigenvector.[11] Furthermore, this can be generalized so that the
entries in A can be real numbers representing connection strengths, as in a stochastic matrix.
is replaced by
147
It is shown that [14] the principal eigenvector (associated with the largest eigenvalue of
the limit of Katz centrality as
approaches
from below.
where
is the number
(or number of outbound links in a directed graph). Compared to eigenvector centrality and
eigenvector centrality is that the PageRank vector is a left hand eigenvector (note the factor
reversed).
has indices
[15]
in
[17]
of
and
are two
This definition comprises all classic centrality measures but not all
Centralization
The centralization of any network is a measure of how central its most central node is in relation to how central all
the other nodes are.[19] The general definition of centralization for non-weighted networks was proposed by Linton
Freeman (1979). Centralization measures then (a) calculate the sum in differences in centrality between the most
central node in a network and all other nodes; and (b) divide this quantity by the theoretically largest such sum of
differences in any network of the same degree.[19] Thus, every centrality measure can have its own centralization
measure. Defined formally, if
is any centrality measure of point , if
is the largest such measure
in the network, and if
graph of with the same number of nodes, then the centralization of the network is:[19]
for any
Further reading
Freeman, L. C. (1979). Centrality in social networks: Conceptual clarification. Social Networks, 1(3), 215-239.
Sabidussi, G. (1966). The centrality index of a graph. Psychometrika, 31 (4), 581-603.
Freeman, L. C. (1977). A set of measures of centrality based on betweenness. Sociometry 40, 35-41.
Koschtzki, D.; Lehmann, K. A.; Peeters, L.; Richter, S.; Tenfelde-Podehl, D. and Zlotowski, O. (2005)
Centrality Indices. In Brandes, U. and Erlebach, T. (Eds.) Network Analysis: Methodological Foundations,
pp.1661, LNCS 3418, Springer-Verlag.
Bonacich, P. (1987). Power and Centrality: A Family of Measures, The American Journal of Sociology, 92 (5), pp
11701182.
External links
https://networkx.lanl.gov/trac/attachment/ticket/119/page_rank.py
http://www.faculty.ucr.edu/~hanneman/nettext/C10_Centrality.html
148
149
Schulze Method
Let d[V,W] be the number of voters who prefer candidate V to candidate W.
A path from candidate X to candidate Y of strength p is a sequence of candidates C(1),...,C(n) with the following
properties:
1. C(1) = X and C(n) = Y.
2. For all i = 1,...,(n-1): d[C(i),C(i+1)] > d[C(i+1),C(i)].
3. For all i = 1,...,(n-1): d[C(i),C(i+1)] p.
p[A,B], the strength of the strongest path from candidate A to candidate B, is the maximum value such that there is a
path from candidate A to candidate B of that strength. If there is no path from candidate A to candidate B at all, then
p[A,B] = 0.
Candidate D is better than candidate E if and only if p[D,E] > p[E,D].
Candidate D is a potential winner if and only if p[D,E] p[E,D] for every other candidate E.
It can be proven that p[X,Y] > p[Y,X] and p[Y,Z] > p[Z,Y] together imply p[X,Z] > p[Z,X].[1]:4.1 Therefore, it is
guaranteed (1) that the above definition of "better" really defines a transitive relation and (2) that there is always at
least one candidate D with p[D,E] p[E,D] for every other candidate E.
Example
Consider the following example, in which 45 voters rank 5 candidates.
5 ACBED (meaning, 5 voters have order of preference: A > C > B > E > D)
5 ADECB
8 BEDAC
3 CABED
7 CAEBD
2 CBADE
7 DCEBA
8 EBADC
First, we compute the pairwise preferences. For example, in comparing A and B pairwise, there are 5+5+3+7=20
voters who prefer A to B, and 8+2+7+8=25 voters who prefer B to A. So d[A, B] = 20 and d[B, A] = 25. The full set
of pairwise preferences is:
150
151
20
26
30
22
16
33
18
17
24
d[B,*]
25
d[C,*]
19
29
d[D,*]
15
12
28
d[E,*]
23
27
21
14
31
The cells for d[X, Y] have a light green background iff d[X, Y] > d[Y, X], otherwise the background is light red.
Note, that there is no undisputed winner by only looking at the pairwise differences.
Now the strongest paths have to be identified. To help visualize the strongest paths, the set of pairwise preferences is
depicted in the diagram on the right in the form of a directed graph. An arrow from the node representing a candidate
X to the one representing a candidate Y is labelled with d[X, Y]. To avoid cluttering the diagram, we have only
drawn an arrow from X to Y when d[X, Y] > d[Y, X] (i.e. the table cells with light green background), omitting the
one in the opposite direction (the table cells with light red background).
One example of computing the strongest path strength is p[B, D] = 33: the strongest path from B to D is the direct
path (B, D) which has strength 33. For contrast, let us also compute p[A, C]. The strongest path from A to C is not
the direct path (A, C) of strength 26, rather the strongest path is the indirect path (A, D, C) which has strength
min(30, 28) = 28. Recall that the strength of a path is the strength of its weakest link.
For each pair of candidates X and Y, the following table shows the strongest path from candidate X to candidate Y in
red, with the weakest link underlined.
152
Strongest paths
... to A
... to B
... to C
... to D
... to E
from
A ...
from
A ...
A-(30)-D-(28)-C-(29)-B
A-(30)-D-(28)-C
A-(30)-D
A-(30)-D-(28)-C-(24)-E
from
B ...
from
B ...
B-(33)-D-(28)-C
B-(33)-D
B-(33)-D-(28)-C-(24)-E
B-(25)-A
from
C ...
from
C ...
C-(29)-B
C-(29)-B-(33)-D
C-(24)-E
C-(29)-B-(25)-A
from
D ...
from
D ...
D-(28)-C-(29)-B
D-(28)-C-(29)-B-(25)-A
D-(28)-C
D-(28)-C-(24)-E
153
from
E ...
from
E ...
E-(31)-D-(28)-C-(29)-B
E-(31)-D-(28)-C
E-(31)-D
... to B
... to C
... to D
E-(31)-D-(28)-C-(29)-B-(25)-A
... to A
... to E
28
28
30
24
28
33
24
29
24
p[B,*]
25
p[C,*]
25
29
p[D,*]
25
28
28
p[E,*]
25
28
28
24
31
Now we can determine the output of the Schulze method. Comparing A and B for example, since 28 = p[A,B] >
p[B,A] = 25, for the Schulze method candidate A is better than candidate B. Another example is that 31 = p[E,D] >
p[D,E] = 24, so candidate E is better than candidate D. Continuing in this way we get the Schulze ranking is E > A >
C > B > D, and E wins. In other words, E wins since p[E,X] p[X,E] for every other candidate X.
Implementation
The only difficult step in implementing the Schulze method is computing the strongest path strengths. However, this
is a well-known problem in graph theory sometimes called the widest path problem. One simple way to compute the
strengths therefore is a variant of the FloydWarshall algorithm. The following pseudocode illustrates the algorithm.
# Input: d[i,j], the number of voters who prefer candidate i to
candidate j.
# Output: p[i,j], the strength of the strongest path from candidate i
to candidate j.
for i from 1 to C
for j from 1 to C
if (i j) then
if (d[i,j] > d[j,i]) then
p[i,j] := d[i,j]
else
p[i,j] := 0
for i from 1 to C
for j from 1 to C
Unrestricted domain
Non-imposition (a.k.a. citizen sovereignty)
Non-dictatorship
Pareto criterion[1]:4.3
Monotonicity criterion[1]:4.5
Majority criterion
Majority loser criterion
Condorcet criterion
Condorcet loser criterion
Schwartz criterion
Smith criterion[1]:4.7
Independence of Smith-dominated alternatives[1]:4.7
Mutual majority criterion
Independence of clones[1]:4.6
Reversal symmetry[1]:4.4
Mono-append[3]
Mono-add-plump[3]
Resolvability criterion[1]:4.2
Polynomial runtime[1]:2.3"
prudence[1]:4.9"
154
155
MinMax sets[1]:4.8"
Woodall's plurality criterion if winning votes are used for d[X,Y]
Symmetric-completion[3] if margins are used for d[X,Y]
Failed criteria
Since the Schulze method satisfies the Condorcet criterion, it automatically fails the following criteria:
Participation[1]:3.4
Consistency
Invulnerability to compromising
Invulnerability to burying
Later-no-harm
Likewise, since the Schulze method is not a dictatorship and agrees with unanimous votes, Arrow's Theorem implies
it fails the criterion
Independence of irrelevant alternatives
The Schulze method also fails
Peyton Young's criterion Local Independence of Irrelevant Alternatives.
Comparison table
The following table compares the Schulze method with other preferential single-winner election methods:
Monotonic Condorcet Majority Condorcet Majority Mutual Smith ISDA LIIA
Clone
Reversal Polynomial Participation, Resolvability
loser
loser majority
independence symmetry
time
Consistency
Schulze
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
Yes
Yes
Yes
No
Yes
Ranked Pairs
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
Yes
Copeland
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
No
Yes
Yes
No
No
Kemeny-Young
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
Yes
No
No
Yes
Nanson
No
Yes
Yes
Yes
Yes
Yes
Yes
No
No
No
Yes
Yes
No
Yes
Baldwin
No
Yes
Yes
Yes
Yes
Yes
Yes
No
No
No
No
Yes
No
Yes
Instant-runoff
voting
No
No
Yes
Yes
Yes
Yes
No
No
No
Yes
No
Yes
No
Yes
Borda
Yes
No
No
Yes
Yes
No
No
No
No
No
Yes
Yes
Yes
Yes
Bucklin
Yes
No
Yes
No
Yes
Yes
No
No
No
No
No
Yes
No
Yes
Coombs
No
No
Yes
Yes
Yes
Yes
No
No
No
No
No
Yes
No
Yes
MiniMax
Yes
Yes
Yes
No
No
No
No
No
No
No
No
Yes
No
Yes
Plurality
Yes
No
Yes
No
No
No
No
No
No
No
No
Yes
Yes
Yes
Anti-plurality
Yes
No
No
No
Yes
No
No
No
No
No
No
Yes
Yes
Yes
Contingent
voting
No
No
Yes
Yes
Yes
No
No
No
No
No
No
Yes
No
Yes
Sri Lankan
contingent
voting
No
No
Yes
No
No
No
No
No
No
No
No
Yes
No
Yes
Supplementary
voting
No
No
Yes
No
No
No
No
No
No
No
No
Yes
No
Yes
Dodgson
No
Yes
Yes
No
No
No
No
No
No
No
No
No
No
Yes
156
The main difference between the Schulze method and the ranked pairs method can be seen in this example:
Suppose the MinMax score of a set X of candidates is the strength of the strongest pairwise win of a candidate A X
against a candidate B X. Then the Schulze method, but not Ranked Pairs, guarantees that the winner is always a
candidate of the set with minimum MinMax score.[1]:4.8 So, in some sense, the Schulze method minimizes the
largest majority that has to be reversed when determining the winner.
On the other hand, Ranked Pairs minimizes the largest majority that has to be reversed to determine the order of
finish, in the minlexmax sense. [4] In other words, when Ranked Pairs and the Schulze method produce different
orders of finish, for the majorities on which the two orders of finish disagree, the Schulze order reverses a larger
majority than the Ranked Pairs order.
Metalab [60]
Music Television (MTV) [61]
Neo [62]
Noisebridge [63]
North Shore Cyclists (NSC) [64] [65]
OpenEmbedded [66]
OpenStack [67]
Park Alumni Society (PAS) [68] [69]
Pirate Party of Australia [70]
Pirate Party of Austria [71]
Pirate Party of Belgium [72]
Pirate Party of Brazil
Pirate Party of France [73]
Pirate Party of Germany [15]
Pirate Party of Italy [74]
Pirate Party of New Zealand [75]
Pirate Party of Sweden [14]
Pirate Party of Switzerland [76]
Pirate Party of the United States [77]
Pitcher Plant of the Month [78]
Pittsburgh Ultimate [79] [80]
RLLMUK [81]
RPMrepo [82] [83]
Sender Policy Framework (SPF) [84]
Software in the Public Interest (SPI) [7]
Squeak [85]
Students for Free Culture [86]
Sugar Labs [87]
Sverok [88]
TestPAC [89]
157
TopCoder [10]
Ubuntu [90]
University of British Columbia Math Club [91] [92]
Wikimedia Foundation [11]
Wikipedia in French,[16] Hebrew,[93] Hungarian,[94] and Russian.[95]
Notes
[1] Markus Schulze, A new monotonic, clone-independent, reversal symmetric, and condorcet-consistent single-winner election method (http:/ /
www. springerlink. com/ content/ y5451n4908227162/ ?p=78c7a3edd3f64751ac9c2afc8aa6fad2& pi=0), Social Choice and Welfare, volume
36, number 2, page 267303, 2011. Preliminary version in Voting Matters, 17:9-19, 2003.
[2] Under reasonable probabilistic assumptions when the number of voters is much larger than the number of candidates
[3] Douglas R. Woodall, Properties of Preferential Election Rules (http:/ / www. votingmatters. org. uk/ ISSUE3/ P5. HTM), Voting Matters,
issue 3, pages 8-15, December 1994
[4] Tideman, T. Nicolaus, "Independence of clones as a criterion for voting rules," Social Choice and Welfare vol 4 #3 (1987), pp 185-206.
[5] See:
Markus Schulze, Condorect sub-cycle rule (http:/ / lists. electorama. com/ pipermail/ election-methods-electorama. com/ 1997-October/
001570. html), October 1997 (In this message, the Schulze method is mistakenly believed to be identical to the ranked pairs method.)
Mike Ossipoff, Party List P.S. (http:/ / groups. yahoo. com/ group/ election-methods-list/ message/ 467), July 1998
Markus Schulze, Tiebreakers, Subcycle Rules (http:/ / groups. yahoo. com/ group/ election-methods-list/ message/ 673), August 1998
Markus Schulze, Maybe Schulze is decisive (http:/ / groups. yahoo. com/ group/ election-methods-list/ message/ 845), August 1998
Norman Petry, Schulze Method - Simpler Definition (http:/ / groups. yahoo. com/ group/ election-methods-list/ message/ 867), September
1998
Markus Schulze, Schulze Method (http:/ / groups. yahoo. com/ group/ election-methods-list/ message/ 2291), November 1998
[6] See:
Anthony Towns, Disambiguation of 4.1.5 (http:/ / lists. debian. org/ debian-vote/ 2000/ 11/ msg00121. html), November 2000
Norman Petry, Constitutional voting, definition of cumulative preference (http:/ / lists. debian. org/ debian-vote/ 2000/ 12/ msg00045.
html), December 2000
[7] Process for adding new board members (http:/ / www. spi-inc. org/ corporate/ resolutions/ 2003/ 2003-01-06. wta. 1/ ), January 2003
[8] See:
Constitutional Amendment: Condorcet/Clone Proof SSD Voting Method (http:/ / www. debian. org/ vote/ 2003/ vote_0002), June 2003
Constitution for the Debian Project (http:/ / www. debian. org/ devel/ constitution), appendix A6
Debian Voting Information (http:/ / www. debian. org/ vote/ )
[9] See:
[10] See:
2006 TopCoder Open Logo Design Contest (http:/ / www. topcoder. com/ tc?module=Static& d1=tournaments& d2=tco06&
d3=logo_rules), November 2005
158
[11]
2006 TopCoder Collegiate Challenge Logo Design Contest (http:/ / www. topcoder. com/ tc?module=Static& d1=tournaments&
d2=tccc06& d3=logo_rules), June 2006
2007 TopCoder High School Tournament Logo (http:/ / studio. topcoder. com/ ?module=ViewContestDetails& ct=2030), September 2006
2007 TopCoder Arena Skin Contest (http:/ / studio. topcoder. com/ ?module=ViewContestDetails& ct=2046), November 2006
2007 TopCoder Open Logo Contest (http:/ / studio. topcoder. com/ ?module=ViewContestDetails& ct=2047), January 2007
2007 TopCoder Open Web Design Contest (http:/ / studio. topcoder. com/ ?module=ViewContestDetails& ct=2050), January 2007
2007 TopCoder Collegiate Challenge T-Shirt Design Contest (http:/ / studio. topcoder. com/ ?module=ViewContestDetails& ct=2122),
September 2007
2008 TopCoder Open Logo Design Contest (http:/ / studio. topcoder. com/ ?module=ViewContestDetails& ct=2127), September 2007
2008 TopCoder Open Web Site Design Contest (http:/ / studio. topcoder. com/ ?module=ViewContestDetails& ct=2133), October 2007
2008 TopCoder Open T-Shirt Design Contest (http:/ / studio. topcoder. com/ ?module=ViewContestDetails& ct=2183), March 2008
See:
article 6 section 3 of the constitution (http:/ / www. fsfeurope. org/ about/ legal/ Constitution. en. pdf)
Fellowship vote for General Assembly seats (http:/ / www. fsfeurope. org/ news/ 2009/ news-20090301-01. en. html), March 2009
And the winner of the election for FSFE's Fellowship GA seat is ... (http:/ / fsfe. org/ news/ 2009/ news-20090601-01. en. html), June 2009
[14] See:
Infr primrvalen (http:/ / forum. piratpartiet. se/ FindPost174988. aspx), October 2009
Dags att kandidera till riksdagen (http:/ / forum. piratpartiet. se/ FindPost176567. aspx), October 2009
Rresultat primrvalet (http:/ / forum. piratpartiet. se/ FindPost193877. aspx), January 2010
[15] 11 of the 16 regional sections and the federal section of the Pirate Party of Germany are using LiquidFeedback (http:/ / liquidfeedback. org/ )
for unbinding internal opinion polls. In 2010/2011, the Pirate Parties of Neuklln ( link (http:/ / wiki. piratenpartei. de/ BE:Neuklln/
Gebietsversammlungen/ 2010. 3/ Protokoll)), Mitte ( link (http:/ / berlin. piratenpartei. de/ 2011/ 01/ 18/
kandidaten-der-piraten-in-mitte-aufgestellt/ )), Steglitz-Zehlendorf ( link (http:/ / wiki. piratenpartei. de/ wiki/ images/ d/ da/
BE_Gebietsversammlung_Steglitz_Zehlendorf_2011_01_20_Protokoll. pdf)), Lichtenberg ( link (http:/ / piraten-lichtenberg. de/ ?p=205)),
and Tempelhof-Schneberg ( link (http:/ / wiki. piratenpartei. de/ BE:Gebietsversammlungen/ Tempelhof-Schoeneberg/ Protokoll_2011. 1))
adopted the Schulze method for its primaries. Furthermore, the Pirate Party of Berlin (in 2011) ( link (http:/ / wiki. piratenpartei. de/
BE:Parteitag/ 2011. 1/ Protokoll)) and the Pirate Party of Regensburg (in 2012) ( link (http:/ / wiki. piratenpartei. de/ BY:Regensburg/
Grndung/ Geschftsordnung#Anlage_A)) adopted this method for their primaries.
[16] Choix dans les votes
[17] fr:Spcial:Pages lies/Mthode Schulze
[18] Election of the Annodex Association committee for 2007 (http:/ / www. cs. cornell. edu/ w8/ ~andru/ cgi-perl/ civs/ results.
pl?id=E_50cfc592ae8f13d9), February 2007
[19] http:/ / blitzed. org/
[20] Condorcet method for admin voting (http:/ / wiki. blitzed. org/ Condorcet_method_for_admin_voting), January 2005
[21] See:
Important notice for Golden Geek voters (http:/ / www. boardgamegeek. com/ article/ 1751580), September 2007
Golden Geek Awards 2008 - Nominations Open (http:/ / www. boardgamegeek. com/ article/ 2582330), August 2008
Golden Geek Awards 2009 - Nominations Open (http:/ / www. boardgamegeek. com/ article/ 3840078), August 2009
Golden Geek Awards 2010 - Nominations Open (http:/ / www. boardgamegeek. com/ article/ 5492260), September 2010
Golden Geek Awards 2011 - Nominations Open (http:/ / boardgamegeek. com/ thread/ 694044), September 2011
[22] Project Logo (http:/ / article. gmane. org/ gmane. comp. db. cassandra. devel/ 424/ match=condorcet+ schwartz+ sequential+ dropping+
beatpath), October 2009
[23] http:/ / 0xAA. org
[24] "Codex Alpe Adria Competitions" (http:/ / 0xAA. org/ competitions/ ). 0xaa.org. 2010-01-24. . Retrieved 2010-05-08.
[25] http:/ / collectiveagency. co/
[26] Civics Meeting Minutes (http:/ / collectiveagency. co/ 2012/ 03/ 21/ civics-meeting-minutes-32012/ ), March 2012
[27] "Fellowship Guidelines" (http:/ / www. marine. usf. edu/ fellowships/ Guidelines-and-Application-2011-2012. pdf) (PDF). . Retrieved
2011-06-01.
[28] Report on HackSoc Elections (http:/ / www. hacksoc. org/ HackSocElections. pdf), December 2008
[29] http:/ / www. cohp. org/
[30] Adam Helman, Family Affair Voting Scheme - Schulze Method (http:/ / www. cohp. org/ records/ votes/ family_affair_voting_scheme.
html)
159
Candidate cities for EBTM05 (http:/ / forum. eurobilltracker. eu/ viewtopic. php?t=4920& highlight=condorcet+ beatpath+ ssd),
December 2004
Meeting location preferences (http:/ / forum. eurobilltracker. eu/ viewtopic. php?t=4921& highlight=condorcet), December 2004
Date for EBTM07 Berlin (http:/ / forum. eurobilltracker. eu/ viewtopic. php?t=9353& highlight=condorcet+ beatpath), January 2007
Vote the date of the Summer EBTM08 in Ljubljana (http:/ / forum. eurobilltracker. eu/ viewtopic. php?t=10564& highlight=condorcet+
beatpath), January 2008
New Logo for EBT (http:/ / forum. eurobilltracker. com/ viewtopic. php?f=26& t=17919& start=15#p714947), August 2009
[34] "Guidance Document" (http:/ / www. eudec. org/ Guidance Document). Eudec.org. 2009-11-15. . Retrieved 2010-05-08.
[35] http:/ / fairtradenorthwest. org/
[36] article XI section 2 of the bylaws (http:/ / fairtradenorthwest. org/ FTNW Bylaws. pdf)
[37] Democratic election of the server admins (http:/ / article. gmane. org/ gmane. comp. video. ffmpeg. devel/ 113026/ match="schulze
method"+ "Cloneproof schwartz sequential droping"+ Condorcet), July 2010
[38] http:/ / www. vtk. be/
[39] article 51 of the statutory rules (http:/ / www. vtk. be/ vtk/ statuten/ huishoudelijk_reglement. pdf)
[40] Voters Guide (http:/ / wiki. freegeek. org/ images/ 7/ 7a/ Voters_guide. pdf), September 2011
[41] http:/ / fhf. it/
[42] See:
Eletto il nuovo Consiglio nella Free Hardware Foundation (http:/ / fhf. it/ notizie/ nuovo-consiglio-nella-fhf), June 2008
Poll Results (http:/ / www. cs. cornell. edu/ w8/ ~andru/ cgi-perl/ civs/ results. pl?id=E_5b6e434828ec547b), June 2008
[43] GnuPG Logo Vote (http:/ / logo-contest. gnupg. org/ results. html), November 2006
[44] http:/ / gbg. hackerspace. se/ site/
[45] 14 of the bylaws (http:/ / gbg. hackerspace. se/ site/ om-ghs/ stadgar/ )
[46] http:/ / gso. cs. binghamton. edu/ index. php/ GSOCS_Home
[47] "User Voting Instructions" (http:/ / gso. cs. binghamton. edu/ index. php/ Voting). Gso.cs.binghamton.edu. . Retrieved 2010-05-08.
[48] Haskell Logo Competition (http:/ / www. cs. cornell. edu/ w8/ ~andru/ cgi-perl/ civs/ results. pl?num_winners=1&
id=E_d21b0256a4fd5ed7& algorithm=beatpath), March 2009
[49] http:/ / www. wvscrabble. com/
[50] A club by any other name ... (http:/ / wvscrabble. blogspot. com/ 2009/ 04/ club-by-any-other-name. html), April 2009
[51] See:
Ka-Ping Yee, Condorcet elections (http:/ / www. livejournal. com/ users/ zestyping/ 102718. html), March 2005
Ka-Ping Yee, Kingman adopts Condorcet voting (http:/ / www. livejournal. com/ users/ zestyping/ 111588. html), April 2005
[52] Knight Foundation awards $5000 to best created-on-the-spot projects (http:/ / civic. mit. edu/ blog/ andrew/
knight-foundation-awards-5000-to-best-created-on-the-spot-projects), June 2009
[53] See:
Mascot 2007 contest (http:/ / www. kumoricon. org/ forums/ index. php?topic=2599. 45), July 2006
Mascot 2008 and cover 2007 contests (http:/ / www. kumoricon. org/ forums/ index. php?topic=4497. 0), May 2007
Mascot 2009 and program cover 2008 contests (http:/ / www. kumoricon. org/ forums/ index. php?topic=6653. 0), April 2008
Mascot 2010 and program cover 2009 contests (http:/ / www. kumoricon. org/ forums/ index. php?topic=10048. 0), May 2009
Mascot 2011 and book cover 2010 contests (http:/ / www. kumoricon. org/ forums/ index. php?topic=12955. 0), May 2010
Mascot 2012 and book cover 2011 contests (http:/ / www. kumoricon. org/ forums/ index. php?topic=15340. 0), May 2011
[54] article 8.3 of the bylaws (http:/ / governance. lopsa. org/ LOPSA_Bylaws)
[55] http:/ / www. libre-entreprise. org/
[56] See:
Choix de date pour la runion Libre-entreprise durant le Salon Solution Linux 2006 (http:/ / www. libre-entreprise. org/ index. php/
Election:DateReunionSolutionLinux2006), January 2006
Entre de Libricks dans le rseau Libre-entreprise (http:/ / www. libre-entreprise. org/ index. php/ Election:EntreeLibricks), February 2008
[57] Lumiera Logo Contest (http:/ / www. cs. cornell. edu/ w8/ ~andru/ cgi-perl/ civs/ results. pl?id=E_7df51370797b45d6), January 2009
[58] http:/ / www. mkm-ig. org/
[59] The MKM-IG uses Condorcet with dual dropping (http:/ / condorcet-dd. sourceforge. net/ ). That means: The Schulze ranking and the
ranked pairs ranking are calculated and the winner is the top-ranked candidate of that of these two rankings that has the better Kemeny score.
See:
Michael Kohlhase, MKM-IG Trustees Election Details & Ballot (http:/ / lists. jacobs-university. de/ pipermail/ projects-mkm-ig/
2004-November/ 000041. html), November 2004
160
Andrew A. Adams, MKM-IG Trustees Election 2005 (http:/ / lists. jacobs-university. de/ pipermail/ projects-mkm-ig/ 2005-December/
000072. html), December 2005
Lionel Elie Mamane, Elections 2007: Ballot (http:/ / lists. jacobs-university. de/ pipermail/ projects-mkm-ig/ 2007-August/ 000406. html),
August 2007
[60] "Wahlmodus" (http:/ / metalab. at/ wiki/ Generalversammlung_2007/ Wahlmodus) (in (German)). Metalab.at. . Retrieved 2010-05-08.
[61] Benjamin Mako Hill, Voting Machinery for the Masses (http:/ / www. oscon. com/ oscon2008/ public/ schedule/ detail/ 3230), July 2008
[62] See:
Wahlen zum Neo-2-Freeze: Formalitten (http:/ / wiki. neo-layout. org/ wiki/ Neo-2-Freeze/ Wahl?version=10#a7. Wahlverfahren),
February 2010
Hinweise zur Stimmabgabe (http:/ / wiki. neo-layout. org/ wiki/ Neo-2-Freeze/ Wahl/ Stimmabgabe?version=11), March 2010
Ergebnisse (http:/ / wiki. neo-layout. org/ wiki/ Neo-2-Freeze/ Wahl/ Ergebnisse?version=9), March 2010
[63] 2009 Director Elections (https:/ / www. noisebridge. net/ index. php?title=2009_Director_Elections& oldid=8951)
[64] http:/ / www. nscyc. org/
[65] NSC Jersey election (http:/ / www. nscyc. org/ JerseyWinner), NSC Jersey vote (http:/ / www. cs. cornell. edu/ w8/ ~andru/ cgi-perl/ civs/
results. pl?id=E_6c53f2bddb068673), September 2007
[66] Online Voting Policy (http:/ / www. openembedded. org/ index. php/ Online_Voting_Policy)
[67] See:
2010 OpenStack Community Election (http:/ / www. cs. cornell. edu/ w8/ ~andru/ cgi-perl/ civs/ results. pl?id=E_f35052f9f6d58f36&
rkey=4603fbf32e182e6c), November 2010
OpenStack Governance Elections Spring 2012 (http:/ / www. openstack. org/ blog/ 2012/ 02/
openstack-governance-elections-spring-2012/ ), February 2012
161
External links
General
Schulze method website (http://m-schulze.webhop.net/) by Markus Schulze
Tutorials
Schulze-Methode (http://blog.cgiesel.de/a/schulze-methode) (German) by Christoph Giesel
Condorcet Computations (http://www.dsi.unifi.it/~PP2009/talks/Talks_giovedi/Talks_giovedi/grabmeier.
pdf) by Johannes Grabmeier
Spieltheorie (http://www.informatik.uni-freiburg.de/~ki/teaching/ss09/gametheory/spieltheorie.pdf)
(German) by Bernhard Nebel
Schulze-Methode (http://m-schulze.webhop.net/serie3_9-10.pdf) (German) by the University of Stuttgart
Advocacy
Descriptions of ranked-ballot voting methods (http://www.cs.wustl.edu/~legrand/rbvote/desc.html) by Rob
LeGrand
Accurate Democracy (http://accuratedemocracy.com/voting_rules.htm) by Rob Loring
Election Methods and Criteria (http://nodesiege.tripod.com/elections/) by Kevin Venzke
The Debian Voting System (http://www.seehuhn.de/pages/vote) by Jochen Voss
election-methods: a mailing list containing technical discussions about election methods (http://lists.electorama.
com/pipermail/election-methods-electorama.com/)
Books
Christoph Brgers (2009), Mathematics of Social Choice: Voting, Compensation, and Division (http://books.
google.com/books?id=dccBaphP1G4C&pg=PA37), SIAM, ISBN 0-89871-695-0
Saul Stahl and Paul E. Johnson (2006), Understanding Modern Mathematics (http://books.google.com/
books?id=CMLL9sVGLb8C&pg=PA119), Sudbury: Jones and Bartlett Publishers, ISBN 0-7637-3401-2
Nicolaus Tideman (2006), Collective Decisions and Voting: The Potential for Public Choice (http://books.
google.com/books?id=RN5q_LuByUoC&pg=PA228), Burlington: Ashgate, ISBN 0-7546-4717-X
Software
Modern Ballots (https://modernballots.com/) and Python Vote Core (https://github.com/bradbeattie/
python-vote-core) by Brad Beattie
Voting Software Project (http://vote.sourceforge.net/) by Blake Cretney
Condorcet with Dual Dropping Perl Scripts (http://condorcet-dd.sourceforge.net/) by Mathew Goldstein
Condorcet Voting Calculator (http://condorcet.ericgorr.net/) by Eric Gorr
Selectricity (http://selectricity.org/) and RubyVote (http://rubyvote.rubyforge.org/) by Benjamin Mako Hill
(http://web.mit.edu/newsoffice/2008/voting-tt0312.html) (http://labcast.media.mit.edu/?p=56)
Schulze voting for DokuWiki (http://www.cosmocode.de/en/blog/lang/2010-07/
05-schulze-voting-for-dokuwiki) by Adrian Lang
Electowidget (http://wiki.electorama.com/wiki/Electowidget) by Rob Lanphier
162
Legislative projects
Arizonans for Condorcet Ranked Voting (http://www.azsos.gov/election/2008/general/ballotmeasuretext/
I-21-2008.pdf) (http://ballotpedia.org/wiki/index.
php?title=Arizona_Competitive_Elections_Reform_Act_(2008)) (http://www.azcentral.com/members/Blog/
PoliticalInsider/22368) (http://www.ballot-access.org/2008/04/29/
arizona-high-school-student-files-paperwork-for-initiatives-for-irv-and-easier-ballot-access/)
163
164
The minimum spanning tree of a planar graph. Each edge is labeled with its
weight, which here is roughly proportional to its length.
One example would be a cable TV company laying cable to a new neighborhood. If it is constrained to bury the
cable only along certain paths, then there would be a graph representing which points are connected by those paths.
Some of those paths might be more expensive, because they are longer, or require the cable to be buried deeper;
these paths would be represented by edges with larger weights. A spanning tree for that graph would be a subset of
those paths that has no cycles but still connects to every house. There might be several spanning trees possible. A
minimum spanning tree would be one with the lowest total cost.
165
Properties
Possible multiplicity
There may be several minimum spanning trees of the same weight
having a minimum number of edges; in particular, if all the edge
weights of a given graph are the same, then every spanning tree of that
graph is minimum. If there are n vertices in the graph, then each tree
has n-1 edges.
Uniqueness
If each edge has a distinct weight then there will be only one, unique
minimum spanning tree. This can be proved by induction or
contradiction. This is true in many realistic situations, such as the cable
TV company example above, where it's unlikely any two paths have
exactly the same cost. This generalizes to spanning forests as well. If
the edge weights are not unique, only the (multi-)set of weights in
minimum spanning trees is unique, that is the same for all minimum
spanning trees[1].
A proof of uniqueness by contradiction is as follows.[2]
1. Say we have an algorithm that finds an MST (which we will call A)
based on the structure of the graph and the order of the edges when
ordered by weight. (Such algorithms do exist, see below.)
2. Assume MST A is not unique.
3. There is another spanning tree with equal weight, say MST B.
4. Let e1 be an edge that is in A but not in B.
5. As B is a MST, {e1}
6. Then B should include at least one edge e2 that is not in A and lies
on C.
7. Assume the weight of e1 is less than that of e2.
8. Replace e2 with e1 in B yields the spanning tree {e1}
Minimum-cost subgraph
If the weights are positive, then a minimum spanning tree is in fact the minimum-cost subgraph connecting all
vertices, since subgraphs containing cycles necessarily have more total weight.
Cycle property
For any cycle C in the graph, if the weight of an edge e of C is larger than the weights of other edges of C, then this
edge cannot belong to an MST. Assuming the contrary, i.e. that e belongs to an MST T1, then deleting e will break
T1 into two subtrees with the two ends of e in different subtrees. The remainder of C reconnects the subtrees, hence
there is an edge f of C with ends in different subtrees, i.e., it reconnects the subtrees into a tree T2 with weight less
than that of T1, because the weight of f is less than the weight of e.
166
Cut property
For any cut C in the graph, if the
weight of an edge e of C is smaller
than the weights of all other edges of
C, then this edge belongs to all MSTs
of the graph. To prove this, assume the
contrary: in the figure at left, make
edge BC (weight 6) part of the MST T
instead of edge e (weight 4). Adding e
to T will produce a cycle, while
replacing BC with e would produce
MST of smaller weight. Thus, a tree
containing BC is not a MST, a
contradiction
that
violates
our
assumption.
Minimum-cost edge
This figure shows the cut property of MSP.T is the MST of the given graph. If S =
{A,B,D,E}, thus V-S = {C,F},then there are 3 possibilities of the edge across the
cut(S,V-S), they are edges BC, EC, EF of the original graph. Then, e is one of the
minimum-weight-edge for the cut, therefore S {e} is part of the MST T.
Algorithms
The first algorithm for finding a minimum spanning tree was developed by Czech scientist Otakar Borvka in 1926
(see Borvka's algorithm). Its purpose was an efficient electrical coverage of Moravia. There are now two algorithms
commonly used, Prim's algorithm and Kruskal's algorithm. All three are greedy algorithms that run in polynomial
time, so the problem of finding such trees is in FP, and related decision problems such as determining whether a
particular edge is in the MST or determining if the minimum total weight exceeds a certain value are in P. Another
greedy algorithm not as commonly used is the reverse-delete algorithm, which is the reverse of Kruskal's algorithm.
If the edge weights are integers, then deterministic algorithms are known that solve the problem in O(m+n) integer
operations.[3] In a comparison model, in which the only allowed operations on edge weights are pairwise
comparisons, Karger, Klein & Tarjan (1995) found a linear time randomized algorithm based on a combination of
Borvka's algorithm and the reverse-delete algorithm.[4][5] Whether the problem can be solved deterministically in
linear time by a comparison-based algorithm remains an open question, however. The fastest non-randomized
comparison-based algorithm with known complexity, by Bernard Chazelle, is based on the soft heap, an approximate
priority queue.[6][7] Its running time is O(m(m,n)), where m is the number of edges, n is the number of vertices and
is the classical functional inverse of the Ackermann function. The function grows extremely slowly, so that for
all practical purposes it may be considered a constant no greater than 4; thus Chazelle's algorithm takes very close to
linear time. Seth Pettie and Vijaya Ramachandran have found a provably optimal deterministic comparison-based
minimum spanning tree algorithm, the computational complexity of which is unknown.[8]
Research has also considered parallel algorithms for the minimum spanning tree problem. With a linear number of
processors it is possible to solve the problem in
time.[9][10] Bader & Cong (2003) demonstrate an
algorithm that can compute MSTs 5 times faster on 8 processors than an optimized sequential algorithm.[11] Later,
Nobari et al. in
[12]
propose a novel, scalable, parallel Minimum Spanning Forest (MSF) algorithm for undirected
167
weighted graphs. This algorithm leverages Prims algorithm in a parallel fashion, concurrently expanding several
subsets of the computed MSF. PMA minimizes the communication among different processors by not constraining
the local growth of a processors computed subtree. In effect, PMA achieves a scalability that previous approaches
lacked. PMA, in practice, outperforms the previous state-of-the-art GPU-based MSF algorithm, while being several
order of magnitude faster than sequential CPU-based algorithms.
Other specialized algorithms have been designed for computing minimum spanning trees of a graph so large that
most of it must be stored on disk at all times. These external storage algorithms, for example as described in
"Engineering an External Memory Minimum Spanning Tree Algorithm" by Roman Dementiev et al.,[13] can operate,
by authors' claims, as little as 2 to 5 times slower than a traditional in-memory algorithm. They rely on efficient
external storage sorting algorithms and on graph contraction techniques for reducing the graph's size efficiently.
The problem can also be approached in a distributed manner. If each node is considered a computer and no node
knows anything except its own connected links, one can still calculate the distributed minimum spanning tree.
, where
assumption of finite variance, Frieze also proved convergence in probability. Subsequently, J. Michael Steele
showed that the variance assumption could be dropped.
In later work, Svante Janson proved a central limit theorem for weight of the MST.
For uniform random weights in
, the exact expected size of the minimum spanning tree has been computed
Expected size
1/2
0.5
3/4
0.75
31 / 35
0.8857143
893 / 924
0.9664502
278 / 273
1.0183151
30739 / 29172
1.053716
199462271 / 184848378
1.0790588
Related problems
A related problem is the k-minimum spanning tree (k-MST), which is the tree that spans some subset of k vertices in
the graph with minimum weight.
A set of k-smallest spanning trees is a subset of k spanning trees (out of all possible spanning trees) such that no
spanning tree outside the subset has smaller weight.[15][16][17] (Note that this problem is unrelated to the k-minimum
spanning tree.)
The Euclidean minimum spanning tree is a spanning tree of a graph with edge weights corresponding to the
Euclidean distance between vertices which are points in the plane (or space).
The rectilinear minimum spanning tree is a spanning tree of a graph with edge weights corresponding to the
rectilinear distance between vertices which are points in the plane (or space).
References
[1] Do the minimum spanning trees of a weighted graph have the same number of edges with a given weight? (http:/ / cs. stackexchange. com/
questions/ 2204/ do-the-minimum-spanning-trees-of-a-weighted-graph-have-the-same-number-of-edges)
[2] Gallager, R. G.; Humblet, P. A.; Spira, P. M. (January 1983), "A distributed algorithm for minimum-weight spanning trees", ACM
Transactions on Programming Languages and Systems 5 (1): 6677, doi:10.1145/357195.357200.
[3] Fredman, M. L.; Willard, D. E. (1994), "Trans-dichotomous algorithms for minimum spanning trees and shortest paths", Journal of Computer
and System Sciences 48 (3): 533551, doi:10.1016/S0022-0000(05)80064-9, MR1279413.
[4] Karger, David R.; Klein, Philip N.; Tarjan, Robert E. (1995), "A randomized linear-time algorithm to find minimum spanning trees", Journal
of the Association for Computing Machinery 42 (2): 321328, doi:10.1145/201019.201022, MR1409738
[5] Pettie, Seth; Ramachandran, Vijaya (2002), "Minimizing randomness in minimum spanning tree, parallel connectivity, and set maxima
algorithms" (http:/ / portal. acm. org/ citation. cfm?id=545477), Proc. 13th ACM-SIAM Symposium on Discrete Algorithms (SODA '02), San
Francisco, California, pp.713722, .
[6] Chazelle, Bernard (2000), "A minimum spanning tree algorithm with inverse-Ackermann type complexity", Journal of the Association for
Computing Machinery 47 (6): 10281047, doi:10.1145/355541.355562, MR1866456.
[7] Chazelle, Bernard (2000), "The soft heap: an approximate priority queue with optimal error rate", Journal of the Association for Computing
Machinery 47 (6): 10121027, doi:10.1145/355541.355554, MR1866455.
[8] Pettie, Seth; Ramachandran, Vijaya (2002), "An optimal minimum spanning tree algorithm", Journal of the Association for Computing
Machinery 49 (1): 1634, doi:10.1145/505241.505243, MR2148431.
[9] Chong, Ka Wong; Han, Yijie; Lam, Tak Wah (2001), "Concurrent threads and optimal parallel minimum spanning trees algorithm", Journal
of the Association for Computing Machinery 48 (2): 297323, doi:10.1145/375827.375847, MR1868718.
[10] Pettie, Seth; Ramachandran, Vijaya (2002), "A randomized time-work optimal parallel algorithm for finding a minimum spanning forest",
SIAM Journal on Computing 31 (6): 18791895, doi:10.1137/S0097539700371065, MR1954882.
168
Additional reading
Graham, R. L.; Hell, Pavol (1985), "On the history of the minimum spanning tree problem", Annals of the History
of Computing 7 (1): 4357, doi:10.1109/MAHC.1985.10011, MR783327.
Otakar Boruvka on Minimum Spanning Tree Problem (translation of the both 1926 papers, comments, history)
(2000) (http://citeseer.ist.psu.edu/nesetril00otakar.html) Jaroslav Nesetril, Eva Milkov, Helena Nesetrilov.
(Section 7 gives his algorithm, which looks like a cross between Prim's and Kruskal's.)
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms,
Second Edition. MIT Press and McGraw-Hill, 2001. ISBN 0-262-03293-7. Chapter 23: Minimum Spanning
Trees, pp.561579.
Eisner, Jason (1997). State-of-the-art algorithms for minimum spanning trees: A tutorial discussion (http://www.
cs.jhu.edu/~jason/papers/index.html#ms97). Manuscript, University of Pennsylvania, April. 78 pp.
Kromkowski, John David. "Still Unmelted after All These Years", in Annual Editions, Race and Ethnic Relations,
17/e (2009 McGraw Hill) (Using minimum spanning tree as method of demographic analysis of ethnic diversity
across the United States).
External links
Jeff Erickson's MST lecture notes (http://compgeom.cs.uiuc.edu/~jeffe/teaching/algorithms/notes/12-mst.
pdf)
Implemented in BGL, the Boost Graph Library (http://www.boost.org/libs/graph/doc/table_of_contents.
html)
The Stony Brook Algorithm Repository - Minimum Spanning Tree codes (http://www.cs.sunysb.edu/
~algorith/files/minimum-spanning-tree.shtml)
Implemented in QuickGraph for .Net (http://www.codeplex.com/quickgraph)
169
Borvka's algorithm
170
Borvka's algorithm
Borvka's algorithm is an algorithm for finding a minimum spanning tree in a graph for which all edge weights are
distinct.
It was first published in 1926 by Otakar Borvka as a method of constructing an efficient electricity network for
Moravia.[1][2][3] The algorithm was rediscovered by Choquet in 1938;[4] again by Florek, ukasiewicz, Perkal,
Steinhaus, and Zubrzycki[5] in 1951; and again by Sollin [6] in 1965. Because Sollin was the only computer scientist
in this list living in an English speaking country, this algorithm is frequently called Sollin's algorithm, especially in
the parallel computing literature.
The algorithm begins by first examining each vertex and adding the cheapest edge from that vertex to another in the
graph, without regard to already added edges, and continues joining these groupings in a like manner until a tree
spanning all vertices is completed.
Pseudocode
Designating each vertex or set of connected vertices a "component", pseudocode for Borvka's algorithm is:
Input: A connected graph G whose edges have distinct weights
1 Begin with an empty set of edges T
2 While the vertices of G connected by T are disjoint:
3
7
8
9
Borvka's algorithm can be shown to take O(log V) iterations of the outer loop until it terminates, and therefore to
run in time O(Elog V), where E is the number of edges, and V is the number of vertices in G. In planar graphs, and
more generally in families of graphs closed under graph minor operations, it can be made to run in linear time, by
removing all but the cheapest edge between each pair of components after each stage of the algorithm.[7]
Other algorithms for this problem include Prim's algorithm (actually discovered by Vojtch Jarnk) and Kruskal's
algorithm. Faster algorithms can be obtained by combining Prim's algorithm with Borvka's. A faster randomized
minimum spanning tree algorithm based in part on Borvka's algorithm due to Karger, Klein, and Tarjan runs in
expected
time. The best known (deterministic) minimum spanning tree algorithm by Bernard Chazelle is
also based in part on Borvka's and runs in O(E (E,V)) time, where is the inverse of the Ackermann function.
These randomized and deterministic algorithms combine steps of Borvka's algorithm, reducing the number of
components that remain to be connected, with steps of a different type that reduce the number of edges between pairs
of components.
Borvka's algorithm
Notes
[1] Borvka, Otakar (1926). "O jistm problmu minimlnm (About a certain minimal problem)" (in Czech, German summary). Prce mor.
prodovd. spol. v Brn III 3: 3758.
[2] Borvka, Otakar (1926). "Pspvek k een otzky ekonomick stavby elektrovodnch st (Contribution to the solution of a problem of
economical construction of electrical networks)" (in Czech). Elektronick Obzor 15: 153154.
[3] Neetil, Jaroslav; Milkov, Eva; Neetilov, Helena (2001). "Otakar Borvka on minimum spanning tree problem: translation of both the
1926 papers, comments, history". Discrete Mathematics 233 (13): 336. doi:10.1016/S0012-365X(00)00224-7. MR1825599.
[4] Choquet, Gustave (1938). "tude de certains rseaux de routes" (in French). Comptes-rendus de lAcadmie des Sciences 206: 310313.
[5] Florek, Kazimierz (1951). "Sur la liaison et la division des points d'un ensemble fini" (in French). Colloquium Mathematicum 2 (1951):
282285.
[6] Sollin, M. (1965). "Le trac de canalisation" (in French). Programming, Games, and Transportation Networks.
[7] Eppstein, David (1999). "Spanning trees and spanners". In Sack, J.-R.; Urrutia, J.. Handbook of Computational Geometry. Elsevier.
pp.425461; Mare, Martin (2004). "Two linear time algorithms for MST on minor closed graph classes" (http:/ / www. emis. de/ journals/
AM/ 04-3/ am1139. pdf). Archivum mathematicum 40 (3): 315320. .
Kruskal's algorithm
Kruskal's algorithm is a greedy algorithm in graph theory that finds a minimum spanning tree for a connected
weighted graph. This means it finds a subset of the edges that forms a tree that includes every vertex, where the total
weight of all the edges in the tree is minimized. If the graph is not connected, then it finds a minimum spanning
forest (a minimum spanning tree for each connected component).
This algorithm first appeared in Proceedings of the American Mathematical Society, pp.4850 in 1956, and was
written by Joseph Kruskal.
Other algorithms for this problem include Prim's algorithm, Reverse-Delete algorithm, and Borvka's algorithm.
Description
create a forest F (a set of trees), where each vertex in the graph is a separate tree
create a set S containing all the edges in the graph
while S is nonempty and F is not yet spanning
remove an edge with minimum weight from S
if that edge connects two different trees, then add it to the forest, combining two trees into a single tree
otherwise discard that edge.
At the termination of the algorithm, the forest has only one component and forms a minimum spanning tree of the
graph.
Performance
Where E is the number of edges in the graph and V is the number of vertices, Kruskal's algorithm can be shown to
run in O(E log E) time, or equivalently, O(E log V) time, all with simple data structures. These running times are
equivalent because:
E is at most V2 and
is O(log V).
If we ignore isolated vertices, which will each be their own component of the minimum spanning forest, V E+1,
so log V is O(log E).
We can achieve this bound as follows: first sort the edges by weight using a comparison sort in O(E log E) time; this
allows the step "remove an edge with minimum weight from S" to operate in constant time. Next, we use a
disjoint-set data structure (Union&Find) to keep track of which vertices are in which components. We need to
perform O(E) operations, two 'find' operations and possibly one union for each edge. Even a simple disjoint-set data
171
Kruskal's algorithm
172
structure such as disjoint-set forests with union by rank can perform O(E) operations in O(E log V) time. Thus the
total time is O(E log E) = O(E log V).
Provided that the edges are either already sorted or can be sorted in linear time (for example with counting sort or
radix sort), the algorithm can use more sophisticated disjoint-set data structure to run in O(E (V)) time, where is
the extremely slowly growing inverse of the single-valued Ackermann function.
Example
Download the example data. [1]
Image
Description
AD and CE are the shortest edges, with length 5, and AD has been arbitrarily chosen, so it is highlighted.
CE is now the shortest edge that does not form a cycle, with length 5, so it is highlighted as the second
edge.
The next edge, DF with length 6, is highlighted using much the same method.
The next-shortest edges are AB and BE, both with length 7. AB is chosen arbitrarily, and is highlighted.
The edge BD has been highlighted in red, because there already exists a path (in green) between B and D,
so it would form a cycle (ABD) if it were chosen.
The process continues to highlight the next-smallest edge, BE with length 7. Many more edges are
highlighted in red at this stage: BC because it would form the loop BCE, DE because it would form the
loop DEBA, and FE because it would form FEBAD.
Finally, the process finishes with the edge EG of length 9, and the minimum spanning tree is found.
Kruskal's algorithm
173
Proof of correctness
The proof consists of two parts. First, it is proved that the algorithm produces a spanning tree. Second, it is proved
that the constructed spanning tree is of minimal weight.
Spanning Tree
Let
be the subgraph of
cannot have
a cycle, since the last edge added to that cycle would have been within one subtree and not between two different
trees. cannot be disconnected, since the first encountered edge that joins two components of would have been
added by the algorithm. Thus,
is a spanning tree of
Minimality
We show that the following proposition P is true by induction: If F is the set of edges chosen at any stage of the
algorithm, then there is some minimum spanning tree that contains F.
Clearly P is true at the beginning, when F is empty: any minimum spanning tree will do, and there exists one
because a weighted connected graph always has a minimum spanning tree.
Now assume P is true for some non-final edge set F and let T be a minimum spanning tree that contains F. If the
next chosen edge e is also in T, then P is true for F + e. Otherwise, T + e has a cycle C and there is another edge f
that is in C but not F. (If there were no such edge f, then e could not have been added to F, since doing so would
have created the cycle C.) Then T f + e is a tree, and it has the same weight as T, since T has minimum weight
and the weight of f cannot be less than the weight of e, otherwise the algorithm would have chosen f instead of e.
So T f + e is a minimum spanning tree containing F + e and again P holds.
Therefore, by the principle of induction, P holds when F has become a spanning tree, which is only possible if F
is a minimum spanning tree itself.
References
Joseph. B. Kruskal: On the Shortest Spanning Subtree of a Graph and the Traveling Salesman Problem [2]. In:
Proceedings of the American Mathematical Society, Vol 7, No. 1 (Feb, 1956), pp.4850
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms,
Second Edition. MIT Press and McGraw-Hill, 2001. ISBN 0-262-03293-7. Section 23.2: The algorithms of
Kruskal and Prim, pp.567574.
Michael T. Goodrich and Roberto Tamassia. Data Structures and Algorithms in Java, Fourth Edition. John Wiley
& Sons, Inc., 2006. ISBN 0-471-73884-0. Section 13.7.1: Kruskal's Algorithm, pp.632.
External links
References
[1] http:/ / www. carlschroedl. com/ blog/ comp/ kruskals-minimum-spanning-tree-algorithm/ 2012/ 05/ 14/
[2] http:/ / links. jstor. org/ sici?sici=0002-9939(195602)7%3A1%3C48%3AOTSSSO%3E2. 0. CO%3B2-M
[3] http:/ / students. ceid. upatras. gr/ ~papagel/ project/ kruskal. htm
[4] http:/ / www. codeproject. com/ KB/ recipes/ Kruskal_Algorithm. aspx
[5] http:/ / www. technical-recipes. com/ 2011/ finding-minimal-spanning-trees-using-kruskals-algorithm/
Prim's algorithm
174
Prim's algorithm
In computer science, Prim's algorithm is a greedy algorithm that finds a minimum spanning tree for a connected
weighted undirected graph. This means it finds a subset of the edges that forms a tree that includes every vertex,
where the total weight of all the edges in the tree is minimized. The algorithm was developed in 1930 by Czech
mathematician Vojtch Jarnk and later independently by computer scientist Robert C. Prim in 1957 and
rediscovered by Edsger Dijkstra in 1959. Therefore it is also sometimes called the DJP algorithm, the Jarnk
algorithm, or the PrimJarnk algorithm.
Other algorithms for this problem include Kruskal's algorithm and Borvka's algorithm. However, these other
algorithms can also find minimum spanning forests of disconnected graphs, while Prim's algorithm requires the
graph to be connected.
Description
Informal
create a tree containing a single vertex, chosen arbitrarily from the graph
create a set containing all the edges in the graph
loop until every edge in the set connects two vertices in the tree
remove from the set an edge with minimum weight that connects a vertex in the tree with a vertex not in the
tree
add that edge to the tree
Technical
An empty graph cannot have a spanning tree, so we begin by assuming
that the graph is non-empty.
The algorithm continuously increases the size of a tree, one edge at a
time, starting with a tree consisting of a single vertex, until it spans all
vertices.
Input: A non-empty connected weighted graph with vertices V and
edges E (the weights can be negative).
Initialize: Vnew = {x}, where x is an arbitrary node (starting point)
from V, Enew = {}
Repeat until Vnew = V:
Choose an edge (u, v) with minimal weight such that u is in Vnew
and v is not (if there are multiple edges with the same weight,
any of them may be picked)
Add v to Vnew, and (u, v) to Enew
Output: Vnew and Enew describe a minimal spanning tree
Prim's algorithm
175
Time complexity
Minimum edge weight data structure
O(V2)
O(E + V log V)
A simple implementation using an adjacency matrix graph representation and searching an array of weights to find
the minimum weight edge to add requires O(V2) running time. Using a simple binary heap data structure and an
adjacency list representation, Prim's algorithm can be shown to run in time O(E log V) where E is the number of
edges and V is the number of vertices. Using a more sophisticated Fibonacci heap, this can be brought down to O(E
+ V log V), which is asymptotically faster when the graph is dense enough that E is (V).
Example run
Image
U
{}
Edge(u,v)
V\U
Description
{A,B,C,D,E,F,G} This is our original weighted graph. The numbers near the edges indicate
their weight.
{D}
(D,A) = 5 V {A,B,C,E,F,G}
(D,B) = 9
(D,E) = 15
(D,F) = 6
{A,D}
(D,B) = 9
{B,C,E,F,G}
(D,E) = 15
(D,F) = 6 V
(A,B) = 7
{A,D,F}
(D,B) = 9
{B,C,E,G}
(D,E) = 15
(A,B) = 7 V
(F,E) = 8
(F,G) = 11
{A,B,D,F}
(B,C) = 8
{C,E,G}
(B,E) = 7 V
(D,B) = 9
cycle
(D,E) = 15
(F,E) = 8
(F,G) = 11
{A,B,D,E,F}
(B,C) = 8
{C,G}
(D,B) = 9
cycle
(D,E) = 15
cycle
(E,C) = 5 V
(E,G) = 9
(F,E) = 8
cycle
(F,G) = 11
Here, the only vertices available are C and G. C is 5 away from E, and G
is 9 away from E. C is chosen, so it is highlighted along with the edge
EC.
Prim's algorithm
176
{A,B,C,D,E,F}
(B,C) = 8
{G}
cycle
(D,B) = 9
cycle
(D,E) = 15
cycle
(E,G) = 9 V
(F,E) = 8
cycle
(F,G) = 11
{A,B,C,D,E,F,G} (B,C) = 8
cycle
(D,B) = 9
cycle
(D,E) = 15
cycle
(F,E) = 8
cycle
(F,G) = 11
cycle
{}
Now all the vertices have been selected and the minimum spanning tree is
shown in green. In this case, it has weight 39.
Proof of correctness
Let P be a connected, weighted graph. At every iteration of Prim's
algorithm, an edge must be found that connects a vertex in a subgraph
to a vertex outside the subgraph. Since P is connected, there will
always be a path to every vertex. The output Y of Prim's algorithm is a
tree, because the edge and vertex added to tree Y are connected. Let Y1
be a minimum spanning tree of graph P. If Y1=Y then Y is a minimum
spanning tree. Otherwise, let e be the first edge added during the
This animation shows how the Prim's algorithm
runs in a graph.(Click the image to see the
construction of tree Y that is not in tree Y1, and V be the set of vertices
animation)
connected by the edges added before edge e. Then one endpoint of
edge e is in set V and the other is not. Since tree Y1 is a spanning tree of
graph P, there is a path in tree Y1 joining the two endpoints. As one travels along the path, one must encounter an
edge f joining a vertex in set V to one that is not in set V. Now, at the iteration when edge e was added to tree Y, edge
f could also have been added and it would be added instead of edge e if its weight was less than e (we know we
encountered the opportunity to take "f" before "e" because "f" is connected to V, and we visited every vertex of V
before the vertex to which we connected "e" ["e" is connected to the last vertex we visited in V]). Since edge f was
not added, we conclude that
Let tree Y2 be the graph obtained by removing edge f from and adding edge e to tree Y1. It is easy to show that tree Y2
is connected, has the same number of edges as tree Y1, and the total weights of its edges is not larger than that of tree
Y1, therefore it is also a minimum spanning tree of graph P and it contains edge e and all the edges added before it
during the construction of set V. Repeat the steps above and we will eventually obtain a minimum spanning tree of
graph P that is identical to tree Y. This shows Y is a minimum spanning tree.
Prim's algorithm
177
References
V. Jarnk: O jistm problmu minimlnm [About a certain minimal problem], Prce Moravsk Prodovdeck
Spolenosti, 6, 1930, pp.5763. (in Czech)
R. C. Prim: Shortest connection networks and some generalizations. In: Bell System Technical Journal, 36 (1957),
pp.13891401
D. Cheriton and R. E. Tarjan: Finding minimum spanning trees. In: SIAM Journal on Computing, 5 (Dec. 1976),
pp.724741
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, Third
Edition. MIT Press, 2009. ISBN 0-262-03384-4. Section 23.2: The algorithms of Kruskal and Prim, pp.631638.
External links
Animated example of Prim's algorithm (http://students.ceid.upatras.gr/~papagel/project/prim.htm)
Prim's Algorithm (Java Applet) (http://www.mincel.com/java/prim.html)
Minimum spanning tree demonstration Python program by Ronald L. Rivest (http://people.csail.mit.edu/
rivest/programs.html)
Open Source Java Graph package with implementation of Prim's Algorithm (http://code.google.com/p/annas/
)
Open Source C# class library with implementation of Prim's Algorithm (http://code.google.com/p/ngenerics/)
Order
The order of this algorithm is
order is
undirected minimum spanning tree. In 1986, Gabow, Galil, Spencer, and Tarjan made a faster implementation, and
its order is
Algorithm
Description
The algorithm has a conceptual recursive description. We will denote by
directed graph
of minimal cost.
178
with root
parallel edges (edges between the same pair of vertices in the same direction) by a single edge with weight equal to
the minimum of the weights of these parallel edges.
Now, for each node other than the root, mark an (arbitrarily chosen) incoming edge of lowest cost. Denote the
other endpoint of this edge by
. The edge is now denoted as
with associated cost
. If
the marked edges form an SRT (Shortest Route Tree),
edges form at least one cycle. Call (an arbitrarily chosen) one of these cycles
directed graph
denoted
If
having a root
. If
is an edge in
, and
.
We include no other edges in
The root
of
with
in
, and
, find an SRT in
and
. Unmark
the set of marked edges do form an SRT, which we define to be the value of
.
Observe that
is defined in terms of
for weighted directed rooted graphs
vertices than
Using a call to
, and finding
and mark
is
.
. Now
Implementation
Let BV be a vertex bucket and BE be an edge bucket. Let v be a vertex and e be an edge of maximum positive
weight that is incident to v. Ci is a circuit. G0 = (V0,E0) is the original digraph. ui is a replacement vertex for Ci.
i=0
A:
if
then goto B
and
contains a circuit {
i=i+1
construct
by shrinking
to
Ei}
References
Y. J. Chu and T. H. Liu, "On the Shortest Arborescence of a Directed Graph", Science Sinica, vol. 14, 1965,
pp.13961400.
J. Edmonds, Optimum Branchings, J. Res. Nat. Bur. Standards, vol. 71B, 1967, pp.233240.
R. E. Tarjan, "Finding Optimum Branchings", Networks, v.7, 1977, pp.2535.
P.M. Camerini, L. Fratta, and F. Maffioli, "A note on finding optimum branchings", Networks, v.9, 1979,
pp.309312.
Alan Gibbons Algorithmic Graph Theory, Cambridge University press, 1985 ISBN 0-521-28881-9
H. N. Gabow, Z. Galil, T. Spencer, and R. E. Tarjan, Efficient algorithms for finding minimum spanning trees in
undirected and directed graphs, Combinatorica 6 (1986), 109-122.
External links
The Directed Minimum Spanning Tree Problem [1] Description of the algorithm summarized by Shanchieh Jay
Yang, May 2000.
Edmonds's algorithm ( edmonds-alg ) [2] An open source implementation of Edmonds's algorithm written in
C++ and licensed under the MIT License. This source is using Tarjan's implementation for the dense graph.
AlgoWiki Edmonds's algorithm [3] - A public-domain implementation of Edmonds's algorithm written in Java.
References
[1] http:/ / www. ce. rit. edu/ ~sjyeec/ dmst. html
[2] http:/ / edmonds-alg. sourceforge. net/
[3] http:/ / algowiki. net/ wiki/ index. php?title=Edmonds%27s_algorithm
179
Formal definition
Input: n-node undirected graph G(V,E); positive integer k n.
Question: Does G have a spanning tree in which no node has degree greater than k?
NP-completeness
This problem is NP-complete (Garey & Johnson 1979). This can be shown by a reduction from the Hamiltonian path
problem. It remains NP-complete even if k is fixed to a value 2. If the problem is defined as the degree must be
k, the k = 2 case of degree-confined spanning tree is the Hamiltonian path problem.
Approximation Algorithm
Frer & Raghavachari (1994) gave an approximation algorithm for the problem which, on any given instance, either
shows that the instance has no tree of maximum degree k or it finds and returns a tree of maximum degree k+1.
References
[1] Bui, T. N. and Zrncic, C. M. 2006. An ant-based algorithm for finding degree-constrained minimum spanning tree. (http:/ / www. cs. york. ac.
uk/ rts/ docs/ GECCO_2006/ docs/ p11. pdf) In GECCO 06: Proceedings of the 8th annual conference on Genetic and evolutionary
computation, pages 1118, New York, NY, USA. ACM.
Garey, Michael R.; Johnson, David S. (1979), Computers and Intractability: A Guide to the Theory of
NP-Completeness, W.H. Freeman, ISBN0-7167-1045-5. A2.1: ND1, p.206.
Frer, Martin; Raghavachari, Balaji (1994), "Approximating the minimum-degree Steiner tree to within one of
optimal", Journal of Algorithms 17 (3): 409423, doi:10.1006/jagm.1994.1042.
180
Definitions
A connected dominating set of a graph G is a set D of vertices with two properties:
1. Any node in D can reach any other node in D by a path that stays entirely within D. That is, D induces a
connected subgraph of G.
2. Every vertex in G either belongs to D or is adjacent to a vertex in D. That is, D is a dominating set of G.
A minimum connected dominating set of a graph G is a connecting dominating set with the smallest possible
cardinality among all connected dominating sets of G. The connected domination number of G is the number of
vertices in the minimum connected dominating set.[1]
Any spanning tree T of a graph G has at least two leaves, vertices that have only one edge of T incident to them. A
maximum leaf spanning tree is a spanning tree that has the largest possible number of leaves among all spanning
trees of G. The max leaf number of G is the number of leaves in the maximum leaf spanning tree.[2]
Complementarity
If d is the connected domination number of an n-vertex graph G, and l is its max leaf number, then the three
quantities d, l, and n obey the simple equation
[3]
If D is a connected dominating set, then there exists a spanning tree in G whose leaves include all vertices that are
not in D: form a spanning tree of the subgraph induced by D, together with edges connecting each remaining vertex v
that is not in D to a neighbor of v in D. This shows that l n d.
In the other direction, if T is any spanning tree in G, then the vertices of T that are not leaves form a connected
dominating set of G. This shows that n l d. Putting these two inequalities together proves the equality n = d + l.
Therefore, in any graph, the sum of the connected domination number and the max leaf number equals the total
number of vertices. Computationally, this implies that finding the minimum dominating set is equally difficult to
finding a maximum leaf spanning tree.
Algorithms
It is NP-complete to test whether there exists a connected dominating set with size less than a given threshold, or
equivalently to test whether there exists a spanning tree with at least a given number of leaves. Therefore, it is
believed that the minimum connected dominating set problem and the maximum leaf spanning tree problem cannot
be solved in polynomial time.
When viewed in terms of approximation algorithms, connected domination and maximum leaf spanning trees are not
the same: approximating one to within a given approximation ratio is not the same as approximating the other to the
same ratio. There exists an approximation for the minimum connected dominating set that achieves a factor of 2 ln
+ O(1), where is the maximum degree of a vertex in G.[4] The maximum leaf spanning tree problem is MAX-SNP
hard, implying that no polynomial time approximation scheme is likely.[5] However, it can be approximated to
within a factor of 2 in polynomial time.[6]
181
182
Applications
Connected dominating set are useful in the computation of routing for mobile ad-hoc networks. In this application, a
small connected dominating set is used as a backbone for communications, and nodes that are not in this set
communicate by passing messages through neighbors that are in the set.[7]
The max leaf number has been employed in the development of fixed-parameter tractable algorithms: several
NP-hard optimization problems may be solved in polynomial time for graphs of bounded max leaf number.[2]
References
[1] Sampathkumar, E.; Walikar, HB (1979), "The connected domination number of a graph", J. Math. Phys. Sci 13 (6): 607613.
[2] Fellows, Michael; Lokshtanov, Daniel; Misra, Neeldhara; Mnich, Matthias; Rosamond, Frances; Saurabh, Saket (2009), "The complexity
ecology of parameters: an illustration using bounded max leaf number", Theory of Computing Systems 45 (4): 822848,
doi:10.1007/s00224-009-9167-9.
[3] Douglas, Robert J. (1992), "NP-completeness and degree restricted spanning trees", Discrete Mathematics 105 (13): 4147,
doi:10.1016/0012-365X(92)90130-8.
[4] Guha, S.; Khuller, S. (1998), "Approximation algorithms for connected dominating sets", Algorithmica 20 (4): 374387,
doi:10.1007/PL00009201.
[5] Galbiati, G.; Maffioli, F.; Morzenti, A. (1994), "A short note on the approximability of the maximum leaves spanning tree problem",
Information Processing Letters 52 (1): 4549, doi:10.1016/0020-0190(94)90139-2.
[6] Solis-Oba, Roberto (1998), "2-approximation algorithm for finding a spanning tree with maximum number of leaves", Proc. 6th European
Symposium on Algorithms (ESA'98), Lecture Notes in Computer Science, 1461, Springer-Verlag, pp.441452,
doi:10.1007/3-540-68530-8_37.
[7] Wu, J.; Li, H. (1999), "On calculating connected dominating set for efficient routing in ad hoc wireless networks", Proceedings of the 3rd
International Workshop on Discrete Algorithms and Methods for Mobile Computing and Communications, ACM, pp.714,
doi:10.1145/313239.313261.
-MST, of
vertices of
, the
-minimum
with
. This problem is
-MST problem is restricted to the Euclidean plane, there exists a PTAS due to Arora.
with
183
References
Arora, S. (1998), "Polynomial time approximation schemes for
euclidean traveling salesman and other geometric problems.", J.
ACM.
Garg, N. (2005), "Saving an epsilon: a 2-approximation for the
k-MST problem in graphs.", STOC.
Ravi, R.; Sundaram, R.; Marathe, M.; Rosenkrantz, D.; Ravi, S.
(1996), "Spanning trees short or small", SIAM Journal on Discrete
Mathematics.
The 4-MST of
External links
Minimum k-spanning tree in "A compendium of NP optimization
problems" [2]
The 6-MST of
References
[1] http:/ / iridia. ulb. ac. be/ ~cblum/ kctlib/
[2] http:/ / www. nada. kth. se/ ~viggo/ wwwcompendium/ node71. html
184
Algorithms
Suppose we have a graph
and
with a root
. Let
. Let
Esau-Williams heuristic[1]
Esau-Williams heuristic finds suboptimal CMST that are very close to the exact solutions, but on average EW
produces better results than many other heuristics.
Initially, all nodes are connected to the root
subtree does not violate the capacity constraints, remove the gate
connecting the
-th subtree to
. We repeat the iterations until we can not make any further improvements to the tree.
Esau-Williams heuristics for computing a suboptimal CMST:
function CMST(c,C,r):
T = {
,
, ...,
}
while have changes:
for each node
= closest node in a different subtree
=
t_max = max(
k = i such that
)
= t_max
by an edge
Sharma's heuristic
Sharma's heuristic [2].
Applications
CMST problem is important in network design: when many terminal computers have to be connected to the central
hub, the star configuration is usually not the minimum cost design. Finding a CMST that organizes the terminals into
subnetworks can lower the cost of implementing a network.
Limitations
But CMST is still not provide the minimum cost for long situated nodes.overcome this drawback ESAU Williams
has solved this problem.
References
[1] Esau, L.R.; Williams, K.C. (1966). "On teleprocessing network design: Part II. A method for approximating the optimal network.". IBM
Systems Journal 5 (3): 142147. doi:10.1147/sj.53.0142.
[2] Sharma, R.L.; El-Bardai, M.T. (1977). "Suboptimal communications network synthesis". In Proc. of International Conference on
Communications: 19.1119.16.
where X and Y are any two sets of elements considered as clusters, and d(x,y) denotes the distance between the two
elements x and y.
A drawback of this method is the so-called chaining phenomenon: clusters may be forced together due to single
elements being close to each other, even though many of the elements in each cluster may be very distant to each
other.
Naive Algorithm
The following algorithm is an agglomerative scheme that erases rows and columns in a proximity matrix as old
clusters are merged into new ones. The
proximity matrix D contains all distances d(i,j). The clusterings are
assigned sequence numbers 0,1,......, (n1) and L(k) is the level of the kth clustering. A cluster with sequence
number m is denoted (m) and the proximity between clusters (r) and (s) is denoted d[(r),(s)].
The algorithm is composed of the following steps:
1. Begin with the disjoint clustering having level L(0) = 0 and sequence number m = 0.
2. Find the most similar pair of clusters in the current clustering, say pair (r), (s), according to d[(r),(s)] = min
d[(i),(j)] where the minimum is over all pairs of clusters in the current clustering.
3. Increment the sequence number: m = m+1. Merge clusters (r) and (s) into a single cluster to form the next
clustering m. Set the level of this clustering to L(m) = d[(r),(s)]
185
186
4. Update the proximity matrix, D, by deleting the rows and columns corresponding to clusters (r) and (s) and
adding a row and column corresponding to the newly formed cluster. The proximity between the new cluster,
denoted (r,s) and old cluster (k) is defined as d[(k), (r,s)] = min d[(k),(r)], d[(k),(s)].
5. If all objects are in one cluster, stop. Else, go to step 2.
known as SLINK.
Other linkages
This is essentially the same as Kruskal's algorithm for minimum spanning trees. However, in single linkage
clustering, the order in which clusters are formed is important, while for minimum spanning trees what matters is the
set of pairs of points that form distances chosen by the algorithm.
Alternative linkage schemes include complete linkage and Average linkage clustering - implementing a different
linkage in the naive algorithm is simply a matter of using a different formula to calculate inter-cluster distances in
the initial computation of the proximity matrix and in step 4 of the above algorithm. An optimally efficient algorithm
is however not available for arbitrary linkages. The formula that should be adjusted has been highlighted using bold
text.
External links
Single linkage clustering algorithm implementation in Ruby (AI4R) [2]
Linkages used in Matlab [3]
References
[1] R. Sibson (1973). "SLINK: an optimally efficient algorithm for the single-link cluster method" (http:/ / www. cs. gsu. edu/ ~wkim/
index_files/ papers/ sibson. pdf). The Computer Journal (British Computer Society) 16 (1): 3034. .
[2] http:/ / ai4r. rubyforge. org
[3] http:/ / www. mathworks. com/ help/ toolbox/ stats/ linkage. html
Depth-first search
This algorithm is a randomized version of the
depth-first search algorithm. Frequently
implemented with a stack, this approach is one
of the simplest ways to generate a maze using
a computer. Consider the space for a maze
being a large grid of cells (like a large chess
board), each cell starting with four walls.
Starting from a random cell, the computer then
selects a random neighbouring cell that has not
yet been visited. The computer removes the
'wall' between the two cells and adds the new
cell to a stack (this is analogous to drawing the
line on the floor). The computer continues this
An animation of generating a 30 by 20 maze using depth-first search.
process, with a cell that has no unvisited
neighbours being considered a dead-end. When at a dead-end it backtracks through the path until it reaches a cell
with an unvisited neighbour, continuing the path generation by visiting this new, unvisited cell (creating a new
junction). This process continues until every cell has been visited, causing the computer to backtrack all the way
back to the beginning cell. This approach guarantees that the maze space is completely visited.
As stated, the algorithm is very simple and does not produce overly-complex mazes. More specific refinements to
the algorithm can help to generate mazes that are harder to solve.
1. Start at a particular cell and call it the "exit."
2. Mark the current cell as visited, and get a list of its neighbors. For each neighbor, starting with a randomly
selected neighbor:
187
188
189
holes in walls
continue subdividing...
completed
Mazes can be created with recursive division, an algorithm which works as follows: Begin with the maze's space
with no walls. Call this a chamber. Divide the chamber with a randomly positioned wall (or multiple walls) where
each wall contains a randomly positioned passage opening within it. Then recursively repeat the process on the
subchambers until all chambers are minimum sized. This method results in mazes with long straight walls crossing
their space, making it easier to see which areas to avoid.
For example, in a rectangular maze, build at random points two walls that are perpendicular to each other. These two
walls divide the large chamber into four smaller chambers separated by four walls. Choose three of the four walls at
190
random, and open a one cell-wide hole at a random point in each of the three. Continue in this manner recursively,
until every chamber has a width of one cell in either of the two directions.
Simple algorithms
Other algorithms exist that require only
enough memory to store one line of a 2D
maze or one plane of a 3D maze. They
prevent loops by storing which cells in the
current line are connected through cells in
the previous lines, and never remove walls
between any two cells already connected.
Most maze generation algorithms require
maintaining relationships between cells
within it, to ensure the end result will be
solvable. Valid simply connected mazes can
however be generated by focusing on each
cell independently. A binary tree maze is a
standard orthogonal maze where each cell
always has a passage leading up or leading
left, but never both. To create a binary tree
maze, for each cell flip a coin to decide
whether to add a passage leading up or left.
Always pick the same direction for cells on
the boundary, and the end result will be a
valid simply connected maze that looks like
a binary tree, with the upper left corner its
root.
A related form of flipping a coin for each cell is to create an image using a random mix of forward slash and
backslash characters. This doesn't generate a valid simply connected maze, but rather a selection of closed loops and
unicursal passages. (The manual for the Commodore 64 presents a BASIC program using this algorithm, but using
PETSCII diagonal line graphic characters instead for a smoother graphic appearance.)
191
References
[1] Nathaniel Johnston (http:/ / www. conwaylife. com/ wiki/ index. php?title=User:Nathaniel) et al (21 August 2010). "Maze - LifeWiki" (http:/
/ www. conwaylife. com/ wiki/ index. php?title=Maze). LifeWiki. . Retrieved 1 March 2011.
External links
Think Labyrinth: Maze algorithms (http://www.astrolog.org/labyrnth/algrithm.htm#perfect) (details on these
and other maze generation algorithms)
Explanation of an Obfuscated C maze algorithm (http://homepages.cwi.nl/~tromp/maze.html) (a program to
generate mazes line-by-line, obfuscated in a single physical line of code)
Maze generation and solving Java applet (http://www.mazeworks.com/mazegen/)
Maze generating Java applets with source code. (http://www.ccs.neu.edu/home/snuffy/maze/)
Maze Generation (http://www.martinfoltin.sk/mazes) - Master's Thesis (Java Applet enabling users to have a
maze created using various algorithms and human solving of mazes)
Create Your Own Mazes (http://www.pedagonet.com/Labyrynthe/mazes.htm)
Collection of maze generation code (http://rosettacode.org/wiki/Maze) in different languages in Rosetta Code
Maze Generation Using JavaScript (http://www.hightechdreams.com/weaver.php?topic=mazegeneration)
Maze generation script for the Unity game engine (http://unifycommunity.com/wiki/index.
php?title=MazeGenerator)
192
193
History
Although complete subgraphs have been studied for longer in mathematics,[3] the term "clique" and the problem of
algorithmically listing cliques both come from the social sciences, where complete subgraphs are used to model
social cliques, groups of people who all know each other. The "clique" terminology comes from Luce & Perry
(1949), and the first algorithm for solving the clique problem is that of Harary & Ross (1957),[1] who were motivated
by the sociological application.
Clique problem
194
Since the work of Harary and Ross, many others have devised algorithms for various versions of the clique
problem.[1] In the 1970s, researchers began studying these algorithms from the point of view of worst-case analysis;
see, for instance, Tarjan & Trojanowski (1977), an early work on the worst-case complexity of the maximum clique
problem. Also in the 1970s, beginning with the work of Cook (1971) and Karp (1972), researchers began finding
mathematical justification for the perceived difficulty of the clique problem in the theory of NP-completeness and
related intractability results. In the 1990s, a breakthrough series of papers beginning with Feige et al. (1991) and
reported at the time in major newspapers,[4] showed that it is not even possible to approximate the problem
accurately and efficiently.
Definitions
An undirected graph is formed by a finite set of vertices and a set of
unordered pairs of vertices, which are called edges. By convention, in
algorithm analysis, the number of vertices in the graph is denoted by n
and the number of edges is denoted by m. A clique in a graph G is a
complete subgraph of G; that is, it is a subset S of the vertices such that
every two vertices in S form an edge in G. A maximal clique is a clique
to which no more vertices can be added; a maximum clique is a clique
that includes the largest possible number of vertices, and the clique
number (G) is the number of vertices in a maximum clique of G.[1]
Several closely related clique-finding problems have been studied.
Clique problem
Algorithms
Maximal versus maximum
A maximal clique, sometimes called inclusion-maximal, is a clique that is not included in a larger clique. Note,
therefore, that every clique is contained in a maximal clique.
Maximal cliques can be very small. A graph may contain a non-maximal clique with many vertices and a separate
clique of size 2 which is maximal. While a maximum (i.e., largest) clique is necessarily maximal, the converse does
not hold. There are some types of graphs in which every maximal clique is maximum (the complements of
well-covered graphs, notably including complete graphs, triangle-free graphs without isolated vertices, complete
multipartite graphs, and k-trees) but other graphs have maximal cliques that are not maximum.
Finding a maximal clique is straightforward: Starting with an arbitrary clique (for instance, a single vertex), grow the
current clique one vertex at a time by iterating over the graphs remaining vertices, adding a vertex if it is connected
to each vertex in the current clique, and discarding it otherwise. This algorithm runs in linear time. Because of the
ease of finding maximal cliques, and their potential small size, more attention has been given to the much harder
algorithmic problem of finding a maximum or otherwise large clique than has been given to the problem of finding a
single maximal clique.
195
Clique problem
196
Clique problem
backtracking recursion. The fastest algorithm known today is due to Robson (2001) which runs in time O(20.249n) =
O(1.1888n).
There has also been extensive research on heuristic algorithms for solving maximum clique problems without
worst-case runtime guarantees, based on methods including branch and bound,[13] local search,[14] greedy
algorithms,[15] and constraint programming.[16] Non-standard computing methodologies for finding cliques include
DNA computing[17] and adiabatic quantum computation.[18] The maximum clique problem was the subject of an
implementation challenge sponsored by DIMACS in 19921993,[19] and a collection of graphs used as benchmarks
for the challenge is publicly available.[20]
197
Clique problem
Approximation algorithms
Several authors have considered approximation algorithms that attempt to find a clique or independent set that,
although not maximum, has size as close to the maximum as can be found in polynomial time. Although much of
this work has focused on independent sets in sparse graphs, a case that does not make sense for the complementary
clique problem, there has also been work on approximation algorithms that do not use such sparsity assumptions.[29]
Feige (2004) describes a polynomial time algorithm that finds a clique of size ((logn/loglogn)2) in any graph that
has clique number (n/logkn) for any constant k. By combining this algorithm to find cliques in graphs with clique
numbers between n/logn and n/log3n with a different algorithm of Boppana & Halldrsson (1992) to find cliques in
graphs with higher clique numbers, and choosing a two-vertex clique if both algorithms fail to find anything, Feige
provides an approximation algorithm that finds a clique with a number of vertices within a factor of
O(n(loglogn)2/log3n) of the maximum. Although the approximation ratio of this algorithm is weak, it is the best
known to date, and the results on hardness of approximation described below suggest that there can be no
approximation algorithm with an approximation ratio significantly less than linear.
Lower bounds
NP-completeness
The clique decision problem is NP-complete. It was one of Richard
Karp's original 21 problems shown NP-complete in his 1972 paper
"Reducibility Among Combinatorial Problems". This problem was also
mentioned in Stephen Cook's paper introducing the theory of
NP-complete problems. Thus, the problem of finding a maximum
clique is NP-hard: if one could solve it, one could also solve the
decision problem, by comparing the size of the maximum clique to the
size parameter given as input in the decision problem.
Karp's NP-completeness proof is a many-one reduction from the
Boolean satisfiability problem for formulas in conjunctive normal
The 3-CNF Satisfiability instance (x x y)
form, which was proved NP-complete in the CookLevin theorem.[31]
(~x ~y ~y) (~x y y) reduced to Clique.
From a given CNF formula, Karp forms a graph that has a vertex for
The green vertices form a 3-clique and
[30]
correspond
to a satisfying assignment.
every pair (v,c), where v is a variable or its negation and c is a clause in
the formula that contains v. Vertices are connected by an edge if they
represent compatible variable assignments for different clauses: that is, there is an edge from (v,c) to (u,d) whenever
cd and u and v are not each others' negations. If k denotes the number of clauses in the CNF formula, then the
k-vertex cliques in this graph represent ways of assigning truth values to some of its variables in order to satisfy the
formula; therefore, the formula is satisfiable if and only if a k-vertex clique exists.
Some NP-complete problems (such as the travelling salesman problem in planar graphs) may be solved in time that
is exponential in a sublinear function of the input size parameter n.[32] However, as Impagliazzo, Paturi & Zane
(2001) describe, it is unlikely that such bounds exist for the clique problem in arbitrary graphs, as they would imply
similarly subexponential bounds for many other standard NP-complete problems.
198
Clique problem
199
Circuit complexity
The computational difficulty of the clique problem has led it to be used
to prove several lower bounds in circuit complexity. Because the
existence of a clique of a given size is a monotone graph property (if a
clique exists in a given graph, it will exist in any supergraph) there
must exist a monotone circuit, using only and gates and or gates, to
solve the clique decision problem for a given fixed clique size.
However, the size of these circuits can be proven to be a
super-polynomial function of the number of vertices and the clique
size, exponential in the cube root of the number of vertices.[33] Even if
a small number of NOT gates are allowed, the complexity remains
superpolynomial.[34] Additionally, the depth of a monotone circuit for
the clique problem using gates of bounded fan-in must be at least a
polynomial in the clique size.[35]
Clique problem
200
Fixed-parameter intractability
Parameterized complexity[39] is the complexity-theoretic study of problems that are naturally equipped with a small
integer parameter k, and for which the problem becomes more difficult as k increases, such as finding k-cliques in
graphs. A problem is said to be fixed-parameter tractable if there is an algorithm for solving it on inputs of size n in
time f(k)nO(1); that is, if it can be solved in polynomial time for any fixed value of k and moreover if the exponent of
the polynomial does not depend on k.
For the clique problem, the brute force search algorithm has running time O(nkk2), and although it can be improved
by fast matrix multiplication the running time still has an exponent that is linear in k. Thus, although the running
time of known algorithms for the clique problem is polynomial for any fixed k, these algorithms do not suffice for
fixed-parameter tractability. Downey & Fellows (1995) defined a hierarchy of parametrized problems, the W
hierarchy, that they conjectured did not have fixed-parameter tractable algorithms; they proved that independent set
(or, equivalently, clique) is hard for the first level of this hierarchy, W[1]. Thus, according to their conjecture, clique
is not fixed-parameter tractable. Moreover, this result provides the basis for proofs of W[1]-hardness of many other
problems, and thus serves as an analogue of the CookLevin theorem for parameterized complexity.
Chen et al. (2006) showed that the clique problem cannot be solved in time
hypothesis fails.
Although the problems of listing maximal cliques or finding maximum cliques are unlikely to be fixed-parameter
tractable with the parameter k, they may be fixed-parameter tractable for other parameters of instance complexity.
For instance, both problems are known to be fixed-parameter tractable when parametrized by the degeneracy of the
input graph.[12]
Hardness of approximation
The computational complexity of approximating the clique problem
has been studied for a long time; for instance, Garey & Johnson (1978)
observed that, because of the fact that the clique number takes on small
integer values and is NP-hard to compute, it cannot have a fully
polynomial-time approximation scheme. However, little more was
known until the early 1990s, when several authors began to make
connections between the approximation of maximum cliques and
probabilistically checkable proofs, and used these connections to prove
hardness of approximation results for the maximum clique
problem.[4][40] After many improvements to these results it is now
known that, unless P = NP, there can be no polynomial time algorithm
that approximates the maximum clique to within a factor better than
O(n1), for any >0.[41]
Clique problem
proof checker could read a sequence of proof string bits and end up accepting the proof. Two vertices are connected
by an edge whenever the two proof checker runs that they describe agree on the values of the proof string bits that
they both examine. The maximal cliques in this graph consist of the accepting proof checker runs for a single proof
string, and one of these cliques is large if and only if there exists a proof string that many proof checkers accept. If
the original Satisfiability instance is satisfiable, there will be a large clique defined by a valid proof string for that
instance, but if the original instance is not satisfiable, then all proof strings are invalid, any proof string has only a
small number of checkers that mistakenly accept it, and all cliques are small. Therefore, if one could distinguish in
polynomial time between graphs that have large cliques and graphs in which all cliques are small, one could use this
ability to distinguish the graphs generated from satisfiable and unsatisfiable instances of the Satisfiability problem,
not possible unless P=NP. An accurate polynomial-time approximation to the clique problem would allow these
two sets of graphs to be distinguished from each other, and is therefore also impossible.
Notes
[1] For surveys of these algorithms, and basic definitions used in this article, see Bomze et al. (1999) and Gutin (2004).
[2] For more details and references, see clique (graph theory).
[3] Complete subgraphs make an early appearance in the mathematical literature in the graph-theoretic reformulation of Ramsey theory by Erds
& Szekeres (1935).
[4] Kolata, Gina (June 26, 1990), "In a Frenzy, Math Enters Age of Electronic Mail" (http:/ / www. nytimes. com/ 1990/ 06/ 26/ science/
in-a-frenzy-math-enters-age-of-electronic-mail. html), New York Times, .
[5] Chiba & Nishizeki (1985).
[6] E.g., see Downey & Fellows (1995).
[7] Itai & Rodeh (1978) provide an algorithm with O(m3/2) running time that finds a triangle if one exists but does not list all triangles; Chiba &
Nishizeki (1985) list all triangles in time O(m3/2).
[8] Eisenbrand & Grandoni (2004); Kloks, Kratsch & Mller (2000); Neetil & Poljak (1985); Vassilevska & Williams (2009); Yuster (2006).
[9] Tomita, Tanaka & Takahashi (2006).
[10] Cazals & Karande (2008); Eppstein & Strash (2011).
[11] Rosgen & Stewart (2007).
[12] Eppstein, Lffler & Strash (2010).
[13] Balas & Yu (1986); Carraghan & Pardalos (1990); Pardalos & Rogers (1992); stergrd (2002); Fahle (2002); Tomita & Seki (2003);
Tomita & Kameda (2007); Konc & Janei (2007).
[14] Battiti & Protasi (2001); Katayama, Hamamoto & Narihisa (2005).
[15] Abello, Pardalos & Resende (1999); Grosso, Locatelli & Della Croce (2004).
[16] Rgin (2003).
[17] Ouyang et al. (1997). Although the title refers to maximal cliques, the problem this paper solves is actually the maximum clique problem.
[18] Childs et al. (2002).
[19] Johnson & Trick (1996).
[20] DIMACS challenge graphs for the clique problem (ftp:/ / dimacs. rutgers. edu/ pub/ challenge/ graph/ benchmarks/ clique/ ), accessed
2009-12-17.
[21] Grtschel, Lovsz & Schrijver (1988).
[22] Golumbic (1980).
[23] Golumbic (1980), p. 159. Even, Pnueli & Lempel (1972) provide an alternative quadratic-time algorithm for maximum cliques in
comparability graphs, a broader class of perfect graphs that includes the permutation graphs as a special case.
[24] Gavril (1973); Golumbic (1980), p. 247.
[25] Clark, Colbourn & Johnson (1990).
[26] Jerrum (1992).
[27] Alon, Krivelevich & Sudakov (1998).
[28] Feige & Krauthgamer (2000).
[29] Boppana & Halldrsson (1992); Feige (2004); Halldrsson (2000).
[30] Adapted from Sipser (1996)
[31] Cook (1971) gives essentially the same reduction, from 3-SAT instead of Satisfiability, to show that subgraph isomorphism is NP-complete.
[32] Lipton & Tarjan (1980).
[33] Alon & Boppana (1987). For earlier and weaker bounds on monotone circuits for the clique problem, see Valiant (1983) and Razborov
(1985).
[34] Amano & Maruoka (1998).
[35] Goldmann & Hstad (1992) used communication complexity to prove this result.
201
Clique problem
[36] Wegener (1988).
[37] For instance, this follows from Grger (1992).
[38] Childs & Eisenberg (2005); Magniez, Santha & Szegedy (2007).
[39] Downey & Fellows (1999).
[40] Feige et al. (1991); Arora & Safra (1998); Arora et al. (1998).
[41] Hstad (1999) showed inapproximability for this ratio using a stronger complexity theoretic assumption, the inequality of NP and ZPP; Khot
(2001) described more precisely the inapproximation ratio, and Zuckerman (2006) derandomized the construction weakening its assumption to
PNP.
[42] This reduction is originally due to Feige et al. (1991) and used in all subsequent inapproximability proofs; the proofs differ in the strengths
and details of the probabilistically checkable proof systems that they rely on.
References
Abello, J.; Pardalos, P. M.; Resende, M. G. C. (1999), "On maximum clique problems in very large graphs"
(http://www2.research.att.com/~mgcr/abstracts/vlclq.html), in Abello, J.; Vitter, J., External Memory
Algorithms, DIMACS Series on Discrete Mathematics and Theoretical Computer Science, 50, American
Mathematical Society, pp.119130, ISBN0-8218-1184-3.
Alon, N.; Boppana, R. (1987), "The monotone circuit complexity of boolean functions", Combinatorica 7 (1):
122, doi:10.1007/BF02579196.
Alon, N.; Krivelevich, M.; Sudakov, B. (1998), "Finding a large hidden clique in a random graph", Random
Structures & Algorithms 13 (34): 457466,
doi:10.1002/(SICI)1098-2418(199810/12)13:3/4<457::AID-RSA14>3.0.CO;2-W.
Alon, N.; Yuster, R.; Zwick, U. (1994), "Finding and counting given length cycles", Proceedings of the 2nd
European Symposium on Algorithms, Utrecht, The Netherlands, pp.354364.
Amano, K.; Maruoka, A. (1998), "A superpolynomial lower bound for a circuit computing the clique function
with at most (1/6)loglogn negation gates" (http://www.springerlink.com/content/m64ju7clmqhqmv9g/),
Proc. Symp. Mathematical Foundations of Computer Science, Lecture Notes in Computer Science, 1450,
Springer-Verlag, pp.399408.
Arora, Sanjeev; Lund, Carsten; Motwani, Rajeev; Sudan, Madhu; Szegedy, Mario (1998), "Proof verification and
the hardness of approximation problems", Journal of the ACM 45 (3): 501555, doi:10.1145/278298.278306,
ECCCTR98-008. Originally presented at the 1992 Symposium on Foundations of Computer Science,
doi:10.1109/SFCS.1992.267823.
Arora, S.; Safra, S. (1998), "Probabilistic checking of proofs: A new characterization of NP", Journal of the ACM
45 (1): 70122, doi:10.1145/273865.273901. Originally presented at the 1992 Symposium on Foundations of
Computer Science, doi:10.1109/SFCS.1992.267824.
Balas, E.; Yu, C. S. (1986), "Finding a maximum clique in an arbitrary graph", SIAM Journal on Computing 15
(4): 10541068, doi:10.1137/0215075.
Battiti, R.; Protasi, M. (2001), "Reactive local search for the maximum clique problem", Algorithmica 29 (4):
610637, doi:10.1007/s004530010074.
Bollobs, Bla (1976), "Complete subgraphs are elusive", Journal of Combinatorial Theory, Series B 21 (1): 17,
doi:10.1016/0095-8956(76)90021-6, ISSN0095-8956.
Bomze, I. M.; Budinich, M.; Pardalos, P. M.; Pelillo, M. (1999), "The maximum clique problem", Handbook of
Combinatorial Optimization, 4, Kluwer Academic Publishers, pp.174, CiteSeerX: 10.1.1.48.4074 (http://
citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.48.4074).
Boppana, R.; Halldrsson, M. M. (1992), "Approximating maximum independent sets by excluding subgraphs",
BIT 32 (2): 180196, doi:10.1007/BF01994876.
Bron, C.; Kerbosch, J. (1973), "Algorithm 457: finding all cliques of an undirected graph", Communications of
the ACM 16 (9): 575577, doi:10.1145/362342.362367.
Carraghan, R.; Pardalos, P. M. (1990), "An exact algorithm for the maximum clique problem" (http://www.inf.
ufpr.br/renato/download/An_Exact_Algorithm_for_the_Maximum_Clique_Problem.pdf), Operations
202
Clique problem
Downey, R. G.; Fellows, M. R. (1995), "Fixed-parameter tractability and completeness. II. On completeness for
W[1]", Theoretical Computer Science 141 (12): 109131, doi:10.1016/0304-3975(94)00097-3.
Downey, R. G.; Fellows, M. R. (1999), Parameterized complexity, Springer-Verlag, ISBN0-387-94883-X.
Eisenbrand, F.; Grandoni, F. (2004), "On the complexity of fixed parameter clique and dominating set",
Theoretical Computer Science 326 (13): 5767, doi:10.1016/j.tcs.2004.05.009.
Eppstein, David; Lffler, Maarten; Strash, Darren (2010), "Listing All Maximal Cliques in Sparse Graphs in
Near-Optimal Time", in Cheong, Otfried; Chwa, Kyung-Yong; Park, Kunsoo, 21st International Symposium on
Algorithms and Computation (ISAAC 2010), Jeju, Korea, Lecture Notes in Computer Science, 6506,
Springer-Verlag, pp.403414, arXiv:1006.5440, doi:10.1007/978-3-642-17517-6_36, ISBN978-3-642-17516-9.
Eppstein, David; Strash, Darren (2011), "Listing all maximal cliques in large sparse real-world graphs", 10th
International Symposium on Experimental Algorithms, arXiv:1103.0318.
Erds, Paul; Szekeres, George (1935), "A combinatorial problem in geometry" (http://www.renyi.hu/~p_erdos/
1935-01.pdf), Compositio Mathematica 2: 463470.
Even, S.; Pnueli, A.; Lempel, A. (1972), "Permutation graphs and transitive graphs", Journal of the ACM 19 (3):
400410, doi:10.1145/321707.321710.
Fahle, T. (2002), "Simple and Fast: Improving a Branch-And-Bound Algorithm for Maximum Clique", Proc. 10th
European Symposium on Algorithms, Lecture Notes in Computer Science, 2461, Springer-Verlag, pp.4786,
doi:10.1007/3-540-45749-6_44, ISBN978-3-540-44180-9.
Feige, U. (2004), "Approximating maximum clique by removing subgraphs", SIAM Journal on Discrete
Mathematics 18 (2): 219225, doi:10.1137/S089548010240415X.
Feige, U.; Goldwasser, S.; Lovsz, L.; Safra, S; Szegedy, M. (1991), "Approximating clique is almost
NP-complete", Proc. 32nd IEEE Symp. on Foundations of Computer Science, pp.212,
doi:10.1109/SFCS.1991.185341, ISBN0-8186-2445-0.
Feige, U.; Krauthgamer, R. (2000), "Finding and certifying a large hidden clique in a semirandom graph",
Random Structures and Algorithms 16 (2): 195208,
doi:10.1002/(SICI)1098-2418(200003)16:2<195::AID-RSA5>3.0.CO;2-A.
Garey, M. R.; Johnson, D. S. (1978), ""Strong" NP-completeness results: motivation, examples and implications",
Journal of the ACM 25 (3): 499508, doi:10.1145/322077.322090.
Gavril, F. (1973), "Algorithms for a maximum clique and a maximum independent set of a circle graph",
Networks 3 (3): 261273, doi:10.1002/net.3230030305.
203
Clique problem
Goldmann, M.; Hstad, J. (1992), "A simple lower bound for monotone clique using a communication game",
Information Processing Letters 41 (4): 221226, doi:10.1016/0020-0190(92)90184-W.
Golumbic, M. C. (1980), Algorithmic Graph Theory and Perfect Graphs, Computer Science and Applied
Mathematics, Academic Press, ISBN0-444-51530-5.
Grger, Hans Dietmar (1992), "On the randomized complexity of monotone graph properties" (http://www.inf.
u-szeged.hu/actacybernetica/edb/vol10n3/pdf/Groger_1992_ActaCybernetica.pdf), Acta Cybernetica 10 (3):
119127, retrieved 2009-10-02
Grosso, A.; Locatelli, M.; Della Croce, F. (2004), "Combining swaps and node weights in an adaptive greedy
approach for the maximum clique problem", Journal of Heuristics 10 (2): 135152,
doi:10.1023/B:HEUR.0000026264.51747.7f.
Grtschel, M.; Lovsz, L.; Schrijver, A. (1988), "9.4 Coloring Perfect Graphs", Geometric Algorithms and
Combinatorial Optimization, Algorithms and Combinatorics, 2, SpringerVerlag, pp.296298,
ISBN0-387-13624-X.
Gutin, G. (2004), "5.3 Independent sets and cliques", in Gross, J. L.; Yellen, J., Handbook of graph theory,
Discrete Mathematics & Its Applications, CRC Press, pp.389402, ISBN978-1-58488-090-5.
Halldrsson, M. M. (2000), "Approximations of Weighted Independent Set and Hereditary Subset Problems"
(http://jgaa.info/accepted/00/Halldorsson00.4.1.pdf), Journal of Graph Algorithms and Applications 4 (1):
116.
Harary, F.; Ross, I. C. (1957), "A procedure for clique detection using the group matrix", Sociometry (American
Sociological Association) 20 (3): 205215, doi:10.2307/2785673, JSTOR2785673, MR0110590.
Hstad, J. (1999), "Clique is hard to approximate within n1", Acta Mathematica 182 (1): 105142,
doi:10.1007/BF02392825.
Impagliazzo, R.; Paturi, R.; Zane, F. (2001), "Which problems have strongly exponential complexity?", Journal
of Computer and System Sciences 63 (4): 512530, doi:10.1006/jcss.2001.1774.
Itai, A.; Rodeh, M. (1978), "Finding a minimum circuit in a graph", SIAM Journal on Computing 7 (4): 413423,
doi:10.1137/0207033.
Jerrum, M. (1992), "Large cliques elude the Metropolis process", Random Structures and Algorithms 3 (4):
347359, doi:10.1002/rsa.3240030402.
Jian, T (1986), "An O(20.304n) Algorithm for Solving Maximum Independent Set Problem", IEEE Transactions
on Computers (IEEE Computer Society) 35 (9): 847851, doi:10.1109/TC.1986.1676847, ISSN0018-9340.
Johnson, D. S.; Trick, M. A., eds. (1996), Cliques, Coloring, and Satisfiability: Second DIMACS Implementation
Challenge, October 1113, 1993 (http://dimacs.rutgers.edu/Volumes/Vol26.html), DIMACS Series in
Discrete Mathematics and Theoretical Computer Science, 26, American Mathematical Society,
ISBN0-8218-6609-5.
Johnson, D. S.; Yannakakis, M. (1988), "On generating all maximal independent sets", Information Processing
Letters 27 (3): 119123, doi:10.1016/0020-0190(88)90065-8.
Karp, Richard M. (1972), "Reducibility among combinatorial problems" (http://www.cs.berkeley.edu/~luca/
cs172/karp.pdf), in Miller, R. E.; Thatcher, J. W., Complexity of Computer Computations, New York: Plenum,
pp.85103.
Karp, Richard M. (1976), "Probabilistic analysis of some combinatorial search problems", in Traub, J. F.,
Algorithms and Complexity: New Directions and Recent Results, New York: Academic Press, pp.119.
Katayama, K.; Hamamoto, A.; Narihisa, H. (2005), "An effective local search for the maximum clique problem",
Information Processing Letters 95 (5): 503511, doi:10.1016/j.ipl.2005.05.010.
Khot, S. (2001), "Improved inapproximability results for MaxClique, chromatic number and approximate graph
coloring", Proc. 42nd IEEE Symp. Foundations of Computer Science, pp.600609,
doi:10.1109/SFCS.2001.959936, ISBN0-7695-1116-3.
204
Clique problem
Kloks, T.; Kratsch, D.; Mller, H. (2000), "Finding and counting small induced subgraphs efficiently",
Information Processing Letters 74 (34): 115121, doi:10.1016/S0020-0190(00)00047-8.
Konc, J.; Janei, D. (2007), "An improved branch and bound algorithm for the maximum clique problem" (http:/
/www.sicmm.org/~konc/articles/match2007.pdf), MATCH Communications in Mathematical and in
Computer Chemistry 58 (3): 569590. Source code (http://www.sicmm.org/~konc/maxclique)
Lipton, R. J.; Tarjan, R. E. (1980), "Applications of a planar separator theorem", SIAM Journal on Computing 9
(3): 615627, doi:10.1137/0209046.
Luce, R. Duncan; Perry, Albert D. (1949), "A method of matrix analysis of group structure", Psychometrika 14
(2): 95116, doi:10.1007/BF02289146, PMID18152948.
Magniez, Frdric; Santha, Miklos; Szegedy, Mario (2007), "Quantum algorithms for the triangle problem",
SIAM Journal on Computing 37 (2): 413424, arXiv:quant-ph/0310134, doi:10.1137/050643684.
Makino, K.; Uno, T. (2004), "New algorithms for enumerating all maximal cliques" (http://www.springerlink.
com/content/p9qbl6y1v5t3xc1w/), Algorithm Theory: SWAT 2004, Lecture Notes in Computer Science, 3111,
Springer-Verlag, pp.260272.
Moon, J. W.; Moser, L. (1965), "On cliques in graphs", Israel Journal of Mathematics 3: 2328,
doi:10.1007/BF02760024, MR0182577.
Neetil, J.; Poljak, S. (1985), "On the complexity of the subgraph problem", Commentationes Mathematicae
Universitatis Carolinae 26 (2): 415419.
stergrd, P. R. J. (2002), "A fast algorithm for the maximum clique problem", Discrete Applied Mathematics
120 (13): 197207, doi:10.1016/S0166-218X(01)00290-6.
Ouyang, Q.; Kaplan, P. D.; Liu, S.; Libchaber, A. (1997), "DNA solution of the maximal clique problem",
Science 278 (5337): 446449, doi:10.1126/science.278.5337.446, PMID9334300.
Pardalos, P. M.; Rogers, G. P. (1992), "A branch and bound algorithm for the maximum clique problem",
Computers & Operations Research 19 (5): 363375, doi:10.1016/0305-0548(92)90067-F.
Razborov, A. A. (1985), "Lower bounds for the monotone complexity of some Boolean functions" (in Russian),
Proceedings of the USSR Academy of Sciences 281: 798801. English translation in Sov. Math. Dokl. 31 (1985):
354357.
Rgin, J.-C. (2003), "Using constraint programming to solve the maximum clique problem" (http://www.
springerlink.com/content/8p1980dfmrt3agyp/), Proc. 9th Int. Conf. Principles and Practice of Constraint
Programming CP 2003, Lecture Notes in Computer Science, 2833, Springer-Verlag, pp.634648.
Robson, J. M. (1986), "Algorithms for maximum independent sets", Journal of Algorithms 7 (3): 425440,
doi:10.1016/0196-6774(86)90032-5.
Robson, J. M. (2001), Finding a maximum independent set in time O(2n/4) (http://www.labri.fr/perso/robson/
mis/techrep.html).
Rosgen, B; Stewart, L (2007), "Complexity results on graphs with few cliques" (http://www.dmtcs.org/
dmtcs-ojs/index.php/dmtcs/article/view/707/1817), Discrete Mathematics and Theoretical Computer Science
9 (1): 127136.
Sipser, M. (1996), Introduction to the Theory of Computation, International Thompson Publishing,
ISBN0-534-94728-X.
Tarjan, R. E.; Trojanowski, A. E. (1977), "Finding a maximum independent set" (ftp://db.stanford.edu/pub/
cstr.old/reports/cs/tr/76/550/CS-TR-76-550.pdf), SIAM Journal on Computing 6 (3): 537546,
doi:10.1137/0206038.
Tomita, E.; Kameda, T. (2007), "An efficient branch-and-bound algorithm for finding a maximum clique with
computational experiments", Journal of Global Optimization 37 (1): 95111, doi:10.1007/s10898-006-9039-7.
Tomita, E.; Seki, T. (2003), "An Efficient Branch-and-Bound Algorithm for Finding a Maximum Clique",
Discrete Mathematics and Theoretical Computer Science, Lecture Notes in Computer Science, 2731,
Springer-Verlag, pp.278289, doi:10.1007/3-540-45066-1_22, ISBN978-3-540-40505-4.
205
Clique problem
Tomita, E.; Tanaka, A.; Takahashi, H. (2006), "The worst-case time complexity for generating all maximal
cliques and computational experiments", Theoretical Computer Science 363 (1): 2842,
doi:10.1016/j.tcs.2006.06.015.
Tsukiyama, S.; Ide, M.; Ariyoshi, I.; Shirakawa, I. (1977), "A new algorithm for generating all the maximal
independent sets", SIAM Journal on Computing 6 (3): 505517, doi:10.1137/0206036.
Valiant, L. G. (1983), "Exponential lower bounds for restricted monotone circuits", Proc. 15th ACM Symposium
on Theory of Computing, pp.110117, doi:10.1145/800061.808739, ISBN0-89791-099-0.
Vassilevska, V.; Williams, R. (2009), "Finding, minimizing, and counting weighted subgraphs", Proc. 41st ACM
Symposium on Theory of Computing, pp.455464, doi:10.1145/1536414.1536477, ISBN978-1-60558-506-2.
Wegener, I. (1988), "On the complexity of branching programs and decision trees for clique functions", Journal
of the ACM 35 (2): 461472, doi:10.1145/42282.46161.
Yuster, R. (2006), "Finding and counting cliques and independent sets in r-uniform hypergraphs", Information
Processing Letters 99 (4): 130134, doi:10.1016/j.ipl.2006.04.005.
Zuckerman, D. (2006), "Linear degree extractors and the inapproximability of max clique and chromatic
number", Proc. 38th ACM Symp. Theory of Computing, pp.681690, doi:10.1145/1132516.1132612,
ISBN1-59593-134-1, ECCCTR05-100.
Without pivoting
The basic form of the BronKerbosch algorithm is a recursive backtracking algorithm that searches for all maximal
cliques in a given graph G. More generally, given three sets R, P, and X, it finds the maximal cliques that include all
of the vertices in R, some of the vertices in P, and none of the vertices in X. Within the recursive calls to the
algorithm, P and X are restricted to vertices that form cliques when added to R, as these are the only vertices that can
be used as part of the output or to prevent some clique from being reported as output.
The recursion is initiated by setting R and X to be the empty set and P to be the vertex set of the graph. Within each
recursive call, the algorithm considers the vertices in P in turn; if there are no such vertices, it either reports R as a
maximal clique (if X is empty), or backtracks. For each vertex v chosen from P, it makes a recursive call in which v
is added to R and in which P and X are restricted to neighbors of v, N(v), which finds and reports all clique
extensions of R that contain v. Then, it moves v from P to X and continues with the next vertex in P.
That is, in pseudocode, the algorithm performs the following steps:
206
With pivoting
The basic form of the algorithm, described above, is inefficient in the case of graphs with many non-maximal
cliques: it makes a recursive call for every clique, maximal or not. To save time and allow the algorithm to backtrack
more quickly in branches of the search that contain no maximal cliques, Bron and Kerbosch introduced a variant of
the algorithm involving a "pivot vertex" u, chosen from P (or more generally, as later investigators realized,[4] from
PX). Any maximal clique must include either u or one of its non-neighbors, for otherwise the clique could be
augmented by adding u to it. Therefore, only u and its non-neighbors need to be tested as the choices for the vertex v
that is added to R in each recursive call to the algorithm. In pseudocode:
BronKerbosch2(R,P,X):
if P and X are both empty:
report R as a maximal clique
choose a pivot vertex u in P X
for each vertex v in P \ N(u):
BronKerbosch2(R {v}, P N(v), X N(v))
P := P \ {v}
X := X {v}
If the pivot is chosen to minimize the number of recursive calls made by the algorithm, the savings in running time
compared to the non-pivoting version of the algorithm can be significant.[5]
207
Example
In the example graph shown, the algorithm is initially
called with R=, P={1,2,3,4,5,6}, and X=. The
pivot u should be chosen as one of the degree-three
vertices, to minimize the number of recursive calls; for
instance, suppose that u is chosen to be vertex 2. Then
there are three remaining vertices in P\N(u): vertices 2,
4, and 6.
The iteration of the inner loop of the algorithm for v=2
makes a recursive call to the algorithm with R={2},
P={1,3,5}, and X=. Within this recursive call, one of
A graph with five maximal cliques: four edges and a triangle
1 or 5 will be chosen as a pivot, and there will be two
second-level recursive calls, one for vertex 3 and the
other for whichever vertex was not chosen as pivot. These two calls will eventually report the two cliques {1,2,5}
and {2,3}. After returning from these recursive calls, vertex 2 is added to X and removed from P.
The iteration of the inner loop of the algorithm for v=4 makes a recursive call to the algorithm with R={4},
P={3,5,6}, and X= (although vertex 2 belongs to the set X in the outer call to the algorithm, it is not a neighbor
of v and is excluded from the subset of X passed to the recursive call). This recursive call will end up making three
second-level recursive calls to the algorithm that report the three cliques {3,4}, {4,5}, and {4,6}. Then, vertex 4 is
added to X and removed from P.
In the third and final iteration of the inner loop of the algorithm, for v=6, there is a recursive call to the algorithm
with R={6}, P=, and X={4}. Because this recursive call has P empty and X non-empty, it immediately
backtracks without reporting any more cliques, as there can be no maximal clique that includes vertex 6 and excludes
vertex 4.
The call tree for the algorithm, therefore, looks like:
BronKerbosch2(, {1,2,3,4,5,6}, )
BronKerbosch2({2}, {1,3,5}, )
BronKerbosch2({2,3}, , ): output {2, 3}
BronKerbosch2({2,5}, {1}, )
BronKerbosch2({1,2,5}, , ): output {1,2,5}
BronKerbosch2({4}, {3,5,6}, )
BronKerbosch2({3,4}, , ): output {3,4}
BronKerbosch2({4,5}, , ): output {4,5}
BronKerbosch2({4,6}, , ): output {4,6}
BronKerbosch2({6}, , {4}): no output
The graph in the example has degeneracy two; one possible degeneracy ordering is 6,4,3,1,2,5. If the vertex-ordering
version of the BronKerbosch algorithm is applied to the vertices, in this order, the call tree looks like
208
Worst-case analysis
The BronKerbosch algorithm is not an output-sensitive algorithm: unlike some other algorithms for the clique
problem, it does not run in polynomial time per maximal clique generated. However, it is efficient in a worst-case
sense: by a result of Moon & Moser (1965), any n-vertex graph has at most 3n/3 maximal cliques, and the worst-case
running time of the BronKerbosch algorithm (with a pivot strategy that minimizes the number of recursive calls
made at each step) is O(3n/3), matching this bound.[8]
For sparse graphs, tighter bounds are possible. In particular the vertex-ordering version of the BronKerbosch
algorithm can be made to run in time O(dn3d/3), where d is the degeneracy of the graph, a measure of its sparseness.
There exist d-degenerate graphs for which the total number of maximal cliques is (n d)3d/3, so this bound is close
to tight.[6]
Notes
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
References
Akkoyunlu, E. A. (1973), "The enumeration of maximal cliques of large graphs", SIAM Journal on Computing 2:
16, doi:10.1137/0202001.
Chen, Lingran (2004), "Substructure and maximal common substructure searching", in Bultinck, Patrick,
Computational Medicinal Chemistry for Drug Discovery, CRC Press, pp.483514, ISBN978-0-8247-4774-9.
Bron, Coen; Kerbosch, Joep (1973), "Algorithm 457: finding all cliques of an undirected graph", Commun. ACM
(ACM) 16 (9): 575577, doi:10.1145/362342.362367.
Cazals, F.; Karande, C. (2008), "A note on the problem of reporting maximal cliques" (ftp://ftp-sop.inria.fr/
geometrica/fcazals/papers/ncliques.pdf), Theoretical Computer Science 407 (1): 564568,
doi:10.1016/j.tcs.2008.05.010.
Eppstein, David; Lffler, Maarten; Strash, Darren (2010), "Listing all maximal cliques in sparse graphs in
near-optimal time", in Cheong, Otfried; Chwa, Kyung-Yong; Park, Kunsoo, 21st International Symposium on
209
210
Algorithms and Computation (ISAAC 2010), Jeju, Korea, Lecture Notes in Computer Science, 6506,
Springer-Verlag, pp.403414, arXiv:1006.5440, doi:10.1007/978-3-642-17517-6_36.
Eppstein, David; Strash, Darren (2011), "Listing all maximal cliques in large sparse real-world graphs", 10th
International Symposium on Experimental Algorithms, arXiv:1103.0318.
Johnston, H. C. (1976), "Cliques of a graphvariations on the BronKerbosch algorithm", International Journal
of Parallel Programming 5 (3): 209238, doi:10.1007/BF00991836.
Koch, Ina (2001), "Enumerating all connected maximal common subgraphs in two graphs", Theoretical Computer
Science 250 (12): 130, doi:10.1016/S0304-3975(00)00286-3.
Moon, J. W.; Moser, L. (1965), "On cliques in graphs", Israel J. Math. 3: 2328, doi:10.1007/BF02760024,
MR0182577.
Tomita, Etsuji; Tanaka, Akira; Takahashi, Haruhisa (2006), "The worst-case time complexity for generating all
maximal cliques and computational experiments", Theoretical Computer Science 363 (1): 2842,
doi:10.1016/j.tcs.2006.06.015.
External links
Bron-Kerbosh algorithm implementation in Python (http://www.kuchaev.com/files/graph.py)
Finding all cliques of an undirected graph (http://www.dfki.de/~neumann/ie-seminar/presentations/
finding_cliques.pdf). Seminar notes by Michaela Regneri, January 11, 2007.
Properties
Relationship to other graph parameters
A set is independent if and only if it is a clique in the graphs complement, so the two concepts are complementary.
In fact, sufficiently large graphs with no large cliques have large independent sets, a theme that is explored in
Ramsey theory.
A set is independent if and only if its complement is a vertex cover. The sum of (G) and the size of a minimum
vertex cover (G) is the number of vertices in the graph.
In a bipartite graph, the number of vertices in a maximum independent set equals the number of edges in a minimum
edge covering; this is Knig's theorem.
211
212
License
[10] GPL
API language
Brief info
NetworkX
BSD
Python
OpenOpt
BSD
Python
[12]
class
[11]
213
NetworkX BSD
Python
Brief info
see the routine maximal_independent_set
[13]
Notes
[1] Godsil & Royle (2001), p. 3.
[2] Moon & Moser (1965).
[3] Fredi (1987).
[4] Chiba & Nishizeki (1985).
[5] Berman & Fujito (1995).
[6] Halldrsson & Radhakrishnan (1997).
[7] For claw-free graphs, see Sbihi (1980). For perfect graphs, see Grtschel, Lovsz & Schrijver (1988).
[8] Baker (1994); Grohe (2003).
[9] Luby (1985).
[10] http:/ / igraph. sourceforge. net
[11] http:/ / networkx. lanl. gov/ reference/ generated/ networkx. algorithms. approximation. independent_set. maximum_independent_set. html
[12] http:/ / openopt. org/ STAB
[13] http:/ / networkx. lanl. gov/ reference/ generated/ networkx. algorithms. mis. maximal_independent_set. html#networkx. algorithms. mis.
maximal_independent_set
References
Baker, Brenda S. (1994), "Approximation algorithms for NP-complete problems on planar graphs", Journal of the
ACM 41 (1): 153180, doi:10.1145/174644.174650.
Berman, Piotr; Fujito, Toshihiro (1995), "On approximation properties of the Independent set problem for degree
3 graphs", Workshop on Algorithms and Data Structures, Lecture Notes in Computer Science, 955,
Springer-Verlag, pp.449460, doi:10.1007/3-540-60220-8_84.
Chiba, N.; Nishizeki, T. (1985), "Arboricity and subgraph listing algorithms", SIAM Journal on Computing 14
(1): 210223, doi:10.1137/0214017.
Fomin, Fedor V.; Grandoni, Fabrizio; Kratsch, Dieter (2009), "A measure & conquer approach for the analysis of
exact algorithms", Journal of ACM 56 (5): 132, doi:10.1145/1552285.1552286, articleno.25.
Fredi, Z. (1987), "The number of maximal independent sets in connected graphs", Journal of Graph Theory 11
(4): 463470, doi:10.1002/jgt.3190110403.
Godsil, Chris; Royle, Gordon (2001), Algebraic Graph Theory, New York: Springer, ISBN0-387-95220-9.
Grohe, Martin (2003), "Local tree-width, excluded minors, and approximation algorithms", Combinatorica 23 (4):
613632, doi:10.1007/s00493-003-0037-9.
Grtschel, M.; Lovsz, L.; Schrijver, A. (1988), "9.4 Coloring Perfect Graphs", Geometric Algorithms and
Combinatorial Optimization, Algorithms and Combinatorics, 2, SpringerVerlag, pp.296298,
ISBN0-387-13624-X.
Halldrsson, M. M.; Radhakrishnan, J. (1997), "Greed is good: Approximating independent sets in sparse and
bounded-degree graphs", Algorithmica 18 (1): 145163, doi:10.1007/BF02523693.
Luby, M. (1985), "A simple parallel algorithm for the maximal independent set problem" (http://www.cs.rpi.
edu/~buschc/courses/distributed/spring2007/papers/mis.pdf), Proc. 17th Symposium on Theory of Computing,
Association for Computing Machinery, pp.110, doi:10.1145/22145.22146.
Moon, J. W.; Moser, Leo (1965), "On cliques in graphs", Israel Journal of Mathematics 3 (1): 2328,
doi:10.1007/BF02760024, MR0182577.
Robson, J. M. (1986), "Algorithms for maximum independent sets", Journal of Algorithms 7 (3): 425440,
doi:10.1016/0196-6774(86)90032-5.
214
Sbihi, Najiba (1980), "Algorithme de recherche d'un stable de cardinalit maximum dans un graphe sans toile"
(in French), Discrete Mathematics 29 (1): 5376, doi:10.1016/0012-365X(90)90287-R, MR553650.
External links
Weisstein, Eric W., " Maximal Independent Vertex Set (http://mathworld.wolfram.com/
MaximalIndependentVertexSet.html)" from MathWorld.
Challenging Benchmarks for Maximum Clique, Maximum Independent Set, Minimum Vertex Cover and Vertex
Coloring (http://www.nlsde.buaa.edu.cn/~kexu/benchmarks/graph-benchmarks.htm)
The graph of the cube has six different maximal independent sets, shown as the red
vertices.
For example, in the graph P3, a path with three vertices a, b, and c, and two edges ab and bc, the sets {b} and {a,c}
are both maximally independent. The set {a} is independent, but is not maximal independent, because it is a subset
of the larger independent set {a,c}. In this same graph, the maximal cliques are the sets {a,b} and {b,c}.
The phrase "maximal independent set" is also used to describe maximal subsets of independent elements in
mathematical structures other than graphs, and in particular in vector spaces and matroids.
215
Notes
[1] Erds (1966) shows that the number of different sizes of maximal independent sets in an n-vertex graph may be as large as n - log n - O(log
log n) and is never larger than n - log n.
[2] Weigt & Hartmann (2001).
[3] Information System on Graph Class Inclusions: maximal clique irreducible graphs (http:/ / wwwteo. informatik. uni-rostock. de/ isgci/
classes/ gc_749. html) and hereditary maximal clique irreducible graphs (http:/ / wwwteo. informatik. uni-rostock. de/ isgci/ classes/ gc_750.
html).
[4] Byskov (2003). For related earlier results see Croitoru (1979) and Eppstein (2003).
[5] Chiba & Nishizeki (1985). The sparseness condition is equivalent to assuming that the graph family has bounded arboricity.
[6] Bisdorff & Marichal (2007); Euler (2005); Fredi (1987).
[7] Eppstein (2003); Byskov (2003).
[8] Eppstein (2003). For a matching bound for the widely used BronKerbosch algorithm, see Tomita, Tanaka & Takahashi (2006).
[9] Bomze et al. (1999); Eppstein (2005); Jennings & Motyckov (1992); Johnson, Yannakakis & Papadimitriou (1988); Lawler, Lenstra &
Rinnooy Kan (1980); Liang, Dhall & Lakshmivarahan (1991); Makino & Uno (2004); Mishra & Pitt (1997); Stix (2004); Tsukiyama et al.
(1977); Yu & Chen (1993).
[10] Makino & Uno (2004); Eppstein (2005).
References
Bisdorff, R.; Marichal, J.-L. (2007), Counting non-isomorphic maximal independent sets of the n-cycle graph,
arXiv:math.CO/0701647.
Bomze, I. M.; Budinich, M.; Pardalos, P. M.; Pelillo, M. (1999), "The maximum clique problem", Handbook of
Combinatorial Optimization, 4, Kluwer Academic Publishers, pp.174, CiteSeerX: 10.1.1.48.4074 (http://
citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.48.4074).
Byskov, J. M. (2003), "Algorithms for k-colouring and finding maximal independent sets" (http://portal.acm.
org/citation.cfm?id=644182), Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete
Algorithms, pp.456457.
Chiba, N.; Nishizeki, T. (1985), "Arboricity and subgraph listing algorithms", SIAM J. on Computing 14 (1):
210223, doi:10.1137/0214017.
Croitoru, C. (1979), "On stables in graphs", Proc. Third Coll. Operations Research, Babe-Bolyai University,
Cluj-Napoca, Romania, pp.5560.
Eppstein, D. (2003), "Small maximal independent sets and faster exact graph coloring" (http://www.cs.brown.
edu/publications/jgaa/accepted/2003/Eppstein2003.7.2.pdf), Journal of Graph Algorithms and Applications
7 (2): 131140, arXiv:cs.DS/cs.DS/0011009.
Eppstein, D. (2005), "All maximal independent sets and dynamic dominance for sparse graphs", Proc. Sixteenth
Annual ACM-SIAM Symposium on Discrete Algorithms, pp.451459, arXiv:cs.DS/0407036.
Erds, P. (1966), "On cliques in graphs", Israel J. Math. 4 (4): 233234, doi:10.1007/BF02771637, MR0205874.
216
217
Graph coloring
Graph coloring
In graph theory, graph coloring is a special case of graph labeling; it
is an assignment of labels traditionally called "colors" to elements of a
graph subject to certain constraints. In its simplest form, it is a way of
coloring the vertices of a graph such that no two adjacent vertices share
the same color; this is called a vertex coloring. Similarly, an edge
coloring assigns a color to each edge so that no two adjacent edges
share the same color, and a face coloring of a planar graph assigns a
color to each face or region so that no two faces that share a boundary
have the same color.
Vertex coloring is the starting point of the subject, and other coloring
problems can be transformed into a vertex version. For example, an
A proper vertex coloring of the Petersen graph
edge coloring of a graph is just a vertex coloring of its line graph, and a
with 3 colors, the minimum number possible.
face coloring of a planar graph is just a vertex coloring of its planar
dual. However, non-vertex coloring problems are often stated and studied as is. That is partly for perspective, and
partly because some problems are best studied in non-vertex form, as for instance is edge coloring.
The convention of using colors originates from coloring the countries of a map, where each face is literally colored.
This was generalized to coloring the faces of a graph embedded in the plane. By planar duality it became coloring
the vertices, and in this form it generalizes to all graphs. In mathematical and computer representations it is typical to
use the first few positive or nonnegative integers as the "colors". In general one can use any finite set as the "color
set". The nature of the coloring problem depends on the number of colors but not on what they are.
Graph coloring enjoys many practical applications as well as theoretical challenges. Beside the classical types of
problems, different limitations can also be set on the graph, or on the way a color is assigned, or even on the color
itself. It has even reached popularity with the general public in the form of the popular number puzzle Sudoku.
Graph coloring is still a very active field of research.
Note: Many terms used in this article are defined in Glossary of graph theory.
History
The first results about graph coloring deal almost exclusively with planar graphs in the form of the coloring of maps.
While trying to color a map of the counties of England, Francis Guthrie postulated the four color conjecture, noting
that four colors were sufficient to color the map so that no regions sharing a common border received the same color.
Guthries brother passed on the question to his mathematics teacher Augustus de Morgan at University College, who
mentioned it in a letter to William Hamilton in 1852. Arthur Cayley raised the problem at a meeting of the London
Mathematical Society in 1879. The same year, Alfred Kempe published a paper that claimed to establish the result,
and for a decade the four color problem was considered solved. For his accomplishment Kempe was elected a Fellow
of the Royal Society and later President of the London Mathematical Society.[1]
In 1890, Heawood pointed out that Kempes argument was wrong. However, in that paper he proved the five color
theorem, saying that every planar map can be colored with no more than five colors, using ideas of Kempe. In the
following century, a vast amount of work and theories were developed to reduce the number of colors to four, until
the four color theorem was finally proved in 1976 by Kenneth Appel and Wolfgang Haken. Perhaps surprisingly, the
proof went back to the ideas of Heawood and Kempe and largely disregarded the intervening developments.[2] The
proof of the four color theorem is also noteworthy for being the first major computer-aided proof.
218
Graph coloring
219
In 1912, George David Birkhoff introduced the chromatic polynomial to study the coloring problems, which was
generalised to the Tutte polynomial by Tutte, important structures in algebraic graph theory. Kempe had already
drawn attention to the general, non-planar case in 1879,[3] and many results on generalisations of planar graph
coloring to surfaces of higher order followed in the early 20th century.
In 1960, Claude Berge formulated another conjecture about graph coloring, the strong perfect graph conjecture,
originally motivated by an information-theoretic concept called the zero-error capacity of a graph introduced by
Shannon. The conjecture remained unresolved for 40 years, until it was established as the celebrated strong perfect
graph theorem in 2002 by Chudnovsky, Robertson, Seymour, Thomas 2002.
Graph coloring has been studied as an algorithmic problem since the early 1970s: the chromatic number problem is
one of Karps 21 NP-complete problems from 1972, and at approximately the same time various exponential-time
algorithms were developed based on backtracking and on the deletion-contraction recurrence of Zykov (1949). One
of the major applications of graph coloring, register allocation in compilers, was introduced in 1981.
A coloring using at most k colors is called a (proper) k-coloring. The smallest number of colors needed to color a
graph G is called its chromatic number, and is often denoted (G). Sometimes (G) is used, since (G) is also used
to denote the Euler characteristic of a graph. A graph that can be assigned a (proper) k-coloring is k-colorable, and it
is k-chromatic if its chromatic number is exactly k. A subset of vertices assigned to the same color is called a color
class, every such class forms an independent set. Thus, a k-coloring is the same as a partition of the vertex set into k
independent sets, and the terms k-partite and k-colorable have the same meaning.
Graph coloring
220
Chromatic polynomial
The chromatic polynomial counts the number of ways a graph can be colored
using no more than a given number of colors. For example, using three colors,
the graph in the image to the right can be colored in 12 ways. With only two
colors, it cannot be colored at all. With four colors, it can be colored in
24+412=72 ways: using all four colors, there are 4!=24 valid colorings
(every assignment of four colors to any 4-vertex graph is a proper coloring); and
for every choice of three of the four colors, there are 12 valid 3-colorings. So,
for the graph in the example, a table of the number of valid colorings would start
like this:
Available colors
1 2 3
Number of colorings 0 0 12 72
The chromatic polynomial is a function P(G,t) that counts the number of t-colorings ofG. As the name indicates, for
a given G the function is indeed a polynomial int. For the example graph, P(G,t)=t(t1)2(t2), and
indeedP(G,4)=72.
The chromatic polynomial includes at least as much information about the colorability of G as does the chromatic
number. Indeed, is the smallest positive integer that is not a root of the chromatic polynomial
Graph coloring
221
Edge coloring
An edge coloring of a graph is a proper coloring of the edges, meaning an assignment of colors to edges so that no
vertex is incident to two edges of the same color. An edge coloring with k colors is called a k-edge-coloring and is
equivalent to the problem of partitioning the edge set into k matchings. The smallest number of colors needed for an
edge coloring of a graph G is the chromatic index, or edge chromatic number, (G). A Tait coloring is a 3-edge
coloring of a cubic graph. The four color theorem is equivalent to the assertion that every planar cubic bridgeless
graph admits a Tait coloring.
Total coloring
Total coloring is a type of coloring on the vertices and edges of a graph. When used without any qualification, a
total coloring is always assumed to be proper in the sense that no adjacent vertices, no adjacent edges, and no edge
and its endvertices are assigned the same color. The total chromatic number (G) of a graph G is the least number
of colors needed in any total coloring of G.
Properties
Bounds on the chromatic number
Assigning distinct colors to distinct vertices always yields a proper coloring, so
The only graphs that can be 1-colored are edgeless graphs. A complete graph
of n vertices requires
colors. In an optimal coloring there must be at least one of the graphs m edges between every pair of
color classes, so
If G contains a clique of size k, then at least k colors are needed to color that clique; in other words, the chromatic
number is at least the clique number:
and
and
, so
for these graphs this bound is best possible. In all other cases, the bound can be slightly improved; Brooks
theorem[4] states that
Graph coloring
Brooks theorem:
222
for a connected, simple graph G, unless G is a complete graph or an odd
cycle.
There is a strong relationship between edge colorability and the graphs maximum degree
if G is bipartite.
In general, the relationship is even stronger than what Brookss theorem gives for vertex coloring:
Vizings Theorem: A graph of maximal degree
or
Other properties
A graph has a k-coloring if and only if it has an acyclic orientation for which the longest path has length at most k;
this is the GallaiHasseRoyVitaver theorem (Neetil & Ossona de Mendez 2012).
For planar graphs, vertex colorings are essentially dual to nowhere-zero flows.
About infinite graphs, much less is known. The following is one of the few results about infinite graph coloring:
If all finite subgraphs of an infinite graph G are k-colorable, then so is G, under the assumption of the axiom of
choice (de Bruijn & Erds 1951).
Also, if a graph admits a full n-coloring for every n n0, it admits an infinite full coloring (Fawcett 1978).
Open problems
The chromatic number of the plane, where two points are adjacent if they have unit distance, is unknown, although it
is one of 4, 5, 6, or 7. Other open problems concerning the chromatic number of graphs include the Hadwiger
conjecture stating that every graph with chromatic number k has a complete graph on k vertices as a minor, the
ErdsFaberLovsz conjecture bounding the chromatic number of unions of complete graphs that have at exactly
one vertex in common to each pair, and the Albertson conjecture that among k-chromatic graphs the complete graphs
are the ones with smallest crossing number.
When Birkhoff and Lewis introduced the chromatic polynomial in their attack on the four-color theorem, they
conjectured that for planar graphs G, the polymomial
has no zeros in the region
. Although it is
Graph coloring
223
and that
also remains an unsolved problem to characterize graphs which have the same chromatic polynomial and to
determine which polynomials are chromatic.
Algorithms
Graph coloring
Decision
Name
Input
Output
Running time
O(2 n)
Complexity
NP-complete
Reduction from
3-Satisfiability
GareyJohnson
GT4
[5]
Optimisation
Name
Chromatic number
Input
Output
(G)
Complexity
NP-hard
Approximability
Counting problem
Name
Chromatic polynomial
Input
Output
Running time
O(2 nn)
Complexity
#P-complete
Approximability
Graph coloring
224
Polynomial time
Determining if a graph can be colored with 2 colors is equivalent to determining whether or not the graph is bipartite,
and thus computable in linear time using breadth-first search. More generally, the chromatic number and a
corresponding coloring of perfect graphs can be computed in polynomial time using semidefinite programming.
Closed formulas for chromatic polynomial are known for many classes of graphs, such as forest, chordal graphs,
cycles, wheels, and ladders, so these can be evaluated in polynomial time.
If the graph is planar and has low branchwidth (or is nonplanar but with a known branch decompositon), then it can
be solved in polynomial time using dynamic programming. In general, the time required is polynomial in the graph
size, but exponential in the branchwidth.
Exact algorithms
Brute-force search for a k-coloring considers every of the
each if it is legal. To compute the chromatic number and the chromatic polynomial, this procedure is used for every
, impractical for all but the smallest input graphs.
Using dynamic programming and a bound on the number of maximal independent sets, k-colorability can be decided
in time and space
.[6] Using the principle of inclusionexclusion and Yatess algorithm for the fast zeta
transform, k-colorability can be decided in time
4-colorability, which can be decided in time
[5]
[7]
and
,[8] respectively.
Contraction
The contraction
of graph G is the graph obtained by identifying the vertices u and v, removing any edges
between them, and replacing them with a single vertex w where any edges that were incident on u or v are redirected
to w. This operation plays a major role in the analysis of graph coloring.
The chromatic number satisfies the recurrence relation:
added.
Several algorithms are based on evaluating this recurrence, the resulting computation tree is sometimes called a
Zykov tree. The running time is based on the heuristic for choosing the vertices u and v.
The chromatic polynomial satisfies following recurrence relation
removed.
represents the number of possible proper colorings of the graph, when the vertices may have same or different
colors. The number of proper colorings therefore come from the sum of two graphs. If the vertices u and v have
different colors, then we can as well consider a graph, where u and v are adjacent. If u and v have the same colors,
we may as well consider a graph, where u and v are contracted. Tuttes curiosity about which other graph properties
satisfied this recurrence led him to discover a bivariate generalization of the chromatic polynomial, the Tutte
polynomial.
The expressions give rise to a recursive procedure, called the deletioncontraction algorithm, which forms the basis
of many algorithms for graph coloring. The running time satisfies the same recurrence relation as the Fibonacci
numbers, so in the worst case, the algorithm runs in time within a polynomial factor of
.[9] The analysis can be improved to within a polynomial factor of the
number
of spanning trees of the input graph.[10] In practice, branch and bound strategies and graph
isomorphism rejection are employed to avoid some recursive calls, the running time depends on the heuristic used to
pick the vertex pair.
Graph coloring
225
Greedy coloring
The greedy algorithm considers the vertices in a specific order ,,
and assigns to
the smallest available color not used by
s
neighbours among
,,
, adding a fresh color if needed. The
quality of the resulting coloring depends on the chosen ordering. There
exists an ordering that leads to a greedy coloring with the optimal
number of
colors. On the other hand, greedy colorings can be
arbitrarily bad; for example, the crown graph on n vertices can be
2-colored, but has an ordering that leads to a greedy coloring with
colors.
If the vertices are ordered according to their degrees, the resulting
greedy coloring uses at most
colors, at
most one more than the graphs maximum degree. This heuristic is
sometimes called the WelshPowell algorithm.[11] Another heuristic due to Brlaz establishes the ordering
dynamically while the algorithm proceeds, choosing next the vertex adjacent to the largest number of different
colors.[12] Many other graph coloring heuristics are similarly based on greedy coloring for a specific static or
dynamic strategy of ordering the vertices, these algorithms are sometimes called sequential coloring algorithms.
Graph coloring
226
Decentralized algorithms
Decentralized algorithms are ones where no message passing is allowed (in contrast to distributed algorithms where
local message passing takes places). Somewhat surprisingly, efficient decentralized algorithms exist that will color a
graph if a proper coloring exists. These assume that a vertex is able to sense whether any of its neighbors are using
the same color as the vertex i.e., whether a local conflict exists. This is a mild assumption in many applications e.g.
in wireless channel allocation it is usually reasonable to assume that a station will be able to detect whether other
interfering transmitters are using the same channel (e.g. by measuring the SINR). This sensing information is
sufficient to allow algorithms based on learning automata to find a proper graph coloring with probability one, e.g.
see Leith (2006) and Duffy (2008).
Computational complexity
Graph coloring is computationally hard. It is NP-complete to decide if a given graph admits a k-coloring for a given
k except for the cases k=1 and k=2. Especially, it is NP-hard to compute the chromatic number. The 3-coloring
problem remains NP-complete even on planar graphs of degree 4.[19]
The best known approximation algorithm computes a coloring of size at most within a factor
O(n(logn)3(loglogn)2) of the chromatic number.[20] For all >0, approximating the chromatic number within n1
is NP-hard.[21]
It is also NP-hard to color a 3-colorable graph with 4 colors[22] and a k-colorable graph with k(log k)/25 colors for
sufficiently large constant k.[23]
Computing the coefficients of the chromatic polynomial is #P-hard. In fact, even computing the value of
is
#P-hard at any rational point k except for k=1 and k=2.[24] There is no FPRAS for evaluating the chromatic
polynomial at any rational point k1.5 except for k=2 unless NP=RP.[25]
For edge coloring, the proof of Vizings result gives an algorithm that uses at most +1 colors. However, deciding
between the two candidate values for the edge chromatic number is NP-complete.[26] In terms of approximation
algorithms, Vizings algorithm shows that the edge chromatic number can be approximated within 4/3, and the
hardness result shows that no (4/3)-algorithm exists for any >0 unless P=NP. These are among the oldest
results in the literature of approximation algorithms, even though neither paper makes explicit use of that notion.[27]
Applications
Scheduling
Vertex coloring models to a number of scheduling problems.[28] In the cleanest form, a given set of jobs need to be
assigned to time slots, each job requires one such slot. Jobs can be scheduled in any order, but pairs of jobs may be
in conflict in the sense that they may not be assigned to the same time slot, for example because they both rely on a
shared resource. The corresponding graph contains a vertex for every job and an edge for every conflicting pair of
jobs. The chromatic number of the graph is exactly the minimum makespan, the optimal time to finish all jobs
without conflicts.
Details of the scheduling problem define the structure of the graph. For example, when assigning aircraft to flights,
the resulting conflict graph is an interval graph, so the coloring problem can be solved efficiently. In bandwidth
allocation to radio stations, the resulting conflict graph is a unit disk graph, so the coloring problem is
3-approximable.
Graph coloring
Register allocation
A compiler is a computer program that translates one computer language into another. To improve the execution
time of the resulting code, one of the techniques of compiler optimization is register allocation, where the most
frequently used values of the compiled program are kept in the fast processor registers. Ideally, values are assigned
to registers so that they can all reside in the registers when they are used.
The textbook approach to this problem is to model it as a graph coloring problem.[29] The compiler constructs an
interference graph, where vertices are symbolic registers and an edge connects two nodes if they are needed at the
same time. If the graph can be colored with k colors then the variables can be stored in k registers.
Other applications
The problem of coloring a graph has found a number of applications, including pattern matching.
The recreational puzzle Sudoku can be seen as completing a 9-coloring on given specific graph with 81 vertices.
Other colorings
Ramsey theory
An important class of improper coloring problems is studied in Ramsey theory, where the graphs edges are assigned
to colors, and there is no restriction on the colors of incident edges. A simple example is the friendship theorem says
that in any coloring of the edges of
the complete graph of six vertices there will be a monochromatic triangle;
often illustrated by saying that any group of six people either has three mutual strangers or three mutual
acquaintances. Ramsey theory is concerned with generalisations of this idea to seek regularity amid disorder, finding
general conditions for the existence of monochromatic subgraphs with given structure.
227
Graph coloring
228
Other colorings
List coloring
Each vertex chooses from a list of colors
List edge-coloring
Each edge chooses from a list of colors
Total coloring
Vertices and edges are colored
Harmonious coloring
Every pair of colors appears on at most one edge
Complete coloring
Every pair of colors appears on at least one edge
Exact coloring
Every pair of colors appears on exactly one edge
Acyclic coloring
Every 2-chromatic subgraph is acyclic
Star coloring
Every 2-chromatic subgraph is a disjoint collection of stars
Strong coloring
Rank coloring
If two vertices have the same color i, then every path between
them contain a vertex with color greater than i
Interval edge-coloring
A color of edges meeting in a common vertex must be contiguous
Circular coloring
Motivated by task systems in which production proceeds in a
cyclic way
Path coloring
Models a routing problem in graphs
Fractional coloring
Vertices may have multiple colors, and on each edge the sum of
the color parts of each vertex is not greater than one
Oriented coloring
Takes into account orientation of edges of the graph
Cocoloring
An improper vertex coloring where every color class induces an
independent set or a clique
Every color appears in every partition of equal size exactly once Subcoloring
Strong edge coloring
An improper vertex coloring where every color class induces a
Edges are colored such that each color class induces a matching
(equivalent to coloring the square of the line graph)
Equitable coloring
The sizes of color classes differ by at most one
T-coloring
Distance between two colors of adjacent vertices must not
belong to fixed set T
union of cliques
Defective coloring
An improper vertex coloring where every color class induces a
bounded degree subgraph.
Weak coloring
An improper vertex coloring where every non-isolated node has at
least one neighbor with a different color
Sum-coloring
The criterion of minimalization is the sum of colors
Centered coloring
Every connected induced subgraph has a color that is used exactly
once
Coloring can also be considered for signed graphs and gain graphs.
Notes
[1] M. Kubale, History of graph coloring, in Kubale (2004)
[2] van Lint & Wilson (2001, Chap. 33)
[3] Jensen & Toft (1995), p. 2
[4] Brooks (1941)
[5] Bjrklund, Husfeldt & Koivisto (2009)
[6] Lawler (1976)
[7] Beigel & Eppstein (2005)
[8] Byskov (2004)
[9] Wilf (1986)
[10] Sekine, Imai & Tani (1995)
[11] Welsh & Powell (1967)
[12] Brlaz (1979)
[13] Schneider (2010)
Graph coloring
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
Cole & Vishkin (1986), see also Cormen, Leiserson & Rivest (1990, Section 30.5)
Goldberg, Plotkin & Shannon (1988)
Schneider (2008)
Barenboim & Elkin (2009); Kuhn (2009)
Panconesi (1995)
Dailey (1980)
Halldrsson (1993)
Zuckerman (2007)
Guruswami & Khanna (2000)
Khot (2001)
Jaeger, Vertigan & Welsh (1990)
Goldberg & Jerrum (2008)
Holyer (1981)
Crescenzi & Kann (1998)
Marx (2004)
Chaitin (1982)
References
Barenboim, L.; Elkin, M. (2009), "Distributed (+1)-coloring in linear (in ) time", Proceedings of the 41st
Symposium on Theory of Computing, pp.111120, doi:10.1145/1536414.1536432, ISBN978-1-60558-506-2
Panconesi, A.; Srinivasan, A. (1996), "On the complexity of distributed network decomposition", Journal of
Algorithms, 20
Schneider, J. (2010), "A new technique for distributed symmetry breaking" (http://www.dcg.ethz.ch/
publications/podcfp107_schneider_188.pdf), Proceedings of the Symposium on Principles of Distributed
Computing
Schneider, J. (2008), "A log-star distributed maximal independent set algorithm for growth-bounded graphs"
(http://www.dcg.ethz.ch/publications/podc08SW.pdf), Proceedings of the Symposium on Principles of
Distributed Computing
Beigel, R.; Eppstein, D. (2005), "3-coloring in time O(1.3289n)", Journal of Algorithms 54 (2)): 168204,
doi:10.1016/j.jalgor.2004.06.008
Bjrklund, A.; Husfeldt, T.; Koivisto, M. (2009), "Set partitioning via inclusionexclusion", SIAM Journal on
Computing 39 (2): 546563, doi:10.1137/070683933
Brlaz, D. (1979), "New methods to color the vertices of a graph", Communications of the ACM 22 (4): 251256,
doi:10.1145/359094.359101
Brooks, R. L.; Tutte, W. T. (1941), "On colouring the nodes of a network", Proceedings of the Cambridge
Philosophical Society 37 (2): 194197, doi:10.1017/S030500410002168X
de Bruijn, N. G.; Erds, P. (1951), "A colour problem for infinite graphs and a problem in the theory of relations"
(http://www.math-inst.hu/~p_erdos/1951-01.pdf), Nederl. Akad. Wetensch. Proc. Ser. A 54: 371373 (=
Indag. Math. 13)
Byskov, J.M. (2004), "Enumerating maximal independent sets with applications to graph colouring", Operations
Research Letters 32 (6): 547556, doi:10.1016/j.orl.2004.03.002
Chaitin, G. J. (1982), "Register allocation & spilling via graph colouring", Proc. 1982 SIGPLAN Symposium on
Compiler Construction, pp.98105, doi:10.1145/800230.806984, ISBN0-89791-074-5
Cole, R.; Vishkin, U. (1986), "Deterministic coin tossing with applications to optimal parallel list ranking",
Information and Control 70 (1): 3253, doi:10.1016/S0019-9958(86)80023-7
Cormen, T. H.; Leiserson, C. E.; Rivest, R. L. (1990), Introduction to Algorithms (1st ed.), The MIT Press
Dailey, D. P. (1980), "Uniqueness of colorability and colorability of planar 4-regular graphs are NP-complete",
Discrete Mathematics 30 (3): 289293, doi:10.1016/0012-365X(80)90236-8
229
Graph coloring
Duffy, K.; O'Connell, N.; Sapozhnikov, A. (2008), "Complexity analysis of a decentralised graph colouring
algorithm" (http://www.hamilton.ie/ken_duffy/Downloads/cfl.pdf), Information Processing Letters 107 (2):
6063, doi:10.1016/j.ipl.2008.01.002
Fawcett, B. W. (1978), "On infinite full colourings of graphs", Can. J. Math. XXX: 455457
Garey, M. R.; Johnson, D. S. (1979), Computers and Intractability: A Guide to the Theory of NP-Completeness,
W.H. Freeman, ISBN0-7167-1045-5
Garey, M. R.; Johnson, D. S.; Stockmeyer, L. (1974), "Some simplified NP-complete problems" (http://portal.
acm.org/citation.cfm?id=803884), Proceedings of the Sixth Annual ACM Symposium on Theory of Computing,
pp.4763, doi:10.1145/800119.803884
Goldberg, L. A.; Jerrum, M. (July 2008), "Inapproximability of the Tutte polynomial", Information and
Computation 206 (7): 908929, doi:10.1016/j.ic.2008.04.003
Goldberg, A. V.; Plotkin, S. A.; Shannon, G. E. (1988), "Parallel symmetry-breaking in sparse graphs", SIAM
Journal on Discrete Mathematics 1 (4): 434446, doi:10.1137/0401044
Guruswami, V.; Khanna, S. (2000), "On the hardness of 4-coloring a 3-colorable graph", Proceedings of the 15th
Annual IEEE Conference on Computational Complexity, pp.188197, doi:10.1109/CCC.2000.856749,
ISBN0-7695-0674-7
Halldrsson, M. M. (1993), "A still better performance guarantee for approximate graph coloring", Information
Processing Letters 45: 1923, doi:10.1016/0020-0190(93)90246-6
Holyer, I. (1981), "The NP-completeness of edge-coloring", SIAM Journal on Computing 10 (4): 718720,
doi:10.1137/0210055
Crescenzi, P.; Kann, V. (December 1998), "How to find the best approximation results a follow-up to Garey
and Johnson", ACM SIGACT News 29 (4): 90, doi:10.1145/306198.306210
Jaeger, F.; Vertigan, D. L.; Welsh, D. J. A. (1990), "On the computational complexity of the Jones and Tutte
polynomials", Mathematical Proceedings of the Cambridge Philosophical Society 108: 3553,
doi:10.1017/S0305004100068936
Jensen, T. R.; Toft, B. (1995), Graph Coloring Problems, Wiley-Interscience, New York, ISBN0-471-02865-7
Khot, S. (2001), "Improved inapproximability results for MaxClique, chromatic number and approximate graph
coloring", Proc. 42nd Annual Symposium on Foundations of Computer Science, pp.600609,
doi:10.1109/SFCS.2001.959936, ISBN0-7695-1116-3
Kubale, M. (2004), Graph Colorings, American Mathematical Society, ISBN0-8218-3458-4
Kuhn, F. (2009), "Weak graph colorings: distributed algorithms and applications", Proceedings of the 21st
Symposium on Parallelism in Algorithms and Architectures, pp.138144, doi:10.1145/1583991.1584032,
ISBN978-1-60558-606-9
Lawler, E.L. (1976), "A note on the complexity of the chromatic number problem", Information Processing
Letters 5 (3): 6667, doi:10.1016/0020-0190(76)90065-X
Leith, D.J.; Clifford, P. (2006), "A Self-Managed Distributed Channel Selection Algorithm for WLAN" (http://
www.hamilton.ie/peterc/downloads/rawnet06.pdf), Proc. RAWNET 2006, Boston, MA
Linial, N. (1992), "Locality in distributed graph algorithms", SIAM Journal on Computing 21 (1): 193201,
doi:10.1137/0221015
van Lint, J. H.; Wilson, R. M. (2001), A Course in Combinatorics (2nd ed.), Cambridge University Press,
ISBN0-521-80340-3.
Marx, Dniel (2004), "Graph colouring problems and their applications in scheduling", Periodica Polytechnica,
Electrical Engineering, 48, pp.1116, CiteSeerX: 10.1.1.95.4268 (http://citeseerx.ist.psu.edu/viewdoc/
summary?doi=10.1.1.95.4268)
Mycielski, J. (1955), "Sur le coloriage des graphes" (http://matwbn.icm.edu.pl/ksiazki/cm/cm3/cm3119.
pdf), Colloq. Math. 3: 161162.
230
Graph coloring
Neetil, Jaroslav; Ossona de Mendez, Patrice (2012), "Theorem 3.13", Sparsity: Graphs, Structures, and
Algorithms, Algorithms and Combinatorics, 28, Heidelberg: Springer, p.42, doi:10.1007/978-3-642-27875-4,
ISBN978-3-642-27874-7, MR2920058.
Panconesi, Alessandro; Rizzi, Romeo (2001), "Some simple distributed algorithms for sparse networks",
Distributed Computing (Berlin, New York: Springer-Verlag) 14 (2): 97100, doi:10.1007/PL00008932,
ISSN0178-2770
Sekine, K.; Imai, H.; Tani, S. (1995), "Computing the Tutte polynomial of a graph of moderate size", Proc. 6th
International Symposium on Algorithms and Computation (ISAAC 1995), Lecture Notes in Computer Science,
1004, Springer, pp.224233, doi:10.1007/BFb0015427, ISBN3-540-60573-8
Welsh, D. J. A.; Powell, M. B. (1967), "An upper bound for the chromatic number of a graph and its application
to timetabling problems", The Computer Journal 10 (1): 8586, doi:10.1093/comjnl/10.1.85
West, D. B. (1996), Introduction to Graph Theory, Prentice-Hall, ISBN0-13-227828-6
Wilf, H. S. (1986), Algorithms and Complexity, PrenticeHall
Zuckerman, D. (2007), "Linear degree extractors and the inapproximability of Max Clique and Chromatic
Number", Theory of Computing 3: 103128, doi:10.4086/toc.2007.v003a006
Zykov, A. A. (1949), " (On some properties of linear
complexes)" (http://mi.mathnet.ru/eng/msb5974) (in Russian), Math. Sbornik. 24(66) (2): 163188
Jensen, Tommy R.; Toft, Bjarne (1995), Graph Coloring Problems, John Wiley & Sons, ISBN0-471-02865-7,
9780471028659
External links
Graph Coloring Page (http://www.cs.ualberta.ca/~joe/Coloring/index.html) by Joseph Culberson (graph
coloring programs)
CoLoRaTiOn (http://vispo.com/software) by Jim Andrews and Mike Fellows is a graph coloring puzzle
Links to Graph Coloring source codes (http://www.adaptivebox.net/research/bookmark/gcpcodes_link.html)
Code for efficiently computing Tutte, Chromatic and Flow Polynomials (http://www.mcs.vuw.ac.nz/~djp/
tutte/) by Gary Haggard, David J. Pearce and Gordon Royle
Graph Coloring Web Application (http://graph-coloring.appspot.com/)
231
Bipartite graph
232
Bipartite graph
In the mathematical field of graph theory, a bipartite graph (or
bigraph) is a graph whose vertices can be divided into two disjoint
sets
and such that every edge connects a vertex in
to one in
; that is,
and
and
and
. If a bipartite
notation is helpful in
, that is, if the two
[3]
Examples
When modelling relations between two different classes of objects, bipartite graphs very often arise naturally. For
instance, a graph of football players and clubs, with an edge between a player and a club if the player has played for
that club, is a natural example of an affiliation network, a type of bipartite graph used in social network analysis.[6]
Another example where bipartite graphs appear naturally is in the (NP-complete) railway optimization problem, in
which the input is a schedule of trains and their stops, and the goal is to find as small a set of train stations as
possible such that every train visits at least one of the chosen stations. This problem can be modeled as a dominating
set problem in a bipartite graph that has a vertex for each train and each station and an edge for each pair of a station
and a train that stops at that station.[7]
More abstract examples include the following:
Every tree is bipartite.[4]
Cycle graphs with an even number of vertices are bipartite.[4]
Every planar graph whose faces all have even length is bipartite.[8] Special cases of this are grid graphs and
squaregraphs, in which every inner face consists of 4 edges and every inner vertex has four or more neighbors.[9]
The complete bipartite graph on n and m vertices, denoted by Kn,m is the bipartite graph G = (V, U, E), where V
and U are disjoint sets of size n and m, respectively, and E connects every vertex in V with all vertices in U. It
follows that Kn,m has nm edges.[10] Closely related to the complete bipartite graphs are the crown graphs, formed
from complete bipartite graphs by removing the edges of a perfect matching.[11]
Hypercube graphs, partial cubes, and median graphs are bipartite. In these graphs, the vertices may be labeled by
bitvectors, in such a way that two vertices are adjacent if and only if the corresponding bitvectors differ in a single
position. A bipartition may be formed by separating the vertices whose bitvectors have an even number of ones
from the vertices with an odd number of ones. Trees and squaregraphs form examples of median graphs, and
every median graph is a partial cube.[12]
Bipartite graph
233
Properties
Characterization
Bipartite graphs may be characterized in several different ways:
A graph is bipartite if and only if it does not contain an odd cycle.[13]
A graph is bipartite if and only if it is 2-colorable, (i.e. its chromatic number is less than or equal to 2).[3]
The spectrum of a graph is symmetric if and only if it's a bipartite graph.[14]
is a
-matrix of size
[20]
to a hypergraph edge
exactly when
is one of
the endpoints of . Under this correspondence, the biadjacency matrices of bipartite graphs are exactly the
incidence matrices of the corresponding hypergraphs. As a special case of this correspondence between bipartite
graphs and hypergraphs, any multigraph (a graph in which there may be two or more edges between the same two
vertices) may be interpreted as a hypergraph in which some hyperedges have equal sets of endpoints, and represented
by a bipartite graph that does not have multiple adjacencies and in which the vertices on one side of the bipartition
all have degree two.[21]
Bipartite graph
234
A similar reinterpretation of adjacency matrices may be used to show a one-to-one correspondence between directed
graphs (on a given number of labeled vertices, allowing self-loops) and balanced bipartite graphs, with the same
number of vertices on both sides of the bipartition. For, the adjacency matrix of a directed graph with vertices can
be any
-matrix of size
, which can then be reinterpreted as the adjacency matrix of a bipartite graph
with
Algorithms
Testing bipartiteness
It is possible to test whether a graph is bipartite, and to return either a two-coloring (if it is bipartite) or an odd cycle
(if it is not) in linear time, using depth-first search. The main idea is to assign to each vertex the color that differs
from the color of its parent in the depth-first search tree, assigning colors in a preorder traversal of the
depth-first-search tree. This will necessarily provide a two-coloring of the spanning tree consisting of the edges
connecting vertices to their parents, but it may not properly color some of the non-tree edges. In a depth-first search
tree, one of the two endpoints of every non-tree edge is an ancestor of the other endpoint, and when the depth first
search discovers an edge of this type it should check that these two vertices have different colors. If they do not, then
the path in the tree from ancestor to descendant, together with the miscolored edge, form an odd cycle, which is
returned from the algorithm together with the result that the graph is not bipartite. However, if the algorithm
terminates without detecting an odd cycle of this type, then every edge must be properly colored, and the algorithm
returns the coloring together with the result that the graph is bipartite.[23]
Alternatively, a similar procedure may be used with breadth-first search in place of depth-first search. Again, each
node is given the opposite color to its parent in the search tree, in breadth-first order. If, when a vertex is colored,
there exists an edge connecting it to a previously-colored vertex with the same color, then this edge together with the
paths in the breadth-first search tree connecting its two endpoints to their lowest common ancestor forms an odd
cycle. If the algorithm terminates without finding an odd cycle in this way, then it must have found a proper
coloring, and can safely conclude that the graph is bipartite.[24]
For the intersection graphs of
line segments or other simple shapes in the Euclidean plane, it is possible to test
whether the graph is bipartite and return either a two-coloring or an odd cycle in time
, even though the
edges.[25]
[26]
The name odd cycle transversal comes from the fact that a graph is
bipartite if and only if it has no odd cycles. Hence, to delete vertices
from a graph in order to obtain a bipartite graph, one needs to "hit all
odd cycle", or find a so-called odd cycle transversal set. In the
illustration, one can observe that every odd cycle in the graph contains
Bipartite graph
235
the blue (the bottommost) vertices, hence removing those vertices kills all odd cycles and leaves a bipartite graph.
Matching
A matching in a graph is a subset of its edges, no two of which share an endpoint. Polynomial time algorithms are
known for many algorithmic problems on matchings, including maximum matching (finding a matching that uses as
many edges as possible), maximum weight matching, and stable marriage.[28] In many cases, matching problems are
simpler to solve on bipartite graphs than on non-bipartite graphs,[29] and many matching algorithms such as the
HopcroftKarp algorithm for maximum cardinality matching[30] work correctly only on bipartite inputs.
As a simple example, suppose that a set
people suitable for all jobs. This situation can be modeled as a bipartite graph
[31]
Additional applications
Bipartite graphs are extensively used in modern coding theory, especially to decode codewords received from the
channel. Factor graphs and Tanner graphs are examples of this. A Tanner graph is a bipartite graph in which the
vertices on one side of the bipartition represent digits of a codeword, and the vertices on the other side represent
combinations of digits that are expected to sum to zero in a codeword without errors.[34] A factor graph is a closely
related belief network used for probabilistic decoding of LDPC and turbo codes.[35]
In computer science, a Petri net is a mathematical modeling tool used in analysis and simulations of concurrent
systems. A system is modeled as a bipartite directed graph with two sets of nodes: A set of "place" nodes that contain
resources, and a set of "event" nodes which generate and/or consume resources. There are additional constraints on
the nodes and edges that constrain the behavior of the system. Petri nets utilize the properties of bipartite directed
graphs and other properties to allow mathematical proofs of the behavior of systems while also allowing easy
implementation of simulations of the system.[36]
In projective geometry, Levi graphs are a form of bipartite graph used to model the incidences between points and
lines in a configuration. Corresponding to the geometric property of points and lines that every two lines meet in at
most one point and every two points be connected with a single line, Levi graphs necessarily do not contain any
cycles of length four, so their girth must be six or more.[37]
References
[1] Diestel, Reinard (2005). Graph Theory, Grad. Texts in Math (http:/ / diestel-graph-theory. com/ ). Springer. ISBN978-3-642-14278-9. .
[2] Asratian, Armen S.; Denley, Tristan M. J.; Hggkvist, Roland (1998), Bipartite Graphs and their Applications, Cambridge Tracts in
Mathematics, 131, Cambridge University Press, ISBN9780521593458.
[3] Asratian, Denley & Hggkvist (1998), p. 7.
[4] Scheinerman, Edward A. (2012), Mathematics: A Discrete Introduction (http:/ / books. google. com/ books?id=DZBHGD2sEYwC&
pg=PA363) (3rd ed.), Cengage Learning, p.363, ISBN9780840049421, .
[5] Chartrand, Gary; Zhang, Ping (2008), Chromatic Graph Theory (http:/ / books. google. com/ books?id=_l4CJq46MXwC& pg=PA223),
Discrete Mathematics And Its Applications, 53, CRC Press, p.223, ISBN9781584888000, .
[6] Wasserman, Stanley; Faust, Katherine (1994), Social Network Analysis: Methods and Applications (http:/ / books. google. com/
books?id=CAm2DpIqRUIC& pg=PA299), Structural Analysis in the Social Sciences, 8, Cambridge University Press, pp.299302,
ISBN9780521387071, .
[7] Niedermeier, Rolf (2006). Invitation to Fixed Parameter Algorithms (Oxford Lecture Series in Mathematics and Its Applications). Oxford.
pp.2021. ISBN978-0-19-856607-6.
Bipartite graph
[8] Soifer, Alexander (2008), The Mathematical Coloring Book, Springer-Verlag, pp.136137, ISBN978-0-387-74640-1. This result has
sometimes been called the "two color theorem"; Soifer credits it to a famous 1879 paper of Alfred Kempe containing a false proof of the four
color theorem.
[9] Bandelt, H.-J.; Chepoi, V.; Eppstein, D. (2010), "Combinatorics and geometry of finite and infinite squaregraphs", SIAM Journal on Discrete
Mathematics 24 (4): 13991440, arXiv:0905.4537, doi:10.1137/090760301.
[10] Asratian, Denley & Hggkvist (1998), p. 11.
[11] Archdeacon, D.; Debowsky, M.; Dinitz, J.; Gavlas, H. (2004), "Cycle systems in the complete bipartite graph minus a one-factor", Discrete
Mathematics 284 (13): 3743, doi:10.1016/j.disc.2003.11.021.
[12] Ovchinnikov, Sergei (2011), Graphs and Cubes, Universitext, Springer. See especially Chapter 5, "Partial Cubes", pp. 127181.
[13] Asratian, Denley & Hggkvist (1998), Theorem 2.1.3, p. 8. Asratian et al. attribute this characterization to a 1916 paper by Dnes Knig.
[14] Biggs, Norman (1994), Algebraic Graph Theory (http:/ / books. google. com/ books?id=6TasRmIFOxQC& pg=PA53), Cambridge
Mathematical Library (2nd ed.), Cambridge University Press, p.53, ISBN9780521458979, .
[15] Knig, Dnes (1931). "Grfok s mtrixok". Matematikai s Fizikai Lapok 38: 116119..
[16] Gross, Jonathan L.; Yellen, Jay (2005), Graph Theory and Its Applications (http:/ / books. google. com/ books?id=-7Q_POGh-2cC&
pg=PA568), Discrete Mathematics And Its Applications (2nd ed.), CRC Press, p.568, ISBN9781584885054, .
[17] Chartrand, Gary; Zhang, Ping (2012), A First Course in Graph Theory (http:/ / books. google. com/ books?id=ocIr0RHyI8oC& pg=PA189),
Courier Dover Publications, pp.189190, ISBN9780486483689, .
[18] Bla Bollobs (1998), Modern Graph Theory (http:/ / books. google. com/ books?id=SbZKSZ-1qrwC& pg=PA165), Graduate Texts in
Mathematics, 184, Springer, p.165, ISBN9780387984889, .
[19] Chudnovsky, Maria; Robertson, Neil; Seymour, Paul; Thomas, Robin (2006), "The strong perfect graph theorem" (http:/ / annals. princeton.
edu/ annals/ 2006/ 164-1/ p02. xhtml), Annals of Mathematics 164 (1): 51229, doi:10.4007/annals.2006.164.51, .
[20] Asratian, Denley & Hggkvist (1998), p. 17.
[21] A. A. Sapozhenko (2001), "Hypergraph" (http:/ / www. encyclopediaofmath. org/ index. php?title=Main_Page), in Hazewinkel, Michiel,
Encyclopedia of Mathematics, Springer, ISBN978-1-55608-010-4,
[22] Brualdi, Richard A.; Harary, Frank; Miller, Zevi (1980), "Bigraphs versus digraphs via matrices", Journal of Graph Theory 4 (1): 5173,
doi:10.1002/jgt.3190040107, MR558453. Brualdi et al. credit the idea for this equivalence to Dulmage, A. L.; Mendelsohn, N. S. (1958),
"Coverings of bipartite graphs", Canadian Journal of Mathematics 10: 517534, doi:10.4153/CJM-1958-052-0, MR0097069.
[23] Sedgewick, Robert (2004), Algorithms in Java, Part 5: Graph Algorithms (3rd ed.), Addison Wesley, pp.109111.
[24] Kleinberg, Jon; Tardos, va (2006), Algorithm Design, Addison Wesley, pp.9497.
[25] Eppstein, David (2009), "Testing bipartiteness of geometric intersection graphs", ACM Transactions on Algorithms 5 (2): Art. 15,
arXiv:cs.CG/0307023, doi:10.1145/1497290.1497291, MR2561751.
[26] Yannakakis, Mihalis (1978), "Node-and edge-deletion NP-complete problems", Proceedings of the 10th ACM Symposium on Theory of
Computing (STOC '78), pp.253264, doi:10.1145/800133.804355
[27] Reed, Bruce; Smith, Kaleigh; Vetta, Adrian (2004), "Finding odd cycle transversals", Operations Research Letters 32 (4): 299301,
doi:10.1016/j.orl.2003.10.009, MR2057781.
[28] Ahuja, Ravindra K.; Magnanti, Thomas L.; Orlin, James B. (1993), "12. Assignments and Matchings", Network Flows: Theory, Algorithms,
and Applications, Prentice Hall, pp.461509.
[29] Ahuja, Magnanti & Orlin (1993), p. 463: "Nonbipartite matching problems are more difficult to solve because they do not reduce to standard
network flow problems."
[30] Hopcroft, John E.; Karp, Richard M. (1973), "An n5/2 algorithm for maximum matchings in bipartite graphs", SIAM Journal on Computing 2
(4): 225231, doi:10.1137/0202019.
[31] Ahuja, Magnanti & Orlin (1993), Application 12.1 Bipartite Personnel Assignment, pp. 463464.
[32] Robinson, Sara (April 2003), "Are Medical Students Meeting Their (Best Possible) Match?" (http:/ / www. siam. org/ pdf/ news/ 305. pdf),
SIAM News (3): 36, .
[33] Dulmage & Mendelsohn (1958).
[34] Moon, Todd K. (2005), Error Correction Coding: Mathematical Methods and Algorithms (http:/ / books. google. com/
books?id=adxb8CRx5vQC& pg=PA638), John Wiley & Sons, p.638, ISBN9780471648000, .
[35] Moon (2005), p. 686.
[36] Cassandras, Christos G.; Lafortune, Stephane (2007), Introduction to Discrete Event Systems (http:/ / books. google. com/
books?id=AxguNHDtO7MC& pg=PA224) (2nd ed.), Springer, p.224, ISBN9780387333328, .
[37] Grnbaum, Branko (2009), Configurations of Points and Lines (http:/ / books. google. com/ books?id=mRw571GNa5UC& pg=PA28),
Graduate Studies in Mathematics, 103, American Mathematical Society, p.28, ISBN9780821843086, .
236
Bipartite graph
237
External links
Information System on Graph Class Inclusions (http://wwwteo.informatik.uni-rostock.de/isgci/index.html):
bipartite graph (http://wwwteo.informatik.uni-rostock.de/isgci/classes/gc_69.html)
Weisstein, Eric W., " Bipartite Graph (http://mathworld.wolfram.com/BipartiteGraph.html)" from
MathWorld.
Greedy coloring
In the study of graph coloring problems in mathematics and computer
science, a greedy coloring is a coloring of the vertices of a graph
formed by a greedy algorithm that considers the vertices of the graph
in sequence and assigns each vertex its first available color. Greedy
colorings do not in general use the minimum number of colors
possible; however they have been used in mathematics as a technique
for proving other results about colorings and in computer science as a
heuristic to find colorings with few colors.
colors.
A crown graph (a complete bipartite graph Kn,n, with the edges of a
perfect matching removed) is a particularly bad case for greedy
coloring: if the vertex ordering places two vertices consecutively whenever they belong to one of the pairs of the
removed matching, then a greedy coloring will use n colors, while the optimal number of colors for this graph is two.
There also exist graphs such that with high probability a randomly chosen vertex ordering leads to a number of
colors much larger than the minimum.[1] Therefore, it is of some importance in greedy coloring to choose the vertex
ordering carefully.
It is NP-complete to determine, for a given graph G and number k, whether some ordering of the vertices of G forces
the greedy algorithm to use k or more colors. In particular, this means that it is difficult to find the worst ordering for
G.[2]
Optimal ordering
The vertices of any graph may always be ordered in such a way that the greedy algorithm produces an optimal
coloring. For, given any optimal coloring in which the smallest color set is maximal, the second color set is maximal
with respect to the first color set, etc., one may order the vertices by their colors. Then when one uses a greedy
algorithm with this order, the resulting coloring is automatically optimal. More strongly, perfectly orderable graphs
(which include chordal graphs, comparability graphs, and distance-hereditary graphs) have an ordering that is
optimal both for the graph itself and for all of its induced subgraphs.[3] However, finding an optimal ordering for an
arbitrary graph is NP-hard (because it could be used to solve the NP-complete graph coloring problem), and
recognizing perfectly orderable graphs is also NP-complete.[4] For this reason, heuristics have been used that attempt
to reduce the number of colors while not guaranteeing an optimal number of colors.
Greedy coloring
Notes
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
Kuera (1991).
Zaker (2006).
Chvtal (1984).
Middendorf & Pfeiffer (1990).
Welsh & Powell (1967); Johnson (1979); Syso (1989); Maffray (2003).
Lovsz (1975).
Lovsz, Saks & Trotter (1989); Sz, Vishwanathan (1990).
Kierstead & Trotter (1981).
Irani (1994).
References
Chvtal, Vclav (1984), "Perfectly orderable graphs", in Berge, Claude; Chvtal, Vclav, Topics in Perfect
Graphs, Annals of Discrete Mathematics, 21, Amsterdam: North-Holland, pp.6368. As cited by Maffray
(2003).
Irani, Sandy (1994), "Coloring inductive graphs on-line", Algorithmica 11 (1): 5372, doi:10.1007/BF01294263.
Kierstead, H. A.; Trotter, W. A. (1981), "An extremal problem in recursive combinatorics", Congress. Numer. 33:
143153. As cited by Irani (1994).
Kuera, Ludk (1991), "The greedy coloring is a bad probabilistic algorithm", Journal of Algorithms 12 (4):
674684, doi:10.1016/0196-6774(91)90040-6.
Johnson, D. S. (1979), "Worst case behavior of graph coloring algorithms", Proc. 5th Southeastern Conf.
Combinatorics, Graph Theory and Computation, Winnipeg: Utilitas Mathematica, pp.513527. As cited by
238
Greedy coloring
Maffray (2003).
Lovsz, L. (1975), "Three short proofs in graph theory", Journal of Combinatorial Theory, Series B 19 (3):
269271, doi:10.1016/0095-8956(75)90089-1.
Lovsz, L.; Saks, M. E.; Trotter, W. A. (1989), "An on-line graph coloring algorithm with sublinear performance
ratio", Discrete Mathematics 75 (13): 319325, doi:10.1016/0012-365X(89)90096-4.
Maffray, Frdric (2003), "On the coloration of perfect graphs", in Reed, Bruce A.; Sales, Cludia L., Recent
Advances in Algorithms and Combinatorics, CMS Books in Mathematics, 11, Springer-Verlag, pp.6584,
doi:10.1007/0-387-22444-0_3, ISBN0-387-95434-1.
Middendorf, Matthias; Pfeiffer, Frank (1990), "On the complexity of recognizing perfectly orderable graphs",
Discrete Mathematics 80 (3): 327333, doi:10.1016/0012-365X(90)90251-C.
Syso, Maciej M. (1989), "Sequential coloring versus Welsh-Powell bound", Discrete Mathematics 74 (12):
241243, doi:10.1016/0012-365X(89)90212-4.
Vishwanathan, S. (1990), "Randomized online graph coloring", Proc. 31st IEEE Symp. Foundations of Computer
Science (FOCS '90), 2, pp.464469, doi:10.1109/FSCS.1990.89567, ISBN0-8186-2082-X.
Welsh, D. J. A.; Powell, M. B. (1967), "An upper bound for the chromatic number of a graph and its application
to timetabling problems", The Computer Journal 10 (1): 8586, doi:10.1093/comjnl/10.1.85.
Zaker, Manouchehr (2006), "Results on the Grundy chromatic number of graphs", Discrete Mathematics 306
(23): 31663173, doi:10.1016/j.disc.2005.06.044.
Introduction
In many programming languages, the programmer has the illusion of allocating arbitrarily many variables. However,
during compilation, the compiler must decide how to allocate these variables to a small, finite set of registers. Not all
variables are in use (or "live") at the same time, so some registers may be assigned to more than one variable.
However, two variables in use at the same time cannot be assigned to the same register without corrupting its value.
Variables which cannot be assigned to some register must be kept in RAM and loaded in/out for every read/write, a
process called spilling. Accessing RAM is significantly slower than accessing registers and slows down the
execution speed of the compiled program, so an optimizing compiler aims to assign as many variables to registers as
possible. Register pressure is the term used when there are fewer hardware registers available than would have been
optimal; higher pressure usually means that more spills and reloads are needed.
In addition, programs can be further optimized by assigning the same register to a source and destination of a move
instruction whenever possible. This is especially important if the compiler is using other optimizations such as SSA
analysis, which artificially generates additional move instructions in the intermediate code.
239
loop
The coalescing done in IRC is conservative, because aggressive coalescing may introduce spills into the graph.
However, additional coalescing heuristics such as George coalescing may coalesce more vertices while still ensuring
that no additional spills are added. Work-lists are used in the algorithm to ensure that each iteration of IRC requires
sub-quadratic time.
Recent developments
Graph coloring allocators produce efficient code, but their allocation time is high. In cases of static compilation,
allocation time is not a significant concern. In cases of dynamic compilation, such as just-in-time (JIT) compilers,
fast register allocation is important. An efficient technique proposed by Poletto and Sarkar is linear scan allocation
[1]
. This technique requires only a single pass over the list of variable live ranges. Ranges with short lifetimes are
assigned to registers, whereas those with long lifetimes tend to be spilled, or reside in memory. The results are on
average only 12% less efficient than graph coloring allocators.
240
References
[1] http:/ / www. cs. ucla. edu/ ~palsberg/ course/ cs132/ linearscan. pdf
[2] Cooper, Dasgupta, "Tailoring Graph-coloring Register Allocation For Runtime Compilation", http:/ / llvm. org/ pubs/
2006-04-04-CGO-GraphColoring. html
[3] Kong, Wilken, "Precise Register Allocation for Irregular Architectures", http:/ / www. ece. ucdavis. edu/ cerl/ cerl_arch/ irreg. pdf
[4] Brisk, Hack, Palsberg, Pereira, Rastello, "SSA-Based Register Allocation", ESWEEK Tutorial http:/ / thedude. cc. gt. atl. ga. us/ tutorials/ 1/
241
242
Definition
Formally, a vertex cover of a graph G is a set C of vertices such that each edge of G is incident to at least one vertex
in C. The set C is said to cover the edges of G. The following figure shows examples of vertex covers in two graphs
(and the set C is marked with red).
A minimum vertex cover is a vertex cover of smallest possible size. The vertex cover number is the size of a
minimum vertex cover. The following figure shows examples of minimum vertex covers in two graphs.
Examples
The set of all vertices is a vertex cover.
The endpoints of any maximal matching form a vertex cover.
The complete bipartite graph
Properties
A set of vertices is a vertex cover, if and only if its complement is an independent set. An immediate consequence
is:
The number of vertices of a graph is equal to its vertex cover number plus the size of a maximum independent set
(Gallai 1959).
Vertex cover
243
Computational problem
The minimum vertex cover problem is the optimization problem of finding a smallest vertex cover in a given
graph.
INSTANCE: Graph
OUTPUT: Smallest number
such that
If the problem is stated as a decision problem, it is called the vertex cover problem:
INSTANCE: Graph
QUESTION: Does
The vertex cover problem is an NP-complete problem: it was one of Karp's 21 NP-complete problems. It is often
used in computational complexity theory as a starting point for NP-hardness proofs.
ILP formulation
Assume that every vertex has an associated cost of
subject to
for all
for all
This ILP belongs to the more general class of ILPs for covering problems. The integrality gap of this ILP is
relaxation gives a factor-
, so its
approximation algorithm for the minimum vertex cover problem. Furthermore, the linear
programming relaxation of that ILP is half-integral, that is, there exists an optimal solution for which each entry
is either 0, 1/2, or 1.
Exact evaluation
The decision variant of the vertex cover problem is NP-complete, which means it is unlikely that there is an efficient
algorithm to solve it exactly. NP-completeness can be proven by reduction from 3-satisfiability or, as Karp did, by
reduction from the clique problem. Vertex cover remains NP-complete even in cubic graphs[2] and even in planar
graphs of degree at most 3.[3]
For bipartite graphs, the equivalence between vertex cover and maximum matching described by Knig's theorem
allows the bipartite vertex cover problem to be solved in polynomial time.
Fixed-parameter tractability
An exhaustive search algorithm can solve the problem in time 2knO(1). Vertex cover is therefore fixed-parameter
tractable, and if we are only interested in small k, we can solve the problem in polynomial time. One algorithmic
technique that works here is called bounded search tree algorithm, and its idea is to repeatedly choose some vertex
and recursively branch, with two cases at each step: place either the current vertex or all its neighbours into the
vertex cover. The algorithm for solving vertex cover that achieves the best asymptotic dependence on the parameter
runs in time
.[4] Under reasonable complexity-theoretic assumptions, namely the exponential time
hypothesis, this running time cannot be improved to 2o(k)nO(1).
However, for planar graphs, and more generally, for graphs excluding some fixed graph as a minor, a vertex cover of
size k can be found in time
, i.e., the problem is subexponential fixed-parameter tractable.[5] This
algorithm is again optimal, in the sense that, under the exponential time hypothesis, no algorithm can solve vertex
cover on planar graphs in time
.[6]
Vertex cover
244
Approximate evaluation
One can find a factor-2 approximation by repeatedly taking both endpoints of an edge into the vertex cover, then
removing them from the graph. Put otherwise, we find a maximal matching M with a greedy algorithm and construct
a vertex cover C that consists of all endpoints of the edges in M. In the following figure, a maximal matching M is
marked with red, and the vertex cover C is marked with blue.
The set C constructed this way is a vertex cover: suppose that an edge e is not covered by C; then M{e} is a
matching and eM, which is a contradiction with the assumption that M is maximal. Furthermore, if e={u,v} M,
then any vertex cover including an optimal vertex cover must contain u or v (or both); otherwise the edge e is not
covered. That is, an optimal cover contains at least one endpoint of each edge in M; in total, the set C is at most 2
times as large as the optimal vertex cover.
This simple algorithm was discovered independently by Fanica Gavril and Mihalis Yannakakis.[7]
More involved techniques show that there are approximation algorithms with a slightly better approximation factor.
For example, an approximation algorithm with an approximation factor of
is known.[8]
Inapproximability
No better constant-factor approximation algorithm than the above one is known. The minimum vertex cover problem
is APX-complete, that is, it cannot be approximated arbitrarily well unless P=NP. Using techniques from the PCP
theorem, Dinur and Safra proved in 2005 that minimum vertex cover cannot be approximated within a factor of
1.3606 for any sufficiently large vertex degree unless P=NP.[9] Moreover, if the unique games conjecture is true
then minimum vertex cover cannot be approximated within any constant factor better than 2.[10]
Although finding the minimum-size vertex cover is equivalent to finding the maximum-size independent set, as
described above, the two problems are not equivalent in an approximation-preserving way: The Independent Set
problem has no constant-factor approximation unless P=NP.
Vertex cover
Fixed-parameter tractability
For the hitting set problem, different parameterizations make sense.[11] The hitting set problem is W[2]-complete for
the parameter OPT, that is, it is unlikely that there is an algorithm that runs in time f(OPT)nO(1) where OPT is the
cardinality of the smallest hitting set. The hitting set problem is fixed-parameter tractable for the parameter OPT+d,
where d is the size of the largest edge of the hypergraph. More specifically, there is an algorithm for hitting set that
runs in time dOPTnO(1).
Applications
An example of a practical application involving the hitting set problem arises in efficient dynamic detection of race
conditions.[12] In this case, each time global memory is written, the current thread and set of locks held by that thread
are stored. Under lockset-based detection, if later another thread writes to that location and there is not a race, it must
be because it holds at least one lock in common with each of the previous writes. Thus the size of the hitting set
represents the minimum lock set size to be race-free. This is useful in eliminating redundant write events, since large
lock sets are considered unlikely in practice.
Notes
[1] Vazirani 2001, pp.122123
[2] Garey, Johnson & Stockmeyer 1974
[3] Garey & Johnson 1977, pp. 190 and 195.
[4] Chen, Kanj & Xia 2006
[5] Demaine et al. 2005
[6] Flum & Grohe (2006, p.437)
[7] Papadimitriou & Steiglitz 1998, p. 432, mentions both Gavril and Yannakakis. Garey & Johnson 1979, p. 134, cites Gavril.
[8] Karakostas 2004
[9] Dinur & Safra 2005
[10] Khot & Regev 2008
[11] Flum & Grohe (2006, p.10ff)
[12] O'Callahan & Choi 2003
References
Chen, Jianer; Kanj, Iyad A.; Xia, Ge (2006). "Improved Parameterized Upper Bounds for Vertex Cover". Mfcs
2006. Lecture Notes in Computer Science 4162: 238249. doi:10.1007/11821069_21. ISBN978-3-540-37791-7.
Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001). Introduction to Algorithms.
Cambridge, Mass.: MIT Press and McGraw-Hill. pp.10241027. ISBN0-262-03293-7.
Demaine, Erik; Fomin, Fedor V.; Hajiaghayi, Mohammad Taghi; Thilikos, Dimitrios M. (2005). "Subexponential
parameterized algorithms on bounded-genus graphs and H-minor-free graphs" (http://erikdemaine.org/papers/
HMinorFree_JACM/). Journal of the ACM 52 (6): 866893. doi:10.1145/1101821.1101823. Retrieved
2010-03-05.
Dinur, Irit; Safra, Samuel (2005). "On the hardness of approximating minimum vertex cover" (http://citeseerx.
ist.psu.edu/viewdoc/download?doi=10.1.1.125.334&rep=rep1&type=pdf). Annals of Mathematics 162 (1):
245
Vertex cover
O'Callahan, Robert; Choi, Jong-Deok (2003). Hybrid dynamic data race detection (http://portal.acm.org/
citation.cfm?id=781528). "Proceedings of the ACM SIGPLAN symposium on principles and practice of parallel
programming (PPoPP 2003) and workshop on partial evaluation and semantics-based program manipulation
(PEPM 2003)". ACM SIGPLAN Notices 38 (10): 167178. doi:10.1145/966049.781528.
Papadimitriou, Christos H.; Steiglitz, Kenneth (1998). Combinatorial Optimization: Algorithms and Complexity.
Dover
Vazirani, Vijay V. (2001). Approximation Algorithms. Springer-Verlag. ISBN3-540-65367-8.
External links
Weisstein, Eric W., " Vertex Cover (http://mathworld.wolfram.com/VertexCover.html)" from MathWorld.
246
Dominating set
247
Dominating set
In graph theory, a dominating set for a graph G=(V,E) is a subset D
of V such that every vertex not in D is joined to at least one member of
D by some edge. The domination number (G) is the number of
vertices in a smallest dominating set forG.
The dominating set problem concerns testing whether (G)K for a
given graph G and input K; it is a classical NP-complete decision
problem in computational complexity theory (Garey & Johnson 1979).
Therefore it is believed that there is no efficient algorithm that finds a
smallest dominating set for a given graph.
Figures (a)(c) on the right show three examples of dominating sets for
a graph. In each example, each white vertex is adjacent to at least one
red vertex, and it is said that the white vertex is dominated by the red
vertex. The domination number of this graph is 2: the examples (b) and
(c) show that there is a dominating set with 2 vertices, and it can be
checked that there is no dominating set with only 1 vertex for this
graph.
History
As Hedetniemi & Laskar (1990) note, the domination problem was studied from the 1950s onwards, but the rate of
research on domination significantly increased in the mid-1970s. Their bibliography lists over 300 papers related to
domination in graphs.
Bounds
Let G be a graph with n1 vertices and let be the maximum degree of the graph. The following bounds on (G)
are known (Haynes, Hedetniemi & Slater 1998a, Chapter 2):
One vertex can dominate at most other vertices; therefore (G)n/(1+).
The set of all vertices is a dominating set in any graph; therefore (G)n.
If there are no isolated vertices in G, then there are two disjoint dominating sets in G; see domatic partition for
details. Therefore in any graph without isolated vertices it holds that (G)n/2.
Independent domination
Dominating sets are closely related to independent sets: an independent set is also a dominating set if and only if it is
a maximal independent set, so any maximal independent set in a graph is necessarily also a minimal dominating set.
Thus, the smallest maximal independent set is also the smallest independent dominating set. The independent
domination number i(G) of a graph G is the size of the smallest independent dominating set (or, equivalently, the
size of the smallest maximal independent set).
The minimum dominating set in a graph will not necessarily be independent, but the size of a minimum dominating
set is always less than or equal to the size of a minimum maximal independent set, that is, (G)i(G).
There are graph families in which a minimum maximal independent set is a minimum dominating set. For example,
Allan & Laskar (1978) show that (G)=i(G) if G is a claw-free graph.
Dominating set
A graph G is called a domination-perfect graph if (H)=i(H) in every induced subgraph H of G. Since an induced
subgraph of a claw-free graph is claw-free, it follows that every claw-free graphs is also domination-perfect
(Faudree, Flandrin & Ryjek 1997).
Examples
Figures (a) and (b) are independent dominating sets, while figure (c) illustrates a dominating set that is not an
independent set.
For any graph G, its line graph L(G) is claw-free, and hence a minimum maximal independent set in L(G) is also a
minimum dominating set in L(G). An independent set in L(G) corresponds to a matching in G, and a dominating set
in L(G) corresponds to an edge dominating set in G. Therefore a minimum maximal matching has the same size as a
minimum edge dominating set.
L-reductions
The following pair of reductions (Kann 1992, pp.108109) shows that the minimum dominating set problem and the
set cover problem are equivalent under L-reductions: given an instance of one problem, we can construct an
equivalent instance of the other problem.
From dominating set to set covering. Given a graph G=(V,E) with V={1,2,...,n}, construct a set cover instance
(S,U) as follows: the universe U is V, and the family of subsets is S={S1, S2, ..., Sn} such that Sv consists of the
vertex v and all vertices adjacent to v in G.
Now if D is a dominating set for G, then C={Sv:vD} is a feasible solution of the set cover problem, with
|C|=|D|. Conversely, if C={Sv:vD} is a feasible solution of the set cover problem, then D is a dominating set
for G, with |D|=|C|.
Hence the size of a minimum dominating set for G equals the size of a minimum set cover for (S,U). Furthermore,
there is a simple algorithm that maps a dominating set to a set cover of the same size and vice versa. In particular, an
efficient -approximation algorithm for set covering provides an efficient -approximation algorithm for minimum
dominating sets.
248
Dominating set
249
For example, given the graph G shown on the right, we
construct a set cover instance with the universe
U={1,2,...,6} and the subsets S1={1,2,5},
S2={1,2,3,5}, S3={2,3,4,6}, S4={3,4},
S5={1,2,5,6}, and S6={3,5,6}. In this example,
D={3,5} is a dominating set for G this corresponds to
the set cover C={S3,S5}. For example, the vertex 4V is
dominated by the vertex 3D, and the element 4U is
contained in the set S3C.
From set covering to dominating set. Let (S,U) be an instance of the set cover problem with the universe U and
the family of subsets S={Si:iI}; we assume that U and the index set I are disjoint. Construct a graph G=(V,E)
as follows: the set of vertices is V=IU, there is an edge {i,j}E between each pair i,jI, and there is also an
edge {i,u} for each iI and uSi. That is, G is a split graph: I is a clique and U is an independent set.
Now if C={Si:iD} is a feasible solution of the set cover problem for some subset DI, then D is a dominating
set for G, with |D|=|C|: First, for each uU there is an iD such that uSi, and by construction, u and i are
adjacent in G; hence u is dominated by i. Second, since D must be nonempty, each iI is adjacent to a vertex in D.
Conversely, let D be a dominating set for G. Then it is possible to construct another dominating set X such that
|X||D| and XI: simply replace each uDU by a neighbour iI of u. Then C={Si:iX} is a feasible
solution of the set cover problem, with |C|=|X||D|.
The illustration on the right shows the construction for
U={a,b,c,d,e}, I={1,2,3,4}, S1={a,b,c},
S2={a,b}, S3={b,c,d}, and S4={c,d,e}.
In this example, C={S1,S4} is a set cover; this
corresponds to the dominating set D={1,4}.
D={a,3,4} is another dominating set for the graph G.
Given D, we can construct a dominating set X={1,3,4}
which is not larger than D and which is a subset of I. The
dominating set X corresponds to the set cover
C={S1,S3,S4}.
Special cases
If the graph has maximum degree , then the greedy approximation algorithm finds an O(log)-approximation of a
minimum dominating set. The problem admits a PTAS for special cases such as unit disk graphs and planar graphs
(Crescenzi et al. 2000). A minimum dominating set can be found in linear time in series-parallel graphs
(Takamizawa, Nishizeki & Saito 1982).
Exact algorithms
A minimum dominating set of an n-vertex graph can be found in time O(2nn) by inspecting all vertex subsets.
Fomin, Grandoni & Kratsch (2009) show how to find a minimum dominating set in time O(1.5137n) and exponential
space, and in time O(1.5264n) and polynomial space. A faster algorithm, using O(1.5048n) time was found by van
Rooij, Nederlof & van Dijk (2009), who also show that the number of minimum dominating sets can be computed in
this time. The number of minimal dominating sets is at most 1.7159n and all such sets can be listed in time
O(1.7159n) (Fomin et al. 2008) .
Dominating set
Parameterized complexity
Finding a dominating set of size k plays a central role in the theory of parameterized complexity. It is the most
well-known problem complete for the class W[2] and used in many reductions to show intractability of other
problems. In particular, the problem is not fixed-parameter tractable in the sense that no algorithm with running time
f(k)nO(1) for any function f exists unless the W-hierarchy collapses to FPT=W[2]. On the other hand, if the input
graph is planar, the problem remains NP-hard, but a fixed-parameter algorithm is known. In fact, the problem has a
kernel of size linear in k (Alber, Fellows & Niedermeier 2004), and running times that are exponential in k and
cubic in n may be obtained by applying dynamic programming to a branch-decomposition of the kernel (Fomin &
Thilikos 2006).
Variants
Vizing's conjecture relates the domination number of a cartesian product of graphs to the domination number of its
factors.
There has been much work on connected dominating sets. If S is a connected dominating set, one can form a
spanning tree of G in which S forms the set of non-leaf vertices of the tree; conversely, if T is any spanning tree in a
graph with more than two vertices, the non-leaf vertices of T form a connected dominating set. Therefore, finding
minimum connected dominating sets is equivalent to finding spanning trees with the maximum possible number of
leaves.
A total dominating set is a set of vertices such that all vertices in the graph (including the vertices in the dominating
set themselves) have a neighbor in the dominating set. Figure (c) above shows a dominating set that is a connected
dominating set and a total dominating set; the examples in figures (a) and (b) are neither.
A domatic partition is a partition of the vertices into disjoint dominating sets. The domatic number is the maximum
size of a domatic partition.
References
Alber, Jochen; Fellows, Michael R; Niedermeier, Rolf (2004), "Polynomial-time data reduction for dominating
set", Journal of the ACM 51 (3): 363384, doi:10.1145/990308.990309.
Allan, Robert B.; Laskar, Renu (1978), "On domination and independent domination numbers of a graph",
Discrete Mathematics 23 (2): 7376, doi:10.1016/0012-365X(78)90105-X.
Crescenzi, Pierluigi; Kann, Viggo; Halldrsson, Magns; Karpinski, Marek; Woeginger, Gerhard (2000),
"Minimum dominating set" [1], A Compendium of NP Optimization Problems.
Faudree, Ralph; Flandrin, Evelyne; Ryjek, Zdenk (1997), "Claw-free graphs A survey", Discrete
Mathematics 164 (13): 87147, doi:10.1016/S0012-365X(96)00045-3, MR1432221.
Fomin, Fedor V.; Grandoni, Fabrizio; Kratsch, Dieter (2009), "A measure & conquer approach for the analysis of
exact algorithms", Journal of the ACM 56 (5): 25:132, doi:10.1145/1552285.1552286.
Fomin, Fedor V.; Grandoni, Fabrizio; Pyatkin, Artem; Stepanov, Alexey (2008), "Combinatorial bounds via
measure and conquer: Bounding minimal dominating sets and applications", ACM Transactions on Algorithms 5
(1): 9:117, doi:10.1145/1435375.1435384.
Fomin, Fedor V.; Thilikos, Dimitrios M. (2006), "Dominating sets in planar graphs: branch-width and exponential
speed-up", SIAM Journal on Computing 36 (2): 281, doi:10.1137/S0097539702419649.
Garey, Michael R.; Johnson, David S. (1979), Computers and Intractability: A Guide to the Theory of
NP-Completeness, W.H.Freeman, ISBN0-7167-1045-5, p. 190, problem GT2.
Grandoni, F. (2006), "A note on the complexity of minimum dominating set", Journal of Discrete Algorithms 4
(2): 209214, doi:10.1016/j.jda.2005.03.002.
250
Dominating set
251
Guha, S.; Khuller, S. (1998), "Approximation algorithms for connected dominating sets", Algorithmica 20 (4):
374387, doi:10.1007/PL00009201.
Haynes, Teresa W.; Hedetniemi, Stephen; Slater, Peter (1998a), Fundamentals of Domination in Graphs, Marcel
Dekker, ISBN0-8247-0033-3, OCLC37903553.
Haynes, Teresa W.; Hedetniemi, Stephen; Slater, Peter (1998b), Domination in Graphs: Advanced Topics, Marcel
Dekker, ISBN0-8247-0034-1, OCLC38201061.
Hedetniemi, S. T.; Laskar, R. C. (1990), "Bibliography on domination in graphs and some basic definitions of
domination parameters", Discrete Mathematics 86 (13): 257277, doi:10.1016/0012-365X(90)90365-O.
Kann, Viggo (1992), On the Approximability of NP-complete Optimization Problems [2]. PhD thesis, Department
of Numerical Analysis and Computing Science, Royal Institute of Technology, Stockholm.
Raz, R.; Safra, S. (1997), "A sub-constant error-probability low-degree test, and sub-constant error-probability
PCP characterization of NP", Proc. 29th Annual ACM Symposium on Theory of Computing, ACM, pp.475484,
doi:10.1145/258533.258641, ISBN0-89791-888-6.
Takamizawa, K.; Nishizeki, T.; Saito, N. (1982), "Linear-time computability of combinatorial problems on
series-parallel graphs", Journal of the ACM 29 (3): 623641, doi:10.1145/322326.322328.
van Rooij, J. M. M.; Nederlof, J.; van Dijk, T. C. (2009), "Inclusion/Exclusion Meets Measure and Conquer:
Exact Algorithms for Counting Dominating Sets", Proc. 17th Annual European Symposium on Algorithms, ESA
2009, Lecture Notes in Computer Science, 5757, Springer, pp.554565, doi:10.1007/978-3-642-04128-0_50,
ISBN978-3-642-04127-3.
References
[1] http:/ / www. nada. kth. se/ ~viggo/ wwwcompendium/ node11. html
[2] http:/ / www. csc. kth. se/ ~viggo/ papers/ phdthesis. pdf
Definition
The decision problem is as follows:
INSTANCE: An (undirected or directed) graph
QUESTION: Is there a subset
with
such that
deleted is
cycle-free?
The graph
from
acyclic graph in the case of directed graphs). Thus, finding a minimum feedback vertex set in a graph is equivalent to
finding a maximum induced forest (resp. maximum induced directed acyclic graph in the case of directed graphs).
NP-hardness
Karp (1972) showed that the feedback vertex set problem for directed graphs is NP-complete. The problem remains
NP-complete on directed graphs with maximum in-degree and out-degree two, and on directed planar graphs with
maximum in-degree and out-degree three.[1] Karp's reduction also implies the NP-completeness of the feedback
vertex set problem on undirected graphs, where the problem stays NP-hard on graphs of maximum degree four. The
feedback vertex set problem can be solved in polynomial time on graphs of maximum degree at most three.
Note that the problem of deleting edges to make the graph cycle-free is equivalent to finding a minimum spanning
tree, which can be done in polynomial time. In contrast, the problem of deleting edges from a directed graph to make
it acyclic, the feedback arc set problem, is NP-complete, see Karp (1972).
Exact algorithms
The corresponding NP optimization problem of finding the size of a minimum feedback vertex set can be solved in
time O(1.7347n), where n is the number of vertices in the graph.[2] This algorithm actually computes a maximum
induced forest, and when such a forest is obtained, its complement is a minimum feedback vertex set. The number of
minimal feedback vertex sets in a graph is bounded by O(1.8638n).[3] The directed feedback vertex set problem can
still be solved in time O*(1.9977n), wheren is the number of vertices in the given directed graph.[4] The
parameterized versions of the directed and undirected problems are both fixed-parameter tractable.[5]
Approximation
The problem is APX-complete, which directly follows from the APX-completeness of the vertex cover problem,[6]
and the existence of an approximation preserving L-reduction from the vertex cover problem to it.[7] The best known
approximation on undirected graphs is by a factor of two.[8]
Bounds
According to the ErdsPsa theorem, the size of a minimum feedback vertex set is within a logarithmic factor of
the maximum number of vertex-disjoint cycles in the given graph.
Applications
In operating systems, feedback vertex sets play a prominent role in the study of deadlock recovery. In the wait-for
graph of an operating system, each directed cycle corresponds to a deadlock situation. In order to resolve all
deadlocks, some blocked processes have to be aborted. A minimum feedback vertex set in this graph corresponds to
a minimum number of processes that one needs to abort (Silberschatz & Galvin 2008).
Furthermore, the feedback vertex set problem has applications in VLSI chip design (cf. Festa, Pardalos & Resende
(2000)) and genome assembly.
252
Notes
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
unpublished results due to Garey and Johnson, cf. Garey & Johnson (1979): GT7
Fomin & Villanger (2010)
Fomin et al. (2008)
Razgon (2007)
Chen et al. (2008)
Dinur & Safra 2005
Karp (1972)
Becker & Geiger (1996)
References
Research articles
Ann Becker, Reuven Bar-Yehuda, Dan Geiger: Randomized Algorithms for the Loop Cutset Problem. J. Artif.
Intell. Res. (JAIR) 12: 219-234 (2000)
Becker, Ann; Geiger, Dan (1996), "Optimization of Pearl's Method of Conditioning and Greedy-Like
Approximation Algorithms for the Vertex Feedback Set Problem.", Artif. Intell. 83 (1): 167188,
doi:10.1016/0004-3702(95)00004-6
Cao, Yixin; Chen, Jianer; Liu, Yang (2010), On Feedback Vertex Set: New Measure and New Structures (http://
www.springerlink.com/content/f3726432823626n7/), in Kaplan, Haim, "SWAT 2010", LNCS 6139: 93104,
doi:10.1007/978-3-642-13731-0
Jianer Chen, Fedor V. Fomin, Yang Liu, Songjian Lu, Yngve Villanger: Improved algorithms for feedback vertex
set problems. J. Comput. Syst. Sci. 74(7): 1188-1198 (2008)
Jianer Chen, Yang Liu, Songjian Lu, Barry O'Sullivan, Igor Razgon: A fixed-parameter algorithm for the directed
feedback vertex set problem. J. ACM 55(5): (2008)
Dinur, Irit; Safra, Samuel (2005), "On the hardness of approximating minimum vertex cover" (http://www.cs.
huji.ac.il/~dinuri/mypapers/vc.pdf), Annals of Mathematics 162 (1): 439485,
doi:10.4007/annals.2005.162.439, retrieved 2010-03-05.
Fomin, Fedor V.; Gaspers, Serge; Pyatkin, Artem; Razgon, Igor (2008), "On the Minimum Feedback Vertex Set
Problem: Exact and Enumeration Algorithms.", Algorithmica 52 (2): 293307, doi:10.1007/s00453-007-9152-0
Fomin, Fedor V.; Villanger, Yngve (2010), "Finding Induced Subgraphs via Minimal Triangulations.", Proc. of
STACS 2010, pp.383394, doi:10.4230/LIPIcs.STACS.2010.2470
Michael R. Garey and David S. Johnson (1979), Computers and Intractability: A Guide to the Theory of
NP-Completeness, W.H. Freeman, ISBN0-7167-1045-5 A1.1: GT7, pg.191.
Karp, Richard M. (1972), "Reducibility Among Combinatorial Problems" (http://www.cs.berkeley.edu/~luca/
cs172/karp.pdf), Complexity of Computer Computations, Proc. Sympos. IBM Thomas J. Watson Res. Center,
Yorktown Heights, N.Y.. New York: Plenum, pp.85103
I. Razgon : Computing Minimum Directed Feedback Vertex Set in O*(1.9977n). In: Giuseppe F. Italiano,
Eugenio Moggi, Luigi Laura (Eds.), Proceedings of the 10th Italian Conference on Theoretical Computer Science
2007, World Scientific, pp.7081 ( author's version (pdf) (http://www.cs.ucc.ie/~ir2/papers/
ictcsIgorCamera.pdf), preliminary full version (pdf) (http://www.cs.ucc.ie/~ir2/papers/mas1203.pdf)).
253
Example
As a simple example, consider the following hypothetical situation:
George says he will give you a piano, but only in exchange for a lawnmower.
Harry says he will give you a lawnmower, but only in exchange for a microwave.
Jane says she will give you a microwave, but only in exchange for a piano.
We can express this as a graph problem. Let each vertex represent an item, and add an edge from A to B if you must
have A to obtain B. Your goal is to get the lawnmower. Unfortunately, you don't have any of the three items, and
because this graph is cyclic, you can't get any of them either.
However, suppose you offer George $100 for his piano. If he accepts, this effectively removes the edge from the
lawnmower to the piano, because you no longer need the lawnmower to get the piano. Consequently, the cycle is
broken, and you can trade twice to get the lawnmower. This one edge constitutes a feedback arc set.
Computational Complexity
As in the above example, there is usually some cost associated with removing an edge. For this reason, we'd like to
remove as few edges as possible. Removing one edge suffices in a simple cycle, but in general figuring out the
minimum number of edges to remove is an NP-hard problem called the minimum feedback arc set problem. It is
particularly difficult in k-edge-connected graphs for large k, where each edge falls in many different cycles. The
decision version of the problem, which is NP-complete, asks whether all cycles can be broken by removing at most k
edges; this was one of Karp's 21 NP-complete problems, shown by reducing from the vertex cover problem.
Although NP-complete, the feedback arc set problem is fixed-parameter tractable: there exists an algorithm for
solving it whose running time is a fixed polynomial in the size of the input graph (independent of the number of
edges in the set) but exponential in the number of edges in the feedback arc set.[3]
254
Notes
[1] Di Battista, Giuseppe; Eades, Peter; Tamassia, Roberto; Tollis, Ioannis G. (1998), "Layered Drawings of Digraphs", Graph Drawing:
Algorithms for the Visualization of Graphs, Prentice Hall, pp.265302, ISBN9780133016154.
[2] Bastert, Oliver; Matuszewski, Christian (2001), "Layered drawings of digraphs", in Kaufmann, Michael; Wagner, Dorothea, Drawing
Graphs: Methods and Models, Lecture Notes in Computer Science, 2025, Springer-Verlag, pp.87120, doi:10.1007/3-540-44969-8_5.
[3] Chen, Jianer; Liu, Yang; Lu, Songjian; O'Sullivan, Barry; Razgon, Igor (2008), "A fixed-parameter algorithm for the directed feedback vertex
set problem", Journal of the ACM 55 (5), doi:10.1145/1411509.1411511.
[4] Dinur, Irit; Safra, Samuel (2005), "On the hardness of approximating minimum vertex cover" (http:/ / www. cs. huji. ac. il/ ~dinuri/
mypapers/ vc. pdf), Annals of Mathematics 162 (1): 439485, doi:10.4007/annals.2005.162.439, . (Preliminary version in STOC 2002, titled
"The importance of being biased", doi:10.1145/509907.509915.)
[5] G. Even, J. Noar, B. Schieber and M. Sudan. Approximating minimum feedback sets and multicuts in directed graphs. Algorithmica, 20,
(1998) 151-174.
[6] Berger, B.; Shor, P. (1990), "Approximation algorithms for the maximum acyclic subgraph problem" (http:/ / dl. acm. org/ citation.
cfm?id=320176. 320203), Proceedings of the 1st ACM-SIAM Symposium on Discrete Algorithms (SODA90), pp.236243, .
[7] Eades, P.; Lin, X.; Smyth, W. F. (1993), "A fast and eective heuristic for the feedback arc set problem", Information Processing Letters 47:
319323, doi:10.1016/0020-0190(93)90079-O.
[8] Kenyon-Mathieu, C.; Schudy, W. (2007). How to rank with few errors. pp. 95. doi:10.1145/1250790.1250806. author's extended version
(http:/ / www. cs. brown. edu/ people/ ws/ papers/ fast_journal. pdf)
References
Richard M. Karp. "Reducibility Among Combinatorial Problems." In Complexity of Computer Computations,
Proc. Sympos. IBM Thomas J. Watson Res. Center, Yorktown Heights, N.Y.. New York: Plenum, p.85-103.
1972.
Pierluigi Crescenzi, Viggo Kann, Magns Halldrsson, Marek Karpinski and Gerhard Woeginger. " Minimum
Feedback Arc Set (http://www.nada.kth.se/~viggo/wwwcompendium/node20.html)". A compendium of NP
optimization problems (http://www.nada.kth.se/~viggo/wwwcompendium/). Last modified March 20, 2000.
Viggo Kann. pdf On the Approximability of NP-complete Optimization Problems (http://www.nada.kth.se/
~viggo/papers/phdthesis.pdf). PhD thesis. Department of Numerical Analysis and Computing Science, Royal
Institute of Technology, Stockholm. 1992.
Michael R. Garey and David S. Johnson (1979). Computers and Intractability: A Guide to the Theory of
NP-Completeness. W.H. Freeman. ISBN0-7167-1045-5. A1.1: GT8, pg.192.
255
256
Tours
Eulerian path
In graph theory, an Eulerian trail (or Eulerian path) is a trail in a
graph which visits every edge exactly once. Similarly, an Eulerian
circuit or Eulerian cycle is an Eulerian trail which starts and ends on
the same vertex. They were first discussed by Leonhard Euler while
solving the famous Seven Bridges of Knigsberg problem in 1736.
Mathematically the problem can be stated like this:
Given the graph on the right, is it possible to construct a path (or
a cycle, i.e. a path starting and ending on the same vertex) which
visits each edge exactly once?
Euler proved that a necessary condition for the existence of Eulerian
circuits is that all vertices in the graph have an even degree, and stated
without proof that connected graphs with all vertices of even degree
have an Eulerian circuit. The first complete proof of this latter claim
was published posthumously in 1873 by Carl Hierholzer.[1]
The term Eulerian graph has two common meanings in graph theory.
One meaning is a graph with an Eulerian circuit, and the other is a
graph with every vertex of even degree. These definitions coincide for
connected graphs.[2]
For the existence of Eulerian trails it is necessary that no more than
two vertices have an odd degree; this means the Knigsberg graph is
not Eulerian. If there are no vertices of odd degree, all Eulerian trails
are circuits. If there are exactly two vertices of odd degree, all Eulerian
trails start at one of them and end at the other. A graph that has an
Eulerian trail but not an Eulerian circuit is called semi-Eulerian.
Definition
Eulerian path
which every vertex has even degree, and may be found by constructing an Euler tour in each connected component
of G and then orienting the edges according to the tour.[6] Every Eulerian orientation of a connected graph is a strong
orientation, an orientation that makes the resulting directed graph strongly connected.
Properties
An undirected graph has an Eulerian cycle if and only if every vertex has even degree, and all of its vertices with
nonzero degree belong to a single connected component.
An undirected graph can be decomposed into edge-disjoint cycles if and only if all of its vertices have even
degree. So, a graph has an Eulerian cycle if and only if it can be decomposed into edge-disjoint cycles and its
nonzero-degree vertices belong to a single connected component.
An undirected graph has an Eulerian trail if and only if at most two vertices have odd degree, and if all of its
vertices with nonzero degree belong to a single connected component.
A directed graph has an Eulerian cycle if and only if every vertex has equal in degree and out degree, and all of its
vertices with nonzero degree belong to a single strongly connected component. Equivalently, a directed graph has
an Eulerian cycle if and only if it can be decomposed into edge-disjoint directed cycles and all of its vertices with
nonzero degree belong to a single strongly connected component.
A directed graph has an Eulerian trail if and only if at most one vertex has (out-degree)(in-degree)=1, at most
one vertex has (in-degree)(out-degree)=1, every other vertex has equal in-degree and out-degree, and all of its
vertices with nonzero degree belong to a single connected component of the underlying undirected graph.
Hierholzer's algorithm
Hierholzer's 1873 paper provides a different method for finding Euler cycles that is more efficient than Fleury's
algorithm:
Choose any starting vertex v, and follow a trail of edges from that vertex until returning to v. It is not possible to
get stuck at any vertex other than v, because the even degree of all vertices ensures that, when the trail enters
another vertex w there must be an unused edge leaving w. The tour formed in this way is a closed tour, but may
not cover all the vertices and edges of the initial graph.
As long as there exists a vertex v that belongs to the current tour but that has adjacent edges not part of the tour,
start another trail from v, following unused edges until returning to v, and join the tour formed in this way to the
previous tour.
257
Eulerian path
By using a data structure such as a doubly linked list to maintain the set of unused edges incident to each vertex, to
maintain the list of vertices on the current tour that have unused edges, and to maintain the tour itself, the individual
operations of the algorithm (finding unused edges exiting each vertex, finding a new starting vertex for a tour, and
connecting two tours that share a vertex) may be performed in constant time each, so the overall algorithm takes
linear time.[8]
Special cases
The asymptotic formula for the number of Eulerian circuits in the complete graphs was determined by McKay and
Robinson (1995):[11]
A similar formula was later obtained by M.I. Isaev (2009) for complete bipartite graphs:[12]
Applications
Eulerian trails are used in bioinformatics to reconstruct the DNA sequence from its fragments.[13]
Notes
[1] N. L. Biggs, E. K. Lloyd and R. J. Wilson, Graph Theory 1736-1936, Clarendon Press, Oxford, 1976, 8-9, ISBN 0-19-853901-0.
[2] C. L. Mallows, N. J. A. Sloane (1975). "Two-graphs, switching classes and Euler graphs are equal in number". SIAM Journal on Applied
Mathematics 28 (4): 876880. doi:10.1137/0128070. JSTOR2100368.
[3] Some people reserve the terms path and cycle to mean non-self-intersecting path and cycle. A (potentially) self-intersecting path is known as
a trail or an open walk; and a (potentially) self-intersecting cycle, a circuit or a closed walk. This ambiguity can be avoided by using the
terms Eulerian trail and Eulerian circuit when self-intersection is allowed.
[4] Jun-ichi Yamaguchi, Introduction of Graph Theory (http:/ / jwilson. coe. uga. edu/ EMAT6680/ Yamaguchi/ emat6690/ essay1/ GT. html).
[5] Schaum's outline of theory and problems of graph theory By V. K. Balakrishnan (http:/ / books. google. co. uk/ books?id=1NTPbSehvWsC&
lpg=PA60& dq=unicursal& pg=PA60#v=onepage& q=unicursal& f=false).
[6] Schrijver, A. (1983), "Bounds on the number of Eulerian orientations", Combinatorica 3 (3-4): 375380, doi:10.1007/BF02579193,
MR729790.
[7] Fleury, M. (1883), "Deux problmes de Gomtrie de situation" (http:/ / books. google. com/ books?id=l-03AAAAMAAJ& pg=PA257) (in
French), Journal de mathmatiques lmentaires, 2nd ser. 2: 257261, .
[8] Fleischner, Herbert (1991), "X.1 Algorithms for Eulerian Trails", Eulerian Graphs and Related Topics: Part 1, Volume 2, Annals of Discrete
Mathematics, 50, Elsevier, pp.X.113, ISBN978-0-444-89110-5.
258
Eulerian path
[9] Brightwell and Winkler, " Note on Counting Eulerian Circuits (http:/ / www. cdam. lse. ac. uk/ Reports/ Files/ cdam-2004-12. pdf)", 2004.
[10] Tetali, P.; Vempala, S. (2001). "Random Sampling of Euler Tours" (http:/ / www. springerlink. com/ content/ k5kbhmg4qkj7whcf/ ).
Algorithmica 30: 376385. .
[11] Brendan McKay and Robert W. Robinson, Asymptotic enumeration of eulerian circuits in the complete graph (http:/ / cs. anu. edu. au/
~bdm/ papers/ euler. pdf), Combinatorica, 10 (1995), no. 4, 367377.
[12] M.I. Isaev (2009). "Asymptotic number of Eulerian circuits in complete bipartite graphs" (in Russian). Proc. 52-nd MFTI Conference
(Moscow): 111114.
[13] Pevzner, Pavel A.; Tang, Haixu; Waterman, Michael S. (2001). "An Eulerian trail approach to DNA fragment assembly" (http:/ / www.
pnas. org/ content/ 98/ 17/ 9748. long). Proceedings of the National Academy of Sciences of the United States of America 98 (17): 97489753.
Bibcode2001PNAS...98.9748P. doi:10.1073/pnas.171285098. PMC55524. PMID11504945. .
References
Euler, L., " Solutio problematis ad geometriam situs pertinentis (http://www.math.dartmouth.edu/~euler/
pages/E053.html)", Comment. Academiae Sci. I. Petropolitanae 8 (1736), 128-140.
Hierholzer, Carl (1873), "Ueber die Mglichkeit, einen Linienzug ohne Wiederholung und ohne Unterbrechung
zu umfahren", Mathematische Annalen 6 (1): 3032, doi:10.1007/BF01442866.
Lucas, E., Rcrations Mathmatiques IV, Paris, 1921.
Fleury, "Deux problemes de geometrie de situation", Journal de mathematiques elementaires (1883), 257-261.
T. van Aardenne-Ehrenfest and N. G. de Bruijn, Circuits and trees in oriented linear graphs, Simon Stevin, 28
(1951), 203-217.
Thorup, Mikkel (2000), "Near-optimal fully-dynamic graph connectivity", Proc. 32nd ACM Symposium on
Theory of Computing, pp.343350, doi:10.1145/335305.335345
W. T. Tutte and C. A. B. Smith, On Unicursal Paths in a Network of Degree 4. Amer. Math. Monthly, 48 (1941),
233-237.
External links
Discussion of early mentions of Fleury's algorithm (http://mathforum.org/kb/message.
jspa?messageID=3648262&tstart=135)
259
Hamiltonian path
260
Hamiltonian path
In the mathematical field of graph theory, a Hamiltonian path (or
traceable path) is a path in an undirected graph that visits each vertex
exactly once. A Hamiltonian cycle (or Hamiltonian circuit) is a
Hamiltonian path that is a cycle. Determining whether such paths and
cycles exist in graphs is the Hamiltonian path problem, which is
NP-complete.
Hamiltonian paths and cycles are named after William Rowan
Hamilton who invented the Icosian game, now also known as
Hamilton's puzzle, which involves finding a Hamiltonian cycle in the
edge graph of the dodecahedron. Hamilton solved this problem using
the Icosian Calculus, an algebraic structure based on roots of unity
with many similarities to the quaternions (also invented by Hamilton).
This solution does not generalize to arbitrary graphs. However, despite
being named after Hamilton, Hamiltonian cycles in polyhedra had also
been studied a year earlier by Thomas Kirkman. [1]
Definitions
A Hamiltonian path or traceable path is a path that visits each vertex
exactly once. A graph that contains a Hamiltonian path is called a
traceable graph. A graph is Hamiltonian-connected if for every pair
of vertices there is a Hamiltonian path between the two vertices.
A Hamiltonian cycle, Hamiltonian circuit, vertex tour or graph cycle is
a cycle that visits each vertex exactly once (except the vertex that is
both the start and end, and so is visited twice). A graph that contains a
Hamiltonian cycle is called a Hamiltonian graph.
Similar notions may be defined for directed graphs, where each edge
(arc) of a path or cycle can only be traced in a single direction (i.e., the
vertices are connected with arrows and the edges traced "tail-to-head").
Examples
a complete graph with more than two vertices is Hamiltonian
Hamiltonian path
Properties
Any Hamiltonian cycle can be converted to a Hamiltonian path by removing one of its edges, but a Hamiltonian path
can be extended to Hamiltonian cycle only if its endpoints are adjacent.
The line graph of a Hamiltonian graph is Hamiltonian. The line graph of an Eulerian graph is Hamiltonian.
A tournament (with more than 2 vertices) is Hamiltonian if and only if it is strongly connected.
The problem of finding a Hamiltonian cycle may be used as the basis of a zero-knowledge proof.
The number of different Hamiltonian cycles in a complete undirected graph on n vertices is (n - 1)! / 2 and in a
complete directed graph on n vertices is (n - 1)!.
BondyChvtal theorem
The best vertex degree characterization of Hamiltonian graphs was provided in 1972 by the BondyChvtal theorem,
which generalizes earlier results by G. A. Dirac (1952) and ystein Ore. In fact, both Dirac's and Ore's theorems are
less powerful than what can be derived from Psa's theorem (1962). Dirac and Ore's theorems basically state that a
graph is Hamiltonian if it has enough edges. First we have to define the closure of a graph.
Given a graph G with n vertices, the closure cl(G) is uniquely constructed from G by repeatedly adding a new edge
uv connecting a nonadjacent pair of vertices u and v with degree(v) + degree(u) n until no more pairs with this
property can be found.
BondyChvtal theorem
A graph is Hamiltonian if and only if its closure is Hamiltonian.
As complete graphs are Hamiltonian, all graphs whose closure is complete are Hamiltonian, which is the content of
the following earlier theorems by Dirac and Ore.
Dirac (1952)
A simple graph with n vertices (n 3) is Hamiltonian if each vertex has degree n / 2 or greater.[2]
Ore (1960)
A graph with n vertices (n 3) is Hamiltonian if, for each pair of non-adjacent vertices, the sum of their
degrees is n or greater (see Ore's theorem).
The following theorems can be regarded as directed versions:
Ghouila-Houiri (1960)
A strongly connected simple directed graph with n vertices is Hamiltonian if some vertex has a full degree
greater than or equal to n.
Meyniel (1973)
A strongly connected simple directed graph with n vertices is Hamiltonian if the sum of full degrees of some
two distinct non-adjacent vertices is greater than or equal to 2n 1.
The number of vertices must be doubled because each undirected edge corresponds to two directed arcs and thus the
degree of a vertex in the directed graph is twice the degree in the undirected graph.
261
Hamiltonian path
Notes
[1] Biggs, N. L. (1981), "T. P. Kirkman, mathematician", The Bulletin of the London Mathematical Society 13 (2): 97120,
doi:10.1112/blms/13.2.97, MR608093.
[2] Graham, p. 20 (http:/ / books. google. com/ books?id=XicsNQIrC3sC& printsec=frontcover& source=gbs_summary_r& cad=0#PPA20,M1).
References
Berge, Claude; Ghouila-Houiri, A. (1962), Programming, games and transportation networks, New York: John
Wiley & Sons, Inc.
DeLeon, Melissa, " A Study of Sufficient Conditions for Hamiltonian Cycles (http://www.rose-hulman.edu/
mathjournal/archives/2000/vol1-n1/paper4/v1n1-4pd.PDF)". Department of Mathematics and Computer
Science, Seton Hall University
Dirac, G. A. (1952), "Some theorems on abstract graphs", Proceedings of the London Mathematical Society, 3rd
Ser. 2: 6981, doi:10.1112/plms/s3-2.1.69, MR0047308
Graham, Ronald L., Handbook of Combinatorics, MIT Press, 1995. ISBN 978-0-262-57170-8.
Hamilton, William Rowan, "Memorandum respecting a new system of roots of unity". Philosophical Magazine, 12
1856
Hamilton, William Rowan, "Account of the Icosian Calculus". Proceedings of the Royal Irish Academy, 6 1858
Meyniel, M. (1973), "Une condition suffisante d'existence d'un circuit hamiltonien dans un graphe orient",
Journal of Combinatorial Theory, Ser. B 14 (2): 137147, doi:10.1016/0095-8956(73)90057-9
Ore, O "A Note on Hamiltonian Circuits." American Mathematical Monthly 67, 55, 1960.
Peterson, Ivars, "The Mathematical Tourist". 1988. W. H. Freeman and Company, NY
Psa, L. A theorem concerning hamilton lines. Magyar Tud. Akad. Mat. Kutato Int. Kozl. 7(1962), 225226.
Chuzo Iwamoto and Godfried Toussaint, "Finding Hamiltonian circuits in arrangements of Jordan curves is
NP-complete," Information Processing Letters, Vol. 52, 1994, pp.183189.
External links
Weisstein, Eric W., " Hamiltonian Cycle (http://mathworld.wolfram.com/HamiltonianCycle.html)" from
MathWorld.
Euler tour and Hamilton cycles (http://www.graph-theory.net/euler-tour-and-hamilton-cycles/)
262
Algorithms
There are n! different sequences of vertices that might be Hamiltonian paths in a given n-vertex graph (and are, in a
complete graph), so a brute force search algorithm that tests all possible sequences would be very slow. There are
several faster approaches. A search procedure by Frank Rubin[2] divides the edges of the graph into three classes:
those that must be in the path, those that cannot be in the path, and undecided. As the search proceeds, a set of
decision rules classifies the undecided edges, and determines whether to halt or continue the search. The algorithm
divides the graph into components that can be solved separately. Also, a dynamic programming algorithm of
Bellman, Held, and Karp can be used to solve the problem in time O(n22n). In this method, one determines, for each
set S of vertices and each vertex v in S, whether there is a path that covers exactly the vertices in S and ends at v. For
each choice of S and v, a path exists for (S,v) if and only if v has a neighbor w such that a path exists for (Sv,w),
which can be looked up from already-computed information in the dynamic program.[3][4]
Andreas Bjrklund provided an alternative approach using the inclusionexclusion principle to reduce the problem
of counting the number of Hamiltonian cycles to a simpler counting problem, of counting cycle covers, which can be
solved by computing certain matrix determinants. Using this method, he showed how to solve the Hamiltonian cycle
problem in arbitrary n-vertex graphs by a Monte Carlo algorithm in time O(1.657n); for bipartite graphs this
algorithm can be further improved to time O(1.414n).[5]
For graphs of maximum degree three, a careful backtracking search can find a Hamiltonian cycle (if one exists) in
time O(1.251n).[6]
Because of the difficulty of solving the Hamiltonian path and cycle problems on conventional computers, they have
also been studied in unconventional models of computing. For instance, Leonard Adleman showed that the
Hamiltonian path problem may be solved using a DNA computer. Exploiting the parallelism inherent in chemical
reactions, the problem may be solved using a number of chemical reaction steps linear in the number of vertices of
the graph; however, it requires a factorial number of distinct types of DNA molecule to participate in the reaction.[7]
263
Complexity
The problem of finding a Hamiltonian cycle or path is in FNP; the analogous decision problem is to test whether a
Hamiltonian cycle or path exists. The directed and undirected Hamiltonian cycle problems were two of Karp's 21
NP-complete problems. They remain NP-complete even for undirected planar graphs of maximum degree three,[8]
for directed planar graphs with indegree and outdegree at most two,[9] for bridgeless undirected planar 3-regular
bipartite graphs, and for 3-connected 3-regular bipartite graphs.[10] However, putting all of these conditions together,
it remains open whether 3-connected 3-regular bipartite planar graphs must always contain a Hamiltonian cycle, in
which case the problem restricted to those graphs could not be NP-complete; see Barnette's conjecture.
In graphs in which all vertices have odd degree, an argument related to the handshaking lemma shows that the
number of Hamiltonian cycles through any fixed edge is always even, so if one Hamiltonian cycle is given, then a
second one must also exist.[11] However, finding this second cycle does not seem to be an easy computational task.
Papadimitriou defined the complexity class PPA to encapsulate problems such as this one.[12]
References
[1] Michael R. Garey and David S. Johnson (1979). Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman.
ISBN0-7167-1045-5. A1.3: GT3739, pp.199200.
[2] Rubin, Frank (1974), "A Search Procedure for Hamilton Paths and Circuits", Journal of the ACM 21: 576-80.
[3] Bellman, R. (1962), "Dynamic programming treatment of the travelling salesman problem", Journal of the ACM 9: 6163,
doi:10.1145/321105.321111.
[4] Held, M.; Karp, R. M. (1962), "A dynamic programming approach to sequencing problems", J. SIAM 10 (1): 196210, doi:10.1137/0110015.
[5] Bjrklund, Andreas (2010), "Determinant sums for undirected Hamiltonicity", Proc. 51st IEEE Symposium on Foundations of Computer
Science (FOCS '10), pp.173182, arXiv:1008.0541, doi:10.1109/FOCS.2010.24.
[6] Iwama, Kazuo; Nakashima, Takuya (2007), "An improved exact algorithm for cubic graph TSP", Proc. 13th Annual International
Conference on Computing and Combinatorics (COCOON 2007), Lecture Notes in Computer Science, 4598, pp.108117,
doi:10.1007/978-3-540-73545-8_13.
[7] Adleman, Leonard (November), "Molecular computation of solutions to combinatorial problems", Science 266 (5187): 10211024,
doi:10.1126/science.7973651, JSTOR2885489, PMID7973651.
[8] Garey, M. R.; Johnson, D. S.; Stockmeyer, L. (1974), "Some simplified NP-complete problems", Proc. 6th ACM Symposium on Theory of
Computing (STOC '74), pp.4763, doi:10.1145/800119.803884.
[9] Plesik, J. (1979), "The NP-completeness of the Hamiltonian cycle problem in planar digraphs with degree bound two" (http:/ / www. aya. or.
jp/ ~babalabo/ DownLoad/ Plesnik 8. 4. 192-196. pdf), Information Processing Letters 8 (4): 199201, .
[10] Akiyama, Takanori; Nishizeki, Takao; Saito, Nobuji (1980--1981), "NP-completeness of the Hamiltonian cycle problem for bipartite
graphs" (http:/ / www. nishizeki. ecei. tohoku. ac. jp/ nszk/ nishi/ sub/ j/ DVD/ PDF_J/ J029. pdf), Journal of Information Processing 3 (2):
7376, MR596313, .
[11] Thomason, A. G. (1978), "Hamiltonian cycles and uniquely edge colourable graphs", Advances in Graph Theory (Cambridge Combinatorial
Conf., Trinity College, Cambridge, 1977), Annals of Discrete Mathematics, 3, pp.259268, doi:10.1016/S0167-5060(08)70511-9,
MR499124.
[12] Papadimitriou, Christos H. (1994), "On the complexity of the parity argument and other inefficient proofs of existence", Journal of
Computer and System Sciences 48 (3): 498532, doi:10.1016/S0022-0000(05)80063-7, MR1279412.
264
History
The origins of the travelling salesman problem are unclear. A handbook for travelling salesmen from 1832 mentions
the problem and includes example tours through Germany and Switzerland, but contains no mathematical
treatment.[2]
The travelling salesman problem was defined in the 1800s by the Irish
mathematician W. R. Hamilton and by the British mathematician
Thomas Kirkman. Hamiltons Icosian Game was a recreational puzzle
based on finding a Hamiltonian cycle.[3] The general form of the TSP
appears to have been first studied by mathematicians during the 1930s
in Vienna and at Harvard, notably by Karl Menger, who defines the
problem, considers the obvious brute-force algorithm, and observes the
non-optimality of the nearest neighbour heuristic:
We denote by messenger problem (since in practice this question
should be solved by each postman, anyway also by many
travelers) the task to find, for nitely many points whose
pairwise distances are known, the shortest route connecting the
points. Of course, this problem is solvable by finitely many
trials. Rules which would push the number of trials below the
William Rowan Hamilton
number of permutations of the given points, are not known. The
rule that one first should go from the starting point to the closest point, then to the point closest to this, etc., in
general does not yield the shortest route.[4]
Hassler Whitney at Princeton University introduced the name travelling salesman problem soon after.[5]
In the 1950s and 1960s, the problem became increasingly popular in scientific circles in Europe and the USA.
Notable contributions were made by George Dantzig, Delbert Ray Fulkerson and Selmer M. Johnson at the RAND
265
266
Corporation in Santa Monica, who expressed the problem as an integer linear program and developed the cutting
plane method for its solution. With these new methods they solved an instance with 49 cities to optimality by
constructing a tour and proving that no other tour could be shorter. In the following decades, the problem was
studied by many researchers from mathematics, computer science, chemistry, physics, and other sciences.
Richard M. Karp showed in 1972 that the Hamiltonian cycle problem was NP-complete, which implies the
NP-hardness of TSP. This supplied a mathematical explanation for the apparent computational difficulty of finding
optimal tours.
Great progress was made in the late 1970s and 1980, when Grtschel, Padberg, Rinaldi and others managed to
exactly solve instances with up to 2392 cities, using cutting planes and branch-and-bound.
In the 1990s, Applegate, Bixby, Chvtal, and Cook developed the program Concorde that has been used in many
recent record solutions. Gerhard Reinelt published the TSPLIB in 1991, a collection of benchmark instances of
varying difficulty, which has been used by many research groups for comparing results. In 2005, Cook and others
computed an optimal tour through a 33,810-city instance given by a microchip layout problem, currently the largest
solved TSPLIB instance. For many other instances with millions of cities, solutions can be found that are guaranteed
to be within 1% of an optimal tour.
Description
As a graph problem
TSP can be modelled as an undirected weighted graph, such that cities
are the graph's vertices, paths are the graph's edges, and a path's
distance is the edge's length. It is a minimization problem starting and
finishing at a specified vertex after having visited each other vertex
exactly once. Often, the model is a complete graph (i.e. each pair of
vertices is connected by an edge). If no path exists between two cities,
adding an arbitrarily long edge will complete the graph without
affecting the optimal tour.
In the symmetric TSP, the distance between two cities is the same in
each opposite direction, forming an undirected graph. This symmetry halves the number of possible solutions. In the
asymmetric TSP, paths may not exist in both directions or the distances might be different, forming a directed graph.
Traffic collisions, one-way streets, and airfares for cities with different departure and arrival fees are examples of
how this symmetry could break down.
Related problems
An equivalent formulation in terms of graph theory is: Given a complete weighted graph (where the vertices
would represent the cities, the edges would represent the roads, and the weights would be the cost or distance of
that road), find a Hamiltonian cycle with the least weight.
The requirement of returning to the starting city does not change the computational complexity of the problem,
see Hamiltonian path problem.
Another related problem is the bottleneck travelling salesman problem (bottleneck TSP): Find a Hamiltonian
cycle in a weighted graph with the minimal weight of the weightiest edge. The problem is of considerable
practical importance, apart from evident transportation and logistics areas. A classic example is in printed circuit
manufacturing: scheduling of a route of the drill machine to drill holes in a PCB. In robotic machining or drilling
Computing a solution
The traditional lines of attack for the NP-hard problems are the following:
Devising algorithms for finding exact solutions (they will work reasonably fast only for small problem sizes).
Devising "suboptimal" or heuristic algorithms, i.e., algorithms that deliver either seemingly or probably good
solutions, but which could not be proved to be optimal.
Finding special cases for the problem ("subproblems") for which either better or exact heuristics are possible.
Computational complexity
The problem has been shown to be NP-hard (more precisely, it is complete for the complexity class FPNP; see
function problem), and the decision problem version ("given the costs and a number x, decide whether there is a
round-trip route cheaper than x") is NP-complete. The bottleneck travelling salesman problem is also NP-hard. The
problem remains NP-hard even for the case when the cities are in the plane with Euclidean distances, as well as in a
number of other restrictive cases. Removing the condition of visiting each city "only once" does not remove the
NP-hardness, since it is easily seen that in the planar case there is an optimal tour that visits each city only once
(otherwise, by the triangle inequality, a shortcut that skips a repeated visit would not increase the tour length).
Complexity of approximation
In the general case, finding a shortest travelling salesman tour is NPO-complete.[8] If the distance measure is a metric
and symmetric, the problem becomes APX-complete[9] and Christofidess algorithm approximates it within 1.5.[10]
If the distances are restricted to 1 and 2 (but still are a metric) the approximation ratio becomes 7/6. In the
asymmetric, metric case, only logarithmic performance guarantees are known, the best current algorithm achieves
performance ratio 0.814 log n;[11] it is an open question if a constant factor approximation exists.
The corresponding maximization problem of finding the longest travelling salesman tour is approximable within
63/38.[12] If the distance function is symmetric, the longest tour can be approximated within 4/3 by a deterministic
algorithm[13] and within
by a randomised algorithm.[14]
267
268
Exact algorithms
The most direct solution would be to try all permutations (ordered combinations) and see which one is cheapest
(using brute force search). The running time for this approach lies within a polynomial factor of
, the
factorial of the number of cities, so this solution becomes impractical even for only 20 cities. One of the earliest
applications of dynamic programming is the HeldKarp algorithm that solves the problem in time
.[15]
The dynamic programming solution requires exponential space. Using inclusionexclusion, the problem can be
solved in time within a polynomial factor of
and polynomial space.[16]
Improving these time bounds seems to be difficult. For example, it has not been determined whether an exact
algorithm for TSP that runs in time
exists.[17]
Other approaches include:
Various branch-and-bound algorithms, which can be used to process TSPs containing 4060 cities.
Progressive improvement algorithms which use techniques reminiscent of linear programming. Works well for up
to 200 cities.
Implementations of branch-and-bound and problem-specific cut generation; this is the method of choice for
solving large instances. This approach holds the current record, solving an instance with 85,900 cities, see
Applegate et al. (2006).
An exact solution for 15,112 German towns from TSPLIB was found in 2001 using the cutting-plane method
proposed by George Dantzig, Ray Fulkerson, and Selmer M. Johnson in 1954, based on linear programming. The
computations were performed on a network of 110 processors located at Rice University and Princeton University
(see the Princeton external link). The total computation time was equivalent to 22.6years on a single 500MHz
Alpha processor. In May 2004, the travelling salesman problem of visiting all 24,978 towns in Sweden was solved: a
tour of length approximately 72,500 kilometers was found and it was proven that no shorter tour exists.[18]
In March 2005, the travelling salesman problem of visiting all 33,810 points in a circuit board was solved using
Concorde TSP Solver: a tour of length 66,048,945 units was found and it was proven that no shorter tour exists. The
computation took approximately 15.7 CPU-years (Cook et al. 2006). In April 2006 an instance with 85,900 points
was solved using Concorde TSP Solver, taking over 136 CPU-years, see Applegate et al. (2006).
269
Special cases
Metric TSP
In the metric TSP, also known as delta-TSP or -TSP, the intercity distances satisfy the triangle inequality.
A very natural restriction of the TSP is to require that the distances between cities form a metric, i.e., they satisfy the
triangle inequality. This can be understood as the absence of "shortcuts", in the sense that the direct connection from
A to B is never longer than the route via intermediate C:
The edge lengths then form a metric on the set of vertices. When the cities are viewed as points in the plane, many
natural distance functions are metrics, and so many natural instances of TSP satisfy this constraint.
The following are some examples of metric TSPs for various metrics.
In the Euclidean TSP (see below) the distance between two cities is the Euclidean distance between the
corresponding points.
270
271
In the rectilinear TSP the distance between two cities is the sum of the differences of their x- and y-coordinates.
This metric is often called the Manhattan distance or city-block metric.
In the maximum metric, the distance between two points is the maximum of the absolute values of differences of
their x- and y-coordinates.
The last two metrics appear for example in routing a machine that drills a given set of holes in a printed circuit
board. The Manhattan metric corresponds to a machine that adjusts first one co-ordinate, and then the other, so the
time to move to a new point is the sum of both movements. The maximum metric corresponds to a machine that
adjusts both co-ordinates simultaneously, so the time to move to a new point is the slower of the two movements.
In its definition, the TSP does not allow cities to be visited twice, but many applications do not need this constraint.
In such cases, a symmetric, non-metric instance can be reduced to a metric one. This replaces the original graph with
a complete graph in which the inter-city distance
is replaced by the shortest path between
and
in the
original graph.
The length of the minimum spanning tree of the network
route, because deleting any edge of the optimal route yields a Hamiltonian path, which is a spanning tree in
. In
the TSP with triangle inequality case it is possible to prove upper bounds in terms of the minimum spanning tree and
design an algorithm that has a provable upper bound on the length of the route. The first published (and the simplest)
example follows:
1. Construct a minimum spanning tree for
.
2. Duplicate all edges of . That is, wherever there is an edge from u to v, add a second edge from v to u. This
gives us an Eulerian graph
.
3. Find an Eulerian circuit in
. Clearly, its length is twice the length of the tree.
4. Convert the Eulerian circuit of
into a Hamiltonian cycle of
in the following way: walk along
each time you are about to come into an already visited vertex, skip it and try to go to the next one (along
, and
).
It is easy to prove that the last step works. Moreover, thanks to the triangle inequality, each skipping at Step 4 is in
fact a shortcut; i.e., the length of the cycle does not increase. Hence it gives us a TSP tour no more than twice as long
as the optimal one.
The Christofides algorithm follows a similar outline but combines the minimum spanning tree with a solution of
another problem, minimum-weight perfect matching. This gives a TSP tour which is at most 1.5 times the optimal.
The Christofides algorithm was one of the first approximation algorithms, and was in part responsible for drawing
attention to approximation algorithms as a practical approach to intractable problems. As a matter of fact, the term
"algorithm" was not commonly extended to approximation algorithms until later; the Christofides algorithm was
initially referred to as the Christofides heuristic.
In the special case that distances between cities are all either one or two (and thus the triangle inequality is
necessarily satisfied), there is a polynomial-time approximation algorithm that finds a tour of length at most 8/7
times the optimal tour length.[23] However, it is a long-standing (since 1975) open problem to improve the
Christofides approximation factor of 1.5 for general metric TSP to a smaller constant. It is known that, unless
P=NP, there is no polynomial-time algorithm that finds a tour of length at most 220/219=1.00456 times the
optimal tour's length.[24] In the case of bounded metrics it is known that there is no polynomial time algorithm that
constructs a tour of length at most 321/320 times the optimal tour's length, unless P=NP.[25]
272
Euclidean TSP
The Euclidean TSP, or planar TSP, is the TSP with the distance being the ordinary Euclidean distance.
The Euclidean TSP is a particular case of the metric TSP, since distances in a plane obey the triangle inequality.
Like the general TSP, the Euclidean TSP (and therefore the general metric TSP) is NP-complete.[26] However, in
some respects it seems to be easier than the general metric TSP. For example, the minimum spanning tree of the
graph associated with an instance of the Euclidean TSP is a Euclidean minimum spanning tree, and so can be
computed in expected O(n log n) time for n points (considerably less than the number of edges). This enables the
simple 2-approximation algorithm for TSP with triangle inequality above to operate more quickly.
In general, for any c > 0, where d is the number of dimensions in the Euclidean space, there is a polynomial-time
algorithm that finds a tour of length at most (1 + 1/c) times the optimal for geometric instances of TSP in
time; this is called a polynomial-time approximation scheme (PTAS).[27] Sanjeev Arora
and Joseph S. B. Mitchell were awarded the Gdel Prize in 2010 for their concurrent discovery of a PTAS for the
Euclidean TSP.
In practice, heuristics with weaker guarantees continue to be used.
Asymmetric TSP
In most cases, the distance between two nodes in the TSP network is the same in both directions. The case where the
distance from A to B is not equal to the distance from B to A is called asymmetric TSP. A practical application of an
asymmetric TSP is route optimisation using street-level routing (which is made asymmetric by one-way streets,
slip-roads, motorways, etc.).
Solving by conversion to symmetric TSP
Solving an asymmetric TSP graph can be somewhat complex. The following is a 33 matrix containing all possible
path weights between the nodes A, B and C. One option is to turn an asymmetric matrix of size N into a symmetric
matrix of size 2N.
A B C
A
1 2
B 6
C 5 4
273
Benchmarks
For benchmarking of TSP algorithms, TSPLIB [28] is a library of sample instances of the TSP and related problems
is maintained, see the TSPLIB external reference. Many of them are lists of actual cities and layouts of actual printed
circuits.
Lower bound
is a lower bound obtained by assuming i be a point in the tour sequence and i has its nearest neighbor as its
next in the path.
is a better lower bound obtained by assuming is next is is nearest, and is previous is is second
nearest.
is an even better lower bound obtained by dividing the path sequence into two parts as before_i and after_i
with each part containing N/2 points, and then deleting the before_i part to form a diluted pointset (see discussion).
274
Upper bound
By applying Simulated Annealing method on samples of N=40000, computer analysis shows an upper bound of
, where 0.72 comes from the boundary effect.
Because the actual solution is only the shortest path, for the purposes of programmatic search another upper bound is
the length of any previously discovered approximation.
Popular Culture
Travelling Salesman, by director Timothy Lanzone, is the story of 4 mathematicians hired by the US Government to
solve the most elusive problem in computer-science history: P vs. NP.[32]
Notes
[1] http:/ / www. mjc2. com/ logistics-planning-complexity. htm Why is vehicle routing hard - a simple explanation
[2] "Der Handlungsreisende wie er sein soll und was er zu thun [sic] hat, um Auftrge zu erhalten und eines glcklichen Erfolgs in seinen
Geschften gewi zu sein von einem alten Commis-Voyageur" (The traveling salesman how he must be and what he should do in order
to be sure to perform his tasks and have success in his business by a high commis-voyageur)
[3] A discussion of the early work of Hamilton and Kirkman can be found in Graph Theory 17361936
[4] Cited and English translation in Schrijver (2005). Original German: "Wir bezeichnen als Botenproblem (weil diese Frage in der Praxis von
jedem Postboten, brigens auch von vielen Reisenden zu lsen ist) die Aufgabe, fr endlich viele Punkte, deren paarweise Abstnde bekannt
sind, den krzesten die Punkte verbindenden Weg zu finden. Dieses Problem ist natrlich stets durch endlich viele Versuche lsbar. Regeln,
welche die Anzahl der Versuche unter die Anzahl der Permutationen der gegebenen Punkte herunterdrcken wrden, sind nicht bekannt. Die
Regel, man solle vom Ausgangspunkt erst zum nchstgelegenen Punkt, dann zu dem diesem nchstgelegenen Punkt gehen usw., liefert im
allgemeinen nicht den krzesten Weg."
[5] A detailed treatment of the connection between Menger and Whitney as well as the growth in the study of TSP can be found in Alexander
Schrijver's 2005 paper "On the history of combinatorial optimization (till 1960). Handbook of Discrete Optimization (K. Aardal, G.L.
Nemhauser, R. Weismantel, eds.), Elsevier, Amsterdam, 2005, pp. 168. PS (http:/ / homepages. cwi. nl/ ~lex/ files/ histco. ps), PDF (http:/ /
homepages. cwi. nl/ ~lex/ files/ histco. pdf)
[6] http:/ / www. google. com/ patents?vid=7054798
[7] Behzad, Arash; Modarres, Mohammad (2002), "New Efficient Transformation of the Generalized Traveling Salesman Problem into Traveling
Salesman Problem", Proceedings of the 15th International Conference of Systems Engineering (Las Vegas)
[8] Orponen (1987)
[9] Papadimitriou (1983)
[10] Christofides (1976)
[11] Kaplan (2004)
[12] Kosaraju (1994)
[13] Serdyukov (1984)
[14] Hassin (2000)
[15] Bellman (1960), Bellman (1962), Held & Karp (1962)
References
Applegate, D. L.; Bixby, R. M.; Chvtal, V.; Cook, W. J. (2006), The Traveling Salesman Problem,
ISBN0-691-12993-2.
Bellman, R. (1960), "Combinatorial Processes and Dynamic Programming", in Bellman, R., Hall, M., Jr. (eds.),
Combinatorial Analysis, Proceedings of Symposia in Applied Mathematics 10,, American Mathematical Society,
pp.217249.
Bellman, R. (1962), "Dynamic Programming Treatment of the Travelling Salesman Problem", J. Assoc. Comput.
Mach. 9: 6163, doi:10.1145/321105.321111.
Christofides, N. (1976), Worst-case analysis of a new heuristic for the travelling salesman problem, Technical
Report 388, Graduate School of Industrial Administration, Carnegie-Mellon University, Pittsburgh.
Hassin, R.; Rubinstein, S. (2000), "Better approximations for max TSP", Information Processing Letters 75 (4):
181186, doi:10.1016/S0020-0190(00)00097-1.
Held, M.; Karp, R. M. (1962), "A Dynamic Programming Approach to Sequencing Problems", Journal of the
Society for Industrial and Applied Mathematics 10 (1): 196210, doi:10.1137/0110015.
Kaplan, H.; Lewenstein, L.; Shafrir, N.; Sviridenko, M. (2004), "Approximation Algorithms for Asymmetric TSP
by Decomposing Directed Regular Multigraphs", In Proc. 44th IEEE Symp. on Foundations of Comput. Sci,
pp.5665.
Karp, R.M. (1982), "Dynamic programming meets the principle of inclusion and exclusion", Oper. Res. Lett. 1
(2): 4951, doi:10.1016/0167-6377(82)90044-X.
Kohn, S.; Gottlieb, A.; Kohn, M. (1977), "A Generating Function Approach to the Traveling Salesman Problem",
ACM Annual Conference, ACM Press, pp.294300.
Kosaraju, S. R.; Park, J. K.; Stein, C. (1994), "Long tours and short superstrings'", Proc. 35th Ann. IEEE Symp. on
Foundations of Comput. Sci, IEEE Computer Society, pp.166177.
275
Further reading
Adleman, Leonard (1994), Molecular Computation of Solutions To Combinatorial Problems (http://www.usc.
edu/dept/molecular-science/papers/fp-sci94.pdf)
Applegate, D. L.; Bixby, R. E.; Chvtal, V.; Cook, W. J. (2006), The Traveling Salesman Problem: A
Computational Study, Princeton University Press, ISBN978-0-691-12993-8.
Arora, S. (1998), "Polynomial time approximation schemes for Euclidean traveling salesman and other geometric
problems" (http://graphics.stanford.edu/courses/cs468-06-winter/Papers/arora-tsp.pdf), Journal of the ACM
45 (5): 753782, doi:10.1145/290179.290180.
Babin, Gilbert; Deneault, Stphanie; Laportey, Gilbert (2005), Improvements to the Or-opt Heuristic for the
Symmetric Traveling Salesman Problem (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.89.
9953), Cahiers du GERAD, G-2005-02, Montreal: Group for Research in Decision Analysis.
Cook, William (2011), In Pursuit of the Travelling Salesman: Mathematics at the Limits of Computation,
Princeton University Press, ISBN978-0-691-15270-7.
Cook, William; Espinoza, Daniel; Goycoolea, Marcos (2007), "Computing with domino-parity inequalities for the
TSP", INFORMS Journal on Computing 19 (3): 356365, doi:10.1287/ijoc.1060.0204.
Cormen, T. H.; Leiserson, C. E.; Rivest, R. L.; Stein, C. (2001), "35.2: The traveling-salesman problem",
Introduction to Algorithms (2nd ed.), MIT Press and McGraw-Hill, pp.10271033, ISBN0-262-03293-7.
Dantzig, G. B.; Fulkerson, R.; Johnson, S. M. (1954), "Solution of a large-scale traveling salesman problem",
Operations Research 2 (4): 393410, doi:10.1287/opre.2.4.393, JSTOR166695.
Garey, M. R.; Johnson, D. S. (1979), "A2.3: ND2224", Computers and Intractability: A Guide to the Theory of
NP-Completeness, W.H. Freeman, pp.211212, ISBN0-7167-1045-5.
Goldberg, D. E. (1989), Genetic Algorithms in Search, Optimization & Machine Learning, New York:
Addison-Wesley, ISBN0-201-15767-5.
Gutin, G.; Yeo, A.; Zverovich, A. (2002), "Traveling salesman should not be greedy: domination analysis of
greedy-type heuristics for the TSP", Discrete Applied Mathematics 117 (13): 8186,
doi:10.1016/S0166-218X(01)00195-0.
Gutin, G.; Punnen, A. P. (2006), The Traveling Salesman Problem and Its Variations, Springer,
ISBN0-387-44459-9.
Johnson, D. S.; McGeoch, L. A. (1997), "The Traveling Salesman Problem: A Case Study in Local
Optimization", in Aarts, E. H. L.; Lenstra, J. K., Local Search in Combinatorial Optimisation, John Wiley and
Sons Ltd, pp.215310.
Lawler, E. L.; Lenstra, J. K.; Rinnooy Kan, A. H. G.; Shmoys, D. B. (1985), The Traveling Salesman Problem: A
Guided Tour of Combinatorial Optimization, John Wiley & Sons, ISBN0-471-90413-9.
MacGregor, J. N.; Ormerod, T. (1996), "Human performance on the traveling salesman problem" (http://www.
psych.lancs.ac.uk/people/uploads/TomOrmerod20030716T112601.pdf), Perception & Psychophysics 58 (4):
527539, doi:10.3758/BF03213088.
Mitchell, J. S. B. (1999), "Guillotine subdivisions approximate polygonal subdivisions: A simple polynomial-time
approximation scheme for geometric TSP, k-MST, and related problems" (http://citeseer.ist.psu.edu/622594.
276
External links
Traveling Salesman Problem (http://www.tsp.gatech.edu/index.html) at Georgia Tech
TSPLIB (http://www.iwr.uni-heidelberg.de/groups/comopt/software/TSPLIB95/) at the University of
Heidelberg
Traveling Salesman Problem (http://demonstrations.wolfram.com/TravelingSalesmanProblem/) by Jon
McLoone
optimap (http://www.gebweb.net/optimap/) an approximation using ACO on GoogleMaps with JavaScript
tsp (http://travellingsalesmanproblem.appspot.com/) an exact solver using Constraint Programming on
GoogleMaps
Demo applet of a genetic algorithm solving TSPs and VRPTW problems (http://www.dna-evolutions.com/
dnaappletsample.html)
Source code library for the travelling salesman problem (http://www.adaptivebox.net/CILib/code/
tspcodes_link.html)
TSP solvers in R (http://tsp.r-forge.r-project.org/) for symmetric and asymmetric TSPs. Implements various
insertion, nearest neighbor and 2-opt heuristics and an interface to Georgia Tech's Concorde and Chained
Lin-Kernighan heuristics.
Traveling Salesman (on IMDB) (http://www.imdb.com/title/tt1801123/)
Traveling Salesman Movie (http://www.travellingsalesmanmovie.com/) Official webpage of Traveling
Salesman film (2012)
C++ implementation of simulated annealing applied to the travelling salesman problem (http://www.
technical-recipes.com/2012/
c-implementation-of-hill-climbing-and-simulated-annealing-applied-to-travelling-salesman-problems/)
277
278
References
[1] Michael R. Garey and David S. Johnson (1979). Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman.
ISBN0-7167-1045-5. A2.3: ND24, pg.212.
[2] R. Garey Parker and Ronald L. Rardin (1984). Guaranteed performance heuristics for the bottleneck traveling salesman problem. Operations
Research Letters. 2(6):269272
assigning a
Algorithm
In pseudo-code:
1. Create the minimum spanning tree MST of
2. Let
be the set of vertices with odd degree in
.
and find a perfect matching
Approximation ratio
The cost of the solution produced by the algorithm is within 3/2 of the optimum.
The proof is as follows:
Let A denote the edge set of the optimal solution of TSP for G. Because (V,A) is connected, it contains some
spanning tree T and thus w(A) w(T). Further let
denote the edge set of the optimal solution of TSP for the
complete graph over vertices from
. Because the edge weights are triangular (so visiting more nodes cannot
reduce total cost), we know that w(A) w(B). We show that there is a perfect matching of vertices from
weight under w(B)/2 w(A)/2 and therefore we have the same upper bound for
(because
with
is a perfect
279
must contain an even number of vertices, a perfect matching exists. Let
. Clearly both e1,e3,...,e2k-1 and e2,e4,...,e2k are perfect matchings and the
weight of at least one of them is less than or equal to w(B)/2. Thus w(M)+w(T) w(A) + w(A)/2 and from
the triangle inequality it follows that the algorithm is 3/2-approximative.
References
NIST Christofides Algorithm Definition [1]
Nicos Christofides, Worst-case analysis of a new heuristic for the travelling salesman problem, Report 388,
Graduate School of Industrial Administration, CMU, 1976.
References
[1] http:/ / www. nist. gov/ dads/ HTML/ christofides. html
T-joins
Let T be a subset of the vertex set of a graph. An edge set whose odd-degree vertices are the vertices in T is called a
T-join. (In a connected graph, a T-join exists if and only if |T| is even.) The T-join problem is to find a smallest
T-join. When T is the empty set, a smallest T-join leads to a solution of the postman problem. For any T, a smallest
T-join necessarily consists of |T| paths, no two having an edge in common, that join the vertices of T in pairs. The
paths will be such that the total length of all of them is as small as possible. A minimum T-join can be obtained using
a weighted matching algorithm that uses O(n3) computational steps.[2]
Solution
If a graph has an Eulerian circuit (or an Eulerian path), then an Eulerian circuit (or path) visits every edge, and so the
solution is to choose any Eulerian circuit (or path).
If the graph is not Eulerian, it must contain vertices of odd degree. By the handshaking lemma, there must be an even
number of these vertices. To solve the postman problem we first find a smallest T-join. We make the graph Eulerian
by doubling of the T-join. The solution to the postman problem in the original graph is obtained by finding an
Eulerian circuit for the new graph.
Variants
A few variants of the Chinese Postman Problem have been studied and shown to be NP-complete.[3]
Min Chinese postman problem for mixed graphs: for this problem, some of the edges may be directed and can
therefore only be visited from one direction. When the problem is minimal traversal of a digraph it is known as
the "New York Street Sweeper problem."
Mink-Chinese postman problem: find k cycles all starting at a designated location such that each edge is traversed
by at least one cycle. The goal is to minimize the cost of the most expensive cycle.
Rural postman problem: Given is also a subset of the edges. Find the cheapest Hamiltonian cycle containing each
of these edges (and possibly others). This is a special case of the minimum general routing problem which
specifies precisely which vertices the cycle must contain.
References
[1] ""Chinese Postman Problem"" (http:/ / www. nist. gov/ dads/ HTML/ chinesePostman. html). .
[2] J. Edmonds and E.L. Johnson, Matching Euler tours and the Chinese postman problem, Math. Program. (1973).
[3] Crescenzi, P.; Kann, V.; Halldrsson, M.; Karpinski, M.; Woeginger, G. "A compendium of NP optimization problems" (http:/ / www. nada.
kth. se/ ~viggo/ problemlist/ compendium. html). KTH NADA, Stockholm. . Retrieved 2008-10-22.
External links
Chinese Postman Problem (http://mathworld.wolfram.com/ChinesePostmanProblem.html)
280
281
Matching
Matching
In the mathematical discipline of graph theory, a matching or independent edge set in a graph is a set of edges
without common vertices. It may also be an entire graph consisting of edges without common vertices.
Definition
Given a graph G = (V,E), a matching M in G is a set of pairwise non-adjacent edges; that is, no two edges share a
common vertex.
A vertex is matched (or saturated) if it is an endpoint of one of the edges in the matching. Otherwise the vertex is
unmatched.
A maximal matching is a matching M of a graph G with the property that if any edge not in M is added to M, it is no
longer a matching, that is, M is maximal if it is not a proper subset of any other matching in graph G. In other words,
a matching M of a graph G is maximal if every edge in G has a non-empty intersection with at least one edge in M.
The following figure shows examples of maximal matchings (red) in three graphs.
A maximum matching is a matching that contains the largest possible number of edges. There may be many
maximum matchings. The matching number
of a graph
is the size of a maximum matching. Note that
every maximum matching is maximal, but not every maximal matching is a maximum matching. The following
figure shows examples of maximum matchings in three graphs.
A perfect matching (a.k.a. 1-factor) is a matching which matches all vertices of the graph. That is, every vertex of
the graph is incident to exactly one edge of the matching. Figure (b) above is an example of a perfect matching.
Every perfect matching is maximum and hence maximal. In some literature, the term complete matching is used. In
the above figure, only part (b) shows a perfect matching. A perfect matching is also a minimum-size edge cover.
Thus, (G) (G) , that is, the size of a maximum matching is no larger than the size of a minimum edge cover.
A near-perfect matching is one in which exactly one vertex is unmatched. This can only occur when the graph has
an odd number of vertices, and such a matching must be maximum. In the above figure, part (c) shows a near-perfect
matching. If, for every vertex in a graph, there is a near-perfect matching that omits only that vertex, the graph is also
called factor-critical.
Given a matching M,
an alternating path is a path in which the edges belong alternatively to the matching and not to the matching.
an augmenting path is an alternating path that starts from and ends on free (unmatched) vertices.
One can prove that a matching is maximum if and only if it does not have any augmenting path. (This result is
sometimes called Berge's lemma.)
Matching
282
Properties
In any graph without isolated vertices, the sum of the matching number and the edge covering number equals the
number of vertices.[1] If there is a perfect matching, then both the matching number and the edge cover number are
|V| / 2.
If A and B are two maximal matchings, then |A|2|B| and |B|2|A|. To see this, observe that each edge in B\A can
be adjacent to at most two edges in A\B because A is a matching; moreover each edge in A\B is adjacent to an edge
in B\A by maximality of B, hence
In particular, this shows that any maximal matching is a 2-approximation of a maximum matching and also a
2-approximation of a minimum maximal matching. This inequality is tight: for example, if G is a path with 3 edges
and 4 nodes, the size of a minimum maximal matching is 1 and the size of a maximum matching is 2.
Matching polynomials
A generating function of the number of k-edge matchings in a graph is called a matching polynomial. Let G be a
graph and mk be the number of k-edge matchings. One matching polynomial of G is
where n is the number of vertices in the graph. Each type has its uses; for more information see the article on
matching polynomials.
to
to
.
with
then
constitute a maximum matching. An improvement over this is the Hopcroft-Karp algorithm, which runs in O(VE)
time. Another approach is based on the fast matrix multiplication algorithm and gives
complexity,[3]
which is better in theory for sufficiently dense graphs, but in practice the algorithm is slower.
In a weighted bipartite graph, each edge has an associated value. A maximum weighted bipartite matching[2] is
defined as a matching where the sum of the values of the edges in the matching have a maximal value. If the graph is
not complete bipartite, missing edges are inserted with value zero. Finding such a matching is known as the
assignment problem. It can be solved by using a modified shortest path search in the augmenting path algorithm. If
the Bellman-Ford algorithm is used, the running time becomes
, or the edge cost can be shifted with a
potential to achieve
[4]
The
remarkable Hungarian algorithm solves the assignment problem and it was one of the beginnings of combinatorial
Matching
283
Maximum matchings
There is a polynomial time algorithm to find a maximum matching or a maximum weight matching in a graph that is
not bipartite; it is due to Jack Edmonds, is called the paths, trees, and flowers method or simply Edmonds's
algorithm, and uses bidirected edges. A generalization of the same technique can also be used to find maximum
independent sets in claw-free graphs. Edmonds' algorithm has subsequently been improved to run in time O(VE)
time, matching the time for bipartite maximum matching.[5] Another algorithm by Mucha and Sankowski,[3] based
on the fast matrix multiplication algorithm, gives
complexity.
Maximal matchings
A maximal matching can be found with a simple greedy algorithm. A maximum matching is also a maximal
matching, and hence it is possible to find a largest maximal matching in polynomial time. However, no
polynomial-time algorithm is known for finding a minimum maximal matching, that is, a maximal matching that
contains the smallest possible number of edges.
Note that a maximal matching with k edges is an edge dominating set with k edges. Conversely, if we are given a
minimum edge dominating set with k edges, we can construct a maximal matching with k edges in polynomial time.
Therefore the problem of finding a minimum maximal matching is essentially equal to the problem of finding a
minimum edge dominating set.[6] Both of these two optimisation problems are known to be NP-hard; the decision
versions of these problems are classical examples of NP-complete problems.[7] Both problems can be approximated
within factor 2 in polynomial time: simply find an arbitrary maximal matching M.[8]
Counting problems
The problem of determining the number of perfect matchings in a given graph is #P Complete (see Permanent).
However, a remarkable theorem of Kasteleyn states that the number of perfect matchings in a planar graph can be
computed exactly in polynomial time via the FKT algorithm. There exists a fully polynomial time randomized
approximation scheme for counting the number of bipartite matchings.[9]
For the problem of determining the total number of matchings in a given graph, see Hosoya index.
[11]
to find a single maximum matching and then use it in order to find all maximally-matchable edges in linear time[12];
the resulting overall runtime is
bipartite graphs with
overall runtime of the algorithm is
for dense
Matching
284
Applications
A Kekul structure of an aromatic compound consists of a perfect matching of its carbon skeleton, showing the
locations of double bonds in the chemical structure. These structures are named after Friedrich August Kekul von
Stradonitz, who showed that benzene (in graph theoretical terms, a 6-vertex cycle) can be given such a structure.[14]
The Hosoya index is the number of non-empty matchings plus one; it is used in computational chemistry and
mathematical chemistry investigations for organic compounds.
References
[1] Gallai, Tibor (1959), "ber extreme Punkt- und Kantenmengen", Ann. Univ. Sci. Budapest. Etvs Sect. Math. 2: 133138.
[2] West, Douglas Brent (1999), Introduction to Graph Theory (2nd ed.), Prentice Hall, Chapter 3, ISBN0-13-014400-2
[3] Mucha, M.; Sankowski, P. (2004), "Maximum Matchings via Gaussian Elimination" (http:/ / www. mimuw. edu. pl/ ~mucha/ pub/
mucha_sankowski_focs04. pdf), Proc. 45st IEEE Symp. Foundations of Computer Science, pp.248255,
[4] Fredman, M.; Tarjan, R. (1987), "Fibonacci heaps and their uses in improved network optimization algorithms" (http:/ / dl. acm. org/ citation.
cfm?id=28874), Journal of the ACM (JACM), 34, pp.596-615,
[5] Micali, S.; Vazirani, V. V. (1980), "An
algorithm for finding maximum matching in general graphs", Proc. 21st IEEE Symp.
pp.415423.
[11] Rabin, Michael O.; Vazirani, Vijay V. (1989), "Maximum matchings in general graphs through randomization", J. of Algorithms 10:
557567.
[12] Tassa, Tamir (2012), "Finding all maximally-matchable edges in a bipartite graph", Theoretical Computer Science 423: 5058,
doi:10.1016/j.tcs.2011.12.071.
[13] Gionis, Aris; Mazza, Arnon; Tassa, Tamir (2008), "k-Anonymization revisited", International Conference on Data Engineering (ICDE),
pp.744753.
[14] See, e.g., Trinajsti, Nenad; Klein, Douglas J.; Randi, Milan (1986), "On some solved and unsolved problems of chemical graph theory",
International Journal of Quantum Chemistry 30 (S20): 699742, doi:10.1002/qua.560300762.
Matching
Further reading
1. Lszl Lovsz; M. D. Plummer (1986), Matching Theory, North-Holland, ISBN0-444-87916-1
2. Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein (2001), Introduction to
Algorithms (second ed.), MIT Press and McGraw-Hill, Chapter 26, pp. 643700, ISBN0-262-53196-8
3. Andrs Frank (2004). On Kuhn's Hungarian Method A tribute from Hungary (http://www.cs.elte.hu/egres/
tr/egres-04-14.pdf) (Technical report).
4. Michael L. Fredman and Robert E. Tarjan (1987), "Fibonacci heaps and their uses in improved network
optimization algorithms", Journal of the ACM (ACM Press) 34 (3): 595615, doi:10.1145/28869.28874.
5. S. J. Cyvin and Ivan Gutman (1988), Kekule Structures in Benzenoid Hydrocarbons, Springer-Verlag
External links
A graph library with Hopcroft-Karp and Push-Relabel based maximum cardinality matching implementation
(http://lemon.cs.elte.hu/)
Augmenting paths
A vertex that is not the endpoint of an edge in some partial matching M is called a free vertex. The basic concept that
the algorithm relies on is that of an augmenting path, a path that starts at a free vertex, ends at a free vertex, and
alternates between unmatched and matched edges within the path. If M is a matching of size n, and P is an
augmenting path relative to M, then the symmetric difference of the two sets of edges, MP, would form a
matching with size n+1. Thus, by finding augmenting paths, an algorithm may increase the size of the matching.
Conversely, suppose that a matching M is not optimal, and let P be the symmetric difference MM* where M* is
an optimal matching. Then P must form a collection of disjoint augmenting paths and cycles or paths in which
matched and unmatched edges are of equal number; the difference in size between M and M* is the number of
augmenting paths in P. Thus, if no augmenting path can be found, an algorithm may safely terminate, since in this
case M must be optimal.
An augmenting path in a matching problem is closely related to the augmenting paths arising in maximum flow
problems, paths along which one may increase the amount of flow between the terminals of the flow. It is possible to
transform the bipartite matching problem into a maximum flow instance, such that the alternating paths of the
285
repeat
maximal set of vertex-disjoint shortest augmenting paths
until
Algorithm
Let U and V be the two sets in the bipartition of G, and let the matching from U to V at any time be represented as
the set M.
The algorithm is run in phases. Each phase consists of the following steps.
A breadth-first search partitions the vertices of the graph into layers. The free vertices in U are used as the starting
vertices of this search, and form the first layer of the partition. At the first level of the search, only unmatched
edges may be traversed (since the free vertices in U are by definition not adjacent to any matched edges); at
subsequent levels of the search, the traversed edges are required to alternate between unmatched and matched.
That is, when searching for successors from a vertex in U, only unmatched edges may be traversed, while from a
vertex in V only matched edges may be traversed. The search terminates at the first layer k where one or more free
vertices in V are reached.
All free vertices in V at layer k are collected into a set F. That is, a vertex v is put into F if and only if it ends a
shortest augmenting path.
The algorithm finds a maximal set of vertex disjoint augmenting paths of length k. This set may be computed by
depth first search from F to the free vertices in U, using the breadth first layering to guide the search: the depth
first search is only allowed to follow edges that lead to an unused vertex in the previous layer, and paths in the
depth first search tree must alternate between unmatched and matched edges. Once an augmenting path is found
that involves one of the vertices in F, the depth first search is continued from the next starting vertex.
Every one of the paths found in this way is used to enlarge M.
The algorithm terminates when no more augmenting paths are found in the breadth first search part of one of the
phases.
Analysis
Each phase consists of a single breadth first search and a single depth first search. Thus, a single phase may be
implemented in linear time. Therefore, the first n phases, in a graph with n vertices and m edges, take time O(mn).
It can be shown that each phase increases the length of the shortest augmenting path by at least one: the phase finds a
maximal set of augmenting paths of the given length, so any remaining augmenting path must be longer. Therefore,
once the initial n phases of the algorithm are complete, the shortest remaining augmenting path has at least n
edges in it. However, the symmetric difference of the eventual optimal matching and of the partial matching M found
by the initial phases forms a collection of vertex-disjoint augmenting paths and alternating cycles. If each of the
paths in this collection has length at least n, there can be at most n paths in the collection, and the size of the
optimal matching can differ from the size of M by at most n edges. Since each phase of the algorithm increases the
size of the matching by at least one, there can be at most n additional phases before the algorithm terminates.
Since the algorithm performs a total of at most 2n phases, it takes a total time of O(mn) in the worst case.
286
Non-bipartite graphs
The same idea of finding a maximal set of shortest augmenting paths works also for finding maximum cardinality
matchings in non-bipartite graphs, and for the same reasons the algorithms based on this idea take O(n) phases.
However, for non-bipartite graphs, the task of finding the augmenting paths within each phase is more difficult.
Building on the work of several slower predecessors, Micali & Vazirani (1980) showed how to implement a phase in
linear time, resulting in a non-bipartite matching algorithm with the same time bound as the HopcroftKarp
algorithm for bipartite graphs. The MicaliVazirani technique is complex, and its authors did not provide full proofs
of their results; alternative methods for this problem were later described by other authors.[3]
Pseudocode
/*
G = G1 G2 {NIL}
where G1 and G2 are partition of graph and NIL is a special null vertex
*/
function BFS ()
for v in G1
if Pair_G1[v] == NIL
Dist[v] = 0
Enqueue(Q,v)
else
Dist[v] =
Dist[NIL] =
while Empty(Q) == false
v = Dequeue(Q)
for each u in Adj[v]
if Dist[ Pair_G2[u] ] ==
Dist[ Pair_G2[u] ] = Dist[v] + 1
Enqueue(Q,Pair_G2[u])
return Dist[NIL] !=
287
Notes
[1] Ahuja, Magnanti & Orlin (1993), section 12.3, bipartite cardinality matching problem, pp. 469470.
[2] Chang & McCormick (1990); Darby-Dowman (1980); Setubal (1993); Setubal (1996).
[3] Gabow & Tarjan (1989) and Blum (2001).
References
Ahuja, Ravindra K.; Magnanti, Thomas L.; Orlin, James B. (1993), Network Flows: Theory, Algorithms and
Applications, Prentice-Hall.
Alt, H.; Blum, N.; Mehlhorn, K.; Paul, M. (1991), "Computing a maximum cardinality matching in a bipartite
graph in time
", Information Processing Letters 37 (4): 237240,
doi:10.1016/0020-0190(91)90195-N.
Bast, Holger; Mehlhorn, Kurt; Schafer, Guido; Tamaki, Hisao (2006), "Matching algorithms are fast in sparse
random graphs", Theory of Computing Systems 39 (1): 314, doi:10.1007/s00224-005-1254-y.
Blum, Norbert (2001), A Simplified Realization of the Hopcroft-Karp Approach to Maximum Matching in
General Graphs (http://theory.cs.uni-bonn.de/ftp/reports/cs-reports/2001/85232-CS.ps.gz), Tech. Rep.
85232-CS, Computer Science Department, Univ. of Bonn.
Chang, S. Frank; McCormick, S. Thomas (1990), A faster implementation of a bipartite cardinality matching
algorithm, Tech. Rep. 90-MSC-005, Faculty of Commerce and Business Administration, Univ. of British
Columbia. As cited by Setubal (1996).
Darby-Dowman, Kenneth (1980), The exploitation of sparsity in large scale linear programming problems
Data structures and restructuring algorithms, Ph.D. thesis, Brunel University. As cited by Setubal (1996).
Edmonds, Jack (1965), "Paths, Trees and Flowers", Canadian J. Math 17: 449467,
doi:10.4153/CJM-1965-045-4, MR0177907.
288
289
Augmenting paths
Given G = (V, E) and a matching M of G, a vertex v is exposed, if no edge of M is incident with v. A path in G is an
alternating path, if its edges are alternately not in M and in M (or in M and not in M). An augmenting path P is an
alternating path that starts and ends at two distinct exposed vertices. A matching augmentation along an
augmenting
path
P
is
the
operation
of
replacing
M
with
a
new
matching
.
We can prove[5][6] that a matching M is maximum if and only if there is no M-augmenting path in G. Hence, either a
matching is maximum, or it can be augmented. Thus, starting from an initial matching, we can compute a maximum
matching by augmenting the current matching with augmenting paths as long as we can find them, and return
whenever no augmenting paths are left. We can formalize the algorithm as follows:
A1
A2
A3
A4
A5
A6
A7
A8
We still have to describe how augmenting paths can be found efficiently. The subroutine to find them uses blossoms
and contractions.
290
It can be shown[7] that G has an M-augmenting path iff G has an M-augmenting path, and that any M-augmenting
path P in G can be lifted to a M-augmenting path in G by undoing the contraction by B so that the segment of P (if
any) traversing through vB is replaced by an appropriate segment traversing through B. In more detail:
if P traverses through a segment u vB w in G, then this segment is replaced with the segment u ( u ...
w ) w in G, where blossom vertices u and w and the side of B, ( u ... w ), going from u to w are
chosen to ensure that the new path is still alternating (u is exposed with respect to
,
).
291
if P has an endpoint vB, then the path segment u vB in G is replaced with the segment u ( u ... v ) in
G, where blossom vertices u and v and the side of B, ( u ... v ), going from u to v are chosen to ensure
that the path is alternating (v is exposed,
).
Thus blossoms can be contracted and search performed in the contracted graphs. This reduction is at the heart of
Edmonds's algorithm.
292
293
Graph G, matching M on G
F empty forest
B03
B05
B06
B07
end for
B08
B09
B10
B11
x vertex matched to w in M
B12
B13
else
B14
B15
do nothing
B16
else
B17
{ e }.
B18
B19
return P
B20
else
// Contract a blossom in G and look for the path in the contracted graph.
B21
B22
G, M contract G and M by B
B23
P find_augmenting_path( G, M )
B24
P lift P to G
B25
return P
B26
end if
B27
mark edge e
B28
B29
end while
mark vertex v
B30
end while
B31
Examples
The following four figures illustrate the execution of the algorithm. We use dashed lines to indicate edges that are
currently not present in the forest. First, the algorithm processes an out-of-forest edge that causes the expansion of
the current forest (lines B10 B12).
Next, it detects a blossom and contracts the graph (lines B20 B22).
Finally, it locates an augmenting path P in the contracted graph (line B23) and lifts it to the original graph (line
B24). Note that the ability of the algorithm to contract blossoms is crucial here; the algorithm can not find P in the
original graph directly because only out-of-forest edges between vertices at even distances from the roots are
considered on line B17 of the algorithm.
294
295
Analysis
The forest F constructed by the find_augmenting_path() function is an alternating forest.[8]
a tree T in G is an alternating tree with respect to M, if
T contains exactly one exposed vertex r called the tree root
every vertex at an odd distance from the root has exactly two incident edges in T, and
all paths from r to leaves in T have even lengths, their odd edges are not in M and their even edges are in M.
a forest F in G is an alternating forest with respect to M, if
its connected components are alternating trees, and
every exposed vertex in G is a root of an alternating tree in F.
Each iteration of the loop starting at line B09 either adds to a tree T in F (line B10) or finds an augmenting path (line
B17) or finds a blossom (line B21). It is easy to see that the running time is
. Micali and Vazirani[9] show
an algorithm that constructs maximum matching in
time.
Bipartite matching
The algorithm reduces to the standard algorithm for matching in bipartite graphs[6] when G is bipartite. As there are
no odd cycles in G in that case, blossoms will never be found and one can simply remove lines B21 B29 of the
algorithm.
Weighted matching
The matching problem can be generalized by assigning weights to edges in G and asking for a set M that produces a
matching of maximum (minimum) total weight. The weighted matching problem can be solved by a combinatorial
algorithm that uses the unweighted Edmonds's algorithm as a subroutine.[5] Kolmogorov provides an efficient C++
implementation of this.[10]
References
[1] Edmonds, Jack (1991), "A glimpse of heaven", in J.K. Lenstra, A.H.G. Rinnooy Kan, A. Schrijver, ed., History of Mathematical
Programming --- A Collection of Personal Reminiscences, CWI, Amsterdam and North-Holland, Amsterdam, pp.32-54
[2] Edmonds, Jack (1965). "Paths, trees, and flowers". Canad. J. Math. 17: 449467. doi:10.4153/CJM-1965-045-4.
[3] Edmonds, Jack (1965). "Maximum matching and a polyhedron with 0,1-vertices". Journal of Research National Bureau of Standards Section
B 69: 125130.
[4] Schrijver, Alexander. Combinatorial Optimization: Polyhedra and Efficiency. Algorithms and Combinatorics. 24. Springer.
[5] Lovsz, Lszl; Plummer, Michael (1986). Matching Theory. Akadmiai Kiad. ISBN963-05-4168-8.
[6] Karp, Richard, "Edmonds's Non-Bipartite Matching Algorithm" (http:/ / www. cs. berkeley. edu/ ~karp/ greatalgo/ lecture05. pdf), Course
Notes. U. C. Berkeley,
[7] Tarjan, Robert, "Sketchy Notes on Edmonds' Incredible Shrinking Blossom Algorithm for General Matching" (http:/ / www. cs. dartmouth.
edu/ ~ac/ Teach/ CS105-Winter05/ Handouts/ tarjan-blossom. pdf), Course Notes, Department of Computer Science, Princeton University,
[8] Kenyon, Claire; Lovsz, Lszl, "Algorithmic Discrete Mathematics", Technical Report CS-TR-251-90, Department of Computer Science,
Princeton University
[9] Micali, Silvio; Vazirani, Vijay (1980), "An O(V1/2E) algorithm for finding maximum matching in general graphs", 21st Annual Symposium
on Foundations of Computer Science, (IEEE Computer Society Press, New York): 1727
[10] Kolmogorov, Vladimir (2009), "Blossom V: A new implementation of a minimum cost perfect matching algorithm" (http:/ / www. cs. ucl.
ac. uk/ staff/ V. Kolmogorov/ papers/ BLOSSOM5. html), Mathematical Programming Computation 1 (1): 4367,
Assignment problem
The assignment problem is one of the fundamental combinatorial optimization problems in the branch of
optimization or operations research in mathematics. It consists of finding a maximum weight matching in a weighted
bipartite graph.
In its most general form, the problem is as follows:
There are a number of agents and a number of tasks. Any agent can be assigned to perform any task, incurring
some cost that may vary depending on the agent-task assignment. It is required to perform all tasks by
assigning exactly one agent to each task in such a way that the total cost of the assignment is minimized.
If the numbers of agents and tasks are equal and the total cost of the assignment for all tasks is equal to the sum of
the costs for each agent (or the sum of the costs for each task, which is the same thing in this case), then the problem
is called the linear assignment problem. Commonly, when speaking of the assignment problem without any
additional qualification, then the linear assignment problem is meant.
296
Assignment problem
297
Example
Suppose that a taxi firm has three taxis (the agents) available, and three customers (the tasks) wishing to be picked
up as soon as possible. The firm prides itself on speedy pickups, so for each taxi the "cost" of picking up a particular
customer will depend on the time taken for the taxi to reach the pickup point. The solution to the assignment problem
will be whichever combination of taxis and customers results in the least total cost.
However, the assignment problem can be made rather more flexible than it first appears. In the above example,
suppose that there are four taxis available, but still only three customers. Then a fourth dummy task can be invented,
perhaps called "sitting still doing nothing", with a cost of 0 for the taxi assigned to it. The assignment problem can
then be solved in the usual way and still give the best solution to the problem.
Similar tricks can be played in order to allow more tasks than agents, tasks to which multiple agents must be
assigned (for instance, a group of more customers than will fit in one taxi), or maximizing profit rather than
minimizing cost.
is minimized.
Usually the weight function is viewed as a square real-valued matrix C, so that the cost function is written down as:
The problem is "linear" because the cost function to be optimized as well as all the constraints contain only linear
terms.
The problem can be expressed as a standard linear program with the objective function
The variable
to task
otherwise. This formulation allows also fractional variable values, but there is always an optimal solution where the
variables take integer values. This is because the constraint matrix is totally unimodular. The first constraint requires
that every agent is assigned to exactly one task, and the second constraint requires that every task is assigned exactly
one agent.
Assignment problem
298
Further reading
Burkard, Rainer; M. Dell'Amico, S. Martello (2012). Assignment Problems (Revised reprint). SIAM.
ISBN978-1-61197-222-1.
general transportation problems. In 2006, it was discovered that Carl Gustav Jacobi had solved the assignment
problem in the 19th century, and the solution had been published posthumously in 1890 in Latin.[1]
Laymans explanation
Say you have three workers: Jim, Steve and Alan. You need to have one of them clean the bathroom, another sweep
the floors & the third wash the windows. Whats the best (minimum-cost) way to assign the jobs? First we need a
matrix of the costs of the workers doing the jobs.
Clean bathroom Sweep floors Wash windows
Jim
$1
$2
$3
Steve $3
$3
$3
Alan
$3
$2
$3
Then the Hungarian algorithm, when applied to the above table would give us the minimum cost it can be done with:
Jim cleans the bathroom, Steve sweeps the floors and Alan washes the windows.
Setting
We are given a nonnegative nn matrix, where the element in the i-th row and j-th column represents the cost of
assigning the j-th job to the i-th worker. We have to find an assignment of the jobs to the workers that has minimum
cost. If the goal is to find the assignment that yields the maximum cost, the problem can be altered to fit the setting
by replacing each cost with the maximum cost subtracted by the cost.[2]
The algorithm is easier to describe if we formulate the problem using a bipartite graph. We have a complete bipartite
graph G=(S, T; E) with n worker vertices (S) and n job vertices (T), and each edge has a nonnegative cost c(i,j). We
want to find a perfect matching with minimum cost.
Let us call a function
value of potential y is
a potential if
for each
. The
. It can be seen that the cost of each perfect matching is at least the value of each
potential. The Hungarian method finds a perfect matching and a potential with equal cost/value which proves the
optimality of both. In fact it finds a perfect matching of tight edges: an edge ij is called tight for a potential y if
. Let us denote the subgraph of tight edges by
. The cost of a perfect matching in
(if there is one) equals the value of y.
299
(denoted by
that the edges oriented from T to S form a matching M. Initially, y is 0 everywhere, and all edges are oriented from S
to T (so M is empty). In each step, either we modify y so that its value increases, or modify the orientation to obtain a
matching with more edges. We maintain the invariant that all the edges of M are tight. We are done if M is a perfect
matching.
In a general step, let
and
be the vertices not covered by M (so
consists of the vertices in S
with no incoming edge and
reachable in
from
by a directed path only following edges that are tight. This can be computed by
breadth-first search.
If
from
to
on the vertices of
.
and
. Increase y by
on the vertices of
is positive
and
increases (note that the number of tight edges does not necessarily increase).
We repeat these steps until M is a perfect matching, in which case it gives a minimum cost assignment. The running
time of this version of the method is
: M is augmented n times, and in a phase where M is unchanged, there
are at most n potential changes (since Z increases every time). The time needed for a potential change is
Matrix interpretation
Given workers and tasks, and an nn matrix containing the cost of assigning each worker to a task, find the cost
minimizing assignment.
First the problem is written in the form of a matrix as given below
where a, b, c and d are the workers who have to perform tasks 1, 2, 3 and 4. a1, a2, a3, a4 denote the penalties
incurred when worker "a" does task 1, 2, 3, 4 respectively. The same holds true for the other symbols as well. The
matrix is square, so each worker can perform only one task.
Step 1
Then we perform row operations on the matrix. To do this, the lowest of all ai (i belonging to 1-4) is taken and is
subtracted from each element in that row. This will lead to at least one zero in that row (We get multiple zeros when
there are two equal elements which also happen to be the lowest in that row). This procedure is repeated for all rows.
We now have a matrix with at least one zero per row. Now we try to assign tasks to agents such that each agent is
doing only one task and the penalty incurred in each case is zero. This is illustrated below.
300
a2'
0'
a4'
0'
0'
d1'
0'
d3' d4'
The zeros that are indicated as 0' are the assigned tasks.
Step 2
Sometimes it may turn out that the matrix at this stage cannot be used for assigning, as is the case in for the matrix
below.
0
d1'
d3' d4'
In the above case, no assignment can be made. Note that task 1 is done efficiently by both agent a and c. Both can't
be assigned the same task. Also note that no one does task 3 efficiently. To overcome this, we repeat the above
procedure for all columns (i.e. the minimum element in each column is subtracted from all the elements in that
column) and then check if an assignment is possible.
Step 3
In most situations this will give the result, but if it is still not possible to assign then all zeros in the matrix must be
covered by marking as few rows and/or columns as possible. The following procedure is one way to accomplish this:
Initially assign as many tasks as possible then do the following (assign tasks in rows 2, 3 and 4)
0
0'
0'
d1'
0'
d3' d4'
Mark all rows having no assignments (row 1). Then mark all columns having zeros in marked row(s) (column 1).
Then mark all rows having assignments in marked columns (row 3). Repeat this till a closed loop is obtained.
0'
0'
d1'
0'
d3' d4'
Now draw lines through all marked columns and unmarked rows.
301
0'
0'
d1'
0'
d3' d4'
The aforementioned detailed description is just one way to draw the minimum number of lines to cover all the 0's.
Other methods work as well.
Step 4
From the elements that are left, find the lowest value. Subtract this from every unmarked element and add it to every
element covered by two lines.
Repeat the procedure (steps 14) until an assignment is possible; this is when the minimum number of lines used to
cover all the 0's is equal to the max(number of people, number of assignments), assuming dummy variables (usually
the max cost) are used to fill in when the number of people is greater than the number of assignments.
Basically you find the second minimum cost among the two rows. The procedure is repeated until you are able to
distinguish among the workers in terms of least cost.
Bibliography
R.E. Burkard, M. Dell'Amico, S. Martello: Assignment Problems (Revised reprint). SIAM, Philadelphia (PA.)
2012. ISBN 978-1-61197-222-1
Harold W. Kuhn, "The Hungarian Method for the assignment problem", Naval Research Logistics Quarterly,
2:8397, 1955. Kuhn's original publication.
Harold W. Kuhn, "Variants of the Hungarian method for assignment problems", Naval Research Logistics
Quarterly, 3: 253258, 1956.
J. Munkres, "Algorithms for the Assignment and Transportation Problems", Journal of the Society for Industrial
and Applied Mathematics, 5(1):3238, 1957 March.
M. Fischetti, "Lezioni di Ricerca Operativa", Edizioni Libreria Progetto Padova, Italia, 1995.
R. Ahuja, T. Magnanti, J. Orlin, "Network Flows", Prentice Hall, 1993.
References
[1] http:/ / www. lix. polytechnique. fr/ ~ollivier/ JACOBI/ jacobiEngl. htm
[2] Beryl Castello, The Hungarian Algorithm (http:/ / www. ams. jhu. edu/ ~castello/ 362/ Handouts/ hungarian. pdf)
External links
Mordecai J. Golin, Bipartite Matching and the Hungarian Method (http://www.cse.ust.hk/~golin/COMP572/
Notes/Matching.pdf), Course Notes, Hong Kong University of Science and Technology.
R. A. Pilgrim, Munkres' Assignment Algorithm. Modified for Rectangular Matrices (http://csclab.murraystate.
edu/bob.pilgrim/445/munkres.html), Course notes, Murray State University.
Mike Dawes, The Optimal Assignment Problem (http://www.math.uwo.ca/~mdawes/courses/344/
kuhn-munkres.pdf), Course notes, University of Western Ontario.
On Kuhn's Hungarian Method A tribute from Hungary (http://www.cs.elte.hu/egres/tr/egres-04-14.pdf),
Andrs Frank, Egervary Research Group, Pazmany P. setany 1/C, H1117, Budapest, Hungary.
Lecture: Fundamentals of Operations Research - Assignment Problem - Hungarian Algorithm (https://www.
youtube.com/watch?v=BUGIhEecipE), Prof. G. Srinivasan, Department of Management Studies, IIT Madras.
302
Implementations
(Note that not all of these satisfy the
time constraint.)
History
The problem of counting planar perfect matchings has its roots in statistical mechanics and chemistry, where the
original question was: If diatomic molecules are adsorbed on a surface, forming a single layer, how many ways can
they be arranged?[1] The partition function is an important quantity that encodes the statistical properties of a system
at equilibrium and can be used to answer the previous question. However, trying to compute the partition function
from its definition is not practical. Thus to exactly solve a physical system is to find an alternate form of the partition
function for that particular physical system that is sufficiently simple to calculate exactly.[2] In the early 1960s, the
definition of exactly solvable was not rigorous.[3] Computer science provided a rigorous definition with the
introduction of polynomial time, which dates to 1965. Similarly, the notation of not exactly solvable should
correspond to #P-hardness, which was defined in 1979.
Another type of physical system to consider is composed of dimers, which is a polymer with two atoms. The dimer
model counts the number of dimer coverings of a graph.[4] Another physical system to consider is the bonding of
H2O molecules in the form of ice. This can be modelled as a directed, 3-regular graph where the orientation of the
edges at each vertex cannot all be the same. How many edge orientations does this model have?
Motivated by physical systems involving dimers, in 1961, Kasteleyn[5] and Temperley-Fisher[6] independently found
the number of domino tilings for the m-by-n rectangle. This is equivalent to counting the number of perfect
matchings for the m-by-n lattice graph. By 1967, Kasteleyn had generalized this result to all planar graphs.[7][8]
Algorithm
Explanation
The main insight is that every non-zero term in the Pfaffian of the adjacency matrix of a graph G corresponds to a
perfect matching. Thus, if one can find an orientation of G to align all signs of the terms in Pfaffian (no matter + or ), then the absolute value of the Pfaffian is just the number of perfect matchings in G. The FKT algorithm does such
a task for a planar graph G.
Let G = (V, E) be an undirected graph with adjacency matrix A. Define PM(n) to be the set of partitions of n
elements into pairs, then the number of perfecting matchings in G is
where sgn(M) is the sign of the permutation M. A Pfaffian orientation of G is a directed graph H with (1, 1,
0)-adjacency matrix B such that pf(B) = PerfMatch(G).[9] In 1967, Kasteleyn proved that planar graphs have an
303
304
efficiently computable Pfaffian orientation. Specifically, for a planar graph G, let H be a directed version of G where
an odd number of edges are oriented clockwise for every face in a planar embedding of G. Then H is a Pfaffian
orientation of G.
Finally, for any skew-symmetric matrix A,
where det(A) is the determinant of A. Since determinants are efficiently computable, so is PerfMatch(G).
High-level description
1. Compute a planar embedding of G
2. Compute a spanning tree T1 of the input graph G
3. Give an arbitrary orientation to each edge in G that is also in T1
4. Use the planar embedding to create an (undirected) graph T2 with the same vertex
set as the dual graph of G
5. Create an edge in T2 between two vertices if their corresponding faces in G share an
edge in G that is not in T1
6. For each leaf v in T2 (that is not also the root)
1. Let e be the lone edge of G in the face corresponding to v that does not yet have
an orientation
2. Give e an orientation such that the number of edges oriented clock-wise is odd
3. Remove v from T2
7. Return the absolute value of the Pfaffian of the (1, 1, 0)-adjacency matrix of G, which is the absolute value of
the square root of the determinant
Generalizations
The sum of weighted perfect matchings can also be computed by using the Tutte matrix for the adjacency matrix in
the last step.
Kuratowski's theorem states that
a finite graph is planar if and only if it contains no subgraph homeomorphic to K5 (complete graph on five
vertices) or K3,3 (complete bipartite graph on two partitions of size three).
Vijay Vazirani generalized the FKT algorithm to graphs which do not contain a subgraph homeomorphic to K3,3.[10]
Since counting the number of perfect matchings in a general graph is #P-complete, some restriction on the input
graph is required unless FP, the function version of P, is equal to #P. Counting the number of matchings, which is
known as the Hosoya index, is also #P-complete even for planar graphs.
Applications
The FKT algorithm has seen extensive use in holographic algorithms on planar graphs via matchgates.[3] For
example, consider the planar version of the ice model mentioned above, which has the technical name
#PL-3-NAE-SAT (where NAE stands for "not all equal"). Valiant found a polynomial time algorithm for this
problem which uses matchgates.[11]
References
[1] Hayes, Brian (2008 JanuaryFebruary), "Accidental Algorithms" (http:/ / www. americanscientist. org/ issues/ pub/ accidental-algorithms),
American Scientist,
[2] Baxter, R. J. (2008) [1982]. Exactly Solved Models in Statistical Mechanics (http:/ / tpsrv. anu. edu. au/ Members/ baxter/ book) (Third ed.).
Dover Publications. p.11. ISBN978-0-486-46271-4. .
[3] Cai, Jin-Yi; Lu, Pinyan; Xia, Mingji (2010). "Holographic Algorithms with Matchgates Capture Precisely Tractable Planar #CSP".
Foundations of Computer Science (FOCS), 2010 51st Annual IEEE Symposium on (http:/ / www. egr. unlv. edu/ ~larmore/ FOCS/ focs2010/
). Las Vegas, NV, USA: IEEE. arXiv:1008.0683.
[4] Kenyon, Richard; Okounkov, Andrei (2005). "What is a Dimer?" (http:/ / www. ams. org/ notices/ 200503/ what-is. pdf). AMS 52 (3):
342343. .
[5] Kasteleyn, P. W. (1961), "The statistics of dimers on a lattice. I. The number of dimer arrangements on a quadratic lattice", Physica 27 (12):
12091225, doi:10.1016/0031-8914(61)90063-5
[6] Temperley, H. N. V.; Fisher, Michael E. (1961). "Dimer problem in statistical mechanics-an exact result". Philosophical Magazine 6 (68):
10611063. doi:10.1080/14786436108243366.
[7] Kasteleyn, P. W. (1963). "Dimer Statistics and Phase Transitions". Journal of Mathematical Physics 4 (2): 287293. doi:10.1063/1.1703953.
[8] Kasteleyn, P. W. (1967), "Graph theory and crystal physics", in Harary, F., Graph Theory and Theoretical Physics, New York: Academic
Press, pp.43110
[9] Thomas, Robin (2006). "A survey of Pfaffian orientations of graphs" (http:/ / people. math. gatech. edu/ ~thomas/ PAP/ pfafsurv. pdf). III.
International Congress of Mathematicians (http:/ / www. icm2006. org/ ). Zurich: European Mathematical Society. pp.963-984. .
[10] Vazirani, Vijay V. (1988), "NC algorithms for computing the number of perfect matchings in K3,3-free graphs and related problems", Proc.
1st Scandinavian Workshop on Algorithm Theory (SWAT '88), Lecture Notes in Computer Science, 318, Springer-Verlag, pp.233242,
doi:10.1007/3-540-19487-8_27.
[11] Valiant, Leslie G. (2004). "Holographic Algorithms (Extended Abstract)". Proceedings of the 45th Annual IEEE Symposium on Foundations
of Computer Science. FOCS'04 (http:/ / www. cs. brown. edu/ ~aris/ focs04/ ). Rome, Italy: IEEE Computer Society. pp.306-315.
doi:10.1109/FOCS.2004.34. ISBN0-7695-2228-9.
External links
Presentation by Ashley Montanaro about the FKT algorithm (http://www.damtp.cam.ac.uk/user/am994/
presentations/matchings.pdf)
More history, information, and examples can be found in chapter 2 and section 5.3.2 of Dmitry Kamenetsky's
PhD thesis (https://digitalcollections.anu.edu.au/bitstream/1885/49338/2/02whole.pdf)
305
Solution
In 1962, David Gale and Lloyd Shapley proved that, for any equal
number of men and women, it is always possible to solve the SMP and
make all marriages stable. They presented an algorithm to do so.[2][3]
The Gale-Shapley algorithm involves a number of "rounds" (or
"iterations") where each unengaged man "proposes" to the
most-preferred woman to whom he has not yet proposed. Each woman
then considers all her suitors and tells the one she most prefers
"Maybe" and all the rest of them "No". She is then provisionally
"engaged" to the suitor she most prefers so far, and that suitor is
likewise provisionally engaged to her. In the first round, first a) each
Animation showing an example of the
Gale-Shapley algorithm
unengaged man proposes to the woman he prefers most, and then b)
each woman replies "maybe" to her suitor she most prefers and "no" to
all other suitors. In each subsequent round, first a) each unengaged man proposes to the most-preferred woman to
whom he has not yet proposed (regardless of whether the woman is already engaged), and then b) each woman
replies "maybe" to her suitor she most prefers (whether her existing provisional partner or someone else) and rejects
the rest (again, perhaps including her current provisional partner). The provisional nature of engagements preserves
the right of an already-engaged woman to "trade up" (and, in the process, to "jilt" her until-then partner).
This algorithm guarantees that:
Everyone gets married
Once a woman becomes engaged, she is always engaged to someone. So, at the end, there cannot be a man and
a woman both unengaged, as he must have proposed to her at some point (since a man will eventually propose
to everyone, if necessary) and, being unengaged, she would have had to have said yes.
The marriages are stable
306
307
Let Alice be a woman and Bob be a man who are both engaged, but not to each other. Upon completion of the
algorithm, it is not possible for both Alice and Bob to prefer each other over their current partners. If Bob
prefers Alice to his current partner, he must have proposed to Alice before he proposed to his current partner.
If Alice accepted his proposal, yet is not married to him at the end, she must have dumped him for someone
she likes more, and therefore doesn't like Bob more than her current partner. If Alice rejected his proposal, she
was already with someone she liked more than Bob.
Algorithm
function stableMatching {
Initialize all m M and w W to free
while free man m who still has a woman w to propose to {
w = m's highest ranked such woman to whom he has not yet proposed
if w is free
(m, w) become engaged
else some pair (m', w) already exists
if w prefers m to m'
(m, w) become engaged
m' becomes free
else
(m', w) remain engaged
}
}
B: ZYX
C: XZY
X: BAC
Y: CBA
Z: ACB
Similar problems
The weighted matching problem seeks to find a matching in a weighted bipartite graph that has maximum weight.
Maximum weighted matchings do not have to be stable, but in some applications a maximum weighted matching is
better than a stable one.
The stable roommates problem is similar to the stable marriage problem, but differs in that all participants belong
to a single pool (instead of being divided into equal numbers of "men" and "women").
The hospitals/residents problem also known as the college admissions problem differs from the stable
marriage problem in that the "women" can accept "proposals" from more than one "man" (e.g., a hospital can take
multiple residents, or a college can take an incoming class of more than one student). Algorithms to solve the
hospitals/residents problem can be hospital-oriented (female-optimal) or resident-oriented (male-optimal).
The hospitals/residents problem with couples allows the set of residents to include couples who must be assigned
together, either to the same hospital or to a specific pair of hospitals chosen by the couple (e.g., a married couple
want to ensure that they will stay together and not be stuck in programs that are far away from each other). The
addition of couples to the hospitals/residents problem renders the problem NP-complete.[4]
References
[1] Stable Matching Algorithms (http:/ / www. dcs. gla. ac. uk/ research/ algorithms/ stable/ )
[2] D. Gale and L. S. Shapley: "College Admissions and the Stability of Marriage", American Mathematical Monthly 69, 9-14, 1962.
[3] Harry Mairson: "The Stable Marriage Problem", The Brandeis Review 12, 1992 ( online (http:/ / www1. cs. columbia. edu/ ~evs/ intro/ stable/
writeup. html)).
[4] D. Gusfield and R. W. Irving, The Stable Marriage Problem: Structure and Algorithms, p. 54; MIT Press, 1989.
308
External links
Solution
Unlike the stable marriage problem, the stable roommates may not, in general, have a solution. For a minimal
counterexample, consider 4 people A, B, C and D such that the rankings are:
A:(B,C,D), B:(C,A,D), C:(A,B,D), D:(A,B,C)
In this ranking, each of A,B,C is the most favorite of someone. In any solution, one of A,B,C must be paired with D
and the other 2 with each other (for example AD and BC) yet D's partner and the one for whom D's partner is most
favorite would each prefer to be with each other. In this example, AC is a more favorable pairing.
Algorithm
An efficient algorithm was given in (Irving 1985). The algorithm will determine, for any instance of the problem,
whether a stable matching exists, and if so, will find such a matching.
Irving's algorithm has O(n2) complexity, provided suitable data structures are used to facilitate manipulation of the
preference lists and identification of rotations (see below).
The algorithm consists of two phases. In the first phase, participants propose to each other, in a manner similar to
that of the Gale Shapley algorithm for the stable marriage problem. Participants propose to each person on their
preference list, in order, continuing to the next person if and when their current proposal is rejected. A participant
rejects a proposal if he already holds, or subsequently receives, a proposal from someone he prefers. In this first
phase, one participant might be rejected by all of the others, an indicator that no stable matching is possible.
Otherwise, Phase 1 ends with each person holding a proposal from one of the others this situation can be
represented as a set S of ordered pairs of the form (p,q), where q holds a proposal from p we say that q is p's
current favorite. In the case that this set represents a matching, i.e., (q,p) is in S whenever (p,q) is, the algorithm
terminates with this matching, which is bound to be stable.
309
Example
The following are the preference lists for a Stable Roommates instance involving 6 participants p1, p2, p3, p4, p5, p6.
p1 : p3 p4 p2 p6 p5
p2 : p6 p5 p4 p1 p3
p3 : p2 p4 p5 p1 p6
p4 : p5 p2 p3 p6 p1
p5 : p3 p1 p2 p4 p6
p6 : p5 p1 p3 p4 p2
A possible execution of Phase 1 consists of the following sequence of proposals and rejections, where represents
proposes to and represents rejects.
p1 p3
p2 p6
p3 p2
p4 p5
p5 p3; p3 p1
p1 p4
p6 p5; p5 p6
p6 p1
So Phase 1 ends with the set S = {(p1,p4), (p2,p6), (p3,p2), (p4,p5), (p5,p3), (p6,p1)}.
In Phase 2, the rotation r1 = (p1,p4), (p3,p2) is first identified. This is because p2 is p1's second favorite, and p4 is the
second favorite of p3. Applying r1 gives the new set S = {(p1,p2), (p2,p6), (p3,p4), (p4,p5), (p5,p3), (p6,p1)}. Next, the
rotation r2 = (p1,p2), (p2,p6), (p4,p5) is identified, and application of r2 gives S = {(p1,p6), (p2,p5), (p3,p4), (p4,p2),
(p5,p3), (p6,p1)}. Finally, the rotation r3 = (p2,p5), (p3,p4) is identified, application of which gives S = {(p1,p6),
(p2,p4), (p3,p5), (p4,p2), (p5,p3), (p6,p1)}. This is a matching, and is guaranteed to be stable.
310
311
References
Irving, Robert W. (1985), "An efficient algorithm for the "stable roommates" problem", Journal of Algorithms 6
(4): 577595, doi:10.1016/0196-6774(85)90033-1
Irving, Robert W.; Manlove, David F. (2002), "The Stable Roommates Problem with Ties" [1], Journal of
Algorithms 43 (1): 85105, doi:10.1006/jagm.2002.1219
References
[1] http:/ / eprints. gla. ac. uk/ 11/ 01/ SRT. pdf
Permanent
The permanent of a square matrix in linear algebra is a function of the matrix similar to the determinant. The
permanent, as well as the determinant, is a polynomial in the entries of the matrix. Both permanent and determinant
are special cases of a more general function of a matrix called the immanant.
Definition
The permanent of an n-by-n matrix A = (ai,j) is defined as
The sum here extends over all elements of the symmetric group Sn, i.e. over all permutations of the numbers 1, 2,
..., n.
For example (22 matrix),
The definition of the permanent of A differs from that of the determinant of A in that the signatures of the
permutations are not taken into account. If one views the permanent as a map that takes n vectors as arguments, then
it is a multilinear map and it is symmetric (meaning that any order of the vectors results in the same permanent). A
formula similar to Laplace's for the development of a determinant along a row or column is also valid for the
permanent; all signs have to be ignored for the permanent.
The permanent of a matrix A is denoted per A, perm A, or Per A, sometimes with parentheses around the argument.
In his monograph, Minc (1984) uses Per(A) for the permanent of rectangular matrices, and uses per(A) when A is a
square matrix. Muir (1882) uses the notation
The word, permanent originated with Cauchy (1812) as fonctions symtriques permanentes for a related type of
functions, and was used by Muir (1882) in the modern, more specific, sense.
Permanent
312
Cycle covers
Any square matrix
representing the weight of the arc from vertex i to vertex j. A cycle cover of a weighted directed graph is a collection
of vertex-disjoint directed cycles in the digraph that covers all vertices in the graph. Thus, each vertex i in the
digraph has a unique "successor"
in the cycle cover, and is a permutation on
where n is the
number of vertices in the digraph. Conversely, any permutation
on
The permanent of an
where
matrix A is defined as
is a permutation over
Perfect matchings
A square matrix
can also be viewed as the biadjacency matrix of a bipartite graph which has vertices
on one side and
from vertex
to vertex
that matches
to
is defined to be the
Thus the permanent of A is equal to the sum of the weights of all perfect matchings of the graph.
each nonzero cycle cover has weight 1, and the adjacency matrix has 0-1 entries. Thus the permanent of a 01-matrix
is equal to the number of vertex cycle covers of an unweighted directed graph.
For an unweighted bipartite graph, if we set ai,j = 1 if there is an edge between the vertices
and
and ai,j = 0
otherwise, then each perfect matching has weight 1. Thus the number of perfect matchings in G is equal to the
permanent of matrix A.[1]
Permanent
Minimal permanent
Of all the doubly stochastic matrices, the matrix aij=1/n (that is, the uniform matrix) has strictly the smallest
permanent. This was conjectured by van der Waerden, and proved in the late 1970s independently by Falikman and
Egorychev.[2] The proof of Egorychev is an application of the AlexandrovFenchel inequality.
Computation
The permanent is believed to be more difficult to compute than the determinant. While the determinant can be
computed in polynomial time by Gaussian elimination, Gaussian elimination cannot be used to compute the
permanent. Moreover, computing the permanent of a 0-1 matrix (matrix whose entries are 0 or 1) is #P-complete.
Thus, if the permanent can be computed in polynomial time by any method, then FP=#P, which is an even stronger
statement than P=NP. When the entries of A are nonnegative, however, the permanent can be computed
approximately in probabilistic polynomial time, up to an error of M, where M is the value of the permanent and >
0 is arbitrary.[3]
References
[1] Dexter Kozen. The Design and Analysis of Algorithms. (http:/ / books. google. com/ books?id=L_AMnf9UF9QC& pg=PA141&
dq="permanent+ of+ a+ matrix"+ valiant& as_brr=3& ei=h8BKScClJYOUMtTP6LEO#PPA142,M1) Springer-Verlag, New York, 1991.
ISBN 978-0-387-97687-7; pp. 141142
[2] Van der Waerden's permanent conjecture (http:/ / planetmath. org/ ?op=getobj& amp;from=objects& amp;id=6935), PlanetMath.org.
[3] Jerrum, M.; Sinclair, A.; Vigoda, E. (2004), "A polynomial-time approximation algorithm for the permanent of a matrix with nonnegative
entries", Journal of the ACM 51: 671697, doi:10.1145/1008731.1008738
Further reading
Cauchy, A. L. (1815), "Mmoire sur les fonctions qui ne peuvent obtenir que deux valeurs gales et de signes
contraires par suite des transpositions opres entre les variables quelles renferment." (http://gallica.bnf.fr/
ark:/12148/bpt6k90193x/f97), Journal de l'cole Polytechnique 10: 91169
Minc, H. (1978). "Permanents". Encyclopedia of Mathematics and its Applications (AddisonWesley) 6.
ISSN0953-4806. OCLC3980645.
Muir, Thomas; William H. Metzler. (1960) [1882]. A Treatise on the Theory of Determinants. New York: Dover.
OCLC535903.
External links
Permanent at PlanetMath (http://planetmath.org/encyclopedia/Permanent.html)
313
314
The sum here extends over all elements of the symmetric group Sn, i.e. over all permutations of the numbers 1, 2,
..., n. This formula differs from the corresponding formula for the determinant only in that, in the determinant, each
product is multiplied by the sign of the permutation while in this formula each product is unsigned. The formula
may be directly translated into an algorithm that naively expands the formula, summing over all permutations and
within the sum multiplying out each matrix entry. This requires n! n arithmetic operations.
Ryser formula
The fastest known [1] general exact algorithm is due to Herbert John Ryser (Ryser (1963)). Rysers method is based
on an inclusionexclusion formula that can be given[2] as follows: Let
be obtained from A by deleting k
columns, let
all possible
, and let
over
. Then
arithmetic operations, or
in
315
Glynn formula
Another formula that appears to be as fast as Ryser's is closely related to the polarization identity for a symmetric
tensor (Glynn 2010).
It has the formula (when the characteristic of the field is not two)
vectors
Special cases
Planar and K3,3-free
The number of perfect matchings in a bipartite graph is counted by the permanent of the graph's biadjacency matrix,
and the permanent of any 0-1 matrix can be interpreted in this way as the number of perfect matchings in a graph.
For planar graphs (regardless of bipartiteness), the FKT algorithm computes the number of perfect matchings in
polynomial time by changing the signs of a carefully chosen subset of the entries in the Tutte matrix of the graph, so
that the Pfaffian of the resulting skew-symmetric matrix (the square root of its determinant) is the number of perfect
matchings. This technique can be generalized to graphs that contain no subgraph homeomorphic to the complete
bipartite graph K3,3.[4]
George Plya had asked the question[5] of when it is possible to change the signs of some of the entries of a 01
matrix A so that the determinant of the new matrix is the permanent of A. Not all 01 matrices are "convertible" in
this manner; in fact it is known (Marcus & Minc (1961)) that there is no linear map
such that
for all
matrices
Little (1975) who showed that such matrices are precisely those that are the biadjacency matrix of bipartite graphs
that have a Pfaffian orientation: an orientation of the edges such that for every even cycle for which
has
a perfect matching, there are an odd number of edges directed along C (and thus an odd number with the opposite
orientation). It was also shown that these graphs are exactly those that do not contain a subgraph homeomorphic to
, as above.
in time
for
where
there is the following formula using the determinants of the principal submatrices of the matrix:
complement of
in
indexed by
, and
is the
316
Approximate computation
When the entries of A are nonnegative, the permanent can be computed approximately in probabilistic polynomial
time, up to an error of M, where M is the value of the permanent and > 0 is arbitrary. In other words, there exists a
fully polynomial-time randomized approximation scheme (FPRAS) (Jerrum, Vigoda & Sinclair (2001)).
The most difficult step in the computation is the construction of an algorithm to sample almost uniformly from the
set of all perfect matchings in a given bipartite graph: in other words, a fully polynomial almost uniform sampler
(FPAUS). This can be done using a Markov chain Monte Carlo algorithm that uses a Metropolis rule to define and
run a Markov chain whose distribution is close to uniform, and whose mixing time is polynomial.
It is possible to approximately count the number of perfect matchings in a graph via the self-reducibility of the
permanent, by using the FPAUS in combination with a well-known reduction from sampling to counting due to
Jerrum, Valiant & Vazirani (1986). Let
denote the number of perfect matchings in
. Roughly, for any
particular edge
in
is then
Notes
[1] As of 2008, see Rempala & Wesolowski (2008)
[2] van Lint & Wilson (2001) p. 99 (http:/ / books. google. com/ books?id=5l5ps2JkyT0C& pg=PA108& dq=permanent+ ryser&
lr=#PPA99,M1)
[3] CRC Concise Encyclopedia of Mathematics ()
[4] Little (1974), Vazirani (1988)
[5] Plya (1913), Reich (1971)
References
Allender, Eric; Gore, Vivec (1994), "A uniform circuit lower bound for the permanent", SIAM J. Comput. 23 (5):
10261049
David G. Glynn (2010), "The permanent of a square matrix", European Journal of Combinatorics 31 (7):
18871891, doi:10.1016/j.ejc.2010.01.010
Jerrum, M.; Sinclair, A.; Vigoda, E. (2001), "A polynomial-time approximation algorithm for the permanent of a
matrix with non-negative entries", Proc. 33rd Symposium on Theory of Computing, pp.712721,
doi:10.1145/380752.380877, ECCCTR00-079
Mark Jerrum; Leslie Valiant; Vijay Vazirani (1986), "Random generation of combinatorial structures from a
uniform distribution", Theoretical Computer Science 43: 169188, doi:10.1016/0304-3975(86)90174-X
van Lint, Jacobus Hendricus; Wilson, Richard Michale (2001), A Course in Combinatorics, ISBN0-521-00601-5
Little, C. H. C. (1974), "An extension of Kasteleyn's method of enumerating the 1-factors of planar graphs", in
Holton, D., Proc. 2nd Australian Conf. Combinatorial Mathematics, Lecture Notes in Mathematics, 403,
Springer-Verlag, pp.6372
Little, C. H. C. (1975), "A characterization of convertible (0, 1)-matrices" (http://www.sciencedirect.com/
science/article/B6WHT-4D7K7HW-H5/2/caa9448ac7c4e895fd7845515c7a68d1), J. Combin. Theory Ser. B 18
(3): 187208, doi:10.1016/0095-8956(75)90048-9
Marcus, M.; Minc, H. (1961), "On the relation between the determinant and the permanent", Illinois J. Math. 5:
376381
Plya, G. (1913), "Aufgabe 424", Arch. Math. Phys. 20 (3): 27
Reich, Simeon (1971), "Another solution of an old problem of plya", American Mathematical Monthly 78 (6):
649650, doi:10.2307/2316574, JSTOR2316574
317
318
Network flow
Maximum flow problem
In optimization theory, the maximum flow
problem is to find a feasible flow through a
single-source, single-sink flow network that is
maximum.
The maximum flow problem can be seen as a
special case of more complex network flow
problems, such as the circulation problem. The
maximum value of an s-t flow is equal to the
minimum capacity of an s-t cut in the network, as
stated in the max-flow min-cut theorem.
History
The maximum flow problem was first formulated in 1954 by T. E. Harris as a simplified model of Soviet railway
traffic flow.[1] In 1955, Lester R. Ford and Delbert R. Fulkerson created the first known algorithm, the
FordFulkerson algorithm.[2][3]
Over the years, various improved solutions to the maximum flow problem were discovered, notably the shortest
augmenting path algorithm of Edmonds and Karp and independently Dinitz; the blocking flow algorithm of Dinitz;
the push-relabel algorithm of Goldberg and Tarjan; and the binary blocking flow algorithm of Goldberg and Rao.
The electrical flow algorithm of Christiano, Kelner, Madry, and Spielman finds an approximately optimal maximum
flow but only works in undirected graphs.
Definition
Let
be a network with
respectively.
The capacity of an edge is a mapping
, denoted by
or
, denoted by
or
, subject to the
, for each
for
each
319
Solutions
We can define the Residual Graph, which provides a systematic way to search for forward-backward operations in
order to find the maximum flow.
Given a flow network
1. The node set of
, and a flow
on
of
with respect to
as follows.
2. Each edge
of
is with a capacity of
3. Each edge
of
is with a capacity of
.
.
The following table lists algorithms for solving the maximum flow problem.
Method
Complexity
Linear programming
Description
Constraints given by the definition of a legal flow. See the linear program here.
FordFulkerson
algorithm
O(E max| f |)
As long as there is an open path through the residual graph, send the minimum of the
residual capacities on the path. The algorithm works only if all weights are integers.
Otherwise it is possible that the FordFulkerson algorithm will not converge to the
maximum value.
EdmondsKarp
algorithm
O(VE2)
O(V2E)
In each phase the algorithms builds a layered graph with breadth-first search on the
residual graph. The maximum flow in a layered graph can be calculated in O(VE) time,
and the maximum number of the phases is n-1. In networks with unit capacities, Dinic's
algorithm terminates in
time.
General push-relabel
maximum flow
algorithm
O(V2E)
The push relabel algorithm maintains a preflow, i.e. a flow function with the possibility of
excess in the vertices. The algorithm runs while there is a vertex with positive excess, i.e.
an active vertex in the graph. The push operation increases the flow on a residual edge,
and a height function on the vertices controls which residual edges can be pushed. The
height function is changed with a relabel operation. The proper definitions of these
operations guarantee that the resulting flow function is a maximum flow.
Push-relabel algorithm
with FIFO vertex
selection rule
O(V3)
Push-relabel algorithm variant which always selects the most recently active vertex, and
performs push operations until the excess is positive or there are admissible residual
edges from this vertex.
O(VE log(V))
The dynamic trees data structure speeds up the maximum flow computation in the layered
graph to O(Elog(V)).
Push-relabel algorithm
with dynamic trees
O(VE log(V2/E))
The algorithm builds limited size trees on the residual graph regarding to height function.
These trees provide multilevel push operations.
320
Application
Multi-source multi-sink maximum flow problem
Given a network N = (V,E) with a set of sources S = {s1, ..., sn} and a
set of sinks T = {t1, ..., tm} instead of only one source and one sink, we
are to find the maximum flow across N. We can transform the
multi-source multi-sink problem into a maximum flow problem by
adding a consolidated source connecting to each vertex in S and a
consolidated sink connected by each vertex in T (also known as
supersource and supersink) with infinite capacity on each edge (See
Fig. 4.1.1.).
References
[1] Schrijver, Alexander, "On the history of the transportation and maximum flow problems" (http:/ / homepages. cwi. nl/ ~lex/ files/ histtrpclean.
pdf), Mathematical Programming 91 (2002) 437-445
[2] Ford, L.R., Jr.; Fulkerson, D.R., "Maximal Flow through a Network", Canadian Journal of Mathematics (1956), pp.399-404.
[3] Ford, L.R., Jr.; Fulkerson, D.R., Flows in Networks, Princeton University Press (1962).
321
322
Notes
1. Andrew V. Goldberg and S. Rao (1998). "Beyond the flow decomposition barrier". J. Assoc. Comput. Mach. 45
(5): 753782. doi:10.1145/290179.290181.
2. Andrew V. Goldberg and Robert E. Tarjan (1988). "A new approach to the maximum-flow problem". Journal of
the ACM (ACM Press) 35 (4): 921940. doi:10.1145/48014.61051. ISSN0004-5411.
3. Joseph Cheriyan and Kurt Mehlhorn (1999). "An analysis of the highest-level selection rule in the preflow-push
max-flow algorithm". Information Processing Letters 69 (5): 239242. doi:10.1016/S0020-0190(99)00019-8.
4. Daniel D. Sleator and Robert E. Tarjan (1983). "A data structure for dynamic trees" (http://www.cs.cmu.edu/
~sleator/papers/dynamic-trees.pdf). Journal of Computer and System Sciences 26 (3): 362391.
doi:10.1016/0022-0000(83)90006-5. ISSN0022-0000.
5. Daniel D. Sleator and Robert E. Tarjan (1985). "Self-adjusting binary search trees" (http://www.cs.cmu.edu/
~sleator/papers/self-adjusting.pdf). Journal of the ACM (ACM Press) 32 (3): 652686. doi:10.1145/3828.3835.
ISSN0004-5411.
6. Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein (2001). "26. Maximum Flow".
Introduction to Algorithms, Second Edition. MIT Press and McGraw-Hill. pp.643668. ISBN0-262-03293-7.
7. Eugene Lawler (2001). "4. Network Flows". Combinatorial Optimization: Networks and Matroids. Dover.
pp.109177. ISBN0-486-41453-1.
Definition
Let
and
respectively.
The capacity of an edge is a mapping c: ER , denoted by cuv or c(u,v). It represents the maximum amount
of flow that can pass through an edge.
A flow is a mapping f: ER+, denoted by fuv or f(u,v), subject to the following two constraints:
1.
for each
2.
(capacity constraint)
for each
(conservation of flows).
, where
is the source of
.
, that is, to determine S and T such that the capacity of the S-T cut
323
Statement
The max-flow min-cut theorem states
The maximum value of an s-t flow is equal to the minimum capacity over all s-t cuts.
Min-cut (Dual)
minimize
subject to
subject to
The equality in the max-flow min-cut theorem follows from the strong duality theorem in linear programming,
which states that if the primal program has an optimal solution, x*, then the dual program also has an optimal
solution, y*, such that the optimal values formed by the two solutions are equal.
Example
The figure on the right is a network having a value of flow of 7. The
vertex in white and the vertices in grey form the subsets S and T of an
s-t cut, whose cut-set contains the dashed edges. Since the capacity of
the s-t cut is 7, which equals to the value of flow, the max-flow
min-cut theorem tells us that the value of flow and the capacity of the
s-t cut are both optimal in this network.
A network with the value of flow equal to the
capacity of an s-t cut
Application
Generalized max-flow min-cut theorem
In addition to edge capacity, consider there is capacity at each vertex, that is, a mapping c: VR+, denoted by c(v),
such that the flow f has to satisfy not only the capacity constraint and the conservation of flows, but also the vertex
capacity constraint
for each
In other words, the amount of flow passing through a vertex cannot exceed its capacity. Define an s-t cut to be the set
of vertices and edges such that for any path from s to t, the path contains a member of the cut. In this case, the
capacity of the cut is the sum the capacity of each edge and vertex in it.
In this new definition, the generalized max-flow min-cut theorem states that the maximum value of an s-t flow is
equal to the minimum capacity of an s-t cut in the new sense.
324
Menger's theorem
In the undirected edge-disjoint paths problem, we are given an undirected graph G = (V, E) and two vertices s and t,
and we have to find the maximum number of edge-disjoint s-t paths in G.
The Menger's theorem states that the maximum number of edge-disjoint s-t paths in an undirected graph is equal to
the minimum number of edges in an s-t cut-set.
costs
projects and
and each
be the set of
Since
and
. An edge (
requires equipment
,
and
Equipment
100
200
200
100
150
50
The minimum capacity of a s-t cut is 250 and the sum of the revenue of each project is 450; therefore the maximum
profit g is 450 250 = 200, by selecting projects
and
.
325
History
The max-flow min-cut theorem was proved by P. Elias, A. Feinstein, and C.E. Shannon in 1956, and independently
also by L.R. Ford, Jr. and D.R. Fulkerson in the same year.
References
Eugene Lawler (2001). "4.5. Combinatorial Implications of Max-Flow Min-Cut Theorem, 4.6. Linear
Programming Interpretation of Max-Flow Min-Cut Theorem". Combinatorial Optimization: Networks and
Matroids. Dover. pp.117120. ISBN0-486-41453-1.
Christos H. Papadimitriou, Kenneth Steiglitz (1998). "6.1 The Max-Flow, Min-Cut Theorem". Combinatorial
Optimization: Algorithms and Complexity. Dover. pp.120128. ISBN0-486-40258-4.
Vijay V. Vazirani (2004). "12. Introduction to LP-Duality". Approximation Algorithms. Springer. pp.93100.
ISBN3-540-65367-8.
Algorithm
Let
to
, let
to the sink
be the flow.
Capacity
constraints:
Skew symmetry:
Flow
conservation:
That is, unless is or . The net flow to a node is zero, except for
the source, which "produces" flow, and the sink, which "consumes"
flow.
This means that the flow through the network is a legal flow after each round in the algorithm. We define the
residual network
to be the network with capacity
and no flow.
Notice that it can happen that a flow from
original network: if
and
to
Algorithm FordFulkerson
Inputs Graph
Output A flow
1.
from
to
, a source node
which is a maximum
from
to
in
, such that
326
The path in step 2 can be found with for example a breadth-first search or a depth-first search in
use the former, the algorithm is called EdmondsKarp.
When no more paths in step 2 can be found, will not be able to reach
nodes reachable by
remainder of
. If you
is the set of
in the residual network, then the total capacity in the original network of edges from
to
to the
upper bound for all such flows. This proves that the flow we found is maximal. See also Max-flow Min-cut theorem.
Complexity
By adding the flow augmenting path to the flow already established in the graph, the maximum flow will be reached
when no more flow augmenting paths can be found in the graph. However, there is no certainty that this situation
will ever be reached, so the best that can be guaranteed is that the answer will be correct if the algorithm terminates.
In the case that the algorithm runs forever, the flow might not even converge towards the maximum flow. However,
this situation only occurs with irrational flow values. When the capacities are integers, the runtime of Ford-Fulkerson
is bounded by
(see big O notation), where
is the number of edges in the graph and is the maximum
flow in the graph. This is because each augmenting path can be found in
Integral example
The following example shows the first steps of FordFulkerson in a flow network with 4 nodes, source
. This example shows the worst-case behaviour of the algorithm. In each step, only a flow of
and sink
network. If breadth-first-search were used instead, only two steps would be needed.
Path
Capacity
Initial flow network
327
to
Non-terminating example
Consider the flow network shown on the right, with source , sink
, capacities of edges
,
and
respectively
,
and
integer
. The constant
Residual capacities
0
1
2
3
4
5
Note that after step 1 as well as after step 5, the residual capacities of edges
and
and
and
infinitely many times and residual capacities of these edges will always be in the same form. Total flow in the
network after step 5 is
. If we continue to use augmenting paths as above, the total flow converges
to
[1]
Python implementation
class Edge(object):
def __init__(self, u, v, w):
self.source = u
self.sink = v
self.capacity = w
def __repr__(self):
return "%s->%s:%s" % (self.source, self.sink, self.capacity)
class FlowNetwork(object):
def __init__(self):
self.adj = {}
self.flow = {}
def add_vertex(self, vertex):
Usage example
For the example flow network in maximum flow problem we do the following:
g=FlowNetwork()
map(g.add_vertex, ['s','o','p','q','r','t'])
g.add_edge('s','o',3)
g.add_edge('s','p',3)
g.add_edge('o','p',2)
g.add_edge('o','q',3)
g.add_edge('p','r',2)
328
Notes
[1] Zwick, Uri (21 August 1995). "The smallest networks on which the Ford-Fulkerson maximum flow procedure may fail to terminate".
Theoretical Computer Science 148 (1): 165170. doi:10.1016/0304-3975(95)00022-O.
References
Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001). "Section 26.2: The
FordFulkerson method". Introduction to Algorithms (Second ed.). MIT Press and McGrawHill. pp.651664.
ISBN0-262-03293-7.
George T. Heineman, Gary Pollice, and Stanley Selkow (2008). "Chapter 8:Network Flow Algorithms".
Algorithms in a Nutshell. Oreilly Media. pp.226250. ISBN978-0-596-51624-6.
Ford, L. R.; Fulkerson, D. R. (1954). "Maximal flow through a network". Canadian Journal of Mathematics 8:
399404.
External links
Another Java animation (http://www.cs.pitt.edu/~kirk/cs1501/animations/Network.html)
Java Web Start application (http://rrusin.blogspot.com/2011/03/implementing-graph-editor-in-javafx.html)
329
Algorithm
The algorithm is identical to the FordFulkerson algorithm, except that the search order when finding the
augmenting path is defined. The path found must be a shortest path that has available capacity. This can be found by
a breadth-first search, as we let edges have unit length. The running time of O(V E2) is found by showing that each
augmenting path can be found in O(E) time, that every time at least one of the E edges becomes saturated, that the
distance from the saturated edge to the source along the augmenting path must be longer than last time it was
saturated, and that the length is at most V. Another property of this algorithm is that the length of the shortest
augmenting path increases monotonically. There is an accessible proof in[3].
Pseudocode
For a more high level description, see FordFulkerson algorithm.
algorithm EdmondsKarp
input:
C[1..n, 1..n] (Capacity matrix)
E[1..n, 1..?] (Neighbour lists)
s
(Source)
t
(Sink)
output:
f
(Value of maximum flow)
F
(A matrix giving a legal flow with the maximum value)
f := 0 (Initial flow is zero)
F := array(1..n, 1..n) (Residual capacity from u to v is C[u,v] - F[u,v])
forever
m, P := BreadthFirstSearch(C, E, s, t, F)
if m = 0
break
f := f + m
(Backtrack search, and write flow)
v := t
while v s
u := P[v]
F[u,v] := F[u,v] + m
F[v,u] := F[v,u] - m
v := u
return (f, F)
algorithm BreadthFirstSearch
330
331
input:
C, E, s, t, F
output:
M[t]
(Capacity of path found)
P
(Parent table)
P := array(1..n)
for u in 1..n
P[u] := -1
P[s] := -2 (make sure source is not rediscovered)
M := array(1..n) (Capacity of found path to node)
M[s] :=
Q := queue()
Q.push(s)
while Q.size() > 0
u := Q.pop()
for v in E[u]
(If there is available capacity, and v is not seen before in search)
if C[u,v] - F[u,v] > 0 and P[v] = -1
P[v] := u
M[v] := min(M[u], C[u,v] - F[u,v])
if v t
Q.push(v)
else
return M[t], P
return 0, P
Example
Given a network of seven nodes, source A, sink G, and capacities as shown below:
In the pairs
is
to
to
, the total capacity, minus the flow that is already used. If the net flow from
is negative, it contributes to the residual capacity.
Capacity
332
Path
Resulting network
Notice how the length of the augmenting path found by the algorithm (in red) never decreases. The paths found are
the shortest possible. The flow found is equal to the capacity across the minimum cut in the graph separating the
source and the sink. There is only one minimal cut in this graph, partitioning the nodes into the sets
and
Notes
[1] Dinic, E. A. (1970). "Algorithm for solution of a problem of maximum flow in a network with power estimation". Soviet Math. Doklady
(Doklady) 11: 12771280.
[2] Edmonds, Jack; Karp, Richard M. (1972). "Theoretical improvements in algorithmic efficiency for network flow problems". Journal of the
ACM (Association for Computing Machinery) 19 (2): 248264. doi:10.1145/321694.321699.
[3] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein (2001). "26.2". Introduction to Algorithms (second ed.). MIT
Press and McGrawHill. pp.660663. ISBN0-262-53196-8.
References
1. Algorithms and Complexity (see pages 6369). http://www.cis.upenn.edu/~wilf/AlgComp3.html
333
augmenting paths. The introduction of the concepts of the level graph and blocking flow enable Dinic's algorithm to
achieve its performance.
Definition
Let
be a network with
and
respectively.
The residual capacity is a mapping
1. if
2.
otherwise.
defined as,
, where
.
An augmenting path is an
Define
graph
.
to
in
is the
, where
.
blocking
flow
is
an
flow
such
that
contains no
the
path.
Algorithm
Dinic's Algorithm
Input: A network
Output: An
1.
2.
3.
4.
.
flow
of maximum value.
Set
for each
.
Construct
from
of
. If
Find a blocking flow
in
.
Augment flow by
and go back to step 2.
graph
with
334
Analysis
It can be shown that the number of edges in each blocking flow increases by at least 1 each time and thus there are at
most
blocking flows in the algorithm, where is the number of vertices in the network. The level graph
can be constructed by Breadth-first search in
in
time. Hence, the running time of Dinic's algorithm is
.
Using a data structure called dynamic trees, the running time of finding a blocking flow in each phase can be reduced
to
and therefore the running time of Dinic's algorithm can be improved to
.
Special cases
In networks with unit capacities, a much stronger time bound holds. Each blocking flow can be found in
time, and it can be shown that the number of phases does not exceed
runs in
and
time.
In networks arising during the solution of bipartite matching problem, the number of phases is bounded by
, therefore leading to the
algorithm. More generally, this bound holds for any unit network a network in which each vertex, except for
source and sink, either has a single entering edge of capacity one, or a single outgoing edge or capacity one, and all
other capacities are arbitrary integers.[2]
Example
The following is a simulation of the Dinic's algorithm. In the level graph
values
1.
is 14. Note that each augmenting path in the blocking flow has 3 edges.
2.
3.
Since
cannot be reached in
. The algorithm terminates and returns a flow with maximum value of 19. Note that in each blocking flow, the
335
History
Dinic's algorithm was published in 1970 by former Russian Computer Scientist Yefim (Chaim) A. Dinitz, who is
today a member of the Computer Science department at Ben-Gurion University of the Negev (Israel), earlier than the
EdmondsKarp algorithm, which was published in 1972 but was discovered earlier. They independently showed that
in the FordFulkerson algorithm, if each augmenting path is the shortest one, the length of the augmenting paths is
non-decreasing.
Notes
[1] Yefim Dinitz (1970). "Algorithm for solution of a problem of maximum flow in a network with power estimation" (http:/ / www. cs. bgu. ac.
il/ ~dinitz/ D70. pdf). Doklady Akademii nauk SSSR 11: 12771280. .
[2] Tarjan 1983, p.102
References
Yefim Dinitz (2006). "Dinitz' Algorithm: The Original Version and Even's Version" (http://www.cs.bgu.ac.il/
~dinitz/Papers/Dinitz_alg.pdf). In Oded Goldreich, Arnold L. Rosenberg, and Alan L. Selman. Theoretical
Computer Science: Essays in Memory of Shimon Even. Springer. pp.218240. ISBN978-3-540-32880-3.
Tarjan, R. E. (1983). Data structures and network algorithms.
B. H. Korte, Jens Vygen (2008). "8.4 Blocking Flows and Fujishige's Algorithm". Combinatorial Optimization:
Theory and Algorithms (Algorithms and Combinatorics, 21). Springer Berlin Heidelberg. pp.174176.
ISBN978-3-540-71844-4.
time. Asymptotically, it is
time.
Algorithm
Given a flow network
want to find the maximum amount of flow you can send from s to t through the network. Two types of operations are
performed on nodes, push and relabel. Throughout we maintain:
. For all u,
is a non-negative
integer.
Notice that the last condition for a preflow is relaxed from the corresponding condition for a legal flow in a regular
flow network.
336
and
. As we adjust the height of the nodes, the flow goes through the
network as water through a landscape. Differing from algorithms such as Ford-Fulkerson, the flow through the
network is not necessarily a legal flow throughout the execution of the algorithm.
In short words, the heights of nodes (except s and t) is adjusted, and flow is sent between nodes, until all possible
flow has reached t. Then we continue increasing the height of internal nodes until all the flow that went into the
network, but did not reach t, has flowed back into s. A node can reach the height
before this is complete,
as the longest possible path back to s excluding t is
long, and
at 0.
Push
A push from u to v means sending a part of the excess flow into u on to v. Three conditions must be met for a push to
take place:
Relabel
Doing a relabel on a node u is increasing its height until it is higher than at least one of the nodes it has available
capacity to. Conditions for a relabel:
Push-relabel algorithm
Push-relabel algorithms in general have the following layout:
1. As long as there is legal push or relabel operation
1. Perform a legal push, or
2. a legal relabel.
The running time for these algorithms are in general
(argument omitted).
for some v
337
Discharge
In relabel-to-front, a discharge on a node u is the following:
1. As long as
(proof omitted).
Sample implementation
C implementation:
#include <stdlib.h>
#include <stdio.h>
#define NODES 6
#define MIN(X,Y) X < Y ? X : Y
#define INFINITE 10000000
void push(int **C, int **F, int *excess, int u, int v) {
int send = MIN(excess[u], C[u][v] - F[u][v]);
F[u][v] += send;
F[v][u] -= send;
excess[u] -= send;
excess[v] += send;
}
void relabel(int **C, int **F, int *height, int u) {
int v;
int min_height = INFINITE;
for (v = 0; v < NODES; v++) {
if (C[u][v] - F[u][v] > 0) {
min_height = MIN(min_height, height[v]);
height[u] = min_height + 1;
338
339
height[source] = NODES;
excess[source] = INFINITE;
for (i = 0; i < NODES; i++)
push(C, F, excess, source, i);
p = 0;
while (p < NODES - 2) {
int u = list[p];
int old_height = height[u];
discharge(C, F, excess, height, seen, u);
if (height[u] > old_height) {
moveToFront(p,list);
p=0;
}
else
p += 1;
}
int maxflow = 0;
for (i = 0; i < NODES; i++)
maxflow += F[source][i];
return maxflow;
}
void printMatrix(int **M) {
int i,j;
for (i = 0; i < NODES; i++) {
for (j = 0; j < NODES; j++)
printf("%d\t",M[i][j]);
printf("\n");
}
}
int main(void) {
int **flow, **capacities, i;
flow = (int **) calloc(NODES, sizeof(int*));
capacities = (int **) calloc(NODES, sizeof(int*));
for (i = 0; i < NODES; i++) {
flow[i] = (int *) calloc(NODES, sizeof(int));
capacities[i] = (int *) calloc(NODES, sizeof(int));
}
//Sample graph
capacities[0][1]
capacities[0][2]
capacities[1][2]
capacities[1][3]
=
=
=
=
2;
9;
1;
0;
340
=
=
=
=
0;
7;
7;
4;
printf("Capacity:\n");
printMatrix(capacities);
printf("Max Flow:\n%d\n", pushRelabel(capacities, flow, 0,
5));
printf("Flows:\n");
printMatrix(flow);
return 0;
}
Python implementation:
def relabel_to_front(C, source, sink):
n = len(C) # C is the capacity matrix
F = [[0] * n for _ in xrange(n)]
# residual capacity from u to v is C[u][v] - F[u][v]
height
excess
seen
# node
list
341
for which
. This
represents a minimal cut in the graph, and no more flow will go from the nodes
to the nodes
such that
. If
,
and
, contradicting that
References
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms,
Second Edition. MIT Press and McGraw-Hill, 2001. ISBN 0-262-03293-7. Section 26.4: Push-relabel algorithms,
and section 26.5: The relabel-to-front-algorithm.
Andrew V. Goldberg, Robert E. Tarjan. A new approach to the maximum flow problem [1]. Annual ACM
Symposium on Theory of Computing, Proceedings of the eighteenth annual ACM symposium on Theory of
computing, 136146. ISBN 0-89791-193-8, 1986
References
[1] http:/ / doi. acm. org/ 10. 1145/ 12130. 12144
Closure problem
A Closure problem is a problem in graph theory for finding a set of vertices in a directed graph such that there are
no edges from the set to the rest of the graph. More specifically, the minimum closure problem asks for a set of this
type with the minimum possible weight in a weighted graph.
Closure problem
Definition
In a directed graph G=(V,A), a set S of vertices is said to be closed if every successor of every vertex in S is also in
S. Equivalently, S is closed if it has no outgoing edge.
It may be assumed without loss of generality that G is a directed acyclic graph. For, if it is not acyclic, each of its
strongly connected components must either be entirely contained in or entirely disjoint from any closed set.
Therefore, the closures of G are in one-to-one correspondence with the closures of the condensation of G, a directed
acyclic graph that has one vertex for each strongly connected component of G. In weighted closure problems, one
may set the weight of any vertex of the condensation to the sum of the weights of the vertices in the corresponding
strongly connected component of G.
Open-pit mining
Picard studied the closure problem on the open-pit mining problem, which he modeled as a maximum-closure
problem.
342
Closure problem
343
Where xi is 1 if the vertex is in the closure and it is 0 otherwise, and the first constraint ensures that if a vertex is in
the closure its successor is also in it. Since each row has at most one 1 and one 1 we know that the constraint matrix
is totally unimodular and an integer solution is obtained by solving the LP relaxation of the problem.
In order to solve the maximum closure problem we can use the max-flow min-cut theorem. Let us construct the s-t
graph for maximum closure problem. The graph has a vertex j for each variable xj. We also add a source s and a sink
vertex t. If the weight of the variable wj is positive we include an arc from the source to the vertex j with capacity wj.
If the weight is negative we add the arc from j to the sink vertex (t) with capacity wj. Each inequality xj xi is
associated with an arc (i,j) with capacity . Let V+be the set of vertices with positive weights and V the set of
vertices with negative weights. Figure \ref{fig:cl4} shows the graph constructed for the closure problem. We can
find the minimum cut in this graph by solving a max-flow problem from source to the sink . The source set of a
minimum cut separating s from t is a maximum closure in the graph. This statement holds because the minimum cut
is finite and cannot include any arc from A that has the weight equal to [5]. Minimizing the value of the finite cut
is equivalent to maximizing the sum of weights of the vertices in the source set of the finite cut. Denote (A,B) the
collection of arc with tails at A and heads at B. Also
Let
Closure problem
344
the minimum cut and vice versathe sink set of a minimum cut (without t), which has to be finite, also minimized
the weight of the closure.
History
During the 1970s J.C. Picard was working on the open-pit mining problem and during that time worked on
generalizing the selection problem to the closure problem. During that time the mining industry was developing
optimization methods on their own. Picard's contribution introduced the closure problem to the mining industry.
References
Hochbaum, Dorit S. (2005), "Complexity and algorithms for convex network optimization and other nonlinear
problems", 4OR 3 (3): 171216, doi:10.1007/s10288-005-0078-6, MR2177960.
Hochbaum, Dorit S. (2001), "An efficient algorithm for image segmentation, Markov random fields and related
problems", Journal of the ACM 48 (4): 686701 (electronic), doi:10.1145/502090.502093, MR2144926.
Hochbaum, Dorit S. (2004), "Selection, provisioning, shared fixed costs, maximum closure, and implications on
algorithmic methods today" [1], Management Science 50 (6): 709723
Hochbaum, Dorit S. (2008), "The pseudoflow algorithm: a new algorithm for the maximum-flow problem",
Operations Research 56 (4): 9921009, doi:10.1287/opre.1080.0524, MR2455709.
Hochbaum, Dorit S.; Queyranne, Maurice (2003), "Minimizing a convex cost closure set", SIAM Journal on
Discrete Mathematics 16 (2): 192207 (electronic), doi:10.1137/S0895480100369584, MR1982135.
Hochbaum, Dorit S.; Shanthikumar, J. George (1990), "Convex separable optimization is not much harder than
linear optimization", Journal of the ACM 37 (4): 843862, doi:10.1145/96559.96597, MR1083654.
References
[1] http:/ / hkilter. com/ files/ articles/ hochbaum_6_04. pdf
345
Definition
Given a flow network, that is, a directed graph
has capacity
, flow f(u,v) 0 and cost a(u,v) 0. The cost of sending this flow is f(u,v) a(u,v). You
Required flow:
346
Solutions
The minimum cost flow problem can be solved by linear programming, since we optimize a linear function, and all
constraints are linear.
Apart from that, many combinatorial algorithms exist, for a comprehensive survey, see
generalizations of maximum flow algorithms, others use entirely different approaches.
AMO93
References
1. Ravindra K. Ahuja, Thomas L. Magnanti, and James B. Orlin (1993). Network Flows: Theory, Algorithms, and
Applications. Prentice-Hall, Inc.. ISBN0-13-617549-X.
2. Morton Klein (1967). "A primal method for minimal cost flows with applications to the assignment and
transportation problems". Management Science 14: 205220.
3. Andrew V. Goldberg and Robert E. Tarjan (1989). "Finding minimum-cost circulations by canceling negative
cycles". Journal of the ACM 36 (4): 873886.
4. Jack Edmonds and Richard M. Karp (1972). "Theoretical improvements in algorithmic efficiency for network
flow problems". Journal of the ACM 19 (2): 248264.
5. Andrew V. Goldberg and Robert E. Tarjan (1990). "Finding minimum-cost circulations by successive
approximation". Math. Oper. Res. 15 (3): 430466.
External links
LEMON C++ library with several maximum flow and minimum cost circulation algorithms [1]
References
[1] http:/ / lemon. cs. elte. hu/
347
Nonplanar
Butterfly graph
K5
K3,3
The complete
graph
K4 is planar
In graph theory, a planar graph is a graph that can be embedded in the plane, i.e., it can be drawn on the plane in
such a way that its edges intersect only at their endpoints. In other words, it can be drawn in such a way that no
edges cross each other.[1] Such a drawing is called a plane graph or planar embedding of the graph. A plane graph
can be defined as a planar graph with a mapping from every node to a point on a plane, and from every edge to a
plane curve on that plane, such that the extreme points of each curve are the points mapped from its end nodes, and
all curves are disjoint except on their extreme points.
Every graph that can be drawn on a plane can be drawn on the sphere as well, and vice versa.
Plane graphs can be encoded by combinatorial maps.
The equivalence class of topologically equivalent drawings on the sphere is called a planar map. Although a plane
graph has an external or unbounded face, none of the faces of a planar map have a particular status.
A generalization of planar graphs are graphs which can be drawn on a surface of a given genus. In this terminology,
planar graphs have graph genus 0, since the plane (and the sphere) are surfaces of genus 0. See "graph embedding"
for other related topics.
Planar graph
348
In the Soviet Union, Kuratowski's theorem was known as the PontryaginKuratowski theorem, as its proof was
allegedly first given in Pontryagin's unpublished notes. By a long-standing academic tradition, such references are
not taken into account in determining priority, so the Russian name of the theorem is not acknowledged
internationally.
Instead of considering subdivisions, Wagner's theorem deals with
minors:
A finite graph is planar if and only if it does not have K5 or K3,3
as a minor.
Planar graph
349
Euler's formula
Euler's formula states that if a finite, connected, planar graph is drawn in the plane without any edge intersections,
and v is the number of vertices, e is the number of edges and f is the number of faces (regions bounded by edges,
including the outer, infinitely large region), then
v e + f = 2.
As an illustration, in the butterfly graph given above, v=5, e=6 and f=3. If the second graph is redrawn without
edge intersections, it has v=4, e=6 and f=4. In general, if the property holds for all planar graphs of f faces, any
change to the graph that creates an additional face while keeping the graph planar would keep ve+f an invariant.
Since the property holds for all graphs with f=2, by mathematical induction it holds for all cases. Euler's formula
can also be proved as follows: if the graph isn't a tree, then remove an edge which completes a cycle. This lowers
both e and f by one, leaving v e+f constant. Repeat until the remaining graph is a tree; trees have v= e+1 and
f=1, yielding ve+f=2. i.e. the Euler characteristic is 2.
In a finite, connected, simple, planar graph, any face (except possibly the outer one) is bounded by at least three
edges and every edge touches at most two faces; using Euler's formula, one can then show that these graphs are
sparse in the sense that e3v6 if v3.
Euler's formula is also valid for convex polyhedra. This is no
coincidence: every convex polyhedron can be turned into a connected,
simple, planar graph by using the Schlegel diagram of the polyhedron,
a perspective projection of the polyhedron onto a plane with the center
of perspective chosen near the center of one of the polyhedron's faces.
Not every planar graph corresponds to a convex polyhedron in this
way: the trees do not, for example. Steinitz's theorem says that the
polyhedral graphs formed from convex polyhedra are precisely the
finite 3-connected simple planar graphs. More generally, Euler's
formula applies to any polyhedron whose faces are simple polygons
that form a surface topologically equivalent to a sphere, regardless of
its convexity.
Average degree
From v e + f = 2 and 3f <= 2e (one face has minimum 3 edges and each edge has maximum two faces) it follows
via algebraic transformations that the average degree is strictly less than 6. Otherwise the given graph can't be planar.
Planar graph
350
Coin graphs
We say that two circles drawn in a plane kiss (or osculate) whenever
they intersect in exactly one point. A "coin graph" is a graph formed by
a set of circles, no two of which have overlapping interiors, by making
a vertex for each circle and an edge for each pair of circles that kiss.
The circle packing theorem, first proved by Paul Koebe in 1936, states
that a graph is planar if and only if it is a coin graph.
This result provides an easy proof of Fry's theorem, that every planar
graph can be embedded in the plane in such a way that its edges are
straight line segments that do not cross each other. If one places each
vertex of the graph at the center of the corresponding circle in a coin
graph representation, then the line segments between centers of kissing
circles do not cross any of the other edges.
Apollonian networks are the maximal planar graphs formed by repeatedly splitting triangular faces into triples of
smaller triangles. Equivalently, they are the planar 3-trees.
Planar graph
351
Outerplanar graphs
Outerplanar graphs are graphs with an embedding in the plane such that all vertices belong to the unbounded face of
the embedding. Every outerplanar graph is planar, but the converse is not true: K4 is planar but not outerplanar. A
theorem similar to Kuratowski's states that a finite graph is outerplanar if and only if it does not contain a subdivision
of K4 or of K2,3.
A 1-outerplanar embedding of a graph is the same as an outerplanar embedding. For k>1 a planar embedding is
k-outerplanar if removing the vertices on the outer face results in a (k1)-outerplanar embedding. A graph is
k-outerplanar if it has a k-outerplanar embedding
Duals are useful because many properties of the dual graph are related in simple ways to properties of the original
graph, enabling results to be proven about graphs by examining their dual graphs.
Planar graph
While the dual constructed for a particular embedding is unique (up to isomorphism), graphs may have different (i.e.
non-isomorphic) duals, obtained from different (i.e. non-homeomorphic) embeddings.
A Euclidean graph is a graph in which the vertices represent points in the plane, and the edges are assigned lengths
equal to the Euclidean distance between those points; see Geometric graph theory.
A plane graph is said to be convex if all of its faces (including the outer face) are convex polygons. A planar graph
may be drawn convexly if and only if it is a subdivision of a 3-vertex-connected planar graph.
Scheinerman's conjecture (now a theorem) states that every planar graph can be represented as an intersection graph
of line segments in the plane.
The planar separator theorem states that every n-vertex planar graph can be partitioned into two subgraphs of size at
most 2n/3 by the removal of O(n) vertices. As a consequence, planar graphs also have treewidth and branch-width
O(n).
For two planar graphs with v vertices, it is possible to determine in time O(v) whether they are isomorphic or not (see
also graph isomorphism problem).[4]
Notes
[1] Trudeau, Richard J. (1993). Introduction to Graph Theory (http:/ / store. doverpublications. com/ 0486678709. html) (Corrected, enlarged
republication. ed.). New York: Dover Pub.. pp.64. ISBN978-0-486-67870-2. . Retrieved 8 August 2012. "Thus a planar graph, when drawn
on a flat surface, either has no edge-crossings or can be redrawn without them."
[2] Schnyder, W. (1989), "Planar graphs and poset dimension", Order 5: 323343, doi:10.1007/BF00353652, MR1010382.
[3] Bhasker, Jayaram; Sahni, Sartaj (1988), "A linear algorithm to find a rectangular dual of a planar triangulated graph", Algorithmica 3 (14):
247278, doi:10.1007/BF01762117.
[4] I. S. Filotti, Jack N. Mayer. A polynomial-time algorithm for determining the isomorphism of graphs of fixed genus. Proceedings of the 12th
Annual ACM Symposium on Theory of Computing, p.236243. 1980.
References
Kuratowski, Kazimierz (1930), "Sur le problme des courbes gauches en topologie" (http://matwbn.icm.edu.pl/
ksiazki/fm/fm15/fm15126.pdf) (in French), Fund. Math. 15: 271283.
Wagner, K. (1937), "ber eine Eigenschaft der ebenen Komplexe", Math. Ann. 114: 570590,
doi:10.1007/BF01594196.
Boyer, John M.; Myrvold, Wendy J. (2005), "On the cutting edge: Simplified O(n) planarity by edge addition"
(http://jgaa.info/accepted/2004/BoyerMyrvold2004.8.3.pdf), Journal of Graph Algorithms and Applications
8 (3): 241273.
McKay, Brendan; Brinkmann, Gunnar, A useful planar graph generator (http://cs.anu.edu.au/~bdm/plantri/).
de Fraysseix, H.; Ossona de Mendez, P.; Rosenstiehl, P. (2006), "Trmaux trees and planarity", International
Journal of Foundations of Computer Science 17 (5): 10171029, doi:10.1142/S0129054106004248. Special Issue
on Graph Drawing.
D.A. Bader and S. Sreshta, A New Parallel Algorithm for Planarity Testing (http://www.cc.gatech.edu/~bader/
papers/planarity2003.html), UNM-ECE Technical Report 03-002, October 1, 2003.
Fisk, Steve (1978), "A short proof of Chvtal's watchman theorem", J. Comb. Theory, Ser. B 24 (3): 374,
doi:10.1016/0095-8956(78)90059-X.
352
Planar graph
353
External links
Edge Addition Planarity Algorithm Source Code, version 1.0 (http://jgaa.info/accepted/2004/
BoyerMyrvold2004.8.3/planarity.zip) Free C source code for reference implementation of BoyerMyrvold
planarity algorithm, which provides both a combinatorial planar embedder and Kuratowski subgraph isolator. An
open source project with free licensing provides the Edge Addition Planarity Algorithms, current version (http://
code.google.com/p/planarity/).
Public Implementation of a Graph Algorithm Library and Editor (http://pigale.sourceforge.net) GPL graph
algorithm library including planarity testing, planarity embedder and Kuratowski subgraph exhibition in linear
time.
Boost Graph Library tools for planar graphs (http://www.boost.org/doc/libs/1_40_0/libs/graph/doc/
planar_graphs.html), including linear time planarity testing, embedding, Kuratowski subgraph isolation, and
straight-line drawing.
3 Utilities Puzzle and Planar Graphs (http://www.cut-the-knot.org/do_you_know/3Utilities.shtml)
NetLogo Planarity model (http://ccl.northwestern.edu/netlogo/models/Planarity) NetLogo version of John
Tantalo's game
Dual graph
In mathematics, the dual graph of a given
planar graph G is a graph which has a vertex
corresponding to each plane region of G,
and an edge joining two neighboring regions
for each edge in G, for a certain embedding
of G. The term "dual" is used because this
property is symmetric, meaning that if H is a
dual of G, then G is a dual of H (if G is
connected). The same notion of duality may
also be used for more general embeddings
of graphs on manifolds.
Properties
Dual graph
354
Algebraic dual
Let G be a connected graph. An algebraic
dual of G is a graph G such that G and G
have the same set of edges, any cycle of G is
a cut of G, and any cut of G is a cycle of
G. Every planar graph has an algebraic
dual which is in general not unique (any
dual defined by a plane embedding will do).
The converse is actually true, as settled by
Whitney:[2]
Two red graphs are duals for the blue one, but they are not isomorphic.
Weak dual
The weak dual of an embedded planar graph is the subgraph of the dual graph whose vertices correspond to the
bounded faces of the primal graph. A planar graph is outerplanar if and only if its weak dual is a forest, and a planar
graph is a Halin graph if and only if its weak dual is biconnected and outerplanar. For any embedded planar graph G,
let G+ be the multigraph formed by adding a single new vertex v in the unbounded face of G, and connecting v to
each vertex of the outer face (multiple times, if a vertex appears multiple times on the boundary of the outer face);
then, G is the weak dual of the planar dual of G+.[3][4]
Dual graph
Complex networks
In the context of complex network theory, edge dual of a random network preserves many of its properties such as
small-world property and the shape of its degree distribution function [5]
Notes
[1] Here we consider that graphs may have loops and multiple edges to avoid uncommon considerations
[2] Whitney, Hassler (1932), "Non-separable and planar graphs", Transactions of the American Mathematical Society 34 (2): 339362,
doi:10.1090/S0002-9947-1932-1501641-2.
[3] Fleischner, Herbert J.; Geller, D. P.; Harary, Frank (1974), "Outerplanar graphs and weak duals", Journal of the Indian Mathematical Society
38: 215219, MR0389672.
[4] Syso, Maciej M.; Proskurowski, Andrzej (1983), "On Halin graphs", Graph Theory: Proceedings of a Conference held in Lagw, Poland,
February 1013, 1981, Lecture Notes in Mathematics, 1018, Springer-Verlag, pp.248256, doi:10.1007/BFb0071635.
[5] Ramezanpour A, Karimipour V. and Mashaghi A., Generating correlated networks from uncorrelated ones, Phys. Rev. E 67, 046107 (2003)
http:/ / pre. aps. org/ abstract/ PRE/ v67/ i4/ e046107
External links
Weisstein, Eric W., " Dual graph (http://mathworld.wolfram.com/DualGraph.html)" from MathWorld.
Weisstein, Eric W., " Self-dual graph (http://mathworld.wolfram.com/Self-DualGraph.html)" from
MathWorld.
Fry's theorem
In mathematics, Fry's theorem states that any simple planar graph can be drawn without crossings so that its edges
are straight line segments. That is, the ability to draw graph edges as curves instead of as straight line segments does
not allow a larger class of graphs to be drawn. The theorem is named after Istvn Fry, although it was proved
independently by Klaus Wagner(1936), Fry(1948), and S. K. Stein(1951).
Proof
Let G be a simple planar graph with n vertices; we may add edges if
necessary so that G is maximal planar. All faces of G will be triangles,
as we could add an edge into any face with more sides while
preserving planarity, contradicting the assumption of maximal
planarity. Choose some three vertices a,b,c forming a triangular face of
G. We prove by induction on n that there exists a straight-line
embedding of G in which triangle abc is the outer face of the
embedding. As a base case, the result is trivial when and a, b and c are
the only vertices in G. Otherwise, all vertices in G have at least three
neighbors.
By Euler's formula for planar graphs, G has 3n-6 edges; equivalently, if
one defines the deficiency of a vertex v in G to be 6 - deg(v), the sum
of the deficiencies is 12. Each vertex in G can have deficiency at most
Induction step for proof of Fry's theorem.
three, so there are at least four vertices with positive deficiency. In
particular we can choose a vertex v with at most five neighbors that is different from a, b and c. Let G' be formed by
removing v from G and retriangulating the face formed by removing v. By induction, G' has a straight line
embedding in which abc is the outer face. Remove the added edges in G', forming a polygon P with at most five
355
Fry's theorem
sides into which v should be placed to complete the drawing. By the Art gallery theorem, there exists a point interior
to P at which v can be placed so that the edges from v to the vertices of P do not cross any other edges, completing
the proof.
The induction step of this proof is illustrated at right.
Related results
De Fraysseix, Pach and Pollack showed how to find in linear time a straight-line drawing in a grid with dimensions
linear in the size of the graph. A similar method has been followed by Schnyder to prove enhanced bounds and a
characterization of planarity based on the incidence partial order. His work stressed the existence of a particular
partition of the edges of a maximal planar graph into three trees known as a Schnyder wood.
Tutte's spring theorem states that every 3-connected planar graph can be drawn on a plane without crossings so that
its edges are straight line segments and an outside face is a convex polygon (Tutte 1963). It is so called because such
an embedding can be found as the equilibrium position for a system of springs representing the edges of the graph.
Steinitz's theorem states that every 3-connected planar graph can be represented as the edges of a convex polyhedron
in three-dimensional space. A straight-line embedding of
of the type described by Tutte's theorem, may be
formed by projecting such a polyhedral representation onto the plane.
The Circle packing theorem states that every planar graph may be represented as the intersection graph of a
collection of non-crossing circles in the plane. Placing each vertex of the graph at the center of the corresponding
circle leads to a straight line representation.
Heiko Harborth raised the question of whether every planar graph has a straight line representation in which all edge
lengths are integers.[1] The answer remains unknown as of 2009. However, integer-distance straight line embeddings
are known to exist for cubic graphs.[2]
Sachs (1983) raised the question of whether every graph with a linkless embedding in three-dimensional Euclidean
space has a linkless embedding in which all edges are represented by straight line segments, analogously to Fry's
theorem for two-dimensional embeddings.
Notes
[1] Harborth et al. (1987); Kemnitz & Harborth (2001); Mohar(2001, 2003).
[2] Geelen, Guo & McKinnon (2008).
References
Fry, Istvn (1948), "On straight-line representation of planar graphs", Acta Sci. Math. (Szeged) 11: 229233,
MR0026311.
de Fraysseix, Hubert; Pach, Jnos; Pollack, Richard (1988), "Small sets supporting Fary embeddings of planar
graphs", Twentieth Annual ACM Symposium on Theory of Computing, pp.426433, doi:10.1145/62212.62254,
ISBN0-89791-264-0.
de Fraysseix, Hubert; Pach, Jnos; Pollack, Richard (1990), "How to draw a planar graph on a grid",
Combinatorica 10: 4151, doi:10.1007/BF02122694, MR1075065.
Geelen, Jim; Guo, Anjie; McKinnon, David (2008), "Straight line embeddings of cubic planar graphs with integer
edge lengths" (http://www.math.uwaterloo.ca/~dmckinnon/Papers/Planar.pdf), J. Graph Theory 58 (3):
270274, doi:10.1002/jgt.20304.
Harborth, H.; Kemnitz, A.; Moller, M.; Sussenbach, A. (1987), "Ganzzahlige planare Darstellungen der
platonischen Korper", Elem. Math. 42: 118122.
Kemnitz, A.; Harborth, H. (2001), "Plane integral drawings of planar graphs", Discrete Math. 236: 191195,
doi:10.1016/S0012-365X(00)00442-8.
356
Fry's theorem
Mohar, Bojan (2003), Problems from the book Graphs on Surfaces (http://www.fmf.uni-lj.si/~mohar/Book/
BookProblems.html).
Mohar, Bojan; Thomassen, Carsten (2001), Graphs on Surfaces, Johns Hopkins University Press, pp.roblem
2.8.15, ISBN0-8018-6689-8.
Sachs, Horst (1983), "On a spatial analogue of Kuratowski's theorem on planar graphs An open problem", in
Horowiecki, M.; Kennedy, J. W.; Syso, M. M., Graph Theory: Proceedings of a Conference held in agw,
Poland, February 1013, 1981, Lecture Notes in Mathematics, 1018, Springer-Verlag, pp.230241,
doi:10.1007/BFb0071633, ISBN978-3-540-12687-4.
Schnyder, Walter (1990), "Embedding planar graphs on the grid" (http://portal.acm.org/citation.
cfm?id=320176.320191), Proc. 1st ACM/SIAM Symposium on Discrete Algorithms (SODA), pp.138148.
Stein, S. K. (1951), "Convex maps", Proceedings of the American Mathematical Society 2 (3): 464466,
doi:10.2307/2031777, JSTOR2031777, MR0041425.
Tutte, W. T. (1963), "How to draw a graph", Proceedings of the London Mathematical Society 13: 743767,
doi:10.1112/plms/s3-13.1.743, MR0158387.
Wagner, Klaus (1936), "Bemerkungen zum Vierfarbenproblem" (http://www.digizeitschriften.de/index.
php?id=resolveppn&PPN=GDZPPN002131633), Jahresbericht. German. Math.-Verein. 46: 2632.
Steinitz's theorem
In polyhedral combinatorics, a branch of mathematics, Steinitz's theorem is a characterization of the undirected
graphs formed by the edges and vertices of three-dimensional convex polyhedra: they are exactly the
3-vertex-connected planar graphs.[1][2] That is, every convex polyhedron forms a 3-connected planar graph, and
every 3-connected planar graph can be represented as the graph of a convex polyhedron. For this reason, the
3-connected planar graphs are also known as polyhedral graphs.[3] Steinitz's theorem is named after Ernst Steinitz,
who proved it in 1922.[4] Branko Grnbaum has called this theorem the most important and deepest known result on
3-polytopes.[2]
The name "Steinitz's theorem" has also been applied to other results of Steinitz:
the Steinitz exchange lemma implying that each basis of a vector space has the same number of vectors,[5]
the theorem that if the convex hull of a point set contains a unit sphere, then the convex hull of a finite subset of
the point contains a smaller concentric sphere,[6] and
Steinitz's vectorial generalization of the Riemann series theorem on the rearrangements of conditionally
convergent series.[7][8][9][10]
357
Steinitz's theorem
358
Steinitz's theorem
tangent to K.[19]
In dimensions higher than three, the algorithmic Steinitz problem (given a lattice, determine whether it is the face
lattice of a convex polytope) is complete for the existential theory of the reals by Richter-Gebert's universality
theorem.[20]
References
[1] Lectures on Polytopes, by Gnter M. Ziegler (1995) ISBN 0-387-94365-X , Chapter 4 "Steinitz' Theorem for 3-Polytopes", p.103.
[2] Branko Grnbaum, Convex Polytopes, 2nd edition, prepared by Volker Kaibel, Victor Klee, and Gunter M. Ziegler, 2003, ISBN
0-387-40409-0, ISBN 978-0-387-40409-7, 466pp.
[3] Weisstein, Eric W., " Polyhedral graph (http:/ / mathworld. wolfram. com/ PolyhedralGraph. html)" from MathWorld.
[4] Steinitz, E. (1922), "Polyeder und Raumeinteilungen", Encyclopdie der mathematischen Wissenschaften, Band 3 (Geometries), pp.1139.
[5] Zynel, Mariusz (1996), "The Steinitz theorem and the dimension of a vector space" (http:/ / citeseerx. ist. psu. edu/ viewdoc/
download?doi=10. 1. 1. 79. 1707& rep=rep1& type=pdf), Formalized Mathematics 5 (8): 423428, .
[6] Kirkpatrick, David; Mishra, Bhubaneswar; Yap, Chee-Keng (1992), "Quantitative Steinitz's theorems with applications to multifingered
grasping", Discrete & Computational Geometry 7 (1): 295318, doi:10.1007/BF02187843.
[7] Rosenthal, Peter (1987), "The remarkable theorem of Lvy and Steinitz", American Mathematical Monthly 94 (4): 342351, JSTOR2323094.
[8] Banaszczyk, Wojciech (1991). "Chapter 3.10 The Lvy-Steinitz theorem". Additive subgroups of topological vector spaces. Lecture Notes
inMathematics. 1466. Berlin: Springer-Verlag. pp.viii+178. ISBN3-540-53917-4. MR1119302.
[9] Kadets, V. M.; Kadets, M. I. (1991). "Chapter 6 The Steinitz theorem and B-convexity". Rearrangements of series in Banach spaces.
Translations of Mathematical Monographs. 86 (Translated by Harold H. McFaden from the Russian-language (Tartu) 1988 ed.). Providence,
RI: American Mathematical Society. pp.iv+123. ISBN0-8218-4546-2. MR1108619.
[10] Kadets, Mikhail I.; Kadets, Vladimir M. (1997). "Chapter 2.1 Steinitz's theorem on the sum range of a series, Chapter 7 The Steinitz theorem
and B-convexity". Series in Banach spaces: Conditional and unconditional convergence. Operator Theory: Advances and Applications. 94
(Translated by Andrei Iacob from the Russian-language ed.). Basel: Birkhuser Verlag. pp.viii+156. ISBN3-7643-5401-1. MR1442255.
[11] Balinski, M. L. (1961), "On the graph structure of convex polyhedra in n-space" (http:/ / projecteuclid. org/ euclid. pjm/ 1103037323),
Pacific Journal of Mathematics 11 (2): 431434, MR0126765, .
[12] Venkatasubramanian, Suresh (2006), Planar graphs and Steinitz's theorem (http:/ / geomblog. blogspot. com/ 2006/ 08/
planar-graphs-and-steinitzs-theorem. html), .
[13] Rib Mor, Ares; Rote, Gnter; Schulz, Andr, "Small Grid Embeddings of 3-Polytopes", Discrete & Computational Geometry 45 (1):
6587, doi:10.1007/s00454-010-9301-0.
[14] Buchin, Kevin; Schulz, Andr (2010), "On the number of spanning trees a planar graph can have", Algorithms - 18th Annual European
Symposium (ESA 2010), Lecture Notes in Computer Science, 6346, Springer-Verlag, pp.110121, doi:10.1007/978-3-642-15775-2.
[15] Schulz, Andr (2011), "Drawing 3-polytopes with good vertex resolution" (http:/ / jgaa. info/ accepted/ 2011/ Schulz2011. 15. 1. pdf),
Journal of Graph Algorithms and Applications 15 (1): 3352, .
[16] Barnette, David; Grnbaum, Branko (1970), "Preassigning the shape of a face" (http:/ / projecteuclid. org/ euclid. pjm/ 1102977361), Pacific
Journal of Mathematics 32 (2): 299306, MR0259744, .
[17] Barnette, David W. (1970), "Projections of 3-polytopes", Israel Journal of Mathematics 8 (3): 304308, doi:10.1007/BF02771563.
[18] Ziegler, Gnter M. (2007), "Convex polytopes: extremal constructions and f-vector shapes. Section 1.3: Steinitz's theorem via circle
packings", in Miller, Ezra; Reiner, Victor; Sturmfels, Bernd, Geometric Combinatorics, IAS/Park City Mathematics Series, 13, American
Mathematical Society, pp.628642, ISBN978-0-8218-3736-8.
[19] Schramm, Oded (1992), "How to cage an egg", Inventiones Mathematicae 107 (3): 543560, doi:10.1007/BF01231901, MR1150601.
[20] Richter-Gebert, Jrgen (1996). Realization Spaces of Polytopes. Lecture Notes in Mathematics. 1643. Springer-Verlag.
ISBN978-3-540-62084-6.
359
Planarity testing
Planarity testing
In graph theory, the planarity testing problem asks whether, given a graph, that graph is a planar graph (can be
drawn in the plane without edge intersections). This is a well-studied problem in computer science for which many
practical algorithms have emerged, many taking advantage of novel data structures. Most of these methods operate
in O(n) time (linear time), where n is the number of edges (or vertices) in the graph, which is asymptotically optimal.
360
Planarity testing
References
[1] Hopcroft, John; Tarjan, Robert E. (1974), "Efficient planarity testing", Journal of the Association for Computing Machinery 21 (4): 549568,
doi:10.1145/321850.321852.
[2] Lempel, A.; Even, S.; Cederbaum, I. (1967), "An algorithm for planarity testing of graphs", in Rosenstiehl, P., Theory of Graphs, New York:
Gordon and Breach, pp.215232.
[3] Even, Shimon; Tarjan, Robert E. (1976), "Computing an st-numbering", Theoretical Computer Science 2 (3): 339344,
doi:10.1016/0304-3975(76)90086-4.
[4] Boyer & Myrvold (2004), p. 243: Its implementation in LEDA is slower than LEDA implementations of many other O(n)-time planarity
algorithms.
[5] Chiba, N.; Nishizeki, T.; Abe, A.; Ozawa, T. (1985), "A linear algorithm for embedding planar graphs using PQtrees", Journal of Computer
and Systems Sciences 30 (1): 5476, doi:10.1016/0022-0000(85)90004-2.
[6] Shih, W. K.; Hsu, W. L. (1999), "A new planarity test", Theoretical Computer Science 223 (12): 179191,
doi:10.1016/S0304-3975(98)00120-0.
[7] Boyer, John M.; Myrvold, Wendy J. (2004), "On the cutting edge: simplified O(n) planarity by edge addition" (http:/ / jgaa. info/ accepted/
2004/ BoyerMyrvold2004. 8. 3. pdf), Journal of Graph Algorithms and Applications 8 (3): 241273, .
[8] de Fraysseix, H.; Ossona de Mendez, P.; Rosenstiehl, P. (2006), "Trmaux Trees and Planarity", International Journal of Foundations of
Computer Science 17 (5): 10171030, doi:10.1142/S0129054106004248.
[9] Brandes, Ulrik (2009), The left-right planarity test (http:/ / www. inf. uni-konstanz. de/ algo/ publications/ b-lrpt-sub. pdf), .
[10] Boyer, John M.; Cortese, P. F.; Patrignani, M.; Battista, G. D. (2003), "Stop minding your P's and Q's: implementing a fast and simple
DFS-based planarity testing and embedding algorithm", Proc. 11th Int. Symp. Graph Drawing (GD '03), Lecture Notes in Computer Science,
2912, Springer-Verlag, pp.2536
[11] Chimani, M.; Mutzel, P.; Schmidt, J. M. (2008), "Efficient extraction of multiple Kuratowski subdivisions", Proc. 15th Int. Symp. Graph
Drawing (GD'07), Lecture Notes in Computer Science, 4875, Sydney, Australia: Springer-Verlag, pp.159170.
[12] http:/ / www. ogdf. net
[13] http:/ / www. boost. org/ doc/ libs/ 1_40_0/ libs/ graph/ doc/ boyer_myrvold. html
[14] Williamson, S. G. (1984), "Depth First Search and Kuratowski Subgraphs", Journal Association of Computing Machinery 31: 681693
361
362
and
and
are T-alike
and
are T-opposite
are T-opposite
Let G be a graph and let T be a DFS-tree of G. The graph G is planar if and only if there exists a partition of
the cotree edges of G into two classes so that any two edges belong to a same class if they are T-alike and any
two edges belong to different classes if they are T-opposite.
References
H. de Fraysseix and P. Rosenstiehl, A depth-first search characterization of planarity, Annals of Discrete
Mathematics 13 (1982), 7580.
Graph drawing
Graph drawing
Graph drawing is an area of mathematics and computer
science combining methods from geometric graph theory
and information visualization to derive two-dimensional
depictions of graphs arising from applications such as
social
network
analysis,
cartography,
and
bioinformatics.[1]
A drawing of a graph or network diagram is a pictorial
representation of the vertices and edges of a graph. This
drawing should not be confused with the graph itself:
very different layouts can correspond to the same
graph.[2] In the abstract, all that matters is which pairs
Graphic representation of a minute fraction of the WWW,
vertices are connected by edges. In the concrete, however,
demonstrating hyperlinks.
the arrangement of these vertices and edges within a
drawing affects its understandability, usability,
fabrication cost, and aesthetics.[3] The problem gets worse, if the graph changes over time by adding and deleting
edges (dynamic graph drawing) and the goal is to preserve the user's mental map[4].
Graphical conventions
Graphs are frequently drawn as node-link diagrams in which the vertices are
represented as disks or boxes and the edges are represented as line segments,
polylines, or curves in the Euclidean plane.[3]
In the case of directed graphs, arrowheads form a commonly used graphical
convention to show their orientation;[2] however, user studies have shown that other
conventions such as tapering provide this information more effectively.[5]
Alternative conventions to node-link diagrams include adjacency representations
such as circle packings, in which vertices are represented by disjoint regions in the
A directed graph with
arrowheads showing the edge
plane and edges are represented by adjacencies between regions; intersection
directions
representations in which vertices are represented by non-disjoint geometric objects
and edges are represented by their intersections; visibility representations in which
vertices are represented by regions in the plane and edges are represented by regions that have an unobstructed line
of sight to each other; confluent drawings, in which edges are represented as smooth curves within mathematical
train tracks; and visualizations of the adjacency matrix of the graph.
Quality measures
Many different quality measures have been defined for graph drawings, in an attempt to find objective means of
evaluating their aesthetics and usability.[6] In addition to guiding the choice between different layout methods for the
same graph, some layout methods attempt to directly optimize these measures.
The crossing number of a drawing is the number of pairs of edges that cross each other. If the graph is planar,
then it is often convenient to draw it without any edge intersections; that is, in this case, a graph drawing
represents a graph embedding. However, nonplanar graphs frequently arise in applications, so graph drawing
algorithms must generally allow for edge crossings.[7]
363
Graph drawing
The area of a drawing is the size of its smallest bounding box, relative to the closest distance between any two
vertices. Drawings with smaller area are generally preferable to those with larger area, because they allow the
features of the drawing to be shown at greater size and therefore more legibly. The aspect ratio of the bounding
box may also be important.
Symmetry display is the problem of finding symmetry groups within a given graph, and finding a drawing that
displays as much of the symmetry as possible. Some layout methods automatically lead to symmetric drawings;
alternatively, some drawing methods start by finding symmetries in the input graph and using them to construct a
drawing.[8]
It is important that edges have shapes that are as simple as possible, to make it easier for the eye to follow them.
In polyline drawings, the complexity of an edge may be measured by its number of bends, and many methods aim
to provide drawings with few total bends or few bends per edge. Similarly for spline curves the complexity of an
edge may be measured by the number of control points on the edge.
Several commonly used quality measures concern lengths of edges: it is generally desirable to minimize the total
length of the edges as well as the maximum length of any edge. Additionally, it may be preferable for the lengths
of edges to be uniform rather than highly varied.
Angular resolution is a measure of the sharpest angles in a graph drawing. If a graph has vertices with high degree
then it necessarily will have small angular resolution, but the angular resolution can be bounded below by a
function of the degree.[9]
The slope number of a graph is the minimum number of distinct edge slopes needed in a drawing with straight
line segment edges (allowing crossings). Cubic graphs have slope number at most four, but graphs of degree five
may have unbounded slope number; it remains open whether the slope number of degree-4 graphs is bounded.[9]
Layout methods
There are many different graph layout strategies:
In force-based layout systems, the graph drawing software modifies an initial vertex placement by continuously
moving the vertices according to a system of forces based on physical metaphors related to systems of springs or
molecular mechanics. Typically, these systems combine attractive forces between adjacent vertices with repulsive
forces between all pairs of vertices, in order to seek a layout in which edge lengths are small while vertices are
well-separated. These systems may perform gradient descent based minimization of an energy function, or they
may translate the forces directly into velocities or accelerations for the moving vertices.[10]
Spectral layout methods use as coordinates the eigenvectors of a matrix such as the Laplacian derived from the
adjacency matrix of the graph.[11]
Orthogonal layout methods, which allow the edges of the graph to run horizontally or vertically, parallel to the
coordinate axes of the layout. These methods were originally designed for VLSI and PCB layout problems but
they have also been adapted for graph drawing. They typically involve a multiphase approach in which an input
graph is planarized by replacing crossing points by vertices, a topological embedding of the planarized graph is
found, edge orientations are chosen to minimize bends, vertices are placed consistently with these orientations,
and finally a layout compaction stage reduces the area of the drawing.[12]
Tree layout algorithms these show a rooted tree-like formation, suitable for trees. Often, in a technique called
"balloon layout", the children of each node in the tree are drawn on a circle surrounding the node, with the radii of
these circles diminishing at lower levels in the tree so that these circles do not overlap.[13]
Layered graph drawing methods (often called Sugiyama-style drawing) are best suited for directed acyclic graphs
or graphs that are nearly acyclic, such as the graphs of dependencies between modules or functions in a software
system. In these methods, the nodes of the graph are arranged into horizontal layers using methods such as the
CoffmanGraham algorithm, in such a way that most edges go downwards from one layer to the next; after this
step, the nodes within each layer are arranged in order to minimize crossings.[14]
364
Graph drawing
Circular layout methods place the vertices of the graph on a circle, choosing carefully the ordering of the vertices
around the circle to reduce crossings and place adjacent vertices close to each other. Edges may be drawn either
as chords of the circle or as arcs inside or outside of the circle. In some cases, multiple circles may be used.[15]
In addition, the placement and routing steps of electronic design automation are similar in many ways to graph
drawing, and the graph drawing literature includes several results borrowed from the EDA literature. However, these
problems also differ in several important ways: for instance, in EDA, area minimization and signal length are more
important than aesthetics, and the routing problem in EDA may have more than two terminals per net while the
analogous problem in graph drawing generally only involves pairs of vertices for each edge.
Software
Software, systems, and providers of systems for drawing graphs include:
Graphviz, an open-source graph drawing system from AT&T[20]
yEd, a widely used graph editor with graph layout functionality[21]
Microsoft Automatic Graph Layout, a .NET library (formerly called GLEE) for laying out graphs[22]
Tom Sawyer Software[23] Tom Sawyer Perspectives is a graphics-based software for building enterprise-class
data visualization and social network analysis applications. It is a Software Development Kit (SDK) with a
graphics-based design and preview environment.
Tulip (software)[24]
Gephi, an open-source network analysis and visualization software
Cytoscape
Notes
[1] Di Battista et al. (1994), pp. viiviii; Herman, Melanon & Marshall (2000), Section 1.1, "Typical Application Areas".
[2] Di Battista et al. (1994), p. 6.
[3] Di Battista et al. (1994), p. viii.
[4] Misue et al. (1995)
[5] Holten & van Wijk (2009); Holten et al. (2011).
[6] Di Battista et al. (1994), Section 2.1.2, Aesthetics, pp. 1416; Purchase, Cohen & James (1997).
[7] Di Battista et al. (1994), p 14.
[8] Di Battista et al. (1994), p. 16.
[9] Pach & Sharir (2009).
[10] Di Battista et al. (1994), Section 2.7, "The Force-Directed Approach", pp. 2930, and Chapter 10, "Force-Directed Methods", pp. 303326.
[11] Beckman (1994); Koren (2005).
[12] Di Battista et al. (1994), Chapter 5, "Flow and Orthogonal Drawings", pp. 137170; (Eiglsperger, Fekete & Klau 2001).
[13] Herman, Melanon & Marshall (2000), Section 2.2, "Traditional Layout An Overview".
365
Graph drawing
[14] Sugiyama, Tagawa & Toda (1981); Bastert & Matuszewski (2001); Di Battista et al. (1994), Chapter 9, "Layered Drawings of Digraphs",
pp. 265302.
[15] Dorusz, Madden & Madden (1997).
[16] Scott (2000).
[17] Di Battista et al. (1994), pp. 15-16, and Chapter 6, "Flow and Upward Planarity", pp. 171214; Freese (2004).
[18] Zapponi (2003).
[19] Anderson & Head (2006).
[20] "Graphviz and Dynagraph Static and Dynamic Graph Drawing Tools", by John Ellson, Emden R. Gansner, Eleftherios Koutsofios,
Stephen C. North, and Gordon Woodhull, in Jnger & Mutzel (2004).
[21] "yFiles Visualization and Automatic Layout of Graphs", by Roland Wiese, Markus Eiglsperger, and Michael Kaufmann, in Jnger &
Mutzel (2004).
[22] Nachmanson, Robertson & Lee (2008).
[23] Madden et al. (1996).
[24] "Tulip A Huge Graph Visualization Framework", by David Auber, in Jnger & Mutzel (2004).
References
Anderson, James Andrew; Head, Thomas J. (2006), Automata Theory with Modern Applications (http://books.
google.com/books?id=ikS8BLdLDxIC&pg=PA38), Cambridge University Press, pp.3841,
ISBN978-0-521-84887-9.
Bastert, Oliver; Matuszewski, Christian (2001), "Layered drawings of digraphs", in Kaufmann, Michael; Wagner,
Dorothea, Drawing Graphs: Methods and Models, Lecture Notes in Computer Science, 2025, Springer-Verlag,
pp.87120, doi:10.1007/3-540-44969-8_5.
Beckman, Brian (1994), Theory of Spectral Graph Layout (http://research.microsoft.com/apps/pubs/default.
aspx?id=69611), Tech. Report MSR-TR-94-04, Microsoft Research.
Di Battista, Giuseppe; Eades, Peter; Tamassia, Roberto; Tollis, Ioannis G. (1994), "Algorithms for Drawing
Graphs: an Annotated Bibliography" (http://www.cs.brown.edu/people/rt/gd.html), Computational
Geometry: Theory and Applications 4: 235282.
Di Battista, Giuseppe; Eades, Peter; Tamassia, Roberto; Tollis, Ioannis G. (1998), Graph Drawing: Algorithms
for the Visualization of Graphs, Prentice Hall, ISBN978-0-13-301615-4.
Dorusz, Uur; Madden, Brendan; Madden, Patrick (1997), "Circular layout in the Graph Layout toolkit", in
North, Stephen, Symposium on Graph Drawing, GD '96 Berkeley, California, USA, September 1820, 1996,
Proceedings, Lecture Notes in Computer Science, 1190, Springer-Verlag, pp.92100,
doi:10.1007/3-540-62495-3_40.
Eiglsperger, Markus; Fekete, Sndor; Klau, Gunnar (2001), "Orthogonal graph drawing", in Kaufmann, Michael;
Wagner, Dorothea, Drawing Graphs, Lecture Notes in Computer Science, 2025, Springer Berlin / Heidelberg,
pp.121171, doi:10.1007/3-540-44969-8_6.
Freese, Ralph (2004), "Automated lattice drawing" (http://www.math.hawaii.edu/~ralph/Preprints/
latdrawing.pdf), in Eklund, Peter, Concept Lattices: Second International Conference on Formal Concept
Analysis, ICFCA 2004, Sydney, Australia, February 23-26, 2004, Proceedings, Lecture Notes in Computer
Science, 2961, Springer-Verlag, pp.589590, doi:10.1007/978-3-540-24651-0_12.
Herman, Ivan; Melanon, Guy; Marshall, M. Scott (2000), "Graph Visualization and Navigation in Information
Visualization: A Survey" (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.28.8892), IEEE
Transactions on Visualization and Computer Graphics 6 (1): 2443, doi:10.1109/2945.841119.
Holten, Danny; Isenberg, Petra; van Wijk, Jarke J.; Fekete, Jean-Daniel (2011), "An extended evaluation of the
readability of tapered, animated, and textured directed-edge representations in node-link graphs" (http://www.
lri.fr/~isenberg/publications/papers/Holten_2011_AEP.pdf), IEEE Pacific Visualization Symposium
(PacificVis 2011), pp.195202, doi:10.1109/PACIFICVIS.2011.5742390.
Holten, Danny; van Wijk, Jarke J. (2009), "A user study on visualizing directed edges in graphs" (http://www.
win.tue.nl/~dholten/papers/directed_edges_chi.pdf), Proceedings of the 27th International Conference on
Human Factors in Computing Systems (CHI '09), pp.22992308, doi:10.1145/1518701.1519054.
366
Graph drawing
Jnger, Michael; Mutzel, Petra (2004), Graph Drawing Software, Springer-Verlag, ISBN978-3-540-00881-1.
Koren, Yehuda (2005), "Drawing graphs by eigenvectors: theory and practice" (https://akpublic.research.att.
com/areas/visualization/papers_videos/pdf/DBLP-journals-camwa-Koren05.pdf), Computers & Mathematics
with Applications 49 (11-12): 18671888, doi:10.1016/j.camwa.2004.08.015, MR2154691.
Madden, Brendan; Madden, Patrick; Powers, Steve; Himsolt, Michael (1996), "Portable graph layout and
editing", in Brandenburg, Franz J., Graph Drawing: Symposium on Graph Drawing, GD '95, Passau, Germany,
September 2022, 1995, Proceedings, Lecture Notes in Computer Science, 1027, Springer-Verlag, pp.385395,
doi:10.1007/BFb0021822.
Misue, K.; Eades, P.; Lai, W.; Sugiyama, K. (1995), "Layout Adjustment and the Mental Map", Journal of Visual
Languages and Computing 6 (2): 183210.
Nachmanson, Lev; Robertson, George; Lee, Bongshin (2008), "Drawing Graphs with GLEE" (ftp://ftp.research.
microsoft.com/pub/TR/TR-2007-72.pdf), in Hong, Seok-Hee; Nishizeki, Takao; Quan, Wu, Graph Drawing,
15th International Symposium, GD 2007, Sydney, Australia, September 24-26, 2007, Revised Papers, Lecture
Notes in Computer Science, 4875, Springer-Verlag, pp.389394, doi:10.1007/978-3-540-77537-9_38.
Pach, Jnos; Sharir, Micha (2009), "5.5 Angular resolution and slopes", Combinatorial Geometry and Its
Algorithmic Applications: The Alcal Lectures, Mathematical Surveys and Monographs, 152, American
Mathematical Society, pp.126127.
Purchase, H. C.; Cohen, R. F.; James, M. I. (1997), "An experimental study of the basis for graph drawing
algorithms" (https://secure.cs.uvic.ca/twiki/pub/Research/Chisel/ComputationalAestheticsProject/
Vol2Nbr4.pdf), Journal of Experimental Algorithmics 2: Article 4, doi:10.1145/264216.264222.
Scott, John (2000), "Sociograms and Graph Theory" (http://books.google.com/books?id=Ww3_bKcz6kgC&
pg=PA), Social network analysis: a handbook (2nd ed.), Sage, pp.6469, ISBN978-0-7619-6339-4.
Sugiyama, Kozo; Tagawa, Shjir; Toda, Mitsuhiko (1981), "Methods for visual understanding of hierarchical
system structures", IEEE Transactions on Systems, Man, and Cybernetics SMC-11 (2): 109125,
doi:10.1109/TSMC.1981.4308636, MR0611436.
Zapponi, Leonardo (August 2003), "What is a Dessin d'Enfant" (http://www.ams.org/notices/200307/what-is.
pdf), Notices of the American Mathematical Society 50: 788789, ISSN0002-9920.
External links
Graph drawing e-print archive (http://gdea.informatik.uni-koeln.de/): including information on papers from all
Graph Drawing symposia.
Graph drawing (http://www.dmoz.org/Science/Math/Combinatorics/Software/Graph_Drawing/) at the
Open Directory Project for many additional links related to graph drawing.
367
368
of each
spring is proportional to the graph-theoretic distance between nodes i and j. In this model, there is no need for a
separate repulsive force. Note that minimizing the difference (usually the squared difference) between euclidean and
ideal distances between nodes is then equivalent to a metric multidimensional scaling problem. Stress majorization
gives a very well-behaved (i.e., monotonically convergent) and mathematically elegant way to minimise these
differences and, hence, find a good layout for the graph.
A force-directed graph can involve forces other than mechanical springs and electrical repulsion; examples include
logarithmic springs (as opposed to linear springs), gravitational forces (which aggregate connected components in
graphs that are not singly connected), magnetic fields (so as to obtain good layouts for directed graphs), and
electrically charged springs (in order to avoid overlap or near-overlap in the final drawing). In the case of
spring-and-charged-particle graphs, the edges tend to have uniform length (because of the spring forces), and nodes
that are not connected by an edge tend to be drawn further apart (because of the electrical repulsion).
While graph drawing is a difficult problem, force-directed algorithms, being physical simulations, usually require no
special knowledge about graph theory such as planarity.
It is also possible to employ mechanisms that search more directly for energy minima, either instead of or in
conjunction with physical simulation. Such mechanisms, which are examples of general global optimization
methods, include simulated annealing and genetic algorithms.
Advantages
The following are among the most important advantages of force-directed algorithms:
Good-quality results
At least for graphs of medium size (up to 50-100 vertices), the results obtained have usually very good results
based on the following criteria: uniform edge length, uniform vertex distribution and showing symmetry. This
last criterion is among the most important ones and is hard to achieve with any other type of algorithm.
Flexibility
Force-directed algorithms can be easily adapted and extended to fulfill additional aesthetic criteria. This makes
them the most versatile class of graph drawing algorithms. Examples of existing extensions include the ones
for directed graphs, 3D graph drawing,[1] cluster graph drawing, constrained graph drawing, and dynamic
graph drawing.
Intuitive
Since they are based on physical analogies of common objects, like springs, the behavior of the algorithms is
relatively easy to predict and understand. This is not the case with other types of graph-drawing algorithms.
Simplicity
Typical force-directed algorithms are simple and can be implemented in a few lines of code. Other classes of
graph-drawing algorithms, like the ones for orthogonal layouts, are usually much more involved.
Interactivity
Another advantage of this class of algorithm is the interactive aspect. By drawing the intermediate stages of
the graph, the user can follow how the graph evolves, seeing it unfold from a tangled mess into a good-looking
configuration. In some interactive graph drawing tools, the user can pull one or more nodes out of their
equilibrium state and watch them migrate back into position. This makes them a preferred choice for dynamic
and online graph-drawing systems.
Strong theoretical foundations
While simple ad-hoc force-directed algorithms (such as the one given in pseudo-code in this article) often
appear in the literature and in practice (because they are relatively easy to understand), more reasoned
approaches are starting to gain traction. Statisticians have been solving similar problems in multidimensional
scaling (MDS) since the 1930s, and physicists also have a long history of working with related n-body
problems - so extremely mature approaches exist. As an example, the stress majorization approach to metric
MDS can be applied to graph drawing as described above. This has been proven to converge monotonically.[2]
Monotonic convergence, the property that the algorithm will at each iteration decrease the stress or cost of the
layout, is important because it guarantees that the layout will eventually reach a local minimum and stop.
Damping schedules such as the one used in the pseudo-code below, cause the algorithm to stop, but cannot
guarantee that a true local minimum is reached.
Disadvantages
The main disadvantages of force-directed algorithms include the following:
High running time
The typical force-directed algorithms are in general considered to have a running time equivalent to O(n3),
where n is the number of nodes of the input graph. This is because the number of iterations is estimated to be
O(n), and in every iteration, all pairs of nodes need to be visited and their mutual repulsive forces computed.
This is related to the N-body problem in physics. However, since repulsive forces are local in nature the graph
can be partitioned such that only neighboring vertices are considered. Common techniques used by algorithms
369
370
for determining the layout of large graphs include high-dimensional embedding,[3] multi-layer drawing and
other methods related to N-body simulation. For example, the BarnesHut simulation-based method FADE[4]
can improve running time to n*log(n) per iteration. As a rough guide, in a few seconds one can expect to draw
at most 1,000 nodes with a standard n2 per iteration technique, and 100,000 with a n*log(n) per iteration
technique.[4] Force-directed algorithm, when combined with a multilevel approach, can draw graphs of
millions of nodes.[5]
Poor local minima
It is easy to see that force-directed algorithms produce a graph with minimal energy, in particular one whose
total energy is only a local minimum. The local minimum found can be, in many cases, considerably worse
than a global minimum, which translates into a low-quality drawing. For many algorithms, especially the ones
that allow only down-hill moves of the vertices, the final result can be strongly influenced by the initial layout,
that in most cases is randomly generated. The problem of poor local minima becomes more important as the
number of vertices of the graph increases. A combined application of different algorithms is helpful to solve
this problem. For example, using the Kamada-Kawai algorithm[6] to quickly generate a reasonable initial
layout and then the Fruchterman-Reingold algorithm[7] to improve the placement of neighbouring nodes.
Another technique to achieve global minimum is to use a multilevel approach.
Pseudocode
Each node has x,y position and dx,dy velocity and mass m. There is usually a spring constant, s, and damping: 0 <
damping < 1. The force toward and away from nodes is calculated according to Hooke's Law and Coulomb's law or
similar as discussed above. The example can be trivially expanded to include a z position for 3D representation.
set up initial node velocities to (0,0)
set up initial node positions randomly // make sure no 2 nodes are in exactly the same position
loop
total_kinetic_energy := 0 // running sum of total kinetic energy over all particles
for each node
net-force := (0, 0) // running sum of total force on this particular node
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
Vose, Aaron. "3D Phylogenetic Tree Viewer" (http:/ / www. aaronvose. com/ phytree3d/ ). . Retrieved 3 June 2012.
de Leeuw, J. (1988)
Harel, D., Koren Y. (2002)
Quigley A., Eades P. (2001)
"A Gallery of Large Graphs" (http:/ / www2. research. att. com/ ~yifanhu/ GALLERY/ GRAPHS/ ). . Retrieved 1 July 2012.
Kamada, T., Kawai, S. (1989)
Fruchterman, T. M. J., & Reingold, E. M. (1991)
Further reading
di Battista, Giuseppe; Peter Eades, Roberto Tamassia, Ioannis G. Tollis (1999). Graph Drawing: Algorithms for
the Visualization of Graphs. Prentice Hall. ISBN978-0-13-301615-4.
Eades, Peter (1984). "A Heuristic for Graph Drawing". Congressus Numerantium 42 (11): 149160.
Fruchterman, Thomas M. J.; Reingold, Edward M. (1991). "Graph Drawing by Force-Directed Placement" (http:/
/citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.13.8444). Software Practice & Experience (Wiley)
21 (11): 11291164. doi:10.1002/spe.4380211102.
Harel, David; Koren, Yehuda (2002). "Graph Drawing by High-Dimensional Embedding" (http://citeseerx.ist.
psu.edu/viewdoc/summary?doi=10.1.1.20.5390). Proceedings of the 9th International Symposium on Graph
Drawing. pp.207219. ISBN3-540-00158-1.
Kamada, Tomihisa; Kawai, Satoru (1989). "An algorithm for drawing general undirected graphs". Information
Processing Letters (Elsevier) 31 (1): 715. doi:10.1016/0020-0190(89)90102-6.
Kaufmann, Michael; Wagner, Dorothea, eds. (2001). Drawing graphs: methods and models. Lecture Notes in
Computer Science 2025. Springer. doi:10.1007/3-540-44969-8. ISBN978-3-540-42062-0.
de Leeuw, Jan (1988). "Convergence of the majorization method for multidimensional scaling". Journal of
Classification (Springer) 5 (2): 163180. doi:10.1007/BF01897162.
Quigley, Aaron; Eades, Peter (2001). "FADE: Graph Drawing, Clustering, and Visual Abstraction" (http://www.
cs.ucd.ie/staff/aquigley/home/downloads/aq-gd2000.pdf) (PDF). Proceedings of the 8th International
Symposium on Graph Drawing. pp.197210. ISBN3-540-41554-8.
External links
Video of Spring Algorithm (http://www.cs.usyd.edu.au/~aquigley/avi/spring.avi)
Live visualisation in flash + source code and description (http://blog.ivank.net/
force-based-graph-drawing-in-as3.html)
Short explanation of the Kamada-Kawai spring-based graph layout algorithm featuring a picture (https://nwb.
slis.indiana.edu/community/?n=VisualizeData.Kamada-Kawaii)
Short explanation of Fruchterman-Reingold algorithm. The algorithm implements a variable step width
(temperature) to guarantee that the system reaches equilibrium state (https://nwb.slis.indiana.edu/
community/?n=VisualizeData.Fruchterman-Rheingold)
Daniel Tunkelang's dissertation (http://reports-archive.adm.cs.cmu.edu/anon/1998/abstracts/98-189.html) (
with source code (http://www.cs.cmu.edu/~quixote/JiggleSource.zip) and demonstration applet (http://
www.cs.cmu.edu/~quixote/gd.html)) on force-directed graph layout
Hyperassociative Map Algorithm (http://wiki.syncleus.com/index.php/DANN:Hyperassociative_Map)
Implementation of a Force Directed Graph with C# including video demonstration (http://chris.widdowson.id.
au/?p=406)
Interactive and real-time force directed graphing algorithms used in an online database modeling tool (http://
www.anchormodeling.com/modeler)
371
Graph embedding
372
Graph embedding
In topological graph theory, an embedding (also spelled imbedding) of a graph
representation of
on a surface is a
on in which points of are associated to vertices and simple arcs (homeomorphic images of
Terminology
If a graph
is embedded on a closed surface , the complement of the union of the points and arcs associated to
which every face is homeomorphic to an open disk.[3] A closed 2-cell embedding is an embedding in which the
closure of every face is homeomorphic to a closed disk.
The genus of a graph is the minimal integer n such that the graph can be embedded in a surface of genus n. In
particular, a planar graph has genus 0, because it can be drawn on a sphere without self-crossing. The
non-orientable genus of a graph is the minimal integer n such that the graph can be embedded in a non-orientable
surface of (non-orientable) genus n.[2]
Combinatorial embedding
An embedded graph uniquely defines cyclic orders of edges incident to the same vertex. The set of all these cyclic
orders is called a rotation system. Embeddings with the same rotation system are considered to be equivalent and the
corresponding equivalence class of embeddings is called combinatorial embedding (as opposed to the term
topological embedding, which refers to the previous definition in terms of points and curves). Sometimes, the
rotation system itself is called a "combinatorial embedding".[4][5][6]
An embedded graph also defines natural cyclic orders of edges which constitutes the boundaries of the faces of the
embedding. However handling these face-based orders is less straightforward, since in some cases some edges may
be traversed twice along a face boundary. For example this is always the case for embeddings of trees, which have a
single face. To overcome this combinatorial nuisance, one may consider that every edge is "split" lengthwise in two
"half-edges", or "sides". Under this convention in all face boundary traversals each half-edge is traversed only once
and the two half-edges of the same edge are always traversed in opposite directions.
Graph embedding
Computational complexity
The problem of finding the graph genus is NP-hard (the problem of determining whether an n-vertex graph has
genus g is NP-complete).[7]
At the same time, the graph genus problem is fixed-parameter tractable, i.e., polynomial time algorithms are known
to check whether a graph can be embedded into a surface of a given fixed genus as well as to find the embedding.
The first breakthrough in this respect happened in 1979, when algorithms of time complexity O(nO(g)) were
independently submitted to the Annual ACM Symposium on Theory of Computing: one by I. Filotti and G.L. Miller
and another one by John Reif. Their approaches were quite different, but upon the suggestion of the program
committee they presented a joint paper.[8]
In 1999 it was reported that the fixed-genus case can be solved in time linear in the graph size and doubly
exponential in the genus.[9]
References
[1] Katoh, Naoki; Tanigawa, Shin-ichi (2007), "Enumerating Constrained Non-crossing Geometric Spanning Trees", Computing and
Combinatorics, 13th Annual International Conference, COCOON 2007, Banff, Canada, July 16-19, 2007, Proceedings, Lecture Notes in
Computer Science, 4598, Springer-Verlag, pp.243253, doi:10.1007/978-3-540-73545-8_25, ISBN978-3-540-73544-1.
[2] Gross, Jonathan; Tucker, Tom (2001), Topological Graph Theory, Dover Publications, ISBN0-486-41741-7.
[3] Lando, Sergei K.; Zvonkin, Alexander K. (2004), Graphs on Surfaces and their Applications, Springer-Verlag, ISBN3-540-00203-0.
[4] Mutzel, Petra; Weiskircher, Ren (2000), "Computing Optimal Embeddings for Planar Graphs", Computing and Combinatorics, 6th Annual
International Conference, COCOON 2000, Sydney, Australia, July 2628, 2000, Proceedings, Lecture Notes in Computer Science, 1858,
Springer-Verlag, pp.95104, doi:10.1007/3-540-44968-X_10, ISBN978-3-540-67787-1.
[5] Didjev, Hristo N. (1995), "On drawing a graph convexly in the plane", Graph Drawing, DIMACS International Workshop, GD '94,
Princeton, New Jersey, USA, October 1012, 1994, Proceedings, Lecture Notes in Computer Science, 894, Springer-Verlag, pp.7683,
doi:10.1007/3-540-58950-3_358.
[6] Duncan, Christian; Goodrich, Michael T.; Kobourov, Stephen (2010), "Planar Drawings of Higher-Genus Graphs", Graph Drawing, 17th
International Symposium, GD 2009, Chicago, IL, USA, September 22-25, 2009, Revised Papers, Lecture Notes in Computer Science, 5849,
Springer-Verlag, pp.4556, doi:10.1007/978-3-642-11805-0_7, ISBN978-3-642-11804-3.
[7] Thomassen, Carsten (1989), "The graph genus problem is NP-complete", Journal of Algorithms 10 (4): 568576,
doi:10.1016/0196-6774(89)90006-0
[8] Filotti, I. S.; Miller, Gary L.; Reif, John (1979), "On determining the genus of a graph in O(v O(g)) steps(Preliminary Report)", Proc. 11th
Annu. ACM Symposium on Theory of Computing, pp.2737, doi:10.1145/800135.804395.
[9] Mohar, Bojan (1999), "A linear time algorithm for embedding graphs in an arbitrary surface", SIAM Journal on Discrete Mathematics 12 (1):
626, doi:10.1137/S089548019529248X
373
Application: Sociograms
Application: Sociograms
A sociogram is a graphic representation of social links that a person
has. It is a graph drawing that plots the structure of interpersonal
relations in a group situation.[1]
Overview
Sociograms were developed by Jacob L. Moreno to analyze choices or
preferences within a group.[2] They can diagram the structure and
An example of a social network diagram
patterns of group interactions. A sociogram can be drawn on the basis
of many different criteria: Social relations, channels of influence, lines of communication etc.
Those points on a sociogram who have many choices are called Stars. Those with few or no choices are called
isolates. Individuals who choose each other are known to have made a Mutual Choice. One-Way Choice refers to
individuals who choose someone but the choice is not reciprocated. Cliques are groups of three or more people
within a larger group who all choose each other (Mutual Choice).
Sociograms are the charts or tools used to find the Sociometry of a social space.
Under the Social Discipline Model, sociograms are sometimes used to reduce misbehavior in a classroom
environment[3]. A sociogram is constructed after students answer a series of questions probing for affiliations with
other classmates. The diagram can then be used to identify pathways for social acceptance for misbehaving students.
In this context, the resulting sociograms are known as a friendship chart. Often, the most important person/thing is in
a bigger bubble in relation to everyone else. The size of the bubble represents the importance, with the biggest
bubble meaning most important and the smallest representing the least important.
References
[1] Sociogram (http:/ / www. merriam-webster. com/ dictionary/ sociogram) at merriam-webster.com.
[2] "An Experiential Approach to Organization Development 7th ed."Brown, Donald R.and Harvey, Don. Page 134
[3] Wolfgang, Charles H., Solving Discipline And Classroom Management Problems: Methods and Models for Todays Teachers; U.S.A, John
Wiley and Sons, 2001.; p. 116
External links
Free softwaretool (Win & Mac) to make Sociograms. (http://www.phenotyping.com/sociogram)
374
Overview
A concept map is a way of representing relationships between ideas, images, or words in the same way that a
sentence diagram represents the grammar of a sentence, a road map represents the locations of highways and towns,
and a circuit diagram represents the workings of an electrical appliance. In a concept map, each word or phrase is
connected to another and linked back to the original idea, word or phrase. Concept maps are a way to develop logical
thinking and study skills by revealing connections and helping students see how individual ideas form a larger
whole.[2]
Concept maps were developed to enhance meaningful learning in the sciences. A well-made concept map grows
within a context frame defined by an explicit "focus question", while a mind map often has only branches radiating
out from a central picture. There is research evidence that knowledge is stored in the brain in the form of productions
(situation-response conditionals) that act on declarative memory content which is also referred to as chunks or
propositions.[3][4] Because concept maps are constructed to reflect organization of the declarative memory system,
they facilitate sense-making and meaningful learning on the part of individuals who make concept maps and those
who use them.
History
The technique of concept mapping was developed by Joseph D. Novak[6] and his research team at Cornell University
in the 1970s as a means of representing the emerging science knowledge of students. It has subsequently been used
as a tool to increase meaningful learning in the sciences and other subjects as well as to represent the expert
knowledge of individuals and teams in education, government and business. Concept maps have their origin in the
learning movement called constructivism. In particular, constructivists hold that learners actively construct
knowledge.
375
376
Novak's work is based on the cognitive theories of David Ausubel (assimilation theory), who stressed the importance
of prior knowledge in being able to learn new concepts: "The most important single factor influencing learning is
what the learner already knows. Ascertain this and teach accordingly."[7] Novak taught students as young as six years
old to make concept maps to represent their response to focus questions such as "What is water?" "What causes the
seasons?" In his book Learning How to Learn, Novak states that a "meaningful learning involves the assimilation of
new concepts and propositions into existing cognitive structures."
Various attempts have been made to conceptualize the process of creating concept maps. Ray McAleese, in a series
of articles, has suggested that mapping is a process of off-loading. In this 1998 paper, McAleese draws on the work
of Sowa [8] and a paper by Sweller & Chandler.[9] In essence, McAleese suggests that the process of making
knowledge explicit, using nodes and relationships, allows the individual to become aware of what they know and as
a result to be able to modify what they know.[10] Maria Birbili applies that same idea to helping young children learn
to think about what they know.[11] The concept of the Knowledge Arena is suggestive of a virtual space where
learners may explore what they know and what they do not know.
Use
Concept maps are used to stimulate the
generation of ideas, and are believed to
aid creativity. For example, concept
mapping is sometimes used for
brain-storming. Although they are
often personalized and idiosyncratic,
concept maps can be used to
communicate complex ideas.
Formalized concept maps are used in
software design, where a common
usage is Unified Modeling Language
diagramming
amongst
similar
conventions
and
development
methodologies.
Example concept map, created using the IHMC CmapTools computer program.
References
[1] Joseph D. Novak & Alberto J. Caas (2006). "The Theory Underlying Concept Maps and How To Construct and Use Them" (http:/ / cmap.
ihmc. us/ Publications/ ResearchPapers/ TheoryCmaps/ TheoryUnderlyingConceptMaps. htm), Institute for Human and Machine Cognition.
Accessed 24 Nov 2008.
[2] CONCEPT MAPPING FUELS (http:/ / www. energyeducation. tx. gov/ pdf/ 223_inv. pdf). Accessed 24 Nov 2008.
[3] Anderson, J. R., & Lebiere, C. (1998). The atomic components of thought. Mahwah, NJ: Erlbaum.
[4] Anderson, J. R., Byrne, M. D., Douglass, S., Lebiere, C., & Qin, Y. (2004). An Integrated Theory of the Mind. Psychological Review, 111(4),
10361050.
[5] Novak, J.D. & Gowin, D.B. (1996). Learning How To Learn, Cambridge University Press: New York, p. 7.
[6] "Joseph D. Novak" (http:/ / www. ihmc. us/ users/ user. php?UserID=jnovak). Institute for Human and Machine Cognition (IHMC). .
Retrieved 2008-04-06.
[7] Ausubel, D. (1968) Educational Psychology: A Cognitive View. Holt, Rinehart & Winston, New York.
[8] Sowa, J.F., 1983. Conceptual structures: information processing in mind and machine, Addison-Wesley.
[9] Sweller, J. & Chandler, P., 1991. Evidence for Cognitive Load Theory. Cognition and Instruction, 8(4), p.351-362.
[10] McAleese,R (1998) The Knowledge Arena as an Extension to the Concept Map: Reflection in Action, Interactive Learning Environments,
6,3,p.251-272.
[11] Birbili, M. (2006) "Mapping Knowledge: Concept Maps in Early Childhood Education" (http:/ / ecrp. uiuc. edu/ v8n2/ birbili. html), Early
Childhood Research & Practice, 8(2), Fall 2006
[12] Moon, B.M., Hoffman, R.R., Novak, J.D., & Caas, A.J. (2011). Applied Concept Mapping: Capturing, Analyzing and Organizing
Knowledge. (http:/ / www. appliedconceptmapping. info) CRC Press: New York.
[13] Villalon, Jorge; Rafael Calvo (2011). "Concept maps as cognitive visualizations of writing assignments" (http:/ / www. ifets. info/
download_pdf. php?j_id=52& a_id=1149) (PDF). Journal of Educational Technology and Society 14 (3): 1627. . Retrieved 2011-11-16.
Further reading
Novak, J.D., Learning, Creating, and Using Knowledge: Concept Maps as Facilitative Tools in Schools and
Corporations, Lawrence Erlbaum Associates, (Mahwah), 1998.
Novak, J.D. & Gowin, D.B., Learning How to Learn, Cambridge University Press, (Cambridge), 1984.
External links
Concept Mapping: A Graphical System for Understanding the Relationship between Concepts (http://www.
ericdigests.org/1998-1/concept.htm) - From the ERIC Clearinghouse on Information and Technology.
A large catalog of papers on cognitive maps and learning (http://cmap.ihmc.us/Publications/) by Novak,
Caas, and others.
Example of a concept map from 1957 (http://www.mind-mapping.org/images/walt-disney-business-map.png)
by Walt Disney.
377
378
Definition
Let {I1,I2,...,In}P(R) be a set of
intervals.
The corresponding
G=(V,E), where
interval
graph
is
Seven intervals on the real line and the corresponding seven-vertex interval graph.
V= {I1,I2,...,In}, and
{I,I}E if and only if II.
From this construction one can verify a common property held by all interval graphs. That is, graph G is an interval
graph if and only if the maximal cliques of G can be ordered M1,M2,...,Mk such that for any vMiMk, where
i<k, it is also the case that vMj for any Mj, ijk.[1]
Interval graph
The intersection graphs of arcs of a circle form circular-arc graphs, a class of graphs that contains the interval graphs.
The trapezoid graphs, intersections of trapezoids whose parallel sides all lie on the same two parallel lines, are also a
generalization of the interval graphs.
The pathwidth of an interval graph is one less than the size of its maximum clique (or equivalently, one less than its
chromatic number), and the pathwidth of any graph G is the same as the smallest pathwidth of an interval graph that
contains G as a subgraph.[5]
The connected triangle-free interval graphs are exactly the caterpillar trees.[6]
Applications
The mathematical theory of interval graphs was developed with a view towards applications by researchers at the
RAND Corporation's mathematics department, which included young researcherssuch as Peter C. Fishburn and
students like Alan C. Tucker and Joel E. Cohenbesides leaderssuch as Delbert Fulkerson and (recurring visitor)
Victor Klee.[7] Cohen applied interval graphs to mathematical models of population biology, specifically food
webs.[8]
Other applications include genetics, bioinformatics, and computer science. Finding a set of intervals that represent an
interval graph can also be used as a way of assembling contiguous subsequences in DNA mapping.[9] Interval graphs
are used to represent resource allocation problems in operations research and scheduling theory. Each interval
represents a request for a resource for a specific period of time; the maximum weight independent set problem for
the graph represents the problem of finding the best subset of requests that can be satisfied without conflicts.[10]
Interval graphs also play an important role in temporal reasoning.[11]
Notes
[1]
[2]
[3]
[4]
[5]
[6]
[7]
(Fishburn 1985)
Golumbic (1980).
Gilmore & Hoffman (1964)
Faudree, Flandrin & Ryjek (1997), p. 89.
Bodlaender (1998).
Eckhoff (1993).
Cohen (1978, pp. ix-10 (http:/ / books. google. se/ books/ princeton?hl=en& q=interval+ graph& vid=ISBN9780691082028&
redir_esc=y#v=snippet& q="interval graph"& f=false))
[8] Cohen (1978, pp. 1233 (http:/ / books. google. se/ books/ princeton?hl=en& q=interval+ graph& vid=ISBN9780691082028&
redir_esc=y#v=snippet& q="interval graph"& f=false))
[9] Zhang et al. (1994).
[10] Bar-Noy et al. (2001).
[11] Golumbic & Shamir (1993).
References
Bar-Noy, Amotz; Bar-Yehuda, Reuven; Freund, Ari; Naor, Joseph (Seffi); Schieber, Baruch (2001), "A unified
approach to approximating resource allocation and scheduling" (http://portal.acm.org/citation.
cfm?id=335410&coll=portal&dl=ACM), Journal of the ACM 48 (5): 10691090, doi:10.1145/502102.502107.
Bodlaender, Hans L. (1998), "A partial k-arboretum of graphs with bounded treewidth", Theoretical Computer
Science 209 (12): 145, doi:10.1016/S0304-3975(97)00228-4.
Booth, K. S.; Lueker, G. S. (1976), "Testing for the consecutive ones property, interval graphs, and graph
planarity using PQ-tree algorithms", J. Comput. System Sci. 13 (3): 335379,
doi:10.1016/S0022-0000(76)80045-1.
Cohen, Joel E. (1978). Food webs and niche space. Monographs in Population Biology. 11. Princeton, NJ:
Princeton University Press (http://press.princeton.edu/titles/324.html). pp.xv+1190.
ISBN978-0-691-08202-8.
379
Interval graph
Eckhoff, Jrgen (1993), "Extremal interval graphs", Journal of Graph Theory 17 (1): 117127,
doi:10.1002/jgt.3190170112.
Faudree, Ralph; Flandrin, Evelyne; Ryjek, Zdenk (1997), "Claw-free graphs A survey", Discrete
Mathematics 164 (13): 87147, doi:10.1016/S0012-365X(96)00045-3, MR1432221.
Fishburn, Peter C. (1985). Interval orders and interval graphs: A study of partially ordered sets.
Wiley-Interscience Series in Discrete Mathematics. New York: John Wiley & Sons.
Fulkerson, D. R.; Gross, O. A. (1965), "Incidence matrices and interval graphs", Pacific Journal of Mathematics
15: 835855.
Gilmore, P. C.; Hoffman, A. J. (1964), "A characterization of comparability graphs and of interval graphs", Can.
J. Math. 16: 539548, doi:10.4153/CJM-1964-055-5.
Golumbic, Martin Charles (1980), Algorithmic Graph Theory and Perfect Graphs, Academic Press,
ISBN0-12-289260-7.
Golumbic, Martin Charles; Shamir, Ron (1993), "Complexity and algorithms for reasoning about time: a
graph-theoretic approach", J. Assoc. Comput. Mach. 40: 11081133.
Habib, Michel; McConnell, Ross; Paul, Christophe; Viennot, Laurent (2000), "Lex-BFS and partition refinement,
with applications to transitive orientation, interval graph recognition, and consecutive ones testing" (http://www.
cs.colostate.edu/~rmm/lexbfs.ps), Theor. Comput. Sci. 234: 5984, doi:10.1016/S0304-3975(97)00241-7.
Zhang, Peisen; Schon, Eric A.; Fischer, Stuart G.; Cayanis, Eftihia; Weiss, Janie; Kistler, Susan; Bourne, Philip E.
(1994), "An algorithm based on graph theory for the assembly of contigs in physical mapping of DNA",
Bioinformatics 10 (3): 309317, doi:10.1093/bioinformatics/10.3.309.
External links
"interval graph" (http://wwwteo.informatik.uni-rostock.de/isgci/classes/gc_234.html). Information System
on Graph Class Inclusions (http://wwwteo.informatik.uni-rostock.de/isgci/index.html).
Weisstein, Eric W., " Interval graph (http://mathworld.wolfram.com/IntervalGraph.html)" from MathWorld.
380
Chordal graph
381
Chordal graph
In the mathematical area of graph theory, a graph is chordal if
each of its cycles of four or more nodes has a chord, which is
an edge joining two nodes that are not adjacent in the cycle.
An equivalent definition is that any chordless cycle has at
most three nodes. Some also state this as that a chordal graph
is a graph with no induced cycles of length more than three.
Chordal graphs are a subset of the perfect graphs. They are
sometimes also called rigid circuit graphs[1] or triangulated
graphs (the latter term is sometimes erroneously used for
plane triangulations (maximal planar graphs)).[2]
A cycle (black) with two chords (green). As for this part, the
graph is chordal. However, removing one green edge would
result in a non-chordal graph. Indeed, the other green edge
with three black edges would form a cycle of length four
with no chords.
Chordal graph
382
2003).
Minimal separators
In any graph, a vertex separator is a set of vertices the removal of which leaves the remaining graph disconnected; a
separator is minimal if it has no proper subset that is also a separator. According to a theorem of Dirac (1961), the
chordal graphs are exactly the graphs in which each minimal separator is a clique; Dirac used this characterization to
prove that chordal graphs are perfect.
The family of chordal graphs may be defined inductively, as the graphs whose vertices can be divided into three
nonempty subsets A, S, and B, such that AS and SB both form chordal induced subgraphs, S is a clique, and
there are no edges from A to B. That is, they are the graphs that have a recursive decomposition by clique separators
into smaller subgraphs. For this reason, chordal graphs have also sometimes been called decomposable graphs.[3]
Subclasses
Interval graphs are the intersection graphs of subtrees of path graphs, a special case of trees; therefore, they are a
subfamily of the chordal graphs.
Split graphs are exactly the graphs that are both chordal and the complements of chordal graphs. Bender, Richmond
& Wormald (1985) showed that, in the limit as n goes to infinity, the fraction of n-vertex chordal graphs that are split
approaches one.
Ptolemaic graphs, are exactly the graphs that are both chordal and distance hereditary. Quasi-threshold graphs are a
subclass of ptolemaic graphs that are exactly the graphs that are both chordal and cographs. Block graphs are another
subclass of ptolemaic graphs in which every two maximal cliques have at most one vertex in common. A special
type is the windmill graphs, where the common vertex is the same for every pair of cliques.
Strongly chordal graphs are graphs that are chordal and contain no n-sun (n3) as induced subgraph.
Chordal graph
The k-trees are the chordal graphs in which all maximal cliques and all maximal clique separators have the same
size.[4] The Apollonian networks are the chordal maximal planar graphs, or equivalently the planar 3-trees.[4] The
maximal outerplanar graphs are a subclass of 2-trees, and therefore are also chordal.
Superclasses
Chordal graphs are a subclass of the well known perfect graphs. Other superclasses of chordal graphs include weakly
chordal graphs, odd-hole-free graphs, and even-hole-free graphs. In fact, chordal graphs are precisely the graphs that
are both odd-hole-free and even-hole-free (see holes in graph theory).
References
Bender, E. A.; Richmond, L. B.; Wormald, N. C. (1985), "Almost all chordal graphs split", J. Austral. Math. Soc.,
A 38 (2): 214221, doi:10.1017/S1446788700023077, MR0770128.
Berry, Anne; Golumbic, Martin Charles; Lipshteyn, Marina (2007), "Recognizing chordal probe graphs and
cycle-bicolorable graphs", SIAM Journal on Discrete Mathematics 21 (3): 573591, doi:10.1137/050637091.
Bodlaender, H. L.; Fellows, M. R.; Warnow, T. J. (1992), "Two strikes against perfect phylogeny", Proc. of 19th
International Colloquium on Automata Languages and Programming.
Chandran, L. S.; Ibarra, L.; Ruskey, F.; Sawada, J. (2003), "Enumerating and characterizing the perfect
elimination orderings of a chordal graph" [5], Theoretical Computer Science 307 (2): 303317,
doi:10.1016/S0304-3975(03)00221-4.
Dirac, G. A. (1961), "On rigid circuit graphs", Abhandlungen aus dem Mathematischen Seminar der Universitt
Hamburg 25: 7176, doi:10.1007/BF02992776, MR0130190.
Fulkerson, D. R.; Gross, O. A. (1965), "Incidence matrices and interval graphs" [6], Pacific J. Math 15: 835855.
Gavril, Fnic (1974), "The intersection graphs of subtrees in trees are exactly the chordal graphs", Journal of
Combinatorial Theory, Series B 16: 4756, doi:10.1016/0095-8956(74)90094-X.
Golumbic, Martin Charles (1980), Algorithmic Graph Theory and Perfect Graphs, Academic Press.
Habib, Michel; McConnell, Ross; Paul, Christophe; Viennot, Laurent (2000), "Lex-BFS and partition refinement,
with applications to transitive orientation, interval graph recognition, and consecutive ones testing" [7],
Theoretical Computer Science 234: 5984, doi:10.1016/S0304-3975(97)00241-7.
Maffray, Frdric (2003), "On the coloration of perfect graphs", in Reed, Bruce A.; Sales, Cludia L., Recent
Advances in Algorithms and Combinatorics, CMS Books in Mathematics, 11, Springer-Verlag, pp.6584,
doi:10.1007/0-387-22444-0_3, ISBN0-387-95434-1.
Patil, H. P. (1986), "On the structure of k-trees", Journal of Combinatorics, Information and System Sciences 11
(24): 5764, MR966069.
Rose, D.; Lueker, George; Tarjan, Robert E. (1976), "Algorithmic aspects of vertex elimination on graphs", SIAM
Journal on Computing 5 (2): 266283, doi:10.1137/0205021.
383
Chordal graph
384
Notes
[1] Dirac (1961).
[2] Weisstein, Eric W., " Triangulated Graph (http:/ / mathworld. wolfram. com/ TriangulatedGraph. html)" from MathWorld.
[3] Peter Bartlett. "Undirected Graphical Models: Chordal Graphs, Decomposable Graphs, Junction Trees, and Factorizations:" (http:/ / www.
stat. berkeley. edu/ ~bartlett/ courses/ 241A-spring2007/ graphnotes. pdf). .
[4] Patil (1986).
[5] http:/ / www. cis. uoguelph. ca/ ~sawada/ papers/ chordal. pdf
[6] http:/ / projecteuclid. org/ Dienst/ UI/ 1. 0/ Summarize/ euclid. pjm/ 1102995572
[7] http:/ / www. cs. colostate. edu/ ~rmm/ lexbfs. ps
External links
Information System on Graph Class Inclusions (http://wwwteo.informatik.uni-rostock.de/isgci/index.html):
chordal graph (http://wwwteo.informatik.uni-rostock.de/isgci/classes/gc_32.html)
Weisstein, Eric W., " Chordal Graph (http://mathworld.wolfram.com/ChordalGraph.html)" from MathWorld.
Perfect graph
In graph theory, a perfect graph is a graph in which the
chromatic number of every induced subgraph equals the
size of the largest clique of that subgraph. Perfect graphs
are the same as the Berge graphs, graphs that have no
odd-length induced cycle or induced complement of an
odd cycle.
The perfect graphs include many important families of
graphs, and serve to unify results relating colorings and
cliques in those families. For instance, in all perfect
graphs, the graph coloring problem, maximum clique
problem, and maximum independent set problem can all
be solved in polynomial time. In addition, several
important min-max theorems in combinatorics, such as
Dilworth's theorem, can be expressed in terms of the
perfection of certain associated graphs.
The Paley graph of order 9, colored with three colors and showing
a clique of three vertices. In this graph and each of its induced
subgraphs the chromatic number equals the clique number, so it is
a perfect graph.
Perfect graph
385
Perfect graph
perfection of the line graphs of bipartite graphs.
386
Perfect graph
References
Berge, Claude (1961). "Frbung von Graphen, deren smtliche bzw. deren ungerade Kreise starr sind". Wiss. Z.
Martin-Luther-Univ. Halle-Wittenberg Math.-Natur. Reihe 10: 114.
Berge, Claude (1963). "Perfect graphs". Six Papers on Graph Theory. Calcutta: Indian Statistical Institute.
pp.121.
Chudnovsky, Maria; Cornujols, Grard; Liu, Xinming; Seymour, Paul; Vukovi, Kristina (2005). "Recognizing
Berge graphs". Combinatorica 25 (2): 143186. doi:10.1007/s00493-005-0012-8.
Chudnovsky, Maria; Robertson, Neil; Seymour, Paul; Thomas, Robin (2006). "The strong perfect graph theorem"
[1]
. Annals of Mathematics 164 (1): 51229. doi:10.4007/annals.2006.164.51.
Gallai, Tibor (1958). "Maximum-minimum Stze ber Graphen". Acta Math. Acad. Sci. Hungar. 9 (3-4):
395434. doi:10.1007/BF02020271.
Golumbic, Martin Charles (1980). Algorithmic Graph Theory and Perfect Graphs [2]. Academic Press. ISBN
0-444-51530-5 Second edition, Annals of Discrete Mathematics 57, Elsevier, 2004.
Grtschel, Martin; Lovsz, Lszl; Schrijver, Alexander (1988). Geometric Algorithms and Combinatorial
Optimization. Springer-Verlag. See especially chapter 9, "Stable Sets in Graphs", pp.273303.
Lovsz, Lszl (1972). "Normal hypergraphs and the perfect graph conjecture". Discrete Mathematics 2 (3):
253267. doi:10.1016/0012-365X(72)90006-4.
Lovsz, Lszl (1972). "A characterization of perfect graphs". Journal of Combinatorial Theory, Series B 13 (2):
9598. doi:10.1016/0095-8956(72)90045-7.
Lovsz, Lszl (1983). "Perfect graphs". In Beineke, Lowell W.; Wilson, Robin J. (Eds.). Selected Topics in
Graph Theory, Vol. 2. Academic Press. pp.5587. ISBN 0-12-086202-6.
External links
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
387
Intersection graph
Intersection graph
In the mathematical area of graph theory, an intersection graph is a
graph that represents the pattern of intersections of a family of sets.
Any graph may be represented as an intersection graph, but some
important special classes of graphs may be defined by the types of sets
that are used to form an intersection representation of them.
For an overview of the theory of intersection graphs, and of important
special classes of intersection graphs, see McKee & McMorris (1999).
Formal definition
Formally, an intersection graph is an undirected graph formed from a family of sets
Si, i=0,1,2,...
by creating one vertex vi for each set Si, and connecting two vertices vi and vj by an edge whenever the corresponding
two sets have a nonempty intersection, that is,
E(G)={{vi,vj}|SiSj}.
388
Intersection graph
Scheinerman's conjecture (now a theorem) states that every planar graph can also be represented as an intersection
graph of line segments in the plane. However, intersection graphs of line segments may be nonplanar as well, and
recognizing intersection graphs of line segments is complete for the existential theory of the reals (Schaefer
2010).
The line graph of a graph G is defined as the intersection graph of the edges of G, where we represent each edge
as the set of its two endpoints.
A string graph is the intersection graph of curves on a plane.
A graph has boxicity k if it is the intersection graph of multidimensional boxes of dimension k, but not of any
smaller dimension.
Related concepts
An order-theoretic analog to the intersection graphs are the containment orders. In the same way that an intersection
representation of a graph labels every vertex with a set so that vertices are adjacent if and only if their sets have
nonempty intersection, so a containment representation f of a poset labels every element with a set so that for any x
and y in the poset, xy if and only if f(x)f(y).
References
ulk, K. (1964), "Applications of graph theory to mathematical logic and linguistics", Theory of Graphs and its
Applications (Proc. Sympos. Smolenice, 1963), Prague: Publ. House Czechoslovak Acad. Sci., pp.1320,
MR0176940.
Erds, Paul; Goodman, A. W.; Psa, Louis (1966), "The representation of a graph by set intersections" [1],
Canadian Journal of Mathematics 18 (1): 106112, doi:10.4153/CJM-1966-014-3, MR0186575.
Golumbic, Martin Charles (1980), Algorithmic Graph Theory and Perfect Graphs, Academic Press,
ISBN0-12-289260-7.
McKee, Terry A.; McMorris, F. R. (1999), Topics in Intersection Graph Theory, SIAM Monographs on Discrete
Mathematics and Applications, 2, Philadelphia: Society for Industrial and Applied Mathematics,
ISBN0-89871-430-3, MR1672910
Szpilrajn-Marczewski, E. (1945), "Sur deux proprits des classes d'ensembles", Fund. Math. 33: 303307,
MR0015448.
Schaefer, Marcus (2010), "Complexity of Some Geometric and Topological Problems" [2], Graph Drawing, 17th
International Symposium, GS 2009, Chicago, IL, USA, September 2009, Revised Papers, Lecture Notes in
Computer Science, 5849, Springer-Verlag, pp.334344, doi:10.1007/978-3-642-11805-0_32,
ISBN978-3-642-11804-3.
External links
Jan Kratochvl, A video lecture on intersection graphs (June 2007) [3]
E. Prisner, A Journey through Intersection Graph County [4]
References
[1]
[2]
[3]
[4]
389
390
Characterizations
There are several possible definitions of the unit disk graph, equivalent
to each other up to a choice of scale factor:
An intersection graph of equal-radius circles, or of equal-radius
disks
A graph formed from a collection of equal-radius circles, in which
two circles are connected by an edge if one circle contains the
center of the other circle
A graph formed from a collection of points in the Euclidean plane, in which two points are connected if their
distance is below a fixed threshold
Properties
Every induced subgraph of a unit disk graph is also a unit disk graph. An example of a graph that is not a unit disk
graph is the star K1,7 with one central node connected to seven leaves: if each of seven unit disks touches a common
unit disk, some two of the seven disks must touch each other. Therefore, unit disk graphs cannot contain an induced
K1,7 subgraph.
Applications
Beginning with the work of Huson & Sen (1995), unit disk graphs have been used in computer science to model the
topology of ad-hoc wireless communication networks. In this application, nodes are connected through a direct
wireless connection without a base station. It is assumed that all nodes are homogeneous and equipped with
omnidirectional antennas. Node locations are modeled as Euclidean points, and the area within which a signal from
one node can be received by another node is modeled as a circle. If all nodes have transmitters of equal power, these
circles are all equal. Random geometric graphs, formed as unit disk graphs with randomly generated disk centers,
have also been used as a model of percolation and various other phenomena.[1]
Computational complexity
It is NP-hard (more specifically, complete for the existential theory of the reals) to determine whether a graph, given
without geometry, can be represented as a unit disk graph.[2] Additionally, it is provably impossible in polynomial
time to output explicit coordinates of a unit disk graph representation: there exist unit disk graphs that require
exponentially many bits of precision in any such representation.[3]
However, many important and difficult graph optimization problems such as maximum independent set, graph
coloring, and minimum dominating set can be approximated efficiently by using the geometric structure of these
graphs,[4] and the maximum clique problem can be solved exactly for these graphs in polynomial time, given a disk
representation.[5] More strongly, if a graph is given as input, it is possible in polynomial time to produce either a
maximum clique or a proof that the graph is not a unit disk graph.[6]
Notes
[1]
[2]
[3]
[4]
[5]
[6]
[7]
References
Breu, Heinz; Kirkpatrick, David G. (1998), "Unit disk graph recognition is NP-hard", Computational Geometry
Theory and Applications 9 (12): 324.
Clark, Brent N.; Colbourn, Charles J.; Johnson, David S. (1990), "Unit disk graphs", Discrete Mathematics 86
(13): 165177, doi:10.1016/0012-365X(90)90358-O.
Dall, Jesper; Christensen, Michael (2002), "Random geometric graphs", Phys. Rev. E 66: 016121,
arXiv:cond-mat/0203026, doi:10.1103/PhysRevE.66.016121.
Huson, Mark L.; Sen, Arunabha (1995), "Broadcast scheduling algorithms for radio networks", Military
Communications Conference, IEEE MILCOM '95, 2, pp.647651, doi:10.1109/MILCOM.1995.483546,
ISBN0-7803-2489-7.
Kang, Ross J.; Mller, Tobias (2011), "Sphere and dot product representations of graphs", Proceedings of the
Twenty-Seventh Annual Symposium on Computational Geometry (SCG'11), June 1315, 2011, Paris, France,
pp.308314.
Marathe, Madhav V.; Breu, Heinz; Hunt, III, Harry B.; Ravi, S. S.; Rosenkrantz, Daniel J. (1994), Geometry
based heuristics for unit disk graphs, arXiv:math.CO/9409226.
Matsui, Tomomi (2000), "Approximation Algorithms for Maximum Independent Set Problems and Fractional
Coloring Problems on Unit Disk Graphs", Lecture Notes in Computer Science, Lecture Notes in Computer
Science 1763: 194200, doi:10.1007/978-3-540-46515-7_16, ISBN978-3-540-67181-7.
McDiarmid, Colin; Mueller, Tobias (2011), Integer realizations of disk and segment graphs, arXiv:1111.2931
Miyamoto, Yuichiro; Matsui, Tomomi (2005), "Perfectness and Imperfectness of the kth Power of Lattice
Graphs", Lecture Notes in Computer Science, Lecture Notes in Computer Science 3521: 233242,
doi:10.1007/11496199_26, ISBN978-3-540-26224-4.
Raghavan, Vijay; Spinrad, Jeremy (2003), "Robust algorithms for restricted domains", Journal of Algorithms 48
(1): 160172, doi:10.1016/S0196-6774(03)00048-8, MR2006100.
391
Line graph
392
Line graph
In graph theory, the line graph L(G) of undirected graph G is another graph L(G) that represents the adjacencies
between edges of G. The name line graph comes from a paper by Harary & Norman (1960) although both Whitney
(1932) and Krausz (1943) used the construction before this(Hemminger & Beineke 1978, p.273). Other terms used
for the line graph include the theta-obrazom, the covering graph, the derivative, the edge-to-vertex dual, the
conjugate, and the representative graph(Hemminger & Beineke 1978, p.273), as well as the edge graph, the
interchange graph, the adjoint graph, and the derived graph(Balakrishnan 1997, p.44).
One of the earliest and most important theorems about line graphs is due to Hassler Whitney(1932), who proved that
with one exceptional case the structure of G can be recovered completely from its line graph. In other words, with
that one exception, the entire graph can be deduced from knowing the adjacencies of edges ("lines").
Formal definition
Given a graph G, its line graph L(G) is a graph such that
each vertex of L(G) represents an edge of G; and
two vertices of L(G) are adjacent if and only if their corresponding edges share a common endpoint ("are
adjacent") in G.
That is, it is the intersection graph of the edges of G, representing each edge by the set of its two endpoints.
Examples
Example construction
The following figures show a graph (left, with blue vertices) and its line graph (right, with green vertices). Each
vertex of the line graph is shown labeled with the pair of endpoints of the corresponding edge in the original graph.
For instance, the green vertex on the right labeled 1,3 corresponds to the edge on the left between the blue vertices 1
and 3. Green vertex 1,3 is adjacent to three other green vertices: 1,4 and 1,2 (corresponding to edges sharing the
endpoint 1 in the blue graph) and 4,3 (corresponding to an edge sharing the endpoint 3 in the blue graph).
Graph G
Line graph
Triangular graphs
The line graph of the complete graph Kn was previously known as the triangular graph. An important theorem is
that the triangular graphs are characterized by their spectra, except for n = 8, where there are three other graphs with
the same spectrum as L(K8). The exceptions are explained by graph switching.
Medial graph
The medial graph is a variant of the line graph of a planar graph, in which two vertices of the medial graph are
adjacent if and only if the corresponding two edges are consecutive on some face of the planar graph. For simple
polyhedra, the medial graph and the line graph coincide, but for non-simple graphs the medial graph remains planar.
Thus, the medial graphs of the cube and octahedron are both isomorphic to the graph of the cuboctahedron, and the
medial graphs of the dodecahedron and icosahedron are both isomorphic to the graph of the icosidodecahedron.
Properties
Properties of a graph G that depend only on adjacency between edges may be translated into equivalent properties in
L(G) that depend on adjacency between vertices. For instance, a matching in G is a set of edges no two of which are
adjacent, and corresponds to a set of vertices in L(G) no two of which are adjacent, that is, an independent set.
Thus,
The line graph of a connected graph is connected. If G is connected, it contains a path connecting any two of its
edges, which translates into a path in L(G) containing any two of the vertices of L(G). However, a graph G that
has some isolated vertices, and is therefore disconnected, may nevertheless have a connected line graph.
A maximum independent set in a line graph corresponds to maximum matching in the original graph. Since
maximum matchings may be found in polynomial time, so may the maximum independent sets of line graphs,
despite the hardness of the maximum independent set problem for more general families of graphs.
The edge chromatic number of a graph G is equal to the vertex chromatic number of its line graph L(G).
The line graph of an edge-transitive graph is vertex-transitive.
If a graph G has an Euler cycle, that is, if G is connected and has an even number of edges at each vertex, then the
line graph of G is Hamiltonian. (However, not all Hamiltonian cycles in line graphs come from Euler cycles in
this way.)
Line graphs are claw-free graphs, graphs without an induced subgraph in the form of a three-leaf tree.
The line graphs of trees are exactly the claw-free block graph.
393
Line graph
394
In this example, the edges going upward, to the left, and to the right from the central degree-four vertex do not have
any cliques in common. Therefore, any partition of the graph's edges into cliques would have to have at least one
clique for each of these three edges, and these three cliques would all intersect in that central vertex, violating the
requirement that each vertex appear in exactly two cliques. Thus, the graph shown is not a line graph.
An alternative characterization of line graphs was proven by Beineke (1970) (and reported earlier without proof by
Beineke (1968)). He showed that there are nine minimal graphs that are not line graphs, such that any graph that is
not a line graph has one of these nine graphs as an induced subgraph. That is, a graph is a line graph if and only if no
subset of its vertices induces one of these nine graphs. In the example above, the four topmost vertices induce a claw
(that is, a complete bipartite graph K1,3), shown on the top left of the illustration of forbidden subgraphs. Therefore,
by Beineke's characterization, this example cannot be a line graph. For graphs with minimum degree at least 5, only
the six subgraphs in the left and right columns of the figure are needed in the characterization(Metelsky &
Tyshkevich 1997). This result is similar to the results of Line graphs of hypergraphs.[2]
Line graph
They show that, when G is a finite connected graph, only four possible behaviors are possible for this sequence:
If G is a cycle graph then L(G) and each subsequent graph in this sequence is isomorphic to G itself. These are the
only connected graphs for which L(G) is isomorphic to G.
If G is a claw K1,3, then L(G) and all subsequent graphs in the sequence are triangles.
If G is a path graph then each subsequent graph in the sequence is a shorter path until eventually the sequence
terminates with an empty graph.
In all remaining cases, the sizes of the graphs in this sequence eventually increase without bound.
If G is not connected, this classification applies separately to each component of G.
Generalizations
Multigraphs
The concept of the line graph of G may naturally be extended to the case where G is a multigraph, although in that
case Whitney's uniqueness theorem no longer holds; for instance a complete bipartite graph K1,n has the same line
graph as the dipole graph and Shannon multigraph with the same number of edges.
Line digraphs
It is also possible to generalize line graphs to directed graphs.[3] If G is a directed graph, its directed line graph or
line digraph has one vertex for each edge of G. Two vertices representing directed edges from u to v and from w to
x in G are connected by an edge from uv to wx in the line digraph when v = w. That is, each edge in the line digraph
of G represents a length-two directed path in G. The de Bruijn graphs may be formed by repeating this process of
forming directed line graphs, starting from a complete directed graph(Zhang & Lin 1987).
395
Line graph
edges d and e in the graph G are incident at a vertex v with degree k, then in the line graph L(G) the edge connecting
the two vertices d and e can be given weight 1/(k-1). In this way every edge in G (provided neither end is connected
to a vertex of degree '1') will have strength 2 in the line graph L(G) corresponding to the two ends that the edge has
in G.
Notes
[1] See also Krausz (1943).
[2] Weisstein, Eric W., " Line Graph (http:/ / mathworld. wolfram. com/ LineGraph. html)" from MathWorld.
[3] Harary, Frank, and Norman, Robert Z., "Some properties of line digraphs", Rend. Circ. Mat. Palermo, II. Ser. 9 (1960), 161-168.
References
Balakrishnan, V. K. (1997), Schaum's Outline of Graph Theory (1st ed.), McGraw-Hill, ISBN0-07-005489-4.
Beineke, L. W. (1968), "Derived graphs of digraphs", in Sachs, H.; Voss, H.-J.; Walter, H.-J., Beitrge zur
Graphentheorie, Leipzig: Teubner, pp.1733.
Beineke, L. W. (1970), "Characterizations of derived graphs", Journal of Combinatorial Theory 9 (2): 129135,
doi:10.1016/S0021-9800(70)80019-9, MR0262097.
Brandstdt, Andreas; Le, Van Bang; Spinrad, Jeremy P. (1999), Graph Classes: A Survey, SIAM Monographs on
Discrete Mathematics and Applications, ISBN0-89871-432-X.
Evans, T.S.; Lambiotte, R. (2009), "Line Graphs, Link Partitions and Overlapping Communities", Phys.Rev.E 80:
016105, doi:10.1103/PhysRevE.80.016105.
Harary, F.; Norman, R. Z. (1960), "Some properties of line digraphs", Rendiconti del Circolo Matematico di
Palermo 9 (2): 161169, doi:10.1007/BF02854581.
Hemminger, R. L.; Beineke, L. W. (1978), "Line graphs and line digraphs", in Beineke, L. W.; Wilson, R. J.,
Selected Topics in Graph Theory, Academic Press Inc., pp.271305.
Krausz, J. (1943), "Dmonstration nouvelle d'un thorme de Whitney sur les rseaux", Mat. Fiz. Lapok 50:
7585, MR0018403.
Metelsky, Yury; Tyshkevich, Regina (1997), "On line graphs of linear 3-uniform hypergraphs", Journal of Graph
Theory 25 (4): 243251, doi:10.1002/(SICI)1097-0118(199708)25:4<243::AID-JGT1>3.0.CO;2-K.
van Rooij, A. C. M.; Wilf, H. S. (1965), "The interchange graph of a finite graph", Acta Mathematica Hungarica
16 (34): 263269, doi:10.1007/BF01904834.
Roussopoulos, N. D. (1973), "A max {m,n} algorithm for determining the graph H from its line graph G",
Information Processing Letters 2 (4): 108112, doi:10.1016/0020-0190(73)90029-X, MR0424435.
Whitney, H. (1932), "Congruent graphs and the connectivity of graphs", American Journal of Mathematics 54 (1):
150168, doi:10.2307/2371086, JSTOR2371086.
Zhang, Fu Ji; Lin, Guo Ning (1987), "On the de BruijnGood graphs", Acta Math. Sinica 30 (2): 195205,
MR0891925.
396
Line graph
397
External links
line graphs (http://wwwteo.informatik.uni-rostock.de/isgci/classes/gc_249.html), Information System on
Graph Class Inclusions (http://wwwteo.informatik.uni-rostock.de/isgci/index.html)
Claw-free graph
In graph theory, an area of mathematics, a claw-free graph is a graph that
does not have a claw as an induced subgraph.
A claw is another name for the complete bipartite graph K1,3 (that is, a star
graph with three edges, three leaves, and one central vertex). A claw-free
graph is a graph in which no induced subgraph is a claw; i.e., any subset of
four vertices has other than only three edges connecting them in this pattern.
Equivalently, a claw-free graph is a graph in which the neighborhood of any
vertex is the complement of a triangle-free graph.
A claw
Claw-free graph
398
Examples
The line graph L(G) of any graph G is claw-free; L(G) has a vertex for
every edge of G, and vertices are adjacent in L(G) whenever the
corresponding edges share an endpoint in G. A line graph L(G) cannot
contain a claw, because if three edges e1, e2, and e3 in G all share
endpoints with another edge e4 then by the pigeonhole principle at least
two of e1, e2, and e3 must share one of those endpoints with each other.
Line graphs may be characterized in terms of nine forbidden subgraphs;[2]
the claw is the simplest of these nine graphs. This characterization
provided the initial motivation for studying claw-free graphs.[1]
The de Bruijn graphs (graphs whose vertices represent n-bit binary strings
for some n, and whose edges represent (n1)-bit overlaps between two
strings) are claw-free. One way to show this is via the construction of the
de Bruijn graph for n-bit strings as the line graph of the de Bruijn graph for
(n1)-bit strings.
The complement of any triangle-free graph is claw-free.[3] These graphs include as a special case any complete
graph.
Proper interval graphs, the interval graphs formed as intersection graphs of families of intervals in which no
interval contains another interval, are claw-free, because four properly intersecting intervals cannot intersect in
the pattern of a claw.[3]
The Moser spindle, a seven-vertex graph used to provide a lower bound for the chromatic number of the plane, is
claw-free.
The graphs of several polyhedra and polytopes are claw-free, including the graph of the tetrahedron and more
generally of any simplex (a complete graph), the graph of the octahedron and more generally of any cross
polytope (isomorphic to the cocktail party graph formed by removing a perfect matching from a complete graph),
the graph of the regular icosahedron,[4] and the graph of the 16-cell.
The Schlfli graph, a strongly regular graph with 27 vertices, is claw-free.[4]
Recognition
It is straightforward to verify that a given graph with n vertices and m edges is claw-free in time O(n4), by testing
each 4-tuple of vertices to determine whether they induce a claw.[5] Somewhat more efficiently, but more
complicatedly, one can test whether a graph is claw-free by checking, for each vertex of the graph, that the
complement graph of its neighbors does not contain a triangle. A graph contains a triangle if and only if the cube of
its adjacency matrix contains a nonzero diagonal element, so finding a triangle may be performed in the same
asymptotic time bound as nn matrix multiplication.[6] Therefore, using the CoppersmithWinograd algorithm, the
total time for this claw-free recognition algorithm would be O(n3.376).
Kloks, Kratsch & Mller (2000) observe that in any claw-free graph, each vertex has at most 2m neighbors; for
otherwise by Turn's theorem the neighbors of the vertex would not have enough remaining edges to form the
complement of a triangle-free graph. This observation allows the check of each neighborhood in the fast matrix
multiplication based algorithm outlined above to be performed in the same asymptotic time bound as 2m2m
matrix multiplication, or faster for vertices with even lower degrees. The worst case for this algorithm occurs when
(m) vertices have (m) neighbors each, and the remaining vertices have few neighbors, so its total time is
O(m3.376/2)=O(m1.688).
Claw-free graph
399
Enumeration
Because claw-free graphs include complements of triangle-free graphs, the number of claw-free graphs on n vertices
grows at least as quickly as the number of triangle-free graphs, exponentially in the square of n. The numbers of
connected claw-free graphs on n nodes, for n=1,2,... are
1, 1, 2, 5, 14, 50, 191, 881, 4494, 26389, 184749, ... (sequence A022562 in OEIS).
If the graphs are allowed to be disconnected, the numbers of graphs are even larger: they are
1, 2, 4, 10, 26, 85, 302, 1285, 6170, ... (sequence A086991 in OEIS).
A technique of Palmer, Read & Robinson (2002) allows the number of claw-free cubic graphs to be counted very
efficiently, unusually for graph enumeration problems.
Matchings
Sumner (1974) and, independently, Las Vergnas (1975)
proved that every claw-free connected graph with an even
number of vertices has a perfect matching.[7] That is,
there exists a set of edges in the graph such that each
vertex is an endpoint of exactly one of the matched edges.
The special case of this result for line graphs implies that,
in any graph with an even number of edges, one can
partition the edges into paths of length two. Perfect
matchings may be used to provide another
characterization of the claw-free graphs: they are exactly
the graphs in which every connected induced subgraph of
even order has a perfect matching.[7]
Claw-free graph
Independent sets
An independent set in a line graph corresponds to a matching
in its underlying graph, a set of edges no two of which share
an endpoint. As Edmonds (1965) showed, a maximum
matching in any graph may be found in polynomial time;
Sbihi (1980) extended this algorithm to one that computes a
maximum independent set in any claw-free graph.[8] Minty
(1980) (corrected by Nakamura & Tamura 2001)
independently provided an alternative extension of Edmonds'
algorithms to claw-free graphs, that transforms the problem
into one of finding a matching in an auxiliary graph derived
from the input claw-free graph. Minty's approach may also be
A non-maximum independent set (the two violet nodes) and
used to solve in polynomial time the more general problem of
an augmenting path
finding an independent set of maximum weight, and
generalizations of these results to wider classes of graphs are also known.[8]
As Sbihi observed, if I is an independent set in a claw-free graph, then any vertex of the graph may have at most two
neighbors in I: three neighbors would form a claw. Sbihi calls a vertex saturated if it has two neighbors in I and
unsaturated if it is not in I but has fewer than two neighbors in I. It follows from Sbihi's observation that if I and J
are both independent sets, the graph induced by IJ must have degree at most two; that is, it is a union of paths and
cycles. In particular, if I is a non-maximum independent set, it differs from any maximum independent set by cycles
and augmenting paths, induced paths which alternate between vertices in I and vertices not in I, and for which both
endpoints are unsaturated. The symmetric difference of I with an augmenting path is a larger independent set; Sbihi's
algorithm repeatedly increases the size of an independent set by searching for augmenting paths until no more can be
found.
Searching for an augmenting path is complicated by the fact that a path may fail to be augmenting if it contains an
edge between two vertices that are not in I, so that it is not an induced path. Fortunately, this can only happen in two
cases: the two adjacent vertices may be the endpoints of the path, or they may be two steps away from each other;
any other adjacency would lead to a claw. Adjacent endpoints may be avoided by temporarily removing the
neighbors of v when searching for a path starting from a vertex v; if no path is found, v can be removed from the
graph for the remainder of the algorithm. Although Sbihi does not describe it in these terms, the problem remaining
after this reduction may be described in terms of a switch graph, an undirected graph in which the edges incident to
each vertex are partitioned into two subsets and in which paths through the vertex are constrained to use one edge
from each subset. One may form a switch graph that has as its vertices the unsaturated and saturated vertices of the
given claw-free graph, with an edge between two vertices of the switch graph whenever they are nonadjacent in the
claw-free graph and there exists a length-two path between them that passes through a vertex of I. The two subsets of
edges at each vertex are formed by the two vertices of I that these length-two paths pass through. A path in this
switch graph between two unsaturated vertices corresponds to an augmenting path in the original graph. The switch
graph has quadratic complexity, paths in it may be found in linear time, and O(n) augmenting paths may need to be
found over the course of the overall algorithm. Therefore, Sbihi's algorithm runs in O(n3) total time.
400
Claw-free graph
Structure
Chudnovsky & Seymour (2005) overview a series of papers in which they prove a structure theory for claw-free
graphs, analogous to the graph structure theorem for minor-closed graph families proven by Robertson and Seymour,
and to the structure theory for perfect graphs that Chudnovsky, Seymour and their co-authors used to prove the
strong perfect graph theorem.[9] The theory is too complex to describe in detail here, but to give a flavor of it, it
suffices to outline two of their results. First, for a special subclass of claw-free graphs which they call quasi-line
graphs (equivalently, locally co-bipartite graphs), they state that every such graph has one of two forms:
1. A fuzzy circular interval graph, a class of graphs represented geometrically by points and arcs on a circle,
generalizing proper circular arc graphs.
2. A graph constructed from a multigraph by replacing each edge by a fuzzy linear interval graph. This generalizes
the construction of a line graph, in which every edge of the multigraph is replaced by a vertex. Fuzzy linear
interval graphs are constructed in the same way as fuzzy circular interval graphs, but on a line rather than on a
circle.
Chudnovsky and Seymour classify arbitrary connected claw-free graphs into one of the following:
401
Claw-free graph
1. Six specific subclasses of claw-free graphs. Three of these are line graphs, proper circular arc graphs, and the
induced subgraphs of an icosahedron; the other three involve additional definitions.
2. Graphs formed in four simple ways from smaller claw-free graphs.
3. Antiprismatic graphs, a class of dense graphs defined as the claw-free graphs in which every four vertices induce
a subgraph with at least two edges.
Much of the work in their structure theory involves a further analysis of antiprismatic graphs. The Schlfli graph, a
claw-free strongly regular graph with parameters srg(27,16,10,8), plays an important role in this part of the analysis.
This structure theory has led to new advances in polyhedral combinatorics and new bounds on the chromatic number
of claw-free graphs,[4] as well as to new fixed-parameter-tractable algorithms for dominating sets in claw-free
graphs.[13]
Notes
[1]
[2]
[3]
[4]
[5]
[6]
References
Beineke, L. W. (1968), "Derived graphs of digraphs", in Sachs, H.; Voss, H.-J.; Walter, H.-J., Beitrge zur
Graphentheorie, Leipzig: Teubner, pp.1733.
Chrobak, Marek; Naor, Joseph; Novick, Mark B. (1989), "Using bounded degree spanning trees in the design of
efficient algorithms on claw-free graphs", in Dehne, F.; Sack, J.-R.; Santoro, N., Algorithms and Data Structures:
Workshop WADS '89, Ottawa, Canada, August 1989, Proceedings, Lecture Notes in Comput. Sci., 382, Berlin:
Springer, pp.147162, doi:10.1007/3-540-51542-9_13.
Chudnovsky, Maria; Robertson, Neil; Seymour, Paul; Thomas, Robin (2006), "The strong perfect graph theorem"
(http://people.math.gatech.edu/~thomas/PAP/spgc.pdf), Annals of Mathematics 164 (1): 51229,
doi:10.4007/annals.2006.164.51.
Chudnovsky, Maria; Seymour, Paul (2005), "The structure of claw-free graphs" (http://www.math.princeton.
edu/~mchudnov/claws_survey.pdf), Surveys in combinatorics 2005, London Math. Soc. Lecture Note Ser., 327,
Cambridge: Cambridge Univ. Press, pp.153171, MR2187738.
Cygan, Marek; Philip, Geevarghese; Pilipczuk, Marcin; Pilipczuk, Micha; Wojtaszczyk, Jakub Onufry (2010).
"Dominating set is fixed parameter tractable in claw-free graphs". arXiv:1011.6239..
Edmonds, Jack (1965), "Paths, Trees and Flowers", Canadian J. Math 17 (0): 449467,
doi:10.4153/CJM-1965-045-4, MR0177907.
Faudree, Ralph; Flandrin, Evelyne; Ryjek, Zdenk (1997), "Claw-free graphs A survey", Discrete
Mathematics 164 (13): 87147, doi:10.1016/S0012-365X(96)00045-3, MR1432221.
Hermelin, Danny; Mnich, Matthias; van Leeuwen, Erik Jan; Woeginger, Gerhard (2010). "Domination when the
stars are out". arXiv:1012.0012..
Itai, Alon; Rodeh, Michael (1978), "Finding a minimum circuit in a graph", SIAM Journal on Computing 7 (4):
413423, doi:10.1137/0207033, MR0508603.
402
Claw-free graph
Kloks, Ton; Kratsch, Dieter; Mller, Haiko (2000), "Finding and counting small induced subgraphs efficiently",
Information Processing Letters 74 (34): 115121, doi:10.1016/S0020-0190(00)00047-8, MR1761552.
Las Vergnas, M. (1975), "A note on matchings in graphs", Cahiers du Centre d'tudes de Recherche
Oprationnelle 17 (2-3-4): 257260, MR0412042.
Minty, George J. (1980), "On maximal independent sets of vertices in claw-free graphs", Journal of
Combinatorial Theory. Series B 28 (3): 284304, doi:10.1016/0095-8956(80)90074-X, MR579076.
Nakamura, Daishin; Tamura, Akihisa (2001), "A revision of Minty's algorithm for finding a maximum weighted
stable set of a claw-free graph" (http://www.kurims.kyoto-u.ac.jp/preprint/file/RIMS1261.ps.gz), Journal
of the Operations Research Society of Japan 44 (2): 194204.
Palmer, Edgar M.; Read, Ronald C.; Robinson, Robert W. (2002), "Counting claw-free cubic graphs" (http://
www.cs.uga.edu/~rwr/publications/claw.pdf), SIAM Journal on Discrete Mathematics 16 (1): 6573,
doi:10.1137/S0895480194274777, MR1972075.
Sbihi, Najiba (1980), "Algorithme de recherche d'un stable de cardinalit maximum dans un graphe sans toile",
Discrete Mathematics 29 (1): 5376, doi:10.1016/0012-365X(90)90287-R, MR553650.
Sumner, David P. (1974), "Graphs with 1-factors", Proceedings of the American Mathematical Society (American
Mathematical Society) 42 (1): 812, doi:10.2307/2039666, JSTOR2039666, MR0323648.
External links
Claw-free graphs (http://wwwteo.informatik.uni-rostock.de/isgci/classes/gc_62.html), Information System
on Graph Class Inclusions
Mugan, Jonathan William; Weisstein, Eric W., " Claw-Free Graph (http://mathworld.wolfram.com/
Claw-FreeGraph.html)" from MathWorld.
Median graph
In mathematics, and more specifically graph theory, a median graph
is an undirected graph in which any three vertices a, b, and c have a
unique median: a vertex m(a,b,c) that belongs to shortest paths
between any two of a, b, and c.
The concept of median graphs has long been studied, for instance by
Birkhoff & Kiss (1947) or (more explicitly) by Avann (1961), but the
first paper to call them "median graphs" appears to be Nebesk'y
(1971). As Chung, Graham, and Saks write, "median graphs arise
naturally in the study of ordered sets and discrete distributive lattices,
The median of three vertices in a median graph
and have an extensive literature".[1] In phylogenetics, the Buneman
graph representing all maximum parsimony evolutionary trees is a
median graph.[2] Median graphs also arise in social choice theory: if a set of alternatives has the structure of a
median graph, it is possible to derive in an unambiguous way a majority preference among them.[3]
Additional surveys of median graphs are given by Klavar & Mulder (1999), Bandelt & Chepoi (2008), and Knuth
(2008).
403
Median graph
404
Examples
Any tree is a median graph.[4] To see this, observe that in a tree, the
union of the three shortest paths between any three vertices a, b, and c
is either itself a path, or a subtree formed by three paths meeting at a
single central node with degree three. If the union of the three paths is
itself a path, the median m(a,b,c) is equal to one of a, b, or c,
whichever of these three vertices is between the other two in the path.
If the subtree formed by the union of the three paths is not a path, the
median of the three vertices is the central degree-three node of the
subtree.
Additional examples of median graphs are provided by the grid
graphs. In a grid graph, the coordinates of the median m(a,b,c) can be
found as the median of the coordinates of a, b, and c. Conversely, it
turns out that, in any median graph, one may label the vertices by
points in an integer lattice in such a way that medians can be calculated
coordinatewise in this way.[5]
The median of three vertices in a tree, showing
the subtree formed by the union of shortest paths
between the vertices.
Median graph
405
Equivalent definitions
In any graph, for any two vertices a and b, define the interval of vertices that lie on shortest paths
I(a,b) = {v | d(a,b) = d(a,v) + d(v,b)}.
A median graph is defined by the property that, for any three vertices a, b, and c, these intervals intersect in a single
point:
For all a, b, and c, |I(a,b) I(a,c) I(b,c)| = 1.
Equivalently, for every three vertices a, b, and c one can find a vertex m(a,b,c) such that the unweighted distances in
the graph satisfy the equalities
d(a,b) = d(a,m(a,b,c)) + d(m(a,b,c),b)
d(a,c) = d(a,m(a,b,c)) + d(m(a,b,c),c)
d(b,c) = d(b,m(a,b,c)) + d(m(a,b,c),c)
and m(a,b,c) is the only vertex for which this is true.
It is also possible to define median graphs as the solution sets of 2-satisfiability problems, as the retracts of
hypercubes, as the graphs of finite median algebras, as the Buneman graphs of Helly split systems, and as the graphs
of windex 2; see the sections below.
The graph of a finite distributive lattice has an edge between any two
vertices a and b whenever I(a,b) = {a,b}. For any two vertices a and b of this graph, the interval I(a,b) defined in
lattice-theoretic terms above consists of the vertices on shortest paths from a to b, and thus coincides with the
graph-theoretic intervals defined earlier. For any a, b, and c, m(a,b,c) is the unique intersection of the three intervals
I(a,b), I(a,c), and I(b,c).[10] Therefore, the graph of any finite distributive lattice is a median graph. Conversely, if a
Median graph
median graph G contains two vertices 0 and 1 such that every other vertex lies on a shortest path between the two
(equivalently, m(0,a,1) = a for all a), then we may define a distributive lattice in which a b = m(a,0,b) and a b =
m(a,1,b), and G will be the graph of this lattice.[11]
Duffus & Rival (1983) characterize graphs of distributive lattices directly as diameter-preserving retracts of
hypercubes. More generally, any median graph gives rise to a ternary operation m satisfying idempotence,
commutativity, and distributivity, but possibly without the identity elements of a distributive lattice. Any ternary
operation on a finite set that satisfies these three properties (but that does not necessarily have 0 and 1 elements)
gives rise in the same way to a median graph.[12]
2-satisfiability
Median graphs have a close connection to the solution sets of 2-satisfiability problems that can be used both to
characterize these graphs and to relate them to adjacency-preserving maps of hypercubes.[15]
A 2-satisfiability instance consists of a collection of Boolean variables and a collection of clauses, constraints on
certain pairs of variables requiring those two variables to avoid certain combinations of values. Usually such
problems are expressed in conjunctive normal form, in which each clause is expressed as a disjunction and the whole
set of constraints is expressed as a conjunction of clauses, such as
406
Median graph
A solution to such an instance is an assignment of truth values to the variables that satisfies all the clauses, or
equivalently that causes the conjunctive normal form expression for the instance to become true when the variable
values are substituted into it. The family of all solutions has a natural structure as a median algebra, where the
median of three solutions is formed by choosing each truth value to be the majority function of the values in the three
solutions; it is straightforward to verify that this median solution cannot violate any of the clauses. Thus, these
solutions form a median graph, in which the neighbor of any solution is formed by negating a set of variables that are
all constrained to be equal or unequal to each other.
Conversely, any median graph G may be represented in this way as the solution set to a 2-satisfiability instance. To
find such a representation, create a 2-satisfiability instance in which each variable describes the orientation of one of
the edges in the graph (an assignment of a direction to the edge causing the graph to become directed rather than
undirected) and each constraint allows two edges to share a pair of orientations only when there exists a vertex v
such that both orientations lie along shortest paths from other vertices to v. Each vertex v of G corresponds to a
solution to this 2-satisfiability instance in which all edges are directed towards v. Any solution to the instance must
come from some vertex v in this way, where v is the common intersection of the sets Wuw for edges directed from w
to u; this common intersection exists due to the Helly property of the sets Wuw. Therefore, the solutions to this
2-satisfiability instance correspond one-for-one with the vertices of G.
Retracts of hypercubes
A retraction of a graph G is an adjacency-preserving map from G to
one of its subgraphs.[16] More precisely, it is graph homomorphism
from G to itself such that (v) = v for any vertex v in the subgraph
(G). The image of the retraction is called a retract of G. Retractions
are examples of metric maps: the distance between (v) and (w), for
any v and w, is at most equal to the distance between v and w, and is
equal whenever v and w both belong to (G). Therefore, a retract must
be an isometric subgraph of G: distances in the retract equal those in
G.
If G is a median graph, and a, b, and c are any three vertices of a retract
Retraction of a cube onto a six-vertex subgraph.
(G), then (m(a,b,c)) must be a median of a, b, and c, and so must
equal m(a,b,c). Therefore, (G) contains medians of any triples of its
vertices, and must also be a median graph. In other words, the family of median graphs is closed under the retraction
operation.[17]
A hypercube graph, in which the vertices correspond to all possible k-bit bitvectors and in which two vertices are
adjacent when the corresponding bitvectors differ in only a single bit, is a special case of a k-dimensional grid graph
and is therefore a median graph. The median of any three bitvectors a, b, and c may be calculated by computing, in
each bit position, the majority function of the bits of a, b, and c. Since median graphs are closed under retraction,
and include the hypercubes, every retract of a hypercube is a median graph.
Conversely, every median graph must be the retract of a hypercube.[18] This may be seen from the connection,
described above, between median graphs and 2-satisfiability: let G be the graph of solutions to a 2-satisfiability
instance; without loss of generality this instance can be formulated in such a way that no two variables are always
equal or always unequal in every solution. Then the space of all truth assignments to the variables of this instance
forms a hypercube. For each clause, formed as the disjunction of two variables or their complements, in the
2-satisfiability instance, one can form a retraction of the hypercube in which truth assignments violating this clause
are mapped to truth assignments in which both variables satisfy the clause, without changing the other variables in
the truth assignment. The composition of the retractions formed in this way for each of the clauses gives a retraction
407
Median graph
408
of the hypercube onto the solution space of the instance, and therefore gives a representation of G as the retract of a
hypercube. In particular, median graphs are isometric subgraphs of hypercubes, and are therefore partial cubes.
However, not all partial cubes are median graphs; for instance, a six-vertex cycle graph is a partial cube but is not a
median graph.
As Imrich & Klavar (2000) describe, an isometric embedding of a median graph into a hypercube may be
constructed in time O(mlogn), where n and m are the numbers of vertices and edges of the graph respectively.[19]
In one direction, suppose one is given as input a graph G, and must test whether G is triangle-free. From G,
construct a new graph H having as vertices each set of zero, one, or two adjacent vertices of G. Two such sets are
adjacent in H when they differ by exactly one vertex. An equivalent description of H is that it is formed by splitting
each edge of G into a path of two edges, and adding a new vertex connected to all the original vertices of G. This
graph H is by construction a partial cube, but it is a median graph only when G is triangle-free: if a, b, and c form a
triangle in G, then {a,b}, {a,c}, and {b,c} have no median in H, for such a median would have to correspond to the
set {a,b,c}, but sets of three or more vertices of G do not form vertices in H. Therefore, G is triangle-free if and only
if H is a median graph. In the case that G is triangle-free, H is its simplex graph. An algorithm to test efficiently
whether H is a median graph could by this construction also be used to test whether G is triangle-free. This
transformation preserves the computational complexity of the problem, for the size of H is proportional to that of G.
The reduction in the other direction, from triangle detection to median graph testing, is more involved and depends
on the previous median graph recognition algorithm of Hagauer, Imrich & Klavar (1999), which tests several
necessary conditions for median graphs in near-linear time. The key new step involves using a breadth first search to
partition the graph into levels according to their distances from some arbitrarily chosen root vertex, forming a graph
in each level in which two vertices are adjacent if they share a common neighbor in the previous level, and searching
for triangles in these graphs. The median of any such triangle must be a common neighbor of the three triangle
vertices; if this common neighbor does not exist, the graph is not a median graph. If all triangles found in this way
have medians, and the previous algorithm finds that the graph satisfies all the other conditions for being a median
graph, then it must actually be a median graph. Note that this algorithm requires, not just the ability to test whether a
triangle exists, but a list of all triangles in the level graph. In arbitrary graphs, listing all triangles sometimes requires
(m3/2) time, as some graphs have that many triangles, however Hagauer et al. show that the number of triangles
arising in the level graphs of their reduction is near-linear, allowing the Alon et al. fast matrix multiplication based
technique for finding triangles to be used.
Median graph
409
Median graph
Additional properties
The Cartesian product of any two median
graphs is another median graph. Medians
in the product graph may be computed by
independently finding the medians in the
two factors, just as medians in grid
graphs may be computed by
independently finding the median in each
linear dimension.
The windex of a graph measures the
amount of lookahead needed to optimally
The Cartesian product of graphs forms a median graph from two smaller median
graphs.
solve a problem in which one is given a
sequence of graph vertices si, and must
find as output another sequence of vertices ti minimizing the sum of the distances d(si,ti) and d(ti1,ti). Median
graphs are exactly the graphs that have windex 2. In a median graph, the optimal choice is to set ti =
m(ti1,si,si+1).[1]
The property of having a unique median is also called the unique Steiner point property.[1] An optimal Steiner
tree for any three vertices a, b, and c in a median graph may be found as the union of three shortest paths, from a,
b, and c to m(a,b,c). Bandelt & Barthlmy (1984) study more generally the problem of finding the vertex
minimizing the sum of distances to each of a given set of vertices, and show that it has a unique solution for any
odd number of vertices in a median graph. They also show that this median of a set S of vertices in a median
graph satisfies the Condorcet criterion for the winner of an election: compared to any other vertex, it is closer to a
majority of the vertices in S.
As with partial cubes more generally, any median graph with n vertices has at most (n/2) log2 n edges. However,
the number of edges cannot be too small: Klavar, Mulder & krekovski (1998) prove that in any median graph
the inequality 2nmk2 holds, where m is the number of edges and k is the dimension of the hypercube that
the graph is a retract of. This inequality is an equality if and only if the median graph contains no cubes. This is a
consequence of another identity for median graphs: the Euler characteristic (-1)dim(Q) is always equal to one,
where the sum is taken over all hypercube subgraphs Q of the given median graph.[24]
The only regular median graphs are the hypercubes.[25]
Notes
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
410
Median graph
[15] Bandelt & Chepoi (2008), Proposition 2.5, p.8; Chung, Graham & Saks (1989); Feder (1995); Knuth (2008), Theorem S, p. 72.
[16] Hell (1976).
[17] Imrich & Klavar (2000), Proposition 1.33, p. 27.
[18] Bandelt (1984); Imrich & Klavar (2000), Theorem 2.39, p.76; Knuth (2008), p. 74.
[19] The technique, which culminates in Lemma 7.10 on p.218 of Imrich and Klavar, consists of applying an algorithm of Chiba & Nishizeki
(1985) to list all 4-cycles in the graph G, forming an undirected graph having as its vertices the edges of G and having as its edges the opposite
sides of a 4-cycle, and using the connected components of this derived graph to form hypercube coordinates. An equivalent algorithm is
Knuth (2008), Algorithm H, p. 69.
[20] For previous median graph recognition algorithms, see Jha & Slutzki (1992), Imrich & Klavar (1998), and Hagauer, Imrich & Klavar
(1999). For triangle detection algorithms, see Itai & Rodeh (1978), Chiba & Nishizeki (1985), and Alon, Yuster & Zwick (1995).
[21] Alon, Yuster & Zwick (1995), based on fast matrix multiplication. Here m is the number of edges in the graph, and the big O notation hides
a large constant factor; the best practical algorithms for triangle detection take time O(m3/2). For median graph recognition, the time bound
can be expressed either in terms of m or n (the number of vertices), as m = O(n log n).
[22] Mulder & Schrijver (1979) described a version of this method for systems of characteristics not requiring any latent vertices, and
Barthlmy (1989) gives the full construction. The Buneman graph name is given in Dress et al. (1997) and Dress, Huber & Moulton (1997).
[23] Mulder & Schrijver (1979).
[24] krekovski (2001).
[25] Mulder (1980).
References
Alon, Noga; Yuster, Raphael; Zwick, Uri (1995), "Color-coding", Journal of the Association for Computing
Machinery 42 (4): 844856, doi:10.1145/210332.210337, MR1411787.
Avann, S. P. (1961), "Metric ternary distributive semi-lattices", Proceedings of the American Mathematical
Society (American Mathematical Society) 12 (3): 407414, doi:10.2307/2034206, JSTOR2034206, MR0125807.
Bandelt, Hans-Jrgen (1984), "Retracts of hypercubes", Journal of Graph Theory 8 (4): 501510,
doi:10.1002/jgt.3190080407, MR0766499.
Bandelt, Hans-Jrgen; Barthlmy, Jean-Pierre (1984), "Medians in median graphs", Discrete Applied
Mathematics 8 (2): 131142, doi:10.1016/0166-218X(84)90096-9, MR0743019.
Bandelt, Hans-Jrgen; Chepoi, V. (2008), "Metric graph theory and geometry: a survey" (http://www.lif-sud.
univ-mrs.fr/~chepoi/survey_cm_bis.pdf), Contemporary Mathematics, to appear.
Bandelt, Hans-Jrgen; Forster, P.; Sykes, B. C.; Richards, Martin B. (October 1, 1995), "Mitochondrial portraits
of human populations using median networks" (http://www.genetics.org/cgi/content/abstract/141/2/743),
Genetics 141 (2): 743753, PMC1206770, PMID8647407.
Bandelt, Hans-Jrgen; Forster, P.; Rohl, Arne (January 1, 1999), "Median-joining networks for inferring
intraspecific phylogenies" (http://mbe.oxfordjournals.org/cgi/content/abstract/16/1/37), Molecular Biology
and Evolution 16 (1): 3748, PMID10331250.
Bandelt, Hans-Jrgen; Macaulay, Vincent; Richards, Martin B. (2000), "Median networks: speedy construction
and greedy reduction, one simulation, and two case studies from human mtDNA", Molecular Phylogenetics and
Evolution 16 (1): 828, doi:10.1006/mpev.2000.0792, PMID10877936.
Barthlmy, Jean-Pierre (1989), "From copair hypergraphs to median graphs with latent vertices", Discrete
Mathematics 76 (1): 928, doi:10.1016/0012-365X(89)90283-5, MR1002234.
Birkhoff, Garrett; Kiss, S. A. (1947), "A ternary operation in distributive lattices" (http://projecteuclid.org/
euclid.bams/1183510977), Bulletin of the American Mathematical Society 53 (1): 749752,
doi:10.1090/S0002-9904-1947-08864-9, MR0021540.
Buneman, P. (1971), "The recovery of trees from measures of dissimilarity", in Hodson, F. R.; Kendall, D. G.;
Tautu, P. T., Mathematics in the Archaeological and Historical Sciences, Edinburgh University Press,
pp.387395.
Chepoi, V.; Dragan, F.; Vaxs, Y. (2002), "Center and diameter problems in planar quadrangulations and
triangulations" (http://portal.acm.org/citation.cfm?id=545381.545427), Proc. 13th ACM-SIAM Symposium on
Discrete Algorithms, pp.346355.
411
Median graph
Chepoi, V.; Fanciullini, C.; Vaxs, Y. (2004), "Median problem in some plane triangulations and
quadrangulations", Computational Geometry: Theory & Applications 27: 193210.
Chiba, N.; Nishizeki, T. (1985), "Arboricity and subgraph listing algorithms", SIAM Journal on Computing 14:
210223, doi:10.1137/0214017, MR0774940.
Chung, F. R. K.; Graham, R. L.; Saks, M. E. (1987), "Dynamic search in graphs" (http://www.math.ucsd.edu/
~fan/mypaps/fanpap/98dynamicsearch.pdf), in Wilf, H., Discrete Algorithms and Complexity (Kyoto, 1986),
Perspectives in Computing, 15, New York: Academic Press, pp.351387, MR0910939.
Chung, F. R. K.; Graham, R. L.; Saks, M. E. (1989), "A dynamic location problem for graphs" (http://www.
math.ucsd.edu/~fan/mypaps/fanpap/101location.pdf), Combinatorica 9 (2): 111132,
doi:10.1007/BF02124674.
Day, William H. E.; McMorris, F. R. (2003), Axiomatic Concensus [sic] Theory in Group Choice and
Bioinformatics, Society for Industrial and Applied Mathematics, pp.9194, ISBN0-89871-551-2.
Dress, A.; Hendy, M.; Huber, K.; Moulton, V. (1997), "On the number of vertices and edges of the Buneman
graph", Annals of Combinatorics 1 (1): 329337, doi:10.1007/BF02558484, MR1630739.
Dress, A.; Huber, K.; Moulton, V. (1997), "Some variations on a theme by Buneman", Annals of Combinatorics 1
(1): 339352, doi:10.1007/BF02558485, MR1630743.
Duffus, Dwight; Rival, Ivan (1983), "Graphs orientable as distributive lattices", Proceedings of the American
Mathematical Society (American Mathematical Society) 88 (2): 197200, doi:10.2307/2044697,
JSTOR2044697.
Feder, T. (1995), Stable Networks and Product Graphs, Memoirs of the American Mathematical Society, 555.
Hagauer, Johann; Imrich, Wilfried; Klavar, Sandi (1999), "Recognizing median graphs in subquadratic time",
Theoretical Computer Science 215 (12): 123136, doi:10.1016/S0304-3975(97)00136-9, MR1678773.
Hell, Pavol (1976), "Graph retractions", Colloquio Internazionale sulle Teorie Combinatorie (Roma, 1973), Tomo
II, Atti dei Convegni Lincei, 17, Rome: Accad. Naz. Lincei, pp.263268, MR0543779.
Imrich, Wilfried; Klavar, Sandi (1998), "A convexity lemma and expansion procedures for bipartite graphs",
European Journal of Combinatorics 19 (6): 677686, doi:10.1006/eujc.1998.0229, MR1642702.
Imrich, Wilfried; Klavar, Sandi (2000), Product Graphs: Structure and Recognition, Wiley,
ISBN0-471-37039-8, MR788124.
Imrich, Wilfried; Klavar, Sandi; Mulder, Henry Martyn (1999), "Median graphs and triangle-free graphs", SIAM
Journal on Discrete Mathematics 12 (1): 111118, doi:10.1137/S0895480197323494, MR1666073.
Itai, A.; Rodeh, M. (1978), "Finding a minimum circuit in a graph", SIAM Journal on Computing 7 (4): 413423,
doi:10.1137/0207033, MR0508603.
Jha, Pranava K.; Slutzki, Giora (1992), "Convex-expansion algorithms for recognizing and isometric embedding
of median graphs", Ars Combinatorica 34: 7592, MR1206551.
Klavar, Sandi; Mulder, Henry Martyn (1999), "Median graphs: characterizations, location theory and related
structures", Journal of Combinatorial Mathematics and Combinatorial Computing 30: 103127, MR1705337.
Klavar, Sandi; Mulder, Henry Martyn; krekovski, Riste (1998), "An Euler-type formula for median graphs",
Discrete Mathematics 187 (1): 255258, doi:10.1016/S0012-365X(98)00019-3, MR1630736.
Knuth, Donald E. (2008), "Median algebras and median graphs", The Art of Computer Programming, IV,
Fascicle 0: Introduction to Combinatorial Algorithms and Boolean Functions, Addison-Wesley, pp.6474,
ISBN978-0-321-53496-5.
Mulder, Henry Martyn (1980), "n-cubes and median graphs", Journal of Graph Theory 4 (1): 107110,
doi:10.1002/jgt.3190040112, MR0558458.
Mulder, Henry Martyn; Schrijver, Alexander (1979), "Median graphs and Helly hypergraphs", Discrete
Mathematics 25 (1): 4150, doi:10.1016/0012-365X(79)90151-1, MR0522746.
Nebesk'y, Ladislav (1971), "Median graphs", Commentationes Mathematicae Universitatis Carolinae 12:
317325, MR0286705.
412
Median graph
krekovski, Riste (2001), "Two relations for median graphs", Discrete Mathematics 226 (1): 351353,
doi:10.1016/S0012-365X(00)00120-5, MR1802603.
Soltan, P.; Zambitskii, D.; Priscaru, C. (1973) (in Russian), Extremal problems on graphs and algorithms of
their solution, Chiinu: tiina.
External links
Median graphs (http://wwwteo.informatik.uni-rostock.de/isgci/classes/gc_211.html), Information System
for Graph Class Inclusions.
Network (http://www.fluxus-engineering.com/sharenet.htm), Free Phylogenetic Network Software. Network
generates evolutionary trees and networks from genetic, linguistic, and other data.
PhyloMurka (http://sourceforge.net/projects/phylomurka), open-source software for median network
computations from biological data.
413
414
Graph isomorphism
Graph isomorphism
In graph theory, an isomorphism of graphs G and H is a bijection between the vertex sets of G and H
such that any two vertices u and v of G are adjacent in G if and only if (u) and (v) are adjacent in H. This kind of
bijection is commonly called "edge-preserving bijection", in accordance with the general notion of isomorphism
being a structure-preserving bijection.
In the above definition, graphs are understood to be undirected non-labeled non-weighted graphs. However, the
notion of isomorphism may be applied to all other variants of the notion of graph, by adding the requirements to
preserve the corresponding additional elements of structure: arc directions, edge weights, etc., with the following
exception. When spoken about graph labeling with unique labels, commonly taken from the integer range 1,...,n,
where n is the number of the vertices of the graph, two labeled graphs are said to be isomorphic if the corresponding
underlying unlabeled graphs are isomorphic.
If an isomorphism exists between two graphs, then the graphs are called isomorphic and we write
. In the
case when the bijection is a mapping of a graph onto itself, i.e., when G and H are one and the same graph, the
bijection is called an automorphism of G.
The graph isomorphism is an equivalence relation on graphs and as such it partitions the class of all graphs into
equivalence classes. A set of graphs isomorphic to each other is called an isomorphism class of graphs.
Example
The two graphs shown below are isomorphic, despite their different looking drawings.
Graph G
Graph H
An isomorphism
between G and H
(a) = 1
(b) = 6
(c) = 8
(d) = 3
(g) = 5
(h) = 2
(i) = 4
(j) = 7
Motivation
The formal notion of "isomorphism", e.g., of "graph isomorphism", captures the informal notion that some objects
have "the same structure" if one ignores individual distinctions of "atomic" components of objects in question, see
the example above. Whenever individuality of "atomic" components (vertices and edges, for graphs) is important for
correct representation of whatever is modeled by graphs, the model is refined by imposing additional restrictions on
the structure, and other mathematical objects are used: digraphs, labeled graphs, colored graphs, rooted trees and so
on. The isomorphism relation may also be defined for all these generalizations of graphs: the isomorphism bijection
Graph isomorphism
415
must preserve the elements of structure which define the object type in question: arcs, labels, vertex/edge colors, the
root of the rooted tree, etc.
The notion of "graph isomorphism" allows us to distinguish graph properties inherent to the structures of graphs
themselves from properties associated with graph representations: graph drawings, data structures for graphs, graph
labelings, etc. For example, if a graph has exactly one cycle, then all graphs in its isomorphism class also have
exactly one cycle. On the other hand, in the common case when the vertices of a graph are (represented by) the
integers 1, 2,... N, then the expression
Algorithmic approach
While graph isomorphism may be studied in a classical mathematical way, as exemplified by the Whitney theorem,
it is recognized that it is a problem to be tackled with an algorithmic approach. The computational problem of
determining whether two finite graphs are isomorphic is called the graph isomorphism problem.
Its practical applications include primarily cheminformatics, mathematical chemistry (identification of chemical
compounds), and electronic design automation (verification of equivalence of various representations of the design
of an electronic circuit).
The graph isomorphism problem is one of few standard problems in computational complexity theory belonging to
NP, but not known to belong to either of its well-known (and, if PNP, disjoint) subsets: P and NP-complete. It is
one of only two, out of 12 total, problems listed in Garey & Johnson (1979) whose complexity remains unresolved,
the other being integer factorization. It is however known that if the problem is NP-complete then the polynomial
hierarchy collapses to a finite level.[3]
Its generalization, the subgraph isomorphism problem, is known to be NP-complete.
The main areas of research for the problem are design of fast algorithms and theoretical investigations of its
computational complexity, both for the general problem and for special classes of graphs.
Graph isomorphism
References
[1] Whitney, Hassler (January 1932). "Congruent Graphs and the Connectivity of Graphs" (http:/ / www. jstor. org/ stable/ 2371086). American
Journal of Mathematics (Am. J. Math.) (The Johns Hopkins University Press) 54 (1): 150-168. . Retrieved 17 August 2012.
[2] Dirk L. Vertigan, Geoffrey P. Whittle: A 2-Isomorphism Theorem for Hypergraphs. J. Comb. Theory, Ser. B 71(2): 215230. 1997.
[3] Schning, Uwe (1988). "Graph isomorphism is in the low hierarchy". Journal of Computer and System Sciences 37: 312323.
Garey, Michael R.; Johnson, David S. (1979), Computers and Intractability: A Guide to the Theory of
NP-Completeness, W.H.Freeman, ISBN0-7167-1045-5
416
417
Trees[10]
Planar graphs[11] (In fact, planar graph isomorphism is in log space,[12] a class contained in P.)
Interval graphs[13]
Permutation graphs[14]
Partial k-trees[15]
Bounded-parameter graphs
Complexity class GI
Since the graph isomorphism problem is neither known to be NP-complete nor to be tractable, researchers have
sought to gain insight into the problem by defining a new class GI, the set of problems with a polynomial-time
Turing reduction to the graph isomorphism problem.[21] If in fact the graph isomorphism problem is solvable in
polynomial time, GI would equal P.
As is common for complexity classes within the polynomial time hierarchy, a problem is called GI-hard if there is a
polynomial-time Turing reduction from any problem in GI to that problem, i.e., a polynomial-time solution to a
GI-hard problem would yield a polynomial-time solution to the graph isomorphism problem (and so all problems in
GI). A problem
is called complete for GI, or GI-complete, if it is both GI-hard and a polynomial-time solution
to the GI problem would yield a polynomial-time solution to
The graph isomorphism problem is contained in both NP and co-AM. GI is contained in and low for Parity P, as well
as contained in the potentially much smaller class SPP [22].[23] That it lies in Parity P means that the graph
isomorphism problem is no harder than determining whether a polynomial-time nondeterministic Turing machine
has an even or odd number of accepting paths. GI is also contained in and low for ZPPNP.[24] This essentially means
that an efficient Las Vegas algorithm with access to an NP oracle can solve graph isomorphism so easily that it gains
no power from being given the ability to do so in constant time.
regular graphs.[25]
bipartite graphs without non-trivial strongly regular subgraphs[25]
bipartite Eulerian graphs[25]
bipartite regular graphs[25]
line graphs[25]
chordal graphs[25]
regular self-complementary graphs[25]
polytopal graphs of general, simple, and simplicial convex polytopes in arbitrary dimensions[28]
This list is incomplete.
418
Program checking
Blum and Kannan[33] have shown a program checker for graph isomorphism. Suppose P is a claimed
polynomial-time procedure that checks if two graphs are isomorphic, but it is not trusted. To check if G and H are
isomorphic:
Ask P whether G and H are isomorphic.
If the answer is "yes':
Attempt to construct an isomorphism using P as subroutine. Mark a vertex u in G and v in H, and modify
the graphs to make them distinctive (with a small local change). Ask P if the modified graphs are
isomorphic. If no, change v to a different vertex. Continue searching.
Either the isomorphism will be found (and can be verified), or P will contradict itself.
If the answer is "no":
Perform the following 100 times. Choose randomly G or H, and randomly permute its vertices. Ask P if the
graph is isomorphic to G and H. (As in AM protocol for graph nonisomorphism).
If any of the tests are failed, judge P as invalid program. Otherwise, answer "no".
This procedure is polynomial-time and gives the correct answer if P is a correct program for graph isomorphism. If P
is not a correct program, but answers correctly on G and H, the checker will either give the correct answer, or detect
invalid behaviour of P. If P is not a correct program, and answers incorrectly on G and H, the checker will detect
invalid behaviour of P with high probability, or answer wrong with probability 2-100.
Notably, P is used only as a blackbox.
Applications
In cheminformatics and in mathematical chemistry, graph isomorphism testing is used to identify a chemical
compound within a chemical database.[34] Also, in organic mathematical chemistry graph isomorphism testing is
useful for generation of molecular graphs and for computer synthesis.
Chemical database search is an example of graphical data mining, where the graph canonization approach is often
used.[35] In particular, a number of identifiers for chemical substances, such as SMILES and InChI, designed to
provide a standard and human-readable way to encode molecular information and to facilitate the search for such
information in databases and on the web, use canonization step in their computation, which is essentially the
canonization of the graph which represents the molecule.
In electronic design automation graph isomorphism is the basis of the Layout Versus Schematic (LVS) circuit design
step, which is a verification whether the electric circuits represented by a circuit schematic and an integrated circuit
layout are the same.[36]
Notes
[1] The latest one resolved was minimum-weight triangulation, proved to be NP-complete in 2006. Mulzer, Wolfgang; Rote, Gnter (2008),
"Minimum-weight triangulation is NP-hard", Journal of the ACM 55 (2): 1, arXiv:cs.CG/0601002, doi:10.1145/1346330.1346336.
[2] Johnson 2005
[3] Babai & Codenotti (2008).
[4] Uwe Schning, "Graph isomorphism is in the low hierarchy", Proceedings of the 4th Annual Symposium on Theoretical Aspects of Computer
Science, 1987, 114124; also: Journal of Computer and System Sciences, vol. 37 (1988), 312323
[5] McKay 1981
[6] Ullman (1976).
[7] Cristopher Moore; Alexander Russell; Schulman, Leonard J. (2005). "The Symmetric Group Defies Strong Fourier Sampling: Part I".
arXiv:quant-ph/0501056v3[quant-ph].
[8] Lszl Babai, William Kantor, Eugene Luks, Computational complexity and the classification of finite simple groups (http:/ / portal. acm.
org/ citation. cfm?id=1382437. 1382817), Proc. 24th FOCS (1983), pp. 162-171.
419
References
Aho, Alfred V.; Hopcroft, John; Ullman, Jeffrey D. (1974), The Design and Analysis of Computer Algorithms,
Reading, MA: AddisonWesley.
Arvind, Vikraman; Kbler, Johannes (2000), "Graph isomorphism is low for ZPP(NP) and other lowness
results.", Proceedings of the 17th Annual Symposium on Theoretical Aspects of Computer Science, Lecture Notes
in Computer Science, 1770, Springer-Verlag, pp.431442, ISBN3-540-67141-2, OCLC43526888.
Arvind, Vikraman; Kurur, Piyush P. (2006), "Graph isomorphism is in SPP", Information and Computation 204
(5): 835852, doi:10.1016/j.ic.2006.02.002.
Babai, Lszl; Codenotti, Paolo (2008), "Isomorphism of Hypergraphs of Low Rank in Moderately Exponential
Time", FOCS '08: Proceedings of the 2008 49th Annual IEEE Symposium on Foundations of Computer Science,
IEEE Computer Society, pp.667676, ISBN978-0-7695-3436-7.
Babai, Lszl; Grigoryev, D. Yu.; Mount, David M. (1982), "Isomorphism of graphs with bounded eigenvalue
multiplicity", Proceedings of the 14th Annual ACM Symposium on Theory of Computing, pp.310324,
420
doi:10.1145/800070.802206, ISBN0-89791-070-2.
Bodlaender, Hans (1990), "Polynomial algorithms for graph isomorphism and chromatic index on partial k-trees",
Journal of Algorithms 11 (4): 631643, doi:10.1016/0196-6774(90)90013-5.
Booth, Kellogg S.; Colbourn, C. J. (1977), Problems polynomially equivalent to graph isomorphism, Technical
Report CS-77-04, Computer Science Department, University of Waterloo.
Booth, Kellogg S.; Lueker, George S. (1979), "A linear time algorithm for deciding interval graph isomorphism",
Journal of the ACM 26 (2): 183195, doi:10.1145/322123.322125.
Boucher, C.; Loker, D. (2006), Graph isomorphism completeness for perfect graphs and subclasses of perfect
graphs (http://www.cs.uwaterloo.ca/research/tr/2006/CS-2006-32.pdf), University of Waterloo, Technical
Report CS-2006-32.
Colbourn, C. J. (1981), "On testing isomorphism of permutation graphs", Networks 11: 1321,
doi:10.1002/net.3230110103.
Filotti, I. S.; Mayer, Jack N. (1980), "A polynomial-time algorithm for determining the isomorphism of graphs of
fixed genus", Proceedings of the 12th Annual ACM Symposium on Theory of Computing, pp.236243,
doi:10.1145/800141.804671, ISBN0-89791-017-6.
Garey, Michael R.; Johnson, David S. (1979), Computers and Intractability: A Guide to the Theory of
NP-Completeness, W. H. Freeman, ISBN978-0-7167-1045-5, OCLC11745039.
Hopcroft, John; Wong, J. (1974), "Linear time algorithm for isomorphism of planar graphs", Proceedings of the
Sixth Annual ACM Symposium on Theory of Computing, pp.172184, doi:10.1145/800119.803896.
Kbler, Johannes; Schning, Uwe; Torn, Jacobo (1992), "Graph isomorphism is low for PP", Computational
Complexity 2 (4): 301330, doi:10.1007/BF01200427.
Kbler, Johannes; Schning, Uwe; Torn, Jacobo (1993), The Graph Isomorphism Problem: Its Structural
Complexity, Birkhuser, ISBN978-0-8176-3680-7, OCLC246882287.
Kozen, Dexter (1978), "A clique problem equivalent to graph isomorphism", ACM SIGACT News 10 (2): 5052,
doi:10.1145/990524.990529.
Luks, Eugene M. (1982), "Isomorphism of graphs of bounded valence can be tested in polynomial time", Journal
of Computer and System Sciences 25: 4265, doi:10.1016/0022-0000(82)90009-5.
McKay, Brendan D. (1981), "Practical graph isomorphism" (http://cs.anu.edu.au/~bdm/nauty/PGI/),
Congressus Numerantium 30: 4587, 10th. Manitoba Conference on Numerical Mathematics and Computing
(Winnipeg, 1980).
Miller, Gary (1980), "Isomorphism testing for graphs of bounded genus", Proceedings of the 12th Annual ACM
Symposium on Theory of Computing, pp.225235, doi:10.1145/800141.804670, ISBN0-89791-017-6.
Schmidt, Douglas C.; Druffel, Larry E. (1976), "A Fast Backtracking Algorithm to Test Directed Graphs for
Isomorphism Using Distance Matrices", J. ACM (ACM) 23 (3): 433445, doi:10.1145/321958.321963,
ISSN0004-5411.
Spielman, Daniel A. (1996), "Faster isomorphism testing of strongly regular graphs", STOC '96: Proceedings of
the twenty-eighth annual ACM symposium on Theory of computing, ACM, pp.576584,
ISBN978-0-89791-785-8.
Ullman, Julian R. (1976), "An algorithm for subgraph isomorphism", Journal of the ACM 23: 3142,
doi:10.1145/321921.321925.
421
Software
Graph Isomorphism (http://www.cs.sunysb.edu/~algorith/files/graph-isomorphism.shtml), review of
implementations, The Stony Brook Algorithm Repository (http://www.cs.sunysb.edu/~algorith).
422
Graph canonization
Graph canonization
In graph theory, a branch of mathematics, graph canonization is finding a canonical form of a graph G, which is a
graph Canon(G) isomorphic to G such that Canon(H)=Canon(G) if and only if H is isomorphic to G. The canonical
form of a graph is an example of a complete graph invariant.[1][2] Since the vertex sets of (finite) graphs are
commonly identified with the intervals of integers 1,..., n, where n is the number of the vertices of a graph, a
canonical form of a graph is commonly called canonical labeling of a graph.[3] Graph canonization is also
sometimes known as graph canonicalization.
A commonly known canonical form is the lexicographically smallest graph within the isomorphism class, which is
the graph of the class with lexicographically smallest adjacency matrix considered as a linear string.
Computational complexity
Clearly, the graph canonization problem is at least as computationally hard as the graph isomorphism problem. In
fact, Graph Isomorphism is even AC0-reducible to Graph Canonization. However it is still an open question whether
the two problems are polynomial time equivalent.[2]
While existence of (deterministic) polynomial algorithms for Graph Isomorphism is still an open problem in the
computational complexity theory, in 1977 Laszlo Babai reported that a simple vertex classification algorithm after
only two refinement steps produces a canonical labeling of an n-vertex random graph with probability
1exp(O(n)). Small modifications and added depth-first search step produce canonical labeling of all graphs in
linear average time. This result sheds some light on the fact why many reported graph isomorphism algorithms
behave well in practice.[4] This was an important breakthrough in probabilistic complexity theory which became
widely known in its manuscript form and which was still cited as an "unpublished manuscript" long after it was
reported at a symposium.
The computation of the lexicographically smallest graph is NP-Hard.[1]
Applications
Graph canonization is the essence of many graph isomorphism algorithms.
A common application of graph canonization is in graphical data mining, in particular in chemical database
applications.[5]
A number of identifiers for chemical substances, such as SMILES and InChI, designed to provide a standard and
human-readable way to encode molecular information and to facilitate the search for such information in databases
and on the web, use canonization step in their computation, which is essentially the canonization of the graph which
represents the molecule.
423
Graph canonization
References
[1] "A Logspace Algorithm for Partial 2-Tree Canonization" (http:/ / books. google. com/ books?id=tU8KG7D03DAC& pg=PA40& dq="graph+
canonization"#PPA40,M1)
[2] "the Space Complexity of k-Tree Isomorphism" (http:/ / books. google. com/ books?id=vCoVUkRhfI4C& pg=PA823& dq="graph+
canonization"#PPA823,M1)
[3] Laszlo Babai, Eugene Luks, "Canonical labeling of graphs", Proc. 15th ACM Symposium on Theory of Computing, 1983, pp. 171183.
[4] L. Babai, "On the Isomorphism Problem", unpublished manuscript, 1977
L. Babai and L. Kucera. Canonical labeling of graphs in average linear time. Proc. 20th Annual IEEE Symposium on Foundations of
Computer Science (1979), 3946.
[5] "Mining Graph Data", by Diane J. Cook, Lawrence B. Holder (2007) ISBN 0-470-07303-9, pp. 120122, section 6.2.1. "Canonical Labeling"
(http:/ / books. google. com/ books?id=bHGy0_H0g8QC& pg=PA119& dq="canonical+ labeling"+ graphs#PPA120,M1)
Yuri Gurevich, From Invariants to Canonization, The Bull. of Euro. Assoc. for Theor. Computer Sci., no. 63,
1997. (http://research.microsoft.com/en-us/um/people/gurevich/opera/131.pdf)
424
Algorithms
Ullmann (1976) describes a recursive backtracking procedure for solving the subgraph isomorphism problem.
Although its running time is, in general, exponential, it takes polynomial time for any fixed choice of H (with a
polynomial that depends on the choice of H). When G is a planar graph and H is fixed, the running time of subgraph
isomorphism can be reduced to linear time.[2]
Applications
As subgraph isomorphism has been applied in the area of cheminformatics to find similarities between chemical
compounds from their structural formula; often in this area the term substructure search is used.[4] Typically a
query structure is defined as SMARTS, a SMILES extension.
The closely related problem of counting the number of isomorphic copies of a graph H in a larger graph G has been
applied to pattern discovery in databases,[5] the bioinformatics of protein-protein interaction networks,[6] and in
exponential random graph methods for mathematically modeling social networks.[7]
Ohlrich et al. (1993) describe an application of subgraph isomorphism in the computer-aided design of electronic
circuits. Subgraph matching is also a substep in graph rewriting (the most runtime-intensive), and thus offered by
graph rewrite tools.
Notes
[1] The original Cook (1971) paper that proves the CookLevin theorem already showed subgraph isomorphism to be NP-complete, using a
reduction from 3-SAT involving cliques.
[2] Eppstein (1999)
[3] Here invokes Big Omega notation.
[4] Ullmann (1976)
[5] Kuramochi & Karypis (2001).
[6] Prulj, Corneil & Jurisica (2006).
[7] Snijders et al. (2006).
References
Cook, S. A. (1971), "The complexity of theorem-proving procedures" (http://4mhz.de/cook.html), Proc. 3rd
ACM Symposium on Theory of Computing, pp.151158, doi:10.1145/800157.805047.
Eppstein, David (1999), "Subgraph isomorphism in planar graphs and related problems" (http://www.cs.brown.
edu/publications/jgaa/accepted/99/Eppstein99.3.3.pdf), Journal of Graph Algorithms and Applications 3 (3):
127, arXiv:cs.DS/9911003.
Garey, Michael R.; Johnson, David S. (1979), Computers and Intractability: A Guide to the Theory of
NP-Completeness, W.H. Freeman, ISBN0-7167-1045-5. A1.4: GT48, pg.202.
Grger, Hans Dietmar (1992), "On the randomized complexity of monotone graph properties" (http://www.inf.
u-szeged.hu/actacybernetica/edb/vol10n3/pdf/Groger_1992_ActaCybernetica.pdf), Acta Cybernetica 10 (3):
119127.
Kuramochi, Michihiro; Karypis, George (2001), "Frequent subgraph discovery", 1st IEEE International
Conference on Data Mining, p.313, doi:10.1109/ICDM.2001.989534, ISBN0-7695-1119-8.
Ohlrich, Miles; Ebeling, Carl; Ginting, Eka; Sather, Lisa (1993), "SubGemini: identifying subcircuits using a fast
subgraph isomorphism algorithm", Proceedings of the 30th international Design Automation Conference,
pp.3137, doi:10.1145/157485.164556, ISBN0-89791-577-1.
Prulj, N.; Corneil, D. G.; Jurisica, I. (2006), "Efficient estimation of graphlet frequency distributions in
proteinprotein interaction networks", Bioinformatics 22 (8): 974980, doi:10.1093/bioinformatics/btl030,
PMID16452112.
425
426
Snijders, T. A. B.; Pattison, P. E.; Robins, G.; Handcock, M. S. (2006), "New specifications for exponential
random graph models", Sociological Methodology 36 (1): 99153, doi:10.1111/j.1467-9531.2006.00176.x.
Ullmann, Julian R. (1976), "An algorithm for subgraph isomorphism", Journal of the ACM 23 (1): 3142,
doi:10.1145/321921.321925.
Jamil, Hasan (2011), "Computing Subgraph Isomorphic Queries using Structural Unification and Minimum
Graph Structures", 26th ACM Symposium on Applied Computing, pp.10581063.
Color-coding
In computer science and graph theory, the method of color-coding[1][2] efficiently finds k-vertex simple paths,
k-vertex cycles, and other small subgraphs within a given graph using probabilistic algorithms, which can then be
derandomized and turned into deterministic algorithms. This method shows that many subcases of the subgraph
isomorphism problem (an NP-complete problem) can in fact be solved in polynomial time.
The theory and analysis of the color-coding method was proposed in 1994 by Noga Alon, Raphael Yuster, and Uri
Zwick.
Results
The following results can be obtained through the method of color-coding:
For every fixed constant
, if a graph
found in:
O(
) expected time, or
O(
) worst-case time.
If a graph
The method
To solve the problem of finding a subgraph
in a given graph
with
, where
can be a
in colored
. Here, a graph is colorful if every vertex in it is colored with a distinct color. This method works by repeating (1)
random coloring a graph and (2) finding colorful copy of the target subgraph, and eventually the target subgraph can
be found if the process is repeated a sufficient number of times.
Suppose
becomes colorful with some non-zero probability . It immediately follows that if the random
coloring is repeated times, then
that if
a graph
is small, it is shown
is only polynomially small. Suppose again there exists an algorithm such that, given
to one of the
in
, if one exists, is
, if
Color-coding
427
Sometimes it is also desirable to use a more restricted version of colorfulness. For example, in the context of finding
cycles in planar graphs, it is possible to develop an algorithm that finds well-colored cycles. Here, a cycle is
well-colored if its vertices are colored by consecutive colors.
Example
An example would be finding a simple cycle of length
in graph
graph
. Therefore, it takes
overall time to find a simple cycle of length in
.
The colorful cycle-finding algorithm works by first finding all pairs of vertices in V that are connected by a simple
path of length k1, and then checking whether the two vertices in each pair are connected. Given a coloring
function
to color graph
, enumerate all partitions of the color set
into two
subsets
of size
and
and
in
vertices of
and those of
and
and
in each of
and
and
. Although this algorithm finds only the end points of the colorful
path, another algorithm by Alon and Naor[4] that finds colorful paths themselves can be incorporated into it.
Derandomization
The derandomization of color-coding involves enumerating possible colorings of a graph
randomness of coloring
is sufficient. By definition,
where
-perfect
is k-perfect
to be discoverable, the
to
of
in
such that
1. The best explicit construction is by Moni Naor, Leonard J. Schulman, and Aravind Srinivasan[5], where a family
of size
can be obtained. This construction does not require the target subgraph to exist in
the original subgraph finding problem.
2. Another explicit construction by Jeanette P. Schmidt and Alan Siegel[6] yields a family of size
.
3. Another construction that appears in the original paper of Noga Alon et al.[2] can be obtained by first building a
-perfect family that maps
to
, followed by building another -perfect
family that maps
with
to
. Consequently, by
that maps from
to
can be where
obtained.
In the case of derandomizing
well-coloring,
each vertex on the subgraph is colored consecutively, a
-perfect family of hash functions from
to
is needed. A sufficient
-perfect
Color-coding
428
to
it is done by using
random bits that are almost
independent, and the size of the resulting
The derandomization of color-coding method can be easily parallelized, yielding efficient NC algorithms.
Applications
Recently, color coding has attracted much attention in the field of bioinformatics. One example is the detection of
signaling pathways in protein-protein interaction (PPI) networks. Another example is to discover and to count the
number of motifs in PPI networks. Studying both signaling pathways and motifs allows a deeper understanding of
the similarities and differences of many biological functions, processes, and structures among organisms.
Due to the huge amount of gene data that can be collected, searching for pathways or motifs can be highly time
consuming. However, by exploiting the color coding method, the motifs or signaling pathways with
vertices in a network
with
vertices can be found very efficiently in polynomial time. Thus, this enables us to
explore more complex or larger structures in PPI networks. More details can be found in [9][10].
References
[1] Alon, N., Yuster, R., and Zwick, U. 1994. Color-coding: a new method for finding simple paths, cycles and other small subgraphs within
large graphs. In Proceedings of the Twenty-Sixth Annual ACM Symposium on theory of Computing (Montreal, Quebec, Canada, May 2325,
1994). STOC '94. ACM, New York, NY, 326335. DOI= http:/ / doi. acm. org/ 10. 1145/ 195058. 195179
[2] Alon, N., Yuster, R., and Zwick, U. 1995. Color-coding. J. ACM 42, 4 (Jul. 1995), 844856. DOI= http:/ / doi. acm. org/ 10. 1145/ 210332.
210337
[3] CoppersmithWinograd Algorithm
[4] Alon, N. and Naor, M. 1994 Derandomization, Witnesses for Boolean Matrix Multiplication and Construction of Perfect Hash Functions.
Technical Report. UMI Order Number: CS94-11., Weizmann Science Press of Israel.
[5] Naor, M., Schulman, L. J., and Srinivasan, A. 1995. Splitters and near-optimal derandomization. In Proceedings of the 36th Annual
Symposium on Foundations of Computer Science (October 2325, 1995). FOCS. IEEE Computer Society, Washington, DC, 182.
[6] Schmidt, J. P. and Siegel, A. 1990. The spatial complexity of oblivious k-probe Hash functions. SIAM J. Comput. 19, 5 (Sep. 1990), 775-786.
DOI= http:/ / dx. doi. org/ 10. 1137/ 0219054
[7] Naor, J. and Naor, M. 1990. Small-bias probability spaces: efficient constructions and applications. In Proceedings of the Twenty-Second
Annual ACM Symposium on theory of Computing (Baltimore, Maryland, United States, May 1317, 1990). H. Ortiz, Ed. STOC '90. ACM,
New York, NY, 213-223. DOI= http:/ / doi. acm. org/ 10. 1145/ 100216. 100244
[8] Alon, N., Goldreich, O., Hastad, J., and Peralta, R. 1990. Simple construction of almost k-wise independent random variables. In Proceedings
of the 31st Annual Symposium on Foundations of Computer Science (October 2224, 1990). SFCS. IEEE Computer Society, Washington,
DC, 544-553 vol.2. DOI= http:/ / dx. doi. org/ 10. 1109/ FSCS. 1990. 89575
[9] Alon, N., Dao, P., Hajirasouliha, I., Hormozdiari, F., and Sahinalp, S. C. 2008. Biomolecular network motif counting and discovery by color
coding. Bioinformatics 24, 13 (Jul. 2008), i241-i249. DOI= http:/ / dx. doi. org/ 10. 1093/ bioinformatics/ btn163
[10] Hffner, F., Wernicke, S., and Zichner, T. 2008. Algorithm Engineering for Color-Coding with Applications to Signaling Pathway
Detection. Algorithmica 52, 2 (Aug. 2008), 114-132. DOI= http:/ / dx. doi. org/ 10. 1007/ s00453-007-9008-7
References
[1] Syso, Maciej M. (1982), "The subgraph isomorphism problem for outerplanar graphs", Theoretical Computer Science 17 (1): 9197,
doi:10.1016/0304-3975(82)90133-5, MR644795.
[2] Johnson, David S. (1985), "The NP-completeness column: an ongoing guide", Journal of Algorithms 6 (3): 434451,
doi:10.1016/0196-6774(85)90012-4, MR800733.
[3] Ramanujacharyulu, C.; Menon, V. V. (1964), "A note on the snake-in-the-box problem", Publ. Inst. Statist. Univ. Paris 13: 131135,
MR0172736.
429
References
Michael R. Garey and David S. Johnson (1979). Computers and Intractability: A Guide to the Theory of
NP-Completeness. W.H. Freeman. ISBN0-7167-1045-5. A1.4: GT48, pg.202.
430
431
Problem complexity
Typically, graph partition problems fall under the category of NP-hard problems. Solutions to these problems are
generally derived using heuristics and approximation algorithms [3][2]. However, uniform graph partitioning or a
balanced graph partition problem can be shown to be NP-complete to approximate within any finite factor [1]. Even
for special graph classes such as trees and grids, no reasonable approximation algorithms exist[4], unless P=NP.
Grids are a particularly interesting case since they model the graphs resulting from Finite Element Model (FEM)
simulations. When not only the number of edges between the components is approximated, but also the sizes of the
components, it can be shown that no reasonable fully polynomial algorithms exist for these graphs [4].
Problem
Consider a graph G = (V, E), where V denotes the set of n vertices and E the set of edges. For a (k,v) balanced
partition problem, the objective is to partition G into k components of at most size v(n/k), while minimizing the
capacity of the edges between separate components.[1] Also, given G and an integer k > 1, partition V into k parts
(subsets) V1, V2, ..., Vk such that the parts are disjoint and have equal size, and the number of edges with endpoints in
different parts is minimized. Such partition problems have been discussed in literature as bicriteria-approximation or
resource augmentation approaches. A common extension is to hypergraphs, where an edge can connect more than
two vertices. A hyperedge is not cut if all vertices are in one partition, and cut exactly once otherwise, no matter how
many vertices are on each side. This usage is common in electronic design automation.
Analysis
For a specific (k, 1+) balanced partition problem, we seek to find a minimum cost partition of G into k components
with each component containing maximum of (1+)(n/k) nodes. We compare the cost of this approximation
algorithm to the cost of a (k,1) cut, wherein each of the k components must have exactly the same size of (n/k) nodes
each, thus being a more restricted problem. Thus,
We already know that (2,1) cut is the minimum bisection problem and it is NP complete.[5] Next we assess a
3-partition problem wherein n=3k, which is also bounded in polynomial time.[1] Now, if we assume that we have an
finite approximation algorithm for (k,1)-balanced partition, then, either the 3-partition instance can be solved using
Graph partition
the balanced (k,1) partition in G or it cannot be solved. If the 3-partition instance can be solved, then (k,1)-balanced
partitioning problem in G can be solved without cutting any edge. Otherwise if the 3-partition instance cannot be
solved, the optimum (k, 1)-balanced partitioning in G will cut at least one edge. An approximation algorithm with
finite approximation factor has to differentiate between these two cases. Hence, it can solve the 3-partition problem
which is a contradiction under the assumption that P=NP. Thus, it is evident that (k,1)-balanced partitioning
problem has no polynomial time approximation algorithm with finite approximation factor unless P=NP.[1]
Multi-level methods
A multi-level graph partitioning algorithm works by applying one or more stages. Each stage reduces the size of the
graph by collapsing vertices and edges, partitions the smaller graph, then maps back and refines this partition of the
original graph.[6] A wide variety of partitioning and refinement methods can be applied within the overall multi-level
scheme. In many cases, this approach can give both fast execution times and very high quality results. One widely
used example of such an approach is METIS,[7] a graph partitioner, and hMETIS, the corresponding partitioner for
hypergraphs.[8]
432
Graph partition
433
Graph partition
References
[1] Andreev, Konstantin and Rcke, Harald, (2004). "Balanced Graph Partitioning". Proceedings of the sixteenth annual ACM symposium on
Parallelism in algorithms and architectures (Barcelona, Spain): 120124. doi:10.1145/1007912.1007931. ISBN1-58113-840-7.
[2] Sachin B. Patkar and H. Narayanan (2003). "An Efficient Practical Heuristic For Good Ratio-Cut Partitioning". VLSI Design, International
Conference on: 64. doi:10.1109/ICVD.2003.1183116.
[3] Feldmann, Andreas Emil and Foschini, Luca (2012). "Balanced Partitions of Trees and Applications". Proceedings of the 29th International
Symposium on Theoretical Aspects of Computer Science (Paris, France): 100111.
[4] Feldmann, Andreas Emil (2012). "Fast Balanced Partitioning is Hard, Even on Grids and Trees". Proceedings of the 37th International
Symposium on Mathematical Foundations of Computer Science (Bratislava, Slovakia).
[5] Garey, Michael R. and Johnson, David S. (1979). Computers and intractability: A guide to the theory of NP-completeness. W. H. Freeman &
Co.. ISBN0-7167-1044-7.
[6] Hendrickson, B. and Leland, R. (1995). "A multilevel algorithm for partitioning graphs". Proceedings of the 1995 ACM/IEEE conference on
Supercomputing. ACM. pp.28.
[7] Karypis, G. and Kumar, V. (1999). "A fast and high quality multilevel scheme for partitioning irregular graphs". SIAM Journal on Scientific
Computing 20 (1): 359. doi:10.1137/S1064827595287997.
[8] Karypis, G. and Aggarwal, R. and Kumar, V. and Shekhar, S. (1997). "Multilevel hypergraph partitioning: application in VLSI domain".
Proceedings of the 34th annual Design Automation Conference. pp.526529.
[9] Newman, M. E. J. (2006). "Modularity and community structure in networks". PNAS 103 (23): 85778696. doi:10.1073/pnas.0601602103.
PMC1482622. PMID16723398.
[10] Rodrigo Aldecoa and Ignacio Marn (2011). "Deciphering network community structure by Surprise" (http:/ / www. plosone. org/ article/
info:doi/ 10. 1371/ journal. pone. 0024195). PLoS ONE 6 (9): e24195. doi:10.1371/journal.pone.0024195. PMID21909420. .
[11] Reichardt, Jrg and Bornholdt, Stefan (Jul 2006). "Statistical mechanics of community detection". Phys. Rev. E 74 (1): 016110.
doi:10.1103/PhysRevE.74.016110.
[12] Carlos Alzate and Johan A.K. Suykens (2010). "Multiway Spectral Clustering with Out-of-Sample Extensions through Weighted Kernel
PCA". IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE Computer Society) 32 (2): 335347.
doi:10.1109/TPAMI.2008.292. ISSN0162-8828. PMID20075462.
External links
Chamberlain, Bradford L. (1998). "Graph Partitioning Algorithms for Distributing Workloads of Parallel
Computations" (http://masters.donntu.edu.ua/2006/fvti/krasnokutskaya/library/generals.pdf)
Bibliography
Feldmann, Andreas Emil (2012). Balanced Partitioning of Grids and Related Graphs: A Theoretical Study of
Data Distribution in Parallel Finite Element Model Simulations (http://www.pw.ethz.ch/people/
research_group/andemil/personal/thesis.pdf). Goettingen, Germany: Cuvillier Verlag. pp.218.
ISBN978-3954041251. An exhaustive analysis of the problem from a theoretical point of view.
BW Kernighan, S Lin (1970). "An efficient heuristic procedure for partitioning graphs" (http://www.ece.wisc.
edu/~adavoodi/teaching/756-old/papers/kl.pdf). Bell System Technical Journal. One of the early fundamental
works in the field. However, performance is O(n2), so it is no longer commonly used.
CM Fiduccia, RM Mattheyses (1982). "A Linear-Time Heuristic for Improving Network Partitions" (http://
ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1585498). Design Automation Conference. A later variant that
is linear time, very commonly used, both by itself and as part of multilevel partitioning, see below.
G Karypis, V Kumar (1999). "A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs"
(http://glaros.dtc.umn.edu/gkhome/node/107). Siam Journal on Scientific Computing. Multi-level
partitioning is the current state of the art. This paper also has good explanations of many other methods, and
comparisons of the various methods on a wide variety of problems.
Karypis, G., Aggarwal, R., Kumar, V., and Shekhar, S. (March 1999). "Multilevel hypergraph partitioning:
applications in VLSI domain" (http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=748202). IEEE
Transactions on Very Large Scale Integration (VLSI) Systems 7 (1): pp. 6979. doi:10.1109/92.748202. Graph
partitioning (and in particular, hypergraph partitioning) has many applications to IC design.
434
Graph partition
435
S. Kirkpatrick, C. D. Gelatt Jr., and M. P. Vecchi (13 May 1983). "Optimization by Simulated Annealing" (http://
www.sciencemag.org/cgi/content/abstract/220/4598/671). Science 220 (4598): 671680.
doi:10.1126/science.220.4598.671. PMID17813860. Simulated annealing can be used as well.
Hagen, L. and Kahng, A.B. (September 1992). "New spectral methods for ratio cut partitioning and clustering"
(http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=159993). IEEE Transactions on Computer-Aided
Design of Integrated Circuits and Systems 11, (9): 10741085. doi:10.1109/43.159993.. There is a whole class of
spectral partitioning methods, which use the Eigenvectors of the Laplacian of the connectivity graph. You can see
a demo of this (http://www.stanford.edu/~dgleich/demos/matlab/spectral/spectral.html), using Matlab.
KernighanLin algorithm
This article is about the heuristic algorithm for the graph partitioning problem. For a heuristic for the
traveling salesperson problem, see LinKernighan heuristic.
KernighanLin is a O(n2 log n) heuristic algorithm for solving the graph partitioning problem. The algorithm has
important applications in the layout of digital circuits and components in VLSI.[1][2]
Description
Let
partition of
between nodes in
and
is minimized. Let
be the internal cost of a, that is, the sum of the costs of edges
be the external cost of a, that is, the sum of the costs of edges between a
where
The algorithm attempts to find an optimal series of interchange operations between elements of
and
which
[1]
maximizes
and then executes the operations, producing a partition of the graph to A and B.
Pseudocode
See [2]
1
function Kernighan-Lin(G(V,E)):
do
A1 := A; B1 := B
for (i := 1 to /2)
7
find a[i] from A1 and b[i] from B1, such that g[i] = D[a[i + D[b[i]] - 2*c[a[i]][b[i]] is maximal
10
11
end for
12
KernighanLin algorithm
13
14
15
16
436
References
[1] Kernighan, B. W.; Lin, Shen (1970). "An efficient heuristic procedure for partitioning graphs". Bell Systems Technical Journal 49: 291307.
[2] Ravikumr, Si. Pi; Ravikumar, C.P (1995). Parallel methods for VLSI layout design (http:/ / books. google. com/ ?id=VPXAxkTKxXIC).
Greenwood Publishing Group. pp.73. ISBN978-0-89391-828-6. OCLC2009-06-12. .
Tree decomposition
In graph theory, a tree decomposition is a mapping of a
graph into a tree that can be used to speed up solving
certain problems on the original graph. The treewidth
measures the number of graph vertices mapped onto any
tree node in an optimal tree decomposition. While it is
NP-hard to determine the treewidth of a graph, many
NP-hard combinatorial problems on graphs are solvable
in polynomial time when restricted to graphs of bounded
treewidth.
In machine learning, tree decompositions are also called
junction trees, clique trees, or join trees; they play an
important role in problems like probabilistic inference,
constraint satisfaction, query optimization, and matrix
decomposition.
The concept of tree decompositions and treewidth was
originally introduced by Halin (1976). Later it was
rediscovered by Robertson & Seymour (1984) and has
since been studied by many other authors[1].
Definition
Tree decomposition
437
3. If Xi and Xj both contain a vertex v, then all nodes Xk of the tree in the (unique) path between Xi and Xj contain v
as well. That is, the nodes associated with vertex v form a connected subset of T. This is also known as coherence,
or the running intersection property. It can be stated equivalently that if
,
and
are nodes, and
is
on the path from
to
, then
The tree decomposition of a graph is far from unique; for example, a trivial tree decomposition contains all vertices
of the graph in its single root node.
Treewidth
The width of a tree decomposition is the size of its largest set Xi minus
one. The treewidth tw(G) of a graph G is the minimum width among
all possible tree decompositions of G. In this definition, the size of the
largest set is diminished by one in order to make the treewidth of a tree
equal to one. Equivalently, the treewidth of G is one less than the size
of the largest clique in the chordal graph containing G with the
smallest clique number. The graphs with treewidth at most k are also
called partial k-trees.
It is NP-complete to determine whether a given graph G has treewidth
at most a given variable k.[3] However, when k is any fixed constant,
the graphs with treewidth k can be recognized, and a width k tree
decomposition constructed for them, in linear time.[4] The time
dependence of this algorithm on k is exponential.
Graph minors
For any fixed constant k, the partial k-trees are closed under the
operation of graph minors, and therefore, by the RobertsonSeymour
theorem, this family can be characterized in terms of a finite set of
forbidden minors. For partial 1-trees (that is, forests), the single
forbidden minor is a triangle, and for the partial 2-trees the single
forbidden minor is the complete graph on four vertices. However, the
number of forbidden minors increases for larger values of k: for partial
3-trees there are four forbidden minors, the complete graph on five
vertices, the octahedral graph with six vertices, the eight-vertex
Wagner graph, and the pentagonal prism with ten vertices.[6]
Forbidden minors for partial 3-trees
The planar graphs do not have bounded treewidth, because the n n
grid graph is a planar graph with treewidth n. Therefore, if F is a
minor-closed graph family with bounded treewidth, it cannot include all planar graphs. Conversely, if some planar
graph cannot occur as a minor for graphs in family F, then there is a constant k such that all graphs in F have
treewidth at most k. That is, the following three conditions are equivalent to each other:[7]
Tree decomposition
438
Dynamic programming
At the beginning of the 1970s, it was observed that a large class of combinatorial optimization problems defined on
graphs could be efficiently solved by non serial dynamic programming as long as the graph had a bounded
dimension,[9] a parameter related to treewidth. Later, several authors independently observed at the end of the
1980s[10] that many algorithmic problems that are NP-complete for arbitrary graphs may be solved efficiently by
dynamic programming for graphs of bounded treewidth, using the tree-decompositions of these graphs.
As an example, consider the problem of finding the maximum independent set in a graph of treewidth k. To solve
this problem, first choose one of the nodes of the tree decomposition to be the root, arbitrarily. For a node Xi of the
tree decomposition, let Di be the union of the sets Xj descending from Xi. For an independent set SXi, let A(S,i)
denote the size of the largest independent subset I of Di such that IXi=S. Similarly, for an adjacent pair of nodes
Xi and Xj, with Xi farther from the root of the tree than Xj, and an independent set SXiXj, let B(S,i,j) denote the
size of the largest independent subset I of Di such that IXiXj=S. We may calculate these A and B values by a
bottom-up traversal of the tree:
At each node or edge, there are at most 2 sets S for which we need to calculate these values, so if k is a constant then
the whole calculation takes constant time per edge or node. The size of the maximum independent set is the largest
value stored at the root node, and the maximum independent set itself can be found (as is standard in dynamic
programming algorithms) by backtracking through these stored values starting from this largest value. Thus, in
graphs of bounded treewidth, the maximum independent set problem may be solved in linear time. Similar
algorithms apply to many other graph problems.
This dynamic programming approach is used in machine learning via the junction tree algorithm for belief
propagation in graphs of bounded treewidth. It also plays a key role in algorithms for computing the treewidth and
constructing tree decompositions: typically, such algorithms have a first step that approximates the treewidth,
constructing a tree decomposition with this approximate width, and then a second step that performs dynamic
programming in the approximate tree decomposition to compute the exact value of the treewidth.[4]
Treewidth of cliques
In any tree decomposition
nodes of the clique. This is shown by induction on the size of the clique. The base cases are cliques of size 1 and 2,
which follow from the conditions 1 and 2 on a tree decomposition. The inductive case is a graph containing a clique
of size k+1, where k is 2 or greater. Let C be the set of nodes in the clique. Since
, there are at least three
nodes in the clique, call them x, y and z. We know from the induction hypothesis that there are nodes a, b and c in the
tree decomposition with
Tree decomposition
439
,
In a tree there is exactly one path between any two nodes. A second property of trees is that the three paths between
a, b and c have a non-empty intersection. Let v be a node in this intersection. From condition 3 on a tree
decomposition we get that
Treewidth of trees
A connected graph with at least two vertices has treewidth 1 if and only if it is a tree. This can be shown in two steps,
first that a tree has treewidth 1, second, that a connected graph with treewidth 1 is a tree.
To show the former, use induction on the number of vertices in the tree to show that it has a tree decomposition with
no bag with size larger than two. The base case is a tree with two vertices, in which case the tree decomposition with
one single node is sufficient. The inductive case is a tree
with
vertices, where is any integer greater
than
. If we remove a leaf
such that
. Let
be
added a node
with
in
and
be
Notes
[1] Diestel (2005) pp.354355
[2] Diestel (2005) section 12.3
[3] Arnborg, Corneil & Proskurowski (1987).
[4] Bodlaender (1996).
[5] Seymour & Thomas (1993).
[6] Bodlaender (1998).
[7] Robertson & Seymour (1986).
[8] Thorup (1998).
[9] Bertel (1972)
[10] Arnborg & Proskurowski (1989); Bern, Lawler & Wong (1987); Bodlaender (1988).
Tree decomposition
References
Arnborg, S.; Corneil, D.; Proskurowski, A. (1987), "Complexity of finding embeddings in a k-tree", SIAM
Journal on Matrix Analysis and Applications 8 (2): 277284, doi:10.1137/0608024.
Arnborg, S.; Proskurowski, A. (1989), "Linear time algorithms for NP-hard problems restricted to partial k-trees",
Discrete Applied Mathematics 23 (1): 1124, doi:10.1016/0166-218X(89)90031-0.
Bern, M. W.; Lawler, E. L.; Wong, A. L. (1987), "Linear-time computation of optimal subgraphs of
decomposable graphs", Journal of Algorithms 8 (2): 216235, doi:10.1016/0196-6774(87)90039-3.
Bertel, Umberto; Brioschi, Francesco (1972), Nonserial Dynamic Programming, Academic Press,
ISBN0-12-093450-7.
Bodlaender, Hans L. (1988), "Dynamic programming on graphs with bounded treewidth", Proc. 15th
International Colloquium on Automata, Languages and Programming, Lecture Notes in Computer Science, 317,
Springer-Verlag, pp.105118, doi:10.1007/3-540-19488-6_110.
Bodlaender, Hans L. (1996), "A linear time algorithm for finding tree-decompositions of small treewidth", SIAM
Journal on Computing 25 (6): 13051317, doi:10.1137/S0097539793251219.
Bodlaender, Hans L. (1998), "A partial k-arboretum of graphs with bounded treewidth", Theoretical Computer
Science 209 (12): 145, doi:10.1016/S0304-3975(97)00228-4.
Diestel, Reinhard (2005), Graph Theory (http://www.math.uni-hamburg.de/home/diestel/books/graph.
theory/) (3rd ed.), Springer, ISBN3-540-26182-6.
Halin, Rudolf (1976), "S-functions for graphs", Journal of Geometry 8: 171186.
Robertson, Neil; Seymour, Paul D. (1984), "Graph minors III: Planar tree-width", Journal of Combinatorial
Theory, Series B 36 (1): 4964, doi:10.1016/0095-8956(84)90013-3.
Robertson, Neil; Seymour, Paul D. (1986), "Graph minors V: Excluding a planar graph", Journal of
Combinatorial Theory, Series B 41 (1): 92114, doi:10.1016/0095-8956(86)90030-4.
Seymour, Paul D.; Thomas, Robin (1993), "Graph Searching and a Min-Max Theorem for Tree-Width.", Journal
of Combinatorial Theory, Series B 58 (1): 2233, doi:10.1006/jctb.1993.1027.
Thorup, Mikkel (1998), "All structured programs have small tree width and good register allocation", Information
and Computation 142 (2): 159181, doi:10.1006/inco.1997.2697.
440
Branch-decomposition
Branch-decomposition
In
graph
theory,
a
branch-decomposition
of
an
undirected graph G is a hierarchical
clustering of the edges of G,
represented by an unrooted binary tree
T with the edges of G as its leaves.
Removing any edge from T partitions
the edges of G into two subgraphs, and
the width of the decomposition is the
maximum number of shared vertices of
any pair of subgraphs formed in this
way. The branchwidth of G is the
Branch decomposition of a grid graph, showing an e-separation. The separation, the
minimum
width
of
any
decomposition, and the graph all have width three.
branch-decomposition
of
G;
branchwidth is closely related to
tree-width and many graph optimization problems may be solved efficiently for graphs of small branchwidth.
Branch-decompositions and branchwidth may also be generalized from graphs to matroids.
Definitions
An unrooted binary tree is a connected undirected graph with no cycles in which each non-leaf node has exactly
three neighbors. A branch-decomposition may be represented by an unrooted binary tree T, together with a bijection
between the leaves of T and the edges of the given graph G=(V,E). If e is any edge of the tree T, then removing e
from T partitions it into two subtrees T1 and T2. This partition of T into subtrees induces a partition of the edges
associated with the leaves of T into two subgraphs G1 and G2 of G. This partition of G into two subgraphs is called
an e-separation.
The width of an e-separation is the number of vertices of G that are incident both to an edge of E1 and to an edge of
E2; that is, it is the number of vertices that are shared by the two subgraphs G1 and G2. The width of the
branch-decomposition is the maximum width of any of its e-separations. The branchwidth of G is the minimum
width of a branch-decomposition of G.
Relation to treewidth
Branch-decompositions of graphs are closely related to tree decompositions, and branch-width is closely related to
tree-width: the two quantities are always within a constant factor of each other. In particular, in the paper in which
they introduced branch-width, Neil Robertson and Paul Seymour[1] showed that for a graph G with tree-width k and
branchwidth b > 1,
441
Branch-decomposition
Carving width
Carving width is a concept defined similarly to branch width, except with edges replaced by vertices and vice versa.
A carving decomposition is an unrooted binary tree with each leaf representing a vertex in the original graph, and the
width of a cut is the number (or total weight in a weighted graph) of edges that are incident to a vertex in both
subtrees.
Branch width algorithms typically work by reducing to an equivalent carving width problem. In particular, the
carving width of the medial graph of a graph is exactly twice the branch width of the original graph. [2]
Generalization to matroids
It is also possible to define a notion of branch-decomposition for matroids that generalizes branch-decompositions of
graphs.[6] A branch-decomposition of a matroid is a hierarchical clustering of the matroid elements, represented as
an unrooted binary tree with the elements of the matroid at its leaves. An e-separation may be defined in the same
way as for graphs, and results in a partition of the set M of matroid elements into two subsets A and B. If denotes
the rank function of the matroid, then the width of an e-separation is defined as (A) + (B) (M) + 1, and the
width of the decomposition and the branchwidth of the matroid are defined analogously. The branchwidth of a graph
and the branchwidth of the corresponding graphic matroid may differ: for instance, the three-edge path graph and the
three-edge star have different branchwidths, 2 and 1 respectively, but they both induce the same graphic matroid
with branchwidth 1.[7] However, for graphs that are not trees, the branchwidth of the graph is equal to the
branchwidth of its associated graphic matroid.[8] The branchwidth of a matroid is equal to the branchwidth of its dual
matroid, and in particular this implies that the branchwidth of any planar graph that is not a tree is equal to that of its
dual.[7]
Branchwidth is an important component of attempts to extend the theory of graph minors to matroid minors:
although treewidth can also be generalized to matroids,[9] and plays a bigger role than branchwidth in the theory of
graph minors, branchwidth has more convenient properties in the matroid setting.[10] Robertson and Seymour
conjectured that the matroids representable over any particular finite field are well-quasi-ordered, analogously to the
RobertsonSeymour theorem for graphs, but so far this has been proven only for the matroids of bounded
442
Branch-decomposition
branchwidth.[11] Additionally, if a minor-closed family of matroids representable over a finite field does not include
the graphic matroids of all planar graphs, then there is a constant bound on the branchwidth of the matroids in the
family, generalizing similar results for minor-closed graph families.[12]
Forbidden minors
By the RobertsonSeymour theorem, the graphs of branchwidth k can
be characterized by a finite set of forbidden minors. The graphs of
branchwidth 0 are the matchings; the minimal forbidden minors are a
two-edge path graph and a triangle graph (or the two-edge cycle, if
multigraphs rather than simple graphs are considered).[13] The graphs
of branchwidth 1 are the graphs in which each connected component is
a star; the minimal forbidden minors for branchwidth 1 are the triangle
graph (or the two-edge cycle, if multigraphs rather than simple graphs
are considered) and the three-edge path graph.[13] The graphs of
branchwidth 2 are the graphs in which each biconnected component is
a series-parallel graph; the only minimal forbidden minor is the
The four forbidden minors for graphs of
complete graph K4 on four vertices.[13] A graph has branchwidth three
branchwidth three.
if and only if it has treewidth three and does not have the cube graph as
a minor; therefore, the four minimal forbidden minors are three of the four forbidden minors for treewidth three (the
graph of the octahedron, the complete graph K5, and the Wagner graph) together with the cube graph.[14]
Forbidden minors have also been studied for matroid branchwidth, despite the lack of a full analogue to the
RobertsonSeymour theorem in this case. A matroid has branchwidth one if and only if every element is either a
loop or a coloop, so the unique minimal forbidden minor is the uniform matroid U(2,3), the graphic matroid of the
triangle graph. A matroid has branchwidth two if and only if it is the graphic matroid of a graph of branchwidth two,
so its minimal forbidden minors are the graphic matroid of K4 and the non-graphic matroid U(2,4). The matroids of
branchwidth three are not well-quasi-ordered without the additional assumption of representability over a finite field,
but nevertheless the matroids with any finite bound on their branchwidth have finitely many minimal forbidden
minors, all of which have a number of elements that is at most exponential in the branchwidth.[15]
Notes
[1]
[2]
[3]
[4]
443
Branch-decomposition
References
Bodlaender, Hans L.; Thilikos, Dimitrios M. (1997), "Constructive linear time algorithms for branchwidth", Proc.
24th International Colloquium on Automata, Languages and Programming (ICALP '97), Lecture Notes in
Computer Science, 1256, Springer-Verlag, pp.627637, doi:10.1007/3-540-63165-8_217.
Bodlaender, Hans L.; Thilikos, Dimitrios M. (1999), "Graphs with branchwidth at most three", Journal of
Algorithms 32 (2): 167194, doi:10.1006/jagm.1999.1011.
Cook, William; Seymour, Paul D. (2003), "Tour merging via branch-decomposition" (http://www.cs.utk.edu/
~langston/projects/papers/tmerge.pdf), INFORMS Journal on Computing 15 (3): 233248,
doi:10.1287/ijoc.15.3.233.16078.
Fomin, Fedor V.; Thilikos, Dimitrios M. (2006), "Dominating sets in planar graphs: branch-width and exponential
speed-up", SIAM Journal on Computing 36 (2): 281, doi:10.1137/S0097539702419649.
Fomin, Fedor V.; Mazoit, Frdric; Todinca, Ioan (2009), "Computing branchwidth via efficient triangulations
and blocks" (http://hal.archives-ouvertes.fr/hal-00390623/), Discrete Applied Mathematics 157 (12):
27262736, doi:10.1016/j.dam.2008.08.009.
Geelen, Jim; Gerards, Bert; Robertson, Neil; Whittle, Geoff (2003), "On the excluded minors for the matroids of
branch-width k", Journal of Combinatorial Theory, Series B 88 (2): 261265,
doi:10.1016/S0095-8956(02)00046-1.
Geelen, Jim; Gerards, Bert; Whittle, Geoff (2002), "Branch-width and well-quasi-ordering in matroids and
graphs", Journal of Combinatorial Theory, Series B 84 (2): 270290, doi:10.1006/jctb.2001.2082.
Geelen, Jim; Gerards, Bert; Whittle, Geoff (2006), "Towards a structure theory for matrices and matroids" (http://
www.icm2006.org/proceedings/Vol_III/contents/ICM_Vol_3_41.pdf), Proc. International Congress of
Mathematicians, III, pp.827842.
Geelen, Jim; Gerards, Bert; Whittle, Geoff (2007), "Excluding a planar graph from GF(q)-representable matroids"
(http://www.math.uwaterloo.ca/~jfgeelen/publications/grid.pdf), Journal of Combinatorial Theory, Series B
97 (6): 971998, doi:10.1016/j.jctb.2007.02.005.
Hall, Rhiannon; Oxley, James; Semple, Charles; Whittle, Geoff (2002), "On matroids of branch-width three",
Journal of Combinatorial Theory, Series B 86 (1): 148171, doi:10.1006/jctb.2002.2120.
Hicks, Illya V. (2000), Branch Decompositions and their Applications (http://www.caam.rice.edu/caam/trs/
2000/TR00-17.ps), Ph.D. thesis, Rice University.
Hicks, Illya V.; McMurray, Nolan B., Jr. (2007), "The branchwidth of graphs and their cycle matroids", Journal
of Combinatorial Theory, Series B 97 (5): 681692, doi:10.1016/j.jctb.2006.12.007.
Hlinn, Petr (2003), "On matroid properties definable in the MSO logic", Proc. 28th International Symposium
on Mathematical Foundations of Computer Science (MFCS '03), Lecture Notes in Computer Science, 2747,
Springer-Verlag, pp.470479, doi:10.1007/978-3-540-45138-9\_41
Hlinn, Petr; Whittle, Geoff (2006), "Matroid tree-width" (http://www.fi.muni.cz/~hlineny/Research/
papers/matr-tw-final.pdf), European Journal of Combinatorics 27 (7): 11171128,
doi:10.1016/j.ejc.2006.06.005.
Addendum and corrigendum: Hlinn, Petr; Whittle, Geoff (2009), "Addendum to matroid tree-width",
European Journal of Combinatorics 30 (4): 10361044, doi:10.1016/j.ejc.2008.09.028.
Mazoit, Frdric; Thomass, Stphan (2007), "Branchwidth of graphic matroids" (http://hal.archives-ouvertes.
fr/docs/00/04/09/28/PDF/Branchwidth.pdf), in Hilton, Anthony; Talbot, John, Surveys in Combinatorics
2007, London Mathematical Society Lecture Note Series, 346, Cambridge University Press, p.275.
Robertson, Neil; Seymour, Paul D. (1991), "Graph minors. X. Obstructions to tree-decomposition", Journal of
Combinatorial Theory 52 (2): 153190, doi:10.1016/0095-8956(91)90061-N.
Seymour, Paul D.; Thomas, Robin (1994), "Call routing and the ratcatcher", Combinatorica 14 (2): 217241,
doi:10.1007/BF01215352.
444
Path decomposition
Path decomposition
In graph theory, a path decomposition of a graph G is, informally, a representation of G as a "thickened" path
graph,[1] and the pathwidth of G is a number that measures how much the path was thickened to formG. More
formally, a path-decomposition is a sequence of subsets of vertices of G such that the endpoints of each edge appear
in one of the subsets and such that each vertex appears in a contiguous subsequence of the subsets,[2] and the
pathwidth is one less than the size of the largest set in such a decomposition. Pathwidth is also known as interval
thickness (one less than the maximum clique size in an interval supergraph of G), vertex separation number, or
node searching number.[3]
Pathwidth and path-decompositions are closely analogous to treewidth and tree decompositions. They play a key role
in the theory of graph minors: the families of graphs that are closed under graph minors and do not include all forests
may be characterized as having bounded pathwidth,[2] and the "vortices" appearing in the general structure theory for
minor-closed graph families have bounded pathwidth.[4] Pathwidth, and graphs of bounded pathwidth, also have
applications in VLSI design, graph drawing, and computational linguistics.
It is NP-hard to find the pathwidth of arbitrary graphs, or even to approximate it accurately.[5][6] However, the
problem is fixed-parameter tractable: testing whether a graph has pathwidth k can be solved in an amount of time
that depends linearly on the size of the graph but superexponentially onk.[7] Additionally, for several special classes
of graphs, such as trees, the pathwidth may be computed in polynomial time without dependence onk.[8][9] Many
problems in graph algorithms may be solved efficiently on graphs of bounded pathwidth, by using dynamic
programming on a path-decomposition of the graph.[10] Path decomposition may also be used to measure the space
complexity of dynamic programming algorithms on graphs of bounded treewidth.[11]
Definition
In the first of their famous series of papers on graph minors, Neil Robertson and Paul Seymour(1983) define a
path-decomposition of a graph G to be a sequence of subsets Xi of vertices of G, with two properties:
1. For each edge of G, there exists an i such that both endpoints of the edge belong to subset Xi, and
2. For every three indices i j k, Xi Xk Xj.
The second of these two properties is equivalent to requiring that the subsets containing any particular vertex form a
contiguous subsequence of the whole sequence. In the language of the later papers in Robertson and Seymour's
graph minor series, a path-decomposition is a tree decomposition (X,T) in which the underlying tree T of the
decomposition is a path graph.
The width of a path-decomposition is defined in the same way as for tree-decompositions, as maxi|Xi|1, and the
pathwidth of G is the minimum width of any path-decomposition ofG. The subtraction of one from the size of Xi in
this definition makes little difference in most applications of pathwidth, but is used to make the pathwidth of a path
graph be equal to one.
Alternative characterizations
As Bodlaender (1998) describes, pathwidth can be characterized in many equivalent ways.
Gluing sequences
A path decomposition can be described as a sequence of graphs Gi that are glued together by identifying pairs of
vertices from consecutive graphs in the sequence, such that the result of performing all of these gluings is G. The
graphs Gi may be taken as the induced subgraphs of the sets Xi in the first definition of path decompositions, with
two vertices in successive induced subgraphs being glued together when they are induced by the same vertex in G,
445
Path decomposition
and in the other direction one may recover the sets Xi as the vertex sets of the graphs Gi. The width of the path
decomposition is then one less than the maximum number of vertices in one of the graphs Gi.[2]
Interval thickness
The pathwidth of any graph G is equal to
one less than the smallest clique number of
an interval graph that contains G as a
subgraph.[12] That is, for every path
decomposition of G one can find an interval
supergraph of G, and for every interval
supergraph of G one can find a path
decomposition of G, such that the width of
the decomposition is one less than the clique
number of the interval graph.
In one direction, suppose a path
An interval graph with pathwidth two, one less than the cardinality of its four
decomposition of G is given. Then one may
maximum cliques ABC, ACD, CDE, and CDF.
represent the nodes of the decomposition as
points on a line (in path order) and represent
each vertex v as a closed interval having these points as endpoints. In this way, the path decomposition nodes
containing v correspond to the representative points in the interval for v. The intersection graph of the intervals
formed from the vertices of G is an interval graph that contains G as a subgraph. Its maximal cliques are given by the
sets of intervals containing the representative points, and its maximum clique size is one plus the pathwidth of G.
In the other direction, if G is a subgraph of an interval graph with clique number p+1, then G has a path
decomposition of width p whose nodes are given by the maximal cliques of the interval graph. For instance, the
interval graph shown with its interval representation in the figure has a path decomposition with five nodes,
corresponding to its five maximal cliques ABC, ACD, CDE, CDF, and FG; the maximum clique size is three and the
width of this path decomposition is two.
This equivalence between pathwidth and interval thickness is closely analogous to the equivalence between
treewidth and the minimum clique number (minus one) of a chordal graph of which the given graph is a subgraph.
Interval graphs are a special case of chordal graphs, and chordal graphs can be represented as intersection graphs of
subtrees of a common tree generalizing the way that interval graphs are intersection graphs of subpaths of a path.
446
Path decomposition
Bounds
Every n-vertex graph with pathwidth k has at most k(n k + (k 1)/2))
edges, and the maximal pathwidth-k graphs (graphs to which no more
edges can be added without increasing the pathwidth) have exactly this
many edges. A maximal pathwidth-k graph must be either a k-path or a
k-caterpillar, two special kinds of k-tree. A k-tree is a chordal graph
A caterpillar tree, a maximal graph with
pathwidth one.
with exactly n k maximal cliques, each containing k + 1 vertices; in a
k-tree that is not itself a (k + 1)-clique, each maximal clique either
separates the graph into two or more components, or it contains a single leaf vertex, a vertex that belongs to only a
single maximal clique. A k-path is a k-tree with at most two leaves, and a k-caterpillar is a k-tree in which the
non-leaf vertices induce a k-path. In particular the maximal graphs of pathwidth one are exactly the caterpillar
trees.[14]
Since path-decompositions are a special case of tree-decompositions, the pathwidth of any graph is greater than or
equal to its treewidth. The pathwidth is also less than or equal to the cutwidth, the minimum number of edges that
crosses any cut between lower-numbered and higher-numbered vertices in an optimal linear arrangement of the
vertices of a graph; this follows because the vertex separation number, the number of lower-numbered vertices with
higher-numbered neighbors, can at most equal the number of cut edges.[15] For similar reasons, the cutwidth is at
most the pathwidth times the maximum degree of the vertices in a given graph.[16]
Any n-vertex forest has pathwidth O(logn).[17] For, in a forest, one can always find a constant number of vertices the
removal of which leaves a forest that can be partitioned into two smaller subforests with at most 2n/3 vertices each.
A linear arrangement formed by recursively partitioning each of these two subforests, placing the separating vertices
between them, has logarithmic vertex searching number. The same technique, applied to a tree-decomposition of a
graph, shows that, if the treewidth of an n-vertex graph G is t, then the pathwidth of G is O(tlogn).[18] Since
outerplanar graphs, series-parallel graphs, and Halin graphs all have bounded treewidth, they all also have at most
logarithmic pathwidth.
As well as its relations to treewidth, pathwidth is also related to clique-width and cutwidth, via line graphs; the line
graph L(G) of a graph G has a vertex for each edge of G and two vertices in L(G) are adjacent when the
corresponding two edges of G share an endpoint. Any family of graphs has bounded pathwidth if and only if its line
graphs have bounded linear clique-width, where linear clique-width replaces the disjoint union operation from
clique-width with the operation of adjoining a single new vertex.[19] If a connected graph with three or more vertices
has maximum degree three, then its cutwidth equals the vertex separation number of its line graph.[20]
In any planar graph, the pathwidth is at most proportional to the square root of the number of vertices.[21] One way to
find a path-decomposition with this width is (similarly to the logarithmic-width path-decomposition of forests
447
Path decomposition
described above) to use the planar separator theorem to find a set of O(n) vertices the removal of which separates
the graph into two subgraphs of at most 2n/3 vertices each, and concatenate recursively-constructed path
decompositions for each of these two subgraphs. The same technique applies to any class of graphs for which a
similar separator theorem holds.[22] Since, like planar graphs, the graphs in any fixed minor-closed graph family
have separators of size O(n),[23] it follows that the pathwidth of the graphs in any fixed minor-closed family is
again O(n). For some classes of planar graphs, the pathwidth of the graph and the pathwidth of its dual graph must
be within a constant factor of each other: bounds of this form are known for biconnected outerplanar graphs[24] and
for polyhedral graphs.[25] For 2-connected planar graphs, the pathwidth of the dual graph less than the pathwidth of
the line graph.[26] It remains open whether the pathwidth of a planar graph and its dual are always within a constant
factor of each other in the remaining cases.
In some classes of graphs, it has been proven that the pathwidth and treewidth are always equal to each other: this is
true for cographs,[27] permutation graphs,[28] the complements of comparability graphs,[29] and the comparability
graphs of interval orders.[30]
In any cubic graph, or more generally any graph with maximum vertex degree three, the pathwidth is at most
n/6+o(n), where n is the number of vertices in the graph. There exist cubic graphs with pathwidth 0.082n, but it is
not known how to reduce this gap between this lower bound and the n/6 upper bound.[31]
Computing path-decompositions
It is NP-complete to determine whether the pathwidth of a given graph is at most k, when k is a variable given as part
of the input.[5] The best known worst-case time bounds for computing the pathwidth of arbitrary n-vertex graphs are
of the form O(2nnc) for some constantc.[32] Nevertheless several algorithms are known to compute
path-decompositions more efficiently when the pathwidth is small, when the class of input graphs is limited, or
approximately.
Fixed-parameter tractability
Pathwidth is fixed-parameter tractable: for any constant k, it is possible to test whether the pathwidth is at most k,
and if so to find a path-decomposition of width k, in linear time.[7] In general, these algorithms operate in two phases.
In the first phase, the assumption that the graph has pathwidth k is used to find a path-decomposition or
tree-decomposition that is not optimal, but whose width can be bounded as a function of k. In the second phase, a
dynamic programming algorithm is applied to this decomposition in order to find the optimal decomposition.
However, the time bounds for known algorithms of this type are exponential in k2, impractical except for the
smallest values of k.[33] For the case k=2 an explicit linear-time algorithm based on a structural decomposition of
pathwidth-2 graphs is given by de Fluiter (1997).
448
Path decomposition
449
Approximation algorithms
It is NP-hard to approximate the pathwidth of a graph to within an additive constant.[6] The best known
approximation ratio of a polynomial time approximation algorithm for pathwidth is O((logn)3/2).[41] For earlier
approximation algorithms for pathwidth, see Bodlaender et al. (1992) and Guha (2000). For approximations on
restricted classes of graphs, see Kloks & Bodlaender (1992).
Graph minors
A minor of a graph G is another graph formed from G by contracting edges, removing edges, and removing vertices.
Graph minors have a deep theory in which several important results involve pathwidth.
Excluding a forest
If a family F of graphs is closed under taking minors (every minor of a member of F is also in F), then by the
RobertsonSeymour theorem F can be characterized as the graphs that do not have any minor in X, where X is a
finite set of forbidden minors.[42] For instance, Wagner's theorem states that the planar graphs are the graphs that
have neither the complete graph K5 nor the complete bipartite graph K3,3 as minors. In many cases, the properties of
F and the properties of X are closely related, and the first such result of this type was by Robertson & Seymour
(1983)[2], and relates bounded pathwidth with the existence of a forest in the family of forbidden minors.
Specifically, define a family F of graphs to have bounded pathwidth if there exists a constant p such that every graph
in F has pathwidth at most p. Then, a minor-closed family F has bounded pathwidth if and only if the set X of
forbidden minors for F includes at least one forest.
In one direction, this result is straightforward to prove: if X does not include at least one forest, then the X-minor-free
graphs do not have bounded pathwidth. For, in this case, the X-minor-free graphs include all forests, and in particular
they include the perfect binary trees. But a perfect binary tree with k levels has pathwidth k1, so in this case the
X-minor-free-graphs have unbounded pathwidth. In the other direction, if X contains an n-vertex forest, then the
X-minor-free graphs have pathwidth at most n2.[43]
Although Xp necessarily includes at least one forest, it is not true that all graphs in Xp are forests: for instance, X1
consists of two graphs, a seven-vertex tree and the triangle K3. However, the set of trees in Xp may be precisely
characterized: these trees are exactly the trees that can be formed from three trees in Xp1 by connecting a new root
vertex by an edge to an arbitrarily chosen vertex in each of the three smaller trees. For instance, the seven-vertex tree
in X1 is formed in this way from the two-vertex tree (a single edge) in X0. Based on this construction, the number of
forbidden minors in Xp can be shown to be at least (p!)2.[44] The complete set X2 of forbidden minors for pathwidth-2
graphs has been computed; it contains 110 different graphs.[45]
Path decomposition
Structure theory
The graph structure theorem for minor-closed graph families states that, for any such family F, the graphs in F can
be decomposed into clique-sums of graphs that can be embedded onto surfaces of bounded genus, together with a
bounded number of apexes and vortices for each component of the clique-sum. An apex is a vertex that may be
adjacent to any other vertex in its component, while a vortex is a graph of bounded pathwidth that is glued into one
of the faces of the bounded-genus embedding of a component. The cyclic ordering of the vertices around the face
into which a vortex is embedded must be compatible with the path decomposition of the vortex, in the sense that
breaking the cycle to form a linear ordering must lead to an ordering with bounded vertex separation number.[4] This
theory, in which pathwidth is intimately connected to arbitrary minor-closed graph families, has important
algorithmic applications.[46]
Applications
VLSI
In VLSI design, the vertex separation problem was originally studied as a way to partition circuits into smaller
subsystems, with a small number of components on the boundary between the subsystems.[34]
Ohtsuki et al. (1979) use interval thickness to model the number of tracks needed in a one-dimensional layout of a
VLSI circuit, formed by a set of modules that need to be interconnected by a system of nets. In their model, one
forms a graph in which the vertices represent nets, and in which two vertices are connected by an edge if their nets
both connect to the same module; that is, if the modules and nets are interpreted as forming the nodes and
hyperedges of a hypergraph then the graph formed from them is its line graph. An interval representation of a
supergraph of this line graph, together with a coloring of the supergraph, describes an arrangement of the nets along
a system of horizontal tracks (one track per color) in such a way that the modules can be placed along the tracks in a
linear order and connect to the appropriate nets. The fact that interval graphs are perfect graphs[47] implies that the
number of colors needed, in an optimal arrangement of this type, is the same as the clique number of the interval
completion of the net graph.
Gate matrix layout[48] is a specific style of CMOS VLSI layout for Boolean logic circuits. In gate matrix layouts,
signals are propagated along "lines" (vertical line segments) while each gate of the circuit is formed by a sequence of
device features that lie along a horizontal line segment. Thus, the horizontal line segment for each gate must cross
the vertical segments for each of the lines that form inputs or outputs of the gate. As in the layouts of Ohtsuki et al.
(1979), a layout of this type that minimizes the number of vertical tracks on which the lines are to be arranged can be
found by computing the pathwidth of a graph that has the lines as its vertices and pairs of lines sharing a gate as its
edges.[49] The same algorithmic approach can also be used to model folding problems in programmable logic
arrays.[50]
Graph drawing
Pathwidth has several applications to graph drawing:
The minimal graphs that have a given crossing number have pathwidth that is bounded by a function of their
crossing number.[51]
The number of parallel lines on which the vertices of a tree can be drawn with no edge crossings (under various
natural restrictions on the ways that adjacent vertices can be placed with respect to the sequence of lines) is
proportional to the pathwidth of the tree.[52]
A k-crossing h-layer drawing of a graph G is a placement of the vertices of G onto h distinct horizontal lines, with
edges routed as monotonic polygonal paths between these lines, in such a way that there are at most k crossings.
The graphs with such drawings have pathwidth that is bounded by a function of h and k. Therefore, when h and k
are both constant, it is possible in linear time to determine whether a graph has a k-crossing h-layer drawing.[53]
450
Path decomposition
A graph with n vertices and pathwidth p can be embedded into a three-dimensional grid of size p p n in such a
way that no two edges (represented as straight line segments between grid points) intersect each other. Thus,
graphs of bounded pathwidth have embeddings of this type with linear volume.[54]
Compiler design
In the compilation of high-level programming languages, pathwidth arises in the problem of reordering sequences of
straight-line code (that is, code with no control flow branches or loops) in such a way that all the values computed in
the code can be placed in machine registers instead of having to be spilled into main memory. In this application, one
represents the code to be compiled as a directed acyclic graph in which the nodes represent the input values to the
code and the values computed by the operations within the code. An edge from node x to node y in this DAG
represents the fact that value x is one of the inputs to operation y. A topological ordering of the vertices of this DAG
represents a valid reordering of the code, and the number of registers needed to evaluate the code in a given ordering
is given by the vertex separation number of the ordering.[55]
For any fixed number w of machine registers, it is possible to determine in linear time whether a piece of
straight-line code can be reordered in such a way that it can be evaluated with at most w registers. For, if the vertex
separation number of a topological ordering is at most w, the minimum vertex separation among all orderings can be
no larger, so the undirected graph formed by ignoring the orientations of the DAG described above must have
pathwith at most w. It is possible to test whether this is the case, using the known fixed-parameter-tractable
algorithms for pathwidth, and if so to find a path-decomposition for the undirected graph, in linear time given the
assumption that w is a constant. Once a path decomposition has been found, a topological ordering of width w (if one
exists) can be found using dynamic programming, again in linear time.[55]
Linguistics
Kornai & Tuza (1992) describe an application of path-width in natural language processing. In this application,
sentences are modeled as graphs, in which the vertices represent words and the edges represent relationships between
words; for instance if an adjective modifies a noun in the sentence then the graph would have an edge between those
two words. Due to the limited capacity of human short-term memory,[56] Kornai and Tuza argue that this graph must
have bounded pathwidth (more specifically, they argue, pathwidth at most six), for otherwise humans would not be
able to parse speech correctly.
Exponential algorithms
Many problems in graph algorithms may be solved efficiently on graphs of low pathwidth, by using dynamic
programming on a path-decomposition of the graph.[10] For instance, if a linear ordering of the vertices of an
n-vertex graph G is given, with vertex separation number w, then it is possible to find the maximum independent set
of G in time O(2w n).[31] On graphs of bounded pathwidth, this approach leads to fixed-parameter tractable
algorithms, parametrized by the pathwidth.[49] Such results are not frequently found in the literature because they are
subsumed by similar algorithms parametrized by the treewidth; however, pathwidth arises even in treewidth-based
dynamic programming algorithms in measuring the space complexity of these algorithms.[11]
The same dynamic programming method also can be applied to graphs with unbounded pathwidth, leading to
algorithms that solve unparametrized graph problems in exponential time. For instance, combining this dynamic
programming approach with the fact that cubic graphs have pathwidth n/6+o(n) shows that, in a cubic graph, the
maximum independent set can be constructed in time O(2n/6+o(n)), faster than previous known methods.[31] A
similar approach leads to improved exponential-time algorithms for the maximum cut and minimum dominating set
problems in cubic graphs,[31] and for several other NP-hard optimization problems.[57]
451
Path decomposition
Notes
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[18] Korach & Solel (1993), Theorem 6, p. 100; Bodlaender (1998), Corollary 24, p.10.
[19] Gurski & Wanke (2007).
[20] Golovach (1993).
[21] Bodlaender (1998), Corollary 23, p. 10.
[22] Bodlaender (1998), Theorem 20, p. 9.
[23] Alon, Seymour & Thomas (1990).
[24] Bodlaender & Fomin (2002); Coudert, Huc & Sereni (2007).
[25] Fomin & Thilikos (2007); Amini, Huc & Prennes (2009).
[26] Fomin (2003).
[27] Bodlaender & Mhring (1990).
[28] Bodlaender, Kloks & Kratsch (1993).
[29] Habib & Mhring (1994).
[30] Garbe (1995).
[31] Fomin & Hie (2006).
[32] Fomin et al. (2008).
[33] Downey & Fellows (1999), p.12.
[34] Monien & Sudborough (1988).
[35] Gusted (1993).
[36] Kloks, Kratsch & Mller (1995). A chordal domino is a chordal graph in which every vertex belongs to at most two maximal cliques.
[37] Kloks et al. (1993).
[38] Kloks & Bodlaender (1992); Gusted (1993).
[39] Garbe (1995) credits this result to the 1993 Ph.D. thesis of Ton Kloks; Garbe's polynomial time algorithm for comparability graphs of
interval orders generalizes this result, since any chordal graph must be a comparability graph of this type.
[40] Suchan & Todinca (2007).
[41] Feige, Hajiaghayi & Lee (2005).
[42] Robertson & Seymour (2004).
[43] Bienstock et al. (1991); Diestel (1995); Cattell, Dinneen & Fellows (1996).
[44] Kinnersley (1992); Takahashi, Ueno & Kajitani (1994); Bodlaender (1998), p. 8.
[45] Kinnersley & Langston (1994).
[46] Demaine, Hajiaghayi & Kawarabayashi (2005).
[47] Berge (1967).
[48] Lopez & Law (1980).
[49] Fellows & Langston (1989).
[50] Mhring (1990); Ferreira & Song (1992).
[51] Hlinny (2003).
[52] Suderman (2004).
[53] Dujmovi et al. (2008).
[54] Dujmovi, Morin & Wood (2003).
452
Path decomposition
[55] Bodlaender, Gustedt & Telle (1998).
[56] Miller (1956).
[57] Kneis et al. (2005); Bjrklund & Husfeldt (2008).
References
Alon, Noga; Seymour, Paul; Thomas, Robin (1990), "A separator theorem for graphs with an excluded minor and
its applications", Proc. 22nd ACM Symp. on Theory of Computing (STOC 1990), pp.293299,
doi:10.1145/100216.100254.
Amini, Omid; Huc, Florian; Prennes, Stphane (2009), "On the path-width of planar graphs", SIAM Journal on
Discrete Mathematics 23 (3): 13111316, doi:10.1137/060670146.
Arnborg, Stefan (1985), "Efficient algorithms for combinatorial problems on graphs with bounded
decomposability A survey", BIT 25 (1): 223, doi:10.1007/BF01934985.
Arnborg, Stefan; Corneil, Derek G.; Proskurowski, Andrzej (1987), "Complexity of finding embeddings in a
$k$-tree", SIAM Journal on Algebraic and Discrete Methods 8 (2): 277284, doi:10.1137/0608024.
Aspvall, Bengt; Proskurowski, Andrzej; Telle, Jan Arne (2000), "Memory requirements for table computations in
partial k-tree algorithms", Algorithmica 27 (3): 382394, doi:10.1007/s004530010025.
Berge, Claude (1967), "Some classes of perfect graphs", Graph Theory and Theoretical Physics, New York:
Academic Press, pp.155165.
Bienstock, Dan; Robertson, Neil; Seymour, Paul; Thomas, Robin (1991), "Quickly excluding a forest", Journal of
Combinatorial Theory, Series B 52 (2): 274283, doi:10.1016/0095-8956(91)90068-U.
Bjrklund, Andreas; Husfeldt, Thore (2008), "Exact algorithms for exact satisfiability and number of perfect
matchings", Algorithmica 52 (2): 226249, doi:10.1007/s00453-007-9149-8.
Bodlaender, Hans L. (1994), "A tourist guide through treewidth", in Dassow, Jrgen; Kelemenov, Alisa,
Developments in Theoretical Computer Science (Proc. 7th International Meeting of Young Computer Scientists,
Smolenice, 1620 November 1992), Topics in Computer Mathematics, 6, Gordon and Breach, pp.120.
Bodlaender, Hans L. (1996), "A linear-time algorithm for finding tree-decompositions of small treewidth", SIAM
Journal on Computing 25 (6): 13051317, doi:10.1137/S0097539793251219.
Bodlaender, Hans L. (1998), "A partial k-arboretum of graphs with bounded treewidth", Theoretical Computer
Science 209 (12): 145, doi:10.1016/S0304-3975(97)00228-4.
Bodlaender, Hans L.; Fomin, Fedor V. (2002), "Approximation of pathwidth of outerplanar graphs", Journal of
Algorithms 43 (2): 190200, doi:10.1016/S0196-6774(02)00001-9.
Bodlaender, Hans L.; Gilbert, John R.; Hafsteinsson, Hjlmtr; Kloks, Ton (1992), "Approximating treewidth,
pathwidth, and minimum elimination tree height", Graph-Theoretic Concepts in Computer Science, Lecture Notes
in Computer Science, 570, pp.112, doi:10.1007/3-540-55121-2_1.
Bodlaender, Hans L.; Gustedt, Jens; Telle, Jan Arne (1998), "Linear-time register allocation for a fixed number of
registers" (http://www.ii.uib.no/~telle/bib/BGT.pdf), Proc. 9th ACMSIAM Symposium on Discrete
Algorithms (SODA '98), pp.574583.
Bodlaender, Hans L.; Kloks, Ton (1996), "Efficient and constructive algorithms for the pathwidth and treewidth
of graphs", Journal of Algorithms 21 (2): 358402, doi:10.1006/jagm.1996.0049.
Bodlaender, Hans L.; Kloks, Ton; Kratsch, Dieter (1993), "Treewidth and pathwidth of permutation graphs",
Proc. 20th International Colloquium on Automata, Languages and Programming (ICALP 1993), Lecture Notes in
Computer Science, 700, Springer-Verlag, pp.114125, doi:10.1007/3-540-56939-1_66.
Bodlaender, Hans L.; Mhring, Rolf H. (1990), "The pathwidth and treewidth of cographs", Proc. 2nd
Scandinavian Workshop on Algorithm Theory, Lecture Notes in Computer Science, 447, Springer-Verlag,
pp.301309, doi:10.1007/3-540-52846-6_99.
Cattell, Kevin; Dinneen, Michael J.; Fellows, Michael R. (1996), "A simple linear-time algorithm for finding
path-decompositions of small width", Information Processing Letters 57 (4): 197203,
453
Path decomposition
doi:10.1016/0020-0190(95)00190-5.
Coudert, David; Huc, Florian; Mazauric, Dorian (1998), "A distributed algorithm for computing and updating the
process number of a forest", Proc. 22nd Int. Symp. Distributed Computing, Lecture Notes in Computer Science,
5218, Springer-Verlag, pp.500501, arXiv:0806.2710, doi:10.1007/978-3-540-87779-0_36.
Coudert, David; Huc, Florian; Sereni, Jean-Sbastien (2007), "Pathwidth of outerplanar graphs", Journal of
Graph Theory 55 (1): 2741, doi:10.1002/jgt.20218.
Diestel, Reinhard (1995), "Graph Minors I: a short proof of the path-width theorem", Combinatorics, Probability
and Computing 4 (01): 2730, doi:10.1017/S0963548300001450.
Diestel, Reinhard; Khn, Daniela (2005), "Graph minor hierarchies", Discrete Applied Mathematics 145 (2):
167182, doi:10.1016/j.dam.2004.01.010.
Demaine, Erik D.; Hajiaghayi, MohammadTaghi; Kawarabayashi, Ken-ichi (2005), "Algorithmic graph minor
theory: decomposition, approximation, and coloring", Proc. 46th IEEE Symposium on Foundations of Computer
Science (FOCS 2005), pp.637646, doi:10.1109/SFCS.2005.14.
Downey, Rod G.; Fellows, Michael R. (1999), Parameterized Complexity, Springer-Verlag,
ISBN0-387-94883-X.
Dujmovi, V.; Fellows, M.R.; Kitching, M.; Liotta, G.; McCartin, C.; Nishimura, N.; Ragde, P.; Rosamond, F. et
al. (2008), "On the parameterized complexity of layered graph drawing", Algorithmica 52 (2): 267292,
doi:10.1007/s00453-007-9151-1.
Dujmovi, Vida; Morin, Pat; Wood, David R. (2003), "Path-width and three-dimensional straight-line grid
drawings of graphs" (http://cg.scs.carleton.ca/~vida/pubs/papers/DMW-GD02.pdf), Proc. 10th
International Symposium on Graph Drawing (GD 2002), Lecture Notes in Computer Science, 2528,
Springer-Verlag, pp.4253.
Ellis, J. A.; Sudborough, I. H.; Turner, J. S. (1983), "Graph separation and search number", Proc. 1983 Allerton
Conf. on Communication, Control, and Computing. As cited by Monien & Sudborough (1988).
Ellis, J. A.; Sudborough, I. H.; Turner, J. S. (1994), "The vertex separation and search number of a tree",
Information and Computation 113 (1): 5079, doi:10.1006/inco.1994.1064.
Feige, Uriel; Hajiaghayi, Mohammadtaghi; Lee, James R. (2005), "Improved approximation algorithms for
minimum-weight vertex separators", Proc. 37th ACM Symposium on Theory of Computing (STOC 2005),
pp.563572, doi:10.1145/1060590.1060674.
Fellows, Michael R.; Langston, Michael A. (1989), "On search decision and the efficiency of polynomial-time
algorithms", Proc. 21st ACM Symposium on Theory of Computing, pp.501512, doi:10.1145/73007.73055.
Ferreira, Afonso G.; Song, Siang W. (1992), "Achieving optimality for gate matrix layout and PLA folding: a
graph theoretic approach", Proc. 1st Latin American Symposium on Theoretical Informatics (LATIN '92), Lecture
Notes in Computer Science, 583, Springer-Verlag, pp.139153, doi:10.1007/BFb0023825.
de Fluiter, Babette (1997), Algorithms for Graphs of Small Treewidth (http://igitur-archive.library.uu.nl/
dissertations/01847381/full.pdf), Ph.D. thesis, Utrecht University, ISBN90-393-1528-0.
Fomin, Fedor V. (2003), "Pathwidth of planar and line graphs", Graphs and Combinatorics 19 (1): 9199,
doi:10.1007/s00373-002-0490-z.
Fomin, Fedor V.; Hie, Kjartan (2006), "Pathwidth of cubic graphs and exact algorithms", Information
Processing Letters 97 (5): 191196, doi:10.1016/j.ipl.2005.10.012.
Fomin, Fedor V.; Kratsch, Dieter; Todinca, Ioan; Villanger, Yngve (2008), "Exact algorithms for treewidth and
minimum fill-in", SIAM Journal on Computing 38 (3): 10581079, doi:10.1137/050643350.
Fomin, Fedor V.; Thilikos, Dimitrios M. (2007), "On self duality of pathwidth in polyhedral graph embeddings",
Journal of Graph Theory 55 (1): 4254, doi:10.1002/jgt.20219.
Garbe, Renate (1995), "Tree-width and path-width of comparability graphs of interval orders", Proc. 20th
International Workshop Graph-Theoretic Concepts in Computer Science (WG'94), Lecture Notes in Computer
Science, 903, Springer-Verlag, pp.2637, doi:10.1007/3-540-59071-4_35.
454
Path decomposition
Golovach, P. A. (1993), "The cutwidth of a graph and the vertex separation number of the line graph", Discrete
Mathematics and Applications 3 (5): 517522, doi:10.1515/dma.1993.3.5.517.
Guha, Sudipto (2000), "Nested graph dissection and approximation algorithms", Proc. 41st IEEE Symposium on
Foundations of Computer Science (FOCS 2000) 0: 126, doi:10.1109/SFCS.2000.892072.
Gurski, Frank; Wanke, Egon (2007), "Line graphs of bounded clique-width", Discrete Mathematics 307 (22):
27342754, doi:10.1016/j.disc.2007.01.020.
Gusted, Jens (1993), "On the pathwidth of chordal graphs", Discrete Applied Mathematics 45 (3): 233248,
doi:10.1016/0166-218X(93)90012-D.
Habib, Michel; Mhring, Rolf H. (1994), "Treewidth of cocomparability graphs and a new order-theoretic
parameter", Order 11 (1): 4760, doi:10.1007/BF01462229.
Hlinny, Petr (2003), "Crossing-number critical graphs have bounded path-width", Journal of Combinatorial
Theory, Series B 88 (2): 347367, doi:10.1016/S0095-8956(03)00037-6.
Kashiwabara, T.; Fujisawa, T. (1979), "NP-completeness of the problem of finding a minimum-clique-number
interval graph containing a given graph as a subgraph", Proc. International Symposium on Circuits and Systems,
pp.657660.
Kinnersley, Nancy G. (1992), "The vertex separation number of a graph equals its path-width", Information
Processing Letters 42 (6): 345350, doi:10.1016/0020-0190(92)90234-M.
Kinnersley, Nancy G.; Langston, Michael A. (1994), "Obstruction set isolation for the gate matrix layout
problem", Discrete Applied Mathematics 54 (23): 169213, doi:10.1016/0166-218X(94)90021-3.
Kirousis, Lefteris M.; Papadimitriou, Christos H. (1985), "Interval graphs and searching" (http://lca.ceid.
upatras.gr/~kirousis/publications/j31.pdf), Discrete Mathematics 55 (2): 181184,
doi:10.1016/0012-365X(85)90046-9.
Kloks, Ton; Bodlaender, Hans L. (1992), "Approximating treewidth and pathwidth of some classes of perfect
graphs", Proc. 3rd International Symposium on Algorithms and Computation (ISAAC'92), Lecture Notes in
Computer Science, 650, Springer-Verlag, pp.116125, doi:10.1007/3-540-56279-6_64.
Kloks, T.; Bodlaender, H.; Mller, H.; Kratsch, D. (1993), "Computing treewidth and minimum fill-in: all you
need are the minimal separators", Proc. 1st European Symposium on Algorithms (ESA'93) (Lecture Notes in
Computer Science), 726, Springer-Verlag, pp.260271, doi:10.1007/3-540-57273-2\_61.
Kloks, Ton; Kratsch, Dieter; Mller, H. (1995), "Dominoes", Proc. 20th International Workshop Graph-Theoretic
Concepts in Computer Science (WG'94), Lecture Notes in Computer Science, 903, Springer-Verlag, pp.106120,
doi:10.1007/3-540-59071-4_41.
Kneis, Joachim; Mlle, Daniel; Richter, Stefan; Rossmanith, Peter (2005), "Algorithms based on the treewidth of
sparse graphs", Proc. 31st International Workshop on Graph-Theoretic Concepts in Computer Science (WG
2005), Lecture Notes in Computer Science, 3787, Springer-Verlag, pp.385396, doi:10.1007/11604686_34.
Korach, Ephraim; Solel, Nir (1993), "Tree-width, path-width, and cutwidth", Discrete Applied Mathematics 43
(1): 97101, doi:10.1016/0166-218X(93)90171-J.
Kornai, Andrs; Tuza, Zsolt (1992), "Narrowness, path-width, and their application in natural language
processing", Discrete Applied Mathematics 36 (1): 8792, doi:10.1016/0166-218X(92)90208-R.
Lengauer, Thomas (1981), "Black-white pebbles and graph separation", Acta Informatica 16 (4): 465475,
doi:10.1007/BF00264496.
Lopez, Alexander D.; Law, Hung-Fai S. (1980), "A dense gate matrix layout method for MOS VLSI", IEEE
Transactions on Electron Devices ED-27 (8): 16711675, doi:10.1109/T-ED.1980.20086, Also in the joint issue,
IEEE Journal of Solid-State Circuits 15 (4): 736740, 1980, doi:10.1109/JSSC.1980.1051462.
Miller, George A. (1956), "[[The Magical Number Seven, Plus or Minus Two (http://www.musanim.com/
miller1956/)]"], Psychological Review 63 (2): 8197, doi:10.1037/h0043158, PMID13310704.
Mhring, Rolf H. (1990), "Graph problems related to gate matrix layout and PLA folding", in Tinhofer, G.; Mayr,
E.; Noltemeier, H. et al., Computational Graph Theory, Computing Supplementum, 7, Springer-Verlag,
455
Path decomposition
pp.1751, ISBN3-211-82177-5.
Monien, B.; Sudborough, I. H. (1988), "Min cut is NP-complete for edge weighted trees", Theoretical Computer
Science 58 (13): 209229, doi:10.1016/0304-3975(88)90028-X.
Ohtsuki, Tatsuo; Mori, Hajimu; Kuh, Ernest S.; Kashiwabara, Toshinobu; Fujisawa, Toshio (1979),
"One-dimensional logic gate assignment and interval graphs", IEEE Transactions on Circuits and Systems 26 (9):
675684, doi:10.1109/TCS.1979.1084695.
Peng, Sheng-Lung; Ho, Chin-Wen; Hsu, Tsan-sheng; Ko, Ming-Tat; Tang, Chuan Yi (1998), "A linear-time
algorithm for constructing an optimal node-search strategy of a tree" (http://www.springerlink.com/content/
lamc6dynulxv7a8n/), Proc. 4th Int. Conf. Computing and Combinatorics (COCOON'98), Lecture Notes in
Computer Science, 1449, Springer-Verlag, pp.197205.
Proskurowski, Andrzej; Telle, Jan Arne (1999), "Classes of graphs with restricted interval models" (http://www.
emis.ams.org/journals/DMTCS/volumes/abstracts/pdfpapers/dm030404.pdf), Discrete Mathematics and
Theoretical Computer Science 3: 167176.
Robertson, Neil; Seymour, Paul (1983), "Graph minors. I. Excluding a forest", Journal of Combinatorial Theory,
Series B 35 (1): 3961, doi:10.1016/0095-8956(83)90079-5.
Robertson, Neil; Seymour, Paul (2003), "Graph minors. XVI. Excluding a non-planar graph", Journal of
Combinatorial Theory, Series B 89 (1): 4376, doi:10.1016/S0095-8956(03)00042-X.
Robertson, Neil; Seymour, Paul D. (2004), "Graph Minors. XX. Wagner's conjecture", Journal of Combinatorial
Theory, Series B 92 (2): 325357, doi:10.1016/j.jctb.2004.08.001.
Scheffler, Petra (1990), "A linear algorithm for the pathwidth of trees", in Bodendiek, R.; Henn, R., Topics in
Combinatorics and Graph Theory, Physica-Verlag, pp.613620.
Scheffler, Petra (1992), "Optimal embedding of a tree into an interval graph in linear time", in Neetil, Jaroslav;
Fiedler, Miroslav, Fourth Czechoslovakian Symposium on Combinatorics, Graphs and Complexity, Elsevier.
Skodinis, Konstantin (2000), "Computing optimal linear layouts of trees in linear time", Proc. 8th European
Symposium on Algorithms (ESA 2000), Lecture Notes in Computer Science, 1879, Springer-Verlag, pp.403414,
doi:10.1007/3-540-45253-2_37.
Skodinis, Konstantin (2003), "Construction of linear tree-layouts which are optimal with respect to vertex
separation in linear time", Journal of Algorithms 47 (1): 4059, doi:10.1016/S0196-6774(02)00225-0.
Suchan, Karol; Todinca, Ioan (2007), "Pathwidth of circular-arc graphs", Proc. 33rd International Workshop on
Graph-Theoretic Concepts in Computer Science (WG 2007), Lecture Notes in Computer Science, 4769,
Springer-Verlag, pp.258269, doi:10.1007/978-3-540-74839-7\_25.
Suderman, Matthew (2004), "Pathwidth and layered drawings of trees" (http://cgm.cs.mcgill.ca/~msuder/
schools/mcgill/research/trees/SOCS-02-8.pdf), International Journal of Computational Geometry and
Applications 14 (3): 203225, doi:10.1142/S0218195904001433.
Takahashi, Atsushi; Ueno, Shuichi; Kajitani, Yoji (1994), "Minimal acyclic forbidden minors for the family of
graphs with bounded path-width", Discrete Mathematics 127 (13): 293304,
doi:10.1016/0012-365X(94)90092-2.
456
457
Example
Consider a grid graph with r rows and c columns; the
number n of vertices equals rc. For instance, in the
illustration, r=5, c=8, and n=40. If r is odd, there is a
single central row, and otherwise there are two rows
equally close to the center; similarly, if c is odd, there is a
single central column, and otherwise there are two
columns equally close to the center. Choosing S to be any
of these central rows or columns, and removing S from
the graph, partitions the graph into two smaller connected
subgraphs A and B, each of which has at most n/2
vertices. If rc (as in the illustration), then choosing a
A planar separator for a grid graph.
central column will give a separator S with rn
vertices, and similarly if cr then choosing a central row
will give a separator with at most n vertices. Thus, every grid graph has a separator S of size at most n, the
removal of which partitions it into two connected components, each of size at most n/2.[3]
The planar separator theorem states that a similar partition can be constructed in any planar graph. The case of
arbitrary planar graphs differs from the case of grid graphs in that the separator has size O(n) but may be larger than
n, the bound on the size of the two subsets A and B (in the most common versions of the theorem) is 2n/3 rather
than n/2, and the two subsets A and B need not themselves form connected subgraphs.
Constructions
Breadth-first layering
Lipton & Tarjan (1979) augment the given planar graph by additional edges, if necessary, so that it becomes
maximal planar (every face in a planar embedding is a triangle). They then perform a breadth-first search, rooted at
an arbitrary vertex v, and partition the vertices into levels by their distance from v. If l1 is the median level (the level
such that the numbers of vertices at higher and lower levels are both at most n/2) then there must be levels l0 and l2
that are O(n) steps above and below l1 respectively and that contain O(n) vertices, respectively, for otherwise there
would be more than n vertices in the levels near l1. They show that there must be a separator S formed by the union
of l0 and l2, the endpoints e of an edge of G that does not belong to the breadth-first search tree and that lies between
the two levels, and the vertices on the two breadth-first search tree paths from e back up to level l0. The size of the
separator S constructed in this way is at most 8n, or approximately 2.83n. The vertices of the separator and the
two disjoint subgraphs can be found in linear time.
This proof of the separator theorem applies as well to weighted planar graphs, in which each vertex has a
non-negative cost. The graph may be partitioned into three sets A, S, and B such that A and B each have at most 2/3
of the total cost and S has O(n) vertices, with no edges from A to B.[4] By analysing a similar separator construction
more carefully, Djidjev (1982) shows that the bound on the size of S can be reduced to 6n, or approximately
2.45n.
Holzer et al. (2009) suggest a simplified version of this approach: they augment the graph to be maximal planar and
construct a breadth first search tree as before. Then, for each edge e that is not part of the tree, they form a cycle by
combining e with the tree path that connects its endpoints. They then use as a separator the vertices of one of these
cycles. Although this approach cannot be guaranteed to find a small separator for planar graphs of high diameter,
their experiments indicate that it outperforms the LiptonTarjan and Djidjev breadth-first layering methods on many
types of planar graph.
458
459
Circle separators
According to the KoebeAndreevThurston circle-packing theorem, any planar graph may be represented by a
packing of circular disks in the plane with disjoint interiors, such that two vertices in the graph are adjacent if and
only if the corresponding pair of disks are mutually tangent. As Miller et al. (1997) show, for such a packing, there
exists a circle that has at most 3n/4 disks touching or inside it, at most 3n/4 disks touching or outside it, and that
crosses O(n disks).
To prove this, Miller et al. use stereographic projection to map the packing onto the surface of a unit sphere in three
dimensions. By choosing the projection carefully, the center of the sphere can be made into a centerpoint of the disk
centers on its surface, so that any plane through the center of the sphere partitions it into two halfspaces that each
contain or cross at most 3n/4 of the disks. If a plane through the center is chosen uniformly at random, a disk will be
crossed with probability proportional to its radius. Therefore, the expected number of disks that are crossed is
proportional to the sum of the radii of the disks. However, the sum of the squares of the radii is proportional to the
total area of the disks, which is at most the total surface area of the unit sphere, a constant. An argument involving
Jensen's inequality shows that, when the sum of squares of n non-negative real numbers is bounded by a constant, the
sum of the numbers themselves is O(n). Therefore, the expected number of disks crossed by a random plane is
O(n) and there exists a plane that crosses at most that many disks. This plane intersects the sphere in a great circle,
which projects back down to a circle in the plane with the desired properties. The O(n) disks crossed by this circle
correspond to the vertices of a planar graph separator that separates the vertices whose disks are inside the circle
from the vertices whose disks are outside the circle, with at most 3n/4 vertices in each of these two subsets.[6][7]
This method leads to a randomized algorithm that finds such a separator in linear time,[6] and a less-practical
deterministic algorithm with the same linear time bound.[8] By analyzing this algorithm carefully using known
bounds on the packing density of circle packings, it can be shown to find separators of size at most
[9]
Spectral partitioning
Spectral clustering methods, in which the vertices of a graph are grouped by the coordinates of the eigenvectors of
matrices derived from the graph, have long been used as a heuristic for graph partitioning problems for nonplanar
graphs.[12] As Spielman & Teng (2007) show, spectral clustering can also be used to derive an alternative proof for a
weakened form of the planar separator theorem that applies to planar graphs with bounded degree. In their method,
the vertices of a given planar graph are sorted by the second coordinates of the eigenvectors of the Laplacian matrix
of the graph, and this sorted order is partitioned at the point that minimizes the ratio of the number of edges cut by
the partition to the number of vertices on the smaller side of the partition. As they show, every planar graph of
bounded degree has a partition of this type in which the ratio is O(1/n). Although this partition may not be
balanced, repeating the partition within the larger of the two sides and taking the union of the cuts formed at each
repetition will eventually lead to a balanced partition with O(n) edges. The endpoints of these edges form a
separator of size O(n).
Edge separators
A variation of the planar separator theorem involves edge separators, small sets of edges forming a cut between two
subsets A and B of the vertices of the graph. The two sets A and B must each have size at most a constant fraction of
the number n of vertices of the graph (conventionally, both sets have size at most 2n/3), and each vertex of the graph
belongs to exactly one of A and B. The separator consists of the edges that have one endpoint in A and one endpoint
in B. Bounds on the size of an edge separator involve the degree of the vertices as well as the number of vertices in
the graph: the planar graphs in which one vertex has degree n1, including the wheel graphs and star graphs, have
no edge separator with a sublinear number of edges, because any edge separator would have to include all the edges
connecting the high degree vertex to the vertices on the other side of the cut. However, every planar graph with
maximum degree has an edge separator of size O((n)).[13]
A simple cycle separator in the dual graph of a planar graph forms an edge separator in the original graph.[14]
Applying the simple cycle separator theorem of Gazit & Miller (1990) to the dual graph of a given planar graph
strengthens the O((n)) bound for the size of an edge separator by showing that every planar graph has an edge
separator whose size is proportional to the Euclidean norm of its vector of vertex degrees.
Papadimitriou & Sideri (1996) describe a polynomial time algorithm for finding the smallest edge separator that
partitions a graph G into two subgraphs of equal size, when G is an induced subgraph of a grid graph with no holes
or with a constant number of holes. However, they conjecture that the problem is NP-complete for arbitrary planar
graphs, and they show that the complexity of the problem is the same for grid graphs with arbitrarily many holes as it
is for arbitrary planar graphs.
460
461
Lower bounds
In a nn grid graph, a set S of s<n points can enclose a subset of
at most s(s1)/2 grid points, where the maximum is achieved by
arranging S in a diagonal line near a corner of the grid. Therefore, in
order to form a separator that separates at least n/3 of the points from
the remaining grid, s needs to be at least (2n/3), approximately
0.82n.
There exist n-vertex planar graphs (for arbitrarily large values of n)
such that, for every separator S that partitions the remaining graph into
subgraphs of at most 2n/3 vertices, S has at least (43)n vertices,
approximately 1.56n.[2] The construction involves approximating a
sphere by a convex polyhedron, replacing each of the faces of the
polyhedron by a triangular mesh, and applying isoperimetric theorems
for the surface of the sphere.
Separator hierarchies
Separators may be combined into a separator hierarchy of a planar graph, a recursive decomposition into smaller
graphs. A separator hierarchy may be represented by a binary tree in which the root node represents the given graph
itself, and the two children of the root are the roots of recursively constructed separator hierarchies for the induced
subgraphs formed from the two subsets A and B of a separator.
A separator hierarchy of this type forms the basis for a tree decomposition of the given graph, in which the set of
vertices associated with each tree node is the union of the separators on the path from that node to the root of the
tree. Since the sizes of the graphs go down by a constant factor at each level of the tree, the upper bounds on the
sizes of the separators also go down by a constant factor at each level, so the sizes of the separators on these paths
add in a geometric series to O(n). That is, a separator formed in this way has width O(n), and can be used to show
that every planar graph have treewidth O(n).
Constructing a separator hierarchy directly, by traversing the binary tree top down and applying a linear-time planar
separator algorithm to each of the induced subgraphs associated with each node of the binary tree, would take a total
of O(nlogn) time. However, it is possible to construct an entire separator hierarchy in linear time, by using the
LiptonTarjan breadth-first layering approach and by using appropriate data structures to perform each partition step
in sublinear time.[15]
If one forms a related type of hierarchy based on separations instead of separators, in which the two children of the
root node are roots of recursively constructed hierarchies for the two subgraphs G1 and G2 of a separation of the
given graph, then the overall structure forms a branch-decomposition instead of a tree decomposition. The width of
any separation in this decomposition is, again, bounded by the sum of the sizes of the separators on a path from any
node to the root of the hierarchy, so any branch-decomposition formed in this way has width O(n) and any planar
graph has branchwidth O(n). Although many other related graph partitioning problems are NP-complete, even for
planar graphs, it is possible to find a minimum-width branch-decomposition of a planar graph in polynomial time.[16]
By applying methods of Alon, Seymour & Thomas (1994) more directly in the construction of
branch-decompositions, Fomin & Thilikos (2006a) show that every planar graph has branchwidth at most 2.12n,
with the same constant as the one in the simple cycle separator theorem of Alon et al. Since the treewidth of any
graph is at most 3/2 its branchwidth, this also shows that planar graphs have treewidth at most 3.18n.
462
Applications
Partition the given graph G into three subsets S, A, B according to the planar separator theorem
Recursively search for the shortest cycles in A and B
Use Dijkstra's algorithm to find, for each s in S, the shortest cycle through s in G.
Return the shortest of the cycles found by the above steps.
The time for the two recursive calls to A and B in this algorithm is dominated by the time to perform the O(n) calls
to Dijkstra's algorithm, so this algorithm finds the shortest cycle in O(n3/2logn) time.
Approximation algorithms
Lipton & Tarjan (1980) observed that the separator theorem may be used to obtain polynomial time approximation
schemes for NP-hard optimization problems on planar graphs such as finding the maximum independent set.
Specifically, by truncating a separator hierarchy at an appropriate level, one may find a separator of size O(n/logn)
the removal of which partitions the graph into subgraphs of size clogn, for any constant c. By the four-color
theorem, there exists an independent set of size at least n/4, so the removed nodes form a negligible fraction of the
maximum independent set, and the maximum independent sets in the remaining subgraphs can be found
independently in time exponential in their size. By combining this approach with later linear-time methods for
separator hierarchy construction[15] and with table lookup to share the computation of independent sets between
isomorphic subgraphs, it can be made to construct independent sets of size within a factor of 1O(1/logn) of
optimal, in linear time. However, for approximation ratios even closer to 1 than this factor, a later approach of Baker
(1994) (based on tree-decomposition but not on planar separators) provides better tradeoffs of time versus
approximation quality.
463
Graph compression
Separators have been used as part of data compression algorithms for representing planar graphs and other separable
graphs using a small number of bits. The basic principle of these algorithms is to choose a number k and repeatedly
subdivide the given planar graph using separators into O(n/k) subgraphs of size at most k, with O(n/k) vertices in
the separators. With an appropriate choice of k (at most proportional to the logarithm of n) the number of
non-isomorphic k-vertex planar subgraphs is significantly less than the number of subgraphs in the decomposition,
so the graph can be compressed by constructing a table of all the possible non-isomorphic subgraphs and
representing each subgraph in the separator decomposition by its index into the table. The remainder of the graph,
formed by the separator vertices, may be represented explicitly or by using a recursive version of the same data
structure. Using this method, planar graphs and many more restricted families of planar graphs may be encoded
using a number of bits that is information-theoretically optimal: if there are Pn n-vertex graphs in the family of
graphs to be represented, then an individual graph in the family can be represented using only (1+o(n))log2Pn
bits.[32] It is also possible to construct representations of this type in which one may test adjacency between vertices,
determine the degree of a vertex, and list neighbors of vertices in constant time per query, by augmenting the table of
subgraphs with additional tabular information representing the answers to the queries.[33][34]
Universal graphs
A universal graph for a family F of graphs is a graph that contains every member of F as a subgraphs. Separators can
be used to show that the n-vertex planar graphs have universal graphs with n vertices and O(n3/2) edges.[35]
The construction involves a strengthened form of the separator theorem in which the size of the three subsets of
vertices in the separator does not depend on the graph structure: there exists a number c, the magnitude of which at
most a constant times n, such that the vertices of every n-vertex planar graph can be separated into subsets A, S, and
B, with no edges from A to B, with |S|=c, and with |A|=|B|=(nc)/2. This may be shown by using the usual form
of the separator theorem repeatedly to partition the graph until all the components of the partition can be arranged
into two subsets of fewer than n/2 vertices, and then moving vertices from these subsets into the separator as
necessary until it has the given size.
Once a separator theorem of this type is shown, it can be used to produce a separator hierarchy for n-vertex planar
graphs that again does not depend on the graph structure: the tree-decomposition formed from this hierarchy has
width O(n) and can be used for any planar graph. The set of all pairs of vertices in this tree-decomposition that both
belong to a common node of the tree-decomposition forms a trivially perfect graph with O(n3/2) vertices that
contains every n-vertex planar graph as a subgraph. A similar construction shows that bounded-degree planar graphs
have universal graphs with O(nlogn) edges, where the constant hidden in the O notation depends on the degree
bound. Any universal graph for planar graphs (or even for trees of unbounded degree) must have (nlogn) edges,
but it remains unknown whether this lower bound or the O(n3/2) upper bound is tight for universal graphs for
arbitrary planar graphs.[35]
464
Notes
[1] Alon, Seymour & Thomas (1990).
[2] Djidjev (1982).
[3] George (1973). Instead of using a row or column of a grid graph, George partitions the graph into four pieces by using the union of a row and
a column as a separator.
[4] Lipton & Tarjan (1979).
[5] Gazit & Miller (1990).
[6] Miller et al. (1997).
[7] Pach & Agarwal (1995).
[8] Eppstein, Miller & Teng (1995).
[9] Spielman & Teng (1996).
[10] Gremban, Miller & Teng (1997).
[11] Har-Peled (2011).
[12] Donath & Hoffman (1972); Fiedler (1973).
[13] Miller (1986) proved this result for 2-connected planar graphs, and Diks et al. (1993) extended it to all planar graphs.
[14] Miller (1986); Gazit & Miller (1990).
[15] Goodrich (1995).
[16] Seymour & Thomas (1994).
[17] Lipton & Tarjan (1979); Erds, Graham & Szemerdi (1976).
[18] Skora & Vrt'o (1993).
[19] Kawarabayashi & Reed (2010). For earlier work on separators in minor-closed families see Alon, Seymour & Thomas (1990), Plotkin, Rao
& Smith (1994), and Reed & Wood (2009).
[20] Miller et al. (1998).
[21] Chalermsook, Fakcharoenphol & Nanongkai (2004).
[22] Chang & Lu (2011).
[23] Lipton, Rose & Tarjan (1979); Gilbert & Tarjan (1986).
[24] Eppstein et al. (1996); Eppstein et al. (1998).
[25] Lipton & Tarjan (1980).
[26] Klein et al. (1994); Tazari & Mller-Hannemann (2009).
[27] Frieze, Miller & Teng (1992).
[28] Bern (1990); Deneko, Klinz & Woeginger (2006); Dorn et al. (2005); Lipton & Tarjan (1980).
[29] Smith & Wormald (1998).
[30] Alber, Fernau & Niedermeier (2003); Fomin & Thilikos (2006b).
[31] Bar-Yehuda & Even (1982); Chiba (1981).
[32] He, Kao & Lu (2000).
[33] Blandford, Blellock & Kash (2003).
[34] Blellock & Farzan (2010).
[35] Babai et al. (1982); Bhatt et al. (1989); Chung (1990).
References
Alber, Jochen; Fernau, Henning; Niedermeier, Rolf (2003), "Graph separators: A parameterized view", Journal of
Computer and System Sciences 67 (4): 808832, doi:10.1016/S0022-0000(03)00072-2.
Alon, Noga; Seymour, Paul; Thomas, Robin (1990), "A separator theorem for nonplanar graphs", J. Amer. Math.
Soc. 3 (4): 801808, doi:10.1090/S0894-0347-1990-1065053-0.
Alon, Noga; Seymour, Paul; Thomas, Robin (1994), "Planar separators", SIAM Journal on Discrete Mathematics
7 (2): 184193, doi:10.1137/S0895480191198768.
Arora, Sanjeev; Grigni, Michelangelo; Karger, David; Klein, Philip; Woloszyn, Andrzej (1998), "A
polynomial-time approximation scheme for weighted planar graph TSP" (http://portal.acm.org/citation.
cfm?id=314613.314632), Proc. 9th ACM-SIAM Symposium on Discrete algorithms (SODA '98), pp.3341.
Babai, L.; Chung, F. R. K.; Erds, P.; Graham, R. L.; Spencer, J. H. (1982), "On graphs which contain all sparse
graphs" (http://renyi.hu/~p_erdos/1982-12.pdf), in Rosa, Alexander; Sabidussi, Gert; Turgeon, Jean, Theory
and practice of combinatorics: a collection of articles honoring Anton Kotzig on the occasion of his sixtieth
birthday, Annals of Discrete Mathematics, 12, pp.2126.
465
466
467
468
Graph minors
Graph minors
In graph theory, an undirected graph H is called a minor of the graph G if H is isomorphic to a graph that can be
obtained by zero or more edge contractions on a subgraph of G.
The theory of graph minors began with Wagner's theorem that a graph is planar if and only if it does not contain the
complete graph K5 nor the complete bipartite graph K3,3 as a minor.[1] The RobertsonSeymour theorem states that
the relation "being a minor of" is a well-quasi-ordering on the isomorphism classes of graphs, and implies that many
other families of graphs have forbidden minor characterizations similar to that for the planar graphs.[2]
Definitions
An edge contraction is an operation which removes an edge from a graph while simultaneously merging the two
vertices it used to connect. An undirected graph H is a minor of another undirected graph G if a graph isomorphic to
H can be obtained from G by contracting some edges, deleting some edges, and deleting some isolated vertices. The
order in which a sequence of such contractions and deletions is performed on G does not affect the resulting graph H.
Graph minors are often studied in the more general context of matroid minors. In this context, it is common to
assume that all graphs are connected, with self-loops and multiple edges allowed (that is, they are multigraphs rather
than simple graphs; the contraction of a loop and the deletion of a cut-edge are forbidden operations. This point of
view has the advantage that edge deletions leave the rank of a graph unchanged, and edge contractions always reduce
the rank by one.
In other contexts (such as with the study of pseudoforests) it makes more sense to allow the deletion of a cut-edge,
and to allow disconnected graphs, but to forbid multigraphs. In this variation of graph minor theory, a graph is
always simplified after any edge contraction to eliminate its self-loops and multiple edges.[3]
Example
In the following example, graph H is a minor of graph G:
H.
G.
The following diagram illustrates this. First construct a subgraph of G by deleting the dashed edges (and the resulting
isolated vertex), and then contract the gray edge (merging the two vertices it connects):
469
Graph minors
many edges.[8] Additionally, the H-minor-free graphs have a separator theorem similar to the planar separator
theorem for planar graphs: for any fixed H, and any n-vertex H-minor-free graph G, it is possible to find a subset of
O(n) vertices the removal of which splits G into two (possibly disconnected) subgraphs with at most 2n/3 vertices
per subgraph.[9]
The Hadwiger conjecture in graph theory proposes that if a graph G does not contain a minor isomorphic to the
complete graph on k vertices, then G has a proper coloring with k1 colors.[10] The case k=5 is a restatement of
the four color theorem. The Hadwiger conjecture has been proven only for k6,[11] but remains unproven in the
general case. Bollobs, Catlin & Erds (1980) call it one of the deepest unsolved problems in graph theory.
Another result relating the four-color theorem to graph minors is the snark theorem announced by Robertson,
Sanders, Seymour, and Thomas, a strengthening of the four-color theorem conjectured by W. T. Tutte and stating
that any bridgeless 3-regular graph that requires four colors in an edge coloring must have the Petersen graph as a
minor.[12][13]
470
Graph minors
union of path graphs, F has bounded treewidth if and only if its forbidden minors include a planar graph,[15] and F
has bounded local treewidth (a functional relationship between diameter and treewidth) if and only if its forbidden
minors include an apex graph (a graph that can be made planar by the removal of a single vertex).[16] If H can be
drawn in the plane with only a single crossing (that is, it has crossing number one) then the H-minor-free graphs
have a simplified structure theorem in which they are formed as clique-sums of planar graphs and graphs of bounded
treewidth.[17] For instance, both K5 and K3,3 have crossing number one, and as Wagner showed the K5-free graphs
are exactly the 3-clique-sums of planar graphs and the eight-vertex Wagner graph, while the K3,3-free graphs are
exactly the 2-clique-sums of planar graphs andK5.[18]
Topological minors
A graph H is called a topological minor of a graph G if a subdivision of H is isomorphic to a subgraph of G.[19] It is
easy to see that every topological minor is also a minor. The converse however is not true in general, but holds for
graph with maximum degree not greater than three.[20]
The topological minor relation is not a well-quasi-ordering on the set of finite graphs and hence the result of
Robertson and Seymour does not apply to topological minors. However it is straightforward to construct finite
forbidden topological minor characterizations from finite forbidden minor characterizations by replacing every
branch set with k outgoing edges by every tree on k leaves that has down degree at least two.
Immersion minor
A graph operation called lifting is central in a concept called immersions. The lifting is an operation on adjacent
edges. Given three vertices v, u, and w, where (v,u) and (u,w) are edges in the graph, the lifting of vuw, or equivalent
of (v,u), (u,w) is the operation that deletes the two edges (v,u) and (u,w) and adds the edge (u,w). In the case where
(u,w) already was present, u and w will now be connected by more than one edge, and hence this operation is
intrinsically a multi-graph operation.
In the case where a graph H can be obtained from a graph G by a sequence of lifting operations (on G) and then
finding an isomorphic subgraph, we say that H is an immersion minor of G.
The immersion minor relation is a well-quasi-ordering on the set of finite graphs and hence the result of Robertson
and Seymour applies to immersion minors. This furthermore means that every immersion minor-closed family is
characterized by a finite family of forbidden immersion minors.
Algorithms
The problem of deciding whether a graph G contains H as a minor is NP-complete in general; for instance, if H is a
cycle graph with the same number of vertices as G, then H is a minor of G if and only if G contains a Hamiltonian
cycle. However, when G is part of the input but H is fixed, it can be solved in polynomial time. More specifically,
the running time for testing whether H is a minor of G in this case is O(n3), where n is the number of vertices in G
and the big O notation hides a constant that depends superexponentially on H.[21] Thus, by applying the polynomial
time algorithm for testing whether a given graph contains any of the forbidden minors, it is possible to recognize the
members of any minor-closed family in polynomial time. However, in order to apply this result constructively, it is
necessary to know what the forbidden minors of the graph family are.[22]
471
Graph minors
Notes
[1] Lovsz (2006), p. 77; Wagner (1937a).
[2] Lovsz (2006), theorem 4, p. 78; Robertson & Seymour (2004).
[3] Lovsz (2006) is inconsistent about whether to allow self-loops and multiple adjacencies: he writes on p. 76 that "parallel edges and loops are
allowed" but on p. 77 he states that "a graph is a forest if and only if it does not contain the triangle K3 as a minor", true only for simple
graphs.
[4] Diestel (2005), Chapter 12: Minors, Trees, and WQO; Robertson & Seymour (2004).
[5] Lovsz (2006), p. 76.
[6] Lovsz (2006), pp. 8082; Robertson & Seymour (2003).
[7] Mader (1967).
[8] Kostochka (1982); Kostochka (1984); Thomason (1984); Thomason (2001).
[9] Alon, Seymour & Thomas (1990); Plotkin, Rao & Smith (1994); Reed & Wood (2009).
[10] Hadwiger (1943).
[11] Robertson, Seymour & Thomas (1993).
[12] Pegg, Ed, Jr. (2002), "Book Review: The Colossal Book of Mathematics" (http:/ / www. ams. org/ notices/ 200209/ rev-pegg. pdf), Notices
of the American Mathematical Society 49 (9): 10841086, doi:10.1109/TED.2002.1003756,
[13] Thomas, Robin, Recent Excluded Minor Theorems for Graphs (http:/ / people. math. gatech. edu/ ~thomas/ PAP/ bcc. pdf), p.14,
[14] Robertson & Seymour (1983).
[15] Lovsz (2006), Theorem 9, p. 81; Robertson & Seymour (1986).
[16] Eppstein (2000); Demaine & Hajiaghayi (2004).
[17] Robertson & Seymour (1993); Demaine, Hajiaghayi & Thilikos (2002).
[18]
[19]
[20]
[21]
[22]
References
Alon, Noga; Seymour, Paul; Thomas, Robin (1990), "A separator theorem for nonplanar graphs" (http://www.
ams.org/journals/jams/1990-03-04/S0894-0347-1990-1065053-0/home.html), Journal of the American
Mathematical Society 3 (4): 801808, doi:10.2307/1990903, JSTOR1990903, MR1065053.
Bollobs, B.; Catlin, P. A.; Erds, Paul (1980), "Hadwiger's conjecture is true for almost every graph" (http://
www2.renyi.hu/~p_erdos/1980-10.pdf), European Journal of Combinatorics 1: 195199.
Demaine, Erik D.; Hajiaghayi, MohammadTaghi (2004), "Diameter and treewidth in minor-closed graph families,
revisited" (http://erikdemaine.org/papers/DiameterTreewidth_Algorithmica/), Algorithmica 40 (3): 211215,
doi:10.1007/s00453-004-1106-1.
Demaine, Erik D.; Hajiaghayi, MohammadTaghi; Thilikos, Dimitrios M. (2002), "1.5-Approximation for
treewidth of graphs excluding a graph with one crossing as a minor", Proc. 5th International Workshop on
Approximation Algorithms for Combinatorial Optimization (APPROX 2002), Lecture Notes in Computer Science,
2462, Springer-Verlag, pp.6780, doi:10.1007/3-540-45753-4_8
Diestel, Reinhard (2005), Graph Theory (http://www.math.uni-hamburg.de/home/diestel/books/graph.
theory/) (3rd ed.), Berlin, New York: Springer-Verlag, ISBN978-3-540-26183-4.
Eppstein, David (2000), "Diameter and treewidth in minor-closed graph families", Algorithmica 27: 275291,
arXiv:math.CO/9907126, doi:10.1007/s004530010020, MR2001c:05132.
Fellows, Michael R.; Langston, Michael A. (1988), "Nonconstructive tools for proving polynomial-time
decidability", Journal of the ACM 35 (3): 727739, doi:10.1145/44483.44491.
Hadwiger, Hugo (1943), "ber eine Klassifikation der Streckenkomplexe", Vierteljschr. Naturforsch. Ges. Zrich
88: 133143.
Hall, Dick Wick (1943), "A note on primitive skew curves", Bulletin of the American Mathematical Society 49
(12): 935936, doi:10.1090/S0002-9904-1943-08065-2.
472
Graph minors
Kostochka, Alexandr V. (1982), "The minimum Hadwiger number for graphs with a given mean degree of
vertices" (in Russian), Metody Diskret. Analiz. 38: 3758.
Kostochka, Alexandr V. (1984), "Lower bound of the Hadwiger number of graphs by their average degree",
Combinatorica 4: 307316, doi:10.1007/BF02579141.
Lovsz, Lszl (2006), "Graph minor theory", Bulletin of the American Mathematical Society 43 (1): 7586,
doi:10.1090/S0273-0979-05-01088-8.
Mader, Wolfgang (1967), "Homomorphieeigenschaften und mittlere Kantendichte von Graphen", Mathematische
Annalen 174 (4): 265268, doi:10.1007/BF01364272.
Plotkin, Serge; Rao, Satish; Smith, Warren D. (1994), "Shallow excluded minors and improved graph
decompositions" (http://www.stanford.edu/~plotkin/lminors.ps), Proc. 5th ACMSIAM Symp. on Discrete
Algorithms (SODA 1994), pp.462470.
Reed, Bruce; Wood, David R. (2009), "A linear-time algorithm to find a separator in a graph excluding a minor",
ACM Transactions on Algorithms 5 (4): Article 39, doi:10.1145/1597036.1597043.
Robertson, Neil; Seymour, Paul (1983), "Graph minors. I. Excluding a forest", Journal of Combinatorial Theory,
Series B 35 (1): 3961, doi:10.1016/0095-8956(83)90079-5.
Robertson, Neil; Seymour, Paul D. (1986), "Graph minors. V. Excluding a planar graph", Journal of
Combinatorial Theory, Series B 41 (1): 92114, doi:10.1016/0095-8956(86)90030-4
Robertson, Neil; Seymour, Paul D. (1993), "Excluding a graph with one crossing", in Robertson, Neil; Seymour,
Paul, Graph Structure Theory: Proc. AMSIMSSIAM Joint Summer Research Conference on Graph Minors,
Contemporary Mathematics, 147, American Mathematical Society, pp.669675.
Robertson, Neil; Seymour, Paul D. (1995), "Graph Minors. XIII. The disjoint paths problem", Journal of
Combinatorial Theory, Series B 63 (1): 65110, doi:10.1006/jctb.1995.1006.
Robertson, Neil; Seymour, Paul D. (2003), "Graph Minors. XVI. Excluding a non-planar graph", Journal of
Combinatorial Theory, Series B 89 (1): 4376, doi:10.1016/S0095-8956(03)00042-X.
Robertson, Neil; Seymour, Paul D. (2004), "Graph Minors. XX. Wagner's conjecture", Journal of Combinatorial
Theory, Series B 92 (2): 325357, doi:10.1016/j.jctb.2004.08.001.
Robertson, Neil; Seymour, Paul; Thomas, Robin (1993), "Hadwiger's conjecture for K6-free graphs" (http://
www.math.gatech.edu/~thomas/PAP/hadwiger.pdf), Combinatorica 13: 279361, doi:10.1007/BF01202354.
Thomason, Andrew (1984), "An extremal function for contractions of graphs", Mathematical Proceedings of the
Cambridge Philosophical Society 95 (2): 261265, doi:10.1017/S0305004100061521.
Thomason, Andrew (2001), "The extremal function for complete minors", Journal of Combinatorial Theory,
Series B 81 (2): 318338, doi:10.1006/jctb.2000.2013.
Wagner, Klaus (1937a), "ber eine Eigenschaft der ebenen Komplexe", Math. Ann. 114: 570590,
doi:10.1007/BF01594196.
Wagner, Klaus (1937b), "ber eine Erweiterung des Satzes von Kuratowski", Deutsche Mathematik 2: 280285.
External links
Weisstein, Eric W., " Graph Minor (http://mathworld.wolfram.com/GraphMinor.html)" from MathWorld.
473
Courcelle's theorem
Courcelle's theorem
In the study of graph algorithms, Courcelle's theorem is the statement that every graph property definable in
monadic second-order logic can be decided in linear time on graphs of bounded treewidth. The result was first
proved by Bruno Courcelle in 1990 and is considered the archetype of algorithmic meta-theorems.[1][2][3]
In this context, the graph is described by a set of vertices V and a binary adjacency relation, and the restriction to
monadic logic means that the graph property in question may be defined in terms of sets of vertices of the given
graph, but not in terms of sets of edges. As an example, the property of a graph being colorable with three colors
(represented by three sets of vertices R, G, and B) may be defined by the monadic second-order formula
The first part of this formula ensures that the three color classes cover all the vertices of the graph, and the second
ensures that they each form an independent set. Thus, by Courcelle's theorem, 3-colorability of graphs of bounded
treewidth may be tested in linear time.
The typical approach to proving Courcelle's theorem involves the construction of a finite bottom-up tree automaton
that performs a tree decomposition of the graph to recognize it.[2]
References
[1] Courcelle, Bruno (1990), "The monadic second-order logic of graphs. I. Recognizable sets of finite graphs", Information and Computation 85
(1): 1275, doi:10.1016/0890-5401(90)90043-H, MR1042649
[2] Kneis, Joachim; Langer, Alexander (2009), "A practical approach to Courcelle's theorem", Electronic Notes in Theoretical Computer Science
251: 6581, doi:10.1016/j.entcs.2009.08.028.
[3] Lampis, Michael (2010), "Algorithmic meta-theorems for restrictions of treewidth", in de Berg, Mark; Meyer, Ulrich, Proc. 18th Annual
European Symposium on Algorithms, Lecture Notes in Computer Science, 6346, Springer, pp.549560, doi:10.1007/978-3-642-15775-2_47.
474
RobertsonSeymour theorem
RobertsonSeymour theorem
In graph theory, the RobertsonSeymour theorem (also called the graph minor theorem[1]) states that the
undirected graphs, partially ordered by the graph minor relationship, form a well-quasi-ordering.[2] Equivalently,
every family of graphs that is closed under minors can be defined by a finite set of forbidden minors, in the same
way that Wagner's theorem characterizes the planar graphs as being the graphs that do not have the complete graph
K5 and the complete bipartite graph K3,3 as minors.
The RobertsonSeymour theorem is named after mathematicians Neil Robertson and Paul D. Seymour, who proved
it in a series of twenty papers spanning over 500 pages from 1983 to 2004.[3] Before its proof, the statement of the
theorem was known as Wagner's conjecture after the German mathematician Klaus Wagner, although Wagner said
he never conjectured it.[4]
A weaker result for trees is implied by Kruskal's tree theorem, which was conjectured in 1937 by Andrew Vzsonyi
and proved in 1960 independently by Joseph Kruskal and S. Tarkowski.[5]
Statement
A minor of an undirected graph G is any graph that may be obtained from G by a sequence of zero or more
contractions of edges of G and deletions of edges and vertices of G. The minor relationship forms a partial order on
the set of all distinct finite undirected graphs, as it obeys the three axioms of partial orders: it is reflexive (every
graph is a minor of itself), transitive (a minor of a minor of G is itself a minor of G), and antisymmetric (if two
graphs G and H are minors of each other, then they must be isomorphic). However, if graphs that are isomorphic
may nonetheless be considered as distinct objects, then the minor ordering on graphs forms a preorder, a relation that
is reflexive and transitive but not necessarily antisymmetric.[6]
A preorder is said to form a well-quasi-ordering if it contains neither an infinite descending chain nor an infinite
antichain.[7] For instance, the usual ordering on the non-negative integers is a well-quasi-ordering, but the same
ordering on the set of all integers is not, because it contains the infinite descending chain 0,1,2,3...
The RobertsonSeymour theorem states that finite undirected graphs and graph minors form a well-quasi-ordering. It
is obvious that the graph minor relationship does not contain any infinite descending chain, because each contraction
or deletion reduces the number of edges and vertices of the graph (a non-negative integer).[8] The nontrivial part of
the theorem is that there are no infinite antichains, infinite sets of graphs that are all unrelated to each other by the
minor ordering. If S is a set of graphs, and M is a subset of S containing one representative graph for each
equivalence class of minimal elements (graphs that belong to S but for which no proper minor belongs to S), then M
forms an antichain; therefore, an equivalent way of stating the theorem is that, in any infinite set S of graphs, there
must be only a finite number of non-isomorphic minimal elements.
Another equivalent form of the theorem is that, in any infinite set S of graphs, there must be a pair of graphs one of
which is a minor of the other.[8] The statement that every infinite set has finitely many minimal elements implies this
form of the theorem, for if there are only finitely many minimal elements, then each of the remaining graphs must
belong to a pair of this type with one of the minimal elements. And in the other direction, this form of the theorem
implies the statement that there can be no infinite antichains, because an infinite antichain is a set that does not
contain any pair related by the minor relation.
475
RobertsonSeymour theorem
476
RobertsonSeymour theorem
477
Obstruction sets
Some examples of finite obstruction sets were already known for
specific classes of graphs before the RobertsonSeymour theorem was
proved. For example, the obstruction for the set of all forests is the
loop graph (or, if one restricts to simple graphs, the cycle with three
vertices). This means that a graph is a forest if and only if none of its
minors is the loop (or, the cycle with three vertices, respectively). The
sole obstruction for the set of paths is the tree with four vertices, one of
which has degree 3. In these cases, the obstruction set contains a single
element, but in general this is not the case. Wagner's theorem states
that a graph is planar if and only if it has neither K5 nor K3,3 as a minor.
In other words, the set {K5,K3,3} is an obstruction set for the set of all
planar graphs, and in fact the unique minimal obstruction set. A similar
theorem states that K4 and K2,3 are the forbidden minors for the set of
outerplanar graphs.
Fixed-parameter tractability
For graph invariants with the property that, for each k, the graphs with invariant at most k are minor-closed, the same
method applies. For instance, by this result, treewidth, branchwidth, and pathwidth, vertex cover, and the minimum
genus of an embedding are all amenable to this approach, and for any fixed k there is a polynomial time algorithm
for testing whether these invariants are at most k, in which the exponent in the running time of the algorithm does not
depend on k. A problem with this property, that it can be solved in polynomial time for any fixed k with an exponent
that does not depend on k, is known as fixed-parameter tractable.
RobertsonSeymour theorem
However, this method does not directly provide a single fixed-parameter-tractable algorithm for computing the
parameter value for a given graph with unknown k, because of the difficulty of determining the set of forbidden
minors. Additionally, the large constant factors involved in these results make them highly impractical. Therefore,
the development of explicit fixed-parameter algorithms for these problems, with improved dependence on k, has
continued to be an important line of research.
Notes
[1] Bienstock & Langston (1995).
[2] Robertson & Seymour (2004).
[3] Robertson and Seymour(1983, 2004); Diestel (2005, p.333).
[4] Diestel (2005, p.355).
[5] Diestel (2005, pp.335336); Lovsz (2005), Section 3.3, pp. 7879.
[6] E.g., see Bienstock & Langston (1995), Section 2, "well-quasi-orders".
[7] Diestel (2005, p.334).
[8] Lovsz (2005, p.78).
[9] Bienstock & Langston (1995), Corollary 2.1.1; Lovsz (2005), Theorem 4, p.78.
[10] Lovsz (2005, pp.7677).
[11] Chambers (2002).
[12] Robertson & Seymour (1995); Bienstock & Langston (1995), Theorem 2.1.4 and Corollary 2.1.5; Lovsz (2005), Theorem 11, p. 83.
[13] Fellows & Langston (1988); Bienstock & Langston (1995), Section 6.
References
Bienstock, Daniel; Langston, Michael A. (1995), "Algorithmic implications of the graph minor theorem" (http://
www.cs.utk.edu/~langston/courses/cs594-fall2003/BL.pdf), Network Models, Handbooks in Operations
Research and Management Science, 7, pp.481502, doi:10.1016/S0927-0507(05)80125-2.
Chambers, J. (2002), Hunting for torus obstructions, M.Sc. thesis, Department of Computer Science, University
of Victoria.
Diestel, Reinhard (2005), "Minors, Trees, and WQO" (http://www.math.uni-hamburg.de/home/diestel/books/
graph.theory/preview/Ch12.pdf), Graph Theory (Electronic Edition 2005 ed.), Springer, pp.326367.
Fellows, Michael R.; Langston, Michael A. (1988), "Nonconstructive tools for proving polynomial-time
decidability", Journal of the ACM 35 (3): 727739, doi:10.1145/44483.44491.
Friedman, Harvey; Robertson, Neil; Seymour, Paul (1987), "The metamathematics of the graph minor theorem",
in Simpson, S., Logic and Combinatorics, Contemporary Mathematics, 65, American Mathematical Society,
pp.229261.
Lovsz, Lszl (2005), "Graph Minor Theory", Bulletin of the American Mathematical Society (New Series) 43
(1): 7586, doi:10.1090/S0273-0979-05-01088-8.
Robertson, Neil; Seymour, Paul (1983), "Graph Minors. I. Excluding a forest", Journal of Combinatorial Theory,
Series B 35 (1): 3961, doi:10.1016/0095-8956(83)90079-5.
478
RobertsonSeymour theorem
479
Robertson, Neil; Seymour, Paul (1995), "Graph Minors. XIII. The disjoint paths problem", Journal of
Combinatorial Theory, Series B 63 (1): 65110, doi:10.1006/jctb.1995.1006.
Robertson, Neil; Seymour, Paul (2004), "Graph Minors. XX. Wagner's conjecture", Journal of Combinatorial
Theory, Series B 92 (2): 325357, doi:10.1016/j.jctb.2004.08.001.
External links
Weisstein, Eric W., " Robertson-Seymour Theorem (http://mathworld.wolfram.com/
Robertson-SeymourTheorem.html)" from MathWorld.
Bidimensionality
Bidimensionality theory characterizes a broad range of graph problems (bidimensional) that admit efficient
approximate, fixed-parameter or kernel solutions in a broad range of graphs. These graph classes include planar
graphs, map graphs, bounded-genus graphs and graphs excluding any fixed minor. In particular, bidimensionality
theory builds on the Graph Minor Theory of Robertson and Seymour by extending the mathematical results and
building new algorithmic tools. The theory was introduced in the work of Demaine, Fomin, Hajiaghayi, and
Thilikos.[1]
Definition
A parameterized problem
is a subset of
. An instance of a parameterized
is minor-bidimensional if
, such that
is a minor of
and integer
yields that
-grid by triangulating
internal faces such that all internal vertices become of degree 6, and
then one corner of degree two joined by edges with all vertices of the
external
face.
A
parameterized
problem
is
contraction-bidimensional if
1. For any pair of graphs
and integer
, such that
yields that
is a contraction of
. In other
cannot increase the
for every
Graph
Bidimensionality
480
be a minor-bidimensional problem such that for any graph G excluding some fixed graph as a minor and of
Similarly, for contraction-bidimensional problems, for graph G excluding some fixed apex graph as a minor,
inclusion
Thus many bidimensional problems like Vertex Cover, Dominating Set, k-Path, are solvable in time
on graphs excluding some fixed graph as a minor.
Kernelization
A parameterized problem with a parameter k is said to admit a linear vertex kernel if there is a polynomial time
reduction, called a kernelization algorithm, that maps the input instance to an equivalent instance with at most O(k)
vertices.
Every minor-bidimensional problem
with additional properties, namely, with the separation property and with
finite integer index, has a linear vertex kernel on graphs excluding some fixed graph as a minor. Similarly, every
contraction-bidimensional problem with the separation property and with finite integer index has a linear vertex
kernel on graphs excluding some fixed apex graph as a minor.[6]
Bidimensionality
Notes
[1]
[2]
[3]
[4]
[5]
[6]
References
Demaine, Erik D.; Fomin, Fedor V.; Hajiaghayi, MohammadTaghi; Thilikos, Dimitrios M. (2005),
"Subexponential parameterized algorithms on bounded-genus graphs and H-minor-free graphs", J. ACM 52 (6):
866893, doi:10.1145/1101821.1101823.
Demaine, Erik D.; Fomin, Fedor V.; Hajiaghayi, MohammadTaghi; Thilikos, Dimitrios M. (2004),
"Bidimensional parameters and local treewidth", SIAM Journal on Discrete Mathematics 18 (3): 501511,
doi:10.1137/S0895480103433410.
Demaine, Erik D.; Hajiaghayi, MohammadTaghi (2005), "Bidimensionality: new connections between FPT
algorithms and PTASs", 16th ACM-SIAM Symposium on Discrete Algorithms (SODA 2005), pp.590601.
Demaine, Erik D.; Hajiaghayi, MohammadTaghi (2008), "Linearity of grid minors in treewidth with applications
through bidimensionality", Combinatorica 28 (1): 1936, doi:10.1007/s00493-008-2140-4.
Demaine, Erik D.; Hajiaghayi, MohammadTaghi (2008), "The bidimensionality theory and its algorithmic
applications", The Computer Journal 51 (3): 332337, doi:10.1093/comjnl/bxm033.
Fomin, Fedor V.; Golovach, Petr A.; Thilikos, Dimitrios M. (2009), "Contraction Bidimensionality: The Accurate
Picture", 17th Annual European Symposium on Algorithms (ESA 2009), Lecture Notes in Computer Science,
5757, pp.706717, doi:10.1007/978-3-642-04128-0_63.
Fomin, Fedor V.; Lokshtanov, Daniel; Raman, Venkatesh; Saurabh, Saket (2010), "Bidimensionality and
EPTAS", arXiv:1005.5449.Proc. 22nd ACM/SIAM Symposium on Discrete Algorithms (SODA 2011), pp.
748-759.
Fomin, Fedor V.; Lokshtanov, Daniel; Saurabh, Saket; Thilikos, Dimitrios M. (2010), "Bidimensionality and
Kernels", 21st ACM-SIAM Symposium on Discrete Algorithms (SODA 2010), pp.503510.
481
482
483
484
485
486
487
488
489
490
491
492
493
License
License
Creative Commons Attribution-Share Alike 3.0 Unported
//creativecommons.org/licenses/by-sa/3.0/
494