0% found this document useful (0 votes)
54 views27 pages

Graph Isomorphism Notes

Uploaded by

praveen kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views27 pages

Graph Isomorphism Notes

Uploaded by

praveen kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 27

Graph isomorphism

21 languages
 Article
 Talk
 Read
 Edit
 View history
Tools











From Wikipedia, the free encyclopedia
In graph theory, an isomorphism of graphs G and H is a bijection between the vertex sets
of G and H
such that any two vertices u and v of G are adjacent in G if and only if  and  are adjacent
in H. This kind of bijection is commonly described as "edge-preserving bijection", in
accordance with the general notion of isomorphism being a structure-preserving bijection. If
an isomorphism exists between two graphs, then the graphs are called isomorphic and
denoted as . In the case when the bijection is a mapping of a graph onto itself, i.e.,
when G and H are one and the same graph, the bijection is called an automorphism of G. If a
graph is finite, we can prove it to be bijective by showing it is one-one/onto; no need to show
both. Graph isomorphism is an equivalence relation on graphs and as such it partitions
the class of all graphs into equivalence classes. A set of graphs isomorphic to each other is
called an isomorphism class of graphs. The question of whether graph isomorphism can be
determined in polynomial time is a major unsolved problem in computer science, known as
the Graph Isomorphism problem.[1][2]
The two graphs shown below are isomorphic, despite their different looking drawings

An isomorphism
Graph G Graph H
between G and H
f(a) = 1

f(b) = 6
f(c) = 8
f(d) = 3
f(g) = 5
f(h) = 2
f(i) = 4
f(j) = 7

Variations[edit]
In the above definition, graphs are understood to be undirected non-labeled non-
weighted graphs. However, the notion of isomorphic may be applied to all other variants of
the notion of graph, by adding the requirements to preserve the corresponding additional
elements of structure: arc directions, edge weights, etc., with the following exception.
Isomorphism of labeled graphs[edit]
For labeled graphs, two definitions of isomorphism are in use.
Under one definition, an isomorphism is a vertex bijection which is both edge-preserving and
label-preserving.[3][4]
Under another definition, an isomorphism is an edge-preserving vertex bijection which
preserves equivalence classes of labels, i.e., vertices with equivalent (e.g., the same) labels
are mapped onto the vertices with equivalent labels and vice versa; same with edge labels.[5]
For example, the  graph with the two vertices labelled with 1 and 2 has a single
automorphism under the first definition, but under the second definition there are two auto-
morphisms.
The second definition is assumed in certain situations when graphs are endowed with unique
labels commonly taken from the integer range 1,...,n, where n is the number of the vertices of
the graph, used only to uniquely identify the vertices. In such cases two labeled graphs are
sometimes said to be isomorphic if the corresponding underlying unlabeled graphs are
isomorphic (otherwise the definition of isomorphism would be trivial).

Motivation[edit]
The formal notion of "isomorphism", e.g., of "graph isomorphism", captures the informal
notion that some objects have "the same structure" if one ignores individual distinctions of
"atomic" components of objects in question. Whenever individuality of "atomic" components
(vertices and edges, for graphs) is important for correct representation of whatever is
modeled by graphs, the model is refined by imposing additional restrictions on the structure,
and other mathematical objects are used: digraphs, labeled graphs, colored graphs, rooted
trees and so on. The isomorphism relation may also be defined for all these generalizations
of graphs: the isomorphism bijection must preserve the elements of structure which define
the object type in question: arcs, labels, vertex/edge colors, the root of the rooted tree, etc.
The notion of "graph isomorphism" allows us to distinguish graph properties inherent to the
structures of graphs themselves from properties associated with graph representations: graph
drawings, data structures for graphs, graph labelings, etc. For example, if a graph has exactly
one cycle, then all graphs in its isomorphism class also have exactly one cycle. On the other
hand, in the common case when the vertices of a graph are (represented by) the integers 1,
2,... N, then the expression
may be different for two isomorphic graphs.

Whitney theorem[edit]
Main article: Whitney graph isomorphism theorem

The exception to Whitney's theorem: these two graphs are not isomorphic but have
isomorphic line graphs.
The Whitney graph isomorphism theorem,[6] shown by Hassler Whitney, states that
two connected graphs are isomorphic if and only if their line graphs are isomorphic, with
a single exception: K3, the complete graph on three vertices, and the complete bipartite
graph K1,3, which are not isomorphic but both have K3 as their line graph. The Whitney
graph theorem can be extended to hypergraphs.[7]

Recognition of graph isomorphism[edit]


Main article: Graph isomorphism problem
While graph isomorphism may be studied in a classical mathematical way, as
exemplified by the Whitney theorem, it is recognized that it is a problem to be tackled
with an algorithmic approach. The computational problem of determining whether two
finite graphs are isomorphic is called the graph isomorphism problem.
Its practical applications include primarily cheminformatics, mathematical
chemistry (identification of chemical compounds), and electronic design
automation (verification of equivalence of various representations of the design of
an electronic circuit).
The graph isomorphism problem is one of few standard problems in computational
complexity theory belonging to NP, but not known to belong to either of its well-known
(and, if P ≠ NP, disjoint) subsets: P and NP-complete. It is one of only two, out of 12
total, problems listed in Garey & Johnson (1979) whose complexity remains unresolved,
the other being integer factorization. It is however known that if the problem is NP-
complete then the polynomial hierarchy collapses to a finite level.[8]
In November 2015, László Babai, a mathematician and computer scientist at the
University of Chicago, claimed to have proven that the graph isomorphism problem is
solvable in quasi-polynomial time.[9][10] He published preliminary versions of these
results in the proceedings of the 2016 Symposium on Theory of Computing,[11] and of the
2018 International Congress of Mathematicians.[12] In January 2017, Babai briefly
retracted the quasi-polynomiality claim and stated a sub-exponential time complexity
bound instead. He restored the original claim five days later.[13] As of 2020, the full
journal version of Babai's paper has not yet been published.
Its generalization, the subgraph isomorphism problem, is known to be NP-complete.
The main areas of research for the problem are design of fast algorithms and theoretical
investigations of its computational complexity, both for the general problem and for
special classes of graphs.

Graph homomorphism
10 languages
 Article
 Talk
 Read
 Edit
 View history
Tools









From Wikipedia, the free encyclopedia


Not to be confused with graph homeomorphism.
A homomorphism from the flower snark J5 into the cycle graph C5.
It is also a retraction onto the subgraph on the central five vertices. Thus J5 is in fact homomorphically
equivalent to the core C5.

In the mathematical field of graph theory, a graph homomorphism is a mapping


between two graphs that respects their structure. More concretely, it is a function
between the vertex sets of two graphs that maps adjacent vertices to adjacent vertices.
Homomorphisms generalize various notions of graph colorings and allow the expression
of an important class of constraint satisfaction problems, such as
certain scheduling or frequency assignment problems.[1] The fact that homomorphisms
can be composed leads to rich algebraic structures: a preorder on graphs, a distributive
lattice, and a category (one for undirected graphs and one for directed graphs).
[2]
 The computational complexity of finding a homomorphism between given graphs is
prohibitive in general, but a lot is known about special cases that are solvable
in polynomial time. Boundaries between tractable and intractable cases have been an
active area of research.[3]

Definitions[edit]
In this article, unless stated otherwise, graphs are finite, undirected
graphs with loops allowed, but multiple edges (parallel edges) disallowed. A graph
homomorphism[4] f  from a graph  to a graph , written
f : G → H
is a function from  to  that maps endpoints of each edge in  to endpoints of an edge
in . Formally,  implies , for all pairs of vertices  in . If there exists any homomorphism
from G to H, then G is said to be homomorphic to H or H-colorable. This is often
denoted as just:
G → H .
The above definition is extended to directed graphs. Then, for a
homomorphism f : G → H, (f(u),f(v)) is an arc (directed edge) of H whenever
(u,v) is an arc of G.
There is an injective homomorphism from G to H (i.e., one that never maps
distinct vertices to one vertex) if and only if G is a subgraph of H. If a
homomorphism f : G → H is a bijection (a one-to-one correspondence between
vertices of G and H) whose inverse function is also a graph homomorphism,
then f is a graph isomorphism.[5]
Covering maps are a special kind of homomorphisms that mirror the definition
and many properties of covering maps in topology.[6] They are defined
as surjective homomorphisms (i.e., something maps to each vertex) that are also
locally bijective, that is, a bijection on the neighbourhood of each vertex. An
example is the bipartite double cover, formed from a graph by splitting each
vertex v into v0 and v1 and replacing each edge u,v with edges u0,v1 and v0,u1. The
function mapping v0 and v1 in the cover to v in the original graph is a
homomorphism and a covering map.
Graph homeomorphism is a different notion, not related directly to
homomorphisms. Roughly speaking, it requires injectivity, but allows mapping
edges to paths (not just to edges). Graph minors are a still more relaxed notion.
Cores and retracts[edit]

K7, the complete graph with 7 vertices, is a core.

Main article: Core (graph theory)


Two graphs G and H are homomorphically equivalent if G → H and H → G.
[4]
 The maps are not necessarily surjective nor injective. For instance,
the complete bipartite graphs K2,2 and K3,3 are homomorphically equivalent: each
map can be defined as taking the left (resp. right) half of the domain graph and
mapping to just one vertex in the left (resp. right) half of the image graph.
A retraction is a homomorphism r from a graph G to a subgraph H of G such
that r(v) = v for each vertex v of H. In this case the subgraph H is called
a retract of G.[7]
A core is a graph with no homomorphism to any proper subgraph. Equivalently,
a core can be defined as a graph that does not retract to any proper subgraph.
[8]
 Every graph G is homomorphically equivalent to a unique core (up to
isomorphism), called the core of G.[9] Notably, this is not true in general for infinite
graphs.[10] However, the same definitions apply to directed graphs and a directed
graph is also equivalent to a unique core. Every graph and every directed graph
contains its core as a retract and as an induced subgraph.[7]
For example, all complete graphs Kn and all odd cycles (cycle graphs of odd
length) are cores. Every 3-colorable graph G that contains a triangle (that is, has
the complete graph K3 as a subgraph) is homomorphically equivalent to K3. This
is because, on one hand, a 3-coloring of G is the same as a
homomorphism G → K3, as explained below. On the other hand, every subgraph
of G trivially admits a homomorphism into G, implying K3 → G. This also means
that K3 is the core of any such graph G. Similarly, every bipartite graph that has
at least one edge is equivalent to K2.[11]

Connection to colorings[edit]
A k-coloring, for some integer k, is an assignment of one of k colors to each
vertex of a graph G such that the endpoints of each edge get different colors.
The k-colorings of G correspond exactly to homomorphisms from G to
the complete graph Kk.[12] Indeed, the vertices of Kk correspond to the k colors,
and two colors are adjacent as vertices of Kk if and only if they are different.
Hence a function defines a homomorphism to Kk if and only if it maps adjacent
vertices of G to different colors (i.e., it is a k-coloring). In particular, G is k-
colorable if and only if it is Kk-colorable.[12]
If there are two homomorphisms G → H and H → Kk, then their
composition G → Kk is also a homomorphism.[13] In other words, if a graph H can
be colored with k colors, and there is a homomorphism from G to H, then G can
also be k-colored. Therefore, G → H implies χ(G) ≤ χ(H), where χ denotes
the chromatic number of a graph (the least k for which it is k-colorable).[14]
Variants[edit]
General homomorphisms can also be thought of as a kind of coloring: if the
vertices of a fixed graph H are the available colors and edges of H describe
which colors are compatible, then an H-coloring of G is an assignment of colors
to vertices of G such that adjacent vertices get compatible colors. Many notions
of graph coloring fit into this pattern and can be expressed as graph
homomorphisms into different families of graphs. Circular colorings can be
defined using homomorphisms into circular complete graphs, refining the usual
notion of colorings.[15] Fractional and b-fold coloring can be defined using
homomorphisms into Kneser graphs.[16] T-colorings correspond to
homomorphisms into certain infinite graphs. [17] An oriented coloring of a directed
graph is a homomorphism into any oriented graph.[18] An L(2,1)-coloring is a
homomorphism into the complement of the path graph that is locally injective,
meaning it is required to be injective on the neighbourhood of every vertex. [19]
Orientations without long paths[edit]
Main article: Gallai–Hasse–Roy–Vitaver theorem
Another interesting connection concerns orientations of graphs. An orientation of
an undirected graph G is any directed graph obtained by choosing one of the
two possible orientations for each edge. An example of an orientation of the
complete graph Kk is the transitive tournament T→k with vertices 1,2,…,k and arcs
from i to j whenever i < j. A homomorphism between orientations of
graphs G and H yields a homomorphism between the undirected
graphs G and H, simply by disregarding the orientations. On the other hand,
given a homomorphism G → H between undirected graphs, any
orientation H→ of H can be pulled back to an orientation G→ of G so that G→ has
a homomorphism to H→. Therefore, a graph G is k-colorable (has a
homomorphism to Kk) if and only if some orientation of G has a homomorphism
to T→k.[20]
A folklore theorem states that for all k, a directed graph G has a homomorphism
to T→k if and only if it admits no homomorphism from the directed path P→k+1.
[21]
 Here P→n is the directed graph with vertices 1, 2, …, n and edges from i to i +
1, for i = 1, 2, …, n − 1. Therefore, a graph is k-colorable if and only if it has an
orientation that admits no homomorphism from P→k+1. This statement can be
strengthened slightly to say that a graph is k-colorable if and only if some
orientation contains no directed path of length k (no P→k+1 as a subgraph). This is
the Gallai–Hasse–Roy–Vitaver theorem.

Connection to constraint satisfaction problems[edit]


Examples[edit]
Graph H of non-consecutive weekdays, isomorphic to the complement graph of C7 and to
the circular clique K7/2

Some scheduling problems can be modeled as a question about finding graph


homomorphisms.[22][23] As an example, one might want to assign workshop
courses to time slots in a calendar so that two courses attended by the same
student are not too close to each other in time. The courses form a graph G, with
an edge between any two courses that are attended by some common student.
The time slots form a graph H, with an edge between any two slots that are
distant enough in time. For instance, if one wants a cyclical, weekly schedule,
such that each student gets their workshop courses on non-consecutive days,
then H would be the complement graph of C7. A graph homomorphism
from G to H is then a schedule assigning courses to time slots, as specified. [22] To
add a requirement saying that, e.g., no single student has courses on both
Friday and Monday, it suffices to remove the corresponding edge from H.
A simple frequency allocation problem can be specified as follows: a number of
transmitters in a wireless network must choose a frequency channel on which
they will transmit data. To avoid interference, transmitters that are geographically
close should use channels with frequencies that are far apart. If this condition is
approximated with a single threshold to define 'geographically close' and 'far
apart', then a valid channel choice again corresponds to a graph
homomorphism. It should go from the graph of transmitters G, with edges
between pairs that are geographically close, to the graph of channels H, with
edges between channels that are far apart. While this model is rather simplified,
it does admit some flexibility: transmitter pairs that are not close but could
interfere because of geographical features can added to the edges of G. Those
that do not communicate at the same time can be removed from it. Similarly,
channel pairs that are far apart but exhibit harmonic interference can be
removed from the edge set of H.[24]
In each case, these simplified models display many of the issues that have to be
handled in practice.[25] Constraint satisfaction problems, which generalize graph
homomorphism problems, can express various additional types of conditions
(such as individual preferences, or bounds on the number of coinciding
assignments). This allows the models to be made more realistic and practical.
Formal view[edit]
Graphs and directed graphs can be viewed as a special case of the far more
general notion called relational structures (defined as a set with a tuple of
relations on it). Directed graphs are structures with a single binary relation
(adjacency) on the domain (the vertex set). [26][3] Under this
view, homomorphisms of such structures are exactly graph homomorphisms. In
general, the question of finding a homomorphism from one relational structure to
another is a constraint satisfaction problem (CSP). The case of graphs gives a
concrete first step that helps to understand more complicated CSPs. Many
algorithmic methods for finding graph homomorphisms,
like backtracking, constraint propagation and local search, apply to all CSPs.[3]
For graphs G and H, the question of whether G has a homomorphism
to H corresponds to a CSP instance with only one kind of constraint, [3] as follows.
The variables are the vertices of G and the domain for each variable is the
vertex set of H. An evaluation is a function that assigns to each variable an
element of the domain, so a function f from V(G) to V(H). Each edge or arc (u,v)
of G then corresponds to the constraint ((u,v), E(H)). This is a constraint
expressing that the evaluation should map the arc (u,v) to a pair (f(u),f(v)) that is
in the relation E(H), that is, to an arc of H. A solution to the CSP is an evaluation
that respects all constraints, so it is exactly a homomorphism from G to H.

Structure of homomorphisms[edit]
Compositions of homomorphisms are homomorphisms. [13] In particular, the
relation → on graphs is transitive (and reflexive, trivially), so it is a preorder on
graphs.[27] Let the equivalence class of a graph G under homomorphic
equivalence be [G]. The equivalence class can also be represented by the
unique core in [G]. The relation → is a partial order on those equivalence
classes; it defines a poset.[28]
Let G < H denote that there is a homomorphism from G to H, but no
homomorphism from H to G. The relation → is a dense order, meaning that for
all (undirected) graphs G, H such that G < H, there is a graph K such
that G < K < H (this holds except for the trivial cases G = K0 or K1).[29][30] For
example, between any two complete graphs (except K0, K1, K2) there are infinitely
many circular complete graphs, corresponding to rational numbers between
natural numbers.[31]
The poset of equivalence classes of graphs under homomorphisms is
a distributive lattice, with the join of [G] and [H] defined as (the equivalence class
of) the disjoint union [G ∪ H], and the meet of [G] and [H] defined as the tensor
product [G × H] (the choice of graphs G and H representing the equivalence
classes [G] and [H] does not matter).[32] The join-irreducible elements of this
lattice are exactly connected graphs. This can be shown using the fact that a
homomorphism maps a connected graph into one connected component of the
target graph.[33][34] The meet-irreducible elements of this lattice are exactly
the multiplicative graphs. These are the graphs K such that a product G × H has
a homomorphism to K only when one of G or H also does. Identifying
multiplicative graphs lies at the heart of Hedetniemi's conjecture.[35][36]
Graph homomorphisms also form a category, with graphs as objects and
homomorphisms as arrows.[37] The initial object is the empty graph, while
the terminal object is the graph with one vertex and one loop at that vertex.
The tensor product of graphs is the category-theoretic product and
the exponential graph is the exponential object for this category.[36][38] Since these
two operations are always defined, the category of graphs is a cartesian closed
category. For the same reason, the lattice of equivalence classes of graphs
under homomorphisms is in fact a Heyting algebra.[36][38]
For directed graphs the same definitions apply. In particular → is a partial
order on equivalence classes of directed graphs. It is distinct from the order →
on equivalence classes of undirected graphs, but contains it as a suborder. This
is because every undirected graph can be thought of as a directed graph where
every arc (u,v) appears together with its inverse arc (v,u), and this does not
change the definition of homomorphism. The order → for directed graphs is
again a distributive lattice and a Heyting algebra, with join and meet operations
defined as before. However, it is not dense. There is also a category with
directed graphs as objects and homomorphisms as arrows, which is again
a cartesian closed category.[39][38]
Incomparable graphs[edit]

The Grötzsch graph, incomparable to K3

There are many incomparable graphs with respect to the homomorphism


preorder, that is, pairs of graphs such that neither admits a homomorphism into
the other.[40] One way to construct them is to consider the odd girth of a graph G,
the length of its shortest odd-length cycle. The odd girth is, equivalently, the
smallest odd number g for which there exists a homomorphism from the cycle
graph on g vertices to G. For this reason, if G → H, then the odd girth of G is
greater than or equal to the odd girth of H.[41]
On the other hand, if G → H, then the chromatic number of G is less than or
equal to the chromatic number of H. Therefore, if G has strictly larger odd girth
than H and strictly larger chromatic number than H, then G and H are
incomparable.[40] For example, the Grötzsch graph is 4-chromatic and triangle-
free (it has girth 4 and odd girth 5), [42] so it is incomparable to the triangle
graph K3.
Examples of graphs with arbitrarily large values of odd girth and chromatic
number are Kneser graphs[43] and generalized Mycielskians.[44] A sequence of
such graphs, with simultaneously increasing values of both parameters, gives
infinitely many incomparable graphs (an antichain in the homomorphism
preorder).[45] Other properties, such as density of the homomorphism preorder,
can be proved using such families.[46] Constructions of graphs with large values of
chromatic number and girth, not just odd girth, are also possible, but more
complicated (see Girth and graph coloring).
Among directed graphs, it is much easier to find incomparable pairs. For
example, consider the directed cycle graphs C→n, with vertices 1, 2, …, n and
edges from i to i + 1 (for i = 1, 2, …, n − 1) and from n to 1. There is a
homomorphism from C→n to C→k (n, k ≥ 3) if and only if n is a multiple of k. In
particular, directed cycle graphs C→n with n prime are all incomparable.[47]

Computational complexity[edit]
In the graph homomorphism problem, an instance is a pair of graphs (G,H) and
a solution is a homomorphism from G to H. The general decision problem,
asking whether there is any solution, is NP-complete.[48] However, limiting allowed
instances gives rise to a variety of different problems, some of which are much
easier to solve. Methods that apply when restraining the left side G are very
different than for the right side H, but in each case a dichotomy (a sharp
boundary between easy and hard cases) is known or conjectured.
Homomorphisms to a fixed graph[edit]
The homomorphism problem with a fixed graph H on the right side of each
instance is also called the H-coloring problem. When H is the complete graph Kk,
this is the graph k-coloring problem, which is solvable in polynomial time for k =
0, 1, 2, and NP-complete otherwise.[49] In particular, K2-colorability of a graph G is
equivalent to G being bipartite, which can be tested in linear time. More
generally, whenever H is a bipartite graph, H-colorability is equivalent to K2-
colorability (or K0 / K1-colorability when H is empty/edgeless), hence equally easy
to decide.[50] Pavol Hell and Jaroslav Nešetřil proved that, for undirected graphs,
no other case is tractable:
Hell–Nešetřil theorem (1990): The H-coloring problem is in P when H is bipartite
and NP-complete otherwise.[51][52]
This is also known as the dichotomy theorem for (undirected) graph
homomorphisms, since it divides H-coloring problems into NP-complete or P
problems, with no intermediate cases. For directed graphs, the situation is
more complicated and in fact equivalent to the much more general question
of characterizing the complexity of constraint satisfaction problems.[53] It turns
out that H-coloring problems for directed graphs are just as general and as
diverse as CSPs with any other kinds of constraints. [54][55] Formally, a
(finite) constraint language (or template) Γ is a finite domain and a finite set
of relations over this domain. CSP(Γ) is the constraint satisfaction problem
where instances are only allowed to use constraints in Γ.
Theorem (Feder, Vardi 1998): For every constraint language Γ, the problem
CSP(Γ) is equivalent under polynomial-time reductions to some H-coloring
problem, for some directed graph H.[55]
Intuitively, this means that every algorithmic technique or complexity
result that applies to H-coloring problems for directed graphs H applies
just as well to general CSPs. In particular, one can ask whether the Hell–
Nešetřil theorem can be extended to directed graphs. By the above
theorem, this is equivalent to the Feder–Vardi conjecture (aka CSP
conjecture, dichotomy conjecture) on CSP dichotomy, which states that
for every constraint language Γ, CSP(Γ) is NP-complete or in P.[48] This
conjecture was proved in 2017 independently by Dmitry Zhuk and Andrei
Bulatov, leading to the following corollary:
Corollary (Bulatov 2017; Zhuk 2017): The H-coloring problem on directed
graphs, for a fixed H, is either in P or NP-complete.
Homomorphisms from a fixed family of graphs[edit]
The homomorphism problem with a single fixed graph G on left side
of input instances can be solved by brute-force in time |V(H)|O(|V(G)|), so
polynomial in the size of the input graph H.[56] In other words, the
problem is trivially in P for graphs G of bounded size. The interesting
question is then what other properties of G, beside size, make
polynomial algorithms possible.
The crucial property turns out to be treewidth, a measure of how tree-
like the graph is. For a graph G of treewidth at most k and a graph H,
the homomorphism problem can be solved in time |V(H)|O(k) with a
standard dynamic programming approach. In fact, it is enough to
assume that the core of G has treewidth at most k. This holds even if
the core is not known.[57][58]
The exponent in the |V(H)|O(k)-time algorithm cannot be lowered
significantly: no algorithm with running time |V(H)|o(tw(G) /log tw(G)) exists,
assuming the exponential time hypothesis (ETH), even if the inputs
are restricted to any class of graphs of unbounded treewidth. [59] The
ETH is an unproven assumption similar to P ≠ NP, but stronger.
Under the same assumption, there are also essentially no other
properties that can be used to get polynomial time algorithms. This is
formalized as follows:
Theorem (Grohe): For a computable class of graphs , the homomorphism
problem for instances  with  is in P if and only if graphs in  have cores of bounded
treewidth (assuming ETH).[58]
One can ask whether the problem is at least solvable in a time
arbitrarily highly dependent on G, but with a fixed polynomial
dependency on the size of H. The answer is again positive if we
limit G to a class of graphs with cores of bounded treewidth, and
negative for every other class.[58] In the language of parameterized
complexity, this formally states that the homomorphism problem
in  parameterized by the size (number of edges) of G exhibits a
dichotomy. It is fixed-parameter tractable if graphs in  have cores
of bounded treewidth, and W[1]-complete otherwise.
The same statements hold more generally for constraint
satisfaction problems (or for relational structures, in other words).
The only assumption needed is that constraints can involve only a
bounded number of variables (all relations are of some bounded
arity, 2 in the case of graphs). The relevant parameter is then the
treewidth of the primal constraint graph.[59]

Introduction To Grammar in Theory of


Computation

ameya_chawla

 Read
 Discuss
Prerequisite – Theory of Computation
Grammar :
It is a finite set of formal rules for generating syntactically correct sentences or
meaningful correct sentences.
Constitute Of Grammar :
Grammar is basically composed of two basic elements –
1. Terminal Symbols –
Terminal symbols are those which are the components of the sentences generated
using a grammar and are represented using small case letter like a, b, c etc.
2. Non-Terminal Symbols –
Non-Terminal Symbols are those symbols which take part in the generation of the
sentence but are not the component of the sentence. Non-Terminal Symbols are also
called Auxiliary Symbols and Variables. These symbols are represented using a
capital letter like A, B, C, etc.
Formal Definition of Grammar :
Any Grammar can be represented by 4 tuples – <N, T, P, S>

 N – Finite Non-Empty Set of Non-Terminal Symbols.


 T – Finite Set of Terminal Symbols.
 P – Finite Non-Empty Set of Production Rules.
 S – Start Symbol (Symbol from where we start producing our sentences or
strings).
Production Rules :
A production or production rule in computer science is a rewrite rule specifying
a symbol substitution that can be recursively performed to generate new symbol
sequences. It is of the form α->  β where  α is a Non-Terminal Symbol which
can be replaced by β which is a string of Terminal Symbols or Non-Terminal
Symbols.
Example-1 :
Consider Grammar G1 = <N, T, P, S>
T = {a,b}    #Set of terminal symbols
P = {A->Aa,A->Ab,A->a,A->b,A->  }    #Set of all production rules
S = {A}    #Start Symbol
As the start symbol is S then we can produce Aa, Ab, a,b, which can further
produce strings where A can be replaced by the Strings mentioned in the
production rules and hence this grammar can be used to produce strings of the
form (a+b)*.
Derivation Of Strings :
A->a    #using production rule 3
OR
A->Aa    #using production rule 1
Aa->ba    #using production rule 4
OR
A->Aa    #using production rule 1
Aa->AAa    #using production rule 1
AAa->bAa    #using production rule 4
bAa->ba    #using production rule 5
Example-2 :
Consider Grammar G2 = <N, T, P, S>
N = {A}   #Set of non-terminals Symbols
T = {a}    #Set of terminal symbols
P = {A->Aa, A->AAa, A->a, A-> }    #Set of all production rules
S = {A}   #Start Symbol
As the start symbol is S then we can produce Aa, AAa, a, which can further
produce strings where A can be replaced by the Strings mentioned in the
production rules and hence this grammar can be used to produce strings of
form (a)*.
Derivation Of Strings :
A->a    #using production rule 3
OR
A->Aa    #using production rule 1
Aa->aa    #using production rule 3
OR
A->Aa    #using production rule 1
Aa->AAa    #using production rule 1
AAa->Aa    #using production rule 4
Aa->aa    #using production rule 3
Equivalent Grammars :
Grammars are said to be equivalent if they produce the same language.
Different Types Of Grammars :
Grammar can be divided on basis of –
 Type of Production Rules
 Number of Derivation Trees
 Number of Strings
Decidable and Undecidable problems in
Theory of Computation
 Read

 Discuss

Prerequisite – Turing Machine


A problem is said to be Decidable if we can always construct a
corresponding algorithm that can answer the problem correctly. We can intuitively
understand Decidable problems by considering a simple example. Suppose we are asked
to compute all the prime numbers in the range of 1000 to 2000. To find the solution of
this problem, we can easily devise an algorithm that can enumerate all the prime numbers
in this range.
Now talking about Decidability in terms of a Turing machine, a problem is said to be a
Decidable problem if there exists a corresponding Turing machine which halts on every
input with an answer- yes or no. It is also important to know that these problems are
termed as Turing Decidable since a Turing machine always halts on every input,
accepting or rejecting it.
Semi- Decidable Problems –
Semi-Decidable problems are those for which a Turing machine halts on the input
accepted by it but it can either halt or loop forever on the input which is rejected by the
Turing Machine. Such problems are termed as Turing Recognisable problems.

Examples – We will now consider few important Decidable problems:


 Are two regular languages L and M equivalent?
We can easily check this by using Set Difference operation.
L-M =Null and M-L =Null.
Hence (L-M) U (M-L) = Null, then L,M are equivalent.
 Membership of a CFL?
We can always find whether a string exists in a given CFL by using an
algorithm based on dynamic programming.
 Emptiness of a CFL
By checking the production rules of the CFL we can easily state whether the
language generates any strings or not.
Undecidable Problems –
The problems for which we can’t construct an algorithm that can answer the
problem correctly in finite time are termed as Undecidable Problems. These
problems may be partially decidable but they will never be decidable. That is
there will always be a condition that will lead the Turing Machine into an infinite
loop without providing an answer at all.
We can understand Undecidable Problems intuitively by considering Fermat’s
Theorem, a popular Undecidable Problem which states that no three positive
integers a, b and c for any n>2 can ever satisfy the equation: a^n + b^n = c^n.
If we feed this problem to a Turing machine to find such a solution which gives
a contradiction then a Turing Machine might run forever, to find the suitable
values of n, a, b and c. But we are always unsure whether a contradiction exists
or not and hence we term this problem as an Undecidable Problem.
Examples – These are few important Undecidable Problems:
 Whether a CFG generates all the strings or not?
As a CFG generates infinite strings, we can’t ever reach up to the last string
and hence it is Undecidable.
 Whether two CFG L and M equal?
Since we cannot determine all the strings of any CFG, we can predict that
two CFG are equal or not.
 Ambiguity of CFG?
There exist no algorithm which can check whether for the ambiguity of a
CFL. We can only check if any particular string of the CFL generates two
different parse trees then the CFL is ambiguous.
 Is it possible to convert a given ambiguous CFG into corresponding non-
ambiguous CFL?
It is also an Undecidable Problem as there doesn’t exist any algorithm for the
conversion of an ambiguous CFL to non-ambiguous CFL.
 Is a language Learning which is a CFL, regular?
This is an Undecidable Problem as we can not find from the production rules
of the CFL whether it is regular or not.
Some more Undecidable Problems related to Turing machine:
 Membership problem of a Turing Machine?
 Finiteness of a Turing Machine?
 Emptiness of a Turing Machine?
 Whether the language accepted by Turing Machine is regular or CFL

Pumping Lemma in Theory of


Computation
 Read

 Discuss

There are two Pumping Lemmas, which are defined for 1. Regular Languages, and 2.
Context – Free Languages   Pumping Lemma for Regular Languages For any regular
language L, there exists an integer n, such that for all x ∈ L with |x| ≥ n, there exists u, v,
w ∈ Σ*, such that x = uvw, and (1) |uv| ≤ n (2) |v| ≥ 1 (3) for all i ≥ 0: uviw ∈ L   In simple
terms, this means that if a string v is ‘pumped’, i.e., if v is inserted any number of times,
the resultant string still remains in L.   Pumping Lemma is used as a proof for irregularity
of a language. Thus, if a language is regular, it always satisfies pumping lemma. If there
exists at least one string made from pumping which is not in L, then L is surely not
regular. The opposite of this may not always be true. That is, if Pumping Lemma holds, it
does not mean that the language is regular. 

 For example, let us prove L01 = {0n1n | n ≥ 0} is irregular. Let us assume that L is regular,
then by Pumping Lemma the above given rules follow. Now, let x ∈ L and |x| ≥ n. So, by
Pumping Lemma, there exists u, v, w such that (1) – (3) hold.   We show that for all u, v,
w, (1) – (3) does not hold. If (1) and (2) hold then x = 0n1n = uvw with |uv| ≤ n and |v| ≥ 1.
So, u = 0a, v = 0b, w = 0c1n where : a + b ≤ n, b ≥ 1, c ≥ 0, a + b + c = n But, then (3) fails
for i = 0 uv0w = uw = 0a0c1n = 0a + c1n ∉ L, since a + c ≠ n.

Pumping Lemma for Context-free Languages (CFL) Pumping Lemma for


CFL states that for any Context Free Language L, it is possible to find two
substrings that can be ‘pumped’ any number of times and still be in the same
language. For any language L, we break its strings into five parts and pump
second and fourth substring. Pumping Lemma, here also, is used as a tool to
prove that a language is not CFL. Because, if any one string does not satisfy its
conditions, then the language is not CFL. Thus, if L is a CFL, there exists an
integer n, such that for all x ∈ L with |x| ≥ n, there exists u, v, w, x, y ∈ Σ*, such
that x = uvwxy, and (1) |vwx| ≤ n (2) |vx| ≥ 1 (3) for all i ≥ 0: uv iwxiy ∈ L 

For above example, 0n1n is CFL, as any string can be the result of pumping at
two places, one for 0 and other for 1. Let us prove, L 012 = {0n1n2n | n ≥ 0} is not
Context-free. Let us assume that L is Context-free, then by Pumping Lemma,
the above given rules follow. Now, let x ∈ L and |x| ≥ n. So, by Pumping
Lemma, there exists u, v, w, x, y such that (1) – (3) hold. We show that for all u,
v, w, x, y (1) – (3) do not hold.   If (1) and (2) hold then x = 0 n1n2n = uvwxy with |
vwx| ≤ n and |vx| ≥ 1. (1) tells us that vwx does not contain both 0 and 2. Thus,
either vwx has no 0’s, or vwx has no 2’s. Thus, we have two cases to consider.
Suppose vwx has no 0’s. By (2), vx contains a 1 or a 2. Thus uwy has ‘n’ 0’s
and uwy either has less than ‘n’ 1’s or has less than ‘n’ 2’s. But (3) tells us that
uwy = uv0wx0y ∈ L. So, uwy has an equal number of 0’s, 1’s and 2’s gives us a
contradiction. The case where vwx has no 2’s is similar and also gives us a
contradiction. Thus L is not context-free.   Source : John E. Hopcroft, Rajeev
Motwani, Jeffrey D. Ullman (2003). Introduction to Automata Theory,
Languages, and Computation.
This article has been contributed by Nirupam Singh.    Please write comments
if you find anything incorrect, or if you want to share more information about the
topic discussed above.

Relationship between grammar and


language in Theory of Computation
 Read

 Discuss
A grammar is a set of production rules which are used to generate strings of a language.
In this article, we have discussed how to find the language generated by a grammar and
vice versa as well.

Language generated by a grammar –

Given a grammar G, its corresponding language L(G) represents the set of all strings
generated from G. Consider the following grammar,
G: S-> aSb|ε
In this grammar, using S-> ε, we can generate ε. Therefore, ε is part of L(G). Similarly,
using S=>aSb=>ab, ab is generated. Similarly, aabb can also be generated.
Therefore,
L(G) = {anbn, n>=0}
In language L(G) discussed above, the condition n = 0 is taken to accept ε.
Key Points –

 For a given grammar G, its corresponding language L(G) is unique.


 The language L(G) corresponding to grammar G must contain all strings
which can be generated from G.
 The language L(G) corresponding to grammar G must not contain any string
which can not be generated from G.
Let us discuss questions based on this:
Que-1. Consider the grammar: (GATE-CS-2009)
S -> aSa|bSb|a|b
The language generated by the above grammar over the alphabet {a,b} is the
set of:
(A) All palindromes
(B) All odd length palindromes.
(C) Strings that begin and end with the same symbol
(D) All even length palindromes
Solution: Using S->a and S->b, a and b can be generated. Similarly using
S=>aSa=>aba, aba can be generated. Other strings which can be generated
from grammar are: a, b, aba, bab, aaa, bbb, ababa, …
Therefore, option (B) is correct.
Que-2. Consider the following context-free grammars: (GATE-CS-2016)

Which one of the following pairs of languages is generated by G1 and G2,


respectively?

Solution: Consider the grammar G1:


Using S=>B=>b, b can be generated.
Using S=>B=>bB, bb can be generated.
Using S=>aS=>aB=>ab can be generated.
Using S=>aS=>aB=>abB=>abb can be generated.
As we can see, number of a’s can be zero or more but number of b is always
greater than zero.
Therefore,
L(G1) = {ambn| m>=0 and n>0}
Consider the grammar G2:
Using S=>aA=>a, a can be generated.
Using S=>bB=>b, b can be generated.
Using S=>aA=>aaA=>aa can be generated.
Using S=>bB=>bbB=>bb can be generated.
Using S=>aA=>aB=>abB=>abb can be generated.
As we can see, either a or b must be greater than 0.
Therefore,
L(G2) = {ambn| m>0 or n>0}

Grammar generating a given language –

Given a language L(G), its corresponding grammar G represents the production


rules which produces L(G). Consider the language L(G):

L(G) = {anbn, n>=0}


The language L(G) is set of strings ε, ab, aabb, aaabbb….
For ε string in L(G), the production rule can be S->ε.
For other strings in L(G), the production rule can be S->aSb|ε.
Therefore, grammar G corresponding to L(G) is:
S->aSb| ε
Key Points –
 For a given language L(G), there can be more than one grammar which can
produce L(G).
 The grammar G corresponding to language L(G) must generate all possible
strings of L(G).
 The grammar G corresponding to language L(G) must not generate any
string which is not part of L(G).
Let us discuss questions based on this:
Que-3. Which one of the following grammar generates the language L = {a i b j |
i≠j}? (GATE-CS-2006)

Solution: The given language L contains the strings :


{a, b, aa, bb, aaa, bbb, aab, abb…}
It means either the string must contain one or more number of a OR one or
more number of b OR a followed by b having unequal number of a and b.
If we consider grammar in option (A), it can generate ab as:
S=>AC=>aAC=>aC=>ab
However, ab can’t be generated by language L. Therefore, grammar in option
(A) is not correct.
Similarly, grammar in option (B) can generate ab as:
S=>aS=>ab
However, ab can’t be generated by language L. Therefore, grammar in option
(B) is not correct.
Similarly, grammar in option (C) can generate ab as:
S=>AC=>C=>aCb=>ab
However, ab can’t be generated by language L. Therefore, grammar in option
(C) is not correct.
Therefore, using method of elimination, option (D) is correct

Introduction of Finite Automata


 Read
 Discuss
Finite Automata(FA) is the simplest machine to recognize patterns. The finite automata
or finite state machine is an abstract machine that has five elements or tuples. It has a set
of states and rules for moving from one state to another but it depends upon the applied
input symbol. Basically, it is an abstract model of a digital computer. The following
figure shows some essential features of general automation.

Figure: Features of Finite Automata

The above figure shows the following features of automata:


1. Input
2. Output
3. States of automata
4. State relation
5. Output relation
A Finite Automata consists of the following: 
Q : Finite set of states.
Σ : set of Input Symbols.
q : Initial state.
F : set of Final States.
δ : Transition Function.
Formal specification of machine is 

{ Q, Σ, q, F, δ }
FA is characterized into two types: 
1) Deterministic Finite Automata (DFA):
DFA consists of 5 tuples {Q, Σ, q, F, δ}.
Q : set of all states.
Σ : set of input symbols. ( Symbols which machine takes as input )
q : Initial state. ( Starting state of a machine )
F : set of final state.
δ : Transition Function, defined as δ : Q X Σ --> Q.
In a DFA, for a particular input character, the machine goes to one state only. A
transition function is defined on every state for every input symbol. Also in DFA
null (or ε) move is not allowed, i.e., DFA cannot change state without any input
character. 
For example, below DFA with Σ = {0, 1} accepts all strings ending with 0. 
 

Figure:  DFA with   Σ = {0, 1} 


One important thing to note is, there can be many possible DFAs for a
pattern. A DFA with a minimum number of states is generally preferred. 

2) Nondeterministic Finite Automata(NFA): NFA is similar to DFA except


following additional features: 
1. Null (or ε) move is allowed i.e., it can move forward without reading
symbols. 
2. Ability to transmit to any number of states for a particular input. 
However, these above features don’t add any power to NFA. If we compare
both in terms of power, both are equivalent. 
Due to the above additional features, NFA has a different transition function, the
rest is the same as DFA. 
δ: Transition Function
δ:  Q X (Σ U ε ) --> 2 ^ Q.
As you can see in the transition function is for any input including null (or ε),
NFA can go to any state number of states. For example, below is an NFA for
the above problem. 

NFA

One important thing to note is, in NFA, if any path for an input string leads
to a final state, then the input string is accepted. For example, in the above
NFA, there are multiple paths for the input string “00”. Since one of the paths
leads to a final state, “00” is accepted by the above NFA. 
Some Important Points: 
 Justification:
Since all the tuples in DFA and NFA are the same except for one of the tuples,
which is Transition Function (δ) 
In case of DFA
δ : Q X Σ --> Q
In case of NFA
δ : Q X Σ --> 2Q
Now if you observe you’ll find out Q X Σ –> Q is part of Q X Σ –> 2 Q.
On the RHS side, Q is the subset of 2 Q which indicates Q is contained in 2 Q or Q
is a part of 2Q, however, the reverse isn’t true. So mathematically, we can
conclude that every DFA is NFA but not vice-versa. Yet there is a way to
convert an NFA to DFA, so there exists an equivalent DFA for every NFA. 
 
1. Both NFA and DFA have the same power and each NFA can be translated
into a DFA. 
2. There can be multiple final states in both DFA and NFA. 
3. NFA is more of a theoretical concept. 
4. DFA is used in Lexical Analysis in Compiler. 
5. If the number of states in the NFA is N then, its DFA can have maximum
2N number of states.

You might also like