Chomsky Hierarchy of Languages
Chomsky Hierarchy of Languages
Languages
Chomsky Hierarchy of Languages
Type 0 Grammar:
Type 0 grammar is known as Unrestricted grammar. There is no restriction on the grammar rules of
these types of languages. These languages can be efficiently modeled by Turing machines.
example:
bAa → aa
S→s
Chomsky Hierarchy of Languages
Type 1 Grammar:
Type 1 grammar is known as Context Sensitive Grammar. The context sensitive grammar is used to
represent context sensitive language. The context sensitive grammar follows the following rules:
• The context sensitive grammar may have more than one symbol on the left hand side of their
production rules.
• The number of symbols on the left-hand side must not exceed the number of symbols on the right-
hand side.
• The rule of the form A → ε is not allowed unless A is a start symbol. It does not occur on the right-
hand side of any rule.
• The Type 1 grammar should be Type 0. In type 1, Production is in the form of V → T
Where the count of symbol in V is less than or equal to T.
example:
S → AT
T → xy
A→a
Chomsky Hierarchy of Languages
Type 2 Grammar:
Type 2 Grammar is known as Context Free Grammar. Context free languages are the languages which can be
represented by the context free grammar (CFG). Type 2 should be type 1. The production rule is of the form
A→α
Where A is any single non-terminal and is any combination of terminals and non-terminals.
example:
A → aBb
A→b
B→a
Chomsky Hierarchy of Languages
Type 3 Grammar:
Type 3 Grammar is known as Regular Grammar. Regular languages are those languages which can be
described using regular expressions. These languages can be modeled by NFA or DFA.
Type 3 is most restricted form of grammar. The Type 3 grammar should be Type 2 and Type 1. Type 3 should
be in the form of
V → T*V / T*
example:
A → xy
Chomsky Hierarchy of Languages & Recognizers
Context-Free Grammar (CFG)
CFG stands for context-free grammar. It is is a formal grammar which is used to generate all possible patterns of
strings in a given formal language. Context-free grammar G can be defined by four tuples as:
G = (V, T, P, S)
Where,
G is the grammar, which consists of a set of the production rule. It is used to generate the string of a language.
T is the final set of a terminal symbol. It is denoted by lower case letters.
V is the final set of a non-terminal symbol. It is denoted by capital letters.
P is a set of production rules, which is used for replacing non-terminals symbols(on the left side of the production) in
a string with other terminal or non-terminal symbols(on the right side of the production).
S is the start symbol which is used to derive the string. We can derive the string by repeatedly replacing a non-
terminal by the right-hand side of the production until all non-terminal have been replaced by terminal symbols.
Context-Free Grammar (CFG)
Construct the CFG for the language having any number of a's over the set ∑= {a}.
r.e. = a*
Production rule for the Regular expression is as follows:
S → aS rule 1
S → ε rule 2
Now if we want to derive a string "aaaaaa", we can start with start symbols.
S
aS
The r.e. = a* can generate a set of string {ε, a, aa, aaa,.....}. We can have a
aaS rule 1
null string because S is a start symbol and rule 2 gives S → ε.
aaaS rule 1
aaaaS rule 1
aaaaaS rule 1
aaaaaaS rule 1
aaaaaaε rule 2
aaaaaa
Context-Free Grammar (CFG)
The rules are in the combination of 0's and 1's with the start symbol. Since (0+1)*
indicates {ε, 0, 1, 01, 10, 00, 11, ....}. In this set, ε is a string, so in the rule, we can
set the rule S → ε.
Derivation
Derivation is a sequence of production rules. It is used to get the input string through these production rules.
During parsing, we have to take two decisions. These are as follows:
• We have to decide the non-terminal which is to be replaced.
• We have to decide the production rule by which the non-terminal will be replaced.
We have two options to decide which non-terminal to be placed with production rule
Leftmost Derivation:
Rightmost Derivation:
Left most Derivation:
In the left most derivation, the input is scanned and replaced with the production rule from left to right.
So in leftmost derivation, we read the input string from left to right.
E=E+E
E=E-E
E=a|b
Input
a-b+a
The leftmost derivation is:
E=E+E
E=E-E+E
E=a-E+E
E=a-b+E
E=a-b+a
Right most Derivation:
In the right most derivation, the input is scanned and replaced with the production rule from right to left.
So in right most derivation, we read the input string from right to left.
E=E+E
E=E-E
E=a|b
Input
a-b+a
The right most derivation is:
E=E-E
E=E-E+E
E=E-E+a
E=E-b+a
E=a-b+a
Context-Free Grammar (CFG)
derive a string "abbcbba"
S → aSa rule 1
S → bSb rule 2
S→c rule 3
Derive the string "abb" for leftmost derivation and rightmost derivation using a CFG given by,
S → AB | ε
A → aB
B → Sb
Derive the string "00101" for leftmost derivation and rightmost derivation using a CFG given by,
S → A1B
A → 0A | ε
B → 0B | 1B | ε
Derivation Tree
Derivation tree is a graphical representation for the derivation of the given production rules for a given CFG. It is
the simple way to show how the derivation can be done to obtain some string from a given set of production
rules. The derivation tree is also called a parse tree.
Parse tree follows the precedence of operators. The deepest sub-tree traversed first. So, the operator in the
parent node has less precedence over the operator in the sub-tree.
Input
a*b+c
Derivation Tree
S → bSb | a | b
Input
bab
Ambiguity in Grammar
A grammar is said to be ambiguous if there exists more than one leftmost derivation or more than one
rightmost derivation or more than one parse tree for the given input string. If the grammar is not ambiguous,
then it is called unambiguous.
If the grammar has ambiguity, then it is not good for compiler construction. No method can automatically
detect and remove the ambiguity, but we can remove ambiguity by re-writing the whole grammar without
ambiguity.
E→I
E→E+E
E→E*E
E → (E)
I → ε | 0 | 1 | 2 | ... | 9
Input
For the string "3 * 2 + 5", the above grammar can generate two parse trees by leftmost derivation:
Ambiguity in Grammar
E→I
E→E+E
E→E*E
E → (E)
I → ε | 0 | 1 | 2 | ... | 9
Input
For the string "3 * 2 + 5", the above grammar can generate two parse trees by leftmost derivation:
Ambiguity in Grammar
Check whether the given grammar G is ambiguous or not.
E→E+E
E→E-E
E → id
Solution:
From the above grammar String "id + id - id" can be derived in 2 ways: