0% found this document useful (0 votes)
22 views57 pages

Unit-3 Context Free Grammar

The document covers the theory of computation with a focus on Context Free Grammar (CFG), including the Chomsky hierarchy, types of grammars, derivation methods, ambiguity, and simplified forms. It provides definitions, examples, and exercises related to CFG, recursive definitions, and the conversion of finite automata to regular grammar. Additionally, it discusses the characteristics of ambiguous and unambiguous grammars, along with techniques for eliminating nullable variables and productions.

Uploaded by

Karan Agravat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views57 pages

Unit-3 Context Free Grammar

The document covers the theory of computation with a focus on Context Free Grammar (CFG), including the Chomsky hierarchy, types of grammars, derivation methods, ambiguity, and simplified forms. It provides definitions, examples, and exercises related to CFG, recursive definitions, and the conversion of finite automata to regular grammar. Additionally, it discusses the characteristics of ambiguous and unambiguous grammars, along with techniques for eliminating nullable variables and productions.

Uploaded by

Karan Agravat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 57

2160704

Theory of Computation

Unit-3
Context Free Grammar
Topics to be covered
• Chomsky hierarchy
• Context free grammar
• Recursive definition
• FA to regular grammar
• Derivation
• Ambiguity & unambiguous grammar
• Simplified forms & normal forms
• CFG to CNF
• Union, Concatenation & Kleene’s of CFG
Chomsky Hierarchy
Chomsky hierarchy (Classification of grammar)
Grammar

Unrestricted Restricted
grammar (type 0) grammar

Context Context Regular


sensitive free grammar
grammar grammar (type 3)
(type 1) (type 2)
Type of Name of Language Accepted Form Examples
Grammar Grammar generated by X --> Y

Type – 0 Unrestricted Recursive Turing X ∈ (V+T)* V (V+T)* aSb -> ab (✓)


Enumerable Machine Y ∈ (V+T)* a -> Sb (X)
Language
Type – 1 Context Context Linear X ∈ T(0) + |X| <= |Y| Sab -> abab (✓)
Sensitive Sensitive Bounded
Grammar Language Automata Y ∈ (V+T)* aSb -> b (X)
S -> ^ (✓)
(Exception)
Type – 2 Context Free Context Free Push X ∈ T(0) + T(1) + S -> aAb (✓)
Grammar Language Down X ∈ Single variable S -> ^ (✓)
Automata Y ∈ (V+T)* Sa -> ab (X)
Type – 3 Regular Regular Finite X ∈ T(2) S -> Saab (✓)
Grammar Language Automata Y ∈ VT* + T* (Left
Linear Grammar) S -> Ab
OR A -> Abb (✓)
Y ∈ T*V + T* (Right (Left linear Grammar)
Linear Grammar)
(Variable can Only be S -> Ab
either at leftmost or at A -> aaB
right most place in Y) B -> b (X)
Type 0 grammar (Phrase Structure Grammar)
• Their productions are of the form:

• X ∈ (V+T)* V (V+T)*

• Y ∈ (V+T)*

• where both and can be strings of terminal and nonterminal


symbols.
• Example: S → ACaB
Bc → acB
CB → DB
aD → Db
Type 1 grammar (Context Sensitive Grammar)
• Their productions are of the form:

• where A is non terminal and , , are strings of terminals and non


terminals.
• The strings and may be empty, but must be non-empty.
• Here, a string can be replaced by (or vice versa) only when it is
enclosed by the strings and in a sentential form.
• Example: AB → AbBc
A → bcA
B→b
Type 2 grammar (Context Free Grammar)
• Their productions are of the form:

• Where is non terminal and is string of terminals and non


terminals.
• Example: S → Xa
X→a
X → aX
X → abc
Type 3 grammar (Linear or Regular grammar)
• Their productions are of the form:
or
• Where are non terminals and is terminal.
• Example: X → a | aY
Y→b
Hierarchy of grammar

Type 3
(Regular)
Context free grammar
Context Free Grammar
• A context free grammar (CFG) is a 4-tuple where,
is finite set of non terminals,
is disjoint finite set of terminals,
is an element of and it’s a start symbol,
is a finite set of productions of the form where and .
• Application of CFG:
1. CFG are extensively used to specify the syntax of
programming language.
2. CFG is used to develop a parser.
CFG Examples
• Write CFG for either a or b
Sa | b
• Write CFG for a+
S aS | a
• Write CFG for a*
S aS | ^
• Write CFG for (ab)*
SabS | ^
• Write CFG for any string of a and b
S aS | bS | a | b
CFG Examples
• Write CFG for ab*
SaX
X˄| bX
• Write CFG for a*b*
SXY
XaX|˄
YbY|˄
• Write CFG for (a+b)*
SaS | bS | ^
• Write CFG for a(a+b)*
SaX
XaX | bX | ^
CFG Examples
• Write CFG for a* | b*
SA | B
A˄| aA
B^ |bB
• Write CFG for (011+1)*(01)*
SAB
A011A | 1A | ^
B01B | ^
• Write CFG for balanced parenthesis
S []S | {}S | [s]S | {s}S | ^
CFG Examples
• Write CFG which contains at least three times 1.
SA1A1A1A
A0A | 1A | ^
• Write CFG that must start and end with same symbol.
S0A0 | 1A1
A0A | 1A | ^
• The language of even & odd length palindrome string over {a,b}
SaSa|bSb|a|b|˄
• No. of a and no. of b are same
SaSb|bSa|˄
• The language of {a, b} ends in a
SaS | bS |a
CFG Examples
• Write CFG for regular expression (a+b)*a(a+b)*a(a+b)*
SXaXaX
XaX|bX|˄
• Write CFG for number of 0’s and 1’s are same (n0(x)=n1(x))
S0S1 | 1S0 | ^
• Write CFG for L={aibjck | i=j or j=k}
For i=j for j=k
SAB SCD
AaAb | ab CaC | a
BcB | c DbDc | bc
CFG Examples
• Write CFG for L={ aibjck | j>i+k}
SABC
AaAb |˄
BbB | b
CbCc |˄
• Write CFG for L={ 0i1j0k | j>i+k}
SABC
A0A1 |˄
B1B | 1
C1C0 |˄
• Write CFG for the language of Algebraic expressions
SS+S | S*S | S-S | S/S | (S) | a
CFG Examples
• CFG for syntax of programming language
<statement> … | <if-statement> | <for-statement> | …

<if-statement> if ( <expression> ) <statement>

<for-statement> for ( <expression>; <expression>; <expression> ) <statement>


Recursive Definitions
Recursive Definitions
1. Recursive Definition of {a,b}* CFG: S aS | bS | ^
˄∈L.
For any S∈L, aS∈L.
For any S∈L, bS∈L.
No other strings are in L.
2. Recursive Definition of Palindrome CFG: S aSa | bSb | a | b | ˄
˄, a, b ∈ L
For any S ∈ L , aSa ∈ L and bSb ∈ L
No other string are in L
3. Recursive Definition of the language {anbn | n≥0} CFG: S aSb | ˄
˄∈ L
For every S ∈ L, aSb ∈L
No other strings are in L
FA to Regular Grammar
FA to Regular Grammar
0 1
𝐴→ 0 𝐴
0 𝐴→ 1 𝐵
𝐵→0𝐶
𝐴 1
𝐵 𝐶 𝐵 →1 𝐵
1

0
𝐶 →0 𝐴
𝐶 →1 𝐵
𝐵→0
At last, all the incoming transitions to
the accepting states are designated by

Source State → input symbol


the production
Exercise: FA to Regular Grammar

a B a
b
b
A C

b b
a a

a
E D
Derivation
Derivation
• Derivation is used to find whether the string belongs to a given
grammar or not.
• There are two types of derivation:
1. Leftmost derivation
2. Rightmost derivation
Leftmost derivation
• A derivation of a string in a grammar is a left most derivation if at
every step the left most non terminal is replaced.
• Grammar: SS+S | S-S | S*S | S/S | a Output string: a*a-a

S S
S-S
S - S
S*S-S
Parse tree represents the
a*S-S S * S a
structure of derivation
a*a-S
a*a-a a a
Parse tree
Leftmost Derivation
Rightmost derivation
• A derivation of a string in a grammar is a right most derivation if
at every step the right most non terminal is replaced.
• It is all called canonical derivation.
• Grammar: SS+S | S-S | S*S | S/S | a Output string: a*a-a
S S
S*S
S * S
S*S-S
S*S-a a S S
-
S*a-a
a*a-a a a
Parse Tree
Rightmost Derivation
Example: Derivation
SA1B
A0A | 𝜖
B0B | 1B | 𝜖 Perform leftmost & Rightmost derivation.
(String: 00101)
Leftmost Derivation Rightmost Derivation
S S
A1B
A1B
A10B
0A1B A101B
00A1B A101
001B 0A101
0010B
00A101
00101
00101B
00101
Exercise: Derivation
1. Perform leftmost derivation and draw parse tree.
SA1B
A0A | 𝜖
B0B | 1B | 𝜖
Output string: 1001.
2. Perform rightmost derivation and draw parse tree.
EE+E | E*E | id | (E) | -E
Output string : id + id * id.
Ambiguous grammar
Ambiguous grammar
• Ambiguous grammar is one that produces more than one leftmost
or more then one rightmost derivation for the same sentence.
• Grammar: SS+S | S*S | (S) | a Output string: a+a*a

S S
S S
S*S S S+S S S
* S +
S+S*S a+S
a+S*S S + S aa+S*S a S * S
a+a*S a+a*S
a a a a
a+a*a a+a*a
Here, Two leftmost derivation for string a+a*a is possible hence, above
grammar is ambiguous.
Exercise: Ambiguous grammar
Check whether following grammars are ambiguous or not:
1. S aS | Sa | 𝜖 (string: aaaa)
2. S aSbS | bSaS | 𝜖 (string: abab)
3. SSS+ | SS* | a (string: aa+a*)
Unambiguous grammar
Grammar: S S+S | S*S | (S) | a Output string: a+a*a
Equivalent unambiguous grammar is S
 S+T
S S + T | T Equivalent
unambiguous  T+T
T T * F | F grammar
F (S) | a  F+T
Try for second
Not possible????  a+T
leftmost derivation

 a+T*F

 a+F*F
Here, two left most derivation is not possible for  a+a*F
string a+a*a hence, grammar is unambiguous.
 a+a*a
Ambiguous Grammar Unambiguous Grammar
In ambiguous grammar, the leftmost In unambiguous grammar, the leftmost
and rightmost derivations are not same. and rightmost derivations are same.

Amount of non-terminals in ambiguous Amount of non-terminals in


grammar is less than in unambiguous unambiguous grammar is more than in
grammar. ambiguous grammar.

Length of the parse tree in


Length of the parse tree in ambiguous unambiguous grammar is comparatively
grammar is comparatively short. large.

Speed of derivation of a tree in Speed of derivation of a tree in


ambiguous grammar is faster than that unambiguous grammar is slower than
of unambiguous grammar. that of ambiguous grammar.

Ambiguous grammar generates more Unambiguous grammar generates only


than one parse tree. one parse tree.

Ambiguous grammar contains Unambiguous grammar does not


ambiguity. contain any ambiguity.
Simplified forms & Normal forms
Nullable Variable
• A Nullable variable in a CFG, is defined as follows:
1. Any variable A for which P contains is nullable.
2. If P contains the production are nullable variable, then A is
nullable.
3. No other variables in V are nullable.
Eliminate ˄ production

Sa X |Yb SaX | Yb | a^ SaX|Yb|a


X ˄ | S X^ | S XS
YbY|b YbY|b YbY|b

Nullable variable={X} Replacing X by ^ in all Removing ^ productions


productions containing
X on RHS and rewriting
the production again
Exercise: Eliminate ^ production
SAC SXaX|bX|Y
AaAb|˄ XXaX|XbX|˄
CaC|a Yab
After elimination of ^ production: After elimination of ^ production:
SAC | C S XaX | bX | Y | aX | Xa | a | b
AaAb| ab X XaX |XbX | aX | Xa | a | Xb | bX | b
CaC|a Yab
A-derivable
• A variable is called A-derivable ,
1. If is a production, B is A-derivable.
2. If C is A-derivable, is a production, and , then B is A-
derivable.
3. No other variables are A-derivable.

SA SA
SB AB
S-derivable={A,B} S-derivable={A,B}
Unit Production & Elimination of Unit productions
• A production of the form AB is termed as unit production. Where
A & B are nonterminals.
Algorithm
• Given a CFG with no ^ productions, construct a CFG having no
unit production as follows.
1. Initialize P1 to be P.
2. For each A ∈ V ,finding the set of A derivable variable.
3. For every pair (A, B) such that B is A- derivable and every non
unit production Bα, add the production Aα to P1 if it is not
already present in P1.
4. Delete all unit productions from P1.
Elimination of unit production
SABA|BA|AA|AB|A|B
Unit Productions
A aA|a
B bB|b are SA and SB

SABA|BA|AA|AB |aA|a|bB|b Removing unit


AaA|a
BbB|b productions
CFG to CNF
Chomsky Normal Form (CNF)
• A context free grammar is in Chomsky normal form (CNF) if every
production is one of these two forms:

Where and are nonterminal and is terminal.


Converting CFG to CNF
• Steps to convert CFG to CNF
1. Eliminate ˄-Productions.
2. Eliminate Unit Productions.
3. Restricting the right side of productions to single terminal or
string of two or more non-terminals.
4. Final step of CNF. (shorten the string of NT to length 2)
Example: CFG to CNF
Step 3: Replace all mixed string with solid NT
SAAC
SAAC|AC| PC aC|a
AaAb|˄ A PAQ|PQ
aAb|ab
CaC|a C PC|a
aC|a
Pa
Step 1: Elimination of ^ production
Qb
Eliminate A^ Step-4: Shorten the string of NT to length 2
SAAC| AC | C
SAX1 X1AC
AaAb| ab
CaC|a SAC|PC|a
APY1 Y1AQ
Step-2: Eliminate Unit Production
APQ
Unit Production is SC
CPC|a
SAAC|AC| CaC|a
AaAb|ab Pa
CaC|a Qb Chomsky Normal Form
Example: CFG to CNF
SaAbB
AAb|b
BBa|a
Step 1 and 2 are not required as there is no ^ and unit productions
Step-3: Replace all mixed string with solid NT Step-4 : final step of CNF
SPAQB SPT1
AAQ|b T1AT2
BBP|a T2QB
AAQ|b
Pa
BBP|a
Qb
Pa
Qb
Example: CFG to CNF
SAA
AB|BB
BabB|b|bb
Step 1 is not required as there is no ^ productions
Step-2: Eliminate Unit Production: Step-4 : Shorten the string of NT to length 2
SAA SAA
A PT1|b|QQ|BB T1QB
A abB|b|bb|BB
B PV1|b|QQ V1QB
BabB|b|bb Pa
Step-3:Replace all mixed string with solid NT: Qb
SAA
A PQB|b|QQ|BB
B PQB|b|QQ
Pa
Qb
Example: CFG to CNF
SASB|^
AaAS|a
BSbS|A|bb
Step-1: Eliminate ˄-Production: Step-3:Replace all mixed string with solid NT:
SASB|AB SASB|AB
APAS|a|PA
AaAS|a|aA BSQS|PAS|a|PA|QQ|QS|SQ|b
BSbS|A|bb|bS|Sb|b Pa
Step-2: Eliminate Unit Production: Qb
Step-4 : Shorten the string of NT to length 2
SASB|AB SAB|AT1 T1SB
AaAS|a|aA Aa|PA|PU1 U1AS
BSbS|aAS|a|aA|bb|bS|Sb|b B SV1|PV2|a|PA|QQ|QS|SQ|b
V1QS V2AS
Pa
Qb
Backus-Naur Form (BNF)
• BNF is one of the notation • Example:
techniques for context free <exp>=<exp> + <term> | <term>
grammar. <term>=<term> * <factor> | <factor>
• It is often used to describe syntax <factor>=<factor> ^ <primary> | <primary>
of the language used in computing.
<primary>=<id> | <const>
• Variables written between <..> are
<id>=<letter>
non terminals.
<const>=[+/-]<digit>
• Vertical bar ‘|’ indicating a
<letter>=a | b | c |……| z
alternate choice.
<digit>=0 | 1 |………….| 9
• […], which is used to enclosed an
optional specification.
Union, Concatenation &
Kleene’s of CFG
CFG Union Concatenation Kleene`s

S1 -> aS1b / ^ S -> S1 / S2 S -> S1 . S2 S -> S1 S / ^


S2 -> bS2C / ^ S1 -> aS1b / ^ S1 -> aS1b / ^ S1 -> aS1b / ^
S2 -> bS2C / ^ S2 -> bS2C / ^ S2 -> bS2C / ^
Union, Concatenation & Kleene’s of CFG
Theorem:- If L1 and L2 are context - free languages, then the
languages L1 U L2, L1L2 , and L1* are also CFLs.
• The proof is constructive: Starting with CFGs
G1 = (V1, Ʃ, S1,P1) and G2 = (V2, Ʃ, S2,P2) ,
• Generating L1 and L2, respectively, we show how to construct a
new CFG for each of the three cases.
1. Gu = (Vu, Ʃ, Su, Pu) generating L1 U L2
2. Gc= (Vc, Ʃ, Sc, Pc) generating L1L2
3. G* = (V, Ʃ, S, P) generating L1 *
Union Gu = (Vu, Ʃ, Su, Pu)
• A grammar Gu = (Vu, Ʃ, Su, Pu) generating L1 U L2.
• First we rename the element of V2 if necessary so that V1 ∩ V2= Ø
Vu= V1 U V2 U {Su}
Where Su is a new symbol not in V1 or V2. Then we let
Pu= P1 U P2 U { Su S1 | S2 }
• On the other hand, if x is derivable from Su in Gu, the first step in any
derivation must be
SuS1 or SuS2
• In the first case, all subsequent productions used must be productions in
G1, because no variables in V2 are involved, and thus x∈ L1; in the second
case, x ∈ L2. Therefore,
L(Gu) ⊆ L1 U L2
Concatenation Gc= (Vc, Ʃ, Sc, Pc)
• A grammar Gc= (Vc, Ʃ, Sc, Pc) generating L1L2 . Again we relabeled
variables if necessary so that V1 ∩ V2 = Ø and define
Vc = V1 U V2 U {Sc}
• This time we let
Pc= P1 U P2 U { ScS1S2 }
• If x ∈L1L2 then x = x1x2 , where xi ∈Li for each i. we may then derive x
in Gc as follows:
Sc S1 S2  *x1 S2  * x1x2 = x
• First step in the derivation must be ScS1 S2 Where the second step is
the derivation of x1 in G1 and the third step is the derivation of x2 in
G2. So x ∈ L1L2
Kleene (*)
• A grammar G* = (V, Ʃ, S, P) generating L1 * Let V = V1 U {S}
• Where S ∉ V1.The language L1*contains strings of the form x = x 1x2 …xk, where
each xi ∈ L1.
• Since each xi can be derived from S1, then to derive x from S it is enough to be
able to derive a string of k S1‘S. We can accomplish this by including the
productions
SS1S | ^
• In P. Therefore, let
P = P1U { SS1S | ^ }
• The proof that L1 * ⊆ L(G*) is straightforward. If x ∈ L(G*) , on the other hand,
then either x = or x can be derived from some string of the form S 1k in G* . In the
second case, since the only production in G* beginning with S 1 are those in G1, we
may conclude that
• x∈ L(G1)k ⊆ L(G1)* .
End of Unit - 3

You might also like