0% found this document useful (0 votes)
41 views146 pages

Unit 2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views146 pages

Unit 2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 146

21CSC301T - FORMAL LANGUAGE AND AUTOMATA

UNIT II
Context free grammar and Language

Dr.AR.Arunarani
Assistant Professor / CINTEL
9/10/2024 1
9/10/2024 2
UNIT-II

Regular Sets and Context Free Grammars

* Regular Sets and Context Free Grammars Properties of regular sets.


* Context-Free Grammars, and Languages
* Derivation trees
* Simplification of CFG: Elimination of Useless Symbols
* Simplification of CFG: Unit productions, Null productions
* Chomsky Normal Forms and Greibach Normal Forms
* Ambiguous and unambiguous grammars
* Minimization of finite automata
9/10/2024 3
Unit-II

Context Free Simplification


Formal Form
Grammar of CFG

Elimination Elimination Elimination


Types Derivations Ambiguity of useless of unit of Null CNF GNF
symbols productions production

9/10/2024 4
Closure properties of Regular Languages

9/10/2024 5
Closure properties for Regular Languages (RL)

This is different
from Kleene
• Closure property: closure
– If a set of regular languages are combined using an operator,
then the resulting language is also regular
• Regular languages are closed under:
– Union, intersection, complement, difference
– Reversal
– Kleene closure
– Concatenation
– Homomorphism
Now, lets prove all of these!
– Inverse homomorphism

9/10/2024 6
RLs are closed under union
• IF L and M are two RLs THEN:

➢they both have two corresponding regular expressions, R and S respectively

➢(L U M) can be represented using the regular expression R+S

➢Therefore, (L U M) is also regular

How can this be proved using FAs?

9/10/2024 7
RLs are closed under complementation

• If L is an RL over ∑, then L=∑*-L


➢To show L is also regular, make the following construction
Convert every final state into non-final, and
every non-final state into a final state

DFA for L DFA for L


qF1 qF1

q0 qi qF2 q0 qi qF2


qFk qFk

Assumes q0 is a non-final state. If not, do the opposite.

9/10/2024 8
DFA for (0+1)*01 and its complement

1 0

Start q0 0 q1 1 q2

0
1

1 0

Start q0 0 q1 1 q2

0
1

9/10/2024 9
RLs are closed under intersection

• A quick, indirect way to prove:


– By DeMorgan’s law:
– L ∩ M = (L U M)
– Since we know RLs are closed under union and
complementation, they are also closed under intersection
• A more direct way would be construct a finite
automaton for L ∩ M

9/10/2024 10
DFA construction for L ∩ M

• AL = DFA for L = {QL, ∑ , qL,FL, δL }


• AM = DFA for M = {QM, ∑ , qM,FM, δM }
• Build AL ∩ M = {QLx QM,∑, (qL,qM), FLx FM,δ} such that:
– δ((p,q),a) = (δL(p,a), δM(q,a)), where p in QL, and q in
QM
• This construction ensures that a string w will be accepted if and only if w reaches an
accepting state in both input DFAs.

9/10/2024 11
Intersection property : example

1 A∩B
A 0, 1 1

0 1
p q pr ps

0 0
0,1
0 1
B 0, 1 qr qs

1
r s 0

9/10/2024 12
DFA construction for L ∩ M

DFA for L DFA for M


qF1 pF1

a a
q0 qi qj qF2 p0 pi pj pF2


DFA for L M
(qF1 ,pF1)


(q0 ,p0) (qi ,pi) (qj ,pj)

9/10/2024 13
RLs are closed under set difference

• We observe:
Closed under intersection

Closed under
–L-M=L∩M complementation

• Therefore, L - M is also regular

9/10/2024 14
RLs are closed under reversal

Reversal of a string w is denoted by wR


– E.g., w=00111, wR=11100
• Reversal of a1a2 … an is anan-1 … a1

Reversal of a language:
• LR = The language generated by reversing all strings in L

Theorem: If L is regular then LR is also regular

9/10/2024 15
RLs are closed under reversal

• Reverse all arcs in the transition diagram


• Start state of the given FA becomes the only final state
• Create new start state p0 and add transition from p0 to all accepting
states of the given FA

9/10/2024 16
9/10/2024 17
9/10/2024 18
Emptiness of regular language

Language is empty if there is no path from start


state to final state
• R=R1+R2 : L(R) empty if both L(R1) and L(R2) are empty
• R=R1.R2 : L(R) empty if L(R1) or L(R2) is empty
• R = R* : Language never be empty

9/10/2024 19
Testing infiniteness

If any string of length more than N (# of states in the FA) exists in the
language then the language is infinite.

This enables atleast one state be visited more than once, could be
infinite times.

9/10/2024 20
9/10/2024 21
Equivalence & Minimization of DFAs

9/10/2024 22
Applications of interest

• Comparing two DFAs:


– L(DFA1) == L(DFA2)?

• How to minimize a DFA?


1. Remove unreachable states
2. Identify & condense equivalent states into one

9/10/2024 23
When to call two states in a DFA “equivalent”?

Two states p and q are said to be equivalent iff:


i) Any string w accepted by starting at p is also accepted by starting at
q;
Past doesn’t matter - only future does!

p
w
AND
q
i) Any string w rejected by starting at p is also rejected by starting at
q.
p
w
q
➔ p≡q

9/10/2024 24
Computing equivalent states in a DFA

Table Filling Algorithm

A =
0 1
B = =
0 1 0
A C E G C x x =
1 0 1 1
0 D x x x =
1 1 0
B D F H E x x x x =
1 0 F x x x x x =
0
Pass #0 G x x x = x x =
1. Mark accepting states ≠ non-accepting states
Pass #1
H x x = x x x x =
1. Compare every pair of states A B C D E F G H
2. Distinguish by one symbol transition
3. Mark = or ≠ or blank(tbd)
Pass #2
1. Compare every pair of states
2. Distinguish by up to two symbol transitions (until different or same or tbd)
….
(keep repeating until table complete)
9/10/2024 25
Table Filling Algorithm - step by step

A =
0 1
B =
0 1 0
A C E G C =
1 0 1 1
0 D =
1 1 0 E =
B D F H
1 0 F =
0
G =
H =
A B C D E F G H

9/10/2024 26
Table Filling Algorithm - step by step

A =
0 1
B =
0 1 0
A C E G C =
1 0 1 1
0 D =
1 1 0 E X X X X =
B D F H
1 0 F X =
0
G X =
1. Mark X between accepting vs. non-accepting state H X =
A B C D E F G H

9/10/2024 27
Table Filling Algorithm - step by step

A =
0 1
B =
0 1 0
A C E G C X =
1 0 1 1
0 D X =
1 1 0 E X X X X =
B D F H
1 0 F X =
0
G X X =
1. Mark X between accepting vs. non-accepting state H X X =
2. Look 1- hop away for distinguishing states or strings
A B C D E F G H

9/10/2024 28
Table Filling Algorithm - step by step

A =
0 1
B =
0 1 0
A C E G C X X =
1 0 1 1
0 D X X =
1 1 0
B D F H E X X X X =
1 0 F X =
0
G X X X =
1. Mark X between accepting vs. non-accepting state H X X X =
2. Look 1- hop away for distinguishing states or strings
A B C D E F G H

9/10/2024 29
Table Filling Algorithm - step by step

A =
0 1
B =
0 1 0
A C E G C X X =
1 0 1 1
0 D X X X =
1 1 0
B D F H E X X X X =
1 0 F X X =
0
G X X X X =
1. Mark X between accepting vs. non-accepting state H X X = X =
2. Look 1- hop away for distinguishing states or strings
A B C D E F G H

9/10/2024 30
Table Filling Algorithm - step by step

A =
0 1
B =
0 1 0
A C E G C X X =
1 0 1 1
0 D X X X =
1 1 0
B D F H E X X X X =
1 0 F X X X =
0
G X X X = X =
1. Mark X between accepting vs. non-accepting state H X X = X X =
2. Look 1- hop away for distinguishing states or strings
A B C D E F G H

9/10/2024 31
Table Filling Algorithm - step by step

A =
0 1
B =
0 1 0
A C E G C X X =
1 0 1 1
0 D X X X =
1 1 0
B D F H E X X X X =
1 0 F X X X =
0
G X X X = X X =
1. Mark X between accepting vs. non-accepting state H X X = X X X =
2. Look 1- hop away for distinguishing states or strings
A B C D E F G H

9/10/2024 32
Table Filling Algorithm - step by step

A =
0 1
B =
0 1 0
A C E G C X X =
1 0 1 1
0 D X X X =
1 1 0
B D F H E X X X X =
1 0 F X X X =
0
G X X X = X X =
1. Mark X between accepting vs. non-accepting state H X X = X X X X =
2. Look 1- hop away for distinguishing states or strings
A B C D E F G H

9/10/2024 33
Table Filling Algorithm - step by step

A =
0 1
B = =
0 1 0
A C E G C X X =
1 0 1 1
0 D X X X =
1 1 0
B D F H E X X X X =
1 0 F X X X X X =
0
G X X X = X X =
1. Mark X between accepting vs. non-accepting state H X X = X X X X =
2. Pass 1:
A B C D E F G H
Look 1- hop away for distinguishing states or strings
3. Pass 2:
Look 1-hop away again for distinguishing states or strings
continue….

9/10/2024 34
Table Filling Algorithm - step by step

A =
0 1
B = =
0 1 0
A C E G C X X =
1 0 1 1
0 D X X X =
1 1 0
B D F H E X X X X =
1 0 F X X X X X =
0
G X X X = X X =
1. Mark X between accepting vs. non-accepting state H X X = X X X X =
2. Pass 1:
A B C D E F G H
Look 1- hop away for distinguishing states or strings
3. Pass 2:
Look 1-hop away again for distinguishing states or strings Equivalences:
continue…. • A=B
• C=H
• D=G

9/10/2024 35
Table Filling Algorithm - step by step

0 1 0 1

0 1 0 0 1
A C E G A C E
1 0 1 1 0
0 1 0
1 1 0 1
B D F H D F
1 0 1 0
0

Retrain only one copy for


each equivalence set of states

Equivalences:
• A=B
• C=H
• D=G

9/10/2024 36
Table Filling Algorithm – special case

A =
0 1
B =
0 1 0
A C E G C =
1 0 1 1
0 D =
1 1 0 E ? =
B D F H
1 0 F =
0
G =
H =

Q) What happens if the input DFA A B C D E F G H


has more than one final state?
Can all final states initially be treated
as equivalent to one another?

9/10/2024 37
Putting it all together …

How to minimize a DFA?


• Goal: Minimize the number of states in a
DFA
• Algorithm: Depth-first traversal from the start state
1. Eliminate states unreachable from the start state
2. Identify and remove equivalent states
3. Output the resultant DFA Table filling algorithm

9/10/2024 38
Are Two DFAs Equivalent?

Unified DFA DFA1

q0 …
Is q0 ≡ q0’?
: if yes, then DFA1≡DFA2
DFA2 : else, not equiv.

q0’ …

1. Make a new dummy DFA by just putting together both DFAs


2. Run table-filling algorithm on the unified DFA
3. IF the start states of both DFAs are found to be equivalent,
THEN: DFA1≡ DFA2
ELSE: different

9/10/2024 39
Summary
• How to prove languages are not regular?
– Pumping lemma & its applications

• Closure properties of regular languages

• Simplification of DFAs
– How to remove unreachable states?
– How to identify and collapse equivalent states?
– How to minimize a DFA?
– How to tell whether two DFAs are equivalent?

9/10/2024 40
Introduction to Grammar

9/10/2024 41
Grammars: Introduction

•Finite Automata accept all regular languages and only regular languages

• Many simple languages are non regular:


- {anbn : n = 0, 1, 2, …}
- {w : w a is palindrome}

and there is no finite automata that accepts them.


• Context-free languages are a larger class of languages that encompasses all
regular languages and many others, including the two above.

9/10/2024 42
Grammars: Introduction
• Every regular language is a context-free language.
• The reverse of this is not true, i.e., every context-free language is not necessarily regular.
• Many issues and questions we asked for regular languages will be the same for
context-free languages:
• Machine model – PDA (Push-Down Automata) Descriptor – CFG
(Context-Free Grammar)
• Pumping lemma for context-free languages (and find CFL’s limit)
• Closure of context-free languages with respect to various operations
• Algorithms and conditions for finiteness or emptiness
• Some analogies don’t hold, e.g., non-determinism in a PDA makes a difference and, in
particular, deterministic PDAs define a subset of the context-free languages.

9/10/2024 43
Grammars: Introduction
• Grammars denote syntactical rules for conversation in natural languages.
• Noam Chomsky gave a mathematical model of grammar in 1956.
• A grammar is a set of production rules which are used to generate strings of a
language.
• A grammar can be represented as 4 tuples (N, T, P, S)
• Where,
• N:- Set of Non terminals or variable list
• T:- Set of Terminals(T∈ ∑)
• S:- Special Non terminal called Starting symbol of grammar( S ∈ N)
• P:- Production rule ( of the form α → β , where α and β are strings on N ∪ ∑ )

9/10/2024 44
Two basic elements of a Grammar
1. Terminal symbols
2. Non-terminal symbols

Terminal Symbols-
• Terminal symbols are denoted by using small case letters such as a, b, c etc.
• Terminal symbols are those which are the constituents of the sentence generated using
a grammar.

Non-Terminal Symbols-
• Non-Terminal symbols are denoted by using capital letters such as A, B, C etc.
• Non-Terminal symbols are those which take part in the generation of the sentence but
are not part of it.
• Non-Terminal symbols are also called as variables.
9/10/2024 45
Example
• Example: Grammar G1
P1: S → AB
P2: A → a
P3: B → b
• G1= (N,T,P,S) = ({S, A, B}, {a, b}, {p1,p2,p3}, S)
Where,
• S, A, and B are Non-terminal symbols
• a and b are Terminal symbols
• S is the Start symbol, S ∈ N
• p1,p2,p3 – are Production rules

9/10/2024 46
Types of Grammar

9/10/2024 47
Types of Grammar

9/10/2024 48
Chomosky Hierarchy
• According to Noam Chomosky, there are four types of grammars −
Type 0, Type 1, Type 2, and Type 3.
• Type 0 known as unrestricted grammar.
• Type 1 known as context sensitive grammar.
• Type 2 known as context free grammar.
• Type 3 Regular Grammar.

9/10/2024 49
Type 0: Unrestricted Grammar:
• Type-0 grammars include all formal grammars.
• Type 0 grammar languages are recognized by Turing Machine.
• These languages are also known as the Recursively Enumerable languages.
• Grammar Production in the form of α → β
• where
α is ( V + T)* V ( V + T)*
V : Variables/NT
T : Terminals.
β is ( V + T )*.
• In type 0 there must be at least one variable on Left side of production.

Example1 : Example2 :
Sab –> ba S → ACaB
A –> S. Bc → acB
Here, Variables are S, A and Terminals a, b. CB → DB
9/10/2024 50
aD → Db
Type 1: Context Sensitive Grammar
• Type-1 grammars generate the context-sensitive languages.
• The language generated by the grammar are recognized by the Linear Bound
Automata(LBA)
Rules:
1. First of all Type 1 grammar should be Type 0.
2. Grammar Production in the form of α → β
Where,
α , β is ( V + T )+.
| α | <= | β |
i.e count of symbol in α is less than or equal to β
Example: 1 Example: 2
S –> AB AB → AbBc
AB –> abc A → bcA
B –> b B→b

9/10/2024 51
Type 2: Context Free Grammar:
• Type-2 grammars generate the context-free languages.
• The language generated by the grammar is recognized by a Pushdown automata (PDA)
Rules:
1. First of all it should be Type 1.
2. Left hand side of production can have only one variable.
3. Grammar Production in the form of α → β
Where,
α is Single NT
β is ( V + T )*.
| α | <= | β |
i.e count of symbol in α is less than or equal to β
Example
S –> AB
A –> a/ε
9/10/2024 52
B –> b
Type 3: Regular Grammar
• Type-3 grammars generate regular languages.
• These languages can be accepted by a finite state automaton (FA)
• Type 3 is most restricted form of grammar.
• The productions must be in the form
X → Aa/a
X → aA/a
where,
X,A is Non Terminal
a∈∑*
Example
S->aS/b
S->aS/c
S->Sa/b
9/10/2024 53
A->ba/ ε
Contd…

9/10/2024 54
CFG and its Languages

9/10/2024 55
Context Free Grammars and Languages
• Context free grammar (CFG) is a formal grammar which is used to generate all possible
strings in a given formal language.
• Context free grammar G can be defined by four tuples as:
(N, T, P, S)
• Where,
• N:- Set of Non terminals or variable list
• T:- Set of Terminals(T∈ ∑)
• S:- Special Non terminal called Starting symbol of grammar( S ∈ N)
• P:- Production rule ( of the form α → β , where α and β are strings on N ∪ ∑ )
• In CFG, the start symbol is used to derive the string.
• We can derive the string by repeatedly replacing a non-terminal by the right hand side
of the production, until all non-terminal have been replaced by terminal symbols.
• It is9/10/2024
used to generate all possible patterns of strings in a given formal language. 56
Examples
Example 1:
Construct the CFG for the language having any number of a's over the set
∑= {a}. R.E= a*
Grammar :Production rule (P):
S → aS rule 1
S → ε rule 2

Derive a string "aaa


-> S
->aS
->aaS rule 1
->aaaS rule 1
->aaa ε rule 2
-> aaa (Required string)
9/10/2024 57
Example 2:
Construct a CFG for the regular expression (0 +1)*
Grammar :Production rule (P):
S → 0S | 1S rule 1
S→ε rule 2
Derive a string “1001”
->S
->1S rule 1
->10S rule 1
-> 100S rule 1
-> 1001S rule 1
-> 1001ε rule 2
-> 1001 (Required string)
9/10/2024 58
Example 3:
Construct a CFG for defining palindrome over ∑={a,b}, L = {wcwR}

Grammar :Production rule (P):


S → aSa rule 1
S → bSb rule 2
S→c rule 3
Derive a string "abbcbba“
S → aSa
→ abSba from rule 2
→ abbSbba from rule 2
→ abbcbba from rule 3 (Required string)

9/10/2024 59
Example 4:
Construct a CFG for defining palindrome over ∑={a,b}
Grammar :Production rule (P):
S → aSa rule 1
S → bSb rule 2
S → a/b/ε rule 3
Derive a string "abbabba“
S → aSa
→ abSba from rule 2
→ abbSbba from rule 2
→ abbabba from rule 3 (Required string)
9/10/2024 60
Example 5:
Construct a CFG for set of strings with equal no.of a’s and equal no.of a’s
over ∑={a,b}
Grammar :Production rule (P):
S → SaSbS rule 1
S →SbSaS rule 2
S→ ε rule 3
Derive a string " babaab “
S → SaSbS from rule 1
→ SbSaaSbS from rule 2
→SbSaS bSaaSbS from rule 2
→ babaab from rule 3 (Required string)

9/10/2024 61
Example 6:
Construct a CFG for the language L = anb2n where n>=1,over ∑={a,b}
Grammar :Production rule (P):
S → aSbb rule 1
S → abb rule 2
Derive a string " aabbbb “
S → aSbb from rule 1
→ aabbbb from rule 2 (Required string)

9/10/2024 62
Example 7:
Construct a CFG for the RE=(011+1)* (01)*
Grammar :Production rule (P):
S → AB rule 1
A → ε /CA rule 2
C→ 011/1 rule 3
B → ε /DB rule 4
D → 01 rule 5

9/10/2024 63
Derivation & Parse Tree

9/10/2024 64
Derivations
• Starting with the start symbol, non-terminals are rewritten using productions
until only terminals remain.
• Any terminal sequence that can be generated in this manner is syntactically
valid.
• If a terminal sequence can’t be generated using the productions of the
grammar it is invalid (has syntax errors).
• The set of strings derivable from the start symbol is the language of the
grammar (sometimes denoted L(G)).
• Derivation is a sequence of production rules.
• It is used to get the input string through these production rules.

9/10/2024 65
• During parsing, we need to take the following two decisions.
1. Need to decide the non-terminal which is to be replaced.

2. Need to decide the production rule by which the non-terminal will be


replaced.
• Based on the following 2 derivations, We have two options to decide which
non-terminal to be placed with production rule .
1. Left most Derivation

2. Right most Derivation

• To illustrate a derivation, we can draw a derivation tree (also called a


parse tree)

9/10/2024 66
Leftmost, Rightmost Derivations

Definition. A left-most derivation of a sentential form is one in which rules


transforming the left-most nonterminal are always applied

Definition. A right-most derivation of a sentential form is one in which


rules transforming the right-most nonterminal are always applied

9/10/2024 67
Left most Derivation
• In the leftmost derivation, the input is scanned and replaced with the production rule from
left to right.
• So in leftmost derivation, we read the input string from left to right.
• Leftmost non-terminal is always expanded.
Example:
E=E+E Rule1
E=E-E Rule2
E=a|b Rule3
The leftmost derivation is:
W= a - b + a
E=E+E
E=E-E+E
E=a-E+E
E=a-b+E
E=a-b+a
9/10/2024 68
Rightmost Derivation
• In rightmost derivation, the input is scanned and replaced with the production rule
from right to left.
• So in rightmost derivation, we read the input string from right to left.
• Rightmost non-terminal is always expanded.
Example:
E=E+E Rule1
E=E-E Rule2
E=a|b Rule3
The rightmost derivation is:
W=a - b + a
E=E-E
E=E-E+E
E=E-E+a
E=E-b+a
E9/10/2024
=a-b+a 69
Leftmost & Rightmost Derivations

Sample derivations:
S ⇒ AB ⇒ AAB ⇒ aAB ⇒ aaB ⇒ aabB ⇒ aabb
S ⇒ AB ⇒ AbB ⇒ Abb ⇒ AAbb ⇒ Aabb ⇒ aabb

These two derivations are special.


S

A B 1st derivation is leftmost.


Always picks leftmost variable.
A A b B
2nd derivation is rightmost.
a a b
Always picks rightmost variable.

9/10/2024 70
• Example:
S –> AB S A
A –> aAA
A –> aA A B a A
A –> a
B –> bB a A A b A A A
B –> b
a

yield = aAab yield = aaAA

• Notes:
– Root can be any non-terminal
– Leaf nodes can be terminals or non-terminals
– A derivation tree with root S shows the productions used to obtain a sentential form

9/10/2024 71
• Observation: Every derivation corresponds to one derivation tree.

S => AB S Rules:
=> aAAB S –> AB
=> aaAB A B A –> aAA
=> aaaB A –> aA
=> aaab a A A b A –> a
B –> bB
a a B –> b

• Observation: Every derivation tree corresponds to one or more derivations.


leftmost: rightmost: mixed:
S => AB S => AB S => AB
=> aAAB => Ab => Ab
=> aaAB => aAAb => aAAb
=> aaaB =>aAab => aaAb
=> aaab => aaab => aaab

• Definition: A derivation is leftmost (rightmost) if at each step in the derivation a production is applied to the
leftmost (rightmost) non-terminal in the sentential form.
– The first derivation above is leftmost, second is rightmost, the third is neither.
9/10/2024 72
• Observation: Every derivation tree corresponds to exactly one leftmost (and rightmost) derivation.

S => AB S
=> aAAB
=> aaAB A B
=> aaaB
=> aaab a A A b

a a

• Observation: Let G be a CFG. Then there may exist a string x in L(G) that has more than 1 leftmost (or
rightmost) derivation. Such a string will also have more than 1 derivation tree.

9/10/2024 73
Parse tree
• Parse tree is the graphical representation of symbol. The symbol can be
terminal or non-terminal.
• In parsing, the string is derived using the start symbol.
• The root of the parse tree is that start symbol.
• All leaf nodes have to be terminals.
• All interior nodes have to be non-terminals.
• In-order traversal gives original input string.

9/10/2024 74
Parse tree

Sample derivations:
S ⇒ AB ⇒ AAB ⇒ aAB ⇒ aaB ⇒ aabB ⇒ aabb
S ⇒ AB ⇒ AbB ⇒ Abb ⇒ AAbb ⇒ Aabb ⇒ aabb

These two derivations use same productions, but in different orders.


This ordering difference is often uninteresting.
Derivation trees give way to abstract away ordering differences.

S Root label = start node.

A B Each interior label = variable.

Each parent/child relation = derivation step.


A A b B
Each leaf label = terminal or
a a b
All leaf labels together = derived string = yield.
9/10/2024 75
Example:
Grammar G :
S→S+S|S*S
S → a|b|c
Input String : W=a * b + c
Parse Tree for Left most Derivation

9/10/2024 76
9/10/2024 77
Input String : W=a * b + c
Parse Tree for Right most Derivation

9/10/2024 78
9/10/2024 79
What is the language defined by ‘G’

• G : S →aS/bS/a/b
L(G) = (a+b)+
• G : S →XaaX
X →aX/bX/ ε
L(G) = (a+b)* aa (a+b)*
• G : S → SS
L(G) =

9/10/2024 80
• G : S →aCa
C→aca/b
S → aca
→ aacaa
→ aaacaaa
→ aaabaaa
L(G) = an b an
• G : S →0S1/ ε
S → 0S1
→ 0 0S1 1
→ 0 00S11 1
→ 0 0011 1
L(G) = 0n 1n | for n>=0;
9/10/2024 81
Ambiguous grammar

9/10/2024 82
Ambiguous Grammar

• Definition: Let G be a CFG. Then G is said to be ambiguous if there exists an x in L(G)


with >1 leftmost derivations. Equivalently, G is said to be ambiguous if there exists
an x in L(G) with >1 parse trees, or >1 rightmost derivations.

• Note: Given a CFL L, there may be more than one CFG G with L = L(G). Some
ambiguous and some not.

• Definition: Let L be a CFL. If every CFG G with L = L(G) is ambiguous, then L is


inherently ambiguous.

30

9/10/2024 83
A leftmost derivation
• An ambiguous Grammar:
E=>E*E
=>I*E
E -> I ∑ ={0,…,9, +, *, (, )} =>3*E+E
E -> E + E =>3*I+E
E -> E * E =>3*2+E
=>3*2+I
E -> (E) =>3*2+5
I -> ε | 0 | 1 | … | 9

• A string: 3*2+5 Another leftmost


derivation
• Two parse trees:
E=>E+E
* on top, & + on top =>E*E+E
& two left-most derivation: =>I*E+E
=>3*E+E
=>3*I+E
=>3*2+I
=>3*2+5

9/10/2024 84
E -> I ∑ ={0,…,9, +, *, (, )} E=>E*E
E -> E + E =>I*E
E -> E * E =>3*E+E E
=>3*I+E
E -> (E) =>3*2+E
I -> ε | 0 | 1 | … | 9 =>3*2+I E * E
=>3*2+5
I E + E
E
Another leftmost 3
derivation E=>E+E I I

=>E*E+E + E
E 2 5
=>I*E+E
=>3*E+E I
E * E
=>3*I+E
=>3*2+I 5
=>3*2+5 I I

3 2

9/10/2024 85
Ambiguity
• A grammar is said to be ambiguous if there exists more than one leftmost
derivation or more than one rightmost derivative or more than one parse tree for
the given input string.

Example1: Input String : W=a * b + c

Parse Tree for Left most Derivation Parse Tree for Right most Derivation

9/10/2024 86
• Example 2 :
S = aSb | SS
S=∈

Parse Tree I Parse Tree II

9/10/2024 87
• If the grammar has ambiguity then it is not good for a compiler construction.
• No method can automatically detect and remove the ambiguity but you can
remove ambiguity by re-writing the whole grammar without ambiguity.

9/10/2024 88
Ambiguous grammar to unambiguous grammar
Example1:
• Show that the given Expression grammar is ambiguous. Also, find an
equivalent unambiguous grammar.
Input Grammar:
E→E*E
E→E+E
E → id
Solution:
• Let us derive the string "id + id * id"

9/10/2024 89
As there are two different parse tree for deriving the same string "id + id * id",
the given grammar is ambiguous.

9/10/2024 90
Removing ambiguity
Rewriting the grammar
For the Expression Grammar, use the following steps to get unambiguous grammar
1. Take care of precedence (Use a different non terminal for each precedence level and also start
with the lowest precedence (PLUS)
2. Ensure associativity (define the rule as left recursive if the operator is left associative and as right
recursive if the operator is right associative )
The equivalent unambiguous grammar
E→E+T
E→T
T→T*F
T→F
F → id
• It reflects the fact that ∗ has higher precedence than +.
• Also that, the operators + and ∗ are left-associative as these 2 are left recursive rules.
9/10/2024 91
Example2:
• Check that the given grammar is ambiguous or not. Also, find an equivalent unambiguous grammar.
S→S+S
S→S*S
S→S^S
S→a
Solution:
Let us derive the string “a + a * a"

9/10/2024 92
The equivalent unambiguous grammar
S→S+A| A
A→A*B|B
B→C^B|C
C→a

• It reflects the fact that ^ has higher precedence than * and +.


• The operators + and ∗ are left-associative as these 2 are left recursive rules.
• The operators ^ is right associative as it is right recursive rule.

9/10/2024 93
9/10/2024 94
9/10/2024 95
9/10/2024 96
Elimination of Useless Symbols

9/10/2024 97
Elimination of Useless Symbols

❖Useful Symbols
❑A symbol X in a CFG G = {V, T, P, S} is called useful
✔ if there exist a derivation of a terminal string from S where X appears
somewhere,
✔ else it is called useless.

9/10/2024 98
Elimination of Useless Symbols

• A CFG has no useless variables if and only if all its variables are reachable and
generating.
• Therefore it is possible to eliminate useless variables from a grammar as
follows:

❑Step 1: Find the non-generating variables and delete them, along with all productions involving
non-generating variables.

❑Step 2: Find the non-reachable variables in the resulting grammar and delete them, along with all
productions involving non-reachable variables.

9/10/2024 99
Elimination of Useless Symbols
• Generating variables
• A variable X is called as generating
- if it derives a string of terminals.

- Note that the language accepted by a context-free grammar is non-empty if and only if the start symbol is
generating.
• Algorithm to find the non-generating variables in a CFG
▪ Mark a variable X as "generating"
- if it has a production X -> w, where w is a string of only terminals and/or variables previously marked
"generating".
▪ Repeat the above step until no further variables get marked "generating".
▪ All variables not marked "generating" are non-generating

9/10/2024 100
Elimination of Useless Symbols
• Reachable variables
• A variable X is called as reachable
- if the start symbol derives a string containing the variable X.

• Algorithm to find the non-reachable variables in a CFG


• Mark the start variable as "reachable".
• Mark a variable Y as "reachable" if there is a production X -> w,
where X is a variable previously marked as "reachable" and
w is a string containing Y.
• Repeat the above step until no further variables get marked "reachable".
• All variables not marked "reachable" are non-reachable

9/10/2024 101
Elimination of Useless Symbols-Example
1. Remove the useless symbol from the given context free grammar
S -> abS | abA | abB
A ->cd
B->aB
C->dc
Solution:
❖Step 1: Eliminate non-generating symbols i.e non-terminals which do
not produce any terminal string
❖ In the given productions, B do not produce any terminal
❖ Eliminate all the productions in which B occurs.
• S -> abS | abA | abB
• A ->cd
• B->aB
• C->dc
❖Resulting productions are: S -> abS | abA
A -> cd
C -> dc

9/10/2024 102
Elimination of Useless Symbols-Example
❖Step 2: Eliminate non-reachable symbols i.e non-terminals that can never be
reached from the starting symbol

• In the set of productions available after Step 2,


‘C’ is not reachable from starting symbol ‘S’
• Eliminate productions involving non-terminal ‘C’
S -> abS | abA
A ->cd
C->dc
• Final productions after eliminating useless symbols are:
S -> abS | abA
A ->cd

9/10/2024 103
Elimination of Useless Symbols-Example
2. Remove the useless symbol from the given context free grammar
S -> aB / bX
A -> Bad / bSX / a
B -> aSB / bBX
X -> SBD / aBx / ad
❖Step 1: Eliminate non-generating symbols i.e non-terminals which do
not produce any terminal string
• A and X directly derive string of terminals a and ad, hence they are useful. Since X is a
useful symbol so S is also a useful symbol as S -> bX.
• But B does not derive any terminals, so clearly B is a non-generating symbol.
• So eliminate the productions with B
S -> aB / bX
A -> Bad / bSX / a
B -> aSB / bBX
X -> SBD / aBx / ad

9/10/2024 104
Elimination of Useless Symbols-Example

• The resulting productions are


S -> bX
A -> bSX / a
X -> ad
❖Step 2: Eliminate non-reachable symbols i.e non-terminals that can never be reached from the starting
symbol
• In the reduced grammar A is a non-reachable symbol
• So remove the production involving A
• Final grammar after elimination of the useless symbols is
S -> bX
X -> ad

9/10/2024 105
Elimination of Useless Symbols
• Elimination of useful symbols - Order of elimination
• Always Eliminate non-generating symbol first and then eliminate non-reachable
symbols
• Reversing the order of elimination would not work
S -> AB | a
A -> aA
B -> b
• Here A is non-generating, and after deleting A (along with the production S -> AB) the
variable B becomes unreachable. Hence, it is considered as useless variable
• However, if we would first test for reachability, all variables would be reachable, and
subsequently eliminating non-generating variables would leave us with B.

9/10/2024 106
Elimination of Useless Symbols

• If a symbol is useful then it is both generating and reachable


• Converse of above statement is not true.
• For e.g. in CFG
S → ABC
B→b
B is both reachable and generating but still not useful

9/10/2024 107
Elimination of Null Productions

9/10/2024 108
Elimination of Null Productions
• Null Productions
A production of type A → є is called as Null production
• In a given CFG, a non-terminal N is called as nullable
- if there is a production N -> ϵ or
- If there is a derivation that starts at N and leads to ϵ
- If A -> ϵ is a production to be eliminated
- look for all productions, whose right side contains A, and
- replace each occurrence of A in each of these productions to obtain the non ϵ-
productions.
- resultant non ϵ-productions must be added to the grammar to keep the language the
same.

9/10/2024 109
Elimination of Null Productions – Example
1. Remove the null productions from the following grammar
S -> aX / bX
X-> a / b / є
Solution:
- There is one null production in the grammar X -> ϵ.
- To eliminate X -> ϵ, change the productions containing X in the right side.
- The productions with X in the right side are S -> aX and S -> bX
- So replacing each occurrence of X by ϵ, we get two new productions
S-> a and S -> b
- Adding these productions to the grammar and eliminating X -> ϵ, we get
S -> aX / bX / a / b
X-> a / b

9/10/2024 110
Elimination of Null Productions – Example
• 2. Remove the null productions from the following grammar
S -> ABAC
A -> aA / ϵ
B -> bB / ϵ and
C -> c
Solution:
• We have two null productions in the grammar A -> ϵ and and B -> ϵ
• To eliminate A -> ϵ we have to change the productions containing A in the right side.
• The productions with A in the right side are S -> ABAC and A -> aA.
• So replacing each occurrence of A by ϵ, we get four new productions
S -> ABC / BAC / BC
A -> a
• Add these productions to the grammar and eliminate A -> ϵ.
S -> ABAC / ABC / BAC / BC
A -> aA / a
B -> bB / ϵ
C -> c

9/10/2024 111
Elimination of Null Productions – Example

• To eliminate B -> ϵ we have to change the productions containing B on the right


side.
• The productions with B in the right side are S -> ABAC / ABC / BAC / BC and B -> bB
• Doing that we generate these new productions:
S -> AAC / AC / C
B -> b
Add these productions to the grammar and remove the production B -> ϵ from the
grammar. The new grammar after removal of ϵ – productions is:
S -> ABAC / ABC / BAC / BC / AAC / AC / C
A -> aA / a
B -> bB / b
C -> c

9/10/2024 112
Elimination of Unit Productions

9/10/2024 113
Elimination of Unit Productions
• Unit Production
▪ A unit production is a production A -> B where both A and B are non-terminals.
▪ Unit productions are redundant and hence should be removed.

• Follow the following steps to remove the unit production


1. Select a unit production A -> B, such that there exist a production B -> α, where α is a terminal
2. For every non-unit production, B -> α repeat the following step
▪ Add production A -> α to the grammar
3. Eliminate A -> B from the grammar
4. Repeat the above steps , if there are more unit productions

9/10/2024 114
Elimination of Unit Productions – Example
1. Eliminate Unit productions from the given grammar
S-> aX / bY / Y
X-> S
Y -> bY / b
Solution:
• There are two unit productions in the given grammar, S -> Y and X -> S
• Substituting the values of unit production S -> Y we get,
S-> aX / bY / bY / b ----🡪 S-> aX / bY / b
• Substituting the values of unit production X -> S we get,
X-> aX / bY / Y
• Final set of productions would be,
S-> aX / bY / b
X-> aX / bY / Y
Y -> bY / b
9/10/2024 115
Elimination of Unit Productions – Example
2. Eliminate Unit productions from the given grammar
S -> AB
A -> a , B -> C , C -> D and D -> b
Solution:
• There are two unit productions in the given grammar, B -> C and C -> D
• Substituting the values of unit production B -> C in C -> D we get,
B-> D
• Substituting the values of unit production B-> D in D -> b we get,
B-> b
• Substituting the values of unit production C-> D in D -> b we get,
C-> b
• C is a non-reachable symbol. Hence remove it
• Final set of productions after removing non-reachable symbol would be,
S -> AB
A -> a
B-> b

9/10/2024 116
Exercise Problems
1. Remove the useless symbols from the given grammar
A -> xyz / Xyzz
X -> Xz / xYz
Y -> yYy / Xz
Z -> Zy / z
2. Remove the useless symbols from the given grammar
T → aaB | abA | aaT
A → aA
B → ab | b
C → ad

9/10/2024 117
Exercise Problems
3. Remove the ε production from the following CFG by preserving the meaning of it.
S → XYX
X → 0X | ε
Y → 1Y | ε

4. Remove the ε production from the following CFG by preserving the meaning of it.
S → ASA | aB | b
A→B
B→b|∈

9/10/2024 118
Exercise Problems
5. Identify and remove the unit productions from the following CFG
S -> S + T/ T
T -> T * F/ F
F -> (S)/a

6. Remove the unit productions from the following grammar


S -> AB
A -> a
B -> C / b
C -> D
D -> E
E -> a

9/10/2024 119
Normal Form

9/10/2024 120
Normal Form
• Normalization is the process of minimizing redundancy from a relation
or set of relations.
• A grammar is said to be in normal form when every production of the
grammar has some specific form
• In this course we are going to study 2 types of Normal form

Normal Form

Chomsky normal form Greibach normal form


(CNF) (GNF)

9/10/2024 121
Chomsky Normal Form (CNF)

9/10/2024 122
Chomsky Normal Form (CNF)

• A context free grammar (CFG) is in Chomsky Normal Form (CNF) if all


production rules satisfy one of the following conditions:

1. S → ε
Let consider,
NT = Non terminal (Eg. A,S,E..) 2. NT→ T (Eg. A → a)
T = Terminal (Eg. a,b,0,1--)
3. NT → NT NT (Eg. A →SE)

9/10/2024 123
Steps to convert a CFG to CNF
1. Eliminate null, unit and useless productions (Kindly refer previous
slides).
2. Eliminate terminals from RHS if they exist with other terminals or
non-terminals.

Example:
Consider A → aX CNF Normal form
Then we can convert to CNF form such as NT→ T
Let Z → a NT → NT NT
A → ZX

9/10/2024 124
Steps to convert a CFG to CNF

3. Eliminate RHS with more than two non-terminals.

Example:
Consider A → BDX CNF Normal form
Then we can convert to CNF form such as NT→ T
Let Z → BD NT → NT NT
A → ZX

9/10/2024 125
Solved problem

CNF Normal form


NT→ T
NT → NT NT

9/10/2024 126
CNF Problem
• Define the two normal forms that are to be converted from a context free grammar(CFG). Convert the following
CFG to Chomsky normal form:
S→A/B/C
A→aAa/B
B→bB/bb
C→baD/abD/aa
D→ aCaa/D
• Construct the following grammar in CNF:
S→ ABC/BaB
A →aA/BaC/aaa
B →bBb/a/D
C →CA/AC
D→ ε
9/10/2024 127
CNF Problem
• Convert the following grammar into CNF
S → cBA
S→A
A → cB | AbbS
B → aaa

• Construct a equivalent grammar G in CNF for the grammar G1 where


G1=({S,A,B}, {a,b}, {S →ASB/ ε , A→ aAS/a, B→ SbS/A/bb}, S)

9/10/2024 128
Greibach Normal Form (GNF)

9/10/2024 129
Greibach Normal Form (GNF)
• GNF stands for Greibach normal form. A CFG(context free grammar) is in GNF(Greibach
normal form) if all the production rules satisfy one of the following conditions:

1. S → ε
Let consider,
NT = Non terminal (Eg. A,S,E..) 2. NT→ T (Eg. A → a)
T = Terminal (Eg. a,b,0,1--)
3. NT → T (NT)* (Eg. A →aSBBA)

9/10/2024 130
Steps to convert a CFG to GNF

1. Eliminate null, unit and useless productions (Kindly refer previous slides).
2. Convert the given grammar into CNF form (Kindly refer previous slides).
3. Rename the Non Terminal as (A1,A2,A3,....)
4. Check the production such that all production should be in the form Ai
→Aj where(i ≤ j) .
5. If the production is not as per step 4, Replace the production as per
Lemma I or Lemma II

9/10/2024 131
Lemma I
If G = (V,T,P,S) is a CFG and, the set of ‘A’ production belong to P are

A → Aα ------ (1)
A → β1 | β2 | β3 | β4 ----- | βn ------ (2)
then Let G’ = (V’,T,P’,S)
Where P’ be
A → β1 α | β2 α | β3 α | β4 α ----- | βn α

By sub. (2) in (1)

9/10/2024 132
Lemma II
If G = (V,T,P,S) is a CFG and, the set of ‘A’ production belong to P are
A → Aα1 | Aα2 | Aα3 -----| Aαm | β1 | β2 | β1 ------ | βn
Then introduce a new non-terminal X
So,Let G’ = (V’,T,P’,S) , Where V’ = (V ∪ X)
Where P’ can be formed

A → βi (1 ≤ i ≤ n)
A → βi X 1

X → αj (1 ≤ j ≤ m)
2
X → αj X

9/10/2024 133
Greibach Normal Form

Example:

S → XA | BB S = A1 A1 → A2A3 | A4A4
B → b | SB X = A2 A4 → b | A1A4
X→b A = A3 A2 → b
A→a B = A4 A3 → a

CNF New Labels Updated CNF

9/10/2024 134
GNF

• If the RHS of the productions start with a lower numbered variable,


substitute for the first variable in the RHS.
• If the RHS of the productions start with a higher numbered variable,
leave the productions as such
• If the RHS of the productions start with the same variable as the LHS:

9/10/2024 135
Greibach Normal Form

Example:

A1 → A2A3 | A4A4 First Step Ai → AjXk j > i


A4 → b | A1A4
A2 → b Xk is a string of zero
A3 → a or more variables

A4 → A1A4

9/10/2024 136
Greibach Normal Form

Example:

First Step Ai → AjXk j > i

A4 → A1A4 | b A1 → A2A3 | A4A4


A4 → A2A3A4 | A4A4A4 | b A4 → b | A1A4
A2 → b
A4 → bA3A4 | A4A4A4 | b
A3 → a

9/10/2024 137
Greibach Normal Form

Example:

A1 → A2A3 | A4A4 Second Step


A4 → bA3A4 | A4A4A4 | b
Eliminate Left
A2 → b
Recursions
A3 → a

A4 → A4A4A4

9/10/2024 138
Greibach Normal Form

Example:
Second Step
Eliminate Left
Recursions
A4 → bA3A4 | A4A4A4 | b

A4 → bA3A4 | b | bA3A4Z | bZ A1 → A2A3 | A4A4


Z → A4A4 | A4A4Z A4 → bA3A4 | A4A4A4 | b
A2 → b
A3 → a

9/10/2024 139
Greibach Normal Form

Example:

A1 → A2A3 | A4A4
A4 → bA3A4 | b | bA3A4Z | bZ A → αX
Z → A4A4 | A4A4 Z
A2 → b GNF
A3 → a

9/10/2024 140
Greibach Normal Form

Example:
A1 → A2A3 | A4A4
A4 → bA3A4 | b | bA3A4Z | bZ
Z → A4A4 | A4A4 Z
A2 → b
A3 → a

A1 → bA3 | bA3A4A4 | bA4 | bA3A4ZA4 | bZA4

Z → bA3A4A4 | bA4 | bA3A4ZA4 | bZA4 | bA3A4A4 | bA4 | bA3A4ZA4 | bZA4

9/10/2024 141
Greibach Normal Form

Example:

A1 → bA3 | bA3A4A4 | bA4 | bA3A4ZA4 | bZA4


A4 → bA3A4 | b | bA3A4Z | bZ
Z → bA3A4A4 | bA4 | bA3A4ZA4 | bZA4 | bA3A4A4 | bA4 | bA3A4ZA4 | bZA4
A2 → b
A3 → a

Grammar in Greibach Normal Form

9/10/2024 142
Home Work Questions

1. Convert the following grammar to the Chomsky Normal Form.


S→P
P → aPb | ε
2. Convert the following grammar to the Greibach Normal Form.

S -> a | CD | CS A -> a | b |
SS
C -> a
D -> AS

9/10/2024 143
Solved problem (1)
GNF form
1. S → ε
Convert the following to GNF 2. NT→ T (Eg. A → a)
S→AB 3. NT → T (NT)* (Eg. A →aSBBA)
A →BS|b
B →SA|a

Solution: CNF form


Step 1 & 2 : The given grammar is in CNF form 1. S → ε
Step 3: Renaming the production, Let S = A1 ,A = A2 ,B = A3 2. NT→ T (Eg. A → a)
A1 → A2 A3 ---- (1) 3. NT → NT NT (Eg. A →SE)
A2 → A3 A1 |b ---- (2)
A3 → A1 A2 |a ---- (3)

Step 4: While checking the condition Ai →Aj where(i ≤ j)


Equation(3) is not in the format , so as per Lemma I let us Sub. The value of A1 from (1) to (3), so
Lemma 1
A3 → A2 A3 A2 |a ---- (4) A → Aα ------ (1)
A → β1 | β2 | β3 | β4 ----- | βn ------ (2)

9/10/2024 A → β1 α | β2 α | β3 α | β4 α ----- | βn α144


Solved problem (1)
Lemma 2
A → Aα1 | Aα2 | Aα3 -----| Aαm | β1 | β1 ------ | βn

Again as per Lemma I sub. The value of A2 from equ. (2) in (4), we may get
A3 → A3 A1 A3 A2 |b A3 A2 |a ---- (5)

So, Now let solve by Lemma 2,


Now sub (6) & (7) in (2)
A3 → A3 A1 A3 A2 |b A3 A2 |a ---- (5) A2 → b A3 A2 A1 | aA1 | b A3 A2X A1| aX A1|b ---- (10)(GNF)
Now Sub (10) in (1)
A1 → b A3 A2 A1 A3 | aA1 A3 | b A3 A2X A1 A3 | aX A1 A3 |bA3 ---- (11)(GNF)
Now sub (11) in (8)&(9)
A α β X→ b A3 A2 A1 A3 A3 A2 | aA1 A3 A3 A2 | b A3 A2X A1 A3 A3 A2 | aX A1 A3 A3 A2 |bA3 A3 A2 ---- (12) (GNF)
X→ b A3 A2 A1 A3 A3 A2X | aA1 A3 A3 A2X | b A3 A2X A1 A3 A3 A2X | aX A1 A3 A3 A2 X|bA3 A3 A2X ---- (13) (GNF)
A3 → b A3 A2 |a ---- (6) (GNF)
A3 → b A3 A2X |aX ---- (7) (GNF) Answer:
X→ A1 A3 A2 ---- (8) A1 → b A3 A2 A1 A3 | aA1 A3 | b A3 A2X A1 A3 | aX A1 A3 |bA3
X→ A1 A3 A2X ---- (9) A2 → b A3 A2 A1 | aA1 | b A3 A2X A1| aX A1|b
A3 → b A3 A2 |a
A3 → b A3 A2X |aX
X→ b A3 A2 A1 A3 A3 A2 | aA1 A3 A3 A2 | b A3 A2X A1 A3 A3 A2 | aX A1 A3 A3 A2 |bA3 A3 A2
X→ b A3 A2 A1 A3 A3 A2X | aA1 A3 A3 A2X | b A3 A2X A1 A3 A3 A2X | aX A1 A3 A3 A2 X|bA3 A3 A2X
9/10/2024 145
Exercise problems

9/10/2024 146

You might also like