0% found this document useful (0 votes)
26 views36 pages

Act CH 3

The document discusses Context-Free Grammars (CFGs) and their relationship with push-down automata, detailing their structure, production rules, and applications in programming language syntax. It covers derivations, parse trees, ambiguity in CFGs, and methods for eliminating useless symbols, epsilon-productions, and unit-productions to achieve normal forms like Chomsky and Greibach normal forms. Additionally, it provides algorithms for converting finite automata to regular grammars and vice versa.

Uploaded by

Negasa Alemu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views36 pages

Act CH 3

The document discusses Context-Free Grammars (CFGs) and their relationship with push-down automata, detailing their structure, production rules, and applications in programming language syntax. It covers derivations, parse trees, ambiguity in CFGs, and methods for eliminating useless symbols, epsilon-productions, and unit-productions to achieve normal forms like Chomsky and Greibach normal forms. Additionally, it provides algorithms for converting finite automata to regular grammars and vice versa.

Uploaded by

Negasa Alemu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 36

1

Chapter 3
CFGs AND PUSH-DOWN AUTOMATA
CFGs 2

 Context-Free Grammars:
 are notations used for describing context-free languages and
recursions.
 This CFGs can be defined using an extension of -NFA with stack
called push-down automata.
 CFGs has been used in compiled design to impose the syntax of a
programming language.
 CFGs explained by quadruples
 V, is a finite set of non-terminals (like states)፣ T, is a finite set of terminal (like
inputs)
 P, is a finite set of production rules and S, is a start symbol which is an element
of N and represent the language to be defined.
 A production rule is of the form A where,   {V ∪ T}* and A  V
… 3

 0n1n, n0 is not regular language as proved earlier, and its


hard to design a finite automata to recognize it.
 But, its very simple to have a CFG to recognize it.
 S  0P1
 P  0P1
 P
Is a simple CFG that recognizes the language 0n1n, n0
 S and P above are set of non-terminals.
 , 0 and 1 are set of terminals.
 S  0P1, P  0P1 and P   are productions
Derivations 4

 There are two conventional methods to check whether a


string is accepted by a language of CFG.
 Recursive inference: the productions are going to be applied from
body to head.
 Take a string, try creating this string by concatenating terminals from the
production rules of the grammar.
 Given CFG and string try to infer if the string is recognizable.
… 5

 Derivations: The basic idea of derivation is to consider productions


as rewrite rules.
 Derivation started by expanding the starting symbol of a grammar.
 Then a non-terminal from the right-hand side, is replaced by the right-hand
side of any production.
 We can do this anywhere in a sequence of symbols (terminals and
non-terminal) and repeat doing so until we have only terminals left.
 Example: Drive a string (id-id)*id+id, from the grammar below
E  E*E E-E  E+E E/E
E  (E)
E  id
Leftmost and rightmost derivations 6

 Its necessary to restrict the number of choices while driving a


string.
 Derivations can be of two forms
1. Leftmost derivation: in leftmost derivation a non-terminal from the leftmost
side is going to be expanded first.
2. Rightmost derivation: in rightmost derivation a non-terminal from the
rightmost side is going to be expanded first.
 Example: Drive a string id*id+id, from the grammar below using LMD
and RMD.
E  E*E E-E  E+E E/E
E  (E)
E  id
Exercise 7

 Design a CFG for the following language specification over


={0, 1}.
1. A CFG that recognizes strings that ends with 1.
2. A CFG that recognizes languages of 0*1(0+1)*.
3. A CFG that recognizes strings that has twice as many 0’s as 1’s.
4. A CFG that recognizes palindromes.
5. Using the CFG you got in Q1, give the LMD and RMD for the following strings.
a) 00101
b) 1001
c) 00011
Parse trees 8

 Parse tree is a useful representation of a derivation.


 In practice, a parse tree is a data structure more convenient for a compiler to
use while checking the grammatical correctness of a source code.
 Parse trees are very useful in managing ambiguities of CFGs.
 Constructing parse tree
 Having a CFG G = (V, T, P, S), the parse trees on G are trees that obey the
following criteria:
1. The root of a parse tree must be a start symbol of a CFG.
2. Each interior node must be a non-terminals in V.
3. Each leaf node can be either a non-terminal, terminal or .
4. If an interior node is labeled A, and its children are labeled X 1, X2, … Xk then A  X1,
X2, … Xk is a production in G.
5. Note: the only time one of X’s can be  is if that there is a production in G such that
A.
… 9

 Given a grammar G:
T→R
T → aTc
R→
R → RbR
generate a parse tree for the strings aabbbcc and abbc.
Ambiguity in CFGs 10

 A CFG is ambiguous:
 if there exists at least one string w in T*, for which we can find
more than one possible parse trees that yields w.
 A CFG is unambiguous:
 If a string has at most one parse tree in a gramma that yields w.
 Example: show a grammar G:
E  E*E E-E  E+E E/E
E  (E)
E  id , is ambiguous grammar taking a string id + id * id.
… 11

 Ambiguous grammars:
 are defective to impose the structural definition on a program.
 Therefor, its important to know the causes of ambiguity in a
grammar and
 Convert ambiguous grammars into unambiguous one’s.
 First, how do we know if a grammar is ambiguous?
 Determining ambiguity is undecidable.
 Which means there is no theorem, proof or algorithm to be used for determining
whether a grammar is ambiguous or not.
 Although, it possible to find a string and show that the string have two
different parse trees generated.
Operator precedence and associativity 12

 Sources of ambiguity: the two common sources of ambiguity


in programming language are:
1. Operator precedence and associativity:
 Precedence: when its not clear (or there is no restriction) to select which production
with operators to use first.
 Associativity: when its not clear to select which side (left or right) non-terminal to
expand first.
 Example: resolve the ambiguity in the grammar below.
E  E*E E-E  E+E E/E
E  (E)
E  id
Dangling else problem 13

2. Dangling else problem in grammar design:


 The dangling else problem in syntactic ambiguity. It occurs when we use
nested if.
 When there are multiple “if” statements, the “else” part doesn’t get a clear
view with which “if ” it should combine.
 Example:

 The general rule is, "Match each else with the closest unmatched then."
… 14

 The idea is that a statement appearing between a then and an else


must be "matched" ; that is, the interior statement must not end with an
unmatched or open then.
 A matched statement is either an if-then-else statement containing no
open statements or it is any other kind of unconditional statement.
Regular Grammar 15

 Regular grammar is a class of CFGs, and used to generate regular


language.
 They have a single non-terminal on the left-hand side and a right-hand side
consisting of a single terminal or single terminal followed by a non-terminal.
 i.e regular grammars contains productions of the form:
… 16

 Regular grammar are of two types:


1. Left-linear grammar 2. Right-
linear grammar

 Every regular grammar can be converted into finite automata and vise-
versa.
FSA to regular grammar 17

1. Algorithm for converting finite automata (FA) to the right linear


grammar is as follows :
1. Begin the process from the start state.
2. Repeat the process for each state.
3. Write the production as the output followed by the state on which the transition is
going.
4. And at last, add € (epsilon) to end the derivation.
 Example: convert the following FSA to right-linear grammar.
FSA to regular grammar 18

2. Algorithm for converting finite automata (FA) to the left linear


grammar is as follows :
1. Take reverse of the FSA and remove unreachable states.
• final state becomes initial and vise-versa.

2. write right linear grammar


3. Then take reverse of the right linear grammar
4. And finally, you will get left linear grammar
 Example: convert the following FSA to left-linear grammar.
Regular grammar to FSA 19

 Algorithm for converting regular grammar to FSA:


1. The number of states in the automata will be equal to the number of non-terminals
plus one.
1. Each state in automata represents each non-terminal in the regular grammar.
2. The additional state will be the final state of the automata.
3. The state corresponding to the start symbol of the grammar will be the initial state of
automata.
4. If L(G) contains ϵ that is start symbol in grammar derives to ϵ, then make start state also
as final state.

2. The transitions for automata are obtained as follows


1. For every production A -> aB make δ(A, a) = B that is make an are labeled ‘a’ from A to B.
2. For every production A -> a make δ(A, a) = final state.
3. For every production A -> ϵ, make δ(A, ϵ) = A and A will be final state.
… 20

 Example: Convert the following regular grammar into FSA


1. Regular grammar G:
S  0S | 1A |1
A  0A | 1A | 0 |1

2. Regular grammar G:
S  a | aA | bB |
A  aA | aS
B  cS | 
Normal forms of CFGs 21

 Normal forms of CFGs, in which all productions are the form


ABC or Aa form, are useful to show:
 That, all CFLs are defined using CFGs.
 There are two commonly used normal forms:
 Chomsky normal form and
 Greibach normal form.
 To get normal forms of CFGs, preliminary Simplifications of CFGs are
required like:
 Eliminating useless symbols,
 Eliminating -productions and
 Eliminating unit productions are required.
Eliminating useless symbols 22

 Useless symbols can be removed into phases:


1. Phase one: driving G’ from the CFGs, such that each variables drive some
terminal.
A. Include all non-terminals W1, that drives some terminal and initialize i=1.
B. Include Non-terminals Wi+1, that drives Wi
C. Increment i and repeat step-2 until Wi+1 = Wi
D. Include all productions that have Wi in it.

2. Phase two: drive G’’ from G’, such that each symbols appears in a sentential
form.
A. Include start symbol in Y1 and initialize i=1.
B. Include all symbols Yi+1, that can be derived from Y1 and all production rules applied.
C. Increment i and repeat step-2 until yi+1 =Yi
… 23

 Example: Given a CFG-G eliminate useless symbols.


G: S AC|B
A a
C c|BC
E aA|e
Eliminating  -productions 24

 -productions are a productions of the form A or there is a


derivation that starts at A and leads to  , variable A is called
nullable.
 Procedures to eliminate productions of the form A :
A. Find productions whose right side contain A.
B. Replace each A, with .
C. Add, the resulted productions to the CFG.
 Example: eliminate -productions in a CFGs below.
G: G:
S ABAC S AB

A aA|  A aAA| 

B bB|  B bBB| 
Eliminating unit-productions 25

 A unit production is a production of the form A B, where both A and B


are variables.
 Unit productions introduce extra step to the derivation, so they need not to
be there.
 Procedures to eliminate unit-productions:
A. To remove A B, add A x whenever B x occurs in the grammar, x is terminal or .
B. Reject A B from the grammar.
C. Repeat step-1 until all unit productions are removed.
D. Add, the resulted productions to the CFG.
 Example: eliminate unit-productions in a CFG below.
G: S ABC
A aB |C
B bB| 
C c|
Chomsky normal forms 26

 A CFG-G, that has a productions of the form:

 To get CNF from CFGs:


1. All useless symbols, -productions and unit productions must be eliminated.
2. If S, start symbol appears on right side add S’ S.
3. Replace A B1B2…Bn, n2 with A B1C where C B2…Bn and apply for C.
4. If A aB, the replace it with A CB, C a
5. Repeat until the CFG is in CNF.
… 27

 Example: convert the CFG into CNF.


E  E*E  E+E
E  (E)
E  id
Greibach normal forms 28

 A CFG-G, that has a productions of the form:


A b
AbC1C2…Cn, is in GNF
where b is a terminal and A, C1, C2…Cn are non-terminals.
 To get GNF from CFGs:
1. Convert the CFG to CNF first.
2. Change names of non-terminals into some Ai in ascending order of i
3. Alter the rules, so that all non-terminals are in ascending order. Such that
AiAjX then i<j should be never ij.
4. Remove left recursions.
Greibach normal forms 29

 Example: convert a CFG to GNF.


S CA |BB
B b |SB
C b
A a
Pumping lemma of CFGs 30

 Pumping lemma of CFGs used to proof only if a grammar is


not context free grammar.
 If L is a context-free language,
 there is a pumping length p such that any string w ∈ L of length ≥ p
can be written as w = uvxyz, where
1. vy ≠ ε,
2. |vxy| ≤ p, and
3. for all i ≥ 0, uvixyiz ∈ L.
 Example: prove L = {0n1n2n | n ≥ 1} is not context free grammar.
Pushdown automata 31

 Pushdown automata is a type of automata that defines


context free languages.
 PDA is an extension of -NFA with stack.
 The stack give the PDA a power to remember an infinite amount of
information.
 There are two types of PDA:
1. A PDA that accepts by entering a final state.
2. A PDA that accepts by emptying its stack.
 There is also a sub class of PDA called, deterministic PDA that
accepts only regular languages- which resembles the
mechanics of a parser.
… 32

 A PDA

 Formal definition of a PDA, consists seven-symbols:


 Q is a finite set of states,  is a finite set of inputs,
  is a finite set of stack alphabets,
  is a finite set of transition functions,
  (q, a, X) (p, )
 q is a state, a is an input and X is a stack symbol.
 In the output (p, ), p is another state and  is a string that replaces the stack symbol X.
 q0 is an initial state, Z0 is an initial stack alphabet, and F is a finite set of accepting
states.
… 33

 Example:
1. Design a PDA for the language 0n1n ,n1 and 0n12n ,n1.
2. Design a PDA for the language wwR for w is in ={0,1}
CFG to PDA 34

 Let G ={V, T, P, S} is a CFG, then a PDA P that recognizes


L(G) is constructed as follow:

 Where transition function  is defined by:


1. For each variable A,

2. For each terminal a,


… 35

 Example:
1. Convert the CFG given below into equivalent PDA.
1. G:

2. G:

3. G:
Applications of CFGs 36

 Parser construction in compiler design.


 DTD (Document Type Definition) in XML, which controls the
allowable tags and the way in which this tags are nested, is
designed using CFGs.

You might also like