Context Free Grammars
Context Free Grammars
Context Free Grammars
Grammars
(CFLs & CFGs)
1
Not all languages are regular
So what happens to the languages
which are not regular?
2
Context-Free Languages
A language class larger than the class of regular
languages
Applications:
Parse trees, compilers Context-
XML Regular free
(FA/RE)
(PDA/CFG)
3
An Example
A palindrome is a word that reads identical from both
ends
E.g., madam, redivider, malayalam, 010010010
Let L = { w | w is a binary palindrome}
Is L regular?
No.
Proof:
Let w=0N10N (assuming N to be the p/l constant)
8
Example #2
Language of balanced paranthesis
e.g., ()(((())))((()))….
CFG?
G:
S => (S) | SS |
How would you “interpret” the string “(((()))()())” using this grammar?
9
Example #3
A grammar for L = {0m1n | m≥n}
CFG? G:
S => 0S1 | A
A => 0A |
10
Example #4
A program containing if-then(-else) statements
if Condition then Statement else Statement
(Or)
if Condition then Statement
CFG?
11
More examples
L1 = {0n | n≥0 }
L2 = {0n | n≥1 }
L3={0i1j2k | i=j or j=k, where i,j,k≥0}
L4={0i1j2k | i=j or i=k, where i,j,k≥1}
12
Applications of CFLs & CFGs
Compilers use parsers for syntactic checking
Parsers can be expressed as CFGs
1. Balancing paranthesis:
B ==> BB | (B) | Statement
Statement ==> …
2. If-then-else:
S ==> SS | if Condition then Statement else Statement | if Condition
then Statement | Statement
Condition ==> …
Statement ==> …
3. C paranthesis matching { … }
4. Pascal begin-end matching
5. YACC (Yet Another Compiler-Compiler)
13
More applications
Markup languages
Nested Tag Matching
HTML
<html> …<p> … <a href=…> … </a> </p> …
</html>
XML
<PC> … <MODEL> … </MODEL> .. <RAM> …
</RAM> … </PC>
14
Tag-Markup Languages
Roll ==> <ROLL> Class Students </ROLL>
Class ==> <CLASS> Text </CLASS>
Text ==> Char Text | Char
Char ==> a | b | … | z | A | B | .. | Z
Students ==> Student Students |
Student ==> <STUD> Text </STUD>
Here, the left hand side of each production denotes one non-terminals
(e.g., “Roll”, “Class”, etc.)
Those symbols on the right hand side for which no productions (i.e.,
substitutions) are defined are terminals (e.g., ‘a’, ‘b’, ‘|’, ‘<‘, ‘>’, “ROLL”,
etc.)
15
Structure of a production
A =======> 1 | 2 | … | k
18
String membership
How to say if a string belong to the language
defined by a CFG?
1. Derivation
Head to body
Both are equivalent forms
2. Recursive inference
Body to head
G:
Example: A => 0A0 | 1A1 | 0 | 1 |
w = 01110
A => 0A0
Is w a palindrome?
=> 01A10
=> 01110
19
Simple Expressions…
We can write a CFG for accepting simple
expressions
G = (V,T,P,S)
V = {E,F}
T = {0,1,a,b,+,*,(,)}
S = {E}
P:
E ==> E+E | E*E | (E) | F
F ==> aF | bF | 0F | 1F | a | b | 0 | 1
20
Generalization of derivation
Derivation is head ==> body
Transitivity:
IFA ==>*GB, and B ==>*GC, THEN A ==>*G C
21
Context-Free Language
The language of a CFG, G=(V,T,P,S),
denoted by L(G), is the set of terminal
strings that have a derivation from the
start variable S.
L(G) = { w in T* | S ==>*G w }
22
Left-most & Right-most
G:
Derivation Styles EF =>
=> E+E | E*E | (E) | F
aF | bF | 0F | 1F |
Derive the string a*(ab+10) from G: E =*=>G a*(ab+10)
E E
==> E * E ==> E * E
Left-most
==> F * E ==> E * (E)
==> aF * E ==> E * (E + E) Right-most
derivation: ==> a * E ==> E * (E + F) derivation:
==> a * (E) ==> E * (E + 1F)
Always ==> a * (E + E) ==> E * (E + 10F) Always
substitute ==> a * (F + E) ==> E * (E + 10) substitute
leftmost ==> a * (aF + E) ==> E * (F + 10) rightmost
variable ==> a * (abF + E) ==> E * (aF + 10)
variable
==> a * (ab + E) ==> E * (abF + 0)
==> a * (ab + F) ==> E * (ab + 10)
==> a * (ab + 1F) ==> F * (ab + 10)
==> a * (ab + 10F) ==> aF * (ab + 10)
==> a * (ab + 10) ==> a * (ab + 10)
23
Leftmost vs. Rightmost
derivations
Q1) For every leftmost derivation, there is a rightmost
derivation, and vice versa. True or False?
True - will use parse trees to prove this
(using induction)
25
Parse trees
26
Parse Trees
Each CFG can be represented using a parse tree:
Each internal node is labeled by a variable in V
X1 … Xi … Xk
27
Examples
E
Recursive inference
A
E + E
0 A 0
F F
Derivation
1 A 1
a 1
Derivation
X1 … Xi … Xk
Recursive
inference
Derivation Right-most
Recursive
derivation
inference
29
Interchangeability of different
CFG representations
Parse tree ==> left-most derivation
DFS left to right
Parse tree ==> right-most derivation
DFS right to left
==> left-most derivation == right-most
derivation
Derivation ==> Recursive inference
Reverse the order of productions
Recursive inference ==> Parse trees
bottom-up traversal of parse tree
30
Connection between CFLs
and RLs
31
What kind of grammars result for regular languages?
33
Ambiguity in CFGs and CFLs
34
Ambiguity in CFGs
A CFG is said to be ambiguous if there
exists a string which has more than one
left-most derivation
Example:
S ==> AS | LM derivation #1: LM derivation #2:
A ==> A1 | 0A1 | 01 S => AS S => AS
=> 0A1S => A1S
=>0A11S => 0A11S
=> 00111S => 00111S
=> 00111 => 00111
Input string: 00111
Can be derived in two ways
35
Why does ambiguity matter?
Values are
E ==> E + E | E * E | (E) | a | b | c | 0 | 1 different !!!
string = a * b + c
E
• LM derivation #1:
• E => E + E => E * E + E E + E (a*b)+c
==>* a * b + c
E * E c
a b
E
• LM derivation #2
• E => E * E => a * E => E E
* a*(b+c)
a * E + E ==>* a * b + c
a E + E
38
Summary
Context-free grammars
Context-free languages
Productions, derivations, recursive inference,
parse trees
Left-most & right-most derivations
Ambiguous grammars
Removing ambiguity
CFL/CFG applications
parsers, markup languages
39