08 CFG
08 CFG
* +
+ 5 2 *
or
2 3 3 5
= 25 = 17
Grammars describe meaning
EXPR
EXPR → EXPR + TERM
EXPR → TERM EXPR
TERM → TERM * NUM
TERM TERM
TERM → NUM
NUM → 0-9 TERM
NOUN-PHRASE → A-NOUN
or → A-NOUN PREP-PHRASE
with a flower
PREP NOUN-PHRASE
The grammar of (parts of) English
SENTENCE
NOUN-PHRASE VERB-PHRASE
CMPLX-VERB
PREP-PHRASE NOUN-PHRASE
EE+E A 0A Non-terminals: E, A
E (E) A 1A Terminals: +, *, (, ), 0, 1
E A A 0 Start variable: E
A 1
shorthand: conventions:
E E + E | (E) | A Variables in UPPERCASE
A 0A | 1A | 0 | 1 Start variable comes first
Derivation
• A derivation is a sequential application of productions:
E E+E
(E)+ E E E + E |(E) | A
(E)+ A N 0A | 1A | 0 | 1
(E + E)+ 1
derivation
(E + E)+ 1
(E + A)+ 1
(A + A)+ 1
ab one production
(A + 1A)+ 1
(A + 10)+ 1
(1 + 10)+ 1
* (1 + 10)+ 1
E * b
a derivation
Context-free languages
• The language of a CFG is the set of all strings
generated by the grammar
* w}
L(G) = {w : w S* and S
A → 0A1 | B
B→#
L(G) = {0n#1n: n ≥ 0}
# A B #
S SS | (S) |
() (()())
Parse trees
S SS | (S) |
S (S) S S S (S)
(SS) ( S )( S ) (SS)
((S)S) (S(S))
(()S) (S())
(()(S)) ((S)())
(()()) (()())
S SS | (S) |
L = {0n1n | n 0}
S 0S1|
Design example
S → 0|LB 1052870032
B → DB|
any number
D → 0|L
leading digit L
L → 1|2|3|4|5|6|7|8|9
Design examples
L = {0n1n0m1m | n 0, m 0} 010011
00110011
These strings have two parts: 000111
L = L1L2
L1 = {0n1n | n 0}
L2 = {0m1m | m 0}
S S1S1
rules for L1: S1 0S11| S1 0S11 |
L2 is the same as L1
Design examples
L = {0n1m0m1n | n 0, m 0} 011001
0011
These strings have nested structure: 1100
00110011
outer part: 0n1n
inner part: 1m0m
S 0S1|A
A 1A0 |
Context-Free Grammar
• A context-free grammar G = (N, S, P, S),
where
– N : set of variables or non-terminals
- S : set of terminals
– P : set of productions, each of which is of the form
A a1 | a2 | …
• Where each ai is an arbitrary string of variables and
terminals
– S: start variable
What is L(G)?
G: S 0 S 0 | 1 S 1 |
Examples
What is L(G)?
– G: S 0 S 0 | 1 S 1 | 0 | 1
• CFG?
S => 0S1 | A
A => 0A |
Design examples
10010011010010110 A: , or ends in 1
initial part middle part final part C: , or begins with 1
A B C
Design examples
10010011010010110 A: , or ends in 1
A B C C: , or begins with 1
U: any string
S → ABC B has recursive structure:
A → | U1
U → 0U | 1U | 00110100
C → | 1U D
B → 0D0 | 0B0 same number of 0s
D → 1U1 | 1 at least one 0
regular
NFA DFA
expression
From regular to context-free
regular expression CFG
S → 0S1 | L = {0n1n: n ≥ 0}
regular context-free
CFLs & Regular Languages
What kind of grammars result for regular languages?
0 1 0,1 0 1
1 0 A => 01B | C
A B C 1 0 B => 11B | 0C | 1A
A B 1 C
C => 1A | 0 | 1
0
Right linear CFG? Right linear CFG? Finite Automaton?
A => 0A
B => 1B | 1A | 0
C => 0B | 0C | 1C | 0 | 1
Exercises
L = {0n12n | n 0}
L = {0n10n | n 0}
L = {0n1n1m | n 0, m 0}
L = {0n1m0n| n 0, m 0}
L = {0m1n | m ≠ n, m, n 0}
L = {0i1j2k | i=j or j=k, where i,j,k≥0}
L = {0i1j2k | i=j or i=k, where i,j,k≥1}
Binary strings of even length
Binary strings with equal numbers of 0’s and 1’s
Left-most & Right-most Derivation Styles
Derive the string a*(ab+10) from G: G:
E => E+E | E*E | (E) | F
E *=>G a*(ab+10) F => aF | bF | 0F | 1F |
E E
==> E * E ==> E * E
Q3) Could there be words which have more than one leftmost
(or rightmost) derivation?
Yes – depending on the grammar
How to prove that your CFGs are
correct?
(using induction)
36
CFG & CFL
• Theorem: A string w in (0+1)* is in L(G), iff, w is a
palindrome.
G:
A → 0A0 | 1A1 | 0 | 1 |
• Proof:
– Use induction
• on string length for the IF part
• on length of derivation for the ONLY IF part
Ambiguity in CFGs
• A CFG is said to be ambiguous if there exists a
string which has more than one left-most
derivation
Example:
S ==> AS | LM derivation #1:
A ==> A1 | 0A1 | 01 S => AS
=> 0A1S
=>0A11S LM derivation #2:
=> 00111S S => AS
=> 00111 => A1S
=> 0A11S
Input string: 00111
=> 00111S
Can be derived in two ways => 00111
Why does ambiguity matter?
E ==> E + E | E * E | (E) | a | b | c | 0 | 1
• LM derivation #1:
E
•E => E + E => E * E + E
==>* a * b + c E + E (a*b)+c
E * E c
a b
E
• LM derivation #2
•E => E * E => a * E => E * E a*(b+c)
a * E + E ==>* a * b + c
a E + E