0% found this document useful (0 votes)
16 views31 pages

Context Free Language

Theory of Automata and Formal Languages
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views31 pages

Context Free Language

Theory of Automata and Formal Languages
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

CS-352 TAFL

Context Free Languages

Instructor: Yusra Arshad

Book: Prof. Sipser-MIT

1
An informal example
Language of palindromes: L p a l
A palindrome is a string that reads the same forward and
backward Ex: otto, madamimadam , 0110, 11011, ǫ

L p a l is not a regular language (can be proved by using the pumping lemma)


We consider Σ = {0, 1}. There is a natural, recursive definition of when a
string of 0 and 1 is in L p a l .
Basis: ǫ, 0 and 1 are palindromes
Induction: If w is a palindrome, so are 0w0 and 1w1. No string is palindrome of
0 and 1, unless it follows from this basis and inductive rule.

A CFG is a formal notation for expressing such recursive definitions of


languages
• Informally, a Context-Free Language (CFL) is a language generated by a
Context-Free Grammar (CFG).

• What is a CFG?

• Informally, a CFG is a set of rules for deriving (or generating) strings (or
sentences) in a language.

• Note: A grammar generates a string, whereas a machine accepts a string

3
Context-Free Languages

• The class of context-free languages generalizes over the class of regular languages, i.e.,
every regular language is a context-free language.

• The reverse of this is not true, i.e., every context-free language is not necessarily
regular. For example, as we will see {0k1k | k≥0} is context-free but not regular.

• Many issues and questions we asked for regular languages will be the same for context-
free languages:

Machine model – PDA (Push-Down Automata)


Descriptor – CFG (Context-Free Grammar)
Pumping lemma for context-free languages (and find CFL’s limit)
Closure of context-free languages with respect to various operations
Algorithms and conditions for finiteness or emptiness

4
Different Kinds of Automata
Automata are distinguished by the temporary memory

• Finite Automata: no temporary memory

• Pushdown Automata: stack

• Turing Machines: random access memory


Memory affects computational power:

More flexible memory

results to
The solution of more computational
problems
Context Free Languages

7
Pushdown Automata

8
Pushdown Automata

9
10
Context Free Grammars
 We have just studies Context Free Grammar (CFG)
 Context free as sentence can be generated in any sequence
 On the left hand side of production rules we have a non-terminal and on the
right hand side we have a sequence of terminal and non-terminals.
 The language generated by G is the set of all possible sentences that may be
generated from the start symbol S.
 Context-free grammars are important because they are powerful enough to
describe the syntax of programming languages.
 almost all programming languages are defined via context-free grammars

 The language generated by CFG is called


Context Free Language (CFL).

11
Context Free Grammars
CFG is a collection of the followings
1. An alphabet ∑ of letters called terminals from
which the strings are formed, that will be the
words of the language.
2. A set of symbols called non-terminals, one of
which is S, stands for “start here”.
3. A finite set of productions of the form
non-terminal  finite string of terminals and
/or non-terminals.
• The terminals are designated by small letters,
while the non-terminals are designated by capital
letters.
• There is at least one production that has the non-
terminal S as its left side.
12
• Example CFG:

<sentence> –> <noun-phrase> <verb-phrase> (1)


<noun-phrase> –> <proper-noun> (2)
<noun-phrase> –> <determiner> <common-noun> (3)
<proper-noun> –> John (4)
<proper-noun> –> Jill (5)
<common-noun> –> car (6)
<common-noun> –> hamburger (7)
<determiner> –> a (8)
<determiner> –> the (9)
<verb-phrase> –> <verb> <adverb> (10)
<verb-phrase> –> <verb> (11)
<verb> –> drives (12)
<verb> –> eats (13)
<adverb> –> slowly (14)
<adverb> –> frequently (15)

• Example Derivation:

<sentence> => <noun-phrase> <verb-phrase> by (1)


=> <proper-noun> <verb-phrase> by (2)
=> Jill <verb-phrase> by (5)
=> Jill <verb> <adverb> by (10)
=> Jill drives <adverb> by (12)
=> Jill drives frequently by (15)
13
• Informally, a CFG consists of:

– A set of replacement rules, each having a Left-Hand Side (LHS) and a Right-Hand
Side (RHS).
– Two types of symbols; variables and terminals.
– LHS of each rule is a single variable (no terminals).
– RHS of each rule is a string of zero or more variables and terminals.
– A string consists of only terminals.

14
Formal Definition of CFG
• Formally, a Context-Free Grammar (CFG) is a 4-tuple:

G = (V, T, P, S)

V - A finite set of variables or non-terminals

T - A finite set of terminals (V and T do not intersect: do not use same symbols)
This is our ∑

P - A finite set of productions, each of the form A –> α, where A is in V and


α is in (V  T)*

Note that α may be ε

S - A starting non-terminal (S is in V) 15
• Example CFG for {0k1k | k≥0}:
G = ({S}, {0, 1}, P, S) // Remember: G = (V, T, P, S)

P:
(1) S –> 0S1 or just simply S –> 0S1 | ε
(2) S –> ε

• Example Derivations:

S => 0S1 (1) S => ε (2)


=> 01 (2)

S => 0S1 (1)


=> 00S11 (1)
=> 000S111 (1)
=> 000111 (2)

• Note that G “generates” the language {0k1k | k≥0} 16


• Example CFG for ?: (Recursive Inference)
G = ({A, B, C, S}, {a, b, c}, P, S)

P:
(1) S –> ABC
(2) A –> aA A –> aA | ε
(3) A –> ε
(4) B –> bB B –> bB | ε
(5) B –> ε
(6) C –> cC C –> cC | ε
(7) C –> ε

• Example Derivations:

S => ABC (1) S => ABC (1)


=> BC (3) => aABC (2)
=> C (5) => aaABC (2)
=> ε (7) => aaBC (3)
=> aabBC (4)
=> aabC (5)
=> aabcC (6)
=> aabc (7)
17
• Note that G generates the language a*b*c*
Examples (cont.): The grammar
Grammar G 1 = ( { E, I } , T, P, E ) where: T = { +, ∗,(, ), a, b, 0, 1 } and P is the set of
productions:

1 E → I
2 E → E +E
3 E → E ∗E
4 E → (E)
5 I → a
6 I → b
7 I → Ia
8 I → Ib
9 I → I0
10 I → I1

Derivation of a ∗(a + b000) by G1


Examples of derivation
• Derivation of a ∗(a + b000) by G 1

• E ⇒ E ∗E ⇒ I ∗E ⇒ a ∗E ⇒ a ∗(E ) ⇒
• a ∗(E + E ) ⇒ a ∗(I + E ) ⇒ a ∗(a + E ) ⇒ a ∗(a + I ) ⇒
• a ∗(a + I 0) ⇒ a ∗(a + I 00) ⇒ a ∗(a + b00)

• Note 1: At each step we might have several rules to choose from, e.g.
• I ∗E ⇒ a ∗E ⇒ a ∗(E ), versus
• I ∗E ⇒ I ∗(E ) ⇒ a ∗(E ).

• Note 2: Not all choices lead to successful derivations of a particular string, for instance
• E ⇒ E + E (at the first step)

• won’t lead to a derivation of a ∗(a + b000).


Leftmost and Rightmost derivation
In other to restrict the number of choices we have in deriving a string, it is often
useful to require that at each step we replace the leftmost (or rightmost) variable
by one of its production rules

Leftmost derivation ⇒ lm : Always replace the left-most variable by one of its


rule-bodies
Rightmost derivation ⇒ rm : Always replace the rightmost variable by one of
its rule-bodies.

EXAMPLES
1− Leftmost derivation: previous example
2− Rightmost derivation:
E ⇒rm E∗E ⇒rm E ∗ (E) ⇒rm
E ∗ (E + E) ⇒rm E ∗ (E + I ) ⇒ r m E ∗ (E + I 0) ⇒rm
E ∗(E + I 00) ⇒ r m E ∗(E + b00) ⇒ r m E ∗(I + b00) ⇒ r m
E ∗(a + b00) ⇒ r m I ∗(a + b00) ⇒ r m a ∗(a + b00)

We can conclude that E ⇒ rm a ∗(a + b00)
21
• Example:

S –> AB S A
A –> aAA
A –> aA A B aA
A –> a
B –> bB a A A b aAA
B –> b
a

yield = aAab yield = aAaAA

• Notes:
– Root can be any non-terminal
– Leaf nodes can be terminals or non-terminals
– A derivation tree with root S shows the productions used to obtain a sentential form

22
• Observation: Every derivation corresponds to one derivation tree.

S => AB S Rules:
=> aAAB S –> AB
=> aaAB A B A –> aAA
=> aaaB A –> aA
=> aaab a A A b A –> a
B –> bB
a a B –> b

• Observation: Every derivation tree corresponds to one or more derivations.


leftmost: rightmost: mixed:
S => AB S => AB S => AB
=> aAAB => Ab => Ab
=> aaAB => aAAb => aAAb
=> aaaB =>aAab => aaAb
=> aaab => aaab => aaab

• Definition: A derivation is leftmost (rightmost) if at each step in the derivation a


production is applied to the leftmost (rightmost) non-terminal in the sentential form.
– The first derivation above is leftmost, second is rightmost, the third is neither. 23
• Observation: Every derivation tree corresponds to exactly one leftmost (and
rightmost) derivation.

S => AB S
=> aAAB
=> aaAB A B
=> aaaB
=> aaab a A A b

a a

• Observation: Let G be a CFG. Then there may exist a string x in L(G) that has
more than 1 leftmost (or rightmost) derivation. Such a string will also have
more than 1 derivation tree.

24
• Example: Consider the string aaab and the preceding grammar.

S –> AB S => AB S
A –> aAA => aAAB
A –> aA => aaAB A B
A –> a => aaaB
B –> bB => aaab a A A b
B –> b
a a

S => AB S
=> aAB
=> aaAB A B
=> aaaB
=> aaab a A b

a A

a
• The string has two left-most derivations, and therefore has two distinct parse trees.

25
• Definition: Let G be a CFG. Then G is said to be ambiguous if there exists an
x in L(G) with >1 leftmost derivations. Equivalently, G is said to be
ambiguous if there exists an x in L(G) with >1 parse trees, or >1 rightmost
derivations.

• Note: Given a CFL L, there may be more than one CFG G with L = L(G).
Some ambiguous and some not.

• Definition: Let L be a CFL. If every CFG G with L = L(G) is ambiguous, then


L is inherently ambiguous.

26
• An ambiguous Grammar: A leftmost derivation
E -> I ∑ ={0,…,9, +, *, (, )} E=>E*E
=>I*E
E -> E + E =>3*E+E
E -> E * E =>3*I+E
=>3*2+E
E -> (E) =>3*2+I
I -> ε | 0 | 1 | … | 9 =>3*2+5

• A string: 3*2+5 Another leftmost derivation


E=>E+E
• Two parse trees: =>E*E+E
* on top, & + on top =>I*E+E
=>3*E+E
& two left-most derivation:
=>3*I+E
=>3*2+I
=>3*2+5
27
E -> I ∑ ={0,…,9, +, *, (, )}
E -> E + E E=>E*E
E -> E * E
=>I*E
E -> (E) E
I -> ε | 0 | 1 | … | 9 =>3*E+E
=>3*I+E
=>3*2+E
=>3*2+I E * E
=>3*2+5

I E + E
E
3
I I

Another leftmost derivation + E


E 2 5
E=>E+E
=>E*E+E
=>I*E+E I
E * E
=>3*E+E
=>3*I+E 5
=>3*2+I I
I
=>3*2+5

3 2
28
29
• A language may be Inherently ambiguous:
L ={anbncmdm | n≥1, m ≥ 1}  {anbmcmdn | n ≥ 1, m ≥ 1}

• An ambiguous grammar:
S -> AB | C
A -> aAb | ab
B -> cBd | cd

C -> aCd | aDd


D -> bDc | bc

• Try the string: aabbccdd, two different derivation trees


• Grammar CANNOT be disambiguated for this (not showing the
proof) 30
String aabbccdd belongs to two different parts of the language:
Rules:
S -> AB | C L ={anbncmdm | n≥1, m ≥ 1}  {anbmcmdn | n ≥ 1, m ≥ 1}
A -> aAb | ab
B -> cBd | cd

C -> aCd | aDd


D -> bDc | bc

Derivation 1 of Derivation 2 of
aabbccdd: aabbccdd:

S => AB S => C
=> aAbB => aCd
=> aabbB => aaDdd
=> aabb cBd => aa bDc dd
=> aabbccdd => aabbccdd

31

You might also like