Lecture 8 - CFG

Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

Theory of Computation

Lecture 8
Grammars & CFG
Dr. Samar Hesham
Department of Computer Science
Faculty of Computers and AI
Cairo University
Egypt

1
Syllabus and Terminologies

 Regular Languages .. Regular Sets


 REs (Regular Expressions)
 FSMs (or FSA/FA) … Finite State Machines/Automata
 DFSM/DFA vs. NDFSM/NFA … Deterministic vs. Non-deterministic FSM
 Comparison and conversion
 Examples & Closure Operations
 Context Free Languages
 CFGs … Context Free Grammars
 PDA … Push Down Automata
 Some applications
 Turing Machine

5/31/2024 FCI-CU-EG 2
Languages

 Finite Automata accept all regular languages and only


regular languages

 Many simple languages are non regular:


- {anbn : n = 0, 1, 2, …}
- {w : w a is palindrome}
and there is no finite automata that accepts them.

 Context-free Languages (CFLs) are a larger class of


languages that encompasses all regular languages and many
others, including the two above.
n n R
{a b } {ww }

Regular Languages

4
Context-Free Languages

n n R
{a b } {ww }

Regular Languages

5
Regular Languages

 Three equivalent formal ways to approach


Regular Languages
Specification
Regular Expressions

Regular
Languages

Finite State Automata Regular Grammars


Implementation Representation
Context-Free Languages

Context-Free Pushdown
Grammars Automata stack

automaton

7
What is a Grammar
 A grammar is a precise description of a formal
language.
 It describes what possible sequence of symbols/strings
constitute valid words or sentences in that language
 Natural Formal Languages:
 Arabic, English, French, Spanish … etc
 Programming Languages:
 C, C++, Java, C#, HTML, XML …

8
What is a Grammar
 A grammar G <N, Σ ,P, S> consists of the following
components:
1. A finite set N of non-terminal symbols or variables.
2. A finite set Σ of terminal symbols that are disjoint from N.
3. A finite set P of production rules of the form
(Σ U N)* N (Σ U N)* → (Σ U N)*
where * is the Kleene star operator and U denotes the set
union. Each production rule maps from one string of
symbols to another where the left hand side contains at
least one non terminal symbol.
4. A distinguished start symbol S ∈ N.

9
Regular languages
 A language is said to be a regular language if it is generated by
a regular grammar.
 A grammar is said to be regular if it's either right-linear or left-
linear.
 Specifically, a grammar G <N, Σ ,P, S> is said to be:
 right-linear if each of its production rules is either in the form
A → xA or A → x,
 left-linear if each of its production rules is either in the form
A → Bx or B → x,
 Where:
 A and B are non terminal symbols in N and,

 x is a string of terminal symbols in Σ*.

10
Example
 Let A={a,b,c}, then the grammar for the A* language
can be described by the following production rules:
S→ 
S→aS
S→bS
S→cS
 How do we know that this grammar describes the
language A*?
We must be able to describe each string of the language in terms of the
grammar rules.
 Prove that the string aacb is in A*???

11
Example
 If A={a,b,c}, and the production rules is the set P the
grammar G=<N, Σ,P,S> ≡ <{S,A,B}, {a,b}, P, S>,
where P ≡ S→AB A→  |aA B→  |bB.
 Let us derive the string aab:
S⇒AB⇒aAB⇒aaAB⇒aaB⇒aabB⇒aab.

 Note: that the language can have more than one grammar.
So we should not be surprised when two people come up
with two different grammar for the same language.

12
Combining grammars
Suppose M and N are languages whose grammars have disjoint sets of
non-terminals. Suppose also that the start symbols for the grammars M
and N are A and B respectively. We can obtain the following new
languages and grammars:

Union Rule: the language M ∪ N starts with the production rule


S→A|B.

Product Rule: the language M ∙ N starts with the production


S → A B.

Closure Rule: the language M* starts with the production


S →AS | .

13
Context-free languages
 A language is said to be context-free if it is generated by context-free
grammar (CFG).
 A grammar G <N, Σ, P, S> is context-free if the production
rules are of the form N → (N U Σ)*.
 Unlike regular grammars, the right-hand sides of the production rules
in CFGs are unrestricted and can be any combination of terminals
and non-terminals.
 Regular languages (RLs) are subsets of context free languages
(CFLs).
 Things that cannot be expressed by regular grammar, but needed in
Parsing of CFLs:
 Palindromes.

 Balanced brackets.

 Counting!! 14
CFG
 A context-free grammar is a notation for defining
context free languages.
 It is more powerful than finite automata or REs, but
still cannot define all possible languages.
 Useful for nested structures, e.g., parentheses in
programming languages.
 Basic idea is to use “variables” (non-terminals) to
stand for sets of strings.
 These variables are defined recursively, in terms of
one another.
15
CFG
 CFG is used to generate the strings belonging to CFL.
 Each production has the form A → w, where A is a
nonterminal and w is a string of terminals and non-
terminals.
 Any non-terminal can be expanded out to any of its
productions at any point.
 Language of a CFG: set of strings of terminals that
can be derived from its start symbol
 Pushdown Automata (PDA) is the automata capable
of accepting languages defined by CFGs.
16
CFGs: Alternate Definition
Many textbooks use different symbols and terms to
describe CFG’s
G = (V, S, P, S)
V = variables a finite set
S = alphabet or terminals a finite set
P = productions a finite set
S = start variable SV

Productions’ form, where AV, a(VS)*:


 A  a
Definition: Context-Free Grammars
Grammar G  (V , T , S , P )

Variables Terminal Start


symbols variables

Productions of the form:

A x
x is string of variables and terminals 18
CSG
 A context-sensitive grammar is a notation for
defining context sensitive languages.
 Each production has the form wAx → wyx
 where w and x are strings of terminals and non-terminals
and y is a string of terminals
 The productions give rules saying "if you see A in a
given context, you may replace A by the string y

19

You might also like