0% found this document useful (0 votes)
50 views15 pages

Motivation For Formal Grammars

The document discusses context-free grammars and their use in describing formal languages. A context-free grammar consists of variables, terminals, productions, and a start variable. Grammars define languages by specifying valid string derivations. Parse trees can represent derivations and the syntactic structure of strings in a language. Parsing involves determining if a string is in the language generated by a grammar by attempting to derive it and build a parse tree.

Uploaded by

Jashwanth Kumar
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views15 pages

Motivation For Formal Grammars

The document discusses context-free grammars and their use in describing formal languages. A context-free grammar consists of variables, terminals, productions, and a start variable. Grammars define languages by specifying valid string derivations. Parse trees can represent derivations and the syntactic structure of strings in a language. Parsing involves determining if a string is in the language generated by a grammar by attempting to derive it and build a parse tree.

Uploaded by

Jashwanth Kumar
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 15

motivation for formal grammars

Natural Languages are usually described


by rules like the one shown below
Example:
(1) sentence  noun-phrase verb-phrase
(2) noun-phrase  article noun
(3) article  a | the
(4) noun  girl | dog
(5) verb-phrase  verb noun-phrase
(6) verb  sees | pets
Grammars Produce Languages

 Language: the set of strings (of terminals) that


can be generated from the start symbol by derivation:
sentence
 noun-phrase verb-phrase . (rule 1)
 article noun verb-phrase . (rule 2)
 the noun verb-phrase . (rule 3)
 the girl verb-phrase . (rule 4)
 the girl verb noun-phrase . (rule 5)
 the girl sees noun-phrase . (rule 6)
 the girl sees article noun . (rule 2)
 the girl sees a noun . (rule 3)
 the girl sees a dog . (rule 4)
•Language: the programs (character streams)
allowed

•Grammar rules (productions): "produce" the


language
left-hand side, right-hand side
•nonterminals (structured names):
noun-phrase verb-phrase
•terminals (tokens): . dog
•metasymbols:  (“consists of”) | (choice)
•start symbol: the nonterminal that stands for
the entire structure (sentence, program).
–sentence
•E.g., if-statement  if (expression) statement else
statement
Context-Free Grammar

 Context-Free Grammars (CFG)


 Noam Chomsky, 1950s.
 Define context-free languages.
 Four components:
 terminals, nonterminals, one start symbol,
productions (left-hand side: one single
nonterminal)
CFG ’s

Context-free grammar( CFG or just grammar) is


4-tuple denoted G=(V,T,P,S) ,

Where
V is a finite set of variables or nonterminals or syntactic
categories

T is a finite set of terminals or tokens

P is a finite set of production rules in the form


A   ,where A is a variable and  is a string of symbols from
(V T)*

S is a special variable called the start symbol


What does “Context-Free” mean?
 Left-hand side of a production is always
one single nonterminal:
 The nonterminal is replaced by the
corresponding right-hand side, no matter
where the nonterminal appears. (i.e., there
is no context in such
replacement/derivation.)
 Context-sensitive grammar (context-
sensitive languages)
 Why context-free?
CFG Example 1
 E → ID

| NUM

| E*E

| E/E

| E+E

| E-E

| (E)

ID → a | b |…|z

NUM → 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Grammars Produce Languages
Does the above grammar produce the following
sentence (a*b)+c Start with start symbol of the CFG

E → E+E
Left most derivation
→ ( E)+E

→ ( E*E)+E
Notations
→ (ID*E)+E
G
→ ( a * E)+E E ==> E +E (single step left most derivation)
lm
→ ( a * ID)+E
*
E ==> E+E (zero or more step derivation)
→ ( a * b)+ID lm
*
Example: E ==> (a*b)+c
→ ( a * b) + c G
Context – free languages (CFL ’s)
The languages described by context –free grammars are
known as CFL ’s

formal notation:
The language generated by G [denoted L(G)] is {w | w is
== w }
*
in T* and S ==>
G
That is , a string is in L(G) if:
1) the string consists solely of terminals
2) the string can be derived from S
A string of terminals and variables is called
a sentential form if S ==>
* 
==
CFG Example 2
 S → if E then S else S
| begin S L
| print E
L → end
|;SL
E → NUM = NUM
NUM → 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Parse Tree
 Represents the derivation steps from start
symbol to the string

 Given the derivations used in the parsing of an


input sequence, a parse tree has

 the start symbol as the root

 the terminals of the input sequence as leafs

 for each production A → X1 X2 ... Xn used in a


derivation, a node A with children X1 X2 ... Xn
Parse Tree Example 1
CFG:
expr
expr → expr + expr | expr * expr | (expr) | number
number → number digit |digit
digit → 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
expr expr
+

Input Sequence: 3+4*5 number expr * expr

digit number number

3 digit digit

4 5
What is Parsing?

 Given a grammar and a token string:


 determine if the grammar can generate the
token string?
 i.e., is the string a legal program in the
language?

 In other words, to construct a parse


tree for the token string.
What’s significant about
parse tree?

 A parse tree gives a unique syntactic


structure

 Leftmost, rightmost derivation

 There is only one leftmost derivation for a


parse tree, and symmetrically only one
rightmost derivation for a parse tree.
Example
expr  expr + expr | expr  expr | ( expr ) | number
number  number digit | digit
digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

parse tree expr

leftmost derivation expr


expr + expr

expr + expr
number expr * expr

number expr * expr


digit number number

digit number number


3 digit digit
3 digit digit
4 5

4 5

You might also like