0% found this document useful (0 votes)

123 views58 pages

Syntax Analysis

The document discusses syntax analysis and parsing. It defines a syntax analyzer as creating the syntactic structure of a program by checking it against a context-free grammar and building a parse tree if it satisfies the grammar rules. A parser is also known as a syntax analyzer. Parsing involves determining if a string of tokens can be generated by a grammar using top-down or bottom-up parsing methods. Top-down parsing starts at the root and proceeds to the leaves, while bottom-up starts at the leaves and proceeds to the root. The document also discusses error handling strategies for parsers.

Uploaded by

bavana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

123 views58 pages

Syntax Analysis

Uploaded by

bavana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 58

SYNTAX

ANALYSIS
Syntax Analyzer
 Syntax Analyzer creates the syntactic structure of the given
source program.

 This syntactic structure - parse tree.

 Syntax Analyzer is also known as parser.

 The syntax analyzer (parser) checks whether a given source

program satisfies the rules implied by a context-free grammar
or not.
 If it satisfies, the parser creates the parse tree of that program.

 Otherwise the parser gives the error messages.

INTRODUCTION
 Every programming language has precise rules that prescribe the syntactic
structure of well-formed programs.

 Program is made up of functions, a function out of declarations and statements, a

statement out of expressions

 The syntax of programming language constructs can be specified by context-

free grammars

 A context-free grammar

 gives a precise syntactic specification of a programming language.

 the design of the grammar is an initial phase of the design of a

compiler.

 a grammar can be directly converted into a parser by some tools.

Parser
• Parser works on a stream of tokens.

• The smallest item is a token.

source Lexical token parse tree

program Parser
Analyzer get next token
Parsing
 Parsing is the process of determining whether a string of tokens can be
generated by a grammar.
 Parsing methods
 The top-down

 Bottom-up methods.

 Top-down parsing, construction starts at the root and proceeds to the

leaves.
 Bottom-up parsing, construction starts at the leaves and proceeds towards
the root.
 Top-down parsers are easy to build by hand.
 Bottom-up parsing,
 Can handle a larger class of grammars.

 They are not as easy to build, but tools for generating them directly from a grammar are available.

 Both top-down and bottom-up parsers scan the input from left to right (one symbol at a time).
Top- Down Parsing
 Done by starting with the root, labeled with the starting nonterminal stmt,
and repeatedly performing the following two steps.

 At node N, labeled with nonterminal A, select one of the productions for A and
construct children at N for the symbols in the production body.

 Find the next node at which a subtree is to be constructed, typically the leftmost
unexpanded nonterminal of the tree.

 The current terminal being scanned in the input is frequently referred to as

the lookahead symbol.
Top- Down Parsing
Top- Down Parsing
Top- Down Parsing
Top-Down Parsing
 Top-Down Parsing is an attempt to find a left-most
derivation for an input string
 Example:
S  cAd Find a derivation for
A  ab | a for w  cad

S S Backtrack S
/|\  /|\  /|\
cAd cAd cAd
/ \ |
a b a
Predictive Parsing

 Recursive-descent parsing is a top-down method of syntax analysis in

which a set of recursive procedures is used to process the input.

 Simple form of recursive descent – Predictive Parsing

Syntax Error Handling
 Goals in error handling

 Report the presence of errors clearly and accurately.

 Recover from each error quickly enough to detect subsequent errors.

 Add minimal overhead to the processing of correct programs.

Error-Recovery Strategies
 The simplest approach is for the parser to quit with an informative error
message when it detects the first error.

 Panic-mode recovery

 Phrase-level recovery

 Error-productions

 Global-correction.
Panic-Mode Recovery
 The parser discards input symbols one at a time until one of a designated set of
synchronizing tokens is found.

 The synchronizing tokens are usually delimiters, such as ; or }.

 Skips a considerable amount of input without checking for additional errors

 It has the advantage of simplicity, and is guaranteed not to go into an infinite

loop.
Phrase-Level Recovery

 Perform local correction on the remaining input;

 It may replace a prefix of the remaining input by some string that allows the
parser to continue.

 A typical local correction is to replace a comma by a semicolon.

 Delete an extraneous semicolon.

 Insert a missing semicolon.

 Disadvantage in coping with situations in which the actual error has occurred
before the point of detection.
Error Productions

 Expand the grammar for the language at hand with productions that generate the
erroneous constructs.

 The parser can then generate appropriate error diagnostics about the erroneous
construct that has been recognized in the input.
Global Correction
 Compiler to make as few changes as possible in processing an incorrect input
string.

 Given an incorrect input string x and grammar G, algorithms will find a parse
tree for a related string y, such that the number of insertions, deletions, and
changes of tokens required to transform x into y is as small as possible.

 Not implemented.
Syntax Definition
 A grammar describes the hierarchical structure of programming language constructs.

 Eg: if ( expression ) statement else statement

 An if-else statement is the concatenation of the keyword if, an opening parenthesis, an

expression, a closing parenthesis, a statement, the keyword else, and another statement.

 Stmt -> if ( expr ) stmt else stmt

 Rule is called a production.

 In a production, lexical elements if and the parentheses are called terminals.

 Variables like expr and stmt are called nonterminals.

A Context Free Grammar
 A context-free grammar has four components:

 A set of terminal symbols, sometimes referred to as "tokens.“

 A set of nonterminals, sometimes called "syntactic variables."

 A set of productions, where each production consists of a nonterminal,called the head or

left side of the production, an arrow, and a sequence of terminals and/or nonterminals ,
called the body or right side of the production

 A designation of one of the nonterminals as the start symbol.

A Context Free Grammar

The terminal symbols are

Notational Conventions
These symbols are terminals:

 Lowercase letters early in the alphabet, such as a, b, c.

 Operator symbols such as +, *, and so on.

 Punctuation symbols such as parentheses, comma, and so on.

 The digits 0, 1, . . . , 9.

 Boldface strings such as id or if, each of which represents a single terminal

symbol.
Notational Conventions
These symbols are nonterminals:

 Uppercase letters early in the alphabet, such as A, B, C.

 The letter s, which, when it appears, is usually the start symbol.

 Lowercase, italic names such as expr or stmt.

 Uppercase letters may be used t o represent nonterminals for the constructs.

For example, nonterminals for expressions, terms, and factors are often
represented by E, T, and F, respectively.
Notational Conventions
 Uppercase letters late in the alphabet, such as X, Y, Z, represent grammar
symbols; that is, either nonterminals or terminals.

 Lowercase letters late in the alphabet , chiefly u, v, ... ,z, represent (possibly
empty) strings of terminals.

 Lowercase greek letters,α, β, γ for example, represent (possibly empty) strings

of grammar symbols.

 A set of productions a -> α 1 , a -> α2, ... , a -> α k with a common head

 A (call them a-productions) , may be written A -> α 1 I α 2 I . , . I α k · call α1 ,

α2 , ... ,αk the alternatives for A.

 Unless stated otherwise, the head of the first production is the start symbol
Notational Conventions
Derivations
 E  E+E : E+E derives from E

 E  E+E  id+E  id+id

 A sequence of replacements of non-terminal symbols is called a derivation

of id+id from E.

 A   if there is a production rule A in our grammar and  and
 are arbitrary strings of terminal and non-terminal symbols

1  2  ...  n (n derives from 1 or 1 derives n )

 : derives in one step


*
: derives in zero or more steps
+
 : derives in one or more steps
CFG - Terminology
 L(G) is the language of G (the language generated by G) which is a set of
sentences.

 A sentence of L(G) is a string of terminal symbols of G.

 If S is the start symbol of G then

 is a sentence of L(G) iff S   where  is a string of terminals of G
*
 If G is a context-free grammar, L(G) is a context-free language.

 Two grammars are equivalent if they produce the same language.

 S - If  contains non-terminals, it is called as a sentential form of G.

*
- If  does not contain non-terminals, it is called as a sentence of G.
Derivation Example
 E  -E  -(E)  -(E+E)  -(id+E)  -(id+id)

 E  -E  -(E)  -(E+E)  -(E+id)  -(id+id)

 At each derivation step, we can choose any of the non-terminal in the

sentential form of G for the replacement.

 If we always choose the left-most non-terminal in each derivation

step, this derivation is called as left-most derivation.

 If we always choose the right-most non-terminal in each derivation

step, this derivation is called as right-most derivation.
Left-Most and Right-Most Derivations
Left-Most Derivation

E  -E lm-(E)  -(E+E)
lm
 -(id+E)
lm
 -(id+id)
lm
lm

Right-Most Derivation

Erm -E 
rm -(E)  rm
-(E+E)  -(E+id)
rm  -(id+id)
rm

 We will see that the top-down parsers try to find the left-most derivation of the
given source program.

 We will see that the bottom-up parsers try to find the right-most derivation of
the given source program in the reverse order.
Parse Trees and Derivations
 A parse tree is a graphical representation of a derivation that filters out the
order in which productions are applied to replace nonterminals.
 The interior node is labeled with the nonterminal A in the head of the
production;

 The children of the node are labeled, from left to right, by the symbols in the
body of the production
 The leaves of a parse tree are labeled by nonterminals or terminals
 Read from left to right, constitute a sentential form, called the yield or frontier
of the tree.
 There is a many-to-one relationship between derivations and parse trees.
Ambiguity
 a grammar that produces more than one parse tree for some sentence is said
to be ambiguous
1
2
3
4

f
Writing a Grammar
 Grammars are capable of describing most, of the syntax of programming
languages .
 Grammar should be unambiguous.
 Left-recursion elimination and left factoring - are useful for rewriting
grammars .
 From the resulting grammar we can create top down parsers without
backtracking.
 Such parsers are called predictive parsers or recursive-descent parser
Eliminating Ambiguity

 ambiguous grammar can be rewritten to eliminate the ambiguity.

 stmt -> if expr then stmt
|if expr then stmt else stmt
|other
 is ambiguous since the string
 if E1 then if E2 then S1 else S2 has the two parse trees
Two parse trees for an ambiguous sentence
Eliminating Ambiguity
 The general rule is, "Match each else with the closest unmatched then."
Left Recursion
 A grammar is left recursive if it has a non-terminal A such that there is a
derivation.

A  A* for some string 

 Top-down parsing techniques cannot handle left-recursive grammars.

 The left-recursion may appear in a single step of the derivation (immediate left-
recursion), or may appear in more than one step of the derivation.
Immediate Left-Recursion
AA|  where  does not start with A

 eliminate immediate left recursion

A   A’
A’   A’ | 
In general,

A  A 1 | ... | A m | 1 | ... | n where 1 ... n do not start with A

 eliminate immediate left recursion

A  1 A’ | ... | n A’
A’  1 A’ | ... | m A’ |  an equivalent grammar
Left-Recursion -- Problem

• A grammar cannot be immediately left-recursive, but it still can

be left-recursive.

S  Aa | b
A  Sc | d

S  Aa  Sca
A  Sc  Aac causes to a left-recursion
Eliminate Left-Recursion -- Algorithm
- Arrange non-terminals in some order: A1 ... An
- for i from 1 to n do {
- for j from 1 to i-1 do {
replace each production
Ai  Aj 
by
Ai  1  | ... | k 
where Aj  1 | ... | k
}
- eliminate immediate left-recursions among Ai productions
}
Eliminate Left-Recursion
S  Aa | b
A  Ac | Sd | f

EE+T|T

TT*F|F

F  ( E ) | id

 This grammar can be re-written as the following non left-

recursive grammar:

E  T E’ E’  + TE’ | є

T  F T’ T’  * F T’ | є

F  (E) | id
Left Factoring
 Left factoring is a grammar transformation that is useful for
producing a grammar suitable for predictive, or top-down,
parsing.
 Stmt -> if expr then stmt else stmt
|if expr then stmt
 A ->α 1 | α 2
 So it should be left factored as
Left-Factoring -- Algorithm
 For each non-terminal A with two or more alternatives (production rules)
with a common non-empty prefix
A  1 | ... | n | 1 | ... | m

convert it into

A  A’ | 1 | ... | m
A’  1 | ... | n
Left-Factoring – Example1
A  abB | aB | cdg | cdeB | cdfB


A  aA’ | cdg | cdeB | cdfB
A’  bB | B


A  aA’ | cdA’’
A’  bB | B
A’’  g | eB | fB
Left-Factoring – Example2
A  ad | a | ab | abc | b


A  aA’ | b
A’  d |  | b | bc


A  aA’ | b
A’  d |  | bA’’
A’’   | c
Top-Down Parsing
 The parse tree is created top to bottom.
 Top-down parser
 Recursive-descent parsing
 Backtracking is needed
 It is a general parsing technique, but not widely used.
 Not efficient
 Predictive parsing
 No backtracking
 Efficient
 Needs a special form of grammars - (LL(1) grammars).
 Recursive predictive parsing is a special form of recursive descent parsing without
backtracking.
 Non-recursive (table driven) predictive parser is also known as LL(1) parser.
Recursive Predictive Parsing
Each non-terminal corresponds to a procedure.
Ex: A  aBb
proc A {
- match the current token with a, and move to the next
token;
- call ‘B’;
- match the current token with b, and move to the next
token;
}
Recursive Predictive Parsing (cont.)
A  aBb | bAB
proc A {
case of the current token {
‘a’: - match the current token with a, and move to the next token;
- call ‘B’;
- match the current token with b, and move to the next token;
‘b’: - match the current token with b, and move to the next token;
- call ‘A’;
- call ‘B’;
}
}
Top-down parse for id + id * id
FIRST and FOLLOW
 FIRST and FOLLOW allow us to choose which production toapply, based on the
next input symbol.

 FIRST(α), where α is any string of grammar symbols, to be the set of terminals that
begin strings derived from α.
 If α => ε, then ε is also in FIRST(α) .

 A => cY, so c is in FIRST(A)

 FOLLOW(A) is the set of the terminals which occur immediately after (follow) the
non-terminal A in the strings derived from the starting symbol.

 a terminal a is in FOLLOW(A) if S  Aa

 $ is in FOLLOW(A) if S  A
*
*
FIRST
1. If X is a terminal, then FIRST(X) = {X}.

2. If X is a nonterminal and X -> YI Y2 ... Yk is a production for some k>=1, then

place a in FIRST(X) if for some i, a is in FIRST(Yi), and ε is in all of
FIRST(YI), ... ,FIRST(Yi-I); that * is , YI Y2 ... Yi-1 => ε. If ε is in FIRST (Yj) for
all j = 1, 2,... ,k, then add ε to FIRST (X). For example, everything in FIRST(Y1)
is surely in FIRST(X) . If Yi does not derive ε then we add nothing more to
FIRST (X) , but if Y1 => ε, then we add FIRST(Y2), and so on.

3. 3. If X => ε is a production, then add ε to FIRST (X).

FOLLOW
LL ( 1 ) Grammars
 L: scanning the input from left to right

 L: producing a leftmost derivation

 1 : one input symbol of lookahead at each step

 A grammar G is LL(1) if and only if whenever A -> α | β are two distinct

productions of G, the following conditions hold:
Construction of a predictive parsing
table.

Compiler Design 3
No ratings yet
Compiler Design 3
140 pages
Perl One-Liners: 130 Programs That Get Things Done
From Everand
Perl One-Liners: 130 Programs That Get Things Done
Peteris Krumins
4/5 (3)
Psychopathology Review: Allison M. Waters, PHD Richard T. Lebeau, PHD, & Michelle G. Craske, PHD
No ratings yet
Psychopathology Review: Allison M. Waters, PHD Richard T. Lebeau, PHD, & Michelle G. Craske, PHD
17 pages
Ch3 - Syntax Analysis
No ratings yet
Ch3 - Syntax Analysis
96 pages
Chapter 3
No ratings yet
Chapter 3
96 pages
CS8602 CD Unit 2
No ratings yet
CS8602 CD Unit 2
43 pages
Chapter - Three
No ratings yet
Chapter - Three
139 pages
Unit 2
No ratings yet
Unit 2
22 pages
3 Role of Parser
No ratings yet
3 Role of Parser
135 pages
CC Unit 3
No ratings yet
CC Unit 3
51 pages
History Modern: Andhra
No ratings yet
History Modern: Andhra
221 pages
Class Three
No ratings yet
Class Three
74 pages
Chapter 3
No ratings yet
Chapter 3
41 pages
Unit-II CD
No ratings yet
Unit-II CD
81 pages
Chapter - Three: Syntax Analysis
No ratings yet
Chapter - Three: Syntax Analysis
100 pages
Parser
No ratings yet
Parser
40 pages
CD Unit 3
No ratings yet
CD Unit 3
76 pages
Compiler - Design - Module3
No ratings yet
Compiler - Design - Module3
19 pages
Seating Plan
No ratings yet
Seating Plan
21 pages
(Week 4) Syntax Analysis (CFG)
No ratings yet
(Week 4) Syntax Analysis (CFG)
50 pages
CD Chapter 2
No ratings yet
CD Chapter 2
39 pages
2024 CD-Ch03 Syntaxx Analysis
No ratings yet
2024 CD-Ch03 Syntaxx Analysis
28 pages
Compiler Design Lec-Three Syntax Analysis
No ratings yet
Compiler Design Lec-Three Syntax Analysis
60 pages
Chapter 3
No ratings yet
Chapter 3
180 pages
Chapter 6
No ratings yet
Chapter 6
52 pages
Chapter 3 - Syntax Analysis
No ratings yet
Chapter 3 - Syntax Analysis
9 pages
‎⁨‏لقطة شاشة ٢٠٢٤-٠٣-٢٩ في ١١.٠٧.٠٧ م⁩
No ratings yet
‎⁨‏لقطة شاشة ٢٠٢٤-٠٣-٢٩ في ١١.٠٧.٠٧ م⁩
6 pages
Chapter - 3
No ratings yet
Chapter - 3
46 pages
Chapter-3 So Far
No ratings yet
Chapter-3 So Far
50 pages
UNIT 3 Syntax Analysis-Part1: Harshita Sharma
No ratings yet
UNIT 3 Syntax Analysis-Part1: Harshita Sharma
70 pages
Unit Iii
No ratings yet
Unit Iii
95 pages
Divertidos Ensayos Persuasivos
100% (1)
Divertidos Ensayos Persuasivos
6 pages
1 Syntax Analyzer
No ratings yet
1 Syntax Analyzer
33 pages
Unit-2 F&CD
No ratings yet
Unit-2 F&CD
31 pages
Syntax Analysis
No ratings yet
Syntax Analysis
47 pages
Syntax Analysis (Part-I)
No ratings yet
Syntax Analysis (Part-I)
88 pages
CD Chapter-3
No ratings yet
CD Chapter-3
105 pages
Atcd Unit 2
No ratings yet
Atcd Unit 2
49 pages
Transformation of Sentence - Magic Rules & Example
No ratings yet
Transformation of Sentence - Magic Rules & Example
9 pages
2.2 - Syntax Analysis (Upto Top-Down Parsing)
No ratings yet
2.2 - Syntax Analysis (Upto Top-Down Parsing)
91 pages
Lesson 3: Syntax Analysis: Risul Islam Rasel
No ratings yet
Lesson 3: Syntax Analysis: Risul Islam Rasel
106 pages
Unit Iii
No ratings yet
Unit Iii
28 pages
Sec 4 Bio Mar HW - Coordination & Response
No ratings yet
Sec 4 Bio Mar HW - Coordination & Response
6 pages
Chapter-3-Syntax Analysis
No ratings yet
Chapter-3-Syntax Analysis
126 pages
CD - Unit - 2
No ratings yet
CD - Unit - 2
22 pages
Lumpia
No ratings yet
Lumpia
4 pages
Syntax Analyzer
No ratings yet
Syntax Analyzer
38 pages
CD Unit-Ii
No ratings yet
CD Unit-Ii
34 pages
Chapter 3 (Part 1)
No ratings yet
Chapter 3 (Part 1)
33 pages
1 Self-Study Guide 1a
No ratings yet
1 Self-Study Guide 1a
25 pages
CH 6
No ratings yet
CH 6
18 pages
Module 2 C D Notes
No ratings yet
Module 2 C D Notes
21 pages
Unit-3-Waves-Definitions and Formula Sheet
No ratings yet
Unit-3-Waves-Definitions and Formula Sheet
3 pages
E Sports
No ratings yet
E Sports
6 pages
Warehousing and Stock Control
No ratings yet
Warehousing and Stock Control
37 pages
ACD-UNIT-4 Notes
No ratings yet
ACD-UNIT-4 Notes
32 pages
Syntax Analysis: CD: Compiler Design
No ratings yet
Syntax Analysis: CD: Compiler Design
36 pages
Upholstering Methods, by Fred W. Zimmerman
100% (2)
Upholstering Methods, by Fred W. Zimmerman
200 pages
MODULE 3 Syntax Analysis
100% (1)
MODULE 3 Syntax Analysis
182 pages
Module 2 Notes
No ratings yet
Module 2 Notes
41 pages
Chapter 3 - Syntax Analysis
No ratings yet
Chapter 3 - Syntax Analysis
16 pages
Unit2 TopDownParsing
No ratings yet
Unit2 TopDownParsing
12 pages
G52Cmp Compilers: Syntax Analysis
No ratings yet
G52Cmp Compilers: Syntax Analysis
36 pages
Compiler Design Chapter-3
0% (1)
Compiler Design Chapter-3
177 pages
Measurements
No ratings yet
Measurements
19 pages
Compiler Design - Syntax Analysis
No ratings yet
Compiler Design - Syntax Analysis
14 pages
BUYAMIA-Investment Round 12-September 2023 Indonesia Updated
No ratings yet
BUYAMIA-Investment Round 12-September 2023 Indonesia Updated
23 pages
2 Syntax Analysis - Introduction
No ratings yet
2 Syntax Analysis - Introduction
8 pages
DW DM Notes
No ratings yet
DW DM Notes
107 pages
Chapter 3 - Syntax Analysis Part One
No ratings yet
Chapter 3 - Syntax Analysis Part One
10 pages
CH03
No ratings yet
CH03
57 pages
Sundrop Art Guide
No ratings yet
Sundrop Art Guide
10 pages
C Depart
No ratings yet
C Depart
7 pages
4 Words and Phrases 1 2022
No ratings yet
4 Words and Phrases 1 2022
4 pages
Compiler Design - Syntax Analysis
No ratings yet
Compiler Design - Syntax Analysis
11 pages
New Multiple Choice Question
No ratings yet
New Multiple Choice Question
262 pages
Docu85238 - Data Domain Boost For OpenStorage 3.4.1.1 Release Notes
No ratings yet
Docu85238 - Data Domain Boost For OpenStorage 3.4.1.1 Release Notes
8 pages
CD Unit-2
100% (1)
CD Unit-2
60 pages
K004en-Nw - NIPPON STEEL PDF
No ratings yet
K004en-Nw - NIPPON STEEL PDF
57 pages
UT1 Part3 - Syntax Tree and Direct DFA
No ratings yet
UT1 Part3 - Syntax Tree and Direct DFA
7 pages
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
2015 Paper Source Colorscope
No ratings yet
2015 Paper Source Colorscope
1 page
Oxfam Shop Volunteer Application Form A4
No ratings yet
Oxfam Shop Volunteer Application Form A4
2 pages
CD Unit 2
100% (1)
CD Unit 2
20 pages
Evermotion Archexteriors Vol 2 PDF
No ratings yet
Evermotion Archexteriors Vol 2 PDF
2 pages
Department of Computer Science and Engineering
No ratings yet
Department of Computer Science and Engineering
23 pages
Case Study
No ratings yet
Case Study
8 pages
Company Profile HDFC Bank
No ratings yet
Company Profile HDFC Bank
7 pages
KCG College of Technology Karapakkam Chennai-600 097
No ratings yet
KCG College of Technology Karapakkam Chennai-600 097
3 pages
MainNav GPS Manual MG-950d User Manual 2008-09-16
No ratings yet
MainNav GPS Manual MG-950d User Manual 2008-09-16
20 pages
Epson WF C5790 Product Brochure
No ratings yet
Epson WF C5790 Product Brochure
2 pages
SCERTS Implementation
100% (2)
SCERTS Implementation
8 pages
Role of Parse1
No ratings yet
Role of Parse1
20 pages
Scholasticism and Monasticism.
100% (9)
Scholasticism and Monasticism.
19 pages

Syntax Analysis

Uploaded by

Syntax Analysis

Uploaded by

SYNTAX

 This syntactic structure - parse tree.

 The syntax analyzer (parser) checks whether a given source

 Otherwise the parser gives the error messages.

 Program is made up of functions, a function out of declarations and statements, a

 The syntax of programming language constructs can be specified by context-

 gives a precise syntactic specification of a programming language.

 the design of the grammar is an initial phase of the design of a

 a grammar can be directly converted into a parser by some tools.

• The smallest item is a token.

source Lexical token parse tree

 Top-down parsing, construction starts at the root and proceeds to the

 The current terminal being scanned in the input is frequently referred to as

 Recursive-descent parsing is a top-down method of syntax analysis in

 Simple form of recursive descent – Predictive Parsing

 Report the presence of errors clearly and accurately.

 Recover from each error quickly enough to detect subsequent errors.

 Add minimal overhead to the processing of correct programs.

 The synchronizing tokens are usually delimiters, such as ; or }.

 Skips a considerable amount of input without checking for additional errors

 It has the advantage of simplicity, and is guaranteed not to go into an infinite

 Perform local correction on the remaining input;

 A typical local correction is to replace a comma by a semicolon.

 Delete an extraneous semicolon.

 Insert a missing semicolon.

 Eg: if ( expression ) statement else statement

 An if-else statement is the concatenation of the keyword if, an opening parenthesis, an

 Stmt -> if ( expr ) stmt else stmt

 Rule is called a production.

 In a production, lexical elements if and the parentheses are called terminals.

 Variables like expr and stmt are called nonterminals.

 A set of terminal symbols, sometimes referred to as "tokens.“

 A set of nonterminals, sometimes called "syntactic variables."

 A set of productions, where each production consists of a nonterminal,called the head or

 A designation of one of the nonterminals as the start symbol.

The terminal symbols are

 Lowercase letters early in the alphabet, such as a, b, c.

 Operator symbols such as +, *, and so on.

 Punctuation symbols such as parentheses, comma, and so on.

 Boldface strings such as id or if, each of which represents a single terminal

 Uppercase letters early in the alphabet, such as A, B, C.

 The letter s, which, when it appears, is usually the start symbol.

 Lowercase, italic names such as expr or stmt.

 Uppercase letters may be used t o represent nonterminals for the constructs.

 Lowercase greek letters,α, β, γ for example, represent (possibly empty) strings

 A (call them a-productions) , may be written A -> α 1 I α 2 I . , . I α k · call α1 ,

 E  E+E  id+E  id+id

 A sequence of replacements of non-terminal symbols is called a derivation

1  2  ...  n (n derives from 1 or 1 derives n )

 : derives in one step

 A sentence of L(G) is a string of terminal symbols of G.

 If S is the start symbol of G then

 Two grammars are equivalent if they produce the same language.

 S - If  contains non-terminals, it is called as a sentential form of G.

 E  -E  -(E)  -(E+E)  -(E+id)  -(id+id)

 At each derivation step, we can choose any of the non-terminal in the

 If we always choose the left-most non-terminal in each derivation

 If we always choose the right-most non-terminal in each derivation

 ambiguous grammar can be rewritten to eliminate the ambiguity.

A  A* for some string 

 Top-down parsing techniques cannot handle left-recursive grammars.

 eliminate immediate left recursion

A  A 1 | ... | A m | 1 | ... | n where 1 ... n do not start with A

 eliminate immediate left recursion

• A grammar cannot be immediately left-recursive, but it still can

 This grammar can be re-written as the following non left-

 A => cY, so c is in FIRST(A)

 a terminal a is in FOLLOW(A) if S  Aa

2. If X is a nonterminal and X -> YI Y2 ... Yk is a production for some k>=1, then

3. 3. If X => ε is a production, then add ε to FIRST (X).

 L: producing a leftmost derivation

 1 : one input symbol of lookahead at each step

 A grammar G is LL(1) if and only if whenever A -> α | β are two distinct

You might also like