0% found this document useful (0 votes)
71 views22 pages

Entrepreneurship Process

This chapter introduces the basics of building a simple one-pass compiler, including defining a grammar using BNF notation, performing syntax-directed translation during parsing, and using different parsing techniques like predictive parsing. It describes representing the syntax with a context-free grammar, associating attributes with grammar rules to define translations, and using parse trees to represent derivations. It also discusses handling operator precedence and associativity during translation.

Uploaded by

Mohsin Mine
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views22 pages

Entrepreneurship Process

This chapter introduces the basics of building a simple one-pass compiler, including defining a grammar using BNF notation, performing syntax-directed translation during parsing, and using different parsing techniques like predictive parsing. It describes representing the syntax with a context-free grammar, associating attributes with grammar rules to define translations, and using parse trees to represent derivations. It also discusses handling operator precedence and associativity during translation.

Uploaded by

Mohsin Mine
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

1

A Simple One-Pass Compiler

Chapter 2
2

Overview
• This chapter contains introductory material
• Building a simple compiler
– Syntax definition
– Syntax directed translation
– Predictive parsing
– Etc.
Overview
Programming Language can be defined by describing
1. The syntax of the language
1. What its program looks like
2. We use CFG or BNF (Backus Naur Form)
2. The semantics of the language
1. What its program mean
2. Difficult to describe
3. Use informal descriptions and suggestive
examples
The Entire Compilation Process
• Grammars for Syntax Definition
• Syntax-Directed Translation
• Parsing - Top Down & Predictive
• Pulling Together the Pieces
• The Lexical Analysis Process
• Symbol Table Considerations
• A Brief Look at Code Generation
• Concluding Remarks/Looking Ahead
5

The Structure of our Compiler

Character Token Syntax-directed Bytecode


stream Lexical analyzer
stream translator Object code

Develop
parser and code
generator for translator

Syntax definition
specification
(grammar)
Grammars for Syntax Definition
• A Context-free Grammar (CFG) Is Utilized to
Describe the Syntactic Structure of a Language
• A CFG Is Characterized By:
1. A Set of Tokens or Terminal Symbols
2. A Set of Non-terminals
3. A Set of Production Rules
Each Rule Has the Form
NT  {T, NT}*
4. A Non-terminal Designated As the Start Symbol
7

Parse Trees
• The root of the tree is labeled by the start symbol
• Each leaf of the tree is labeled by a terminal
(=token) or 
• Each interior node is labeled by a non-terminal
• If A  X1 X2 … Xn is a production, then node A has
children X1, X2, …, Xn where Xi is a (non)terminal or
 ( denotes the empty string)
Grammars for Syntax Definition
Example CFG

list  list + digit


list  list - digit
list  digit
digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
(the “|” means OR)
(So we could have written
list  list + digit | list - digit | digit )
Information
 A string of tokens is a sequence of zero or more tokens.
 The string containing with zero tokens, written as , is called
empty string.
 A grammar derives strings by beginning with the start symbol
and repeatedly replacing the non terminal by the right side of a
production for that non terminal.
 The token strings that can be derived from the start symbol form
the language defined by the grammar.
Grammars are Used to Derive Strings:
Using the CFG defined on the earlier slide, we can
derive the string: 9 - 5 + 2 as follows:
list  list + digit P1 : list  list + digit

 list - digit + digit P2 : list  list - digit

 digit - digit + digit P3 : list  digit


 9 - digit + digit P4 : digit  9

 9 - 5 + digit P4 : digit  5

 9-5+2 P4 : digit  2
Grammars are Used to Derive Strings:
This derivation could also be represented via a Parse Tree
(parents on left, children on right)
list  list + digit list
 list - digit + digit
list + digit
 digit - digit + digit
- 2
 9 - digit + digit list digit
 9 - 5 + digit 5
digit
 9-5+2
9
12

Ambiguity
Consider the following context-free grammar:

G = <{string}, {+,-,0,1,2,3,4,5,6,7,8,9}, P, string>

with production P =

string  string + string | string - string | 0 | 1 | … | 9


This grammar is ambiguous, because more than one parse tree
generates the string 9-5+2
13

Ambiguity (cont’d)

string string

string string string

string string string string string

9 - 5 + 2 9 - 5 + 2
14

Associativity of Operators
Left-associative operators have left-recursive productions

left  left + term | term


String a+b+c has the same meaning as (a+b)+c

Right-associative operators have right-recursive productions

right  term = right | term

String a=b=c has the same meaning as a=(b=c)


15

Precedence of Operators
Operators with higher precedence “bind more tightly”
expr  expr + term | term
term  term * factor | factor
factor  number | ( expr )
String 2+3*5 has the same meaning as 2+(3*5)
expr
expr term
term term factor
factor factor number
number number
2 + 3 * 5
16

Syntax of Statements
stmt  id := expr
| if expr then stmt
| if expr then stmt else stmt
| while expr do stmt
| begin opt_stmts end
opt_stmts  stmt ; opt_stmts
|

Note: Opt stands for optional


More Complex Grammars
block  begin opt_stmts end
opt_stmts  stmt_list | 
stmt_list  stmt_list ; stmt | stmt

What is this grammar for ?


What does “” represent ?
What kind of production rule is this ?
Defining a Parse Tree
• A parse tree pictorially shows how the start symbol of a
grammar derives a string in the language.
• More Formally, a Parse Tree for a CFG Has the Following
Properties:
– Root Is Labeled With the Start Symbol
– Leaf Node Is a Token or 
– Interior Node Is a Non-Terminal
– If A  x1x2…xn, Then A Is an Interior; x1x2…xn Are
Children of A and May Be Non-Terminals or Tokens
19

Syntax-Directed Translation
• Uses a CF grammar to specify the syntactic
structure of the language
• AND associates a set of attributes with
(non)terminals
• AND associates with each production a set of
semantic rules for computing values for the
attributes
• The attributes contain the translated form of the
input after the computations are completed
Syntax for Statements
stmt  id := expr
| if expr then stmt
| if expr then stmt else stmt
| while expr do stmt
| begin opt_stmts end

Ambiguous Grammar?
Syntax-Directed Translation
• Associate Attributes With Grammar Rules and Translate as Parsing occurs

• The translation will follow the parse tree structure (and as a result the
structure and form of the parse tree will affect the translation).

• First example: Inductive Translation.


• Infix to Postfix Notation Translation for Expressions
– Translation defined inductively as: Postfix(E) where E is an Expression.

Rules
1. If E is a variable or constant then Postfix(E) = E
2. If E is E1 op E2 then Postfix(E)
= Postfix(E1 op E2) = Postfix(E1) Postfix(E2) op
3. If E is (E1) then Postfix(E) = Postfix(E1)
Examples
Postfix( ( 9 – 5 ) + 2 )
= Postfix( ( 9 – 5 ) ) Postfix( 2 ) +
= Postfix( 9 – 5 ) Postfix( 2 ) +
= Postfix( 9 ) Postfix( 5 ) - Postfix( 2 ) +
=95–2+

Postfix(9 – ( 5 + 2 ) )
= Postfix( 9 ) Postfix( ( 5 + 2 ) ) -
= Postfix( 9 ) Postfix( 5 + 2 ) –
= Postfix( 9 ) Postfix( 5 ) Postfix( 2 ) + –
=952+–

You might also like