0% found this document useful (0 votes)
14 views36 pages

Week 3-4

Slides of Compiler Construction chapter 3-4

Uploaded by

Malik Zohaib
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views36 pages

Week 3-4

Slides of Compiler Construction chapter 3-4

Uploaded by

Malik Zohaib
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 36

MODULE # 2: COMPILER CONSTRUCTION

INSTRUCTOR: DR. SAKEENA JAVAID

1
OUTLINE OF THE TOPICS TO BE COVERED TODAY

 Language definition
 Syntax and semantic Specification
 Using BNF notation to define syntax of a language

2
SYNTAX AND SEMANTIC SPECIFICATION
 Syntax and Semantic Specification
 For specifying syntax, we present a widely used notation called Backus-Naur Form or
CFGs.
 For example: if (expression) statement else statement
 Context-free grammar, or grammars are used alternatively here

 (An if-else statement is the concatenation of the keyword if, an opening parenthesis,
an expression, a closing parenthesis, a statement, the keyword else, and another
statement.)

3
SYNTAX AND SEMANTIC SPECIFICATION

 Using the variable expr to denote an expression and the variable stmt to denote a statement,
this structuring rule can be expressed as
Stmt if (expr) stmt else stmt
 arrow may be read as “can have the form." Such a rule is called a production
 In a production, lexical elements like the keyword if and the parentheses are called terminals
 Variables like expr and stmt represent sequences of terminals and are called non-terminals.

4
SYNTAX AND SEMANTIC SPECIFICATION

 Definition of Grammars:
 A context-free grammar has four components:

1. A set of terminal symbols, sometimes referred to as “tokens." The terminals are the elementary symbols of the
language defined by the grammar.
2. A set of non-terminals, sometimes called “syntactic variables." Each non-terminal represents a set of strings of
terminals, in a manner we shall describe.
3. A set of productions, where each production consists of a nonterminal, called the head or left side of the
production, an arrow, and a sequence of terminals and/or non-terminals, called the body or right side of the
production. The intuitive intent of a production is to specify one of the written forms of a construct; if the head
nonterminal represents a construct, then the body represents a written form of the construct.
4. A designation of one of the non-terminals as the start symbol

5
SYNTAX AND SEMANTIC SPECIFICATION
 Few Conventions for specifying Productions
 Specifying grammars by listing their productions, with the productions for the start
symbol listed first
 We assume that digits, signs such as < and <=, and boldface strings such as while are
terminals
 An italicized name is a nonterminal, and any non-italicized name or symbol may be
assumed to be a terminal
 For notational convenience, productions with the same nonterminal as the head can
have their bodies grouped, with the alternative bodies separated by the symbol |,
which we read as “or."
6
SYNTAX AND SEMANTIC SPECIFICATION

 Example:
 Several examples here use expressions consisting of digits and plus
and minus signs; e.g., expressions such as 9-5+2, 3-1, or 7.
 Since a plus or minus sign must appear between two digits, we refer
to such expressions as “lists of digits separated by plus or minus
signs."
 The following grammar describes the syntax of these expressions.
The productions are:

7
SYNTAX AND SEMANTIC SPECIFICATION

 The bodies of the three productions with nonterminal list as head equivalently can be grouped:

 According to our conventions, the terminals of the grammar are the symbols

 The non-terminals are the italicized names list and digit, with list being the start symbol because its
productions are given first
 We say a production is for a nonterminal if the nonterminal is the head of the production.
 A string of terminals is a sequence of zero or more terminals. The string of zero terminals, is called the
empty string.

8
SYNTAX AND SEMANTIC SPECIFICATION

 Derivations:

 A grammar derives strings by beginning with the start symbol


 By repeatedly replacing a nonterminal by the body of a production for that nonterminal
 The terminal strings that can be derived from the start symbol form the language defined by the grammar
 Example 1: The language defined by the grammar of previous Example consists of lists of digits separated by plus and
minus signs.
 The ten productions for the nonterminal digit allow it to stand for any of the terminals 0, 1, …, 9.
 From production (3), a single digit by itself is a list. Productions (1) and (2) express the rule that any list followed by a plus
or minus sign and then another digit makes up a new list
(1)
(2)
(3)
(4) 9
SYNTAX AND SEMANTIC SPECIFICATION

 Productions (1) to (4) are all we need to define the desired language
 For example, we can deduce that 9-5+2 is a list as follows
a. 9 is a list by production (3), since 9 is a digit
b. 9-5 is a list by production (2), since 9 is a list and 5 is a digit
c. 9-5+2 is a list by production (1), since 9-5 is a list and 2 is a digit

10
SYNTAX AND SEMANTIC SPECIFICATION

 Example 2:

 A somewhat different sort of list is the list of parameters in a function call


 In Java, the parameters are enclosed within parentheses, as in the call max(x, y) of function max with parameters x and y.
 One nuance of such lists is that an empty list of parameters may be found between the terminals ( and ).
 We may start to develop a grammar for such sequences with the productions:
 Note that the second possible body for optparams (“optional parameter list") is which stands for the empty string of symbols
 That is, optparams can be replaced by the empty string, so a call can consist of a function name followed by the two-terminal string ()
 Notice that the productions for params are analogous to those for list in Example 1,
 with comma in place of the arithmetic operator + or -, and param in place of digit.

call  id ( optparams )
optparams  params |
params  params , param | param 11
SYNTAX AND SEMANTIC SPECIFICATION
 Parsing:

 Parsing is the problem of taking a string of terminals and figuring out how to derive it from the start symbol of the
grammar
 If it cannot be derived from the start symbol of the grammar, then reporting syntax errors within the string
 A parse tree pictorially shows how the start symbol of a grammar derives a string in the language. If nonterminal A has a production
 A XYZ, then a parse tree may have an interior node labeled A with three children labeled X, Y , and Z, from left to right
 Formally, given a context-free grammar, a parse tree according to the grammar is a tree with the following properties :
 The root is labeled by the start symbol
 Each leaf is labeled by a terminal or by
 Each interior node is labeled by a non-terminal

12
SYNTAX AND SEMANTIC SPECIFICATION

 If A is the nonterminal labeling some interior node


 i.e., X1, X2, …, Xn are the labels of the children of that node from left to right, then there must be a production A
X1, X2, Xn
 Here, X1, X2, …, Xn each stand for a symbol that is either a terminal or a nonterminal
 As a special case, if A is a production,
 then a node labeled A may have a single child labeled

13
SYNTAX AND SEMANTIC SPECIFICATION

 Tree Terminology:
 A tree consists of one or more nodes
 Nodes may have labels, which in this book typically will be grammar symbols
 When we draw a tree, we often represent the nodes by these labels only
 Exactly one node is the root
 All nodes except the root have a unique parent; the root has no parent
 When we draw trees, we place the parent of a node above that node and draw an edge between them. The root is then the highest (top)
node.
 If node N is the parent of node M, then M is a child of N. The children of one node are called siblings. They have an order, from the left,
and when we draw trees, we order the children of a given node in this manner.
 A node with no children is called a leaf. Other nodes | those with one or more children | are interior nodes.
 A descendant of a node N is either N itself, a child of N, a child of a child of N, and so on, for any number of levels.
 We say node N is an ancestor of node M if M is a descendant of N

14
SYNTAX AND SEMANTIC SPECIFICATION

 Example 3: The derivation of 9-5+2 in previous Example is illustrated by the tree

 Each node in the tree is labeled by a grammar symbol

 An interior node and its children correspond to a production

 the interior node corresponds to the head of the production, the children to the body.

 the root is labeled list, the start symbol of the grammar

 The children of the root are labeled, from left to right, list, +, and digit.

 Note that in the production (on right side); the left child of the root is similar to the root, with a child labeled +
instead of -.

15
SYNTAX AND SEMANTIC SPECIFICATION

 In the Fig. the yield is 9-5+2; for convenience, all the leaves
are shown at the bottom level
 Another definition of the language generated by a grammar is
as the set of strings
 that can be generated by some parse tree.

 The process of finding a parse tree for a given string of


terminals is called parsing

16
SYNTAX AND SEMANTIC SPECIFICATION

 Ambiguity
 Grammar can have more than one parse trees
 Showing the multiple representations of the grammar, we normally find a terminal string
 Strings having more than one interpretation, we need to design unambiguous grammars with
additional rules
 Example: We used a single non terminal string and did not distinguish between list and digits

17
SYNTAX AND SEMANTIC SPECIFICATION

 For example:
 9-2+5 can have more than one parse trees and can be interpreted as
 Any two parse trees have the two ways of parenthesizing the expression
(9-5)+2 and 9-(5+2); evaluate the result please … ???
Previous grammar does not not allow this interpretation.

18
SYNTAX AND SEMANTIC SPECIFICATION

Parse trees of the newly constructed grammars

Order of evaluation: Order of evaluation:


left to right right to left

Figure: Parse trees of the grammars


19
SYNTAX AND SEMANTIC SPECIFICATION

 Associativity of Operators
 By convention the expression
 9+5+2 is equivalent to 9+(5+2) and
 9-5-2 is equivalent to 9-(5-2)
 How? When the operand has some operators on its both sides, there are some rules for
making the decision regarding their selection.
 We say operator + associate to its left
 In most programming languages, the operators: +, -, / and * are left associative.

20
SYNTAX AND SEMANTIC SPECIFICATION

 Associativity of Operators
 Some common operators like exponentiation are right associative
 Assignment operators = in C and its descendants are right associative;
 for example the expression a=b=c is treated as a=(b=c)
 String like a=b=c with a right associative operator are generated by the following grammar:

21
SYNTAX AND SEMANTIC SPECIFICATION

 Contrast between left-associative and right associative operators is shown below


 Left-associative operator’s parse tree grows down towards left
 Right-associative operator’s parse tree grows down towards right

Figure: Parse trees for the left and right associative operators 22
SYNTAX AND SEMANTIC SPECIFICATION

 Precedence of Operators
 Consider the expression 9+5*2
 Has two interpretations: (9+5)*2 and 9+(5*2); is there any ways to solve this?
 As operators + and * lie on the same level, now which new rules are required …??

23
SYNTAX AND SEMANTIC SPECIFICATION

 In ordinary arithmetic, multiplication and division have higher precedence over


addition and subtraction
 So we can rewrite the expressions as

9+(5*2) and (9*5)+2;

24
PRACTICE TASK # 2

25
SYNTAX AND SEMANTIC SPECIFICATION

 Syntax-Directed Translation
 It is done by attaching rules or program fragments to productions in the grammar

 Here expr is the sum of two sub-expressions


 Here is the instance of expr
 Why is used here??
 How translation of expr will be performed… ??

26
SYNTAX AND SEMANTIC SPECIFICATION

 By building syntax trees… (will see later)


 By using the conversion of infix notation into postfix notations
 Two concepts are used for syntax-directed translation
 Attributes: quality associated with a programming construct, for example: data types, number
of instructions in generated code or the location of the first instruction in the code of the
construct
 Syntax-Directed translation schemes: A scheme for attaching program segments to the
productions of the grammar (program segments are executed when the production is used
during the syntax analysis)

27
SYNTAX AND SEMANTIC SPECIFICATION

 Syntax-directed translations are used to translate infix expressions into postfix notation, to evaluate the
expressions and to build the syntax trees for programming constructs.
 Postfix Notation
 Considering an expression E

28
SYNTAX AND SEMANTIC SPECIFICATION

 For example:
 (9-5)+2 is 95-2+
 The constants 9, 5, and 2 will be interpreted as the same by rule 1.
 Then 9-2 will be translated as 95- by rule two
 Parenthesis will be translated by rule 3 including the rule two for internal processing

29
SYNTAX AND SEMANTIC SPECIFICATION

 Examples:
 How to convert 9-(5+2)into postfix??
 Evaluate the result … ??
 No parenthesis are required for postfix notations, scan from left to right
 How to convert 952+-3* into infix form … ??

30
SYNTAX AND SEMANTIC SPECIFICATION

 Synthesized attributes:
 Associating qualities with programming constructs
 Values and types with expressions
 Attributes with terminals and non terminals
 Rules to productions, how they are evaluated (like in parse trees, nodes relationship with its
child nodes)

31
SYNTAX AND SEMANTIC SPECIFICATION

 Formal Definition

32
SYNTAX AND SEMANTIC SPECIFICATION

 Attributes can be evaluated as follows:


 Constructing a parse tree for the given string x
 Apply semantic rules to evaluate the attributes at each node in the parse tree

33
SYNTAX AND SEMANTIC SPECIFICATION

34
SYNTAX AND SEMANTIC SPECIFICATION

35
Thanks 

36

You might also like