0% found this document useful (0 votes)
96 views

Compiler Construction CS-4207: Instructor Name: Atif Ishaq

This lecture covered syntax definition using context-free grammars, derivation of strings from a grammar, parsing, ambiguous grammars, and operator associativity and precedence: - Context-free grammars define a language using terminals, nonterminals, productions, and a start symbol. They can describe the syntax of programming languages. - Derivation shows how a string can be generated from the start symbol by replacing nonterminals. Parsing is the inverse process of recognizing strings. - Ambiguous grammars allow multiple parse trees for the same string. Only unambiguous grammars should be used. - Operator precedence and associativity determine how expressions with multiple operators are parsed, such as whether "2 + 4 * 3"

Uploaded by

Faisal Shehzad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views

Compiler Construction CS-4207: Instructor Name: Atif Ishaq

This lecture covered syntax definition using context-free grammars, derivation of strings from a grammar, parsing, ambiguous grammars, and operator associativity and precedence: - Context-free grammars define a language using terminals, nonterminals, productions, and a start symbol. They can describe the syntax of programming languages. - Derivation shows how a string can be generated from the start symbol by replacing nonterminals. Parsing is the inverse process of recognizing strings. - Ambiguous grammars allow multiple parse trees for the same string. Only unambiguous grammars should be used. - Operator precedence and associativity determine how expressions with multiple operators are parsed, such as whether "2 + 4 * 3"

Uploaded by

Faisal Shehzad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Compiler Construction

CS-4207

Instructor Name: Atif Ishaq


Lecture 3
Today’s Lecture

 Syntax Definition

 Context Free Grammar

 Derivation of strings

 Parsing

 Ambiguous Grammar

 Associativity and Precedence of Operator

2
Syntax Definition

Many definition of a Syntax of a sentence in logic or in a programming language

have the following form

 A number or an identifier is an expression

 If E1 and E2 are expressions, then so are

E1 + E2 , E1-E2 , E1 * E2 , E1/E2 , (E1)

What is characteristics to such inductive definition is that they start from some

atomic constructs (e.g. number, identifier), introduce one or more categorical

constructs (e.g. expression), and then provide rules by which constructs can be

combined to yield further instances of constructs.

This mode of definition is generalized to the notion of context-free grammar (or

simply grammar) 3
Syntax Definition

A grammar naturally describes the hierarchical structure of most programming

language constructs (e.g. if-else statement)

if (expression) statement else statement

an if-else statement is concatenation of keyword if, an opening parenthesis,

expression, closing parenthesis, a statement , keyword else and another statement.

stmt  if (expr) stmt else stmt

 Arrow is read as “can have the form”, such rule is called a production

 Lexical elements like if and parenthesis are called terminal

 Variables like expr and stmt represents sequence of terminals and are called

nonterminal
4
Context-Free Grammars

A context-free grammar has four components

1. A set of tokens, known as terminal symbols

2. A set of non terminal symbols (corresponding to categorical concepts)

3. A set of production of the form A → B1…….Bk. Where A is a non terminal, and

each Bi is any symbol

4. A designation of one of the non terminals as the start symbol

We often combine the two rules A → B1…….Bk and A → C1…….Ck into

A → B1…….Bk | C1…….Ck

Example : List of digits separated by + or -

list → list + digit | list – digit | digit


5
digit → 0 | 1 | 2| 3 | 4 | 5 |6 | 7 | 8 | 9
Derivation

A grammar derives strings by beginning with start symbol, and repeatedly

replacing a non terminal symbol by the right side of a production for that non

terminal. The token (terminal) strings that can be derived from the start symbol is

the language defined by the grammar.

Example : Given the grammar

L→L+D|L–D|D

D → 0 | 1 | 2| 3 | 4 | 5 |6 | 7 | 8 | 9

We can use it to derive the string 9 – 5 + 2 as follows

6
The process inverse to derivation is recognition or parsing.
Parsing

Parsing is the problem of taking string of terminals and figuring out how to

derive it from the start symbol of the grammar, and if it can not be derived from

the start symbol of the grammar, then reporting syntax error within the string.

The example discussed 9 – 5 + 2 , each character is a terminal.

A source program generally have multi-character lexeme that are grouped by

lexical analyzer into tokens, whose first components are the terminals processed

by the parser.

7
Parse Tree

A parse tree pictorially shows how the start symbol of a grammar derives a string

in the language

If nonterminal A has a production A → XYZ , then a parse tree may have interior

nodes labelled A with three children labeled X,Y, and Z from left to right

X Y Z

8
Parse Tree

Given a context-free grammar, a parse tree according to grammar has following

properties

1. The root is labeled by the start symbol

2. Each leaf node is labeled by a terminal or e (empty string)

3. Each interior node is labeled by a non terminal

4. If A is the non terminal labeling some interior nodes X1,X2,…..Xn are the

lables of the children of that node from left to right then there must be a

production

A → X1X2…..Xn

9
Parse Tree

Given is parse tree of the derivation of the string 9 – 5 + 2 by the grammar for list

The importance of gramma is not only in their ability to distinguish between

acceptable and unacceptable strings. Not less important is the hierarchical


10
grouping they induce on the string through the parse tree
Ambiguous Grammar
A grammar is ambiguous if it can produce two different parse tree for the same
string
The grammar for list was unambiguous. On the other hand, the following
grammar for the same language is ambiguous

The grammar on the right side is providing correct arithmetical grouping.


We will restrict to unambiguous grammars now onward 11
Associativity of Operators

In programming languages, the associativity of an operator is a property that


determines how operators of the same precedence are grouped in the absence of
parentheses

 Operators may be associative (operation may be grouped arbitrarily)

 Operators may be left associative (operations are grouped from left)

 Operators may be right associative (operations are grouped from right)

 Operators may be non associative (operations can not be chained –


incompatible type)

The associativity and precedence of an operator is a part of the definition of the


programming languages 12
Associativity of Operators

By convention, 9 + 5 + 2 is equivalent to (9 + 5) + 2 and 9 – 5 – 2 is equivalent to


(9 – 5) – 2
When an operand has operator to its left and right, conventions are needed for
deciding which operator applied to that operand. In this case operator + and –
associates to left, because an operand with plus signs on both sides of it belongs
to the operator to its left.

Some common operators such as exponentiation and assignment are right


associative. e.g. the expression a = b = c is treated in the same way as the

expression a = ( b = c )

Grammar for string with a right associative operator

right → letter = right | letter

letter → a | b | c | ………..| z 13
Associativity of Operators

Parse trees for left associative and right associative grammar

14
Precedence of Operators

Consider the expression, 9 + 5 * 2

Two interpretation of this expression

(9 + 5) * 2 or 9 + (5 * 2)

The associativity rules for + and * apply to occurrence of same operator, so they
do not resolve this ambiguity

 * has higher precedence than + if * takes its operand before + does


 Multiplication and division has higher precedence than addition and
subtraction. Therefor 5 is taken by * in both 9 + 5 * 2 and 9 * 5 + 2; i.e., the
expression is equivalent to 9 + (5 * 2) and (9 * 5) + 2 respectively
15
Precedence of Operators
Consider the language of parenthesis-free arithmetical expression over digits and
the two operations + and *. A possible grammar for this language is

Unfortunately, parsing the string 2 + 4 * 3 according to this grammar, yields the


grouping {2 + 4 } * 3 . This grouping implies evaluation of the string to the value
18 instead of correct value 14.
A grammar that correctly captures the operator precedence is given by:

Parsing the string 2 + 4 * 3 according to this two-tiered grammar yields

leading to the value 14.


16
Easy Reading of Grammar
Consider the extended grammar

We can interpret it as capturing the following definition

• An expression (E) is a sequence of terms T separated by the operator + or -.


E = T { + T }*
• A term (T) is a sequence of digits (D) separated by the operators * or / .
T = D{*D}*

17
Lecture Outcome

 Significance of context free grammar in compiler construction

 How to resolve associativity and precedence issues in arithmetic


expressions

 Focusing on unambiguous grammar for parsing

18
19

You might also like