0% found this document useful (0 votes)
14 views21 pages

3 parser-Intro-L5

This document discusses syntax analysis and parsing. It begins by listing the functions of a parser as testing for membership in a language and generating parse trees. It then discusses context-free grammars and how they are used to precisely define language syntax. The document covers topics such as derivation, parse trees, ambiguity, and the two main types of parsing: top-down and bottom-up. It provides examples to illustrate concepts like derivation and parsing strategies. The key takeaway is that parsing checks if tokens conform to language syntax by analyzing them based on the grammar rules.

Uploaded by

PRANJAL SHARMA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views21 pages

3 parser-Intro-L5

This document discusses syntax analysis and parsing. It begins by listing the functions of a parser as testing for membership in a language and generating parse trees. It then discusses context-free grammars and how they are used to precisely define language syntax. The document covers topics such as derivation, parse trees, ambiguity, and the two main types of parsing: top-down and bottom-up. It provides examples to illustrate concepts like derivation and parsing strategies. The key takeaway is that parsing checks if tokens conform to language syntax by analyzing them based on the grammar rules.

Uploaded by

PRANJAL SHARMA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

BITS Pilani

BITS Pilani Prof.Aruna Malapati


Hyderabad Campus Department of CSIS
BITS Pilani
Hyderabad Campus

Syntax Analysis / Parser


Today’s Learning Objectives

• List the functions of a Parser

• Perform grammar transformations

BITS Pilani, Hyderabad Campus


Parser

BITS Pilani, Hyderabad Campus


Where Are We?

Source code: if (b==0) a = “Hi”;

Lexical Analysis

Token Stream: if ( b == 0)a = “Hi”;

Syntactic Analysis
Abstract Syntax Tree
if
(AST)
Semantic Analysis
== = ;

b 0 a “Hi”
Do tokens conform to the language syntax?
BITS Pilani, Hyderabad Campus
Functions of Syntax Analysis

• Test for membership of w belongs to L(G).

• Additional functionalities

• Generate parse trees

• Handle errors if the string does not belong to the


language.

• Form of the grammar is not important

• Many grammars generate the same language.

BITS Pilani, Hyderabad Campus


What syntax analysis cannot
do?
• To check whether variables are of types on which
operations are allowed.

• To check whether a variable has been declared before


use.

• To check whether a variable has been initialized.

• These issues will be handled in semantic analysis.

BITS Pilani, Hyderabad Campus


Limitations of regular
languages
• How to describe language syntax precisely and
conveniently. Can regular expressions be used?

• Many languages are not regular for example string of


balanced parentheses.
• ((((.))))

• {(i)i|i=0}

• There is no regular expression for this language.

Syntax definition - Context free grammars

BITS Pilani, Hyderabad Campus


Syntax definition

• Context free grammars is a four tuple <V,T,P,S>


– a finite set of non terminal symbols / Variables (V)
– A finite set of terminals (T) V ∩ T = Φ
– a finite set of productions of the form A -> w where A € V and w € ( V U T)*
– a start symbol / Variable (S) S € V

• A parser derives strings by beginning with start symbol and


repeatedly replacing a non terminal by the right hand side of
a production for that non terminal.

• The strings that can be derived from the start symbol of a


grammar G from the language L(G) defined by the grammar.

BITS Pilani, Hyderabad Campus


Derivation

list -> list + digit | list - digit | digit


digit -> 0 | 1 | . | 9

Consists of the language which is a list of digit separated


by + or -.
Therefore, the string 9-5+2 belongs to the language
specified by the grammar.
list -> list + digit
-> list - digit + digit
-> digit - digit + digit The name context free comes from the fact
-> 9 - digit + digit
that use of a production X does not depend
on the context of X.
-> 9 - 5 + digit
-> 9 - 5 + 2

BITS Pilani, Hyderabad Campus


Derivation

• If in a sentential form only the leftmost non terminal is


replaced then it becomes leftmost derivation.

• Every leftmost step can we written as wAγ ->lm* wδγ,


where w is a string of terminals and A-> δ is a
production.

• Similarly rightmost derivation can also be defined


accordingly when the rightmost non terminal is replaced.

BITS Pilani, Hyderabad Campus


Parse tree

• It shows how the start symbol of a grammar derives a


string in the language.
• Root is labeled by the start symbol.
• Leaf nodes are labeled by tokens.

• Each internal node is labeled by a non terminal.


• if A is a non-terminal labeling an internal node and x 1 , x
2 , .x n are labels of children of that node then A à x 1 x 2
. x n is a production.

BITS Pilani, Hyderabad Campus


Ambiguity
• A Grammar can have more than one parse tree for a
string.

• Consider grammar
string -> string + string | string - string | 0 | 1 | . | 9

• String 9-5+2 has two parse trees

BITS Pilani, Hyderabad Campus


Ambiguity

• Ambiguity is problematic because meaning of the


programs can be incorrect.

• Ambiguity can be handled in several ways.


– Enforce associativity and precedence.

– Rewrite the grammar (cleanest way).

• There are no general techniques for handling ambiguity.

• It is impossible to convert automatically an ambiguous


grammar to an unambiguous one.

BITS Pilani, Hyderabad Campus


Example

• String of balanced parentheses


S -> ( S ) S | ε

• For example, consider the string: (( )). It can be derived


as:

S -> (S) S Replacing inner S with (S)S


S -> ((S)S) S Replacing all S with Empty
S -> (())

BITS Pilani, Hyderabad Campus


Parsing

• Process of determination whether a string can be


generated by a grammar.

• Parsing falls in two categories:


– Top-down parsing:

• Construction of the parse tree starts at the root (from the start symbol) and
proceeds towards leaves (token or terminals)

– Bottom-up parsing:

• Constructions of the parse tree starts from the leaf nodes (tokens or terminals of
the grammar) and proceeds towards root (start symbol) 1.

BITS Pilani, Hyderabad Campus


The Parsing Problem

• Two categories of parsers


– Top down - Construction of the parse tree starts at the root (from the start
symbol) and proceeds towards leaves (token or terminals) Order is that of a
leftmost derivation.

• Does a preorder traversal of tree


– node then branches

– Branches followed left-to-right

BITS Pilani, Hyderabad Campus


Top-down parsing

S –> AB
A –> aA | ε S
B –> b | bB
A B

Here is a top-down parse of aaab. a A b


S
a A
AB S –> AB
aAB A –> aA
a A
aaAB A –> aA
aaaAB A –> aA
aaaεB A –> ε ϵ
aaab B –> b

BITS Pilani, Hyderabad Campus


The Parsing Problem

• Bottom up - Construction of the parse tree starts from


the leaf nodes (tokens or terminals of the grammar) and
proceeds towards root (start symbol)

• Order is that of the reverse of a rightmost


derivation

• Parsers look only one token ahead in the input

BITS Pilani, Hyderabad Campus


Bottom - Up parsing
E→T+E|T
T → int * T | int | (E)

Consider the string: int * int + int


Bottom-up parsing reduces a string to the start symbol
by inverting productions:

BITS Pilani, Hyderabad Campus


Take home message
• Parsing checks if tokens conform to the language
syntax?

• Often generates parse trees or an error if the input string


does not confirm.

• Grammar Transformations help in improving


performance of parser.

• Top down parsers cannot handle left recursive grammar


and grammars with common prefixes.

BITS Pilani, Hyderabad Campus

You might also like