Chapter 6 Part IV

Download as pdf or txt
Download as pdf or txt
You are on page 1of 36

Bottom-Up Parsing

Naïve Bottom Up Parsing Algorithm


SSM – Source String Marker
Handle
Handle
Stack Implementation of Shift
Reduce Parser
Stack Implementation of Shift
Reduce Parser
Stack Implementation of Shift
Reduce Parser
Shift – Reduce Parser

Input String – id + id * id
Problem Solving
1. Grammar is 2. Grammar is
S-> aS S-> aS
S-> aT S-> aT
T-> bW T-> bW
W-> c W-> c
Enter Input String: aaabc Enter Input String: abcd
Operator Grammar
• No Ɛ-transition.
• No two adjacent non-terminals.
Eg.
E  E op E | id
op  + | *
The above grammar is not an operator
grammar but:
E  E + E | E* E | id
Operator Precedence

• If a has higher precedence over b; a .> b


• If a has lower precedence over b; a <. b
• If a and b have equal precedence; a =. b
Note:
– id has higher precedence than any other symbol
– $ has lowest precedence.
– if two operators have equal precedence, then we
check the Associativity of that particular operator.
Precedence Table

id + * $
id .> .> .>
+ <. .> <. .>
* <. .> .> .>
$ <. <. <. .>

Example: w= $id + id * id$


$<.id.>+<.id.>*<.id.>$
Basic Principal
• Scan input string left to right, try to detect .>
and put a pointer on its location.
• Now scan backwards till reaching <.
• String between <. And .> is our handle.
• Replace handle by the head of the respective
production.
• REPEAT until reaching start symbol.
Algorithm
w  input
a  input symbol
b  stack top
Repeat
{
if(a is $ and b is $)
return
if(a .> b)
push a into stack
move input pointer
else if(a <. b)
c  pop stack
until(c .> b)
else
error()
}
Example
STACK INPUT ACTION/REMARK
$ id + id * id$ $ <. Id
$ id + id * id$ id >. +
$ + id * id$ $ <. +
$+ id * id$ + <. Id
$ + id * id$ id .> *
$+ * id$ + <. *
$+* id$ * <. Id
$ + * id $ id .> $
$+* $ * .> $
$+ $ + .> $
$ $ accept
Lex Tool
Phases of Compiler
Lex
• Lex is a program that generates lexical analyzer.
• The lexical analyzer is a program that transforms an input stream into a
sequence of tokens.
• It reads the input stream and produces the source code as output
through implementing the lexical analyzer in the C program.
• Firstly lexical analyzer creates a program lex.1 in the Lex language. Then Lex
compiler runs the lex.1 program and produces a C program lex.yy.c.
• Finally C compiler runs the lex.yy.c program and produces an object program
a.out.
• a.out is lexical analyzer that transforms an input stream into a sequence of
tokens.
Structure of Lex Programs
declarations
%%
translation rules
%%
auxiliary functions
• Declarations This section includes declaration of variables, constants.
• Translation rules It contains regular expressions and code segments.
• Form : Pattern {Action}
• Pattern is a regular expression or regular definition.
• Action refers to segments of code.
• Auxiliary functions This section holds additional functions which are
used in actions.
• 1. Definition Section: The definition section contains the declaration of
variables, regular definitions, constants.
• In the definition section, text is enclosed in “%{ %}” brackets.

• Anything written in this brackets is copied directly to the file lex.yy.c

• Syntax:
%{
// Definitions
%}
• 2. Rules Section: The rules section contains a series of rules in the
form: pattern action and pattern must be unintended and action
begin on the same line in {} brackets. The rule section is enclosed
in “%% %%”.

%%
pattern action
%%
• 3. User Code Section: This section contain C statements and additional
functions. We can also compile these functions separately and load with
the lexical analyzer.
Basic Program Structure:

%{
// Definitions
%}

%%
Rules
%%
User code section
• Design of Lexical Analyzer
• Lexical analyzer can either be generated by NFA or by DFA.
• DFA is preferable in the implementation of lex.
• How to run the program:
To run the program, it should be first saved with the extension .l or .lex.
Run the below commands on terminal in order to run the program file.

Step 1: lex filename.l or lex filename.lex depending on the extension file is


saved with
Step 2: gcc lex.yy.c
Step 3: ./a.out or a.exe
Step 4: Provide the input to program in case it is required
• yylex() :- implies the main entry point for lex, reads the input stream
generates tokens, returns zero at the end of input stream .

• It is called to invoke the lexer (or scanner) and each time yylex() is
called, the scanner continues processing the input from where it last
left off.
YACC
• YACC stands for Yet Another Compiler Compiler.
• YACC provides a tool to produce a parser for a given grammar.
• YACC is a program designed to compile a LALR (1) grammar.
• It is used to produce the source code of the syntactic analyzer of the
language produced by LALR (1) grammar.
• The input of YACC is the rule or grammar and the output is a C
program.
YACC
• Full specification looks like:
declarations
%%
rules
%%
programs

• The rules section is made up of one or more grammar rules. A grammar rule has the form:

A : BODY ;
Body section can be written as:
A:BCD;
A:EF;
A:G;

can be given to Yacc as

A:BCD
|EF
|G;

You might also like