CH03

Download as pdf or txt
Download as pdf or txt
You are on page 1of 57

3.

Syntax Analysis
❒ Synt ax: t he way in which t okens are put t oget her t o form
expressions, st at ement s, or blocks of st at ement s.
D The rules governing t he format ion of st at ement s in a programming
language.
❒ Synt ax analysis: t he t ask concerned wit h fit t ing a sequence of
t okens int o a specified synt ax.
❒ Parsing: To break a sent ence down int o it s component part s wit h an
explanat ion of t he form, funct ion, and synt act ical relat ionship of
each part .
❒ The synt ax of a programming language is usually given by t he
grammar rules of a cont ext free grammar (CFG).

1
Role of a Parser
❒ The synt ax analyzer (parser) checks whet her a given
source program sat isfies t he rules implied by a CFG
or not .
❍ If it sat isfies, t he parser creat es t he parse t ree of t hat
p rogram.
❍ Ot herwise, t he parser gives t he error messages.

❒ A CFG:
❍ gives a precise synt act ic specificat ion of a
programming language.
❍ A grammar can be direct ly convert ed in t o a parser by
some t ools (yacc).

2
Role of a Parser…
Parse t ree
next char next t oken
lexical Synt ax
get next
char analyzer analyzer
get next
t oken

Source
Progra
symbol
m
t able

Lexical Synt ax
(Cont ains a record Error
Error
for each ident ifier)

3
Parser…
❒ The parser can be cat egorized int o t w o groups:
❒ Top-down parser
❍ The parse t ree is creat ed t op t o bot t om, st art ing from
t he root t o leaves.
❒ Bot t om-up parser
❍ The parse t ree is creat ed bot t om t o t op, st art ing from
t he leaves t o root .
❒ Bot h t op-down and bot t om-up parser scan t he input
from left t o right (one symbol at a t ime).
❒ Efficient t op-down and bot t om-up parsers can be
implement ed by making use of cont ext -free-
grammar.
❍ LL for t op-down parsing
❍ LR for bot t om-up parsing
4
Cont ext free grammar (CFG)
❒ A context-free grammar is a specification for the
syntactic structure of a programming language.
Context-free grammar has 4-tuples:
G = (T, N, P, S) where
❍ T is a finit e set of t erminals (a set of t okens)
❍ N is a finit e set of non-t erminals (synt act ic variables)
❍ P is a finit e set of product ions of t he
form A → α where A is non-t erminal
and α is a st rings of t erminals and non-
t erminals (including t he empt y st ring)
D S ∈ N is a designat ed st art symbol (one of t he non-
t erminal symbols)

5
Example: grammar for simple arit hmet ic
expressions

expression  expression + t erm Terminal symbols


expression  expression - t erm id + - * / ( )
expression  t erm
t erm  t erm * f act or Non-t erminals
t erm  t erm / f act or expression
t erm  f act or t erm
f act or  (expression ) fact or
f act or  id St art symbol
expression

6
Not at ional Convent ions Used
❒ Terminals:
❍ Lowercase let t ers early in t he alphabet , such as a, b, c.
❍ Operat or symbols such as +, *, and so on.
❍ Punct uat ion symbols such as parent heses, comma, and so on.
❍ The digit s 0,1,. . . ,9.
❍ Boldface st rings such as id or if, each of which represent single
t erminal symbols.
❒ Non-t erminals:
❍ Uppercase let t ers early in t he alphabet , such as A, B, C.
❍ The let t er S is usually t he st art symbol.
❍ Lowercase, it alic names such as expr or st mt .
❍ Uppercase let t ers may be used t o represent non-t erminals for
t he const ruct s.
• expr, t erm, and f act or are represent ed by E, T, F

7
Not at ional Convent ions Used…
 Grammar symbols
 Uppercase late let t ers lat e in t he alphabet , such as X, Y, Z,
t hat is, eit her non-t erminals or t erminals.
❒ St rings of t erminals.
❒ Lowercase let t ers lat e in t he alphabet , mainly u,v,x,y ∈ T*

❒ St rings of grammar symbols.


❒ Lowercase Greek let t ers, α, β, γ ∈ (N∪T)*
❒ A set of product ions A  α1, A  α2, . . . , A  αk wit h a common
head A (call t hem A-product ions), may be writ t en
A  α1 | α2 | …| αk
α1, α2,. . . , αk a r e t he alt ernat ives for A.
❒ The head of t he first product ion is t he st art symbol.

E E + T | E - T I T
T T * F I T / F I F
F  ( E ) | id
8
Derivat ion
❒ A derivat ion is a sequence of replacement s of st ruct ure names by
choices on t he right hand sides of grammar rules.
Example: E → E + E | E – E | E * E | E / E | -E
E→ ( E)
E → id

E  E + E means t hat E + E is derived from E


- we can replace E by E + E
- we have t o have a product ion rule E  E+E in our grammar.
EE+E id+Eid+id
means t hat a sequence of replacement s of non-t erminal symbols is called
a derivat ion of id+id from E .
 In general The one-st ep derivat ion is def ined by
α A β ⇒ α γ β if t here is a product ion rule A → γ in our grammar
Where α and β are arbit rary st rings of t erminal and non- t erminal
9
symbols.
α1=> α2=>….=> αn (αn isderived from α1 or α1 derivesαn)
Deri vat i on…
 If we always choose t he left -most non-t erminal in each
derivat ion st ep, t his derivat ion is called left -most derivat ion.
Example: E=>-E=>-(E)=>-(E+E)=>-(id+E)=>-(id+id)
 If we always choose t he right -most non-t erminal in each
derivation step, this is called right-most derivation.
Example: E=>-E=>-(E)=>-(E+E)=>-(E+id)=>-(id+id)

 The t op-down parser t ry t o find t he left -most d erivat ion of t he


given source program.
 The bot t om-up parser t ry t o find right -most d erivat ion of t he
given source program in t he reverse order.

1
0
Parse t ree
❒ A parse t ree is a graphical represent at ion of a
derivat ion
❒ It filt ers out t he order in which product ions are applied
t o replace non-t erminals.

❒ A parse t ree corresponding t o a derivat ion is a labeled


t ree in which:
• t he int erior nodes are labeled by non-t erminals,
• t he leaf nodes are labeled by t erminals, and
• t he children of each int ernal node represent t he
replacement of t he associat ed non-t erminal
in one st ep of t he derivat ion.
11
Parse t ree and Derivat ion
Grammar E → E + E | E ∗ E | ( E ) | - E | id
Let s examine t his derivat ion:
E ⇒ -E ⇒ -(E) ⇒ -(E + E) ⇒ -(id + id)

E E E E E

- E - E - E - E

( E ) ( E ) ( E )

E + E E + E
This is a t op-down derivat ion
because we st art building t he id id
parse t ree at t he t op parse t ree
12
Exercise
a) Using t he grammar below, draw a parse t ree for t he
following st ring:
( ( id . id ) id ( id ) ( ( ) ) )
S→ E
E →id
| ( E. E )
| ( L)
| ()
L→ LE
| E
b) Give a right most derivat ion for t he st ring given in (a).

13
Ambiguit y
❒ A grammar produces more t han one parse t ree for a
sent ence is called as an ambiguous grammar.
• produces more t han one left most derivat ion or
• more t han one right most derivat ion for t he same
sent ence.

❒ We should eliminat e t he ambiguit y in the grammar


during t he design phase of t he compiler.
❒ An unambiguous grammar should be writ t en t o eliminat e
t he ambiguit y.
❒ Ambiguous grammars (b/ c of ambiguous operat ors) can
be disambiguat ed according t o t he precedence and
associat ively rules.

14
Ambiguit y: Example
❒ Example: The arit hmet ic expression
grammar E → E + E | E * E | ( E ) | id
❒ permit s t wo dist inct left most derivat ions for t he
sent ence id + id * id:
(a) (b)
E => E + E E => E * E
=> id + E => E + E * E
=> id + E * E => id + E * E
=> id + id * E => id + id * E
=> id + id * id => id + id * id

15
Ambiguit y: example

According t o t he grammar, bot h are correct .


Agrammar t hat produces more t han16 one
parse t ree for any input sent ence is said
t o be an ambiguous grammar.
Eliminat ion of ambiguit y
Precedence/ Associat ion
 These t wo derivat ions point out a problem wit h t he grammar:
 The grammar do not have not ion of precedence, or implied order of
evaluat ion

To add precedence
❒ Creat e a non-t erminal for each level of precedence
❒ Isolat e t he corresponding part of t he grammar
❒ Force t he parser t o recognize high precedence sub expressions first
For algebraic expressions
❒ Mult iplicat ion and division, first (level one)
❒ Subt ract ion and addit ion, next (level t wo)

To add associat ion


❒ Left -associat ive : The next -level (higher) non-t erminal places at t he
last of a product ion 17
Eliminat ion of ambiguit y
❒ To disambiguat e t he grammar :

E → E + E | E ∗ E | ( E ) | id

 we can use precedence of operat ors as


follows:
* Higher precedence (left associat ive)
+ Lower precedence (left associat ive)

 We get t he following unambiguous


grammar:
E→ E+T| T id + id * id
T→ T∗F| F
F → ( E ) | id
18
Left Recursion
E→ E+T| T
Consider t he grammar: T→ T∗F| F
F → ( E ) | id

At op-down parser might loop forever when parsing


an expression using t his grammar

E E E E

E + T E + T E + T

E + T E + T

E + T
19
Eliminat ion of Left recursion
❒ A grammar is left recursive, if it has a non-t erminal A
such t hat t here is a derivat ion
A=>+Aα for some st ring α.
❒ Top-down parsing met hods cannot handle left -
recursive grammar.
❒ so a transformation that eliminates left-recursion is
❒ needed.
To eliminate left recursion for single production
A  Aα | β could be replaced by t he nonleft - recursive
product ions
A  β A’
A’  α A’ | ε
20
Eliminat ion of Left recursion…
E→ E+T| T
This left -recursive
T→ T∗F| F
grammar:
F → ( E ) | id

Can be re-writ t en t o eliminat e t he immediat e left recursion:

E → TE’
E’ → +TE’ | ε
T → FT’
T’ → ∗FT’ | ε
F → ( E ) | id

Exercise: Parse id + id * id using the non-left recursive grammar above using left-most derivation.
21
Top-Down and Bot t om-Up
Parsers
Top-down parsers:
• St art s const ruct ing t he parse t ree at t he t op (root ) of t he
t ree and move down t owards t he leaves.
• Easy t o implement by hand, but work wit h rest rict ed
grammars.
example: Recursive Decent Parser

Bot t om-up parsers:


• build t he nodes on t he bot t om of t he parse t ree first .
• Suit able for aut omat ic parser generat ion, handle a larger
class of grammars.
examples: shift -reduce parser (or LR(k) parsers)

22
Top-down (LL) parsing
Recursive Descent Parsing (RDP)
❒ This met hod of t op-down parsing can be considered as
an at t empt t o find t he left most derivat ion for an input
st ring. It may involve backt racking.
❒ To const ruct t he parse t ree using RDP:
o we creat e one node t ree consist ing of S.
❍ t wo point ers, one for t he t ree and one for t he input , will
be used t o indicat e where t he parsing process is.
❍ init ially, t hey will be on S and t he first input symbol,
respect ively.
❍ t hen we use t he first S-product ion t o expand t he t ree.
The t ree point er will be posit ioned on t he left most
symbol of t he newly creat ed sub-t ree.

23
Recursive Descent Parsing (RDP)…

❒ as the symbol pointed by the tree pointer matches


that of the symbol pointed by the input pointer, both
pointers are moved to the right.
❒ whenever the tree pointer points on a non-terminal,
we expand it using the first production of the non-
terminal.
❒ whenever the pointers point on different
terminals, the production that was used is not
correct, thus another production should be
used. We have to go back to the step just
before we replaced the non-terminal and use
another production.
❒ if we reach the end of the input and the tree
pointer passes the last symbol of the tree, we
have finished parsing. 24
RDP…
❒ Example: G: S  cAd
A  ab| a
❒ Draw t he parse t ree for t he input st ring cad using
t he above met hod.

❒ Exercise: Consider t he following grammar:


S A
A  A + A | B++
B y
Draw t he parse t ree for t he input “ y+++y++”

Home work: 25
Convert the grammar into non-left recursive and draw the parse tree using RDP
Exercise
 Using t he grammar below, draw a parse t ree for t he
following st ring using RDP algorit hm:
( ( id . id ) id ( id ) ( ( ) ) )
S→ E
E→
id
| ( E. E )
| ( L)
| ()
L→ LE
| E

26
Bot t om-Up (LR) Parser
Abot t om-up parser, or a shift -reduce parser, begins
at t he leaves and works up t o t he t op of t he t ree.

The reduct ion st eps t race a right most derivat ion


on reverse.

S → aABe
Consider t he Grammar: A → Abc | b
B →d

We want t o parse t he input st ring abbcde.

27
Bot t om-Up Parser: Simulat ion
INPUT: a b b c d e $ OUTPUT:

Product ion
S→ aABe
Bot t om-Up Parsing
A→ Abc
Program
A→ b
B→ d

I NPUT: a b b c d e $ OUTPUT:

Pr oduct ion
S → aABe
Bot t om-Up Par sing
A → Abc A
Pr ogr am
A→b
B→d b

28
Bot t om-Up Parser: Simulat ion
I NPUT: a A b c d e $ OUTPUT:

Pr oduct ion
S → aABe
Bot t om-Up Par sing
A → Abc A
Pr ogr am
A→b
B→d b

I NPUT: a A b c d e $ OUTPUT:

Pr oduct ion
S → aABe
Bot t om-Up Par sing
A → Abc A
Pr ogr am
A→b
B→d b

We are not reducing here in t his example.


A parser would reduce, get st uck and t hen backt rack! 29
Bot t om-Up Parser: Simulat ion
I NPUT: a A b c d e $ OUTPUT:

Pr oduct ion
A
S → aABe
Bot t om-Up Par sing
A → Abc A b c
Pr ogr am
A→b
B→d b

I NPUT: a A d e $ OUTPUT:

Pr oduct ion
A
S → aABe
Bot t om-Up Par sing
A → Abc A b c
Pr ogr am
A→b
B→d b

30
Bot t om-Up Parser: Simulat ion
I NPUT: a A d e $ OUTPUT:

Pr oduct ion
A B
S → aABe
Bot t om-Up Par sing
A → Abc A b c d
Pr ogr am
A→b
B→d b

I NPUT: a A B e $ OUTPUT:

Pr oduct ion
A B
S → aABe
Bot t om-Up Par sing
A → Abc A b c d
Pr ogr am
A→b
B→d b

31
Bot t om-Up Parser: Simulat ion
I NPUT: a A B e $ OUTPUT:
S
Pr oduct ion e
a A B
S → aABe
Bot t om-Up Par sing
A → Abc A b c d
Pr ogr am
A→b
B→d b

I NPUT: S $ OUTPUT:
S
Pr oduct ion e
a A B
S → aABe
Bot t om-Up Par sing
A → Abc A b c d
Pr ogr am
A→b
B→d b

This parser is known as an LR Parser because


32
it scans t he input from Left t o right , and it const ruct s
a Right most derivat ion in reverse order.
Bot t om-up parser (LR parsing)
S  aABe
A  Abc | b
B d

abbcde  aAbcde  aAde  aABe S

❒ At each st ep, we have t o find α such t hat α is a


subst ring of t he sent ence and replace α by A, where
A α

33
St ack implement at ion of shift / reduce
parsing
❒ In LR parsing t he t wo maj or problems are:
❍ locat e t he subst ring t hat is t o be reduced
❍ locat e t he product ion t o use

❒ A shift / reduce parser operat es:


❍ By shift ing zero or more input int o t he st ack unt il t he
right side of t he handle is on t op of t he st ack.
❍ The parser t hen replaces handle by t he non-t erminal
of t he product ion.
❍ This is repeat ed unt il t he st art symbol is in t he st ack
and t he input is empt y, or unt il error is det ect ed.

34
St ack implement at ion of shift / reduce parsing…

❒ Four act ions are possible:


❍ shift : t he next input is shift ed on t o t he t op of
t he st ack
❍ reduce: t he parser knows t he right end of
t he handle is at t he t op of t he st ack. It should
t hen decide what non-t erminal should replace
t hat subst ring
❍ accept : t he parser announces successful
complet ion of parsing
❍ error: t he parser discovers a synt ax error

35
Synt ax error handling
 If a compiler had to process only correct programs, its
design and implementation would be simplified greatly.
 However, a compiler is expected to assist the
programmer in locating and tracking down errors that
inevitably creep into programs, despite the
programmer's best efforts.

 How if spoken languages had the same requirements


for syntactic accuracy as computer languages?

36
Synt ax error handling

❒ Common programming errors can occur different levels:


❍ Lexical errors include missing quotes around text ,
misspellings of keywords, or operat ors, : E.g., ebigin
inst ead of begin
❍ Synt act ic errors include misplaced semicolons ; extra or missing
braces { }, case wit hout swit ch…
❍ Semant ic errors include t ype mismat ches bet ween
operat ors and operands. Operat or applied t o incompat ible
operand
❍ Logical errors can be anyt hing from incorrect reasoning. E.g,
assignment operat or = inst ead of t he comparison operat or ==

37
Synt ax error handling…
❒ The error handler should be writ t en wit h t he
following goals in mind:
• Errors should be report ed clearly and accurat ely
• It should report t he place of t he error
• It should also report t he t ype of t he error

• The compiler should recover from common errors


efficient ly and det ect ot her errors.
• Eg.Add missing semicolons

• It should not slow down t he whole process significant ly.


• Add minimal overhead to the processing of correct programs.

38
Synt ax error handling…

❒ There are four main st rat egies in error handling:

❍ Panic mode error recovery: discards all t okens unt il a


synchronizat ion t oken (like ; and { or }) is found.
❍ Phrase level recovery: t he parser makes a local correct ion
so t hat it can cont inue t o parse t he rest of t he input .
• Replace comma by a semicolon, delet e or insert semicolon…
❍ Error product ions: augment t he grammar t o capt ure
t he most common errors t hat programmers make.
❍ Global correct ion: makes as few changes as possible in t he
program so t hat a globally least cost correct ion program is
obt ained.

39
The Parser Generat or: Yacc
❒ Yacc st ands for "yet anot her compiler-compiler".
❒ Yacc: a t ool for aut omat ically generat ing a parser
given a grammar writ t en in a yacc specificat ion (.y
file)
❒ Yacc parser – calls lexical analyzer t o collect
t okens from input st ream.
❒ Tokens are organized using grammar rules
❒ When a rule is recognized, it s act ion is execut ed
Not e
 lex t okenizes t he input and yacc parses t he
t okens, t aking t he right act ions, in cont ext .

169
Scanner, Parser, Lex and Yacc

170
Yacc…
❒ There are four st eps involved in creat ing a compiler in Yacc:
1. Specify t he grammar:
– Writ e t he grammar in a .y file (also specify t he act ions here t hat
are t o be t aken in C).
– Writ e a lexical analyzer t o process input and pass t okens t o t he
parser. This can be done using Lex.
– Writ e a funct ion t hat st art s parsing by calling yyparse().
– Writ e error handling rout ines (like yyerror()).
2. Generat e a parser from Yacc by running Yacc over t he
grammar file.
3. Compile code produced by Yacc as well as any ot her
relevant source files.
4. Link t he obj ect files t o appropriat e libraries for t he
execut able parser.
172
43
Writ ing a Grammar in Yacc
❒ Product ions in Yacc are of t he form:

Nonterminal : tokens/nonterminals { action }


| tokens/nonterminals { action }

;
❒ Tokens t hat are single charact ers can be used
direct ly wit hin product ions, e.g. ‘ +’
❒ Named t okens must be declared first in t he
declarat ion part using
%t oken TokenName

44
Synt hesized At t ribut es
❒ Semant ic act ions may refer t o values of t he synt hesized
at t ribut es of t erminals and non-t erminals in a
product ion:
X : Y1 Y2 Y3 …Yn { act ion }
❍ $$ refers t o t he value of t he at t ribut e of X
❍ $ i refers t o t he value of t he at t ribut e of Yi

❒ For example
fact or : ‘ (’ expr ‘ )’ { $$=$2; }
f act or.val=x

$$=$2
( expr.val=x )
45
Lex Yacc int eract ion…

yyparse()
input
calc.y y.t ab.c
Yacc

y.t ab.h a.out


gcc

Lex
calc.l lex.yy.c
Compiled
yylex()
out put

46
Lex Yacc int eract ion…
❒ If lex is t o ret urn t okens t hat yacc will process, t hey
have t o agree on what t okens t here are. This is
done as follows:
❍ The yacc file will have t oken definit ions
%t oken INTEGER
in t he definit ions sect ion.
❍ When t he yacc file is t ranslat ed wit h yacc -d, a header file
y.t ab.h is creat ed t hat has definit ions like
#define INTEGER 258
❍ This file can t hen be included in bot h t he lex and yacc
program.
❍ The lex file can t hen call ret urn INTEGER, and t he yacc
program can mat ch on t his t oken.

47
Example : Simple calculat or: yacc file
%{
int t ypes for at t ribut es
#include <stdio.h>
and yylval
void yyerror(char *);
#define YYSTYPE int Grammar rules
%}
%token INTEGER action
%%
program:
program expr '\n' { printf("%d\n", $2); }
|
; The value of
expr: LHS(expr)
INTEGER { $$=$1;}
| expr '+' expr { $$ = $1 + $3; }
| expr '-' expr { $$ = $1 - $3; }
;
%% The value of
void yyerror(char *s) { t okens on RHS
fprintf(stderr, "%s\n", s);} St ored in yylval
int main(void) {
yyparse();
return 0;} Lexical analyzer invoked by
t he parser 179
Example : Simple calculat or: lex file
%{ The lex program mat ches
#include <stdio.h> Numbers and operat ors
#include "y.tab.h" and ret urns t hem
extern int yylval ; Generat ed by yacc, cont ains
%} #define INTEGER 256
%%
[0-9]+ {yylval=atoi(yytext); Defined in y.t ab.c
return INTEGER;
} Place t he int eger value
[-+*/\n] return *yytext; In t he st ack
[ \t] ;/*Skip white space*/
. yyerror("invalid character");
%%
int yywrap(void){
operat ors will
return 1; be ret urned
} 180
Lex and Yacc: compile and run
[compiler@localhost yacc]$ vi calc.l
[compiler@localhost yacc]$ vi calc.y
[compiler@localhost yacc]$ yacc -d calc.y
yacc: 4 shift / reduce conflict s.
[compiler@localhost yacc]$ lex calc.l
[compiler@localhost yacc]$ ls
a.out calc.l calc.y lex.yy.c t ypescript y.t ab.c y.t ab.h
[compiler@localhost yacc]$ gcc y.t ab.c lex.yy.c
[compiler@localhost yacc]$ ls
a.out calc.l calc.y lex.yy.c t ypescript y.t ab.c y.t ab.h
[compiler@localhost yacc]$ ./ a.out
2+3
5
23+8+
Invalid characht er
synt ax error 50
Example : Simple calculat or: yacc file– opt ion2
%{
#include<stdlib.h>
#include<stdio.h>
%}
%token INTEGER;
%%
Program :
program expr '\n' {printf("%d\n ", $2);}
|
;
expr : expr '+' mulexpr {$$=$1 + $3;}
|expr '-' mulexpr {$$=$1 - $3;}
|mulexpr {$$=$1;}
;
mulexpr : mulexpr '*' term {$$=$1 * $3;}
| mulexpr '/' term {$$=$1 / $3;}
|term {$$=$1;}
;
term :
'(' expr ')' {$$=$2;}
| INTEGER {$$=$1;}
;
%%

51
Example : Simple calculat or: yacc file– opt ion2

void yyerror(char *s)


{
fprintf(stderr, " %s\n ", s);}
}
int main(void)
{
yyparse();
return 0;
}

52
Calculat or 2: Example– yacc file
%{
#include<stdio.h> user: 3 * (4 + 5)
sym holds t he calc: 27
int sym[26];
%} value of t he user: x = 3 * (4 +
%token INTEGER VARIABLE associat ed 5) user: y = 5
%left '+' '-' variable user: x
%left '*' '/' calc: 27
%% associat ive and user: y
program : precedence rules calc: 5
program statement '\n'
| user: x + 2*y
; calc: 37
statement :
expression {printf("%d\n", $1);}
|VARIABLE '=' expression {sym[$1]= $3;}
;
expression :
INTEGER {$$=$1;}
|VARIABLE {$$=sym[$1];}
|expression '+' expression {$$=$1 + $3;}
|expression '-' expression {$$=$1 - $3;}
|expression '*' expression {$$=$1 * $3;}
|expression '/' expression {$$=$1 * $3;}
| '(' expression ')' {$$=$2;}
; 53
%%
Calculat or 2: Example– yacc file

int yyerror(char *s)


{
fprintf(stderr, "%s\n",s);
return 0;
}
int main()
{
yyparse();
return 0;
}

54
Calculat or 2: Example– lex file
%{
#include<stdio.h> The lexical
#include<stdlib.h> analyzer ret urns
#include "y.tab.h“ variables and
void yyerror(char *); int egers
extern int yylval;
%}
%%
[a-z] { yylval=*yytext; For variables
return VARIABLE; yylval specifies an
} index t o t he
[0-9]+ { yylval=atoi(yytext); symbol t able sym.
return INTEGER;
}
[-+*/()=\n] return *yytext;
[ \t] ; /*Skip white space*/
. yyerrror(" Invalid character
%% ");
int yywrap(void)
{
return 1;
} 186
Conclusions
❒ Yacc and Lex are very helpful for
building t he compiler front -end
❒ A lot of t ime is saved when compared t o
hand-implement at ion of parser and scanner
❒ They bot h work as a mixt ure of “ rules” and
“ C code”
❒ C code is generat ed and is merged wit h t he
rest of t he compiler code
Calculat or program
❒ Expand t he calculat or program so t hat t he new
calculat or program is capable of processing:

user: 3 * (4 + 5)
user: x = 3 * (4 + 5)
user: y = 5
user: x + 2*y
2^3/ 6
sin(1) + cos(PI)
t an
log
fact orial

57

You might also like