0% found this document useful (0 votes)

15 views31 pages

Lecture 4

Uploaded by

as.business.023

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views31 pages

Lecture 4

Uploaded by

as.business.023

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 31

Lexical Analysis -

Part 3

Lexical Analysis - Part

Outline of the
Lecture

What is lexical analysis? (covered in part 1)

Why should LA be separated from syntax
analysis? (covered in part 1)
Tokens, patterns, and lexemes (covered in
part 1) Difficulties in lexical analysis (covered
in part 1)
Recognition of tokens - finite automata and
transition diagrams (covered in part 2)
Specification of tokens - regular expressions and
regular definitions (covered in part 2)
LEX - A Lexical Analyzer Generator

Lexical Analysis - Part

Transition
Diagrams

Transition diagrams are generalized DFAs

with the following differences
Edges may be labelled by a symbol, a set of
symbols, or a regular definition
Some accepting states may be indicated as
retracting states, indicating that the lexeme does
not include the symbol that brought us to the
accepting state
Each accepting state has an action attached to
it, which is
executed when that state is reached. Typically,
such an action returns a token and its attribute
value
Transition diagrams are not meant for machine
translation but only for manual translation

Lexical Analysis - Part

Lexical Analyzer Implementation from Trans.
Diagrams
TOKEN gettoken() {
TOKEN mytoken; char c;
while(1) { switch (state) {
/* recognize reserved words and identifiers */
case 0: c = nextchar(); if (letter(c))
state = 1; else state = failure();
break;
case 1: c = nextchar();
if (letter(c) || digit(c))
state = 1; else state = 2; break;
case 2: retract(1);
mytoken.token =
search_token(); if
(mytoken.token == IDENTIFIER)
mytoken.value = get_id_string();
return(mytoken);
Lexical Analysis - Part
Lexical Analysis - Part
Lexical Analyzer Implementation from Trans.
Diagrams

/* recognize hexa and octal constants */

case 3: c = nextchar();
if (c == ’0’) state = 4; break;
else state = failure();
case 4: c = nextchar();
if ((c == ’x’) || (c == ’X’)) state
= 5; else if (digitoct(c)) state
= 9; else state = failure();
break;
case 5: c = nextchar(); if (digithex(c))
state = 6; else state = failure();
break;

Lexical Analysis - Part

Lexical Analysis - Part
Lexical Analyzer Implementation from Trans.
Diagrams
case 6: c = nextchar(); if (digithex(c))
state = 6; else if ((c ==
’u’)|| (c == ’U’)||(c == ’l’)||
(c == ’L’)) state = 8; else
state = 7; break;
case 7: retract(1);
/* fall through to case 8, to save coding */
case 8: mytoken.token = INT_CONST;
mytoken.value = eval_hex_num();
return(mytoken);
case 9: c = nextchar(); if (digitoct(c))
state = 9; else if ((c == ’u’)||
(c == ’U’)||(c == ’l’)||(c == ’L’))
state = 11; else state = 10; break;

Lexical Analysis - Part

Lexical Analyzer Implementation from Trans.
Diagrams

case 10: retract(1);

/* fall through to case 11, to save coding */
case 11: mytoken.token = INT_CONST;
mytoken.value = eval_oct_num();
return(mytoken);

Lexical Analysis - Part

Lexical Analysis - Part
Lexical Analyzer Implementation from Trans.
Diagrams
/* recognize integer constants */
case 12: c = nextchar(); if (digit(c))
state = 13; else state = failure();
case 13: c = nextchar(); if (digit(c))
state = 13;else if ((c == ’u’)||
(c == ’U’)||(c == ’l’)||(c == ’L’))
state = 15; else state = 14; break;
case 14: retract(1);
/* fall through to case 15, to save coding */
case 15: mytoken.token = INT_CONST;
mytoken.value = eval_int_num();
return(mytoken);
default: recover();
}
}
}
Lexical Analysis - Part
Combining Transition Diagrams to
form LA
Different transition diagrams must be
combined appropriately to yield an LA
Combining TDs is not trivial
It is possible to try different transition diagrams
one after another
For example, TDs for reserved words, constants,
identifiers,
and operators could be tried in that order
However, this does not use the “longest
match"
characteristic (thenext would be an
identifier, and not
reserved word then followed by identifier
ext)
To find the longest match, all TDs must be tried
and the longest match must be used
Using LEX to generate a lexical analyzer makes it
easy for the compiler writer
Lexical Analysis - Part
LEX - A Lexical Analyzer
Generator
LEX has a language for describing regular
expressions
It generates a pattern matcher for the regular
expression specifications provided to it as input
General structure of a LEX program
{definitions} – Optional
%%
{rules} – Essential
%%
{user subroutines} – Essential
Commands to create an LA
lex ex.l – creates a C-program lex.yy.c
gcc -o ex.o lex.yy.c – produces ex.o
ex.o is a lexical analyzer, that carves tokens from its
input
Lexical Analysis - Part
LEX
Example
/* LEX specification for the Example */
%%
[A-Z]+ {ECHO; printf("\n");}
.|\n ;
%%
yywrap(){}
main(){yylex();}

/* Input */ /* Output */
wewevWEUFWIGhHkkH WEUFWIG
sdcwehSDWEhTkFLksewT H

H
SDWE
T
FL
T - Part
Lexical Analysis
Definitions
Section

Definitions Section contains definitions and

included code Definitions are like macros and
have the following form: name translation
digit [0-9]
number {digit} {digit}*
Included code is all code included between %{ and %}
%{
float number; int count=0;
%}

Lexical Analysis - Part

Rules
Section
Contains patterns and C-code
A line starting with white space or material
enclosed in %{ and %} is C-code
A line starting with anything else is a pattern line
Pattern lines contain a pattern followed by
some white space and C-code
{ pattern} { action (C − code)}
C-code lines are copied verbatim to the the
generated C-file
Patterns are translated into NFA which are then
converted into DFA, optimized, and stored in the
form of a table and a driver routine
The action associated with a pattern is executed
when the DFA recognizes a string corresponding to
that pattern and reaches a final state
Lexical Analysis - Part
Strings and
Operators
Examples of strings: integer a57d hello
Operators:
" \ [] ^ - ? . * + | () $ {} % <>
\ can be used as an escape character as in C
Character classes: enclosed in [ and ]
Only \, -, and ^ are special inside [ ]. All other
operators are irrelevant inside [ ]
Examples:
[-+][0-9]+ ---> (-|+)(0|1|2|3|4|5|6|7|
8|9)+
[^abc] ---> all char except a,b, or c,
[a-d][0-4][A-C] ---> a|b|c|d|0|1|2|3|
including special and control char
4|A|B|C
[+\-][0-5]+ ---> (+|-)(0|1|2|3|4|5)+
[^a-zA-Z] ---> all char which are not letters

Lexical Analysis - Part

Operators -
Details
. operator: matches any character except newline
? operator: used to implement ϵ option
ab?c stands for a(b | ϵ)c
Repetition, alternation, and grouping:
(ab | cd +)?(ef )∗ — > (ab | c(d )+ | ϵ)(ef )∗
Context sensitivity: /,^,$, are context-
sensitive operators
^: If the first char of an expression is ^, then that
expression is matched only at the beginning of a
line. Holds only outside [ ] operator
$: If the last char of an expression is $, then that
expression is matched only at the end of a line
/: Look ahead operator, indicates trailing context
^ab ---> line beginning with ab
ab$ ---> line ending with ab (same as
ab/\n) DO/({letter}|{digit})* = ({letter}|
{digit})*,
Lexical Analysis - Part
LEX
Actions

Default action is to copy input to output, those

characters which are unmatched
We need to provide patterns to catch characters
yytext: contains the text matched against a
pattern copying yytext can be done by the
action ECHO
yyleng: provides the number of characters
matched
LEX always tries
integer the rules in the order written
action1;
down and the longest match is preferred
[a-z]+
action2;
The input integers will match the second
pattern

Lexical Analysis - Part

LEX Example 1: EX-
1.lex
%%
[A-Z]+ {ECHO; printf("\n";}
.|\n ;
%%
yywrap(){}
main(){yylex();}

/* Input */ /* Output */
wewevWEUFWIGhHkkH WEUFWIG
sdcwehSDWEhTkFLksewT H

H
SDWE
T
FL
T
Lexical Analysis - Part
LEX Example 2: EX-
2.lex

%%
^[ ]*\n
\n {ECHO; yylineno++;}
.* {printf("%d\t%s",yylineno,yytext);}
%%

yywrap(){}
main(){ yylineno = 1; yylex(); }

Lexical Analysis - Part

LEX Example 2
(contd.)
/* Input and Output */
========================
kurtrtotr
dvure

1234
5678
9

euhoyo854
shacg345845nkfg
===============
=========
1 kurtrtotr
2 dvure
3
Lexical Analysis - Part
LEX Example 3: EX-
3.lex
%{
FILE *declfile;
%}

blanks [ \t]*
letter [a-
z] digit [0-
9]
id ({letter}|_)({letter}|{digit}|
_)* number {digit}+
arraydeclpart {id}"["{number}"]"
declpart ({arraydeclpart}|{id})
decllist ({declpart}
{blanks}","{blanks})*
Lexical Analysis - Part
LEX Example 3
(contd.)

%%
{declaration} fprintf(declfile,"%s\n",yytext);
%%

yywrap()
{ fclose(declfil
e);
}
main(){
declfile = fopen("declfile","w");
yylex();
}

Lexical Analysis - Part

LEX Example 3: Input, Output,
Rejection
wjwkfblwebg2; int ab, float cd, ef;
ewl2efo24hg2jhrto;ty;
int ght,asjhew[37],fuir,gj[45]; sdkvbwrkb;
float ire,dehj[80];
sdvjkjkw
==========================================
float cd, ef;
int ght,asjhew[37],fuir,gj[45];
float ire,dehj[80];
===============================
===========
wjwkfblwebg2; int ab,

ewl2efo24hg2jhrto;ty;
sdkvbwrkb;
sdvjkjkw
Lexical Analysis - Part
LEX Example 4: Identifiers, Reserved
Words, and Constants (id-hex-oct-int-1.lex)

%{
int hex = 0; int oct = 0; int regular =0;
%}
letter [a-zA-Z_]
digit [0-9]
digits {digit}+
digit_oct [0-7]
[0-9A-F]
digit_hex [uUlL]
int_qualifier [ \t]+
blanks {letter}
identifier ({letter
integer }|
hex_const {digit})
oct_const
*
Lexical Analysis - Part
LEX Example 4:
(contd.)
%%
if {printf("reserved word:%s\n",yytext);}
else {printf("reserved word:%s\n",yytext);}
while {printf("reserved word:%s\n",yytext);}
switch {printf("reserved word:%s\n",yytext);}
{identifier} {printf("identifier :%s\n",yytext);}
{hex_const} {sscanf(yytext,"%i",&hex);
printf("hex constant: %s = %i\n",yytext,hex);}
{oct_const} {sscanf(yytext,"%i",&oct);
printf("oct constant: %s = %i\n",yytext,oct);}
{integer} {sscanf(yytext,"%i",&regular);
printf("integer : %s = %i\n",yytext, regular);}
.|\n ;
%%
yywrap(){}
int main(){yylex();}
Lexical Analysis - Part
LEX Example 4: Input and
Output

uorme while
0345LA 456UB 0x786lHABC
b0x34
========================
identifier :uorme
reserved word:while
oct constant: 0345L = 229
identifier :A
integer : 456U = 456
identifier :B
hex constant: 0x786l = 1926
identifier :HABC
identifier :b0x34

Lexical Analysis - Part

LEX Example 5: Floats in C (C-
floats.lex)
digits [0-9]+
exp ([Ee](\+|\-)?{digits})
blanks [ \t\n]+
float_qual [fFlL]
%%
{digits}{exp}{float_qual}?/{blanks}
{printf("float no fraction:%s\n",yytext);}
[0-9]*\.{digits}{exp}?{float_qual}?/{blanks}
{printf("float with optional
integer part :%s\
n",yytext);}
{digits}\.[0-9]*{exp}?{float_qual}?/
{blanks}
{printf("float with
optional fraction:%s\
n",yytext);}
Lexical Analysis - Part
LEX Example 5: Input and
Output

123 345.. 4565.3 675e-5 523.4e+2 98.1e5 234.3.4

345. .234E+09L 987E-6F 5432.E7l
=================================================
float with optional integer part : 4565.3
float no fraction: 675e-5
float with optional integer part : 523.4e+2
float with optional integer part : 98.1e5
float with optional integer part : 3.4
float with optional fraction: 345.
float with optional integer part : .234E+09L
float no fraction: 987E-6F
float with optional fraction: 5432.E7l

Lexical Analysis - Part

LEX Example 6: LA for Desk
Calculator

number [0-9]+\.?|[0-9]*\.[0-9]+
name [A-Za-z][A-Za-z0-9]*
%%
[ ] {/* skip blanks */}
{number} {sscanf(yytext,"%lf",&yylval.dval);
return NUMBER;}
{name} {struct symtab *sp =symlook(yytext);
yylval.symp = sp; return NAME;}
"++" {return POSTPLUS;}
"--" {return POSTMINUS;}
\n|. {return yytext[0];}
"$" {return 0;}

Lexical Analysis - Part

002chapter 2 - Lexical Analysis
No ratings yet
002chapter 2 - Lexical Analysis
114 pages
Lexical Analysis & Lex Tool
No ratings yet
Lexical Analysis & Lex Tool
17 pages
Compiler-Lexical Analysis
100% (1)
Compiler-Lexical Analysis
59 pages
Compiler Design Lab KCS552
No ratings yet
Compiler Design Lab KCS552
82 pages
Chapter 3 - Lexical Analysis and Lexical Analyzer Generators
No ratings yet
Chapter 3 - Lexical Analysis and Lexical Analyzer Generators
52 pages
Chapter2-Lexical Analysis
No ratings yet
Chapter2-Lexical Analysis
64 pages
Compiler Desing-Final ppt2
No ratings yet
Compiler Desing-Final ppt2
194 pages
SSCD LAB MAUNUAL DRTTIT FULL (Santhosh) PDF
No ratings yet
SSCD LAB MAUNUAL DRTTIT FULL (Santhosh) PDF
50 pages
CD UNIT-1
No ratings yet
CD UNIT-1
60 pages
Lecture3 Lex
No ratings yet
Lecture3 Lex
44 pages
CH 3 Myppt
No ratings yet
CH 3 Myppt
59 pages
CC Unit 2
No ratings yet
CC Unit 2
80 pages
Ch3 1
No ratings yet
Ch3 1
52 pages
SSC Module2 LexicalAnalysis
No ratings yet
SSC Module2 LexicalAnalysis
26 pages
Lexical Analysis
No ratings yet
Lexical Analysis
57 pages
Compiler Lab Manual Final E-Content
75% (16)
Compiler Lab Manual Final E-Content
55 pages
CD Sanchit Sir Notes
100% (1)
CD Sanchit Sir Notes
115 pages
Cs3501-Compiler Design Lab Manual
No ratings yet
Cs3501-Compiler Design Lab Manual
45 pages
SS & OS Final Lab Manual
No ratings yet
SS & OS Final Lab Manual
46 pages
2 Lexing
No ratings yet
2 Lexing
73 pages
4-Intro To Flex and Bison-09!09!2024
No ratings yet
4-Intro To Flex and Bison-09!09!2024
28 pages
Madcom
No ratings yet
Madcom
12 pages
Compilation Techniques
No ratings yet
Compilation Techniques
20 pages
Chapter 2 - Lexical Analysis
No ratings yet
Chapter 2 - Lexical Analysis
74 pages
Compiler Design (CD) : Lab Assignment 1
No ratings yet
Compiler Design (CD) : Lab Assignment 1
36 pages
1 - Scanning Slides Sanyal Part1
No ratings yet
1 - Scanning Slides Sanyal Part1
22 pages
Chapter 2 - Lexical Analysis - Regular Expressions
No ratings yet
Chapter 2 - Lexical Analysis - Regular Expressions
27 pages
Code:: Compiler Design (3170701) 190090107055
No ratings yet
Code:: Compiler Design (3170701) 190090107055
76 pages
Role of Lexical Analyzer - Input Buffering
No ratings yet
Role of Lexical Analyzer - Input Buffering
11 pages
Lexical Analysis: Dr. Murali Krishna Enduri Department of CSE
No ratings yet
Lexical Analysis: Dr. Murali Krishna Enduri Department of CSE
88 pages
Week13,14 CD Lab
No ratings yet
Week13,14 CD Lab
6 pages
Cs3501 Compiler Design Lab Manual
No ratings yet
Cs3501 Compiler Design Lab Manual
54 pages
LP IV Compiler Manual
No ratings yet
LP IV Compiler Manual
26 pages
ChatGPT - MyLearning On Compiler Construction
No ratings yet
ChatGPT - MyLearning On Compiler Construction
37 pages
L2 Lexical Analysis
No ratings yet
L2 Lexical Analysis
59 pages
2024 CD-Ch02 Lexical Analysis
No ratings yet
2024 CD-Ch02 Lexical Analysis
25 pages
Chapter 2
No ratings yet
Chapter 2
27 pages
Unit I Introduction To Compilers: Lex - The Lexical-Analyzer Generator
No ratings yet
Unit I Introduction To Compilers: Lex - The Lexical-Analyzer Generator
19 pages
Lexical Analysis 3
No ratings yet
Lexical Analysis 3
27 pages
Lexical Analysis and Lexical Analyzer Generators: COP5621 Compiler Construction
No ratings yet
Lexical Analysis and Lexical Analyzer Generators: COP5621 Compiler Construction
52 pages
Comp Chap2
No ratings yet
Comp Chap2
36 pages
Chapter 2 - Lexical Analysis
No ratings yet
Chapter 2 - Lexical Analysis
56 pages
A3 47 Mushan Khan Practical1
No ratings yet
A3 47 Mushan Khan Practical1
13 pages
Lexical Analysis 2
No ratings yet
Lexical Analysis 2
24 pages
Automated ANTLR Tree Walker Generation
No ratings yet
Automated ANTLR Tree Walker Generation
140 pages
Solution DFH Worksheet1
No ratings yet
Solution DFH Worksheet1
10 pages
Pdf&rendition 1
No ratings yet
Pdf&rendition 1
14 pages
Chapter 3 Lexical Analysis
No ratings yet
Chapter 3 Lexical Analysis
5 pages
Compiler Design Lexical Analysis
No ratings yet
Compiler Design Lexical Analysis
24 pages
Chapter 2 - Lexical Analyser
No ratings yet
Chapter 2 - Lexical Analyser
38 pages
Compiler Course: Lexical Analysis
No ratings yet
Compiler Course: Lexical Analysis
50 pages
CD Lab Manual Aim - Algorithm
No ratings yet
CD Lab Manual Aim - Algorithm
11 pages
Chapter 2 - Lexical Analyser
No ratings yet
Chapter 2 - Lexical Analyser
40 pages
Chapter 2 - Lexical Analyser
No ratings yet
Chapter 2 - Lexical Analyser
39 pages
Compilers: Topic 2: Lexical Analysis
No ratings yet
Compilers: Topic 2: Lexical Analysis
29 pages
Introduction To Lex
No ratings yet
Introduction To Lex
18 pages
CS3304 9 LanguageSyntax 2 PDF
No ratings yet
CS3304 9 LanguageSyntax 2 PDF
39 pages
The Theory of Parsing Translation and Compiling Volume 1 Parsing
100% (3)
The Theory of Parsing Translation and Compiling Volume 1 Parsing
562 pages
Resp Examen Prog
No ratings yet
Resp Examen Prog
6 pages
Lecture 3
No ratings yet
Lecture 3
22 pages
CD
No ratings yet
CD
5 pages
If Else
No ratings yet
If Else
2 pages
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Anekant CD PBL
No ratings yet
Anekant CD PBL
21 pages
Lexical Analysis: Programming Languages Translators
No ratings yet
Lexical Analysis: Programming Languages Translators
21 pages
Compiler
No ratings yet
Compiler
60 pages
Chapter 2 Lexical Analysis
No ratings yet
Chapter 2 Lexical Analysis
14 pages
Smart Mini Compilar
No ratings yet
Smart Mini Compilar
40 pages
Lab Manual
No ratings yet
Lab Manual
23 pages
Compiler Construction: Chapter 1: Introduction To Compilation
No ratings yet
Compiler Construction: Chapter 1: Introduction To Compilation
65 pages
Lecture 2.1 - Lexical Analysis
No ratings yet
Lecture 2.1 - Lexical Analysis
24 pages
Tuple and Dictionary
No ratings yet
Tuple and Dictionary
1 page
SPCC Exp7
No ratings yet
SPCC Exp7
8 pages
System Software Question Answer Part 3
No ratings yet
System Software Question Answer Part 3
2 pages
LEX and YACC
No ratings yet
LEX and YACC
3 pages
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
PLT Lecture Notes
No ratings yet
PLT Lecture Notes
5 pages
CSE - III Year - I SEM - R19 - Compiler Design
No ratings yet
CSE - III Year - I SEM - R19 - Compiler Design
13 pages
Homewwork 1
No ratings yet
Homewwork 1
2 pages
Chapter 1
No ratings yet
Chapter 1
42 pages
CS - 602: Systems Programming and Compiler Design: Assemblers & Loaders, Linkers
No ratings yet
CS - 602: Systems Programming and Compiler Design: Assemblers & Loaders, Linkers
2 pages
Lab Manual CD
No ratings yet
Lab Manual CD
19 pages
Lex and Yacc Roll No 23
No ratings yet
Lex and Yacc Roll No 23
7 pages
C Programming
From Everand
C Programming
Netra
No ratings yet
Lexical Analyzer: Using Flex by Dr. S. M. Farhad
No ratings yet
Lexical Analyzer: Using Flex by Dr. S. M. Farhad
22 pages
Compiler - Design - CS-603 (C) - MST-1 Solution - 1580200474 - 1580279576
No ratings yet
Compiler - Design - CS-603 (C) - MST-1 Solution - 1580200474 - 1580279576
15 pages
Designing A Compiler - CodeProject
No ratings yet
Designing A Compiler - CodeProject
11 pages
Lecture Outline: Prof. Aiken CS 143 Lecture 14 1 Prof. Aiken CS 143 Lecture 14 2
No ratings yet
Lecture Outline: Prof. Aiken CS 143 Lecture 14 1 Prof. Aiken CS 143 Lecture 14 2
8 pages
Binary
No ratings yet
Binary
4 pages
Programming Languages With Compiler FQuiz 1 PDF
No ratings yet
Programming Languages With Compiler FQuiz 1 PDF
3 pages
CS8602 - Compiler Design
No ratings yet
CS8602 - Compiler Design
5 pages
MSCCST 302
No ratings yet
MSCCST 302
2 pages
P13CS63 - Ii
No ratings yet
P13CS63 - Ii
3 pages
Compiler-Design Notes
No ratings yet
Compiler-Design Notes
5 pages

Lecture 4

Uploaded by

Lecture 4

Uploaded by

Lexical Analysis -

Lexical Analysis - Part

What is lexical analysis? (covered in part 1)

Lexical Analysis - Part

Transition diagrams are generalized DFAs

Lexical Analysis - Part

/* recognize hexa and octal constants */

Lexical Analysis - Part

Lexical Analysis - Part

case 10: retract(1);

Lexical Analysis - Part

Definitions Section contains definitions and

Lexical Analysis - Part

Lexical Analysis - Part

Default action is to copy input to output, those

Lexical Analysis - Part

Lexical Analysis - Part

Lexical Analysis - Part

Lexical Analysis - Part

123 345.. 4565.3 675e-5 523.4e+2 98.1e5 234.3.4

Lexical Analysis - Part

Lexical Analysis - Part

You might also like