Lab No. 01 - Building A Lexical Analyzer

This document provides instructions for a lab assignment on building a lexical analyzer. It includes code listings and explanations of a simple lexical analyzer program. It breaks down the components of the program and explains what each part is doing. It also provides tasks for students to create regular expressions to recognize different types of tokens and to modify the program to count different tokens. An appendix defines common regular expression patterns.

Uploaded by

Mahfuzur Rahman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

261 views4 pages

Lab No. 01 - Building A Lexical Analyzer

Uploaded by

Mahfuzur Rahman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Lab Manual CSI 412 – Compiler Sessional

Lab No. 01 – Building a Lexical Analyzer

A Lexical Analyzer checks spelling and breaks a statement into streams of tokens. The checking is done
by patterns specified by Regular Expressions or REGEX.
Take a look at the given code below:

Code Listing 1 - A simple lexical analyzer (tokens.l)

1 %{
2 int tokencount = 0;
3 %}
4
5 %%
6 [ \t\n]+ {printf("White spaces ignored\n");}
7 \\\\.*.[\n]? {printf("%s is a comment\n",yytext);}
8 [a-zA-Z]+ {printf("%s is a token\n",yytext);tokencount++;}
9 [<>\-\+\?\*\/!\^,;] {printf("%s is a token\n",yytext);tokencount++;}
10 . {printf("Unexpected\n");}
11 %%
12
13 int main(){
14 yylex();
15 printf("Number of tokens is : %d\n",tokencount);
16 }
Now, let’s explain the program.
1. The coding environment: We need the package FLEX to be installed for building a lexical analyzer.
2. Naming convention: The file extension must be (any_name.l), the .l extension is for lexical
analyzer.
3. Running the code: In order to run the code, we need to run the following commands sequentially-
a. Go to terminal and run the following:
i. lex file_name.l [The name of the file that contains the code.]
ii. cc lex.yy.c –o output_name –ll
[“cc” is for selecting the C compiler, “lex.yy.c” is the output file from the previous
command, “-o” is to specify an output file name, “output_name” can be given by user,
“-ll” is to add the lex library to the compiler.]
iii. ./output_name [To run the output file.]
b. Type anything on the terminal and it will show, if the token is a lexeme or not.
c. To get out of the program press “CTRL+D” or “CTRL+C”.
Breakdown of the Code:

 Any lex file contains three main regions of code:

1. %{
//This part is to include any file or declare any global variable.
2. %}
3. %%
//Here all the regular expressions or patterns are defined
4. %%
5. int main(){
//all required execution code are specified in main function.
6. }
Lab Manual CSI 412 – Compiler Sessional

 Line no. 02: The int tokencount = 0 is a variable to store the number of valid tokens after
execution of the code.
 Line no. 06: [ \t\n]+ is a regular expression that accepts all Spaces, Tabs and Newline characters
for once or infinite times and prints a message whenever they are encountered. Beware that, there’s a
SPACE before the “\t” in the regular expression.
 Line no. 07: \\\\.*.[\n]? is a regular expression that accepts a comment line. Remember in C
code a comment line follows the given structure (“\\This is a comment line.”). As in flex “\” has a
special function so to accept a “\” as a character we need to escape its functionality using another “\”.
So, to accept one “\” we need two “\\”.
 Line no. 08: [a-zA-Z]+ accepts all valid characters from small (a-z) and big (A-Z) and the pattern
is accepted for 1 up to infinite times. Basically it accepts all valid strings that are made up of alphabets
only. We are considering these as valid tokens and so increasing the tokencount if we encounter any
token that has the specified pattern.
 Line no. 09: [<>\-\+\?\*\/!\^,;] accepts a list of special characters. The special
characters that have a function in flex are escaped first using a “\”. We are also considering the operators
as valid tokens and so increasing the tokencount.
 Line no. 10: All characters except a newline character are accepted by a (.).
 Line no. 14: We are calling the yylex() function, which, runs the patterns that we just specified
against all inputs that we provide.
 Line no. 15: When we exit the program, the total number of valid tokens i.e. lexemes are displayed.

Code Listing 2 - A simple lexical analyzer that ttakes input form a file called "code.c"

1 %{
2 int tokencount = 0;
3 %}
4
5 %%
6 [ ]+ {printf("White spaces ignored\n");}
7 \\\\.*.[\n]? {printf("%s is a comment\n",yytext);}
8 [a-zA-Z]+ {printf("%s is a token\n",yytext);tokencount++;}
9 [<>\-\+\?\*\/!\^,;] {printf("%s is a token\n",yytext);tokencount++;}
10 \/\*(.|\n)+\*\/ {printf("Comment block found");}
11 . {printf("Unexpected\n");}
12 %%
13
14 int main(){
15 FILE *file;
16 file = fopen("code.c", "r") ;
17 if (!file) {
18 printf("couldnot open file");
19 exit (1);
20 }
21 else {
22 yyin = file;
23 }
24 yylex();
25 printf("Number of tokens is : %d\n",tokencount);
26 }
Lab Manual CSI 412 – Compiler Sessional

N.B.: In order to run this code you must create a file called “code.c” in the directory of the lex file and put
some sample code in it.
Sample Input file: Suppose we have the sample “code.c” as follows:

This is a sample text.

a=10;
b=a+5;
Code Listing 3 – Sample “code.c” file
Sample Output: Here the output will be

This is a token.
White spaces ignored.
is is a token.
White spaces ignored.
a is a token.
White spaces ignored.
sample is a token.
White spaces ignored.
text is a token.
. Unexpected
White spaces ignored.
a is a token.
= Unexpected
10 Unexpected
; is a token.
White spaces ignored.
b is a token.
= Unexpected
a is a token.
+ is a token.
5 Unexpected
; is a token.

Number of tokens is 11.

Tasks for LAB 01:

1. Create regular expressions for accepting the following strings:
a. Numbers or digits i.e. 123, 999 etc.
b. Any string that starts with a capital letter, e.g., And, B04, Call etc.
c. Any string that has an operator somewhere in it, e.g., a+b, a++, 2--, *b/ etc.
d. Any double/float numbers, e.g., 23.343, 99.999 etc.
e. Any string that starts and ends with a vowel, e.g., apostle, oscillate etc.
2. Create a lex program that will count the number of variables, variable types, operators and digits
separately.
N.B.: Rules of regular expression can be found in the appendix.
Lab Manual CSI 412 – Compiler Sessional

Appendix
Regular Expressions:
General rules of regular expressions are as follows:

Table 1 – Rules of Regular Expressions

Expression Pattern accepted

[abc] Will accept all characters that are either a/b/c
(abc) Will accept the string “abc”
[abc]+ Will accept the characters a/b/c either 1 or ∞ times, e.g., aa, bbbc, abc, a etc.
[abc]* Will accept the characters a/b/c either 0 or ∞ times.
(abc)? Will accept the string “abc” 0 or 1 time.
. Will accept all characters except a newline character.
(a|b) Will match the character “a” or “b”
^(a) Matches the beginning of a line as the first character of a regular expression.
[^a] Will match any character except “a”.
a$ Matches “a” at the end of a line as the last character of a regular expression
(abc){min,max} Will match the string “abc” from min to max times.
(abc){,max} Will match the string “abc” from 0 times to max times.
(abc){min,} Will match the string “abc” from min times to ∞ times.
“abc” Will match everything literally within the quotation marks.

Check the book for extended usage and examples.

Curio 742 User Manual
No ratings yet
Curio 742 User Manual
136 pages
Embedded Systems - An Introduction
No ratings yet
Embedded Systems - An Introduction
281 pages
Lexical Analyzer Flex Lab Report
No ratings yet
Lexical Analyzer Flex Lab Report
12 pages
Virtual C PDF
No ratings yet
Virtual C PDF
128 pages
cs3501 Compiler Design Lab Manual
No ratings yet
cs3501 Compiler Design Lab Manual
56 pages
EPBCS - Student Guide
No ratings yet
EPBCS - Student Guide
23 pages
CC TO BTC Method (Updated October 2021) : Shared For Users of Premium Methods Section at Exploit Forum
100% (5)
CC TO BTC Method (Updated October 2021) : Shared For Users of Premium Methods Section at Exploit Forum
6 pages
Compiler Design (CD) : Lab Assignment 1
No ratings yet
Compiler Design (CD) : Lab Assignment 1
36 pages
CS3501 - Compiler Design Lab Manual
No ratings yet
CS3501 - Compiler Design Lab Manual
53 pages
Compiler Design Lab KCS552
No ratings yet
Compiler Design Lab KCS552
82 pages
8 Baal Pool Unblocked For Free BST Game
No ratings yet
8 Baal Pool Unblocked For Free BST Game
6 pages
WhatsApp Chat With Videsi Girls Group
No ratings yet
WhatsApp Chat With Videsi Girls Group
1 page
2 Lexing
No ratings yet
2 Lexing
73 pages
2 - Lexical Analysis
No ratings yet
2 - Lexical Analysis
52 pages
Cs3501 Compiler Design Lab Manual
No ratings yet
Cs3501 Compiler Design Lab Manual
54 pages
Chapter 2
No ratings yet
Chapter 2
41 pages
Lexical Analysis
No ratings yet
Lexical Analysis
57 pages
CD Lab Manual
No ratings yet
CD Lab Manual
52 pages
Chapter 3 Lexical Analysis
No ratings yet
Chapter 3 Lexical Analysis
5 pages
Lexical Analyzer
No ratings yet
Lexical Analyzer
33 pages
Unit 1 (B)
No ratings yet
Unit 1 (B)
69 pages
17ACS42 Manual
No ratings yet
17ACS42 Manual
54 pages
CD Lab Manual
No ratings yet
CD Lab Manual
48 pages
Cs3501 Compiler Design Laboratory
No ratings yet
Cs3501 Compiler Design Laboratory
50 pages
Evaluating Software Architectures PDF
0% (2)
Evaluating Software Architectures PDF
2 pages
Quiz Review - Pertemuan 3 - Attempt Review
No ratings yet
Quiz Review - Pertemuan 3 - Attempt Review
8 pages
Compiler Record
No ratings yet
Compiler Record
42 pages
Compiler Design & Networks Lab Manual
No ratings yet
Compiler Design & Networks Lab Manual
69 pages
CD Cse Record
No ratings yet
CD Cse Record
76 pages
Software Manual 2
100% (1)
Software Manual 2
21 pages
Chapter 3 - Lexical Analysis and Lexical Analyzer Generators
No ratings yet
Chapter 3 - Lexical Analysis and Lexical Analyzer Generators
52 pages
CD Lab Manual
No ratings yet
CD Lab Manual
48 pages
BDA Assignment
No ratings yet
BDA Assignment
55 pages
Lexical Analyzer
No ratings yet
Lexical Analyzer
31 pages
4-Intro To Flex and Bison-09!09!2024
No ratings yet
4-Intro To Flex and Bison-09!09!2024
28 pages
Lab 4
No ratings yet
Lab 4
12 pages
Compiler Design Lab Manual
No ratings yet
Compiler Design Lab Manual
32 pages
Ch3 1
No ratings yet
Ch3 1
52 pages
Lexical Analysis 2
No ratings yet
Lexical Analysis 2
24 pages
Wildcard and Prompt Tutorial and Many Others
No ratings yet
Wildcard and Prompt Tutorial and Many Others
49 pages
Lexical Analysis and Lexical Analyzer Generators: COP5621 Compiler Construction
No ratings yet
Lexical Analysis and Lexical Analyzer Generators: COP5621 Compiler Construction
52 pages
Flex
No ratings yet
Flex
36 pages
2 Lexing
No ratings yet
2 Lexing
16 pages
Practical File Compiler Design
No ratings yet
Practical File Compiler Design
32 pages
Compiler Design Assignment Write Specification of LEX/FLEX Program.
No ratings yet
Compiler Design Assignment Write Specification of LEX/FLEX Program.
5 pages
CC Lab5
No ratings yet
CC Lab5
15 pages
SPCC Exp7
No ratings yet
SPCC Exp7
8 pages
Chapter 8.6 Introducing Domain Separation PDF
No ratings yet
Chapter 8.6 Introducing Domain Separation PDF
18 pages
COS 320 Compilers: David Walker
No ratings yet
COS 320 Compilers: David Walker
38 pages
System Software Manual
No ratings yet
System Software Manual
27 pages
CH 2 - Lexical Analysis
No ratings yet
CH 2 - Lexical Analysis
36 pages
Unit 2 Lexical Analyzer
No ratings yet
Unit 2 Lexical Analyzer
63 pages
Spring 2024 Compiler Constructoin A Lab 5
No ratings yet
Spring 2024 Compiler Constructoin A Lab 5
9 pages
Formalization of UML Use Case Diagram-A Z Notation Based Approach
No ratings yet
Formalization of UML Use Case Diagram-A Z Notation Based Approach
6 pages
Operation Manual: Devicecontrol
No ratings yet
Operation Manual: Devicecontrol
37 pages
E4 - Flex and Bison
No ratings yet
E4 - Flex and Bison
23 pages
Dhanasekar - Resume
No ratings yet
Dhanasekar - Resume
6 pages
Lexical Analysis: Textbook:Modern Compiler Design
No ratings yet
Lexical Analysis: Textbook:Modern Compiler Design
43 pages
CD Week3
No ratings yet
CD Week3
6 pages
Class 2019 Lex
No ratings yet
Class 2019 Lex
30 pages
Lecture 07 PDF
No ratings yet
Lecture 07 PDF
8 pages
Introduction To Web Programing II
No ratings yet
Introduction To Web Programing II
13 pages
CD Lab Performance 6 Question Paper - S25
No ratings yet
CD Lab Performance 6 Question Paper - S25
3 pages
Folders List
No ratings yet
Folders List
7 pages
Compiler Design Lab (CSP358) : Practical No. 1 (LEX)
No ratings yet
Compiler Design Lab (CSP358) : Practical No. 1 (LEX)
16 pages
Lexical Analyzer: Using Flex by Dr. S. M. Farhad
No ratings yet
Lexical Analyzer: Using Flex by Dr. S. M. Farhad
22 pages
Android Java
No ratings yet
Android Java
5 pages
Microsoft Word - Lab - Compiler 6-10
No ratings yet
Microsoft Word - Lab - Compiler 6-10
5 pages
12.CEH Module 3 Assignment 3.1
No ratings yet
12.CEH Module 3 Assignment 3.1
7 pages
Wordtune
No ratings yet
Wordtune
9 pages
Learn How To Display Surpac Data in Google Earth
No ratings yet
Learn How To Display Surpac Data in Google Earth
9 pages
DP New Manual
No ratings yet
DP New Manual
28 pages
Lexical Analysis: Programming Languages Translators
No ratings yet
Lexical Analysis: Programming Languages Translators
21 pages
Combining Fiori Launchpad With SAP Screen Personas To Run Simple
No ratings yet
Combining Fiori Launchpad With SAP Screen Personas To Run Simple
15 pages
Data Transformation COM API Reference: Informatica B2B Data Exchange™
No ratings yet
Data Transformation COM API Reference: Informatica B2B Data Exchange™
24 pages
Department of Computing: CS 354: Compiler Construction Class: BSCS-7A
No ratings yet
Department of Computing: CS 354: Compiler Construction Class: BSCS-7A
4 pages
Task - Recruiter Senior Recruiter - Talent Infinity11
No ratings yet
Task - Recruiter Senior Recruiter - Talent Infinity11
4 pages
CompilerDesignLabManual PDF
No ratings yet
CompilerDesignLabManual PDF
11 pages
Cse420 Lab 1
No ratings yet
Cse420 Lab 1
4 pages
Requerimientos de Software
No ratings yet
Requerimientos de Software
5 pages
Project 1 - Lexical Analyzer Using The Lex Unix Tool No Due Date - Project Not Graded
No ratings yet
Project 1 - Lexical Analyzer Using The Lex Unix Tool No Due Date - Project Not Graded
4 pages
Microsoft Business Intelligence (Msbi: Ssis, Ssas, SSRS) : Module 1 - SQL Server Integration Services (SSIS)
No ratings yet
Microsoft Business Intelligence (Msbi: Ssis, Ssas, SSRS) : Module 1 - SQL Server Integration Services (SSIS)
4 pages
Mobile Robot Chapter 4: C and 8051 (V4.a)
No ratings yet
Mobile Robot Chapter 4: C and 8051 (V4.a)
10 pages
Jeremy Engel LT
No ratings yet
Jeremy Engel LT
2 pages
C++ Functions and tutorial
From Everand
C++ Functions and tutorial
Nino Paiotta
No ratings yet
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
C Programming
From Everand
C Programming
Netra
No ratings yet
UNIX Shell Programming Interview Questions You'll Most Likely Be Asked
From Everand
UNIX Shell Programming Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet

Lab No. 01 - Building A Lexical Analyzer

Uploaded by

Lab No. 01 - Building A Lexical Analyzer

Uploaded by

Lab Manual CSI 412 – Compiler Sessional

Lab No. 01 – Building a Lexical Analyzer

Code Listing 1 - A simple lexical analyzer (tokens.l)

 Any lex file contains three main regions of code:

This is a sample text.

Number of tokens is 11.

Tasks for LAB 01:

Table 1 – Rules of Regular Expressions

Expression Pattern accepted

Check the book for extended usage and examples.

You might also like