0% found this document useful (0 votes)

21 views15 pages

Header Section Definitions Section Rules Section

Uploaded by

gebremolla641

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views15 pages

Header Section Definitions Section Rules Section

Uploaded by

gebremolla641

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

In Lex (a lexical analyzer generator), the structure of a Lex program is divided into three main

sections:
1. Header Section (%{ %})
2. Definitions Section
3. Rules Section
Each section serves a specific purpose in defining and implementing the lexer.

1. Header Section (%{ %})

 Purpose:
o The header section is used to include code that will be copied verbatim into the
generated C file. This is where you typically add #include statements and declare
any global variables or functions that will be needed by the lexer.
o It's wrapped in %{ and %} to indicate that this code should be included as-is in the
C file generated by Lex.
 Contents:
o Includes: Include C standard libraries (like stdio.h, stdlib.h).
o Global Variables: Declare any global variables (e.g., int line, int column to track
the current line and column in the input).
o Function Declarations: Declare or define functions that are needed for error
handling (e.g., void yyerror(char *s) to handle error reporting).
 Example:
%{
#include <stdio.h>
#include <stdlib.h>

// Declare global variables

int line = 1, column = 0;

// Error handling function

void yyerror(char *s) {
printf("Error: %s\n", s);
}
%}
 Explanation: The above code will be included in the generated C code before the Lex rules
are applied, making functions like yyerror() and variables like line and column available
throughout the program.

2. Definitions Section
 Purpose:
o This section defines patterns (using regular expressions) that can be reused in the
rules section. It often includes macros for frequently used patterns like digits,
letters, and comments.
o It’s written after the %} of the header section and before the rules section.
 Contents:
o Macros: Simple names used to define patterns, like DIGIT, LETTER, or
COMMENT.
o These macros can be used later in the rules section to make patterns more readable
and maintainable.
o Optionally, this section can also declare Lex options like case-insensitivity or other
configurations.
 Example:
DIGIT [0-9]
LETTER [a-zA-Z]
UNDERSCORE _
COMMENT ##.*\n
 Explanation:
o DIGIT is defined as a regular expression that matches any single digit [0-9].
o LETTER matches any alphabetical character.
o UNDERSCORE is an underscore (_).
o COMMENT matches any line starting with ## and ending with a newline (\n).

3. Rules Section
 Purpose:
o This is the core section of the Lex program. It defines the patterns (regular
expressions) the lexer will match in the input and the actions to take when a match
is found.
o The patterns are written on the left, and the corresponding actions (typically C code)
are written on the right, separated by whitespace.
 Contents:
o Pattern-Action Pairs: Regular expressions that describe the tokens you want to
recognize, followed by C code that defines what to do when that pattern is matched.
o The action can be anything from printing the token type and updating position
counters to more complex actions like generating code or building data structures.
o Special rules for whitespace or unrecognized tokens are also included here.
 Example:
[DIGIT]+ { printf("NUMBER %.*s\n", yyleng, yytext); }
int { printf("KEYWORD_INTEGER\n"); }
\n { line++; column = 0; }
[ \t]+ { column += yyleng; }
. { printf("Error: unrecognized symbol \"%.*s\"\n", yyleng, yytext); }
 Explanation:
o When the input matches the keyword int, the lexer prints KEYWOD_INTEGER
and updates the column.
o The pattern [DIGIT]+ matches any sequence of digits, prints it as a number, and
increments the column by the number of characters matched (yyleng).
o The pattern \n increments the line counter and resets the column (for newlines).
o [ \t]+ matches spaces or tabs and increases the column accordingly without any
other output.
o The rule . catches any unrecognized symbols and generates an error message.
Workflow Example:
 Given input "int 123;", the Lex program would:
o Recognize function as a keyword and print KEYWOD_INTEGER.
o Recognize 123 as a number and print NUMBER 123.
o Recognize ; as a semicolon and print SEMICOLON.
o Track the line and column throughout the process.
Summary:
 Header Section (%{ %}): Contains C code and global variables/functions, included as-is
in the generated C file.
 Definitions Section: Contains macros or reusable regular expressions for easy reference in
the rules.
 Rules Section: Specifies patterns to match and the actions to execute when a match is
found. This is the core of the lexer and defines how to process input and recognize tokens.
Macros
Lex allows the use of macros in the Defini ons Sec on to simplify the specifica on of complex pa erns.
Once defined, macros can be reused in the Rules Sec on to make the code more readable and
maintainable.

Why Use Macros?

 Reusability: Define a pattern once and use it multiple times in different rules.
 Readability: Instead of repeating complex regular expressions, you give them a
meaningful name.
 Maintainability: If a pattern changes, you only need to update it in one place.
How Macros Work:
 You define macros in the Definitions Section of a Lex file, right after the header section.
 Macros are typically written using regular expressions.
 You can refer to a macro by its name in the Rules Section instead of rewriting the regular
expression multiple times.
Example:
Consider a scenario where you frequently match digits, letters, or comments. Instead of repeating
these regular expressions in multiple places, you define macros:
DIGIT [0-9]
LETTER [a-zA-Z]
UNDERSCORE _
COMMENT ##.*\n
In the Rules Section, you can then use these macros like so:
{LETTER}({LETTER}|{DIGIT}|{UNDERSCORE})* { printf("Identifier found\n"); }
{DIGIT}+ { printf("Number found\n"); }
{COMMENT} { printf("Comment found\n"); }
Here, the macros {LETTER}, {DIGIT}, and {COMMENT} replace the corresponding regular
expressions. When Lex processes this input, it substitutes the macro with the defined regular
expression.
Benefits of Macros in Lex:
1. Simplification: For example, instead of repeating [0-9] everywhere in your rules where
you want to match digits, you can define DIGIT as a macro and use {DIGIT}.
2. Efficiency: Makes your Lex file easier to modify. If you decide to redefine what a "digit"
is (e.g., to include more characters), you only need to change the definition of the macro.
3. Organization: Macros make your Lex specification cleaner and easier to understand by
abstracting out complex patterns.
Example without Macros:
[0-9]+ { printf("Number found\n"); }
[a-zA-Z]+ { printf("Identifier found\n"); }
##.*\n { printf("Comment found\n"); }
Now compare this to the example with macros:
DIGIT [0-9]
LETTER [a-zA-Z]
COMMENT ##.*\n

{DIGIT}+ { printf("Number found\n"); }

{LETTER}+ { printf("Identifier found\n"); }
{COMMENT} { printf("Comment found\n"); }
Both examples do the same thing, but the second one using macros is much clearer, easier to
maintain, and avoids repetition.
Summary:
 A macro in Lex is a symbolic name for a pattern, typically defined using regular
expressions.
 Macros are defined in the Definitions Section and used in the Rules Section.
 They improve code readability, maintainability, and reduce repetition.
Explanation of printf, %d, and "%.*s"
1. printf:
printf is a standard C function used to print formatted output to the console. The general format
is:
printf(format_string, arguments);
 format_string: A string that specifies how to format the output. It can contain text and
format specifiers (e.g., %d, %s) to insert values.
 arguments: The values that will be substituted for the format specifiers in the format
string.

2. %d:
%d is a format specifier used in printf to print an integer value. When printf encounters %d in the
format string, it looks for an integer argument to print at that position.
Example:
int line = 10;
printf("Line number: %d\n", line);
Output:
Line number: 10

3. "%.*s":
This format specifier is used to print a string with a specified length. It's particularly useful when
you want to print only part of a string.
 %.*s: This tells printf to print a string, but the .* means that the number of characters to
print will be provided as an additional argument before the string itself.
o .: The dot represents precision.
o *: The * allows you to specify the precision dynamically via an argument.
Example:
char *text = "Hello, World!";
int length = 5;
printf("Text: %.*s\n", length, text);
Output:

Text: Hello
In this example, only the first 5 characters of "Hello, World!" are printed because length = 5.
In the context of Flex, yytext is the string containing the matched text, and yyleng is the length of
the matched text. So:
printf("NUMBER %.*s\n", yyleng, yytext);
This prints the word "NUMBER" followed by the matched text from yytext but only prints
yyleng characters (the length of the match).
For example, if yytext contains "12345" and yyleng is 5, it will print:
NUMBER 12345
Sample Flex Code (example.l)
%{
#include <stdio.h>
%}
%%
[0-9]+\.[0-9]+ { printf("FLOAT: %s\n", yytext); }
[0-9]+ { printf("INTEGER: %s\n", yytext); }
[a-zA-Z_][a-zA-Z0-9_]* { printf("IDENTIFIER: %s\n", yytext); }
"+"|"-"|"*"|"/"|”=” { printf("OPERATOR: %s\n", yytext); }
[ \t\n]+ ; /* Ignore whitespace */
Int {printf(“KEYWORD_INTEGER”);}

. { printf("UNKNOWN: %s\n", yytext); }

int main(void) {
printf("Enter a String:\n"); int a = 10;
yylex();
return 0;
}

int yywrap() {
return 1;
}
Explanation of Key Parts
 %{...%}: This section at the top is for C code, typically for #include statements or
variable declarations.
 %% sections: Divides the code into three parts: declarations, rules, and user code.
 Regular Expressions:
o [0-9]+\.[0-9]+: Matches floating-point numbers and uses yytext to output the
matched text.
o [0-9]+: Matches integers.
o [a-zA-Z_][a-zA-Z0-9_]*: Matches identifiers, which are letters or underscores
followed by alphanumeric characters.
o "+"|"-"|"*"|"/": Matches arithmetic operators.
o [ \t\n]+: Matches whitespace characters (ignored here).
o .: Catches any character not matched by other rules, marking it as "UNKNOWN."
Compilation and Execution
To compile and run this flex program:
1. Generate the C code from the flex file:
bash
flex example.l
2. Compile the generated C file:
gcc lex.yy.c -o lexer
3. Run the lexer on an input file or directly:
./lexer or lexer
Each line of input you enter will be tokenized, with output describing each token type based on
its pattern match.
Sample Flex Code with Definitions (example.l)
%{
#include <stdio.h>
%}

/* Definitions for common patterns */

DIGIT [0-9]
LETTER [a-zA-Z]
ID {LETTER}({LETTER}|{DIGIT})*
INT {DIGIT}+
FLOAT {DIGIT}+\.{DIGIT}+

{FLOAT} { printf("FLOAT: %s\n", yytext); }

{INT} { printf("INTEGER: %s\n", yytext); }
{ID} { printf("IDENTIFIER: %s\n", yytext); }
"+"|"-"|"*"|"/" { printf("OPERATOR: %s\n", yytext); }
[ \t\n]+ ; /* Ignore whitespace */
. { printf("UNKNOWN: %s\n", yytext); }

int main(void) {
yylex(); // Start scanning
return 0;
}
int yywrap() {
return 1;
}
Explanation of Key Parts
 Definitions Section: Each macro is defined at the top, before the %% delimiter.
o DIGIT: Matches any single digit.
o LETTER: Matches any uppercase or lowercase letter.
o ID: Matches identifiers that start with a letter and may contain letters and digits.
o INT: Matches an integer (a sequence of digits).
o FLOAT: Matches a floating-point number (digits followed by a period and more
digits).
 Rules Section: Each rule can use the defined macros by placing them in curly braces
{...}, like {ID}, {FLOAT}, etc. Flex substitutes these with the actual regular expressions
during processing.
Advantages of Using Definitions
 Readability: You can clearly see what each token type is without reading the regex
details.
 Modularity: If you need to change a pattern (e.g., redefine ID to allow underscores), you
only have to update it in the definitions section.
In a flex (lexical analyzer) file, these functions are standard boilerplate, allowing the scanner to
work correctly:
1. int main(void):
o This is the entry point of a C program. Here, yylex() is called, which is the
function generated by flex to perform the lexical analysis.
o yylex() reads input from stdin, matches the patterns defined in the flex rules, and
executes the corresponding actions.
o After yylex() finishes (usually when it reaches the end of the input), main returns
0, signaling that the program finished without errors.
2. int yywrap():
o yywrap() is a function flex calls when it reaches the end of the input file.
o By default, yywrap() should return 1 to indicate that there is no more input to
process.
o Flex will stop scanning when yywrap() returns 1. This function is often required
even if you’re only scanning a single file.
Why are these needed?
 main runs the yylex() scanner, initiating the scanning process.
 yywrap provides a way to handle multiple files or stop scanning cleanly after a single file
if there's no additional input.
Change Code Page to UTF-8 (It helps to detect Amharic lang..)
By default, the Windows Command Prompt(CMD) uses a legacy code page (like CP437 or
CP850), which doesn't support UTF-8. You can change it to UTF-8 by running the following
command:
chcp 65001
 This sets the code page to UTF-8 (65001).
 You can verify the code page change by typing:
chcp
TO DOWNLOAD

download flex 2.5.4a

Flex and Bison
100% (1)
Flex and Bison
23 pages
Online Platforms For Ict Content Development
100% (1)
Online Platforms For Ict Content Development
21 pages
CPI Basics
No ratings yet
CPI Basics
13 pages
ShopData PDF
No ratings yet
ShopData PDF
162 pages
Traditional Approach VS OO Approach
100% (16)
Traditional Approach VS OO Approach
17 pages
Compiler Desing-Final ppt2
No ratings yet
Compiler Desing-Final ppt2
194 pages
Lex
No ratings yet
Lex
41 pages
Windows 2000 Pro BR: Serial I
No ratings yet
Windows 2000 Pro BR: Serial I
4 pages
Tutorial On Lex & Yacc: Presented by Dewan Tanvir Ahmed Lecturer, CSE Bangladesh University of Engineering and Technology
No ratings yet
Tutorial On Lex & Yacc: Presented by Dewan Tanvir Ahmed Lecturer, CSE Bangladesh University of Engineering and Technology
31 pages
Lex Programming Lab
No ratings yet
Lex Programming Lab
9 pages
Lex PDF
No ratings yet
Lex PDF
20 pages
A Brief (F) Lex Tutorial
No ratings yet
A Brief (F) Lex Tutorial
13 pages
EMBEDDED SYSTEMS QUESTION PAPERS , IMPORTANT PREVIOUS QUESTION PAPERS VERY IMPORTANT BTECH MTECH JNUTH 1. Write short note on the following parts of embedded systems: a) Processors b) Memory c) Operating systems d) Programming Languages.
No ratings yet
EMBEDDED SYSTEMS QUESTION PAPERS , IMPORTANT PREVIOUS QUESTION PAPERS VERY IMPORTANT BTECH MTECH JNUTH 1. Write short note on the following parts of embedded systems: a) Processors b) Memory c) Operating systems d) Programming Languages.
4 pages
SSCD Mod4AzDOCUMENTS
No ratings yet
SSCD Mod4AzDOCUMENTS
67 pages
Compiler Design Lab KCS552
No ratings yet
Compiler Design Lab KCS552
82 pages
Configuring SSL
No ratings yet
Configuring SSL
61 pages
Module 4 RVC
No ratings yet
Module 4 RVC
59 pages
2 Lexing
No ratings yet
2 Lexing
73 pages
Accessor - and - Array
No ratings yet
Accessor - and - Array
3 pages
CD Cse Record
No ratings yet
CD Cse Record
76 pages
Lex Yacc Tutorial
No ratings yet
Lex Yacc Tutorial
38 pages
SS & OS Final Lab Manual
No ratings yet
SS & OS Final Lab Manual
46 pages
LP Practical File 21dit044
No ratings yet
LP Practical File 21dit044
51 pages
Lecture3 Lex
No ratings yet
Lecture3 Lex
44 pages
LEX Programming
No ratings yet
LEX Programming
36 pages
1lex and Yacc
No ratings yet
1lex and Yacc
42 pages
Flex
No ratings yet
Flex
36 pages
Lex Material 1
No ratings yet
Lex Material 1
37 pages
Module-4 Lex and Yacc
No ratings yet
Module-4 Lex and Yacc
67 pages
Class 2019 Lex
No ratings yet
Class 2019 Lex
30 pages
SSCD LAB MAUNUAL DRTTIT FULL (Santhosh) PDF
No ratings yet
SSCD LAB MAUNUAL DRTTIT FULL (Santhosh) PDF
50 pages
Resources PDF Trainings EC-2245-Mainframe-MVS REXX
No ratings yet
Resources PDF Trainings EC-2245-Mainframe-MVS REXX
4 pages
CDLabmanual
No ratings yet
CDLabmanual
40 pages
Compiler Design Manual
No ratings yet
Compiler Design Manual
69 pages
Lex and Yacc
No ratings yet
Lex and Yacc
8 pages
Introduction To Lex
No ratings yet
Introduction To Lex
20 pages
Final SAP Mcq2 (21june)
No ratings yet
Final SAP Mcq2 (21june)
54 pages
LIM Guide 22.1.0
No ratings yet
LIM Guide 22.1.0
55 pages
LexYacc Final
No ratings yet
LexYacc Final
44 pages
Advanced SQL - LAB 2
No ratings yet
Advanced SQL - LAB 2
11 pages
Chapter 9 - LEX - LabManual
No ratings yet
Chapter 9 - LEX - LabManual
26 pages
Assignment No.: Assignment To Understand Basic Syntax of LEX Specifications, Built-In Functions and Variables
No ratings yet
Assignment No.: Assignment To Understand Basic Syntax of LEX Specifications, Built-In Functions and Variables
32 pages
Plotter PopJet 165-180-200 User Guide v.2
No ratings yet
Plotter PopJet 165-180-200 User Guide v.2
27 pages
Copier IR2230 2830 3530seriesbrochure
No ratings yet
Copier IR2230 2830 3530seriesbrochure
6 pages
Flex Tool Presentation - DVK
No ratings yet
Flex Tool Presentation - DVK
17 pages
System Software Manual
No ratings yet
System Software Manual
27 pages
2
No ratings yet
2
16 pages
Flex Coursz
No ratings yet
Flex Coursz
15 pages
Reminder + Radar: Getting Things Done One Place at A Time
No ratings yet
Reminder + Radar: Getting Things Done One Place at A Time
21 pages
Compiler Design Lab (CSP358) : Practical No. 1 (LEX)
No ratings yet
Compiler Design Lab (CSP358) : Practical No. 1 (LEX)
16 pages
Notes About Lex and Yacc: Pablo Nogueira Iglesias December 26, 1999
No ratings yet
Notes About Lex and Yacc: Pablo Nogueira Iglesias December 26, 1999
15 pages
2 Lexical Analyser Generator
No ratings yet
2 Lexical Analyser Generator
9 pages
1 Introduction To LEX: Input - File.l
No ratings yet
1 Introduction To LEX: Input - File.l
19 pages
System Programming & Compiler Design Lab Manual
No ratings yet
System Programming & Compiler Design Lab Manual
41 pages
Unit I Introduction To Compilers: Lex - The Lexical-Analyzer Generator
No ratings yet
Unit I Introduction To Compilers: Lex - The Lexical-Analyzer Generator
19 pages
Using Lex
No ratings yet
Using Lex
15 pages
C + + Project: Submitted By: Ravi Class: Xii A Roll No
No ratings yet
C + + Project: Submitted By: Ravi Class: Xii A Roll No
20 pages
What Is Lex
No ratings yet
What Is Lex
10 pages
EA100 Tool Instrct
No ratings yet
EA100 Tool Instrct
5 pages
Lecture #5 Began Here: Avoid These Top 10 Homework #1 Bugs in Your Homework #2
No ratings yet
Lecture #5 Began Here: Avoid These Top 10 Homework #1 Bugs in Your Homework #2
5 pages
Lex Tool
No ratings yet
Lex Tool
7 pages
Lab 4
No ratings yet
Lab 4
12 pages
PLT Lecture Notes
No ratings yet
PLT Lecture Notes
5 pages
Theory:: Aim: Implement A Lexical Analyzer For A Subset of C Using LEX Implementation Should Support Error Handling
No ratings yet
Theory:: Aim: Implement A Lexical Analyzer For A Subset of C Using LEX Implementation Should Support Error Handling
5 pages
‏لقطة شاشة ٢٠٢٤-٠٣-٢١ في ٩.١٢.٣٢ ص
No ratings yet
‏لقطة شاشة ٢٠٢٤-٠٣-٢١ في ٩.١٢.٣٢ ص
7 pages
CompilerDesignLabManual PDF
No ratings yet
CompilerDesignLabManual PDF
11 pages
SPCC Exp7
No ratings yet
SPCC Exp7
8 pages
Compiler-Design Notes
No ratings yet
Compiler-Design Notes
5 pages
Introducing The Rational Unified Process
No ratings yet
Introducing The Rational Unified Process
30 pages
Practical - 2: Aim: Introduction To Lex Tool. Lex
No ratings yet
Practical - 2: Aim: Introduction To Lex Tool. Lex
3 pages
HAWQ: A Massively Parallel Processing SQL Engine in Hadoop: Pivotal Inc
No ratings yet
HAWQ: A Massively Parallel Processing SQL Engine in Hadoop: Pivotal Inc
12 pages
Module 6-Special Purpose OS
No ratings yet
Module 6-Special Purpose OS
16 pages
CMSC 141 - Automata and Language Theory
No ratings yet
CMSC 141 - Automata and Language Theory
2 pages
System Programming (2150708) : Topic: Implementation of Lexical Analyser Using LEX Utility Tool in UNIX
No ratings yet
System Programming (2150708) : Topic: Implementation of Lexical Analyser Using LEX Utility Tool in UNIX
5 pages
CC2
No ratings yet
CC2
6 pages
Compiler Design Practical List
No ratings yet
Compiler Design Practical List
5 pages
AhoogaModularDisplay - M02 Manual 2023
No ratings yet
AhoogaModularDisplay - M02 Manual 2023
8 pages
Command Memento
No ratings yet
Command Memento
1 page
Lesson 1 Excel
No ratings yet
Lesson 1 Excel
8 pages
Udit Rajput
No ratings yet
Udit Rajput
4 pages
RESUME Cristhian, Edwin, Ortiz, Mercado 01022023
No ratings yet
RESUME Cristhian, Edwin, Ortiz, Mercado 01022023
3 pages
Angular - Practice Questions 03-04-2025-L49+l50-Angular Js
No ratings yet
Angular - Practice Questions 03-04-2025-L49+l50-Angular Js
5 pages
Nano
No ratings yet
Nano
5 pages
Nov Dec 2023-1
No ratings yet
Nov Dec 2023-1
2 pages
2020 06 21 10.45.02
No ratings yet
2020 06 21 10.45.02
2 pages
10 Best Practices To Containerize Java Applications With Docker
No ratings yet
10 Best Practices To Containerize Java Applications With Docker
1 page
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
C Programming
From Everand
C Programming
Netra
No ratings yet
"C Programming for Beginners: A Step-by-Step Guide"
From Everand
"C Programming for Beginners: A Step-by-Step Guide"
Lov kush
No ratings yet
Coding for beginners The basic syntax and structure of coding
From Everand
Coding for beginners The basic syntax and structure of coding
Diamond Moore
No ratings yet

Header Section Definitions Section Rules Section

Uploaded by

Header Section Definitions Section Rules Section

Uploaded by

In Lex (a lexical analyzer generator), the structure of a Lex program is divided into three main

1. Header Section (%{ %})

// Declare global variables

// Error handling function

Why Use Macros?

{DIGIT}+ { printf("Number found\n"); }

. { printf("UNKNOWN: %s\n", yytext); }

/* Definitions for common patterns */

{FLOAT} { printf("FLOAT: %s\n", yytext); }

download flex 2.5.4a

You might also like