Introduction To Lex
Introduction To Lex
Phases of Compiler
Lex
• Lex is a program that generates lexical analyzer.
• The lexical analyzer is a program that transforms an input stream into a
sequence of tokens.
• It reads the input stream and produces the source code as output
through implementing the lexical analyzer in the C program.
• Firstly lexical analyzer creates a program lex.1 in the Lex language. Then Lex
compiler runs the lex.1 program and produces a C program lex.yy.c.
• Finally C compiler runs the lex.yy.c program and produces an object program
a.out.
• a.out is lexical analyzer that transforms an input stream into a sequence of
tokens.
Structure of Lex Programs
declarations
%%
translation rules
%%
auxiliary functions
• Declarations This section includes declaration of variables, constants.
• Translation rules It contains regular expressions and code segments.
• Form : Pattern {Action}
• Pattern is a regular expression or regular definition.
• Action refers to segments of code.
• Auxiliary functions This section holds additional functions which are
used in actions.
• 1. Definition Section: The definition section contains the declaration of
variables, regular definitions, constants.
• In the definition section, text is enclosed in “%{ %}” brackets.
• Syntax:
%{
// Definitions
%}
• 2. Rules Section: The rules section contains a series of rules in the
form: pattern action and pattern must be unintended and action
begin on the same line in {} brackets. The rule section is enclosed
in “%% %%”.
%%
pattern action
%%
• 3. User Code Section: This section contain C statements and additional
functions. We can also compile these functions separately and load with
the lexical analyzer.
Basic Program Structure:
%{
// Definitions
%}
%%
Rules
%%
User code section
• Design of Lexical Analyzer
• Lexical analyzer can either be generated by NFA or by DFA.
• DFA is preferable in the implementation of lex.
• How to run the program:
To run the program, it should be first saved with the extension .l or .lex.
Run the below commands on terminal in order to run the program file.
• It is called to invoke the lexer (or scanner) and each time yylex() is
called, the scanner continues processing the input from where it last
left off.
YACC
• YACC stands for Yet Another Compiler Compiler.
• YACC provides a tool to produce a parser for a given grammar.
• YACC is a program designed to compile a LALR (1) grammar.
• It is used to produce the source code of the syntactic analyzer of the
language produced by LALR (1) grammar.
• The input of YACC is the rule or grammar and the output is a C
program.
YACC
• Full specification looks like:
declarations
%%
rules
%%
programs
• The rules section is made up of one or more grammar rules. A grammar rule has the form:
A : BODY ;
Body section can be written as:
A:BCD;
A:EF;
A:G;
A:BCD
|EF
|G;