0% found this document useful (0 votes)

3 views25 pages

CD - Unit I

The document provides an overview of compiler design, including the roles of various components such as preprocessors, compilers, assemblers, interpreters, loaders, and link-editors. It details the phases of compilation, including lexical analysis, syntax analysis, intermediate code generation, code optimization, and code generation, along with the concepts of bootstrapping and cross-compilation. Additionally, it discusses the importance of lexical analysis in identifying tokens and the differences between tokens, lexemes, and patterns.

Uploaded by

mslucky863

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views25 pages

CD - Unit I

Uploaded by

mslucky863

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 25

KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES

UNIT I
Introduction: Overview of compilation, Language Processors, The structure of a Compiler, Pass
and Phases of translation, Interpretation and bootstrapping.

Lexical Analysis: The Role of the Lexical Analyzer, Input Buffering, Recognition of Tokens,
Design of a Lexical-Analyzer Generator, Optimization of DFA-Based Pattern Matchers, The Lexical-
Analyzer Generator(LEX) tool.

1.1 .1OVERVIEW OF LANGUAGE PROCESSING SYSTEM

A computer is a blend of hardware and software, where hardware executes
instructions as binary code (0s and 1s). Writing in binary is complex, so programmers use
high-level languages, which are easier to understand and remember. These programs
are processed by operating system components and devices to generate machine-
readable code. This system of converting high-level language to binary is called a
language processing system.

1.1.1.1 PREPROCESSOR
A pre-processor produce input to compilers. They may perform the following functions.

 Macro processing: A pre-processor may allow a user to define macros that are
short hands for longer constructs.
 File inclusion: A pre-processor may include header files into the program text.
 Rational pre-processor: these pre-processors augment older languages with
more modern flow-of-control and data structuring facilities.
 Language Extensions: These pre-processor attempts to add capabilities to the
language by certain amounts to build-in macro
KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES

1.1.1.2 COMPILER
Compiler is a translator program that translates a program written in (HLL) the
source program and translates it into an equivalent program in (MLL) the target
program. As an important part of a compiler is error showing to the programmer.

Executing a program written n HLL programming language is basically of two parts.

The source program must first be compiled translated into an object program. Then
the results object program is loaded into a memory executed.

1.1.1.3 ASSEMBLER
Programmers found it difficult to write or read programs in machine language.
They begin to use a mnemonic (symbols) for each machine instruction, which they
would subsequently translate into machine language. Such a mnemonic machine
language is now called an assembly language. Programs known as assembler were
written to automate the translation of assembly language in to machine language.
The input to an assembler program is called source program, the output is a
machine language translation (object program).

1.1.1.4 INTERPRETER:
An interpreter is a program that appears to execute a source program as if it
were machine language.
KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES

Languages such as BASIC, SNOBOL, and LISP can be translated using interpreters. JAVA
also uses interpreter. The process of interpretation can be carried out in following
phases.

 Lexical analysis
 Syntax analysis
 Semantic analysis
 Direct Execution
Advantages
 Modification of user program can be easily made and implemented as execution
proceeds.
 Type of object that may denotes various change dynamically.
 Debugging a program and finding errors is simplified task for a program used for
interpretation.
 The interpreter for the language makes it machine independent.

Disadvantages
 The execution of the program is slower.
 Memory consumption is more.
1.1.1.5 Loader and Link-editor:
Once the assembler procedures an object program, that program must be placed
into memory and executed. The assembler could place the object program directly in
memory and transfer control to it, thereby causing the machine language program to be
execute. This would waste core by leaving the assembler in memory while the users
program was being executed. Also the programmer would have to retranslate his program
with each execution, thus wasting translation time. To overcome this problems of wasted
translation time and memory. System programmers developed another component called
loader

“A loader is a program that places programs into memory and prepares them for
execution.” It would be more efficient if subroutines could be translated into object form
the loader could relocate” directly behind the user’s program. The task of adjusting
programs o they may be placed in arbitrary core locations is called relocation. Relocation
loaders perform four functions.
KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES
1.1.2 STRUCTURE OF THE COMPILER DESIGN

Phases of a compiler: A compiler operates in phases. A phase is a logically interrelated

operation that takes source program in one representation and produces output in another
representation. The phases of a compiler are shown in below there are two phases of
compilation.

a. Analysis (Machine Independent/Language Dependent)

b. Synthesis (Machine Dependent/Language independent)

Compilation process is partitioned into

1.1.2.1 PHASES OF A COMPILER

No-of-sub processes called ‘phases’.

KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES
Lexical Analysis:-

LA or Scanners reads the source program one character at a time, carving the source
program into a sequence of atomic units called tokens.

Syntax Analysis:-
The second stage of translation is called Syntax analysis or parsing. In this phase
expressions, statements, declarations etc are identified by using the results of lexical
analysis. Syntax analysis is aided by using techniques based on formal grammar of the
programming language.

Intermediate Code Generations:-

An intermediate representation of the final machine language code is produced. This
phase bridges the analysis and synthesis phases of translation.

Code Optimization:-
This is optional phase described to improve the intermediate code so that the output
runs faster and takes less space.

Code Generation:-
The last phase of translation is code generation. A number of optimizations to
reduce the length of machine language program are carried out during this phase. The
output of the code generator is the machine language program of the specified computer.

Table Management (or) Book-keeping:-

This is the portion to keep the names used by the program and records essential
information about each. The data structure used to record this information called a “Symbol
Table‟.

Error Handlers:-
It is invoked when a flaw error in the source program is detected. The output of LA is
a stream of tokens, which is passed to the next phase, the syntax analyzer or parser. The SA
groups the tokens together into syntactic structure called as expression. Expression may
further be combined to form statements. The syntactic structure can be regarded as a tree
whose leaves are the token called as parse trees.
KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES

1.1.3 Phases in Compiler:

Generally, phases are divided into two parts:

1.1.3.1Front End phases:

The front end consists of those phases or parts of phases that is source language-
dependent and target machine, independents. These generally consist of lexical analysis,
semantic analysis, syntactic analysis, symbol table creation, and intermediate code
generation. A little part of code optimization can also be included in the front-end part.
The front-end part also includes the error handling that goes along with each of the
phases.
KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES

1.1.3.2 Back End phases:

The portions of compilers that depend on the target machine and do not depend on
the source language are included in the back end. In the back end, code generation and
necessary features of code optimization phases, along with error handling and symbol
table operations are also included.

1.1.4 Passes in Compiler:

A pass is a component where parts of one or more phases of the compiler are
combined when a compiler is implemented. A pass reads or scans the instructions of the
source program or the output produced by the previous pass, which makes necessary
transformation specified by its phases.

There are generally two types of passes

1. One-pass
2. Two-pass

Grouping

Several phases are grouped together to a pass so that it can read the input file and write an
output file.

1. One-Pass – In One-pass all the phases are grouped into one phase. The six phases
are included here in one pass.
2. Two-Pass – In Two-pass the phases are divided into two parts i.e. Analysis or Front
End part of the compiler and the synthesis part or back end part of the compiler.

1. Purpose of One Pass Compiler

A one-pass compiler generates a structure of machine instructions as it looks like a

stream of instructions and then sums up with machine address for these guidelines to a
rundown of directions to be back patched once the machine address for it is generated. It is
used to pass the program for one time. Whenever the line source is handled, it is checked
and the token is removed.
KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES

2. Purpose of Two-Pass Compiler

A two-pass compiler utilizes its first pass to go into its symbol table a rundown of
identifiers along with the memory areas to which these identifiers relate. Then, at that
point, a second pass replaces mnemonic operation codes by their machine language
equivalent and replaced uses of identifiers by their machine address. In the second pass, the
compiler can read the result document delivered by the first pass, assemble the syntactic
tree and deliver the syntactical examination. The result of this stage is a record that contains
the syntactical tree.
KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES

1.1.5 Interpretation

An interpreter converts high-level programming language into machine code line-by-

line, unlike compilers and assemblers. It executes instructions immediately, stopping when
errors occur and allowing easier error correction. However, interpreted programs are slower
to execute than compiled ones and require the source code to run every time.

Interpreters were first used in 1952, making programming simpler given the limitations of
early computers. They are commonly used in micro-computers and help programmers
debug errors before moving to the next statement.

A self-interpreter is an interpreter written in the same language it interprets, like a BASIC

interpreter written in BASIC. Examples of languages with elegant self-interpreters include
Lisp and Prolog

Advantages and Disadvantages of Interpreters

 The advantage of the interpreter is that it is executed line by line which helps users
to find errors easily.
 The disadvantage of the interpreter is that it takes more time to execute successfully
than the compiler.

Applications of Interpreters

 Each operator executed in a command language is usually an invocation of a

complex routine, such as an editor or compiler so they are frequently used to
command languages and glue languages.
 Virtualization is often used when the intended architecture is unavailable.
 Sand-boxing
 Self-modifying code can be easily implemented in an interpreted language.
 Emulator for running Computer software written for obsolete and unavailable
hardware on more modern equipment.

1.1.6 Bootstrapping

Bootstrapping is a key technique in compiler design, where a basic compiler is used to

build and improve more advanced versions of itself. It enables the development of
compilers for new programming languages and enhances existing ones over time.

 It relies on self-compiling compilers, with each iteration improving their ability to

handle complex code.
 The process simplifies the development cycle, supporting incremental improvements
and quicker deployment of robust compilers.
 Many programming languages, such as C and Java, have successfully employed
bootstrapping techniques during their development.
KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES

Bootstrapping is the process of building compilers through iterative development:

1. Start with a Basic Compiler: A simple compiler is created using a basic language
(e.g., assembly language). It handles essential features of a programming language.
2. Create an Advanced Version: The basic compiler is used to compile a more advanced
version, which can handle additional features like better error checking and
optimizations.
3. Gradually Improve: Each version of the compiler builds on the previous one, adding
more features and improving efficiency. This process continues until the desired
result is achieved.

In the T-diagram:

1. Step 1: The source language is a subset of C (C0), the target language is Assembly,
and the implementation language is also Assembly.
2. Step 2: Using the C0 compiler, a compiler for the full C language is created, with C as
the source language and Assembly as the target language.
KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES
Cross compilation
Cross-compilation is a process where a compiler runs on one platform (host) but
generates machine code for a different platform (target). This is useful when the target
platform is not powerful enough to run the full compiler or when the target architecture is
different from the host system. Using bootstrapping in cross-compilation can help create a
compiler that runs on one system (the host) but produces code for another system (the
target).

For instance, to create a cross-compiler for language X generating code in Z, an

existing compiler Y (on machine M) is used to build a basic compiler, XYZ. This XYZ compiler
translates X code to Z code while running on M, resulting in a cross-compiler (XMZ). This
approach allows the creation of a compiler for language X without directly needing the
target system.

Advantages of Bootstrapping:

1. Improved Efficiency: Speeds up compiler development by using basic compilers to

create advanced ones.
2. Portability: Allows compilers to work across various systems, making them more
flexible.
3. Reduced Dependency: Minimizes reliance on external tools or software by enabling
self-sufficient compiler creation.

Challenges of Bootstrapping:

1. Initial Effort: Requires significant time and effort to build the first simple compiler.
2. Complexity of Self-Compilation: Ensuring the compiler can compile itself while
supporting advanced features is challenging.
3. Time Consumption: Iterative improvements in early stages are slow and resource-
intensive.
KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES
Lexical Analysis
1.2.1OVER VIEW OF LEXICAL ANALYSIS

1. To identify the tokens we need some method of describing the possible tokens
that can appear in the input stream. For this purpose we introduce regular
expression, a notation that can be used to describe essentially all the tokens of
programming language.

2.Secondly, having decided what the tokens are, we need some mechanism to
recognize these in the input stream. This is done by the token recognizers, which
are designed using transition diagrams and finite automata.

1.2.2 ROLE OF LEXICAL ANALYZER

The LA is the first phase of a compiler. It main task is to read the input
character and produce as output a sequence of tokens that the parser uses for syntax
analysis.

Upon receiving a get next token‟ command form the parser, the lexical
analyser reads the input character until it can identify the next token. The LA return to
the parser representation for the token it has found. The representation will be an
integer code, if the token is a simple construct such as parenthesis, comma or colon.

LA may also perform certain secondary tasks as the user interface. One such
task is striping out from the source program the commands and white spaces in the
form of blank, tab and new line characters. Another is correlating error message from
the compiler with the source program.
KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES

LEXICAL ANALYSIS VS PARSING:

TOKEN, LEXEME, PATTERN:

Token: Token is a sequence of characters that can be treated as a single logical

entity. Typical tokens are,

1) Identifiers 2) keywords 3) operators 4) special symbols 5) constants

Pattern: A set of strings in the input for which the same token is produced as
output. This set of strings is described by a rule called a pattern associated with the
token.

Lexeme: A lexeme is a sequence of characters in the source program that is matched

by the pattern for a token.
Example:
KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES
A patter is a rule describing the set of lexemes that can represent a particular token in
source program.

LEXICAL ERRORS:
Lexical errors are the errors thrown by your lexer when unable to continue. This
means that there’s no way to recognise a lexeme as a valid token for you lexer. Syntax
errors, on the other side, will be thrown by your scanner when a given set of already
recognised valid tokens don't match any of the right sides of your grammar rules. Simple
panic-mode error handling system requires that we return to a high-level parsing
function when a parsing or lexical error is detected.

Error-recovery actions are:

1. Delete one character from the remaining input.

2. Insert a missing character in to the remaining input.
3. Replace a character by another character.
4. Transpose two adjacent characters.

1.2.3 INPUT BUFFERING

Input buffering is a critical concept in compiler design that improves the efficiency of
reading and processing source code. Typically, a compiler scans the input one character at a
time, which can be slow and inefficient. Input buffering addresses this issue by allowing the
compiler to read chunks of input data into a buffer before processing them. This reduces the
number of system calls, each of which carries overhead, thereby improving performance.

A buffer is essentially a temporary storage area where a block of input data is

loaded. The size of this buffer can vary depending on the compiler's specific needs and the
type of source code being compiled. For instance, compilers for high-level programming
languages might use larger buffers, as these languages often have longer lines of code, while
compilers for low-level languages may use smaller buffers.

One major advantage of input buffering is its ability to reduce the frequency of
system calls needed to read the source code, leading to faster compilation times.
Additionally, it simplifies the compiler's design by minimizing the amount of code required
for input management.

However, input buffering is not without its challenges. If the buffer size is excessively
large, it can consume too much memory, potentially leading to slower performance or even
crashes, especially on systems with limited resources. Furthermore, improper management
of the buffer can result in errors during compilation, such as incorrect processing of the
input data.

Initially both the pointers point to the first character of the input string as shown below
KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES

In the process of lexical analysis, the forward pointer (fp) scans the input to identify the end
of a lexeme. When a blank space is encountered, it signifies the end of the current lexeme
(e.g., recognizing the lexeme "int"). The fp then moves ahead, skipping the white space,
while both the begin pointer (bp) and fp are reset to the starting position of the next token.

However, reading characters directly from secondary storage is resource-intensive and

inefficient. To address this, the buffering technique is employed. A block of data is initially
loaded into a buffer, reducing the number of direct accesses to secondary storage. The
lexical analyzer then processes characters from the buffer instead.

There are two primary methods used in input buffering:

1. One Buffer Scheme: This approach uses a single buffer to hold the input data. It is
simpler but may require extra effort to manage overlapping lexemes.
2. Two Buffer Scheme: This method employs two buffers alternately. While one buffer
is being processed, the other is being filled with the next block of input, enabling
seamless processing and reducing delays caused by input operations.

One Buffer Scheme: In this scheme, only one buffer is used to store the input string but
the problem with this scheme is that if lexeme is very long then it crosses the buffer
boundary, to scan rest of the lexeme the buffer has to be refilled, that makes overwriting
the first of lexeme.
KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES
Two Buffer Scheme: The Two Buffer Scheme improves input buffering by using two
alternating buffers to store input. When one buffer is processed, the other is filled with the
next block of data, ensuring uninterrupted processing. Initially, both the begin pointer (bp)
and forward pointer (fp) point to the first character of the first buffer. The fp moves right to
find the end of a lexeme, which is marked by a blank space. The lexeme is identified as the
string between bp and fp.

To mark buffer boundaries, a Sentinel (end-of-buffer character) is placed at the end of each
buffer. When fp encounters the first sentinel, the second buffer is filled. Similarly,
encountering the second sentinel prompts refilling of the first buffer. This process continues
until all input is processed. A limitation of this method is that lexemes longer than the buffer
size cannot be fully scanned. Despite this, the scheme efficiently reduces secondary storage
access delays.

1.2.4 Recognition of Tokens

 Tokens obtained during lexical analysis are recognized by Finite Automata.

 Finite Automata (FA) is a simple idealized machine that can be used to recognize
patterns within input taken from a character set or alphabet (denoted as C). The
primary task of an FA is to accept or reject an input based on whether the defined
pattern occurs within the input.
 There are two notations for representing Finite Automata. They are:

1. Transition Table
2. Transition Diagram

1.2.4.1 Transition Table

It is a tabular representation that lists all possible transitions for each state and input
symbol combination.

EXAMPLE
Assume the following grammar fragment to generate a specific language
KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES

Where the terminals if, then, else, relop, id and num generates sets of strings given by
following regular definitions

 where letter and digits are defined as - (letter → [A-Z a-z] & digit → [0-9])
 For this language, the lexical analyzer will recognize the keywords if, then, and else,
as well as lexemes that match the patterns for relop, id, and number.
 To simplify matters, we make the common assumption that keywords are also
reserved words: that is they cannot be used as identifiers.
 The num represents the unsigned integer and real numbers of Pascal.
 In addition, we assume lexemes are separated by white space, consisting of nonnull
sequences of blanks, tabs, and newlines.
 Our lexical analyzer will strip out white space. It will do so by comparing a string
against the regular definition ws, below.

 If a match for ws is found, the lexical analyzer does not return a token to the parser.
 It is the following token that gets returned to the parser.
KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES
1.2.4.2 Transition Diagram
It is a directed labelled graph consisting of nodes and edges. Nodes represent states, while edges
represent state transitions.

1.2.5 STRUCTURE OF THE GENERATOR ANALYZER

 The program that serves as the lexical analyzer includes a fixed program that
simulates an automaton; at this point we leave open whether that automaton is
deterministic or nondeterministic.
 The rest of the lexical analyzer consists of components that are created from the Lex
program by Lex itself.
 These components are:
 A transition table for the automaton.
 Those functions that are passed directly through Lex to the output.
 The actions from the input program, which appear as fragments of code to be
invoked at the appropriate time by the automaton simulator.
KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES
ARCHITECTURE OF GENERATOR ANALYZER

Pattern Matching Based on NFA's

 Transition table is constructed by Non-deterministic Automata.
 We begin by taking each regular-expression pattern in the Lex program and convert
to NFA.
 We need a single automaton that will recognize lexemes matching any of
the patterns in the program, so we combine all the NFA's in to one by introducing a
new start state with e-transitions to each of the start states of the NFA's N{ for
pattern pi}.

Read input beginning at the point on its input which we have referred to as
lexemeBegin. As it moves the pointer called forward ahead in the input, it calculates the set of
states it is in at each point.
KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES
Eventually, the NFA simulation reaches a point on the input where there are no next states. At that
point, there is no hope that any longer prefix of the input would ever get the NFA to an
accepting state; rather, the set of states will always be empty. Thus, we are ready to decide on the
longest prefix that is a lexeme matching some pattern.

DFA'S FOR LEXICAL ANALYZERS

Architecture, resembling the output of Lex, is to convert the NFA for all the patterns into an
equivalent DFA, using the subset construction method.
The accepting states are labelled by the pattern that is identified by that state.
KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES

LOOKAHEAD OPERATOR
 The Lex lookahead operator / in a Lex pattern r1/r2 is sometimes necessary, because
the pattern r1 for a particular token may need to describe some trailing context r2 in
order to correctly identify the actual lexeme.
 When converting the pattern r1/r2 to an NFA, we treat the / as if it were e, so we do
not actually look for a / on the input. However, if the NFA recognizes a prefix xy of
the input buffer as matching this regular expression, the end of the lexeme is not where the
NFA entered its accepting state.
AN NFA FOR THE PATTERN FOR THE FORTRAN IF WITHLOOKAHEAD

Notice that the e-transition from state 2 to state 3 represents the lookahead operator. State
6 indicates the presence of the keyword IF. However, we find the lexeme IF by scanning
backwards to the last occurrence of state 2, whenever state 6 is entered.
KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES
1.2.6 Optimization of DFA-Based Pattern Matchers
To optimize the DFA you have to follow the various steps. These are as follows:

Step 1: Remove all the states that are unreachable from the initial state via any set of the
transition of DFA.

Step 2: Draw the transition table for all pair of states.

Step 3: Now split the transition table into two tables T1 and T2. T1 contains all final states
and T2 contains non-final states.

Step 4: Find the similar rows from T1 such that:

1. δ (q, a) = p
2. δ (r, a) = p
That means, find the two states which have same value of a and b and remove one of them.

Step 5: Repeat step 3 until there is no similar rows are available in the transition table T1.

Step 6: Repeat step 3 and step 4 for table T2 also.

Step 7: Now combine the reduced T1 and T2 tables. The combined transition table is the
transition table of minimized DFA.

Solution:
Step 1: In the given DFA, q2 and q4 are the unreachable states so remove them.

Step 2: Draw the transition table for rest of the states.

KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES

Step 3:

Now divide rows of transition table into two sets as:

1. One set contains those rows, which start from non-final sates:

2. Other set contains those rows, which starts from final states.

Step 4: Set 1 has no similar rows so set 1 will be the same.

Step 5: In set 2, row 1 and row 2 are similar since q3 and q5 transit to same state on 0 and 1.
So skip q5 and then replace q5 by q3 in the rest.

Step 6: Now combine set 1 and set 2 as:

KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES
Now it is the transition table of minimized DFA.

Transition diagram of minimized DFA:

1.2.6 The Lexical-Analyzer Generator (LEX) tool

LEX in compiler design is a tool that generates lexical analyzers, which are programs that
convert streams of characters into meaningful units called tokens. This process is known as
tokenization and is a key part of lexical analysis, the first phase of a compiler's workflow.

Key Features of Lexical Analysis:

1. Tokenization: Converts input character streams into tokens like identifiers,

separators, keywords, constants, operators, etc.
2. Error Detection: Identifies lexical errors, such as unmatched strings or length
violations.
3. Comment Elimination: Removes spaces, blank lines, and comments for simplified
processing.

What is Lex? Lex is a specialized tool (or program) that automates the generation of lexical
analyzers. It takes input in the form of Lex source programs (File.l) and produces C programs
(lex.yy.c) as output. The generated C program can be compiled using a standard C compiler,
resulting in a lexical analyzer (a.out), which converts character streams into tokens.

Functions of Lex:

1. Takes File.l (written in Lex syntax) as input and generates lex.yy.c.

2. Compiling lex.yy.c with a C compiler produces an executable (a.out), which performs
token generation.
3. Works in conjunction with YACC (Yet Another Compiler Compiler) for parser
generation.
KKR & KSR INSTITUTE OF TECHNOLOGY & SCIENCES
Originally created by Mike Lesk and Eric Schmidt, Lex remains widely used for compiler
design due to its efficiency in simplifying lexical analysis tasks

A Lex file is structured into three main sections, separated by % delimiters:

1. Declarations: This section includes the declaration of variables, constants, or other

necessary components used in the Lex file.
2. Translation Rules: This part consists of patterns and their corresponding actions.
Patterns define the input to be matched, and actions specify what to do when a
pattern is found.
3. Auxiliary Procedures: This section contains auxiliary functions that support the
actions specified in the translation rules.

Compiler Design Notes
No ratings yet
Compiler Design Notes
130 pages
Compiler Design Note1
No ratings yet
Compiler Design Note1
111 pages
CD All Units
No ratings yet
CD All Units
117 pages
Compiler Design Chapter-1
No ratings yet
Compiler Design Chapter-1
41 pages
Caie - Ict - Grade 5 Textbook - PDF For Pic
No ratings yet
Caie - Ict - Grade 5 Textbook - PDF For Pic
160 pages
Document 2402362.1 - Create WO With Oper and MAte
No ratings yet
Document 2402362.1 - Create WO With Oper and MAte
3 pages
Compiler Design Short Notes
No ratings yet
Compiler Design Short Notes
133 pages
Termux Basic Commands
100% (1)
Termux Basic Commands
2 pages
Compiler Design Lecture Notes (10CS63) : D C S & E
No ratings yet
Compiler Design Lecture Notes (10CS63) : D C S & E
96 pages
Mujeeb Mulesoft Resume
No ratings yet
Mujeeb Mulesoft Resume
3 pages
Chapter-1 Compiler Design
100% (1)
Chapter-1 Compiler Design
13 pages
Implementation of PLC For CNC Flame Cutting Machine
100% (1)
Implementation of PLC For CNC Flame Cutting Machine
10 pages
Cs3551 Distributed Computing L T P C
100% (2)
Cs3551 Distributed Computing L T P C
2 pages
CD - Unit I
No ratings yet
CD - Unit I
30 pages
Compiler Notes Arv
No ratings yet
Compiler Notes Arv
171 pages
Notes CD
No ratings yet
Notes CD
148 pages
Lecture Notes of Compiler Design Lab
No ratings yet
Lecture Notes of Compiler Design Lab
170 pages
Compiler Design Notes
No ratings yet
Compiler Design Notes
185 pages
Vino Compiler Notes
No ratings yet
Vino Compiler Notes
153 pages
CD
No ratings yet
CD
149 pages
Compilers
No ratings yet
Compilers
86 pages
Unit 1 Introduction To Compiler 1. Introduction To Compiler
No ratings yet
Unit 1 Introduction To Compiler 1. Introduction To Compiler
134 pages
Notes Compile Complete
No ratings yet
Notes Compile Complete
117 pages
Intro To Compilers
No ratings yet
Intro To Compilers
77 pages
Compiler Notes
No ratings yet
Compiler Notes
68 pages
TI TMS9918 VDP Datasheet
No ratings yet
TI TMS9918 VDP Datasheet
104 pages
CSE353 Slides
No ratings yet
CSE353 Slides
76 pages
Copch 1
No ratings yet
Copch 1
32 pages
Lecture Notes: Sir C R Reddy College of Engineering
No ratings yet
Lecture Notes: Sir C R Reddy College of Engineering
25 pages
Compiler Design CS8602 Full Lecture Notes Unique
No ratings yet
Compiler Design CS8602 Full Lecture Notes Unique
92 pages
CS8602 Notes Compiler Design
No ratings yet
CS8602 Notes Compiler Design
92 pages
CSC303 - Compiler Design - 060624
No ratings yet
CSC303 - Compiler Design - 060624
49 pages
GA-880GM-USB3L: User's Manual
No ratings yet
GA-880GM-USB3L: User's Manual
96 pages
Unit 5
No ratings yet
Unit 5
45 pages
Compiler Design
No ratings yet
Compiler Design
118 pages
Unit 1 Slides
No ratings yet
Unit 1 Slides
49 pages
Unit I
No ratings yet
Unit I
20 pages
Manaul Labels Unlimited
No ratings yet
Manaul Labels Unlimited
69 pages
Unit-I Introduction To Compilers: CS6660-Compiler Design Department of CSE &IT 2016-2017
No ratings yet
Unit-I Introduction To Compilers: CS6660-Compiler Design Department of CSE &IT 2016-2017
95 pages
CDU1
No ratings yet
CDU1
21 pages
PLC Connection Manual PDF
No ratings yet
PLC Connection Manual PDF
6 pages
Unit 1 Compiler Design
No ratings yet
Unit 1 Compiler Design
43 pages
Compiler Construction Notes
No ratings yet
Compiler Construction Notes
61 pages
COMPILER - DESIGN Unit 1
No ratings yet
COMPILER - DESIGN Unit 1
25 pages
CD Unit 1
No ratings yet
CD Unit 1
20 pages
Dsa Basic Data Structure
No ratings yet
Dsa Basic Data Structure
72 pages
Unit 1 Compiler Design
No ratings yet
Unit 1 Compiler Design
70 pages
Unit-1 Notes CD OU
No ratings yet
Unit-1 Notes CD OU
19 pages
Netezza Fundamentals PDF
No ratings yet
Netezza Fundamentals PDF
60 pages
Midsem
No ratings yet
Midsem
13 pages
CD Unit - 1 Lms Notes
No ratings yet
CD Unit - 1 Lms Notes
58 pages
Compiler Design Unit-1
No ratings yet
Compiler Design Unit-1
25 pages
Ch07. Virtualization and Cloud Computing (PPT Slides)
No ratings yet
Ch07. Virtualization and Cloud Computing (PPT Slides)
17 pages
Collectiion of Java and J2EE Questions
No ratings yet
Collectiion of Java and J2EE Questions
35 pages
Unit 1
No ratings yet
Unit 1
29 pages
Unit 1
No ratings yet
Unit 1
9 pages
CD Unit-I
No ratings yet
CD Unit-I
21 pages
Compiler Design Unit 1
No ratings yet
Compiler Design Unit 1
26 pages
CD Unit I
No ratings yet
CD Unit I
20 pages
Kca015 Unit1
No ratings yet
Kca015 Unit1
23 pages
Chapter 1-1
No ratings yet
Chapter 1-1
25 pages
Compiler Design - Introduction
No ratings yet
Compiler Design - Introduction
6 pages
Virtex4 High Speed DDR Transceivers Xapp705
No ratings yet
Virtex4 High Speed DDR Transceivers Xapp705
20 pages
CS 321 - Compilers: Outline
No ratings yet
CS 321 - Compilers: Outline
8 pages
Bipv10-Iap 110829
No ratings yet
Bipv10-Iap 110829
56 pages
LIC AAO Computer Knowledge Question Paper 2016
No ratings yet
LIC AAO Computer Knowledge Question Paper 2016
2 pages
CrashReport 1676386336268
No ratings yet
CrashReport 1676386336268
4 pages
CD KCS502 Unit 1 A
No ratings yet
CD KCS502 Unit 1 A
8 pages
Chapter 1 - Introduction
No ratings yet
Chapter 1 - Introduction
13 pages
KCA105 Unit1
No ratings yet
KCA105 Unit1
18 pages
Computer Studies MS
No ratings yet
Computer Studies MS
3 pages
Mds Transnet: 1.2 Typical Applications
No ratings yet
Mds Transnet: 1.2 Typical Applications
6 pages
UNIT-1 1.1. Introduction of Language Processingsystem
No ratings yet
UNIT-1 1.1. Introduction of Language Processingsystem
14 pages
PythonBook CH2
No ratings yet
PythonBook CH2
4 pages
2660 v2.1 SPEc
No ratings yet
2660 v2.1 SPEc
9 pages
SAP Knowledge Acceleration
No ratings yet
SAP Knowledge Acceleration
8 pages
Word Exercises
No ratings yet
Word Exercises
1 page
Unit 1
No ratings yet
Unit 1
29 pages
CD Unit1
No ratings yet
CD Unit1
21 pages
Compiler 2021 Module 1
No ratings yet
Compiler 2021 Module 1
15 pages
Compiler-Design U1
No ratings yet
Compiler-Design U1
10 pages
2 C 9 B 08264 e 0639 C 45 Eae
No ratings yet
2 C 9 B 08264 e 0639 C 45 Eae
20 pages
Resume Sonal Jain
No ratings yet
Resume Sonal Jain
3 pages
Compiler Design Ch1
No ratings yet
Compiler Design Ch1
13 pages
Module - I: Introduction To Compiling: 1.1 Introduction of Language Processing System
No ratings yet
Module - I: Introduction To Compiling: 1.1 Introduction of Language Processing System
7 pages
Introduction To Prosafers: Yokogawa Safety System
No ratings yet
Introduction To Prosafers: Yokogawa Safety System
10 pages
Dell Embedded Box PC 3000 and 5000 Series
No ratings yet
Dell Embedded Box PC 3000 and 5000 Series
3 pages
F7634 Gps+evdo Wifi Router Specification
No ratings yet
F7634 Gps+evdo Wifi Router Specification
4 pages
Code Beneath the Surface: Mastering Assembly Programming
From Everand
Code Beneath the Surface: Mastering Assembly Programming
Kameron Hussain
No ratings yet
COMPUTER PROGRAMMING FOR KIDS: An Easy Step-by-Step Guide For Young Programmers To Learn Coding Skills (2022 Crash Course for Newbies)
From Everand
COMPUTER PROGRAMMING FOR KIDS: An Easy Step-by-Step Guide For Young Programmers To Learn Coding Skills (2022 Crash Course for Newbies)
Dexter Rogers
No ratings yet