0% found this document useful (0 votes)
15 views18 pages

Lexical Analysis in Compiler Design

Uploaded by

ankush8522109
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views18 pages

Lexical Analysis in Compiler Design

Uploaded by

ankush8522109
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 18

lexical analysis in compiler design

SUBMITTED TO :- Tanvi Mehta


SUBMITTED BY:- DHRITI
2021BCA055
introduction to lexical analysis

 Lexical Analysis is the first phase of the compiler


also known as a scanner.
 It converts the High level input program into a
sequence of Tokens.
introduction to compiler design
 Compiler design is a crucial aspect of computer science and
software engineering that focuses on the development of software
tools called compilers.
 A compiler is a program that translates source code written in a
high-level programming language into equivalent machine code or
another form of code that can be executed by a computer's
hardware.
 Compiler design requires a deep understanding of programming
languages, computer architecture, algorithms, and data structures.
 It also involves a balance between theoretical concepts and
practical considerations to produce efficient and reliable compilers
process
OVERVIEW OF COMPILATION PROCESS

The following are the phases through which


our program passes before being transformed
into an executable form:
Preprocessor
Compiler
Assembler
Linker
order of compilation phases
DETAILED EXPLAINATION OF EACH PHASE
 Preprocessor:- The source code is the code which is written in
a text editor and the source code file is given an extension ".c".
This source code is first passed to the preprocessor, and then the
preprocessor expands this code.
 Compiler:- The code which is expanded by the preprocessor is
passed to the compiler. The compiler converts this code into
assembly code. Or we can say that the C compiler converts the pre-
processed code into assembly code.
 Assembler:-The assembly code is converted into object code by
using an assembler. The name of the object file generated by the
assembler is the same as the source file. The extension of the
object file in DOS is '.obj,' and in UNIX, the extension is 'o'.
 Linker:- Mainly, all the programs written in C use library
functions. These library functions are pre-compiled, and the object
PURPOSE OF LEXICAL ANALYSIS
 If the lexical analyzer is located as a separate pass
in the compiler it can need an intermediate file to
locate its output, from which the parser would
then takes its input. It can eliminate the need for
the intermediate file
 The lexical analyzer also interacts with the symbol
table while passing tokens to the parser.
Whenever a token is discovered, the lexical
analyzer returns a representation for that token to
the parser
TOKENIZATION

 Tokenization, when applied to data security, is the process


of substituting a sensitive data element with a non-
sensitive equivalent, referred to as a token, that has no
intrinsic or exploitable meaning or value.
 The token is a reference (i.e. identifier) that maps back to
the sensitive data through a tokenization system. The
mapping from original data to a token uses methods that
render tokens infeasible to reverse in the absence of the
tokenization system,
 A one-way cryptographic function is used to convert the
original data into tokens, making it difficult to recreate the
original data without obtaining entry to the tokenization
TOKEN CATEGORIES

This phase recognizes three types of tokens: -


Terminal Symbols (TRM)- Keywords and Operators,
Literals (LIT), and Identifiers (IDN)
 Example 1:
int a = 10; //Input Source code
Tokens
int (keyword), a(identifier), =(operator),
10(constant) and ;(punctuation-semicolon)
Answer – Total number of tokens = 5
• Example 2:
int main() {
// printf() sends the string inside quotation to
// the standard output (the display)
printf("Welcome to GeeksforGeeks!");
return 0;
}Tokens
'int', 'main', '(', ')', '{', 'printf', '(', ' "Welcome to
GeeksforGeeks!" ',
')', ';', 'return', '0', ';', '}'
Answer – Total number of tokens = 14
LEXICAL ERRORS

 1. Exceeding length of identifier or numeric


constants:-
Example:-
#include <iostream>
using namespace std;
int main() {
int a=2147483647 +1;
return 0;
}
This is a lexical error since signed integer lies between
−2,147,483,648 and 2,147,483,647.
2. Appearance of illegal characters:-
#include <iostream>
using namespace std;
int main() {
printf("Geeksforgeeks");$
return 0;
}
This is a lexical error since an illegal character $
appears at the end of the statement.
• 12.Unmatched string:-
#include <iostream>
using namespace std;
int main() {
/* comment
cout<<"GFG!";
return 0;i
}
This is a lexical error since the ending of comment “*/” is
not present but the beginning is present
ADVANTAGES OF LEXICAL ANALYSIS
1.Tokenization: Lexical analysis breaks down the
input source code into tokens, which are the
smallest meaningful units of the programming
language
2.Error Detection: Lexical analysis includes
mechanisms for detecting and reporting lexical
errors, such as illegal characters or tokens that do
not conform to the language syntax.
3.Language Independence: Lexical analysis can
be designed to support multiple programming
languages.
DISADVANTAGES OF LEXICAL ANALYSIS
1.Complexity: Implementing a robust lexical analyzer can
be complex, especially for languages with intricate lexical
rules or irregular syntax.
2.Performance Overhead: Lexical analysis adds an
additional processing step to the compilation process, which
can introduce some performance overhead.
3.Memory Consumption: A lexical analyzer typically
needs to maintain data structures such as symbol tables
and token buffers, which can consume memory,
4.Error Recovery: While lexical analysis can detect lexical
errors such as invalid characters or tokens, error recovery
mechanisms may be limited.
APPLICATIONS OF LEXICAL ANALYSIS
1.Compiler Design:- In compiler design, lexical analysis is
the first phase of the compilation process.
2.Interpreter Design:- Similar to compilers, interpreters
also use lexical analysis to break down the source code into
tokens before executing it.
3.Text Processing: Lexical analysis is used in various text
processing applications, such as text editors, search
engines, and lexical analyzers for natural language.
4.Compiler Front-End Tools: Lexical analysis tools such
as Lex and Flex are widely used to generate lexical
analyzers automatically from lexical specifications.
references
• .https://www.geeksforgeeks.org/lexicalerror
• ./https://www.javatpoint.com/lexical-error
• .https://www.geeksforgeeks.org/error-detection-
recovery-compiler/
• .https://www.geeksforgeeks.org/token-patterns-
and-lexems/
• .https://www.tutorialspoint.com/compiler_design/
compiler_design_lexical_analysis.htm
• .https://www.guru99.com/c-tokens-keywords-
identifier.html

You might also like