Lexical and Syntax Analysis in Compiler Design by Vishal Trivedi
Lexical and Syntax Analysis in Compiler Design by Vishal Trivedi
Abstract — This Research paper gives brief information on how III. COMPILERS
the source program gets evaluated in Lexical analysis phase of
Compiler reads whole program at a time and generate errors (if
compiler and Syntax analysis phase of compiler. In addition to
that, this paper also explains the concept ofCompiler and Phases of occurred). Compiler generates intermediate code in order to
Compiler. Mainly this paper concentrates on Lexical analysis and generate target code. Once the whole program is checked, errors
Syntax analysis. are displayed. Example of compilers are Borland Compiler,
Turbo C Compiler. Generated target code is easy to understand
Keywords —Token, Lexeme, Identifier, Operator, Operand, after the process of compilation. The process of compilation
Sentinel, Prefix, Derivation, Kleene closure, Positive closure, must be done efficiently. There are mainly two parts of
Terminal, Production rule, Non-terminal, Sentential. compilation process.
[1] Analysis Phase: This phase of compilation process is
I. INTRODUCTION machineindependent. The main objective of analysis phase
is to divide to source code into parts and rearrange these
Whenever we create a source code and start the process of
parts into meaningful structure. The meaning of source
evaluating it, computer only shows the output and errors (if
code is determined and then intermediate code is created
occurred). We don’t know the actual process behind it. In this
from the source program. Analysis phase contains mainly
research paper, the exact procedure and step by step
three sub-phases named lexicalanalysis, syntaxanalysis and
evaluation of source code in Lexical and Syntax Analysis are
semanticanalysis.
explained. In addition to that touched topics are Index Terms,
[2] Synthesis Phase: This phase of compilation process is
Compilers, Phases of Compiler, Operations on grammar,
machinedependent. The intermediate code is taken and
Lexical analysis, Roll of Scanner, Finite automata, Syntax
converted into an equivalent target code. Synthesis phase
analysis, Types of Derivation, Ambiguous grammar, Left
contains mainly three sub-phases named intermediatecode,
recursion, Left factoring, Types of Parsing, Top Down
codeoptimization and codegeneration.
Parsing, Bottom Up Parsing, Error Handling.
= a
+ b
* c
2
V. OPERATIONS The lexical analyzer is the first phase of compiler. It’s main task
is to read the input characters and produces a sequence of tokens
Єrefers to Empty string.
as output that parser uses for syntax analysis.
Λ or ∅refer to Empty set of string.
Lexical Analysis is first phase of compiler. states, δ is a transition function. There are two types of finite
First of all, lexical analyzer scans the whole program and For each state, DFA has exactly one edge leaving out for
divide it into Token. Token refers to the string with each symbol.In the theoryofcomputation, a branch of
meaning. Token describes the class or category of input theoretical computer science, a
Pattern refers to set of rules that describes the token. Deterministicfinitestatemachine(DFSM)is a finite-state
Lexemes refers to the sequence of characters in source code machine that accepts and rejects strings of symbols and
that are matched with the pattern of tokens. Example: int, i, only produces a unique computation of the automaton
Syntax analysis is also known as syntacticalanalysis Leftmostderivation and Rightmostderivation. Let’sconsider the
or parsing or hierarchicalanalysis. grammar with the production S ->S+S | S-S | S*S | S/S |(S))| a
> aB1|aB2|aB3. Left factoring should not be there in grammaror High complex
Each and every phase of compiler detects errors which [1] Wikipedia - Available on :
must be reported to error handler whose task is to handle the https://en.wikipedia.org/wiki/Nondeterministic_finite_automaton
errors so that compilation can proceed. Lexical errorscontain https://en.wikipedia.org/wiki/Deterministic_finite_automaton
constants, appearance of illegal characters etc. Syntax errors [2] Diagrams and Flowcharts – Available on : https://www.draw.io/s
contains errors in structure, missing operators, missing [3] Vishal Trivedi – ―Life Cycle of Source Program – Compiler
parenthesis etc. Semantic errorscontain incompatible types of
Design‖ – International Journal of Creative Research and Thoughts
operands, undeclared variables, not matching of actual
– Volume 5 – Issue 4 November 2017 – Paper ID : IJCRT1704159
arguments with formal arguments etc. There are various
strategies to recover the errors which can be implement by – ISSN : 2320-2882
Available on :
http://www.darshan.ac.in/Upload/DIET/Documents/CE/2170701_C
D_Sem%207_GTU_Study%20Material_15112016_100740AM.pdf
https://www.tutorialspoint.com/compiler_design/compiler_design_s
ACKNOWLEDGMENT [10] Aho, Lam, Sethi, and Ullman – ―Compilers: Principles, Techniques
I am using this opportunity to express my gratitude to and Tools‖ - Second Edition, Pearson, 2014
everyone who supported me in this research. I am thankful for
their aspiring guidance, invaluably constructive criticism and
friendly advice during the research. I am sincerely grateful to
them for sharing their truthful and illuminating views on a
number of issues related to the research work.