Lexical and Syntax Analysis_Updated

Uploaded by

amc204849

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views

Lexical and Syntax Analysis_Updated

Uploaded by

amc204849

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

Lexical and Syntax Analysis

 We know three different ways of implementing programming languages, compilation, pure interpretation, and hybrid
implementation.
 The compilation approach uses a program called a compiler, which translates programs written in a high-level
programming language into machine code.
 Compilation is typically used to implement programming languages that are used for large applications, often written in
languages such as C++ and COBOL.
 Pure interpretation systems perform no translation; rather, programs are interpreted in their original form by a software
interpreter
 HTML and Java Scripts are example of Pure interpretation, where execution efficiency is not required.
 Hybrid implementation systems translate programs written in high-level languages into intermediate forms, which are
interpreted.
 In recent years the use of Just-in-Time (JIT) compilers has become widespread, particularly for Java programs and
programs written for the Microsoft .NET system.
 A JIT compiler, which translates intermediate code to machine code, is used on methods at the time they are first called.
 A JIT compiler transforms a hybrid system to a delayed compiler system.
 Syntax analyzers, or parsers, are always based on a formal description of the syntax of program
 There 3 compelling advantages of using BNF Description
o First, BNF descriptions of the syntax of programs are clear and concise
o Second, the BNF description can be used as the direct basis for the syntax analyzer.
o Third, implementations based on BNF are relatively easy to maintain because of their modularity.
 Compilers separate the task of analyzing syntax into two distinct parts, named
o lexical analysis and
o syntax analysis
 The lexical analyzer deals with small-scale language constructs, such as names and numeric literals.
 The syntax analyzer deals with the large-scale constructs,
 There are three reasons why lexical analysis is separated from syntax analysis:
o 1. Simplicity—Techniques for lexical analysis are less complex than those required for syntax analysis, so the
lexical-analysis process can be simpler if it is separate. Also, removing the low-level details of lexical analysis
from the syntax analyzer makes the syntax analyzer both smaller and less complex.
o 2. Efficiency—Lexical analysis requires a significant portion of total compilation time; it is not fruitful to
optimize the syntax analyzer.
o 3. Portability—Because the lexical analyzer reads input program files and often includes buffering of that
input, it is somewhat platform dependent. However, the syntax analyzer can be platform independent. It is
always good to isolate machine-dependent parts of any software system.
o A lexical analyzer serves as the front end of a syntax analyzer. First phase in compiler designing, it helps you
to convert a sequence of characters into a sequence of tokens. The lexical analyzer breaks this syntax into a
series of tokens.
o Technically, lexical analysis is a part of syntax analysis.
o A lexical analyzer performs syntax analysis at the lowest level of program structure.
o An input program appears to a compiler as a single string of characters.
o The lexical analyzer collects characters into logical groupings and assigns internal codes to the groupings
according to their structure.
o these logical groupings are named lexemes,
o and the internal codes for categories of these groupings are named tokens
o
o Lexical analyzers extract lexemes from a given input string and produce the corresponding tokens.
o The lexical-analysis process includes skipping comments and white space outside lexemes, as they are not
relevant to the meaning of the program
 There are three approaches to building a lexical analyzer:
o Write a formal description of the token patterns of the language using a descriptive language related to
regular expressions.
o These descriptions are used as input to a software tool that automatically generates a lexical analyzer.
o Design a state transition diagram that describes the token patterns of the language and write a program that
implements the diagram.
o Design a state transition diagram that describes the token patterns of the language and hand-construct a
table-driven implementation of the state diagram

Steps to draw a state diagram –

 Identify the initial state and the final terminating states.

 Identify the possible states in which the object can exist (boundary values corresponding to different attributes guide us
in identifying different states).
 Label the events which trigger these transitions.
 Convenient utility subprograms:
 getChar - gets the next character of input, puts it in nextChar, determines its class and puts the class in charClass
 addChar - puts the character from nextChar into the place the lexeme is being accumulated, lexeme
 lookup - determines whether the string in lexeme is a reserved word (returns a code)

 The Parsing Problem

o Goals of the parser, given an input program:
o Find all syntax errors; for each, produce an appropriate diagnostic message and recover quickly
o Produce the parse tree, or at least a trace of the parse tree, for the program
 Introduction to Parsing
o Parsers for programming languages construct parse trees for given programs.
o Both parse trees and derivations include all of the syntactic information needed by a language processor.
 There are two distinct goals of syntax analysis:
o First, the syntax analyzer must check the input program to determine whether it is syntactically correct.
o The second goal of syntax analysis is to produce a complete parse tree, or at least trace the structure of the
complete parse tree, for syntactically correct input.
o Parsers are categorized according to the direction in which they build parse trees.
o The two broad classes of parsers are top-down, in which the tree is built from the root downward to the
leaves, and bottom-up, in which the parse tree is built from the leaves upward to the root.
o We use a small set of notational conventions for grammar symbols and strings to make the discussion less
cluttered.
o 1. Terminal symbols—lowercase letters at the beginning of the alphabet (a, b, . . .)
o 2. Nonterminal symbols—uppercase letters at the beginning of the alphabet (A, B, . . .)
o 3. Terminals or non-terminals—uppercase letters at the end of the alphabet (W, X, Y, Z)
o 4. Strings of terminals—lowercase letters at the end of the alphabet (w, x, y, z)
 Lexemes are terminals
 Non terminals are in angled bracket for example,
 <while_statement>, <expr>, and <function_def>.
 The sentences of a language (programs, in the case of a programming language) are strings of terminals.
 Mixed strings describe right-hand sides (RHSs) of grammar rules and are used in parsing algorithms
 Top-Down Parsers
o Is a parsing strategy where one first looks at the highest level of the parse tree and works down the parse tree
by using the rewriting rules of a formal grammar.
o Each node is visited before its branches are followed. Branches from a particular node are followed in left-to-
right order. This corresponds to a leftmost derivation
 Bottom-Up Parsers
o A bottom-up parser constructs a parse tree by beginning at the leaves and progressing toward the root.
o This parse order corresponds to the reverse of a rightmost derivation.
o For example, the first step for a bottom-up parser is to determine which substring of the initial given sentence
is the RHS to be reduced to its corresponding LHS to get the second last sentential form in the derivation.
o The process of finding the correct RHS to reduce is complicated by the fact that a given right sentential form
may include more than one RHS from the grammar of the language being parsed.
o The correct RHS is called the handle.
 The Complexity of Parsing
o Parsers that work for any unambiguous grammar are complex and inefficient ( O(n3), where n is the length of
the input )
o Compilers use parsers that only work for a subset of all unambiguous grammars, but do it in linear time ( O(n),
where n is the length of the input )
o There is a subprogram for each nonterminal in the grammar, which can parse sentences that can be
generated by that nonterminal
o A grammar for simple expressions:
o <expr>  <term> {(+ | -) <term>}
o <term>  <factor> {(* | /) <factor>}
o <factor>  id | int_constant | ( <expr> )

o Assume we have a lexical analyzer named lex, which puts the next token code in nextToken
o The coding process when there is only one RHS:
o For each terminal symbol in the RHS, compare it with the next input token; if they match, continue, else there
is an error
o For each nonterminal symbol in the RHS, call its associated parsing subprogram

o
o A nonterminal that has more than one RHS requires an initial process to determine which RHS it is to parse
o The correct RHS is chosen on the basis of the next token of input (the lookahead)
o The next token is compared with the first token that can be generated by each RHS until a match is found
o If no match is found, it is a syntax error

o
o Furthermore, parsers must recover from the error so that the parsing process can continue.

o
o
o

Speakout-Placement Test Instructions
90% (10)
Speakout-Placement Test Instructions
2 pages
Ft1 in TG LP 1-45
No ratings yet
Ft1 in TG LP 1-45
254 pages
CS3304 9 LanguageSyntax 2 PDF
No ratings yet
CS3304 9 LanguageSyntax 2 PDF
39 pages
1 Lexial Analysis
No ratings yet
1 Lexial Analysis
24 pages
Aspects of Professional and Academic Language
50% (2)
Aspects of Professional and Academic Language
37 pages
Lexical and Syntax Analysis: CSE 325/CSE 425: Concepts of Programming Language
No ratings yet
Lexical and Syntax Analysis: CSE 325/CSE 425: Concepts of Programming Language
41 pages
Lexical and Syntax Analysis: CSE 325/CSE 425: Concepts of Programming Language
No ratings yet
Lexical and Syntax Analysis: CSE 325/CSE 425: Concepts of Programming Language
41 pages
Compiler Design
No ratings yet
Compiler Design
19 pages
PP_LA_SA
No ratings yet
PP_LA_SA
20 pages
Compiler Designnotes
No ratings yet
Compiler Designnotes
18 pages
Chap04
No ratings yet
Chap04
15 pages
CC Summary (Slides)
No ratings yet
CC Summary (Slides)
9 pages
Lexical Analyzer (Compiler Contruction)
100% (1)
Lexical Analyzer (Compiler Contruction)
6 pages
SPCC - 5
No ratings yet
SPCC - 5
19 pages
Compiler Design
No ratings yet
Compiler Design
19 pages
Unit-2 F&CD
No ratings yet
Unit-2 F&CD
31 pages
CC-ll
No ratings yet
CC-ll
15 pages
Btcse 701-Compiler Design
No ratings yet
Btcse 701-Compiler Design
10 pages
compiler design unit 1 srm 21 regulation
No ratings yet
compiler design unit 1 srm 21 regulation
193 pages
CD Unit - 2
100% (1)
CD Unit - 2
148 pages
Cd notes
No ratings yet
Cd notes
194 pages
Lecture 4 Lexical Analysis
No ratings yet
Lecture 4 Lexical Analysis
23 pages
Compiler Rewind
No ratings yet
Compiler Rewind
52 pages
Lesson 08 2
No ratings yet
Lesson 08 2
33 pages
Compiler Construction
No ratings yet
Compiler Construction
11 pages
Compiler Design Notes
No ratings yet
Compiler Design Notes
35 pages
Chapter 4
No ratings yet
Chapter 4
42 pages
Compiler Design
No ratings yet
Compiler Design
7 pages
SP Unit III-2024-25
No ratings yet
SP Unit III-2024-25
126 pages
Unit-I - CD R2021
No ratings yet
Unit-I - CD R2021
60 pages
CD Unit-2
100% (1)
CD Unit-2
60 pages
Introduction To Compiler Design-Unit I
No ratings yet
Introduction To Compiler Design-Unit I
30 pages
Unit 1
No ratings yet
Unit 1
24 pages
2019 FEBRUARY IAT 1 TE CMPN SEM VI SPCC
No ratings yet
2019 FEBRUARY IAT 1 TE CMPN SEM VI SPCC
12 pages
Unit 1 Compiler Design
No ratings yet
Unit 1 Compiler Design
124 pages
L2 - Structure of a Compiler
No ratings yet
L2 - Structure of a Compiler
43 pages
CC Viva Questions
0% (1)
CC Viva Questions
5 pages
ch4
No ratings yet
ch4
46 pages
Chapter 3 Syntax Analysis
No ratings yet
Chapter 3 Syntax Analysis
78 pages
Compiler RNP SP Unit 4
No ratings yet
Compiler RNP SP Unit 4
69 pages
Compiler Construction (CS4623) : Course Instructor: Ms. Tayyaba Zaheer
No ratings yet
Compiler Construction (CS4623) : Course Instructor: Ms. Tayyaba Zaheer
23 pages
Chapter 3 (2)
No ratings yet
Chapter 3 (2)
41 pages
Compiler Design Assignment
100% (1)
Compiler Design Assignment
16 pages
Compler
No ratings yet
Compler
35 pages
Assignment CS7002 Compiler Design Powered by A2softech (A2kash)
No ratings yet
Assignment CS7002 Compiler Design Powered by A2softech (A2kash)
8 pages
Compiler Design 1
No ratings yet
Compiler Design 1
206 pages
Compiler construction tools & introduction to LA
No ratings yet
Compiler construction tools & introduction to LA
5 pages
Module 5 Lexical Analyser
No ratings yet
Module 5 Lexical Analyser
10 pages
CD Farre
No ratings yet
CD Farre
13 pages
Unit 2. The Phases of A Compiler
No ratings yet
Unit 2. The Phases of A Compiler
23 pages
Lecture 5
No ratings yet
Lecture 5
28 pages
MID 1 FLCD (Part 1)
No ratings yet
MID 1 FLCD (Part 1)
6 pages
cd UNIT-1
No ratings yet
cd UNIT-1
60 pages
Lexical and Syntax Analysis
No ratings yet
Lexical and Syntax Analysis
3 pages
Phases of Compiler
No ratings yet
Phases of Compiler
9 pages
Chapter 2 Lexical Analysis (Scanning) Edited
No ratings yet
Chapter 2 Lexical Analysis (Scanning) Edited
46 pages
Lexical and Syntax Analysis (Parsing)
No ratings yet
Lexical and Syntax Analysis (Parsing)
39 pages
Lexical and Syntactic Analysis: Vitaly Shmatikov
No ratings yet
Lexical and Syntactic Analysis: Vitaly Shmatikov
39 pages
Compiler 2 PDF
No ratings yet
Compiler 2 PDF
43 pages
Principles of Compiler Design
100% (2)
Principles of Compiler Design
35 pages
ALL UNITS
No ratings yet
ALL UNITS
19 pages
Compiler: Mahmudul Hasan (Moon)
No ratings yet
Compiler: Mahmudul Hasan (Moon)
4 pages
Python Programming Concepts
From Everand
Python Programming Concepts
MRB
No ratings yet
Short Quiz English
No ratings yet
Short Quiz English
18 pages
INGLES
No ratings yet
INGLES
8 pages
0-Pillars-of-OOP.pptx (1)
No ratings yet
0-Pillars-of-OOP.pptx (1)
40 pages
Universidad Europea: English Exam American Headway Third Final Test 1-5
No ratings yet
Universidad Europea: English Exam American Headway Third Final Test 1-5
5 pages
Hekima and Busara - Are They Different: Concepts and How Do They Relate To Utu?
0% (1)
Hekima and Busara - Are They Different: Concepts and How Do They Relate To Utu?
10 pages
UTS Business English
No ratings yet
UTS Business English
6 pages
Self-Assessment: Learning Unit 1
No ratings yet
Self-Assessment: Learning Unit 1
2 pages
Blue Color Line: C150X75X6.5X10 (T.O.S EL+111.059) Connection Type: BSF (Refer DRG No: R1118111-00000-CS-STD-0004)
No ratings yet
Blue Color Line: C150X75X6.5X10 (T.O.S EL+111.059) Connection Type: BSF (Refer DRG No: R1118111-00000-CS-STD-0004)
5 pages
Performance Objective Phonetique 2
No ratings yet
Performance Objective Phonetique 2
4 pages
V A Lakshya
No ratings yet
V A Lakshya
4 pages
WBP Warrior English Book (Sopan Academy-8926698266)
No ratings yet
WBP Warrior English Book (Sopan Academy-8926698266)
22 pages
Grade 3 Module Without Header Day 3
No ratings yet
Grade 3 Module Without Header Day 3
2 pages
NeoEduct - Proiecte Didactice - LLR - 6 - 2023-2024
No ratings yet
NeoEduct - Proiecte Didactice - LLR - 6 - 2023-2024
21 pages
+ MSMM : (I) 3hpfftamijtjai PM M-D
No ratings yet
+ MSMM : (I) 3hpfftamijtjai PM M-D
20 pages
Idiomatic Translation of Umpasa in Delivering Ulos in Toba Batak Wedding Ceremony
No ratings yet
Idiomatic Translation of Umpasa in Delivering Ulos in Toba Batak Wedding Ceremony
28 pages
Pandemonium and Parade Japanese Monsters and the Culture of Yokai First Edition Foster - Instantly access the complete ebook with just one click
100% (2)
Pandemonium and Parade Japanese Monsters and the Culture of Yokai First Edition Foster - Instantly access the complete ebook with just one click
47 pages
Shkolla 9-Vjecare " " Plani Mesimor Lenda Gjuhe E Huaj Angleze Ix
100% (1)
Shkolla 9-Vjecare " " Plani Mesimor Lenda Gjuhe E Huaj Angleze Ix
3 pages
[FREE PDF sample] Speech and Thought Representation in English A Cognitive Functional Approach 1st Edition Lieven Vandelanotte ebooks
100% (2)
[FREE PDF sample] Speech and Thought Representation in English A Cognitive Functional Approach 1st Edition Lieven Vandelanotte ebooks
83 pages
Asesmen Sumatif-2
No ratings yet
Asesmen Sumatif-2
2 pages
Undergradcurric
No ratings yet
Undergradcurric
32 pages
Reported Speech Activities Promoting Classroom Dynamics Group Form - 15485
No ratings yet
Reported Speech Activities Promoting Classroom Dynamics Group Form - 15485
2 pages
Unit Test-2 & Final Term, Syllabus - Class Ix - Session 2021-22
No ratings yet
Unit Test-2 & Final Term, Syllabus - Class Ix - Session 2021-22
3 pages
513639.SSGL 37 07 - Mate Kapovic
No ratings yet
513639.SSGL 37 07 - Mate Kapovic
125 pages
Simple Present
No ratings yet
Simple Present
33 pages
English 6, Lesson Plan.
No ratings yet
English 6, Lesson Plan.
11 pages
Lec03_01 (Swift1)
No ratings yet
Lec03_01 (Swift1)
79 pages
Bangla Speech Emotion Detectionusing Machine Learning Ensemble Methods
No ratings yet
Bangla Speech Emotion Detectionusing Machine Learning Ensemble Methods
8 pages

Lexical and Syntax Analysis_Updated

Uploaded by

Lexical and Syntax Analysis_Updated

Uploaded by

Lexical and Syntax Analysis

Steps to draw a state diagram –

 Identify the initial state and the final terminating states.

 The Parsing Problem

You might also like