0% found this document useful (0 votes)

56 views

Compiler 3

Lex is a program that generates lexical analyzers in C. It reads an input stream and produces a sequence of tokens. Lexical analyzers are used with YACC, which generates parsers. YACC takes a context-free grammar specification and produces a C program that can parse inputs based on that grammar. Together, Lex and YACC allow the creation of programs that can break inputs into tokens (Lex) and check that they are syntactically valid based on a grammar (YACC).

Uploaded by

Jabin Akter Joty

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views

Compiler 3

Uploaded by

Jabin Akter Joty

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

LEX

 Lex is a program that generates lexical analyzer. It is used with YACC parser generator.
 The lexical analyzer is a program that transforms an input stream into a sequence of
tokens.
 It reads the input stream and produces the source code as output through implementing
the lexical analyzer in the C program.

The function of Lex is as follows:

 Firstly lexical analyzer creates a program lex.1 in the Lex language. Then Lex compiler
runs the lex.1 program and produces a C program lex.yy.c.
 Finally C compiler runs the lex.yy.c program and produces an object program a.out.
 a.out is lexical analyzer that transforms an input stream into a sequence of tokens.

Lex file format

A Lex program is separated into three sections by %% delimiters. The formal of Lex source is as
follows:

{ definitions }

1. %%
2. { rules }
3. %%
4. { user subroutines }

Definitions include declarations of constant, variable and regular definitions.

Rules define the statement of form p1 {action1} p2 {action2}....pn {action}.

Where pi describes the regular expression and action1 describes the actions what action the
lexical analyzer should take when pattern pi matches a lexeme.

User subroutines are auxiliary procedures needed by the actions. The subroutine can be loaded
with the lexical analyzer and compiled separately.

YACC

 YACC stands for Yet Another Compiler Compiler.

 YACC provides a tool to produce a parser for a given grammar.
 YACC is a program designed to compile a LALR (1) grammar.
 It is used to produce the source code of the syntactic analyzer of the language produced
by LALR (1) grammar.
 The input of YACC is the rule or grammar and the output is a C program.

These are some points about YACC:

Input: A CFG- file.y

Output: A parser y.tab.c (yacc)

 The output file "file.output" contains the parsing tables.

 The file "file.tab.h" contains declarations.
 The parser called the yyparse ().
 Parser expects to use a function called yylex () to get tokens.

The basic operational sequence is as follows:

This file contains the desired grammar in YACC format.

It shows the YACC program.

It is the c source program created by YACC.

C Compiler

Executable file that will parse grammar given in gram.Y

Syntax analysis or parsing is the second phase of a compiler. In this chapter, we shall learn the
basic concepts used in the construction of a parser.

We have seen that a lexical analyzer can identify tokens with the help of regular expressions and
pattern rules. But a lexical analyzer cannot check the syntax of a given sentence due to the
limitations of the regular expressions. Regular expressions cannot check balancing tokens, such
as parenthesis. Therefore, this phase uses context-free grammar (CFG), which is recognized by
push-down automata.

CFG, on the other hand, is a superset of Regular Grammar, as depicted below:

It implies that every Regular Grammar is also context-free, but there exists some problems,
which are beyond the scope of Regular Grammar. CFG is a helpful tool in describing the syntax
of programming languages.

Context-Free Grammar

In this section, we will first see the definition of context-free grammar and introduce
terminologies used in parsing technology.

A context-free grammar has four components:

 A set of non-terminals (V). Non-terminals are syntactic variables that denote sets of
strings. The non-terminals define sets of strings that help define the language generated
by the grammar.
 A set of tokens, known as terminal symbols (Σ). Terminals are the basic symbols from
which strings are formed.
 A set of productions (P). The productions of a grammar specify the manner in which the
terminals and non-terminals can be combined to form strings. Each production consists of
a non-terminal called the left side of the production, an arrow, and a sequence of tokens
and/or on- terminals, called the right side of the production.
 One of the non-terminals is designated as the start symbol (S); from where the production
begins.

The strings are derived from the start symbol by repeatedly replacing a non-terminal (initially the
start symbol) by the right side of a production, for that non-terminal.

Example

We take the problem of palindrome language, which cannot be described by means of Regular
Expression. That is, L = { w | w = wR } is not a regular language. But it can be described by
means of CFG, as illustrated below:
G = ( V, Σ, P, S )

Where:

V = { Q, Z, N }
Σ = { 0, 1 }
P = { Q → Z | Q → N | Q → ℇ | Z → 0Q0 | N → 1Q1 }
S={Q}

This grammar describes palindrome language, such as: 1001, 11100111, 00100, 1010101, 11111,
etc.

Syntax Analyzers

A syntax analyzer or parser takes the input from a lexical analyzer in the form of token streams.
The parser analyzes the source code (token stream) against the production rules to detect any
errors in the code. The output of this phase is a parse tree.

This way, the parser accomplishes two tasks, i.e., parsing the code, looking for errors and
generating a parse tree as the output of the phase.

Parsers are expected to parse the whole code even if some errors exist in the program. Parsers
use error recovering strategies, which we will learn later in this chapter.

Derivation

A derivation is basically a sequence of production rules, in order to get the input string. During
parsing, we take two decisions for some sentential form of input:

 Deciding the non-terminal which is to be replaced.

 Deciding the production rule, by which, the non-terminal will be replaced.

To decide which non-terminal to be replaced with production rule, we can have two options.

Left-most Derivation
If the sentential form of an input is scanned and replaced from left to right, it is called left-most
derivation. The sentential form derived by the left-most derivation is called the left-sentential
form.

Right-most Derivation

If we scan and replace the input with production rules, from right to left, it is known as right-
most derivation. The sentential form derived from the right-most derivation is called the right-
sentential form.

Example

Production rules:

E→E+E
E→E*E
E → id

Input string: id + id * id

The left-most derivation is:

E→E*E
E→E+E*E
E → id + E * E
E → id + id * E
E → id + id * id

Notice that the left-most side non-terminal is always processed first.

The right-most derivation is:

E→E+E
E→E+E*E
E → E + E * id
E → E + id * id
E → id + id * id

Parse Tree

A parse tree is a graphical depiction of a derivation. It is convenient to see how strings are
derived from the start symbol. The start symbol of the derivation becomes the root of the parse
tree. Let us see this by an example from the last topic.

We take the left-most derivation of a + b * c

The left-most derivation is:

E→E*E
E→E+E*E
E → id + E * E
E → id + id * E
E → id + id * id

Step 1:

E→E*E

Step 2:

E→E+E*E

Step 3:

E → id + E * E
Step 4:

E → id + id * E

Step 5:

E → id + id * id

In a parse tree:

 All leaf nodes are terminals.

 All interior nodes are non-terminals.
 In-order traversal gives original input string.

A parse tree depicts associativity and precedence of operators. The deepest sub-tree is traversed
first, therefore the operator in that sub-tree gets precedence over the operator which is in the
parent nodes.

Ambiguity

A grammar G is said to be ambiguous if it has more than one parse tree (left or right derivation)
for at least one string.
Example

E→E+E
E→E–E
E → id

For the string id + id – id, the above grammar generates two parse trees:

The language generated by an ambiguous grammar is said to be inherently ambiguous.

Ambiguity in grammar is not good for a compiler construction. No method can detect and
remove ambiguity automatically, but it can be removed by either re-writing the whole grammar
without ambiguity, or by setting and following associativity and precedence constraints.

Associativity

If an operand has operators on both sides, the side on which the operator takes this operand is
decided by the associativity of those operators. If the operation is left-associative, then the
operand will be taken by the left operator or if the operation is right-associative, the right
operator will take the operand.

Example

Operations such as Addition, Multiplication, Subtraction, and Division are left associative. If the
expression contains:

id op id op id

it will be evaluated as:

(id op id) op id

For example, (id + id) + id

Operations like Exponentiation are right associative, i.e., the order of evaluation in the same
expression will be:

id op (id op id)

For example, id ^ (id ^ id)

Precedence

If two different operators share a common operand, the precedence of operators decides which
will take the operand. That is, 2+3*4 can have two different parse trees, one corresponding to
(2+3)*4 and another corresponding to 2+(3*4). By setting precedence among operators, this
problem can be easily removed. As in the previous example, mathematically * (multiplication)
has precedence over + (addition), so the expression 2+3*4 will always be interpreted as:

2 + (3 * 4)

Mon-45 p3 Designreport
No ratings yet
Mon-45 p3 Designreport
57 pages
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Knox Mobile Enrollment
100% (1)
Knox Mobile Enrollment
53 pages
SE Compiler Chapter 3-Parser
No ratings yet
SE Compiler Chapter 3-Parser
27 pages
Compiler Design - Syntax Analysis
No ratings yet
Compiler Design - Syntax Analysis
14 pages
4.parsing
No ratings yet
4.parsing
32 pages
2014-CD Ch-03 SAn
No ratings yet
2014-CD Ch-03 SAn
21 pages
Unit 3 Syntax - Analyzer
No ratings yet
Unit 3 Syntax - Analyzer
56 pages
Compiler Design Lec-Three Syntax Analysis
No ratings yet
Compiler Design Lec-Three Syntax Analysis
60 pages
Compiler Design - Syntax Analysis
No ratings yet
Compiler Design - Syntax Analysis
11 pages
Unit-2 F&CD
No ratings yet
Unit-2 F&CD
31 pages
Cd notes
No ratings yet
Cd notes
194 pages
3-Module 2 - Role of Parser - Parse Tree-02-08-2024
No ratings yet
3-Module 2 - Role of Parser - Parse Tree-02-08-2024
76 pages
1 Syntax Analyzer
No ratings yet
1 Syntax Analyzer
33 pages
Lex
No ratings yet
Lex
13 pages
Compiler Design Chapter-3
0% (1)
Compiler Design Chapter-3
177 pages
Figure 1two Parse Trees For 9-5+2
No ratings yet
Figure 1two Parse Trees For 9-5+2
3 pages
Chapter-3-Syntax Analysis
No ratings yet
Chapter-3-Syntax Analysis
126 pages
2 Syntax Analysis - Introduction
No ratings yet
2 Syntax Analysis - Introduction
8 pages
CD Unit 2
No ratings yet
CD Unit 2
19 pages
Parsing Notes
No ratings yet
Parsing Notes
96 pages
AT&CD Unit 2
No ratings yet
AT&CD Unit 2
26 pages
Syntax Analysis
No ratings yet
Syntax Analysis
73 pages
KCA015 Unit2
No ratings yet
KCA015 Unit2
29 pages
3 Role of Parser
No ratings yet
3 Role of Parser
135 pages
Chapter 3
No ratings yet
Chapter 3
96 pages
Compiler Theory: (A Simple Syntax-Directed Translator)
No ratings yet
Compiler Theory: (A Simple Syntax-Directed Translator)
50 pages
Lecture 03
No ratings yet
Lecture 03
7 pages
Lecture 8 Syntax Analysis
No ratings yet
Lecture 8 Syntax Analysis
37 pages
Lecture 03
No ratings yet
Lecture 03
36 pages
2019-11-29_04_41_39CS_V_sem_Compiler_design
No ratings yet
2019-11-29_04_41_39CS_V_sem_Compiler_design
10 pages
Chapter 3
No ratings yet
Chapter 3
180 pages
ACD-UNIT-4 Notes
No ratings yet
ACD-UNIT-4 Notes
32 pages
Chapter 3 - Syntax Analyzer
No ratings yet
Chapter 3 - Syntax Analyzer
28 pages
CC-ll
No ratings yet
CC-ll
15 pages
Entrepreneurship Process
No ratings yet
Entrepreneurship Process
22 pages
CD Unit-2 (R20)
No ratings yet
CD Unit-2 (R20)
38 pages
1 Syntax Analyzer
No ratings yet
1 Syntax Analyzer
33 pages
1.describing Syntax and Semantics
No ratings yet
1.describing Syntax and Semantics
110 pages
Chapter 3 Syntax Analysis Full Reading Material
No ratings yet
Chapter 3 Syntax Analysis Full Reading Material
76 pages
CH03
No ratings yet
CH03
57 pages
2. Simple Syntax Directed Translation
No ratings yet
2. Simple Syntax Directed Translation
51 pages
CSC 409 Note 2
No ratings yet
CSC 409 Note 2
12 pages
APznzabvYKoN4zDY71onQwxNN3R5YXoFXjgna4I0XurpAH1XE77GlYeHrkYJx-bE96PPeJntwqzIfNBvguewq_9dNxjJHAPsi5CaMk-Pv6X530i-KQDKh3JuMvyl95bEO1TR_fC6I6zJQhW0qb1oPgi21XiXcoliVzRGGVn66Gsj5rdWsJ7DYhv9_bPuB3iUXcsUAVwQmrEsvBAIIrycUz
No ratings yet
APznzabvYKoN4zDY71onQwxNN3R5YXoFXjgna4I0XurpAH1XE77GlYeHrkYJx-bE96PPeJntwqzIfNBvguewq_9dNxjJHAPsi5CaMk-Pv6X530i-KQDKh3JuMvyl95bEO1TR_fC6I6zJQhW0qb1oPgi21XiXcoliVzRGGVn66Gsj5rdWsJ7DYhv9_bPuB3iUXcsUAVwQmrEsvBAIIrycUz
44 pages
Chapter 3
No ratings yet
Chapter 3
77 pages
CST302_COMPILER_DESIGN_MODULE 2
No ratings yet
CST302_COMPILER_DESIGN_MODULE 2
19 pages
CD UNIT-II Syntax Analysis
No ratings yet
CD UNIT-II Syntax Analysis
13 pages
Chapter 3 - Syntax Analyzer
No ratings yet
Chapter 3 - Syntax Analyzer
28 pages
Lecture 5
No ratings yet
Lecture 5
28 pages
Unit 3 SDD
No ratings yet
Unit 3 SDD
7 pages
2024_CD-Ch03_Syntaxx_Analysis
No ratings yet
2024_CD-Ch03_Syntaxx_Analysis
28 pages
APznzaYtAWjYy0s_GBEoizaF1ROv5e2pS_Nl6BcNYabrBN8gt4KeYj7LFiXdkYVxT_V92vXdgLmWE0ZcbyVltch5fozoqQQ4KdG766DLjO8aJsMIPKjEjniZOjL0qtNhMykCRh_ohPtDpZvrHNBAvbbZBhvxDpVEqpjDluyzuJGi-VI3NuG46DY_24QwGBEoRdfQYjfevW6tvweeRG (1)
No ratings yet
APznzaYtAWjYy0s_GBEoizaF1ROv5e2pS_Nl6BcNYabrBN8gt4KeYj7LFiXdkYVxT_V92vXdgLmWE0ZcbyVltch5fozoqQQ4KdG766DLjO8aJsMIPKjEjniZOjL0qtNhMykCRh_ohPtDpZvrHNBAvbbZBhvxDpVEqpjDluyzuJGi-VI3NuG46DY_24QwGBEoRdfQYjfevW6tvweeRG (1)
100 pages
Chapter-3 so far
No ratings yet
Chapter-3 so far
50 pages
Unit II PDF
No ratings yet
Unit II PDF
7 pages
CD Paper Solution 2022-23
No ratings yet
CD Paper Solution 2022-23
49 pages
Theme
No ratings yet
Theme
11 pages
Compiler Design 3
No ratings yet
Compiler Design 3
140 pages
02 Simple Sysntax Directed Translation
No ratings yet
02 Simple Sysntax Directed Translation
57 pages
8 Notes
No ratings yet
8 Notes
12 pages
Compiler 2
No ratings yet
Compiler 2
45 pages
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
C++ Programming Language
From Everand
C++ Programming Language
Knowledge Flow
No ratings yet
Log
No ratings yet
Log
5 pages
FMSCacheSizingTool Instructions v1 2
No ratings yet
FMSCacheSizingTool Instructions v1 2
27 pages
PNMSJ: Pasolink Network Management System Java
No ratings yet
PNMSJ: Pasolink Network Management System Java
33 pages
Blockchain Rental Property System With Smart Contracts: 1) Background/ Problem Statement
100% (1)
Blockchain Rental Property System With Smart Contracts: 1) Background/ Problem Statement
13 pages
Health Informatics:: A Global Perspective
No ratings yet
Health Informatics:: A Global Perspective
23 pages
Schem SPI Installation Guide PDF
No ratings yet
Schem SPI Installation Guide PDF
148 pages
College Connect
100% (1)
College Connect
318 pages
NetBeans IDE Java Quick Start Tutorial
No ratings yet
NetBeans IDE Java Quick Start Tutorial
5 pages
Operating Manual: Getting Started, Irc5 and Robotstudio
100% (1)
Operating Manual: Getting Started, Irc5 and Robotstudio
44 pages
CSS Workshop Layout
80% (5)
CSS Workshop Layout
2 pages
Daily QA Report Somatom Go All
No ratings yet
Daily QA Report Somatom Go All
17 pages
SmartPVMS 24.5.0 FusionSolar SmartPVMS User Manual
No ratings yet
SmartPVMS 24.5.0 FusionSolar SmartPVMS User Manual
252 pages
Mypdf
No ratings yet
Mypdf
21 pages
Visor para Transmisor de Nivel
No ratings yet
Visor para Transmisor de Nivel
40 pages
Samsung GT B5722 Device Specifications
No ratings yet
Samsung GT B5722 Device Specifications
6 pages
Ethernet 8051
No ratings yet
Ethernet 8051
4 pages
ICTBOT
0% (1)
ICTBOT
2 pages
ALV2XLSX User Guide
100% (1)
ALV2XLSX User Guide
11 pages
45 - Altivar 71 Plus Variable Speed Drives PDF
No ratings yet
45 - Altivar 71 Plus Variable Speed Drives PDF
130 pages
Ultima Underworld Keyboard Commands by Dan Simpson
No ratings yet
Ultima Underworld Keyboard Commands by Dan Simpson
5 pages
3rd International Conference on Computer Science, Engineering and Artificial Intelligence (CSEAI 2025)
No ratings yet
3rd International Conference on Computer Science, Engineering and Artificial Intelligence (CSEAI 2025)
2 pages
SHP DH525
No ratings yet
SHP DH525
2 pages
HP All-In-One 1.4.3 & ProLiant Storage Server
No ratings yet
HP All-In-One 1.4.3 & ProLiant Storage Server
18 pages
Ziphertech - Business Analysis Program
No ratings yet
Ziphertech - Business Analysis Program
4 pages
ISCWSA Error Model Rev4
No ratings yet
ISCWSA Error Model Rev4
57 pages
Robust Control Part1
No ratings yet
Robust Control Part1
12 pages
Junior V4 Datasheet PDF
No ratings yet
Junior V4 Datasheet PDF
2 pages
Full Download Learning Drupal 8 1st Edition Abbott PDF
100% (13)
Full Download Learning Drupal 8 1st Edition Abbott PDF
70 pages

Compiler 3

Uploaded by

Compiler 3

Uploaded by

LEX

The function of Lex is as follows:

Lex file format

Definitions include declarations of constant, variable and regular definitions.

Rules define the statement of form p1 {action1} p2 {action2}....pn {action}.

 YACC stands for Yet Another Compiler Compiler.

These are some points about YACC:

Input: A CFG- file.y

Output: A parser y.tab.c (yacc)

 The output file "file.output" contains the parsing tables.

The basic operational sequence is as follows:

It shows the YACC program.

It is the c source program created by YACC.

Executable file that will parse grammar given in gram.Y

CFG, on the other hand, is a superset of Regular Grammar, as depicted below:

A context-free grammar has four components:

 Deciding the non-terminal which is to be replaced.

The left-most derivation is:

Notice that the left-most side non-terminal is always processed first.

The right-most derivation is:

We take the left-most derivation of a + b * c

 All leaf nodes are terminals.

The language generated by an ambiguous grammar is said to be inherently ambiguous.

it will be evaluated as:

For example, (id + id) + id

For example, id ^ (id ^ id)

You might also like