0% found this document useful (0 votes)

41 views18 pages

CMP 335 Regular Expression Exercises Note

Uploaded by

akinolaolaoluwa21

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views18 pages

CMP 335 Regular Expression Exercises Note

Uploaded by

akinolaolaoluwa21

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 18

Introduction

Regular expressions (Regex) are a powerful tool used in various fields of computer science, including
compiler design. They provide a concise and flexible means of describing patterns in text, which is essential
for tasks like lexical analysis in compiler design. This blog will delve into the significance of regular
expressions in compiler design, explaining their role, usage, and impact on the overall process of compiling
a program.

A
What are Regular Expressions?
Regular expressions are sequences of characters that define search patterns, primarily used for string
matching within texts. These patterns can be simple, such as matching a single character, or complex,
involving combinations of various characters, special symbols, and operators.
Basic Syntax of Regular Expressions
 Literals: The simplest form of a regular expression is a literal, which matches the exact character. For
example, the regex a will match the character 'a' in a text.
 Concatenation: Two regular expressions can be concatenated, meaning they must appear in sequence.
For example, ab will match the sequence 'ab'.
 Alternation: The alternation operator | allows for matching one of several patterns. For example, a|
b matches either 'a' or 'b'.

 Repetition Operators:
 * (Kleene Star) matches zero or more occurrences of the preceding element.
 + matches one or more occurrences.
 ? matches zero or one occurrence.
Examples
 The regex a*b matches any number of 'a's followed by a 'b' (e.g., b, ab, aaab).
 The regex (a|b)c matches either 'ac' or 'bc'.
Role of Regular Expressions in Compiler Design
In compiler design, regular expressions play a crucial role in the lexical analysis phase, which is the first
phase of a compiler. The primary task of lexical analysis is to read the source code and convert it into
tokens, which are the smallest units of meaning (like keywords, operators, and identifiers).
Lexical Analysis
The lexical analyzer, also known as the scanner, uses regular expressions to identify patterns in the source
code and classify them into tokens. These tokens are then used by the parser in the subsequent phase of the
compiler.
For example, consider the following piece of code:
int main() {
int a = 5;
}

The lexical analyzer would break this code into tokens like int, main, (, ), {, }, a, =, 5, ;. To identify each of
these tokens, the lexer relies on regular expressions:
 Keywords like int and main can be matched directly using literals or predefined patterns.
 Identifiers (like variable names) can be matched using regex that allows sequences of letters and digits.
 Operators like = can be matched using specific literals.
Conversion to Finite Automata
Regular expressions are not only used for pattern matching but can also be converted into finite automata,
which are used to recognize tokens. This conversion is an essential step in the lexical analysis process.
 Deterministic Finite Automata (DFA): A DFA is used to recognize tokens in a single pass over the
input string. It’s efficient and suitable for real-time processing.
 Nondeterministic Finite Automata (NFA): An NFA is more flexible in terms of pattern matching but
requires more complex processing, as it can be in multiple states simultaneously.
The process usually involves converting the regular expression into an NFA, which is then optimized and
transformed into a DFA. This DFA can then be used by the lexer to scan the source code efficiently.
Advantages of Using Regular Expressions
 Simplicity: Regular expressions offer a simple and concise way to represent patterns, making the
lexer easier to implement.
 Efficiency: Once compiled into finite automata, regular expressions can be used to quickly and
efficiently recognize tokens in the source code.
 Flexibility: Regular expressions are highly flexible, allowing the lexer to handle a wide variety of
token patterns, from simple keywords to complex operators.
Limitations
While regular expressions are powerful, they have limitations:
 Context-Free Grammars: Regular expressions are not capable of expressing context-free grammars,
which are needed for the syntactical structure of programming languages. Therefore, they are limited
to the lexical analysis phase.
 Complexity: For very complex patterns, regular expressions can become hard to read and maintain,
especially as the language’s syntax grows in complexity.

Conclusion
Regular expressions are a fundamental tool in compiler design, particularly in the lexical analysis phase.
They enable the efficient identification and classification of tokens in source code, facilitating the
compilation process. By converting regular expressions into finite automata, compilers can quickly process
and recognize patterns, ensuring that the source code is correctly parsed and transformed into executable
code. Despite their limitations, regular expressions remain an essential component in the toolkit of compiler
designers.
QUESTIONS AND SOLUTIONS

1. What is Operator Precedence Parsing Algorithm in compiler design?

Any string of Grammar can be parsed by using stack implementation, as in shift Reduce parsing. But in
operator precedence parsing shifting and reducing is done based on Precedence Relation between symbol
at the top of stack & current input symbol of the input string to be parsed.

The operator precedence parsing algorithm is as follows −

Input − The precedence relations from some operator precedence grammar and an input string of terminals
from that grammar.

Output − There is no output but it can construct a skeletal parse tree as we parse, with one non-terminal
labeling all interior nodes and the use of single productions not shown. Alternatively, the sequence of shift-
reduce steps can be considered the output.

Method − Let the input string be a1a2 … … . an.$Initially, the stack contains $.

Repeat forever

If only $ is on the stack and only $ is on the input then accept and break else

begin

let a be the topmost terminal symbol on the stack and let b be the current input symbols.

If a <. b or a =. b then shift b onto the stack /*Shift*/

else if a . >b then /*reduce*/

repeat pop the stack

until the top stack terminal is related by <. to the terminal most recently popped.

else call the error-correcting routine end.

Example1 − Construct the Precedence Relation table for the Grammar.

E → E + E|E − E|E ∗ E|E⁄E|E ↑ E|(E)| − E|id

Using Assumptions

Operators Precedence Association

↑ Highest Right Associative

* and / Next Highest Left Associative

+ and − Lowest Left Associative

Solution

Operator Precedence Relations

+ − * / ↑ id ( ) $

+ .> .> <. <. <. <. <. .> .>

− .> .> <. <. <. <. <. .> .>

* .> .> .> .> <. <. <. .> .>

/ .> .> .> .> <. <. <. .> .>

↑ .> .> .> .> <. <. <. .> .>

id .> .> .> .> .> .> .>

( <. <. <. <. <. <. <. =

) .> .> .> .> .> .> .>

$ <. <. <. <. <. <. <.
Example2 − Find out all precedence relations between various operators & symbols in the following
Grammar & show them using the precedence table.

E → E + T|T

T → T ∗ F|F

F → (E)|id

Solution

+ * ( ) id $

+ .> .< <. .> <. .>

* .> .> <. .> <. .>

( <. <. <. =. <.

) .> .> .> .>

id .> .> .> .>

$ <. <. <. <.

2. What are Precedence Functions in compiler design?

Precedence relations between any two operators or symbols in the precedence table can be converted to
two precedence functions f & g that map terminals symbols to integers.

 If a <. b, then f (a) <. g (b)

 If a = b, then f (a) =. g (b)
 If a .> b, then f (a) .> g (b)
Here a, b represents terminal symbols. f (a) and g (b) represents the precedence functions that have an
integer value.

Computations of Precedence Functions

 For each terminal a, create the symbol fa&ga.

 Make a node for each symbol.
If a =. b, then fa & gb are in same group or node.

If a =. b & c =. b, then fa & fc must be in same group or node.

 (a) If a <. b, Mark an edge from gb to fa.

(b) If a .>b, Mark an edge from fa to gb.

 If the graph constructed has a cycle, then no precedence functions exist.

 If there are no cycles.
(a) fa = Length of longest path beginning at the group of fa.

(b) ga = Length of the longest path from the group of ga.

Example1 − Construct precedence graph & precedence function for the following table.

Solution

Step1 − Create Symbols

Step2 − No symbol has equal precedence, as can be seen in the given table; therefore, each symbol will
remain in a different node.

Step3 − If a <. b, create an edge from fa → ga

If a .>b, create an edge from gb → fa

Since, $ <. +,*, id. therefore, make an edge from g+, g*, gid to fs

Similarity + <. ,∗, id. ∴ make an edge from g*, gid to f+

Similarity * <. id. Therefore, Mark an edge from gid to f*.

Since, +,*, id . > $ therefore, Mark an edge from f+, f*, fid to gs.

Similarity +,*, id . > +. Mark an edge from f+, f*, fid to g+.

Similarity , id . > . Mark an edge from f*, fid to g.

Combining all the edges we get

Step4 − Computing the maximum length of the path from each node, we get the following precedence
functions

Id + * $
F 4 2 4 0
G 5 1 3 0
Example2 − Construct precedence graph & precedence function for the following table.

Solution

As we have (=.). Therefore f & g will be in the same group.

Computation of precedence graph

3. What is Design of Lexical Analysis in Compiler Design?
Lexical Analysis can be designed using Transition Diagrams.

Finite Automata (Transition Diagram) − A Directed Graph or flowchart used to recognize token.
The transition Diagram has two parts −

 States − It is represented by circles.

 Edges − States are connected by Edges Arrows.

Example − Draw Transition Diagram for "if" keyword.

To recognize Token ("if"), Lexical Analysis has to read also the next character after "f". Depending upon
the next character, it will judge whether the "if" keyword or something else is.

So, Blank space after "if" determines that "If" is a keyword.

"*" on Final State 3 means Retract, i.e., control will again come to previous state 2. Therefore Blank space
is not a part of the Token ("if").

Transition Diagram for an Identifier − An identifier starts with a letter followed by letters or Digits.
Transition Diagram will be:

For example, In statement int a2; Transition Diagram for identifier a2 will be:
As (;) is not part of Identifier ("a2"), so use "*" for Retract i.e., coming back to state 1 to recognize
identifier ("a2").

The Transition Diagram for identifier can be converted to Program Code as −

Coding
State 0: C = Getchar()
If letter (C) then goto state 1 else fail

State1: C = Getchar()
If letter (C) or Digit (C) then goto state 1
else if Delimiter (C) goto state 2
else Fail

State2: Retract ()
return (6, Install ());
In-state 2, Retract () will take the pointer one state back, i.e., to state 1 & declares that whatever has been
found till state 1 is a token.

The lexical Analysis will return the token to the Parser, not in the form of an English word but the form of
a pair, i.e., (Integer code, value).

In the case of identifier, the integer code returned to the parser is 6 as shown in the table.

Install () − It will return a pointer to the symbol table, i.e., address of tokens.
The following table shows the integer code and value of various tokens returned by lexical analysis to the
parser.

Suppose, if the identifier is stored at location 236 in the symbol table, then

Similarly, if constant is stored at location 238 then

Integer code = 7

Install () = 238 i.e., Pair will be (7, 238)

Transition Diagram (Finite Automata) for Tokens −

4.What are the Rules of Regular Expressions in Compiler Design?
The language accepted by finite automata can be simply defined by simple expressions known as Regular
Expressions. It is an effective approach to describe any language. A regular expression can also be
represented as a sequence of patterns that represent a string. Regular expressions are used to connect
character sequence in strings. The string searching algorithm used this pattern to discover the operations on
a string.

There are various rules for regular expressions which are as follows −

 ε is a Regular expression.
 Union of two Regular Expressions R1 and R2.
i.e., R1 + R2 or R1|R2 is also a regular expression.
 Concatenation of two Regular Expressions R1 and R2.
i.e., R1 R2 is also a Regular Expression.
 Closure of Regular Expression R, i.e., R* is also a Regular Expression.
 If R is a Regular Expression, then (R) is also a Regular Expression.
Algebraic Laws
R1|R2=R2|R1 or R1+ R2=R2+ R1 (Commutative)
R1| (R2|R3)=(R1| R2)|R3 (Associative)
Or
R1+ (R2+ R3)=(R1+ R2)+R3
R1 (R2|R3)=(R1R2)R3 (Associative)
R1| (R2|R3)=R1R2| R1R3 (Distributive)
Or
R1 (R2+ R3)=R1R2+R1R3
ε R=R ε=R (Concatenation)
Example1 − Write Regular Expressions for the following language over ∑∑ ={a,b}
 String of length zero or one.
Answer: ε | a | b or (ε+a+b)

 Strings of length two.

Answer: aa | ab | bb or (aa+ab+ba +bb)

 Strings of Even Length

Answer: (aa| ab| ba | bb)* or (aa+ab+ba +bb)*

 Set of all strings of a’s and b’s having at least two occurrences of aa.
Answer − (a+b)*aa(a+b)aa(a+b)*

Example2 − Find Regular Expressions for following language.

 L={ε,1,11,111,….}
{∴ 10=ε,11=1,12=11,13=111…..}
Answer: 1*

Answer: (11) ∗∗
 L={ε,11,1111,111111,…..}

Answer: (0+1) ∗∗ or (0|1) ∗∗

 L = Set of all strings of 0’s and 1’s = {ε,0,1,01,11,00,000,101,……}

Answer: (0+1) ∗∗ 11
 L = Set of all strings of 0’s and 1’s ending with 11.

Answer: 0(0+1) ∗∗ 1
 L = Set of all strings of 0’s and 1’s beginning with 0 and ending with 1.
Example3 − Write Regular Expression in which the second letter from the right end of the string is 1

Answer: (0+1) ∗∗ 1(0+1)

where ∑∑ ={0,1}.

Example4 − Write Regular Expressions for the following language over ∑∑ ={a,b}
 L=Set of strings having at least one occurrence of the double letter
Answer: (a+b)*(aa+bb)(a+b)*

 L = Set of strings having double letter at Beginning and Ending of string.

Answer: (aa+bb)(a+b)*(aa+bb)

 L = Set of strings having double letter at Beginning or on Ending of string.

Answer: (aa+bb)(a+b)*+ (a+b)*(aa+bb)+(aa+bb)(a+b)*(aa+bb)

5. What is assignment statements with Integer types in compiler design?

Assignment statements consist of an expression. It involves only integer variables.

Abstract Translation Scheme

Consider the grammar, which consists of an assignment statement.

S → id = E

E→E+E

E→E∗E

E → −E

E → (E)

E → id

Here Translation of E can have two attributes −

𝐄. 𝐏𝐋𝐀𝐂𝐄− It tells about the name that will hold the value of the expression.
𝐄. 𝐂𝐎𝐃𝐄− It represents a sequence of three address statements evaluating the expression E in


grammar represents an Assignment statement. E. CODE represents the three address codes of the
statement. CODE for non-terminals on the left is the concatenation of CODE for each non-terminal
on the right of Production.

Abstract Translation Scheme

Production Semantic Action

S → id = E {S. CODE = E. CODE| |id. PLACE| | '=. '||E. PLACE}
E → E(1) + E(2) {T = newtemp( );
E. PLACE = T;
E. CODE = E(1). CODE | |E(2). CODE| |
E. PLACE

E → E(1) ∗ E(2)
| | '=' | |E(1). PLACE | | '+' | |E(2). PLACE }
{T = newtemp( );
E. PLACE = T;
E. CODE = E(1). CODE | |E(2). CODE | |
E. PLACE | | '=' | |E(1). PLACE
(2)
| | '*' | |E . PLACE }
(1)
E → −E {T = newtemp( );
E. PLACE = T;
E. CODE = E(1). CODE
| |E. PLACE | | '=−' | |E(1). PLACE
}
(1)
E → (E ) {E. PLACE = E(1). PLACE;
(1)
E. CODE = E . CODE }
E → id {E. PLACE = id. PLACE;
E. CODE = null; }
In the first production S → id = E,

id. PLACE| | '=' | | E. PLACE is a string which follows S. CODE = E. CODE.

In the second production E → E(1) + E(2),

E. PLACE| | '=' | | E(1). PLACE | | '+' | | E(2). PLACE is a string which is appended with E. CODE = E (1).
CODE ||E(2). CODE.

In the fifth production, i.e., E → (E(1)) does not have any string which follows E. CODE = E (1). CODE.
This is because it does not have any operator on its R.H.S of the production.

Similarly, the sixth production also does not have any string appended after E. CODE = null. The sixth
production contains null because there is no expression appears on R.H.S of production. So, no CODE
attribute will exist as no expression exists, because CODE represents a sequence of Three Address
Statements evaluating the expression.

It consists of id in its R.H.S, which is the terminal symbol but not an expression. We can also use a
procedure GEN (Statement) in place of S. CODE & E. CODE, As the GEN procedure will automatically
generate three address statements.

So, the GEN statement will replace the CODE Statements.

GEN Statements replacing CODE definitions

Production Semantic Action

S → id = E GEN(id. PLACE = E. PLACE)

E → E(1) + E(2) GEN(E. PLACE = E(1). PLACE + E(2). PLACE

E → E(1) ∗ E(2) GEN(E. PLACE = E(1). PLACE ∗ E(2). PLACE

E → −E(1) GEN(E. PLACE = −E(1). PLACE)

E → (E(1)) None

E → id None

6. What is Design of Lexical Analysis in Compiler Design?

Lexical Analysis can be designed using Transition Diagrams.

Finite Automata (Transition Diagram) − A Directed Graph or flowchart used to recognize token.
The transition Diagram has two parts −

 States − It is represented by circles.

 Edges − States are connected by Edges Arrows.

Example − Draw Transition Diagram for "if" keyword.

To recognize Token ("if"), Lexical Analysis has to read also the next character after "f". Depending upon
the next character, it will judge whether the "if" keyword or something else is.

So, Blank space after "if" determines that "If" is a keyword.

"*" on Final State 3 means Retract, i.e., control will again come to previous state 2. Therefore Blank space
is not a part of the Token ("if").

Transition Diagram for an Identifier − An identifier starts with a letter followed by letters or Digits.
Transition Diagram will be:

For example, In statement int a2; Transition Diagram for identifier a2 will be:

As (;) is not part of Identifier ("a2"), so use "*" for Retract i.e., coming back to state 1 to recognize
identifier ("a2").
The Transition Diagram for identifier can be converted to Program Code as −

Coding
State 0: C = Getchar()
If letter (C) then goto state 1 else fail

State1: C = Getchar()
If letter (C) or Digit (C) then goto state 1
else if Delimiter (C) goto state 2
else Fail

State2: Retract ()
return (6, Install ());
In-state 2, Retract () will take the pointer one state back, i.e., to state 1 & declares that whatever has been
found till state 1 is a token.

The lexical Analysis will return the token to the Parser, not in the form of an English word but the form of
a pair, i.e., (Integer code, value).

In the case of identifier, the integer code returned to the parser is 6 as shown in the table.

7. Explain about regular expressions in A regular expression is basically a shorthand way of showing how
a regular language is built from the base set of regular languages.

The symbols are identical which are used to construct the languages, and any given expression that has a
language closely associated with it.

For each regular expression E, there is a regular language L(E).

Example 1
If the regular expression is as follows −

a + b · a*

It can be written in fully parenthesized form as follows −

(a + (b · (a*)))

Regular expressions vs. Languages

The symbols of the regular expressions are distinct from those of the languages. These symbols are given
below −

Operators in Regular expression −

There are two binary operations on regular expressions (+ and ·) and one unary operator (*)

These are closely associated with the union, product and closure operations on the corresponding
languages.

The regular expression a + bc* is basically shorthand for the regular language {a} ∪ ({b} · ({c}*)).
Example 1

Example 2
Find the language of the given regular expression. It is explained below −

L(a + bc) = L(a) ∪ L(bc)

a + bc*.

= L(a) ∪ (L(b) · L(c*))

= L(a) ∪ (L(b) · L(c)*)
= {a} ∪ ({b} · {c}*)
= {a} ∪ ({b} · {∧, c, c2, . . . , cn, . . . , })
= {a} ∪ {b, bc, bc2, . . . , bcn, . . . }
= {a, b, bc, bc2, . . . , bcn, . . . }.

8. What are the properties of Regular expressions in TOC?

A regular expression is basically a shorthand way of showing how a regular language is built from the base
set of regular languages.

The symbols are identical which are used to construct the languages, and any given expression that has a
language closely associated with it.

For each regular expression E, there is a regular language L(E).

There are some general equalities for the regular expressions.

Properties
All the properties held for any regular expressions R, E, F and can be verified by using properties of
languages and sets.

Additive (+) properties

The additive properties of regular expressions are as follows −

R+∅=∅+R=R
R+E=E+R

R+R=R
(R + E) + F = R + (E + F)
Product (·) properties
The product properties of regular expressions are as follows −

R∅ = ∅R = ∅
R∧ = ∧R = R
(RE)F = R(EF)
Distributive properties
The distributive properties of regular expressions are as follows −

R(E + F) = RE + RF
(R + E)F = RF + EF
Closure properties
The closure properties of regular expressions are as follows −

∅* = ∧ * = ∧

R* = ∧ + RR* = (∧ + R)R*
R* = R*R* = (R*)* = R + R*

RR* = R*R
R(ER)* = (RE)*R
(R + E)* = (R*E*)* = (R* + E*)* = R*(ER*)*
All the properties can be verified by using the properties of languages and sets.

Example 1
Show that

(∅ + a + b)* = a*(ba*)*

Using the properties above:

(∅ + a + b)* = (a + b)* (+ property)
= a*(ba*)* (closure property).
Example 2
Show that

∧ + ab + abab(ab)* = (ab)*

∧ + ab + abab(ab)* = ∧ + ab(∧ + ab(ab)*)

Using the properties above:

= ∧ + ab((ab)) (using R = ∧ + RR*)

= ∧ + ab(ab)*= (ab)* (using R* = ∧ + RR* again)

9.Explain about right linear regular grammars in TOC

Regular grammar describes a regular language. It consists of four components, which are as follows −

G = (N, E, P, S)
Where,

 N − finite set of non-terminal symbols,

 E − a finite set of terminal symbols,
 P − a set of production rules, each of one is in the forms
 S → aB

S → ∈,
 S→a

S ∈ N is the start symbol.



The above grammar can be of two forms −

 Right Linear Regular Grammar

 Left Linear Regular Grammar
Linear Grammar
When the right side of the Grammar part has only one terminal then it's linear else non linear.

Let’s discuss about right linear grammar −

Right linear grammar

Right linear grammar means that the non-terminal symbol will be at the right side of the production.

It is a formal grammar (N, Σ, P, S) such that all the production rules in P are of one of the following forms
−

L → a, { L is a non-terminal and a is a terminal in Σ}

L → ∈.
L → aM, {L and M are non-terminals in N and a is in Σ}

Example
Consider a language L= {bnabma | n>=2, m>=2}

S→bbB ⇒for first 2 b’s

The production rules or grammar for the given language L= {bnabma | n>=2, m>=2} is −

B→bB|aC ⇒ any number of b’s followed by a

C→bbD ⇒ 2b’s
D→ bD|a ⇒ any number of b’s followed by a

10.Explain about left linear regular grammar in TOC

Regular grammar describes a regular language. It consists of four components, which are as follows −

G = (N, E, P, S)
Where,

 N − finite set of non-terminal symbols,

 E − a finite set of terminal symbols,
 P − a set of production rules, each of one is in the forms
 S → aB

 S → ∈,
 S→a

 S ∈ N is the start symbol.

The above grammar can be of two forms −

 Right Linear Regular Grammar

 Left Linear Regular Grammar
Linear Grammar
When the right side of the Grammar part has only one terminal then it's linear else nonv linear.

eft linear grammar

In a left-regular grammar (also called left-linear grammar), the rules are of the form as given below −

 L → a, {L is a non-terminal in N and a is a terminal in Σ}

L → ∈, {∈ is the empty string}.

 L → Ma, {L and M are in N and a is in Σ}

The left linear grammar means that the non-terminal symbol will be at the left side.

Example
Consider a language {bnabma| n>=2, m>=2}
The left linear grammar that is generated based on given language is −

⇒ last 3 symbols bba

B → Bb| Dbba ⇒ for bm and bba are for bn followed by a.
S → Bbba

D → Db|e ⇒ for bn-2

CC Unit-1
No ratings yet
CC Unit-1
143 pages
Question Bank Compiler Design 2024
No ratings yet
Question Bank Compiler Design 2024
9 pages
Unit 2
No ratings yet
Unit 2
93 pages
Lexi Cal A Analyzer
No ratings yet
Lexi Cal A Analyzer
38 pages
CD ppt1
No ratings yet
CD ppt1
62 pages
8 Operator Precedence Parsing 13-08-2024
No ratings yet
8 Operator Precedence Parsing 13-08-2024
40 pages
Chapter 3 - Lexical Analysis
No ratings yet
Chapter 3 - Lexical Analysis
51 pages
Compiler Design Part 2
No ratings yet
Compiler Design Part 2
20 pages
Scanner (Lexical Analyzer) : The Structure of A Compiler
No ratings yet
Scanner (Lexical Analyzer) : The Structure of A Compiler
109 pages
Lexical and Syntax Analysis
No ratings yet
Lexical and Syntax Analysis
63 pages
CD ch2
No ratings yet
CD ch2
104 pages
Edc15 Multimap - ECU Connections
67% (3)
Edc15 Multimap - ECU Connections
3 pages
Slides CHP 3 and 4
No ratings yet
Slides CHP 3 and 4
21 pages
PLDI Week 06 Parsing
No ratings yet
PLDI Week 06 Parsing
55 pages
CD 22-23 Answers
No ratings yet
CD 22-23 Answers
28 pages
1st Phase Lexical Analyzer
No ratings yet
1st Phase Lexical Analyzer
33 pages
CD Unit-2
No ratings yet
CD Unit-2
64 pages
2 Lexical Analizer
No ratings yet
2 Lexical Analizer
56 pages
Lec2 LexicalAnalyser
No ratings yet
Lec2 LexicalAnalyser
30 pages
CD - Unit1 - Lecture4 5 6 7
No ratings yet
CD - Unit1 - Lecture4 5 6 7
50 pages
CD 2
No ratings yet
CD 2
106 pages
Lexical Analysis
No ratings yet
Lexical Analysis
57 pages
M2 Main
No ratings yet
M2 Main
41 pages
ch3 M.PPTX - 0
No ratings yet
ch3 M.PPTX - 0
46 pages
CD Important Questions With Answers
No ratings yet
CD Important Questions With Answers
34 pages
Operator Precedence Parsing
No ratings yet
Operator Precedence Parsing
22 pages
Lexical Analysis All Token List and Diffence
No ratings yet
Lexical Analysis All Token List and Diffence
4 pages
Compilation Techniques
No ratings yet
Compilation Techniques
21 pages
Compiler Design - Lexical Analysis
No ratings yet
Compiler Design - Lexical Analysis
16 pages
Compiler-Lexical Analysis
100% (1)
Compiler-Lexical Analysis
59 pages
Lect 03
No ratings yet
Lect 03
19 pages
2 - Lexical Analysis
No ratings yet
2 - Lexical Analysis
52 pages
CH 2
No ratings yet
CH 2
36 pages
CD Unit-2
No ratings yet
CD Unit-2
64 pages
Compilers CH 3
No ratings yet
Compilers CH 3
58 pages
Priority Order of CSE & CSE Specialisation
No ratings yet
Priority Order of CSE & CSE Specialisation
10 pages
Question Bank Part A, Part B&C
No ratings yet
Question Bank Part A, Part B&C
15 pages
Lecture02 Scanning 1
No ratings yet
Lecture02 Scanning 1
72 pages
Compiler Design Lab Manual
No ratings yet
Compiler Design Lab Manual
32 pages
Lec 03 Syntax Analysis
No ratings yet
Lec 03 Syntax Analysis
19 pages
Lexical Analyzer 1
No ratings yet
Lexical Analyzer 1
37 pages
Chapter 3 - Lexical Analysis
100% (1)
Chapter 3 - Lexical Analysis
51 pages
CompilerD L3
No ratings yet
CompilerD L3
36 pages
Compiler Construction Lecture Notes
No ratings yet
Compiler Construction Lecture Notes
27 pages
Lect2 Lexical
No ratings yet
Lect2 Lexical
9 pages
Chapter 2 - Lexical Analysis
No ratings yet
Chapter 2 - Lexical Analysis
56 pages
Compilers: CS414-2017S-01 Compiler Basics & Lexical Analysis
No ratings yet
Compilers: CS414-2017S-01 Compiler Basics & Lexical Analysis
58 pages
Unit II - Lexical Analysis-20-1-2021
No ratings yet
Unit II - Lexical Analysis-20-1-2021
49 pages
Chapter Two (3) (Autosaved)
No ratings yet
Chapter Two (3) (Autosaved)
29 pages
Unit22pdf 2021 03 13 13 38 11
No ratings yet
Unit22pdf 2021 03 13 13 38 11
114 pages
Compiler 2
No ratings yet
Compiler 2
10 pages
05 Navigation and Routing
No ratings yet
05 Navigation and Routing
54 pages
Unit 1 (Windows Programming)
No ratings yet
Unit 1 (Windows Programming)
11 pages
Operator Precedence Grammar
100% (2)
Operator Precedence Grammar
5 pages
Computer Science PG Rur Syllabus
No ratings yet
Computer Science PG Rur Syllabus
56 pages
Compiler Design - Lexical Analysis: University of Salford, UK
No ratings yet
Compiler Design - Lexical Analysis: University of Salford, UK
1 page
Compilers - Week 2
No ratings yet
Compilers - Week 2
14 pages
P VX Language
No ratings yet
P VX Language
856 pages
Sri Vidya College of Engineering and Technology Question Bank
No ratings yet
Sri Vidya College of Engineering and Technology Question Bank
5 pages
Digital Image Watermarking Using Deep Learning - A Survey
No ratings yet
Digital Image Watermarking Using Deep Learning - A Survey
12 pages
Chapter 2
No ratings yet
Chapter 2
56 pages
Microprocessor Programming: by Prof. Y. P. Jadhav. Physics Dept. Smt. C.H.M. College, Ulhasnagar-3
100% (1)
Microprocessor Programming: by Prof. Y. P. Jadhav. Physics Dept. Smt. C.H.M. College, Ulhasnagar-3
104 pages
CS 346: Compilers: Lexical Analyzer Lexical Analyzer
No ratings yet
CS 346: Compilers: Lexical Analyzer Lexical Analyzer
52 pages
TAFL Syllabus-1
No ratings yet
TAFL Syllabus-1
6 pages
Compiler
No ratings yet
Compiler
60 pages
Compiler Construction Notes
No ratings yet
Compiler Construction Notes
21 pages
C++ Last Year Final Exam
No ratings yet
C++ Last Year Final Exam
5 pages
LLF Help
No ratings yet
LLF Help
2 pages
Compiler Construction Lecture Notes: Why Study Compilers?
No ratings yet
Compiler Construction Lecture Notes: Why Study Compilers?
16 pages
PHP Training Report
57% (14)
PHP Training Report
17 pages
Python Notes
No ratings yet
Python Notes
80 pages
Modified 2024 - 2025
No ratings yet
Modified 2024 - 2025
28 pages
723 Seminar Report
No ratings yet
723 Seminar Report
24 pages
20.1 PP & Oop by Sir Sikandar
No ratings yet
20.1 PP & Oop by Sir Sikandar
25 pages
Mat 102 Mcqs
No ratings yet
Mat 102 Mcqs
40 pages
Re To DFA
No ratings yet
Re To DFA
6 pages
Fulltime List of Attendance Sheet (1) 081354
No ratings yet
Fulltime List of Attendance Sheet (1) 081354
15 pages
Lecture 0 - CS50x 2024
No ratings yet
Lecture 0 - CS50x 2024
19 pages
C++ Identifiers, Data Types and Operators
No ratings yet
C++ Identifiers, Data Types and Operators
5 pages
Akash Pal Unix and Shell Programming Practical File
No ratings yet
Akash Pal Unix and Shell Programming Practical File
13 pages
Cs 406 Java Co Po Pso Mapping
No ratings yet
Cs 406 Java Co Po Pso Mapping
3 pages
BCA 3 Python Imp Questions
No ratings yet
BCA 3 Python Imp Questions
2 pages
Design Programining Logic
No ratings yet
Design Programining Logic
21 pages
Heart Disease Prediction Using Machine Learning Techniques: Abstract
No ratings yet
Heart Disease Prediction Using Machine Learning Techniques: Abstract
5 pages
Introduction To Computer
No ratings yet
Introduction To Computer
26 pages
Linked List Java Program
No ratings yet
Linked List Java Program
4 pages
Research and Design of Low-Power High-Performance
No ratings yet
Research and Design of Low-Power High-Performance
7 pages
Bidirectional Parsing For Natural Language Processing
No ratings yet
Bidirectional Parsing For Natural Language Processing
9 pages
Capgemini Reckoner V2.0
No ratings yet
Capgemini Reckoner V2.0
26 pages
SSN Ieee
No ratings yet
SSN Ieee
3 pages
Cs6109 - Compiler Design: Lab Assignment
No ratings yet
Cs6109 - Compiler Design: Lab Assignment
8 pages
Abhishek Tyagi
No ratings yet
Abhishek Tyagi
1 page
CS304 Final Spring2006
No ratings yet
CS304 Final Spring2006
6 pages

CMP 335 Regular Expression Exercises Note

Uploaded by

CMP 335 Regular Expression Exercises Note

Uploaded by

Introduction

1. What is Operator Precedence Parsing Algorithm in compiler design?

The operator precedence parsing algorithm is as follows −

E → E + E|E − E|E ∗ E|E⁄E|E ↑ E|(E)| − E|id

Operators Precedence Association

↑ Highest Right Associative

* and / Next Highest Left Associative

+ and − Lowest Left Associative

Operator Precedence Relations

+ .> .> <. <. <. <. <. .> .>

− .> .> <. <. <. <. <. .> .>

* .> .> .> .> <. <. <. .> .>

/ .> .> .> .> <. <. <. .> .>

↑ .> .> .> .> <. <. <. .> .>

id .> .> .> .> .> .> .>

( <. <. <. <. <. <. <. =

) .> .> .> .> .> .> .>

+ .> .< <. .> <. .>

* .> .> <. .> <. .>

( <. <. <. =. <.

) .> .> .> .>

id .> .> .> .>

$ <. <. <. <.

2. What are Precedence Functions in compiler design?

 If a <. b, then f (a) <. g (b)

Computations of Precedence Functions

 For each terminal a, create the symbol fa&ga.

If a =. b & c =. b, then fa & fc must be in same group or node.

 (a) If a <. b, Mark an edge from gb to fa.

 If the graph constructed has a cycle, then no precedence functions exist.

(b) ga = Length of the longest path from the group of ga.

Step1 − Create Symbols

Step3 − If a <. b, create an edge from fa → ga

If a .>b, create an edge from gb → fa

Similarity + <. ,∗, id. ∴ make an edge from g*, gid to f+

Similarity * <. id. Therefore, Mark an edge from gid to f*.

Similarity *, id . > *. Mark an edge from f*, fid to g.

Combining all the edges we get

As we have (=.). Therefore f & g will be in the same group.

Computation of precedence graph

 States − It is represented by circles.

 Edges − States are connected by Edges Arrows.

So, Blank space after "if" determines that "If" is a keyword.

The Transition Diagram for identifier can be converted to Program Code as −

Similarly, if constant is stored at location 238 then

Install () = 238 i.e., Pair will be (7, 238)

Transition Diagram (Finite Automata) for Tokens −

 Strings of length two.

 Strings of Even Length

Example2 − Find Regular Expressions for following language.

Answer: (0+1) ∗∗ or (0|1) ∗∗

Answer: (0+1) ∗∗ 1(0+1)

 L = Set of strings having double letter at Beginning and Ending of string.

 L = Set of strings having double letter at Beginning or on Ending of string.

5. What is assignment statements with Integer types in compiler design?

Abstract Translation Scheme

Consider the grammar, which consists of an assignment statement.

Here Translation of E can have two attributes −

Abstract Translation Scheme

Production Semantic Action

id. PLACE| | '=' | | E. PLACE is a string which follows S. CODE = E. CODE.

In the second production E → E(1) + E(2),

So, the GEN statement will replace the CODE Statements.

GEN Statements replacing CODE definitions

Production Semantic Action

S → id = E GEN(id. PLACE = E. PLACE)

E → E(1) + E(2) GEN(E. PLACE = E(1). PLACE + E(2). PLACE

E → E(1) ∗ E(2) GEN(E. PLACE = E(1). PLACE ∗ E(2). PLACE

E → −E(1) GEN(E. PLACE = −E(1). PLACE)

6. What is Design of Lexical Analysis in Compiler Design?

 States − It is represented by circles.

 Edges − States are connected by Edges Arrows.

Example − Draw Transition Diagram for "if" keyword.

So, Blank space after "if" determines that "If" is a keyword.

For each regular expression E, there is a regular language L(E).

It can be written in fully parenthesized form as follows −

Similarity , id . > . Mark an edge from f*, fid to g.

L(a + bc) = L(a) ∪ L(bc)

= ∧ + ab((ab)) (using R = ∧ + RR*)