0% found this document useful (0 votes)
10 views21 pages

Compiler Construction 2

The document provides an overview of compiler construction concepts, including code optimization techniques, definitions of key terms such as sentinel, handle, and bootstrapping, and the phases of a compiler. It also discusses attributes in syntax-directed definitions, types of parsers, and various functions and tasks of lexical analyzers. Additionally, it covers topics like DAG construction, leading and trailing symbols, and differences between top-down and bottom-up parsing.

Uploaded by

Ajinkya Jagtap
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views21 pages

Compiler Construction 2

The document provides an overview of compiler construction concepts, including code optimization techniques, definitions of key terms such as sentinel, handle, and bootstrapping, and the phases of a compiler. It also discusses attributes in syntax-directed definitions, types of parsers, and various functions and tasks of lexical analyzers. Additionally, it covers topics like DAG construction, leading and trailing symbols, and differences between top-down and bottom-up parsing.

Uploaded by

Ajinkya Jagtap
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

COMPILER CONSTRUCTION Theory ₹₹

a) List of Code Optimization Techniques

1. Constant Folding
2. Common Subexpression Elimination (CSE)
3. Dead Code Elimination
4. Loop Optimization
5. Strength Reduction
6. Peephole Optimization

b) What is Sentinel?

A sentinel is a special value placed at the end of a data structure to simplify processing and avoid
extra boundary checks.

c) Define Handle.

A handle is a substring in the rightmost sentential form that matches the right-hand side of a
production and can be replaced in shift-reduce parsing.

d) Define Bootstrapping.

Bootstrapping is the process of using a compiler to compile itself, often done in multiple stages.

e) LEX is a scanner provided by the Linux operating system. State True or False. Justify.

True. LEX is a lexical analyzer generator used in Linux/Unix systems to generate scanners for
tokenizing input.

f) LALR is the best bottom-up parsing method. Justify.

LALR (Look-Ahead LR) parsing is preferred because it reduces memory usage compared to CLR
parsing while still handling most programming languages efficiently.

g) Define Basic Block.

A basic block is a sequence of instructions with one entry point and one exit point where execution
flows sequentially without jumps or branches.

h) Differentiate between Synthesized and Inherited Attributes.


Aspect Synthesized Attribute Inherited Attribute

An attribute that is computed from child An attribute that is assigned from parent or
Definition nodes and passed upward to the parent in sibling nodes and moves downward or
the parse tree. sideways in the parse tree.

In an expression grammar: E → E1 + T,
In a grammar rule: T → id, T.type = E.type
Example E.val = E1.val + T.val (computed from
(inherits type from parent E).
children).
i) What is a Parser?

A parser is a component of a compiler that analyzes the syntax of source code and generates a parse
tree.

j) List All Phases of the Compiler.

1. Lexical Analysis
2. Syntax Analysis
3. Semantic Analysis
4. Intermediate Code Generation
5. Code Optimization
6. Code Generation
7. Symbol Table Management
8. Error Handling

a) What is the use of lookahead pointer?

• A lookahead pointer is used in lexical analysis to check upcoming characters in the input
stream without consuming them. It helps in deciding the correct token, especially in cases
where multiple patterns match the input.

b) State true or false: "Target code is generated in the analysis phase of the compiler".

• False. Target code is generated in the synthesis phase, not in the analysis phase.

c) What is the output of LEX program?

• The output of a LEX program is a C program (lex.yy.c) that contains a lexical analyzer, which
can recognize patterns in input text.

d) Terminals can have synthesized attributes, but not inherited attributes. State true or false.

• True. Terminals can only have synthesized attributes because they do not have children to
inherit attributes from.

e) Define operand descriptors.

• Operand descriptors store information about where an operand is currently located (register,
memory, or stack) during intermediate code generation.

f) State True or False: The yywrap() lex library function by default always returns 1.

• True. By default, yywrap() returns 1, indicating the end of input.

g) List the two aspects of compilation.

• Analysis (Front-end) – Breaks down the source code (Lexical, Syntax, and Semantic
Analysis).
• Synthesis (Back-end) – Generates target code (Intermediate Code Generation, Optimization,
and Code Generation).
h) List the different types of conflicts that occur in LR parser.

• Shift-Reduce Conflict
• Reduce-Reduce Conflict

i) What is handle pruning?

• Handle pruning is a technique in bottom-up parsing where a handle (rightmost derivation


step) is identified and replaced by a non-terminal to reduce the string to the start symbol.

j) List the techniques used in code optimization.

• Constant Folding
• Common Subexpression Elimination
• Dead Code Elimination
• Loop Optimization (Loop Unrolling, Loop Invariant Code Motion)
• Peephole Optimization

Here are the answers in a simple and easy-to-understand manner:

a) Define cross-compiler.

• A cross-compiler is a compiler that runs on one machine but generates code for a different
machine or platform.

b) State the advantages of Boot-strapping.

• It helps in self-improvement of a compiler.


• It ensures the portability of a compiler to different machines.
• It allows the automatic updating of compilers when programming languages evolve.

c) What is sentinels?

• Sentinels are special values placed at the end of an array or list to avoid boundary checking
and improve efficiency in searching or scanning operations.

d) State the use of function retract( ).

• The retract() function moves the lookahead pointer backward in lexical analysis to
reconsider a character that was read ahead.

e) Name the types of LR parsers.

• SLR (Simple LR)


• CLR (Canonical LR)
• LALR (Look-Ahead LR)

f) What does the second ‘L’ stand for in LL(1) parser?

• The second ‘L’ stands for Left-to-right derivation in LL(1) parsing.


g) What is the purpose of augmenting the grammar?

• Augmenting the grammar adds a new start symbol to the original grammar, which helps in
parsing and constructing parse trees properly.

h) Define synthesized attribute.

• A synthesized attribute is an attribute whose value is computed from its child nodes in a
parse tree.

i) What is basic block?

• A basic block is a sequence of consecutive statements in a program where flow of control


enters at the beginning and exits at the end without any branching (except at the end).

j) Define DAG.

• DAG (Directed Acyclic Graph) is a data structure used in code optimization to represent
expressions efficiently by eliminating common subexpressions and redundant
computations.

Here are the answers in a simple and easy-to-understand manner:

a) YACC is a compiler or Parser. Write the correct statement.

• YACC is a parser generator, not a compiler. It is used to generate parsers for processing
structured input.

b) Write a regular expression in LEX for a hexadecimal number in C language.

• 0[xX][0-9a-fA-F]+

c) Define cross-compiler.

• A cross-compiler is a compiler that runs on one platform but generates executable code for a
different platform.

d) List any two transformations performed on a basic block.

• Constant Folding (Replacing constant expressions with their computed values)


• Dead Code Elimination (Removing code that does not affect program output)

e) What is sentinels?

• Sentinels are special values placed at the end of an array to eliminate the need for boundary
checking, improving efficiency.

f) Define Annotated Parse Tree.

• An annotated parse tree is a parse tree where each node contains attribute values that help
in semantic analysis.
g) Name the types of LR parser.

• SLR (Simple LR)


• CLR (Canonical LR)
• LALR (Look-Ahead LR)

h) What is a basic block?

• A basic block is a sequence of statements that always execute sequentially without branching
(except at the end).

i) State the use of function retract( ).

• The retract() function moves the lookahead pointer backward in lexical analysis when an
extra character is read ahead.

j) Construct LR(1) items for the following production: S → ∈

LR(1) items:

1. S' → · S, $
2. S' → S ·, $
3. S → ·, $ (since S → ∈, the dot appears before ∈)

The LR(1) item for S → ∈ simply means the parser has recognized an empty production and will
reduce S → ∈ when encountering $ (end of input).

Here are the answers in a simple and easy-to-understand manner:

b) List the two classes of SDD.

• S-attributed SDD (Uses only synthesized attributes)


• L-attributed SDD (Uses both inherited and synthesized attributes but follows left-to-right
evaluation rules)

c) Define the term dead code.

• Dead code refers to code that never gets executed or does not affect the program output,
making it unnecessary and removable during optimization.

d) List the different types of conflicts that occur in LR parser.

• Shift-Reduce Conflict
• Reduce-Reduce Conflict

e) State one difference between an Annotated Parse Tree and a Dependency Graph.

• Annotated Parse Tree shows attribute values at each node in the parse tree.
• Dependency Graph shows dependencies between attributes to determine evaluation order.
f) List the techniques used in code optimization.

• Constant Folding
• Dead Code Elimination
• Common Subexpression Elimination
• Loop Optimization (Loop Unrolling, Loop Invariant Code Motion)
• Peephole Optimization

g) What is the purpose of augmenting the grammar?

• Augmenting the grammar introduces a new start symbol to help define parsing rules clearly
and ensure proper handling of input.

h) Define the term Attribute Grammar.

• An Attribute Grammar is a formal way to define semantic rules by associating attributes


with grammar symbols in a parse tree.

i) What is the output of Lexical Analysis?

• The output of lexical analysis is a stream of tokens, which are used as input for the syntax
analyzer (parser).

j) State True or False: Shift-Shift conflict does not occur in LR Parser.

• True. LR parsers do not have Shift-Shift conflicts because their parsing tables are designed to
avoid such situations.

Q2
a) Write a short note on s-attributed grammar.

An S-attributed grammar is a syntax-directed definition (SDD) that uses only synthesized


attributes. These attributes are computed from child nodes and passed upward in the parse tree.

Key Features:

1. Uses only synthesized attributes (no inherited attributes).


2. Values are computed bottom-up in the parse tree.
3. Suitable for LR parsers (bottom-up parsing methods).
4. Commonly used for evaluating expressions and semantic analysis.

Example: For an arithmetic expression:


E → E1 + T { E.val = E1.val + T.val }
T → num { T.val = num.value }
c) Write LEX definition for identifier.

An identifier in LEX is a sequence of letters (A-Z, a-z) or underscores (_) followed by letters,
digits (0-9), or underscores. It must not start with a digit.

LEX Pattern for Identifier:


[a-zA-Z_][a-zA-Z0-9_]*
Explanation:

• [a-zA-Z_] → The first character must be a letter or underscore.


• [a-zA-Z0-9_]* → The remaining characters can be letters, digits, or underscores.

a) Define Annotated Parse Tree. Give an example.

• An Annotated Parse Tree is a parse tree where each node is associated with attribute values
that help in semantic analysis. These attributes can be synthesized or inherited and are used to
store information such as data types, values, or symbol table entries.

Example:
Consider the expression E → E1 + T, where E1 and T have synthesized attributes for value
computation.

Annotated Parse Tree for 3 + 2:

E (val = 5)
/ \
E1(val=3) + T(val=2)

Here, E1.val = 3, T.val = 2, and E.val = E1.val + T.val = 5.

b) List and explain in short any two LEX library functions (2 marks each).

1. yylex()
o This function is the main lexical analyzer function generated by LEX.
o It reads the input stream, matches patterns, and returns the corresponding tokens to the
parser.
2. yytext
o yytext is a global character array that stores the current matched token from the
input.
o Example: If 123 is matched as a number, yytext will store "123".
c) Give 2 differences between synthesized and inherited attributes.

Synthesized Attributes Inherited Attributes


Computed from child nodes and passed Derived from parent or sibling nodes and passed down
up in the parse tree. or sideways in the parse tree.
Used in S-attributed grammars, making Used in L-attributed grammars, requiring careful order
evaluation simpler. of evaluation.

c) Calculate FIRST and FOLLOW for the given grammar:

Given grammar:

1. S → a | ∈ | (R)
2. T → S, T | S
3. R → T

FIRST sets:

• FIRST(S) = {a, ∈, (}
• FIRST(T) = {a, ∈, (, a, ∈, (} = {a, ∈, (}
• FIRST(R) = FIRST(T) = {a, ∈, (}

FOLLOW sets:

• FOLLOW(S) = { $, |, ) }
• FOLLOW(T) = { $, |, ) }
• FOLLOW(R) = { ) }

e) Compute LEADING and TRAILING symbols for the given grammar:

Given grammar:

1. E → E + T | T
2. T → T * F | F
3. F → (E) | id

LEADING symbols: (Symbols that can appear at the beginning of derivations)

• LEADING(E) = { (, id }
• LEADING(T) = { (, id }
• LEADING(F) = { (, id }

TRAILING symbols: (Symbols that can appear at the end of derivations)

• TRAILING(E) = { ), id }
• TRAILING(T) = { ), id }
• TRAILING(F) = { ), id }
a) Construct the DAG for the expression:

Expression: b * (a + c) + (a + c) * d

Directed Acyclic Graph (DAG) Construction Steps:

1. Identify common subexpressions: (a + c) appears twice.


2. Create a single node for a + c to eliminate redundancy.
3. Use this node for both multiplications b * (a + c) and (a + c) * d.
4. Sum the results of both multiplications.

DAG Representation:

(+)
/ \
(*) (*)
/ \ / \
b (a+c) (a+c) d
/ \
a c

b) Basic and Auxiliary Tasks of a Lexical Analyzer

Basic Tasks:

1. Tokenization: Converts source code into tokens.


2. Pattern Matching: Uses regular expressions to identify tokens.
3. Skipping White Spaces & Comments: Ignores unnecessary spaces and comments.
4. Symbol Table Management: Stores variable names and other identifiers.

Auxiliary Tasks:

1. Error Handling: Detects and reports lexical errors.


2. Preprocessing Tasks: Removes macros and handles includes in languages like C.
3. Input Buffering: Efficiently reads input to optimize scanning.

c) Two Limitations of Top-Down Parsing

1. Cannot Handle Left Recursion:


o If a grammar has left recursion (e.g., A → Aα | β), top-down parsing fails or loops
indefinitely.
2. Limited Lookahead:
o It may not handle certain grammars where more than one lookahead token is needed for
decision-making.
d) Definitions of S-Attributed and L-Attributed Grammar

• S-Attributed Grammar:
o Uses only synthesized attributes, which are evaluated from child nodes to parent in
the parse tree.
o Example: Used in postfix expression evaluation.
• L-Attributed Grammar:
o Uses both synthesized and inherited attributes, with evaluation following left-to-
right traversal.
o Example: Used in type checking in a compiler.

e) Difference Between Top-Down Parsing and Bottom-Up Parsing

Top-Down Parsing Bottom-Up Parsing


Starts from the start symbol and expands using Starts from input tokens and reduces them to the
production rules. start symbol.
Uses prediction (LL parsers). Uses reduction (LR parsers).
Cannot handle left recursion. Can handle left recursion.
Example: Recursive Descent, LL(1) Parsing. Example: LR, SLR, LALR, CLR Parsing.

a) Phases of a Compiler in Sequence:

1. Lexical Analysis – Converts source code into tokens.


2. Syntax Analysis (Parsing) – Checks the syntax using grammar rules and constructs a parse
tree.
3. Semantic Analysis – Ensures meaning correctness (e.g., type checking).
4. Intermediate Code Generation – Converts source code into an intermediate representation
(IR).
5. Code Optimization – Improves the intermediate code for efficiency.
6. Code Generation – Converts optimized IR into target machine code.
7. Symbol Table Management & Error Handling (Runs throughout all phases).

b) Definitions of Synthesized and Inherited Attributes:

• Synthesized Attribute:
o An attribute computed from child nodes and passed up in the parse tree.
o Example: Expression evaluation (E.val = E1.val + T.val).
• Inherited Attribute:
o An attribute derived from parent or sibling nodes and passed down or sideways in
the parse tree.
o Example: Type checking in variable declarations.
c)Directed Acyclic Graph (DAG) Construction for the Given Block
Step 1: Identify Common Subexpressions

• a[i] is accessed twice in b = a[i] and e = a[i], so we create a single node for a[i] to avoid
redundancy.
• a[j] = d is an independent assignment.

DAG Representation
----> (b)
|
(a[i]) (a[j]) ---> (a)
| |
| (d)
|
(e)
Explanation:

1. a[i] is computed once and used for both b and e.


2. a[j] is assigned d, modifying a at index j.
3. The DAG removes duplicate computations, optimizing the code execution.

d) Difference Between Top-Down Parsing and Bottom-Up Parsing

Top-Down Parsing Bottom-Up Parsing


Starts from the start symbol and applies Starts from the input string and reduces it to the
production rules to derive the input string. start symbol using production rules.
Uses prediction (LL parsers). Uses reduction (LR parsers).
Cannot handle left recursion. Can handle left recursion.
Works well for small and simple grammars. Suitable for complex and ambiguous grammars.
Example: Recursive Descent, LL(1) Parsing. Example: LR, SLR, LALR, CLR Parsing.
e) Differenciate between SLR and Canonical LR parser.

Aspect SLR (Simple LR) Parser Canonical LR Parser

SLR (Simple LR) parsing uses Canonical LR uses complete LR(1)


Definition follow sets to determine reductions items, where each state has lookaheads to
in parsing conflicts. resolve conflicts.

Reductions depend on lookahead


Parsing Table Reductions are performed using
symbols, which help in better conflict
Construction Follow(A) for a production A → α.
resolution.

More chances of shift-reduce and


Handling of Fewer conflicts due to the use of
reduce-reduce conflicts, as it only
Conflicts lookaheads, making it more precise.
uses follow sets.

Has fewer states, leading to a Has more states, leading to a larger


Number of States
smaller parsing table. parsing table.

Can handle a limited set of


Grammar Can handle a wider range of grammars,
grammars due to its simple conflict
Handling including those that SLR cannot parse.
resolution approach.

Faster and requires less memory,


More powerful but requires more
Efficiency making it useful for simple
memory and computational effort.
grammars.

e) Definition of Left Recursion and Its Elimination

Definition:

• A grammar is left-recursive if a non-terminal calls itself in the leftmost position in its


production rule.
• Example:
o Direct Left Recursion: A → Aα | β (A calls itself directly).
o Indirect Left Recursion: A → Bα, B → Aγ (A calls B, and B calls A).

Elimination of Left Recursion:

• Convert A → Aα | β into:
• A → βA'
• A' → αA' | ∈
• Example:
o Given A → A + T | T,
o Convert to A → T A' and A' → + T A' | ∈.

This transformation ensures that recursion happens at the rightmost position instead of the left,
making the grammar suitable for top-down parsing.
a) Define SDD and SDT. State the task performed by SDT.

Syntax Directed Definition (SDD) is a formalism used in compilers where semantic rules are
associated with grammar productions. These rules define how attributes are computed based on
syntax structure.

� Types of Attributes in SDD:

• Synthesized Attribute → Computed from child nodes (bottom-up).


• Inherited Attribute → Passed from parent or sibling nodes (top-down).

2. Definition of SDT (Syntax Directed Translation)

Syntax Directed Translation (SDT) is a method where semantic actions (code snippets) are
embedded within the grammar productions to perform translations during parsing.

� Example:

E → E1 + T { print('+'); }

Here, { print('+'); } is an SDT action that prints + when the rule is applied.

3. Tasks Performed by SDT:

1. Lexical Analysis Assistance → Helps in token translation.


2. Syntax Tree Construction → Helps in building parse trees.
3. Type Checking → Ensures type compatibility.
4. Intermediate Code Generation → Produces intermediate representations.
5. Code Optimization → Improves efficiency by removing redundant code.
6. Code Generation → Converts intermediate code into machine code.

b) Difference Between LL Parser and LR Parser


LL Parser LR Parser

Left-to-right scanning with Leftmost Left-to-right scanning with Rightmost derivation in


derivation. reverse.

Uses predictive parsing (Top-down


Uses shift-reduce parsing (Bottom-up approach).
approach).

Cannot handle left recursion. Can handle left recursion.

Example: Recursive Descent, LL(1)


Example: SLR, LALR, CLR Parsing.
Parsing.
d) Execution Steps of a YACC Program

1. Define Tokens – Use %token to define terminal symbols.


2. Define Grammar Rules – Write productions in %% section.
3. Write C Code for Actions – Implement semantic actions for rules.
4. Compile with Lex & YACC – Use lex and yacc commands.
5. Link and Run – Generate and execute the parser.

e) Two Differences Between Synthesized and Inherited Attributes


Synthesized Attributes Inherited Attributes

Computed from child nodes and Derived from parent or sibling nodes and passed
passed up. down/sideways.

Used in S-attributed grammars. Used in L-attributed grammars.

b) Write lex program specification. Explain the Lex library functions


associated with lex in brief
%{
#include <stdio.h>
%}

%%
[0-9]+ { printf("Number detected: %s\n", yytext); }
[a-zA-Z]+ { printf("Word detected: %s\n", yytext); }
\n { /* Ignore newlines */ }
. { printf("Symbol detected: %s\n", yytext); }
%%
int main() {
printf("Enter text: ");
yylex();
return 0;
}
int yywrap() { return 1; }
a) Write a LEX program to find factorial of a given number
%{
#include <stdio.h>
#include <stdlib.h>
long factorial(int n) {
if (n == 0 || n == 1) return 1;
return n * factorial(n - 1);
}
%}
%%
[0-9]+ {
int num = atoi(yytext);
printf("Factorial of %d is %ld\n", num, factorial(num));
}
.|\n { /* Ignore other characters */ }
%%
int main() {
printf("Enter a number: ");
yylex();
return 0;
}
int yywrap() { return 1; }
b) What is multi-pass compiler? Explain diagrammatically with its
advantages and disadvantages.

Grammatical
rules with Error routine
data structure table
Source
code

Pass I IC1 Pass II IC2 Pass III IC3

Pass n

Variable,constant
symbol and literal Object
code

Multi-Pass Compiler

A multi-pass compiler processes the source code in multiple passes, where each pass analyzes and
transforms the program before passing it to the next phase.

Advantages of Multi-pass Compiler:


1. A multi-pass compiler requires lesser memory space than single-pass compiler.
2. The wider scope thus available to these compliers allows better code generation.
Disadvantages of Multi-pass Compiler:
1. In multi-pass compiler each pass reads and writes an intermediate file, which makes the
complication process time consuming.
2. The multi-pass compilers are slower than single-pass compiler.
The time required for compilation increases with the increase in the number of passes in a complier.
a) List the code optimization techniques. Explain anyone technique with an example.

Code optimization improves the performance of the compiled code by making it faster and more
efficient. Some common techniques are:

1. Constant Folding – Replacing constant expressions at compile time.


2. Common Subexpression Elimination (CSE) – Avoiding redundant calculations.
3. Dead Code Elimination – Removing unused code.
4. Loop Optimization – Improving loop efficiency (e.g., loop unrolling, invariant code motion).
5. Strength Reduction – Replacing expensive operations with cheaper ones.
6. Peephole Optimization – Improving a small set of instructions (removing unnecessary
instructions).
7. Copy Propagation – Replacing a variable with its known value.
8. Register Allocation – Using CPU registers efficiently instead of memory.

Explanation of Common Subexpression Elimination (CSE)

• If an expression appears multiple times, it is calculated once and reused instead of


recalculating it every time.

Example Before Optimization

a = b * c + d;
x = b * c + e;

Here, b * c is computed twice.

After Optimization

temp = b * c;
a = temp + d;
x = temp + e;

• Advantage: Reduces redundant calculations, improving efficiency.


• Use Case: Used in loops and repeated expressions to enhance performance.
b) Write a lex program to find the sum of n numbers
%{
#include <stdio.h>
int sum = 0;
%}
%%
[0-9]+ { sum += atoi(yytext); }
\n { printf("Sum of numbers: %d\n", sum); sum = 0; }
%%
int main() {
printf("Enter numbers (separated by space or newline):\n");
yylex();
return 0;
}
int yywrap() { return 1; }
b) List and explain in short any two LEX library function.)
1. yytext

• Definition:
yytext is a character array that stores the matched token from the input.
• Usage:
o Helps in identifying token values like keywords, numbers, or identifiers.
o Used for further processing like symbol table insertion.
• Example:
• [0-9]+ { printf("Number: %s\n", yytext); }
o If input is "123", the output will be:
o Number: 123
2. yywrap()

• Definition:
This function is called when the input file ends.
• Default Behavior:
o Returns 1 to signal the end of lexical analysis.
• Usage:
o Can be overridden to process multiple input files.
• Example:
• int yywrap() { return 1; } // Ends processing
o If overridden:
o int yywrap() { return 0; } // Continue processing another file

These functions help in token recognition and input handling in Lex programs.
a) Write the steps of creation of lexical analyzer on lex. Explain the lex
library functions associated with lex.

A lexical analyzer (lexer) reads the input source code and converts it into tokens. The steps to create
a lexical analyzer using Lex are:

1. Write the Lex Specification


o Create a .l file (Lex source file).
o Define patterns (regular expressions) for tokens.
2. Compile the Lex File
o Use the command:
o lex filename.l
o This generates a C file named lex.yy.c.
3. Compile the Generated C File
o Use:
o gcc lex.yy.c -o lexer -ll
o This generates the executable lexer.
4. Run the Lexer
o Execute the lexer using:
o ./lexer < input.txt
o It reads the input and produces tokens.

Lex Library Functions and Their Uses


Function Description

yylex() The main function that scans and processes input based on defined patterns.

yytext Stores the matched token as a string.

yyleng Holds the length of the matched token.

yywrap() Called at the end of input. Returns 1 by default.

input() Reads the next character from input.

unput(c) Pushes a character c back to the input stream.

ECHO Prints the matched token to standard output.

These functions help in scanning, processing, and handling tokens in a Lex-based lexical analyzer.
a) Define Annotated Parse tree. Give an example.

An Annotated Parse Tree is a syntax tree where each node is associated with attribute values that
provide semantic information. These attributes help in type checking, intermediate code
generation, and semantic analysis during compilation.

Attributes are of two types:

1. Synthesized Attributes – Derived from child nodes and passed upwards.


2. Inherited Attributes – Passed from parent or sibling nodes.

Example

Consider the arithmetic expression: x + y


Using the grammar:

E → E1 + T
E1 → T
T → id

The Annotated Parse Tree would be:

E (val = x + y)
/ \
E1 (val = x) +
|
T (val = x)
|
id (x)

Each node carries computed values or attributes, which are later used for semantic analysis or
code generation.
d) Give 2 differences between synthesized and inherited attributes.

Differences Between Synthesized and Inherited Attributes (4 Marks)

Aspect Synthesized Attribute Inherited Attribute


Derived from child nodes and passed Passed from parent or sibling nodes
Definition
upwards in the parse tree. and moves downwards or sideways.
Computed using the values from the
Computation Computed using the values of child nodes.
parent or sibling nodes.
Direction in Moves downward from parent to
Moves upward from children to parent.
Parse Tree child or across siblings.
Common in S-attributed grammars and Used in L-attributed grammars for
Usage used in semantic analysis, type checking, context-sensitive information like
and evaluation. scope rules.

Example

Synthesized Attribute (Moving Upwards)

E → T + E1
E.val = T.val + E1.val

• E.val is computed using T.val and E1.val, moving upwards.

Inherited Attribute (Moving Downwards)

T → id
T.type = E.type // Inheriting type from parent node E

• T.type is inherited from E.type, moving downwards.

Synthesized attributes are more common in syntax-directed translation, while inherited attributes
are used in context-sensitive parsing.

You might also like