0% found this document useful (0 votes)
17 views25 pages

Sem 5 Major

The document discusses various concepts in compiler design, including the role of regular expressions in lexical analysis, the challenges of left recursion in parsers, and the use of Directed Acyclic Graphs (DAGs) for code optimization. It also covers Syntax Directed Translation (SDT) schemes, the implementation of syntax-directed translators, and the structure of parsing tables. Examples are provided for one-pass and two-pass compilers, as well as semantic errors and activation records.

Uploaded by

22bcs023
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views25 pages

Sem 5 Major

The document discusses various concepts in compiler design, including the role of regular expressions in lexical analysis, the challenges of left recursion in parsers, and the use of Directed Acyclic Graphs (DAGs) for code optimization. It also covers Syntax Directed Translation (SDT) schemes, the implementation of syntax-directed translators, and the structure of parsing tables. Examples are provided for one-pass and two-pass compilers, as well as semantic errors and activation records.

Uploaded by

22bcs023
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

(a) Lex is an example of a lexical analyzer.

(b) Translation into a sequence of machine instructions is done in the code


generation phase.
(c) Preliminary scanning is removing comments and whitespace.
(d) Inttofloat converts integer to floating-point numbers.
(e) Several phases of the compiler may be grouped together to form a pass.
(f) Book-keeping is also known as symbol table management.

Q. whats is the need for regular expression in compiler design?

In Compiler Design, regular expressions are concise notations used to define and
recognize patterns in source code. They play a crucial role in lexical analysis,
where they are used to identify and extract tokens (e.g., keywords, identifiers)
from code. Regular expressions help in defining the syntax of programming
languages, facilitating the transformation of human-readable code into machine-
readable forms during compilation.
(a) Left recursion is a challenge in top-down parsers.
(b) The leftmost derivation is a derivation where the leftmost non-terminal is
expanded first.
(c) Grammar which produces more than one parse tree is ambiguous grammar.
(d) The replacement of the right side of a production by the left side is called
reduction.

(e) The four parts of predictive parsers are input buffer, stack, parsing table,
and output.
Two examples of one-pass compilers are:

1. Pascal-P Compiler: A simple one-pass compiler for Pascal language.


2. C Compiler for Embedded Systems: Many C compilers for embedded
systems are designed as one-pass compilers to optimize for speed and
limited memory.

Two examples of two-pass compilers are:

1. FORTRAN Compiler: The early versions of FORTRAN compilers used a


two-pass design to handle code generation and optimization.
2. COBOL Compiler: Commonly implemented as a two-pass compiler for
ease of error checking and code generation.
Q13: How are self-organizing lists useful for symbol table management?

• Definition: A self-organizing list is a data structure where frequently


accessed elements are moved closer to the front of the list, minimizing
the time needed for future access.
• Use in Symbol Table Management:
o Improved Access Time: Frequently used symbols (like keywords or
frequently accessed identifiers) can be quickly accessed.
o Dynamic Optimization: The structure adapts to runtime usage
patterns without the need for predefined ordering.
o Simple Implementation: Operations like move-to-front or
transpose are easy to implement and manage.
o Efficient for Small Data Sets: For symbol tables with a limited
number of entries, self-organizing lists provide good performance
without the overhead of more complex structures like hash tables.

Q14: What is a DAG? Explain with an example.

• Definition: A Directed Acyclic Graph (DAG) is a graph that has directed


edges and no cycles.
• Use in Compiler Design:
o Intermediate Code Representation: DAGs are used for
representing expressions to optimize code by eliminating common
sub-expressions.
• Example: Expression: a=b+c , d=a+e, f=b+c
o DAG Representation:
▪ Create nodes for variables and operations.
▪ Avoid duplication of b+c,

Step-by-Step DAG Construction:

1. Create nodes for variables b and c.


2. Combine b and c with a + operator to create a node
for the expression b + c (common subexpression).
3. Use the result of b + c to assign to a.
4. For d, create a + node combining a and e.
5. For f, reuse the node for b + c.
A DAG (Directed Acyclic Graph) is a representation used in compilers for
intermediate code optimization. It helps to remove duplicate computations,
optimize the use of temporary variables, and identify common subexpressions.
Purpose of DAG:

1. Avoid recomputation of common subexpressions.


2. Optimize the sequence of instructions for efficient execution.
3. Reduce the number of registers required.

Q15: What is left recursion? Eliminate left recursion in the given examples.

• Definition: Left recursion occurs in a grammar when a non-terminal A is


defined in such a way that it can recursively call itself as the first symbol
of its production. This can lead to infinite recursion in parsers.
• Eliminating Left Recursion:
o Identify direct left recursion in the grammar.
o Rewrite the grammar using a new non-terminal to eliminate it.
(a) Grammar:
(b) Grammar:
What is LR parser?

An LR parser is a type of bottom-up parser used in compiler design to parse context-free grammars.
It reads the input from left to right (L) and produces a rightmost derivation in reverse (R). LR parsers
are highly efficient and can handle a wide range of grammars, including those not suitable for top-
down parsing.

Key Features:

1. Deterministic Parsing: Uses a deterministic finite automaton (DFA) to efficiently parse inputs.
2. Handles Conflicts: Can manage shift-reduce and reduce-reduce conflicts better than simpler
parsers.
3. Parsing Table: Relies on a parsing table with actions like shift, reduce, accept, and
error.

Types of LR Parsers:

1. SLR (Simple LR): Simplest but less powerful.


2. LALR (Look-Ahead LR): Combines simplicity with better handling of conflicts; commonly
used.
3. CLR (Canonical LR): Most powerful but generates larger parsing tables.

Workflow:

1. The parser maintains a stack of states and symbols.


2. It uses a parsing table to decide whether to shift (push a symbol onto the stack), reduce
(apply a production rule), or accept.
3. The process ends when the input is successfully parsed or an error is detected.

Use Case: LR parsers are widely used in compilers for programming languages like C, C++, and Java
due to their robustness and ability to handle complex grammars.

cfg

A Context-Free Grammar (CFG) is a formal grammar used to define the syntax of


programming languages and other formal languages. It consists of a set of production rules that
describe how strings in the language can be generated.

Components of a Context-Free Grammar:

1. Non-Terminals (N):Symbols that can be expanded into other non-terminals or terminals


(e.g., S, E).
2. Terminals (T): The actual symbols of the language (e.g., a, +, 1).
3. Start Symbol (S):A special non-terminal from which derivation begins.
4. Production Rules (P):Rules that define how non-terminals can be replaced by a sequence of
terminals and/or non-terminals.

Example: E → E + T | T

Formal Definition:

A CFG is a 4-tuple (N, T, P, S) where:

• N: A finite set of non-terminal symbols.


• T: A finite set of terminal symbols, disjoint from N.
• P: A finite set of production rules of the form A → α, where A ∈ N and α ∈ (N ∪ T)*.
• S ∈ N: The start symbol.

Syntax-Directed Translation (SDT) is a mechanism in compiler design that combines syntax


analysis (parsing) with semantic actions to produce intermediate representations or perform
computations. SDT uses attribute grammars to associate additional information (attributes) with
grammar symbols and defines rules to compute these attributes.

Key Components of SDT:

1. Syntax Tree/Parse Tree:


a. Represents the syntactic structure of the input based on the grammar rules.
2. Attributes:
a. Values associated with grammar symbols to carry semantic information.
b. Types:
i. Synthesized Attributes: Computed from the attributes of child nodes
(bottom-up).
ii. Inherited Attributes: Computed from the attributes of parent or sibling
nodes (top-down).
3. Semantic Actions:
a. Pieces of code or rules embedded within the grammar, executed when a production
rule is applied.
b. Example: E → E1 + T { E.value = E1.value + T.value }

Process of SDT:

1. Parse the input to build a parse tree or directly evaluate attributes.


2. Evaluate the attributes based on semantic rules defined in the grammar.
3. Use the computed attributes for tasks like type checking, intermediate code generation, or
optimization.

Here are the solutions for the provided questions:

Q6: List two examples of semantic errors in a compiler.

1. Using an undeclared variable in an expression.


2. Type mismatches (e.g., adding an integer to a string).

Q7: E.code has

• Explanation: This refers to the code generated for an expression E during intermediate code
generation.
• Answer: E.code has instructions representing the evaluation of E. For example, E → E1 +
E2 might generate the code:
makefile
Copy code
t1 = E1.code
t2 = E2.code
t3 = t1 + t2

Q8: Write the SDT (Syntax-Directed Translation) for any Boolean expression.

For the production:

mathematica
Copy code
E → E1 AND E2 { E.code = E1.code || E2.code }

Example:
Input: a AND b OR c
Output (Three-address code):

css
Copy code
t1 = a AND b
t2 = t1 OR c

Q9: Give an example of an activation record.

An activation record for a function contains:

1. Return Address: To return to the caller.


2. Local Variables: Space for local variables.
3. Parameters: Values passed to the function.
4. Control Link: Pointer to the caller's activation record.
5. Temporary Storage: For intermediate results.

Example: For a function int sum(int a, int b), the activation record includes:

• Parameters a and b.
• Local variable for the result.
• Return address for the caller.

Q10: What is a parsing table?

A parsing table is a data structure used in syntax analysis to guide the parsing process.
1. In LL parsers: It maps a combination of a non-terminal and input token to a production rule.
2. In LR parsers: It contains actions (shift, reduce, accept) and goto states.

Q11: Regular expression for the language where w∈{0,1}∗w \in \{0,
1\}^*w∈{0,1}∗ such that www has no consecutive 1's.

1. Regular Expression:

markdown
Copy code
(0+1)*0*

2. NFA:
States: q0,q1q_0, q_1q0 ,q1 .
Transitions:

lua
Copy code
q_0 -- 0 --> q_0
q_0 -- 1 --> q_1
q_1 -- 0 --> q_0

3. DFA: Optimize the NFA to remove ε-transitions and ensure deterministic behavior.

Q12: Transition diagram and code for integers.

1. Transition Diagram: States for integers (e.g., q0,q1,q2q_0, q_1, q_2q0 ,q1 ,q2 ) representing
digits 0−90-90−9.
2. Code: Check for numeric input; reject invalid characters.

Q13: How are self-organizing lists useful for symbol table management?

• Frequently accessed symbols are moved to the front of the list, improving search efficiency.
• Common strategies: Move-to-front, Transpose, and Count.

Q14: What is DAG? Explain with an example.

A Directed Acyclic Graph (DAG) is used for intermediate code optimization by removing duplicate
computations and common sub-expressions.
Example:
For a=b+c;d=a+e;f=b+ca = b + c; d = a + e; f = b + c a=b+c;d=a+e;f=b+c:
css
Copy code
b c
\ /
(+)
/ \
a f
\
(+)
|
d

Q15: Eliminate left recursion.

(a) A→ABα ∣ Aa ∣ aA → AB\alpha \ | \ Aa \ | \ aA→ABα ∣ Aa ∣ a

Solution:

css
Copy code
A → aA'
A' → BαA' \ | aA' \ | ε

(b) E→E+T ∣ TE → E+T \ | \ TE→E+T ∣ T, T→T∗F ∣ FT → T*F \ | \ FT→T∗F ∣ F, F→idF →


idF→id

Solution:

1. Eliminate left recursion in E→E+T ∣ TE → E+T \ | \ TE→E+T ∣ T:

mathematica
Copy code
E → TE'
E' → +TE' \ | ε

2. Eliminate left recursion in T→T∗F ∣ FT → T*F \ | \ FT→T∗F ∣ F:

vbnet
Copy code
T → FT'
T' → *FT' \ | ε

3. FFF remains unchanged:


bash
Copy code
F → id

4o

You said:

### **Q14: What is DAG? Explain with an example.** A **Directed Acyclic Graph (DAG)** is used for
intermediate code optimization by removing duplicate computations and common sub-expressions.
**Example**: For a\=b+c;d\=a+e;f\=b+ca = b + c; d = a + e; f = b + ca\=b+c;d\=a+e;f\=b+c: css Copy
code `b c \ / (+) / \ a f \ (+) | d`

give its answer for 5 marks question

ChatGPT said:

ChatGPT

What is DAG? Explain with an Example.

A DAG (Directed Acyclic Graph) is a representation used in compilers for intermediate code
optimization. It helps to remove duplicate computations, optimize the use of temporary variables,
and identify common subexpressions.

Purpose of DAG:

1. Avoid recomputation of common subexpressions.


2. Optimize the sequence of instructions for efficient execution.
3. Reduce the number of registers required.

Example:

Consider the following intermediate code:

css
Copy code
a = b + c
d = a + e
f = b + c

Step-by-Step DAG Construction:

1. Create nodes for variables b and c.


2. Combine b and c with a + operator to create a node for the expression b + c (common
subexpression).
3. Use the result of b + c to assign to a.
4. For d, create a + node combining a and e.
5. For f, reuse the node for b + c.

DAG Representation:

css
Copy code
b c
\ /
(+)
/ \
a f
\
(+)
|
d

Explanation:

• The b + c computation is performed once and reused for both a and f.


• This reduces redundancy and improves code efficiency.

Optimized Code from DAG:

makefile
Copy code
t1 = b + c
a = t1
f = t1
t2 = a + e
d = t2

4o

You said:

give notes on syntax directed translation scheme,implementation of syntax directed translators,


intermediate code,postfix notation,parse tree and syntax tree of chapeter syntax diected translation
in subject compiler design

ChatGPT said:

ChatGPT

Here are comprehensive notes on the requested topics from the chapter Syntax Directed Translation
in Compiler Design:
1. Syntax Directed Translation Scheme

A Syntax Directed Translation (SDT) scheme is a method that uses a context-free grammar to specify
the syntax of a language along with semantic rules to define its meaning. The semantic rules are
associated with grammar productions and guide the translation process.

Key Features:

1. Attributes: Each grammar symbol is associated with attributes that store information.
a. Synthesized Attributes: Computed from children nodes.
b. Inherited Attributes: Computed from parent or sibling nodes.
2. Semantic Rules: Define how attributes are evaluated and are written as code snippets in the
production rules.

Components of SDT:

• Grammar: Specifies the syntactic structure of a language.


• Attributes: Provide additional information about grammar symbols.
• Semantic Rules: Define attribute evaluation and guide translation.

Example:

For the production:

kotlin
Copy code
E → E1 + T
Semantic Rule: E.val = E1.val + T.val

2. Implementation of Syntax Directed Translators

A Syntax Directed Translator processes the input in phases:

1. Parsing Phase:
a. Constructs a Parse Tree based on the grammar.
b. Guides attribute evaluation based on syntax rules.
2. Translation Phase:
a. Uses semantic rules to generate intermediate code or target code.

Steps for Implementation:

1. Define a Grammar:
a. Write a grammar for the language.
2. Attach Semantic Rules:
a. Add semantic actions for each production.
3. Design an Attribute Evaluation Scheme:
a. Use an evaluation order (e.g., Postorder for Synthesized Attributes).
4. Code Generation:
a. Generate intermediate or target code during attribute evaluation.

3. Intermediate Code

Intermediate code is a representation of a program that serves as a bridge between high-level and
machine-level code. It is independent of the target architecture.

Features of Intermediate Code:

• Portable and easy to optimize.


• Represents operations and operands abstractly.

Common Intermediate Representations:

1. Three-Address Code (TAC):


a. Uses statements of the form x = y op z.
2. Abstract Syntax Tree (AST):
a. Represents the program structure hierarchically.
3. Postfix Notation:
a. Represents expressions without parentheses.

Example of TAC:

For the expression a = b + c * d:

makefile
Copy code
t1 = c * d
t2 = b + t1
a = t2

4. Postfix Notation

Postfix notation (Reverse Polish Notation) is a representation of expressions where the operator
follows its operands. It eliminates the need for parentheses.
Rules:

1. Operators are written after their operands.


2. The order of operations is implicitly defined by the sequence.

Example:

Infix Expression: (a + b) * c Postfix Notation: a b + c *

5. Parse Tree

A Parse Tree is a graphical representation of the syntactic structure of a string according to a


grammar. It represents the derivation of the input string.

Features:

1. The root represents the start symbol of the grammar.


2. Internal nodes correspond to grammar non-terminals.
3. Leaf nodes represent grammar terminals.

Example:

For the grammar:

r
Copy code
E → E + T | T
T → id

The parse tree for id + id:

bash
Copy code
E
/ | \
E + T
| |
T id
|
id
6. Syntax Tree

A Syntax Tree is a simplified version of a parse tree where nodes represent meaningful constructs
rather than grammar symbols. It abstracts unnecessary details and focuses on the structure.

Difference from Parse Tree:

• Parse trees show all grammar symbols, including terminals and non-terminals.
• Syntax trees show only the essential operations and operands.

Example:

For the expression a + b * c, the syntax tree is:

css
Copy code
+
/ \
a *
/ \
b c

Summary Table

Topic Key Points


Syntax Directed Translation Combines grammar and semantic rules to specify syntax and
Scheme meaning.
Implementation of Consists of parsing, attribute evaluation, and code generation
Translators phases.
Abstract representation like TAC or Postfix Notation, portable and
Intermediate Code
architecture-neutral.
Operator follows operands, eliminates parentheses, e.g., a b + c
Postfix Notation
*.
Parse Tree Full representation of grammar derivation, includes all symbols.
Syntax Tree Simplified parse tree focusing on operations and operands.

These notes provide a structured overview of Syntax Directed Translation and its related concepts in
Compiler Design. Let me know if you need additional examples or further clarifications!

4o
Algorithm for Minimization of DFA (Simple Steps)

Step 1: Identify and Remove Unreachable States

1. Start from the initial state and find all states that can be reached by any sequence of inputs.
2. Remove all states that are not reachable.

Step 2: Partition States into Groups

1. Group 1: Final states (accepting states).


2. Group 2: Non-final states (non-accepting states).

Step 3: Refine Groups

1. For each input symbol, check where states in the same group transition.
2. If two states in the same group transition to different groups, split them into separate
groups.
3. Repeat this process until no group can be split further.

Step 4: Form the Minimized DFA

1. Treat each group as a single new state.


2. Define transitions for the new states based on the original DFA.

Step 5: Create the Final Minimized DFA

1. Use the refined groups as the states.


2. The initial state of the minimized DFA is the group containing the original initial state.
3. The final states of the minimized DFA are groups containing the original final states.
Algorithm for Operator Precedence Parsing

Operator precedence parsing is used to parse expressions based on operator precedence and
associativity rules. Below are the steps in simple terms:

Step 1: Define Precedence Rules

• Create a precedence table for all operators.


• Example:
o * and / have higher precedence than + and -.
o Operators of the same precedence follow associativity rules (left-to-right or right-to-
left).

Step 2: Initialize the Stack

• Start with a stack containing only the start symbol ($).


• Place the input expression (with $ at the end) on the input tape.

Step 3: Compare Precedence

• Compare the top of the stack with the current input symbol:
o If the stack symbol has lower precedence, push the input symbol onto the stack.
o If the stack symbol has higher or equal precedence, reduce by applying a grammar
rule.

Step 4: Perform Reduction

• Reduction means replacing the handle (a part of the string that matches the right-hand side
of a production rule) with the left-hand side of the rule.
• Repeat reductions until the stack symbol has lower precedence than the input symbol.

Step 5: Continue Parsing

• Repeat steps 3 and 4 until:


o The stack contains the start symbol $.
o The input tape contains only $.

Step 6: Accept or Reject

• If the stack and input tape both contain $, the input is accepted.
• Otherwise, the input is rejected.
Algorithm for Removing Left Recursion

Left recursion occurs in a grammar when a non-terminal refers to itself as the first symbol on the
right-hand side of a production. It can cause problems in top-down parsers like recursive descent.
Here's how to eliminate it:

Types of Left Recursion

1. Immediate Left Recursion:


a. A → Aα | β
b. Example: A → Aα | β

2. Indirect Left Recursion:


a. A → Bα, B → Aγ
Example: A → Bα and B → Aγ

Steps to Remove Immediate Left Recursion

Given a production: A → Aα | β

1. Separate the productions:


a. Left-recursive part: Aα
b. Non-left-recursive part: β
2. Rewrite the grammar:
a. Create a new non-terminal A'.
b. Update the grammar:
i. A → βA'
ii. A' → αA' | ε
(where ε means "empty string")

Example:

Input Grammar:

A → Aα | β

Transformed Grammar:

• A → βA'
• A' → αA' | ε
Steps to Remove Indirect Left Recursion

1. Reorder the Non-Terminals:

Arrange non-terminals so that indirect recursion is removed in a systematic way.

2. Replace Indirect Left Recursion:

For a non-terminal A that depends on another non-terminal B:

a. Substitute all productions of B into A.


3. Eliminate Immediate Left Recursion:

Apply the steps for immediate left recursion.

Example:

Input Grammar:

A → Bα | β

B → Aγ | δ

Step 1: Substitute B in A:

A → (Aγ | δ)α | β

Step 2: Simplify:

A → Aγα | δα | β

Step 3: Apply Immediate Left Recursion Elimination:

• A → δαA' | βA'
• A' → γαA' | ε

Example: Immediate Left Recursion Removal

Input Grammar:

E → E + T | T

Steps:

1. Separate:
a. Left-recursive part: E + T
b. Non-left-recursive part: T
2. Rewrite:
a. E → T E'
b. E' → + T E' | ε

Output Grammar:

• E → T E'
• E' → + T E' | ε

This algorithm ensures that the resulting grammar is free of left recursion and can be used for top-
down parsing techniques like recursive descent.

You might also like