0% found this document useful (0 votes)
20 views21 pages

Compiler Design Notes of Unit 4

Notes of unit 4

Uploaded by

Nayan Raut
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views21 pages

Compiler Design Notes of Unit 4

Notes of unit 4

Uploaded by

Nayan Raut
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

SDD

SDT
SDT in Compiler Design

Syntax Directed Translation (SDT) is a powerful technique used in compiler construction to


combine syntax analysis (checking the structure of the source code) with semantic analysis
(verifying the meaning of the code). It achieves this by associating semantic actions with the
grammar rules that define the programming language's syntax.

Key Concepts:

• Augmented Context-Free Grammar (ACFG): A standard context-free grammar


(CFG) is extended with attributes attached to non-terminals and productions. These
attributes hold semantic information that helps define the meaning of the code
constructs.
• Attribute Evaluation: Semantic actions are attached to productions and are executed
during parsing. These actions compute the values of attributes based on:
o Lexical values of terminal symbols in the parse tree
o Constants defined in the compiler
o Attributes of other nodes in the parse tree (bottom-up or top-down evaluation)
• Parse Tree Construction: As the parser builds the parse tree, it also evaluates the
semantic actions associated with each production rule, populating the attributes with
meaningful values.

Benefits of SDT:

• Clear and Structured Translation Rules: SDT provides a well-defined framework


for specifying how source code constructs are translated, using grammar rules and
semantic actions.
• Separation of Concerns: By separating syntax and semantics, SDT promotes
modularity and maintainability in compiler design. Changes to translation logic can be
made within the semantic actions without affecting the parsing process.
• Efficient Code Generation: SDT allows for optimizations during the translation
process, leading to the generation of efficient target code. This can involve techniques
like intermediate code generation and optimization passes.

Common Applications of SDT:

• Type Checking: Semantic actions can verify that expressions and assignments have
compatible types, ensuring type safety.
• Intermediate Code Generation: SDT can be used to construct an intermediate
representation of the source program, facilitating further analysis and optimization.
• Symbol Table Management: Attributes can store information about variables and
functions in the symbol table, enabling efficient access and type resolution.
• Error Handling: Semantic actions can detect errors in the source code at compile
time, providing better error messages to the programmer.
S-Attributed SDD

• Definition: An S-attributed SDD uses only synthesized attributes. These attributes


are associated with the non-terminal on the left-hand side (LHS) of a production rule.
They represent information that is computed bottom-up in the parse tree, meaning
their values depend on the attributes of their children (right-hand side symbols).
• Evaluation: Semantic actions for S-attributed SDDs are placed at the end of the
production rule on the right-hand side. These actions compute the value of the
synthesized attribute for the LHS non-terminal based on the attributes of its children.
• Parsing: S-attributed SDDs are typically evaluated using bottom-up parsing. As the
parser encounters a production and reduces it, the semantic action is executed, and the
synthesized attribute of the parent (LHS) is computed based on its children's
attributes.
L-Attributed SDD

• Definition: An L-attributed SDD allows for both synthesized attributes and


inherited attributes. Inherited attributes are associated with non-terminals on the
right-hand side (RHS) of a production rule. They represent information that is
passed down from a parent non-terminal or left siblings in the parse tree. However,
L-attributed SDDs have a restriction: inherited attributes cannot inherit values from
right siblings.
• Evaluation: Semantic actions in L-attributed SDDs can be placed anywhere in the
production rule. They can compute synthesized attributes based on children's
attributes or inherited attributes. Inherited attributes typically receive their values
from the parent node's semantic action.
• Parsing: L-attributed SDDs are usually evaluated using a depth-first, left-to-right
parsing strategy. The parser traverses the parse tree in this order, executing semantic
actions as it encounters them. This ensures that inherited attributes have already been
computed before they are used in other semantic actions.
Bottom-Up Evaluation of S- Attributed
definitions

• Bottom-Up Parsing: This parsing approach starts with the input tokens and builds
the parse tree bottom-up by recognizing productions (rules) that match the tokens.
• S-Attributed Definitions: These definitions only involve synthesized attributes.
Synthesized attributes are associated with the non-terminal symbol on the left-hand
side (LHS) of a production rule. Their values depend on the values of the attributes
attached to the symbols on the right-hand side (RHS) of the rule.

Here's why S-attributed definitions are suitable for bottom-up parsing:


1. Information Flow: The information required to compute the value of a synthesized
attribute for the parent node (LHS) comes from its children nodes (RHS) in the
production rule. This aligns perfectly with the bottom-up parsing process, where child
nodes are processed first.
2. Parsing Mechanism: During bottom-up parsing, the parser maintains a stack. This
stack can be extended to hold not just grammar symbols but also their associated
attribute values. When a reduction (applying a production rule) occurs, the values of
the synthesized attributes for the child symbols on the stack are used to compute the
value of the synthesized attribute for the parent symbol (LHS) using the semantic
actions associated with the production rule.

Here are some additional details about the bottom-up evaluation process:

• The stack is typically implemented as a pair of arrays: one for states and one for
attribute values.
• Each state in the state array corresponds to a parsing table entry and guides the
parser's next action.
• The attribute value array holds the current values for the synthesized attributes
associated with the symbols on the stack.
• Semantic actions attached to production rules define how to compute the synthesized
attribute value for the parent based on the values of its children's attributes.
Bottom-Up Evaluation of Inherited
attributes
Type Checking and Type Conversion
Type Checking

• Definition: The process of verifying and enforcing the constraints of data types in a
program's source code. It ensures that operations are performed on compatible data
types, preventing errors and promoting program correctness.
• Benefits:
o Catches type mismatches early (at compile time) before the program runs,
leading to more reliable code.
o Improves code maintainability and readability by enforcing type consistency.
o Can help optimize memory usage by allocating the appropriate amount of
space for different data types.
• Types of Type Checking:
o Static Type Checking: Performed during compilation. The compiler analyzes
the program's source code and verifies that types are used consistently.
Examples: C, C++, Java.
o Dynamic Type Checking: Performed at runtime. The program itself checks
data types as it executes. More flexible but can lead to runtime errors if types
are incompatible. Examples: Python, JavaScript (partially).

• Implementation:
o Type information is usually associated with variables, expressions, and
function arguments during the semantic analysis phase of compilation.
o The type checker uses a set of rules defined by the programming language to
verify that operations make sense with the involved data types. It might
involve checking for:
▪ Assignment compatibility (e.g., assigning an integer to an integer
variable)
▪ Operator compatibility (e.g., adding two integers, not an integer and a
string)
▪ Function argument/return type matching
▪ Inheritance and polymorphism constraints in object-oriented languages

Type Conversion

• Definition: The process of transforming data from one type to another. This allows
for operations between different data types but might involve loss of precision or data
integrity issues.
• Types of Type Conversion:
o Implicit Conversion (Automatic Casting): The compiler performs the
conversion automatically based on predefined rules. Examples:
▪ Widening conversion (converting a smaller type to a larger one, e.g.,
integer to float)
▪ Narrowing conversion (converting a larger type to a smaller one, e.g.,
float to integer, can lead to truncation)
o Explicit Conversion (Casting): The programmer explicitly instructs the
compiler to convert a value from one type to another. Example: int x =
(int) 3.14; (truncates the decimal part)
• Importance:
o Enables operations between different data types, providing flexibility in code.
o Can be used to optimize code for specific data types.

Relationship Between Type Checking and Type Conversion

• Type checking ensures that type conversions, whether implicit or explicit, are
performed safely and within the language's rules.
• Type conversions allow for data manipulation across different types, but type
checking guards against invalid conversions.

In summary, type checking is a crucial mechanism for ensuring program correctness and
reliability, while type conversion provides flexibility in data manipulation. A well-designed
compiler effectively balances these two aspects to produce robust and efficient code.
6.5 INTERMEDIATE CODE GENERATION
While translating a source program into a functionally equivalent object code representation, a parser may first
generate an intermediate representation. This makes retargeting of the code possible and allows some optimizations
to be carried out that would otherwise not be possible. The following are commonly used intermediate representations:
1. Postfix notation

2. Syntax tree

3. Three-address code

Postfix Notation

In postfix notation, the operator follows the operand. For example, in the expression (a − b) * (c + d) + (a − b), the
postfix representation is:

Syntax Tree

The syntax tree is nothing more than a condensed form of the parse tree. The operator and keyword nodes of the
parse tree (Figure 6.5) are moved to their parent, and a chain of single productions is replaced by single link (Figure
6.6).

Figure 6.5: Parse tree for the string id+id*id.


Figure 6.6: Syntax tree for id+id*id.

Three-Address Code

Three address code is a sequence of statements of the form x = y op z. Since a statement involves no more than
three references, it is called a "three-address statement," and a sequence of such statements is referred to as
three-address code. For example, the three-address code for the expression a + b * c + d is:

Sometimes a statement might contain less than three references; but it is still called a three-address statement. The
following are the three-address statements used to represent various programming language constructs:

Used for representing arithmetic expressions:

Used for representing Boolean expressions:

Used for representing array references and dereferencing operations:

Used for representing a procedure call:

call p, n
6.6 REPRESENTING THREE-ADDRESS STATEMENTS
Records with fields for the operators and operands can be used to represent three-address statements. It is possible
to use a record structure with four fields: the first holds the operator, the next two hold the operand1 and operand2,
respectively, and the last one holds the result. This representation of a three-address statement is called a "quadruple
representation".

Quadruple Representation

Using quadruple representation, the three-address statement x = y op z is represented by placing op in the operator
field, y in the operand1 field, z in the operand2 field, and x in the result field. The statement x = op y, where op is a
unary operator, is represented by placing op in the operator field, y in the operand1 field, and x in the result field; the
operand2 field is not used. A statement like param t1 is represented by placing param in the operator field and t1 in the
operand1 field; neither operand2 nor the result field are used. Unconditional and conditional jump statements are
represented by placing the target labels in the result field. For example, a quadruple representation of the
three-address code for the statement x = (a + b) * - c/d is shown in Table 6.1. The numbers in parentheses represent
the pointers to the triple structure.

Table 6.1: Quadruple Representation of x = (a + b) * − c/d

Operator Operand1 Operand2 Result

(1) + a b t1

(2) − c t2

(3) * t1 t2 t3

(4) / t3 d t4

(5) = t4 x

Triple Representation

The contents of the operand1, operand2, and result fields are therefore normally the pointers to the symbol records
for the names represented by these fields. Hence, it becomes necessary to enter temporary names into the symbol
table as they are created. This can be avoided by using the position of the statement to refer to a temporary value. If
this is done, then a record structure with three fields is enough to represent the three-address statements: the first
holds the operator value, and the next two holding values for the operand1 and operand2, respectively. Such a
representation is called a "triple representation". The contents of the operand1 and operand2 fields are either pointers
to the symbol table records, or they are pointers to records (for temporary names) within the triple representation itself.
For example, a triple representation of the three-address code for the statement x = (a+b)*−c/d is shown in Table 6.2.
Table 6.2: Triple Representation of x = (a + b) * − c/d

Operator Operand1 Operand2

(1) + a b

(2) − c

(3) * (1) (2)

(4) / (3) d

(5) = x (4)

Indirect Triple Representation

Another representation uses an additional array to list the pointers to the triples in the desired order. This is called an
indirect triple representation. For example, a triple representation of the three-address code for the statement x =
(a+b)*−c/d is shown in Table 6.3.

Table 6.3: Indirect Triple Representation of x = (a + b) * − c/d

Operator Operand1 Operand2

(1) (1) + a b

(2) (2) − c

(3) (3) * (1) (2)

(4) (4) / (3) d

(5) (5) = x (4)

Comparison

By using quadruples, we can move a statement that computes A without requiring any changes in the statements
using A, because the result field is explicit. However, in a triple representation, if we want to move a statement that
defines a temporary value, then we must change all of the pointers in the operand1 and operand2 fields of the records
in which this temporary value is used. Thus, quadruple representation is easier to work with when using an optimizing
compiler, which entails a lot of code movement. Indirect triple representation presents no such problems, because a
separate list of pointers to the triple structure is maintained. When statements are moved, this list is reordered, and no
change in the triple structure is necessary; hence, the utility of indirect triples is almost the same as that of quadruples.

You might also like