0% found this document useful (0 votes)
56 views36 pages

Chapter 4 Semantic Analysis

Compiler Chapter 4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views36 pages

Chapter 4 Semantic Analysis

Compiler Chapter 4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 36

Department of Computer Science

Complier Design

1
Chapter 4

Semantic Analysis

2
What is Semantic analysis?

 Semantic analysis is a pass by a compiler that adds semantic information to


the parse tree and performs certain checks based on this information.
 Semantic information that is added and checked is typing information (type
checking) and the binding of variables and function names to their
definitions ( object binding ).
 Semantic analysis also includes error reporting in case any semantic
error is found out.
Following things are done in Semantic Analysis:

 Disambiguate Overloaded operators: If an operator is overloaded, it should be specify the

meaning of that particular operator.

 Type checking : The process of verifying and enforcing the constraints of types is called

type checking.

 This may occur either at compile-time (a static check) or run-time (a dynamic check).

 Static type checking is a primary task of the semantic analysis carried out by a compiler.

 Uniqueness checking : Whether a variable name is unique or not, in the its scope.

 Name Checks : Check whether any variable has a name which is not allowed.
 Beyond syntax analysis
☞ Parser cannot catch all the program errors
☞ There is a level of correctness that is deeper than syntax analysis
☞ Some language features cannot be modeled using context free grammar formalism

A parser has its own limitations in catching program errors related to semantics, something
that is deeper than syntax analysis.
Typical features of semantic analysis cannot be modeled using context free grammar
formalism.
If one tries to incorporate those features in the definition of a language then that language
does not remain context free anymore.
Example 1
 string x; int y; y = x + 3; The use of x is a type error. (a string can’t add with integer)
 int a, b; a = b + c; Here, c is not declared
 int x; char x; An identifier x refer to different data types and makes declaration conflicts
 A variable declared within one function cannot be used within the scope of the other
function unless declared there separately.
 These are a couple of examples, which tell us that typically what a compiler has to do
beyond syntax analysis.
 This is just an example probably you can think of many more examples in which syntax
analysis will not handle.
Compiler needs to know?
 Whether a variable has been declared? and What is the type of the variable?
 Whether a variable is a scalar, an array, or a function?
 What declaration of the variable does each reference use?
 If an expression is type consistent?
 How many arguments does a function take?
 Are all invocations of a function consistent with the declaration?
 If an operator/function is overloaded, which function is being invoked?
 Inheritance relationship

 If the compiler has the answers to all these and other questions, then it will be able to
successfully do a semantic analysis by using the generated parse tree.
How to answer these questions?
 In order to answer the previous questions the compiler will have to keep information about
the type of variables, number of parameters in a particular function, types of
inheritance used etc.
 It will have to do some sort of computation in order to gain this information.
 Most compilers keep a structure called symbol table to store this information.
 But How?
 In syntax analysis we used context free grammar.
 Here we put lot of attributes around it and it consists of context sensitive grammars along
with extended attribute grammars.
 Attribute grammar is nothing but it is a CFG and attributes put around all the terminal
and non-terminal symbols are used.
Example of Attribute grammar:
E → E + T { E.value = E.value + T.value }

CFG
Sematic Rules

The semantic rules that specify how the grammar should be interpreted.

Here, the values of non-terminals E and T are added together and the result is copied to the
non-terminal E.
Semantic attributes may be assigned to their values from their domain at the time of parsing
and evaluated at the time of assignment or conditions.
Based on the way the attributes get their values, they can be broadly divided into two
categories: synthesized attributes and inherited attributes.
 Synthesized attributes:

 These attributes get values from the attribute values of their child nodes.
 To illustrate, assume the following production: S → ABC
 If S is taking values from its child nodes (A, B, C), then it is said to be a
synthesized attribute, as the values of ABC are synthesized to S.
 As in our previous example (E → E + T), the parent node E gets its value from
its child node.
 Synthesized attributes never take values from their parent nodes or any
sibling nodes.
Inherited attributes:
In contrast to synthesized attributes, inherited attributes can take values from
parent and/or siblings.
 As in the following production, S → ABC
 A can get values from S, B and C. B can take values from S, A, and C. Likewise, C can take
values from S, A, and B.

♠ Expansion: When a non-terminal is expanded to terminals as per a grammatical rule.


♠ Reduction: When a terminal is reduced to its corresponding non-terminal according to
grammar rules.
 Semantic analysis uses Syntax Directed Translations (SDT) to perform the above
tasks.
Syntax Directed Translation
 It has augmented rules to the grammar that facilitate semantic analysis.

 SDT involves passing information bottom-up and/or top-down to the parse tree in
form of attributes attached to the nodes.

 There are Two types of SDT: These are

 S-attributed SDT

 L-attributed STD
A. S-attributed SDT
 If an SDT uses only synthesized attributes, it is called as S-attributed SDT.
These attributes are evaluated using S-attributed SDTs that have their semantic actions
written after the production (right hand side).
S-attributed SDTs are evaluated in bottom-up parsing, as the values of the parent
nodes depend upon the values of the child nodes.

For Example with the production Rule S → ABC


 S is taking values from its child nodes (A, B, C) and S is root and ABC are child's
A. L-attributed SDT
 This form of SDT uses both synthesized and inherited attributes with restriction of

not taking values from right siblings.


 Semantic action rules can be placed anywhere in right side.

 In L-attributed SDTs, a non-terminal can get values from its parent, child, and sibling
nodes. As in the following production S → ABC
 S can take values from A, B, and C (synthesized). A can take values from S only. B can
take values from S and A. C can get values from S, A, and B.
 No non-terminal can get values from the sibling to its right.

 Attributes in L-attributed SDTs are evaluated by depth-first and left-to-right parsing manner.
From the diagram, we can say that, All S-attributed STD are L-attributed STD.
 terminals are assumed to have only synthesized attribute values of which are supplied by
lexical analyzer
 Start symbol has no parents, hence no inherited attributes.
Example Parse tree for 3*4+5n here n is newline
 Inherited attributes help to find the context (type, scope etc.) of a token.
 Here addtype (id.entry, L.in) functions adds a symbol table entry for the id a and attaches
to its parent the type of L.in .
Parse tree for real x, y, z

Dependence of attributes in an inherited attribute system. The value of in (an inherited attribute) at the
three L nodes gives the type of the three identifiers x , y and z . These are determined by computing the value
of the attribute T.type at the left child of the root and then valuating L.in top down at the three L nodes in the
right subtree of the root.
 Dependence Graph
 It is directed graph indicating interdependencies among the synthesized and
inherited attributes of various nodes in a parse tree.
 If an attribute b depends on an attribute c then the semantic rule for b must be
evaluated after the semantic rule for c
 The dependencies among the nodes can be depicted by a directed graph called
dependency graph

 An algorithm to construct the dependency graph is : After making one node for every
attribute of all the nodes of the parse tree, make one edge from each of the other attributes
on which it depends.
 The semantic rule A.a = f(X.x , Y.y) for the production A -> XY defines the synthesized
attribute a of A to be dependent on the attribute x of X and the attribute y of Y .
 Similarly for the semantic rule X.x = g(A.a , Y.y) for the same production there will be
an edge from A.a to X.x and an edg e from Y.y to X.x. With inherited attributes
Part 2
What is Next?

Construction of Syntax Trees

Directed Acyclic Graph (DAG)

Type System

Type Checking

Type Conversions
 Construction of Abstract Syntax Trees

What is Abstract Syntax Trees/ syntax tree ?

 An Abstract Syntax Tree (syntax tree) is a tree in which each leaf node
represents an operand, while each inside node represents an operator.
 The syntax is "abstract" : it does not represent every detail of real syntax, but
rather structural or content-related details .
 It is condensed form of parse trees.
Example 1 id * id + id
Example 2 : Abstract syntax tree for a+b*c-d
( a) parse tree ( b) Abstract syntax tree
Rules for constructing a syntax tree
 Each node in a syntax tree can be executed as data with multiple fields.
 In the node for an operator, one field recognizes the operator and the remaining field
includes a pointer to the nodes for the operands.
 The following functions are used to create the nodes of the syntax tree for the expressions
with binary operators.
 mknode (op, left, right) − It generates an operator node with label op and two field including
pointers to left and right.
 mkunode(op, entry)- It generates urinary operator for labels
 mkleaf (id, entry) − It generates an identifier node with label id and the field including the
entry, a pointer to the symbol table entry for the identifier.
 mkleaf (num, val) − It generates a number node with label num and a field including val,
For example : Let us have the following SDT
EE+T { E.ptr = mknode( ‘+’,E.ptr , T.ptr );}
ET {E.ptr=T.ptr ;}
TT*F {T.ptr = mknode (‘*’ , T.ptr, F.ptr) ;}
TF {T.ptr = F.ptr ;}
Fid {F.ptr = mkleaf ( id, id.entry ;)

Using the above SDT ( Both production


rule and Semantic Actions) let us parse the
tree having a+b*c
To compute the semantic action for the first production rule
 EE+T { E.ptr = mknode( ‘+’,E.ptr , T.ptr );} First we have to compute for T, and again
to solve T we have to compute F first. F can be solved directly because it directs to an id.
 This indicates it is done in bottom up manner , p1, p2, … . . p5 are pointers to the symbol

table entries for identifier 'a' ‘b’, and 'c' respectively.


p1 : mkleaf(id, id.entry) : id , a
p2 : mkleaf (id, id.entry): id, b
p3: mkleaf(id, id.entry) : id, c
p4 : mknode (‘*’ ,T.ptr, F.ptr): *, p2, p3
p5: mknode (‘+’,E.ptr, T,ptr) : +, p1,p4
Example 2 syntax tree for the Example 3 syntax tree for a statement.
expression. a = b ∗ −c + d
if p = q then q = 2 * r

Here (operand c) has a unary operator (-)


we should use mknode (op, entry )
Directed Acyclic Graph (DAG)

A directed acyclic graph (DAG) is on Abstract syntax tree,

All the function calls are made as in the order.

Whenever the required node is already present, a pointer to It is returned so that


a pointer to the old node itself is obtained.
A new node is made if it did not exist before.
Example Let us Construct DAG for the following expression (a+b*c) –(d/(b*c) )
Type system
 Type checking is an important aspect of semantic analysis.
 A type is a set of values on which certain operations are valid for.
 For example, consider the type integer in C++, The operation mod(%) can only be
applied to values of type integer, and to no other types.
 A language's type specifies which operations are valid for values of each type.
 A type checker verifies that the type of a construct matches, which is expected by its
context, and ensures that correct operations are applied to the values of each type.
 Languages can be classified into three main categories depending upon the type
system they employ. Which are Untyped , Statically typed and Dynamically typed
 Untyped :
 There are no explicit types and allows any operation to be performed on any data .
 Tcl, BCPL, Assembly languages are category of these languages.
 Statically typed :
 In these type of languages, all the type checking is done at the compile time only.
 Before source code is compiled, type associated with each single variable must be known.
and are also called Strongly typed languages.
 Example of languages in this category are Algol ,C++, java, Kotlin and Scala etc.
 Dynamically typed :

 In dynamically typed languages, the type checking is done at the runtime.

 This means that variables are checked against types only when the program is executing.

 Programming languages like Lisp, PHP, JavaScript etc. have dynamic type checking.
 Type Expressions
☞ Read from your handout
 Type Conversion

d ou
E n Y
n k
h a
T

You might also like