0% found this document useful (0 votes)
28 views55 pages

Unit 4 Symbol Table

Uploaded by

rizvimoid321
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views55 pages

Unit 4 Symbol Table

Uploaded by

rizvimoid321
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 55

Unit 4:

Symbol Table,
Run-time Storage Administration,
Error Detection and Recovery
Symbol Table
• Symbol table is an important data structure used in a compiler in order
to keep track of semantics of variables
• It stores information about the scope and binding information about
names, information about instances of various entities
• It is built-in lexical and syntax analysis phases
• The information is collected by the analysis phases of the compiler
and is used by the synthesis phases of the compiler to generate code
• It is used by the compiler to achieve compile-time efficiency
Used of symbol table in various phases of
compiler
1.Lexical Analysis: Creates new table entries in the table, for example like entries about
tokens.
2.Syntax Analysis: Adds information regarding attribute type, scope, dimension, line of
reference, use, etc in the table.
3.Semantic Analysis: Uses available information in the table to check for semantics i.e. to
verify that expressions and assignments are semantically correct(type checking) and update
it accordingly.
4.Intermediate Code generation: Refers symbol table for knowing how much and what type
of run-time is allocated and table helps in adding temporary variable information.
5.Code Optimization: Uses information present in the symbol table for machine-dependent
optimization.
6.Target Code generation: Generates code by using address information of identifier present
in the table.
Symbol Table Entries
• Items stored in Symbol table: • Information used by the compiler
• Variable names from Symbol table:
• Constants • Data type and name
• Declaring procedures
• Procedure name
• Offset in storage
• Function names • If structure or record then, a pointer to
• Literal constants and strings structure table.
• Compiler generated temporaries • For parameters, whether parameter
passing by value or by reference
• Labels in source languages • Number and type of arguments passed to
function
• Base Address
Basic operations of Symbol table
Feature of Symbol Table
• Insert
• Delete
• Lookup
• Modify
Name Representation in Symbol Table

1. Fixed length Name Representation

2. Variable length Name Representation


Fixed length Name Representation
• Space is fixed for the allocation of Name.
• Wastage of space if length of name is too small.
Example:
Name Attribute
S U M M A T I O N
S U M
N
N 1
Variable length Name Representation
• The amount of space required by string is used to store the names.
• The name can be stored with the help of starting index and length of
each name.
Example: Name
Attribute
Starting Index Length

0 10
10 4
14 2
16 3

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
S U M M A T I O N $ S U M $ N $ N 1 $
Data Structure used for Symbol Table
• Requirements for symbol table:
• For quick insertion of identifier and related information
• For quick searching of identifier

• Data structure of Symbol table


1. List data structure
2. Self Organizing List
3. Hash table
1. List data structure
• Linear list is a simplest mechanism to implement symbol table.
• Array is used to store names and associated information.
• New name can be added in the symbol as they arrive.
• A pointer “Available” is used to maintained at end of all records.
• Advantage: takes minimum amount of space.
Name 1 Info 1
Name 2 Info 2
Name 3 Info 3
. .
. .
. .
Name n Info n

Available
(start of empty slot)
2. Self Organizing List
• Linked list is used.
• A link field is added to each record.
• We search the records in the order pointed by the link of link field.
• A pointer “first” is maintained to point to first record of the symbol
table.
Name 1 Info 1
Name 2 Info 2

start Name 3 Info 3


Name 4 Info 4
3. Hash Tables
• Hashing is important method to search the records in symbol table.
• This method is superior to list organization.
• Two tables is maintained:
• 1. Hash table
• 2. Symbol Table
• Hash table consists of k entries from 0 to k. These entries are basically
pointers to symbol table pointing to name of symbol table.
• To determine whether the ‘Name’ is in symbol table, we use a hash
function ‘h’
position=h(Name)
• Advantage: Used for quick search
• Disadvantage:
• complicated to implement
• extra space is required
• obtaining scope of variables is very difficult.
• collisions (open addressing, chaining, rehashing)
2. Run-time Storage
Organization
Introduction
• Compiler demands for block of memory to operating system.

• Compiler utilizes the block of memory for running


(executing) the compiled program.

• This block of memory is called Run-Time Storage.


Division of Run-Time Storage to hold code
and data

1. The generated target code (fixed size, static)


2. Data objects (static)
3. Information which keeps track of procedure activation (dynamic)
Storage Allocation Strategies

1. Static allocation
2. Stack Allocation
3. Heap Allocation
1. Static Allocation
• Size of data objects is known at compile time.
• Binding of object with allocated storage do not change. (Static
Allocation)
• In static allocation, the compiler can determine the amount of storage
required by each data object. (Easy to find the address of data objects
in activation record)
Limitations of Static Allocation
• Static allocation can be done only if the size of data object is known
at compile time.
• The data structure cannot be created dynamically. (static allocation
cannot manage the allocation of memory at run-time).
• Recursive procedures are not supported by static allocation.
2. Stack Allocation
• Storage is organized as STACK (LIFO).
• This stack is called control stack.
• As activation begins, the activation records are pushed into the stack and no
completion of this activation, the corresponding activation records can be popped.
• The locals are stored in each activation records. Hence locals are bound to
corresponding activation record on each fresh activation.
• The data structure cab be created dynamically for stack allocation.
Limitations:
• The memory addressing can be done using pointer and index registers. Hence the
stack allocation is slower than static allocation
3. Heap Allocation
• If the values of non local variables must retained even after the activation
record then such a retaining is not possible in stack allocation. For this
limitation, we use heap allocation
• Based on Linked list
• Dynamic allocation
• Heap allocation allocates the continuous block of memory when required
for storage of activation records or other data objects (malloc).
• Free function is used for deallocation.
• Efficient allocation strategy:
• Creates a linked list for free blocks and when any memory is deallocated that block
of memory is appended in the linked list.
• Allocates the most suitable block of memory from linked list (Use best fit technique
for the allocation of block)
Activation Record
• Manages the information needed by a single execution of a procedure.

• An activation record is pushed into the stack when a procedure is


called and it is popped when the control returns to the caller function
Various fields of Activation Record
• Return Value: It is used by calling procedure to return a value to calling
procedure.
• Actual Parameter: It is used by calling procedures to supply parameters
to the called procedures.
• Control Link (Dynamic Link): It points to activation record of the
caller. (optional)
• Access Link (Static Link): It is used to refer to non-local data held in
other activation records. (optional)
• Saved Machine Status: It holds the information about status of machine
before the procedure is called. (Machine register + Program counter)
• Local Data: It holds the data that is local to the execution of the
procedure.
• Temporaries: It stores the value that arises in the evaluation of an
expression.
Example: Formation of activation record
main()
{
int f;
f=factorial(3);
}
int factorial(int n)
{
if(n==1)
return 1;
else
return n*factorial(n-1);
}
First call of factorial function
Second call of factorial function
Third call of factorial function
Block Structure and Non-block Structure
Storage Allocation

1. Local Data

2. Non-Local Data
Local Data
• The local data can be accessed with the help of activation record.

Reference of any variable x in procedure


= Base pointer pointing to start of procedure
+ offset of variable x from base pointer
Example:
Procedure A
int a;
Procedure B
int b;
body of B
body of A
Access to Non-Local Names
• For nonlocal names, there are two types of scope rule:
• 1. Static: (PASCAL, C, and ADA languages)
• 2. Dynamic: (LISP and SNOBOL languages)
Static Scope or Lexical Scope
• Block structured language
• Block: sequence of statements containing the local data representation and enclosed
within the delimiters
Example:
{
Declaration statements
……..
}
The declarations are visible at a program point:
1. Declaration are made locally in the procedure
2. The names of all enclosing procedures
3. The declaration of names made immediately within such procedures
Example: Static scope of declarations for
local variables
function()
{ //B1
int p,q;
{//B2
int p;
{//B3
int r;
}
}
{//B4
int q,s;
}
}
Lexical Scope for Nested Procedure
• Nested Procedures
• Nested Depth
Implementation of Lexical Scope
1. Access link:
• The implementation of lexical scope can me obtained by using pointer
to each activation record. These pointers are called access link.
2. Display:
• It is expensive to traverse down access link every time when a particular
nonlocal variable is accessed.
• To speed up the access to non-locals can be achieved by maintaining an
array of pointers called display.
Example:
test() ///main()
{ int a; B(1); }
B(int i)
{ int b;
if (i != 0)
retrun B(i-1);
else
C();}

C()
{ int k; A();}
A()
{ int d; d=1;}
Display
p0()
{
p1()
{
p2()
{….}
}
p3()
{
p4()
{…..}
}
}
Error detection and Recovery in Compiler
• In this phase of compilation, all possible errors made by the user are
detected and reported to the user in form of error messages. This
process of locating errors and reporting it to user is called Error
Handling process.
• Functions of Error handler
• Detection
• Reporting
• Recovery
Classification of Errors
Lexical phase errors
• errors detected during the lexical analysis phase.

• Types of lexical errors:


1. Exceeding length of identifier or numeric constants (size of identifier and
number)
2. Appearance of illegal characters ( printf(“Compiler Design”);$)
3. Unmatched string (eg. Missing beginning or end Comment)
Error recovery for Lexical Error
• Panic Mode Recovery
• In this method, successive characters from the input are removed one at a time
until a designated set of synchronizing tokens is found.
• Synchronizing tokens are delimiters such as ; or }

• Advantage is that it is easy to implement and guarantees not to go to infinite


loop

• Disadvantage is that a considerable amount of input is skipped without


checking it for additional errors
Syntactic Phase Errors
• errors detected during syntax analysis phase
• Types of syntax errors are
1. Errors in structure
2. Missing operator
3. Misspelled keywords
4. Unbalanced parenthesis
Example :
swicth(ch)
{
.......
.......
}
• The keyword switch is incorrectly written as swicth. Hence,
“Unidentified keyword/identifier” error occurs.
Error Recovery for Syntactic Phase Errors
1. Panic Mode Recovery
2. Statement Mode Recovery
3. Error Production
4. Global Correction
1. Panic Mode Recovery
• In this method, successive characters from input are removed one at a
time until a designated set of synchronizing tokens is found.
Synchronizing tokens are deli-meters such as ; or }
• Advantage:
• easy to implement
• guarantees not to go to infinite loop
• Disadvantage:
• considerable amount of input is skipped without checking it for additional
errors
2. Statement Mode recovery
• In this method, when a parser encounters an error, it performs
necessary correction on remaining input so that the rest of input
statement allow the parser to parse ahead.
• The correction can be deletion of extra semicolons, replacing comma
by semicolon or inserting missing semicolon.
• While performing correction, most care should be taken for not going
in infinite loop.
• Disadvantage:
• it finds difficult to handle situations where actual error occurred before point
of detection.
3. Error production
• If user has knowledge of common errors that can be encountered then,
these errors can be incorporated by augmenting the grammar with
error productions that generate erroneous constructs.
• If this is used then, during parsing appropriate error messages can be
generated and parsing can be continued.
• Disadvantage:
• difficult to maintain.
4. Global Correction
• The parser examines the whole program and tries to find out the
closest match for it which is error free.
• The closest match program has less number of insertions, deletions
and changes of tokens to recover from erroneous input.
• Due to high time and space complexity, this method is not
implemented practically.
Semantic errors
• Errors detected during semantic analysis phase
• Types of semantic errors:
1. Incompatible type of operands
2. Undeclared variables
3. Not matching of actual arguments with formal one
Example
int a[10], b;
.......
.......
a = b;

• It generates a semantic error because of an incompatible type of a


and b.
Error recovery for Sematic Error
• if error “Undeclared Identifier” is encountered then, to recover from
this a symbol table entry for corresponding identifier is made.
• If data types of two operands are incompatible then, automatic type
conversion is done by the compiler.

You might also like