Compiler Construction: The Symbol Table
Compiler Construction: The Symbol Table
This includes:
- Keywords - Data Types
- Operators - Functions
- Variables - Procedures
- Constants - Literals
Contents of the Symbol Table
The symbol table will contain the following types
of information for the input strings in a source
program:
• The lexeme (input string) itself
• Corresponding token
• Its semantic component (e.g., variable, operator,
constant, functions, procedure, etc.)
• Data type
• Pointers to other entries (when necessary)
Sum I Average
I
I
Average Sum
Average
Sum
Balanced
Unbalanced Unbalanced
1 c a l l
2 d e c l a r e
3 a v e r a g e s
4 x
Name table
0 0 5 0 -1
1 5 4 1 -1
2 9 7 2 -1
2 6
17 51 2 17 -1
3 -1
18 53 8 18 -1
4 -1
5 -1
19 61 3 19 18
6 -1
7 -1 20 64 3 20 -1
8 5
9 -1
10 19
Hash Functions
• The hash function takes the lexeme and produces a
nonunique value from it which is a starting point
in a list of entries, one of which is the one that we
seek.
• We want our hash value to produce as few hash
collisions as possible.
• Our first temptation would be to use a hash
function such as:
Sum(ASCII(Lexeme)) MOD SomeValue
This is not a good hash function because cat and
act would have the same hash value
Hash Function
int symboltable::hashcode(char string[])
{
int i, numshifts, startchar, length;
unsigned code;
length = strlen(string);
numshifts = (int) min(length, (8*sizeof(int)-8));
startchar = ((length-numshifts) % 2);
code = 0;
for (i = startchar;
i <= startchar + numshifts - 1; i++)
code = (code << 1) + string[i];
return(code % hashtablesize);
}
String
StringSString
Table Table
Auxiliary Table
• If there are several values to store or more than
several pointers to other entries in the symbol
table, it may be necessary to add an auxiliary table
to hold these values.
• The attribute table is linked to the auxiliary table
by holding the starting position and number of
entries in the auxiliary table.
• This would also allow us to implement structures
such as array and records in a fairly straight-
forward fashion.
// The tokens
enum tokentype {tokbegin, tokcall, tokdeclare, tokdo,
tokelse,tokend, tokendif, tokenduntil,
tokendwhile, tokif, tokinteger,
tokparameters, tokprocedure, tokprogram,
tokread, tokreal, tokset, tokthen,
tokuntil, tokwhile, tokwrite, tokstar,
tokplus, tokminus, tokslash, tokequals,
toksemicolon, tokcomma, tokperiod,
tokgreater, tokless, toknotequal,
tokopenparen, tokcloseparen, tokfloat,
tokidentifier, tokconstant, tokerror,
tokeof, tokunknown
};
typedef struct {
semantictype smtype;
tokentype tok_class;
datatype dataclass;
int owningprocedure;
int thisname;
int outerscope, scopenext;
valrec value;
char label[labelsize];
} attribtabtype;
Declaration For the Class Symbol
private:
// Initializes the procedure stack's entry
procstackitem initprocentry(int tabindex);
int hashcode(char string[]);
void LexemeInCaps(int tabindex);
// Creates a label for the object code
void makelabel(int tabindex, char *label);
inline int min(int a, int b)
{return ((a < b)? a : b);}
char stringtable[stringtablesize];
nametabtype nametable[nametablesize];
attribtabtype attribtable[attribtablesize];
int hashtable[hashtablesize];
int strtablen, namtablen, attribtablen;
procstackitem thisproc;
stack <procstackitem> ps;
};
Initializing the Symbol Table
thisproc = initprocentry(-1);
installname(keystring[i], nameindex);
setattrib(nameindex, stfunction, (tokentype)i);
installdatatype(nameindex, stfunction, dtreal);
tabindex = nametable[nameindex].symtabptr
= attribtablen++;
attribtable[tabindex].thisname = nameindex;
attribtable[tabindex].smtype = stunknown;
attribtable[tabindex].dataclass = dtunknown;
return(tabindex);
}
if (attribtable[tabindex].smtype == stkeyword
|| attribtable[tabindex].smtype == stoperator)
attribtable[tabindex].dataclass = dtnone;
else
attribtable[tabindex].dataclass = dtunknown;
if (gettok_class(tabindex) == tokidentifier
&& thisproc.proc != -1)
if (thisproc.sstart == -1) {
thisproc.sstart = tabindex;
thisproc.snext = tabindex;
}
else {
attribtable[thisproc.snext].scopenext
= tabindex;
thisproc.snext = tabindex;
}
}
We create a linked
list of identifiers
belonging withing
the scope. This will
simply the process
of closing the scope
when we are
finished.
Keeping track of scope
We also keep a
linked list of
identifiers sharing
the same name
pointing from the
most closely
nested scope
outward.
i = attribtable[tabindex].thisname;
j = nametable[i].strstart;
s = stringtable + j;
cout << s ;
}
// LexemeInCaps() - Print the lexeme in capital letters
// This makes it more distinctive
void symboltable::LexemeInCaps(int tabindex)
{
int i, j;