0% found this document useful (0 votes)
39 views17 pages

CSC305 ASSIGNMENT Final

The document discusses symbol tables and their implementation in programming languages. It explains what a symbol table is, its structure and common operations. It then discusses how symbol tables are used in different compiler phases and provides an example of symbol table implementation in C++.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views17 pages

CSC305 ASSIGNMENT Final

The document discusses symbol tables and their implementation in programming languages. It explains what a symbol table is, its structure and common operations. It then discusses how symbol tables are used in different compiler phases and provides an example of symbol table implementation in C++.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

FEDERAL UNIVERSITY LOKOJA

DEPARTMENT OF COMPUTER SCIENCE

CSC305 ASSIGNMENT

GROUP 5

MATRIC NUMBER 087-106

LECTURER: MR ABUBAKAR ALIYU

Page | 1
GROUP MEMBERS
S/N NAME MATRIC NUMBER
1 OVABOR OSEDEBAMEH OBED SCI19CSC087
2 PETER ENOJO MOSES SCI19CSC088
3 SALAMI BABATUNDE SAMUEL SCI19CSC089
4 SALAMI VICTOR HARUNA SCI19CSC090
5 SALAWUDEEN ABDULMUHAYMIN SCI19CSC091
OGIRIMA
6 SAMUEL TITUS OLUWATIMILEYIN SCI19CSC092
7 SULEIMAN SADIK ABUBAKAR SCI19CSC093
8 SULEIMAN NANAHAWAW IZE SCI19CSC094
9 SUNDAY JEREMIAH SCI19CSC095
10 THOMAS OJONIMI SHEDRACK SCI19CSC096
11 TUWASE TOSIN SCI19CSC097
12 UKWUEZE CHUKWUEMEKA GIFT SCI19CSC098
13 UMOREN BARNABAS JOHN SCI19CSC099
14 USMAN ABDULMALEEK ANDA SCI19CSC100
15 USMAN OMOWUNMI GRACE SCI19CSC101
16 USMANKING ABDULLAHI SCI19CSC102
17 YAKUBU FRANCIS ELEOJO SCI19CSC103
18 YAU-YABA UMAR SCI19CSC104
19 YUSUF KHADIJAT OZOHU SCI19CSC105
20 ABDULRAHMAN AHMAD BABA SCI19CSC106

Page | 2
QUESTION 1
Explain in detail the symbol table and its implementation in C/C++/Java or Python.

ANSWER
Behind every remarkable compiler lies a very important tool known as the symbol
table. This unassuming data structure plays a very vital role in deciphering the
language of code by bridging the gap between human readable instructions and
machine-friendly executions. With the symbol table, the compiler resolves the
complexity of scoping, resolves naming problems and ensures semantic accuracy.

WHAT IS A SYMBOL TABLE?


In computer science, a symbol table is a data structure which is created and
maintained by compilers in order to store information about the occurrences of
various entities such as variables and function names, objects, classes and
interfaces.

STRUCTURE OF SYMBOL TABLE ENTRY


Each entry in the symbol table is associated with attributes that support the compiler
in different phases. These are;
1. Symbol Name: which holds the name of the identifiers (variable name).
2. Type: Stores the datatype or written types of the identifiers.
3. Size: Specifies the size of the identifiers (fixed or variable)
4. Dimension: Holds the dimensions of the data types, it could be one
dimensional or multidimensional.
5. Line of Declaration: The line number of the source code with the identifier
that has been declared is stored.
6. Line of Usage: It stores the line number of the source code that identifier has
been used.
7. Address: stores the address information of the identifier (fixed, compile, Run
time).

Page | 3
SYMBOL TABLE OPERATION
Common operations that occur in the symbol table during the compilation process
are:
1. Insert: This operation inserts a name into the symbol table and returns a
pointer.
2. Lookup: This operation searches a name and returns the pointer.
3. Create: This operation allocates a new empty symbol table.
4. Delete: This operation helps in removing symbols from the symbol table.

SYMBOL TABLE USAGE IN COMPILER PHASES


● Lexical Analysis: Creates entities for identifiers. The lexical analyser is also
known as the scanner and it scans code line by line. During the scanning when
it encounters any identifier, it creates the entry for that in the symbol table.
● Syntax Analysis: The Syntax analyzer adds information regarding attributes
like type, score, dimension, line of reference, line of usage etc.
● Semantic Analysis: The Semantic analyser using the available information
stored in the symbol table checks semantics of the identifiers created by the
lexical analysis phase and updates the Symbol table accordingly.
● Intermediate Code Generation: The available information inside the
Symbol table helps the intermediate code generator in adding temporary
variables information.
● Code Optimization:- The available information stored inside the Symbol
table is used specifically in machine dependent optimization.
● Target Code Generation:- The target code generator generates the target
code using address information of identifiers stored inside the symbol table.

IMPLEMENTATION OF SYMBOL TABLE IN C++


Let's consider a real-life scenario where we have a symbol table to manage
information about students in a university. Each student has a unique student ID,
name, age, and major. We want to implement a symbol table to store this information
and perform operations like insertion, lookup, and deletion based on the student ID.

Page | 4
#include <iostream>
#include <unordered_map>
#include <string>

class Student {
public:
std::string name;
int age;
std::string major;

Student(const std::string& name, int age, const


std::string& major)
: name(name), age(age), major(major) {}
};

class StudentDatabase {
private:
std::unordered_map<int, Student> database;

public:
void insert(int studentID, const std::string& name, int
age, const std::string& major) {
Student student(name, age, major);
database[studentID] = student;
}

Student* lookup(int studentID) {


auto it = database.find(studentID);
if (it != database.end()) {
return &(it->second);
}
return nullptr;
}

Page | 5
void remove(int studentID) {
auto it = database.find(studentID);
if (it != database.end()) {
database.erase(it);
} else {
throw std::runtime_error("Student not found in
the database.");
}
}
};

// Example Usage:
int main() {
StudentDatabase studentDB;

// Insert student records into the database


studentDB.insert(1001, "Alice", 21, "Computer
Science");
studentDB.insert(1002, "Bob", 20, "Electrical
Engineering");
studentDB.insert(1003, "Charlie", 22, "Mechanical
Engineering");

// Lookup and print student information


Student* alice = studentDB.lookup(1001);
if (alice) {
std::cout << "Student ID: " << 1001 << ", Name: "
<< alice->name
<< ", Age: " << alice->age << ", Major: "
<< alice->major << std::endl;
} else {
std::cout << "Student not found." << std::endl;

Page | 6
}

// Remove a student from the database


studentDB.remove(1002);

// Lookup (after deletion)


Student* bob = studentDB.lookup(1002);
if (bob) {
std::cout << "Student ID: " << 1002 << ", Name: "
<< bob->name
<< ", Age: " << bob->age << ", Major: "
<< bob->major << std::endl;
} else {
std::cout << "Student not found." << std::endl;
}

return 0;
}

IMPORTANCE OF SYMBOL TABLE


1. Helps to manage runtime allocation of identifiers.
2. Symbol table stored all valid data and essential information related to symbols
which is useful at time of completion.
3. Scope of identifiers are easily handled by symbol table.

Page | 7
QUESTION 2
Discuss in details with their implementations in any programming language of your
choice, the following
a) Java CC (Java Compiler Compiler), and
b) YACC (Yet another Compiler Compiler)

ANSWER
a. Java CC (Java Compiler Compiler)

JavaCC, short for Java Compiler Compiler, is a parser generator and lexical
analyzer generator for the Java programming language. It is used to generate Java
code for building parsers, interpreters, and compilers for various programming
languages or domain-specific languages (DSLs).
It uses a BNF-like (Backus-Naur Form) syntax to define the grammar rules of
the language being parsed. Based on this input grammar specification, JavaCC
generates the necessary Java code that can be integrated into your Java application
for lexing and parsing the input source code.

Fig 1.0 Java Compiler Compiler

Parser Generation

Parser and lexical analyzers are the two software components that deal with the
input of character sequences. The compiler and interpreters integrate lexical
analyzers and parsers. The parsers are used to deciphers the files that contain

Page | 8
programs. In other words, the parser reads grammar specifications and converts them
into Java programs that recognize the matches to the grammar.

Lexical analyzers break the sequence of characters into subsequences


called tokens and also classifies the tokens. Therefore, lexical analyzers and parsers
can be used with a variety of programs. JavaCC is also capable of performing
standard parser functionalities such as tree building, debugging, and actions. The
functionality of tree building is performed by the tool called JJTree that is inbuilt
in JavaCC

Fig 1.2 Parser Generator

FEATURES OF JAVACC
1. Parser Generation: JavaCC generates LL(k) parsers, which are predictive
parsers based on a fixed number (k) of look-ahead tokens. This allows efficient
parsing of languages with deterministic grammars.
2. Lexical Analysis: JavaCC provides support for generating lexical analyzers
(scanners) that can tokenize the input source code into a sequence of tokens (tokens
are the smallest units of the language being parsed).
3. Error Reporting: JavaCC generates parsers with built-in error handling and
reporting capabilities, making it easier to identify syntax errors in the input source
code.
4. Grammar Extensions: JavaCC allows you to extend the BNF-like grammar
notation with additional features, such as lookahead, syntactic predicates, semantic
actions, and more.
5. Lexer States: JavaCC supports lexer states, which allow the lexer to switch
between different sets of lexical rules based on the parser's context.

Page | 9
6. Tree Building: JavaCC can be configured to generate Abstract Syntax Trees
(ASTs) or parse trees, making it easier to perform subsequent semantic analysis or
code generation.

CODE IMPLEMENTATION OF JAVACC USING C++


C++ example that demonstrates a basic parser for a simple arithmetic expression
language

// ArithmeticParser.jj
options {
STATIC = false;
}

PARSER_BEGIN(ArithmeticParser)
public class ArithmeticParser {
public static void main(String[] args) throws
ParseException {
ArithmeticParser parser = new
ArithmeticParser(System.in);
parser.parse();
}
}
PARSER_END(ArithmeticParser)

SKIP: { " " | "\t" | "\n" | "\r" }

TOKEN : {
< INTEGER: (["0"-"9"])+ >
| < ADD: "+" >
| < SUBTRACT: "-" >
| < MULTIPLY: "*" >
| < DIVIDE: "/" >
| < LPAREN: "(" >
| < RPAREN: ")" >

Page | 10
}

int parse() :
{
int value;
}
{
value = additiveExpression() <EOF>
{
System.out.println("Result: " + value);
return value;
}
}

int additiveExpression() :
{
int value;
Token op;
}
{
value = multiplicativeExpression()
(
op = <ADD> { value += multiplicativeExpression(); }
| op = <SUBTRACT> { value -= multiplicativeExpression();
}
)*
{ return value; }
}

int multiplicativeExpression() :
{
int value;
Token op;

Page | 11
}
{
value = primaryExpression()
(
op = <MULTIPLY> { value *= primaryExpression(); }
| op = <DIVIDE> { value /= primaryExpression(); }
)*
{ return value; }
}

int primaryExpression() :
{
int value;
}
{
value = <INTEGER>
| <LPAREN> value = additiveExpression() <RPAREN>
{ return value; }
}

Page | 12
b. YACC (Yet another compiler compiler)
Yacc (Yet Another Compiler Compiler) is a syntax analyser generator or parser
generator, hence Yacc takes in a context-free grammar as input and generates a
parser in C programming language as output. It is often used in combination with
Lex (Lexical analyzer generator) which generates the lexical analyzer. Although
Flex is commonly used recently, it is an open source and faster analyzer for the
language being compiled.
The parser generated by Yacc is typically a LALR (Look-Ahead LR) parser, which
is an efficient and widely used type of bottom-up parser. The LALR parsing
technique allows the parser to recognize and analyze the structure of the input
program and build an Abstract Syntax Tree (AST) for further processing.

Fig1.3 Yacc Compiler


The above image shows the Yacc compiler taking a file containing desired grammar
is specified in Yacc format(.y extension) as input, and brings an output in C
program(y.tab.c) as a parser. Then the y.tab.c file runs on the C compiler as input
and gives (a.out) as output which is an object code in syntax analyser.

FEATURES OF YACC
1. Grammar Specification: Yacc uses a context-free grammar (CFG) specification
to define the syntax of the language being parsed. The grammar consists of rules that
describe the valid sequences of tokens in the language.
2. Parser Generation: Based on the input grammar, Yacc generates the necessary
C code to implement the LALR parser. The generated parser reads the sequence of
tokens produced by the lexical analyzer (Lex) and matches them against the
grammar rules to recognize the syntax of the input program.

Page | 13
3. Symbol Table and Semantic Actions: Yacc allows the inclusion of semantic
actions in the grammar rules. These semantic actions are snippets of code that get
executed when specific grammar rules are matched, allowing the parser to perform
tasks like building the AST, handling symbol table entries, and performing semantic
analysis.
4. Error Recovery: Yacc provides mechanisms for error recovery during parsing.
It can be configured to produce informative error messages and continue parsing
after encountering syntax errors.
5. Shift-Reduce Conflicts Resolution: Yacc automatically resolves shift-reduce
and reduce-reduce conflicts that may arise in the grammar. This ensures that the
generated parser behaves predictably and can handle ambiguous grammars.

STRUCTURE OF YACC PROGRAM


/* declarations*/
%%
/*rules*/
%%
/*supporting C routines*

CODE IMPLEMENTATION OF YACC USING C++


Note: Yacc programs are generally written in 2 files one for Lex with .1
extension(tokenization and send the token to Yacc) and another for Yacc with .y
extension( for grammar evaluation and result evaluation)
Example: Creating a YACC program to evaluate a given arithmetic expression
consisting of ‘+’,’-’, ‘/’, ‘*’ and brackets.
Input: 7*(5-3)/2
Output: 7
Input: 6/((3-2)*(-5+2))
Output: -2

Page | 14
Lexical Analyser source code:

%option noyywrap

%{
#include "parser.tab.hpp"
%}

DIGIT [0-9]
WS [ \t\r\n]

%%

{WS} /* Ignore whitespace */


{DIGIT}+ { yylval = atoi(yytext); return NUMBER; }
. { return *yytext; }

%%

Parser Source code:

%{
#include <stdio.h>
%}

%token NUMBER
%left '+' '-'
%left '*' '/'

%%
input: expression { printf("Result: %d\n", $1); }
;

expression: expression '+' expression { $$ = $1 + $3; }

Page | 15
| expression '-' expression { $$ = $1 - $3; }
| expression '*' expression { $$ = $1 * $3; }
| expression '/' expression { $$ = $1 / $3; }
| NUMBER { $$ = $1; }
;

%%

int yylex();
void yyerror(const char *msg);

int main() {
yyparse();
return 0;
}

void yyerror(const char *msg) {


fprintf(stderr, "Error: %s\n", msg);
}

MAIN PROGRAM

#include <iostream>

extern int yyparse();


extern void yyerror(const char *msg);

int main() {
yyparse();
return 0;
}

Page | 16
References

● I2Tutorials. “Compiler Design-Operations of the Symbol Table.” i2tutorials,

https://www.i2tutorials.com/compiler-design-tutorial/compiler-design-

operations-of-the-symbol-table/. Accessed 3 August 2023.

● JavaTpont. “Symbol Table - javatpoint.” Javatpoint,

https://www.javatpoint.com/symbol-table. Accessed 3 August 2023.

● Tutorialspoint. “Compiler Design - Symbol Table.” Tutorialspoint,

https://www.tutorialspoint.com/compiler_design/compiler_design_symbol_t

able.htm. Accessed 3 August 2023.

● WikiPedia. “Symbol table.” Wikipedia,

https://en.wikipedia.org/wiki/Symbol_table. Accessed 3 August 2023.

Page | 17

You might also like