Compiler Design Lab
Compiler Design Lab
Theory :
1. Parser Generator:
A parser generator is a tool used in compiler design to automate the generation of parsers,
which are essential components in the process of converting source code into a form that a
computer can execute. Parsers analyse the syntactic structure of programming languages,
breaking down code into a hierarchical structure called a parse tree. Parser generators
streamline the creation of parsers by taking a formal specification of the language's grammar
as input and automatically generating code for the parser in a target programming language.
This allows compiler developers to focus on defining language syntax rather than writing
intricate parsing code manually. Common parser generators include tools like Yacc (Yet
Another Compiler Compiler) and Bison, which are widely used for generating parsers for
languages like C and C++. These tools play a crucial role in the efficient and systematic
development of compilers by automating the tedious and error-prone process of manual
parser implementation.
2. Scanner generator :
A scanner generator is a tool used in the development of compilers to automate the creation
of lexical analyzers or scanners. Lexical analyzers are responsible for breaking down the
input source code into a stream of tokens, which are the smallest meaningful units in a
programming language. These tokens serve as input for the subsequent phases of the
compiler. A popular tool for generating lexical analyzers is Lex, often used in conjunction
with Yacc or Bison for a complete compiler solution. Lexical specifications, defining the rules
for recognizing tokens, are provided as input to the scanner generator, which then produces
source code for the lexical analyzer in a target programming language. This automation
simplifies the implementation of scanners, allowing developers to focus on the language's
lexical structure rather than manually coding intricate token recognition logic. Scanner
generators contribute to the efficiency and consistency of compiler development processes.
Conclusion :
In conclusion, we have studied the Compiler Construction Toolkits have demonstrated their
invaluable role in simplifying and automating the intricate process of building compilers.
These toolkits, exemplified by tools like Lex, Yacc, Bison, and LLVM, empower developers to
efficiently construct compilers, allowing them to focus on language design and optimization
strategies rather than grappling with low-level implementation details. The modular and
comprehensive nature of these toolkits significantly contributes to the ease and effectiveness
of compiler development.
EXPERIMENT NO : 2
Aim : Write a program in C++ to Identify lexical , syntax and Semantic errors
Theory :
1. Lexical Errors :
When the token pattern does not match the prefix of the remaining input, the lexical
analyzer gets stuck and has to recover from this state to analyse the remaining input.
In simple words, a lexical error occurs when a sequence of characters does not
match the pattern of any token. It typically happens during the execution of a
program.
1. Exceeding length of identifier or numeric constants.
Eg: int a=2147483647 +1; /* exceeds the limit if int */
2. Appearance of illegal characters
Eg: printf("Geeksforgeeks");$ /*presence of illegal character*/
3. Unmatched string
Eg: /* comment cout<<"GFG!"; /*ending of comment not done*/
4. Spelling Error
Eg: int 3num= 1234; /* spelling mistake*/
5. Replacing a character with an incorrect character.
Eg: int x = 12$34; /*wrong character*/
6. Removal of the character that should be present.
Eg: #include <iostream> /*missing 'o' character hence its can be lexical error*/
2. Syntax Error:
Syntax errors are mistakes in the source code, such as spelling and punctuation
errors, incorrect labels, and so on, which cause an error message to be generated by
the compiler. These appear in a separate error window, with the error type and line
number indicated so that it can be corrected in the edit window.Syntax error is found
during the execution of the program.
Some syntax error can be:
● Error in structure
● Missing operators
● Unbalanced parenthesis
Some are the examples are as follows :
1. Using "=" when "==" is needed.
Eg: if (number=200)
count << "number is equal to 20";
else
count << "number is not equal to 200"
2. Missing semicolon:
Eg: int a = 5 // semicolon is missing
3.Errors in expressions:
Eg: x = (3 + 5; // missing closing parenthesis ')'
y = 3 + * 5; // missing argument between '+' and '*'
3. Semantic Error :
During the semantic analysis phase, this type of error appears. These types of errors
are detected at compile time. Most of the compile time errors are scope and
declaration errors. For example: undeclared or multiple declared identifiers. Type
mismatched is another compile time error. The semantic error can arise using the
wrong variable or using the wrong operator or doing operation in the wrong order.
Some examples are as follows :
1. Use of a non-initialized variable:
Eg: int i;
void f (int m)
{
m=t; /* non-initialized variable t */
}
2.Type incompatibility:
Eg: int a = "hello"; // the types String and int are not compatible
3. Errors in expressions:
Eg: String s = "...";
int a = 5 - s; // the - operator does not support arguments of type String
Lexical Error:
#include <iostream>
int main() {
// Write C++ code here
std::cput << "hello this is first program!"
return 0;
}
Syntax Error:
#include <iostream>
int main() {
// Write C++ code here
std::cout << "Hello world!"
return 0;
}
Semantic Error :
#include <iostream>
int main() {
// Write C++ code here
int a=10;
int b= 100;
std::cout<<a+c;
return 0;
}
Conclusion :
In conclusion we have studied the experiment on semantic errors, lexical errors, and syntax
errors underscores their distinct roles in the debugging process. While syntax errors relate to
code structure, lexical errors involve tokenization issues, and semantic errors pertain to
incorrect program logic. Addressing these errors systematically is crucial for producing
robust and error-free software, highlighting the importance of thorough testing and
debugging practices in software development.
EXPERIMENT NO : 3
Theory :
Lex, Yacc, Flex, and Bison are indispensable tools in the realm of compiler construction and
language processing. Lex and Flex automate the generation of lexical analyzers, breaking
down input source code into tokens based on defined regular expressions. Yacc and Bison,
on the other hand, focus on generating parsers from context-free grammars, enabling the
analysis of the syntactic structure of programming languages. Together, Lex/Flex and
Yacc/Bison form a powerful combination: Lex/Flex tokenize the input stream, while
Yacc/Bison parse the tokens according to grammar rules, constructing parse trees or
abstract syntax trees (ASTs) that represent the program's structure. This automation
simplifies the development of compilers, interpreters, and other language processing
systems, allowing developers to concentrate on language semantics rather than low-level
parsing routines.
The foundation of Lex and Flex lies in finite automata theory and regular expressions.
Lex/Flex specifications consist of regular expressions paired with corresponding actions,
which are executed upon matching a pattern in the input stream. These tools employ
deterministic finite automata (DFA) or non-deterministic finite automata (NFA) to efficiently
recognize patterns in the input, optimising lexical analysis.
Yacc and Bison, on the other hand, leverage concepts from formal language theory and
parsing algorithms. Yacc/Bison specifications define the syntax rules of a language using
context-free grammars (CFG). These grammars consist of production rules that describe the
syntactic structure of the language. Yacc/Bison parsers employ bottom-up parsing
algorithms, such as LR(1) or LALR(1), to analyse the input and construct parse trees. This
process involves shifting tokens onto a parsing stack and reducing them according to the
grammar rules until a parse tree is built.
In summary, Lex, Yacc, Flex, and Bison provide developers with powerful tools rooted in
formal language theory and automata theory. By automating lexical and syntactic analysis,
these tools simplify the development of language processing systems, enabling the creation
of efficient and reliable compilers, interpreters, and related software.
Conclusion:
We have successfully studied and learnt about lex, yacc, flex, bison tools.
EXPERIMENT NO : 3
Code:
STEP 1: WRITE THE ABOVE PROGRAM IN TEXT EDITOR AND SAVE IT BY FILE.L
//Implementation of Lexical Analyzer using Lex tool
%{
int COMMENT=0;
%}
identifier [a-zA-Z][a-zA-Z0-9]*
%%
#.* {printf("\n%s is a preprocessor directive",yytext);}
int |
float |
char |
double |
while |
for |
struct |
typedef |
do |
if |
break |
continue |
void |
switch |
return |
else |
goto {printf("\n\t%s is a keyword",yytext);}
"/*" {COMMENT=1;}{printf("\n\t %s is a COMMENT",yytext);}
{identifier}\( {if(!COMMENT)printf("\nFUNCTION \n\t%s",yytext);}
\{ {if(!COMMENT)printf("\n BLOCK BEGINS");}
\} {if(!COMMENT)printf("BLOCK ENDS ");}
{identifier}(\[[0-9]*\])? {if(!COMMENT) printf("\n %s IDENTIFIER",yytext);}
\".*\" {if(!COMMENT)printf("\n\t %s is a STRING",yytext);}
[0-9]+ {if(!COMMENT) printf("\n %s is a NUMBER ",yytext);}
\)(\:)? {if(!COMMENT)printf("\n\t");ECHO;printf("\n");}
\( ECHO;
= {if(!COMMENT)printf("\n\t %s is an ASSIGNMENT OPERATOR",yytext);}
\<= |
\>= |
\< |
== |
\> {if(!COMMENT) printf("\n\t%s is a RELATIONAL OPERATOR",yytext);}
%%
int main(int argc, char **argv)
{
FILE *file;
file=fopen("var.c","r");
if(!file)
{
printf("could not open the file");
exit(0);
}
yyin=file;
yylex();
printf("\n");
return(0);
}
int yywrap()
{
return(1);
}
STEP 2: BE IN THE DIRECTORY IN WHICH YOU SAVED THE PROGRAM.TO VIEW THAT
PERFORM
cat file.l
STEP 3: WRITE THE BELOW CODE IN THE TEXT EDITOR AND SAVE IT BY VAR.C
#include<stdio.h>
main()
{
int a,b;
}
STEP 4: EXECUTE THE FOLLOWING COMMANDS
umit@umit-OptiPlex-990:~$ lex file.l
umit@umit-OptiPlex-990:~$ cc lex.yy.c
umit@umit-OptiPlex-990:~$ ./a.out OR a.exe
Output: