CC Projectgroup 2
CC Projectgroup 2
[C++ compiler]
Project Team
Student Name Student ID Program Contact Number Email Address
Muhammad Taha Bcsm-f17-110 BSCS Bcsm-f17-
[email protected]
Ahmar Awan Bcsm-f17-098 BSCS Bcsm-f17-098
@superior.edu.pk
Ahsen ejaz Bcsm-f17-314 BSCS Bcsm-f17-314
@superior.edu.pk
Asad Ali Rana Bcsm-s17-036 BSCS Bcsm-s17-036
@superior.edu.pk
Acknowledgements
We are thankful to our lecturer Miss. Maryam for her invaluable guidance, continuous encouragement and
constant support in making this project possible. We appreciate her guidance from the initial to the final level
that enabled us to develop an understanding of this project thoroughly. Without her advice and assistance, it
would have been a lot tougher to complete this project.
Executive Summary
As we studied about the complier that is use to translate high-level language and convert it into target
and understandable low-level language. Compilers are utility programs that take your code and
transform it into executable machine code files .When you run a compiler on your code, first, the
preprocessor reads the source code (the C++ file you just wrote). The preprocessor searches for any
preprocessor directives (lines of code starting with a #). Preprocessor directives cause the preprocessor
to change your code in some way (by usually adding some library or another C++ file). The compiler
works through the preprocessed code line by line translating each line into the appropriate machine
language instruction. This will also uncover any syntax errors that are present in your source code and
will throw an error to the command line.
Table of Contents
Acknowledgments...........................................................................................................................1
Executive Summary.......................................................................................................................2
Table of Contents………………………………………………………………………………...3
1. Introduction.…………………………………………………………………………………...4
2. Scope of Project………………………………………………………………………………..4
3.Problem Statement …………………………………………………………………………….
4.proposed Solution……………………………………………………………………………...
5.Existing System…………………………………………………………………………………
6. Lexical Analysis Phase………………………………………………………………………..6
Lexical Analysis (Tokenization) Code……………………………………………………6
Regular Expression………………………………………………………………………..8
Deterministic Finite Automata (DFA)…………………………………………………….8
Lexical Code (for implementing DFA)……………………………………………………9
Project Title: c++ compiler
1. Introduction
The C and C++ programming languages are closely related. C++ grew out of C, as it was designed to be
source-and-link compatible with C. C++ was based on C and retains a great deal of the functionality.
The C++ language provides mechanisms for mixing code that is compiled by compatible C and C++
compilers in the same program. As a matter of fact, C++ can run most of C code while C cannot run
most C++ code.
The purpose of compatibility with C is so that C++ programs can have convenient access to the billions
(trillions?) of lines of existing C code in the world.
Although, C and C++ code are almost compatible but there are still many incompatibilities or conflicts
between them. The conflicts can be of two types:
Incompatible C feature - valid as C code but not as C++ code.
Incompatible C++ feature - valid as C++ code but not as C code.
In this project we focus on a different domain. Compatible C/C++ features i.e. features of C code that
are valid in C++. We aim at detecting such snippets of code in our input program and will give an error
if a C code is detected, whilst if no C code could be detected then we will compile it for minor errors,
i.e. a mini compiler strictly for C++.
Scope of the Project
The purpose of this project is to design a convenient and easy to use compiler for c++ language. In
addition to detect C code inside C++ code, our Mini C++ compiler will also be able to report following
errors to the user:
Invalid variable name.
Invalid basic arithmetic expression.
Syntax error in While loop.
Syntax errors in For loop.
Syntax errors in If-Then-Else.
Problem Statement
Compiler:
Its difficult to find exact error in huge line of code.
Construct a mini C++ type compiler.
Don’t tell us about the exact error.
It should be able to strictly identify only C++ code.
It should report an error And any C code which is acceptable in C++.
Proposed Solution
%{
#include<stdio.h>
%}
%%
[0-9]+|[0-9]*\.[0-9]+ {printf("Number");}
[\+\-\*\/\^] {printf("Operator");}
[()] {printf("Punctuation");}
%%
Int yywrap()
{
return 1;
}
int main()
{
yylex();
return 0;
}
Final Project
Regular Expression
%{
%}
%s A B C D E F G H I J K L M N X
%%
<INITIAL>([-+]?[0-9]*[\.]?[0-9]+)+ BEGIN A;
<INITIAL>l BEGIN B;
<INITIAL>s BEGIN E;
<INITIAL>c BEGIN I;
<INITIAL>t BEGIN K;
<INITIAL>[^lsct0123456789\.\n] BEGIN X;
<INITIAL>\n BEGIN INITIAL; {printf("Accepted");}
<B>o BEGIN C;
<B> [^o\n] BEGIN X;
<B>\n BEGIN INITIAL; {printf("Not Accepted");}
<C>g BEGIN D;
<C>[^g\n] BEGIN X;
1
Final Project
<E>q BEGIN F;
<E>i BEGIN H;
<E>[^qi\n] BEGIN X;
<E>\n BEGIN INITIAL; {printf("Not Accepted");}
<F>r BEGIN G;
<F>[^r\n] BEGIN X;
<F>\n BEGIN INITIAL; {printf("Not Accepted");}
<G>t BEGIN D;
<G>[^t\n] BEGIN X;
<G>\n BEGIN INITIAL; {printf("Not Accepted");}
<H>n BEGIN D;
<H>[^n\n] BEGIN X;
<H>\n BEGIN INITIAL; {printf("Not Accepted");}
<I>o BEGIN J;
<I>[^o\n] BEGIN X;
<I>\n BEGIN INITIAL; {printf("Not Accepted");}
<J>s BEGIN D;
<J>[^s\n] BEGIN X;
<J>\n BEGIN INITIAL; {printf("Not Accepted");}
<K>a BEGIN L;
<K>[^a\n] BEGIN X;
<K>\n BEGIN INITIAL; {printf("Not Accepted");}
<L>n BEGIN D;
<L>[^n\n] BEGIN X;
<L>\n BEGIN INITIAL; {printf("Not Accepted");}
<D>\( BEGIN M;
<D>[^\(\n] BEGIN X;
<D>\n BEGIN INITIAL; {printf("Not Accepted");}
<M>([-+]?[0-9]*[\.]?[0-9]+)+ BEGIN N;
<M>[^0123456789\.\n] BEGIN X;
<M>\n BEGIN INITIAL; {printf("Not Accepted");}
2
Final Project
<N>\) BEGIN A;
<N>[^\)\n] BEGIN X;
<N>\n BEGIN INITIAL; {printf("Not Accepted");}
<X>[^\n] BEGIN X;
<X>\n BEGIN INITIAL; {printf("Invalid");}
%%
int yywrap()
{
return 1;
}
int main()
{
printf("Enter an expression: ", yytext);
yylex();
return 0;
}
3
Final Project
. ;
%%
int main()
{
yylex();
printf("\n total no. of token = %d\n", n);
}
4
Final Project
%{
%%
[-+]?[0-9]+ {
cout << "*** " << yytext << " is an integer.\n";
x = atoi(yytext);
product = x * mode;
sq = x * x;
if (mode == 1)
cout << "Its square is " << sq << ".\n";
else
cout << "Its product with your mode is "
<< product << ".\n"; }
exit|quit {
cout << "\nBye\n";
return 0; }
[\t ]+ cout << " ";
\n cout << endl;
. cout << yytext;
%%
5
Final Project
%{
#include<stdio.h>
#include "y.tab.h"
%}
%%
[0-9]+ {
yylval=atoi(yytext);
return NUMBER;
}
[\t] ;
6
Final Project
[\n] return 0;
. return yytext[0];
%%
int yywrap()
return 1;
%{
#include<stdio.h>
int flag=0;
%}
7
Final Project
%token NUMBER
/* Rule Section */
%%
ArithmeticExpression: E{
printf("\nResult=%d\n", $$);
return 0;
8
Final Project
};
E:E'+'E {$$=$1+$3;}
|E'-'E {$$=$1-$3;}
|E'*'E {$$=$1*$3;}
|E'/'E {$$=$1/$3;}
|E'%'E {$$=$1%$3;}
|'('E')' {$$=$2;}
| NUMBER {$$=$1;}
;
9
Final Project
%%
//driver code
void main()
yyparse();
if(flag==0)
void yyerror()
10
Final Project
flag=1;
11
Final Project
F -> F^G
F -> G
G -> H
H -> number
12
After removing Ambiguity by converting it from Left Recursive Grammar to Right
Recursive Grammar, it becomes:
E -> TE’
E’ -> Є|+TE’|-TE’
T -> FT’
T’ -> Є|*FT’|/FT’
F -> GF’
F’ -> Є|^GF’
^ / * - + $
^ ּ> ּ> ּ> ּ> ּ> ּ>
/ <ּ ּ> ּ> ּ> ּ> ּ>
* <ּ ּ> ּ> ּ> ּ> ּ>
- <ּ <ּ <ּ ּ> ּ> ּ>
+ <ּ <ּ <ּ ּ> ּ> ּ>
$ <ּ <ּ <ּ <ּ <ּ <ּ
Stack Implementation
Final Project
1
Final Project
2
After removing Ambiguity by converting it from Left Recursive Grammar to Right
Recursive Grammar, it becomes:
E -> LE’
E’ -> Є|+LE’|-LE’
L-> FL’
L’ -> Є|*FL’|/FL’
F -> GF’
F’ -> Є|^GF’
^ / * - + $
^ ּ> ּ> ּ> ּ> ּ> ּ>
/ <ּ ּ> ּ> ּ> ּ> ּ>
* <ּ ּ> ּ> ּ> ּ> ּ>
- <ּ <ּ <ּ ּ> ּ> ּ>
+ <ּ <ּ <ּ ּ> ּ> ּ>
$ <ּ <ּ <ּ <ּ <ּ <ּ
Stack Implementation
Final Project
1
Final Project
^ / * - + $
^ ּ> ּ> ּ> ּ> ּ> ּ>
/ <ּ ּ> ּ> ּ> ּ> ּ>
* <ּ ּ> ּ> ּ> ּ> ּ>
- <ּ <ּ <ּ ּ> ּ> ּ>
+ <ּ <ּ <ּ ּ> ּ> ּ>
$ <ּ <ּ <ּ <ּ <ּ <ּ
Stack Implementation
CFG: E -> E+L|E-L|L
2
Final Project
L -> L*F|L/F|F
F -> F^G|G
H -> number
3
Final Project
Conclusion
A compiler operates in various phases, each phase transforms the source program from one
representation to another. Every phase takes inputs from its previous stage and feeds its
output to the next phase of the compiler.All these phases convert the source code by
dividing into tokens, creating parse trees, and optimizing the source code by different
phases.The front end includes all analysis phases and requires enormous amount of space
to store tokens and trees.