0% found this document useful (0 votes)
116 views14 pages

Mini Compiler: Submitted By: Tejash Niroula 16bce2292

The document describes a mini compiler project that translates a simple language to intermediate code. It outlines the language specifications including identifier rules, data types, expressions, and statements. It provides two sample programs and shows the generated intermediate code. It also describes the different phases of a compiler including lexical analysis, syntax analysis, and code generation.

Uploaded by

Tejash Niroula
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
116 views14 pages

Mini Compiler: Submitted By: Tejash Niroula 16bce2292

The document describes a mini compiler project that translates a simple language to intermediate code. It outlines the language specifications including identifier rules, data types, expressions, and statements. It provides two sample programs and shows the generated intermediate code. It also describes the different phases of a compiler including lexical analysis, syntax analysis, and code generation.

Uploaded by

Tejash Niroula
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

MINI COMPILER

Submitted by:
tejash niroula
16bce2292

i
CONTENTS
Chapter 1: Introduction.........................................................................................................................1
Chapter 2: Language description

2.1 Language ..........................................................................................................................2


2.2 Sample Program I..............................................................................................................4
2.3 Code generated..................................................................................................................5
2.4 Sample Program II.............................................................................................................7
2.5 Code Generated.................................................................................................................8

Chapter 3: Phases of compiler

3.1 Overview.........................................................................................................................10
3.2 Lexical analyser...............................................................................................................10
3.3 Syntax analyser................................................................................................................11
3.4 Semantic analyser............................................................................................................11
3.5 Intermediate code generator............................................................................................12

Chapter 4: Screenshots........................................................................................................................13
Chapter 5: Feasibility and future scope...............................................................................................16
Chapter 6: Conclusion.........................................................................................................................17
Chapter 7: References.........................................................................................................................18
ii
Chapter 1
Introduction
A compiler translates the code written in one language to some other language without changing the
meaning of the program. It is also expected that a compiler should make the target code efficient and
optimized in terms of time and space.
The compilation process is a sequence of various phases. Each phase takes input from its previous
stage, has its own representation of source program, and feeds its output to the next phase of the
compiler.
The analysis phase of the compiler reads the source program, divides it into core parts and then
checks for lexical, grammar and syntax errors. The analysis phase generates an intermediate code
which is also referred as Assembly Language Code.

Figure 1 Compiler Architechture


Our project is to take a language and convert it to intermediate code. This part of compiler design is
front end. This Intermediate code generated is independent of machine. This code is later optimised
and converted to machine code.

1
Chapter 2
Language Description

2.1 Language
Our language will have the following specifications.
Identifier Rules
• Identifier can be of maximum length 6.
• Identifiers are not case sensitive.
• An Indetifier can only have alphanumeric characters( a-z , A-Z , 0-9 ) and
underscore(_).
• The first character of an identifier can only contain alphabet( a-z , A-Z ).
• Keywords are not allowed to be used as Identifiers.
• No special characters, such as semicolon, period, whitespaces, slash or comma
are permitted to be used in or as Identifier.
Data Types:
Our language supports only 3 datatypes
• Integer
• String
• Character
Expressions
• Arithmetic operators (+, -, *, /, %)
• Uniray operator
• Paranthesis
• Only Integer supported
• Relational expression to be supported (>, <, >=, <=, ==, !=)
•  Character string and integer constants
e.g.    int const 4
3
char const ‘4’
string const “4”
Statements
• Declaration statement : int a;
• Declaration and Initialisation : int a=5;
• Assingment Statement : a=6;
• Conditional statement

Simple if (nesting not allowed)


if then
Endif
Switch Statement (nesting not allowed)
Switch()
Cases
Value 1:
Break;

 Value n:
break;
Endcase
Repetition Statement (nesting not allowed)
• Repeat
Until ()
• While (relational expression)
Endwhile
• For = start value, end value, inc/dec
………
Endfor
4
I/O Statement
• Input ;
• Output ;
Program Structure
Decleration:
Start
End
5
• Sample Program I

#mode 10
declaration
int r
int c
int in
int flg

start

r=0
flg = 1
while( flg == 1 )
if( c == 0) then
flg = 0
endif
c = c-1
endwhile
end

6
• Code generated
START:
MOV AX, @DATA
MOV DS, AX

MOV AX,
MOV r, AX
MOV AX,
MOV flg, AX
LB01:
MOV AX,
CMP AX,
JNE LB01
MOV AX,
CMP AX,
JNE LB01
MOV AX,
MOV flg, AX
LB01:
MOV AX,
SUB AX,
MOV c, AX
JMP LB01
LB01:
MOV AX, 4C00H
INT 21H

END START

7
• Sample Program II

#mode 10
declaration
int a ; b
int i
int k
string mes1
start
k=k*1
if(i<9 )then
i=i+9
k=k*1
endif
i=i-45
repeat
i=i+9*k+b
k=k*1
output "Hello World"
input k
until(i<2 )
while(k>3 )
i=i+9
k=k*1
endwhile
end

8
2.5 Code Generated

START:
MOV AX, @DATA
MOV DS, AX

MOV AX, k
MUL 1
MOV k, AX
MOV AX, i
CMP AX, 9
JGE LB01
MOV AX, i
ADD AX, 9
MOV i, AX
MOV AX, k
MUL 1
MOV k, AX
LB01:
MOV AX, i
SUB AX, 45
MOV i, AX
LB01:
MOV AX, i
ADD AX, 9
MUL k
ADD AX, b
MOV i, AX
MOV AX, k
MUL 1
MOV k, AX
LEA DX, "Hello World"
CALL MESSAGE
CALL INDEC
MOV k, AX
MOV AX, i
CMP AX, 2
JGE LB01
LB01:

8
MOV AX,
CMP AX, 3
JLE LB01
MOV AX, i
ADD AX, 9
MOV i, AX
MOV AX, k
MUL 1
MOV k, AX
JMP LB01
LB01:
MOV AX, 4C00H
INT 21H

END START
9
Chapter 3
Phases of compiler

• Overview
Analysis part of compiler breaks the source program into constituent pieces and imposes a
grammatical structure on them which further uses this structure to create an intermediate
representation of the source program. It is also termed as front end of compiler.

• Lexical analyser
Lexical analysis is the process of converting a sequence of characters from source program into a
sequence of tokens. A program which performs lexical analysis is termed as a lexical analyzer
(lexer), tokenizer or scanner.
Lexical analysis consists of two stages of processing which are as follows:
• Scanning
• Tokenization
Token is a valid sequence of characters which are given by lexeme. In a programming language,
• keywords,
• constant,
• identifiers,
• numbers,
• operators and
• punctuations symbols
are possible tokens to be identified.
For example : c=a+b;
In this c,a and b are identifiers and ‘=’ and ‘*’ are mathematical operators.

Lexical Errors
• A character sequence that cannot be scanned into any valid token is a lexical error.
• Lexical errors are uncommon, but they still must be handled by a scanner.
10
• Misspelling of identifiers, keyword, or operators are considered as lexical errors.
Usually, a lexical error is caused by the appearance of some illegal character, mostly at the
beginning of a token.
• Syntax analyser
Syntax analysis is the second phase of compiler. Syntax analysis is also known as parsing.
Parsing is the process of determining whether a string of tokens can be generated by a grammar.
It is performed by syntax analyzer which can also be termed as parser.
In addition to construction of the parse tree, syntax analysis also checks and reports syntax errors
accurately. Parser is a program that obtains tokens from lexical analyzer and constructs the parse tree
which is passed to the next phase of compiler for further processing.
Parser implements context free grammar for performing error checks.
Types of Parser
• Top down parsers Top down parsers construct parse tree from root to leaves.
• Bottom up parsers Bottom up parsers construct parse tree from leaves to root.
Role of Parser
• Once a token is generated by the lexical analyzer, it is passed to the parser.
• On receiving a token, the parser verifies the string of token names that can be generated by the
grammar of source language.
• It calls the function getNextToken(), to notify the lexical analyzer to yield another token.
• It scans the token one at a time from left to right to construct the parse tree.
• It also checks the syntactic constructs of the grammar.
• Semantic analyser
• Semantic analysis is the third phase of compiler.
• It checks for the semantic consistency.
• Type information is gathered and stored in symbol table or in syntax tree.
• Performs type checking.

11

• Intermediate code generator


Intermediate code generation is the process by which a compiler's code generator converts some
intermediate representation of source code into a form (e.g., machine code) that can be readily
executed by a machine.
Intermediate code generation produces intermediate representations for the source program which
are of the following forms:
     o Postfix notation
     o Three address code
     o Syntax tree
Most commonly used form is the three address code.
        t1 = inttofloat (5)
        t2 = id3* tl
        t3 = id2 + t2
        id1 = t3

Properties of intermediate code


• It should be easy to produce.
• It should be easy to translate into target program.
After intermediate code generation the front end part of compiler finishes.The output to intermediate
code generated is fed as input to back end of compiler , which converts this Intermediate code to
machine code.

12
Chapter 4
Screenshots

Output 1

13

Output 2

14

15
Chapter 5
Feasibility and future scope

New languages which are more close to general languages are being invented.With the growth of
technology ease of working is given priority. We have emerged from C , C++ to python ,ruby , etc.
which require less lines of code . There are other platforms such as Android Studio, Qt which
provide easy GUI creation and uses the popular languages Java and C++ respectively.

Our project can be extended to form a new language which is easy to learn, faster , has more inbuilt
features and has many more qualities of a good programming language.
A compiler is a program used for automated translation of computer programs from one language to
another. It translates input source code to output machine code which can be executed.Often, the
input language is one that a given computer can’t directly execute ,for example, because the
language is designed to be human-readable.Often, the output language is one that a given computer
can directly execute.
We don’t ever need a compiler to execute a program. It is just an intermediate to convert a High
Level Language program to machine executable code.

16
Chapter 6
Conclusion
In a compiler the process of Intermediate code generation is independent of machine and the
process of conversion of Intermediate code to target code is independent of language used.
Thus we have done the front end of compilation process. It includes 3 phases of compilation lexical
analysis, syntax analysis and semantic analysis which is then followed by intermediate code
generation.
In computer programming, the translation of source code into object code by a compiler. This report
outlines the analysis phase in compiler construction. In it’s implementation and source language is
converted to assembly level language.
17
Chapter 7
References
• Salomaa, Arto [1973]. Formal Languages. Academic Press, New York

• Schulz, Waldean A. [1976]. Semantic Analysis and Target Language Synthesis


in a Translator.Ph.D. thesis, University of Colorado, Boulder, CO.

• https://www.cs.vt.edu/undergraduate/courses/CS4304
18

You might also like