0% found this document useful (0 votes)
13 views5 pages

An Overview of Compiler Construction

Uploaded by

tusharadri123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views5 pages

An Overview of Compiler Construction

Uploaded by

tusharadri123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/351762702

An Overview of Compiler Construction

Article in International Journal of Research in Engineering and Technology · March 2021

CITATIONS READS

3 5,649

5 authors, including:

Bashir Salisu Abubakar Muhammad Mahmoud Ahmad


Kano University of Science & Technology 3 PUBLICATIONS 9 CITATIONS
8 PUBLICATIONS 12 CITATIONS
SEE PROFILE
SEE PROFILE

All content following this page was uploaded by Bashir Salisu Abubakar on 21 May 2021.

The user has requested enhancement of the downloaded file.


International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 08 Issue: 03 | Mar 2021 www.irjet.net p-ISSN: 2395-0072

An Overview of Compiler Construction

Bashir S. Abubakar1, Abdulkadir Ahmad2, Muktar M. Aliyu3, Muhammad M. Ahmad4,


Hafizu U. Uba5

Department of Computer Science


Kano University of Science and Technology, Wudil, Kano, Nigeria
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract – Research in compiler construction has been one The back-end and the front-end [7] are the two parts of a
of the core research areas in computing. Researchers in this compiler. The task of the back-end is to synthesis the target
domain try to understand how a computer system and language. Then, the front-end analyses the source code [6].
computer languages associate. A compiler translates code In a perfect compiler design, the back-end will lack any
written in human-readable form (source code) to target code knowledge of the source code, and the front-end will also
(machine code) that is efficient and optimized in terms of time lack knowledge about the target code. A compiler operates in
and space without altering the meaning of the source stages. Each stage performance a specific task. These stages
program. This paper aims to explain what a compiler is and are a scanner, parser, semantic analysis, intermediate code,
give an overview of the stages involved in translating code optimization and code generator [1, 6, and 7].
computer programming languages.

Key Words: - compiler, phases of a compiler, analysis,


synthesis, features of a compiler

1. INTRODUCTION

Assembly or high-level languages are the languages used to


write a computer system program. However, a computer
system understands none of these languages. Therefore, a
compiler is needed to translate the high-level language. A
high-level language is a language written in a human-
readable form with an easy-to-read syntax [6]. Examples of
such languages are Java, C#, Delphi, Ruby and many others.
Any computer program written in a high-level language is
known as source code. A compiler uses a source code as
Fig 1: Abstract view of a compiler
input, processes it and produces an object code without
changing the meaning of the source code [6]. The object code 1.1 Features of a compiler
is sometimes called machine code or target code [7].
a. Correctness
A compiler is a computer system software that transfigures b. Speed of compilation
source code into an intermediate code which afterwards c. Preserve the correct meaning of the code
transformed into target code without altering the meaning of d. Compile-time proportion to program size
the source code [5, 3]. The result of this transformation e. Good diagnostics for syntax errors
(machine code) must be efficient and optimized in terms of f. Good error reporting and handling
time and space (memory size). The interface between a g. Work well with the debugging
computer programmer and a computer system is the
compiler and the operating system [3]. A compiler detects an 1.2 Types of compiler
error(s) in the source code during compilation processes and A compiler is divided into 3, namely:
handle. There are three types of error in computer a. Single-pass compiler
programming. They are syntax, runtime and logic error [7, b. Two-pass compiler
6]. The only detected error during compilation processes is c. Multi-pass compiler
the syntax error. The other two types of errors occur during
program execution [4].

© 2021, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 587
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 08 Issue: 03 | Mar 2021 www.irjet.net p-ISSN: 2395-0072

2. THE COMPONENTS OF A COMPILER LEXICAL ANALYSIS


Lexical analysis is the first stage in compiler construction.
Before a compiler translates source code to object code, This stage is also called scanning [6]. In this stage, the source
the source code undergoes a series of steps, and these code is scan to remove any whitespace or comments. Then,
steps are called phases of a compiler [6]. Each stage the source code is categories into meaningful sequences of
performs a single and unique duty. A data structure lexical item called tokens.
called a symbol table is needed to store the output of
each stage, and an error handler needs to be present to A token may be composed of a single character or sequence
keep tracks of errors encounter [7]. of character. A token is classified as being either: Identifiers,
The phases of a compiler consist of six (6) phases. These Keywords Operators, Separators, Liberals, and Comments.
phases can be regrouped into two (2) categories as For each lexeme the scanner produces a token as output in
follow below [6]. the form [7]: <Token- name, attribute-value>

2.1 Analysis: A lexical analyser may be implement using Regular


expression from automata theory and deterministic finite
The source code is divided into meaning characters automata [6]. A Regular expression is used to specify the
and creates an intermediate representation. This token while deterministic finite automata are used to
part is further subdivided into three (3) as follows: recognise the token. Now let analyse the following:
a. Lexical analysis Count = frequency + 1
b. syntax analysis
c. Semantic analysis Lexeme(collection Tokens(category
2.2 Synthesis of characters) of lexeme)
The output of the analysis is used here to produce Count Identified (id)
the desired machine-oriented code. This section is = Assignment
subdivided into three (3). operator
a. Intermediate code generation frequency Identified
b. code optimization + Addition operator
c. code generator 1 Integer constant

Fig 3: Lexical Analyzer Interface

SYNTAX ANALYSIS
The next stage immediately after the scanner is the syntax
stage. This stage is also known as parsing [6]. The parsing
stage takes a token produced in the first phase and
constructs a syntax tree (parse tree). The goal of parsing is
to determine the syntactical validity of a source string.

Fig 2: Block Diagram of Compiler

© 2021, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 588
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 08 Issue: 03 | Mar 2021 www.irjet.net p-ISSN: 2395-0072

Parsing is implemented using context-free grammar (CFG) CODE OPTIMIZER


[6, 7]. A context free grammar CFG notations are used to the
syntactic specification of any program. The intermediate code generated in the previous stage is
Now let analyse the following: been optimized in this stage. The structure of the tree that is
Count = frequency + 1 generated by the parser can be rearranged to suit the needs
of the machine architecture to produce an object code that
runs faster [2]. The optimization is achieved by removing
unnecessary lines of codes.

CODE GENERATOR
Code generator is the last phase of a compiler construction
process. The code generator uses the optimized
representation of the intermediate code to generate a naïve
machine code. This stage depend on the machine
architecture.

3. CONCLUSIONS

Parse tree as an output of Parser This paper explains what a compiler is and gives an
overview of the steps involved in translating a programing
language into object code. A compiler translate source code
into object without tempering with the meaning of the
source code. The steps involved in translating a language are
six namely; lexical, syntax, semantic, intermediate
representation, code optimizer and code generator. Each of
this phases perform a single task.

REFERENCES

[1]. De Oliveira Guimarães, J. (2007). Learning compiler


construction by examples. ACM SIGCSE Bulletin, 39(4),
70. doi:10.1145/1345375.1345418

[2]. Guilan, D., Suqing, Z., Jinlan, T., & Weidu, J. (2002). A
study of compiler techniques for multiple targets in
compiler infrastructures. ACM SIGPLAN Notices, 37(6),
45. doi:10.1145/571727.571735
Fig 4: parse information flow

SEMANTIC ANALYSIS [3]. Jatin Chhabra, Hiteshi Chopra, Abhimanyu Vats (2014).
Research paper on Compiler Design. International Journal
This is the third stage in a compiler construction. Semantic
of Innovative Research in Technology (IJIRT), Volume 1,
analysis check for semantic errors in the parse tree
Issue 5
produced by the syntax analyzer [6]. Examples of semantic
errors are data compatibility (data type), undeclared
[4]. Zelkowitz, M. V. (1975). Third generation compiler
variable use and many more.
design. Proceedings of the 1975 Annual Conference on -
ACM 75. doi:10.1145/800181.810332
INTERMEDIATE CODE GENERATOR
In this phase, an intermediate code of the machine-oriented
is generated. It represents a program for some abstract [5]. Rudmik, A., & Lee, E. S. (1979). Compiler design for
machine [6]. The intermediate code is between a program efficient code generation and program optimization.
written in human-oriented and machine-oriented. Proceedings of the 1979 SIGPLAN Symposium on Compiler
Construction - SIGPLAN
’79. doi:10.1145/800229.806962

© 2021, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 589
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 08 Issue: 03 | Mar 2021 www.irjet.net p-ISSN: 2395-0072

[6]. Grune, D., van Reeuwijk, K., Bal, H. E., Jacobs, C. J. H.,
& Langendoen, K. (2012). Modern Compiler
Design. doi:10.1007/978-1-4614-4699-6

[7]. Alfred V. Aho, Monica S. Lam, Ravi Sethi, Jeffrey D.


Ullman, Compilers: Principles, Techniques, and
Tools, 2nd edition, Addison Wesley, August 31, 2006,
ISBN‐13: 978‐0321486813

[8]. Koskimies, K., Räihä, K.-J., & Sarjakoski, M.


(1982). Compiler construction using attribute
grammars. Proceedings of the 1982 SIGPLAN
Symposium on Compiler Construction - SIGPLAN
’82. doi:10.1145/800230.806991

[9]. Noonan, R. E. (1986). Compiler construction using


modern tools. Proceedings of the Seventeenth SIGCSE
Technical Symposium on Computer Science
Education - SIGCSE ’86. doi:10.1145/5600.5697

[10]. Demaille, A., Levillain, R., & Perrot, B. (2008). A


set of tools to teach compiler construction.
Proceedings of the 13th Annual Conference on
Innovation and Technology in Computer Science
Education - ITiCSE
’08. doi:10.1145/1384271.1384291

[11]. Li, H., Hu, C., Zhang, P., & Xie, L. (2016). Modular
SDN Compiler Design with Intermediate
Representation. Proceedings of the 2016 Conference
on ACM SIGCOMM 2016 Conference - SIGCOMM
’16. doi:10.1145/2934872.2959061

[12]. Chen, H., Ching, W.-M., & Hendren, L. (2017). An


ELI-to-C compiler: design, implementation, and
performance. Proceedings of the 4th ACM SIGPLAN
International Workshop on Libraries, Languages,
and Compilers for Array Programming - ARRAY
2017. doi:10.1145/3091966.3091969

© 2021, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 590

View publication stats

You might also like