0% found this document useful (0 votes)
5 views22 pages

SEN 317-Lecture-1

The document provides an overview of compiler construction, detailing the types of language translators, key terms, and the phases of compilation. It explains the functionality of compilers, their design, and the importance of optimization and code generation. Additionally, it highlights the contributions of Grace Hopper and discusses various tools and frameworks used in compiler development.

Uploaded by

stargazeboi14
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views22 pages

SEN 317-Lecture-1

The document provides an overview of compiler construction, detailing the types of language translators, key terms, and the phases of compilation. It explains the functionality of compilers, their design, and the importance of optimization and code generation. Additionally, it highlights the contributions of Grace Hopper and discusses various tools and frameworks used in compiler development.

Uploaded by

stargazeboi14
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Lecture 1

Introduction to Compiler
Construction

Compiler Construction- NUN 2024 Austin Olom Ogar


Language Translator
A language translator is a program that converts instructions written
in one programming language (source code) into machine code or
another programming language, enabling a computer to execute the
code. There are three main types of translators:
1. Compilers: Translate the entire source code of a high-level
. machine code at once before execution.
language into
2. Interpreters: Translate and execute code line by line, without
producing an intermediate machine code.
3. Assemblers: Convert assembly language into machine code, which
is directly executed by the computer.

Compiler Construction - NUN 2024 Austin Olom Ogar


Key Terms in Compiler Design and Translation Process
Source Language: The language in which the original
code (source code) is written before it gets processed or
compiled.

Target Language: The language in which the compiled or


translated code (object code) is generated, typically
machine code or another low-level language.

.
Implementation Language: The programming language
used to develop the compiler or interpreter that
processes the source code into the target language.

Object Code: object code is the output produced when a


compiler translates the human-readable source code
into a form that the computer’s processor can
understand
Compiler Construction- NUN 2024 Austin Olom Ogar
Language Processing System
Preprocessor:
• Produces input for compilers.
• Handles macros, file inclusion, and language
extensions.

Assembler
• Converts assembly language into machine code.
• Outputs an object file containing machine
instructions. .
Linker
• Merges object files into an executable.
• Resolves memory locations and references.

Loader
• Loads executable files into memory for execution.
• Allocates memory and initializes registers.
Compiler Construction- NUN 2024 Austin Olom Ogar
Compiler Design Overview
Compiler Design involves creating software tools called
compilers, which translate human-readable code from
high-level programming languages (e.g., C++, Java) into
machine-readable code (e.g., assembly or machine code).
.
• Goal: Automate code translation efficiently and
accurately.
• Process: Compilers analyze the structure and syntax of
source code, perform optimizations, and generate
executable programs that can run on computers.
Compiler Construction- NUN 2024 Austin Olom Ogar
Why Use Compilers?
Translation: Convert high-level code (e.g., C++, Python) into machine code that computers
can execute.

Efficiency: Streamline the translation process, making it faster and more reliable.

Programmer Convenience: Allow programmers to write in high-level languages, which are


.
easier to read, write, and maintain.

Optimization: Improve code performance by applying various optimizations, enhancing


execution speed and efficiency.

Portability: Generate code that can run on different hardware and operating systems
without modification.

Compiler Construction- NUN 2024 Austin Olom Ogar


Who Invented the Compiler?
Grace Hopper: The Pioneer of Compilers
•Background: Grace Hopper was a trailblazing American computer scientist in the
1950s.
•First Compiler: She developed the A-0 System, the first compiler that translated
mathematical notation into machine code.
•Revolutionary Impact: Hopper's invention transformed programming by enabling:
.
• Human-Readable Languages: Programmers could write in languages that are
easier to understand and use.
• Accelerated Development: Software development became faster and more
efficient.
•Legacy: Her groundbreaking work laid the foundation for modern compiler
technology, which remains vital in computer programming today.

Compiler Construction- NUN 2024 Austin Olom Ogar


How Does a Compiler Work?
Overview of Compiler Functionality
1.Translation Process:
A compiler converts high-level programming languages (written by humans) into
machine-readable instructions that computers can execute.
2.Analysis:
The compiler analyzes the code's structure and syntax to ensure it is correct and
.
follows the rules of the programming language.
3.Optimization:
Once verified, the compiler optimizes the code for efficiency, enhancing
performance and resource usage.
4.Code Generation:
The final step involves generating machine code, which consists of binary
instructions tailored to the specific architecture of the computer.
Compiler Construction- NUN 2024 Austin Olom Ogar
Types of Compilers
The four main types of compilers are as follows −
1.Single-Pass Compiler:
 Processes the source code in a single pass from start to finish.
 Generates machine code on-the-go.
 Efficient but may not catch all errors or perform extensive optimization.
2.Multi-Pass Compiler:
 Makes multiple passes over the source code, analyzing it in different stages.
 Allows for thorough error checking and optimization.
.
 Generally slower than single-pass compilers.
3.Just-In-Time (JIT) Compiler:
 Translates code into machine language during program execution (on-the-fly).Commonly used in
languages like Java and JavaScript.
 Enhances performance by converting code as needed.
4.Ahead-of-Time (AOT) Compiler:
 Translates code into machine language before execution, creating an executable file.
 Frequently used in languages like C and C++.Offers fast execution but requires prior compilation.

Compiler Construction- NUN 2024 Austin Olom Ogar


Type of Compiler phases
A compiler can broadly be divided into two phases
based on the way they compile.

Analysis Phase:
Known as the front-end of the compiler, the analysis phase of
the compiler reads the source program, divides it into core
parts and then checks for lexical, grammar and syntax errors.
The analysis phase generates an intermediate representation
.
of the source program and symbol table, which should be fed
to the Synthesis phase as input.

Synthesis Phase:
Known as the back-end of the compiler, the synthesis phase
generates the target program with the help of intermediate
source code representation and symbol table.

A compiler can have many phases and passes.

Compiler Construction- NUN 2024 Austin Olom Ogar


Phases of Compiler
The compilation process is a sequence of various
phases. Each phase takes input from its previous stage,
has its own representation of source program, and feeds
its output to the next phase of the compiler. Let us
understand the phases of a compiler.

The following Java source code will be used to explain


the phases.
.
SourceCode

public class HelloWorld {


public static void main(String[] args) {
System.out.println("Hello, World!");
}
}
Lexical Analysis
Lexical Analysis is the first phase of a compiler where the source code
is broken down into tokens. Each token represents a meaningful unit
of the code such as keywords, identifiers, symbols, etc.
Keywords:
• public, class, static, void
• These are reserved words in Java.
Identifiers:
• HelloWorld, main, args

.
These are user-defined names for classes, methods, or variables.
Symbols/Operators:
• {, }, (, ), ;
• These are syntactic symbols that separate code blocks and indicate
operations.
Literals:
• "Hello, World!“
• This is a string literal.
Whitespace:
Spaces, tabs, and newlines that separate tokens but are ignored by the compiler during
tokenization.
Syntax Analyzer
A syntax analyzer (or parser) takes the token stream generated by the lexical
analyzer and checks it against the formal grammar rules of the programming
language. For a basic syntax analyzer for a source code, here are the steps:
Input::
• The input to the syntax analyzer is a stream of tokens generated by the
lexical analyzer. For example, tokens like public, class, static, {, }, etc..
Grammar Definition:
• Define the grammar rules of the language being analyzed. For example, a
simple Java grammar might include rules such as:

.
• <class> ::= 'class' <identifier> '{' <method>* '}'
• <method> ::= 'public' 'static' 'void' <identifier> '(' <args> ')' '{'
<statements> '}'
Parsing Algorithm:
• The syntax analyzer can use parsing techniques like LL(1) or LR(1) to
check the token stream.
Error Handling:
• If a token doesn't match the grammar rules, the syntax analyzer reports a
syntax error.
Output:
The output is a parse tree or an Abstract Syntax Tree (AST) that represents
the hierarchical structure of the code.
Semantic Analysis
Semantic Analysis is the third phase of the compiler after lexical and syntax
analysis. It ensures that the program is semantically correct, meaning that
the statements in the source code make logical sense and align with the
rules of the programming language:
Type Checking::
• In the statement int x = "Hello";, the semantic analyzer detects a type
mismatch. The variable x is declared as an integer, but it is assigned a
string value, which is not allowed in Java. The semantic analysis ensures
that values assigned to variables match their declared types..
Scope Resolution:
.
• The semantic analyzer checks that all variables and methods are declared
before they are used.
Function Parameter Matching:
• The semantic analyzer checks that the arguments passed to functions
match the expected parameter types.
• Symbol Table Usage:
• The analyzer ensures that all variables, functions, and classes are entered
and checked in a symbol table to track their types, scopes, and
declarations throughout the program .
Intermediate Code Generation
Intermediate Code Generation is a key phase in the compilation process,
where the source code is transformed into an intermediate representation
(IR) that is more abstract than machine code but closer to the actual
hardware than the high-level source code. This intermediate code is easier to
optimize and is often used to facilitate portability across different hardware
platforms:
Let's consider a simple Java source code:
• public class HelloWorld {
• public static void main(String[] args) {


int a = 5;
int b = 10; .
• int c = a + b;
• System.out.println(c);
• }
• }

Intermediate Code Generation equivalent:


• t1 = 5 // Assign 5 to temporary variable t1
• t2 = 10 // Assign 10 to temporary variable t2
• t3 = t1 + t2 // Add t1 and t2, store the result in t3
• print t3 // Print the value of t3
This representation is useful for optimization before converting it to machine code.
Code Optimization
Code Optimization is a critical phase in compiler design where the intermediate code is optimized
to make it more efficient without changing its functionality. The goal is to improve performance,
reduce resource usage, and enhance the execution speed of the final program.
Let's consider a simple Java source code:
• public class HelloWorld {
• public static void main(String[] args) {
• int a = 5;
• int b = 10;
• int c = a + b;
• System.out.println(c);
• }
• }
Code Optimization Techniques:
• Constant Folding:
.
• Since both a and b are constants, the expression a + b can be computed at compile
time instead of during execution.
• Optimized: int c = 15; (replaces int c = a + b;)
• Dead Code Elimination:
• If there are any code blocks that don't affect the program's output, they can be
removed. In this case, no dead code exists, but in larger programs, it can be crucial
for optimization
• Strength Reduction
• Loop Optimization
Optimized Intermediate Code:
• t1 = 15 // Constant folding for int c = a + b;
• print t1 // The value t1 is printed directly, skipping unnecessary operations.
Code Generation
Code Generation is the final phase of compilation. It takes the optimized
intermediate code and generates machine code or assembly code that the
target machine can execute directly:
Let's consider a simple Java source code:
• public class HelloWorld {
• public static void main(String[] args) {
• int a = 5;
• int b = 10;
• int c = a + b;

• } .
System.out.println(c);

• }
Code Generation Example (for a simple target architecture):
Assembly Code (assuming an x86 architecture):
• MOV R1, 5 ; Move the constant 5 into register R1
• MOV R2, 10 ; Move the constant 10 into register R2
• ADD R3, R1, R2 ; Add the contents of R1 and R2, store in R3
• CALL PRINT, R3 ; Call the print function with R3 (value of c)
Symbol Table
The Symbol Table is a crucial data structure used in compiler design to store
information about identifiers (variables, functions, objects) used in the
source code. It helps the compiler keep track of declarations and definitions
for generating the correct machine code:
public class HelloWorld {
public static void main(String[] args) {
int a = 5;
int b = 10;
int c = a + b;
System.out.println(c);

.
}
}
Symbol Table for this Code:
Identifier Type Scope Memory Location
HelloWorld Class Global N/A
main Method Global N/A
args String[] Local (main) Stack location
a int Local (main) Stack location
b int Local (main) Stack location
c int Local (main) Stack location
Compiler Tools and Frameworks
ANTLR (ANother Tool for Language Recognition):
• Overview: ANTLR is a powerful parser generator used to read,
process, execute, or translate structured text or binary files. It is
widely used in building parsers for programming languages and
data processing pipelines.
• Features: .ANTLR generates lexers, parsers, and tree parsers,
providing a high-level way to define grammars. It supports
multiple programming languages like Java, Python, C#, and
JavaScript, making it a versatile tool for compiler construction.
• Use in Industry: ANTLR is used in tools for code analysis, language
processing, and compilers, making it a go-to framework for
modern
Compiler programming
Construction - NUN 2024 Austin Olom Ogar languages and large-scale projects..
Compiler Tools and Frameworks cont.
Flex/Bison:
• Overview: Flex is a tool for generating lexical analyzers, while
Bison is used to generate parsers. They work together to build
compilers by translating high-level grammars into executable
code.
. scans the input text and produces tokens, which
• Features: Flex
Bison then parses based on the grammar rules provided. This
combination is highly efficient for C/C++ programming.
• Use in Industry: These tools are widely used in building traditional
compilers, especially for languages like C/C++, and are considered
industry standards for low-level parsing tasks...
Compiler Construction - NUN 2024 Austin Olom Ogar
Application of Compiler Construction
• Programming Languages: Compilers enable the translation of high-level programming languages (like C+
+, Java) into machine code, allowing software developers to write complex applications efficiently.

• Optimized Code Generation: Compilers are used to optimize code, improving execution speed, memory
usage, and overall performance of programs on different hardware architectures.

• Software Development Tools: Compilers are essential in integrated development environments (IDEs) to
provide error detection, syntax checking, and debugging during the software development process.

• .
Embedded Systems: In embedded systems development, compilers play a crucial role in generating
machine-specific code that can run efficiently on hardware with limited resources.

• Interpreter Systems: Compilers are used in interpreter systems like Just-In-Time (JIT) compilers, which
compile code at runtime to improve execution performance in languages such as Java and Python.

• Cross-Platform Development: Cross-compilers allow software to be developed on one platform and


executed on another, facilitating cross-platform application development.

You might also like