SEN 317-Lecture-1
SEN 317-Lecture-1
Introduction to Compiler
Construction
.
Implementation Language: The programming language
used to develop the compiler or interpreter that
processes the source code into the target language.
Assembler
• Converts assembly language into machine code.
• Outputs an object file containing machine
instructions. .
Linker
• Merges object files into an executable.
• Resolves memory locations and references.
Loader
• Loads executable files into memory for execution.
• Allocates memory and initializes registers.
Compiler Construction- NUN 2024 Austin Olom Ogar
Compiler Design Overview
Compiler Design involves creating software tools called
compilers, which translate human-readable code from
high-level programming languages (e.g., C++, Java) into
machine-readable code (e.g., assembly or machine code).
.
• Goal: Automate code translation efficiently and
accurately.
• Process: Compilers analyze the structure and syntax of
source code, perform optimizations, and generate
executable programs that can run on computers.
Compiler Construction- NUN 2024 Austin Olom Ogar
Why Use Compilers?
Translation: Convert high-level code (e.g., C++, Python) into machine code that computers
can execute.
Efficiency: Streamline the translation process, making it faster and more reliable.
Portability: Generate code that can run on different hardware and operating systems
without modification.
Analysis Phase:
Known as the front-end of the compiler, the analysis phase of
the compiler reads the source program, divides it into core
parts and then checks for lexical, grammar and syntax errors.
The analysis phase generates an intermediate representation
.
of the source program and symbol table, which should be fed
to the Synthesis phase as input.
Synthesis Phase:
Known as the back-end of the compiler, the synthesis phase
generates the target program with the help of intermediate
source code representation and symbol table.
.
• <class> ::= 'class' <identifier> '{' <method>* '}'
• <method> ::= 'public' 'static' 'void' <identifier> '(' <args> ')' '{'
<statements> '}'
Parsing Algorithm:
• The syntax analyzer can use parsing techniques like LL(1) or LR(1) to
check the token stream.
Error Handling:
• If a token doesn't match the grammar rules, the syntax analyzer reports a
syntax error.
Output:
The output is a parse tree or an Abstract Syntax Tree (AST) that represents
the hierarchical structure of the code.
Semantic Analysis
Semantic Analysis is the third phase of the compiler after lexical and syntax
analysis. It ensures that the program is semantically correct, meaning that
the statements in the source code make logical sense and align with the
rules of the programming language:
Type Checking::
• In the statement int x = "Hello";, the semantic analyzer detects a type
mismatch. The variable x is declared as an integer, but it is assigned a
string value, which is not allowed in Java. The semantic analysis ensures
that values assigned to variables match their declared types..
Scope Resolution:
.
• The semantic analyzer checks that all variables and methods are declared
before they are used.
Function Parameter Matching:
• The semantic analyzer checks that the arguments passed to functions
match the expected parameter types.
• Symbol Table Usage:
• The analyzer ensures that all variables, functions, and classes are entered
and checked in a symbol table to track their types, scopes, and
declarations throughout the program .
Intermediate Code Generation
Intermediate Code Generation is a key phase in the compilation process,
where the source code is transformed into an intermediate representation
(IR) that is more abstract than machine code but closer to the actual
hardware than the high-level source code. This intermediate code is easier to
optimize and is often used to facilitate portability across different hardware
platforms:
Let's consider a simple Java source code:
• public class HelloWorld {
• public static void main(String[] args) {
•
•
int a = 5;
int b = 10; .
• int c = a + b;
• System.out.println(c);
• }
• }
• }
Code Generation Example (for a simple target architecture):
Assembly Code (assuming an x86 architecture):
• MOV R1, 5 ; Move the constant 5 into register R1
• MOV R2, 10 ; Move the constant 10 into register R2
• ADD R3, R1, R2 ; Add the contents of R1 and R2, store in R3
• CALL PRINT, R3 ; Call the print function with R3 (value of c)
Symbol Table
The Symbol Table is a crucial data structure used in compiler design to store
information about identifiers (variables, functions, objects) used in the
source code. It helps the compiler keep track of declarations and definitions
for generating the correct machine code:
public class HelloWorld {
public static void main(String[] args) {
int a = 5;
int b = 10;
int c = a + b;
System.out.println(c);
.
}
}
Symbol Table for this Code:
Identifier Type Scope Memory Location
HelloWorld Class Global N/A
main Method Global N/A
args String[] Local (main) Stack location
a int Local (main) Stack location
b int Local (main) Stack location
c int Local (main) Stack location
Compiler Tools and Frameworks
ANTLR (ANother Tool for Language Recognition):
• Overview: ANTLR is a powerful parser generator used to read,
process, execute, or translate structured text or binary files. It is
widely used in building parsers for programming languages and
data processing pipelines.
• Features: .ANTLR generates lexers, parsers, and tree parsers,
providing a high-level way to define grammars. It supports
multiple programming languages like Java, Python, C#, and
JavaScript, making it a versatile tool for compiler construction.
• Use in Industry: ANTLR is used in tools for code analysis, language
processing, and compilers, making it a go-to framework for
modern
Compiler programming
Construction - NUN 2024 Austin Olom Ogar languages and large-scale projects..
Compiler Tools and Frameworks cont.
Flex/Bison:
• Overview: Flex is a tool for generating lexical analyzers, while
Bison is used to generate parsers. They work together to build
compilers by translating high-level grammars into executable
code.
. scans the input text and produces tokens, which
• Features: Flex
Bison then parses based on the grammar rules provided. This
combination is highly efficient for C/C++ programming.
• Use in Industry: These tools are widely used in building traditional
compilers, especially for languages like C/C++, and are considered
industry standards for low-level parsing tasks...
Compiler Construction - NUN 2024 Austin Olom Ogar
Application of Compiler Construction
• Programming Languages: Compilers enable the translation of high-level programming languages (like C+
+, Java) into machine code, allowing software developers to write complex applications efficiently.
• Optimized Code Generation: Compilers are used to optimize code, improving execution speed, memory
usage, and overall performance of programs on different hardware architectures.
• Software Development Tools: Compilers are essential in integrated development environments (IDEs) to
provide error detection, syntax checking, and debugging during the software development process.
• .
Embedded Systems: In embedded systems development, compilers play a crucial role in generating
machine-specific code that can run efficiently on hardware with limited resources.
• Interpreter Systems: Compilers are used in interpreter systems like Just-In-Time (JIT) compilers, which
compile code at runtime to improve execution performance in languages such as Java and Python.