0% found this document useful (0 votes)
14 views35 pages

Compiler Construction Lecture 1 - 2

The document outlines a course on Compiler Construction (CS-636) at GIMS-PMAS Arid Agriculture University, detailing the course structure, including sessional marks distribution, project requirements, and the importance of studying compilers. It emphasizes the balance of theory and practice, programming experience, and the process of compiling code from high-level to low-level languages. Additionally, it covers key concepts such as scanning, parsing, semantic analysis, intermediate code generation, and error handling in the compilation process.

Uploaded by

programmerareeba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views35 pages

Compiler Construction Lecture 1 - 2

The document outlines a course on Compiler Construction (CS-636) at GIMS-PMAS Arid Agriculture University, detailing the course structure, including sessional marks distribution, project requirements, and the importance of studying compilers. It emphasizes the balance of theory and practice, programming experience, and the process of compiling code from high-level to low-level languages. Additionally, it covers key concepts such as scanning, parsing, semantic analysis, intermediate code generation, and error handling in the compilation process.

Uploaded by

programmerareeba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Compiler Construction(CS-636)

Gul Sher Ali


MS(CS & Tech.)-China

______________________________________________________
GIMS- PMAS Arid Agriculture University, Gujrat Campus

1
Agenda

Course outline
Marks distribution
Sessional marks elaboration
Assignment/Project sending Submission
Procedure
Overview of the course
Why need to study this course?

2
Distribution

Sessional
Assignments = 3
Quizzes = 2 (unannounced + announced)
Project = 3
Exams
Mid term = 12
Final = 20
Practical = 20

3
Project

Implementation
Flex Script for analyzer module
Any language for other modules

4
Why Take this Course
Reason #1: understand compilers and
languages
understand the code structure
understand language semantics
understand relation between
source code and generated
machine code
become a better programmer

5
Why Take this Course

Reason #2: nice balance of theory and


practice
Theory
mathematical models: regular expressions,
automata, grammars, graphs
algorithms that use these models

Practice
Apply theoretical notions to build a real compiler

6
Why Take this Course

Reason #3: programming experience


write a large program which manipulates
complex data structures

7
Examples

Typical Compilers:
VC, VC++, GCC, JavaC
FORTRAN, Pascal, VB(?)
Translators
Word to PDF
PDF to Postscript

8
In This Course

We will study typical compilation:


from programs written in high-level languages
to low-level object code and machine code

9
Typical Compilation

High-level source code

Compiler

Low-level machine code


10
Source Code
int expr( int n )
{
int d;
d = 4*n*n*(n+1)*(n+1);
return d;
}

11
Source Code

Optimized for human readability


Matches human notions of grammar
Uses named constructs such as variables and
procedures

12
Assembly Code
.globl _expr
_expr:
imull %eax,%edx
pushl %ebp
movl 8(%ebp),%eax
movl %esp,%ebp
incl %eax
subl $24,%esp
imull %eax,%edx
movl 8(%ebp),%eax
movl %edx,-4(%ebp)
movl %eax,%edx
movl -4(%ebp),%edx
leal 0(,%edx,4),%eax
movl %edx,%eax
movl %eax,%edx
jmp L2
imull 8(%ebp),%edx
.align 4
movl 8(%ebp),%eax
L2:
incl %eax
leave
ret
13
Assembly Code

Optimized for hardware


Consists of machine instructions
Uses registers and unnamed memory locations
Much harder to understand by humans

14
How to Translate

Correctness:
the generated machine code must execute
precisely the same computation as the source
code

15
How to Translate

Is there a unique translation? No!


Is there an algorithm for an “ideal translation”?
No!

16
How to Translate

Translation is a complex process


source language and generated code are very
different
Need to structure the translation

17
Before we start, Let’s Judge:
What is the output of the following snippets of
code?
1=

2=

18
What is a Compiler?

 A compiler is a computer
program that translates a
program in a source language
into an equivalent program in a Source Target
target language. program compiler program

 A source program/code is a
program/code written in the
source language, which is
usually a high-level language. Error
message
 A target program/code is a
program/code written in the
target language, which often is
a machine language or an
intermediate code.
19
Process of Compiling Stream of characters
scanner
Stream of tokens
parser
Parse/syntax tree
Semantic analyzer
Annotated tree
Intermediate code generator
Intermediate code
Code optimization
Intermediate code
Code generator
Target code
Code optimization
Target code
20
Some Data Structures

Symbol table
Literal table
Parse tree

21
Symbol Table

Identifiers are names of variables, constants,


functions, data types, etc.
Store information associated with identifiers
Information associated with different types of identifiers
can be different
Information associated with variables are name, type,
address,size (for array), etc.
Information associated with functions are name,type of return
value, parameters, address, etc.

22
Symbol Table (cont’d)

Accessed in every phase of compilers


The scanner, parser, and semantic analyzer put
names of identifiers in symbol table.
The semantic analyzer stores more information (e.g.
data types) in the table.
The intermediate code generator, code optimizer and
code generator use information in symbol table to
generate appropriate code.

23
Literal table

Store constants and strings used in program


reduce the memory size by reusing constants and
strings
Can be combined with symbol table

24
Parse tree

Dynamically-allocated, pointer-based structure


Information for different data types related to
parse trees need to be stored somewhere.
Nodes are variant records, storing information for
different types of data
Nodes store pointers to information stored in other
data structure, e.g. symbol table

25
Scanning

A scanner reads a stream of characters and puts


them together into some meaningful (with
respect to the source language) units called
tokens.
It produces a stream of tokens for the next
phase of compiler.

26
Parsing

A parser gets a stream of tokens from the


scanner, and determines if the syntax
(structure) of the program is correct according
to the (context-free) grammar of the source
language.
Then, it produces a data structure, called a
parse tree or an abstract syntax tree, which
describes the syntactic structure of the program.

27
Semantic analysis

It gets the parse tree from the parser together


with information about some syntactic elements
It determines if the semantics or meaning of the
program is correct.
This part deals with static semantic.
semantic of programs that can be checked by
reading off from the program only.
syntax of the language which cannot be described in
context-free grammar.
Mostly, a semantic analyzer does type checking.
It modifies the parse tree in order to get that
(static) semantically correct code. 28
Intermediate code generation

An intermediate code generator


takes a parse tree from the semantic analyzer
generates a program in the intermediate language.
In some compilers, a source program is
translated into an intermediate code first and
then the intermediate code is translated into the
target language.
In other compilers, a source program is
translated directly into the target language.

29
Intermediate code generation
(cont’d)

Using intermediate code is beneficial when


compilers which translates a single source
language to many target languages are
required.
The front-end of a compiler – scanner to
intermediate code generator – can be used for every
compilers.
Different back-ends – code optimizer and code
generator– is required for each target language.
One of the popular intermediate code is three-
address code. A three-address code instruction
is in the form of x = y op z.
30
Code optimization

Replacing an inefficient sequence of instructions


with a better sequence of instructions.
Sometimes called code improvement.
Code optimization can be done:
after semantic analyzing
performed on a parse tree
after intermediate code generation
performed on a intermediate code
 after code generation
performed on a target code

31
Code generation

A code generator
takes either an intermediate code or a parse tree
produces a target program.

32
Error Handling

Error can be found in every phase of


compilation.
Errors found during compilation are called static (or
compile-time) errors.
Errors found during execution are called dynamic (or
run-time) errors
Compilers need to detect, report, and recover
from error found in source programs
Error handlers are different in different phases
of compiler.
33
Cross Compiler

a compiler which generates target code for a


different machine from one on which the
compiler runs.
A host language is a language in which the
compiler is written.
T-diagram S T
H

Cross compilers are used very often in practice.


34
Cousins of Compilers

Linkers
Loaders
Interpreters
Assemblers

35

You might also like