0% found this document useful (0 votes)
27 views49 pages

Lecture 1 16092024 120225pm

Uploaded by

rehmanhammad81
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views49 pages

Lecture 1 16092024 120225pm

Uploaded by

rehmanhammad81
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

Compiler Construction

Lecture 1: Introduction

Dr. Sabina Akhtar


About me
• MS(Computer Science)
– FAST-NU Islamabad (2002-2005)

• MS(Computer Science)
– Université de Lorraine,
Nancy France (2007-2008)
About me
• Ph.d (Formal Verification)
– Researched at INRIA Nancy
– Université de Lorraine, Nancy, France (2008-2012)
Contact details

• Email: [email protected]
• Office hours:
Some Rules

– Attendance during first 10 minutes for each hour

– After that you will be marked as absent

– Raise your hand before asking any question and


then WAIT for the permission

– Must wear your ID card in the class


Some Rules
– Submission must be done on time

– No compensation for the missed quizzes,


assignments & projects

– Keep mobile phones on silent mode in the class

– Whatever you do, please do not create


disturbance in the class
Dishonesty, Plagiarism

• Students involved in any kind of cheating in any


exam (Quizzes/Assignments/Projects) will get 0
(Zero) in that exam OR get F in the course

• In case of quizzes with cheating case, their


weightage will be highest.
Pre-Requisites
• Must have cleared Automata Theory

• Should be familiar with programming


languages
Tentative Evaluation Breakdown

Total 100
Quizzes 10
Assignments 8
Projects 12
Mid-Term Examination 20
Final Examination 50
Assignments and Projects
• 2 Assignments
• 2 Projects
– Phase 1: Write language specification and
generate scanner and parser for the language.
– Phase 2: Write intermediate code generator and
target code generator in java for the output of
Phase 1.
– Documentation is mandatory along with the
projects.

No late submissions will be accepted!!!!


Groups for Projects
• 4-5 students per group
– All the students must participate in project
development process
– Marking will done on individual understanding
– List to be submitted today
• Assignments – in group
Tools required
• JavaCC
– a parser generator and a lexical analyzer generator
• Javac
– compiler for java programs
• Java Virtual Machine
• Eclipse(optional)
– plugin available for javacc
Course outline
Week # Topic
1 Introduction to compilers
2 Revision RE, DFA,…Project Introduction
3 Lexical Analysis
4 Introduction to Parsing
5 Issues in parsing grammars
6 Top down parsing
7 Syntax directed translation
8 Semantic analysis
9 Midterms
Course outline
Week # Topic
10 Intermediate code generation
11 Optimizations
12 Code generation
13 Runtime Environments
14 Bottom-up analysis
15,16 Demos
17 Revision
18 Final Exams
Books
• Text Book:
– Compilers Principles, Techniques and Tools, 2nd edition, the
“Dragon Book”, by Aho/Lam/Sethi/Ullman. Pearson
Education Inc. 2009.
• Web Reference:
– Lectures from New York University.
– Link: http://cs.nyu.edu/courses/fall14/CSCI-GA.2130-001/
– Lectures from Stanford University.
– Link: http://web.stanford.edu/class/cs143/

15
What is a compiler, what does it do?
What is this course about?
• How are programs written in a high-level
language transformed into machine code?
• How can we ensure a program is somewhat
meaningful?
• How is the program’s memory managed?
• How can we analyze programs to discover
invariant properties and improve their runtime
performance?
Compiler

• Source programs: typically high level,


• Target programs: typically assembler or
object/machine code,
• Compiler: implementations, e.g. C, Python, Java ...
Historical background
Modern languages and compilers
• Object-oriented languages
– Java, C++, C#, Python, Ruby,…
– Compilers are more complex

• What does Java compiler do?


• What about C#?
We already have many languages and their
compilers,...

Do we still need more compilers??


Context denpendency
• Designed for specific context
• Cannot be used in a different scenario
• Example
– SQL
• Where is it used?
– Java
–C
– Python
Example
• Go programming language by Google
– Why did they develop it?
– Large server software in C++
– Compilation was too slow

• Go programs compile faster than other languages


• Why?

Thats why we need to study compiler construction!


Language Processors
• Interpreter:
– a program (written in a meta language) for
executing another program.
• Compiler:
– a program (written in a meta language) that
translates a program into an equivalent program.

• Meta-language:
– a language to talk about another language
Advantages of compilers

• allow programming at an understandable abstraction


level,
• allow programs to be written in machine-
independent languages,
• help in verifying software and error reporting,
• help code optimization.
What is a compiler?
• A programming language translator

Source code Compiler Target code


Structure

Source Front IR Back Target


Code code Code
End End
Over Simplified!
Source
Code

Front IR IR IR Back
End code Optimization code End

Target
Code
Front End
Input string

Lexical Syntactic Semantic


analysis analysis analysis

IR Code
Lexical analysis

Le t t e r s Words

Lexical
Characters analyzer
tokens
Lexical analysis

Stream of characters

What are the tokens that will be produced by


lexer?
Lexemes

From Linguistics we have that a lexeme is the smallest


meaningful entity of a language.

The lexemes here: position, =, initial, +, rate, *, and 60.


Lexical analysis

Stream of characters

scanned into list of tokens, one for each lexeme:

⟨id,1⟩ ⟨=⟩ ⟨id,2⟩ ⟨+⟩ ⟨id,3⟩ ⟨∗⟩ ⟨num,60⟩


Syntactic analysis
⟨id,1⟩ ⟨=⟩ ⟨id,2⟩ ⟨+⟩ ⟨id,3⟩ ⟨∗⟩ ⟨num,60⟩

parsed into syntax tree (precedence):


Semantic analysis

enriched with semantic information (explicit


type conversion):
Intermediate Representation

translated to intermediate code:


Back End
Intermediate code

IR Code
Optimization generation

Target code
Optimization

optimized to
Code Generation

generates
Class Activity

Your task is to show the output of each of the


compiler phases discussed in this lecture for
the source code statement

position = (initial + rate)


Class Activity 2

Your task is to show the output of each of the


compiler phases discussed in this lecture for
the source code statement

if x == y then z = 1;

Note that keywords will be identified separately by the lexer


Compiler-construction tools
• Commonly used tools:
– scanner generators: token description → lexical
analyser,
– parser generators: grammar → syntax analyser,
– syntax-directed translation engines: syntax tree → IR,
– code-generator generators: translation rules → code
generator,
• Compiler-generation: integrated set of the above.
JavaCC
• JavaCC is Java’s Compiler Compiler
– creates scanners (lexical analyzers, lexers) and
parsers written in Java
• Since it creates both the lexer and the parser,
JavaCC is thus equivalent to a combination of
the traditional lex/flex and yacc/bison tools
• Free download from
– https://javacc.dev.java.net
JavaCC
• Unlike yacc/bison, JavaCC support is limited to
LL(k)
• As with lex/flex/yacc/bison, JavaCC’s .jj file
includes the grammars as well as embedded
code that defines what happens as
productions are recognized
• Output: seven Java source files, comprising
the lexer, parser, and supporting classes such
as exceptions, I/O utilities, and constants
Assignment 1
• Your task is to show the output of each of the
compiler phases discussed in this lecture for
the source code statement

celsius = (fahrenheit – 32) * (5/9)


Assignment 1
• What sequence of tokens does the lexical analyzer output?
• What abstract syntax tree does the syntax analyzer output?
• What is the abstract syntax tree after the semantic analyzer
modifies it?
• What sequence of three-address instructions does the IR
generator output?
• What sequence of three-address instructions does the
optimizer output?
• What sequence of machine instructions does the code
generator output?
Assignment 1
• Due date
– next week before the class
• No late submissions allowed after the
deadline
Quiz in the next week
• Revise your Automata Theory
• Topics
– Regular Expressions
– Deterministic Finite Automata
– Context Free Grammar
– Kleene’s closure
– Positive closure,…
References
• Book Section 1.1, 1.2
• Lectures from New York University. Link:
http://cs.nyu.edu/courses/fall14/CSCI-
GA.2130-001/
• Lectures from Stanford University. Link:
http://web.stanford.edu/class/cs143/

You might also like