Language Processing System Notes
Language Processing System Notes
Evaluation
Active sheets Exercise reports Midterm Exam Final Exam 10 % 30 % 20 % 40 %
Contact
Send e-mail to
hamada@u-aizu.ac.jp
Course materials at
www.u-aizu.ac.jp/~hamada/education.html
Check every week for update
Books
Andrew W. Appel : Modern Compiler Implementation in C A. Aho, R. Sethi and J. Ullman, Compilers: Principles, Techniques and Tools (The Dragon Book ), Addison Wesley S. Muchnick, Advanced Compiler Design and Implementation, Morgan Kaufman, 1997
Books
Goals
understand the structure of a compiler understand how the components operate understand the tools involved
scanner generator, parser generator, etc.
understanding means
[theory] be able to read source code [practice] be able to adapt/write source code
Related to Compilers
Interpreters (direct execution) Assemblers Preprocessors Text formatters (non-WYSIWYG) Analysis tools
Todays Outline
Introduction to Language Processing Systems
Why do we need a compiler? What are compilers? Anatomy of a compiler
Compilers Construction touches many topics in Computer Science Theory Finite State Automata, Grammars and Parsing, data-flow Algorithms Graph manipulation, dynamic programming Data structures Symbol tables, abstract syntax trees Systems Allocation and naming, multi-pass systems, compiler construction Computer Architecture Memory hierarchy, instruction selection, interlocks and latencies Security Detection of and Protection against vulnerabilities Software Engineering Software development environments, debugging Artificial Intelligence Heuristic based search
Power of a Language
Can use to describe any action
Not tied to a context Many ways to describe the same action Flexible
Natural Languages:
Powerful, but Ambiguous
Same expression describes many possible
actions
Programming Languages
Properties
need to be precise need to be concise need to be expressive need to be at a high-level (lot of abstractions)
Compiler
Input: High-level programming language Output: Low-level assembly instructions Compiler does the translation:
Read and understand the program Precisely determine what actions it require Figure-out how to faithfully carry-out those
actions Instruct the computer to carry out those actions
Computation
Expressions (arithmetic, logical, etc.) Assignment statements Control flow (conditionals, loops) Procedures
sumcalc, .-sumcalc .section .Lframe1: .long .LECIE1-.LSCIE1 .LSCIE1:.long 0x0 .byte 0x1 .string "" .uleb128 0x1 .sleb128 -8 .byte 0x10 .byte 0xc .uleb128 0x7 .uleb128 0x8 .byte 0x90 .uleb128 0x1 .align 8 .LECIE1:.long .LEFDE1-.LASFDE1 .long .LASFDE1-.Lframe1 .quad .LFB2 .quad .LFE2-.LFB2 .byte 0x4 .long .LCFI0-.LFB2 .byte 0xe .uleb128 0x10 .byte 0x86 .uleb128 0x2 .byte 0x4 .long .LCFI1-.LCFI0 .byte 0xd .uleb128 0x6 .align 8
.size
Anatomy of a Computer
Compiler
What is a compiler?
A compiler is a program that reads a program written in one language and translates it into another language.
compiler
Example
X=a+b*10
compiler
MOV id3, R2 MUL #10.0, R2 MOV id2, R1 ADD R2, R1 MOV R1, id1
What is a compiler?
Intermediate representation
front-end analysis
semantic representation
back-end synthesis
compiler
Compiler Architecture
Front End
tokens Source language Parse tree AST
Back End
Intermediate Language OIL
Semantic Analysis
IC generator
Code Optimizer
Code Generator
Target language
Error Handler
Symbol Table
lexical analysis
front-end
AST
tokens
syntax analysis
context handling
annotated AST
Semantic representation
annotated AST
Semantic representation
program in some source language semantic representation executable code for target machine
front-end analysis
back-end synthesis
compiler
AST example
expression grammar
expression expression + term | expression - term | term term term * factor | term / factor | factor factor identifier | constant | ( expression )
example expression
b*b 4*a*c
* b
-
type: real loc: reg1 type: real loc: reg2
*
type: real loc: sp+16
* b
type: real loc: sp+16 type: real loc: reg2
c
type: real loc: sp+8
Example
Scanner
id1 := id2 + id3 * 60
Parser
:= id1 id2 id3 + * 60
Semantic Analyzer
:= id1 id2 id3 + * int-to-real 60
example expression
b*b (4*a*c)
Answers
4*a*c
Break
1. Retargeting - Build a compiler for a new machine by attaching a new code generator to an existing front-end. 2. Optimization - reuse intermediate code optimizers in compilers for different languages and different machines. Note: the terms intermediate code, intermediate language, and intermediate representation are all used interchangeably.
Compiler structure
program in some source language program in some source language front-end analysis back-end synthesis executable code for target machine executable code for target machine executable code for target machine
front-end analysis
semantic representation
back-end synthesis
compiler
back-end synthesis
front-end analysis
semantic representation
C++ Java
FORTRAN
PowerPC
FE
BE BE
FE
IR BE
FE
BE
Compiler Example
position=initial+rate*60
compiler
MOV id3, R2 MUL #60.0, R2 MOV id2, R1 ADD R2, R1 MOV R1, id1
Example
Intermediate Code Generator
temp1 := int-to-real (60) temp2 := id3 * temp1 temp3 := id2 + temp2 id1 := temp3
Scanner
id1 := id2 + id3 * 60
Parser
:= id1 id2 id3 + * 60
Code Optimizer
temp1 := id3 * 60.0 id1 := id2 + temp1
Semantic Analyzer
:= id1 id2 id3 + * int-to-real 60
Code Generator
MOV MUL MOV ADD MOV
id1
Resident Compiler
Compiled Application
compiler
Postfix expression
Infix expression:
Refer to expressions in which the operations are put between its operands. Example: a+b*10
Postfix expression: Refer to expressions in which the operations come after its operands. Example: ab10*+
END
Interpreter vs Compiler
Source Program
Input
Interpreter
Output
Input
Output
Typical Compiler
Source Program Lexical Analyzer Syntax Analyzer Semantic Analyzer Intermediate Code Generator Code Optimizer Code Generator Target Program