Terminology: Statement (敘述) Grammar (文法) Syntax (語法) vs. Semantics (語意)
Terminology: Statement (敘述) Grammar (文法) Syntax (語法) vs. Semantics (語意)
Statement ( 敘述 )
» declaration, assignment containing expression ( 運算式 )
Grammar ( 文法 )
» a set of rules specify the form of legal statements
Syntax ( 語法 ) vs. Semantics ( 語意 )
» Example: assuming J,K:integer and X,Y:float
» I:=J+K vs I:=X+Y
Compilation: 編譯
» matching statements written by the programmer to structures d
efined by the grammar and generating the appropriate object c
ode
1
System Software
Assembler
Loader and Linker
Macro Processor
Compiler
Operating System
Other System Software
» RDBS
» Text Editors
» Interactive Debugging System
2
Basic Compiler
3
Scanner
PROGRAM
STATS
SUM VAR
:= SUM
0 ,
; SUMSQ
SUMSQ ,
:= I
READ
(
VALUE
)
;
4
Parser
Terminology
» Define symbol ::=
» Nonterminal symbols <>
» Alternative symbols |
» Terminal symbols
5
Simplified Pascal Grammar
6
Parser
7
Syntax Tree
8
Syntax Tree for Program 5.1
9
Lexical Analysis
Function
» scanning the program to be compiled and recognizing the to
kens that make up the source statements
Tokens
» Tokens can be keywords, operators, identifiers, integers, flo
ating-point numbers, character strings, etc.
» Each token is usually represented by some fixed-length cod
e, such as an integer, rather than as a variable-length chara
cter string (see Figure 5.5)
» Token type, Token specifier (value) (see Figure 5.6)
10
Scanner Output
Token specifier
» identifier name, integer value
Token coding scheme
» Figure 5.5
11
Example - Figure 5.6
Statement Token type Token specifier
PROGRAM STATS 1
22 ^STATS
VAR 2
SUM,SUMSQ,I, …, : INTEGER 22 ^SUM
14
22 ^SUMSQ
14
…..
14
22 ^VARIANCE
13
6
BEGIN 3
SUM:=0; 22 ^SUM
15
23 #0
12
Token Recognizer
By grammar
» <ident> ::= <letter> | <ident> <letter>| <ident><digit>
» <letter> ::= A | B | C | D | … | Z
» <digit> ::= 0 | 1 | 2 | 3 | … | 9
By scanner - modeling as finite automata
» Figure 5.8(a)
13
Recognizing Identifier
A -Z
0 -9
1 A -Z 2 - 3
A -Z
0 -9
14
Recognizing Integer
1 0 -9 2
1 1 -9 2
3 space 4
15
Scanner -- Implementation
16
Syntactic Analysis
17
Operator-Precedence Parsing
Operator
» any terminal symbol (or any token)
Precedence
» * »+
» +«*
Operator-precedence
» precedence relations between operators
18
Precedence Matrix for the Fig. 5.2
Operator-Precedence Parse Example
BEGIN READ ( VALUE ) ;
20
(i) … id1 := id2 DIV
(ii) … id1 := <N1> DIV int -
Bottom-up parsing
Generating precedence matrix
» Aho et al. (1988)
23
Shift-reduce Parsing with Stack
Figure 5.14
24
Recursive-Descent Parsing
25
26
Recursive-Descent Parse of READ
27
Simplified Pascal Grammar for Recursive-
Descent Parser
28
29
30
31
32
33
34
35
36
37
38
39