0% found this document useful (0 votes)
50 views

Terminology: Statement (敘述) Grammar (文法) Syntax (語法) vs. Semantics (語意)

The document defines key terminology related to programming languages and compilers, including statements, grammar, syntax, semantics, and compilation. It also outlines the main components of a basic compiler, including lexical analysis, syntactic analysis, and code generation. Lexical analysis involves scanning source code and recognizing tokens, syntactic analysis parses the statements according to grammar rules, and code generation produces the target code.

Uploaded by

Jaimon Jacob
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views

Terminology: Statement (敘述) Grammar (文法) Syntax (語法) vs. Semantics (語意)

The document defines key terminology related to programming languages and compilers, including statements, grammar, syntax, semantics, and compilation. It also outlines the main components of a basic compiler, including lexical analysis, syntactic analysis, and code generation. Lexical analysis involves scanning source code and recognizing tokens, syntactic analysis parses the statements according to grammar rules, and code generation produces the target code.

Uploaded by

Jaimon Jacob
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 39

Terminology

 Statement ( 敘述 )
» declaration, assignment containing expression ( 運算式 )
 Grammar ( 文法 )
» a set of rules specify the form of legal statements
 Syntax ( 語法 ) vs. Semantics ( 語意 )
» Example: assuming J,K:integer and X,Y:float
» I:=J+K vs I:=X+Y
 Compilation: 編譯
» matching statements written by the programmer to structures d
efined by the grammar and generating the appropriate object c
ode

1
System Software

 Assembler
 Loader and Linker
 Macro Processor
 Compiler
 Operating System
 Other System Software
» RDBS
» Text Editors
» Interactive Debugging System

2
Basic Compiler

 Lexical analysis -- scanner


» scanning the source statement, recognizing and classifying
the various tokens
 Syntactic analysis -- parser
» recognizing the statement as some language construct
 Code generation --

3
Scanner
PROGRAM
STATS
SUM VAR
:= SUM
0 ,
; SUMSQ
SUMSQ ,
:= I

READ
(
VALUE
)
;

4
Parser

 Grammar: a set of rules


» Backus-Naur Form (BNF)
» Ex: Figure 5.2

 Terminology
» Define symbol ::=
» Nonterminal symbols <>
» Alternative symbols |
» Terminal symbols

5
Simplified Pascal Grammar

6
Parser

 READ(VALUE)  <read> ::= READ (<id-list>)


 <id-list>::= id | <id-list>,id

 SUM := 0  <assign>::= id := <exp>


 <exp> ::= <term> |
 SUM := SUM + VALUE <exp>+<term> |
<exp>-<term>
 <term>::=<factor> |
 MEAN := SUM DIV 100
<term>*<factor> | <term> DIV <f
actor>
 <factor>::= id | int | <exp>

7
Syntax Tree

8
Syntax Tree for Program 5.1

9
Lexical Analysis

 Function
» scanning the program to be compiled and recognizing the to
kens that make up the source statements
 Tokens
» Tokens can be keywords, operators, identifiers, integers, flo
ating-point numbers, character strings, etc.
» Each token is usually represented by some fixed-length cod
e, such as an integer, rather than as a variable-length chara
cter string (see Figure 5.5)
» Token type, Token specifier (value) (see Figure 5.6)

10
Scanner Output

 Token specifier
» identifier name, integer value
 Token coding scheme
» Figure 5.5

11
Example - Figure 5.6
Statement Token type Token specifier
PROGRAM STATS 1
22 ^STATS
VAR 2
SUM,SUMSQ,I, …, : INTEGER 22 ^SUM
14
22 ^SUMSQ
14
…..
14
22 ^VARIANCE
13
6
BEGIN 3
SUM:=0; 22 ^SUM
15
23 #0

12
Token Recognizer

 By grammar
» <ident> ::= <letter> | <ident> <letter>| <ident><digit>
» <letter> ::= A | B | C | D | … | Z
» <digit> ::= 0 | 1 | 2 | 3 | … | 9
 By scanner - modeling as finite automata
» Figure 5.8(a)

13
Recognizing Identifier

 Identifiers allowing underscore (_)


» Figure 5.8(b)

State A-Z 0-9 _


1 2
2 2 2 3
3 2 2

A -Z
0 -9

1 A -Z 2 - 3
A -Z
0 -9

14
Recognizing Integer

 Allowing leading zeroes


» Figure 5.8(c)
0 -9

1 0 -9 2

 Disallowing leading zeroes


» Figure 5.8(d)
0 -9

1 1 -9 2

3 space 4

15
Scanner -- Implementation

 Figure 5.10 (a)


» Algorithmic code for identifer recognition
 Tabular representation of finite automaton for Figure 5.9
State A-Z 0-9 ;,+-*() : = .
1 2 4 5 6
2 2 2 3
3
4 4
5
6 7
7

16
Syntactic Analysis

 Recognize source statements as language constructs


or build the parse tree for the statements
» bottom-up: operator-precedence parsing
» top-down:: recursive-descent parsing

17
Operator-Precedence Parsing

 Operator
» any terminal symbol (or any token)
 Precedence
» * »+
» +«*
 Operator-precedence
» precedence relations between operators

18
Precedence Matrix for the Fig. 5.2
Operator-Precedence Parse Example
BEGIN READ ( VALUE ) ;

20
(i) … id1 := id2 DIV
   
(ii) … id1 := <N1> DIV int -
    

(iii) … id1 := <N1> DIV <N2> -


   

(iv) … id1 := <N3> - id3 *


    

(v) … id1 := <N3> - <N4> * id4 ;


    
(vi) … id1 := <N3> - <N4> * <N5> ;
    

(vi) … id1 := <N3> - <N6> ;


   

(vii) … id1 := <N7> ;


  
Operator-Precedence Parsing

 Bottom-up parsing
 Generating precedence matrix
» Aho et al. (1988)

23
Shift-reduce Parsing with Stack

 Figure 5.14

24
Recursive-Descent Parsing

 Each nonterminal symbol in the grammar is associate


d with a procedure
 <read> ::= READ (<id-list>)
 <stmt> ::= <assign> | <read> | <write> | <for>
 Left recursion
» <dec-list> ::= <dec> | <dec-list>;<dec>
 Modification
» <dec-list> ::= <dec> {;<dec>}

25
26
Recursive-Descent Parse of READ

27
Simplified Pascal Grammar for Recursive-
Descent Parser

28
29
30
31
32
33
34
35
36
37
38
39

You might also like