0% found this document useful (0 votes)
143 views29 pages

CS153 111017

CS 153: Concepts of Compiler Design October 17 Class Meeting Department of Computer Science San Jose State University Fall 2011 Instructor: Ron Mak. Shomit ghose History of computing speaker Wednesday, Oct. 19, 6:00-7:00 PM Auditorium ENGR 189 Reception before the talk in ENGR 294 at 5:00 PM.

Uploaded by

Sethu Raman
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
143 views29 pages

CS153 111017

CS 153: Concepts of Compiler Design October 17 Class Meeting Department of Computer Science San Jose State University Fall 2011 Instructor: Ron Mak. Shomit ghose History of computing speaker Wednesday, Oct. 19, 6:00-7:00 PM Auditorium ENGR 189 Reception before the talk in ENGR 294 at 5:00 PM.

Uploaded by

Sethu Raman
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 29

CS 153: Concepts of Compiler Design

October 17 Class Meeting


Department of Computer Science San Jose State University Fall 2011 Instructor: Ron Mak www.cs.sjsu.edu/~mak

Shomit Ghose
History of Computing Speaker Wednesday, Oct. 19, 6:00-7:00 PM Auditorium ENGR 189


Reception before the talk in ENGR 294 at 5:00 PM Micro-History:

An Examination of the Brief but Successful Life of a Silicon Valley Start-up


 

Venture capitalist Partner, ONSET Ventures

SJSU Dept. of Computer Science Fall 2011: October 17

CS 153: Concepts of Compiler Design R. Mak

Midterm Solution: Question 1


1.

List and describe five software engineering techniques we employed to make the code manageable and understandable.
 

Initial framework classes


Validate the architecture early.

Partitioning
language-dependent front end language-independent middle tier and back end The back end can be either an interpreter or a compiler.

 

Early initial end-to-end thread


Always build on working code.

Design patterns
strategy, factory, etc. Code to the interfaces. Closed for modification, open for extension.

Team development tools


subversion source control

SJSU Dept. of Computer Science Fall 2011: October 17

CS 153: Concepts of Compiler Design R. Mak

Midterm Solution: Question 2


2.

What is the purpose of the symbol table stack and how does it achieve its purpose?
Purpose: Implement static scoping
Push a symbol table onto the stack whenever the parser enters a scope. Pop the symbol table off the stack when the parser leaves a scope.

Search only the local (topmost) symbol table to determine if an identifier has been declared in the local scope. Search the entire stack from top to bottom to determine if an identifier has been declared in the local or an outer scope.

SJSU Dept. of Computer Science Fall 2011: October 17

CS 153: Concepts of Compiler Design R. Mak

Midterm Solution: Question 3


3. What is the purpose of the runtime stack and how does it achieve its purpose?
3. Purpose: To store runtime values according to the call chain
3. 4. Push an activation record onto the stack whenever the main program or a procedure or function is called. Pop the symbol table off the stack upon return.

4. The topmost activation record at level n contains the current values of the local variables and formal parameters of the currently active procedure or function at level n. 5. Use a runtime display to optimize accessing the appropriate activation record on the stack.

SJSU Dept. of Computer Science Fall 2011: October 17

CS 153: Concepts of Compiler Design R. Mak

Midterm Solution: Question 4


2.

Implement the ternary conditional operator in Pascal using the keywords IF, THEN, and ELSE.
a.

Modify the syntax diagrams.

The result at run time of evaluating the conditional operator is a single value, the result of evaluating either <expression-2> or <expression-3>. Therefore, a conditional expression must be a factor.

SJSU Dept. of Computer Science Fall 2011: October 17

CS 153: Concepts of Compiler Design R. Mak

Midterm Solution: Question 4


b.

What type checking operations are necessary while parsing a conditional operator?
 

<expression-1> must be boolean <expression-2> and <expression-3> must be type compatible with the surrounding operators (preferably they should be the same type) or be assignment compatible with the target variable. _

SJSU Dept. of Computer Science Fall 2011: October 17

CS 153: Concepts of Compiler Design R. Mak

Midterm Solution: Question 4


c.

Draw a parse tree for the statement


:=

k := i j*IF m-n = 0 THEN m*n ELSE m+n Note that the conditional does not change any precedence rules.
* j IF

k i

= m n 0 m

* n m

+ n

SJSU Dept. of Computer Science Fall 2011: October 17

CS 153: Concepts of Compiler Design R. Mak

Midterm Solution: Question 5


5.

Describe the purpose of each of the following hash tables (or tree maps) and describe its keys (or give an example of a key).
a.

symbol table


Store the symbol table entries for the identifiers declared within given scope Keys: Names of the identifiers

b.

symbol table entry


 

Store the attributes of an identifier Keys: Attribute enum constants such as ROUTINE_CODE

c.

type specification object


 

Store attributes about a data type Keys: Attribute enum constants such as ARRAY_INDEX_TYPE
CS 153: Concepts of Compiler Design R. Mak

SJSU Dept. of Computer Science Fall 2011: October 17

Midterm Solution: Question 5


d.

parse tree node


 

Store the attributes of a parse tree node Keys: Attribute enum constants LINE, ID, and VALUE

a.

memory map


Store the runtime values of the local variables and formal parameters of a program, procedure, or function Keys: The names of the variables and parameters _

SJSU Dept. of Computer Science Fall 2011: October 17

CS 153: Concepts of Compiler Design R. Mak

10

Midterm Solution: Question 6


6.

How to implement the ENDALL reserved word?


Front end
Modify the scanner to recognize ENDALL as a reserved word. Modify method CompoundStatementParser.parse() to include ENDALL as a statement list terminator. Modify method StatementParser.parseList() Stop looping if the global flag endAllFlag is true. Set endAllFlag to true after consuming the ENDALL keyword. Modify method StatementParser.parse() Set endAllFlag to false after consuming the BEGIN keyword.

Middle tier
No changes

Back end
No changes
SJSU Dept. of Computer Science Fall 2011: October 17 CS 153: Concepts of Compiler Design R. Mak

11

Midterm Solution: Question 7


7.

Classic Pascal included the WITH statement.


a.

What must the Pascal parser do in order to parse a WITH statement?

After parsing the record variable following the WITH keyword, the parser must
Determine the record type of the variable. Push the record types symbol table onto the symbol table stack. When parsing the nested statements of the WITH statement, look up identifiers first in the record types symbol table to determine whether or not they are record fields. At the end of the WITH statement, pop off the record types symbol table. _

SJSU Dept. of Computer Science Fall 2011: October 17

CS 153: Concepts of Compiler Design R. Mak

12

Midterm Solution: Question 7


b.

What advantages would a WITH statement have at run time? None at all, if the WITH statement is considered to be shorthand for the programmer (syntactic sugar). However, if the parse tree contains a WITH node, then the record variable only needs to be evaluated once. This would be a performance optimization especially if the record variable is complicated, such as having subscripts, fields, and pointer dereferencing.

c.

How would you implement a WITH statement in the interpreters back end? In the syntactic sugar case, do nothing. In the WITH node case, the interpreter must allocate an extra slot in the activation record to store the value of the record variable.

SJSU Dept. of Computer Science Fall 2011: October 17

CS 153: Concepts of Compiler Design R. Mak

13

Minimum Acceptable Compiler Project


At least two data types with type checking. Basic arithmetic operations with operator precedence. Assignment statements. At least one conditional control statement (e.g., IF) At least one looping control statement. Procedures or functions with calls and returns Parameters passed by value or by reference. Basic error recovery (skip to semicolon or end of line). Sample source programs written in the source language. Generate Jasmin code that can be assembled. Execute the resulting .class file standalone (preferred) or with a test harness. 70 points/100 No crashes (e.g., null pointer exceptions)
SJSU Dept. of Computer Science Fall 2011: October 17 CS 153: Concepts of Compiler Design R. Mak

14

Ideas for Programming Languages


A language that works with a database such as MySQL
  

Combines Pascal and SQL for writing database applications Not PL/SQL use the language to write client programs Compiled code makes JDBC calls hidden from the programmer

A language that can access web pages




Statements that scrape pages to extract information

A language for generating business reports




A Pascal-like language that combines report writer features

A string-processing language


Combines Pascal and Perl for writing applications that involve pattern matching and string transformations
CS 153: Concepts of Compiler Design R. Mak

SJSU Dept. of Computer Science Fall 2011: October 17

15

Can We Build a Better Scanner?


Our scanner in the front end is relatively easy to understand and follow.


Separate scanner classes for each token type.

However, its big and slow.


 

Separate scanner classes for each token type. Create lots of objects and make lots of method calls.

We can write a more compact and faster scanner.




However, it may be harder to understand and follow. _

SJSU Dept. of Computer Science Fall 2011: October 17

CS 153: Concepts of Compiler Design R. Mak

16

Deterministic Finite Automata (DFA)


Pascal identifier
 

Regular expression: <letter> ( <letter> | <digit> )* Implement the regular expression with a finite automaton (AKA finite state machine):
start state 1 letter transition digit letter 2 accepting state [other] 3

This automaton is a deterministic finite automaton (DFA).




At each state, the next input character uniquely determines which transition to take to the next state.
CS 153: Concepts of Compiler Design R. Mak

SJSU Dept. of Computer Science Fall 2011: October 17

17

State-Transition Matrix
letter 1 letter 2 digit [other] 3

Represent the behavior of a DFA by a state-transition matrix:

SJSU Dept. of Computer Science Fall 2011: October 17

CS 153: Concepts of Compiler Design R. Mak

18

DFA for a Pascal Number

E digit digit digit E + digit digit

+ -

3
digit

digit

.
[other]

digit

10

11

[other]

12

[other]

SJSU Dept. of Computer Science Fall 2011: October 17

CS 153: Concepts of Compiler Design R. Mak

19

DFA for a Pascal Identifier or Number


private static final int matrix[][] = {

Negative numbers in the matrix are the accepting states.


letter

letter

1
digit

[other]

/* letter 1, /* 0 */ { /* 1 */ { 1, /* 2 */ { ERR, /* 3 */ { ERR, /* 4 */ { -5, /* 5 */ { ERR, /* 6 */ { ERR, /* 7 */ { -8, /* 8 */ { ERR, /* 9 */ { ERR, /* 10 */ { ERR, /* 11 */ { -12, /* 12 */ { ERR,

digit 4, 1, ERR, 4, 4, ERR, 7, 7, ERR, 11, 11, 11, ERR, digit

+ 3, -2, ERR, ERR, -5, ERR, ERR, -8, ERR, 10, ERR, -12, ERR,

3, -2, ERR, ERR, -5, ERR, ERR, -8, ERR, 10, ERR, -12, ERR,

. ERR, -2, ERR, ERR, 6, ERR, ERR, -8, ERR, ERR, ERR, -12, ERR,

E other */ 1, ERR }, 1, -2 }, ERR, ERR }, ERR, ERR }, 9, -5 }, ERR, ERR }, ERR, ERR }, 9, -8 }, ERR, ERR }, ERR, ERR }, ERR, ERR }, -12, -12 }, ERR, ERR },

digit digit

}; digit digit digit

3
digit

.
[other]

digit

10

11

[other]

12

[other]

5
SJSU Dept. of Computer Science Fall 2011: October 17

8
CS 153: Concepts of Compiler Design R. Mak

Notice how the letter E is handled!


20

A Simple DFA Scanner


public class SimpleDFAScanner { // Input characters. private static final int LETTER private static final int DIGIT private static final int PLUS private static final int MINUS private static final int DOT private static final int E private static final int OTHER

= = = = = = =

0; 1; 2; 3; 4; 5; 6; // error state

private static final int ERR = -99999;

private static final int matrix[][] = { ... }; private char ch; private int state; ... }
SJSU Dept. of Computer Science Fall 2011: October 17 CS 153: Concepts of Compiler Design R. Mak

// current input character // current state

21

A Simple DFA Scanner, contd


int typeOf(char { return : : : : : : } ch) (ch == 'E') Character.isLetter(ch) Character.isDigit(ch) (ch == '+') (ch == '-') (ch == '.') ? ? ? ? ? ? E LETTER DIGIT PLUS MINUS DOT OTHER;

SJSU Dept. of Computer Science Fall 2011: October 17

CS 153: Concepts of Compiler Design R. Mak

22

A Simple DFA Scanner, contd


private String nextToken() throws IOException { while (Character.isWhitespace(ch)) nextChar(); if (ch == 0) return null; // EOF? state = 0; // start state StringBuilder buffer = new StringBuilder(); while (state >= 0) { // not accepting state state = matrix[state][typeOf(ch)]; // transit This is the heart of the scanner.

if ((state >= 0) || (state == ERR)) { buffer.append(ch); // build token string nextChar(); } } Table-driven scanners can be very fast! return buffer.toString(); }
SJSU Dept. of Computer Science Fall 2011: October 17 CS 153: Concepts of Compiler Design R. Mak

23

Simple DFA Scanner, contd


private void scan() throws IOException { nextChar(); while (ch != 0) { // EOF? String token = nextToken(); if (token != null) { System.out.print("=====> \"" + token + "\" "); String tokenType = (state == -2) ? "IDENTIFIER" How do we know : (state == -5) ? "INTEGER" which token we : (state == -8) ? "REAL (fraction only)" just got? : (state == -12) ? "REAL" : "*** ERROR ***"; System.out.println(tokenType); } } Demo }
SJSU Dept. of Computer Science Fall 2011: October 17 CS 153: Concepts of Compiler Design R. Mak

24

Backus Naur Form (BNF)


A text-based way to describe source language syntax.
 

Named after John Backus and Peter Naur. Text-based means it can be read by a program ...
such as a compiler-compiler that can automatically generate a parser for a source language after reading (and parsing) the languages syntax rules written in BNF.

Uses certain meta-symbols.




Symbols that are part of BNF itself but are not necessarily part of the syntax of the source language.
::= | < > is defined as or Surround names of nonterminal (not literal) items

SJSU Dept. of Computer Science Fall 2011: October 17

CS 153: Concepts of Compiler Design R. Mak

25

BNF Example: U.S. Postal Address


<postal-address> ::= <name-part> <street-part> <city-state-part> <name-part> ::= <first-part> <last-name> | <first-part> <last-name> <suffix> <first-part> ::= <first-name> | <capital-letter> . <street-part> ::= <house-number> <street-name> | <house-number> <street-name> <apartment-number> <city-state-part > ::= <city-name> , <state-code> <ZIP-code> <suffix> ::= Sr. | Jr. | <roman-numeral> <first-name> ::= <name> <last-name> ::= <name> <street-name> ::= <name> <city-name> ::= <name> <house-number> ::= <number> <apartment-number> ::= <number> <state-code> ::= <capital-letter> <capital-letter> <capital-letter> ::= A|B|C|D|E|F|G|H|I|J|K|L|M |N|O|P|Q|R|S|T|U|V|W|X|Y|Z <name> ::= <number> ::= etc.
SJSU Dept. of Computer Science Fall 2011: October 17 CS 153: Concepts of Compiler Design R. Mak

26

BNF: Optional and Repeated Items


To show optional items in BNF, use the vertical bar |.



An expression is a simple expression optionally followed by an relational operator and another simple expression.
<expression> ::= <simple expression> | <simple expression> <rel op> <simple expression>

BNF uses recursion for repeated items.



 

A digit sequence is a digit followed by zero or more digits.


<digit sequence> ::= | <digit sequence> ::= | <digit> Right <digit> <digit sequence> recursive <digit> Left <digit sequence> <digit>
recursive

SJSU Dept. of Computer Science Fall 2011: October 17

CS 153: Concepts of Compiler Design R. Mak

27

BNF Example: Pascal Number

<digit sequence> ::= <digit> | <digit> <digit sequence> Repetition via recursion. <unsigned integer> ::= <digit sequence> <unsigned real> ::= <unsigned integer>.<digit sequence> | <unsigned integer>.<digit sequence> <e> <scale factor> | <unsigned integer > <e> <scale factor> <unsigned number> ::= <unsigned integer> | <unsigned real> <scale factor> ::= <unsigned integer> | <sign> <unsigned integer> <e> ::= E | e The sign is optional. <sign> ::= + | SJSU Dept. of Computer Science Fall 2011: October 17 CS 153: Concepts of Compiler Design R. Mak

28

BNF Example: Pascal IF Statement

<if statement> ::= IF <expression> THEN <statement> | IF <expression> THEN <statement> ELSE <statement>

It should be straightforward to write a parsing method from either the syntax diagram or the BNF. _

SJSU Dept. of Computer Science Fall 2011: October 17

CS 153: Concepts of Compiler Design R. Mak

29

You might also like