0% found this document useful (0 votes)

21 views47 pages

Ch2 CC

The document outlines the fundamental concepts of compilers, focusing on formal grammars, regular expressions, and the roles of scanners and parsers in programming languages. It discusses the historical context of grammar specification, the structure of a tiny language, and the principles of parsing and lexical analysis. Additionally, it covers the use of finite automata for recognizing tokens and regular expressions, emphasizing the importance of separating the scanner and parser for efficiency and simplicity.

Uploaded by

Arif Kamal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views47 pages

Ch2 CC

Uploaded by

Arif Kamal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 47

CSEP 501 – Compilers

Languages, Automata, Regular

Expressions & Scanners
Hal Perkins
Winter 2008

04/01/25 © 2002-08 Hal Perkins & UW CSE B-1

Agenda
 Basic concepts of formal grammars
(review)
 Regular expressions
 Lexical specification of
programming languages
 Using finite automata to recognize
regular expressions
 Scanners and Tokens
04/01/25 © 2002-08 Hal Perkins & UW CSE B-2
Programming Language
Specs
 Since the 1960s, the syntax of every
significant programming language
has been specified by a formal
grammar
 First done in 1959 with BNF (Backus-
Naur Form or Backus-Normal Form) used
to specify the syntax of ALGOL 60

Borrowed from the linguistics
community (Chomsky)
04/01/25 © 2002-08 Hal Perkins & UW CSE B-3
Grammar for a Tiny
Language
 program ::= statement | program
statement
 statement ::= assignStmt | ifStmt
 assignStmt ::= id = expr ;
 ifStmt ::= if ( expr ) stmt
 expr ::= id | int | expr + expr
 id ::= a | b | c | i | j | k | n | x | y | z
 int ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
04/01/25 © 2002-08 Hal Perkins & UW CSE B-4
Productions
 The rules of a grammar are called productions
 Rules contain

Nonterminal symbols: grammar variables (program,
statement, id, etc.)

Terminal symbols: concrete syntax that appears in
programs (a, b, c, 0, 1, if, (, ), …
 Meaning of
nonterminal ::= <sequence of terminals and
nonterminals>

In a derivation, an instance of nonterminal can be
replaced by the sequence of terminals and nonterminals
on the right of the production
 Often, there are two or more productions for one
nonterminal – use any in different parts of
derivation
04/01/25 © 2002-08 Hal Perkins & UW CSE B-5
Alternative Notations
 There are several syntax notations
for productions in common use; all
mean the same thing
ifStmt ::= if ( expr ) stmt
ifStmt if ( expr ) stmt
<ifStmt> ::= if ( <expr> ) <stmt>

04/01/25 © 2002-08 Hal Perkins & UW CSE B-6

Example
Derivation
program ::= statement | program statement
statement ::= assignStmt | ifStmt
assignStmt ::= id = expr ;
ifStmt ::= if ( expr ) stmt
expr ::= id | int | expr + expr
Id ::= a | b | c | i | j | k | n | x | y | z
int ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

a = 1 ; if ( a + 1 ) b = 2 ;

04/01/25 © 2002-08 Hal Perkins & UW CSE B-7

Parsing
 Parsing: reconstruct the derivation
(syntactic structure) of a program
 In principle, a single recognizer
could work directly from a
concrete, character-by-character
grammar
 In practice this is never done

04/01/25 © 2002-08 Hal Perkins & UW CSE B-8

Parsing & Scanning
 In real compilers the recognizer is split
into two phases
 Scanner: translate input characters to
tokens

Also, report lexical errors like illegal characters
and illegal symbols
 Parser: read token stream and reconstruct
the derivation
source tokens
Scanner Parser

04/01/25 © 2002-08 Hal Perkins & UW CSE B-9

Characters vs Tokens
(review)
 Input text
// this statement does very little
if (x >= y) y = 42;
 Token Stream
IF LPAREN ID(x) GEQ ID(y)

RPAREN ID(y) BECOMES INT(42) SCOLON

04/01/25 © 2002-08 Hal Perkins & UW CSE B-10

Why Separate the Scanner
and Parser?
 Simplicity & Separation of Concerns
 Scanner hides details from parser
(comments, whitespace, input files, etc.)
 Parser is easier to build; has simpler
input stream (tokens)
 Efficiency
 Scanner can use simpler, faster design

(But still often consumes a surprising
amount of the compiler’s total execution
time)

04/01/25 © 2002-08 Hal Perkins & UW CSE B-11

Tokens
 Idea: we want a distinct token kind
(lexical class) for each distinct
terminal symbol in the programming
language
 Examine the grammar to find these
 Some tokens may have attributes
 Examples: integer constant token will
have the actual integer (17, 42, …) as an
attribute; identifiers will have a string
with the actual id
04/01/25 © 2002-08 Hal Perkins & UW CSE B-12
Typical Tokens in
Programming Languages
 Operators & Punctuation
 + - * / ( ) { } [ ] ; : :: < <= == = != ! …
 Each of these is a distinct lexical class
 Keywords
 if while for goto return switch void …
 Each of these is also a distinct lexical class (not a string)
 Identifiers
 A signle ID lexical class, but parameterized by actual id
 Integer constants
 A single INT lexical class, but parameterized by int value
 Other constants, etc.

04/01/25 © 2002-08 Hal Perkins & UW CSE B-13

Principle of Longest Match
 In most languages, the scanner should
pick the longest possible string to make
up the next token if there is a choice
 Example
return maybe != iffy;
should be recognized as 5 tokens

RETURN ID(maybe) NEQ ID(iffy) SCOLON

i.e., != is one token, not two, “iffy” is an ID,

not IF followed by ID(fy)

04/01/25 © 2002-08 Hal Perkins & UW CSE B-14

Formal Languages &
Automata Theory (a review in
one slide)
 Alphabet: a finite set of symbols
 String: a finite, possibly empty sequence of
symbols from an alphabet
 Language: a set, often infinite, of strings
 Finite specifications of (possibly infinite)
languages

Automaton – a recognizer; a machine that accepts all
strings in a language (and rejects all other strings)

Grammar – a generator; a system for producing all
strings in the language (and no other strings)
 A particular language may be specified by many
different grammars and automata
 A grammar or automaton specifies only one
language
04/01/25 © 2002-08 Hal Perkins & UW CSE B-15
Regular Expressions and
FAs
 The lexical grammar (structure) of
most programming languages can be
specified with regular expressions
 (Sometimes a little cheating is needed)
 Tokens can be recognized by a
deterministic finite automaton
 Can be either table-driven or built by
hand based on lexical grammar

04/01/25 © 2002-08 Hal Perkins & UW CSE B-16

Regular Expressions
 Defined over some alphabet Σ
 For programming languages,
alphabet is usually ASCII or Unicode
 If re is a regular expression, L(re )
is the language (set of strings)
generated by re

04/01/25 © 2002-08 Hal Perkins & UW CSE B-17

Fundamental REs

re L(re ) Notes
a {a} Singleton set, for each a
in Σ
ε {ε} Empty string
 {} Empty language

04/01/25 © 2002-08 Hal Perkins & UW CSE B-18

Operations on REs
re L(re ) Notes
rs L(r)L(s) Concatenation
r|s L(r)  L(s) Combination (union)
r* L(r)* 0 or more occurrences
(Kleene closure)
 Precedence: * (highest), concatenation, | (lowest)
 Parentheses can be used to group REs as needed

04/01/25 © 2002-08 Hal Perkins & UW CSE B-19

Abbreviations
 The basic operations generate all possible
regular expressions, but there are common
abbreviations used for convenience. Typical
examples:
Abbr. Meaning Notes
r+ (rr*) 1 or more occurrences
r? (r | ε) 0 or 1 occurrence
[a-z] (a|b|…|z) 1 character in given
range
[abxyz (a|b|x|y|z) 1 of the given characters
]
04/01/25 © 2002-08 Hal Perkins & UW CSE B-20
Examples
re Meaning
+ single + character
! single ! character
= single = character
!= 2 character sequence
<= 2 character sequence
xyzzy 5 character sequence

04/01/25 © 2002-08 Hal Perkins & UW CSE B-21

More Examples
re Meaning

[abc]+

[abc]*

[0-9]+

[1-9][0-9]*

[a-zA-Z][a-zA-Z0-
9_]*
04/01/25 © 2002-08 Hal Perkins & UW CSE B-22
Abbreviations
 Many systems allow abbreviations
to make writing and reading
definitions or specifications easier
name ::= re

 Restriction: abbreviations may not be

circular (recursive) either directly or
indirectly (else would be non-regular)

04/01/25 © 2002-08 Hal Perkins & UW CSE B-23

Example
 Possible syntax for numeric
constants

digit ::= [0-9]

digits ::= digit+
number ::= digits ( . digits )?
( [eE] (+ | -)?
digits ) ?
04/01/25 © 2002-08 Hal Perkins & UW CSE B-24
Recognizing REs
 Finite automata can be used to
recognize strings generated by
regular expressions
 Can build by hand or automatically
 Not totally straightforward, but can be
done systematically
 Tools like Lex, Flex, Jlex et seq do this
automatically, given a set of REs
04/01/25 © 2002-08 Hal Perkins & UW CSE B-25
Finite State Automaton
 A finite set of states

One marked as initial state

One or more marked as final states

States sometimes labeled or numbered
 A set of transitions from state to state

Each labeled with symbol from Σ, or ε
 Operate by reading input symbols (usually characters)

Transition can be taken if labeled with current symbol

ε-transition can be taken at any time
 Accept when final state reached & no more input

Scanner uses a FSA as a subroutine – accept longest match
each time called, even if more input; i.e., run the FSA from
the current location in the input each time the scanner is
called
 Reject if no transition possible, or no more input and not
in final state (DFA)
04/01/25 © 2002-08 Hal Perkins & UW CSE B-26
Example: FSA for “cat”

c a t

04/01/25 © 2002-08 Hal Perkins & UW CSE B-27

DFA vs NFA
 Deterministic Finite Automata (DFA)
 No choice of which transition to take under
any condition
 Non-deterministic Finite Automata (NFA)
 Choice of transition in at least one case
 Accept if some way to reach final state on
given input
 Reject if no possible way to final state

04/01/25 © 2002-08 Hal Perkins & UW CSE B-28

FAs in Scanners
 Want DFA for speed (no
backtracking)
 Conversion from regular
expressions to NFA is easy
 There is a well-defined procedure
for converting a NFA to an
equivalent DFA

04/01/25 © 2002-08 Hal Perkins & UW CSE B-29

From RE to NFA: base
cases

04/01/25 © 2002-08 Hal Perkins & UW CSE B-30

r ε s

04/01/25 © 2002-08 Hal Perkins & UW CSE B-31

r |s

ε ε

ε s ε

04/01/25 © 2002-08 Hal Perkins & UW CSE B-32

ε ε

04/01/25 © 2002-08 Hal Perkins & UW CSE B-33

From NFA to DFA
 Subset construction

Construct a DFA from the NFA, where each DFA state
represents a set of NFA states
 Key idea

The state of the DFA after reading some input is the set
of all states the NFA could have reached after reading
the same input
 Algorithm: example of a fixed-point computation
 If NFA has n states, DFA has at most 2n states

=> DFA is finite, can construct in finite # steps
 Resulting DFA may have more states than
needed

See books for construction and minimization details

04/01/25 © 2002-08 Hal Perkins & UW CSE B-34

Example: DFA for hand-
written scanner
 Idea: show a hand-written DFA for some
typical programming language constructs

Then use to construct hand-written scanner
 Setting: Scanner is called whenever the
parser needs a new token

Scanner stores current position in input

Starting there, use a DFA to recognize the
longest possible input sequence that makes
up a token and return that token

04/01/25 © 2002-08 Hal Perkins & UW CSE B-35

Scanner DFA Example (1)
whitespace
or comments

end of input
1 Accept EOF

(
2 Accept LPAREN

)
3 Accept RPAREN

;
4 Accept SCOLON

04/01/25 © 2002-08 Hal Perkins & UW CSE B-36

Scanner DFA Example (2)

! =
5 6 Accept NEQ

[other ] Accept NOT

< =
8 9 Accept LEQ

[other ] Accept LESS

Scanner DFA Example (3)

[0-9] [0-9]
11

[other ] Accept INT

Scanner DFA Example (4)

[a-zA-Z] [a-zA-Z0-9_]
13

[other ] Accept ID or keyword

 Strategies for handling identifiers vs keywords

 Hand-written scanner: look up identifier-like things in table
of keywords to classify (good application of perfect hashing)
 Machine-generated scanner: generate DFA will appropriate
transitions to recognize keywords

Lots ’o states, but efficient (no extra lookup step)

Implementing a Scanner
by Hand – Token
Representation
 A token is a simple, tagged structure
public class Token {
public int kind; // token’s lexical class
public int intVal; // integer value if class = INT
public String id; // actual identifier if class = ID
// lexical classes
public static final int EOF = 0; // “end of file” token
public static final int ID = 1; // identifier, not keyword
public static final int INT = 2; // integer
public static final int LPAREN = 4;
public static final int SCOLN = 5;
public static final int WHILE = 6;
// etc. etc. etc. …

Simple Scanner Example
// global state and methods

static char nextch; // next unprocessed input

character

// advance to next input char

void getch() { … }

// skip whitespace and comments

void skipWhitespace() { … }

Scanner getToken()
method
// return next input token
public Token getToken() {
Token result;

skipWhiteSpace();

if (no more input) {

result = new Token(Token.EOF); return result;
}

switch(nextch) {
case '(': result = new Token(Token.LPAREN); getch(); return result;
case ‘)': result = new Token(Token.RPAREN); getch(); return result;
case ‘;': result = new Token(Token.SCOLON); getch(); return result;

// etc. …

getToken() (2)
case '!': // ! or !=
getch();
if (nextch == '=') {
result = new Token(Token.NEQ); getch(); return result;
} else {
result = new Token(Token.NOT); return result;
}

case '<': // < or <=

getch();
if (nextch == '=') {
result = new Token(Token.LEQ); getch(); return result;
} else {
result = new Token(Token.LESS); return result;
}
// etc. …

getToken() (3)
case '0': case '1': case '2': case '3': case '4':
case '5': case '6': case '7': case '8': case '9':
// integer constant
String num = nextch;
getch();
while (nextch is a digit) {
num = num + nextch; getch();
}
result = new Token(Token.INT,
Integer(num).intValue());
return result;
…

getToken (4)
case 'a': … case 'z':
case 'A': … case 'Z': // id or keyword
string s = nextch; getch();
while (nextch is a letter, digit, or underscore) {
s = s + nextch; getch();
}
if (s is a keyword) {
result = new Token(keywordTable.getKind(s));
} else {
result = new Token(Token.ID, s);
}
return result;

Project Notes
 For the course project, use a
lexical analyzer generator
 Suggestion: JFlex a Java Lex-
lookalike
 (Works with CUP – a Java yacc/bison
implementation)

Coming Attractions
 Homework this week: paper exercises on
regular expressions, etc.
 Next week: first part of the compiler
assignment – the scanner
 Based on the project from Ch. 2 of Appel’s
book
 Next topic: parsing
 Will do LR parsing first – suggest using this for
the project (thus CUP (YACC-like) instead of
JavaCC or ANTLR)
 Good time to start reading chs. 3 & 4.
04/01/25 © 2002-08 Hal Perkins & UW CSE B-47

SP Unit III-2024-25
No ratings yet
SP Unit III-2024-25
126 pages
L02 Syntax
No ratings yet
L02 Syntax
114 pages
Unit 2
No ratings yet
Unit 2
93 pages
Compiler Construction Lecture 3-4
No ratings yet
Compiler Construction Lecture 3-4
78 pages
Chapter 2
No ratings yet
Chapter 2
99 pages
CD ch2
No ratings yet
CD ch2
104 pages
Chapter 2
No ratings yet
Chapter 2
77 pages
Unit 2-Introduction To Compilers
No ratings yet
Unit 2-Introduction To Compilers
51 pages
Unit 5
No ratings yet
Unit 5
43 pages
Lexical and Syntax Analysis
No ratings yet
Lexical and Syntax Analysis
63 pages
Slides 02 - Compiler Construction - UET CS - Lexical Analyzer Rev 2
No ratings yet
Slides 02 - Compiler Construction - UET CS - Lexical Analyzer Rev 2
69 pages
Chapter 3 - Lexical Analysis
No ratings yet
Chapter 3 - Lexical Analysis
51 pages
Chapter 3 - Lexical Analysis
No ratings yet
Chapter 3 - Lexical Analysis
52 pages
Chapter 3 - Lexical Analysis and Lexical Analyzer Generators
No ratings yet
Chapter 3 - Lexical Analysis and Lexical Analyzer Generators
52 pages
Acknowledgements: The Slides For This Lecture Are A Modified Versions of The Offering by
No ratings yet
Acknowledgements: The Slides For This Lecture Are A Modified Versions of The Offering by
40 pages
PLDI Week 06 Parsing
No ratings yet
PLDI Week 06 Parsing
55 pages
2024 CSN352 Lec 8
No ratings yet
2024 CSN352 Lec 8
48 pages
Ch3 1
No ratings yet
Ch3 1
52 pages
CH 3 Myppt
No ratings yet
CH 3 Myppt
59 pages
Lexical Analysis
No ratings yet
Lexical Analysis
36 pages
Unit-2 Lexical Analysis
No ratings yet
Unit-2 Lexical Analysis
36 pages
Ch3 - Lexical Analysis
No ratings yet
Ch3 - Lexical Analysis
52 pages
Chapter 33
No ratings yet
Chapter 33
107 pages
CD - Unit1 - Lecture4 5 6 7
No ratings yet
CD - Unit1 - Lecture4 5 6 7
50 pages
2 Scan 1
No ratings yet
2 Scan 1
24 pages
1st Phase Lexical Analyzer
No ratings yet
1st Phase Lexical Analyzer
33 pages
File 1675742677 110405 LexicalAnalysis-Continue1
No ratings yet
File 1675742677 110405 LexicalAnalysis-Continue1
39 pages
Compilers CH 3
No ratings yet
Compilers CH 3
58 pages
Chapter 3 - Lexical Analysis
No ratings yet
Chapter 3 - Lexical Analysis
34 pages
The Structure of A Compiler: Any Compiler Must Perform Two Major Tasks
No ratings yet
The Structure of A Compiler: Any Compiler Must Perform Two Major Tasks
57 pages
Compiler Design Chapter-2
60% (5)
Compiler Design Chapter-2
105 pages
Lexical Analysis and Lexical Analyzer Generators: COP5621 Compiler Construction
No ratings yet
Lexical Analysis and Lexical Analyzer Generators: COP5621 Compiler Construction
52 pages
2 - Lexical Analysis
No ratings yet
2 - Lexical Analysis
52 pages
UNIT-I - Lexical Analysis
No ratings yet
UNIT-I - Lexical Analysis
51 pages
Ch02 Programming Language Syntax 4e 2
No ratings yet
Ch02 Programming Language Syntax 4e 2
64 pages
2 - Scanner
No ratings yet
2 - Scanner
49 pages
Unit 1 (B)
No ratings yet
Unit 1 (B)
69 pages
Chapter 3 - Lexical Analysis
No ratings yet
Chapter 3 - Lexical Analysis
51 pages
Unit II - Lexical Analysis-20-1-2021
No ratings yet
Unit II - Lexical Analysis-20-1-2021
49 pages
Lecture 3-4 Updated
No ratings yet
Lecture 3-4 Updated
26 pages
Lec2 LexicalAnalyser
No ratings yet
Lec2 LexicalAnalyser
30 pages
Chapter 2
No ratings yet
Chapter 2
91 pages
SLD 2
No ratings yet
SLD 2
67 pages
CompilerD L3
No ratings yet
CompilerD L3
36 pages
Compiler Course: Lexical Analysis
No ratings yet
Compiler Course: Lexical Analysis
50 pages
Compiler-Lexical Analysis
100% (1)
Compiler-Lexical Analysis
59 pages
Lexical Analysis1
No ratings yet
Lexical Analysis1
44 pages
Chapter 2 - Lexical Analysis
No ratings yet
Chapter 2 - Lexical Analysis
56 pages
Lexical Analysis All Token List and Diffence
No ratings yet
Lexical Analysis All Token List and Diffence
4 pages
Chapter 2 - Lexical Analysis
100% (1)
Chapter 2 - Lexical Analysis
69 pages
Chapter 3 - Lexical Analysis
100% (3)
Chapter 3 - Lexical Analysis
51 pages
Lexical and Syntactic Analysis: Slide 1
No ratings yet
Lexical and Syntactic Analysis: Slide 1
39 pages
Code Source Tokens Scanner Parser IR
No ratings yet
Code Source Tokens Scanner Parser IR
26 pages
CS 346: Compilers: Lexical Analyzer Lexical Analyzer
No ratings yet
CS 346: Compilers: Lexical Analyzer Lexical Analyzer
52 pages
CS402 Quiz-4 by Vu Topper RM-1
No ratings yet
CS402 Quiz-4 by Vu Topper RM-1
47 pages
Compilers - Week 2
No ratings yet
Compilers - Week 2
14 pages
Chapter 3 - Lexical Analysis
100% (1)
Chapter 3 - Lexical Analysis
51 pages
Chapter 2
No ratings yet
Chapter 2
56 pages
Fundamentals of Classical Logic
100% (1)
Fundamentals of Classical Logic
28 pages
Simplification of CFG: Presented To Presented by
100% (2)
Simplification of CFG: Presented To Presented by
12 pages
4244 Question Paper
No ratings yet
4244 Question Paper
2 pages
Properties of CFG
No ratings yet
Properties of CFG
84 pages
C1 U2 Lomc
No ratings yet
C1 U2 Lomc
8 pages
CH-05 Semantic Analysis
No ratings yet
CH-05 Semantic Analysis
115 pages
Logic 2 Notes
No ratings yet
Logic 2 Notes
312 pages
Discrete Sturcture
No ratings yet
Discrete Sturcture
40 pages
CH02 COA9e
No ratings yet
CH02 COA9e
61 pages
Chapter 2
No ratings yet
Chapter 2
27 pages
48583FrequencyDistributions Week 1.2.3 (F)
No ratings yet
48583FrequencyDistributions Week 1.2.3 (F)
124 pages
3.3 Pushdown Automata
No ratings yet
3.3 Pushdown Automata
23 pages
2the Universal Quantifier 0
No ratings yet
2the Universal Quantifier 0
9 pages
Human Computer Interaction
No ratings yet
Human Computer Interaction
21 pages
Discovering Computers 2016: Building Solutions
No ratings yet
Discovering Computers 2016: Building Solutions
21 pages
Second Edition Errata - Ps
No ratings yet
Second Edition Errata - Ps
6 pages
MR22 - 1209 Atcd Syllabus
No ratings yet
MR22 - 1209 Atcd Syllabus
2 pages
E3 Chap 04
No ratings yet
E3 Chap 04
26 pages
Lab Manual-CC
No ratings yet
Lab Manual-CC
19 pages
Question Bank For DM
No ratings yet
Question Bank For DM
3 pages
4.lexical Analysis VS Parsing
No ratings yet
4.lexical Analysis VS Parsing
4 pages
49049chapter 1 MIS
No ratings yet
49049chapter 1 MIS
36 pages
CH06 COA9e
No ratings yet
CH06 COA9e
46 pages
IAT334 Lec01 Intro
No ratings yet
IAT334 Lec01 Intro
58 pages
AI-09-Resolution in FOL
No ratings yet
AI-09-Resolution in FOL
35 pages
Mathematical Language and Symbols: Libeeth B. Guevarra Department of Mathematics and Natural Sciences
No ratings yet
Mathematical Language and Symbols: Libeeth B. Guevarra Department of Mathematics and Natural Sciences
34 pages
ATC Question Bank
No ratings yet
ATC Question Bank
2 pages
Discovering Computers 2016: Communicating Digital Content
No ratings yet
Discovering Computers 2016: Communicating Digital Content
38 pages
Regular Expression - DPP 01 Discussion Notes
No ratings yet
Regular Expression - DPP 01 Discussion Notes
14 pages
CH 41
No ratings yet
CH 41
10 pages
Non-Recursive Predictive Parsing
No ratings yet
Non-Recursive Predictive Parsing
14 pages
Tut 5 - Solutions PDF
No ratings yet
Tut 5 - Solutions PDF
3 pages
Assignment Sheet 1
No ratings yet
Assignment Sheet 1
3 pages
DInamantinuro
100% (1)
DInamantinuro
4 pages
Description of DBMS - STATS Oracle v12
No ratings yet
Description of DBMS - STATS Oracle v12
25 pages
A Modern Formal Logic Primer A Modern Formal Logic Primer and The Answer Manual, and Correction To The One Substantive Error
No ratings yet
A Modern Formal Logic Primer A Modern Formal Logic Primer and The Answer Manual, and Correction To The One Substantive Error
4 pages
Sample Test 1 Fotoc2 2022 Solutions
No ratings yet
Sample Test 1 Fotoc2 2022 Solutions
3 pages
Compound Statement and Kinds of Compound Statement - Conjunction, Alternation, Conditional and Bi-Conditional
No ratings yet
Compound Statement and Kinds of Compound Statement - Conjunction, Alternation, Conditional and Bi-Conditional
3 pages
Closure Properties
No ratings yet
Closure Properties
4 pages
Course File Contents and Check List (Dossier)
No ratings yet
Course File Contents and Check List (Dossier)
1 page
Maths Handout Chapter 3 (Part2)
No ratings yet
Maths Handout Chapter 3 (Part2)
4 pages
Assembly Language: From Basics to Expert Proficiency
From Everand
Assembly Language: From Basics to Expert Proficiency
William Smith
No ratings yet
Assembly Programming:Simple, Short, And Straightforward Way Of Learning Assembly Language
From Everand
Assembly Programming:Simple, Short, And Straightforward Way Of Learning Assembly Language
Sherwyn Allibang
5/5 (2)
UNIX Shell Programming Interview Questions You'll Most Likely Be Asked
From Everand
UNIX Shell Programming Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet

Ch2 CC

Uploaded by

Ch2 CC

Uploaded by

CSEP 501 – Compilers

Languages, Automata, Regular

04/01/25 © 2002-08 Hal Perkins & UW CSE B-1

04/01/25 © 2002-08 Hal Perkins & UW CSE B-6

04/01/25 © 2002-08 Hal Perkins & UW CSE B-7

04/01/25 © 2002-08 Hal Perkins & UW CSE B-8

04/01/25 © 2002-08 Hal Perkins & UW CSE B-9

RPAREN ID(y) BECOMES INT(42) SCOLON

04/01/25 © 2002-08 Hal Perkins & UW CSE B-10

04/01/25 © 2002-08 Hal Perkins & UW CSE B-11

04/01/25 © 2002-08 Hal Perkins & UW CSE B-13

RETURN ID(maybe) NEQ ID(iffy) SCOLON

i.e., != is one token, not two, “iffy” is an ID,

04/01/25 © 2002-08 Hal Perkins & UW CSE B-14

04/01/25 © 2002-08 Hal Perkins & UW CSE B-16

04/01/25 © 2002-08 Hal Perkins & UW CSE B-17

04/01/25 © 2002-08 Hal Perkins & UW CSE B-18

04/01/25 © 2002-08 Hal Perkins & UW CSE B-19

04/01/25 © 2002-08 Hal Perkins & UW CSE B-21

 Restriction: abbreviations may not be

04/01/25 © 2002-08 Hal Perkins & UW CSE B-23

digit ::= [0-9]

04/01/25 © 2002-08 Hal Perkins & UW CSE B-27

04/01/25 © 2002-08 Hal Perkins & UW CSE B-28

04/01/25 © 2002-08 Hal Perkins & UW CSE B-29

04/01/25 © 2002-08 Hal Perkins & UW CSE B-30

04/01/25 © 2002-08 Hal Perkins & UW CSE B-31

04/01/25 © 2002-08 Hal Perkins & UW CSE B-32

04/01/25 © 2002-08 Hal Perkins & UW CSE B-33

04/01/25 © 2002-08 Hal Perkins & UW CSE B-34

04/01/25 © 2002-08 Hal Perkins & UW CSE B-35

04/01/25 © 2002-08 Hal Perkins & UW CSE B-36

[other ] Accept NOT

[other ] Accept LESS

04/01/25 © 2002-08 Hal Perkins & UW CSE B-37

[other ] Accept INT

04/01/25 © 2002-08 Hal Perkins & UW CSE B-38

[other ] Accept ID or keyword

 Strategies for handling identifiers vs keywords

04/01/25 © 2002-08 Hal Perkins & UW CSE B-39

04/01/25 © 2002-08 Hal Perkins & UW CSE B-40

static char nextch; // next unprocessed input

// advance to next input char

// skip whitespace and comments

04/01/25 © 2002-08 Hal Perkins & UW CSE B-41

if (no more input) {

04/01/25 © 2002-08 Hal Perkins & UW CSE B-42

case '<': // < or <=

04/01/25 © 2002-08 Hal Perkins & UW CSE B-43

04/01/25 © 2002-08 Hal Perkins & UW CSE B-44

04/01/25 © 2002-08 Hal Perkins & UW CSE B-45

04/01/25 © 2002-08 Hal Perkins & UW CSE B-46

You might also like