0% found this document useful (0 votes)

52 views58 pages

Compilers: CS414-2017S-01 Compiler Basics & Lexical Analysis

This document provides an overview of compilers and lexical analysis. It discusses decomposing compilers into smaller units like the lexical analyzer. Lexical analysis converts source code into a stream of tokens. Deterministic finite automata and regular expressions can be used to automatically create lexical analyzers. Regular expressions provide a way to describe the tokens in a language. Tools like JavaCC can generate lexical analyzers by parsing regular expressions.

Uploaded by

Zerihun Bekele

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

52 views58 pages

Compilers: CS414-2017S-01 Compiler Basics & Lexical Analysis

Uploaded by

Zerihun Bekele

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 58

Compilers

CS414-2017S-01
Compiler Basics & Lexical Analysis
David Galles

Department of Computer Science

University of San Francisco
01-0: Syllabus
Office Hours
Course Text
Prerequisites
Test Dates & Testing Policies
Projects
Teams of up to 2
Grading Policies
Questions?
01-1: Notes on the Class
Don’t be afraid to ask me to slow down!
We will cover some pretty complex stuff here,
which can be difficult to get the first (or even the
second) time. ASK QUESTIONS
While specific questions are always preferred, “I
don’t get it” is always an acceptable question. I am
always happy to stop, re-explain a topic in a
different way.
If you are confused, I can guarantee that at
least one other person in the class would
benefit from more explanation
01-2: Notes on the Class
Projects are non-trivial
Using new tools (JavaCC)
Managing a large scale project
Lots of complex classes & advanced
programming techniques.
01-3: Notes on the Class
Projects are non-trivial
Using new tools (JavaCC)
Managing a large scale project
Lots of complex classes & advanced
programming techniques.
START EARLY!
Projects will take longer than you think
(especially starting with the semantic analyzer
project)
ASK QUESTIONS!
01-4: What is a compiler?

Source Program Compiler Machine code

Simplified View
01-5: What is a compiler?
Source Token Abstract
Lexical Analyzer Parser
File Stream Syntax Tree

Abstract
Assembly Code Generator Semantic Analyzer
Assembly
Tree
Assembly Tree
Generator

Relocatable
Assembler Object Linker Machine code
Code

Libraries

More Accurate View

01-6: What is a compiler?
Front end
Source Token Abstract
Lexical Analyzer Parser
File Stream Syntax Tree

Abstract
Assembly Code Generator Semantic Analyzer
Assembly
Back End
Tree
Assembly Tree
Generator

Relocatable
Assembler Object Linker Machine code
Code

Libraries
01-7: What is a compiler?

Source Token Abstract

Lexical Analyzer Parser
File Stream Syntax Tree

Covered in
this course
Abstract
Assembly Code Generator Semantic Analyzer
Assembly
Tree
Assembly Tree
Generator

Relocatable
Assembler Object Linker Machine code
Code

Libraries
01-8: Why Use Decomposition?
01-9: Why Use Decomposition?
Software Engineering!

Smaller units are easier to write, test and debug

Code Reuse
Writing a suite of compilers (C, Fortran, C++,
etc) for a new architecture
Create a new language – want compilers
available for several platforms
01-10: Lexical Analysis
Converting input file to stream of tokens

void main() {
print(4);
}
01-11: Lexical Analysis
Converting input file to stream of tokens

void main() { IDENTIFIER(void)

print(4); IDENTIFIER(main)
} LEFT-PARENTHESIS
RIGHT-PARENTHESIS
LEFT-BRACE
IDENTIFIER(print)
LEFT-PARENTHESIS
INTEGER-LITERAL(4)
RIGHT-PARENTHESIS
SEMICOLON
RIGHT-BRACE
01-12: Lexical Analysis
Brute-Force Approach
Lots of nested if statements
if (c = nextchar() == ’P’) {
if (c = nextchar() == ’R’) {
if (c = nextchar() == ’0’) {
if (c = nextchar() == ’G’) {
/* Code to handle the rest of either
PROGRAM or any identifier that starts
with PROG
*/
} else if (c == ’C’) {
/* Code to handle the rest of either
PROCEDURE or any identifier that starts
with PROC
*/

...
01-13: Lexical Analysis
Brute-Force Approach
Break the input file into words, separated by
spaces or tabs
This can be tricky – not all tokens are separated
by whitespace
Use string comparison to determine tokens
01-14: Deterministic Finite Automata
Set of states
Initial State
Final State(s)
Transitions

DFA for else, end, identifiers

Combine DFA
01-15: DFAs and Lexical Analyzers
Given a DFA, it is easy to create C code to
implement it
DFAs are easier to understand than C code
Visual – almost like structure charts
... However, creating a DFA for a complete lexical
analyzer is still complex
01-16: Automatic Creation of DFAs
We’d like a tool:

Describe the tokens in the language

Automatically create DFA for tokens
Then, automatically create C code that implements
the DFA

We need a method for describing tokens

01-17: Formal Languages
Alphabet Σ: Set of all possible symbols
(characters) in the input file
Think of Σ as the set of symbols on the
keyboard
String w : Sequence of symbols from an alphabet
String length |w| Number of characters in a
string: |car| = 3, |abba| = 4
Empty String ǫ: String of length 0: |ǫ| = 0
Formal Language: Set of strings over an
alphabet

Formal Language 6= Programming language – Formal

Language is only a set of strings.
01-18: Formal Languages
Example formal languages:

Integers {0, 23, 44, . . .}

Floating Point Numbers {3.4, 5.97, . . .}
Identifiers {foo, bar, . . .}
01-19: Language Concatenation
Language Concatenation Given two formal
languages L1 and L2 , the concatenation of L1 and
L2 , L1 L2 = {xy|x ∈ L1 , y ∈ L2 }

For example:
{fire, truck, car} {car, dog} =
{firecar, firedog, truckcar, truckdog, carcar, cardog}
01-20: Kleene Closure
Given a formal language L:
L0 = {ǫ}
L1 = L
L2 = LL
L3 = LLL
L4 = LLLL
[ [ [ [ [
L∗ = L0 L1 L2 ... Ln ...
01-21: Regular Expressions
Regular expressions are use to describe formal
languages over an alphabet Σ:

Regular Expression Language

ǫ L[ǫ] = {ǫ}
a∈Σ L[a] = {a}
(M R) L[M R] = L[M ]L[R]
S
(M |R) L[(M |R)] = L[M ] L[R]
(M ∗) L[(M ∗)] = L[M ]∗
01-22: r.e. Precedence
From highest to Lowest:

Kleene Closure *
Concatenation
Alternation |

ab*c|e = (a(b*)c) | e
01-23: Regular Expression Examples

all strings over {a,b}

binary integers (with leading zeroes)
all strings over {a,b} that
begin and end with a
all strings over {a,b} that
contain aa
all strings over {a,b} that
do not contain aa
01-24: Regular Expression Examples

all strings over {a,b} (a|b)*

binary integers (with leading zeroes) (0|1)(0|1)*
all strings over {a,b} that a(a|b)*a
begin and end with a
all strings over {a,b} that (a|b)*aa(a|b)*
contain aa
all strings over {a,b} that b*(abb*)*(a|ǫ)
do not contain aa
01-25: Reg. Exp. Shorthand

[a,b,c,d] = (a|b|c|d)
[d-g] = [d,e,f,g] = (b|e|f|g)
[d-f,M-O] = [d,e,f,M,N,O]
= (d|e|f|M|N|O)
(α)? = Optionally α (i.e., (α | ǫ))
(α)+ = α(α)*
01-26: Regular Expressions & Unix
Many unix tools use regular expressions
Example: grep ’<reg exp>’ filename
Prints all lines that contain a match to the
regular expression
Special characters:
^ beginning of line
$ end of line
(grep examples on other screen)
01-27: JavaCC Regular Expressions
All characters & strings must be in quotation marks
"else"
"+"
("a"|"b")
All regular expressions involving * must be
parenthesized
("a")*, not "a"*
01-28: JavaCC Shorthand
["a","b","c","d"] = ("a"|"b"|"c"|"d")
["d"-"g"] = ["d","e","f","g"] = ("b"|"e"|"f"|"g")
["d"-"f","M"-"O"] = ["d","e","f","M","N","O"]
= ("d"|"e"|"f"|"M"|"N"|"O")
(α)? = Optionally α (i.e., (α | ǫ))
(α)+ = α(α)*
(~["a","b"]) = Any character except “a” or “b”.
Can only be used with [] notation
~(a(a|b)*b) is not legal
01-29: r.e. Shorthand Examples
Regular Expression Langauge
{if}
Set of legal identifiers
Set of integer literals
(leading zeroes allowed)
Set of real literals
01-30: r.e. Shorthand Examples
Regular Expression Langauge
"if" {if}
["a"-"z"](["0"-"9","a"-"z"])* Set of legal identifiers
["0"-"9"] Set of integer literals
(leading zeroes allowed)
(["0"-"9"]+"."(["0"-"9"]*))| Set of real literals
((["0"-"9"])*"."["0"-"9"]+)
01-31: Lexical Analyzer Generator
JavaCC is a Lexical Analyzer Generator and a Parser
Generator

Input: Set of regular expressions (each of which

describes a type of token in the language)
Output: A lexical analyzer, which reads an input file
and separates it into tokens
01-32: Structure of a JavaCC file
options{
/* Code to set various options flags */
}

PARSER_BEGIN(foo)

public class foo {

/* This segment is often empty */
}

PARSER_END(foo)

TOKEN_MGR_DECLS :
{
/* Declarations used by lexical analyzer */
}

/* Token Rules & Actions */

01-33: Token Rules in JavaCC
Tokens are described by rules with the following
syntax:

TOKEN :
{
<TOKEN_NAME: RegularExpression>
}
TOKEN_NAME is the name of the token being
described
RegularExpression is a regular expression that
describes the token
01-34: Token Rules in JavaCC
Token rule examples:

TOKEN :
{
<ELSE: "else">
}

TOKEN :
{
<INTEGER_LITERAL: (["0"-"9"])+>
}
01-35: Token Rules in JavaCC
Several different tokens can be described in the
same TOKEN block, with token descriptions
separated by |.

TOKEN :
{
<ELSE: "else">
| <INTEGER_LITERAL: (["0"-"9"])+>
| <SEMICOLON: ";">
}
01-36: getNextToken
When we run javacc on the input file foo.jj, it
creates the class fooTokenManager
The class fooTokenManager contains the static
method getNextToken()
Every call to getNextToken() returns the next
token in the input stream.
01-37: getNextToken
When getNextToken is called, a regular
expression is found that matches the next
characters in the input stream.
What if more than one regular expression
matches?

TOKEN :
{
<ELSE: "else">
| <IDENTIFIER: (["a"-"z"])+>
}
01-38: getNextToken
When more than one regular expression matches
the input stream:
Use the longest match
“elsed” should match to IDENTIFIER, not to
ELSE followed by the identifier “d”
If two matches have the same length, use the
rule that appears first in the .jj file
“else” should match to ELSE, not
IDENTIFIER
01-39: JavaCC Example
PARSER_BEGIN(simple)
public class simple {

}
PARSER_END(simple)

TOKEN :
{
<ELSE: "else">
| <SEMICOLON: ";">
| <FOR: "for">
| <INTEGER_LITERAL: (["0"-"9"])+>
| <IDENTIFIER: ["a"-"z"](["a"-"z","0"-"9"])*>
}

else;ford for
01-40: SKIP Rules
Tell JavaCC what to ignore (typically whitespace)
using SKIP rules
SKIP rule is just like a TOKEN rule, except that no
TOKEN is returned.

SKIP:
{
< regularexpression1 >
| < regularexpression2 >
| ...
| < regularexpressionn >
}
01-41: Example SKIP Rules
PARSER_BEGIN(simple2)
public class simple2 {
}
PARSER_END(simple2)

SKIP :
{
< " " >
| < "\n" >
| < "\t" >
}

TOKEN :
{
<ELSE: "else">
| <SEMICOLON: ";">
| <FOR: "for">
| <INTEGER_LITERAL: (["0"-"9"])+>
| <IDENTIFIER: ["A"-"Z"](["A"-"Z","0"-"9"])*>
}
01-42: JavaCC States
Comments can be dealt with using SKIP rules
How could we skip over 1-line C++ Style
comments?

// This is a comment
01-43: JavaCC States
Comments can be dealt with using SKIP rules
How we could skip over 1-line C++ Style
comments:

// This is a comment
Using a SKIP rule
SKIP :
{
< "//" (~["\n"])* "\n" >
}
01-44: JavaCC States
Writing a regular expression to match multi-line
comments (using /* and */) is much more difficult
Writing a regular expression to match nested
comments is impossible (take Automata Theory for
a proof :) )
What can we do?
Use JavaCC States
01-45: JavaCC States
We can label each TOKEN and SKIP rule with a
“state”
Unlabeled TOKEN and SKIP rules are assumed to
be in the default state (named DEFAULT,
unsurprisingly enough)
Can switch to a new state after matching a TOKEN
or SKIP rule using the : NEWSTATE notation
01-46: JavaCC States
SKIP :
{
< " " >
| < "\n" >
| < "\t" >
}
SKIP :
{
< "/*" > : IN_COMMENT
}
<IN_COMMENT>
SKIP :
{
< "*/" > : DEFAULT
| < ~[] >
}
TOKEN :
{
<ELSE: "else">
| ... (etc)
}
01-47: Actions in TOKEN & SKIP
We can add Java code to any SKIP or TOKEN rule
That code will be executed when the SKIP or
TOKEN rule is matched.
Any methods / variables defined in the
TOKEN_MGR_DECLS section can be used by
these actions
01-48: Actions in TOKEN & SKIP
PARSER_BEGIN(remComments)
public class remComments { }
PARSER_END(remComments)

TOKEN_MGR_DECLS :
{
public static int numcomments = 0;
}

SKIP :
{
< "/*" > : IN_COMMENT
}

SKIP :
{
< "//" (~["\n"])* "\n" > { numcomments++; }
}
01-49: Actions in TOKEN & SKIP
<IN_COMMENT>
SKIP :
{
< "*/" > { numcomments++; SwitchTo(DEFAULT);}
}

<IN_COMMENT>
SKIP :
{
< ~[] >
}

TOKEN :
{
<ANY: ~[]>
}
01-50: Tokens
Each call to getNextToken returns a “Token” object
Token class is automatically created by javaCC.
Variables of type Token contain the following public
variables:
public int kind; The type of token. When
javacc is run on the file foo.jj, a file
fooConstants.java is created, which contains
the symbolic names for each constant
public interface simplejavaConstants {
int EOF = 0;
int CLASSS = 8;
int DO = 9;
int ELSE = 10;
...
01-51: Tokens
Each call to getNextToken returns a “Token” object
Token class is automatically created by javaCC.
Variables of type Token contain the following public
variables:
public int beginLine, beginColumn,
endLine, endColumn; The location of the
token in the input file
01-52: Tokens
Each call to getNextToken returns a “Token” object
Token class is automatically created by javaCC.
Variables of type Token contain the following public
variables:
public String image; The text that was
matched to create the token.
01-53: Generated TokenManager
class TokenTest {
public static void main(String args[]) {
Token t;
Java.io.InputStream infile;
pascalTokenManager tm;
boolean loop = true;

if (args.length < 1) {
System.out.print("Enter filename as command line argument");
return;
}
try {
infile = new Java.io.FileInputStream(args[0]);
} catch (Java.io.FileNotFoundException e) {
System.out.println("File " + args[0] + " not found.");
return;
}
tm = new sjavaTokenManager(new SimpleCharStream(infile));
01-54: Generated TokenManager
t = tm.getNextToken();
while(t.kind != sjavaConstants.EOF) {
System.out.println("Token : "+ t + " : ");
System.out.println(pascalConstants.tokenImage[t.kind]);
}
}
}
01-55: Lexer Project
Write a .jj file for simpleJava tokens
Need to handle all whitespace (tabs, spaces,
end-of-line)
Need to handle nested comments (to an arbitrary
nesting level)
01-56: Project Details
JavaCC is available at https://javacc.dev.java.net/
To compile your project
% javacc simplejava.jj
% javac *.java
To test your project
% java TokenTest <test filename>
To submit your program: Create a branch:

https://www.cs.usfca.edu/svn/<username>/cs414/lexer/

Use JavaCC To Build A User Friendly
No ratings yet
Use JavaCC To Build A User Friendly
21 pages
Adams Adv
No ratings yet
Adams Adv
45 pages
Compilers CH 3
No ratings yet
Compilers CH 3
58 pages
CH 3 Myppt
No ratings yet
CH 3 Myppt
59 pages
Lexical Analysis
No ratings yet
Lexical Analysis
57 pages
Compiler
No ratings yet
Compiler
60 pages
Chapter 3 - Lexical Analysis
No ratings yet
Chapter 3 - Lexical Analysis
52 pages
Chapter 2 - Lexical Analysis - Regular Expressions
No ratings yet
Chapter 2 - Lexical Analysis - Regular Expressions
27 pages
2 - Compilers (Lexical Analysis)
No ratings yet
2 - Compilers (Lexical Analysis)
60 pages
CD ch2
No ratings yet
CD ch2
104 pages
Unit 2-Introduction To Compilers
No ratings yet
Unit 2-Introduction To Compilers
51 pages
Compiler Course: Lexical Analysis
No ratings yet
Compiler Course: Lexical Analysis
50 pages
2 - Lexical Analysis
No ratings yet
2 - Lexical Analysis
52 pages
Chapter 2
No ratings yet
Chapter 2
27 pages
Lexical Analyzer 1
No ratings yet
Lexical Analyzer 1
37 pages
Slides CHP 3 and 4
No ratings yet
Slides CHP 3 and 4
21 pages
Compiler-Lexical Analysis
100% (1)
Compiler-Lexical Analysis
59 pages
Chapter 3 - Lexical Analysis
100% (1)
Chapter 3 - Lexical Analysis
51 pages
Chapter 2
No ratings yet
Chapter 2
56 pages
Chapter 3 - Lexical Analysis and Lexical Analyzer Generators
No ratings yet
Chapter 3 - Lexical Analysis and Lexical Analyzer Generators
52 pages
Lexical Analysis and Lexical Analyzer Generators: COP5621 Compiler Construction
No ratings yet
Lexical Analysis and Lexical Analyzer Generators: COP5621 Compiler Construction
52 pages
Chapter 2
No ratings yet
Chapter 2
31 pages
Chapter 7 Lexical Analysis
No ratings yet
Chapter 7 Lexical Analysis
61 pages
4-Intro To Flex and Bison-09!09!2024
No ratings yet
4-Intro To Flex and Bison-09!09!2024
28 pages
2 - Scanner
No ratings yet
2 - Scanner
49 pages
Chapter 3 - Lexical Analysis
100% (3)
Chapter 3 - Lexical Analysis
51 pages
Chapter 2 - Lexical Analysis
No ratings yet
Chapter 2 - Lexical Analysis
56 pages
Unit 1
No ratings yet
Unit 1
34 pages
Acknowledgements: The Slides For This Lecture Are A Modified Versions of The Offering by
No ratings yet
Acknowledgements: The Slides For This Lecture Are A Modified Versions of The Offering by
40 pages
Chapter 3 - Lexical Analysis
No ratings yet
Chapter 3 - Lexical Analysis
51 pages
UNIT-I - Lexical Analysis
No ratings yet
UNIT-I - Lexical Analysis
51 pages
Chapter 3 - Lexical Analysis
No ratings yet
Chapter 3 - Lexical Analysis
51 pages
Slides 02 - Compiler Construction - UET CS - Lexical Analyzer Rev 2
No ratings yet
Slides 02 - Compiler Construction - UET CS - Lexical Analyzer Rev 2
69 pages
Compilers - Week 2
No ratings yet
Compilers - Week 2
14 pages
Scanner (Lexical Analyzer) : The Structure of A Compiler
No ratings yet
Scanner (Lexical Analyzer) : The Structure of A Compiler
109 pages
ch-2.pdf 2
No ratings yet
ch-2.pdf 2
27 pages
CompilerD L3
No ratings yet
CompilerD L3
36 pages
Lexi Cal A Analyzer
No ratings yet
Lexi Cal A Analyzer
38 pages
Lecture 3
No ratings yet
Lecture 3
31 pages
Chapter 2 - Lexical Analysis
100% (1)
Chapter 2 - Lexical Analysis
69 pages
Chapter 2
No ratings yet
Chapter 2
77 pages
1st Phase Lexical Analyzer
No ratings yet
1st Phase Lexical Analyzer
33 pages
Lexical Analyser
No ratings yet
Lexical Analyser
55 pages
Lec2 LexicalAnalyser
No ratings yet
Lec2 LexicalAnalyser
30 pages
COS 320 Compilers: David Walker
No ratings yet
COS 320 Compilers: David Walker
38 pages
Lexical Analysis
No ratings yet
Lexical Analysis
47 pages
Chapter-2 Compiler Design
No ratings yet
Chapter-2 Compiler Design
98 pages
Compiler Design Lexical Analysis
No ratings yet
Compiler Design Lexical Analysis
24 pages
Lexical Analysis
No ratings yet
Lexical Analysis
44 pages
Lexical Analysis All Token List and Diffence
No ratings yet
Lexical Analysis All Token List and Diffence
4 pages
Ch3 - Lexical Analysis
No ratings yet
Ch3 - Lexical Analysis
52 pages
Chapter 3 - Lexical Analysis
No ratings yet
Chapter 3 - Lexical Analysis
34 pages
Lecture 3
No ratings yet
Lecture 3
22 pages
Unit 2 Lexical Analyzer
No ratings yet
Unit 2 Lexical Analyzer
63 pages
WINSEM2023-24 CSI2005 TH VL2023240501823 2024-01-08 Reference-Material-I
No ratings yet
WINSEM2023-24 CSI2005 TH VL2023240501823 2024-01-08 Reference-Material-I
23 pages
SP Unit III-2024-25
No ratings yet
SP Unit III-2024-25
126 pages
PL Lec 2 Syntax and Semantics
No ratings yet
PL Lec 2 Syntax and Semantics
48 pages
SE Compiler Chapter 2
No ratings yet
SE Compiler Chapter 2
16 pages
Lexical Analysis1
No ratings yet
Lexical Analysis1
44 pages
Training of Trainers For Ethiopia - May 29-30, 2023
No ratings yet
Training of Trainers For Ethiopia - May 29-30, 2023
5 pages
Microsoft Word High Level Overview Slides
No ratings yet
Microsoft Word High Level Overview Slides
9 pages
Lecture11 Java
No ratings yet
Lecture11 Java
80 pages
Lecture4 Java
No ratings yet
Lecture4 Java
46 pages
Microsoft PowerPoint High Level Overview Slides
No ratings yet
Microsoft PowerPoint High Level Overview Slides
9 pages
Chap 02
No ratings yet
Chap 02
16 pages
Lecture6 Java
No ratings yet
Lecture6 Java
84 pages
Chap 04
100% (1)
Chap 04
22 pages
ACSC 368 - Artificial Intelligence: Homework 1
No ratings yet
ACSC 368 - Artificial Intelligence: Homework 1
1 page
Chap 01
No ratings yet
Chap 01
11 pages
Chapter 5 VR, AR and MR
No ratings yet
Chapter 5 VR, AR and MR
22 pages
ACSC 368 - Artificial Intelligence: Coursework 1
No ratings yet
ACSC 368 - Artificial Intelligence: Coursework 1
1 page
Chapter 1 - Intro To Emerging Technologies
100% (1)
Chapter 1 - Intro To Emerging Technologies
58 pages
ACSC368: Artificial Intelligence: Course Details
No ratings yet
ACSC368: Artificial Intelligence: Course Details
4 pages
Answer For HMIS Exercise Revised Feb, 2010
No ratings yet
Answer For HMIS Exercise Revised Feb, 2010
30 pages
Sample DB Project
No ratings yet
Sample DB Project
17 pages
Chapter 3 - Artificial Intelligence
No ratings yet
Chapter 3 - Artificial Intelligence
26 pages
Lecture3 Java
No ratings yet
Lecture3 Java
82 pages
Part 1. Experiments With Javacc: Source Code Source Code
No ratings yet
Part 1. Experiments With Javacc: Source Code Source Code
3 pages
Focus Area Ms-Word: - Practical and Oral Questions Focus On
No ratings yet
Focus Area Ms-Word: - Practical and Oral Questions Focus On
5 pages
Assignment 3:: Due 8am On Mon, Oct 28, 2019
No ratings yet
Assignment 3:: Due 8am On Mon, Oct 28, 2019
10 pages
JavaScript Syllabus - Besant Technologies
No ratings yet
JavaScript Syllabus - Besant Technologies
4 pages
Spark Join2
No ratings yet
Spark Join2
14 pages
MNFST
No ratings yet
MNFST
4 pages
DAA Unit 4
No ratings yet
DAA Unit 4
34 pages
Codds Rules Powerpoint Presentation1
No ratings yet
Codds Rules Powerpoint Presentation1
27 pages
Data Structures and Algorithms: Lab Exercises 3
No ratings yet
Data Structures and Algorithms: Lab Exercises 3
10 pages
Syllabus Class XII Computer Science 2021-22
No ratings yet
Syllabus Class XII Computer Science 2021-22
1 page
Stored Procedure Tutorial
No ratings yet
Stored Procedure Tutorial
7 pages
File Handling (Text, Binary, CSV) CLASS XII COMPUTER SCIENCE
No ratings yet
File Handling (Text, Binary, CSV) CLASS XII COMPUTER SCIENCE
56 pages
Microprocessor Programming (22415) Unit 4: Assembly Language Hours: 16 Marks: 20
No ratings yet
Microprocessor Programming (22415) Unit 4: Assembly Language Hours: 16 Marks: 20
60 pages
Notable Figures in Computing
No ratings yet
Notable Figures in Computing
4 pages
University College Cork Exam, Questions and Answers - SQL Exam 2016
No ratings yet
University College Cork Exam, Questions and Answers - SQL Exam 2016
23 pages
C# 10 in A Nutshell: The Definitive Reference 1st Edition Joseph Albahari PDF Download
No ratings yet
C# 10 in A Nutshell: The Definitive Reference 1st Edition Joseph Albahari PDF Download
52 pages
FREE-ELECTRONS-embedded Linux Kernel and Drivers
No ratings yet
FREE-ELECTRONS-embedded Linux Kernel and Drivers
181 pages
Informatics Practices Project On "Students Marksheet": Submitted By
No ratings yet
Informatics Practices Project On "Students Marksheet": Submitted By
28 pages
Log Cat 1735213509909
No ratings yet
Log Cat 1735213509909
21 pages
DataPreparation Outlier Treatment
100% (1)
DataPreparation Outlier Treatment
3 pages
Winsem2023-24 Msts601l TH Ch2023240503480 Reference Material I 05-02-2024 Binary Palindrome
No ratings yet
Winsem2023-24 Msts601l TH Ch2023240503480 Reference Material I 05-02-2024 Binary Palindrome
11 pages
How Can I Filter One Microsoft Access Combobox Based On Another Combobox Selection
No ratings yet
How Can I Filter One Microsoft Access Combobox Based On Another Combobox Selection
6 pages
Python Function Question and Answers PDF
No ratings yet
Python Function Question and Answers PDF
25 pages
Mod 3 Complete
No ratings yet
Mod 3 Complete
70 pages
Unit Test - 1 Feb 26
No ratings yet
Unit Test - 1 Feb 26
5 pages
Verilator
No ratings yet
Verilator
248 pages
Sap Bods Interview Questions
No ratings yet
Sap Bods Interview Questions
10 pages
Common Errors at The Time of Executing J1inchln
No ratings yet
Common Errors at The Time of Executing J1inchln
4 pages
JavaScript Drag and Drop
No ratings yet
JavaScript Drag and Drop
38 pages
Lab Manual - OODP
No ratings yet
Lab Manual - OODP
53 pages
BigQuery Remote Function User Guide
No ratings yet
BigQuery Remote Function User Guide
7 pages

Compilers: CS414-2017S-01 Compiler Basics & Lexical Analysis

Uploaded by

Compilers: CS414-2017S-01 Compiler Basics & Lexical Analysis

Uploaded by

Compilers

Department of Computer Science

Source Program Compiler Machine code

More Accurate View

Source Token Abstract

Smaller units are easier to write, test and debug

void main() { IDENTIFIER(void)

DFA for else, end, identifiers

Describe the tokens in the language

We need a method for describing tokens

Formal Language 6= Programming language – Formal

Integers {0, 23, 44, . . .}

Regular Expression Language

all strings over {a,b}

all strings over {a,b} (a|b)*

Input: Set of regular expressions (each of which

public class foo {

/* Token Rules & Actions */

You might also like