Lecture I
Lecture I
Part I
Language and Grammars
tokens Intermediate
Scanner Parser representation
Source Parse Rest of
program
(lexical (syntax tree
analysis) analysis) front end
Get next
tokens
Symbol
Table
• Lexical
if x<1 thenn y = 5:
“Typos”
• Syntactic
if ((x<1) & (y>5))) ...
{ ... { ... ... }
• Semantic
if (x+5) then ...
Type Errors
Undefined IDs, etc.
• Logical Errors
if (i<9) then ...
Should be <= not <
Bugs
Compiler cannot detect Logical Errors
Error Detection
• Much responsibility on Parser
– Many errors are syntactic in nature
– Precision/ efficiency of modern parsing method
– Detect the error as soon as possible
• Good news is
– Simple mechanism can catch most common errors
• Error-Correcting Compilers
– Issue an error message
– Fix the problem
– Produce an executable
Example
Error on line 23: “myVarr” undefined.
“myVar” was used.
• Example:
int myVar flag ;
...
Declaration of flag is discarded
x := flag;
...
... Variable flag is undefined
while (flag==0)
...
Variable falg is undefined
Example
• The key...
– Good set of synchronizing tokens
– Knowing what to do then
• Advantage
– Simple to implement
– Does not go into infinite loop
– Commonly used
• Disadvantage
– May skip over large sections of source with some errors
Error Recovery Approaches: Phrase-Level Recovery
Example
• The key...
Don’t get into an infinite loop
...constantly inserting tokens and never scanning the actual source
• Generally used for error-repairing compilers
– Difficulty: Point of error detection might be much later the point of
error occurrence
Error Recovery Approaches: Error Productions
• Used with...
– LR (Bottom-up) parsing
– Parser Generators
Error Recovery Approaches: Global Correction
• Theoretical Approach
• Find the minimum change to the source to yield a
valid program
– Insert tokens, delete tokens, swap adjacent tokens
• Global Correction Algorithm
Input: grammatically incorrect input string x; grammar G
Output: grammatically correct string y
Algorithm: converts x Æ y using minimum number
changes (insertion, deletion etc.)
• Impractical algorithms - too time consuming
Context Free Grammars (CFG)