0% found this document useful (0 votes)
65 views11 pages

Lexical Analysis

Lexical analysis breaks down source code text into tokens. It identifies basic elements like keywords, identifiers, literals, and punctuation. During lexical analysis, a lexer applies patterns defined by regular expressions to the source code and generates a stream of tokens. Common tokens include variables, functions, operators, and punctuation. The lexer does not interpret the meaning of the tokens, only identifies them based on patterns.

Uploaded by

anonlm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views11 pages

Lexical Analysis

Lexical analysis breaks down source code text into tokens. It identifies basic elements like keywords, identifiers, literals, and punctuation. During lexical analysis, a lexer applies patterns defined by regular expressions to the source code and generates a stream of tokens. Common tokens include variables, functions, operators, and punctuation. The lexer does not interpret the meaning of the tokens, only identifies them based on patterns.

Uploaded by

anonlm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Lexical Analysis 1

Lexical Analysis
pertaining to words
int (datatype)
main (identifier)
( (open par.)
) (close par.)
{ (open brace)
int main() printf (identifier)
{ lexical analysis ( (open par.)
printf("Hello"); "Hello" (string lit.)
return 0; ) (close par.)
} ; (terminator)
return (ret.
source code keyword)
0 (integer lit.)
; (terminator)
} (close brace)
tokens
Tokens
A string that follows a certain pattern.

Pattern
A rule that describes a set of strings to be
associated with a token.
Expressed using regular expressions.
What Lexical Analysis is
and What it’s Not

int main() int main()


{ {
printf("Hello"; printf("Hello);
return 0; return 0;
} }

missing closing missing closing quotes


parenthesis (lexical error)
(no lexical error)
Quick Recap on AUTOMAT
3 operations for regular expressions
concatenation ab a followed by b
union a|b a or b
star a* any number of a’s
 
Special symbol is used to denote empty

*precedence: star, concatenation,


Quick Recap on AUTOMAT
Describe the following regular
expressions

a*b*c*

a*(ba*| ԑ)

0*10*
Additional Shortcuts (Yay)!

To express… We can write…


a|b|0|1 [ab01]
0|1|2|3|4|5|6|7|8|9 [0-9]
all letters [A-Za-z]
ss* s+
s|ԑ s?
Pattern Examples

A series of letters ending with either


a period or exclamation mark
[A-Za-z]+[.!]
A digit followed by an optional
uppercase letter
[0-9][A-Z]?
An even digit [02468]
But what if…
Sometimes, you would need the actual
symbols like ? and [] in your regular
expression.

To allow this, many lexer generator


conventions make use of escape
characters.
The Lexer

Task is to perform lexical


analysis in a given source
code and return a set of
tokens.

You might also like