Lexical Analysis (Scanner)
Lexical Analysis (Scanner)
Source
Program Tokens
Lexical
(Character analyzer
Stream)
▪ Lexical Analysis is
also known as
lexical scanner.
Lexical Analysis (scanner)
token & attributes
source lexical Syntax
program analyzer analyzer
get next
token
Symbol
Table
.(Token-type, attribute-value) : يتم التعامل مع الوحدة اللفظية على أنها ثنائية مكونة من جزأين
TOKENS, PATTERNS, AND LEXEMES:
I learn compiler
In English language: design
noun, verb, adjective, …
In a programming language:
Identifier, Integer, Keyword, Whitespace,…
TOKENS, PATTERNS, AND LEXEMES:
Example: The program statement
count = 1
y := 31 + 28 * x
y := 31 + 28 * x Lexical analyzer
<id, “y”> <assign, > <num, 31> <+, > <num, 28> <*, > <id, “x”>
token
Token value
Parser
(token attribute)
Source
Program Tokens
Lexical
(Character analyzer
Stream)
Example:
The symbol table for C++ Code
Example:
Lexical Tokens
3) Operators :- +,-,*,/,==,=,….
(2) identifiers
• words that the programmer constructs to attach a
name to a construct, Identifiers may be used to identify
variables, classes, constants, functions, etc.
(3) operators
• symbols used for arithmetic, character, or
logical operations, such as +,- ,=,!=, etc.
Lexical Tokens
(4) numeric constants
• numbers such as 124, 12.35, 0.09E-23, etc.
• Numeric constants may be stored in a table.
average = (sum/count)
average identifier
= Assignment operator
( open parenthesis
sum identifier
/ Division operator
count Identifier
) Close parenthesis
An example of source input:
package tokencount;
import java.util.*;
public class tokeniz {
public static void main(String[] args) {
StringTokenizer st = new StringTokenizer("Aslamu alikum all");
System.out.println("Total tokens : " + st.countTokens());
}
}
Extract tokens:
output:
This
is
a
test
The following example illustrates how the String.split
method can be used to break up a string into its basic
tokens:
output:
this
Is
a
test