0% found this document useful (0 votes)
11 views

Assignment Two

Uploaded by

Ego Computer
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Assignment Two

Uploaded by

Ego Computer
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Assignment Two

Token:-

It is basically a sequence of characters that are treated as a unit as it cannot be


further broken down. In programming languages like C language- keywords (int,
char, float, const, goto, continue, etc.) identifiers (user-defined names),
operators (+, -, *, /), delimiters/punctuators like comma (,), semicolon(;), braces
({ }), etc. , strings can be considered as tokens. This phase recognizes three
types of tokens: Terminal Symbols (TRM)- Keywords and Operators, Literals
(LIT), and Identifiers (IDN).
let’s understand now how to calculate tokens in a source code (C
language):
Example 1:
int a = 10; //Input Source code

Tokens
int (keyword), a(identifier), =(operator), 10(constant) and ;
(punctuation-semicolon)

Lexeme

It is a sequence of characters in the source code that are matched by given


predefined language rules for every lexeme to be specified as a valid token.
Example:
main is lexeme of type identifier(token)
(,),{,} are lexemes of type punctuation(token)
Pattern

It specifies a set of rules that a scanner follows to create a token.


Example of Programming Language (C, C++):
For a keyword to be identified as a valid token, the pattern is the sequence of
characters that make the keyword.
For identifier to be identified as a valid token, the pattern is the predefined
rules that it must start with alphabet, followed by alphabet or a digit.

Criteria Token Lexeme Pattern

It is a sequence of It specifies a
Token is basically a
characters in the source set of rules
sequence of
code that are matched that a
characters that are
by given predefined scanner
treated as a unit as
language rules for every follows to
it cannot be further
lexeme to be specified create a
broken down.
Definition as a valid token. token.

The
all the reserved
sequence of
Interpretation keywords of that
int, goto characters
language(main,
of type that make the
printf, etc.)
Keyword keyword.

it must start
with the
name of a variable, alphabet,
Interpretation main, a
function, etc followed by
of type the alphabet
Identifier or a digit.

Interpretation
all the operators are
of type +, = +, =
considered tokens.
Operator

Interpretation each kind of (, ), {, } (, ), {, }


of type punctuation is
Punctuation considered a token.
e.g. semicolon,
Criteria Token Lexeme Pattern

bracket, comma,
etc.

any string of
characters
a grammar rule or “Welcome to
(except ‘ ‘)
boolean literal. GeeksforGeeks!”
Interpretation between ”
of type Literal and

You might also like