Program No. - 3: Write A Program To Find Different Tokens in A Program
Program No. - 3: Write A Program To Find Different Tokens in A Program
- 3
Aim:
Write a program to find different tokens in a program.
Theory:
Lexical Analysis is the first phase of compiler also known as scanner. It converts the input
program into a sequence of Tokens.
Lexical Analysis can be implemented with the Deterministic finite Automata.
Token-
A lexical token is a sequence of characters that can be treated as a unit in the grammar of
the programming languages.
Example of tokens:
Type token (id, number, real, . . . )
Punctuation tokens (IF, void, return, . . . )
Alphabetic tokens (keywords)
Keywords; Examples-for, while, if etc.Identifier; Examples-Variable name, function name
etc.Operators; Examples '+', '++', '-' etc.Separators; Examples ',' ';' etc.
Example of Non-Tokens:
Comments, preprocessor directive, macros, blanks, tabs, newline etc.
Algorithm:
1. A file is read single string at a time until the end of file.
2. Each string is made to check against rules and strings of keywords, identifiers,
operators and delimiters.
3. If the string is matched with one of the tokens, that particular function returs 1 else it
returns 0.
4. The tokens identified are then counted in counter variables.
5. The file is closed when all the tokens are identified and counted.
Code:
#include<iostream>
#include<conio.h>
#include<stdio.h>
#include<string.h>
using namespace std;
int main()
{
int k=0,id=0,op=0,d=0;
char s[100];
FILE *fp;
fp=fopen("text.txt","r");
while(fscanf(fp,"%s",s)!=EOF)
if(keywords(s)){
k++;}
else if(operators(s)){
op++;}
else if(delimiters(s)){
d++;}
else if(identifiers(s)){
id++;}
fclose(fp);
cout<<"No. of keywords = "<<k<<endl;
cout<<"No. of identifiers = "<<id<<endl;
cout<<"No. of operators = "<<op<<endl;
cout<<"No. of delimiters = "<<d<<endl;
return 0;
}
Output:
Learnings:
Tokens are the most essential part of compilers. These are the basic units that are needed
for higher level operations of compiler. Every program is converted into tokens defined by a
programming language. Some of the most common tokens are keywords, identifiers,
numbers, strings, operators, constants, delimiters.