0% found this document useful (0 votes)
297 views5 pages

System Software and Compiler Lab: Token Separation

The document describes a program to perform token separation on a given subset of a programming language. The program takes source code as input, separates it into tokens using regular expressions, and outputs the tokens grouped into categories like keywords, identifiers, operators, and punctuation. It reads in a file, separates the strings using a string tokenizer, matches the strings to patterns for different token types, and prints the tokens along with their classification.

Uploaded by

abhinaya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
297 views5 pages

System Software and Compiler Lab: Token Separation

The document describes a program to perform token separation on a given subset of a programming language. The program takes source code as input, separates it into tokens using regular expressions, and outputs the tokens grouped into categories like keywords, identifiers, operators, and punctuation. It reads in a file, separates the strings using a string tokenizer, matches the strings to patterns for different token types, and prints the tokens along with their classification.

Uploaded by

abhinaya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

System Software and Compiler Lab

Token Separation

Aim
To write a program to perform token separation for a given subset of a language.

Description
Scanning is the first phase of a compiler in which the source program is read character by
character and then grouped in to various tokens. Token is defined as sequence of characters with
collective meaning. The various tokens could be identifiers, keywords, operators, punctuations,
constants, etc. The input is a program written in any high level language and the output is stream
of tokens. Regular expressions can be used for implementing this token separation

Algorithm
Step 1: Read the content of the file using File Reader.

Step 2: Separate the string with the delimiter space using String Tokenizer.

Step 3: Match the String with the pattern using Regular Expression.

Step 4: Group the tokens as identifier ,keywords ,operators ,punctations etc… and display it
using the given format <keyword,int>.

Program
package compiler;

importjava.io.BufferedReader;

importjava.io.FileNotFoundException;

importjava.io.FileReader;

importjava.io.IOException;

importjava.util.StringTokenizer;

importjava.util.regex.Matcher;

importjava.util.regex.Pattern;

public class Token {

Ex. No. 1 | Token Separation 1


System Software and Compiler Lab

public static void main(String[] args) throws IOException {

System.out.println("Token Seperation");

String code="";

try {

FileReaderfr=new FileReader("E://K7_Eclipse_ws//lab//src//compiler//samplein.txt");

BufferedReaderbr=new BufferedReader(fr);

while(br.ready()){

code=code+br.readLine();

} catch (FileNotFoundException ex) {

System.out.println("File Does Not Exist");

StringTokenizerst= new StringTokenizer(code,"");

String str;

Pattern keypath=Pattern.compile("public|static|void|class|package|int|float|char|String");

Pattern idpath=Pattern.compile("[a-zA-Z]([a-zA-Z]|[0-9])*");

Pattern oppath=Pattern.compile("[+|-|*|/|>|<|=]");

Pattern numpath=Pattern.compile("[0-9]+");

Pattern punpath=Pattern.compile("[|;|,|{|}|(|)|]");

Matcher kmatch;

Matcher idmatch;

Matcher opmatch;

Matcher nummatch;

Matcher punmatch;

Ex. No. 1 | Token Separation 2


System Software and Compiler Lab

while(st.hasMoreElements()){

str=st.nextToken();

kmatch=keypath.matcher(str);

idmatch=idpath.matcher(str);

opmatch=oppath.matcher(str);

nummatch=numpath.matcher(str);

punmatch=punpath.matcher(str);

if(kmatch.matches()){

System.out.println("<Keyword,"+str+">");

else if(idmatch.matches()){

System.out.println("<Identifier,"+str+">");

else if(opmatch.matches()){

System.out.println("<Operator,"+str+">");

else if(nummatch.matches()){

System.out.println("<Constant,"+str+">");

else if(punmatch.matches()){

System.out.println("<Punctuation,"+str+">");

Ex. No. 1 | Token Separation 3


System Software and Compiler Lab

Input

Output

Ex. No. 1 | Token Separation 4


System Software and Compiler Lab

Result
Thus the program for token separation is done successfully.

Ex. No. 1 | Token Separation 5

You might also like