0% found this document useful (0 votes)
48 views30 pages

Compiler Design Lab File

The program implements a lexical analyzer to recognize the token 'for(' by using a finite state machine approach. It transitions between states while reading the input string character by character. If the input matches the expected sequence, it prints the token.

Uploaded by

jatin yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views30 pages

Compiler Design Lab File

The program implements a lexical analyzer to recognize the token 'for(' by using a finite state machine approach. It transitions between states while reading the input string character by character. If the input matches the expected sequence, it prints the token.

Uploaded by

jatin yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 30

DELHI TECHNOLOGICAL UNIVERSITY

IT-302

COMPILER DESIGN

Department of Information Technology


Submitted by: Jatin Yadav

Roll Number :- 2K21/IT/88

Batch :- IT-B

Submitted to : Prof Seba Susan

1
INDEX

S. No. Experiment Date

1. Program to implement NFA that accepts my name 8-01-2024

2. Write a program to implement NFA corresponding to ‘for’ 15-01-2024

keyword

3. Write a Program to implement a lexical analyzer to issue 29-01-2024

tokens for “for(”

4. Write a program to implement all the categories of the 5-02-2024

lexical analysis

2
EXPERIMENT-1

AIM : Program to implement NFA that accepts your


name

THEORY :

Non-deterministic Finite Automata (NFA): NFA is a finite automaton where for some
cases when a single input is given to a single state, the machine goes to more than 1
states, i.e. some of the moves cannot be uniquely determined by the present state and
the present input symbol.

An NFA can be represented as M = { Q, ∑, ∂, q0, F}


Q → Finite non-empty set of states.
∑ → Finite non-empty set of input symbols.
∂ → Transitional Function.
q0 → Beginning state.
F → Final State

CODE :

//EXP1 Jatin Yadav 2K21/IT/88

#include <iostream>
using namespace std;
int main(){

cout<<"Enter input string"<<endl;


string s;
cin>>s;

int n = s.size();

char input;
int state = 0;
bool flag = false;

3
for(int i = 1 ; i <= n ; i++){
input = s[i-1];

switch (state)
{
case 0:
if(input == 'J'){
state++;
cout<<"Current state is: "<<state<<endl;
}else{
cout<<"Not Reached"<<endl;
return 0;
}
break;

case 1:
if(input == 'A'){
state++;
cout<<"Current state is: "<<state<<endl;
}else{
cout<<"Not Reached"<<endl;
return 0;
}
break;

case 2:
if(input == 'T'){
state++;
cout<<"Current state is: "<<state<<endl;
}else{
cout<<"Not Reached"<<endl;
return 0;
}
break;

case 3:
if(input == 'I'){
state++;
cout<<"Current state is: "<<state<<endl;
}else{
cout<<"Not Reached"<<endl;
return 0;
}

break;
4
case 4:
if(input == 'N'){
state++;
}else{
cout<<"Not Reached"<<endl;
return 0;
}
break;

case 5:
cout<<"Not Reached"<<endl;
return 0;

default:
break;
}

}
if (flag){
cout<<"Reached Final state: "<<state<<endl;
} else{
cout<<"Not Reached"<<endl;
}

5
OUTPUT :

Input String : “JATIN”

Input String : “JATINN”

Input String : “1JATIN”

Explanation:
The provided C++ code is a simple lexical analyzer that recognizes the string
"JATIN" using a non-deterministic finite automaton (NFA). The program
prompts the user to input a string, and it iterates through the characters of the
input, transitioning between states based on the characters encountered.

If the input matches the sequence "JATIN," it reaches the final state, and the
message "Reached Final state" is printed. Otherwise, it outputs "Not Reached."

The code uses a switch statement to handle state transitions and checks for the
correct sequence of characters to match the desired string.

6
EXPERIMENT- 2
AIM: Write a program to implement NFA corresponding to ‘for’
keyword

THEORY:
As it is known that Lexical Analysis is the first phase of compiler also known as
scanner. It converts the input program into a sequence of Tokens.
A C program consists of various tokens and a token is either a keyword, an
identifier, a constant, a string literal, or a symbol.
For Example:

1) Keywords:
Examples- for, while, if etc.

2) Identifier
Examples- Variable name, function name etc.

3) Operators:
Examples- '+', '++', '-' etc.

4) Separators:
Examples- ', ' ';' etc

We implement the following NFA using 5 states namely : 0,1,2,3,4


Initial State is : State 0
Final State is : State 3 if string is of size 3 or State 4 if string is greater
than size 3

CODE :
#include <iostream>
using namespace std;
int main() {

cout << "Enter input string" << endl;


7
string s;
cin >> s;
int n = s.size();

char input;
int state = 0;
bool flag = true;

for (int i = 1 ; i <= n ; i++) {


input = s[i - 1];

switch (state)
{
case 0:
if (input == 'f') {
state++;
cout << "Current state is: " << state << endl;
}else{
flag = false;
}
break;
case 1:
if (input == 'o') {
state++;
cout << "Current state is: " << state << endl;
}else{
flag = false;
}
break;
case 2:
if (input == 'r') {
state++;
cout << "Current state is: " << state << endl;
}else{
flag = false;
}
break;

case 3:
if ((input >= 'a' && input <= 'z') ||
(input >= 'A' && input <= 'Z') ||
(input >= '0' && input <= '9'))
{
flag = false;
} else {
8
state++;
cout << "Current state is: " << state << endl;

i--;
}
break;

default:
break;
}

if (flag && (state == 4 || state == 3)) {


cout << "Reached Final state: " << state << endl;
cout << "Token issued: <for>" << endl;
} else {
cout << "Not Reached" << endl;
cout << "Final state: " << state << endl;
}
}

9
OUTPUT :

Input String : “for”

Input String : “for(”

Input String : “for > 3”

Input String : “for123”

10
Explanation:
The provided C++ code is a simple lexical analyzer that recognizes the keyword "for"
in an input string using a basic nondeterministic finite automaton (NFA) approach.

The NFA has four states, each corresponding to a character in the keyword "for." The
analyzer iterates through the input string character by character, transitioning between
states based on the characters encountered. The final state is reached if the input
matches the keyword "for," and a corresponding token is issued.

The NFA starts in state 0 and progresses through states 1, 2, and 3 for each character
in "for." If the next character is not as expected, the flag is set to false, indicating that
the string does not match the keyword. The analyzer also checks for invalid characters
after reaching the final state.

The code outputs the current state at each transition and issues a token if the final state
is reached, denoting the recognition of the "for" keyword. This lexical analyzer serves
as a basic example of string recognition using a finite automaton. However, for a more
comprehensive lexical analysis, tools like Lex or Flex are commonly used in compiler
design.

11
EXPERIMENT-3
AIM: Write a Program to implement a lexical analyzer to issue tokens
for “for(”

THEORY:
The provided code is a C++ implementation of a simple lexical analyzer. It utilises a
finite state machine approach to recognize and categorise tokens in a given input
string. The code reads characters from the input and transitions between states based
on the observed characters, identifying and printing tokens such as identifiers,
numbers, and operators. It employs a buffer to accumulate characters for token
formation and recognizes whitespace to separate tokens.

Lexical Analysis, the initial compiler phase, transforms input code into Tokens. In C
programs, tokens include keywords (e.g., for, while), identifiers (e.g., variable
names), operators (e.g., +, ++), and separators (e.g., ',', ';'). The process categorises
elements to facilitate subsequent compilation phases.

CODE :
#include <bits/stdc++.h>
using namespace std;

unordered_map<string, int> mp;


bool isKeyword(string s)
{
int state = 0;
switch (state)
{
case 0:
if (s[state] == 'f')
{

12
state = 1;
}
else
{
return false;
}
case 1:
if (s[state] == 'o')
{
state = 2;
}
else
{
return false;
}
case 2:
if (s[state] == 'r')
{
state = 3;
}
else
{
return false;
}
case 3:
if (state < s.size() && (s[state] < '0' || s[state] > '9') &&
(s[state] < 'a' || s[state] > 'z') && (s[state] < 'A' || s[state] > 'Z'))
{
cout << "Token Sucessfully Issued <for>" << endl;
return true;
}
else
{
return false;
}
default:
return false;
}
cout << "Token Sucessfully Issued <for>" << endl;
return true;
}

bool isParanthesis(string s)
{
int state = 0;
13
switch (state)
{
case 0:
if (s[state] == '(')
{
cout << "Token Sucessfully Issued <PAR,S-OP>" << endl;
return true;
}
else if (s[state] == ')')
{
cout << "Token Sucessfully Issued <PAR,S-CL>" << endl;
return true;
}
else
{
cout << "Input invalid" << endl;
return false;
}
}
return false;
}

void mainProgram(string s)
{
int lexemeBegin = 0, forward = 0;
while (lexemeBegin < s.length() - 1)
{
while (forward < s.length())
{
string s1 = s.substr(lexemeBegin, forward - lexemeBegin + 1);
if (isKeyword(s1))
{
lexemeBegin = forward;
break;
}
else
{
forward++;
}
}
forward = lexemeBegin;
while (forward < s.length())
{
string s1 = s.substr(lexemeBegin, forward - lexemeBegin + 1);
if (isParanthesis(s1))
14
{
lexemeBegin = forward + 1;
break;
}
else
{
forward++;
}
}
forward = lexemeBegin;
}
}

int main()
{

string s;
getline(cin, s);
mainProgram(s);

return 0;
}

15
OUTPUT :

Input String : “for(”

Input String : “(for(”

Input String : “(for)”

Explanation:

The given C++ code implements a simple lexical analyzer that recognizes keywords
and parentheses in an input string. Let's break down the key components and the flow
of the program:

1. isKeyword() Function:
- This function checks whether a given string `s` represents the keyword "for."
- It uses a state machine with different cases representing the sequence of characters
expected for the keyword "for."
- The function returns `true` if the string matches the keyword, and it prints a
success message. Otherwise, it returns `false`.

2. isParanthesis() Function:
- This function checks whether a given string `s` represents an open parenthesis '(' or
a close parenthesis ')'.
- It uses a simple case to distinguish between '(' and ')'.
16
- If the input is a valid parenthesis, it prints a corresponding success message and
returns `true`. Otherwise, it returns `false`.

3. mainProgram() Function:
- This function is the main driver for the lexical analysis process.
- It utilizes two pointers, `lexemeBegin` and `forward`, to extract substrings from
the input string.
- The function iterates through the input string and, for each substring, checks
whether it is a keyword using `isKeyword`. If a keyword is found, the `lexemeBegin`
pointer is updated to the next position.
- Subsequently, it checks for parentheses using `isParenthesis` in a similar manner.

4. main() Function:
- In the `main` function, the user inputs a string, and `mainProgram` is called to
perform lexical analysis on the input string.
- The program terminates after processing the input string.

It's important to note that the code currently handles only keywords and parentheses.
A comprehensive lexical analyzer in a compiler would typically handle a broader set
of token types, including identifiers, operators, literals, and separators.

17
EXPERIMENT-4
AIM: Write a program to implement all the categories of
the lexical analysis

THEORY:

Lexical Analysis:
● Lexical analysis, also known as scanning or tokenization, marks the
initial phase of compilation.
● Its primary task is to divide the source code into a sequence of tokens.

Tokens:
● Tokens are the smallest units of meaning in a programming language.
● They serve as fundamental building blocks that represent various
elements in a program.

Types of Tokens:
● Tokens include keywords, identifiers, operators, literals, and punctuation.
● These categories cover the essential components that form the structure of
a program.

Role of Lexical Analyzer:


● A lexical analyzer, often implemented using tools like Lex or Flex,
enforces the lexical rules.
● It utilizes regular expressions and finite automata to define the language's
lexical structure.

Output and Parser Interaction:


● The output of the lexical analysis phase is a stream of tokens.
● This token stream is then passed to the parser for further examination of
the program's syntactic structure.

18
OVERVIEW:

CODE:
#include <bits/stdc++.h>
using namespace std;

// ct is used to keep count of tokens issued for variables


int ct = 1;

unordered_map<string, int> mp;

// To check if the given character is a letter or not


bool isLetter(char s){
return ((s >= 'a' && s <= 'z') || (s >= 'A' && s <= 'Z'));
}

// To check if the given character is a digit or not

19
bool isDigit(char s){
return (s >= '0' && s <= '9');
}
// To check if the given string is a keyword or not
bool isKeyword(string s){
int state = 0;
switch (state){
case 0:
if (s[state] == 'f')
{
state = 1;
}
else
{
return false;
}
case 1:
if (s[state] == 'o')
{
state = 2;
}
else
{
return false;
}
case 2:
if (s[state] == 'r'){
state = 3;
}
else{
return false;
}
case 3:
if (state < s.size() && (!isLetter(s[state]) && !isDigit(s[state]))){
cout << "Token Issued Successfully <for>" << endl;
return true;
}
else{
return false;
}
default:
return false;
}
cout << "<for>" << endl;
return true;
20
}

// To check if the given string is a bracket or not


bool isParenthesis(string s){
int state = 0;
switch (state){
case 0:
if (s[state] == '('){
cout << "Token Issued Successfully <PAR,S-OP>" << endl;
return true;
}
else if (s[state] == ')'){
cout << "Token Issued Successfully <PAR,S-CL>" << endl;
return true;
}
else{
return false;
}
}
return false;
}

bool isWhiteSpace(string s){


int state = 0;
switch (state){
case 0:
if (s[state] == ' '){
state = 1;
}
else{
return false;
}
case 1:
while (s[state] == ' '){
state++;
}
if (s[state] < '0' || s[state] > '9'){
cout << "<>" << endl;
return true;
}
}
return false;
}
21
bool isPunctuation(string s){
int state = 0;
switch (state){
case 0:
if (s[state] == ';'){
cout << "Token Issued Successfully <PUNC,Semi>" << endl;
return true;
}
else if (s[state] == ':'){
cout << "Token Issued Successfully <PUNC,Col>" << endl;
return true;
}
else{
return false;
}
}
return false;
}

bool isRelop(string s){


int state = 0;
switch (state){
case 0:
if (s[state] == '='){
cout << "Token Issued Successfully <relop,EQ>" << endl;
return true;
}
else if (s[state] == '<'){
cout << "Token Issued Successfully <relop,LT>" << endl;
return true;
}
else if (s[state] == '>'){
cout << "Token Issued Successfully <relop,GT>" << endl;
return true;
}
else{
return false;
}
}
return false;
}

22
bool isArith(string s)
{
int state = 0;
switch (state)
{
case 0:
if (s[state] == '+')
{
cout << "Token Issued Successfully <arith,PL>" << endl;
return true;
}
else if (s[state] == '-')
{
cout << "Token Issued Successfully <arith,MN>" << endl;
return true;
}
else if (s[state] == '*')
{
cout << "Token Issued Successfully <arith,ML>" << endl;
return true;
}
else if (s[state] == '/')
{
cout << "Token Issued Successfully <arith,DV>" << endl;
return true;
}
else
{
return false;
}
}
return false;
}

bool isNumber(string s)
{
int state = 0;
int val = 0;
switch (state)
{
case 0:
if (s[state] >= '0' && s[state] <= '9')
{
val = val * 10 + (s[state] - '0');
23
state = 1;
}
else
{
return false;
}

case 1:
while (s[state] >= '0' && s[state] <= '9')
{

val = val * 10 + (s[state] - '0');


state++;
}
if (s[state] < '0' || s[state] > '9')
{
cout << "Token Issued Successfully <num," << val << ">" << endl;
return true;
}
}
cout << "Token Issued Successfully <num," << val << ">" << endl;
return true;
}

bool isVariable(string s)
{
int state = 0;
switch (state)
{
case 0:
if ((s[state] >= 'a' && s[state] <= 'z') || (s[state] >= 'A' && s[state] <=
'Z'))
{
state = 1;
}
else
{
return false;
}
case 1:
while ((s[state] >= 'a' && s[state] <= 'z') || (s[state] >= 'A' && s[state]
<= 'Z') || (s[state] >= '0' && s[state] <= '9'))
{
state++;
continue;
24
}
if ((s[state] < '0' || s[state] > '9') && (s[state] < 'a' || s[state] >
'z') && ((s[state] < 'A' || s[state] > 'Z')))
{
string ss = s.substr(0, s.size() - 1);
if (mp.find(ss) == mp.end())
{

cout << "Token Issued Successfully <id," << ct << ">" << endl;
mp[ss] = ct;
ct++;
}
else
{
cout << "Token Issued Successfully id," << mp[ss] << ">" << endl;
}
return true;
}
else
{
return false;
}
}
return false;
}

int main()
{
string s;
getline(cin, s);

int lexemeBegin = 0, forward = 0;


while (lexemeBegin < s.length() - 1)
{
while (forward < s.length())
{
string s1 = s.substr(lexemeBegin, forward - lexemeBegin + 1);
if (isKeyword(s1))
{
lexemeBegin = forward;
break;
}
else
{
forward++;
25
}
}
forward = lexemeBegin;
while (forward < s.length())
{
string s1 = s.substr(lexemeBegin, forward - lexemeBegin + 1);
if (isParanthesis(s1))
{
lexemeBegin = forward + 1;
break;
}
else
{
forward++;
}
}
forward = lexemeBegin;
while (forward < s.length())
{
if ((s[forward] < '0' || s[forward] > '9') && forward < s.length())
{
string s1 = s.substr(lexemeBegin, forward - lexemeBegin + 1);
if (isNumber(s1))
{
lexemeBegin = forward;
break;
}
else
{
break;
}
}
else
{
forward++;
}
}
forward = lexemeBegin;
while (forward < s.length())
{
string s1 = s.substr(lexemeBegin, forward - lexemeBegin + 1);
if (isPunctuation(s1))
{
lexemeBegin = forward + 1;
break;
26
}
else
{
forward++;
}
}
forward = lexemeBegin;
while (forward < s.length())
{
string s1 = s.substr(lexemeBegin, forward - lexemeBegin + 1);
if (isArith(s1))
{
lexemeBegin = forward + 1;
break;
}
else
{
forward++;
}
}
forward = lexemeBegin;

while (forward < s.length())


{
string s1 = s.substr(lexemeBegin, forward - lexemeBegin + 1);
if (isRelop(s1))
{
lexemeBegin = forward + 1;
break;
}
else
{
forward++;
}
}
forward = lexemeBegin;
while (forward < s.length())
{
if ((s[forward] != ' ') && forward < s.length())
{
string s1 = s.substr(lexemeBegin, forward - lexemeBegin + 1);
if (isWhiteSpace(s1))
{
lexemeBegin = forward;
break;
27
}
else
{
break;
}
}
else
{
forward++;
}
}
forward = lexemeBegin;
while (forward < s.length())
{
if (((s[forward] < '0' || s[forward] > '9') && (s[forward] < 'a' ||
s[forward] > 'z') && (s[forward] > 'A' || s[forward] < 'Z')) && forward <
s.length())
{
string s1 = s.substr(lexemeBegin, forward - lexemeBegin + 1);
if (isVariable(s1))
{
lexemeBegin = forward;
break;
}
else
{
break;
}
}
else
{

forward++;
}
}
forward = lexemeBegin;
if (lexemeBegin < s.length() && s[lexemeBegin] != '=' && s[lexemeBegin] != '
' && s[lexemeBegin] != '+' && s[lexemeBegin] != '-' && s[lexemeBegin] != '<' &&
s[lexemeBegin] != '>' && s[lexemeBegin] != '$' && s[lexemeBegin] != ';' &&
s[lexemeBegin] != ':' && s[lexemeBegin] != '(' && s[lexemeBegin] != ')' &&
(s[lexemeBegin] < '0' || s[lexemeBegin] > '9') && (s[lexemeBegin] < 'a' ||
s[lexemeBegin] > 'z') && ((s[lexemeBegin] < 'A' || s[lexemeBegin] > 'Z')))
{
cout << "Error " << s[lexemeBegin] << " is not identified so token not
issued" << endl;
28
return 0;
}
if (forward == 0)
{
break;
}
}

return 0;
}

OUTPUT :

29

You might also like