0% found this document useful (0 votes)
6 views

Compiler LAB

The document is a lab report submitted by Kushal Aryal to his professor, Mr. Prithvi Raj Paneru. It details four experiments conducted as part of a Compiler Design and Construction course. The experiments involve writing C programs to check the validity of identifiers and comments, perform tokenization, and recognize specific strings. For each experiment, the student outlines objectives, provides relevant theory, discusses implementation details and results, and concludes with what was learned.

Uploaded by

Safal Neupane
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Compiler LAB

The document is a lab report submitted by Kushal Aryal to his professor, Mr. Prithvi Raj Paneru. It details four experiments conducted as part of a Compiler Design and Construction course. The experiments involve writing C programs to check the validity of identifiers and comments, perform tokenization, and recognize specific strings. For each experiment, the student outlines objectives, provides relevant theory, discusses implementation details and results, and concludes with what was learned.

Uploaded by

Safal Neupane
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Prithvi Narayan Campus

Institute of Science and Technology

Tribhuvan University

Lab Report
On
Compiler Design and Construction (CSC-365)

Submitted to:
Mr. Prithvi Raj Paneru

Department of Computer Science and Information Technology


Prithvi Narayan Campus, Pokhara

Submitted by:
Kushal Aryal
25013/076

Kushal Aryal 25013


INDEX
S.N. Practical Performed Submitted Remarks
on: on:
1.

2.

3.

4.

5.

6.

7.

8.

9.

10.

11.

12.

Kushal Aryal 25013


LAB-1: Lexical Analysis
EXPERIMENT-1: To test whether the given identifier is valid or not using
C-program.
1. OBJECTIVES:
• C program to check valid identifiers.
• To write valid C program for implementation.
2. THEORY:
C-identifiers:
C-Identifiers refers to name given to entities such as variables, functions, structures etc.
Identifiers must be unique. They are created to give a unique name to an entity to identify
it during the execution of the program. A valid identifier can have letters (both uppercase
and lowercase letters), digits and underscores. The first letter of an identifier should be
either a letter or an underscore. we cannot use keywords like int, while, etc. as identifiers.
There is no rule on how long an identifier can be. However, we may run into problems in
some compilers if the identifier is longer than 31 characters.
Valid identifiers are number, money, test_, number, _1number, money2, etc.
Invalid identifiers are 123, 123number, 123_, -money, etc.

3. IMPLEMENTATION:
Source code:
#include <stdio.h>
#include <ctype.h>
int main()
{
char a[10];
int flag, i = 1;
printf("\n Enter the identifier\t");
gets(a);

if (isalpha(a[0]) || a[0] == '_')


flag = 1;
while (a[i] != '\0')
{
if (!isdigit(a[i]) && !isalpha(a[i]) && a[i] != '_')
{
flag = 0;
break;
}
i++;
}
if (flag == 1)

Kushal Aryal 25013


{

printf("\n Valid Identifiers");


}
else
{
printf("\n Not a vaild identifiers");
}
}
Output:

4. RESULT & DISCUSSION:


Program performed above shows the implementation to check whether the given
identifiers are valid or not where Kusal and _47Kusal were valid identifiers and 1234567
and 1234kusal are invalid identifiers.

5. CONCLUSION:
Hence, we can conclude that validity of identifiers was checked and implemented using
C-program.

Kushal Aryal 25013


LAB-1: Lexical Analysis
EXPERIMENT-2: To test whether the given string is valid comments or not
using C program.
1. OBJECTIVES:
• C program to check valid comments.
• To write valid C program for implementation.
2. THEORY:
C-comments:
Comments can be used to explain code, and to make it more readable. It can also be used
to prevent execution when testing alternative code. Comments can be singled-
lined or multi-lined.
Single-line comments: Single-line comments start with two forward slashes (//). Any text
between // and the end of the line is ignored by the compiler (will not be executed).
Example:
// Single-line comments
Multi-line comments: Multi-line comments start with /* and ends with */. Any text
between /* and */ will be ignored by the compiler. Multi-line comments can span across
multiple lines and are used for longer comments or for commenting out blocks of code.
Example:
/* This is a multi-line comment
This is a multi-line comment
This is a multi-line comment */

3. IMPLEMENTATION:
Source code:
#include<stdio.h>
int main()
{
char com [30];
int i=2,a=0;
printf("\n Enter Text : ");
gets(com);
if(com[0]=='/')
{
if(com[1]=='/')
printf("\n It is a Comment.");
else if(com [1]=='*')
{
for(i=2;i<=30;i++)
{
if(com[i]=='*'&&com[i+1]=='/')
{

Kushal Aryal 25013


printf("\n It is a Comment.");
a=1;
break;
}
else continue;
}
if(a==0)
printf("\n It is Not a Comment.");
}
else
printf("\n It is Not a Comment.");
}
else
printf("\n It is Not a Comment.");
return 0;
}
Output:

4. RESULT & DISCUSSION:


Program performed above shows the implementation to check whether the given string is
valid comment or not where string starting with “//” were valid comments, string between
“/* */” were also valid comments, but string starting with “/*” but not ending with “*/”
were not valid comment and string not starting with either “//” or not in between “/* */” are
invalid comments.

5. CONCLUSION:
Hence, we can conclude that validity of comment was checked and implemented using C-
program.

Kushal Aryal 25013


LAB-1: Lexical Analysis
EXPERIMENT-3: To implement tokenization in C program.
1. OBJECTIVES:
• To tokenize the code of c program.
• To write valid C program for implementation.
2. THEORY:
Tokenization in C:
In the context of programming, tokenization refers to the process of breaking down a
sequence of characters or a string of text into smaller units known as tokens. In the C
programming language, tokenization is a fundamental step in the process of parsing and
analyzing source code.
In C, tokens are the smallest meaningful units that make up a program. They can represent
keywords, identifiers, constants, operators, punctuation symbols, and other elements of the
language. During the tokenization process, the C compiler or parser scans the source code
and identifies these tokens, which are then used for syntax analysis and further processing.
Each token represents a specific element or instruction in the C program, and the order and
combination of tokens determine the program's structure and behavior. Tokenization is a
crucial step in the compilation process as it enables the compiler to understand and analyze
the code's syntax and semantics.

3. IMPLEMENTATION:
Text File:

Source code:
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#include <process.h>
int iskeyword(char b[])
{

Kushal Aryal 25013


char keywords[32][10] = {"auto", "break", "case", "char", "const", "continue",
"default", "do", "double", "else", "enum", "extern", "float", "for", "goto", "if", "int", "long",
"register", "return", "short", "signed", "sizeof", "static", "struct", "switch", "typeof",
"union", "unsigned", "void", "volatile", "while"};
int i, flag = 0;
for (i = 0; i < 32; i++)
{
if (strcmp(keywords[i], b) == 0)
{
flag = 1;
break;
}
}
return flag;
}
int main()
{
char ch, buffer[15], operators[] = "+-*/%=";
FILE *fp;
int i, j = 0;
fp = fopen("Test.txt", "r");
if (fp == NULL)
{
printf("\n file cannot be opened");
exit(0);
}
while ((ch = fgetc(fp)) != EOF)
{
for (i = 0; i < 6; i++)
{
if (ch == operators[i])
{
printf("%c is operator \n", ch);
}
}
if (isalnum(ch))
{
buffer[j++] = ch;
}
else if ((ch == ' ' || ch == '\n') && j != 0)
{
buffer[j] = '\0';
j = 0;
if (iskeyword(buffer) == 1)

Kushal Aryal 25013


printf("%s is a keyword\n", buffer);
else
printf("%s is identifier\n", buffer);
}
}
return 0;
}
Output:

4. RESULT & DISCUSSION:


Program performed above shows the implementation to tokenization of the given code in
C program where input was taken from Test.txt file and the code was successfully
tokenized.

5. CONCLUSION:
Hence, we can conclude that the tokenization of given code was done and implemented
using C-program.

Kushal Aryal 25013


LAB-1: Lexical Analysis
EXPERIMENT-4: Write a Program in C to recognize the string a*, a*b+ and
abb.
1. OBJECTIVES:
• To write valid c program to recognize the string a*, a*b+ and abb.
2. THEORY:
String recognizer in Lexical analysis:
In lexical analysis, a string recognizer is a component that identifies and recognizes string
literals in the source code. String literals are sequences of characters enclosed within double
quotes (e.g., "Hello, world!").
The string recognizer, typically implemented as part of a lexical analyzer (lexer) or scanner,
scans the input source code character by character and detects string literals based on
specific rules or patterns. It ensures that the recognized strings are properly formatted and
extracts them as tokens for further processing.
A string recognizer in lexical analysis identifies and recognizes string literals in the source
code based on specific rules or patterns. It is typically implemented using regular
expressions or other pattern matching techniques to ensure correct identification and
extraction of string literals from the input source code.

3. IMPLEMENTATION:
Source code:
#include <stdio.h>
#include <stdlib.h>
int main()
{
char s[10], c;
int state = 0, i = 0;
printf("\n Enter a string: ");
gets(s);
while(s[i] != '\0')
{
switch(state)
{
case 0:
c = s[i++];
if (c == 'a')
state = 1;
else if (c == 'b')
state = 2;
else
state = 6;
break;

Kushal Aryal 25013


case 1:
c = s[i++];
if (c == 'a')
state = 3;
else if (c == 'b')
state = 4;
else
state = 6;
break;
case 2:
c = s[i++];
if (c == 'a')
state = 6;
else if (c == 'b')
state = 2;
else
state = 6;
break;
case 3:
c = s[i++];
if (c == 'a')
state = 3;
else if (c == 'b')
state = 2;
else
state = 6;
break;
case 4:
c = s[i++];
if (c == 'a')
state = 6;
else if (c == 'b')
state = 5;
else
state = 6;
break;
case 5:
c = s[i++];
if (c == 'a')
state = 6;
else if (c == 'b')
state = 2;
else
state = 6;

Kushal Aryal 25013


break;
case 6:
printf("\n %s is not recognized", s);
exit(0);
}
}
if(state == 1 || state == 3 )
printf("\n %s is accepted under rule a*", s);
else if (state == 2 || state == 4)
printf("\n %s is accepted under rule a*b+", s);
else if (state == 5)
printf("\n %s is accepted under rule abb", s);
else
printf("\n %s is not recognized", s);
}
Output:

4. RESULT & DISCUSSION:


Program performed above shows the implementation to recognize the given string with
pattern a*, a*b+ and abb, where input string was provided as aabbaa, aaaaaaaaa,
aaaabbbbbbbbbb and those strings were successfully recognized.

5. CONCLUSION:
Hence, we can conclude that the given string was recognized and implemented using C-
program.

Kushal Aryal 25013

You might also like