CC Lab 1-4

The document outlines practical experiments for a Compiler Construction course, focusing on lexical analysis, token counting, left recursion removal, and LL(1) grammar checking using C programming. Each experiment includes an aim, theoretical background, and code implementation to demonstrate the concepts. The experiments facilitate understanding of compiler design principles and the importance of tokenization and grammar analysis.

Uploaded by

walterblancolovesmeth

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views13 pages

CC Lab 1-4

Uploaded by

walterblancolovesmeth

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

COMPILER CONSTRUCTION

CSE304
Practical file

AMITY SCHOOL OF ENGINEERING AND TECHNOLOGY

AMITY UNIVERSITY, UTTAR PRADESH

Submitted by:
Krish Dogra
A2305222573
6CSE-9X

Submitted to:
Dr Roshan Lal
Experiment 1
Aim: Write a program to identify keywords, constants, special characters and
identifiers from a given input string.
Language Used: C
Theory: Lexical analysis is a fundamental part of compiler design, where a
given input string is processed to identify meaningful components called tokens.
Tokens include keywords, constants, special characters, and identifiers, each
playing a crucial role in programming languages.
Keywords are predefined reserved words such as int, return, and if, which have
specific meanings in the C programming language. Constants refer to fixed
numeric values like 100 or 3.14, which do not change during execution.
Identifiers are user-defined names for variables, functions, and arrays, following
the language's naming conventions. Special characters include symbols like +,
-, *, {, }, which define operations and control structures in the code.
The program reads an input string, processes each character, and groups them
into tokens based on predefined rules. It scans the string, checks whether a
sequence of characters matches a keyword, a constant, or an identifier, and
detects special symbols. This process ensures accurate token classification,
which is crucial for syntax analysis in a compiler.
Lexical analysis is essential for code parsing, debugging, and compilation, as it
helps detect syntax errors early. By systematically breaking the input into
tokens, the program enhances readability and assists in further stages of code
execution, such as semantic analysis and optimization.

Code:
#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include <stdbool.h>
const char *keywords[] = {"int", "float", "char", "double", "return", "if",
"else", "for", "while", "do", "switch", "case", "break", "continue", "void",
"static", "struct", "typedef", "const", "sizeof", "volatile", "enum", "union",
"default", "extern", "goto", "register", "short", "signed", "unsigned", "long",
"auto", "inline", "restrict", "_Alignas", "_Alignof", "_Atomic", "_Bool",
"_Complex", "_Generic", "_Imaginary", "_Noreturn", "_Static_assert",
"_Thread_local"};
int keyword_count = sizeof(keywords) / sizeof(keywords[0]);

bool isKeyword(char *word) {

for (int i = 0; i < keyword_count; i++) {
if (strcmp(word, keywords[i]) == 0) {
return true; }}
return false;
}
bool isNumber(char *word) {
for (int i = 0; word[i] != '\0'; i++) {
if (!isdigit(word[i])) {
return false; } } return true; }
bool isSpecialSymbol(char ch) {
char special_symbols[] = "!@#$%^&*()-+=|<>?/{}[]:;.,'\\";
for (int i = 0; special_symbols[i] != '\0'; i++) {
if (ch == special_symbols[i]) {
return true;
}} return false;
}
void identifyTokens(char *str) {
char token[50]; int index = 0;
for (int i = 0; str[i] != '\0'; i++) {
if (isalnum(str[i])) {
token[index++] = str[i];
} else {
if (index > 0) {
token[index] = '\0';
if (isKeyword(token)) { printf("Keyword: %s\n", token); }
else if (isNumber(token)) { printf("Constant: %s\n", token); }
else { printf("Identifier: %s\n", token); }
index = 0; }
if (isSpecialSymbol(str[i])) {
printf("Special Character: %c\n", str[i]); } } }
}
int main() {
char input[100];
printf("Enter a C statement: "); fgets(input, sizeof(input), stdin);
identifyTokens(input);
return 0;
}
Output:
Experiment 2
Aim: Write a program to count the total number of tokens in the source code.
Language Used: C
Theory: A token is the smallest meaningful unit in a programming language,
including keywords, identifiers, operators, constants, and special symbols.
Tokenization is the process of breaking source code into tokens, which is a
fundamental step in lexical analysis for compilers. This program reads a C
source code file, extracts tokens using strtok(), and counts them. It identifies
tokens by splitting the text based on predefined delimiters such as whitespace,
punctuation, and operators. Tokenization is widely used in compiler design,
syntax highlighting, and static code analysis. By implementing this process, we
can better understand how source code is structured and interpreted.
Code:
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#define MAX_LENGTH 100

int isSpecialCharacter(char ch) {

char specialChars[] = "(){}[];,=+-*/<>!&|\"'";
for (int i = 0; specialChars[i] != '\0'; i++) {
if (ch == specialChars[i]) {
return 1; }}
return 0;
}
int main() {
char sourceCode[MAX_LENGTH];
printf("Enter the source code: \n");
fgets(sourceCode, MAX_LENGTH, stdin);
int tokenCount = 0;
char token[MAX_LENGTH];
int index = 0;

for (int i = 0; sourceCode[i] != '\0'; i++) {

if (isalnum(sourceCode[i]) || sourceCode[i] == '_') {
token[index++] = sourceCode[i];
} else {
if (index > 0) {
token[index] = '\0';
tokenCount++;
index = 0;
}
if (isSpecialCharacter(sourceCode[i])) {
tokenCount++; } } }
if (index > 0) {
tokenCount++;
}
printf("Total number of tokens: %d\n", tokenCount);
return 0;
}
Output:
Experiment 3
Aim: Write a program to remove left recursion from given grammar.
Language Used: C
Theory: Left recursion occurs when a non-terminal in a grammar can
eventually derive itself as the first symbol on the right-hand side of its own
production. This causes problems for top-down parsers, like recursive descent
parsers, as they can enter an infinite loop when attempting to parse such
productions.
To remove left recursion, we apply a transformation to the grammar. For a
production of the form:
𝑨 → 𝑨𝜶 | 𝜷
where A is a non-terminal and α and β are sequences of terminals and/or non-
terminals, the left recursion is eliminated by introducing a new non-terminal A'.
The transformation is as follows:
𝑨 → 𝜷𝑨′
𝑨′ → 𝜶𝑨′ | 𝜺
Here, A' represents the new non-terminal, and ε is the empty string.
This transformation allows the grammar to be parsed by top-down parsers,
making it LL(1) compatible and preventing infinite recursion during parsing.
Code:
#include <stdio.h>
#include <string.h>
#define MAX 10

void removeLeftRecursion(char nonTerminal, char productions[MAX][MAX],

int prodCount) {
char alpha[MAX][MAX], beta[MAX][MAX];
int alphaCount = 0, betaCount = 0;
for (int i = 0; i < prodCount; i++) {
if (productions[i][0] == nonTerminal) {
strcpy(alpha[alphaCount++], productions[i] + 1);
} else {
strcpy(beta[betaCount++], productions[i]); }
}
if (alphaCount == 0) {
printf("%c -> ", nonTerminal);
for (int i = 0; i < prodCount; i++) {
printf("%s", productions[i]);
if (i < prodCount - 1) printf(" | ");
}
printf("\n"); return;
}
printf("%c -> ", nonTerminal);
for (int i = 0; i < betaCount; i++) {
printf("%s%c'", beta[i], nonTerminal);
if (i < betaCount - 1) printf(" | ");
}
printf("\n");
printf("%c' -> ", nonTerminal);
for (int i = 0; i < alphaCount; i++) {
printf("%s%c'", alpha[i], nonTerminal);
if (i < alphaCount - 1) printf(" | "); }
printf(" | ε\n");
}

int main() {
char nonTerminal;
int prodCount;
char productions[MAX][MAX];
printf("Enter the non-terminal: ");
scanf(" %c", &nonTerminal);
printf("Enter the number of productions: ");
scanf("%d", &prodCount);
printf("Enter the productions (e.g., A->Aa|b, enter only 'Aa' and 'b'):\n");
for (int i = 0; i < prodCount; i++) {
scanf("%s", productions[i]);
}
printf("\nGrammar after removing left recursion:\n");
removeLeftRecursion(nonTerminal, productions, prodCount);
return 0;
}
Output:
Experiment 4
Aim: Write a program to check whether a given grammar is LL(1).
Language Used: C
Theory:
LL(1) parsing is a top-down method that uses a parsing table to decide the next
action. A grammar is LL(1) if:
1. No Left Recursion – Avoids infinite loops.
2. No Ambiguity – Each parsing table cell has at most one production.
3. No Left Factoring – No common prefixes in productions.
To check if a grammar is LL(1):
1. Compute First and Follow sets.
2. Build the Parsing Table using these sets.
3. If any cell has multiple entries, the grammar is not LL(1).
This ensures efficient predictive parsing without backtracking.
Code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#define MAX 10

typedef struct {
char nonTerminal;
char productions[MAX][MAX];
int prodCount;
char first[MAX];
char follow[MAX];
} Grammar;
int n;
Grammar grammar[MAX];

int isNonTerminal(char ch) {

return (ch >= 'A' && ch <= 'Z');
}
void calculateFirst(int index, char firstSet[MAX]) {
for (int i = 0; i < grammar[index].prodCount; i++) {
char firstSymbol = grammar[index].productions[i][0];
if (!isNonTerminal(firstSymbol)) {
strncat(firstSet, &firstSymbol, 1);
} else {
for (int j = 0; j < n; j++) {
if (grammar[j].nonTerminal == firstSymbol) {
strcat(firstSet, grammar[j].first);
}}}}}
void calculateFollow() {
grammar[0].follow[0] = '$';
for (int i = 0; i < n; i++) {
for (int j = 0; j < grammar[i].prodCount; j++) {
char *prod = grammar[i].productions[j];
int len = strlen(prod);
for (int k = 0; k < len - 1; k++) {
if (isNonTerminal(prod[k])) {
strncat(grammar[i].follow, &prod[k + 1], 1); }}}}}
int checkLL1() {
char parsingTable[MAX][MAX][MAX] = {""};
for (int i = 0; i < n; i++) {
char firstSet[MAX] = "";
calculateFirst(i, firstSet);
for (int j = 0; j < strlen(firstSet); j++) {
char terminal = firstSet[j];
if (parsingTable[grammar[i].nonTerminal - 'A'][terminal - 'a'][0] != '\0'){
return 0;
}
strcpy(parsingTable[grammar[i].nonTerminal - 'A'][terminal - 'a'],
grammar[i].productions[j]); }
}
return 1;
}
int main() {
printf("Enter the number of non-terminals: ");
scanf("%d", &n);
for (int i = 0; i < n; i++) {
printf("Enter non-terminal %d: ", i + 1);
scanf(" %c", &grammar[i].nonTerminal);
printf("Enter the number of productions for %c: ",
grammar[i].nonTerminal);
scanf("%d", &grammar[i].prodCount);
printf("Enter productions for %c (one per line):\n",
grammar[i].nonTerminal);
for (int j = 0; j < grammar[i].prodCount; j++) {
scanf("%s", grammar[i].productions[j]);
}
}
calculateFollow();

if (checkLL1()) {
printf("The given grammar is LL(1).\n");
} else {
printf("The given grammar is NOT LL(1).\n");
}
return 0;
}
Output:

Meesho Questions and Answers
No ratings yet
Meesho Questions and Answers
8 pages
Compiler Design Lab Manual 05.02.2024 - Final
No ratings yet
Compiler Design Lab Manual 05.02.2024 - Final
71 pages
Ada CD Index Cdfile
No ratings yet
Ada CD Index Cdfile
70 pages
CC Lab
No ratings yet
CC Lab
54 pages
CD Final Manual
No ratings yet
CD Final Manual
34 pages
CD Lab Report 1
No ratings yet
CD Lab Report 1
31 pages
Shivam CD
No ratings yet
Shivam CD
54 pages
CD - Lab Manual - Classroom
No ratings yet
CD - Lab Manual - Classroom
18 pages
Practical-6 TO 10
No ratings yet
Practical-6 TO 10
19 pages
CD Lab Record Expltn Exp 1-4 24-25
No ratings yet
CD Lab Record Expltn Exp 1-4 24-25
25 pages
CD File
No ratings yet
CD File
20 pages
Riddhi CD Lab Manual
No ratings yet
Riddhi CD Lab Manual
22 pages
SPCC Merged
No ratings yet
SPCC Merged
21 pages
All Merged PDF CD
No ratings yet
All Merged PDF CD
46 pages
Modus Ug Testvectors
No ratings yet
Modus Ug Testvectors
96 pages
Write A C Program To Identify Different Types of Tokens in A Given Program
No ratings yet
Write A C Program To Identify Different Types of Tokens in A Given Program
46 pages
21BAI1213 - Abhinav V - Experiment-2
No ratings yet
21BAI1213 - Abhinav V - Experiment-2
11 pages
CD Experiment 1 To 10-1
No ratings yet
CD Experiment 1 To 10-1
16 pages
CD Himanshu
No ratings yet
CD Himanshu
32 pages
Uliya
No ratings yet
Uliya
45 pages
Compiler 5-10
No ratings yet
Compiler 5-10
11 pages
R20 CD Lab Manual
No ratings yet
R20 CD Lab Manual
43 pages
CC Lab Record
No ratings yet
CC Lab Record
19 pages
Compiler Design Record (21072)
No ratings yet
Compiler Design Record (21072)
48 pages
Practical 01: Aim: Write A Program To Create, Read and Write Into A File. Code
No ratings yet
Practical 01: Aim: Write A Program To Create, Read and Write Into A File. Code
15 pages
CC Lab 1-2
No ratings yet
CC Lab 1-2
6 pages
Implementation of Shift Reduce Parsing Algorithm: Action
No ratings yet
Implementation of Shift Reduce Parsing Algorithm: Action
22 pages
CD File
No ratings yet
CD File
31 pages
Android
0% (1)
Android
18 pages
Compiler Design Lab
100% (1)
Compiler Design Lab
15 pages
Compiler Design Lab
No ratings yet
Compiler Design Lab
43 pages
Aryan CD Lab Manual PDF
No ratings yet
Aryan CD Lab Manual PDF
24 pages
Department of Information Technology System Software Lab: 1. Write Source Code For Recursive Descent Parsing
No ratings yet
Department of Information Technology System Software Lab: 1. Write Source Code For Recursive Descent Parsing
11 pages
R20 CD Lab2
No ratings yet
R20 CD Lab2
33 pages
CD Lab Record
No ratings yet
CD Lab Record
43 pages
R.D.Foundation Group of Institution Faculty of Engineering Department of Computer Science & Engineering
No ratings yet
R.D.Foundation Group of Institution Faculty of Engineering Department of Computer Science & Engineering
42 pages
COMPILER DESIGN LAB Manual
No ratings yet
COMPILER DESIGN LAB Manual
32 pages
Final Lab Manual CC
No ratings yet
Final Lab Manual CC
42 pages
Custom Algorithm Block
No ratings yet
Custom Algorithm Block
392 pages
CD Lab Manual
No ratings yet
CD Lab Manual
40 pages
CDSS Lab Programs 1-11
No ratings yet
CDSS Lab Programs 1-11
27 pages
Important Programs
No ratings yet
Important Programs
20 pages
CD Lab Manual
No ratings yet
CD Lab Manual
68 pages
Teja CD Record
No ratings yet
Teja CD Record
33 pages
Week 2a &2B
No ratings yet
Week 2a &2B
6 pages
CD Lab
No ratings yet
CD Lab
36 pages
CD Lab Manual
No ratings yet
CD Lab Manual
37 pages
Program No. - 3: Write A Program To Find Different Tokens in A Program
No ratings yet
Program No. - 3: Write A Program To Find Different Tokens in A Program
3 pages
CD Lab Manual - Word
No ratings yet
CD Lab Manual - Word
42 pages
Port City International University: Report On: Report No: Course Code: Course Title
No ratings yet
Port City International University: Report On: Report No: Course Code: Course Title
34 pages
Cdlab UPDATED
No ratings yet
Cdlab UPDATED
43 pages
Shankar Final CS
No ratings yet
Shankar Final CS
30 pages
Compiler Design Pur Vi
No ratings yet
Compiler Design Pur Vi
39 pages
Compiler Design
No ratings yet
Compiler Design
37 pages
Practical - 4: Aim:-Program To Check Validation of Username and Password in C. Code
No ratings yet
Practical - 4: Aim:-Program To Check Validation of Username and Password in C. Code
4 pages
Galgotias College of Engineering & Technology: (Knowledge Park-II, Greater Noida, Uttar Pradesh, India)
No ratings yet
Galgotias College of Engineering & Technology: (Knowledge Park-II, Greater Noida, Uttar Pradesh, India)
35 pages
My Complete CC
No ratings yet
My Complete CC
15 pages
CD Iii-2 - R18
No ratings yet
CD Iii-2 - R18
16 pages
Practical File: Submitted by Sonali Yadav SCET: 2537 Cse 4 Year
No ratings yet
Practical File: Submitted by Sonali Yadav SCET: 2537 Cse 4 Year
24 pages
Unit 3
No ratings yet
Unit 3
28 pages
Write A C Program To Identify Different Types of Tokens in A Given Program
No ratings yet
Write A C Program To Identify Different Types of Tokens in A Given Program
6 pages
CD Lab Programs
No ratings yet
CD Lab Programs
9 pages
DBBL PO (Software) Question Pattern
No ratings yet
DBBL PO (Software) Question Pattern
3 pages
SPCC Practicalss
No ratings yet
SPCC Practicalss
6 pages
Lab13 - Secrets and ConfigMaps
100% (1)
Lab13 - Secrets and ConfigMaps
10 pages
Compiler Design Lab Manual
No ratings yet
Compiler Design Lab Manual
33 pages
Embedded Interview Questions PDF 1675182402
No ratings yet
Embedded Interview Questions PDF 1675182402
10 pages
ARM Instruction Set: Computer Organization and Assembly Languages P GZ y GG Yung-Yu Chuang
No ratings yet
ARM Instruction Set: Computer Organization and Assembly Languages P GZ y GG Yung-Yu Chuang
25 pages
Lecture 3 PDC
No ratings yet
Lecture 3 PDC
21 pages
DMS 1-5 Merged
No ratings yet
DMS 1-5 Merged
367 pages
ZOHO Exam Pattern Set - 01
No ratings yet
ZOHO Exam Pattern Set - 01
23 pages
Jax WS PDF
No ratings yet
Jax WS PDF
19 pages
Laboratory Exercise 1
No ratings yet
Laboratory Exercise 1
18 pages
New Syllabus
No ratings yet
New Syllabus
4 pages
DV - Resume - Shubham - 4
No ratings yet
DV - Resume - Shubham - 4
2 pages
All Types of Stack
No ratings yet
All Types of Stack
4 pages
Mod 4-5
No ratings yet
Mod 4-5
40 pages
Don't Overreact
No ratings yet
Don't Overreact
6 pages
Plugindoc
No ratings yet
Plugindoc
26 pages
Ads Unit-2
No ratings yet
Ads Unit-2
22 pages
PT 1 Paper 12th IP 24-25
No ratings yet
PT 1 Paper 12th IP 24-25
2 pages
Os Ques
No ratings yet
Os Ques
13 pages
Assembler: A Computer Will Not Understand Any Program Written in A
No ratings yet
Assembler: A Computer Will Not Understand Any Program Written in A
2 pages
Emanate White Paper
No ratings yet
Emanate White Paper
14 pages
Floating Point in Qsys
No ratings yet
Floating Point in Qsys
19 pages
DBMS Unit-4
No ratings yet
DBMS Unit-4
20 pages
U CC Lab Open Ended
No ratings yet
U CC Lab Open Ended
5 pages
Linked List Quiz
No ratings yet
Linked List Quiz
13 pages
Composition Over Inheritance - Wikipedia
No ratings yet
Composition Over Inheritance - Wikipedia
7 pages
Dhruv 1.3 AP
No ratings yet
Dhruv 1.3 AP
4 pages
VL2019201000936 Da PDF
No ratings yet
VL2019201000936 Da PDF
2 pages
Question Excel
No ratings yet
Question Excel
2 pages

CC Lab 1-4

Uploaded by

CC Lab 1-4

Uploaded by

COMPILER CONSTRUCTION

AMITY SCHOOL OF ENGINEERING AND TECHNOLOGY

bool isKeyword(char *word) {

int isSpecialCharacter(char ch) {

for (int i = 0; sourceCode[i] != '\0'; i++) {

void removeLeftRecursion(char nonTerminal, char productions[MAX][MAX],

int isNonTerminal(char ch) {

You might also like