0% found this document useful (0 votes)
13 views44 pages

Ccfile

Uploaded by

s98230358
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views44 pages

Ccfile

Uploaded by

s98230358
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

AMITY SCHOOL OF ENGINEERING & TECHNOLOGY

AMITY UNIVERSITY CAMPUS, SECTOR-125, NOIDA-201303

B.Tech CSE (Data Science)


Practical File

Compiler Construction
[CSE304]

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


AMITY SCHOOL OF ENGINEERING AND TECHNOLOGY
AMITY UNIVERSITY, NOIDA, UTTAR PRADESH

Submitted To: Submitted By:


Dr. Nancy Gulati Shreya Sankhwar
A023167022058

1
6CSE-DS-

1Y
Practical 1

Aim: Check the regular expressions , if that can be accepted or not.

Theory:
A regular expression is a sequence of characters that is used to search pattern. It is mainly
used for pattern matching with strings, or string matching, etc. They are a generalized
way to match patterns with sequences of characters.
POSIX Regular Expression Library in C allows developers to work with regular
expressions through the <regex.h> header. The key functions used are:

1. regcomp(): Compiles a regular expression.


2. regexec(): Executes the compiled regular expression to match input strings.
3. regfree(): Frees memory allocated for the compiled regular expression. If
regcomp() encounters an invalid pattern, it returns a non-zero error code. This
experiment evaluates various regex patterns for validity using these functions.

Procedure:
1. Include the <regex.h> header.
2. Create an array of regex patterns, including valid and invalid ones.
3. Use the regcomp() function to compile each pattern.
4. Check the return value of regcomp() to determine if the regex is valid or not.
5. Print the result for each pattern.

CODE:
// C program for illustration of regcomp()
#include <regex.h>
2
#include <stdio.h>

// Driver Code int main()


{
// Variable to create regex

RESULT-

3
4
regex_t reegex;

5
// Variable to store the return // value after creation of regex int value;
// Function call to create regex value = regcomp( &reegex, "[:word:]", 0);
// If compilation is successful if (value == 0) {
printf("RegEx compiled successfully."); }

// Else for Compilation error


else {
printf("Compilation error."); } return 0;
}

Time Complexity- O(n)

6
Practical 2

Aim: Count the no of token in the program.

Theory:
In C programming, tokens are the smallest individual units of a program. These include
keywords, identifiers, constants, operators, and special symbols. Tokenizing a program
involves parsing its contents and separating it into these individual components. This
process helps in lexical analysis and understanding the structure of the program.

As it is known that Lexical Analysis is the first phase of compiler also known as scanner.
It converts the input program into a sequence of Tokens. For Example:

1) Keywords:

7
Examples- for, while, if etc.

2) Identifier
Examples- Variable name, function
name etc.

3) Operators:
Examples- '+', '++', '-' etc.

4) Separators:
Examples- ', ' ';' etc

Procedure:
1. Prepare Input File: Write or use an existing C program file to serve as input for
token counting.

2. Implement Tokenizer: Use standard file I/O in C to read the file, and process the
content character by character to identify tokens based on delimiters like spaces,
tabs, and special symbols.

3. Count Tokens: Maintain a count of tokens identified during the process.

4. Display Results: Output the total number of tokens detected in the program.

Code:
#include <stdio.h>
#include <string.h>
#include <ctype.h>

8
// Function to check if a character is a delimiter int isDelimiter(char
ch) {
return (ch == ' ' || ch == '+' || ch == '-' || ch == '*' || ch == '/' ||
ch == ',' || ch == ';' || ch == '>' || ch == '<' || ch == '=' ||
ch == '(' || ch == ')' || ch == '[' || ch == ']' || ch == '{' || ch == '}'); }

// Function to check if a character is an operator int


isOperator(char ch) {
return (ch == '+' || ch == '-' || ch == '*' || ch == '/' || ch == '=' ||
ch == '<' || ch == '>' || ch == '&' || ch == '|'); }

// Function to count tokens in the given input code void


countTokens(char *code) {
char buffer[100]; int i
= 0, tokenCount = 0;

printf("Tokens found in the code:\n");


for (int j = 0; code[j] != '\0'; j++) {
char ch = code[j];

// Check for delimiters


if (isDelimiter(ch)) { if
(i != 0) { buffer[i] =
'\0'; printf("%s\n",
buffer);
tokenCount++;
i = 0; } if (!isspace(ch))
{ printf("%c\n", ch);
tokenCount++; } } else if (isOperator(ch)) {
if (i != 0) { buffer[i] = '\0'; printf("%s\n",
buffer); tokenCount++;
i = 0; } printf("%c\n",
ch); tokenCount++;
9
} else {
buffer[i++] = ch; } }
if (i != 0) { buffer[i]
= '\0'; printf("%s\n",
buffer); tokenCount++;
} printf("\nTotal number of tokens: %d\n",
tokenCount); }
int main() { char code[1000];

printf("Enter the C program code (end with ~ on a new line):\n");


// Reading multiple lines of code as input
char line[200];
code[0] = '\0'; // Initialize code as an empty string
while (1) {
fgets(line, sizeof(line), stdin);
if (line[0] == '~') // End input with ~
break; strcat(code, line); }
countTokens(code);
return 0;

10
}
RESULT-

Time Complexity- O(n)

11
Practical 3

Aim: Evaluate the postfix expression using push pop operations using c language

Theory:
Postfix expression: The expression of the form “a b operator” (ab+) i.e., when a pair of
operands is followed by an operator.
Postfix expression (Reverse Polish Notation) is a mathematical notation where
operators follow their operands. It eliminates the need for parentheses. For example:

1. Infix: (3 + 4) * 5
2. Postfix: 3 4 + 5 * Evaluation Steps:
1) Scan the postfix expression from left to right.
2) Use a stack to store operands.
3) If an operand is encountered, push it onto the stack.
4) If an operator is encountered, pop two elements from the stack, perform the
operation, and push the result back.
5) At the end of the expression, the stack will contain the result.
Stack Operations:
1. Push: Insert an element into the stack.
2. Pop: Remove and return the top element from the stack.

Procedure:
1) Define a stack structure and functions for push and pop.
2) Traverse the postfix expression:
1. If it is an operand, push it onto the stack.
2. If it is an operator, pop two elements from the stack, perform the operation, and
push the result back.
3) Continue until the end of the expression.
4) The result will be at the top of the stack.

12
Code:
#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
// Define stack and its functions
#define MAX 100
int stack[MAX]; int
top = -1;
void push(int value) { if (top >=
MAX - 1) { printf("Stack
overflow\n"); exit(1); }
stack[++top] = value; } int pop()
{ if (top == -1) { printf("Stack
underflow\n"); exit(1); } return
stack[top--];
}

int evaluatePostfix(char* expression) {


int i = 0, operand1, operand2, result;

while (expression[i] != '\0') {


if (isdigit(expression[i])) {
// Convert character to integer and push onto stack
push(expression[i] - '0');
} else {
// It's an operator, so pop two operands
operand2 = pop(); operand1 = pop();

switch (expression[i]) { case '+':


result = operand1 + operand2; break; case '-':
result = operand1 - operand2; break; case '*':

13
result = operand1 * operand2; break; case '/':
result = operand1 / operand2; break; default:
printf("Invalid operator encountered\n");
exit(1);

RESULT-

14
15
}

16
// Push
the result
back
onto the
stack
push(resu
lt);
} i++;
}

// Final result is the last element in the stack return pop();


}

int main() { char postfixExpression[MAX];

printf("Enter a postfix expression (e.g., 23*54*+): ");


scanf("%s", postfixExpression);

int result = evaluatePostfix(postfixExpression);


printf("The result of the postfix evaluation is: %d\n", result);

return 0;
}

Time Complexity- O(n)


Practical 4

Aim: Convert NFA to DFA using c language

Theory:

17
1. A Nondeterministic Finite Automaton (NFA) is an automaton where for some state
and input, the machine may transition to multiple states or no state at all. It may also
include epsilon transitions (transitions without consuming an input symbol).
2. A Deterministic Finite Automaton (DFA) is an automaton where for each state and
input, there is exactly one possible next state. No epsilon transitions are allowed, and
each input symbol leads to exactly one state transition. The process of converting an
NFA to a DFA involves:

1) Identifying the possible state combinations of the NFA.


2) Creating a state transition table for the DFA by considering the power set of the
NFA states.
3) Minimizing the DFA if necessary.

Procedure:
1. Input: The states, input symbols, transition function, start state, and accepting
states of the NFA.
2. Construct DFA states:
1) Identify the epsilon-closure (set of reachable states from a given state using
epsilon transitions).
2) For each state of the NFA, create new DFA states by combining possible states of
the NFA.
3. Transition function: For each new DFA state, calculate the possible transitions for
each input symbol.
4. Final DFA construction: Once all possible transitions are calculated, print the
DFA state transition table.
Code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_STATES 10
#define MAX_SYMBOLS 2
18
#define MAX_TRANSITIONS 5
int nfaStates, nfaAlphabetSize,
nfaTransitions[MAX_STATES][MAX_SYMBOLS][MAX_TRANSITIONS];
int nfaStartState, nfaFinalStates[MAX_STATES]; int
dfaStates[MAX_STATES][MAX_SYMBOLS], dfaStateCount = 0;

void convertNFAtoDFA() {
int currentState, nextState, i, j, found;
int dfaState[MAX_STATES];

dfaState[dfaStateCount++] = nfaStartState;

for (i = 0; i < dfaStateCount; i++) { currentState =


dfaState[i]; for (j = 0; j < nfaAlphabetSize; j++) {
nextState = -1; for (int k = 0; k <
MAX_TRANSITIONS; k++) { if
(nfaTransitions[currentState][j][k] != -1) { nextState
= nfaTransitions[currentState][j][k]; break;
} } if (nextState != -1) { found = 0; for (int k =
0; k < dfaStateCount; k++)
{ if (dfaState[k] == nextState) {
found = 1; break; } }
if
(!found) { dfaState[dfaStateCount++]
= nextState;
} } }}}

void printDFA() {
printf("DFA States and Transitions:\n");
for (int i = 0; i < dfaStateCount; i++) { printf("DFA
State %d\n", i);
19
}}

RESULT-

20
int main() { nfaStates = 3; nfaAlphabetSize = 2; nfaStartState = 0;

21
// Define NFA transitions: -1 means no transition memset(nfaTransitions, -1,
sizeof(nfaTransitions)); nfaTransitions[0][0][0] = 0; nfaTransitions[0][1][1] = 1;
nfaTransitions[1][0][1] = 1; nfaTransitions[1][1][2] = 2; nfaTransitions[2][0][2] = 2;
nfaTransitions[2][1][2] = 2;

convertNFAtoDFA();
printDFA();

return 0;
}

Time Complexity- O(2^nmn)

Practical 5

Aim:
To write a C program that checks whether a given line is a comment in C programming.
Theory:
1. Comments in C:
o Comments are used to add explanations or notes within the code without
affecting program execution.
o C supports two types of comments:
▪ Single-line comments: Begin with // and extend to the end of the
line.
▪ Multi-line comments: Begin with /* and end with */, spanning
multiple lines if needed.
2. Approach to Identifying Comments:

22
o Read the given line of text. o Check if it starts with // (indicating a single-
line comment). o Check if it starts with /* and ends with */ (indicating a
multi-line comment).
o If neither condition is met, the line is not a comment.

Procedure:
1. Take input from the user as a string (line of code).
2. Check the first two characters of the string:
o If "//", it is a single-line comment.

o If "/*" at the start and "*/" at the end, it is a multi-line comment.

o Otherwise, it is not a comment.

3. Display the result accordingly.

Code:
#include <stdio.h>
#include <string.h>

void checkComment(char line[]) {


int len = strlen(line);

if (line[0] == '/' && line[1] == '/') {

23
else if (line[0] == '/' && line[1] == '*' && line[len - 2] == '*' && line[len - 1] == '/')

printf("The given line is a single-line comment.\n");


}

}
else {
printf("The given line is NOT a comment.\n");
}
}

int main() {
char line[200];

printf("Enter a line of code: ");


fgets(line, sizeof(line), stdin);

// Removing newline character if present


line[strcspn(line, "\n")] = 0;

checkComment(line);

return 0;
}

Output:

Time Complexity:
24

the result.
{ printf("The given line is a multi-line comment.\n");

Best Case (O(1)): If the input starts with characters that immediately determine

Worst Case (O(n)): When scanning a long string to verify a multi-line comment.

25
Practical 6

Aim:
To write a C program that simulates an automaton that accepts strings belonging to the
regular expression: a(a+b) aa(a+b)^*aa(a+b) a where the symbol set = {a, b}.
Theory:
1. Regular Expression (RE) Explanation:
o a(a+b)*a represents strings that:
▪ Start with a.
▪ Contain any number of a or b in between (including none).
▪ End with a.
o Example of accepted strings:
▪ aa, aba, aaa, abba, ababaaa o Example of rejected strings:
▪ bba, ab, ba, bbba
2. Finite Automaton for RE:
o States: {q0, q1, q2, qf} o Transitions:
▪ q0 → q1 on a (first a)
▪ q1 → q1 on a or b (middle part: (a+b)*)
▪ q1 → qf on a (last a) o Start State: q0 o Final State: qf o
Reject State: If input does not follow the above rules.

Procedure:
1. Read the input string.
2. Start from state q0.
3. Transition through states based on the input symbol: o If first character is a, move
to q1.
17
o Read the middle part (accept a or b in q1).
o If the last character is a, move to the final state qf.
4. If the string reaches qf, accept it; otherwise, reject it.

Code:
c
CopyEdit
#include <stdio.h>
#include <string.h>

// Function to check if the given string is accepted by the


automaton int isAccepted(char str[]) { int len = strlen(str);

// Check if length is at least 2 (must start and end with 'a')


if (len < 2) return 0;

// Check first and last character if


(str[0] == 'a' && str[len - 1] == 'a') {
for (int i = 1; i < len - 1; i++) { if
(str[i] != 'a' && str[i] != 'b') {
return 0; // Invalid character found
}
}
return 1; // String is accepted
}
return 0; // Not matching the required pattern }
18
int main() {
char input[100];

printf("Enter a string (symbols {a, b}): ");


scanf("%s", input);

if (isAccepted(input)) {
printf("The string is ACCEPTED.\n");
} else {
printf("The string is REJECTED.\n");
}

return 0;
}

Output:

Time Complexity:
• O(n) (where n is the length of the input string) since it scans the string once.

19
Practical 7

Aim:
To write a C program to check whether a given identifier is valid according to the rules
of C programming.

Theory:
1. Definition of an Identifier: o An identifier is the name of variables, functions,
arrays, etc. in a programming language.
2. Rules for a Valid Identifier in C:
o Must start with a letter (A-Z or a-z) or an underscore (_).
o Can contain letters, digits (0-9), and underscores (_). o Cannot be a
reserved keyword in C.
o Cannot contain special characters like @, #, $, %, &, etc..
o Cannot start with a digit.
3. Examples:
✅ Valid Identifiers:
o name, var_1, _count, MyVar ❌ Invalid Identifiers: o 1var (starts with a
digit) o my-var (contains -)
o int (reserved keyword)

Procedure:
1. Read the input string.
2. Check if the first character is a letter or an underscore.
3. Check the remaining characters to ensure they contain only letters, digits, or
underscores.
4. Verify that the string is not a C reserved keyword.
5. Display the result accordingly.
20
Code:
c
CopyEdit
#include <stdio.h>
#include <ctype.h>
#include <string.h>

// List of C reserved keywords const


char *keywords[] = {
"auto", "break", "case", "char", "const", "continue", "default", "do",
"double", "else", "enum", "extern", "float", "for", "goto", "if",
"int", "long", "register", "return", "short", "signed", "sizeof",
"static", "struct", "switch", "typedef", "union", "unsigned", "void",
"volatile", "while"
};
#define KEYWORDS_COUNT (sizeof(keywords) / sizeof(keywords[0]))

// Function to check if a string is a reserved


keyword int isKeyword(char *str) { for (int i = 0;
i < KEYWORDS_COUNT; i++) { if
(strcmp(str, keywords[i]) == 0) return 1; //
It's a keyword
}
return 0;
}
21
// Function to check if the identifier is
valid int isValidIdentifier(char *str) { int
len = strlen(str);

// Check if the first character is a letter or underscore


if (!isalpha(str[0]) && str[0] != '_') { return 0;
}

// Check remaining characters (should be letters, digits, or underscore)


for (int i = 1; i < len; i++) { if (!isalnum(str[i]) && str[i] != '_') {
return 0;
}
}

// Check if it's a keyword


if (isKeyword(str)) {
return 0;
}

return 1;
}

int main() {
char identifier[100];

printf("Enter an identifier: ");


scanf("%s", identifier);

if (isValidIdentifier(identifier)) {
printf("The identifier '%s' is VALID.\n", identifier);
} else { printf("The identifier '%s' is INVALID.\n",
identifier);
}

return 0; 22
}

Output:
Time Complexity:
• Best Case (O(1)): If the first character is invalid.
• Worst Case (O(n)): If the identifier is long, it scans each character and checks for
keywords.

40
Practical 8

Aim: To write a C program that simulates a lexical analyzer to validate


operators: <, >, <=, >=, !, !=, |, ||, &, &&, =, ==

Theory:
1. Lexical Analysis in Compiler Design:
o Lexical analysis is the first phase of a compiler.
o It scans the input program and recognizes tokens such as keywords,
identifiers, operators, and symbols.
2. Operators to be Validated: o Relational Operators: <, >, <=, >=, ==, != o
Logical Operators: !, ||, && o Bitwise Operators: |, & o Assignment
Operator: =
3. Lexical Analyzer Working:
o Read the input string. o Check if the string matches any of the valid
operators.
o If valid, print "Valid Operator."
o If invalid, print "Invalid Operator."

Procedure:
1. Read the input string.
2. Compare it with valid operators using strcmp().
3. If the input matches any valid operator, print "Valid Operator."
4. Otherwise, print "Invalid Operator."

Code:
c
CopyEdit
#include <stdio.h>

41
#include <string.h>

// List of valid operators const char *operators[] = { "<", ">", "<=", ">=", "!", "!=", "|",
"||", "&", "&&", "=", "==" };
#define OPERATORS_COUNT (sizeof(operators) / sizeof(operators[0]))

// Function to check if the input is a valid operator


int isValidOperator(char *op) { for (int i = 0; i <
OPERATORS_COUNT; i++) { if
(strcmp(op, operators[i]) == 0) { return 1;
// Valid operator
}
}
return 0; // Invalid operator
}

int main() {
char input[3]; // Max length of an operator in the list is 2, so we take 3 for safety

printf("Enter an operator: ");


scanf("%s", input);

if (isValidOperator(input)) { printf("'%s'
is a VALID operator.\n", input);
} else {
printf("'%s' is an INVALID operator.\n", input);
}

return 0;

42
}

Output:

43
Time Complexity:
• Best Case (O(1)): If the operator is found at the beginning of the list.
• Worst Case (O(n)): If it has to check all operators.

44

You might also like