CompilerDesign
CompilerDesign
Introduction
In compiler design, two critical components are the lexical analyzer and the syntax analyzer. LEX and
YACC are tools that help automate the creation of these components.
LEX (Lexical Analyser Generator): Used to generate a lexical analyzer or scanner that identifies
tokens in the input code.
YACC (Yet Another Compiler Compiler): Used to generate a parser that analyzes the syntactic
structure of the tokens.
Example:
%{
#include <stdio.h>
%}
%%
[0-9]+ { printf("NUMBER\n"); }
[a-zA-Z]+ { printf("IDENTIFIER\n"); }
"+" { printf("PLUS\n"); }
\n { /* ignore newline */ }
. { printf("UNKNOWN\n"); }
%%
int main() {
yylex();
return 0;
}
Explanation:
- The above example identifies numbers, identifiers, and the plus operator.
- The pattern `[0-9]+` matches sequences of digits, and `[a-zA-Z]+` matches sequences of letters.
Advantages of LEX:
Automated Lexical Analysis: Automates the process of recognizing tokens using regular
expressions.
Efficiency: Generates fast and optimized C code for token recognition.
Ease of Use: Simplifies the development of lexical analyzers.
Example:
%{
#include <stdio.h>
%}
%token NUMBER
%%
expression: NUMBER '+' NUMBER { printf("Sum: %d\n", $1 + $3); }
;
%%
int main() {
yyparse();
return 0;
}
Explanation:
- The rule `expression: NUMBER '+' NUMBER` defines an expression that consists of two numbers
separated by a plus sign.
- `$1` and `$3` represent the first and third tokens (numbers), and `$2` would represent the plus sign.
Advantages of YACC:
-Automated Parser Generation: Simplifies parser creation from complex grammars.
- Error Detection and Handling: Efficiently identifies and handles syntax errors.
- Flexibility: Allows defining complex grammar rules for different programming languages.
Output:
Input: 4 + 5
Output: Result: 9
Conclusion
Using LEX and YACC in compiler design offers a streamlined approach to implementing lexical and
syntax analysis. By automating token recognition and grammar parsing, these tools help developers
create efficient and reliable compilers for different programming languages.
References
- "Lex & Yacc" by John R. Levine, Tony Mason, and Doug Brown
- Compiler Construction: Principles and Practice by Kenneth C. Louden
- Official GNU Documentation for Flex and Bison
Practical-2
Objective: Write a program to check whether a string include keyword or not.
Code:
#include <iostream>
#include <string>
int main() {
std::string str, keyword;
std::cout << "Enter a string: ";
std::getline(std::cin, str);
std::cout << "Enter a keyword: ";
std::getline(std::cin, keyword);
if (str.find(keyword) != std::string::npos) {
std::cout << "The keyword is present in the string." << std::endl;
} else {
std::cout << "The keyword is not present in the string." << std::endl;
}
return 0;
}
Output:
Practical-3
Objective: Write a program to check whether a string contains an alphabet or not.
Code:
#include <iostream>
#include <string>
#include <cctype> // For isalpha function
int main() {
std::string str;
bool hasAlphabet = false;
if (hasAlphabet) {
std::cout << "The string contains at least one alphabet character." << std::endl;
} else {
std::cout << "The string does not contain any alphabet characters." << std::endl;
}
return 0;
}
Output:
Practical-4
Objective: Write a program to show all the operations of a stack.
Code:
#include <iostream>
#include <stack>
int main() {
std::stack<int> stack;
int choice, value;
do {
std::cout << "\nStack Operations Menu:";
std::cout << "\n1. Push";
std::cout << "\n2. Pop";
std::cout << "\n3. Top";
std::cout << "\n4. Is Empty";
std::cout << "\n5. Size";
std::cout << "\n6. Exit";
std::cout << "\nEnter your choice: ";
std::cin >> choice;
switch (choice) {
case 1:
std::cout << "Enter value to push: ";
std::cin >> value;
stack.push(value);
std::cout << value << " pushed into the stack." << std::endl;
break;
case 2:
if (!stack.empty()) {
std::cout << "Popped value: " << stack.top() << std::endl;
stack.pop();
} else {
std::cout << "Stack is empty." << std::endl;
}
break;
case 3:
if (!stack.empty()) {
std::cout << "Top value: " << stack.top() << std::endl;
} else {
std::cout << "Stack is empty." << std::endl;
}
break;
case 4:
std::cout << (stack.empty() ? "Stack is empty." : "Stack is not empty.") << std::endl;
break;
case 5:
std::cout << "Stack size: " << stack.size() << std::endl;
break;
case 6:
std::cout << "Exiting..." << std::endl;
break;
default:
std::cout << "Invalid choice. Please try again." << std::endl;
}
} while (choice != 6);
return 0;
}
Output:
Practical-5
Objective: Write a program to remove left recursion from a grammar.
Code:
#include <iostream>
#include <vector>
#include <string>
using namespace std;
newProductions.push_back({prod.nonTerminal, updatedNonRecursive});
int main() {
// Define original productions
vector<Production> productions = {
{'E', {"E+T", "T"}},
{'T', {"T*F", "F"}},
{'F', {"(E)", "id"}}
};
return 0;
}
Output:
Practical-6
Objective: Write a program to perform left factoring on a grammar.
Code:
#include <iostream>
#include <vector>
#include <string>
#include <map>
using namespace std;
if (productions.size() > 1) {
string commonPrefix = productions[0];
for (int i = 1; i < productions.size(); i++) {
commonPrefix = longestCommonPrefix(commonPrefix, productions[i]);
if (commonPrefix.empty()) break;
}
int main() {
map<string, vector<string>> grammar;
int n;
cout << "Enter the number of grammar rules: ";
cin >> n;
// Input grammar
for (int i = 0; i < n; i++) {
string nonTerminal, arrow, production;
cout << "Enter non-terminal: ";
cin >> nonTerminal >> arrow; // Arrow input: '->'
vector<string> productions;
cout << "Enter productions (separated by space, end with newline): ";
while (cin >> production) {
productions.push_back(production);
if (cin.peek() == '\n') break; // End input when newline is encountered
}
grammar[nonTerminal] = productions;
}
Output:
Practical-7
Objective: Write a program to find out the FIRST of the Nonterminals in a grammar.
Code:
#include <iostream>
#include <vector>
#include <unordered_set>
#include <unordered_map>
#include <string>
using namespace std;
unordered_set<char> firstSet;
for (const string &production : productions[nonTerminal]) {
char symbol = production[0];
if (isupper(symbol)) { // Non-terminal symbol
calculateFirstSet(string(1, symbol));
firstSet.insert(firstSets[string(1, symbol)].begin(), firstSets[string(1, symbol)].end());
} else {
firstSet.insert(symbol); // Terminal symbol
}
}
firstSets[nonTerminal] = firstSet;
}
int main() {
productions["S"] = {"aBC", "b"};
productions["B"] = {"b", "C"};
productions["C"] = {"c", "e"};
return 0;
}
Output:
Practical-8
Objective: Implementing Programs using Flex (Lexical analyzer tool).
Theory:
Introduction of Flex:
Flex is a lexical analyzer generator, which is a tool for programming that
recognizes lexical patterns in the input with the help of flex specifications. Scroll
below to see the list of flex programs.
Output:
Program (b): Program to check if the given letter is a vowel or not.
%{
#undef yywrap
#define yywrap() 1
void display(int);
%}
%%
[a|e|i|o|u] { display(1); }
. { display(0); }
%%
void display(int flag) {
if(flag == 1)
printf("The given letter [%s] is a vowel\n", yytext);
else
printf("The given letter [%s] is NOT a vowel\n", yytext);
}
int main() {
printf("Enter a letter: ");
yylex();
return 0;
}
Output:
Practical-9
Objective: Elaborate DAG Representation with examples.
Theory:
Directed Acyclic Graph:
The Directed Acyclic Graph (DAG) is used to represent the structure of basic blocks, to visualize
the flow of values between basic blocks, and to provide optimization techniques in the basic
block. To apply an optimization technique to a basic block, a DAG is a three-address code that is
generated as the result of an intermediate code generation.
• Directed acyclic graphs are a type of data structure and they are used to apply
transformations to basic blocks.
• The Directed Acyclic Graph (DAG) facilitates the transformation of basic blocks.
• DAG is an efficient method for identifying common subexpressions.
• It demonstrates how the statement’s computed value is used in subsequent statements.
Examples of directed acyclic graph:
Directed Acyclic Graph for the above cases can be built as follows:
Step 1 –
• If the y operand is not defined, then create a node (y).
• If the z operand is not defined, create a node for case(1) as node(z).
Step 2 –
• Create node(OP) for case(1), with node(z) as its right child and node(OP) as its left child (y).
• For the case (2), see if there is a node operator (OP) with one child node (y).
• Node n will be node(y) in case (3).
Step 3 –
Remove x from the list of node identifiers. Step 2: Add x to the list of attached identifiers for node n.
Example 1:
T0 = a + b : Expression 1
T1 = T0 + c : Expression 2
d = T0 + T1 : Expression 3
Expression 1: T0 = a + b Expression 2: T1 = T 0 + c
Expression 3 : d = T0 + T1
Example 2:
T1 = a + b
T2 = T1 + c
T3 = T1 x T2
Example 3 :
T1 = a + b
T2 = a – b
T3 = T1 * T2
T4 = T1 – T3 T5 = T4 + T3