CDLabmanual
CDLabmanual
Department of
Computer Engineering & Information Technology
Laboratory Manual
“Compiler Design”
Subject Code: 3170701
Student Name
Enrollment Number
Academic Term
Certificate
Date: / /
Teaching and Examination Scheme
Course Outcome
Index
Sr. Page
Date Title of Experiments Signature Remarks
no. No.
1 Introduction of LEX
A lexical analyzer takes input streams and divides into tokens. This division into units is known as
lexical analysis. Lex takes set of rules for valid tokens and produce C program which we call lexical
analyzer or lexer that can identify these tokens.
Lex is a lexical analyzer generator-a tool for programming that recognizes lexical patterns in the input
with the help of Lex specifications. Lex is generally used in the manner as shown below.
First, a specification of a lexical analyzer is prepared by creating a program lex.l in the Lex language.
Then, the lex.l is run through the Lex compiler to produce a c program lex.yy.c. This program consists
of a tabular representation of a transition diagram constructed from the regular expression of lex.l,
together with a standard routine that uses the table to recognize lexemes.
The action associated with regular expression in lex.l is pieces of C code and are carried over directly
to lex.yy.c. Finally lex.yy.c is run through the C compiler to produce an object program a.out, which is
the lexical analyzer that transforms an input stream into a sequence of tokens.
Declarations
%%
Rules Section
%%
Definition Section
It contains different user defined Lex options used by the lexer. It also creates an environment for the
execution of the Lex program.
The definition section creates an environment for the lexer, which is a C code. This area of the Lex
specification is separated by “ %{ “ , and it contains C statements, such as global declarations,
commands, including library files and other declarations, which will be copied to the lexical
analyzer(lex.yy.c) when it passed through the Lex tool.
The definition section provides an environment for the Lex tool to convert the Lex specifications
correctly and efficiently to a lexical analyzer. This section mainly contains declarations of simple name
definitions to simplify the scanner specifications and declarations of start condition. The statement in
this section will help the Lex rules to run efficiently.
Example:
%{
#include "calc.h"
#include <stdio.h>
#include <stdlib.h>
char name[10];
%}
/* Regular expressions */
/* ------------------- */
white [\t\n ]+
letter [A-Za-z]
digit [0-9]
identifier
{letter}(_|{letter}|{digit10})*
Rule Section
It contains the patterns and actions that specify the lex specifications. A pattern is in the form of a
regular expression to match the largest possible string.
Once the pattern is matched, the corresponding action part is invoked. The action part contains normal
C language statements. They are enclosed within braces ( “{“ and “}”), if there is more than one
statement then make these component statements into single block of statements.
%%
{LETTER}({LETTER}| {DIGIT})* {
printf(“\n It is a Identifier: %s \n”, yytext);
}
%%
Always use braces to make the code clear, if the action has more than one statement or more than one
line large. The lexer always tries to match the largest possible string, but when there are two possible
rules that match the same length, the lexer uses the first rule in the Lex specification to invoke its
corresponding action.
All the rules in the rule section will automatically be converted into C statements by the Lex tool and will be put under the
function name of yylex(). Whenever, we call the function yylex, C statements corresponding to the rules will be executed.
That is we called the function yylex() in main function, even though we have not defined it anywhere in the program.
Example:
main()
{
yylex();
}
%%
%%
main()
{
yylex();
}
Let above program be in file called practical.l. To create or generate a lexical analyzer we must enter
the following command
$ lex practical.l
When, the above command is executed, Lex translates the Lex specification into a C source file called
lex.yy.c, which is a lexical analyzer. Any lexical analyzer can be compiled using the following
command
$ cc lex.yy.c –ll
This will compile the lexical analyzer, lex.yy.c, using any C compiler by linking it with Lex library
using the extension –ll. After compilation, the output, by default, will write to “a.out” file.
The resulting program is executed using the following command
$ ./a.out or $ ./a.out < filename
Lex variables
yyin Of the type FILE*. This points to the current file being
parsed by the lexer.
yyout Of the type FILE*. This points to the location where
the output of the lexer will be written. By default, both
yyin and yyout point to standard input and output.
yytext The text of the matched pattern is stored in this
variable (char*).
yyleng Gives the length of the matched pattern.
yylineno Provides current line number information
Program:
%{
%}
%%
.{
printf("hello world\n");
exit(0);
}
Output:
Aim: Lex program for identifying and classifying the input as keywords,
identifiers, digits, or words.
Program:
%{
#include <stdio.h>
#include <string.h>
#include <ctype.h>
%}
keyword "if"|"else"|"for"|"while"
digit [0-9]+
identifier [a-zA-Z_][a-zA-Z0-9_]*
word [a-zA-Z]+
%%
{keyword} {
printf("Keyword: %s\n", yytext);
}
{digit} {
printf("Digit: %s\n", yytext);
}
{identifier}{
printf("Identifier: %s\n", yytext);
}
{word} {
printf("Word: %s\n", yytext);
6 Compiler Design (3170701)
}
[ \t\n]+ ; /* skip white spaces */
. {
printf("Unknown: %c\n", yytext[0]);
}
%%
int yywrap() {
return 1;
}
int main() {
yylex();
return 0;
}
Output:
Aim: Lex program for displaying a message when the Enter key is
pressed.
Program:
%{
#include <stdio.h>
%}
%%
\n {
printf("Enter key pressed!\n");
};
%%
int yywrap() {
return 1;
}
int main() {
yylex();
return 0;
}
Output:
Aim: Lex program for checking whether the input string contains all
lowercase characters, all uppercase characters, or a mixture of both.
Program:
%{
#include <stdio.h>
#include <string.h>
#include <ctype.h>
%}
alpha [a-zA-Z]+
%%
{alpha} {
int lower_count = 0;
int upper_count = 0;
for (int i = 0; i < strlen(yytext); ++i) {
if (islower(yytext[i])) {
++lower_count;
} else if (isupper(yytext[i])) {
++upper_count;
}
}
if (lower_count == strlen(yytext)) {
printf("All lowercase\n");
} else if (upper_count == strlen(yytext)) {
printf("All uppercase\n");
9 Compiler Design (3170701)
} else {
printf("Mixture of both\n");
}
}
[ \t\n]+ ; /* skip white spaces */
. { /* ignore other characters */ }
%%
int yywrap() {
return 1;
}
int main() {
yylex();
return 0;
}
Output:
Aim: Lex program for checking whether the input string contains both
consonants and vowels.
Program:
%{
#include <stdio.h>
#include <string.h>
#include <ctype.h>
%}
alpha [a-zA-Z]+
%%
{alpha} {
int vowel_count = 0;
int consonant_count = 0;
for (int i = 0; i < strlen(yytext); ++i) {
if (tolower(yytext[i]) == 'a' || tolower(yytext[i]) == 'e' ||
tolower(yytext[i]) == 'i' || tolower(yytext[i]) == 'o' ||
tolower(yytext[i]) == 'u') {
++vowel_count;
} else if (isalpha(yytext[i])) {
++consonant_count;
}
}
if (vowel_count > 0 && consonant_count > 0) {
printf("Contains both vowels and consonants\n");
11 Compiler Design (3170701)
} else {
printf("Does not contain both vowels and consonants\n");
}
}
[ \t\n]+ ; /* skip white spaces */
. { /* ignore other characters */ }
%%
int yywrap() {
return 1;
}
int main() {
yylex();
return 0;
}
Output:
Aim: Lex program for creating a Lexer that takes input from a text file
and counts the number of characters, number of lines, and other relevant
metrics.
Program:
%{
#include <stdio.h>
int char_count = 0;
int line_count = 0;
int word_count = 0;
int non_whitespace_count = 0;
%}
%%
\n {
++line_count;
++char_count;
}
[^\s]+ {
++word_count;
.{
++char_count;
}
%%
int yywrap() {
return 1;
}
yyin = file;
return 0;
}
Output:
Program:
%{
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#define SHIFT 3
[a-zA-Z]+ {
caesar_cipher(yytext);
printf("Encrypted: %s\n", yytext);
}
16 Compiler Design (3170701)
[ \t\n]+ ; /* skip white spaces */
%%
int yywrap() {
return 1;
}
int main() {
yylex();
return 0;
}
Output:
Aim: Lex program for extracting single and multi-line comments from a
C program.
Program:
%{
#include <stdio.h>
%}
single_comment "//"[^\n]*
multi_comment "/\\*"[^*]*"\\*/"
%%
{single_comment} {
printf("Single line comment: %s\n", yytext);
}
{multi_comment} {
printf("Multi line comment: %s\n", yytext);
}
int yywrap() {
return 1;
}
int main() {
yylex();
return 0;
}
Output:
Program:
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
char s[20],stack[20];
int main() {
char m[5][6][3]={"tb"," "," ","tb"," "," "," ","+tb"," "," ","n","n","fc"," ","
","fc"," "," "," ","n" ,"*fc"," a","n","n","i"," "," ","(e)"," "," "};
int size[5][6]={2,0,0,2,0,0,0,3,0,0,1,1,2,0,0,2,0,0,0,1,3,0,1,1,1,0,0,3,0,0};
int i,j,k,n,str1,str2;
printf("\n Enter the input string: ");
scanf("%s",s);
strcat(s,"$");
n=strlen(s);
stack[0]='$';
stack[1]='e';
i=1;
j=0;
printf("\nStack Input\n");
printf("__________________\n");
while((stack[i]!='$')&&(s[j]!='$')) {
if(stack[i]==s[j]) {
Program:
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
int main() {
char m[5][6][3] = {
{"tb"," "," ","tb"," "," "},
{" ","+tb"," "," ","n","n"},
{"fc"," "," ","fc"," "," "},
{" "," "," ","n" ,"*fc"," a"},
{"n","n","i"," "," ","("}
};
int size[5][6] = {
{2,0,0,2,0,0},
{0,3,0,0,1,1},
{2,0,0,2,0,0},
{0,0,0,1,3,0},
{1,1,1,0,0,3}
printf("\nStack Input\n");
printf("__________________\n");
switch (stack[i]) {
case 'e': str1 = 0; break;
case 'b': str1 = 1; break;
switch (s[j]) {
case 'i': str2 = 0; break;
case '+': str2 = 1; break;
case '*': str2 = 2; break;
case '(': str2 = 3; break;
case ')': str2 = 4; break;
case '$': str2 = 5; break;
}
if (m[str1][str2][0] == '\0') {
printf("\nERROR");
exit(0);
} else if (m[str1][str2][0] == 'n') {
i--;
} else if (m[str1][str2][0] == 'i') {
stack[i] = 'i';
} else {
for (k = size[str1][str2] - 1; k >= 0; k--) {
stack[i] = m[str1][str2][k];
i++;
}
printf("\n SUCCESS");
return 0;
}
Output:
Program:
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
void push(char *, int *, char);
char stacktop(char *);
void isproduct(char, char);
int ister(char);
int isnter(char);
int isstate(char);
void error();
void isreduce(char, char);
char pop(char *, int *);
void printt(char *, int *, char [], int);
void rep(char [], int);
struct action {
char row[6][5];
};
const struct action A[12] = {
{"sf","emp","emp","se","emp","emp"},
{"emp","sg","emp","emp","emp","acc"},
{"emp","rc","sh","emp","rc","rc"},
Output: