0% found this document useful (0 votes)
157 views

Compiler-Design Notes

Compiler design notes

Uploaded by

Vivek Rawat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
157 views

Compiler-Design Notes

Compiler design notes

Uploaded by

Vivek Rawat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

COMPILER DESIGN LAB

Lexical Analysis
It is the first step of compiler design, it takes the input as a stream of characters and gives the output
as tokens also known as tokenization. The tokens can be classified into identifiers, separators,
keywords , Operators, Constant and Special Characters.
It has three phases:
 Tokenization: It takes the stream of characters and converts it into tokens.
 Error Messages: It gives errors related to lexical analysis such as exceeding length,
unmatched string, etc.
 Eliminate Comments: Eliminates all the spaces, blank spaces, new lines, and indentations.

LEX
 Lex is a program that generates lexical analyzer. It is used with YACC parser generator.
 The lexical analyzer is a program that transforms an input stream into a sequence of tokens.
 It reads the input stream and produces the source code as output through implementing the
lexical analyzer in the C program.

The function of Lex is as follows:


 Firstly lexical analyzer creates a program lex.1 in the Lex language. Then Lex compiler runs
the lex.1 program and produces a C program lex.yy.c.
 Finally C compiler runs the lex.yy.c program and produces an object program a.out.
 a.out is lexical analyzer that transforms an input stream into a sequence of tokens.

Lex file format


A Lex program is separated into three sections by %% delimiters. The formal of Lex source is as
follows:

{ definitions }
%%
{ rules }
%%
{ user subroutines }

Definitions include declarations of constant, variable and regular definitions.


Rules define the statement of form p1 {action1} p2 {action2}....pn {action}.
Where pi describes the regular expression and action1 describes the actions what action the lexical
analyzer should take when pattern pi matches a lexeme.
User subroutines are auxiliary procedures needed by the actions. The subroutine can be loaded with
the lexical analyzer and compiled separately.

MANIKA SINGH /B.Tech CSE / SEC-Q/ ROLL_NO-33


COMPILER DESIGN LAB

Q1) Design a LEX Code to count the number of lines, space, tab-meta character and rest of
characters in a given input pattern.

PROGRAM:-
%{
#include<stdio.h>
int line=0, space=0, tab=0, total_char=0;
%}

%%
[\n] {line++;}
[" "] {space++;}
[ \t] {tab++;}
[^\t \n " "] {total_char++;}
%%

void main()
{
printf("enter the sentence");
yylex();
printf("number of lines : %d\n",line);
printf("number of spaces : %d\n",space);
printf("number of words : %d\n",total_char);
printf("number of tabs : %d\n",tab);
}
int yywrap()
{
return 1;
}

OUTPUT:-

MANIKA SINGH /B.Tech CSE / SEC-Q/ ROLL_NO-33


COMPILER DESIGN LAB

Q2)Design a LEX Code to identify and print valid identifier of C/C++ in given input pattern.

PROGRAM:-
%{
#include<stdio.h>
%}

%%
^[a-z A-Z _][a-z A-Z 0-9 _] { printf("Valid Identifier");}
^[^a-z A-Z _] { printf("Invalid Identifier");}
.;
%%

void main()
{
printf("Enter any identifier you want to check: \n");
yylex();
}

int yywrap()
{
return 1;
}

OUTPUT:-

MANIKA SINGH /B.Tech CSE / SEC-Q/ ROLL_NO-33


COMPILER DESIGN LAB

Q3) Design a LEX Code to identify and print integer and float value in a given Input pattern.

PROGRAM:-
%{
#include<stdio.h>
%}

%%
[0-9]+ {printf("this is an Integer number");}
[0-9]*.[0-9]+ {printf("This is a floating number");}
.* {printf("You have entered a wrong number");}
%%

void main()
{
printf("Enter any number you want to check: \n");
yylex();
}

int yywrap()
{
return 1;
}

OUTPUT:-

MANIKA SINGH /B.Tech CSE / SEC-Q/ ROLL_NO-33


COMPILER DESIGN LAB

Q4) Design a LEX Code for tokenizing {Identify and print OPERATORS, SEPARATORS, KEYWORDS,
IDENTIFIERS}.

PROGRAM:-
%{
#include<stdio.h>
%}

%%
auto|double|int|struct|break|else|long|switch|case|enum|register|typedef|char|extern|return|
union|continue|for|signed|void|do|if|static|while|default|goto|sizeof|volatile|const|float|short
{printf("KEYWORD\n");}
[{};,()] {printf("SEPERATOR \n");}
[+-/=*%] {printf("OPERATOR\n");}
([a-zA-Z][0-9])+|[a-zA-Z]* {printf("IDENTIFIER\n");}
[0-9]+ {printf("Digits\n");}
.|\n ;
%%

int yywrap()
{
return 1;
}

int main()
{
printf("Enter any program: \n");
yylex();
return 0;
}

OUTPUT:-

MANIKA SINGH /B.Tech CSE / SEC-Q/ ROLL_NO-33

You might also like