0% found this document useful (0 votes)

57 views27 pages

System Software Manual

The document describes a system software lab involving the use of Lex and Yacc tools. It outlines 10 programs to be executed using Lex that involve tasks like counting characters, recognizing expressions, and identifying keywords. It also lists 10 programs to be implemented using Yacc, including evaluating expressions, recognizing grammars, and validating code. The document then provides details on Lex including its file structure, rules, and regular expressions used. It also explains how to compile and run Lex programs.

Uploaded by

Ashwini SD

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views27 pages

System Software Manual

Uploaded by

Ashwini SD

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 27

SYSTEM SOFTWARE LAB

Part A

Execution of the following programs using LEX:

1) Program to count the number of vowels and consonants in a given string.

2) Program to count the number of characters, words, spaces and lines in a given
input file.
3) Program to count number of
a) positive and negative integers
b) positive and negative fractions
4) Program to count the number of comment line in a given C program. Also
eliminate them and copy that program into separate file.
5) Program to count the number of ‘scanf’ and ‘printf’ statements in a c
program . replace them with ‘readf’ and ‘writef’ statements respectively.
6) Program to recognize a valid arithmetic expression and identify the identifiers
and operators present.
7) Program to recognize whether a given sentence is simple or compound.
8) Program to recognize and count the number of identifiers in a given input file.
9) Write a lex program to identify the hyperlinks from the given input string.
10) Write a lex program to identify the capital strings from the given input string.

Part B

Execution of the following programs using YACC:

1) Program to test the validity of a simple expression involving operators +,-,*

and /.
2) Program to recognize nested IF control statements and display the number of
levels of nesting.
3) Program to recognize the grammar an b where n>=0.
4) Program to recognize a valid variable, which starts with a letter, followed by
any number of letters or digits.
5) Program to evaluate an arithmetic expression involving operators +.-.*and /.
6) Program to recognize strings ‘aaab’, ‘abbb’, ‘ab’, and ‘ a’ using the grammar
(am bn , where m>0and n>=0)
7) Program to recognize the grammar(an b, n>=10)
8) Program to check the validity of simple if else statements.
9) Program to accept the print the name, salary and age of the employee.
10) Program to recognize the grammar(ambn, where m>=0 and n> 2).
Lex and Yacc

Lex is a tool for building lexers or lexical analyzers. It takes an arbitrary input stream
and tokenizes it. The Lex utility generates a 'C' code which is nothing but a yylex()
function which can be used as an interface to YACC. A good amount of details on
Lex can be obtained from the Man Pages itself. A Practical approach to certain
fundamentals are given here.
The General Format of a Lex File consists of three sections:
1. Definitions
2. Rules
3. User Subroutines
Definitions consists of any external 'C' definitions used in the lex actions or
subroutines . e.g all preprocessor directives like #include, #define macros etc. These
are simply copied to the lex.yy.c file. The other type of definitions are Lex definitions
which are essentially the lex substitution strings,lex start states and lex table size
declarations. The Rules is the basic part which specifies the regular expressions and
their corresponding actions. The User Subroutines are the function definitions of the
functions that are used in the Lex actions.

Things to remember:
1. If there is no R.E for the input string , it will be copied to the standard output.
2. The Lex resolves the ambiguity in case of matching by choosing the longest match
first or by choosing the rule given first.
3. All the matched expressions are contained in yytext whose length is yyleng.

Structure of Lex program

Definition Section
%%
Rules Section
%%
User Subroutines Section

1. Definition Section: It includes the literal block, definitions, start conditions.

i. Literal block: a C code bracketed by the lines
%{
C code, declarations
%}
ii. Definition: allow us to give name to all or part of a RE(regular expression)
that can be referred by name in the rules section.

2. Rules Section: Contains pattern lines and C code. Pattern is written using RE and C
code, also called the action part acts according to the pattern specified. If C code
exceeds one line, then it must be enclosed in braces { }.

3. User Subroutine Section: This section includes routines called from the rules.
main()
{
yylex(); /*lexer or scanner*/
}
Lex specifications are set of patterns, that is pattern part of the rules section, in which
Lex matches against the input. Each time one of the patterns matches, the Lex
program invokes C code, that is the action part of rules section, which takes some
action with the matched token.

Lex translates the lex specifications into a file containing C routine called yylex().The
yylex() will recognize expressions in a stream and perform the specified actions for
each expression as it is detected.

The pattern part of rules section is written using Regular Expressions (REs) RE is a
pattern description using a meta language. REs are composed of normal characters
and meta characters.
The characters/Meta characters that form regular expression along
with their descriptions are listed below:

. Matches any single character except the new line character “\n”
[] Matches any one of the characters within brackets. Also called as character
class. If the first character is circumflex “^”, it changes the meaning to match any
character except those within the brackets. A range of characters is indicated with ‘-‘.
Example:
1. [a-z0-9] indicates the character class containing all the lower case
letters, and the digits.
2. [^ask] matches all characters except a,s, and k

* Matches zero or more of the preceding expression.

Ex: [A-Za-z][A-za-z0-9]* => ap90,a1, z23, w…. indicates all alphanumeric strings
with a leading alphabetic character. This is a typical expression for recognizing
identifiers in computer language.

+ Matches one or more of the preceding expression Ex: a+ => a, aa, aaa….
[a-z]+ is all strings of lower case letters. [ab]+ => ab, abab, ababab…..

? The operator ? indictes an option element of an expression Ex: ab?c matches

either ac or abc. i.e., matches zero or one occurrence of the preceding RE ,here b is
optional.

$ If the very last character is $, the expression will only be matched at the end of
a line. i.e., matches the end of line as the last character of RE. Ex:ab$ matches any
stream that ends with b.

{} Specify either repetitions (if the enclose numbers) or definition expansion (if
the enclose a name). Ex: {digit} looks for a predefined string named digit and inserts
it at that point in the expression. A{1,5} matches looks for 1 to 5 occurrences of a.
i.e., indicates how many times the previous RE is allowed to match when containing
one or two numbers.
| Indicates alternation Ex: (ab|cd) matches either ab or cd. i.e., matches either
the preceding RE or the following RE.
() Groups a series of REs together into a new RE. (ab|cd+)?(ef)* matches such
strings abefef, efef, cdef, cddd.

“..” Interprets everything within the quotation marks literally. Meta characters
other than C escape sequence lose their meaning. Ex:”/*” matches the two characters
* & /.

^ As the first character of RE, it matches the beginning of a line. Also used for
negation within [].

\ used to escape meta characters. If the following character is a lower case

letter, then it is a C escape sequence such as \t,\n etc.,

/ Matches the preceding RE but only if followed by the following RE. Ex:0/1
matches ‘0’ in the string ‘01’ but does not match anything in the string ‘0’or ‘02’.
Only one slash is permitted per pattern.
<> A name or list of names in angle brackets at the beginning of a pattern makes
that pattern apply only in the given start states.

Commands to compile and execute lex programs:

Lex programs has to be stored with filename.l extension, then there are two
steps in compiling the lex program.
1. The Lex source must be turned into generated program in the host general
purpose language. i.e., C language, using the command
$lex filename.l, this lex compiler generates a C file called lex.yy.c, the literal
block, action part of rules section, and user subroutine section of lex program
where C valid statements will be included gets copied as it is to this C file
lex.yy.c. This C file contains the lexer, yylex().When lex scanner runs, it
matches the input against the patterns in the rules section.Every time it finds a
match, it executes the C code associated with the pattern.When no match, lex
writes a copy of the token to the output. Lex executes action for the longest
possible match for the current input.
2. This C file will be compiled using C compiler and loaded, usually with a
library of lex subroutines. command for compiling this is
$cc lex.yy.c –ll, where –ll is the loader flag accesss the lex library.
The resulting program is placed on the usual file a.out for later execution. Or
we can create our own executable file using the command
$cc lex.yy.c –o filename –ll, where filename is our executable file. To
terminate, press Cntrl+d.

Lex source program

filename.l Lex Compiler Lex.yy.c

a.out
C compiler

Input Stream a.out Sequence of tokens

Fig : Creating a Lexical Analyzer with Lex

Lex Practice

Metacharacter Matches

. any character except newline

\n newline

* zero or more copies of the preceding expression

+ one or more copies of the preceding expression

? zero or one copy of the preceding expression

^ beginning of line

$ end of line
a|b a or b

(ab)+ one or more copies of ab (grouping)

"a+b" literal "a+b" (C escapes still work)

[] character class

Table 1: Pattern Matching Primitives

Expression Matches

abc abc

abc* ab abc abcc abccc ...

abc+ abc, abcc, abccc, abcccc, ...

a(bc)+ abc, abcbc, abcbcbc, ...

a(bc)? a, abc

[abc] one of: a, b, c

[a-z] any letter, a through z

[a\-z] one of: a, -, z

[-az] one of: - a z

[A-Za-z0-9]+ one or more alphanumeric characters

[ \t\n]+ whitespace

[^ab] anything except: a, b

[a^b] a, ^, b

[a|b] a, |, b

a|b a, b

Table 2: Pattern Matching Examples

Regular expressions in lex are composed of metacharacters (Table 1). Pattern-match-
ing examples are shown in Table 2. Within a character class, normal operators lose
their meaning. Two operators allowed in a character class are the hyphen ("-") and cir-
cumflex ("^"). When used between two characters, the hyphen represents a range of
characters. The circumflex, when used as the first character, negates the expression. If
two patterns match the same string, the longest match wins. In case both matches are
the same length, then the first pattern listed is used.

... definitions ...

%%
... rules ...
%%
... subroutines ...

Input to Lex is divided into three sections, with %% dividing the sections. This is
best illustrated by example. The first example is the shortest possible lex file:

Input is copied to output, one character at a time. The first %% is always required, as
there must always be a rules section. However, if we don’t specify any rules, then the
default action is to match everything and copy it to output. Defaults for input and out-
put are stdin and stdout, respectively. Here is the same example, with defaults explic-
itly coded:

%%
/* match everything except newline */
. ECHO;
/* match newline */
\n ECHO;

int yywrap(void) {
return 1;
}

int main(void) {
yylex();
return 0;
}

Two patterns have been specified in the rules section. Each pattern must begin in col-
umn one. This is followed by whitespace (space, tab or newline), and an optional ac-
tion associated with the pattern. The action may be a single C statement, or multiple C
statements enclosed in braces. Anything not starting in column one is copied verbatim
to the generated C file. We may take advantage of this behavior to specify comments
in our lex file. In this example there are two patterns, "." and "\n", with an ECHO ac-
tion associated for each pattern. Several macros and variables are predefined by lex.
ECHO is a macro that writes code matched by the pattern. This is the default action
for any unmatched strings. Typically, ECHO is defined as:

#define ECHO fwrite(yytext, yyleng, 1, yyout)

Variable yytext is a pointer to the matched string (NULL-terminated), and yyleng is

the length of the matched string. Variable yyout is the output file, and defaults to std-
out. Function yywrap is called by lex when input is exhausted. Return 1 if you are
done, or 0 if more processing is required. Every C program requires a main function.
In this case, we simply call yylex, the main entry-point for lex. Some implementations
of lex include copies of main and yywrap in a library, eliminating the need to code
them explicitly. This is why our first example, the shortest lex program, functioned
properly.

Name Function

int yylex(void) call to invoke lexer, returns token

char *yytext pointer to matched string

yyleng length of matched string

yylval value associated with token

int yywrap(void) wrapup, return 1 if done, 0 if not done

FILE *yyout output file

FILE *yyin input file

INITIAL initial start condition

BEGIN condition switch start condition

ECHO write matched string

Table 3: Lex Predefined Variables

Here is a program that does nothing at all. All input is matched, but no action is asso-
ciated with any pattern, so there will be no output.

%%
.
\n

The following example prepends line numbers to each line in a file. Some implemen-
tations of lex predefine and calculate yylineno. The input file for lex is yyin, and de-
faults to stdin.

%{
int yylineno;
%}
%%
^(.*)\n printf("%4d\t%s", ++yylineno, yytext);
%%
int main(int argc, char *argv[]) {
yyin = fopen(argv[1], "r");
yylex();
fclose(yyin);
}
The definitions section is composed of substitutions, code, and start states. Code in
the definitions section is simply copied as-is to the top of the generated C file, and
must be bracketed with "%{" and "%}" markers. Substitutions simplify pattern-
matching rules. For example, we may define digits and letters:

digit [0-9]
letter [A-Za-z]
%{
int count;
%}
%%
/* match identifier */
{letter}({letter}|{digit})* count++;
%%
int main(void) {
yylex();
printf("number of identifiers = %d\n", count);
return 0;
}

Whitespace must separate the defining term and the associated expression. References
to substitutions in the rules section are surrounded by braces ({letter}) to distinguish
them from literals. When we have a match in the rules section, the associated C code
is executed. Here is a scanner that counts the number of characters, words, and lines
in a file (similar to Unix wc):

%{
int nchar, nword, nline;
%}
%%
\n { nline++; nchar++; }
[^ \t\n]+ { nword++, nchar += yyleng; }
. { nchar++; }
%%
int main(void) {
yylex();
printf("%d\t%d\t%d\n", nchar, nword, nline);
return 0;
}

Yacc(Yet another compiler compiler)

Yacc provides a general tool for imposing structure on the input to a computer
program. Yacc is the Utility which generates the function 'yyparse' which is indeed
the Parser. Yacc describes a context free , LALR(1) grammar and supports both
bottom-up and top-down parsing.The general format for the YACC file is very similar
to that of the Lex file.

1. Declarations
2. Grammar Rules
3. Subroutines
In Declarations apart from the legal 'C' declarations there are few Yacc specific
declarations which begins with a %sign.

1. %union It defines the Stack type for the Parser.

It is a union of various datas/structures/ objects.

2. %token These are the terminals returned by the yylex

                        function to the yacc. A token can also have type
                        associated with it for good type checking and
                        syntax directed translation. A type of a token
                        can be specified as %token <stack member>
                        tokenName.

3. %type The type of a non-terminal symbol in

                        the Grammar rule can be specified with this.
                        The format is %type <stack member>
                        non-terminal.

4. %noassoc Specifies that there is no associativity

of a terminal symbol.

5. %left Specifies the left associativity of

a Terminal Symbol

6. %right Specifies the right assocoativity of

a Terminal Symbol.

7. %start Specifies the L.H.S non-terminal symbol of a

production rule which should be taken as the
starting point of the grammar rules.

8. %prec Changes the precedence level associated with

                       a particular rule to that of the following
                       token name or literal.
                       The grammar rules are specified as follows:
                       Context-free grammar production-
                       p->AbC
                       Yacc Rule-
                           p : A b C { /* 'C' actions */}
The general style for coding the rules is to have all Terminals in upper-case and all
non-terminals in lower-case.
To facilitates a proper syntax directed translation the Yacc has something called
pseudo-variables which forms a bridge between the values of terminal/non-terminals
and the actions. These pseudo variables are $$,$1,$2,$3...... The $$ is the L.H.S
value of the rule whereas $1 is the first R.H.S value of the rule and so is $2 etc. The
default type for pseudo variables is integer unless they are specified by %type ,
%token <type> etc.
Structure of Yacc program
Declaration section
%%
Rules section
%%
User subroutine section
Declaration/Definition section:
. Includes declarations of the tokens used in the grammar. It can also include a
literal block, C code enclosed in
%{
%}
. Includes %token, %union, %start, %type, %left, %right, and %nonassoc
declarations.
Rules section:
. Contains the grammar rules and actions containing C code.
. Each rule starts with a non-terminal symbol and a colon followed by a
possibly empty list of symbols or tokens and actions. Ex: e:e’+’e
. Blanks, tabs, and new lines are ignored except that the may not appear in
names or multi-character reserved symbols.
User Subroutine section:
. Yacc copies the contents of this section verbatim to the C file.
. Typically includes routines called from the actions.
main()
{
yyparse() /* parser*/
}
Compiling and executing Yacc programs:
Yacc programs must be stored as filename.y Extension, then there are two steps in
compiling the Yacc program.

 The Yacc source must be turned into generated program in the host
general purpose language. i.e., C language, using the command $yacc -
d filename.y(-d is token definition), this yacc compiler generates a C
file called y.tab.c, the literal block, action part of rules section, and
user subroutine section of Yacc program where C valid statements will
be included gets copied as it is to this C file y.tab.c. This C file
contains the parser, yyparse().When Yacc parser runs, it in turn
repeatedly calls yylex, the lexical analyzer which supplies tokens to
yacc as and when required. When an error is detected, parse returns the
value 1, or the lexical analyzer returns the end marker token and the
parser accepts. In this case, yyparse returns the value 0.
 This C file will be compiled using C compiler and loaded, usually with
a library of yacc and lex subroutines. Here first lex program must be
compiled as usual which generates the C file lex.yy.c, then Yacc
program must be compiled which generates the C file called y.tab.c.
Now both C files will be compiled using C compiler.
$cc lex.yy.c y.tab.c –ll -ly, where –ly is the loader flag accesss the
Yacc library.
The resulting program is placed on the usual file a.out for later
execution. To terminate, press Cntrl+d.

Yacc specification Yacc Compiler y.tab.c

C compiler a.out

Input Stream a.out Sequence of tokens

Fig : Parser construction using yacc

Special characters and Library routines:

1. yylex() => The scanner/lexer created b Lex has the entry point yylex().It
scans the program. All code in the rules section is copied into yylex().

2. yytext => Whenever the lexer matches a token, the text of the token is
stored in the null terminated string yytext. It is array of characters whose
contents are replaced each time new token is matched.

3. yywrap() => When a lexer encounters an end of file, it calls the routine
yywrap() to find out what to do next. If yywrap() returns 0, the scanner
continues scanning, if it returns 1, the scanner returns zero token to report
end of file.

4. yyin,yyout => Standard input and output files of lex. Like stdin & stdout
files used in c.

5. Echo => Writes the token to the current output file yyout. Equivalent to
fprintf(yyout,”%s”,yytext);

6. input() => Provides character to the lexer. Also yyinput()

7. output => Writes its arguments to the ouput file yyout.i.e putc(c,yyout).
Also yyout().

8. unput() => returns the character to the input stream. Also yyunput().

9. yyleng => Stores the length of yytext. Same as strlen(yytext).

10. yyless() => yyless(n) is used to push back the ‘n’ characters of the token.

11. yymore() => Can be used to append more text to the token.

12. yyparse() => The entry point to the yacc generated parser. Returns zero on
success and non-zero on failure.
13. yyerror() => Simple error reporting routine, yyerror(char *msg).

14. % => Used to declare the definitions like %token, %start, %type, %left,
%right, %union.

15. $ => Introduces a value of reference in actions. Ex: $3 refers the value of
third symbol in the RHS of the rule, c=12+89, $3 refers to value 89.

16. ‘ => Used to define literal tokens Ex: ‘+’, ‘-‘,…

17. ; => Each rule in the rule section end with a semicolon.

18. | => To specify the alternative RHS for the same LHS in a rule. Ex:
e : e’+’e|e’-‘e|e’*’e.

19. : => Used to separate LHS and RHS of a rule.

20. %token => Are the symbols that the lexer passes to the parser. So parser
need to call yylex() which returns the tokens required by the parser. All
tokens must be explicitly defined in the definition section.

21. %left, %right, %nonassoc => Explicit means of specifying left, right, and
no associativity.

22. %start <rule name> => Specifies the first rule that the parser should start.

23. %prec => Changes the precedence level associated with a particular
grammar rule. Ex: unary minus may be given highest level of precedence,
whereas binary minus will have lower level precedence.

24. %s or %x => Indicates start condition.

25. %union => As there may be multiple types of symbol values, expressions
may have double values, lexer should return the value of the variable as
double, this is accomplished by %union.

26. %type => Sets the tpe for non-terminals. Ex: %union { double dval;}
%type <dval> expression.

27. YYABORT => Causes yyparse() to return immediately with a non zero
value(failure).

28. YYACCEPT => Causes yyparse() to return immediately with a

zero(success).

Yacc Practice, Part II

... definitions ...

%%
... rules ...
%%
... subroutines ...

Input to yacc is divided into three sections. The definitions section consists of token
declarations, and C code bracketed by "%{" and "%}". The BNF grammar is placed
in the rules section, and user subroutines are added in the subroutines section.

This is best illustrated by constructing a small calculator that can add and subtract
numbers. We’ll begin by examining the linkage between lex and yacc. Here is the def-
initions section for the yacc input file:

%token INTEGER
This definition declares an INTEGER token. When we run yacc, it generates a parser
in file y.tab.c, and also creates an include file, y.tab.h:

#ifndef YYSTYPE
#define YYSTYPE int
#endif
#define INTEGER 258
extern YYSTYPE yylval;

Lex includes this file and utilizes the definitions for token values. To obtain tokens,
yacc calls yylex. Function yylex has a return type of int, and returns the token. Values
associated with the token are returned by lex in variable yylval. For example,

[0-9]+ {
yylval = atoi(yytext);
return INTEGER;
}

would store the value of the integer in yylval, and return token INTEGER to yacc.
The type of yylval is determined by YYSTYPE. Since the default type is integer, this
works well in this case. Token values 0-255 are reserved for character values. For ex-
ample, if you had a rule such as

[-+] return yytext; / return operator */

the character value for minus or plus is returned. Note that we placed the minus sign
first so that it wouldn’t be mistaken for a range designator. Generated token values
typically start around 258, as lex reserves several values for end-of-file and error pro-
cessing. Here is the complete lex input specification for our calculator:

%{
#include "y.tab.h"
#include <stdlib.h>
void yyerror(char *);
%}
%%

[0-9]+ {
yylval = atoi(yytext);
return INTEGER;
}

[-+\n] return *yytext;

[ \t] ; /* skip whitespace */

. yyerror("invalid character");

int yywrap(void) {
return 1;
}

Internally, yacc maintains two stacks in memory; a parse stack and a value stack. The
parse stack contains terminals and nonterminals, and represents the current parsing
state. The value stack is an array of YYSTYPE elements, and associates a value with
each element in the parse stack. For example, when lex returns an INTEGER token,
yacc shifts this token to the parse stack. At the same time, the corresponding yylval is
shifted to the value stack. The parse and value stacks are always synchronized, so
finding a value related to a token on the stack is easily accomplished. Here is the yacc
input specification for our calculator:

%{
int yylex(void);
void yyerror(char *);
%}
%token INTEGER

program:
program expr '\n' { printf("%d\n", $2); }
|
;

expr:
INTEGER { $$ = $1; }
| expr '+' expr { $$ = $1 + $3; }
| expr '-' expr { $$ = $1 - $3; }
;

void yyerror(char *s) {

fprintf(stderr, "%s\n", s);
}

int main(void) {
yyparse();
return 0;
}

The rules section resembles the BNF grammar discussed earlier. The left-hand side of
a production, or nonterminal, is entered left-justified, followed by a colon. This is fol-
lowed by the right-hand side of the production. Actions associated with a rule are en-
tered in braces.
By utilizing left-recursion, we have specified that a program consists of zero or more
expressions. Each expression terminates with a newline. When a newline is detected,
we print the value of the expression. When we apply the rule

expr: expr '+' expr { $$ = $1 + $3; }

we replace the right-hand side of the production in the parse stack with the left-hand
side of the same production. In this case, we pop "expr '+' expr" and push "expr".
We have reduced the stack by popping three terms off the stack, and pushing back one
term. We may reference positions in the value stack in our C code by specifying "$1"
for the first term on the right-hand side of the production, "$2" for the second, and so
on. "$$" designates the top of the stack after reduction has taken place. The above ac-
tion adds the value associated with two expressions, pops three terms off the value
stack, and pushes back a single sum. Thus, the parse and value stacks remain synchro-
nized.

Numeric values are initially entered on the stack when we reduce from INTEGER to
expr. After INTEGER is shifted to the stack, we apply the rule

expr: INTEGER { $$ = $1; }

The INTEGER token is popped off the parse stack, followed by a push of expr. For
the value stack, we pop the integer value off the stack, and then push it back on again.
In other words, we do nothing. In fact, this is the default action, and need not be spec-
ified. Finally, when a newline is encountered, the value associated with expr is
printed.

In the event of syntax errors, yacc calls the user-supplied function yyerror. If you
need to modify the interface to yyerror, you can alter the canned file that yacc in-
cludes to fit your needs. The last function in our yacc specification is main … in case
you were wondering where it was. This example still has an ambiguous grammar.
Yacc will issue shift-reduce warnings, but will still process the grammar using shift as
the default operation.
Yacc Practice, Part II

In this section we will extend the calculator from the previous section to incorporate
some new functionality. New features include arithmetic operators multiply, and di-
vide. Parentheses may be used to over-ride operator precedence, and single-character
variables may be specified in assignment statements. The following illustrates sample
input and calculator output:

user: 3 * (4 + 5)
calc: 27
user: x = 3 * (4 + 5)
user: y = 5
user: x
calc: 27
user: y
calc: 5
user: x + 2*y
calc: 37

The lexical analyzer returns VARIABLE and INTEGER tokens. For variables, yyl-
val specifies an index to sym, our symbol table. For this program, sym merely holds
the value of the associated variable. When INTEGER tokens are returned, yylval
contains the number scanned. Here is the input specification for lex:

%{
#include <stdlib.h>
#include "y.tab.h"
void yyerror(char *);
%}

/* variables */
[a-z] {
yylval = *yytext - 'a';
return VARIABLE;
}

/* integers */
[0-9]+ {
yylval = atoi(yytext);
return INTEGER;
}

/* operators */
[-+()=/*\n] { return *yytext; }

/* skip whitespace */
[ \t] ;

/* anything else is an error */

. yyerror("invalid character");

int yywrap(void) {
return 1;
}

The input specification for yacc follows. The tokens for INTEGER and VARIABLE
are utilized by yacc to create #defines in y.tab.h for use in lex. This is followed by
definitions for the arithmetic operators. We may specify %left, for left-associative, or
%right, for right associative. The last definition listed has the highest precedence.
Thus, multiplication and division have higher precedence than addition and subtrac-
tion. All four operators are left-associative. Using this simple technique, we are able
to disambiguate our grammar.

%token INTEGER VARIABLE

%left '+' '-'
%left '*' '/'

%{
void yyerror(char *);
int yylex(void);
int sym[26];
%}

program:
program statement '\n'
|
;

statement:
expr { printf("%d\n", $1); }
| VARIABLE '=' expr { sym[$1] = $3; }
;

expr:
INTEGER
| VARIABLE { $$ = sym[$1]; }
| expr '+' expr { $$ = $1 + $3; }
| expr '-' expr { $$ = $1 - $3; }
| expr '*' expr { $$ = $1 * $3; }
| expr '/' expr { $$ = $1 / $3; }
| '(' expr ')' { $$ = $2; }
;

%%
void yyerror(char *s) {
fprintf(stderr, "%s\n", s);
}

int main(void) {
yyparse();
return 0;
}

SC672 Full Document
60% (5)
SC672 Full Document
20 pages
Offshore Platform Cost Estimation
50% (2)
Offshore Platform Cost Estimation
7 pages
Algebra 1 Summer Skills Packet
100% (1)
Algebra 1 Summer Skills Packet
17 pages
s71200 Functional Safety Manual 2015 02 en
100% (1)
s71200 Functional Safety Manual 2015 02 en
212 pages
Compiler Design Lab KCS552
No ratings yet
Compiler Design Lab KCS552
82 pages
SSCD LAB MAUNUAL DRTTIT FULL (Santhosh) PDF
No ratings yet
SSCD LAB MAUNUAL DRTTIT FULL (Santhosh) PDF
50 pages
Flex and Bison
100% (1)
Flex and Bison
23 pages
SSCD Mod4AzDOCUMENTS
No ratings yet
SSCD Mod4AzDOCUMENTS
67 pages
Database Management Systems PPT Part 2
No ratings yet
Database Management Systems PPT Part 2
8 pages
1lex and Yacc
No ratings yet
1lex and Yacc
42 pages
Lecture3 Lex
No ratings yet
Lecture3 Lex
44 pages
‏لقطة شاشة ٢٠٢٤-٠٣-٢١ في ٩.١٢.٣٢ ص
No ratings yet
‏لقطة شاشة ٢٠٢٤-٠٣-٢١ في ٩.١٢.٣٢ ص
7 pages
Burp Suite Cookbook Practical Recipes To
100% (1)
Burp Suite Cookbook Practical Recipes To
7 pages
Lex Material 1
No ratings yet
Lex Material 1
37 pages
Compiler Desing-Final ppt2
No ratings yet
Compiler Desing-Final ppt2
194 pages
SS Lab Manual
No ratings yet
SS Lab Manual
38 pages
Module 4 RVC
No ratings yet
Module 4 RVC
59 pages
area of Rectangle and Circle Using Method Overloading
No ratings yet
area of Rectangle and Circle Using Method Overloading
10 pages
Email Lead
No ratings yet
Email Lead
112 pages
SS & OS Final Lab Manual
No ratings yet
SS & OS Final Lab Manual
46 pages
Chapter 1: Introduction: Silberschatz, Galvin and Gagne ©2013 Operating System Concepts - 9 Edit9on
No ratings yet
Chapter 1: Introduction: Silberschatz, Galvin and Gagne ©2013 Operating System Concepts - 9 Edit9on
35 pages
DB2 Text Search
No ratings yet
DB2 Text Search
52 pages
Chapter 9 - LEX - LabManual
No ratings yet
Chapter 9 - LEX - LabManual
26 pages
Lex Tool
No ratings yet
Lex Tool
7 pages
Bracket-Casting Design Tutorial by Using Simulation
No ratings yet
Bracket-Casting Design Tutorial by Using Simulation
4 pages
Lab Session
No ratings yet
Lab Session
27 pages
Jdsu - MTS-5200 (Otdr)
No ratings yet
Jdsu - MTS-5200 (Otdr)
220 pages
Flex
No ratings yet
Flex
36 pages
Lexical Analyzer
No ratings yet
Lexical Analyzer
33 pages
Code:: Compiler Design (3170701) 190090107055
No ratings yet
Code:: Compiler Design (3170701) 190090107055
76 pages
SS Lab Manual
No ratings yet
SS Lab Manual
66 pages
Engineering Mathematics-Iii: Course Objectives: This Course Will Enable Students To
No ratings yet
Engineering Mathematics-Iii: Course Objectives: This Course Will Enable Students To
21 pages
CD Cse Record
No ratings yet
CD Cse Record
76 pages
Lex Yacc Tutorial
No ratings yet
Lex Yacc Tutorial
38 pages
CD (Aicte 2020-2021)
No ratings yet
CD (Aicte 2020-2021)
74 pages
LEX Programming
No ratings yet
LEX Programming
36 pages
Lex Yacc
No ratings yet
Lex Yacc
22 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
38 pages
Introduction To Lex
No ratings yet
Introduction To Lex
20 pages
1 Voice
No ratings yet
1 Voice
1 page
AI Unit 1 Assignment Question
0% (1)
AI Unit 1 Assignment Question
2 pages
Device Edge Cloud Continuum Paradigms, Architectures and Applications
No ratings yet
Device Edge Cloud Continuum Paradigms, Architectures and Applications
234 pages
LEX and YACC
No ratings yet
LEX and YACC
3 pages
Erp Case Study SAP AT CO-OPERATIVE BULK HANDLING LTD. (CBH)
100% (1)
Erp Case Study SAP AT CO-OPERATIVE BULK HANDLING LTD. (CBH)
21 pages
Sidhartha
No ratings yet
Sidhartha
5 pages
Top 70 CCNA Interview Questions & Answers
No ratings yet
Top 70 CCNA Interview Questions & Answers
8 pages
Question Bank: Department of Computer Science and Engineering
No ratings yet
Question Bank: Department of Computer Science and Engineering
7 pages
User Interface - Definition From Answers
No ratings yet
User Interface - Definition From Answers
3 pages
SS Manual GEC 18CSL66
No ratings yet
SS Manual GEC 18CSL66
49 pages
Module-4 Lex and Yacc
No ratings yet
Module-4 Lex and Yacc
67 pages
Coli A 00356
No ratings yet
Coli A 00356
44 pages
SSANDCOMP DESIGNLABmanual
No ratings yet
SSANDCOMP DESIGNLABmanual
49 pages
LexYacc Final
No ratings yet
LexYacc Final
44 pages
9536 Exp5 Merged
No ratings yet
9536 Exp5 Merged
18 pages
Class 2019 Lex
No ratings yet
Class 2019 Lex
30 pages
Matthias Schonlau, Ph.D. Statistical Learning - Classification Stat441
No ratings yet
Matthias Schonlau, Ph.D. Statistical Learning - Classification Stat441
30 pages
TheMinitestCookbook Sample
No ratings yet
TheMinitestCookbook Sample
28 pages
Big Data Analytics in Mobile Cellular Networks
No ratings yet
Big Data Analytics in Mobile Cellular Networks
29 pages
Introduction For Lab Compiler
No ratings yet
Introduction For Lab Compiler
15 pages
SPCC Exp7
No ratings yet
SPCC Exp7
8 pages
Unsupervised Compositionality Prediction of Nominal Compounds
No ratings yet
Unsupervised Compositionality Prediction of Nominal Compounds
57 pages
Evaluating Computational Language Models With Scaling Properties of Natural Language
No ratings yet
Evaluating Computational Language Models With Scaling Properties of Natural Language
34 pages
Combining Deep Learning and Argumentative Reasoning For The Analysis of Social Media Textual Content Using Small Data Sets
No ratings yet
Combining Deep Learning and Argumentative Reasoning For The Analysis of Social Media Textual Content Using Small Data Sets
26 pages
Incorporating Source-Side Phrase Structures Into Neural Machine Translation
No ratings yet
Incorporating Source-Side Phrase Structures Into Neural Machine Translation
26 pages
Modeling Speech Acts in Asynchronous Conversations: A Neural-CRF Approach
No ratings yet
Modeling Speech Acts in Asynchronous Conversations: A Neural-CRF Approach
36 pages
On The Complexity of CCG Parsing: Marco Kuhlmann
No ratings yet
On The Complexity of CCG Parsing: Marco Kuhlmann
36 pages
C Programming
From Everand
C Programming
Netra
No ratings yet
18CSL66 SS Lab
No ratings yet
18CSL66 SS Lab
66 pages
Lab
No ratings yet
Lab
169 pages
Theory:: Aim: Implement A Lexical Analyzer For A Subset of C Using LEX Implementation Should Support Error Handling
No ratings yet
Theory:: Aim: Implement A Lexical Analyzer For A Subset of C Using LEX Implementation Should Support Error Handling
5 pages
Lex PDF
No ratings yet
Lex PDF
20 pages
Course: IT794 Compiler Construction Lab Manuaul Tools
No ratings yet
Course: IT794 Compiler Construction Lab Manuaul Tools
5 pages
Lab Manual CD
No ratings yet
Lab Manual CD
19 pages
Lab Manual
No ratings yet
Lab Manual
23 pages
PLT Lecture Notes
No ratings yet
PLT Lecture Notes
5 pages
Lex
No ratings yet
Lex
41 pages
Flex/Le X: Javeria Akram (276) Ifra Zahid
No ratings yet
Flex/Le X: Javeria Akram (276) Ifra Zahid
21 pages
Certificate
No ratings yet
Certificate
1 page
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Compiler Design Lab (CSP358) : Practical No. 1 (LEX)
No ratings yet
Compiler Design Lab (CSP358) : Practical No. 1 (LEX)
16 pages
Lexical Analyzer: 4.2 Implementation
No ratings yet
Lexical Analyzer: 4.2 Implementation
6 pages
Lex-Yacc For Exam
100% (1)
Lex-Yacc For Exam
17 pages
System Programming (BTHU-301A) : Bachelor of Technology
No ratings yet
System Programming (BTHU-301A) : Bachelor of Technology
22 pages
FMCG Market Share Global
No ratings yet
FMCG Market Share Global
1 page
National Chung Cheng University Student Personal Information
No ratings yet
National Chung Cheng University Student Personal Information
1 page
Tutorial On Lex & Yacc: Presented by Dewan Tanvir Ahmed Lecturer, CSE Bangladesh University of Engineering and Technology
No ratings yet
Tutorial On Lex & Yacc: Presented by Dewan Tanvir Ahmed Lecturer, CSE Bangladesh University of Engineering and Technology
31 pages
Paper 77-Using The Term Frequency Inverse Document Frequency
No ratings yet
Paper 77-Using The Term Frequency Inverse Document Frequency
11 pages
Notes About Lex and Yacc: Pablo Nogueira Iglesias December 26, 1999
No ratings yet
Notes About Lex and Yacc: Pablo Nogueira Iglesias December 26, 1999
15 pages
Mlnx-Os Um PDF
No ratings yet
Mlnx-Os Um PDF
857 pages
System Programming & Compiler Design Lab Manual
No ratings yet
System Programming & Compiler Design Lab Manual
41 pages
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
Programming with MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
From Everand
Programming with MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
Peter Kattan
4.5/5 (3)
Lex and Yacc Roll No 23
No ratings yet
Lex and Yacc Roll No 23
7 pages
OSC Question Bank (Unit 1, 2, 3)
No ratings yet
OSC Question Bank (Unit 1, 2, 3)
1 page
Lex Yacc
No ratings yet
Lex Yacc
17 pages
Jake S Resume Anonymous
No ratings yet
Jake S Resume Anonymous
1 page
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Compiler Design Practical List
No ratings yet
Compiler Design Practical List
5 pages
Lex and Yacc
No ratings yet
Lex and Yacc
8 pages
CompilerDesignLabManual PDF
No ratings yet
CompilerDesignLabManual PDF
11 pages

System Software Manual

Uploaded by

System Software Manual

Uploaded by

SYSTEM SOFTWARE LAB

Execution of the following programs using LEX:

1) Program to count the number of vowels and consonants in a given string.

Execution of the following programs using YACC:

1) Program to test the validity of a simple expression involving operators +,-,*

Structure of Lex program

1. Definition Section: It includes the literal block, definitions, start conditions.

* Matches zero or more of the preceding expression.

? The operator ? indictes an option element of an expression Ex: ab?c matches

\ used to escape meta characters. If the following character is a lower case

Commands to compile and execute lex programs:

Lex source program

Input Stream a.out Sequence of tokens

Fig : Creating a Lexical Analyzer with Lex

. any character except newline

* zero or more copies of the preceding expression

+ one or more copies of the preceding expression

? zero or one copy of the preceding expression

(ab)+ one or more copies of ab (grouping)

"a+b" literal "a+b" (C escapes still work)

Table 1: Pattern Matching Primitives

abc* ab abc abcc abccc ...

abc+ abc, abcc, abccc, abcccc, ...

a(bc)+ abc, abcbc, abcbcbc, ...

[abc] one of: a, b, c

[a-z] any letter, a through z

[a\-z] one of: a, -, z

[-az] one of: - a z

[A-Za-z0-9]+ one or more alphanumeric characters

[^ab] anything except: a, b

Table 2: Pattern Matching Examples

... definitions ...

#define ECHO fwrite(yytext, yyleng, 1, yyout)

Variable yytext is a pointer to the matched string (NULL-terminated), and yyleng is

int yylex(void) call to invoke lexer, returns token

char *yytext pointer to matched string

yyleng length of matched string

int yywrap(void) wrapup, return 1 if done, 0 if not done

FILE *yyout output file

FILE *yyin input file

INITIAL initial start condition

BEGIN condition switch start condition

ECHO write matched string

Table 3: Lex Predefined Variables

Yacc(Yet another compiler compiler)

1. %union It defines the Stack type for the Parser.

2. %token These are the terminals returned by the yylex

3. %type The type of a non-terminal symbol in

4. %noassoc Specifies that there is no associativity

5. %left Specifies the left associativity of

6. %right Specifies the right assocoativity of

7. %start Specifies the L.H.S non-terminal symbol of a

8. %prec Changes the precedence level associated with

Yacc specification Yacc Compiler y.tab.c

Input Stream a.out Sequence of tokens

Fig : Parser construction using yacc

Special characters and Library routines:

6. input() => Provides character to the lexer. Also yyinput()

9. yyleng => Stores the length of yytext. Same as strlen(yytext).

16. ‘ => Used to define literal tokens Ex: ‘+’, ‘-‘,…

19. : => Used to separate LHS and RHS of a rule.

24. %s or %x => Indicates start condition.

28. YYACCEPT => Causes yyparse() to return immediately with a

Yacc Practice, Part II

... definitions ...

[-+] return *yytext; /* return operator */

[-+\n] return *yytext;

[ \t] ; /* skip whitespace */

void yyerror(char *s) {

expr: expr '+' expr { $$ = $1 + $3; }

expr: INTEGER { $$ = $1; }

/* anything else is an error */

%token INTEGER VARIABLE

You might also like

[-+] return yytext; / return operator */