Compiler Construction
Compiler Construction
⋅;
printf ("\n\n Total vowels %d is", vc;}
return (0);
}
%% %%
main() main()
{ {
printf ("enter how many numbers"); printf ("enter input \n");
yylex(); yylex();
} }
yyerror()
{
printf ("error");
Example10:A lex program to count
} number of vowels and number of
yywrap() words per line
{ and total number of lines ending with
return (1); '·'.
}
Compiler Construction Lexical Analysis (Scanner)
Solution: Solution: Solution:
2.26 % { # include <stdio·h>
Now compile this program as: int lcnt = 0, vcnt = 0, wno = 0;
$ lex sum·l %}
$ cc lex·yy·c - ll %%
$ ·/a·out
enter how many numbers
[ · ]+ {lcnt ++; wno ++; printf ("lcnt = %
3 d vowel = % d
enter number 10 word = % d \n", lcnt, vcnt, wno);
enter number 20 vcnt = 0; wno = 0;}
enter number 30 [ ]+ {wno ++;}
→ sum is 60
We can also change the a·out file name that is we can use
[aeiouAEIOU] {vcnt ++;}
our own executable file. ·;
$ lex sum·l %%
$ cc lex·yy·c - 0 sumout - ll main()
$ ·/sumout { printf ("Enter input : \n");
enter how many numbers
2
yylex();
enter number 5 printf (" \n Total lines = % d", lcnt);
enter number 10 sum is 15. }
Example12: A lex program to find factorial of a given Example13:A lex program to find sum of
number. first n numbers.
Solution: Solution:
%{ %{
# include<stdio⋅h>
# include<stdio⋅h>
int i, fact, n;
%} int i, x, sum = 0;
%% %}
[0 – 9]+ %%
{ n = atoi (yytext); [0 – 9]+
for (i=1; i<=n; i++) {
fact = fact * i; x=atoi (yytext);
printf ("Factorial do no. % d is %d\n", n, fact); for (i=0; i<=x; i++)
return (0) {
%%
sum=sum+i;
main()
printf ("%d", i);
{ printf ("\n Enter number");
yylex(); }
} prinft ("The sum of first %d numbers is %d
/n"; x, sum);
$ lex fact ⋅ l
Output:
$ cc ⋅ lex ⋅ yy ⋅ c –ll
return (0);
$ ⋅/a⋅out
}
%%
Enter number main()
5 {
Factorial of number 5 is 120 printf ("\enter number \n");
enter number
yylex();
4
factorial of number 4 is 24. }
$ lex sum ⋅ l
Output:
$ ⋅/sumofn
id, if and for.
Solution:
%{ Enter number
include <stdio.h> 5
%} 12345
$ ⋅/ sumofn
%% The sum of first 5 numbers is 15
[a-zA-z] [a-zA-z0-9]* {return id;}
[iI] [fF] {return if;} enter number
[Ff] [Oo] [Rr] {return for;} 10
%% 1 2 3 4 5 6 7 8 9 10
main() The sum of first 10 numbers is 55.
{
printf ("\n Enter word"); Recursive Descent Parsing (RDP) : A
yylex(); parse that uses a set of recursive
} procedures to recognize its input without
backtracking
Handle
The sentential form (string) which matches the RHS of Annotated parse tree
production rule while reduction, then that string is A parse-tree, with values of its attributes at
called "handle".
S-attributed each node is called annotated parse tree
The SDD is S-attributed if every attribute is synthesized. Syntax-directed translation(SDT)
A syntax-directed translation is called S-attributed if all fundamentally works by adding actions to
its attributes are synthesized. the productions in a context-free grammar,
For S-attributed SDD, the attributes are evaluated in resulting in a Syntax-Directed Definition
bottom-up order of the nodes of the parse tree. (SDD)
The attributes are evaluated by using postorder Syntax-directed translation (SDT) refers to
traversal of the parse tree. a method of compiler implementation
Since, bottom-up parsing uses postorder traversal, S- where the source language translation is
attributed definitions can be implemented during completely driven by the parser.
bottom-up parsing or LR parsing. The main idea behind syntax-directed
Synthesized attributes can be evaluated by a bottom- translation is that the semantics or the
up parser as the input is being parsed. meaning of the program is closely tied to its
syntax.
L-Attributed
L-Attributed Definitions contain both synthesized and Syntax-directed definition (SDD)
inherited attributes but do not need to build a A context-free grammar in which the
dependency graph to evaluate them. productions are shown along with its
The idea behind L-Attributed Definitions, between the associated semantic rules is called as a
attributes associated with a production body, syntax-directed definition.(SDD)
dependency-graph edges can go from left to right, but A SDD is a context-free grammar together
not from right to left (hence "L-attributed"). with attributes and rules, where attributes
The classes of syntax-directed definitions whose are associated with grammar symbols and
attributes can always be evaluated in depth-first order rules are associated with production.
are called L-Attributed Definitions. If S is a symbol and a is one of its attributes
then we write S.a which is value of a at a
Some Lex Library functions are particular node of tree labeled S.
1. yylex(): This function is used to start or resume
scanning. The next call in program1 to yylex() will There are two classes of SDD's to construct
continue from the point where it left off. All codes in translators:
rule section are copied into yylex(). 2. yytext(): 1. S-attributed (LR-parsable)
Whenever a lexer matches a token, the text of the 2. L-attributed (LL-parsable)
token is stored in the null terminated string yytext.
(work just like pointers in C). Whenever the new token BOOTSTRAPPING
is matched, the contents of yytext are replaced by new Bootstrapping is a process in which simple
token. language is used to translate more
3. yywrap():The purpose of yywrap() function to complicated program which in turn may
additional processing in order to "wrap" things up handle for more complicated program.
before terminating. Bootstrapping is an approach for making a
When yylex() reaches the end of its input file, it calls self-compiling compiler that is a compiler
yywrap( ), which returns a value of 0 or 1. If the value is written in the source programming
1, indicates that no further input is available. By default language that it determine to compile
it always return 1. A bootstrap compiler can compile the
4. yyerror():The yyerror( ) function is called which compiler and thus we can use this compiled
reports the error to the user. compiler to compile everything else and the
future versions of itself.
Global optimization : The optimizing transformations
are applied over a program unit i.e. over a function or a
procedure.
Recursive Decent Parser
Local optimization : The optimizing transformations Left recursive grammars are not suitable. It
are applied over small segments of a program accepts LL (1) grammar. It uses recursive
consisting of a few statements. procedures. Parser requires more space in
A compiler is a program that reads a program written memory since it is recursive. Precise error
in one language - the source language and translates it indication is not possible. First and follow
into an equivalent program in another language - the functions are not required.
target language
Predictive Parser
CROSS COMPILER Left recursive grammars are not suitable.
A compiler which may run on one machine Definition: It accepts LL (1) grammar.
and produce the target code for another machine is It uses parser table.
known as cross compiler. This parser requires less space in memory.
It detects the errors using parse table.
Sentinels FIRST and FOLLOW functions are required.
The sentinel is a special character that cannot be part PARSERS
of the source program, and a natural choice is the The program performing syntax analysis is
character eof. known as parser.
In sentinels we use special character that is not the The main objective of the parser is to check
part of source program. This character is at the end of the input tokens to analyze its grammatical
each half. So every time look ahead pointer checks this correctness.
character and then the other half is loaded Parser is one of the components in a
complier, which determines whether if a
Dead code string of tokens can be generated by a
Dead code is the code which can be omitted from a grammar.
program without affecting the results.
Dead code is detected by checking whether the value Definition of Parsing
assigned in an assignment statement is used anywhere Parsing takes input from the lexical analysis
in the program. and builds a parse tree, which will be used
in future steps to develop the machine
Synthesized Attributes code.
An attribute is said to be synthesized attribute if its To determine the syntactic structure of an
parse tree node value is determined by the attribute input from lexical analysis is called as
value at child nodes. parsing.
A synthesized attribute at node n is defined only in The goals of parsing are to check the
terms of attribute values at the children of n itself validity of a source string and to determine
Synthesized attributes pass on information up the its syntactic structure.
parse tree.
Synthesized attributes can be contained by both the PARSER GENERATOR (YACC)
terminals and non-terminals. YACC stands for "Yet Another Compiler –
Compiler". YACC assists in the next phase of
the compiler.
YACC creates a parser which will be output
in a form suitable for inclusion in the next
Inherited Attributes phase.
An attribute is said to be inherited attribute if its parse
tree node value is determined by the attribute value at
parent and/or siblings node. Attribute grammar
A Inherited attribute at node n is defined only in terms Attribute grammar is a special form of
of attribute values of n’s parent, n itself, and n’s context-free grammar where some
siblings. additional information (attributes) is
Inherited attributes pass on information down the appended to one or more of its non-
parse tree. terminals in order to provide context-
Inherited attributes can’t be contained by both but it is sensitive information.
only contained by non-terminals.
Dependency Graph
Synthesized Attributes The inter-dependencies among the
An attribute is said to be synthesized attribute if its inherited and synthesized attributes at the
parse tree node value is determined by the attribute nodes in a parse tree can be shown by a
value at child nodes. directed graph called a dependency graph.
A synthesized attribute at node n is defined only in
terms of attribute values at the children of n itself Code optimization involves improving
Synthesized attributes pass on information up the the performance of code in terms of
parse tree. speed, memory usage, or efficiency.
Synthesized attributes can be contained by both the Here are several key optimization
terminals and non-terminals. techniques:
Definition of Basic Block :
A basic block is a sequence of consecutive statement in Loop Optimization, Inline Expansion,
which flow of control enters at the beginning and Minimizing Function Calls, Memory
leaves at the end without halting or branching except Optimization, Data Structure
at the last instruction. Optimization.