CC LL
CC LL
Working
i. Instead of invoking one read command per character, N input characters are
read into each half of the buffer with one read command.
ii. If less than N characters are left in the input stream, then those characters
are read and a special character eof is read into the buffer after the input
characters.
iii. Two pointers to the input buffer are maintained. The string of characters
between the two pointers is the current lexeme.
# Write a program to find the area of circle.
%{
#include<stdio.h>
#define Pi 3.1415
int r;
float, area;
%}
%}
[0 -9]+{
r = atoi(yytext);
area = pi* r* r;
return (0);}
%%
Main(){
yylex ();}
#include<stdio.h>
%}
%%
[0-9]+ {
num= atoi(yytext);
while (num>1)
return(0);}
main(){
printf("enter number");
yylex ();}
3 Parser
1. Define Parser.
Syntax analysis means to check syntax of the input statement with the help of
stream of stokes from the lexical analysis & produce parse string to sympatric
analysis which perform this task is called as parse.
2. Grammar: - Grammar is used in parser which stores syntax of statements
used in source code.
3. CFG: - Is collection of 4 topples (VTPS)
V - Variable or non-terminals (Uppercase)
T - Terminals (Lowercase)
P - set of production
S - start symbol
# Basic Terminology: -
1) Sentence: - A string of terminal derived from grammar is called as sentence.
2) Sentential Form: - A string of terminal as well as non-terminal is called as
sentential form.
3) Derivation: - This process starts from starting non-terminal It is replacement
of non-terminal by the RHS of its production rule.
4) Reduction: - This process starts from sentence. It finds out string of the
sentence which matches with LHS of any production rule. Its replacement is
called as random.
5) Syntax Free: - This is also called as parse string or derivation string. It is a
graphical representation of a sentence.
6) Ambiguous Grammar: - If there are many derivation processes which can
evaluate sentence from starting non terminal Then this grammar is called as
ambiguous grammar.
4. Top-down Parser: -
Top-down parsing is an attempt to find the leftmost derivation for an input
string. It constructs a parse tree for the input starting from the root (start
symbol) and creates the nodes of the parse tree in pre-order.
1. Top-down parser user derivation process.
2. It will not accept ambiguous grammar.
3. It will not accept left-recursive grammar.
5. Bottom-Up Parser
As we have seen in the definition part of bottom-up parser, it is constructed
beginning at the leaves (bottom) and working up towards the root (top). It
reduces an input string "w" to the start symbol of the grammar. During every
reduction step, a particular sub-string matching the right side of the
production is replaced by a symbol on the left of that production.
6. Left Recursive & left Factoring grammar: -
The given grammar is called as left recursive grammar because, when left most
symbol in RHS of production rule of any non-terminal is that non-terminal itself
then it is called as a left recursive grammar.
7. Predictive Parser (LL (1) Parser)
In many cases after writing a grammar, eliminating left recursion and after left
factoring the resulting grammar, we get the grammar that can be parsed by a
Recursive Descent Parser but this parser does not need backtracking. This type
of parser is called as Predictive Parser.
Advantages
i. It is simple and easy to implement parsing technique.
ii. The operator precedence parser is constructed by hand after understanding
the grammar. It is simple to debug.
Disadvantage: - i. It is hard to handle tokens like minus (-) which has two
different values of precedence depending on whether it is being used as binary
or unary.
ii. This technique does not take the grammar as the input and generate a
parser. Any addition or deletion of production rules would require a rewrite of
the parser.
8. LR Parser
It is the most powerful shift-reduce parsing method.
It is used to parse large class of context free grammars.
Advantages of LR Parsing
i. LR parsers can be constructed to recognize virtually all programming
language constructs for which context free grammars are written.
ii. LR parsing is most general non-backtracking shift-reduce parsing, yet it is still
efficient.
iii. The class of grammars that can be parsed using LR methods is a proper
superset of the class of grammars that can be parsed with predictive parsers.
9. YACC
i. YACC stands for Yet Another Compiler Compiler.
ii. It is a automatic parser generator utility provided by UNIX/LINUX.
4 Syntax Directed Definition
1. Define SSD
A syntax directed definition specifies the values of attributes, by associating
semantic rule with the grammar production. An attribute is any property of a
symbol.
EX. E→E1 + T E.code = E1.code ll T.code ll ‘+’
2. Synthesized attribute:
It is computed from its children or associated with the meaning of the tokens.
i.e., a synthesized attribute for a nonterminal A at a parse tree node N, is
defined by a semantic rule associated with the production at N. A synthesized
attribute at node N is defined only in terms of attribute value at the children of
N and at N itself.
3. Inherited attributes:
The attribute value of a parse-tree node is determined from attribute values of
its parent and siblings. i.e. an inherited attribute for a non-terminal A at a
parse tree node N is defined by a semantic rule associated with the production
at the parent of N. An inherited attribute at node N is defined only in terms of
attribute values at N's parent, N itself and N's siblings.
4. Dependency graph :-
If an attribute b at a node in a parse tree depends on an attribute c, then the
semantic rule for b at that node must be evaluated after the semantic rule that
defines c. The interdependencies among the inherited and synthesized
attributes at the nodes in a parse tree can be depicted by a directed graph
called a dependency graph.
5. Code Generation and optimization
1 Register Descriptor
A partial result is the value of some subexpression computed while evaluating
an expression. Partial results are maintained in CPU registers. If the number of
results exceeds the number of available CPU register, some of them have to be
moved to memory
2 Write a short note on code optimization technique
Code optimization can be significantly done in loops of the program. Inner loop
is a place where program spends large amount of time. If number of
instructions are less in inner loop the running time of the program will get
decreased to a large extend. Hence loop optimization is a technique in which
code optimization performed on inner loops.
3 Explain in detail two optimization technique with example?
i) Frequency Reduction: Execution of program can be reduced by moving code
from a part of a program which is executed very frequently (in a loop) to
another part of the program which is executed fewer times.
Example,
For i: 1 to 500 do will be transformed into
Begin Z: 200*a;
x:=k For i:= 1 to 500 do
z: 200*a; Begin
y:=Z+k; x:=k;
end y = 2 + x;
ii) Strength Reduction : The strength reduction optimization replaces the
occurrences of a time consuming operation or high strength operation by
occurrences of faster operation or low strength operation.
For example,
For (i=1; i<=50; i++){…
count = i*7; …}
4) Define Directed Acyclic Graph (DAG).
In mathematics, particularly graph theory, and computer science, a directed
acyclic graph (DAG) is a directed graph with no directed cycles. That is, it
consists of vertices and edges (also called arcs), with each edge directed from
one vertex to another, such that following those directions will never form a
closed loop.
Flow graph is a directed graph. It contains the flow of control information for
the set of basic block. A control flow graph is used to depict that how the
program control is being parsed among the blocks. It is useful in the loop
optimization.
1) Define Operand descriptors.
The operand descriptor (OD) describes the value and attributes of an
operand. A typical HISC instruction consists of an operation code, and indexes
to source and destination operands referenced by operand descriptors.
2) Define Annotated parse tree.
AN ANNOTATED PARSE TREE is a parse tree showing the values of the
attributes at each node. The process of computing the attribute values at the
nodes is called annotating or decorating the parse tree.
3) State True or False. Bottom-up parsing uses the process of derivation.
Bottom-up parsing can be defined as an attempt to reduce the input string
w to the start symbol of grammar by tracing out the rightmost derivations
of w in reverse. Eg. Classification of bottom up parsers A general shift
reduce parsing is LR parsing.
4) Define cross compiler.
A cross compiler is a compiler capable of creating executable code for a
platform other than the one on which the compiler is running. For
example, a compiler that runs on a Windows 7 PC but generates code
that runs on Android smartphone is a cross compiler.
5) State True or False. The yywrap( ) lex library function by default always
return 1.
If yywrap () returns false (zero), then it is assumed that the function has
gone ahead and set up yyin to point to another input file, and scanning
continues. If it returns true (non-zero), then the scanner terminates,
returning 0 to its caller. Note that in either case, the start condition
remains unchanged; it does not revert to INITIAL
6) List the techniques used in code optimization.
Compile Time Evaluation
Common Sub-Expression Elimination
Code Movement
Dead Code Elimination-
7) What is the purpose of augmenting the grammar?
The augmented grammar adds a new starting non-terminal S ′ with the sole
production S ′ → S. This helps in detecting acceptance: If you reduce by this
particular production (to the non-terminal S ′), you are accepting. To reduce to
the start non-terminal of the original grammar tells you nothing, it might
appear on some right hand side.
8) Define one pass & Multipass compilers
A one-pass compiler is a compiler that passes through the source code of each
compilation unit only once. A multi-pass compiler is a type of compiler that
processes the source code or abstract syntax tree of a program several times.
A one-pass compilers is faster than multi-pass compiler.
9) Give the name of the file which is obtained after compilation of the lex
program by the Lex compiler.
The lex compiler transforms lex.l to a C program known as lex.yy.c. •
lex.yy.c is compiled by the C compiler to a file called a.out.
10) What is the output of LEX program?
It takes as its input a LEX source program and produces lexical Analyzer as its
output. Lexical Analyzer will convert the input string entered by the user into
tokens as its output. LEX is a program generator designed for lexical processing
of character input/output stream.
11) List the phases of compiler.
Lexical Analyzer.
Syntax Analyzer
Semantic Analyzer.
Intermediate Code Generator.
1. Write a LEX Program which identifies the tokens like id, if, for and while.
%{
#include <stdio.h>
%}
^[a - z A - Z _][a - z A - Z 0 - 9 _] * printf("Valid Identifier");
^[^a - z A - Z _] printf("Invalid Identifier");
.;
%%
main()
{
yylex();}
int op = 0,i;
float a, b;
%}
dig [0-9]+|([0-9]*)"."([0-9]+)
add "+"
sub "-"
mul "*"
div "/"
pow "^"
ln \n
%%
{dig} {digi();}
{add} {op=1;}
{sub} {op=2;}
{mul} {op=3;}
{div} {op=4;}
{pow} {op=5;}
%%
digi(){
if(op==0)
a=atof(yytext);
else{
b=atof(yytext);
switch(op){
case 1:a=a+b;
break;
case 2:a=a-b;
break;
case 3:a=a*b;
break;
case 4:a=a/b;
break;
case 5:
for(i=a;b>1;b--)
a=a*i;
break;}
op=0;}}
yylex();}
yywrap(){
return 1;}