0% found this document useful (0 votes)
27 views12 pages

Linux

Uploaded by

Akshay Mahajan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views12 pages

Linux

Uploaded by

Akshay Mahajan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

LR parsers :

It is an efficient bottom-up syntax analysis technique that can be used to parse large
classes of context free grammar is called LR(0) parsing.
L stands for the left to right scanning
R stands for rightmost derivation in reverse
0 stands for no. of input symbols of lookahead

Advantages of LR parsing :
 It recognizes virtually all programming language constructs for which CFG
can be written
 It is able to detect syntactic errors
 It is an efficient non-backtracking shift reducing parsing method.

SLR Parser :
SLR is simple LR. It is the smallest class of grammar having few number of states. SLR
is very easy to construct and is similar to LR parsing. The only difference between SLR
parser and LR(0) parser is that in LR(0) parsing table, there’s a chance of ‘shift
reduced’ conflict because we are entering ‘reduce’ corresponding to all terminal states.
We can solve this problem by entering ‘reduce’ corresponding to FOLLOW of LHS of
production in the terminating state. This is called SLR(1) collection of items
Steps for constructing the SLR parsing table :
1. Writing augmented grammar
2. LR(0) collection of items to be found
3. Find FOLLOW of LHS of production
4. Defining 2 functions:goto[list of terminals] and action[list of non-terminals]
in the parsing table
EXAMPLE – Construct LR parsing table for the given context-free grammar
S–>AA
A–>aA|b
Solution:
STEP1 – Find augmented grammar
The augmented grammar of the given grammar is:-
S’–>.S [0th production]
S–>.AA [1st production]
A–>.aA [2nd production]
A–>.b [3rd production]
STEP2 – Find LR(0) collection of items
Below is the figure showing the LR(0) collection of items. We will understand
everything one by one.
The terminals of this grammar are {a,b}.
The non-terminals of this grammar are {S,A}
RULE –
If any non-terminal has ‘ . ‘ preceding it, we have to write all its production and add ‘ . ‘
preceding each of its production.
RULE –
from each state to the next state, the ‘ . ‘ shifts to one place to the right.
 In the figure, I0 consists of augmented grammar.
 Io goes to I1 when ‘ . ‘ of 0th production is shifted towards the right of S(S’-
>S.). this state is the accepted state. S is seen by the compiler.
 Io goes to I2 when ‘ . ‘ of 1st production is shifted towards right (S->A.A) .
A is seen by the compiler
 I0 goes to I3 when ‘ . ‘ of the 2nd production is shifted towards right (A-
>a.A) . a is seen by the compiler.
 I0 goes to I4 when ‘ . ‘ of the 3rd production is shifted towards right (A-
>b.) . b is seen by the compiler.
 I2 goes to I5 when ‘ . ‘ of 1st production is shifted towards right (S->AA.) .
A is seen by the compiler
 I2 goes to I4 when ‘ . ‘ of 3rd production is shifted towards right (A->b.) . b
is seen by the compiler.
 I2 goes to I3 when ‘ . ‘ of the 2nd production is shifted towards right (A-
>a.A) . a is seen by the compiler.
 I3 goes to I4 when ‘ . ‘ of the 3rd production is shifted towards right (A-
>b.) . b is seen by the compiler.
 I3 goes to I6 when ‘ . ‘ of 2nd production is shifted towards the right (A-
>aA.) . A is seen by the compiler
 I3 goes to I3 when ‘ . ‘ of the 2nd production is shifted towards right (A-
>a.A) . a is seen by the compiler.
STEP3 –
Find FOLLOW of LHS of production
FOLLOW(S)=$
FOLLOW(A)=a,b,$
To find FOLLOW of non-terminals, please read follow set in syntax analysis.
STEP 4-
Defining 2 functions:goto[list of non-terminals] and action[list of terminals] in the
parsing table. Below is the SLR parsing table.
 $ is by default a nonterminal that takes accepting state.
 0,1,2,3,4,5,6 denotes I0,I1,I2,I3,I4,I5,I6
 I0 gives A in I2, so 2 is added to the A column and 0 rows.
 I0 gives S in I1,so 1 is added to the S column and 1 row.
 similarly 5 is written in A column and 2 row, 6 is written in A column and 3
row.
 I0 gives a in I3 .so S3(shift 3) is added to a column and 0 row.
 I0 gives b in I4 .so S4(shift 4) is added to the b column and 0 row.
 Similarly, S3(shift 3) is added on a column and 2,3 row ,S4(shift 4) is added
on b column and 2,3 rows.
 I4 is reduced state as ‘ . ‘ is at the end. I4 is the 3rd production of
grammar(A–>.b).LHS of this production is A. FOLLOW(A)=a,b,$ . write
r3(reduced 3) in the columns of a,b,$ and 4th row
 I5 is reduced state as ‘ . ‘ is at the end. I5 is the 1st production of grammar(S–
>.AA). LHS of this production is S.
FOLLOW(S)=$ . write r1(reduced 1) in the column of $ and 5th row
 I6 is a reduced state as ‘ . ‘ is at the end. I6 is the 2nd production of grammar(
A–>.aA). The LHS of this production is A.
FOLLOW(A)=a,b,$ . write r2(reduced 2) in the columns of a,b,$ and 6th row
Level Up Your GATE Prep!
Embark on a transformative journey towards GATE success by choosing Data Science &
AI as your second paper choice with our specialized course. If you find yourself lost in
the vast landscape of the GATE syllabus, our program is the compass you need.
Advantages of Construction of LL(1) Parsing Table:
1.Deterministic Parsing: LL(1) parsing tables give a deterministic parsing process,
truly intending that for a given information program and language structure, there is a
novel not entirely set in stone by the ongoing non-terminal image and the lookahead
token. This deterministic nature works on the parsing calculation and guarantees that
the parsing system is unambiguous and unsurprising.
2.Efficiency: LL(1) parsing tables take into consideration productive parsing of
programming dialects. When the parsing table is built, the parsing calculation can
decide the following parsing activity by straightforwardly ordering the table, bringing
about a steady time query. This productivity is particularly useful for huge scope
programs and can altogether lessen the time expected for parsing.
3.Predictive Parsing: LL(1) parsing tables work with prescient parsing, where the
parsing activity is resolved exclusively by the ongoing non-terminal image and the
lookahead token without the requirement for backtracking or speculating. This prescient
nature makes the LL(1) parsing calculation direct to execute and reason about. It
likewise adds to better blunder dealing with and recuperation during parsing.
4.Error Discovery: The development of a LL(1) parsing table empowers the parser to
proficiently distinguish mistakes. By dissecting the passages in the parsing table, the
parser can recognize clashes, like various sections for a similar non-terminal and
lookahead blend. These struggles demonstrate sentence structure ambiguities or
mistakes in the syntax definition, considering early discovery and goal of issues.
5.Non-Left Recursion: LL(1) parsing tables require the disposal of left recursion in the
language structure. While left recursion is a typical issue in syntaxes, the most common
way of killing it brings about a more organized and unambiguous language structure.
The development of a LL(1) parsing table energizes the utilization of non-left recursive
creations, which prompts more clear and more effective parsing calculations.
6.Readability and Practicality: LL(1) parsing tables are by and large straightforward
and keep up with. The parsing table addresses the whole parsing calculation in a plain
configuration, with clear mappings between non-terminal images, lookahead tokens,
and parsing activities. This plain portrayal works on the comprehensibility of the
parsing calculation and improves on changes to the sentence structure, making it more
viable over the long haul.
7.Language Plan: Building a LL(1) parsing table assumes a significant part in the plan
and improvement of programming dialects. LL(1) language structures are frequently
preferred because of their straightforwardness and consistency. By guaranteeing that a
punctuation is LL(1) and building the related parsing table, language planners can shape
the linguistic structure and characterize the normal way of behaving of the language all
the more really.
Here we will study the concept and uses of Parse Tree in Compiler Design. First, let us
check out two terms
PARSE TREE IN CC
 Parse : It means to resolve (a sentence) into its component parts and describe
their syntactic roles or simply it is an act of parsing a string or a text.
 Tree: A tree may be a widely used abstract data type that simulates a
hierarchical tree structure, with a root value and sub-trees of youngsters with
a parent node, represented as a group of linked nodes.
Parse Tree:
 Parse tree is the hierarchical representation of terminals or non-terminals.
 These symbols (terminals or non-terminals) represent the derivation of the
grammar to yield input strings.
 In parsing, the string springs using the beginning symbol.
 The starting symbol of the grammar must be used as the root of the Parse
Tree.
 Leaves of parse tree represent terminals.
 Each interior node represents productions of a grammar.
Rules to Draw a Parse Tree:
1. All leaf nodes need to be terminals.
2. All interior nodes need to be non-terminals.
3. In-order traversal gives the original input string.
Uses of Parse Tree:
 It helps in making syntax analysis by reflecting the syntax of the input
language.
 It uses an in-memory representation of the input with a structure that
conforms to the grammar.
 The advantages of using parse trees rather than semantic actions: you’ll make
multiple passes over the info without having to re-parse the input.
Parse Tree and Syntax Tree
Parse Tree Syntax Tree

A parse tree is created by a parser, which is


A syntax tree is created by the compiler
a component of a compiler that processes the
based on the parse tree after the parser has
source code and checks it for syntactic
finished processing the source code.
correctness.

Syntax trees are simpler and more


Parse trees are typically more detailed and
abstract, as they only include the
larger than syntax trees, as they contain
information necessary to generate
more information about the source code.
machine code or intermediate code.

Parse trees are used as an intermediate syntax trees are the final representation
representation during the compilation used by the compiler to generate machine
process. code or intermediate code.

Parse trees are typically represented using a


tree structure with nodes representing the Syntax trees are also typically represented
different elements in the source code and using a tree structure, but the nodes and
edges representing the relationships between edges may be arranged differently.
them.

Parse trees can be represented in different


Syntax trees are usually represented using
ways, such as a tree structure, a graph, or an
a tree structure or an s-expression.
s-expression

Parse trees are intended for use by the Syntax trees are also primarily used by
compiler and are not usually intended to be the compiler, but they can also be read
read by humans. and understood by humans, as they
provide a simplified and abstract view of
Parse Tree Syntax Tree

the source code.

Parse trees include information about the


source code that is not needed by the Syntax trees do not include this
compiler, such as comments and white information.
space.

Parse trees may include nodes for error


recovery and disambiguation, which are
Syntax trees do not include these nodes
used by the parser to recover from errors in
the source code and resolve ambiguities.

LEX

 Lex is officially known as a "Lexical Analyser".


 Its main job is to break up an input stream into more usable elements. Or in, other words, to
identify the "interesting bits" in a text file.
 For example, if you are writing a compiler for the C programming language, the symbols { } ( );
all have significance on their own.
 The letter a usually appears as part of a keyword or variable name, and is not interesting on its
own.
 Instead, we are interested in the whole word. Spaces and newlines are completely
uninteresting, and we want to ignore them completely, unless they appear within quotes "like
this"
 All of these things are handled by the Lexical Analyser.
 A tool widely used to specify lexical analyzers for a variety of languages
 We refer to the tool as Lex compiler, and to its input specification as the Lex language

Lex specifications:

A Lex program (the .l file) consists of three parts:

declarations

%%

translation rules

%%

auxiliary procedures
1. The delarations section includes declarations of variables, manifest constants (A
manifest constant is an identifier that is declared to represent a constant e.g. # define
PIE 3.14).
2. and regular definitions.
3. The translation rules of a Lex program are statements of the form :

p1 {action 1}

p2 {action 2}

p3 {action 3}

……

……

Where each p is a regular expression and each action is a program fragment describing
what action the lexical analyzer should take when a pattern p matches a lexeme. In
Lex the actions are written in C.

1. The third section holds whatever auxiliary procedures are needed by the actions.
Alternatively these procedures can be compiled separately and loaded with the lexical
analyzer.

How does this Lexical analyzer work?

 The lexical analyzer created by Lex behaves in concert with a parser in the following manner.
 When activated by the parser, the lexical analyzer begins reading its remaining input, one
character at a time, until it has found the longest prefix of the input that is matched by one of
the regular expressions p.
 Then it executes the corresponding action. Typically the action will return control to the
parser.
 However, if it does not, then the lexical analyzer proceeds to find more lexemes, until an action
causes control to return to the parser.
 The repeated search for lexemes until an explicit return allows the lexical analyzer to process
white space and comments conveniently.
 The lexical analyzer returns a single quantity, the token, to the parser. To pass an attribute
value with information about the lexeme, we can set the global variable yylval.
 e.g. Suppose the lexical analyzer returns a single token for all the relational operators, in which
case the parser won’t be able to distinguish between ” <=”,”>=”,”<”,”>”,”==” etc. We can set
yylval appropriately to specify the nature of the operator.
 Note: To know the exact syntax and the various symbols that you can use to write the
regular expressions visit the manual page of FLEX in LINUX :

$man flex

The two variables yytext and yyleng

 Lex makes the lexeme available to the routines appearing in the third section through
two variables yytext and yyleng
1. yytext is a variable that is a pointer to the first character of the lexeme.
2. yyleng is an integer telling how long the lexeme is.
 A lexeme may match more than one patterns. How is this problem resolved?
 Take for example the lexeme if. It matches the patterns for both keyword if and identifier.
 If the pattern for keyword if precedes the pattern for identifier in the declaration list of the lex
program the conflict is resolved in favour of the keyword.
 In general this ambiguity-resolving strategy makes it easy to reserve keywords by listing them
ahead of the pattern for identifiers.
 The Lex’s strategy of selecting the longest prefix matched by a pattern makes it easy to
resolve other conflicts like the one between “<” and “<=”.

In the lex program, a main () function is generally included as:

main (){
yyin = fopen (filename,”r”);
while (yylex ());
}

 Here filename corresponds to input file and the yylex routine is called which returns
the tokens.

YACC

 Yacc is officially known as a "parser".


 It's job is to analyse the structure of the input stream, and operate of the "big picture".
 In the course of it's normal work, the parser also verifies that the input is syntactically sound.
 Consider again the example of a C-compiler. In the C-language, a word can be a function name
or a variable, depending on whether it is followed by a (or a = There should be exactly one } for
each { in the program.
 YACC stands for "Yet another Compiler Compiler". This is because this kind of analysis of text
files is normally associated with writing compilers.

How does this yacc works?

 yacc is designed for use with C code and generates a parser written in C.
 The parser is configured for use in conjunction with a lex-generated scanner and relies on
standard shared features (token types, yylval, etc.) and calls the function yylex as a scanner
coroutine.
 You provide a grammar specification file, which is traditionally named using a .y extension.
 You invoke yacc on the .y file and it creates the y.tab.h and y.tab.c files containing a thousand
or so lines of intense C code that implements an efficient LALR (1) parser for your grammar,
including the code for the actions you specified.
 The file provides an extern function yyparse.y that will attempt to successfully parse a valid
sentence.
 You compile that C file normally, link with the rest of your code, and you have a parser! By
default, the parser reads from stdin and writes to stdout, just like a lex-generated scanner
does.

Difference between LEX and YACC

 Lex is used to split the text into a list of tokens, what text become token can be specified using
regular expression in lex file.
 Yacc is used to give some structure to those tokens. For example in Programming languages,
we have assignment statements like int a = 1 + 2; and i want to make sure that the left hand
side of '=' be an identifier and the right side be an expression [it could be more complex than
this]. This can be coded using a CFG rule and this is what you specify in yacc file and this you
cannot do using lex (lexcannot handle recursive languages).
 A typical application of lex and yacc is for implementing programming languages.
 Lex tokenizes the input, breaking it up into keywords, constants, punctuation, etc.
 Yacc then implements the actual computer language; recognizing a for statement, for instance,
or a function definition.
 Lex and yacc are normally used together. This is how you usually construct an application
using both:
 Input Stream (characters) -> Lex (tokens) -> Yacc (Abstract Syntax Tree) -> Your Application

ADD COMMENT SHARE EDIT


Please log in to add an answe

YACC

 Yacc is officially known as a "parser".


 It's job is to analyse the structure of the input stream, and operate of the "big picture".
 In the course of it's normal work, the parser also verifies that the input is syntactically sound.
 Consider again the example of a C-compiler. In the C-language, a word can be a function name
or a variable, depending on whether it is followed by a (or a = There should be exactly one } for
each { in the program.
 YACC stands for "Yet another Compiler Compiler". This is because this kind of analysis of text
files is normally associated with writing compilers.

How does this yacc works?


 yacc is designed for use with C code and generates a parser written in C.
 The parser is configured for use in conjunction with a lex-generated scanner and relies on
standard shared features (token types, yylval, etc.) and calls the function yylex as a scanner
coroutine.
 You provide a grammar specification file, which is traditionally named using a .y extension.
 You invoke yacc on the .y file and it creates the y.tab.h and y.tab.c files containing a thousand
or so lines of intense C code that implements an efficient LALR (1) parser for your grammar,
including the code for the actions you specified.
 The file provides an extern function yyparse.y that will attempt to successfully parse a valid
sentence.
 You compile that C file normally, link with the rest of your code, and you have a parser! By
default, the parser reads from stdin and writes to stdout, just like a lex-generated scanner
does.

Difference between LEX and YACC

 Lex is used to split the text into a list of tokens, what text become token can be specified using
regular expression in lex file.
 Yacc is used to give some structure to those tokens. For example in Programming languages,
we have assignment statements like int a = 1 + 2; and i want to make sure that the left hand
side of '=' be an identifier and the right side be an expression [it could be more complex than
this]. This can be coded using a CFG rule and this is what you specify in yacc file and this you
cannot do using lex (lexcannot handle recursive languages).
 A typical application of lex and yacc is for implementing programming languages.
 Lex tokenizes the input, breaking it up into keywords, constants, punctuation, etc.
 Yacc then implements the actual computer language; recognizing a for statement, for instance,
or a function definition.
 Lex and yacc are normally used together. This is how you usually construct an application
using both:
 Input Stream (characters) -> Lex (tokens) -> Yacc (Abstract Syntax Tree) -> Your Application

You might also like