0% found this document useful (0 votes)
1 views17 pages

Unit-2 CD

Syntax analysis, or parsing, is the second phase of compiler design that checks the syntactical structure of input code by constructing a parse tree using the language's grammar. It involves various parsing algorithms such as LL and LR parsing, and is responsible for error detection and generating an intermediate representation of the code. The process also includes error handling, which detects and reports errors while allowing the compiler to recover and continue processing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views17 pages

Unit-2 CD

Syntax analysis, or parsing, is the second phase of compiler design that checks the syntactical structure of input code by constructing a parse tree using the language's grammar. It involves various parsing algorithms such as LL and LR parsing, and is responsible for error detection and generating an intermediate representation of the code. The process also includes error handling, which detects and reports errors while allowing the compiler to recover and continue processing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Unit-2

Introduction to Syntax Analysis in Compiler Design:


When an input string (source code or a program in some language) is given to a
compiler, the compiler processes it in several phases, starting from lexical analysis (scans
the input and divides it into tokens) to target code generation.
Syntax Analysis or Parsing is the second phase, i.e. after lexical analysis. It checks the
syntactical structure of the given input, i.e. whether the given input is in the correct syntax
(of the language in which the input has been written) or not. It does so by building a data
structure, called a Parse tree or Syntax tree.
The parse tree is constructed by using the pre-defined Grammar of the language and
the input string. If the given input string can be produced with the help of the syntax tree (in
the derivation process), the input string is found to be in the correct syntax. if not, the error
is reported by the syntax analyzer.
Syntax analysis, also known as parsing, is a process in compiler design where the
compiler checks if the source code follows the grammatical rules of the programming
language. This is typically the second stage of the compilation process, following lexical
analysis.
The main goal of syntax analysis is to create a parse tree or abstract syntax tree
(AST) of the source code, which is a hierarchical representation of the source code that
reflects the grammatical structure of the program.
There are several types of parsing algorithms used in syntax analysis, including
(Role of parser)
 LL parsing: This is a top-down parsing algorithm that starts with the root of the parse
tree and constructs the tree by successively expanding non-terminals. LL parsing is
known for its simplicity and ease of implementation.
 LR parsing: This is a bottom-up parsing algorithm that starts with the leaves of the parse
tree and constructs the tree by successively reducing terminals. LR parsing is more
powerful than LL parsing and can handle a larger class of grammars.
 LR(1) parsing: This is a variant of LR parsing that uses lookahead to disambiguate the
grammar.
 LALR parsing: This is a variant of LR parsing that uses a reduced set of lookahead
symbols to reduce the number of states in the LR parser.
 Once the parse tree is constructed, the compiler can perform semantic analysis to check
if the source code makes sense and follows the semantics of the programming language.
 The parse tree or AST can also be used in the code generation phase of the compiler
design to generate intermediate code or machine code.
Features of syntax analysis:
Syntax Trees: Syntax analysis creates a syntax tree, which is a hierarchical representation of
the code’s structure. The tree shows the relationship between the various parts of the code,
including statements, expressions, and operators.
Context-Free Grammar: Syntax analysis uses context-free grammar to define the syntax
of the programming language. Context-free grammar is a formal language used to describe
the structure of programming languages.
Top-Down and Bottom-Up Parsing: Syntax analysis can be performed using two main
approaches: top-down parsing and bottom-up parsing. Top-down parsing starts from the
highest level of the syntax tree and works its way down, while bottom-up parsing starts
from the lowest level and works its way up.
Error Detection: Syntax analysis is responsible for detecting syntax errors in the code. If
the code does not conform to the rules of the programming language, the parser will report
an error and halt the compilation process.
Intermediate Code Generation: Syntax analysis generates an intermediate representation
of the code, which is used by the subsequent phases of the compiler. The intermediate
representation is usually a more abstract form of the code, which is easier to work with
than the original source code.
Optimization: Syntax analysis can perform basic optimizations on the code, such as
removing redundant code and simplifying expressions.
The pushdown automata (PDA) is used to design the syntax analysis phase.

The Grammar for a Language consists of Production rules.


Example: Suppose Production rules for the Grammar of a language are :
S -> cAd
A -> bc|a
And the input string is “cad”.

To generate string “cad” it uses the rules as shown in the given diagram:
Advantages :Advantages of using syntax analysis in compiler design include:

 Structural validation: Syntax analysis allows the compiler to check if the source code
follows the grammatical rules of the programming language, which helps to detect and
report errors in the source code.
 Improved code generation: Syntax analysis can generate a parse tree or abstract syntax
tree (AST) of the source code, which can be used in the code generation phase of the
compiler design to generate more efficient and optimized code.
 Easier semantic analysis: Once the parse tree or AST is constructed, the compiler can
perform semantic analysis more easily, as it can rely on the structural information
provided by the parse tree or AST.

Disadvantages: Disadvantages of using syntax analysis in compiler design include:

 Complexity: Parsing is a complex process, and the quality of the parser can greatly
impact the performance of the resulting code. Implementing a parser for a complex
programming language can be a challenging task, especially for languages with
ambiguous grammars.
 Reduced performance: Syntax analysis can add overhead to the compilation process,
which can reduce the performance of the compiler.
 Limited error recovery: Syntax analysis algorithms may not be able to recover from
errors in the source code, which can lead to incomplete or incorrect parse trees and
make it difficult for the compiler to continue the compilation process.
 Inability to handle all languages: Not all languages have formal grammars, and some
languages may not be easily parseable.
Grammar :
It is a finite set of formal rules for generating syntactically correct sentences or meaningful
correct sentences.
Constitute Of Grammar :
Grammar is basically composed of two basic elements –
1. Terminal Symbols –
Terminal symbols are those which are the components of the sentences generated using
a grammar and are represented using small case letter like a, b, c etc.
2. Non-Terminal Symbols –
Non-Terminal Symbols are those symbols which take part in the generation of the
sentence but are not the component of the sentence. Non-Terminal Symbols are also
called Auxiliary Symbols and Variables. These symbols are represented using a capital
letter like A, B, C, etc.
Formal Definition of Grammar :
Any Grammar can be represented by 4 tuples – <N, T, P, S>
 N – Finite Non-Empty Set of Non-Terminal Symbols.
 T – Finite Set of Terminal Symbols.
 P – Finite Non-Empty Set of Production Rules.
 S – Start Symbol (Symbol from where we start producing our sentences or strings).

Production Rules :
A production or production rule in computer science is a rewrite rule specifying a symbol
substitution that can be recursively performed to generate new symbol sequences. It is of
the form α-> β where α is a Non-Terminal Symbol which can be replaced by β which is a
string of Terminal Symbols or Non-Terminal Symbols.

Example-1 :
Consider Grammar G1 = <N, T, P, S>
T = {a,b} #Set of terminal symbols
P = {A->Aa,A->Ab,A->a,A->b,A-> } #Set of all production rules
S = {A} #Start Symbol
Derivation Of Strings :

A->a #using production rule 3


OR
A->Aa #using production rule 1
Aa->ba #using production rule 4
OR
A->Aa #using production rule 1
Aa->AAa #using production rule 1
AAa->bAa #using production rule 4
bAa->ba #using production rule 5

Error Handling in Compiler Design:


The tasks of the Error Handling process are to detect each error, report it to the
user, and then make some recovery strategy and implement them to handle the error.
During this whole process processing time of the program should not be slow.
Functions of Error Handler:
 Error Detection
 Error Report
 Error Recovery
Error handler=Error Detection+Error Report+Error Recovery.
An Error is the blank entries in the symbol table. Errors in the program should be
detected and reported by the parser. Whenever an error occurs, the parser can handle it and
continue to parse the rest of the input. Although the parser is mostly responsible for
checking for errors, errors may occur at various stages of the compilation process.
Types or Sources of Error :There are three types of error: logic, run-time and
compile-time error:
1. Logic errors occur when programs operate incorrectly but do not terminate abnormally
(or crash). Unexpected or undesired outputs or other behaviour may result from a logic
error, even if it is not immediately recognized as such.
2. A run-time error is an error that takes place during the execution of a program and
usually happens because of adverse system parameters or invalid input data. The lack of
sufficient memory to run an application or a memory conflict with another program and
logical error is an example of this. Logic errors occur when executed code does not
produce the expected result. Logic errors are best handled by meticulous program
debugging.
3. Compile-time errors rise at compile-time, before the execution of the program. Syntax
error or missing file reference that prevents the program from successfully compiling is
an example of this.

Classification of Compile-time error –


1. Lexical : This includes misspellings of identifiers, keywords or operators
2. Syntactical : a missing semicolon or unbalanced parenthesis
3. Semantical : incompatible value assignment or type mismatches between operator and
operand
4. Logical : code not reachable, infinite loop.

What is Context-Free Grammar?


Context Free Grammar is formal grammar, the syntax or structure of a formal language
can be described using context-free grammar (CFG), a type of formal grammar. The grammar
has four tuples: (V,T,P,S).
V - It is the collection of variables or non-terminal symbols.
T - It is a set of terminals.
P - It is the production rules that consist of both terminals and non-terminals.
S - It is the starting symbol.

A grammar is said to be the Context-free grammar if every production is in the form of :


G -> (V∪T)*, where G ∊ V
 And the left-hand side of the G, here in the example, can only be a Variable, it cannot
be a terminal.
 But on the right-hand side here it can be a Variable or Terminal or both combination of
Variable and Terminal.
The above equation states that every production which contains any combination of the ‘V’
variable or ‘T’ terminal is said to be a context-free grammar.
For example, the grammar A = { S, a, b } having productions:
 Here S is the starting symbol.
 {a, b} are the terminals generally represented by small characters.
 S is the variable.
S-> aS
S-> bSa
but
a->bSa, or
a->ba is not a CFG as on the left-hand side there is a variable which does not
follow the CFGs rule.
Lets consider the string “aba” and and try to derive the given grammar from the productions
given. we start with symbol S, apply production rule S->bSa and then S->aS (S->a) to get
the string “aba”.

Parse tree of string “aba”


WRITING A GRAMMAR
A grammar consists of a number of productions. Each production has an abstract
symbol called a nonterminal as its left-hand side, and a sequence of one or more nonterminal
and terminal symbols as its right-hand side. For each grammar, the terminal symbols are
drawn from a specified alphabet.
Starting from a sentence consisting of a single distinguished nonterminal, called
the goal symbol, a given context-free grammar specifies a language, namely, the set of
possible sequences of terminal symbols that can result from repeatedly replacing any
nonterminal in the sequence with a right-hand side of a production for which the nonterminal
is the left-hand side.
REGULAR EXPRESSION
It is used to describe the tokens of programming languages.
It is used to check whether the given input is valid or not using transition d
The transition diagram has set of states and edges.
It has no start symbol.
It is useful for describing the structure of lexical constructs such as identifiers, constants,
keywords, and so forth.
CONTEXT-FREE GRAMMAR
It consists of a quadruple where S → start symbol, P → production, T → terminal, V
→ variable or non- terminal.
It is used to check whether the given input is valid or not using derivation.
The context-free grammar has set of productions.
It has start symbol.
parentheses, matching begin- end’s and so on.
Classification of Top Down Parsers
Parsing is classified into two categories, i.e. Top-Down Parsing, and Bottom-Up
Parsing. Top-Down Parsing is based on Left Most Derivation whereas Bottom-Up Parsing is
dependent on Reverse Right Most Derivation.
The process of constructing the parse tree which starts from the root and goes down to the
leaf is Top-Down Parsing.
1. Top-Down Parsers constructs from the Grammar which is free from ambiguity and left
recursion.
2. Top-Down Parsers use leftmost derivation to construct a parse tree.
3. It does not allow Grammar With Common Prefixes.

Classification of Top-Down Parsing


1. With Backtracking: Brute Force Technique
2. Without Backtracking:
1. Recursive Descent Parsing
2. Predictive Parsing or Non-Recursive Parsing or LL(1) Parsing or Table Driver Parsing.
Recursive Descent Parsing
1. Whenever a Non-terminal spends the first time then go with the first alternative and
compare it with the given I/P String
2. If matching doesn’t occur then go with the second alternative and compare it with the
given I/P String.
3. If matching is not found again then go with the alternative and so on.
4. Moreover, If matching occurs for at least one alternative, then the I/P string is parsed
successfully.
Recursive Descent Parsing
S()
{ Choose any S production, S ->X1X2…..Xk;
for (i = 1 to k)
{
If ( Xi is a non-terminal)
Call procedure Xi();
else if ( Xi equals the current input, increment input)
Else /* error has occurred, backtrack and try another possibility */
}
}
Non-Recursive Predictive Parsing:
This type of parsing does not require backtracking. Predictive parsers can be
constructed for LL(1) grammar, the first ‘L’ stands for scanning the input from left to right,
the second ‘L’ stands for leftmost derivation, and ‘1’ for using one input symbol lookahead
at each step to make parsing action decisions.
LL(1) or Table Driver or Predictive Parser
1. In LL1, the first L stands for Left to Right, and the second L stands for Left-most
Derivation. 1 stands for the number of Looks Ahead tokens used by the parser while
parsing a sentence.
2. LL(1) parsing is constructed from the grammar which is free from left recursion,
common prefix, and ambiguity.
3. LL(1) parser depends on 1 look ahead symbol to predict the production to expand the
parse tree.
4. This parser is Non-Recursive
Construction of LL(1)predictive parsing table
For each production A -> α repeat the following steps:
 Add A -> α under M[A, b] for all b in FIRST(α)
 If FIRST(α) contains ε then add A -> α under M[A,c] for all c in FOLLOW(A).
 Size of parsing table = (No. of terminals + 1) * #variables

Example:
Consider the Grammar
S(L) | a
LSL’
L’Ꜫ | SL’
M ( ) a $

S 1 2

L 3 3

L’ 5 4 5 4

For any grammar, if M has multiple entries then it is not LL(1) grammar.
Example:
S→iEtSS’/a
S’→eS/ε
E→b

Important Notes
If a grammar contains left factoring then it can not be LL(1).
Eg - S -> aS | a
---- both productions go in a
If a grammar contains left recursion it can not be LL(1)
Eg - S -> Sa | b
S -> Sa goes to FIRST(S) = b
S -> b goes to b, thus b has 2 entries hence not LL(1)

Shift Reduce Parser in Compiler


Shift Reduce parser attempts for the construction of parse in a similar manner as
done in bottom-up parsing i.e. the parse tree is constructed from leaves(bottom) to the
root(up). A more general form of the shift-reduce parser is the LR parser.
This parser requires some data structures i.e.
 An input buffer for storing the input string.
 A stack for storing and accessing the production rules.
Basic Operations –
 Shift: This involves moving symbols from the input buffer onto the stack.
 Reduce: If the handle appears on top of the stack then, its reduction by using
appropriate production rule is done i.e. RHS of a production rule is popped out of a
stack and LHS of a production rule is pushed onto the stack.
 Accept: If only the start symbol is present in the stack and the input buffer is empty
then, the parsing action is called accept. When accepted action is obtained, it is means
successful parsing is done.
 Error: This is the situation in which the parser can neither perform shift action nor
reduce action and not even accept action.
Example 1 – Consider the grammar
S –> S + S
S –> S * S
S –> id
Perform Shift Reduce parsing for input string “id + id + id”.

Advantages:
 Shift-reduce parsing is efficient and can handle a wide range of context-free grammars.
 It can parse a large variety of programming languages and is widely used in practice.
 It is capable of handling both left- and right-recursive grammars, which can be
important in parsing certain programming languages.
 The parse table generated for shift-reduce parsing is typically small, which makes the
parser efficient in terms of memory usage.

Disadvantages:
 Shift-reduce parsing has a limited lookahead, which means that it may miss some syntax
errors that require a larger lookahead.
 It may also generate false-positive shift-reduce conflicts, which can require additional
manual intervention to resolve.
 Shift-reduce parsers may have difficulty in parsing ambiguous grammars, where there
are multiple possible parse trees for a given input sequence.
 In some cases, the parse tree generated by shift-reduce parsing may be more complex
than other parsing techniques.
LR Parser
LR parser is a bottom-up parser for context-free grammar that is very generally used
by computer programming language compiler and other associated tools. LR parser reads
their input from left to right and produces a right-most derivation. It is called a Bottom-up
parser because it attempts to reduce the top-level grammar productions by building up from
the leaves. LR parsers are the most powerful parser of all deterministic parsers in practice.

Description Of LR Parser :
The term parser LR(k) parser, here the L refers to the left-to-right scanning, R refers
to the rightmost derivation in reverse and k refers to the number of unconsumed “look
ahead” input symbols that are used in making parser decisions. Typically, k is 1 and is
often omitted. A context-free grammar is called LR (k) if the LR (k) parser exists for it.
This first reduces the sequence of tokens to the left. But when we read from above, the
derivation order first extends to non-terminal.
1. The stack is empty, and we are looking to reduce the rule by S’→S$.
2. Using a “.” in the rule represents how many of the rules are already on the stack.
3. A dotted item, or simply, the item is a production rule with a dot indicating how much
RHS has so far been recognized. Closing an item is used to see what production rules can
be used to expand the current structure. It is calculated as follows :
Rules for LR parser
The rules of LR parser as follows.
1. The first item from the given grammar rules adds itself as the first closed set.
2. If an object is present in the closure of the form A→ α. β. γ, where the next symbol after
the symbol is non-terminal, add the symbol’s production rules where the dot precedes the
first item.
3. Repeat steps (B) and (C) for new items added under (B).
LR parser algorithm :
LR Parsing algorithm is the same for all the parser, but the parsing table is different for
each parser. It consists following components as follows.
Input Buffer:
It contains the given string, and it ends with a $ symbol.
Stack :
The combination of state symbol and current input symbol is used to refer to the parsing
table in order to take the parsing decisions.
Parsing Table:
Parsing table is divided into two parts- Action table and Go-To table. The action
table gives a grammar rule to implement the given current state and current terminal in the
input stream. There are four cases used in action table as follows.
1. Shift Action- In shift action the present terminal is removed from the input stream and the
state n is pushed onto the stack, and it becomes the new present state.
2. Reduce Action- The number m is written to the output stream.
3. The symbol m mentioned in the left-hand side of rule m says that state is removed from
the stack.
4. The symbol m mentioned in the left-hand side of rule m says that a new state is looked
up in the goto table and made the new current state by pushing it onto the stack.
An accept - the string is accepted
No action - a syntax error is reported

LR parser diagram :

LALR Parser :
LALR Parser is lookahead LR parser. It is the most powerful parser which can handle
large classes of grammar. The size of CLR parsing table is quite large as compared to other
parsing table. LALR reduces the size of this table.LALR works similar to CLR. The only
difference is , it combines the similar states of CLR parsing table into one single state.
The general syntax becomes [A->∝.B, a ]
where A->∝.B is production and a is a terminal or right end marker $
LR(1) items=LR(0) items + look ahead
How to add lookahead with the production?
CASE 1 –
A->∝.BC, a
Suppose this is the 0th production.Now, since ‘ . ‘ precedes B,so we have to write B’s
productions as well.
B->.D [1st production]
Suppose this is B’s production. The look ahead of this production is given as- we look at
previous production i.e. – 0th production. Whatever is after B, we find FIRST(of that
value) , that is the lookahead of 1st production. So, here in 0th production, after B, C is
there. Assume FIRST(C)=d, then 1st production become.
B->.D, d

Steps for constructing the LALR parsing table :


1. Writing augmented grammar
2. LR(1) collection of items to be found
3. Defining 2 functions: goto[list of terminals] and action[list of non-terminals] in the
LALR parsing table
Error Handling in Compiler Design
The tasks of the Error Handling process are to detect each error, report it to the
user, and then make some recovery strategy and implement them to handle the error.
During this whole process processing time of the program should not be slow.
Functions of Error Handler:
 Error Detection
 Error Report
 Error Recovery
 Error handler=Error Detection+Error Report+Error Recovery.
An Error is the blank entries in the symbol table. Errors in the program should be
detected and reported by the parser. Whenever an error occurs, the parser can handle it and
continue to parse the rest of the input. Although the parser is mostly responsible for
checking for errors, errors may occur at various stages of the compilation process.
Types or Sources of Error :
There are three types of error: logic, run-time and compile-time error:
1. Logic errors occur when programs operate incorrectly but do not terminate abnormally
(or crash). Unexpected or undesired outputs or other behaviour may result from a logic
error, even if it is not immediately recognized as such.
2. A run-time error is an error that takes place during the execution of a program and
usually happens because of adverse system parameters or invalid input data. The lack of
sufficient memory to run an application or a memory conflict with another program and
logical error is an example of this. Logic errors occur when executed code does not
produce the expected result. Logic errors are best handled by meticulous program
debugging.
3. Compile-time errors rise at compile-time, before the execution of the program. Syntax
error or missing file reference that prevents the program from successfully compiling is an
example of this.

Classification of Compile-time error –


1. Lexical : This includes misspellings of identifiers, keywords or operators
2. Syntactical : a missing semicolon or unbalanced parenthesis
3. Semantical : incompatible value assignment or type mismatches between operator and
operand
4. Logical : code not reachable, infinite loop.
Finding error or reporting an error – Viable-prefix is the property of a parser that
allows early detection of syntax errors.
 Goal detection of an error as soon as possible without further consuming unnecessary
input
 How: detect an error as soon as the prefix of the input does not match a prefix of any
string in the language.
Error Recovery:
The basic requirement for the compiler is to simply stop and issue a message, and
cease compilation. There are some common recovery methods that are as follows.
We already discuss the errors. Now, let’s try to understand the recovery of errors in every
phase of the compiler.
1. Panic mode recovery :
This is the easiest way of error-recovery and also, it prevents the parser from
developing infinite loops while recovering error.
The parser discards the input symbol one at a time until one of the designated (like
end, semicolon) set of synchronizing tokens (are typically the statement or expression
terminators) is found.
This is adequate when the presence of multiple errors in the same statement is rare.
Example: Consider the erroneous expression- (1 + + 2) + 3. Panic-mode recovery: Skip
ahead to the next integer and then continue. Bison: use the special terminal error to
describe how much input to skip.
E->int|E+E|(E)|error int|(error)
2.Phase level recovery :
When an error is discovered, the parser performs local correction on the remaining
input.
If a parser encounters an error, it makes the necessary corrections on the remaining
input so that the parser can continue to parse the rest of the statement.
To prevent going in an infinite loop during the correction, utmost care should be
taken.
Whenever any prefix is found in the remaining input, it is replaced with some string.
In this way, the parser can continue to operate on its execution.
3. Error productions :
The use of the error production method can be incorporated if the user is aware of
common mistakes that are encountered in grammar in conjunction with errors that produce
erroneous constructs.
When this is used, error messages can be generated during the parsing process, and
the parsing can continue. Example: write 5x instead of 5*x
4. Global correction :
In order to recover from erroneous input, the parser analyzes the whole program
and tries to find the closest match for it, which is error-free.
The closest match is one that does not do many insertions, deletions, and changes of
tokens. This method is not practical due to its high time and space complexity.

Advantages of Error Handling in Compiler Design:


1.Robustness:
Mistake dealing with improves the strength of the compiler by permitting it to deal with
and recuperate from different sorts of blunders smoothly.
This guarantees that even within the sight of blunders, the compiler can keep handling
the information program and give significant mistake messages.
2.Error location:
By consolidating blunder taking care of components, a compiler can distinguish and
recognize mistakes in the source code.
This incorporates syntactic mistakes, semantic blunders, type blunders, and other
potential issues that might make the program act startlingly or produce erroneous result.
3.Error revealing:
Compiler mistake taking care of works with viable blunder answering to the client or
software engineer.
It creates engaging blunder messages that assist developers with understanding the
nature and area of the mistake, empowering them to effectively fix the issues.
4.Error recuperation:
Mistake dealing with permits the compiler to recuperate from blunders and proceed with
the aggregation cycle whenever the situation allows.
This is accomplished through different methods like blunder adjustment, mistake
synchronization, and resynchronization. The compiler endeavors to redress the blunders
and continues with assemblage, keeping the whole interaction from being ended
unexpectedly.
5.Incremental gathering:
Mistake taking care of empowers gradual aggregation, where a compiler can order and
execute right partitions of the program regardless of whether different segments contain
blunders. This element is especially helpful for enormous scope projects, as it permits
engineers to test and investigate explicit modules without recompiling the whole codebase.
Disadvantages of error handling in compiler design:
1.Increased complexity:
Error handling in compiler design can significantly increase the complexity of the
compiler.
This can make the compiler more challenging to develop, test, and maintain. The more
complex the error handling mechanism is, the more difficult it becomes to ensure that it is
working correctly and to find and fix errors.
2.Reduced performance:
Error handling in compiler design can also impact the performance of the compiler. This
is especially true if the error handling mechanism is time-consuming and computationally
intensive.
As a result, the compiler may take longer to compile programs and may require more
resources to operate.
3.Increased development time:
Developing an effective error handling mechanism can be a time-consuming process.
This is because it requires significant testing and debugging to ensure that it works as
intended. This can slow down the development process and result in longer development
times.
4.Difficulty in error detection:
While error handling is designed to identify and handle errors in the source code, it
can also make it more difficult to detect errors. This is because the error handling
mechanism may mask some errors, making it harder to identify them. Additionally, if the
error handling mechanism is not working correctly, it may fail to detect errors altogether.
Introduction to YACC
YACC is an LALR parser generator developed at the beginning of the 1970s by
Stephen C. Johnson for the Unix operating system. It automatically generates the LALR(1)
parsers from formal grammar specifications.
YACC plays an important role in compiler and interpreter development since it
provides a means to specify the grammar of a language and to produce parsers that either
interpret or compile code written in that language.
Key Concepts and Features of YACC
 Grammar Specification: The input to YACC is a context-free grammar (usually in the
Backus-Naur Form, BNF) that describes the syntax rules of the language it parses.
 Parser Generation: YACC translates the grammar into a C function that could perform
an efficient parsing of input text according to such predefined rules.
 LALR(1) Parsing: This is a bottom-up parsing method that makes use of a single token
lookahead in determining the next action of parsing.
 Semantic Actions: These are the grammar productions that are associated with an
action; this enables the execution of code, usually in C, used in the construction
of abstract syntax trees, the generation of intermediate representations, or error
handling.
 Attribute Grammars: These grammars consist of non-terminal grammar symbols with
attributes, which through semantic actions are used in the construction of parse trees or
the output of code.
 Integration with Lex: It is often used along with Lex, a tool that generates lexical
analyzers-scanners-which breaks input into tokens that are then processed by the YACC
parser.
For Compiling YACC Program:
1. Write lex program in a file file.l and yacc in a file file.y
2. Open Terminal and Navigate to the Directory where you have saved the files.
3. type lex file.l
4. type yacc file.y
5. type cc lex.yy.c y.tab.h -ll
6. type ./a.out

You might also like