0% found this document useful (0 votes)

12 views28 pages

Unit 5 SP

The document outlines the phases of a compiler, which include the Analysis and Synthesis phases, detailing the roles of lexical, syntax, and semantic analysis. It explains the function of the symbol table in managing identifiers and their attributes throughout the compilation process. Additionally, it covers the types of parsing methods and error handling in syntax analysis, emphasizing the importance of these components in producing a correct and optimized target program.

Uploaded by

anshumohadikar2005

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views28 pages

Unit 5 SP

Uploaded by

anshumohadikar2005

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

UNIT-V SP

Phases of a Compiler
We basically have two phases of compilers, namely the Analysis phase and Synthesis
phase. The analysis phase creates an intermediate representation from the given
source code. The synthesis phase creates an equivalent target program from the
intermediate representation.

Symbol Table – It is a data structure being used and maintained by the compiler,
consisting of all the identifier’s names along with their types. It helps the compiler to
function smoothly by finding the identifiers quickly.
The analysis of a source program is divided into mainly three phases. They are:
1. Linear Analysis-
This involves a scanning phase where the stream of characters is read from left to
right. It is then grouped into various tokens having a collective meaning.
2. Hierarchical Analysis-
In this analysis phase, based on a collective meaning, the tokens are categorized
hierarchically into nested groups.
3. Semantic Analysis-
This phase is used to check whether the components of the source program are
meaningful or not.
The compiler has two modules namely the front end and the back end. Front-end
constitutes the Lexical analyzer, semantic analyzer, syntax analyzer, and intermediate
code generator. And the rest are assembled to form the back end.
1. Lexical Analyzer –
It is also called a scanner. It takes the output of the preprocessor (which performs
file inclusion and macro expansion) as the input which is in a pure high-level
language. It reads the characters from the source program and groups them into
lexemes (sequence of characters that “go together”). Each lexeme corresponds to
a token. Tokens are defined by regular expressions which are understood by the
lexical analyzer. It also removes lexical errors (e.g., erroneous characters),
comments, and white space.
2. Syntax Analyzer – It is sometimes called a parser. It constructs the parse tree. It
takes all the tokens one by one and uses Context-Free Grammar to construct the
parse tree.
Why Grammar?
The rules of programming can be entirely represented in a few productions. Using
these productions we can represent what the program actually is. The input has to
be checked whether it is in the desired format or not.
The parse tree is also called the derivation tree. Parse trees are generally
constructed to check for ambiguity in the given grammar. There are certain rules
associated with the derivation tree.
 Any identifier is an expression
 Any number can be called an expression
 Performing any operations in the given expression will always result in an
expression. For example, the sum of two expressions is also an expression.
 The parse tree can be compressed to form a syntax tree
Syntax error can be detected at this level if the input is not in accordance with the
grammar.

 Semantic Analyzer – It verifies the parse tree, whether it’s meaningful or not. It
furthermore produces a verified parse tree. It also does type checking, Label
checking, and Flow control checking.
 Intermediate Code Generator – It generates intermediate code, which is a form
that can be readily executed by a machine We have many popular intermediate
codes. Example – Three address codes etc. Intermediate code is converted to
machine language using the last two phases which are platform dependent.
Till intermediate code, it is the same for every compiler out there, but after that, it
depends on the platform. To build a new compiler we don’t need to build it from
scratch. We can take the intermediate code from the already existing compiler and
build the last two parts.
 Code Optimizer – It transforms the code so that it consumes fewer resources and
produces more speed. The meaning of the code being transformed is not altered.
Optimization can be categorized into two types: machine-dependent and machine-
independent.
 Target Code Generator – The main purpose of the Target Code generator is to
write a code that the machine can understand and also register allocation,
instruction selection, etc. The output is dependent on the type of assembler. This is
the final stage of compilation. The optimized code is converted into relocatable
machine code which then forms the input to the linker and loader.

Lexical Analysis
Lexical Analysis is the first phase of the compiler also known as a scanner. It
converts the High level input program into a sequence of Tokens.
 Lexical Analysis can be implemented with the Deterministic finite Automata.
 The output is a sequence of tokens that is sent to the parser for syntax
analysis

What is a token?
A lexical token is a sequence of characters that can be treated as a unit in the
grammar of the programming languages.
Example of tokens:
 Type token (id, number, real, . . . )
 Punctuation tokens (IF, void, return, . . . )
 Alphabetic tokens (keywords)
Keywords; Examples-for, while, if etc.
Identifier; Examples-Variable name, function name, etc.
Operators; Examples '+', '++', '-' etc.
Separators; Examples ',' ';' etc
Example of Non-Tokens:
 Comments, preprocessor directive, macros, blanks, tabs, newline, etc.
Lexeme: The sequence of characters matched by a pattern to form
the corresponding token or a sequence of input characters that comprises a
single token is called a lexeme. eg- “float”, “abs_zero_Kelvin”, “=”, “-”, “273”, “;” .
How Lexical Analyzer functions

1. Tokenization i.e. Dividing the program into valid tokens.

2. Remove white space characters.
3. Remove comments.
4. It also provides help in generating error messages by providing row numbers
and column numbers.

 The lexical analyzer identifies the error with the help of the automation machine and the
grammar of the given language on which it is based like C, C++, and gives row number and
column number of the error.
Suppose we pass a statement through lexical analyzer –
a=b+c; It will generate token sequence like this:
id=id+id; Where each id refers to it’s variable in the symbol table referencing
all details
For example, consider the program
int main()
{
// 2 variables
int a, b;
a = 10;
return 0;
}
All the valid tokens are:
'int' 'main' '(' ')' '{' 'int' 'a' ',' 'b' ';'
'a' '=' '10' ';' 'return' '0' ';' '}'
Above are the valid tokens.
You can observe that we have omitted comments.

As another example, consider below printf statement.

There are 5 valid token in this printf statement.

Exercise 1:
Count number of tokens :
int main()
{
int a = 10, b = 20;
printf("sum is :%d",a+b);
return 0;
}
Answer: Total number of token: 27.
Exercise 2:
Count number of tokens :
int max(inti);
 Lexical analyzer first read int and finds it to be valid and accepts as token
 max is read by it and found to be a valid function name after reading (
 int is also a token , then again i as another token and finally ;
Answer: Total number of tokens 7:
int, max, ( ,int, i, ), ;

Which data structure is used for lexical analysis?

In computer science, a symbol table is a data structure used by a language translator
such as a compiler or interpreter, where each identifier in a program's source code is
associated with information relating to its declaration or appearance in the source, such
as its type, scope level and sometimes its location.

Symbol Table is an important data structure created and maintained by the

compiler in order to keep track of semantics of variables i.e. it stores information
about the scope and binding information about names, information about
instances of various entities such as variable and function names, classes,
objects, etc.
 It is built-in lexical and syntax analysis phases.
 The information is collected by the analysis phases of the compiler and is
used by the synthesis phases of the compiler to generate code.
 It is used by the compiler to achieve compile-time efficiency.
 It is used by various phases of the compiler as follows:-
1. Lexical Analysis: Creates new table entries in the table, for example like
entries about tokens.
2. Syntax Analysis: Adds information regarding attribute type, scope,
dimension, line of reference, use, etc in the table.
3. Semantic Analysis: Uses available information in the table to check for
semantics i.e. to verify that expressions and assignments are
semantically correct(type checking) and update it accordingly.
4. Intermediate Code generation: Refers symbol table for knowing how
much and what type of run-time is allocated and table helps in adding
temporary variable information.
5. Code Optimization: Uses information present in the symbol table for
machine-dependent optimization.
6. Target Code generation: Generates code by using address information
of identifier present in the table.
Symbol Table entries – Each entry in the symbol table is associated with
attributes that support the compiler in different phases.
Items stored in Symbol table:
 Variable names and constants
 Procedure and function names
 Literal constants and strings
 Compiler generated temporaries
 Labels in source languages
Information used by the compiler from Symbol table:
 Data type and name
 Declaring procedures
 Offset in storage
 If structure or record then, a pointer to structure table.
 For parameters, whether parameter passing by value or by reference
 Number and type of arguments passed to function
 Base Address
Operations of Symbol table – The basic operations defined on a symbol table
include:

Implementation of Symbol table –

Following are commonly used data structures for implementing symbol table:-
1. List –
 In this method, an array is used to store names and associated
information.
 A pointer “available” is maintained at end of all stored records and new
names are added in the order as they arrive
 To search for a name we start from the beginning of the list till available
pointer and if not found we get an error “use of the undeclared name”
 While inserting a new name we must ensure that it is not already present
otherwise an error occurs i.e. “Multiple defined names”
 Insertion is fast O(1), but lookup is slow for large tables – O(n) on
average
 The advantage is that it takes a minimum amount of space.
2. Linked List –
 This implementation is using a linked list. A link field is added to each
record.
 Searching of names is done in order pointed by the link of the link field.
 A pointer “First” is maintained to point to the first record of the symbol
table.
 Insertion is fast O(1), but lookup is slow for large tables – O(n) on
average
3. Hash Table –
 In hashing scheme, two tables are maintained – a hash table and symbol
table and are the most commonly used method to implement symbol
tables.
 A hash table is an array with an index range: 0 to table size – 1. These
entries are pointers pointing to the names of the symbol table.
 To search for a name we use a hash function that will result in an integer
between 0 to table size – 1.
 Insertion and lookup can be made very fast – O(1).
 The advantage is quick to search is possible and the disadvantage is that
hashing is complicated to implement.
4. Binary Search Tree –
 Another approach to implementing a symbol table is to use a binary
search tree i.e. we add two link fields i.e. left and right child.
 All names are created as child of the root node that always follows the
property of the binary search tree.
 Insertion and lookup are O(log2 n) on average.
What is Syntax Analysis?
The syntax analysis phase gets its input from the lexical analyzer which is a
string of tokens. It verifies whether the provided string of tokens is
grammatically correct or not.

 If that string of tokens is grammatically incorrect, it reports the syntax error.

 If that string of tokens is grammatically correct, it produces a parse tree for
that string.

Later the syntax analyzer forwards this parse tree to the next front end for
processing. We also refer to the syntax analyzer as the parser.

Besides building the parse tree syntax analyzer even collects information
about each token. And stores this information in the symbol table. Along with
this it even performs:

 Type checking.
 Does semantic analysis.
 Generates an intermediate code.

In case, syntax analyzer identifies syntax errors, it performs error recovery

methods. These methods help syntax analyzers to handle the syntax error.

Types of Parsing
The three common types of parsing are as follow:

1. Universal Parsing
2. Top-down Parsing
3. Bottom-up Parsing

Universal Parsing

Though universal parsing can parse any type of grammar. But it is quite
ineffective to be used in the production compiler. So usually, we only use two
methods for parsing top-down and bottom.

Top-down Parsing
In the top-down method, the parser builds the parse tree starting from the top.
That means it starts from the root of the parse tree, traversing towards the
bottom i.e. the leaves of the parse tree.

Bottom-up Parsing

In the bottom-up method, the parser builds the parse tree starting from the
bottom. This implies it starts from the leaves of the parse tree, traversing
upwards to the top i.e. root of the parse tree.

Syntax Error Handling

The compiler is specially designed to assist the programmer in tracking down
the errors. To gain syntactic accuracy error handling is left to the compiler
designer.

 If we concentrate on error handling right from the starting of compiler

designing:
 It will help in simplifying the structure of the compiler.
 And it will also improve its error handling capability.

Different Types of Errors

1. Lexical Error: It occurs when spellings of the identifiers, keywords, operators

are misspelt.
2. Syntactic Error: It arises when you forget to put semicolons, braces, commas
etc.
3. Semantic Error: This occurs when there is a mismatch between operators
and operands.
4. Logical Error: It occurs when the programmer provides incorrect reasoning in
the program.

Syntax Analysis in Compiler

21st December 2021 by Neha T Leave a Comment

The syntax analysis phase is the second phase of a compiler. It takes input
from the lexical analyzer. And provides an output that serves as input to the
semantic analyzer.
Syntax analysis is also referred to as syntax analyzer or parser. It reads the
string of tokens from the lexical analyzer. And confirm that it can be generated
from the grammar used for the source language.

Content: Syntax Analysis in Compiler Design

What is Syntax Analysis?
The syntax analysis phase gets its input from the lexical analyzer which is a
string of tokens. It verifies whether the provided string of tokens is
grammatically correct or not.

 If that string of tokens is grammatically incorrect, it reports the syntax error.

 If that string of tokens is grammatically correct, it produces a parse tree for
that string.

Later the syntax analyzer forwards this parse tree to the next front end for
processing. We also refer to the syntax analyzer as the parser.

Besides building the parse tree syntax analyzer even collects information
about each token. And stores this information in the symbol table. Along with
this it even performs:

 Type checking.
 Does semantic analysis.
 Generates an intermediate code.

In case, syntax analyzer identifies syntax errors, it performs error recovery

methods. These methods help syntax analyzers to handle the syntax error.

Types of Parsing
The three common types of parsing are as follow:

1. Universal Parsing
2. Top-down Parsing
3. Bottom-up Parsing

Universal Parsing
Though universal parsing can parse any type of grammar. But it is quite
ineffective to be used in the production compiler. So usually, we only use two
methods for parsing top-down and bottom.

Top-down Parsing

In the top-down method, the parser builds the parse tree starting from the top.
That means it starts from the root of the parse tree, traversing towards the
bottom i.e. the leaves of the parse tree.

Bottom-up Parsing

Note: Whatever type of parsing, the parser chooses, it starts scanning the
parse tree from the left. And it will continue traversing the tree towards the
right. Remember it will scan only one symbol or node at a time.

Syntax Error Handling

The compiler is specially designed to assist the programmer in tracking down
the errors. To gain syntactic accuracy error handling is left to the compiler
designer.

 If we concentrate on error handling right from the starting of compiler

designing:
 It will help in simplifying the structure of the compiler.
 And it will also improve its error handling capability.

Different Types of Errors

1. Lexical Error: It occurs when spellings of the identifiers, keywords, operators

However, the main task of the parser is to detect syntactic error efficiently.
The error handling in parser involves:

 The parser must report the errors in the program accurately.

 The parser must recover from each error quickly. So that it can detect
subsequent errors.
 There must be minimal overhead to process a correct program.

The error handler must report the location of the error in the program. This
helps the compiler to detect the line at which an error had occurred. Also, it
points to the line at which error occurs.

Error Recovery Method

Even though developers design programming languages with error handling
capability. But the errors are inevitable in the programs despite the
programmer’s efforts. Thus, compilers are designed to track down and locate
the errors in the program.

Role of context free grammar in Syntax analysis:

Syntax analysis or parsing is the second phase of a compiler. In this chapter, we shall
learn the basic concepts used in the construction of a parser.
We have seen that a lexical analyzer can identify tokens with the help of regular
expressions and pattern rules. But a lexical analyzer cannot check the syntax of a given
sentence due to the limitations of the regular expressions. Regular expressions cannot
check balancing tokens, such as parenthesis. Therefore, this phase uses context-free
grammar (CFG), which is recognized by push-down automata.
CFG, on the other hand, is a superset of Regular Grammar, as depicted below:
It implies that every Regular Grammar is also context-free, but there exists some
problems, which are beyond the scope of Regular Grammar. CFG is a helpful tool in
describing the syntax of programming languages.

Context-Free Grammar
In this section, we will first see the definition of context-free grammar and introduce
terminologies used in parsing technology.
A context-free grammar has four components:
 A set of non-terminals (V). Non-terminals are syntactic variables that denote
sets of strings. The non-terminals define sets of strings that help define the
language generated by the grammar.
 A set of tokens, known as terminal symbols (Σ). Terminals are the basic
symbols from which strings are formed.
 A set of productions (P). The productions of a grammar specify the manner in
which the terminals and non-terminals can be combined to form strings. Each
production consists of a non-terminal called the left side of the production, an
arrow, and a sequence of tokens and/or on- terminals, called the right side of the
production.
 One of the non-terminals is designated as the start symbol (S); from where the
production begins.
The strings are derived from the start symbol by repeatedly replacing a non-terminal
(initially the start symbol) by the right side of a production, for that non-terminal.
Example
We take the problem of palindrome language, which cannot be described by means of
Regular Expression. That is, L = { w | w = wR } is not a regular language. But it can be
described by means of CFG, as illustrated below:
G = ( V, Σ, P, S )
Where:
V = { Q, Z, N }
Σ = { 0, 1 }
P = { Q → Z | Q → N | Q → ℇ | Z → 0Q0 | N → 1Q1 }
S={Q}
This grammar describes palindrome language, such as: 1001, 11100111, 00100,
1010101, 11111, etc.

Syntax Analyzers
A syntax analyzer or parser takes the input from a lexical analyzer in the form of token
streams. The parser analyzes the source code (token stream) against the production
rules to detect any errors in the code. The output of this phase is a parse tree.

This way, the parser accomplishes two tasks, i.e., parsing the code, looking for errors
and generating a parse tree as the output of the phase.

What is LEX?
It is a tool or software which automatically generates a lexical analyzer (finite
Automata). It takes as its input a LEX source program and produces lexical Analyzer as
its output. Lexical Analyzer will convert the input string entered by the user into tokens
as its output.
LEX is a program generator designed for lexical processing of character input/output
stream. Anything from simple text search program that looks for pattern in its input-
output file to a C compiler that transforms a program into optimized code.
In program with structure input-output two tasks occurs over and over. It can divide the
input-output into meaningful units and then discovering the relationships among the
units for C program (the units are variable names, constants, and strings). This division
into units (called tokens) is known as lexical analyzer or LEXING. LEX helps by taking a
set of descriptions of possible tokens n producing a routine called a lexical analyzer or
LEXER or Scanner.

LEX Source Program

It is a language used for specifying or representing Lexical Analyzer.
There are two parts of the LEX source program −

 Auxiliary Definitions
 Translation Rules

 Auxiliary Definition
It denotes the regular expression of the form.

Distinct Names ⎡⎣⎢D1D2Dn=R1=R2=Rn⎤⎦⎥[D1=R1D2=R2Dn=Rn] Regular Expressions

Where
 Distinct Names (Di)→ Shortcut name of Regular Expression
 Regular Expression (Ri)→ Notation to represent a collection of input symbols.
Example
Auxiliary Definition for Identifiers −

Auxiliary Definition for Signed Numbers

integer=digit digit*
sign = + | -
signedinteger = sign integer
Auxiliary Definition for Decimal Numbers
decimal = signedinteger . integer | sign.integer
Auxiliary Definition for Exponential Numbers
Exponential – No = (decimal | signedinteger) E signedinteger
Auxiliary Definition for Real Numbers
Real-No. = decimal | Exponential – No

 Translation Rules
It is a set of rules or actions which tells Lexical Analyzer what it has to do or what it has
to return to parser on encountering the toke
n.
It consists of statements of the form −
P1 {Action1}
P2 {Action2}
.
.
.
Pn {Actionn}
Where
Pi → Pattern or Regular Expression consisting of input alphabets and Auxiliary definition
names.
Actioni → It is a piece of code that gets executed whenever a token is Recognized.
Each Actioni specifies a set of statements to be executed whenever each regular
expression or pattern Pi matches with the input string.
Example
Translation Rules for "Keywords"

We can see that if Lexical Analyzer is given the input "begin", it will recognize the token
"begin" and Lexical Analyzer will return 1 as integer code to the parser.
Translation Rules for "Identifiers"
letter (letter + digit)* {Install ( );return 6}
If Lexical Analyzer is given the token which is an "identifier", then the Action taken by
the Lexical Analyzer is to install or store the name in the symbol table & return value 6
as integer code to the parser.

YACC
A parser generator is a program that takes as input a specification of a syntax,
and produces as output a procedure for recognizing that language. Historically,
they are also called compiler-compilers.
YACC (yet another compiler-compiler) is an LALR(1) (LookAhead, Left-to-right,
Rightmost derivation producer with 1 lookahead token) parser generator. YACC
was originally designed for being complemented by Lex.
Input File:
YACC input file is divided into three parts.

/* definitions */
....

%%
/* rules */
....
%%

/* auxiliary routines */
....
Input File: Definition Part:

 The definition part includes information about the tokens used in the syntax
definition:

%token NUMBER
%token ID

 Yacc automatically assigns numbers for tokens, but it can be overridden by

%token NUMBER 621


 Yacc also recognizes single characters as tokens. Therefore, assigned token
numbers should not overlap ASCII codes.

 The definition part can include C code external to the definition of the parser
and variable declarations, within %{ and %} in the first column.

 It can also include the specification of the starting symbol in the grammar:

%start nonterminal

Input File: Rule Part:

 The rules part contains grammar definition in a modified BNF form.

 Actions is C code in { } and can be embedded inside (Translation schemes).

Input File: Auxiliary Routines Part:

 The auxiliary routines part is only C code.

 It includes function definitions for every function needed in rules part.

 It can also contain the main() function definition if the parser is going to be
run as a program.

 The main() function must call the function yyparse().

Input File:

 If yylex() is not defined in the auxiliary routines sections, then it should be

included:

#include "lex.yy.c"

 YACC input file generally finishes with:

.y

Output Files:

 The output of YACC is a file named y.tab.c

 If it contains the main() definition, it must be compiled to be executable.

 Otherwise, the code can be an external function definition for the

function intyyparse()

 If called with the –d option in the command line, Yacc produces as output a
header file y.tab.h with all its specific definition (particularly important are
token definitions to be included, for example, in a Lex input file).

 If called with the –v option, Yacc produces as output a

file y.output containing a textual description of the LALR(1) parsing table
used by the parser. This is useful for tracking down how the parser solves
conflicts.

Example:
Yacc File (.y)
What Is An Interpreter?
Just like a compiler, interpreters also do the same job. It also translates the high-level
language into low-level languages.

Advantages Of Interpreter
1. Cross-Platform → In interpreted language we directly share the source code which
can run on any system without any system incompatibility issue.
2. Easier To Debug → Code debugging is easier in interpreters since it reads the code
line by line, and returns the error message on the spot. Also, the client with the
source code can debug or modify the code easily if they need to.
3. Less Memory and Step → Unlike the compiler, interpreters don’t generate new
separate files. So it doesn’t take extra Memory and we don’t need to perform one
extra step to execute the source code, it is executed on the fly.
4. Execution Control → Interpreter reads code line by line so you can stop the
execution and edit the code at any point, which is not possible in a compiled
language. However, after being stopped it will start from the beginning if you execute
the code again.

Disadvantages Of Interpreter
1. Slower → Interpreter is often slower than compiler as it reads, analyzes and
converts the code line by line.
2. Dependencies file required → A client or anyone with the shared source code
needs to have an interpreter installed in their system, in order to execute the code.
3. Less Secure → Unlike compiled languages, an interpreter doesn’t generate any
executable file so to share the program with others we need to share our source
code which is not secure and private. So it is not good for any company or
corporations who are concerned about their privacy.
Types of Errors
There are five different types of errors in C.

1. Syntax Error
2. Run Time Error
3. Logical Error
4. Semantic Error
5. Linker Error

1. Syntax Error

Syntax errors occur when a programmer makes mistakes in typing the code's syntax
correctly or makes typos. In other words, syntax errors occur when a programmer does
not follow the set of rules defined for the syntax of C language.

Syntax errors are sometimes also called compilation errors because they are always
detected by the compiler. Generally, these errors can be easily identified and rectified
by programmers.

The most commonly occurring syntax errors in C language are:

 Missing semi-colon (;)

 Missing parenthesis ({})
 Assigning value to a variable without declaring it

Let us take an example to understand syntax errors:

#include<stdio.h>

void main() {
var = 5; // we did not declare the data type of variable

printf("The variable is: %d", var);

}

Output:

error: 'var' undeclared (first use in this function)

If the user assigns any value to a variable without defining the data type of the variable,
the compiler throws a syntax error.

Let's see another example:

#include<stdio.h>

void main() {
for (inti=0;) { // incorrect syntax of the for loop
printf("Scaler Academy");
}
}

Output:

error: expected expression before ')' token

A for loop needs 3 arguments to run. Since we entered only one argument, the compiler
threw a syntax error.

2. Run Time Error

Errors that occur during the execution (or running) of a program are called Run Time
Errors. These errors occur after the program has been compiled successfully. When a
program is running, and it is not able to perform any particular operation, it means that
we have encountered a run time error. For example, while a certain program is running,
if it encounters the square root of -1 in the code, the program will not be able to
generate an output because calculating the square root of -1 is not possible. Hence, the
program will produce an error.

Run time errors can be a little tricky to identify because the compiler can not detect
these errors. They can only be identified once the program is running. Some of the most
common run time errors are: number not divisible by zero, array index out of bounds,
string index out of bounds, etc.

Run time errors can occur because of various reasons. Some of the reasons are:

1. Mistakes in the Code: Let us say during the execution of a while loop, the programmer
forgets to enter a break statement. This will lead the program to run infinite times, hence
resulting in a run time error.
2. Memory Leaks: If a programmer creates an array in the heap but forgets to delete the
array's data, the program might start leaking memory, resulting in a run time error.
3. Mathematically Incorrect Operations: Dividing a number by zero, or calculating the
square root of -1 will also result in a run time error.
4. Undefined Variables: If a programmer forgets to define a variable in the code, the
program will generate a run time error.

Example 1:
// A program that calculates the square root of integers
#include<stdio.h>
#include<math.h>

int main() {
for (inti = 4; i>= -2; i--) {
printf("%f", sqrt(i));
printf("\n");
}
return0;
}

Output:

2.000000
1.732051
1.414214
1.000000
0.000000
-1.#IND00
-1.#IND00

In some compilers you may also see this output:

2.000000
1.732051
1.414214
1.000000
0.000000
-nan
-nan

In the above example, we used a for loop to calculate the square root of six integers.
But because we also tried to calculate the square root of two negative numbers, the
program generated two errors (the IND written above stands for "Ideterminate"). These
errors are the run time errors. -nan is similar to IND.

Example 2:
#include<stdio.h>

void main() {
intvar = 2147483649;

printf("%d", var);
}

Output:

-2147483647

This is an integer overflow error. The maximum value an integer can hold in C is
2147483647. Since in the above example, we assigned 2147483649 to the variable var,
the variable overflows, and we get -2147483647 as the output (because of the circular
property).
3. Logical Error

Sometimes, we do not get the output we expected after the compilation and execution
of a program. Even though the code seems error free, the output generated is different
from the expected one. These types of errors are called Logical Errors. Logical errors
are those errors in which we think that our code is correct, the code compiles without
any error and gives no error while it is running, but the output we get is different from
the output we expected.

In 1999, NASA lost a spacecraft due to a logical error. This happened because of some
miscalculations between the English and the American Units. The software was coded
to work for one system but was used with the other.

For Example:
#include<stdio.h>

void main() {
float a = 10;
float b = 5;

if (b = 0) { // we wrote = instead of ==
printf("Division by zero is not possible");
} else {
printf("The output is: %f", a/b);
}
}

Output:

The output is: inf

INF signifies a division by zero error. In the above example, at line 8, we wanted to
check whether the variable b was equal to zero. But instead of using the equal to
comparison operator (==), we use the assignment operator (=). Because of this,
the if statement became false and the value of b became 0. Finally, the else clause got
executed.

4. Semantic Error

Errors that occur because the compiler is unable to understand the written code are
called Semantic Errors. A semantic error will be generated if the code makes no sense
to the compiler, even though it is syntactically correct. It is like using the wrong word in
the wrong place in the English language. For example, adding a string to an integer will
generate a semantic error.
Semantic errors are different from syntax errors, as syntax errors signify that the
structure of a program is incorrect without considering its meaning. On the other hand,
semantic errors signify the incorrect implementation of a program by considering the
meaning of the program.

The most commonly occurring semantic errors are: use of un-initialized variables, type
compatibility, and array index out of bounds.

Example 1:
#include<stdio.h>

void main() {
int a, b, c;

a * b = c;
// This will generate a semantic error
}

Output:

error: lvalue required as left operand of assignment

When we have an expression on the left hand side of an assignment operator (=), the
program generates a semantic error. Even though the code is syntactically correct, the
compiler does not understand the code.

Example 2:
#include<stdio.h>

void main() {
intarr[5] = {5, 10, 15, 20, 25};

intarraySize = sizeof(arr)/sizeof(arr[0]);

for (inti = 0; i<= arraySize; i++)

{
printf("%d \n", arr[i]);
}
}

Output:

5
10
15
20
25
32764

In the above example, we printed six elements while the array arr only had five.
Because we tried to access the sixth element of the array, we got a semantic error and
hence, the program generated a garbage value.

5. Linker Error

Linker is a program that takes the object files generated by the compiler and combines
them into a single executable file. Linker errors are the errors encountered when the
executable file of the code can not be generated even though the code gets compiled
successfully. This error is generated when a different object file is unable to link with the
main object file. We can run into a linked error if we have imported an incorrect header
file in the code, we have a wrong function declaration, etc.

For Example:
#include<stdio.h>

void Main() {
intvar = 10;
printf("%d", var);
}

Output:

undefined reference to `main'

In the above code, as we wrote Main() instead of main(), the program generated a linker
error. This happens because every file in the C language must have a main() function.
As in the above program, we did not have a main() function, the program was unable to
run the code, and we got an error. This is one of the most common type of linker error.
Difference between Unix and Windows
S. Parameters UNIX Windows
No.
1. Licensing It is an open-source system It is a proprietary software
which can be used to under owned by Microsoft.
General Public License.
2. User Interface It has a text base interface, It has a Graphical User
making it harder to grasp for Interface, making it simpler
newcomers. to use.
3. Processing It supports Multiprocessing. It supports Multithreading.
4. File System It uses Unix File System(UFS) It uses File Allocation
that comprises STD.ERR and System (FAT32) and New
STD.IO file systems. technology file
system(NTFS).
5. Security It is more secure as all It is less secure compared
changes to the system require to UNIX.
explicit user permission.
6. Data Backup It is tedious to create a backup It has an integrated backup
& Recovery and recovery system in UNIX, and recovery system that
but it is improving with the make it simpler to use.
introduction of new
distributions of Unix.
8. Hardware Hardware support is limited in Drivers are available for
UNIX system. Some hardware almost all the hardware.
might not have drivers built for
them.
9. Reliability Unix and its distributions are Although Windows has
well known for being very been stable in recent
stable to run. years, it is still to match
the stability provided by
Unix systems.

Doctor's Secret To Hair Growth
100% (1)
Doctor's Secret To Hair Growth
42 pages
NBC Chapter 11
100% (5)
NBC Chapter 11
105 pages
Compiler Design
No ratings yet
Compiler Design
117 pages
Compiler Design
No ratings yet
Compiler Design
47 pages
Compiler Design Unit 1
No ratings yet
Compiler Design Unit 1
30 pages
Wa0003.
No ratings yet
Wa0003.
126 pages
Unit 1
No ratings yet
Unit 1
109 pages
CST302 FullNotes
No ratings yet
CST302 FullNotes
134 pages
CDCS Specimen Paper C - QP
No ratings yet
CDCS Specimen Paper C - QP
30 pages
Cat 1
No ratings yet
Cat 1
150 pages
Acd Unit 4
No ratings yet
Acd Unit 4
52 pages
Module 1
No ratings yet
Module 1
86 pages
Chapter-1 Compiler Design
100% (1)
Chapter-1 Compiler Design
13 pages
Introduction, Lexical Analysis 1.1 Language Processors:: Compiled By: Dept. of CSE SJEC, M'luru
No ratings yet
Introduction, Lexical Analysis 1.1 Language Processors:: Compiled By: Dept. of CSE SJEC, M'luru
52 pages
Screenshot 2025-02-01 at 5.32.27 PM
No ratings yet
Screenshot 2025-02-01 at 5.32.27 PM
28 pages
Lexical Analysis and Parsing CD
No ratings yet
Lexical Analysis and Parsing CD
107 pages
Expository Writing Notes 2
No ratings yet
Expository Writing Notes 2
21 pages
Compiler Design
No ratings yet
Compiler Design
30 pages
Introduction and Structure of A Compiler
No ratings yet
Introduction and Structure of A Compiler
47 pages
Module 1
No ratings yet
Module 1
133 pages
Gate Compiler Design
No ratings yet
Gate Compiler Design
72 pages
CT - Lecture 2
No ratings yet
CT - Lecture 2
23 pages
Acd 2.1
No ratings yet
Acd 2.1
20 pages
Physical Science Q4 Week 2 v2
100% (1)
Physical Science Q4 Week 2 v2
20 pages
CD Unit 1
No ratings yet
CD Unit 1
54 pages
SPCCPDF
No ratings yet
SPCCPDF
83 pages
Compiler Design Slide Chapter 1-6
No ratings yet
Compiler Design Slide Chapter 1-6
250 pages
Automata Theory and Compiler Design
No ratings yet
Automata Theory and Compiler Design
55 pages
SCSA1604
No ratings yet
SCSA1604
133 pages
SSCDNotes PDF
100% (1)
SSCDNotes PDF
53 pages
L2 Compiler Phases
No ratings yet
L2 Compiler Phases
29 pages
Compiler Notes
No ratings yet
Compiler Notes
66 pages
SCS13033
No ratings yet
SCS13033
121 pages
CD - 1
No ratings yet
CD - 1
22 pages
Unit 1
No ratings yet
Unit 1
50 pages
Compiler Design Mod 1
No ratings yet
Compiler Design Mod 1
75 pages
Unit I SRM
100% (1)
Unit I SRM
36 pages
Module 1
No ratings yet
Module 1
24 pages
Module-1 1
No ratings yet
Module-1 1
53 pages
Module 1
No ratings yet
Module 1
26 pages
Phase of A Compiler
No ratings yet
Phase of A Compiler
24 pages
Ak CD Cse 305 Assignment 1
No ratings yet
Ak CD Cse 305 Assignment 1
15 pages
Lexical Analyzer
100% (1)
Lexical Analyzer
35 pages
Phases of A Compiler
No ratings yet
Phases of A Compiler
4 pages
Unit-1 PCD
No ratings yet
Unit-1 PCD
28 pages
CC Questions
No ratings yet
CC Questions
9 pages
Compiler Design: Instructor: Mohammed O. Samara University
No ratings yet
Compiler Design: Instructor: Mohammed O. Samara University
28 pages
Cdmodule 1
No ratings yet
Cdmodule 1
25 pages
Thought Mastery Vocab Text PDF
No ratings yet
Thought Mastery Vocab Text PDF
2 pages
CD Assignment Question Bank
No ratings yet
CD Assignment Question Bank
20 pages
Introduction To Compiler Design-Unit I
No ratings yet
Introduction To Compiler Design-Unit I
30 pages
Unit 1,2 PDF
No ratings yet
Unit 1,2 PDF
31 pages
Unit 1,2 PDF
No ratings yet
Unit 1,2 PDF
31 pages
CD 1
No ratings yet
CD 1
23 pages
Lec#1
No ratings yet
Lec#1
36 pages
CSE353 Slides
No ratings yet
CSE353 Slides
76 pages
Delphi - Visual Component Library PDF
No ratings yet
Delphi - Visual Component Library PDF
1,072 pages
Unit 1 Slides
No ratings yet
Unit 1 Slides
49 pages
Introduction and Structure of A Compiler
No ratings yet
Introduction and Structure of A Compiler
47 pages
Overview of Compiler Environment Pass and Phase Phases of Compiler Regular Expression Lexical Analyzer LEX Tool Bootstrapping
No ratings yet
Overview of Compiler Environment Pass and Phase Phases of Compiler Regular Expression Lexical Analyzer LEX Tool Bootstrapping
35 pages
UNIT-I Compiler Design - SCS1303: School of Computing Department of Computer Science and Engineering
No ratings yet
UNIT-I Compiler Design - SCS1303: School of Computing Department of Computer Science and Engineering
27 pages
Seminar Report 1
No ratings yet
Seminar Report 1
4 pages
Upload 1
No ratings yet
Upload 1
3 pages
TLE-Fruit Bearing Trees-Week 2
No ratings yet
TLE-Fruit Bearing Trees-Week 2
6 pages
Form 5 Chapter 5 - Environment
No ratings yet
Form 5 Chapter 5 - Environment
49 pages
Phases of A Compiler
No ratings yet
Phases of A Compiler
17 pages
1 Lexial Analysis
No ratings yet
1 Lexial Analysis
24 pages
Java-Important Questions
100% (3)
Java-Important Questions
3 pages
Customer Relationship Management
100% (1)
Customer Relationship Management
18 pages
Compiler Notes 1
No ratings yet
Compiler Notes 1
20 pages
Glenn Gould RCM
No ratings yet
Glenn Gould RCM
5 pages
1.Q and A Compiler Design
No ratings yet
1.Q and A Compiler Design
20 pages
211108-2017-Spouses Latonio v. McGeorge Food Industries20180221-6791-1nj34pi
No ratings yet
211108-2017-Spouses Latonio v. McGeorge Food Industries20180221-6791-1nj34pi
8 pages
1 Logarthmic - Decrement
No ratings yet
1 Logarthmic - Decrement
5 pages
Log
No ratings yet
Log
332 pages
Cueng Discover Analyze Read Publish - 0
No ratings yet
Cueng Discover Analyze Read Publish - 0
48 pages
Teenager Problems
No ratings yet
Teenager Problems
4 pages
Ministry of Statistics and Programme Implementation
No ratings yet
Ministry of Statistics and Programme Implementation
5 pages
1-1. Location: 1. Background To Nairobi City
No ratings yet
1-1. Location: 1. Background To Nairobi City
9 pages
Review On Medicinal Properties of Arica Papaya Inn.: Asian Pacific Journal of Tropical Disease
No ratings yet
Review On Medicinal Properties of Arica Papaya Inn.: Asian Pacific Journal of Tropical Disease
6 pages
Periodic Assessment I 2025 2026-5-12
No ratings yet
Periodic Assessment I 2025 2026-5-12
1 page
Physics Class 12
No ratings yet
Physics Class 12
9 pages
High School Exit Sample
No ratings yet
High School Exit Sample
4 pages
HolaKola HW Model Provide
No ratings yet
HolaKola HW Model Provide
4 pages
Recommendation Forms
No ratings yet
Recommendation Forms
1 page
ANH 9 Hoclieu - MIDTERM TEST (SEMESTER 1)
No ratings yet
ANH 9 Hoclieu - MIDTERM TEST (SEMESTER 1)
4 pages
Adbury Survey of Training Experiences and Clinical Practice in Assessment For Autism Spectrum Disorder by Neuropsychologists
No ratings yet
Adbury Survey of Training Experiences and Clinical Practice in Assessment For Autism Spectrum Disorder by Neuropsychologists
19 pages
Recitation 2 Philosophy
No ratings yet
Recitation 2 Philosophy
3 pages
Accudemia For Tutors FA24
No ratings yet
Accudemia For Tutors FA24
7 pages
C Programming Language
From Everand
C Programming Language
Younish Pathan
No ratings yet

Unit 5 SP

Uploaded by

Unit 5 SP

Uploaded by

UNIT-V SP

1. Tokenization i.e. Dividing the program into valid tokens.

As another example, consider below printf statement.

There are 5 valid token in this printf statement.

Which data structure is used for lexical analysis?

Symbol Table is an important data structure created and maintained by the

Implementation of Symbol table –

 If that string of tokens is grammatically incorrect, it reports the syntax error.

In case, syntax analyzer identifies syntax errors, it performs error recovery

Syntax Error Handling

 If we concentrate on error handling right from the starting of compiler

Different Types of Errors

1. Lexical Error: It occurs when spellings of the identifiers, keywords, operators

Syntax Analysis in Compiler

Content: Syntax Analysis in Compiler Design

 If that string of tokens is grammatically incorrect, it reports the syntax error.

In case, syntax analyzer identifies syntax errors, it performs error recovery

Syntax Error Handling

 If we concentrate on error handling right from the starting of compiler

Different Types of Errors

1. Lexical Error: It occurs when spellings of the identifiers, keywords, operators

 The parser must report the errors in the program accurately.

Error Recovery Method

Role of context free grammar in Syntax analysis:

LEX Source Program

Distinct Names ⎡⎣⎢D1D2Dn=R1=R2=Rn⎤⎦⎥[D1=R1D2=R2Dn=Rn] Regular Expressions

Auxiliary Definition for Signed Numbers

%token NUMBER 621

 The rules part contains grammar definition in a modified BNF form.

 Actions is C code in { } and can be embedded inside (Translation schemes).

Input File: Auxiliary Routines Part:

 The auxiliary routines part is only C code.

 It includes function definitions for every function needed in rules part.

 The main() function must call the function yyparse().

 If yylex() is not defined in the auxiliary routines sections, then it should be

 The output of YACC is a file named y.tab.c

 If it contains the main() definition, it must be compiled to be executable.

 Otherwise, the code can be an external function definition for the

 If called with the –v option, Yacc produces as output a

The most commonly occurring syntax errors in C language are:

 Missing semi-colon (;)

Let us take an example to understand syntax errors:

printf("The variable is: %d", var);

error: 'var' undeclared (first use in this function)

Let's see another example:

error: expected expression before ')' token

2. Run Time Error

**In some compilers you may also see this output: **

The output is: inf

error: lvalue required as left operand of assignment

for (inti = 0; i<= arraySize; i++)

undefined reference to `main'

You might also like

In some compilers you may also see this output: