CSC312 2.docx Updated

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 10

GROUP 1 CSC312

COMPILER DESIGN AND


CONSTRUCTION II.
TOPIC: TOP DOWN AND BOTTOM UP
APPROACH.

WHAT IS PARSING?
It means to divide something into parts to examine each part individually, you
can take a sentence, split it into grammatical components then identify them
and the relations between them. An example is URL parsing which is common
in web development, it involves breaking down a URL into components such as
the protocol, domain, path, query parameters. We can take a look at this,
parsing the URL https://www.example.com/page?query=123 would extract the
protocol (“http”), domain (www.example.com), path (“/page”), and query
parameter (“query=123”).
WHAT IS A PARSER?
A parser is a program that’s usually part of a compiler, It receives input in the
form of sequential source program instructions, interactive online commands,
markup tags or some other defined interface. Parsers break the input they get
into parts such as objects, methods and their attributes or options. A practical
example of a parser reading and understanding a shopping list, your shopping
list has items like apples, milk and bread and you want to organise the list into
categories like fruits, dairy and bread products, it can be organised using the
following:
1. Reading the list.
2. Understanding categories.
3. Organizing the list.
4. Creating a structured list.
The parser helps implement the above and it makes your shopping easier by
diving it into sections.
WHAT DO PARSING AND PARSERS ENTAIL?
The word "parse" means to analyse an object specifically. It is commonly used
in computer science to refer to reading program code. For example, after a
program is written, whether it be in C++, Java, or any other language, the code
needs to be parsed by the compiler in order to be compiled.
Parsing can also refer to breaking up ordinary text. For example, search
engines typically parse search phrases entered by users so that they can more
accurately search for each word.
The process of transforming the data from one format to another is called
Parsing, In the world of software, every different entity has its criteria for the
data to be processed. So parsing is the process that transforms the data in such
a way so that it can be understood by any specific software.
This process can be accomplished by the parser. The parser is a component of
the translator that helps to organise linear text structure following the set of
defined rules which is known as grammar.
Parsing, in general, refers to the process of analysing a string of symbols
according to the rules of a formal grammar. It’s a fundamental concept used in
various fields, including linguistics, mathematics, and computer science. When
we talk about parsing in programming, we’re specifically referring to the
process of analysing strings of text according to the syntax rules of a
programming language or a data format.

Types of Parsing:
There are two types of Parsing:

● The Top-down Parsing

● The Bottom-up Parsing

Top-Down Parsing:
Top-down parsing is a parsing technique that first looks at the highest level of
the parse tree and works down the parse tree by using the rules of grammar,
in top-down parsing, you start from the root of the tree and work your way
down towards the leaves. It's like starting from the top of the tree and trying
to figure out how it branches out.
Let’s take this;
S -> aABe , A -> Abc|a, B -> d , where S is the start symbol and there are three
production rules.
S -> aabcde
Process:
You begin at the top node (the root) of the tree, which represents the start
symbol of the grammar.
You recursively descend the tree, following branches based on production
rules in the grammar.
At each node, you try to match the current input with the expected symbol or
pattern.
If there's a match, you continue descending further down the tree. If not, you
backtrack and try another branch.
Similitude (Analogy):
Imagine you're trying to identify a specific type of tree by starting with its
general characteristics (like the shape of its leaves, the color of its bark). Then,
based on these characteristics, you narrow down your search to find the exact
type of tree.
Example:
Let's say you're parsing an arithmetic expression. Starting with the root symbol
for an expression, you recursively break it down into smaller components like
terms, factors, and individual tokens until you reach the leaves, which
represent the tokens in the expression.
Bottom-Up Parsing: Bottom-up Parsing is a parsing technique that first looks
at the lowest level of the parse tree and works up the parse tree by using the
rules of grammar. In bottom-up parsing, you start from the leaves of the tree
and work your way up towards the root. It's like starting with individual pieces
and gradually assembling them into larger structures.
LHS (Left-Hand Side can be re-written as RHS (Right-Hand Side):
RHS (Right-Hand Side) can be reduced to LHS (Left-Hand Side) using bottom up
approach, so we will keep reducing till we get to the Start symbol. I.e aabcde
will be reduced to S -> aABe
Process:
You begin with the individual tokens (leaves) of the tree, treating each token as
a small parse tree by itself.
You combine adjacent tokens according to production rules to create larger
and larger parse trees.
Eventually, you merge these smaller parse trees together until you reach the
root of the tree, which represents the entire input.
Similitude (Analogy):
Think of it as assembling a jigsaw puzzle from individual pieces. You start with
the pieces (tokens) and gradually connect them based on their shapes and how
they fit together, until you form the complete picture (the parsed input).

Example:
Using the arithmetic expression example again, you start with individual tokens
representing numbers, operators, and parentheses. Then, based on grammar
rules, you combine adjacent tokens into larger components like factors, terms,
and finally, the entire expression.

Conclusion:
Both top-down and bottom-up parsing approaches offer different perspectives
on how to understand and process the structure of a language. While top-
down parsing starts with the big picture and breaks it down into smaller parts,
bottom-up parsing begins with the individual pieces and builds them up into
larger structures. Each approach has its own strengths and weaknesses,
depending on the specific grammar and parsing requirements.

Top-Down Parsing Bottom-Up Parsing

It is a parsing strategy that first looks It is a parsing strategy that first


Top-Down Parsing Bottom-Up Parsing

at the highest level of the parse tree looks at the lowest level of the
and works down the parse tree by parse tree and works up the parse
using the rules of grammar. tree by using the rules of grammar.

Bottom-up parsing can be defined


Top-down parsing attempts to find
as an attempt to reduce the input
the left most derivations for an input
string to the start symbol of a
string.
grammar.

In this parsing technique we start


In this parsing technique we start
parsing from the bottom (leaf node
parsing from the top (start symbol of
of the parse tree) to up (the start
parse tree) to down (the leaf node of
symbol of the parse tree) in a
parse tree) in a top-down manner.
bottom-up manner.

This parsing technique uses Left Most This parsing technique uses Right
Derivation. Most Derivation.

The main decision is to select when


The main leftmost decision is to select
to use a production rule to reduce
what production rule to use in order
the string to get the starting
to construct the string.
symbol.

Example: Recursive Descent parser. Example: ItsShift Reduce parser.

Recursive Descent Presenter:


It is a kind of Top-Down Parser. A top-down parser builds the parse tree from
the top to down, starting with the start non-terminal. A Predictive Parser is a
special case of Recursive Descent Parser, where no Back Tracking is required.
By carefully writing a grammar means eliminating left recursion and left
factoring from it, the resulting grammar will be a grammar that can be parsed
by a recursive descent parser.
Example:

Before removing left After removing left


recursion recursion

E –> T E’
E –> E + T | T E’ –> + T E’ | e
T –> T * F | F T –> F T’
F –> ( E ) | id T’ –> * F T’ | e
F –> ( E ) | id

Here is E Epsilon
For Recursive Descent Parser, we are going to write one program for every
variable.

In LL1, the first L stands for Left to Right, and the second L stands for Left-most
Derivation. 1 stands for the number of Looks Ahead tokens used by the parser
while parsing a sentence. LL(1) parsing is constructed from the grammar which
is free from left recursion, common prefix, and ambiguity.
Let’s take this practical example
Imagine you have a recipe, and you want to follow it step by step to cook a
dish. Each step in the recipe corresponds to a rule in our grammar. Our goal is
to understand the recipe (grammar) and follow it correctly to make the dish
(parse the input) .Now, let's say our recipe (grammar) is written in such a way
that each step (rule) tells us exactly what ingredient or action to take next
without any ambiguity. This means, for any step we're on and any ingredient
or action we're considering, there's only one choice to make to move forward.
Here comes the LL(1) part: "LL" means we read the input from left to right and
construct a leftmost derivation (we start from the left of a sentence and keep
replacing non-terminals with their productions until we reach the rightmost
derivation). And "1" means we only need to look at the next 1 token in the
input to make decisions.
So, in our cooking scenario, being LL(1) means that for each step in the recipe,
we only need to look at the current ingredient or action we're dealing with to
know exactly what to do next. There's no need to look ahead multiple steps or
backtrack. To put it simply, LL(1) parsing is like following a super clear and
straightforward recipe where each step tells you exactly what to do next based
on what you're currently dealing with, without any confusion or need to guess.
Program -> Statement*
Statement -> VariableDeclaration | Assignment | IfStatement | WhileLoop |
FunctionDefinition
VariableDeclaration -> 'var' Identifier ';'
Assignment -> Identifier '=' Expression ';'
IfStatement -> 'if' '(' Expression ')' '{' Statement* '}' ('else' '{' Statement* '}' )?
WhileLoop -> 'while' '(' Expression ')' '{' Statement* '}'
FunctionDefinition -> 'function' Identifier '(' Parameters ')' '{' Statement* '}'

LR Parser
A general shift reduce parsing is LR parsing. The L stands for scanning the input
from left to right and R stands for constructing a rightmost derivation in
reverse.
Benefits of LR parsing:
1. Many programming languages using some variations of an LR parser. It
should be noted that C++ and Perl are exceptions to it.
2. LR Parser can be implemented very efficiently.
3. Of all the Parsers that scan their symbols from left to right, LR Parsers
detect syntactic errors, as soon as possible.
Imagine you have a bunch of Lego blocks scattered on a table, and you
want to build a spaceship out of them. Each Lego block represents a part
of the spaceship, like a wing, a fuselage, or an engine. Now, you have a
set of assembly instructions (grammar rules) that tell you how to put
these blocks together to build the spaceship.
In essence, LR parsing is like building a Lego spaceship by scanning
through the blocks, following assembly instructions, and gradually
assembling them into larger structures until you've constructed the
entire spaceship. It's a bottom-up process because you start with
individual blocks and build up to the final structure, just like how
bottom-up parsing starts with individual tokens and builds up to the
complete parse tree.

Operator precedence parser – An operator precedence parser is a


bottom-up parser that interprets an operator grammar. This parser is
only used for operator grammars. Ambiguous grammars are not
allowed in any parser except operator precedence parser. There are two
methods for determining what precedence relations should hold
between a pair of terminals:
References:
o www.techopedia.com
techopedia
o www.byjus.com
parsing in compiler design
o www.geeksforgeeks.org
GeeksforGeeks

You might also like