0% found this document useful (0 votes)
7 views

Chapter 3

Chapter 3 discusses syntax analysis, the second phase of compiler design, which organizes tokens into a hierarchical structure like a parse tree based on programming language grammar. It highlights the role of the parser in validating syntax, generating parse trees, detecting errors, and preparing for semantic analysis, along with the components of Context-Free Grammar (CFG) such as terminals, non-terminals, production rules, and start symbols. The chapter also covers parsing techniques, including top-down and bottom-up parsing methods.

Uploaded by

Shafi Esa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Chapter 3

Chapter 3 discusses syntax analysis, the second phase of compiler design, which organizes tokens into a hierarchical structure like a parse tree based on programming language grammar. It highlights the role of the parser in validating syntax, generating parse trees, detecting errors, and preparing for semantic analysis, along with the components of Context-Free Grammar (CFG) such as terminals, non-terminals, production rules, and start symbols. The chapter also covers parsing techniques, including top-down and bottom-up parsing methods.

Uploaded by

Shafi Esa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Chapter 3

Syntax Analysis
Syntax Analysis
• Syntax analysis, also known as parsing, is the second
phase of compiler design.
• It takes the tokens generated by the lexical analysis
phase and organizes them into a hierarchical
structure, usually represented as a parse tree or
syntax tree, based on the grammar of the
programming language.
Role of Parser
• Validate Syntax: Ensure the program's structure
conforms to the language's grammar rules.
• Generate Parse Tree: Represent the syntactic structure
of the source code.
• Error Detection and Recovery: Identify and recover
from syntax errors for further analysis.
• Prepare for Semantic Analysis: Provide a structure for
semantic analysis and intermediate code generation.
Cont…
Context-Free Grammar (CFG)
• Used in syntax analysis to define language syntax.
• A Context-Free Grammar (CFG) is a formalism used to
define the syntax of programming languages.
• It provides the rules to describe the structure of valid
strings in the language.
• A CFG consists of the following key components:
Terminals
Non-terminals
Production rules
Start symbol.
Terminals
• Definition: The basic symbols of the language,
which cannot be broken down further.
• Represented with small letters
• Example
• In arithmetic expressions: +, *, (, ), id (identifier).
• In programming languages: if, else, while.
Non-Terminals
• Symbols that represent groups of strings or structures in
the language that are represented with capital letters
and that can be further derivate
• They are placeholders for patterns defined in terms of
other terminals and non-terminals.
• Example:
• Expr (Expression), Stmt (Statement), Term, Factor
Start Symbol
• A specific non-terminal symbol from which parsing or
derivation begins.
• One of non terminals from where production begins.
• Example:
For arithmetic expressions, the start symbol might be Expr
Production Rules
Rules that describe how non-terminals can be replaced with
terminals or other non-terminals.
• Terminals and non-terminals combined to form strings
• Form: A production rule has the format:
•A → α
Where:
• A is a non-terminal.
• α is a sequence of terminals and/or non-terminals.
Example 1
3 + 5, 10 * (2 + 3), or 5 - (3 / 2).
Key Components of the CFG:
Variables (Non-terminal symbols): These represent
syntactic categories.
 Expr: Represents an arithmetic expression.
 Term: Represents a term (part of an expression that can be
multiplied or divided).
 Factor: Represents a factor (the smallest unit, such as a
number or a sub-expression)
Terminals: These are the actual symbols in the
language (operators, numbers, parentheses).
Operators: +, -, *, /
Parentheses: (, )
NUMBER: Represents any numeric value, e.g., 3, 5,
10, etc.
Cont…
Production Rules: These define how variables (non-
terminals) can be expanded into terminals or other non-
terminals.
Cont…
Start Symbol: This is the symbol from which the
parsing starts.
In this case, the start symbol is Expr.
Example 2
Variables (Non-terminals): Expr, Term, Factor For example, the expression 3 + 5 * (2 - 1) would be parsed as:
Terminals: +, -, *, /, (, ), NUMBER
Expr → Expr + Term
Production Rules: → Term + Term
1. Expr → Expr + Term → Factor + Term
2. Expr → Expr - Term → NUMBER + Term
→ 3 + Term
3. Expr → Term → Term * Factor
4. Term → Term * Factor → Factor * Factor
5. Term → Term / Factor → NUMBER * NUMBER
→ 5 * (2 - 1)
6. Term → Factor → 5 * (Expr - Expr)
7. Factor → ( Expr ) → 5 * (NUMBER - NUMBER)
8. Factor → NUMBER → 5 * (2 - 1)
Start Symbol: Expr
Description of Components
Parsing Techniques
Top-Down Parsing:
Begins from the start symbol and works down the
parse tree.
Examples: Recursive Descent, LL(1).
Bottom-Up Parsing
Builds the parse tree from leaves to root.
Examples: LR(0), SLR(1), LALR(1), Canonical LR.

You might also like