CD Unit 1
CD Unit 1
1. Explain various phases of a compiler with a neat diagram and example in hinglish. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Explain Token, Pattern, and Lexeme with an example.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3. Explain error recovery strategy in a parser.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8
4. Explain the application of compiler technology.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11
5. Explain Lexical Analyzer in detail.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
6. Explain the Lexical Analyzer generator tool (LEX).. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
7. Write the algorithm for FIRST and FOLLOW.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
8. Explain the structure of a compiler with a diagram.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
9. Discuss the role of a lexical analyzer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
10. Explain syntax analyzer in detail.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
11. Explain Recursive Descent Parser with an example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37
12. *Write the Recursive Descent Parser for the context-free grammar: S → +SS | SS | a.. . . . . . . . . . . . . . . . . . . . . . . . . . .41
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 1/43
1. Explain various phases of a compiler with a neat diagram
and example in hinglish
🚀 Phases of a Compiler
1️⃣ Lexical Analysis (Tokenization)
Kaam: Source code ko tokens me todta hai.
Input: Source code
Output: Tokens
Example:
int a = b + 5;
css
=
/ \
a +
/ \
b 5
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 2/43
Input: Parse Tree
Output: Annotated Parse Tree
Example:
Agar `b` declare nahi hui hai, toh error throw karega.
css
t1 = b + 5
a = t1
css
a = b + 5
css
MOV b, R1
ADD 5, R1
MOV R1, a
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 3/43
📝 Diagram of Compiler Phases
plaintext
Source Program
|
v
Lexical Analysis → Tokens
|
v
Syntax Analysis → Parse Tree
|
v
Semantic Analysis → Annotated Tree
|
v
Intermediate Code Generation → Intermediate Code
|
v
Code Optimization → Optimized Intermediate Code
|
v
Code Generation → Target Code
|
v
Target Machine Code
int a = b + 5;
Phases Output:
1. Lexical Analysis: Tokens → `int`, `a`, `=`, `b`, `+`, `5`, `;`
2. Syntax Analysis: Parse Tree banega.
3. Semantic Analysis: Type check karega.
4. Intermediate Code: `t1 = b + 5` → `a = t1`
5. Code Optimization: Direct `a = b + 5`
6. Code Generation: Machine Code instructions generate honge.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 4/43
2. Explain Token, Pattern, and Lexeme with an example.
🚀 1. Token:
Definition: Token ek category of lexemes hota hai jo compiler recognize karta hai.
Example: Keywords (`int`, `if`, `else`), Identifiers (`a`, `count`), Operators (`+`, `=`).
Explanation: Token ek general classification hai jaise `IDENTIFIER`, `NUMBER`, `KEYWORD`, etc.
Example Code:
int a = b + 5;
Tokens:
`int`→ KEYWORD
`a` → IDENTIFIER
`=` → OPERATOR
`b` → IDENTIFIER
`+` → OPERATOR
`5` → NUMBER
`;` → SPECIAL SYMBOL
🚀 2. Lexeme:
Definition: Lexeme ek sequence of characters hai jo token ke instance ko represent karta hai.
Example: Source code me jo exact word ya symbol likha hota hai, woh lexeme hota hai.
Example Code:
int a = b + 5;
Lexemes:
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 5/43
`int`→ Token: KEYWORD
`a` → Token: IDENTIFIER
`=` → Token: OPERATOR
`b` → Token: IDENTIFIER
`+` → Token: OPERATOR
`5` → Token: NUMBER
`;` → Token: SPECIAL SYMBOL
🚀 3. Pattern:
Definition: Pattern ek rule ya regular expression hota hai jo batata hai ki ek lexeme ko kaise
match karna hai.
Example:
`int|float|char` → KEYWORD Pattern
`[a-zA-Z][a-zA-Z0-9]*` → IDENTIFIER Pattern
`[0-9]+` → NUMBER Pattern
Example Code:
int a = b + 5;
Patterns:
✅ Comparison Table
Term Definition Example
Token Category of lexemes `KEYWORD`, `IDENTIFIER`
📝 Summary Example:
c
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 7/43
3. Explain error recovery strategy in a parser.
Example:
int a = 5 + ;
int b = 10;
Parser `;` tak skip karega aur next statement `int b = 10;` se parsing continue karega.
Example:
int a = 5 +
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 8/43
int b = 10;
int a = 5 + 0;
Example Rule:
`if (expr stmt)` ko handle karte waqt agar `)` missing hai, toh error production handle kar lega.
Example Code:
if (x == 5 {
printf("Error detected");
}
Example:
int a = 5 +
b = 10;
int a = 5 + 0;
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 9/43
int b = 10;
✅ Comparison Table
Strategy Approach Pros Cons
Panic Mode Tokens skip karta hai Fast aur simple Valid tokens miss ho sakte hain
Phrase-Level Local correction Specific error fix Har error handle nahi hota
Error Production Predefined error rules Common errors handle Sabhi errors cover nahi honge
Global Correction Minimum global changes Optimal correction Slow aur computationally expensive
📝 Key Takeaways
1. Panic Mode: Quick recovery by skipping tokens.
2. Phrase-Level: Small corrections in syntax.
3. Error Productions: Specific error rules for common mistakes.
4. Global Correction: Tries to minimize changes globally.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 10/43
4. Explain the application of compiler technology.
🚀 6. Embedded Systems
Role: Embedded systems me compiler ka use code ko resource-constrained hardware devices ke
liye optimize karne me hota hai.
Example:
ARM-based microcontrollers ke liye compiler tools
✅ Summary Table
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 12/43
Application Area Example Tools Purpose
Programming Languages GCC, Javac Source to machine code
JIT Compilation JVM, CLR Runtime optimization
📝 Key Takeaways:
Compiler technology har jagah hai – from software development se lekar hardware design tak.
Optimization aur efficiency ensure karna iska main goal hota hai.
New technologies jaise AI/ML frameworks aur JIT compilers bhi compiler techniques ka use karte
hain.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 13/43
5. Explain Lexical Analyzer in detail.
Example:
int a = b + 5;
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 14/43
Lexeme Token Pattern
int KEYWORD int
a IDENTIFIER [a-zA-Z][a-zA-Z0-9]*
= OPERATOR =
b IDENTIFIER [a-zA-Z][a-zA-Z0-9]*
+ OPERATOR +
5 NUMBER [0-9]+
; SYMBOL ;
int x = 10;
Steps:
Output:
plaintext
(KEYWORD, int)
(IDENTIFIER, x)
(OPERATOR, =)
(NUMBER, 10)
(SYMBOL, ;)
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 15/43
🚀 6. Lexical Errors
Common Lexical Errors:
Error Handling:
lex
digit [0-9]
letter [a-zA-Z]
%%
{digit}+ printf("NUMBER");
{letter}+ printf("IDENTIFIER");
"=" printf("OPERATOR");
%%
✅ Key Takeaways:
1. Lexical Analyzer: First phase of a compiler, generates tokens.
2. Tokens: Logical unit representing keywords, identifiers, operators.
3. Lexemes: Actual text corresponding to tokens.
4. Tools: Lex, Flex, ANTLR.
5. Error Handling: Detect invalid characters, illegal identifiers.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 16/43
Agar aur clarity chahiye ya koi specific example samajhna ho, toh pooch lena! 😊
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 17/43
6. Explain the Lexical Analyzer generator tool (LEX).
🚀 What is LEX?
Definition:
LEX ek Lexical Analyzer Generator hai jo regular expressions ko use karke source code ke tokens
ko identify karta hai.
Input: Lex Specification File (rules aur patterns defined hoti hain).
Output: C Program (scanner/lexer ke form me).
🚀 Working of LEX
1. Specification File: User ek `.l` extension wala file likhta hai jo token definitions aur rules contain
karta hai.
2. Lex Compiler: LEX tool specification file ko process karta hai aur ek C program generate karta hai.
3. Scanner Output: Generated C program ko compile karke ek executable file banti hai, jo tokens ko
recognize kar ke output karti hai.
lex
{Definitions}
%%
{Rules}
%%
{Auxiliary Functions}
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 18/43
lex
digit [0-9]
letter [a-zA-Z]
lex
{digit}+ printf("NUMBER\n");
{letter}+ printf("IDENTIFIER\n");
"=" printf("OPERATOR\n");
lex
digit [0-9]
letter [a-zA-Z]
%%
{letter}({letter}|{digit})* printf("IDENTIFIER: %s\n", yytext);
{digit}+ printf("NUMBER: %s\n", yytext);
"=" printf("OPERATOR: =\n");
"+" printf("OPERATOR: +\n");
";" printf("SPECIAL SYMBOL: ;\n");
[ \t\n]+ /* Ignore whitespace */
. printf("UNKNOWN: %s\n", yytext);
%%
int main() {
yylex(); // Calls the scanner
return 0;
}
Steps to Execute:
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 19/43
bash
lex example.l
bash
bash
./scanner
Input:
int x = 10 + 20;
Output:
plaintext
IDENTIFIER: int
IDENTIFIER: x
OPERATOR: =
NUMBER: 10
OPERATOR: +
NUMBER: 20
SPECIAL SYMBOL: ;
🚀 Advantages of LEX
1. Efficiency: Regular expressions ki wajah se high performance scanning hota hai.
2. Automation: Manual lexer banane ki zarurat nahi hoti.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 20/43
3. Error Handling: Undefined patterns ke liye automatic error reporting.
4. Portability: Generated C code kisi bhi platform par run ho sakta hai.
🚀 Limitations of LEX
1. Limited to Regular Languages: Complex grammars handle nahi kar sakta.
2. Integration Needed: LEX ko parser (e.g., YACC) ke saath integrate karna padta hai.
3. Debugging Issues: Large LEX files me debugging mushkil ho sakta hai.
✅ Key Takeaways:
LEX: Lexical analyzer generator jo regular expressions ka use karta hai.
Input: Specification file with token patterns.
Output: C program acting as a scanner.
Use cases: Language compilers, interpreters, data filters.
Agar aur detail chahiye ya koi confusion ho, toh zarur puchho! 😊
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 21/43
7. Write the algorithm for FIRST and FOLLOW.
🚀 1. FIRST Set
📌 Definition:
FIRST(X): Set of terminals (or ε, epsilon) jo kisi non-terminal ya production ke starting symbols ho
sakte hain.
Algorithm FIRST(X):
1. Initialize FIRST(X) = {}
2. If X is a terminal, add X to FIRST(X) and return.
3. If X → ε is a production, add ε to FIRST(X).
4. For each production X → Y₁ Y₂ ... Yₙ:
a. For each Yᵢ:
i. Add FIRST(Yᵢ) (excluding ε) to FIRST(X).
ii. If ε ∈ FIRST(Yᵢ), continue to Yᵢ₊₁.
iii. If ε ∉ FIRST(Yᵢ), break the loop.
b. If all Y₁ Y₂ ... Yₙ can derive ε, add ε to FIRST(X).
5. Return FIRST(X).
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 22/43
📚 Example of FIRST Set:
Grammar:
1. `E → T E'`
2. `T → F T'`
3. `F → ( E ) | id`
Solution:
🚀 2. FOLLOW Set
📌 Definition:
FOLLOW(A): Set of terminals jo kisi non-terminal `A` ke baad aa sakte hain.
Algorithm FOLLOW(A):
1. Initialize FOLLOW(A) = {}
2. If A is the start symbol, add $ to FOLLOW(A).
3. For each production rule:
a. For A → αBβ:
i. Add FIRST(β) (excluding ε) to FOLLOW(B).
ii. If ε ∈ FIRST(β) or β is empty, add FOLLOW(A) to FOLLOW(B).
b. For A → αB:
i. Add FOLLOW(A) to FOLLOW(B).
4. Repeat until no changes occur in FOLLOW sets.
5. Return FOLLOW(A).
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 23/43
1. `E → T E'`
2. `T → F T'`
3. `F → ( E ) | id`
Solution:
🚀 Summary Table
Property FIRST(X) FOLLOW(X)
Definition Start symbols of X Symbols after X
Includes Terminals, ε Terminals, `$`
Use Top-Down Parsing LL(1) Parsing
Algorithm Focus Starts from the start of productions Follows symbols after non-terminal
✅ Key Takeaways:
1. FIRST(X): Starting terminals of a non-terminal or production.
2. FOLLOW(X): Terminals that can follow a non-terminal.
3. Both sets are essential for LL(1) Parsing Table Construction.
4. Algorithms are iterative and depend on grammar rules.
Agar aur clarity chahiye ya example solve karna ho, toh pooch lena! 😊
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 24/43
8. Explain the structure of a compiler with a diagram.
🚀 1. Phases of a Compiler
Compiler do main parts me divide hota hai:
Intermediate Code Optimization: Code ko optimize karta hai taaki execution faster ho.
Code Generation: Target machine code generate hota hai.
Code Optimization: Machine-specific optimization karta hai taaki performance better ho.
+------------------------+
| Source Program |
+-----------+------------+
|
v
+-----------+------------+
| Lexical Analysis |
+-----------+------------+
|
v
+-----------+------------+
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 25/43
| Syntax Analysis |
+-----------+------------+
|
v
+-----------+------------+
| Semantic Analysis |
+-----------+------------+
|
v
+-----------+------------+
| Intermediate Code Gen. |
+-----------+------------+
|
v
+-----------+------------+
| Code Optimization |
+-----------+------------+
|
v
+-----------+------------+
| Code Generation |
+-----------+------------+
|
v
+-----------+------------+
| Target Code |
+------------------------+
Input: Tokens
Output: Parse Tree
Task: Syntax rules ko check karta hai aur parse tree generate karta hai.
Example: Syntax Error detection jaise `int = 5 a;`
3. Semantic Analysis
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 26/43
5. Code Optimization
6. Code Generation
Error Handler:
Errors ko detect aur handle karta hai har phase me (Lexical, Syntax, Semantic).
Example: Syntax Errors, Type Errors, Runtime Errors.
🚀 5. Front-end vs Back-end
Aspect Front-end (Analysis) Back-end (Synthesis)
Focus Source Code Analysis Target Code Generation
Phases Lexical, Syntax, Semantic Optimization, Code Generation
Output Intermediate Code Machine Code
Error Handling Syntax & Semantic Errors Runtime & Logical Errors
✅ Key Takeaways:
1. Compiler do main parts me divide hota hai: Front-end aur Back-end.
2. Har phase ka ek specific task hota hai jo code ko efficient aur optimized banata hai.
3. Symbol Table aur Error Handler har phase me support provide karte hain.
4. Diagram se clear hota hai ki har phase ek dusre se connected hai.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 27/43
Agar diagram ya kisi phase me aur explanation chahiye, toh zaroor poochho! 😊
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 28/43
9. Discuss the role of a lexical analyzer
int x = 10;
Tokens: `KEYWORD (int)`, `IDENTIFIER (x)`, `OPERATOR (=)`, `NUMBER (10)`, `SEMICOLON (;)`
Source code ko tokens me todta hai jo smallest meaningful units hote hain.
Example: `if (x > 0)`
Tokens: `KEYWORD (if)`, `SYMBOL (()`, `IDENTIFIER (x)`, `OPERATOR (>)`, `NUMBER (0)`, `SYMBOL ())`
// This is a comment
int a = 5;
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 29/43
3. Identifying Lexemes and Generating Tokens
4. Error Detection
Lexical errors jaise invalid characters ya unknown symbols ko detect karta hai.
Example:
int @a = 5;
Identifiers aur literals ko symbol table me store karta hai jisme unka type, value, aur scope
defined hota hai.
Example:
int x = 10;
= OPERATOR `=`
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 30/43
Lexeme Token Pattern
10 NUMBER `[0-9]+`
🚀 Lexical Errors
Lexical Analyzer me kuch common errors hote hain:
+---------------------------------------------+
| Source Program (Characters) |
+-------------------------+-------------------+
|
v
+-------------------------+-------------------+
| Lexical Analyzer (Scanner) |
+-------------------------+-------------------+
|
+--------------+-----------------+
| |
v v
+-----------------+ +------------------+
| Symbol Table | | Token Stream |
+-----------------+ +------------------+
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 31/43
✅ Key Takeaways:
1. Lexical Analyzer source code ko tokens me todta hai.
2. Whitespace aur comments ko ignore karta hai.
3. Errors detect karta hai aur symbol table maintain karta hai.
4. Tokens ko Syntax Analyzer ko pass karta hai.
5. Yeh phase efficiency aur clarity improve karta hai compiler design me.
Agar aur clarity chahiye ya koi specific example solve karna ho, toh zaroor poochho! 😊
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 32/43
10. Explain syntax analyzer in detail.
🚀 Types of Parsers
Parsers ko mainly do types me classify kiya jaata hai:
1. Top-Down Parsers
Start symbol se begin karte hain aur grammar ke rules ko apply karke syntax tree banate hain.
Examples:
Recursive Descent Parsing
LL(1) Parsing (Left-to-right, Left-most derivation)
2. Bottom-Up Parsers
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 33/43
Tokens ko bottom se upar tak process karte hain. Yeh reverse order me production rules ko apply
karte hain.
Examples:
Shift-Reduce Parsing
LR Parsing (Left-to-right, Right-most derivation)
1. `E → E + T`
2. `E → T`
3. `T → T * F`
4. `T → F`
5. `F → (E)`
6. `F → id`
Expression:
`id + id * id`
Step-by-Step Parsing:
1. Input Tokens:
`id`, `+`, `id`, `*`, `id`
E
|
E + T
/ \
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 34/43
T T * F
| |
F F
| |
id id
E
/ \
E T+
/ / \
T T * F
| | |
F F id
| |
id id
Description: Top-down approach hai jisme har non-terminal ke liye ek function likha jaata hai. Yeh
method backtracking use kar sakta hai agar current choice fail hoti hai.
Advantages:
Simple to implement.
Naturally supports left-recursive grammars.
Disadvantages:
Inefficient for complex grammars.
Backtracking ki wajah se performance problem ho sakti hai.
2. LL(1) Parsing
Description: Left-to-right, Left-most derivation se parsing karta hai. Yeh non-recursive approach
hai jisme parsing table use hota hai.
Advantages:
Efficient parsing.
No backtracking required.
Disadvantages:
Left recursion ko handle nahi kar sakta.
Grammar ko transform karna padta hai.
3. Shift-Reduce Parsing
Description: Bottom-up approach hai jo stack-based parsing hoti hai. Yeh approach shift (move
input symbol to stack) aur reduce (apply production rule) operations use karta hai.
Advantages:
Efficient for large grammars.
Supports bottom-up derivations.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 35/43
Disadvantages:
More complex to implement than LL parsers.
Difficult to handle ambiguity in grammar.
4. LR Parsing
Description: Bottom-up approach hai jisme LR parsing table use hota hai. Yeh shift-reduce
parsing ka advanced version hai. LR parsers context-free grammars ke liye efficient hote hain.
Advantages:
Efficient and can parse large grammars.
No backtracking required.
Disadvantages:
More complex to implement than LL(1).
Requires detailed parse tables.
✅ Key Takeaways:
1. Syntax Analyzer (Parser) ka role grammar rules ke through program ka structure verify karna
hota hai.
2. Yeh parse tree generate karta hai jo source code ka structure represent karta hai.
3. Top-Down aur Bottom-Up parsers hote hain.
4. Parser ko efficient banana zaruri hai taaki parsing fast ho.
5. Error detection aur error recovery syntax analyzer ke important tasks hain.
Agar aapko koi specific example ya aur detail chahiye, toh zarur poochho! 😊
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 36/43
11. Explain Recursive Descent Parser with an example
mathematica
E → E + T
E → T
T → id
Parser Functions:
1. E()
2. T()
python
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 37/43
tokens = ['id', '+', 'id']
current_token = 0 # Pointer to the current token
if current_token == len(tokens):
print("Parsing Successful!")
else:
print("Parsing Failed!")
1. E() function:
`E()` function ko sabse pehle call kiya jaata hai. Yeh function T() ko call karke ek term process
karta hai.
Uske baad, agar + symbol milta hai, toh `match('+')` function call hota hai, aur `T()` ko
dubara call karke next expression ko process kiya jaata hai.
2. T() function:
function ko call karte hi, hum id ko match karte hain (jo humare grammar me term ka
`T()`
part hai).
3. match() function:
`match()`function token ko check karta hai aur agar expected token milta hai, toh pointer ko
next token par move kar deta hai.
Agar expected token nahi milta, toh SyntaxError raise hota hai.
Step-by-Step Example
Input Tokens:
`id + id`
1. Initial Call:
Parser `E()` ko call karta hai. Ab `E()` function ke andar `T()` ko call kiya jaata hai.
2. T() Call:
`T()` function ko call karte hi, id ko match kiya jaata hai. Token pointer move hoke next token (`+`)
pe chala jaata hai.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 38/43
3. Back to E():
Ab `E()` function dobara check karta hai ki kya next token + hai. Haan, token `+` milta hai, toh
`match('+')` function call hota hai aur token pointer next token (`id`) pe move ho jaata hai.
5. Parsing Complete:
Agar input tokens successfully parse ho jaate hain aur token pointer end of tokens tak pahuch
jaata hai, toh parsing successful hota hai.
css
A → Aα | β
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 39/43
r
E → E + T | T
T → id
Agar hum is grammar ko directly Recursive Descent parser me implement karte hain, toh yeh infinite
recursion ka cause ban sakta hai.
Key Takeaways:
1. Recursive Descent Parsing ek simple top-down parsing technique hai jisme har non-terminal ke
liye function likha jaata hai.
2. Yeh technique backtracking ka use karti hai agar ek production rule fail hota hai.
3. Recursive Descent parsers left recursion ko handle nahi kar paate aur complex grammars ke liye
inefficient ho sakte hain.
4. Yeh parser LL(1) grammars ke liye effective hai.
Agar aapko koi aur clarification chahiye ya additional example ki zarurat hai, toh bataiye! 😊
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 40/43
12. *Write the Recursive Descent Parser for the context-free
grammar: S → +SS | SS | a.
css
S → +SS | SS | a
We will now write the Recursive Descent Parser for this grammar.
🚀 Step-by-Step Explanation:
1. We will define a function `S()` to handle non-terminal `S` and process it according to the three
rules of the grammar:
Rule 1: `S → +SS`
Rule 2: `S → SS`
Rule 3: `S → a`
2. The parser will use a global token list which will be parsed one by one. We will define a `match()`
function to consume the current token and move to the next one.
3. The parsing process will proceed recursively by matching the tokens with the grammar rules.
# Rule 1: S → +SS
if current_token < len(tokens) and tokens[current_token] == '+':
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 41/43
match('+') # Consume the '+' symbol
S() # Parse first S
S() # Parse second S
# Rule 2: S → SS
elif current_token < len(tokens) and tokens[current_token] == 'a':
S() # Parse first S
S() # Parse second S
# Rule 3: S → a
elif current_token < len(tokens) and tokens[current_token] == 'a':
match('a') # Consume the 'a' symbol
else:
raise SyntaxError("Invalid syntax")
# Function to match the expected token and move to the next token
def match(expected_token):
global current_token
if current_token < len(tokens) and tokens[current_token] == expected_token:
current_token += 1 # Move to the next token
else:
raise SyntaxError(f"Expected {expected_token}, but got {tokens[current_token]}")
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 42/43
1. Start with `S()` function:
`S()` first checks for the production rule `S → +SS`, but the current token is `a`, so it skips this
rule.
2. Match Rule `S → SS`:
Since the first token is `a`, we apply the rule `S → SS` and recursively call `S()` for the first
`S`:
The first `S()` matches `a`, consumes `a`, and moves the `current_token` to the next
token (`+`).
3. Second Recursive Call to `S()`:
Now, we process the second `S()`. The current token is `+`, so we apply the rule `S → +SS`,
consume the `+`, and recursively call `S()` twice:
First `S()` matches `a` (consumes `a`).
Second `S()` matches `a` (consumes `a`).
4. Completion:
The `current_token` pointer has moved through all tokens, and the parsing is successful.
Key Takeaways:
1. Recursive Descent Parser is simple to implement and works by recursively processing each non-
terminal according to its grammar rules.
2. The given grammar `S → +SS | SS | a` is successfully parsed with this approach.
3. Backtracking is inherently supported in recursive descent parsers when you check different rules
for matching tokens.
4. Error Handling is achieved by raising a `SyntaxError` when the token does not match the expected
token.
If you need any further explanation or additional examples, feel free to ask! 😊
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 43/43