0% found this document useful (0 votes)
23 views26 pages

ch2 3

This document discusses syntax analysis in programming language compilation. It reviews lexical analysis, parsing, context-free grammars, LL and LR parsing, and implementation of parsers. Specifically, it covers top-down and bottom-up parsing with examples, the hierarchy of linear parsers from CFGs to regular grammars, and implementation of recursive descent parsers. It also briefly mentions semantic analysis and attribute grammars.

Uploaded by

saharkamal7274
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views26 pages

ch2 3

This document discusses syntax analysis in programming language compilation. It reviews lexical analysis, parsing, context-free grammars, LL and LR parsing, and implementation of parsers. Specifically, it covers top-down and bottom-up parsing with examples, the hierarchy of linear parsers from CFGs to regular grammars, and implementation of recursive descent parsers. It also briefly mentions semantic analysis and attribute grammars.

Uploaded by

saharkamal7274
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 26

CSCI 431 Programming Languages

Fall 2003

Syntax Analysis
(Section 2.2-2.3)

A modification of slides developed by Felix


Hernandez-Campos at UNC Chapel Hill

1
Review: Compilation/Interpretation

Source Code

Compiler or Interpreter
Interpre-
Translation Execution
tation

Target Code

2
Review: Syntax Analysis

• Specifying the form


Source Code
of a programming
language

– Tokens Compiler or Interpreter


» Regular Expressions
(also F.A.s & Reg. Grammars)
Translation Execution

– Syntax
» Context-Free
Grammars Target Code
(also P.D.A.s)

3
Phases of Compilation

4
Syntax Analysis

• Syntax:
– Webster’s definition: 1 a : the way in which linguistic
elements (as words) are put together to form constituents
(as phrases or clauses)
• The syntax of a programming language
– Describes its form
» Organization of tokens
» Context Free Grammars (CFGs)
– Must be recognizable by compilers and interpreters
» Parsing
» LL and LR parsers

5
Context Free Grammars

• CFGs
– Add recursion to regular expressions
» Nested constructions
– Notation
expression  identifier | number | - expression
| ( expression )
| expression operator expression
operator  + | - | * | /

» Terminal symbols
» Non-terminal symbols
» Production rule (i.e. substitution rule)
terminal symbol  terminal and non-terminal symbols

6
Parsing

• Parsing an arbitrary Context Free Grammar


– O(n3)
– Too slow for large programs
• Linear-time parsing
– LL parsers (a ‘Left-to-right, Left-most’ derivation)
» Recognize LL grammar
» Use a top-down strategy
– LR parsers (a ‘Left-to-right, Right-most’ derivation)
» Recognize LR grammar
» Use a bottom-up strategy

7
Parsing example

• Example: comma-separated list of identifier

– CFG

id_list  id id_list_tail
id_list_tail  , id_list_tail
id_list_tail  ;

– Parsing

A, B, C;

8
Top-down derivation of A, B, C;

CFG

Left-to-right,
Left-most derivation
LL(1) parsing

9
Top-down derivation of A, B, C;

CFG

10
Bottom-up parsing of A, B, C;

CFG

Left-to-right,
Right-most derivation
LR parsing
(a shift-reduce parser)

11
Bottom-up parsing of A, B, C;

CFG

12
Bottom-up parsing of A, B, C;

CFG

13
LR Parsing vs. LL Parsing

• LL
– A ‘top-down’ or ‘predictive’ parser
– Predict needed productions based on the current left-most
non-terminal in the tree and the current input token
– The top-of-stack contains the left-most non-terminal
– The stack contains a record of what the parser expects to
see
• LR
– A ‘bottom-up’ or shift-reduce parser
– Shifts tokens onto the stack until it recognizes a right-hand
side then reduces those tokens to their left-hand side
– The stack contains a record of what the parser has already
seen

14
An appropriate LR Grammar

id_list  id_list_prefix ;
id_list_prefix  id_list_prefix , id
 id
This grammar can’t be parsed top-down!
Problems for LL grammars:
- left recursion, example above
- common prefixes, example:
stmt  id := expr | id (arg_list)
15
LL(1) Grammar for the Calculator
Language

16
LR(1) Grammar for the Calculator
Language

17
Hierarchy of Linear Parsers

• Basic containment relationship


– All CFGs can be recognized by LR parser
– Only a subset of all the CFGs can be recognized by LL
parsers

CFGs LR parsing

LL parsing

18
Bigger Picture

• Chomsky Hierarchy of Grammars


Unrestricted Grammar

Context Sensitive Grammar

Context Free Grammar

Regular
Grammar

19
Implementation of an LL Parser

• Two options:
– A recursive descent parser (section 2.2.3)
» For LL grammars only
– Parse table and a driver (section 2.2.5)
» LR parsers covered in section 2.2.6

20
Recursive Descent Parser Example

• LL(1) grammar

21
Recursive Descent Parser Example

• Outline of
recursive parser

– This parser only


verifies syntax

– match is
the scanner

22
Recursive Descent Parser Example

23
Recursive Descent Parser Example

24
Recursive Descent Parser Example

A program that develops recursive decent


parsers: JavaCC

25
Semantic Analysis

• Specifying the meaning


Source Code
of a programming
language
Compiler or Interpreter
– Attribute Grammars
Translation Execution

Target Code

26

You might also like