0% found this document useful (0 votes)
11 views9 pages

Lecture 6 (6-2-23)

The document discusses parsing and syntax analysis in programming languages. It defines context-free grammars and how they are used to specify the syntax of programming languages. Parsers can be generated from context-free grammars to check if a program's syntax follows the grammar rules.

Uploaded by

Tahsk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views9 pages

Lecture 6 (6-2-23)

The document discusses parsing and syntax analysis in programming languages. It defines context-free grammars and how they are used to specify the syntax of programming languages. Parsers can be generated from context-free grammars to check if a program's syntax follows the grammar rules.

Uploaded by

Tahsk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Parsing/Syntax Analysis

• A parser for a grammar of a programming language


• verifies that the string of tokens for a program in that language can indeed be generated
from that grammar
• reports any syntax errors in the program
• constructs a parse tree representation of the program (not necessarily explicit)
• usually calls the lexical analyzer to supply a token to it when necessary
• could be hand-written or automatically generated
• is based on context-free grammars
• Grammars are generative mechanisms like regular expressions
• Pushdown automata are machines recognizing context-free languages (like FSA for RL)

• RE specify patterns for set of strings (language) -> FSA (Finite State Automata) checks if a string belongs
to the set of strings generated by a RE
• Context-free grammar specify the syntax of a program -> Push down automata checks if a given syntax of
program follows the grammar (specified using context-free grammar)
Grammars
• Every programming language has precise grammar rules that describe the syntactic structure of
well-formed programs
• In C, the rules state how functions are made out of parameter lists, declarations, and
statements
• how statements are made of expressions, etc.
• Grammars are easy to understand, and parsers for programming languages can be constructed
automatically from certain classes of grammars
• Parsers or syntax analyzers are generated for a particular grammar
• Context-free grammars are usually used for syntax specification of programming languages
Context-free Grammars
• A CFG is denoted as G = (N, T, P, S)
• N: Finite set of non-terminals
• T: Finite set of terminals
• S ∈ N: The start symbol
• P: Finite set of produc ons, each of the form A → α, where A ∈ N and α ∈ (N ∪ T)∗
• Usually, only P is specified and the first production corresponds to that of the start symbol
• Examples
Derivations
This means using this production

• E ⇒E→E+E E + E ⇒E→id id + E ⇒E→id id + id is a derivation of the terminal string id + id from E


• In a derivation, a production is applied at each step, to replace a nonterminal by the right-hand side of
the corresponding production
• In the above example, the produc ons E → E + E, E → id, and E → id, are applied at steps 1,2, and, 3
respectively
• The above derivation is represented in short as, E ⇒∗ id + id, and is read as S derives id + id
Context-free languages

• Context-free grammars generate context-free languages (grammar and language resp.)


• The language generated by G, denoted L(G), is
• L(G) = {w | w ∈ T∗, and S ⇒∗ w} i.e., a string is in L(G), if
• the string consists solely of terminals
• the string can be derived from S
• Examples
• L(G1) = Set of all expressions with +, *, names, and balanced ’(’ and ’)’
• L(G2) = Set of palindromes over 0 and 1
• L(G3) = {anbn| n ≥ 0}
• L(G4) = {x | x has equal no. of a’s and b’s}
• A string α ∈ (N ∪ T)∗ is a sentential form if S ⇒∗ α
• Two grammars G1 and G2 are equivalent, if L(G1) = L(G2)
Derivation Trees

• Derivations can be displayed as trees


• The internal nodes of the tree are all non-terminals and the leaves are all terminals
• Corresponding to each internal node A, there exists a
• production ∈ P, with the RHS of the production being the list of children of A, read from left to
right
• The yield of a derivation tree is the list of the labels of all the leaves read from left to right
• If α is the yield of some derivation tree for a grammar G, then S ⇒∗ α and conversely
Derivation Tree Example

Productions

Step 1:
S->aAS

Step 2: Step 3:
A->SbA S->a
Step 5:
S->a
Step 4:
A->ba
Leftmost and Rightmost Derivations

• If at each step in a derivation, a production is applied to the leftmost nonterminal,


then the derivation is said to be leftmost. Similarly rightmost derivation.
• If w L(G) for some G, then w has at least one parse tree and corresponding to a
parse tree, w has unique leftmost and rightmost derivations
• If some word w in L(G) has two or more parse trees, then G is said to be
ambiguous
• i.e., w has more than one leftmost derivation tree or rightmost derivation
tree
• A CFL for which every G is ambiguous, is said to be an inherently ambiguous CFL
Leftmost and Rightmost Derivations: An Example

Leftmost and
rightmost derivation
of string: aabbaa

You might also like