Lecture 3
Lecture 3
Describing Syntax
and Semantics
Tentative Course Outline
• Introduction
• Describing Syntax: Terminology
• Formal Methods of Describing Syntax
• Additional Notes on Terminals and Nonterminals
• Grammars and Derivations
(SYNTAX) (SEMANTICS)
Elements of Syntax
Chomsky described four classes of grammars that define four classes of languages.
Two of these grammar classes, named context-free and regular, turned out to be useful
for describing the syntax of programming languages.
Regular grammars: The forms of the tokens of programming languages
Context-free grammars: The syntax of whole programming
Context-Free Grammars
BNF Fundamentals
BNF Fundamentals
• These abstractions are called Variables or Nonterminals of a Grammar.
• Lexemes and tokens are the Terminals of a grammar.
• Nonterminals are often enclosed in angle brackets
LHS RHS
Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-18
Additional Notes on Terminals and Nonterminals
Terminals
Terminals are the smallest block we consider in our
grammars. Let’s see some typical terminals:
• identifiers: these are the names used for variables,
classes, functions, methods and so on
• comments
Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-19
Additional Notes on Terminals and Nonterminals
Nonterminals
Identifiers
<program> <stmts>
<stmts> <stmt> | <stmt> ; <stmts>
<stmt> <var> = <expr>
<var> a | b | c | d
<expr> <term> + <term> | <term> - <term>
<term> <var> | const
Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-25
Grammars and Derivations
• In order to check if a given string represents a valid program in the language, we try to derive
it in the grammar.
• Derivation starts from the start symbol <program>.
• At each step we replace a nonterminal with its definition (RHS of the rule).
• Every string of symbols in a derivation is a sentential form
• A sentence is a sentential form that has only terminal symbols
• A leftmost derivation is one in which the leftmost nonterminal in each sentential form is the
one that is expanded
• A grammar is ambiguous if and only if it generates a sentential form that has two or more
distinct parse trees
A=B+C*A
Ambiguity
A = (B+C)*A or
A = B+(C*A)
Copyright © 2017 Pearson Education, Ltd. All rights reserved. 1-34
Grammars and Derivations
Handling Ambiguity
Handling Ambiguity
Operator Precedence
In mathematics * operation
has a higher precedence than
+
This can be implemented
with extra nonterminals
Handling Ambiguity
Associativity of Operators
Associativity
• In a BNF rule, if the LHS appears at the beginning of the RHS, the rule is said to be
left recursive
• Left recursion specifies left associativity
Associativity
A parse tree for A = B + C + A illustrating the
associativity of addition
Left associativity
1-46
Grammars and Derivations
An Unambiguous grammar for “if then else”