0% found this document useful (0 votes)
6 views41 pages

pl9ch3 Backup

Chapter 3 discusses the syntax and semantics of programming languages, defining syntax as the structure of expressions and semantics as their meaning. It introduces formal methods for describing syntax, including context-free grammars and Backus-Naur Form (BNF), and explains the concept of attribute grammars for linking syntax to semantics. The chapter also addresses issues of ambiguity in grammars and presents extended BNF for more expressive syntax definitions.

Uploaded by

sikadeg953
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views41 pages

pl9ch3 Backup

Chapter 3 discusses the syntax and semantics of programming languages, defining syntax as the structure of expressions and semantics as their meaning. It introduces formal methods for describing syntax, including context-free grammars and Backus-Naur Form (BNF), and explains the concept of attribute grammars for linking syntax to semantics. The chapter also addresses issues of ambiguity in grammars and presents extended BNF for more expressive syntax definitions.

Uploaded by

sikadeg953
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

Chapter 3

Describing Syntax
and Semantics

ISBN 0-321-49362-1
Chapter 3 Topics

• Introduction
• The General Problem of Describing Syntax
• Formal Methods of Describing Syntax
• Attribute Grammars
• Describing the Meanings of Programs:
Dynamic Semantics

Copyright © 2009 Addison-Wesley. All rights reserved. 1-2


Introduction

• Syntax: the form or structure of the


expressions, statements, and program units
• Semantics: the meaning of the
expressions, statements, and program
units
• Syntax and semantics provide a language’s
definition
– Users of a language definition
• Other language designers
• Implementers
• Programmers (the users of the language)

Copyright © 2009 Addison-Wesley. All rights reserved. 1-3


Introduction

1-4
The General Problem of Describing
Syntax: Terminology

• A sentence is a string of characters over


some alphabet

• A language is a set of sentences

• A lexeme is the lowest level syntactic unit


of a language (e.g., *, sum, begin)

• A token is a category of lexemes (e.g.,


identifier)

Copyright © 2009 Addison-Wesley. All rights reserved. 1-5


Delimiting lexemes

1-6
Scanning Fortran

← This is an identifier

← This is a loop

1-7
Formal Definition of Languages

• Recognizers
– A recognition device reads input strings over the alphabet
of the language and decides whether the input strings
belong to the language
– Example: syntax analysis part of a compiler
- Detailed discussion of syntax analysis appears in
Chapter 4

• Generators
– A device that generates sentences of a language
– One can determine if the syntax of a particular sentence is
syntactically correct by comparing it to the structure of
the generator

Copyright © 2009 Addison-Wesley. All rights reserved. 1-8


Regular Expressions

1-9
Extending Regular Expressions

1-10
What is a grammar?

1-11
BNF and Context-Free Grammars

• Context-Free Grammars
– Developed by Noam Chomsky in the mid-1950s
– Language generators, meant to describe the
syntax of natural languages
– Define a class of languages called context-free
languages

• Backus-Naur Form (1959)


– Invented by John Backus to describe Algol 58
– BNF is equivalent to context-free grammars

Copyright © 2009 Addison-Wesley. All rights reserved. 1-12


BNF Fundamentals

1-13
BNF Fundamentals
• In BNF, abstractions are used to represent classes of syntactic
structures--they act like syntactic variables (also called nonterminal
symbols, or just terminals)

• Terminals are lexemes or tokens

• A rule has a left-hand side (LHS), which is a nonterminal, and a


right-hand side (RHS), which is a string of terminals and/or
nonterminals

• Nonterminals are often enclosed in angle brackets

– Examples of BNF rules:


<ident_list> → identifier | identifier, <ident_list>
<if_stmt> → if <logic_expr> then <stmt>

• Grammar: a finite non-empty set of rules

• A start symbol is a special element of the nonterminals of a


grammar
Copyright © 2009 Addison-Wesley. All rights reserved. 1-14
BNF Rules

• An abstraction (or nonterminal symbol)


can have more than one RHS
<stmt> → <single_stmt>
| begin <stmt_list> end

Copyright © 2009 Addison-Wesley. All rights reserved. 1-15


Describing Lists

• Syntactic lists are described using


recursion
<ident_list> → ident
| ident, <ident_list>

• A derivation is a repeated application of


rules, starting with the start symbol and
ending with a sentence (all terminal
symbols)

Copyright © 2009 Addison-Wesley. All rights reserved. 1-16


Example Grammar - 1

1-17
Example Grammar - 2

<program> → <stmts>
<stmts> → <stmt> | <stmt> ; <stmts>
<stmt> → <var> = <expr>
<var> → a | b | c | d
<expr> → <term> + <term> | <term> - <term>
<term> → <var> | const

Copyright © 2009 Addison-Wesley. All rights reserved. 1-18


An Example Derivation

<program> => <stmts> => <stmt>


=> <var> = <expr>
=> a = <expr>
=> a = <term> + <term>
=> a = <var> + <term>
=> a = b + <term>
=> a = b + const

Copyright © 2009 Addison-Wesley. All rights reserved. 1-19


Derivations

• Every string of symbols in a derivation is a


sentential form
• A sentence is a sentential form that has
only terminal symbols
• A leftmost derivation is one in which the
leftmost nonterminal in each sentential
form is the one that is expanded
• A derivation may be neither leftmost nor
rightmost

Copyright © 2009 Addison-Wesley. All rights reserved. 1-20


Parse Tree

• A hierarchical representation of a derivation


<program>

<stmts>

<stmt>

<var> = <expr>

a <term> + <term>

<var> const

b
Copyright © 2009 Addison-Wesley. All rights reserved. 1-21
Another Derivation Example

1-22
Parse Tree

1-23
Ambiguity in Grammars

• A grammar is ambiguous if and only if it


generates a sentential form that has two
or more distinct parse trees

Copyright © 2009 Addison-Wesley. All rights reserved. 1-24


An Ambiguous Expression Grammar

<expr> → <expr> <op> <expr> | const


<op> → / | -

<expr> <expr>

<expr> <op> <expr> <expr> <op> <expr>

<expr> <op> <expr> <expr> <op> <expr>

const - const / const const - const / const

Copyright © 2009 Addison-Wesley. All rights reserved. 1-25


An Unambiguous Expression Grammar

• If we use the parse tree to indicate


precedence levels of the operators, we
cannot have ambiguity
<expr> → <expr> - <term> | <term>
<term> → <term> / const| const

<expr>

<expr> - <term>

<term> <term> / const

const const
Copyright © 2009 Addison-Wesley. All rights reserved. 1-26
Example: Is it IF-THEN or IF-THEN-ELSE?

1-27
Example: Is it IF-THEN or IF-THEN-ELSE?

1-28
Example: Is it IF-THEN or IF-THEN-ELSE?
(Ambiguity removed)

1-29
Associativity of Operators

• Operator associativity can also be indicated by a


grammar

<expr> -> <expr> + <expr> | const (ambiguous)


<expr> -> <expr> + const | const (unambiguous)

<expr>
<expr>

<expr> + const

<expr> + const

const
Copyright © 2009 Addison-Wesley. All rights reserved. 1-30
Extended BNF

• Optional parts are placed in brackets [ ]


<proc_call> -> ident [(<expr_list>)]
• Alternative parts of RHSs are placed
inside parentheses and separated via
vertical bars
<term> → <term> (+|-) const
• Repetitions (0 or more) are placed inside
braces { }
<ident> → letter {letter|digit}

Copyright © 2009 Addison-Wesley. All rights reserved. 1-31


BNF and EBNF

• BNF
<expr> → <expr> + <term>
| <expr> - <term>
| <term>
<term> → <term> * <factor>
| <term> / <factor>
| <factor>
• EBNF
<expr> → <term> {(+ | -) <term>}
<term> → <factor> {(* | /) <factor>}

Copyright © 2009 Addison-Wesley. All rights reserved. 1-32


Attribute Grammars

Copyright © 2009 Addison-Wesley. All rights reserved. 1-33


Attribute Grammars : Definition

• Def: An attribute grammar is a context-free


grammar G = (S, N, T, P) with the following
additions:
– For each grammar symbol x there is a set A(x) of
attribute values
– Each rule has a set of functions that define
certain attributes of the nonterminals in the rule
– Each rule has a (possibly empty) set of
predicates to check for attribute consistency

Copyright © 2009 Addison-Wesley. All rights reserved. 1-34


Attribute Grammars: An Example

Copyright © 2009 Addison-Wesley. All rights reserved. 1-35


Attribute Grammars: An Example

• Syntax
<assign> -> <var> = <expr>
<expr> -> <var> + <var> | <var>
<var> A | B | C
• actual_type: synthesized for <var>
and <expr>
• expected_type: inherited for <expr>

Copyright © 2009 Addison-Wesley. All rights reserved. 1-36


Attribute Grammars (continued)

• How are attribute values computed?


– If all attributes were inherited, the tree could be
decorated in top-down order.
– If all attributes were synthesized, the tree could
be decorated in bottom-up order.
– In many cases, both kinds of attributes are used,
and it is some combination of top-down and
bottom-up that must be used.

Copyright © 2009 Addison-Wesley. All rights reserved. 1-37


Attribute Grammars (continued)

<expr>.expected_type ← inherited from parent

<var>[1].actual_type ← lookup (A)


<var>[2].actual_type ← lookup (B)
<var>[1].actual_type =? <var>[2].actual_type

<expr>.actual_type ← <var>[1].actual_type
<expr>.actual_type =? <expr>.expected_type

Copyright © 2009 Addison-Wesley. All rights reserved. 1-38


Example: Parse Tree for A=A+B

1-39
Example: Derivation of Attributes

1-40
Summary

• BNF and context-free grammars are


equivalent meta-languages
– Well-suited for describing the syntax of
programming languages
• An attribute grammar is a descriptive
formalism that can describe both the syntax
and the semantics of a language

Copyright © 2009 Addison-Wesley. All rights reserved. 1-41

You might also like