0% found this document useful (0 votes)
2 views70 pages

PrLanguage 3

This document discusses the design and implementation of programming languages, focusing on syntax and semantics. It covers formal methods for describing programming languages, including context-free grammars and attribute grammars, as well as challenges in language description. The importance of clear syntax and semantics for successful programming language implementation is emphasized.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views70 pages

PrLanguage 3

This document discusses the design and implementation of programming languages, focusing on syntax and semantics. It covers formal methods for describing programming languages, including context-free grammars and attribute grammars, as well as challenges in language description. The importance of clear syntax and semantics for successful programming language implementation is emphasized.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 70

‫طراحی و پیادهسازی‬

‫زبانهای برنامهنویسی‬
‫دانشگاه بین المللی امام خمینی (ره)‬

‫‪1‬‬ ‫شکوه کرمانشاهانی‬


‫فصل سوم‬

‫‪Ref. Book: Concepts of Programming Language by Sebesta‬‬

‫‪/20‬‬
2

/20
3 Discussion of this chapter

 Defining Syntax and Semantic

 most common method of describing syntax: context-free


grammars (also known as Backus-Naur Form)
 Derivations
 Parse Trees
 Ambiguity
 descriptions of operator precedence and associativity
 Extended Backus-Naur Form
 Attribute grammars
 which can be used to describe both the syntax and static
semantics of programming languages
 Brief discussion of three formal methods of describing semantics

/20
4 Description of a programming
language is crutial
 A concise yet understandable description of a PL is
difficult but essential to the language’s success: Tradeoff\
Some Chalanges:
 Diversity of the people who must understand the
description
 Programming language implementors obviously must be
able to determine how the expressions, statements, and
program units of a language are formed, and also their
intended effect when executed. The difficulty of the
implementors’ job is, in part, determined by the
completeness and precision of the language description.
 language users must be able to determine how to
encode software solutions by referring to a language
reference manual.

/20
5 Syntax & Semantic
description

 The syntax of a programming language is the form of its


expressions, statements, and program units.
 Its semantics is the meaning of those expressions,
statements, and program units.
 EX.
the syntax of a Java while statement is
while (boolean_expr) statement
 The semantics of this statement form is that when the
current value of the Boolean expression is true, the
embedded statement is executed. Then control implicitly
returns to the Boolean expression to repeat the process. If
the Boolean expression is false, control transfers to the
statement following the while construct.

/20
6 Syntax & Semantic
description (2)
 Although they are often separated for discussion
purposes, syntax and semantics are closely related.
 In a well-designed programming language, semantics
should follow directly from syntax; that is, the
appearance of a statement should strongly suggest
what the statement is meant to accomplish.

 Describing syntax is easier than describing semantics


 A concise and universally accepted notation is available
for syntax description, but none has yet been developed
for semantics
 Naturally more complicated

/20
7 Syntax

 A language, whether natural (such as English) or


artificial (such as Java), is a set of strings of characters
from some alphabet. The strings of a language are
called sentences or statements.
 The syntax rules of a language specify which strings of
characters from the language’s alphabet are in the
language.

 Formal descriptions of the syntax of programming


languages, for simplicity’s sake, often do not include
descriptions of the lowest-level syntactic units. These
small units are called lexemes.

/20
8 Syntax; Lexeme

 The lexemes of a programming language include its


numeric literals, operators, and special words, among
others.
 One can think of programs as strings of lexemes rather
than of characters.
 The description of lexemes can be given by a lexical
specification, which is usually separate from the
syntactic description of the language.
 Lexemes are partitioned into groups: Token

/20
9 Syntax; Token

 A token of a language is a category of its lexemes


 For example, the names of variables, methods,
classes, and so forth in a programming language form
a group called identifiers
 : An identifier is a token that can have lexemes, or
instances, such as sum and total.
 In some cases, a token has only a single possible
lexeme.
 For example, the token for the arithmetic operator
symbol + has just one possible lexeme

/20
10 Syntax; Lexeme & Token
Example
 index = 2 * count + 17;

/20
11 Two way of Language Definition:
Recognition / Generation
 Language Recognizer
For Language: L from alphabet: ∑
 We need to construct a mechanism R, called a
recognition device
 capable of reading strings of characters from the alphabet Σ
 Indicate whether a given input string was or was not in L
 R would either accept or reject the given string.
 Then: R is a description of L
 Because most useful languages are, for all practical
purposes, infinite, this might seem like a lengthy and
ineffective process.

However: The syntax analyzer (Paeser) of a compiler is a


recognizer
/20
12 Two way of Language Definition:
Recognition / Generation (2)

 Language Generator
 A device that can be used to generate the sentences
of a language.
 We can think of the generator as having a button that
produces a sentence of the language every time it is
pushed.
 seems to be a device of limited usefulness as a
language descriptor/

There is a close connection between formal generation and


recognition devices for the same language

/20
13 Formal Methods of Describing
Syntax

 Context-Free Grammar
 Backus-Naur Form (BNF)
 Drivation
 Parse Tree
 Ambiguity
 Operator Precedence
 Associativity
 An Unambiguous Grammar for if-else (page 227 pdf)
 Extended BNF

/20
14 Design / Definition a Language
with BNF

left-hand side right-hand side (RHS)


(LHS)

rule, or production

 A metalanguage is a language that is used to describe


another language.
 BNF is a metalanguage for programming languages.
 BNF/20uses abstractions for syntactic structures
15 Formal Methods of Describing
Syntax

6 Rule

/20
16 Formal Methods of Describing
Syntax

generated by the leftmost derivation:

/20
17 BNF: PARSE TREE

/20
18 BNF: Ambiguity

/20
19 BNF: Ambiguity
Left Most Derivation

Solution:
Make a decision:
Right Recursive Rules
Or
Left Recursive Rules
/20
Right Most Derivation
20 BNF: Ambiguity

Solution:
Make a decision:
Right Recursive Rules Using More non-terminals
Or
Left Recursive Rules

/20
21 BNF: NO Ambiguity

/20
22 BNF: Operator Precedence

/20
23
BNF: Associativity of Operators

/20
24
BNF: Associativity of Operators

/20
Left Recursion Left Associativity
25 Design / Definition a Language
with BNF
 BNF is a metalanguage for programming languages.
 BNF uses abstractions for syntactic structures

How Define??
 Un-Ambiguise Rules:
 More Non-Terminals / More Rules
 Each Rule: Right Recursion or Left Recursion
 Assocaitivity
 Decision For Each Rule to be: Right Recursiev or Left Recursive
 Right Recursive: Right Associativity
Left Recursive: Left Associativity
 Operator Precedence: Rules Ordering
 More Depth: More Precedence
 Other ….Continue
/20
26 BNF: Continue
A Grammar for a Small Language

/20
27 BNF: Continue
A Derivation of Grammar for a Small Language

/20
28 BNF: Continue
A Grammar for Identifier Decration of C

<Declaration>  <ID-TYPE> <Spaces> <Ident-List> ;

<ident_list> → identifier

| identifier, <ident_list>

<Spaces>  Space
| space<spaces>

/20
29 BNF: Continue
if Statement

Ambiguity
/20
30 BNF: Continue
An Unambiguous Grammar for if-else
More Non-Termainals

There is just one possible parse tree, using this


grammar, for the following
sentential form:

/20
31 Extended BNF : EBNF

/20
32 Extended BNF : EBNF

/20
33 BNF & EBNF
last slide

/20
34 Attribute Grammars

 An attribute grammar is a device used to describe


more of the structure of a programming language
than can be described with a context-free grammar.
 Extension to a context-free grammar
 allows certain language rules to be conveniently
described, such as type compatibility

 + Static Semantic

/20
35 Static Semantics

There are some characteristics of programming


languages that are difficult to describe with BNF
 consider type compatibility rules
 In Java, for example, a floating-point value cannot be
assigned to an integer type variable, although the
opposite is legal
 Although this restriction can be specified in BNF, it
requires additional nonterminal symbols and rules. If all
of the typing rules of Java were specified in BNF, the
grammar would become too large to be useful,
because the size of the grammar determines the size
of the syntax analyzer.

/20
36 Static Semantics (2)

 some characteristics of programming languages that are


impossible to describe with BNF:

 Consider the common rule that all variables must be


declared before they are referenced. It has been proven
that this rule cannot be specified in BNF.
These problems exemplify the categories of language rules
called static semantics rules.
 The static semantics of a language is only indirectly
related to the meaning of programs during execution
 rather, it has to do with the legal forms of programs
(syntax rather than semantics).
Static semantics is so named because the analysis required
to check these specifications can be done at compile time.

/20
37 Static Semantic (3)

 Because of the problems of describing static


semantics with BNF, a variety of more powerful
mechanisms has been devised for that task.

 One such mechanism, attribute grammars, was


designed by Knuth (1968) to describe
 both the syntax and the static semantics of programs.

/20
38 Attribute grammars

 Attribute grammars are a formal approach both to


describing and checking the correctness of the static
semantics rules of a program.

 Attribute grammars are context-free grammars to


which have been added:
 Attributes
 attribute computation functions / Semantic Functions
 predicate functions

/20
39 Attribute Grammars (2)

 Attributes, which are associated with grammar


symbols (the terminal and nonterminal symbols), are
similar to variables in the sense that they can have
values assigned to them.
 Attribute computation functions, sometimes called
semantic functions, are associated with grammar
rules.
 They are used to specify how attribute values are
computed
 Predicate functions, which state the static semantic
rules of the language, are associated with grammar
rules

/20
40 Attribute Grammars (3)

 Associated with each grammar symbol X is a set of


attributes A(X):
 Synthesized attributes: S(X)
used to pass semantic information up a parse tree
 inherited attributes: I(X)
pass semantic information down and across a tree
 Associated with each grammar rule is:
 A set of semantic functions
 A set of predicate functions (possibly empty )
over the attributes of the symbols in the grammar rule

/20
41 Attribute Grammars (4): Ex.

/20
42 Attribute Grammars (5)
Intrinsic Attributes
 Intrinsic attributes are synthesized attributes of leaf
nodes whose values are determined outside the parse
tree.

 For example, the type of an instance of a variable in a


program could come from the symbol table

 the only attributes with values are the intrinsic


attributes of the leaf nodes.

/20
43 Attribute Grammars (6)
Intrinsic Attributes
 Given the intrinsic attribute values on a parse tree, the
semantic functions can be used to compute the
remaining attribute values

/20
44 Attribute Grammars (7): Ex.
Predicate functions, which
state the static semantic rules of the language, are associated with
grammar
rules.

/20
45 Dynamic Semantics:
Describing the Meanings of Programs

 dynamic semantics
meaning, of the expressions, statements, and program
units of a programming language.

 Recall
Because of the power and naturalness of the available
notation, describing syntax is a relatively simple matter.
 BUT
No universally accepted notation or approach has been
devised for dynamic semantics.

/20
46 Dynamic Semantics:
Describing the Meanings of Programs

 Solution?
There are several of the methods that have been
developed
 Operational Semantics
 Denotational Semantics
 Axiomatic Semantics
 Others

PAUSE
/20
We will return to this topic again
47

Design and design issues


of a Programming Language

/20
48 Chaper 5:
Fundamental semantic issues of variables
Names, Bindings, and Scopes
Two primary components of Von Neumann computer
architecture
 Memory
stores both instructions and data
 Processor
provides operations for modifying the contents of the
memory
 The abstractions in a language for the memory cells of
the machine are variables

/20
49 Abstraction for memory cells:
Variables

A variable has some attributes:

 Name
 Address
 Type
 Value

/20
50 Name
& its Designing issues

A name is a string of characters used to identify some


entity in a program

The following are the primary design issues for names:


 Are names case sensitive?
 Are the special words of the language reserved words
or keywords?
Special Words
 Special words in programming languages are used to
make programs more readable by naming actions to
be performed

/20
51 Name
Special Words

Special Words
 To make programs more readable by naming actions to
be performed
 To separate the syntactic parts of statements and
programs
Special Words
 Reserved word is a special word of a programming
language that cannot be used as a name.
 One potential problem with reserved words: If the language
includes a large number of reserved words, the user may
have difficulty making up names that are not reserved. The
best example of this is
COBOL, which has 300 reserved words.

/20
52 Name
Special Words

Special Words
 Reserved word
 In most languages, names that are defined in other
program units, such as Java packages and C and C++
libraries, can be made visible to a program. These names
are predefined, but visible only if explicitly imported.
Once imported, they cannot be redefined.
 Keyword
 which means they can be redefined
 EX. Fortran

 ‫ ؟‬Which meaning is used ?

/20
53 Abstraction for memory cells:
Variables

A variable has some attributes:

 Name
 Address
 Type
 Value

/20
54 Variable
Address

The address of a variable is the machine memory address


with which it is associated.
 Not as simple as it may appear
 It is possible for the same variable to be associated
with different addresses at different times during the
execution of the program.
 EX. Local Variable of a subprogram
Different instantiations of the same variable
 Address of a variable: l-value …?

/20
55 Abstraction for memory cells:
Variables

A variable has some attributes:

 Name
 Address
 Type
 Value

/20
56 Variable
TYPE

 The type of a variable determines the range of values


the variable can store and the set of operations that
are defined for values of the type

 Ex. Is type “char” can be in a sum expression?

/20
57 Abstraction for memory cells:
Variables

A variable has some attributes:

 Name
 Address
 Type
 Value

/20
58 Variable
Value

 The value of a variable is the contents of the memory


cell or cells associated with the variable.
 It is convenient to think of computer memory in terms
of abstract cells, rather than physical cells.
 A byte size is too small for most program variables
 An abstract memory cell has the size required by the
variable with which it is associated.

Henceforth, the term memory cell will mean abstract


memory cell.
 Value of a variable: r-value

/20
59 Abstraction for memory cells:
Variables

A variable has some attributes:

 Name
 Address
 Type
 Value

 Binding

/20
60
Binding

 Definition: A binding is an association between an attribute and an entity


 Such as between a variable and its attributes
 The time at which a binding takes place is called binding time.
 Binding and binding times are prominent concepts in the semantics of
programming languages.

 Binding times:
 Design ex. Binding * to multiplication
 Implementation ex. Binding a Size or range of values to a variable type
 Compile ex. Binding a type to a variable in Java
 Load ex. Binding a variable to a storage cell
 Link ex. Binding a call to library subprogram to subprogram code
 Run ex. Some value binding & some storage binding

/20
61 Binding
Example

count = count + 5;

 Some of the bindings and their binding times


 The type of count is bound at compile time.
 The set of possible values of count is bound at compiler
design time.
 The meaning of the operator symbol + is bound at compile
time, when the types of its operands have been determined.
 The internal representation of the literal 5 is bound at compiler
design time.
 The value of count is bound at execution time with this
statement.

A complete understanding of the binding times for the


attributes of program entities is a prerequisite for
/20
understanding the semantics of a programming language
62 Binding of Attributes to
Variables
 Static Binding
A binding is static if it first occurs before run time
begins and remains unchanged throughout program
execution

 Dynamic Binding
 If the binding first occurs during run time
 or can change in the course of program execution, it
is called
 .

/20
63 Variables
Type Binding
 Static type Binding
 Explicit Declaration
 Implicit Declaration
 Implicit variable type binding is done by the language
processor, either a compiler or an interpreter.
 Syntactic form of the variable’s name
 detrimental to reliability
 type inference
 For example, in C# a var declaration of a variable must
include an initial value,

/20
64 Variables
Type Binding
 Dynamic type Binding
 Not Explicit Declaration
 Not Implicit Declaration
 the variable is bound to a type when it is assigned a
value in an assignment statement
 (Such an assignment may also bind the variable to an
address and a memory cell, because different type
values may require different amounts of storage.)
 more programming flexibility
 Possibility of generic program
 dealing with data of any numeric type

/20
65 Variables
Type Binding
 Dynamic type Binding
 Before the mid-1990s, the most commonly used
programming languages used static type binding, the
primary exceptions being some functional languages
such as Lisp
 Since then there has been a significant shift to languages
that use dynamic type binding. In Python, Ruby,
JavaScript, and PHP, type binding is dynamic
 Ex. list = [10.2, 3.5]; ………list = 47;……..

 The option of dynamic type binding was included in C#


2010
dynamic any;

/20
66 Variables
Type Binding
 Dynamic type Binding disadvantage 1
 It causes programs to be less reliable
 Less error detection capabitity
 For example, suppose that in a particular JavaScript program, i
and x are currently the names of scalar numeric variables and y is
currently the name of an array. Furthermore, suppose that the
program needs the assignment statement
i = x;
 but because of a keying error, it has the assignment statement
i = y;
 In JavaScript (or any other language that uses dynamic type
binding), no error is detected in this statement by the
interpreter—the type of the variable named i is simply changed
to an array

/20
67 Variables
Type Binding
 Dynamic type Binding disadvantage 2
 COST

 The cost of implementing dynamic attribute binding is


considerable, particularly in execution time. Type
checking must be done at run time. Furthermore,
every variable must have a run-time descriptor
associated with it to maintain the current type.

/20
68 Variables
Storage Binding & Lifetime
 Allocation
The memory cell to which a variable is bound somehow
must be taken from a pool of available memory
 Deallocation
the process of placing a memory cell that has been
unbound from a variable back into the pool of available
memory.
 Lifetime
The time during which the variable is bound to a specific
memory location
The lifetime of a variable begins when it is bound to a
specific cell and ends when it is unbound from that cell

/20
69 Variables
Storage Binding & Lifetime
Selon storage binding of a variable and according to the
lifetime:
 Scaler Variable
 Static
 stack-dynamic
 explicit heap-dynamic
 implicit heap-dynamic

/20
70 Variables
Scope
Visibility of a variable

/20

You might also like