0% found this document useful (0 votes)
7 views362 pages

Concept

The document outlines the goals and topics of a course on programming languages, including language design issues, modern programming features, and different programming paradigms. It discusses the importance of studying programming languages for better language usage, learning new languages, and making informed language choices. Additionally, it covers evaluation criteria for programming languages, influences on language design, and various implementation methods.

Uploaded by

asem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views362 pages

Concept

The document outlines the goals and topics of a course on programming languages, including language design issues, modern programming features, and different programming paradigms. It discusses the importance of studying programming languages for better language usage, learning new languages, and making informed language choices. Additionally, it covers evaluation criteria for programming languages, influences on language design, and various implementation methods.

Uploaded by

asem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 362

CS341

CONCEPT OF
PROGRAMMING
LANGUAGES
Ch1 : Preliminaries

Dr. Nada Mobark


COURSE GOALS
▪ Survey of language design issues and their implications for
translation and run-time support.
▪ Overview of modern programming languages and their
features including abstract data and control structures,
binding and scope rules, subprograms, parameter passing
mechanisms, Exception Handling, as well as support for
concurrency.
▪ Describe different paradigms of programming languages
such as: Object-oriented, functional, and Logic programming
languages.

Dr. Nada Mobark, 2025 2


TEXT BOOK
Robert W. Sebesta, Concepts
of Programming Languages
(12th edition), Pearson
Education (2019).
ISBN 9780134997186

Dr. Nada Mobark, 2025 3


SOFTWARE
▪ C, C++, Fortran, and Ada
gcc.gnu.org
▪ C# and F# microsoft.com Java
java.sun.com
▪ Scheme www.plt-
scheme.org/software/drscheme
▪ Python www.python.org

▪ Ruby www.ruby-lang.org

▪ JavaScript/PHP is included in
virtually all browsers;

Dr. Nada Mobark, 2025 4


ALL IN ONE!

Dr. Nada Mobark, 2025 5


TOPICS
1.1 Reasons for Studying Concepts of Programming Languages
1.2 Programming Domains
1.3 Language Evaluation Criteria
1.4 Influences on Language Design
1.5 Language Categories
1.6 Language Design Trade-Offs
1.7 language implementation methods

Dr. Nada Mobark, 2025 6


WHY CONCEPTS OF PROGRAMMING LANGUAGES??

▪ Better use of languages that are already known


▪ New features and unknown constructs
▪ Increased ability to express ideas

▪ Increased ability to learn new languages


▪ See how concepts are incorporated into the design of a language

▪ Improved background for choosing appropriate languages


▪ Choose based on features rather than familiarity

▪ Better understanding of significance of implementation


▪ Understanding design issues leads to intelligent use

Dr. Nada Mobark, 2025 7


JUST WONDERING!
▪ How Many Computer Programming Languages Are There?
▪ According to Wikipedia, there are about 700 programming
languages
▪ Other sources that only list notable languages still count up to an
impressive 245 languages.
▪ Another list called HOPL, that claims to include every
programming language to ever exist, puts the total number of
programming languages at 8,945.

Dr. Nada Mobark, 2025 8


PROGRAMMING DOMAINS
Scientific • Large numbers of floating-point
computations; use of arrays
applications • Fortran

Business • Produce reports, use decimal


numbers and characters
applications • COBOL

Artificial • Symbols rather than numbers


manipulated; use of linked lists
intelligence • LISP

Systems • Need efficiency because of


continuous use
programming •C

• Eclectic collection of languages:


Web Software markup (e.g., HTML), scripting
(e.g., PHP)

Dr. Nada Mobark, 2025 9


LANGUAGE EVALUATION CRITERIA

the ease with which


Readability programs can be read
and understood

the ease with which a


language can be used to
create programs
Writability

conformance to
Reliability specifications

the ultimate
total cost Cost

Dr. Nada Mobark, 2025 10


READABILITY
▪ Ease of maintenance is determined in large part by the
readability of programs
▪ Characteristics that affect readability:
▪ Syntax design
▪ meaningful keywords to indicate its purpose
▪ Special words and methods of forming compound statements ( endif in
Ada)
▪ Simplicity
▪ A manageable set of features and constructs
▪ Minimal feature multiplicity
▪ Orthogonality
▪ A relatively small set of primitive constructs can be combined in a
relatively small number of ways where every possible combination is
legal → less exceptions

Dr. Nada Mobark, 2025 11


WRITABILITY
▪ Writability must be considered in the context of the target
problem domain of a language
▪ VBasic vs. C for GUI application

▪ Characteristics that affect writability:


▪ Simplicity and orthogonality
▪ Few constructs, a small number of primitives, a small set of rules for
combining them
▪ Expressivity
▪ A set of relatively convenient ways of specifying operations
▪ Eg. Using for loops simplified counting loops

Dr. Nada Mobark, 2025 12


RELIABILITY
▪ A program is said to be reliable if it performs to its
specification under all conditions
▪ Related characteristics:
▪ Type checking
▪ Testing for type errors (eg. Function parameters)

▪ Exception handling
▪ Intercept run-time errors and take corrective measures

▪ Aliasing:
▪ Different names to the same memory cell.

Dr. Nada Mobark, 2025 13


COST
▪ Training programmers to use the language
▪ Function of simplicity and orthognality

▪ Writing programs
▪ closeness to particular applications

▪ Reliability: poor reliability leads to high costs


▪ Critical apps → very high
▪ Non-critical → lost future business or lawsuits

▪ Maintaining programs:
▪ Usually done by different programmers → readability is an issue!
▪ large software systems with relatively long lifetimes, maintenance costs
can be as high as two to four times as much as development costs

Dr. Nada Mobark, 2025 14


EVALUATION CRITERIA: OTHER
▪ Portability
▪ The ease with which programs can be moved from one
implementation to another
▪ Generality
▪ The applicability to a wide range of applications

▪ Well-defineness
▪ The completeness and precision of the language’s official
definition

Dr. Nada Mobark, 2025 15


INFLUENCES ON LANGUAGE DESIGN
▪ Computer Architecture ▪ Program Design
▪ Languages are developed Methodologies
around the prevalent ▪ New software development
computer architecture, methodologies (e.g., object-
known as the von Neumann oriented software
architecture development) led to new
programming paradigms
and by extension, new
programming languages
1950s Late
and early 1970s:
1960s: Process-
focus on oriented
machine to data-
efficiency oriented

Late Middle
1960s: 1980s:
structured Object-
programmi oriented
ng (top- program
down ming
Dr. Nada Mobark, 2025 design) 16
LANGUAGE CATEGORIES

Dr. Nada Mobark, 2025 17


LANGUAGE DESIGN TRADE-OFFS
▪ Readability vs. writability
▪ Example: APL provides many powerful operators (and a large
number of new symbols), allowing complex computations to be
written in a compact program but at the cost of poor readability

▪ Reliability vs. cost of execution


▪ Example: Java demands all references to array elements be
checked for proper indexing, which leads to increased execution
cost

▪ Writability (flexibility) vs. reliability


▪ Example: C++ pointers are powerful and very flexible but are
unreliable
▪ The easier a program to write, the more likely it is to be correct!

Dr. Nada Mobark, 2025 18


IMPLEMENTATION METHODS
▪ Translate high-level program (source language) into machine
code (machine language)
▪ Slow translation, fast execution
▪ Compilation process has several phases

Dr. Nada Mobark, 2025 19


IMPLEMENTATION METHODS
▪ Programs are interpreted by another
program known as an interpreter
▪ No translation
▪ Interpreter is a virtual machine with fetch-
decode-execute cycle
▪ produces a result from a program statement

▪ Easier implementation of programs (run-


time errors can easily and immediately be
displayed)
▪ Slower execution (10 to 100 times)
▪ Due to statement decoding

▪ Now rare for traditional high-level


languages
▪ Significant comeback with some Web scripting
languages (e.g., JavaScript, PHP)

Dr. Nada Mobark, 2025 20


COMPARISON

https://www.programiz.com/article/difference-compiler-interpreter
Dr. Nada Mobark, 2025 21
IMPLEMENTATION METHODS
▪ Hybrid Implementation Systems
▪ A compromise between compilers and pure
interpreters
▪ A high-level language program is
translated to an intermediate language
that allows easy interpretation
▪ Faster than pure interpretation
▪ Use: Small and medium systems when
efficiency is not the first concern

https://www.tutorialspoint.com/execute_ruby_online.php

Dr. Nada Mobark, 2025 22


SUMMARY
▪ The study of programming languages is valuable for a
number of reasons:
▪ Increase our capacity to use different constructs
▪ Enable us to choose languages more intelligently
▪ Makes learning new languages easier

▪ Most important criteria for evaluating programming


languages include:
▪ Readability, writability, reliability, cost

▪ Major influences on language design have been machine


architecture and software development methodologies

Dr. Nada Mobark, 2025 23


ANY Q??
CS341
CONCEPT OF
PROGRAMMING
LANGUAGES
Ch2 : Evolution of programming languages

Dr. Nada Mobark


GOAL

Explore the
environment and
motivation behind the
development of a
collection of
programming -
languages.

Dr. Nada Mobark, 2025 26


TOPICS
2.1 Zuse’s Plankalkül
Languages over years:
2.3 Fortran
2.4 ALGOL
2.5 Lisp
2.6 COBOL
2.7 BASIC
2.13 Prolog
2.14 Ada

Dr. Nada Mobark, 2025 27


LANGUAGE EVOLUTION

https://www.youtube.com/watch?v=Og847HVwRSI

Dr. Nada Mobark, 2025 28


MACHINE LANGUAGE
▪ In the late 1940s and early 1950s, machines were:
▪ slow, unreliable, expensive,
▪ with extremely small memories,
▪ no indexing or floating point
▪ difficult to program

▪ What was wrong with using machine code?


▪ Numeric codes 🡪 Poor readability
▪ Absolute addressing 🡪 Poor modifiability

Dr. Nada Mobark, 2025 29


2.1 ZUSE’S PLANKALKÜL
▪ by the German computer pioneer Konrad Zuse, the
creator of the first relay computer
▪ Designed in 1945 for the Z4, proposed in his PhD
dissertation.
▪ Plankalkül, means program calculus.
▪ Not published until 1972
▪ Never implemented
▪ Advanced data structures
▪ floating point, arrays, records

▪ Syntax:

| A + 1 => A
V | 4 5 (subscripts)
S | 1.n 1.n (data types)
Dr. Nada Mobark, 2025 30
GENEALOGY OF COMMON LANGUAGES

Dr. Nada Mobark, 2025 31


2.3 IBM 704 AND FORTRAN
▪ Fortran Environment of development
▪ Computers were small and unreliable
▪ Applications were scientific
▪ No efficient programming models
▪ Machine efficiency was the most
important concern

▪ Developed by John W. Backus at IBM


for their 704 mainframes.
▪ indexing and floating-point hardware
instructions

Dr. Nada Mobark, 2025 32


FORTRAN
▪ Highly optimizing compilers
▪ “A programmer writes only 5 percent of all instructions, and the
program generates (compiles) the remaining 95 percent for the
computer”

▪ helped open the door to modern computing


▪ made code comprehensible to mathematicians and scientists.

25th reunion of
Fortran team,1982

Dr. Nada Mobark, 2025 33


Fortran I_1957:
Fortran 0_1954 • index registers and Fortran
: floating point hardware II_1958:
• compiled programming • Independent
not implemented • Code very fast compilation
• Programs less than 400 • Fixed the bugs
lines

Fortran 90/95: Fortran IV_1960-


Fortran 77_1978:
• Modules
62:
• Character string
• Dynamic arrays, Pointers handling • Explicit type
• Recursion declarations
• Logical loop control
• CASE statement statement • Logical selection
statement
• Parameter type checking • IF-THEN-ELSE
statement • Subprogram names
could be parameters

Fortran 2003: Fortran 2008: Fortran 2018


• OOP • Concurrent • Parallel processing
• procedure pointers Programming
• interoperability
with C

Dr. Nada Mobark, 2025 34


Example code

Dr. Nada Mobark, 2025 35


FORTRAN EVALUATION

▪ The first widely used/acceptable programming language

▪ Originally designed to implement a compiler, only for IBM


machines.
▪ Impressive effect on use of computers and design of
programming languages.
▪ Static typing/allocation 🡪 simple, efficient yet not flexible.

Dr. Nada Mobark, 2025 36


2.4 THE FIRST STEP TOWARD SOPHISTICATION:
ALGOL
▪ Environment of development
▪ FORTRAN had (barely) arrived for IBM 70x
▪ Many other languages were being developed, all for specific
machines
▪ No portable language; all were machine-dependent
▪ No universal language for communicating algorithms

▪ ACM and GAMM met for four days for design (May 27 to June
1, 1958), Goals of the language
▪ Close to mathematical notation
▪ Good for describing algorithms
▪ Must be translatable to machine code

Dr. Nada Mobark, 2025 37


ALGOL EVOLUTION

Concept of type Block structure (local Design is based on the

ALGOL 63
ALGOL 58

ALGOL 60
formalized scope) concept of orthogonality
Names could be any Two parameter passing User-defined data
length methods structures
Arrays could have any Subprogram recursion Reference types
number of subscripts Stack-dynamic arrays Dynamic arrays (called
Subscripts were placed in Still no I/O flex arrays)
brackets new metalanguage ( key
no string handling
Parameters were words and terms)
separated by mode (in &
out)
Compound statements
(begin ... end)
Semicolon as a statement
separator
:= , Assignment operator
if had an else-if clause
No I/O

Dr. Nada Mobark, 2025 38


EXAMPLE CODE

Dr. Nada Mobark, 2025 39


ALGOL EVALUATION
Successes Failure

▪ It was the standard way to ▪ Never widely used,


publish algorithms for over especially in U.S.
20 years ▪ Lack of I/O and the
character set made
▪ All subsequent imperative programs non-portable
languages are based on it
▪ Too flexible--hard to
▪ First machine-independent implement
language ▪ Entrenchment of Fortran
▪ First language whose ▪ Formal syntax description
(BNF)
syntax was formally defined
(BNF) ▪ Lack of support from IBM

Dr. Nada Mobark, 2025 40


2.5 FUNCTIONAL PROGRAMMING: LISP
▪ AI research needed a language to
▪ Process data in lists (rather than arrays)
▪ Symbolic computation (rather than numeric)

▪ LISt Processing language


▪ Designed at MIT by McCarthy
▪ Declarative programming:
▪ What to do not how to do it!!

▪ Only two data types: atoms and lists


▪ Syntax is based on lambda calculus

Dr. Nada Mobark, 2025 41


REPRESENTATION OF TWO LISP LISTS

The lists
(A B C D)
and
(A (B C) D (E (F G)))

Dr. Nada Mobark, 2025 42


LISP EVALUATION
▪ Pioneered functional programming
▪ No need for variables or assignment
▪ Control via recursion and conditional expressions

▪ Still the dominant language for AI


▪ Common Lisp and Scheme are contemporary dialects of Lisp
▪ ML, Haskell, and F# are also functional programming
languages, but use very different syntax

Dr. Nada Mobark, 2025 43


2.6 COMPUTERIZING BUSINESS RECORDS:
COBOL
▪ Environment of development
▪ UNIVAC was beginning to use FLOW-MATIC
▪ USAF was beginning to use AIMACO
▪ IBM was developing COMTRAN (COMmercial TRANslator)

▪ First Design Meeting (Pentagon) - May 1959


▪ members were all from computer manufacturers and DoD
branches
▪ Design Goals:
▪ Must look like simple English
▪ Must be easy to use, even if that means it will be less powerful
▪ Must broaden the base of computer users
▪ Must not be biased by current compiler problems

Dr. Nada Mobark, 2025 44


RECORDS IN COBOL
▪ Record is a collection of fields that is used to describe an
entity.
▪ Field is used to indicate the data stored about an element.

▪ File is a collection of related records.


▪ Simple text files cannot be used in COBOL, instead PS (Physical
Sequential) and VSAM files are used.

Dr. Nada Mobark, 2025 45


EXAMPLE CODE

Dr. Nada Mobark, 2025 46


COBOL EVALUATION
▪ Contributions
▪ First macro facility in a high-level language
▪ Hierarchical data structures (records)
▪ Nested selection statements
▪ Long names (up to 30 characters), with hyphens
▪ Separate data division

▪ First language required by DoD


▪ The poor performance of the early compilers made the language
too expensive to use.

▪ Led to the electronic mechanization of accounting.

▪ Still the most widely used business applications language

Dr. Nada Mobark, 2025 47


2.7 THE BEGINNING OF TIMESHARING: BASIC

▪ Specially designed for ”liberal art” students

▪ Design Goals:
▪ Easy to learn and use for non-science students
▪ Must be “pleasant and friendly”
▪ Fast turnaround for homework
▪ Free and private access
▪ User time is more important than computer time

▪ Poorly structured programs

▪ Current popular dialect: Visual Basic , 1990s

▪ First widely used language with time sharing


▪ Terminals connected to a computer

Dr. Nada Mobark, 2025 48


EXAMPLE CODE

Dr. Nada Mobark, 2025 49


2.13 PROGRAMMING BASED ON LOGIC: PROLOG

▪ Developed, by Comerauer and Roussel (University of Aix-


Marseille), with help from Kowalski ( University of Edinburgh)
▪ Based on formal logic

▪ Non-procedural

▪ Can be summarized as being an intelligent database system


that uses an inferencing process to infer the truth of given
queries
▪ Comparatively inefficient

▪ Few application areas

Dr. Nada Mobark, 2025 50


2.14 HISTORY’S LARGEST DESIGN EFFORT: ADA

▪ Huge design effort, involving hundreds of people,


much money, and about eight years, Sequence of
requirements (1975-1978)
▪ Named after Augusta Ada Byron, the first programmer
▪ the first published algorithm ever specifically tailored for
implementation on a computer

Contributions Comments

• Packages - support for data • Competitive design


abstraction • Included all that was then known
• Exception handling - elaborate about software engineering and
• Generic program units language design
• Concurrency - through the • First compilers were very
tasking model difficult; the first really usable
compiler came nearly five years
after the language design was
completed
Dr. Nada Mobark, 2025 51
EXAMPLE CODE

Dr. Nada Mobark, 2025 52


SUMMARY
▪ Development, development environment, and evaluation of a
number of important programming languages
▪ Perspective into current issues in language design

Dr. Nada Mobark, 2025 53


ANY Q??
CS341
CONCEPT OF
PROGRAMMING
LANGUAGES
Ch3(a) : Describing Syntax

Dr. Nada Mobark


GOAL
How does a
computer
understand
computer programs’
representation?

Compile == understand !

Dr. Nada Mobark, 2025 56


TOPICS
3.1 Introduction
3.2 The General Problem of Describing Syntax
3.3 Formal Methods of Describing Syntax
▪ Grammar
▪ Rules
▪ Derivation
▪ Parse trees
▪ EBNF

Dr. Nada Mobark, 2025 57


WHAT IS A LANGUAGE?
▪ A natural language is a structured system of communication
used by humans consisting of sounds or gestures.

▪ A programming language is a formal language comprising a


set of instructions for computers.

Dr. Nada Mobark, 2025 58


HOW TO DESCRIBE A LANGUAGE?
▪ The study of programming languages, like the study of natural
languages, syntax and semantics are closely related.

• the form or • the meaning of


structure of the the expressions,
expressions, statements, and
statements, and program units
program units

Syntax Semantics

Dr. Nada Mobark, 2025 59


EXAMPLE : SENTENCE STRUCTURE

Dr. Nada Mobark, 2025 60


LANGUAGE DEFINITIONS

Recognizers Generators

• A recognition device • A device that generates sentences


(mechanism): of a language
• reads input strings over the • used to enumerate all of the
alphabet of the language sentences of a language
• decides whether the input • Like a button
strings belong to the language • Preferable by programmers to
• Accept/reject learn a language

Dr. Nada Mobark, 2025 61


WHY TO DESCRIBE A LANGUAGE?

Dr. Nada Mobark, 2025 62


WHY TO DESCRIBE A LANGUAGE?
▪ Syntax and semantics provide a language’s definition
▪ Difficult but essential!

▪ Challenge: diversity of users


▪ Initial evaluators, other language designers
▪ Implementers
▪ Users (Programmers)

Dr. Nada Mobark, 2025 63


FORMAL METHODS FOR DESCRIBING SYNTAX

Context-Free Backus-Naur
Grammars Form (BNF)
• Developed by Noam Chomsky • Invented by John Backus to
in the mid-1950s describe the syntax of
• Language generators, meant ALGOL58
to describe the syntax of • Revised by Peter Naur for
natural languages ALGOL60
• Two grammar classes: • Concise formal descriptions
context-free and regular • Not easily understandable,
new notation
• Not immediately accepted, but
later became the standard

grammars : formal language-generation mechanisms


Dr. Nada Mobark, 2025 64
TERMINOLOGY OF DESCRIBING SYNTAX

A language is a set of sentences

A sentence is a string of characters over


some alphabet

A lexeme is the lowest level syntactic


unit of a language (e.g., *, sum, begin)

A token is a category of lexemes (e.g.,


identifier)

Dr. Nada Mobark, 2025 65


EXAMPLE:
int index = 2 * count + 17;

Lexemes Tokens
▪ int ▪ keyword
▪ index ▪ identifier
▪ count ▪ identifier
▪ = ▪ equal_sign/assignment_
op
▪ +
▪ plus_op
▪ *
▪ mult_op
▪ 2
▪ int_literal
▪ 17
▪ int_literal
▪ ;
▪ semicolon or delimiter

Dr. Nada Mobark, 2025 66


GRAMMARS FUNDAMENTALS
▪ A metalanguage is a language what is used to describe
another language.
▪ BNF or grammar is a metalanguage for programming languages,
i.e., the language of languages.
▪ Single words correspond to terminals
▪ Keywords: class , public, while, for in Java
▪ Literals : 1234, ‘d’
▪ Separators and delimiters: semicolons, commas, brackets, braces

▪ All the structures built on top of terminals (sentences, periods,


paragraphs, chapters, and entire documents) correspond
to non-terminals.

Dr. Nada Mobark, 2025 67


GRAMMAR, RULES
▪ rule/production has a left-hand side (LHS), which is a
nonterminal, and a right-hand side (RHS), which is a string of
terminals and/or non-terminals

▪ nonterminal symbols :
▪ often enclosed in angle brackets
▪ act like syntactic variables

▪ Grammar:
▪ a finite non-empty set of rules
▪ A generative device for defining languages

Dr. Nada Mobark, 2025 68


GRAMMAR, MULTIPLE RULE DEFINITION
▪ Two or more possible syntactic forms in the language:
▪ multiple rules:
▪ Single rule ( | ➔ OR)

Dr. Nada Mobark, 2025 69


GRAMMAR, DESCRIBING LISTS
▪ Variable-length lists in mathematics are written using an
ellipsis (. . .)
▪ Example : 1, 2, . . .

▪ Syntactic lists are described using recursion

Dr. Nada Mobark, 2025 70


EXAMPLE

A start symbol is a special element of the


non-terminals of a grammar
Dr. Nada Mobark, 2025 71
GRAMMAR, DERIVATIONS
▪ A derivation is a repeated application of rules, starting with
the start symbol
▪ The derivation continues until the sentential form contains no non-
terminals.
▪ A derivation may be either leftmost or rightmost
▪ leftmost derivation is one in which the leftmost nonterminal in
each sentential form is the one that is expanded

Dr. Nada Mobark, 2025 72


begin A = B + C ; B = C end DERIVATION:
• symbol => is read
“derives.”
• sentential form: every
string of symbols in a
derivation
• generated sentence : a
sentential form,
consisting of only
terminals, or lexemes.
• Objective : recognition

Dr. Nada Mobark, 2025 73


EXAMPLE

A = B * ( A + C )

Dr. Nada Mobark, 2025 74


PARSE TREE
▪ A hierarchical representation of a derivation
▪ Every internal node of a parse tree is labeled with a
nonterminal symbol;
▪ every leaf is labeled with a terminal symbol.
▪ Every subtree of a parse tree describes one instance of an
abstraction in the sentence.

Dr. Nada Mobark, 2025 75


EXAMPLE

A = B * ( A + C )

Dr. Nada Mobark, 2025 76


AMBIGUITY IN GRAMMARS
▪ A grammar is ambiguous if and only if it generates a
sentential form that has two or more distinct parse trees
▪ If a language structure has more than one parse tree, then the
meaning of the structure cannot be determined uniquely.
▪ Reasons:
▪ Operator precedence
▪ Associativity

▪ Normally, an ambiguous grammar can be rewritten into an


unambiguous grammar.
▪ New non-terminals, new rules : to represent operands, and force
different operators to different levels in the parse tree.

Dr. Nada Mobark, 2025 77


EXAMPLE

Dr. Nada Mobark, 2025 78


EXAMPLE, PARSE TREE

Dr. Nada Mobark, 2025 79


ASSOCIATIVITY OF OPERATORS
▪ Associativity: two operators in an expression with the same
precedence
▪ a semantic rule is required to specify precedence

▪ Example,
A+B+C
▪ left and right associative orders of evaluation mean the same
thing:
(A + B) + C = A + (B + C)
▪ Subtraction and division are not associative, whether in
mathematics or in a computer

Dr. Nada Mobark, 2025 80


EXAMPLE
▪ In (+ and *) left and right
associative orders of
evaluation mean the same
thing:
(A + B) + C = A + (B + C)
▪ In Syntax, left recursion
specifies left associativity

Dr. Nada Mobark, 2025 81


AMBIGUITY EXAMPLE: “DANGLING-ELSE”
▪ Consider the grammar

Ambiguous <if_stmt> ➔ if <logic_expr> then <stmt>


or not?
| if <logic_expr> then <stmt> else <stmt>
<stmt> ➔ <if_stmt>

▪ How to derive the following statement?

if <logic_expr> then if <logic_expr> then


if <logic_expr> then if <logic_expr> then
<stmt> <stmt>
else else
<stmt> <stmt>

Dr. Nada Mobark, 2025 82


PARSE TREE Some languages (like Java)
match each else with the
nearest preceding elseless if

if <logic_expr> then
if <logic_expr> then
<stmt>
else
<stmt>

if <logic_expr> then
if <logic_expr> then
<stmt>
else
<stmt>

Dr. Nada Mobark, 2025 83


EXTENDED BNF
▪ New meta-symbols:
▪ Optional parts are placed in brackets [ ]

▪ Alternative parts of RHSs are placed inside parentheses and


separated via vertical bars

▪ Repetitions (0 or more) are placed inside braces { }

Dr. Nada Mobark, 2025 84


EXAMPLE
▪ If you have a rule such as:
<id> = <letter>
| <id><letter>
| <id><digit>
▪ You can replace it with:
<id> = <letter> {(<letter> | <digit>)}

Dr. Nada Mobark, 2025 85


EXAMPLE

BNF EBNF

<signed_int> = <signed_int> = [ +|- ]


<digit> {<digit>}
+ <int>
| - <int>
<int> = <digit> |
<int><digit>

Dr. Nada Mobark, 2025 86


MORE EXAMPLES …

BNF EBNF
<expr> → <expr> + <expr> → <term> {(+ | -)
<term> <term>}
| <expr>
- <term> <term> → <factor> {(*|/)
<factor>}
| <term>
<term> → <term> *
<factor>
| <term>
/ <factor>
|
<factor>

Dr. Nada Mobark, 2025 87


ANY Q??
CS341
CONCEPT OF
PROGRAMMING
LANGUAGES
Ch3(b) : Describing Semantics

Dr. Nada Mobark


GOAL
Briefly discuss formal
methods of describing
semantics

Dr. Nada Mobark, 2025 90


TOPICS
3.4 Attribute Grammars
3.5 Dynamic Semantics
Operational
4.1 Introduction to Compiler Design

Dr. Nada Mobark, 2025 91


WHY DESCRIBING SEMANTICS?
▪ Several needs for a methodology and notation for semantics:
▪ Programmers need to understand the statements of a language do
before developing programs.
▪ Compiler writers must know exactly what language constructs
mean to design implementations for them correctly.
▪ Correctness proofs would be possible without testing.
▪ Compiler generators would be possible
▪ language designers could discover ambiguities and
inconsistencies in their designs.

Dr. Nada Mobark, 2025 92


SEMANTICS DESCRIPTIONS

Static Dynamic
▪ Extension to BNF grammar ▪ Related to the program
▪ BNF cannot describe all of the meaning during execution
syntax of programming
languages ▪ Can be used to prove
correctness without testing
▪ Checked at compile time
▪ Example: loops
▪ Example : data-type
compatibility

Dr. Nada Mobark, 2025 93


94
3.4 ATTRIBUTE
GRAMMARS

Dr. Nada Mobark, 2025


ATTRIBUTE GRAMMARS: DEFINITION
▪ An attribute grammar is a context-free grammar with the
following additions:
▪ For each grammar symbol X there is a set A(X) of attribute values
consisting of:
▪ S(X): synthesized attributes
▪ I(X): inherited attributes
▪ intrinsic attributes on the leaves:
▪ symbol table → declaration
▪ Each rule has:
▪ a set of functions that define certain attributes of the non-terminals in
the rule
▪ a (possibly empty) set of predicates (Boolean functions) to check for
attribute consistency
▪ A false predicate function value indicates a violation of the syntax or
static semantics rules of the language.

Dr. Nada Mobark, 2025 95


EXAMPLE
▪ Rule → the name on the end of an Ada procedure must match
the procedure’s name.

Dr. Nada Mobark, 2025 96


EXAMPLE
▪ Syntax

<assign> -> <var> = <expr>


<expr> -> <var> + <var> | <var>
<var> -> A | B | C

Dr. Nada Mobark, 2025 97


EXAMPLE, PARSE TREE

Dr. Nada Mobark, 2025 98


EXAMPLE
▪ Syntax
<assign> -> <var> = <expr>
<expr> -> <var> + <var> | <var>
<var> -> A | B | C
▪ The variables can be one of two types: int or real.

▪ The types of operands in the right side can be mixed, but the
assignment is valid only if the target and the value resulting from
evaluating the right side have the same type.

Dr. Nada Mobark, 2025 99


EXAMPLE, RULES

<assign> -> <var> = <expr>


<expr> -> <var> + <var> | <var>
<var> A | B | C

Semantic rules:
1. When there are two variables on the right side of an
assignment:
▪ If , they have the same type, the expression type is that of the operands
▪ if the operand types are not the same is always real.

2. The type of the left side of the assignment must match the
type of the right side.

Dr. Nada Mobark, 2025 100


EXAMPLE, ATTRIBUTES
<assign> -> <var> = <expr>
<expr> -> <var> + <var> | <var>
<var> A | B | C

▪ actual_type: synthesized for <var> and <expr>


▪ used to store the actual type, int or real, of a variable or
expression
▪ variable → actual type is intrinsic
▪ Expression → determined from the actual types of the child node or
children nodes of the <expr> nonterminal.

▪ expected_type: inherited for <expr>


▪ used to store the type, int or real, that is expected for the
expression
▪ determined by the type of the variable on the left side of the
assignment statement.

Dr. Nada Mobark, 2025 101


Dr. Nada Mobark, 2025 102
DECORATING A PARSE TREE
▪ How are attribute
values computed?
▪ If all attributes are
inherited
▪ the tree is decorated in
top-down order.
▪ If all attributes are
synthesized
▪ the tree is decorated in
bottom-up order.
▪ Both kinds of attributes
are used
▪ some combination of
top-down and bottom-
up that must be used.

Dr. Nada Mobark, 2025 103


EXAMPLE, FINAL RESULT

Dr. Nada Mobark, 2025 104


EVALUATION
▪ Static semantics is an essential part of all compilers
▪ Decorating parse trees is an expensive process
▪ When used to describe modern programming languages,
attribute grammar becomes large in size and very complex.

Dr. Nada Mobark, 2025 105


106
3.5 DYNAMIC
SEMANTICS

Dr. Nada Mobark, 2025


DYNAMIC SEMANTICS
▪ dynamic semantics reflects the meaning, of the expressions,
statements, and program units of a programming language
➢ There is no single widely acceptable notation or formalism
for describing semantics
➢ Programmers usually rely on language manuals
➢ Imprecise, incomplete

▪ Three methods:
✓ Operational semantics
▪ Denotational semantics
▪ Axiomatic semantics

Dr. Nada Mobark, 2025 107


OPERATIONAL SEMANTICS
▪ Operational Semantics
▪ Describe the meaning of a program by executing its statements on
a machine, either simulated or actual.
▪ The change in the state of the machine (memory, registers, etc.)
defines the meaning of the statement
▪ the concept is frequently used in programming textbooks and
programming language reference manuals

Dr. Nada Mobark, 2025 108


THE BASIC PROCESS
▪ First step : design an appropriate intermediate language,
where the primary desired characteristic of the language is
clarity
▪ Every construct of the intermediate language must have an
obvious and unambiguous meaning

Dr. Nada Mobark, 2025 109


EXAMPLE

Dr. Nada Mobark, 2025 110


JAVA DO-WHILE

do:
…..
……..
while : if condition == 1 goto do
end:
Dr. Nada Mobark, 2025 111
EVALUATION
▪ Good if used informally (language manuals, etc.) or for
teaching programming languages
▪ Extremely complex if used formally
▪ Vienna Definition Language (VDL) was used for describing
semantics of PL/I.
▪ can lead to circularities, in which concepts are indirectly
defined in terms of themselves

Dr. Nada Mobark, 2025 112


4.1 INTRODUCTION
▪ Language implementation systems analyze source code,
regardless of the specific implementation approach
▪ Nearly all syntax analysis is based on a formal description of
the syntax of the source language (BNF)
▪ Advantages of Using BNF to describe Syntax
▪ Provides a clear and concise syntax description
▪ The parser can be build directly based on the BNF
▪ Parsers based on BNF are easy to update

Dr. Nada Mobark, 2025 113


Symbol Table

i Lexemes Tokens position = initial + rate * 60


1 position id …

2 initial id …
Source Code
3 rate id …

4 60 Int_lit
Compiler
Lexical Analyzer

=
<id 1> + Syntax Analyzer
<id 2> *
<id 3> Int_Lit
Semantic Analyzer

Optimized Code Intermediate Code


t1 = id3 * 60.0 t1 = inttofloat(60)
t3 = id2 + t2 t2 = id3 * t1
id1 = t3
Code Generator
t3 = id2 + t2
id1 = t3

Assembly/machine Code
LDF R2, id3
MULF R2, R2, #60.0
LDF R1, id2
ADDF R1, R1, R2 Computer
STF id1, R1
Dr. Nada Mobark, 2025 114
SUMMARY
▪ An attribute grammar is a descriptive formalism that can
describe both the syntax and the semantics of a language
▪ Operational semantics describe the meaning of a program by
executing its statements on a machine using an intermediate
language

Dr. Nada Mobark, 2025 115


ANY Q??
CS341
CONCEPT OF
PROGRAMMING
LANGUAGES
Ch4 : Lexical and Syntax Analysis

Dr. Nada Mobark


GOAL
Compile == understand !!

Dr. Nada Mobark, 2025 118


TOPICS
4.1 Introduction
4.2 Lexical Analysis
4.3 The Parsing Problem
▪ Top-down Parsing
▪ Bottom-Up Parsing

Dr. Nada Mobark, 2025 119


4.1 INTRODUCTION
▪ Language implementation systems analyze source code,
regardless of the specific implementation approach
▪ Nearly all syntax analysis is based on a formal description of
the syntax of the source language (BNF)
▪ Advantages of Using BNF to describe Syntax
▪ Provides a clear and concise syntax description
▪ The parser can be build directly based on the BNF
▪ Parsers based on BNF are easy to update

Dr. Nada Mobark, 2025 120


position = initial + rate * 60
SYNTAX ANALYSIS
Source Code

Compiler
▪ The syntax analysis portion of a
compiler nearly always consists Lexical Analyzer
of two parts:
▪ A low-level part is called a lexical Parser
analyzer
▪ A high-level part is called a
syntax analyzer, or parser (based Semantic Analyzer
on BNF)

Code Generator

Machine
Code

Computer
Dr. Nada Mobark, 2025 121
4.2 LEXICAL ANALYSIS
▪ A lexical analyzer is a pattern matcher for character strings
▪ It is the “front-end” of a parser

▪ Identifies substrings of the source program that belong


together - lexemes
▪ Lexemes match a character pattern (while), which is associated
with a category of lexemes (keyword), namely token

Dr. Nada Mobark, 2025 122


TERMINOLOGY

A language is a set of sentences

A sentence is a string of characters over some


alphabet

A lexeme is the lowest level syntactic unit of a


language (e.g., *, sum, begin)

A token is a category of lexemes (e.g.,


identifier)

Dr. Nada Mobark, 2025 123


EXAMPLE
▪ A lexeme is the lowest level syntactic unit of a language (e.g.,
*, sum, begin, ;, {, }, [, ])
▪ A token is a category of lexemes (e.g., identifier)

Lexeme Tokens
index = 2 * count + 17;
index identifier
count identifier
= equal_sign
* mult_op
+ plus_op
2 int_literal
17 int_literal
Dr. Nada Mobark, 2025 ; semicolon 124
LEXICAL ANALYSIS
▪ A lexical analyzer also . . .
▪ Skips comments
▪ Skips blanks outside lexemes
▪ Inserts lexemes for identifiers and literals into a symbol table
▪ Detects syntactic errors in lexemes
▪ For example, ill-formed floating-point literals, 12,345.21, 12,213, 4us

Symbol Table

i Lexeme Tokens

1 sum IDENT …

2 ( LEFT_PAREN …

3 + ADD_OP …

4 47 INT_LIT

Dr. Nada Mobark, 2025 125


LEXICAL ANALYSIS
▪ A lexical analyzer typically has several instance variables
▪ Character nextChar
▪ CharClass (letter, digit, etc.)
▪ String lexeme
▪ int tokenType

▪ And, essential functions of a Lexical Analyzer


▪ getChar - gets the next character of input, puts it in nextChar,
determines its class and puts the class in charClass
▪ addChar - puts the character from nextChar into the place the
lexeme is being accumulated, lexeme
▪ lookup - determines whether the string in lexeme is a reserved
word (returns a code)

Dr. Nada Mobark, 2025 126


STATE DIAGRAM
Gets the next character determines whether the string in
of input, and lexeme is a reserved word (returns a
determines its class code)

Token
add the
character
from
nextChar
to the Recognizes single-char tokens
lexeme (returns a code)
string

Dr. Nada Mobark, 2025 127


LEXICAL ANALYZER //Character classes
#define LETTER 0
#define DIGIT 1
#define UNKNOWN 99
Implementation: //Token codes
▪ front.c (Figure 4.1) #define INT_LIT 10
#define IDENT 11
#define ASSIGN_P 20
#define ADD_OP 21
(sum + 47) / total #define SUB_OP 22
#define MULT_OP 23
#define DIV_OP 24
Next token is: 25 Next lexeme is ( #define LEFT_PAREN 25
#define RIGHT_PAREN 26
Next token is: 11 Next lexeme is sum
Next token is: 21 Next lexeme is +
Next token is: 10 Next lexeme is 47
Next token is: 26 Next lexeme is )
Next token is: 24 Next lexeme is /
Next token is: 11 Next lexeme is total
Next token is: -1 Next lexeme is EOF

Dr. Nada Mobark, 2025 128


4.3 THE PARSING PROBLEM
Source Code

Compiler
▪ Goals of the parser
Lexical Analyzer
▪ Produce the parse tree
▪ Find all syntax errors
▪ produce an appropriate diagnostic Parser
message and recover quickly

▪ Two categories of parsers Semantic Analyzer


▪ Top down parser - produce the
parse tree, beginning at the root
▪ Bottom up parser - produce the Code Generator
parse tree, beginning at the
leaves
Machine
Code

Computer
Dr. Nada Mobark, 2025 129
THE TOP-DOWN PARSER
▪ An LL parser is a top-down parser.
▪ parses the input from Left to right
▪ performs Leftmost derivation of the sentence.

▪ A nonterminal symbol, A, can be replaced by a nonempty set


of production rules, namely A-rules
▪ Given a sentential form, xA , the parser must choose the correct
A-rule to get the next sentential form
▪ x is a string of terminal symbols
▪ A is a single nonterminal symbol
▪  is a mixed string of terminals and/or non-terminals
▪ leftmost derivation
▪ keep replacing the leftmost nonterminal A by the appropriate A-rules
▪ look only one token ahead in the input

Dr. Nada Mobark, 2025 130


TOP-DOWN PARSER EXAMPLE E

▪ Look at the following grammar


E → T + E | T
T → int * T | int | (E)

▪ Consider the string: T + E


int * int + int
▪ Left-most derivation, start from T T
int
root! *

int int

Dr. Nada Mobark, 2025 131


BOTTOM-UP PARSER
▪ A bottom-up parser is an LR parse
▪ parses the input from Left to right

▪ performs rightmost derivation of the sentence.


▪ Steps (reduction process ):
▪ Start with the tokens of the program and work back to the start
symbol
▪ continuously picks a substring of the input and attempts to reverse
it back to a nonterminal.
▪ Try to match the RHS of some production rule with a substring of
tokens (handle), and replace the substring with the LHS of the
production rule

Dr. Nada Mobark, 2025 132


BOTTOM-UP PARSER EXAMPLE E

E
E → E + T | T
T → T * int | int | (E)
int * int + int
▪ Apply right-most derivation, in T
reverse order → start from
terminals!
T
int * int + int T
T * int + int
T + int
T+ T
E+ T int
int * int +
E

Dr. Nada Mobark, 2025 133


PRACTICE
▪ Look at the following grammar
S –> AB
A –> aA | ε
B –> b | bB
▪ Giving the string:
aaaεb
▪ Draw top-down and bottom-up parse trees for the string

Dr. Nada Mobark, 2025 134


TOP-DOWN S

A B

S –> AB
A –> aA | ε
B –> b | bB A

S => A B A
=> a A B
=> a a A B
=>a a a A B
=> a a a ε B A
=> a a a ε b

a a a ε b
Dr. Nada Mobark, 2025 135
BOTTOM-UP S –> AB
S
A –> aA | ε
B –> b | bB
A
=> a a a ε b
=> aaaAb
=> aaAb
A
=> aAb
=> Ab
=> AB
=> S
A

A B

a a a ε b
Dr. Nada Mobark, 2025 136
SEPARATE LEXICAL AND SYNTAX ANALYSIS
▪ Reasons to Separate Lexical and Syntax Analysis
▪ Simplicity - less complex approaches can be used for lexical
analysis; separating them simplifies the parser
▪ Efficiency - separation allows optimization of the lexical analyzer
▪ Portability - parts of the lexical analyzer may not be portable, but
the parser always is portable

Dr. Nada Mobark, 2025 137


Symbol Table

i Lexemes Tokens position = initial + rate * 60


1 position id …

2 initial id …
Source Code
3 rate id …

4 60 Int_lit
Compiler
Lexical Analyzer

=
<id 1> + Syntax Analyzer
<id 2> *
<id 3> Int_Lit
Semantic Analyzer

Optimized Code Intermediate Code


t1 = id3 * 60.0 t1 = inttofloat(60)
t3 = id2 + t2 t2 = id3 * t1
id1 = t3
Code Generator
t3 = id2 + t2
id1 = t3

Assembly/machine Code
LDF R2, id3
MULF R2, R2, #60.0
LDF R1, id2
ADDF R1, R1, R2 Computer
STF id1, R1
Dr. Nada Mobark, 2025 138
SUMMARY
▪ The major methods of implementing programming languages
are: compilation, pure interpretation, and hybrid
implementation
▪ Syntax analysis is a common part of language implementation
▪ A lexical analyzer is a pattern matcher that isolates small-
scale parts of a program
▪ Detects syntax errors
▪ Produces a parse tree

▪ Parsing problem for bottom-up parsers: find the substring of


current sentential form

Dr. Nada Mobark, 2025 139


ANY Q??
CS341
CONCEPT OF
PROGRAMMING
LANGUAGES
Ch5 : Names, Bindings, and Scopes

Dr. Nada Mobark


GOAL
Discuss the
fundamental semantic
issues of variables

Dr. Nada Mobark, 2025 142


TOPICS
5.1 Introduction
5.3 Variables
▪ Name
▪ Address
▪ Value
▪ Type
▪ Lifetime

5.4 The Concept of Binding


▪ Type binding
▪ Storage binding

5.5 Scope
▪ Static
▪ dynamic

5.6 Scope and Lifetime

Dr. Nada Mobark, 2025 143


INTRODUCTION
▪ Imperative languages are abstractions of Von Neumann
architecture
▪ Processor: execute programs modifying the contents of the
memory
▪ Memory : store data and instructions
▪ Variables are essential!

Dr. Nada Mobark, 2025 144


WHAT IS A VARIABLE?
▪ Machine and assembly
languages
▪ Readability, writability, and
maintainability issues

Dr. Nada Mobark, 2025 145


WHAT IS A VARIABLE?
▪ A variable is an abstraction of a memory cell or collection of
cells
▪ Variables are
▪ noted by name
▪ stored based on type
▪ used based on scope and lifetime

Dr. Nada Mobark, 2025 146


VARIABLE ATTRIBUTES
▪ Name
▪ Address
▪ Value
▪ Type Symbol Table
▪ Lifetime
Name Token Addr Value Type Lifetime Scope
▪ Scope (Lexemes)

1 index id int local

2 17 const int

index = 2 * count + 17;

Dr. Nada Mobark, 2025 147


VARIABLE ATTRIBUTES, NAME
▪ A name is a string of characters that identifies some entity in a
program
▪ Examples: variable, subprogram, constant, etc.

▪ The term identifier is often used interchangeably with name.


▪ Design issues for names:
▪ What forms are legal?
▪ Are names case sensitive?
▪ Are special words reserved words or keywords?

Dr. Nada Mobark, 2025 148


5.2 DESIGN ISSUES FOR NAMES, FORM
▪ Most of the programming languages have the same for:
▪ a letter followed by a string consisting of letters, digits, and
underscore characters ( _ ).
▪ Special characters may be allowed at the beginning
▪ PHP: all variable names must begin with dollar signs
▪ Perl: all variable names begin with special characters, which
specify the variable’s type
▪ No spaces → A multiple-word naming convention:
▪ camel notation, all of the words of except the first are capitalized, as in
myStack
▪ use of underscores and mixed case
▪ A programming style, not a language issue!

Dr. Nada Mobark, 2025 149


DESIGN ISSUES FOR NAMES, LENGTH
▪ If too short, they cannot be descriptive
▪ Language examples:
▪ FORTRAN 95: maximum of 31
▪ C99: no limit but only the first 63 are significant;
▪ C++: no limit, but implementers often impose a limit on name length to
simplify the symbol table
▪ C# and Java: no limit, and all are significant

Dr. Nada Mobark, 2025 150


DESIGN ISSUES FOR NAMES, CASE
SENSITIVITY
▪ Names in the C-based languages are case sensitive
▪ Names in others are not

▪ Stick to a convention to avoid confusion


▪ In C variable names are lowercase.
▪ In C++, Java, and C# predefined names are mixed case (e.g.
IndexOutOfBoundsException)
▪ Disadvantage:
▪ poor readability
▪ names that look alike are different
▪ Poor writability:
▪ the need to remember specific case usage makes it more difficult to
write correct programs

Dr. Nada Mobark, 2025 151


DESIGN ISSUES FOR NAMES, SPECIAL WORDS
▪ Special words are used to separate the syntactic(action) parts
of statements and programs.
▪ Are they allowed to be used as names?
▪ No → reserved words
▪ cannot be redefined ( used as a user-defined name)
▪ Problem: If there are too many, many collisions occur
▪ Eg. : COBOL (300 reserved words)
▪ LENGTH, BOTTOM, COUNT
▪ Solution : names are visible only when explicitly imported.
▪ Yes → keywords
▪ can be redefined, have special meaning only in certain contexts
▪ e.g., in Fortran
▪ Real VarName (Real is a data type)
▪ Real = 3.4 (Real is a variable)

Dr. Nada Mobark, 2025 152


VARIABLE ATTRIBUTES, NAME
▪ not all variables have names!

int add(int nX, int nY)


{
return nX + nY;
}

Anonymous
variable

Dr. Nada Mobark, 2025 153


VARIABLE ATTRIBUTES, TYPE
▪ The type of a variable determines :
▪ the range of values of variables
▪ floating point type also determines the precision
▪ the set of operations that are defined for values of that type

Dr. Nada Mobark, 2025 154


VARIABLE ATTRIBUTES, ADDRESS
▪ The memory address with which a variable is associated
▪ sometimes called its l-value

▪ Design issues: Where and When?

Dr. Nada Mobark, 2025 155


WHAT IS BINDING?
▪ A binding is an association (between a name and the thing that
is named )
▪ between an entity and an attribute
▪ between a variable and its type or value,
▪ between an operation and a symbol

Dr. Nada Mobark, 2025 156


BINDING TIME
▪ Binding time is the time at which a binding takes place
▪ Language design time -- bind operator symbols to operations
▪ Language implementation time-- bind a data type to a
representation
▪ Compile time -- bind a variable to a type
▪ Runtime -- bind a local variable to a memory cell

int count = x + 5;

Dr. Nada Mobark, 2025 157


BINDING ATTRIBUTES TO VARIABLES

static dynamic

▪ A binding is static if it first ▪ A binding is dynamic if it


occurs before run time and first occurs during
remains unchanged execution or can change
throughout program during execution of the
execution. program

complete understanding of the binding times for program entities


is a prerequisite for understanding the semantics of a
programming language

Dr. Nada Mobark, 2025 158


TYPE BINDINGS
▪ Before the variable is referenced in a program, it must be
bound to a data type
▪ Design issues:
▪ How a type is specified?
▪ When does the binding take place?

Dr. Nada Mobark, 2025 159


STATIC TYPE BINDINGS
▪ If static, the type may be specified by either an explicit or an
implicit declaration
▪ An explicit declaration is a program statement used for declaring
the types of variables

Dr. Nada Mobark, 2025 160


STATIC TYPE BINDINGS
▪ An implicit declaration is a default mechanism for specifying
types of variables other than declaration statements
▪ Advantage: writability
▪ Disadvantage: reliability ( hard to detect errors)

naming conventions type inference

Requiring names of specific types to Using the context of the values


begin with particular special assigned to the variable in a
characters declaration statement

Perl: C#:
$apple String or Numeric var sum = 0;
var total = 0.0
var name = “Fred”
@apple Array

Dr. Nada Mobark, 2025 161


DYNAMIC TYPE BINDING
▪ Variable type is not specified by a declaration statement
▪ A type is determined when a variable is assigned a value
▪ May affect address-binding

▪ Variable can change its type at run time


▪ Example: JavaScript
list = [2, 4.33, 6, 8];
list = 17.3;
▪ Advantage:
▪ flexibility (generic program units)

▪ Disadvantages:
▪ High cost (dynamic allocation/de-allocation)
▪ Type error detection by the compiler is difficult
▪ Usually implemented by interpreter (slow)

Dr. Nada Mobark, 2025 162


STORAGE BINDING
▪ The variable is bound to a specific memory location.

Dr. Nada Mobark, 2025 163


VARIABLE ATTRIBUTES, LIFETIME
▪ The lifetime of a variable is the time during which it is bound
to a particular memory cell
▪ Allocation - getting a cell from some pool of available cells
▪ Deallocation - putting a cell back into the pool

▪ Categories:
▪ Static variables
▪ Stack-dynamic variables
▪ heap-dynamic variables
▪ Explicit
▪ Implicit

Dr. Nada Mobark, 2025 164


STATIC VARIABLES
▪ A variable is bound to memory cells before execution begins
and remains bound to the same memory cell until program
execution terminates.
▪ e.g., C and C++ static variables in functions

▪ Advantages:
▪ efficiency (direct addressing),
▪ history-sensitive subprogram support

▪ Disadvantage:
▪ lack of flexibility (no recursion)

Dr. Nada Mobark, 2025 165


EXAMPLE
#define MAX 5
int main(){
int i =0;
printf("Enter 5 numbers to be summed\n");
for(i = 0; i<MAX; ++i)
sumIt();
printf(“\nProgram completed\n");
C:\>test_static.o
getchar(); Enter 5 numbers to be summed
return 0;
} Enter a number: 1
void sumIt(void){ The current sum is: 1
static int sum = 0; Enter a number: 2
The current sum is: 3
int num;
Enter a number: 3
printf("\nEnter a number: "); The current sum is: 6
scanf("%d", &num); Enter a number: 4
sum+=num; The current sum is: 10
printf("The current sum is: %d",sum); Enter a number: 5
} The current sum is: 15
Program completed

Dr. Nada Mobark, 2025 166


STACK-DYNAMIC VARIABLES
▪ Storage bindings are created for variables when their
declaration statements are elaborated.
▪ A declaration is elaborated when the executable code associated
with it is executed

▪ If scalar, all attributes except address are


statically bound
▪ local variables in C subprograms (not declared
static) and Java methods
▪ Advantage: allows recursion; conserves
storage
▪ Disadvantages:
▪ Overhead of allocation and deallocation
▪ Subprograms cannot be history sensitive
▪ Inefficient references (indirect addressing)

Dr. Nada Mobark, 2025 167


HEAP-DYNAMIC VARIABLES
▪ What is heap?
▪ Allocation specified by the
programmer
▪ takes effect during execution
▪ Advantage:
▪ dynamic storage management
▪ Flexibility

Dr. Nada Mobark, 2025 168


EXPLICIT HEAP-DYNAMIC VARIABLES
▪ Allocated and deallocated by
explicit directives
▪ Referenced only through pointers
or references
▪ e.g.
▪ all objects in Java
▪ dynamic objects in C++ (via new
and delete)

▪ Disadvantage:
▪ Inefficient Cost of allocations,
references, and de-allocations
▪ Explicit de-allocation makes
programs unreliable
Dr. Nada Mobark, 2025 169
IMPLICIT HEAP-DYNAMIC VARIABLES
▪ Allocation and deallocation caused by assignment statements
▪ All attributes are bound every time they are assigned

▪ eg.
▪ all strings and arrays in Perl, JavaScript, and PHP

Highs = 5
▪ Advantage:Highs = [1, 2, 3, 4, 5]
▪ flexibility (generic code)

▪ Disadvantages:
▪ Inefficient- Introduce the run-time overhead of maintaining all the
dynamic attributes.
▪ Loss of error detection by the compiler

Dr. Nada Mobark, 2025 170


STORAGE BINDING TIMES

Dr. Nada Mobark, 2025 171


Same name x with different
variables (referred to different
In C++ memory cells).

int x = 1; //global variable


int main( )
{ cout << “global x in main is ” << x << endl; // 1
int x = 5; //local variable to main
cout << “local x in main’s outer scope is ” << x << endl; //5
{ //start new scope
int x = 7; //hides both x in outer scope and global x
cout << “local x in main’s inner scope is ” << x << endl; //7
} //end new scope
cout << “local x in main’s outer scope is ” << x << endl; ///5
} //end of main

Dr. Nada Mobark, 2025 172


VARIABLE ATTRIBUTES, VALUE
▪ The contents of the location with which the variable is
associated
▪ The address of a variable → the l-value of the variable
▪ The value of a variable → the r-value of the variable

▪ To access r-value, the l-value needs to be evaluated first.

Content Content
Sum = Sum + age ;
Address

Dr. Nada Mobark, 2025 173


VARIABLE ATTRIBUTES, SCOPE
▪ The scope of a variable is the range of statements over which
it is visible
▪ The scope rules of a language determine how references to names
are associated with variables
▪ Scope and lifetime are sometimes closely related, but are
different concepts
▪ scope is a textual, or spatial, concept whereas lifetime is a
temporal concept

Dr. Nada Mobark, 2025 174


STATIC SCOPE
▪ Scope can be statically determined – prior to execution
▪ To connect a name reference to a variable, you (or the
compiler) must search for the declaration of the variables
▪ Scope-based variable categories:
▪ The local variables of a program unit are those that are declared in
that unit
▪ The nonlocal variables of a program unit are those that are visible
in the unit but not declared there
▪ Global variables are a special category of nonlocal variables

Dr. Nada Mobark, 2025 175


GLOBAL SCOPE
▪ C, C++, PHP, JavaScript, and Python support a program
structure that consists of a sequence of function definitions in
a file
▪ These languages allow variable declarations to appear outside
function definitions
▪ C and C++ :
▪ Implicitly visible in all subsequent
functions in the file,
▪ except in in the case when the
variable is redefined.

Dr. Nada Mobark, 2025 176


BLOCKS
▪ Some languages allows a section of code to have its own local
variables whose scope is minimized
▪ Defined by blocks, can be nested
▪ Treated like sub-programs → variables are stack dynamic
▪ Storage is allocated when the block is entered and deallocated
when the block is exited.

The scope of loop counter


variables is restricted to the for
construct

Dr. Nada Mobark, 2025 177


NESTED BLOCKS
▪ A variable that is defined in an outer scope is accessible in all
(following) inner scopes.

Dr. Nada Mobark, 2025 178


EXAMPLE
▪ Variables with same name in nested scopes

Is this allowed ?

legal in C and C++, not in Java and C#→


error-prone

Dr. Nada Mobark, 2025 179


VARIABLE SHADOWING
▪ Block scoping may hide another
variable in a larger enclosing scope
▪ Variables can be hidden from a unit
by having a "closer" variable with
the same name
▪ can be accessed with selective
references
▪ C++ uses the scope resolution
operator (:: )

Dr. Nada Mobark, 2025 180


NESTED FUNCTIONS
a function defined inside another
function is called a nested function.
function big() {
▪ Static scoping
function sub1() { ▪ Once your program finds a
var x = 7; name reference, the search
sub2(); goes as follows:
print(x); ▪ search declarations, first
locally, then in increasingly
} larger enclosing scopes, until
one is found for the given
function sub2() { name
var y = x; ▪ Enclosing static scopes (to a
print(x); specific scope) are called its
} static ancestors; the nearest
static ancestor is called a
static parent
var x = 3;
sub1();
}

Dr. Nada Mobark, 2025 181


1-181
EVALUATION OF STATIC SCOPING
▪ Works well in many situations
▪ Problems:
▪ In most cases, too much access is possible
▪ As a program evolves, the initial structure is destroyed and local
variables often become global; subprograms also gravitate toward
become global, rather than nested

Dr. Nada Mobark, 2025 182


DYNAMIC SCOPE
▪ Based on calling sequences of program units, not their textual
layout (temporal versus spatial)
▪ References to variables are connected to declarations by
searching back through the chain of subprogram calls that
forced execution to this point

Dr. Nada Mobark, 2025 183


EXAMPLE
function big() {
▪ Dynamic scoping
▪ Reference to x in sub2 is to sub1's x function sub1() {
var x = 7;
sub2();
print(x);
big calls sub1 }
sub1 calls sub2
function sub2() {
sub2 uses x var y = x;
print(x);
}

var x = 3;
sub1();
}

Dr. Nada Mobark, 2025 184


EVALUATION OF DYNAMIC SCOPING
▪ Advantage:
▪ No need to pass arguments → convenience

▪ Disadvantages:
▪ While a subprogram is executing, its variables are visible to all
subprograms it calls → less reliable
▪ A statement in a subprogram that contains a reference to a
nonlocal variable can refer to different nonlocal variables during
different executions of the sub-programs → Impossible to
statically determine attributes
▪ Takes longer time to resolve → inefficient
▪ You need to know the sequence of subprogram calls to understand
a reference to a variable → Poor readability

Dr. Nada Mobark, 2025 185


#include <iostream>

using namespace std;

int i = 5; EXAMPLE
void p(){ ✓ Trace the program by
int i = -1; hand, and predict the
i = i + 1; output of the program.
cout << i << endl;
} ✓ What happens if we
remove the line:
int main(){
cout << i << endl; using namespace std;
char ch;
int i = 6;
i = i + 1;
p();
cout << i << endl;
return 0;
}

Dr. Nada Mobark, 2025 186


#include <iostream>

using namespace std;

int i;
EXAMPLE
int main(){
int i; ✓ Trace the program by
i = 5; hand, and predict the
output of the program.
for(
int i = 1; ✓ Does it compile
i<10 && cout << i << ' '; correctly or not?
++i ) Explain.
{
int i = -1;
cout << i << ' ';
}

cout << i << endl;


return 0;
}

Dr. Nada Mobark, 2025 187


SUMMARY
▪ Case sensitivity and the relationship of names to special
words represent design issues of names
▪ Variables are characterized by the sextuples: name, address,
value, type, lifetime, scope
▪ Binding is the association of attributes with program entities
▪ Scalar variables are categorized as: static, stack dynamic,
explicit heap dynamic, implicit heap dynamic
▪ The scope of a variable can be determined either statically or
dynamically

Dr. Nada Mobark, 2025 188


ANY Q??
CS341
CONCEPT OF
PROGRAMMING
LANGUAGES
Ch6 : Data types

Dr. Nada Mobark


GOAL
Explore categories,
characteristics, design
issues, and
implementation of the
common data types in
different programming
languages.

Dr. Nada Mobark, 2025 191


TOPICS
6.1 Introduction
6.2 Primitive Data Types
6.3 String Types
6.4 Enumeration Types
6.5 Array Types
6.6 Associative Arrays
6.11 Pointer and Reference Types

Dr. Nada Mobark, 2025 192


6.1 INTRODUCTION
▪ A data type defines a collection of data objects and a set of
predefined operations on those objects
▪ Earlier programming languages offered limited data structure
(types)
▪ The concept evolved over time:
▪ FORTRAN → arrays
▪ COBOL → decimal data , records
▪ ALGOL → user-defined data types

Dr. Nada Mobark, 2025 193


INTRODUCTION
▪ Uses of type system:
▪ Type-checking
▪ ensuring that the operands of an operator are of compatible types
▪ Program modularization
▪ Proper calling of Methods and interfaces
▪ Understanding semantics
▪ Expected program output

▪ One design issue for all data types: What operations are
defined and how are they specified?

Dr. Nada Mobark, 2025 194


6.2 PRIMITIVE DATA TYPES
▪ Almost all programming languages
provide a set of primitive data types
▪ Those not defined in terms of other
data types
▪ Some primitive data types are
merely reflections of the hardware

Dr. Nada Mobark, 2025 195


PRIMITIVE DATA TYPES, INTEGER
▪ A string of bits
▪ one of the bits (typically the leftmost) representing the sign

▪ Different sizes of integers


▪ Java :

▪ Negative values representation


▪ Signed magnitude
▪ Ones complement
▪ Twos complement

Dr. Nada Mobark, 2025 196


PRIMITIVE DATA TYPES, FLOATING POINT
▪ Model real numbers, but only as approximations
▪ Precision
▪ range

▪ Languages for scientific use support at least two floating-point


types (e.g., float and double; sometimes more)
▪ Usually exactly like the hardware, but not always
▪ IEEE Floating-Point Standard 754

Dr. Nada Mobark, 2025 197


PRIMITIVE DATA TYPES, COMPLEX
▪ Some languages support a complex type
▪ , e.g., C99, Fortran, and Python

▪ Each value consists of two floats, the real part and the
imaginary part

Dr. Nada Mobark, 2025 198


PRIMITIVE DATA TYPES, DECIMAL
▪ For business applications (money)
▪ Essential to COBOL
▪ C# offers a decimal data type

▪ Store a fixed number of decimal digits, in coded form (BCD)


▪ one digit per byte, or packed two digits per byte

▪ Operations are done on H/W or simulated in S/W


▪ Advantage: accuracy
▪ Disadvantages: limited range, wastes memory (6 digits == 24
bits)

Dr. Nada Mobark, 2025 199


PRIMITIVE DATA TYPES, BOOLEAN
▪ Simplest of all
▪ Range of values: two elements
▪ “true”
▪ “false”

▪ The C language doesn’t have a boolean data type

▪ Advantage:
▪ Readability over integers to represent switches or flags

▪ Could be implemented as bits, but often as bytes


▪ Why?

Dr. Nada Mobark, 2025 200


PRIMITIVE DATA TYPES, CHARACTER
▪ Stored as numeric coding
▪ Most commonly used coding: ASCII
▪ An alternative, Unicode
▪ Includes characters from most natural
languages
▪ 16-bit coding(UCS-2)
▪ 32-bit Unicode (UCS-4)
▪ Originally used in Java
▪ Now supported by many languages
▪ Supported by Fortran, starting with 2003

Dr. Nada Mobark, 2025 201


6.3 CHARACTER STRING TYPES
▪ Values are sequences of characters

▪ Typical operations:
▪ Assignment and copying
▪ Comparison (=, >, etc.)
▪ Catenation
▪ Substring reference(slice)
▪ Pattern matching

▪ Design issues:
▪ Is it a primitive type or just a special kind of array?
▪ Should the length of strings be static or dynamic?

Dr. Nada Mobark, 2025 202


CHARACTER STRING TYPE IN LANGUAGES
▪ C and C++
▪ Not primitive
▪ Use char arrays and a library of functions that provide operations

▪ Java (and C#, Ruby, and Swift)


▪ Primitive via the String class

▪ Fortran and Python


▪ Primitive type with assignment and several operations

▪ Perl, JavaScript, Ruby, and PHP


▪ built-in pattern matching, using regular expressions

Dr. Nada Mobark, 2025 203


CHARACTER STRING LENGTH OPTIONS
▪ Static:
▪ Java (immutable)
▪ length can’t be changed after string is created
▪ require no special dynamic storage allocation

▪ Limited dynamic:
▪ C and C++
▪ any number of chars 0 – max
▪ maintain the length, or use a special end of a string’s
character
▪ require no special dynamic storage allocation

▪ Dynamic length
▪ JavaScript, Perl A descriptor is the collection
▪ Variable length with no maximum of the attributes of a variable :
▪ Dynamic storage • Static : built at
▪ must grow and shrink dynamically. compilation time as part
of the symbol table
▪ Adjacent cells (mostly used)
• Dynamic: part or all has
▪ Overhead in allocation and deallocation
to be maintained at run
▪ Linked list or array of char pointers time
▪ Extra storage, complex operations

Dr. Nada Mobark, 2025 204


CHARACTER STRING TYPE EVALUATION
▪ Aid to writability
▪ As a primitive type with static length, they are inexpensive to
provide--why not have them?
▪ Dynamic length is nice, but is it worth the expense?

Dr. Nada Mobark, 2025 205


IMMUTABLE STRINGS IN RUBY

>> greeting = 'Hello'


=> "Hello“

>> greeting
=> "Hello“

>> greeting.object_id
=> 70101471431160

>> whazzup = greeting


 "Hello“

>> greeting = 'Dude!'


=> "Dude!“

>> puts whazzup


=> "HELLO!"

Dr. Nada Mobark, 2025 206


6.11 POINTER AND REFERENCE TYPES
▪ A pointer type variable has a range of values that consists of
memory addresses and a special value, nil
▪ Provide the power of indirect addressing
▪ Provide a way to manage dynamic memory
▪ storage is allocated from the heap

▪ Design issues:
▪ What are the scope of a pointer variable?
▪ What is the lifetime of a heap-dynamic variable?
▪ Are pointers restricted as to the type of value to which they can
point?
▪ Are pointers used for dynamic storage management, indirect
addressing, or both?
▪ Should the language support pointer types, reference types, or
both?

Dr. Nada Mobark, 2025 207


POINTER OPERATIONS
▪ Two fundamental operations: assignment and dereferencing
▪ Assignment is used to set a pointer variable’s value to some
useful address
▪ Dereferencing yields the value stored at the location
represented by the pointer’s value
▪ Dereferencing can be explicit or implicit

Dr. Nada Mobark, 2025 208


EXAMPLE: POINTERS IN C AND C++
▪ Explicit dereferencing using (*) and address-of (&) operators
int* ptr = &x
j = *ptr
▪ Extremely flexible but must be used with care
▪ Pointers can point at any variable regardless of when or where it
was allocated
▪ Pointer arithmetic is possible
▪ Domain type need not be fixed
▪ void * can point to any type and can’t be type checked (cannot be de-
referenced)

Dr. Nada Mobark, 2025 209


PROBLEMS WITH POINTERS
▪ Aliasing
▪ Lost heap-dynamic variable (memory leakage)
▪ An allocated heap-dynamic variable that is no longer accessible
to the user program (often called garbage)
▪ Dangling pointers (dangerous)
▪ A pointer points to a heap-dynamic variable that has been
deallocated

float* stuff = new float[100];


float *p;

Stuff = new float[1000];

p = stuff;

delete []stuff;

Dr. Nada Mobark, 2025 210


REFERENCE TYPES
▪ C++ includes a special kind of pointer type called a reference
type that is used primarily for formal parameters
▪ Advantages of both pass-by-reference and pass-by-value

▪ Java extends C++’s reference variables and allows them to


replace pointers entirely
▪ References are references to objects, rather than being addresses

▪ C# includes both the references of Java and the pointers of


C++
▪ What about Python?

Dr. Nada Mobark, 2025 211


REFERENCES IN RUBY
▪ Pointer arithmetic, as in C, is not possible with Ruby.
▪ Some types are immutable

>> number = 3
=> 3

>> number
=> 3

>> number = 2 * number


=> 6

>> number
=> 6

Dr. Nada Mobark, 2025 212


6.4 ENUMERATION TYPES
▪ All possible values, which are
named constants, are provided in
the definition
▪ Design issues
▪ Is an enumeration constant allowed
to appear in more than one type
definition, and if so, how is the type
of an occurrence of that constant
checked?
▪ Are enumeration values coerced to
integer?
▪ Any other type coerced to an
enumeration type?

Dr. Nada Mobark, 2025 213


EXAMPLE, C#

Dr. Nada Mobark, 2025 214


EVALUATION OF ENUMERATED TYPE
▪ Aid to readability,
▪ no need to code a color as a number

▪ Aid to reliability,
▪ compiler can check:
▪ operations (don’t allow colors to be added)
▪ No enumeration variable can be assigned a value outside its defined
range
▪ In C#, F#, Swift, and Java 5.0, enumeration type variables :
▪ are not coerced into integer types
▪ can’t be assigned a value outside the predefined range.

Dr. Nada Mobark, 2025 215


6.5 ARRAY TYPES
▪ An array is a homogeneous aggregate of data elements in
which an individual element is identified by its position in the
aggregate, relative to the first element.

Dr. Nada Mobark, 2025 216


ARRAY TYPES
▪ Design issues:
▪ When does allocation take place?
▪ What types are legal for subscripts?
▪ What is the maximum number of subscripts?
▪ When are subscript ranges bound?
▪ Are subscripting expressions in element references range
checked?
▪ Are ragged or rectangular multidimensional arrays allowed, or
both?
▪ Are any kind of slices supported?

Dr. Nada Mobark, 2025 217


ARRAY INDEXING
▪ Indexing (or subscripting) is a mapping from indices to
elements
array_name (index_value_list) → an element
▪ Index Syntax
▪ Fortran and Ada use parentheses
▪ Ada explicitly uses parentheses to show uniformity between array
references and function calls because both are mappings
▪ Most other languages use brackets

▪ In some languages, the lower bound of the subscript range is


implicit
▪ Perl allows negative subscripts
▪ Offset from the end of the array

Dr. Nada Mobark, 2025 218


ARRAY CATEGORIES
▪ When are subscript type/ranges bound?
▪ Static: subscript ranges are statically bound and storage allocation is
static (before run-time)
▪ Advantage: efficiency (no dynamic allocation)
▪ eg. C/C++ static arrays
▪ Fixed stack-dynamic: subscript ranges are statically bound, but the
allocation is done at declaration/elaboration time during execution
▪ Advantage: space efficiency
▪ eg. C/C++ local arrays declared in functions
▪ Fixed heap-dynamic: subscript ranges are statically bound, storage
binding is dynamic but fixed after allocation
▪ binding is done when requested and storage is allocated from heap, not
stack
▪ Advantage: flexibility – allocated space fits the problem
▪ eg. C/C++ pointer arrays
▪ Heap-dynamic: binding of subscript ranges and storage allocation is
dynamic and can change any number of times
▪ Advantage: flexibility (arrays can grow or shrink during program execution)
▪ eg. Java ArrayList

Dr. Nada Mobark, 2025 219


ARRAY INITIALIZATION
▪ Some language allow initialization at the time of storage
allocation
▪ C# example:
int list [] = {4, 5, 7, 83}
▪ C and C++ examples
char name [] = ″freddie″;
char *names [] = {″Bob″, ″Jake″, ″Joe″];
▪ Java example
String[] names = {″Bob″, ″Jake″, ″Joe″};
▪ Python
list = [x ** 2 for x in range(12) if x % 3 == 0]

Dr. Nada Mobark, 2025 220


SLICES
▪ A slice is some substructure of an array; nothing more than a
referencing mechanism
▪ Slices are only useful in languages that have array operations
▪ Python
vector = [2, 4, 6, 8, 10, 12, 14, 16]
mat = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
▪ vector (3:6) is a three-element array
▪ mat[0][0:2] is the first and second element of the first row of
mat

▪ Ruby supports slices with the slice method


▪ list.slice(2, 2) returns the third and fourth elements of
list

Dr. Nada Mobark, 2025 221


RECTANGULAR AND JAGGED ARRAYS
▪ A rectangular array is a multi-dimensioned array in which all
of the rows have the same number of elements and all
columns have the same number of elements
▪ A jagged matrix has rows with varying number of elements
▪ Possible when multi-dimensioned arrays actually appear as arrays
of arrays
▪ C, C++, C# and Java support jagged arrays

Dr. Nada Mobark, 2025 222


ARRAY IMPLEMENTATION, SINGLE
DIMENSIONED
▪ Addressing array elements
▪ Use information in the descriptor
▪ Static → compile time!

Compile-time
descriptor for
single-
dimensioned
arrays

Dr. Nada Mobark, 2025 223


ARRAY IMPLEMENTATION, MULTI-
DIMENSIONED
▪ An actual address value requires finding the number of
preceding elements
▪ Two common ways:
▪ Row major order (by rows) – used in most languages
▪ Column major order (by columns) – used in Fortran

The location of the [i, j] element in a A compile-time descriptor


matrix for a multidimensional
Dr. Nada Mobark, 2025 array 224
HIGHER DIMENSIONAL ARRAYS
▪ Colored Images can be viewed as a 3D array of pixels

Dr. Nada Mobark, 2025 225


ASSOCIATIVE ARRAYS
▪ An associative array is an unordered collection of data
elements that are indexed by an equal number of values
called keys
▪ User-defined keys must be stored

▪ Design issues:
▪ What is the form of references to elements?
▪ Is the size static or dynamic?

▪ Built-in type in Perl, Python, Ruby, and Swift

Dr. Nada Mobark, 2025 226


SUMMARY
▪ The data types of a language are a large part of what
determines that language’s style and usefulness
▪ The primitive data types of most imperative languages
include numeric, character, and Boolean types
▪ The user-defined enumeration and subrange types are
convenient and add to the readability and reliability of
programs
▪ Arrays are included in most languages
▪ Pointers are used for addressing flexibility and to control
dynamic storage management

Dr. Nada Mobark, 2025 227


ANY Q??
CS341
CONCEPT OF
PROGRAMMING
LANGUAGES
Ch7 : Expressions and Assignment

Dr. Nada Mobark


GOAL
Understand the semantics
of operators, expression
evaluation, type
conversions, and
assignment.

Dr. Nada Mobark, 2025 230


TOPICS
7.1 Introduction
7.2 Arithmetic Expressions
7.3 Overloaded Operators
7.4 Type conversions
7.5 Relational and Boolean Expressions
7.6 Short-Circuit Evaluation
7.7 Assignment Statements

Dr. Nada Mobark, 2025 231


7.1 INTRODUCTION
▪ Arithmetic evaluation was one of the motivations for the
development of the first programming languages
▪ Expressions are the fundamental means of specifying
computations in a programming language
▪ To understand expression evaluation, need to be familiar with
the orders of operator and operand evaluation

Dr. Nada Mobark, 2025 232


7.2 ARITHMETIC EXPRESSIONS
▪ Similar to mathematics, arithmetic expressions consist of
operators, operands, parentheses, and function calls
▪ Design issues:
▪ Types of operators?
▪ Operator precedence rules?
▪ Operator associativity rules?
▪ Order of operand evaluation?
▪ Operand evaluation side effects?
▪ Operator overloading?
▪ Type mixing in expressions?

Dr. Nada Mobark, 2025 233


OPERATORS, CATEGORIES
▪ Example, Ruby

Dr. Nada Mobark, 2025 234


OPERATORS, NUMBER OF OPERANDS
▪ A unary operator has one operand
▪ A binary operator has two operands
▪ A ternary operator has three operands
average = (count == 0)? 0 : sum / count

Dr. Nada Mobark, 2025 235


OPERATORS, NOTATION
▪ In most languages, binary operators are infix, except in
Scheme and LISP, in which they are prefix; Perl also has some
prefix binary operators
(Infix) a + b * c → (prefix) ??

▪ Most unary operators are prefix, but the ++ and - - operators


in C-based languages can be either prefix or postfix

Dr. Nada Mobark, 2025 236


OPERATORS, PRECEDENCE
▪ The operator precedence rules for expression evaluation
define the order in which “adjacent” operators of different
precedence levels are evaluated
▪ Example:

What is the relative precedence of the


unary minus ??

Dr. Nada Mobark, 2025 237


EXAMPLES

Dr. Nada Mobark, 2025 238


OPERATORS, ASSOCIATIVITY
▪ The operator associativity rules for expression evaluation
define the order in which adjacent operators with the same
precedence level are evaluated
▪ Typical associativity rules
▪ Left to right, except **, which is right to left
▪ Sometimes unary operators associate right to left
▪ In APL; all operators have equal precedence and all operators
associate right to left

Dr. Nada Mobark, 2025 239


OPERATORS, ASSOCIATIVITY
▪ In case of floating point numbers, some associativity options
may cause overflow!!
▪ A & C → very large +ve values, B & D → very large –ve values

A+C+B+D
▪ Programmers can alter the precedence and associativity rules by
placing parentheses

(A + B) + (C + D)

Dr. Nada Mobark, 2025 240


7.5 RELATIONAL AND BOOLEAN EXPRESSIONS

▪ A relational operator is an operator that


compares the values of its two
operands
▪ Relational Expressions
▪ Use relational operators and operands of
various types
▪ Evaluate to some Boolean representation

▪ Operator symbols used vary somewhat


among languages (!=, /=, ~=, .NE., <>,
#)

Dr. Nada Mobark, 2025 241


EXAMPLE, RUBY

Dr. Nada Mobark, 2025 242


BOOLEAN EXPRESSIONS
▪ Boolean Expressions
▪ Operands are Boolean and
the result is Boolean
▪ Example operators : AND,
OR, &&, ||

Dr. Nada Mobark, 2025 243


EXAMPLE, RUBY

Dr. Nada Mobark, 2025 244


BOOLEAN EXPRESSIONS
▪ C89 has no Boolean type -- it uses int type with 0 for false
and nonzero for true
▪ eg,
a < b < c is a legal expression:
▪ Left operator is evaluated, producing 0 or 1
▪ The evaluation result is then compared with the third operand (i.e., c)
▪ b is never compared with c

Dr. Nada Mobark, 2025 245


PRECEDENCE
▪ Arithmetic expressions can be the operands of relational
expressions, and relational expressions can be the operands
of Boolean expressions → different precedence levels

Dr. Nada Mobark, 2025 246


SHORT CIRCUIT EVALUATION
▪ An expression in which the result is determined without
evaluating all of the operands and/or operators
▪ Examples;
(13 * a) * (b / 13 – 1)
▪ If a is zero, there is no need to evaluate (b /13 - 1)
(a > b) || (b++ / 3)
▪ B may not be incremented

Dr. Nada Mobark, 2025 247


SHORT CIRCUIT EVALUATION
▪ AND operation does not short circuit in
▪ FORTRAN (1956)
▪ BASIC (1964) and VB
▪ Pascal (1970)
▪ SQL (1974)

▪ Problem with non-short-circuit evaluation


index = 0;
while (index <= length) && (LIST[index] != value)
index++;

▪ When index=length, LIST[index] will cause an indexing


problem

Dr. Nada Mobark, 2025 248


OPERAND, EVALUATION ORDER
▪ Variables: fetch the value from memory
▪ Constants: sometimes a fetch from memory; sometimes the
constant is in the machine language instruction
▪ Parenthesized expressions: evaluate all operands and
operators first
▪ The most interesting case is when an operand is a function call
▪ May be subject to side effects!

Dr. Nada Mobark, 2025 249


POTENTIALS FOR SIDE EFFECTS
▪ Functional side effects: when a function changes a two-way
parameter or a non-local variable
▪ Problem with functional side effects:
▪ When a function referenced in an expression alters another
operand of the expression;

a = 10;
b = a + fun(&a);
//assume fun returns 10 and changes
//its parameter to 20

Dr. Nada Mobark, 2025 250


SOLUTIONS TO FUNCTIONAL SIDE EFFECTS
▪ Write the language definition to disallow functional side
effects
▪ No two-way parameters in functions
▪ No non-local (global) references in functions
▪ Advantage: it works!
▪ Disadvantage: inflexibility of one-way parameters and lack of non-
local references
▪ Write the language definition to demand that operand
evaluation order be fixed
▪ Java requires that operands appear to be evaluated in left-to-right
order
▪ Disadvantage: limits some compiler optimizations

Dr. Nada Mobark, 2025 251


REFERENTIAL TRANSPARENCY
▪ A program has the property of referential transparency if any
two expressions in the program that have the same value can
be substituted for one another anywhere in the program,
without affecting the action of the program
▪ Advantage: Semantics of a program is much easier to
understand
▪ eg.
result1 = (fun(a) + b) / (fun(a) – c);
temp = fun(a);
result2 = (temp + b) / (temp – c);
▪ If fun has no side effects, result1 = result2
▪ Otherwise, not, and referential transparency is violated

Dr. Nada Mobark, 2025 252


7.4 TYPE CONVERSIONS
▪ A narrowing conversion is one that converts an object to a
type that cannot include all of the values of the original type
e.g., float to int
▪ A widening conversion is one in which an object is converted
to a type that can include at least approximations to all of the
values of the original type e.g., int to float

Dr. Nada Mobark, 2025 253


TYPE CONVERSIONS: IMPLICIT
▪ A mixed-mode expression is one that has operands of
different types
▪ A coercion is an implicit type conversion
▪ In most languages, all numeric types are coerced in expressions,
using widening conversions

int a;
float b, c, d;
. . .
d = b * a;
▪ Disadvantage of coercions:
▪ They decrease in the type error detection ability of the compiler

▪ In ML, Ada, and F#, there are no coercions in expressions →


increased error detection

Dr. Nada Mobark, 2025 254


TYPE CONVERSIONS : EXPLICIT
▪ Called casting in C-based languages
▪ Examples
▪ C: (int)angle
▪ F#: float(sum)

Note that F#’s syntax is similar to that of function calls

Dr. Nada Mobark, 2025 255


COERCION MADNESS!!
▪ JavaScript Example

Dr. Nada Mobark, 2025 256


7.7 ASSIGNMENT STATEMENTS
▪ The general syntax
<target_var> <assign_operator> <expression>

▪ The assignment operator


▪ = Fortran, BASIC, the C-based languages
▪ := Ada

▪ confusing when = is overloaded for the relational operator for


equality
▪ that’s why the C-based languages use == as the relational
operator

Dr. Nada Mobark, 2025 257


COMPOUND ASSIGNMENT OPERATORS
▪ A shorthand method of specifying a commonly needed form
of assignment
▪ Introduced in ALGOL; adopted by C and the C-based
languages
▪ Example
a = a + b
▪ can be written as
a += b

counter = 2
while counter < 68
puts counter
counter**=2
end

Dr. Nada Mobark, 2025 258


UNARY ASSIGNMENT OPERATORS
▪ Unary assignment operators in C-based languages combine
increment and decrement operations with assignment
▪ Examples
sum = ++count //count incremented, then assigned to sum
sum = count++ //count assigned to sum, then incremented
count++ //count incremented
-count++ //count incremented then negated
▪ Ruby does not support ++ operator!

Dr. Nada Mobark, 2025 259


MULTIPLE ASSIGNMENTS
▪ Perl and Ruby allow multiple-target multiple-source
assignments
($first, $second, $third) = (20, 30, 40);
▪ Also, the following is legal and performs an interchange:
($first, $second) = ($second, $first);

Dr. Nada Mobark, 2025 260


ASSIGNMENT, CONDITIONAL TARGETS
▪ In Perl
($flag ? $total : $subtotal) = 0

▪ equivalent to
if ($flag){
$total = 0
} else {
$subtotal = 0
}

Dr. Nada Mobark, 2025 261


ASSIGNMENT AS AN EXPRESSION
▪ In the C-based languages, Perl, and JavaScript, the assignment
statement produces a result and can be used as an operand
Examples:
▪ while ((ch = getchar())!= EOF){…}
▪ a = b + (c = d / b) – 1
▪ Sum = count = 0;
▪ if (x = y) ...

▪ Java and C# allow only boolean expressions in their if


statements

Dr. Nada Mobark, 2025 262


7.3 OVERLOADED OPERATORS
▪ Use of an operator for more than one purpose is called
operator overloading
▪ Some are common (e.g., + for addition and string concatenetion)
▪ The compiler will choose the correct meaning based on the types
of the operands

Dr. Nada Mobark, 2025 263


OVERLOADED OPERATORS
▪ C++, C#, and Ruby allow user-defined overloaded operators
▪ Not allowed in Java!

Dr. Nada Mobark, 2025 264


OVERLOADED OPERATORS, PROS
▪ When sensibly used → aid to readability (avoid method calls,
expressions appear natural)

Dr. Nada Mobark, 2025 265


OVERLOADED OPERATORS, CONS
▪ potential troubles:
▪ Loss of compiler error detection
▪ omission of an operand is not an error!
▪ eg.
▪ & → addressOf / Bitwise AND
▪ - → subtraction, unary minus
▪ loss of readability
▪ Users can define nonsense operations
▪ Using a symbol that is unrelated to the operation
▪ Binding modules in a system
▪ Same operators overloaded in different ways
▪ Needs to be eliminated

Dr. Nada Mobark, 2025 266


SUMMARY
▪ Expressions are the basic feature to understand about a
language.
▪ Operator precedence and associativity defines a statement-
level control structure.
▪ Operator overloading is a feature that improves writability but
may have a negative effect on code readability.
▪ Various forms of assignment operators are supported in
different languages.

Dr. Nada Mobark, 2025 267


ANY Q??
CS341
CONCEPT OF
PROGRAMMING
LANGUAGES
Ch8 : Statement-Level Control Structures

Dr. Nada Mobark


GOAL
Examine the flow of control
among statements

Dr. Nada Mobark, 2025 270


TOPICS
8.1 Introduction
8.2 Selection Statements
▪ Two-way
▪ Multiple-way

8.3 Iterative Statements


▪ Counter/Logically controlled Loops
▪ User-controlled mechanisms
▪ Iteration based on Data Structures

8.4 unconditional Branching

Dr. Nada Mobark, 2025 271


8.1 LEVELS OF CONTROL FLOW
▪ Control statements are the statements which alter the flow of
execution and provide better control to the programmer on
the flow of execution
▪ Levels:
▪ Within expressions (Chapter 7):
▪ Associativity, precedence rules
▪ Among program statements (this chapter)
▪ Selection & looping
▪ Among program units (Chapter 9)
▪ Sub-programs

Dr. Nada Mobark, 2025 272


STATEMENT-LEVEL
CONTROL
• Together, they form
flexible, complex, and
powerful program logic.
• More structures → high
wirtability
• Too-few structures →
may affect readability

Dr. Nada Mobark, 2025 273


CONTROL STATEMENTS: DESIGN
▪ Goal: what is the best collection of control structures that
provides the required capability and the desired writability?
▪ Design issue
▪ Should a control structure have multiple entries?
▪ goto, labels
▪ What about exit?
▪ no explicit exit → all exits from a control structure are restricted to
transferring control to the first statement following the structure → no
harm to readability and also no danger.

Dr. Nada Mobark, 2025 274


8.5 UNCONDITIONAL BRANCHING
▪ Transfers execution control to a
specified place in the program
▪ Represented one of the most heated
debates in 1960’s and 1970’s
▪ Major concerns: Readability,
maintainability
▪ Java : Not supported
▪ C# : goto statement can be used in
switch statements.
▪ Any program that uses a goto can be
rewritten so that it doesn't need the
goto

Dr. Nada Mobark, 2025 275


C++ EXAMPLE

Dr. Nada Mobark, 2025 276


8.2 SELECTION STATEMENTS
▪ A selection statement provides the means of choosing
between two or more paths of execution
▪ Two general categories:
▪ Two-way selectors
▪ Multiple-way selectors

Dr. Nada Mobark, 2025 277


TWO-WAY SELECTION STATEMENTS
▪ General form:
if control_expression
then clause
else clause

▪ Design Issues:
▪ What is the form and type of the control expression?
▪ How are the then and else clauses specified?
▪ How should the meaning of nested selectors be specified?

Dr. Nada Mobark, 2025 278


TWO-WAY SELECTION, THE CONTROL EXPRESSION
▪ If the then reserved word or some other syntactic marker is
to introduce the then clause, no need for parentheses

▪ Expression :
▪ In C89, C99, Python, and C++, the control expression can be
arithmetic
▪ In most other languages, the control expression must be Boolean

Dr. Nada Mobark, 2025 279


TWO-WAY SELECTION, CLAUSE FORM
▪ In many contemporary languages, the then and else clauses
can be single statements or compound statements
▪ In Perl, must be compound → all clauses must be delimited by
braces
▪ In Python and Ruby, clauses are statement sequences → ??

Dr. Nada Mobark, 2025 280


TWO-WAY SELECTION, NESTING SELECTORS

▪ Java example
if (sum == 0)

if (count == 0)
result = 0;
else result = 1;
▪ Which if gets the else?
▪ static semantics rule: else matches with the nearest elseless-if
▪ To force an alternative semantics, compound statements may be
used

Dr. Nada Mobark, 2025 281


NESTING SELECTORS - RUBY
▪ Statement sequences as clauses,
▪ use of a special word resolves ambiguity and adds to the
readability:
if sum == 0 then
if count == 0 then
result = 0
else
result = 1
end
end

Dr. Nada Mobark, 2025 282


NESTING SELECTORS - RUBY

▪ Ruby : If - elsif Statement:


▪ used to make more complex branching statements.

Dr. Nada Mobark, 2025 283


MULTIPLE-WAY SELECTION STATEMENTS
▪ Allow the selection of one of any number of statements or
statement groups
▪ an n-way branch to statements of code, where n is the number of
selectable segments

Dr. Nada Mobark, 2025 284


MULTIPLE-WAY SELECTION USING IF
▪ Multiple Selectors
can appear as direct
extensions to two-way
selectors, using else-
if clauses → more
flexible

Dr. Nada Mobark, 2025 285


MULTIPLE-WAY SELECTION
▪ Design Issues:
▪ What is the form and type of the control expression?
▪ How are the selectable segments specified?
▪ Is execution flow through the structure restricted to include just a
single selectable segment?
▪ How are case values specified?
▪ What is done about unrepresented expression values?

Dr. Nada Mobark, 2025 286


THE SWITCH STATEMENT
▪ C, C++, Java, and JavaScript.
▪ The control expression and the constant expressions are some
discrete type:
▪ Integer
▪ Characters
▪ enumeration types

▪ Testing for equality.

▪ default clause
▪ for unrepresented values
▪ Optional

Dr. Nada Mobark, 2025 287


THE SWITCH STATEMENT - C
▪ Design choices for C’s switch
statement
▪ Control expression can be only an
integer type
▪ Selectable segments can be
compound statements
▪ no implicit branch at the end of
selectable segments
▪ Any number of segments can be
executed in one execution of the
construct
increase in flexibility
▪ The break statement (restricted
goto) should be used for exiting decrease in reliability

Dr. Nada Mobark, 2025 288


THE SWITCH STATEMENT - C#
▪ Design choices for C#’s switch
statement differs from C in:
▪ disallows the implicit execution
of more than one segment
▪ Each selectable segment must end
with an unconditional branch
(goto or break)
▪ the control expression and the
case constants can be string

Dr. Nada Mobark, 2025 289


MULTIPLE-WAY SELECTION - RUBY
▪ Case statement

▪ Allows range checking


▪ Implicit branch at the end
of selectable segments

Dr. Nada Mobark, 2025 290


IMPLEMENTING MULTIPLE SELECTORS
▪ Approaches:
▪ Implement multiple conditional branches using hard coded labels

goto branches
label1 :label1 : code for statement1
goto out
. . .
labeln : code for statementn
goto out
default: code for statementn+1
goto out
branches: if t = constant_expression1 goto label1label1
. . .
if t = constant_expressionn goto labelnlabeln
goto default
out:

Dr. Nada Mobark, 2025 291


IMPLEMENTING MULTIPLE SELECTORS
▪ Approaches:
▪ Store case values in a table and use a linear search of the table
▪ Suitable when there are more than ten cases, a hash table of case values
can be used
▪ Use an array whose indices are the case values and values are the
case labels
▪ Useful when the number of cases is small and more than half of the
whole range of case values are represented,

Dr. Nada Mobark, 2025 292


8.3 ITERATIVE STATEMENTS
▪ The repeated execution of a statement or compound
statement is accomplished either by iteration or recursion
▪ The body of an iterative statement is the collection of statements
▪ The execution of the body is controlled by the iteration statement.

▪ General design issues for iteration control statements:


1. How is iteration controlled?
2. Where is the control mechanism in the loop?

Dr. Nada Mobark, 2025 293


LOGICALLY-CONTROLLED LOOPS
▪ Repetition control is based on a Boolean expression
▪ Design issues:
▪ Pretest or posttest?
▪ pretest to mean that the test for loop completion occurs before the loop
body is executed
▪ posttest to mean that it occurs after the loop body is executed.

Dr. Nada Mobark, 2025 294


LOGICALLY-CONTROLLED LOOPS
▪ C and C++ have both pretest
and posttest forms
▪ the control expression can be
arithmetic
▪ it is legal to branch into the
body of a logically-controlled
loop
▪ Java:
▪ the control expression must
be Boolean
▪ the body can only be entered
at the beginning ; Java has no
goto

Dr. Nada Mobark, 2025 295


COUNTER-CONTROLLED LOOPS
▪ A counting iterative statement has a loop variable, and a
means of specifying the initial and terminal, and step size
values
▪ Design Issues:
▪ Should it be a special case of the logically controlled loop or a
separate statement?
▪ The loop variable :
▪ the type and scope
▪ Is it legal to be changed in the loop body, and if so, does the change
affect loop control?
▪ Should be evaluated only once, or once for every iteration?
▪ What is its value after loop termination?

Dr. Nada Mobark, 2025 296


COUNTER-CONTROLLED LOOPS – C-BASED
▪ Loop parameters:
▪ Initial, terminal, step-size specs of a loop variable

▪ Syntax:
for ([expr_1] ; [expr_2] ; [expr_3]) statement

▪ Semantics:

Dr. Nada Mobark, 2025 297


COUNTER-CONTROLLED LOOPS - C
▪ C Design choices:
▪ There is no explicit loop variable
▪ The first expression is evaluated once, but the other two are
evaluated with each iteration
▪ If the second expression is absent, it is an infinite loop
▪ Everything can be changed in the loop
▪ It is legal to branch into the body of a for loop in C
▪ The expressions can be whole statements, or even statement
sequences, with the statements separated by commas

Dr. Nada Mobark, 2025 298


COUNTER-CONTROLLED LOOPS – C++
▪ C++ differs from C in two ways:
▪ The control expression can also be Boolean
▪ The initial expression can include variable definitions (scope is
from the definition to the end of the loop body)
▪ Java and C#
▪ Differs from C++ in that the control expression must be Boolean

Dr. Nada Mobark, 2025 299


USER-LOCATED LOOP CONTROL MECHANISMS
▪ Sometimes it is convenient for the programmers to decide a
location for loop control (other than top or bottom of the loop)
▪ Simple design for single loops (e.g., break)

▪ Design issues for nested loops


▪ Should the conditional be part of the exit?
▪ Should control be transferable out of more than one loop?

Dr. Nada Mobark, 2025 300


EXAMPLE - BREAK
▪ C , C++, Python, Ruby, C#, and java have unconditional
unlabeled exits (break), last in Perl
▪ Exit the innermost loop
▪ Can be used to quit infinite loops

Dr. Nada Mobark, 2025 301


EXAMPLE

Dr. Nada Mobark, 2025 302


EXAMPLE - CONTINUE
▪ C, C++, and Python have an unlabeled control statement,
continue, that skips the remainder of the current iteration,
but does not exit the loop

Dr. Nada Mobark, 2025 303


RUBY

1.i = 1
2.while true
3. if i*5 >= 25
4. break
5. end 1.for i in 5...11
6. puts i*5 2. if i == 7 then
7. i += 1 3. next
8.end 4. end
5. puts i
6.end

Redo ??
Dr. Nada Mobark, 2025 304
ITERATION BASED ON DATA STRUCTURES
▪ The number of elements in a data structure controls loop
iteration
▪ Mechanism is a call to an iterator function that returns the next
element in some chosen order, if there is one; else loop is
terminate

Dr. Nada Mobark, 2025 305


EXAMPLE, C#
public interface Iterator<T>
▪ Implementing this interface allows an object to be the target of the
"for-each loop" statement.

Dr. Nada Mobark, 2025 306


SUMMARY
▪ Variety of statement-level structures
▪ Choice of control statements beyond selection and logical
pretest loops is a trade-off between language size and
writability

Dr. Nada Mobark, 2025 307


ANY Q??
CS341
CONCEPT OF
PROGRAMMING
LANGUAGES
Ch9 : Subprograms

Dr. Nada Mobark


GOAL
explore the
design/implementation of
subprograms and
parameter-passing
methods

Dr. Nada Mobark, 2025 310


TOPICS
9.1 Introduction
9.2 Fundamentals of Subprograms
9.3 Design Issues for Subprograms
9.5 Parameter-Passing Methods
9.6 Parameters That Are Subprograms
9.8 Design Issues for Functions

Dr. Nada Mobark, 2025 311


9.1 INTRODUCTION
▪ The first programmable computer, Babbage’s Analytical
Engine
▪ Built in 1840s
▪ reusing collections of instruction cards at several different places
in a program

Dr. Nada Mobark, 2025 312


9.1 INTRODUCTION
▪ Two fundamental abstraction facilities
▪ Process abstraction
▪ Discussed in this chapter
▪ Emphasized from early days
▪ Advantages:
▪ abstract details, improve logical structure, better readability
▪ Reuse code, save coding time, save memory!
▪ Data abstraction
▪ OOP - Emphasized in the1980s
▪ Discussed in depth in Chapter 11

Dr. Nada Mobark, 2025 313


9.2 FUNDAMENTALS OF SUBPROGRAMS
▪ Subprograms are collection of
statements that define
parameterized computations
▪ Each subprogram has a single
entry point
▪ The calling program is
suspended during execution of
the called subprogram
▪ Control always returns to the
caller when the called
subprogram’s execution
terminates

Dr. Nada Mobark, 2025 314


PROCEDURES VS. FUNCTIONS
▪ Two categories of subprograms
▪ Function:
▪ returns value(s) Pascal
▪ They are expected to produce no side
effects
▪ no change to the parameters, nor any
variable outside the function body
▪ In practice, functions have side
effects
▪ Procedure:
▪ does not return value
▪ can produce results in the calling
program unit by two methods:
▪ through variables that are not formal
parameters but are still visible in both
the procedure and the calling
program unit,
▪ through formal parameters that allow
the transfer of data to the caller, those
parameters can be changed.

Dr. Nada Mobark, 2025 315


BASIC DEFINITIONS
▪ A subprogram header is the first
part of the definition
▪ including the name, the kind of
subprogram (procedure/function),
and the formal parameters
▪ The parameter profile (signature) is
the number, order, and types of its
parameters
▪ The protocol is a subprogram’s
parameter profile and, if it is a
function, its return type
▪ A subprogram definition describes
the interface to and the actions of
the subprogram abstraction
▪ A subprogram call is an explicit
request that the subprogram be
executed

Dr. Nada Mobark, 2025 316


EXAMPLES, C
▪ Function declarations in C and C++ are often called
prototypes
▪ A subprogram declaration provides the protocol, but not the body,
of the subprogram
▪ Such declarations are often placed in header files

Dr. Nada Mobark, 2025 317


METHODS IN RUBY

Dr. Nada Mobark, 2025 318


PARAMETERS
▪ Subprograms typically describe
computations that need data!

▪ There are two ways that a subprogram


can gain access to the data that it is to
process:
▪ through direct access to nonlocal variables
(declared elsewhere but visible in the
subprogram) → can reduce reliability
▪ through parameter passing → more flexible

▪ A formal parameter is a dummy variable listed


in the subprogram header and used in the
subprogram
▪ An actual parameter represents a value or
address used in the subprogram call
statement

Dr. Nada Mobark, 2025 319


ACTUAL/FORMAL PARAMETER
CORRESPONDENCE
▪ Positional
▪ The binding of actual parameters to formal
parameters is by position
▪ the first actual parameter is bound to the first
formal parameter and so forth
▪ Safe and effective

▪ Keyword
▪ The name of the formal parameter to which an
actual parameter is to be bound is specified
with the actual parameter
▪ Advantage: Parameters can appear in any order,
thereby avoiding parameter correspondence
errors
▪ Disadvantage: User must know the formal
parameter’s names

Dr. Nada Mobark, 2025 320


FORMAL PARAMETER, DEFAULT VALUES
▪ if no actual parameter is
passed, formal parameters
can have default values
▪ Allowed in certain languages
(e.g., C++, Python, PHP).
▪ In C++, default parameters
must appear last because
parameters are positionally
associated (no keyword
parameters)

Dr. Nada Mobark, 2025 321


VARIABLE PARAMETERS
▪ C# methods can accept a variable number of parameters as
long as they are of the same type—the corresponding formal
parameter is an array preceded by params

Dr. Nada Mobark, 2025 322


9.3 DESIGN ISSUES FOR SUBPROGRAMS
▪ Are local variables static or dynamic?
▪ What parameter passing methods are provided?
▪ Are parameter types checked?
▪ Can subprograms be passed as parameters? what is the
referencing environment of a passed subprogram?
▪ Can subprograms be nested? What is the referencing
environment of a passed subprogram?
▪ Can subprograms be overloaded?
▪ Can subprogram be generic?

Dr. Nada Mobark, 2025 323


9.8 DESIGN ISSUES FOR FUNCTIONS
▪ Are side effects allowed?
▪ Parameters should always be in-mode to reduce side effect (like
Ada)
▪ What types of return values are allowed?
▪ Most imperative languages restrict the return types
▪ C allows any type except arrays and functions
▪ C++ is like C but also allows user-defined types
▪ Java and C# methods can return any type (but because methods are not
types, they cannot be returned)
▪ Python and Ruby treat methods as first-class objects, so they can be
returned, as well as any other class

▪ What is the max number of Returned Values?


▪ In most languages, only a single value can be returned from a
function
▪ Ruby can return many values by storing them in an array
▪ ML, F#, Python return multiple values as a tuple

Dr. Nada Mobark, 2025 324


9.6 PARAMETERS THAT ARE SUBPROGRAMS
▪ In some situations, it is convenient to be able to transmit
computations, rather than data, as parameters to
subprograms.
▪ pass subprogram names as parameters

▪ Example : when a subprogram must sample some


mathematical function
▪ integration by sampling a function at a number of points

Dr. Nada Mobark, 2025 325


EXAMPLE, PYTHON

Dr. Nada Mobark, 2025 326


9.5 PARAMETER PASSING
▪ Parameter-passing methods are the ways in which parameters
are transmitted to and/or from called subprograms.
▪ In mode : can receive data from the corresponding actual
parameter
▪ Out mode : can transmit data to the actual parameter
▪ In-out mode :can do both

Dr. Nada Mobark, 2025 327


IMPLEMENTATION MODELS OF PARAMETER
PASSING
▪ Passing modes:
• Physically move a value
• Move an access path to a value

▪ A variety of models developed by language designers:


▪ Pass-by-value
▪ Pass-by-result
▪ Pass-by-value-result
▪ Pass-by-reference
▪ Pass-by name

Dr. Nada Mobark, 2025 328


PASS-BY-VALUE (IN MODE)
▪ The value of the actual parameter
is used to initialize the
corresponding formal parameter
▪ Normally implemented by copying
▪ Physical move: additional storage is
required (stored twice) and the actual
move can be costly (for large
parameters)
▪ Passing Access path : not
recommended, must write-protect in
the called subprogram, and accesses
cost more (indirect addressing)

Dr. Nada Mobark, 2025 329


PASS-BY-REFERENCE (IN-OUT MODE)
▪ Pass an access path
▪ called pass-by-sharing

▪ Advantage:
▪ Passing process is efficient (no
copying and no duplicated storage)
▪ Disadvantages
▪ Slower accesses (compared to pass-
by-value) to formal parameters
▪ Potentials for unwanted side effects
(collisions)
▪ Unwanted aliases (access broadened)

fun(total, total);
Dr. Nada Mobark, 2025
fun(list[i], list[j]); // i == j 330
PASS-BY-REFERENCE (IN-OUT MODE)
▪ Another issue:
▪ Can the passed reference be changed in the called
subprogram?
▪ In C, it is possible
▪ But in some other languages, such as Pascal and C++, formal
parameters that are addresses are implicitly dereferenced, which
prevents such changes

Dr. Nada Mobark, 2025 331


PASS-BY-RESULT (OUT MODE)
▪ When a parameter is passed by
result:
▪ no value is transmitted to the
subprogram
▪ the corresponding formal
parameter acts as a local variable
▪ its value is transmitted to caller’s
actual parameter when control is
returned to the caller, by physical
move
▪ Require extra storage location and
copy operation

Dr. Nada Mobark, 2025 332


PASS-BY-VALUE-RESULT (IN-OUT MODE)
▪ A combination of pass-by-value and pass-by-result
▪ Sometimes called pass-by-copy
▪ Formal parameters have local storage
▪ Disadvantages:
▪ Those of pass-by-result
▪ Those of pass-by-value

Dr. Nada Mobark, 2025 333


IMPLEMENTING
PARAMETER-
PASSING METHODS
• In most languages
parameter
communication takes
place thru the run-time
stack
• Pass-by-reference are
the simplest to
implement; only an
address is placed in the
stack

Function header: void sub(int a, int b, int c, int d)


Function call in main: sub(w, x, y, z)

(pass w by value, x by result, y by value-result, z by reference)

Dr. Nada Mobark, 2025 334


PARAMETER PASSING METHODS OF MAJOR
LANGUAGES
▪ C
▪ Pass-by-value
▪ Pass-by-reference is achieved by using pointers as parameters

▪ C++
▪ A special pointer type called reference type for pass-by-reference

▪ Java
▪ All non-object parameters are passed are passed by value
▪ no method can change any of these parameters
▪ Object parameters are passed by reference

▪ C#
▪ Default method: pass-by-value
▪ Pass-by-reference is specified by preceding both a formal parameter
and its actual parameter with ref
▪ Python and Ruby
▪ use pass-by-assignment (all data values are objects); the actual is
assigned to the formal

Dr. Nada Mobark, 2025 335


DESIGN CONSIDERATIONS FOR PARAMETER
PASSING
▪ Two important considerations
▪ Efficiency
▪ One-way or two-way data transfer

▪ But the above considerations are in conflict


▪ Good programming suggest limited access to variables, which
means one-way whenever possible
▪ But pass-by-reference is more efficient to pass structures of
significant size

Dr. Nada Mobark, 2025 336


TYPE CHECKING PARAMETERS
▪ The types of actual parameters are checked for consistency
with the types of the corresponding formal parameters.
▪ Considered very important for reliability
▪ FORTRAN 77 and original C: none
▪ Pascal and Java: it is always required
▪ ANSI C and C++: choice is made by the user / Prototypes
▪ Relatively new languages Perl, JavaScript, and PHP do not
require type checking
▪ In Python and Ruby, variables do not have types, so parameter
type checking is not possible

Dr. Nada Mobark, 2025 337


SUMMARY
▪ A subprogram definition describes the actions represented
by the subprogram
▪ Subprograms can be either functions or procedures
▪ Local variables in subprograms can be stack-dynamic or
static
▪ Three models of parameter passing: in mode, out mode, and
in-out mode
▪ (extra) Subprograms can be overloaded

Dr. Nada Mobark, 2025 338


ANY Q??
CS341
CONCEPT OF
PROGRAMMING
LANGUAGES
Ch10: Implementing subprograms

Dr. Nada Mobark


GOAL
Explore the implementation
of subprograms

Dr. Nada Mobark, 2025 341


TOPICS
10.1 The General Semantics of Calls and Returns
10.2 Implementing “Simple” Subprograms
10.3 Implementing Subprograms with Stack-Dynamic Local
Variables

Dr. Nada Mobark, 2025 342


THE GENERAL SEMANTICS OF CALLS AND
RETURNS
▪ The subprogram call and return operations of a language are
together called its subprogram linkage

Dr. Nada Mobark, 2025 343


THE GENERAL SEMANTICS OF CALLS AND
RETURNS
▪ General semantics of calls to a subprogram
▪ Parameter passing methods
▪ Stack-dynamic allocation of local variables
▪ Save the execution status of calling program
▪ Transfer of control and arrange for the return
▪ If subprogram nesting is supported, access to nonlocal variables
must be arranged
▪ General semantics of subprogram returns:
▪ Out mode and inout mode parameters must have their values
returned
▪ Deallocation of stack-dynamic locals
▪ Restore the execution status
▪ Return control to the caller

Dr. Nada Mobark, 2025 344


10.2 IMPLEMENTING “SIMPLE” SUBPROGRAMS
▪ Don’t support recursion
▪ Call Semantics:
- Save the execution status of the caller
- Pass the parameters
- Pass the return address to the called
- Transfer control to the called
▪ Return Semantics:
▪ If pass-by-value-result or out mode parameters are used, move the
current values of those parameters to their corresponding actual
parameters
▪ If it is a function, move the functional value to a place the caller can
get it
▪ Restore the execution status of the caller
▪ Transfer control back to the caller

Dr. Nada Mobark, 2025 345


IMPLEMENTING “SIMPLE” SUBPROGRAMS
▪ Required storage:
▪ Status information about the caller
▪ parameters,
▪ return address,
▪ return value for functions,
▪ temporaries

Dr. Nada Mobark, 2025 346


IMPLEMENTING “SIMPLE” SUBPROGRAMS
▪ Two separate parts:
▪ the actual code and
▪ the non-code part (local variables and data that can change)

▪ The format, or layout, of the non-code part of an executing


subprogram is called an activation record
▪ Data are relevant only during activation/execution
▪ No recursion → one activation record
▪ Fixed size → statically allocated

Dr. Nada Mobark, 2025 347


EXAMPLE
▪ Three subprograms
▪ Separated code and data segments
▪ Can be attached to the code

▪ May be compiled separately, put


together with a linker
▪ The linker does:
▪ Find/Load main and all referenced
subprograms code in memory,
including library calls.
▪ Load all activation records in memory
▪ Patch in the target address of all calls to
subprograms.

Dr. Nada Mobark, 2025 348


10.3 USING STACK-DYNAMIC LOCAL
VARIABLES
▪ More complex activation record, because:
▪ The compiler must generate code to cause implicit allocation and
deallocation of local variables
▪ Recursion must be supported
▪ adds the possibility of multiple simultaneous activations of a
subprogram

Dr. Nada Mobark, 2025 349


ACTIVATION RECORD
▪ An activation record instance is dynamically created when a
subprogram is called
▪ reside on the run-time stack
▪ Last called, first complete
▪ Return address points to the next instruction following the call.
▪ The dynamic link points to the base of the activation record of the
caller
▪ Static scope : used to trace back info in case of run-time errors.
▪ Dynamic scope : used to access non-local variables
▪ Return address, dynamic link, and parameters are placed first by
caller.
▪ Local variables allocated and initialized by the calling program →
placed last

Dr. Nada Mobark, 2025 350


AN EXAMPLE: C FUNCTION
void sub(float total, int part)
{
int list[5];
float sum;

}

Dr. Nada Mobark, 2025 351


ACTIVATION RECORD
▪ The activation record format is static, but its size may be
dynamic
▪ Local data may not have fixed size

▪ The Environment Pointer (EP) must be maintained by the run-


time system.
▪ It always points at the base of the activation record instance of the
currently executing program unit
▪ Used as the base of the offset addressing of the data contents of the
activation record

Dr. Nada Mobark, 2025 352


REVISED SEMANTIC CALL/RETURN ACTIONS
▪ Caller Actions:
▪ Create an activation record instance
▪ Save the execution status of the current program unit
▪ Compute and pass the parameters
▪ Pass the return address to the called
▪ Transfer control to the called

▪ Prologue (before call) actions of the called:


▪ Save the old EP as the dynamic link in the activation record
▪ Set to point to base of the new Activation record instance
▪ Allocate local variables

Dr. Nada Mobark, 2025 353


REVISED SEMANTIC CALL/RETURN ACTIONS
▪ Epilogue (at the end of call) actions of the called:
▪ If there are pass-by-value-result or out-mode parameters, the
current values of those parameters are moved to the
corresponding actual parameters
▪ If the subprogram is a function, its value is moved to a place
accessible to the caller
▪ Restore the stack pointer by setting it to the value of the current
EP-1 and set the EP to the old dynamic link
▪ Restore the execution status of the caller
▪ Transfer control back to the caller

Dr. Nada Mobark, 2025 354


AN EXAMPLE WITHOUT RECURSION
void fun1(float r) {
void fun3(int q) {
int s, t;
...
...
}
fun2(s);
void main() {
...
float p;
}
...
void fun2(int x) {
fun1(p);
int y;
...
...
}
fun3(y);
...
}

Dr. Nada Mobark, 2025 355


AN EXAMPLE WITHOUT RECURSION

main calls fun1


fun1 calls fun2
fun2 calls fun3

Dr. Nada Mobark, 2025 356


DYNAMIC CHAIN AND LOCAL OFFSET
▪ The collection of dynamic links in the stack at a given time is
called the dynamic chain, or call chain
▪ Local variables can be accessed by their offset from the
beginning of the activation record, whose address is in the EP.
This offset is called the local_offset
▪ The local_offset of a local variable can be determined by the
compiler at compile time
▪ Based on order, type, and size

Dr. Nada Mobark, 2025 357


AN EXAMPLE WITH RECURSION
▪ The activation record used in the previous example supports
recursion

int factorial (int n) {


<-----------------------------1
if (n <= 1) return 1;
else return (n * factorial(n - 1));
<-----------------------------2
}
void main() {
int value;
value = factorial(3);
<-----------------------------3
}

Dr. Nada Mobark, 2025 358


STACKS FOR CALLS TO FACTORIAL
▪ Each call result in a
fresh copy of the
activation record
placed on the stack.
▪ the functional value
is undefined

Dr. Nada Mobark, 2025 359


STACKS FOR RETURNS FROM FACTORIAL
▪ Functional value is
returned before the
call ends.

Dr. Nada Mobark, 2025 360


SUMMARY
▪ Subprogram linkage semantics requires many action by the
implementation
▪ Simple subprograms have relatively basic actions
▪ Stack-dynamic languages are more complex
▪ Subprograms with stack-dynamic local variables have two
components
▪ actual code
▪ activation record

▪ Activation record instances contain formal parameters and


local variables among other things

Dr. Nada Mobark, 2025 361


ANY Q??

You might also like