Concept
Concept
CONCEPT OF
PROGRAMMING
LANGUAGES
Ch1 : Preliminaries
▪ Ruby www.ruby-lang.org
▪ JavaScript/PHP is included in
virtually all browsers;
conformance to
Reliability specifications
the ultimate
total cost Cost
▪ Exception handling
▪ Intercept run-time errors and take corrective measures
▪ Aliasing:
▪ Different names to the same memory cell.
▪ Writing programs
▪ closeness to particular applications
▪ Maintaining programs:
▪ Usually done by different programmers → readability is an issue!
▪ large software systems with relatively long lifetimes, maintenance costs
can be as high as two to four times as much as development costs
▪ Well-defineness
▪ The completeness and precision of the language’s official
definition
Late Middle
1960s: 1980s:
structured Object-
programmi oriented
ng (top- program
down ming
Dr. Nada Mobark, 2025 design) 16
LANGUAGE CATEGORIES
https://www.programiz.com/article/difference-compiler-interpreter
Dr. Nada Mobark, 2025 21
IMPLEMENTATION METHODS
▪ Hybrid Implementation Systems
▪ A compromise between compilers and pure
interpreters
▪ A high-level language program is
translated to an intermediate language
that allows easy interpretation
▪ Faster than pure interpretation
▪ Use: Small and medium systems when
efficiency is not the first concern
https://www.tutorialspoint.com/execute_ruby_online.php
Explore the
environment and
motivation behind the
development of a
collection of
programming -
languages.
https://www.youtube.com/watch?v=Og847HVwRSI
▪ Syntax:
| A + 1 => A
V | 4 5 (subscripts)
S | 1.n 1.n (data types)
Dr. Nada Mobark, 2025 30
GENEALOGY OF COMMON LANGUAGES
25th reunion of
Fortran team,1982
▪ ACM and GAMM met for four days for design (May 27 to June
1, 1958), Goals of the language
▪ Close to mathematical notation
▪ Good for describing algorithms
▪ Must be translatable to machine code
ALGOL 63
ALGOL 58
ALGOL 60
formalized scope) concept of orthogonality
Names could be any Two parameter passing User-defined data
length methods structures
Arrays could have any Subprogram recursion Reference types
number of subscripts Stack-dynamic arrays Dynamic arrays (called
Subscripts were placed in Still no I/O flex arrays)
brackets new metalanguage ( key
no string handling
Parameters were words and terms)
separated by mode (in &
out)
Compound statements
(begin ... end)
Semicolon as a statement
separator
:= , Assignment operator
if had an else-if clause
No I/O
The lists
(A B C D)
and
(A (B C) D (E (F G)))
▪ Design Goals:
▪ Easy to learn and use for non-science students
▪ Must be “pleasant and friendly”
▪ Fast turnaround for homework
▪ Free and private access
▪ User time is more important than computer time
▪ Non-procedural
Contributions Comments
Compile == understand !
Syntax Semantics
Recognizers Generators
Context-Free Backus-Naur
Grammars Form (BNF)
• Developed by Noam Chomsky • Invented by John Backus to
in the mid-1950s describe the syntax of
• Language generators, meant ALGOL58
to describe the syntax of • Revised by Peter Naur for
natural languages ALGOL60
• Two grammar classes: • Concise formal descriptions
context-free and regular • Not easily understandable,
new notation
• Not immediately accepted, but
later became the standard
Lexemes Tokens
▪ int ▪ keyword
▪ index ▪ identifier
▪ count ▪ identifier
▪ = ▪ equal_sign/assignment_
op
▪ +
▪ plus_op
▪ *
▪ mult_op
▪ 2
▪ int_literal
▪ 17
▪ int_literal
▪ ;
▪ semicolon or delimiter
▪ nonterminal symbols :
▪ often enclosed in angle brackets
▪ act like syntactic variables
▪ Grammar:
▪ a finite non-empty set of rules
▪ A generative device for defining languages
A = B * ( A + C )
A = B * ( A + C )
▪ Example,
A+B+C
▪ left and right associative orders of evaluation mean the same
thing:
(A + B) + C = A + (B + C)
▪ Subtraction and division are not associative, whether in
mathematics or in a computer
if <logic_expr> then
if <logic_expr> then
<stmt>
else
<stmt>
if <logic_expr> then
if <logic_expr> then
<stmt>
else
<stmt>
BNF EBNF
BNF EBNF
<expr> → <expr> + <expr> → <term> {(+ | -)
<term> <term>}
| <expr>
- <term> <term> → <factor> {(*|/)
<factor>}
| <term>
<term> → <term> *
<factor>
| <term>
/ <factor>
|
<factor>
Static Dynamic
▪ Extension to BNF grammar ▪ Related to the program
▪ BNF cannot describe all of the meaning during execution
syntax of programming
languages ▪ Can be used to prove
correctness without testing
▪ Checked at compile time
▪ Example: loops
▪ Example : data-type
compatibility
▪ The types of operands in the right side can be mixed, but the
assignment is valid only if the target and the value resulting from
evaluating the right side have the same type.
Semantic rules:
1. When there are two variables on the right side of an
assignment:
▪ If , they have the same type, the expression type is that of the operands
▪ if the operand types are not the same is always real.
2. The type of the left side of the assignment must match the
type of the right side.
▪ Three methods:
✓ Operational semantics
▪ Denotational semantics
▪ Axiomatic semantics
do:
…..
……..
while : if condition == 1 goto do
end:
Dr. Nada Mobark, 2025 111
EVALUATION
▪ Good if used informally (language manuals, etc.) or for
teaching programming languages
▪ Extremely complex if used formally
▪ Vienna Definition Language (VDL) was used for describing
semantics of PL/I.
▪ can lead to circularities, in which concepts are indirectly
defined in terms of themselves
2 initial id …
Source Code
3 rate id …
4 60 Int_lit
Compiler
Lexical Analyzer
=
<id 1> + Syntax Analyzer
<id 2> *
<id 3> Int_Lit
Semantic Analyzer
Assembly/machine Code
LDF R2, id3
MULF R2, R2, #60.0
LDF R1, id2
ADDF R1, R1, R2 Computer
STF id1, R1
Dr. Nada Mobark, 2025 114
SUMMARY
▪ An attribute grammar is a descriptive formalism that can
describe both the syntax and the semantics of a language
▪ Operational semantics describe the meaning of a program by
executing its statements on a machine using an intermediate
language
Compiler
▪ The syntax analysis portion of a
compiler nearly always consists Lexical Analyzer
of two parts:
▪ A low-level part is called a lexical Parser
analyzer
▪ A high-level part is called a
syntax analyzer, or parser (based Semantic Analyzer
on BNF)
Code Generator
Machine
Code
Computer
Dr. Nada Mobark, 2025 121
4.2 LEXICAL ANALYSIS
▪ A lexical analyzer is a pattern matcher for character strings
▪ It is the “front-end” of a parser
Lexeme Tokens
index = 2 * count + 17;
index identifier
count identifier
= equal_sign
* mult_op
+ plus_op
2 int_literal
17 int_literal
Dr. Nada Mobark, 2025 ; semicolon 124
LEXICAL ANALYSIS
▪ A lexical analyzer also . . .
▪ Skips comments
▪ Skips blanks outside lexemes
▪ Inserts lexemes for identifiers and literals into a symbol table
▪ Detects syntactic errors in lexemes
▪ For example, ill-formed floating-point literals, 12,345.21, 12,213, 4us
Symbol Table
i Lexeme Tokens
1 sum IDENT …
2 ( LEFT_PAREN …
3 + ADD_OP …
4 47 INT_LIT
Token
add the
character
from
nextChar
to the Recognizes single-char tokens
lexeme (returns a code)
string
Compiler
▪ Goals of the parser
Lexical Analyzer
▪ Produce the parse tree
▪ Find all syntax errors
▪ produce an appropriate diagnostic Parser
message and recover quickly
Computer
Dr. Nada Mobark, 2025 129
THE TOP-DOWN PARSER
▪ An LL parser is a top-down parser.
▪ parses the input from Left to right
▪ performs Leftmost derivation of the sentence.
int int
E
E → E + T | T
T → T * int | int | (E)
int * int + int
▪ Apply right-most derivation, in T
reverse order → start from
terminals!
T
int * int + int T
T * int + int
T + int
T+ T
E+ T int
int * int +
E
A B
S –> AB
A –> aA | ε
B –> b | bB A
S => A B A
=> a A B
=> a a A B
=>a a a A B
=> a a a ε B A
=> a a a ε b
a a a ε b
Dr. Nada Mobark, 2025 135
BOTTOM-UP S –> AB
S
A –> aA | ε
B –> b | bB
A
=> a a a ε b
=> aaaAb
=> aaAb
A
=> aAb
=> Ab
=> AB
=> S
A
A B
a a a ε b
Dr. Nada Mobark, 2025 136
SEPARATE LEXICAL AND SYNTAX ANALYSIS
▪ Reasons to Separate Lexical and Syntax Analysis
▪ Simplicity - less complex approaches can be used for lexical
analysis; separating them simplifies the parser
▪ Efficiency - separation allows optimization of the lexical analyzer
▪ Portability - parts of the lexical analyzer may not be portable, but
the parser always is portable
2 initial id …
Source Code
3 rate id …
4 60 Int_lit
Compiler
Lexical Analyzer
=
<id 1> + Syntax Analyzer
<id 2> *
<id 3> Int_Lit
Semantic Analyzer
Assembly/machine Code
LDF R2, id3
MULF R2, R2, #60.0
LDF R1, id2
ADDF R1, R1, R2 Computer
STF id1, R1
Dr. Nada Mobark, 2025 138
SUMMARY
▪ The major methods of implementing programming languages
are: compilation, pure interpretation, and hybrid
implementation
▪ Syntax analysis is a common part of language implementation
▪ A lexical analyzer is a pattern matcher that isolates small-
scale parts of a program
▪ Detects syntax errors
▪ Produces a parse tree
5.5 Scope
▪ Static
▪ dynamic
2 17 const int
Anonymous
variable
int count = x + 5;
static dynamic
Perl: C#:
$apple String or Numeric var sum = 0;
var total = 0.0
var name = “Fred”
@apple Array
▪ Disadvantages:
▪ High cost (dynamic allocation/de-allocation)
▪ Type error detection by the compiler is difficult
▪ Usually implemented by interpreter (slow)
▪ Categories:
▪ Static variables
▪ Stack-dynamic variables
▪ heap-dynamic variables
▪ Explicit
▪ Implicit
▪ Advantages:
▪ efficiency (direct addressing),
▪ history-sensitive subprogram support
▪ Disadvantage:
▪ lack of flexibility (no recursion)
▪ Disadvantage:
▪ Inefficient Cost of allocations,
references, and de-allocations
▪ Explicit de-allocation makes
programs unreliable
Dr. Nada Mobark, 2025 169
IMPLICIT HEAP-DYNAMIC VARIABLES
▪ Allocation and deallocation caused by assignment statements
▪ All attributes are bound every time they are assigned
▪ eg.
▪ all strings and arrays in Perl, JavaScript, and PHP
Highs = 5
▪ Advantage:Highs = [1, 2, 3, 4, 5]
▪ flexibility (generic code)
▪ Disadvantages:
▪ Inefficient- Introduce the run-time overhead of maintaining all the
dynamic attributes.
▪ Loss of error detection by the compiler
Content Content
Sum = Sum + age ;
Address
Is this allowed ?
var x = 3;
sub1();
}
▪ Disadvantages:
▪ While a subprogram is executing, its variables are visible to all
subprograms it calls → less reliable
▪ A statement in a subprogram that contains a reference to a
nonlocal variable can refer to different nonlocal variables during
different executions of the sub-programs → Impossible to
statically determine attributes
▪ Takes longer time to resolve → inefficient
▪ You need to know the sequence of subprogram calls to understand
a reference to a variable → Poor readability
int i = 5; EXAMPLE
void p(){ ✓ Trace the program by
int i = -1; hand, and predict the
i = i + 1; output of the program.
cout << i << endl;
} ✓ What happens if we
remove the line:
int main(){
cout << i << endl; using namespace std;
char ch;
int i = 6;
i = i + 1;
p();
cout << i << endl;
return 0;
}
int i;
EXAMPLE
int main(){
int i; ✓ Trace the program by
i = 5; hand, and predict the
output of the program.
for(
int i = 1; ✓ Does it compile
i<10 && cout << i << ' '; correctly or not?
++i ) Explain.
{
int i = -1;
cout << i << ' ';
}
▪ One design issue for all data types: What operations are
defined and how are they specified?
▪ Each value consists of two floats, the real part and the
imaginary part
▪ Advantage:
▪ Readability over integers to represent switches or flags
▪ Typical operations:
▪ Assignment and copying
▪ Comparison (=, >, etc.)
▪ Catenation
▪ Substring reference(slice)
▪ Pattern matching
▪ Design issues:
▪ Is it a primitive type or just a special kind of array?
▪ Should the length of strings be static or dynamic?
▪ Limited dynamic:
▪ C and C++
▪ any number of chars 0 – max
▪ maintain the length, or use a special end of a string’s
character
▪ require no special dynamic storage allocation
▪ Dynamic length
▪ JavaScript, Perl A descriptor is the collection
▪ Variable length with no maximum of the attributes of a variable :
▪ Dynamic storage • Static : built at
▪ must grow and shrink dynamically. compilation time as part
of the symbol table
▪ Adjacent cells (mostly used)
• Dynamic: part or all has
▪ Overhead in allocation and deallocation
to be maintained at run
▪ Linked list or array of char pointers time
▪ Extra storage, complex operations
>> greeting
=> "Hello“
>> greeting.object_id
=> 70101471431160
▪ Design issues:
▪ What are the scope of a pointer variable?
▪ What is the lifetime of a heap-dynamic variable?
▪ Are pointers restricted as to the type of value to which they can
point?
▪ Are pointers used for dynamic storage management, indirect
addressing, or both?
▪ Should the language support pointer types, reference types, or
both?
p = stuff;
delete []stuff;
>> number = 3
=> 3
>> number
=> 3
>> number
=> 6
▪ Aid to reliability,
▪ compiler can check:
▪ operations (don’t allow colors to be added)
▪ No enumeration variable can be assigned a value outside its defined
range
▪ In C#, F#, Swift, and Java 5.0, enumeration type variables :
▪ are not coerced into integer types
▪ can’t be assigned a value outside the predefined range.
Compile-time
descriptor for
single-
dimensioned
arrays
▪ Design issues:
▪ What is the form of references to elements?
▪ Is the size static or dynamic?
A+C+B+D
▪ Programmers can alter the precedence and associativity rules by
placing parentheses
(A + B) + (C + D)
a = 10;
b = a + fun(&a);
//assume fun returns 10 and changes
//its parameter to 20
int a;
float b, c, d;
. . .
d = b * a;
▪ Disadvantage of coercions:
▪ They decrease in the type error detection ability of the compiler
counter = 2
while counter < 68
puts counter
counter**=2
end
▪ equivalent to
if ($flag){
$total = 0
} else {
$subtotal = 0
}
▪ Design Issues:
▪ What is the form and type of the control expression?
▪ How are the then and else clauses specified?
▪ How should the meaning of nested selectors be specified?
▪ Expression :
▪ In C89, C99, Python, and C++, the control expression can be
arithmetic
▪ In most other languages, the control expression must be Boolean
▪ Java example
if (sum == 0)
if (count == 0)
result = 0;
else result = 1;
▪ Which if gets the else?
▪ static semantics rule: else matches with the nearest elseless-if
▪ To force an alternative semantics, compound statements may be
used
▪ default clause
▪ for unrepresented values
▪ Optional
goto branches
label1 :label1 : code for statement1
goto out
. . .
labeln : code for statementn
goto out
default: code for statementn+1
goto out
branches: if t = constant_expression1 goto label1label1
. . .
if t = constant_expressionn goto labelnlabeln
goto default
out:
▪ Syntax:
for ([expr_1] ; [expr_2] ; [expr_3]) statement
▪ Semantics:
1.i = 1
2.while true
3. if i*5 >= 25
4. break
5. end 1.for i in 5...11
6. puts i*5 2. if i == 7 then
7. i += 1 3. next
8.end 4. end
5. puts i
6.end
Redo ??
Dr. Nada Mobark, 2025 304
ITERATION BASED ON DATA STRUCTURES
▪ The number of elements in a data structure controls loop
iteration
▪ Mechanism is a call to an iterator function that returns the next
element in some chosen order, if there is one; else loop is
terminate
▪ Keyword
▪ The name of the formal parameter to which an
actual parameter is to be bound is specified
with the actual parameter
▪ Advantage: Parameters can appear in any order,
thereby avoiding parameter correspondence
errors
▪ Disadvantage: User must know the formal
parameter’s names
▪ Advantage:
▪ Passing process is efficient (no
copying and no duplicated storage)
▪ Disadvantages
▪ Slower accesses (compared to pass-
by-value) to formal parameters
▪ Potentials for unwanted side effects
(collisions)
▪ Unwanted aliases (access broadened)
fun(total, total);
Dr. Nada Mobark, 2025
fun(list[i], list[j]); // i == j 330
PASS-BY-REFERENCE (IN-OUT MODE)
▪ Another issue:
▪ Can the passed reference be changed in the called
subprogram?
▪ In C, it is possible
▪ But in some other languages, such as Pascal and C++, formal
parameters that are addresses are implicitly dereferenced, which
prevents such changes
▪ C++
▪ A special pointer type called reference type for pass-by-reference
▪ Java
▪ All non-object parameters are passed are passed by value
▪ no method can change any of these parameters
▪ Object parameters are passed by reference
▪ C#
▪ Default method: pass-by-value
▪ Pass-by-reference is specified by preceding both a formal parameter
and its actual parameter with ref
▪ Python and Ruby
▪ use pass-by-assignment (all data values are objects); the actual is
assigned to the formal