CSC 533: Organization of Programming Languages Spring 2005: Background
CSC 533: Organization of Programming Languages Spring 2005: Background
Spring 2005
Background
machine assembly high-level languages
software development methodologies
key languages
Syntax
grammars, BNF
derivation trees, parsing
EBNF, syntax graphs
parsing
Semantics
operational, axiomatic, denotational
Evolution of
programming
first computers (e.g., ENIAC)
were not programmable
had to be
rewired/reconfigured for
different computations
011111110100010101001100010001100000000100000010000000010000000000000000000000
000000000000000000000000000000000000000000000000000000000000000001000000000000
001000000000000000000000000000000001000000000000000000000000000000000000000000
000000000000000000000000000000000000000000001010000100000000000000000000000000
000000000000000000110100000000000000000000000000000000000000000000101000000000
000000100000000000000000010000000000101110011100110110100001110011011101000111
001001110100011000010110001000000000001011100111010001100101011110000111010000
000000001011100111001001101111011001000110000101110100011000010000000000101110
011100110111100101101101011101000110000101100010000000000010111001110011011101
000111001001110100011000010110001000000000001011100111001001100101011011000110
000100101110011101000110010101111000011101000000000000101110011000110110111101
101101011011010110010101101110011101000000000000000000000000000000000010011101
111000111011111110010000000100110000000000000000000000001001000000010010011000
000000000000010101000000000000000000000000100100100001001010100000000000000100
000000000000000000000000000000000001000000000000000000000000101000000001000000
000000000010001001000000010000000000000001000000010101000000000000000000000000
100100100001001010100000000000000100000000000000000000000000000000000001000000
000000000000000000101100000001000000000000000100001000000000000000000000100000
000100000000000000000000000010000001110001111110000000001000100000011110100000
000000000000000000000000000000000000000000000001001000011001010110110001101100
011011110111011101101111011100100110110001100100001000010000000000000000000000
000000000000000000000000000000000000000001000000000000000000000000000000000000
000000000000000000000000000000000100000000001111111111110001000000000000000000
000000000000010000000000000000000000000000000000000000000000000000000000000000
000001000000000011111111111100010000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000001100000000000000000000
001100000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000010000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000000000000110000000
000000000000000100000000000000000000000000001101000000000000000000000000000000
000000000000000000000000000000000000001000000000000000000000000000000000000000
000000000000000100010000000000000000000000000000000000000000000000000000000000
000000000010000000000000000000000000000000000000000000000000000010001100000000
000000000000000000000000000000000000000000000000000000000000100000000000000000
000000000000000000000000000000000000101100000000000000000000000000000000000000
000000000000000000000000000000001000000000000000000000000000000000000000000000
000000001101001000000000000000000000000000000000000000000000000000000000100100
000010010000000000000000000000010000000000000000000000000011011100000000000000
000000000000000000000000000000000000000000000000000000100000000000000000000000
000000000000001101000011001010110110001101100011011110010111001100011011100000
111000000000000011001110110001101100011001100100101111101100011011011110110110
101110000011010010110110001100101011001000010111000000000010111110101000101011
111011100010111010001101111011001000000000001011111010111110110110001110011010
111110101111100110111011011110111001101110100011100100110010101100001011011010
101000001000110010100100011011101101111011100110111010001110010011001010110000
101101101010111110101001000110111011011110111001101110100011100100110010101100
001011011010000000001011111010111110110110001110011010111110101111100110111011
011110111001101110100011100100110010101100001011011010101000001000011011000110
000000001100101011011100110010001101100010111110101111101000110010100100011011
101101111011100110111010001110010011001010110000101101101000000000110110101100
001011010010110111000000000011000110110111101110101011101000000000000000000000
000000000000000000000000000000000000000000000000000000
Evolution of
programming (cont.)
mid 1950s: assembly languages
developed
.file
"hello.cpp"
gcc2_compiled.:
.global _Q_qtod
.section ".rodata"
.align 8
.LLC0: .asciz "Hello world!"
.section ".text"
.align 4
.global main
.type
main,#function
.proc
04
main:
!#PROLOGUE# 0
save %sp,-112,%sp
!#PROLOGUE# 1
sethi %hi(cout),%o1
or %o1,%lo(cout),%o0
sethi %hi(.LLC0),%o2
or %o2,%lo(.LLC0),%o1
call __ls__7ostreamPCc,0
nop
mov %o0,%l0
mov %l0,%o0
sethi %hi(endl__FR7ostream),%o2
or %o2,%lo(endl__FR7ostream),%o1
call __ls__7ostreamPFR7ostream_R7ostream,0
nop
mov 0,%i0
b .LL230
nop
.LL230: ret
restore
.LLfe1: .size
main,.LLfe1-main
.ident "GCC: (GNU) 2.7.2"
Evolution of
programming (cont.)
late 1950s: high-level languages
developed
// File: hello.cpp
// Author: Dave Reed
//
// This program prints "Hello world!"
////////////////////////////////////////
#include <iostream>
using namespace std;
int main()
{
cout << "Hello world!" << endl;
return 0;
}
(Pascal)
C
C
C
C
FORTRAN program
Prints "Hello world" 10 times
PROGRAM HELLO
DO 10, I=1,10
PRINT *,'Hello world'
10 CONTINUE
STOP
END
#include <stdio.h>
main() {
for(int i = 0; i < 10; i++) {
printf ("Hello World!\n");
}
}
#include <iostream>
using namespace std;
int main() {
for(int i = 0; i < 10; i++) {
cout << "Hello World!" << endl;
}
return 0;
}
class HelloWorld {
public static void main (String args[]) {
for(int i = 0; i < 10; i++) {
System.out.print("Hello World ");
}
}
}
<html>
<body>
<script language="JavaScript">
for(i = 0; i < 10; i++) {
document.write("Hello World<br>");
}
</script>
</body>
</html>
10
11
COBOL
artificial intelligence
LISP/Scheme or Prolog
systems programming C
software engineering
Web development
12
Syntax
syntax: the form of expressions, statements, and program units in a
programming language
programmers & implementers need a clear, unambiguous description
13
BNF is a meta-language
a grammar is a collection of rules that define a language
BNF rules define abstractions in terms of terminal symbols and abstractions
<ASSIGN> <VAR> := <EXPRESSION>
14
<identifier> <digit>
<identifier> <letter> <digit>
<letter> <letter> <digit>
C <letter> <digit>
CU <digit>
CU1
15
<identifier> <digit>
<identifier> <letter> <digit>
<letter> <letter> <digit>
C <letter> <digit>
CU <digit>
CU1
<identifier>
<identifier>
<digit>
<identifier>
<letter>
<letter>
16
Ambiguous grammars
consider a grammar for simple assignments
<assign> <id> := <expr>
<id>
A | B | C
<expr>
<expr> + <expr>
| <expr> * <expr>
| ( <expr> )
| <id>
A grammar is ambiguous if there exist sentences with 2 or more distinct parse trees
e.g.,
A := A + B * C
<assign>
<id>
<assign>
:=
<expr>
<expr>
<id>
<expr>
:=
<expr>
<expr>
<id>
A
<expr>
<id>
<expr>
<expr>
<expr>
<expr>
<id>
<id>
<id>
<id>
17
Ambiguity is bad!
programmer perspective
need to know how code will behave
<id> := <expr>
A | B | C
<expr> + <term> | <term>
<term> * <factor> | <factor>
( <expr> ) | <id>
18
Operator precedence
<assign>
<id>
<expr>
<term>
<factor>
<id> := <expr>
A | B | C
<expr> + <term> | <term>
<term> * <factor> | <factor>
( <expr> ) | <id>
A := A + B * C
<assign>
<id>
:=
<expr>
<expr>
<term>
<term>
<factor>
<id>
A
<term>
<factor>
<factor>
<id>
<id>
19
Operator associativity
similarly, can build in associativity
left-recursive definitions left-associative
right-recursive definitions right-associative
<assign>
<id>
A := A + B + C
:=
<expr>
<expr>
<term>
<factor>
<expr>
<term>
<term>
<factor>
<factor>
<id>
<id>
<id>
C
20
Right associativity
suppose we wanted exponentiation ^ to be right-associative
need to add right-recursive level to the grammar hierarchy
A := A ^ B ^ C
<assign>
<id>
:=
<expr>
<term>
<factor>
<exp>
<id>
A
<factor>
<exp>
<factor>
<id>
<exp>
<id>
C
21
In ALGOL 60
<math expr>
<simple math>
| <if clause> <simple math> else <math expr>
<if clause>
if <boolean expr> then
<simple math> <term>
| <add op> <term>
| <simple math> <add op> <term>
<term>
<factor>
<add op>
<mult op>
<primary>
+ | | / | %
<unsigned number> | <variable>
| <function designator> | ( <math expr> )
precedence? associativity?
22
Dangling else
consider the C++ grammar rule:
<selection stmt> if ( <expr> ) <stmt>
| if ( <expr> ) <stmt> else <stmt>
potential problems?
if (x >
if (x >
cout <<
else
cout <<
0)
100)
foo << endl;
bar << endl;
ambiguity!
to which if does the else belong?
23
<uncond stmt>
<compound stmt>
<cond stmt>
<if stmt>
<if clause>
if x > y then
if y > z then
printstring("foo");
else
printstring("bar");
ambiguous?
24
25
26
top-down parsers build the parse tree from the root (top-level abstraction) down to
the leaves (terminal symbols)
e.g., recursive descent (LL) simple, but limited (e.g., no left recursion)
bottom-up parsers build the parse tree from the leaves(terminal symbols) up to the
root (top-level abstraction)
e.g., shift-reduce (LR) implemented as a PDA, more complex but more general
27
Semantics
generally much trickier than syntax
3 common approaches
operational semantics: describe meaning of a program by executing it on a machine (either real
or abstract)
Pascal code
for i := first to last do
begin
end
Operational semantics
i = first
loop: if i > last goto out
i = i + 1
goto loop
out:
axiomatic semantics: describe meaning using assertions about conditions, can prove
properties of program using formal logic
Pascal code
Axiomatic semantics
while (x > y) do
begin
end
while (x > y) do
begin
ASSERT: x > y
end
ASSERT: x <= y
28