Java - Program Specifications (Full Notes)
Java - Program Specifications (Full Notes)
Program Specifications
So far in this course we have focused on Haskell (functional) programs. Haskell functions
are termed “pure” or “side-effect free”, which makes them much simpler to reason about.
1
Program Specifications Program Specifications
Challenge: The execution of some Java code can modify the program
state in a way that depends on conditions before the execution.
somecode
// POST: Q
The above specification expresses that if the state satisfies P , then after
the execution of somecode, the state will satisfy Q.
2
Program Specifications Program Specifications
b) // PRE: a = 5 ∧ x = 2
int y = a + x;
// POST: x = 2 ∧ y = 7 ∧ a = 5
c) // PRE: a = 3 ∧ x = 7
int y = a + x;
// POST: a = 6 ∧ x = 2 ∧ y = 8
// PRE: a = 3 ∧ x = 7
int y = a + x;
// POST: a = 3 ∧ x = 7 ∧ y = 10
This specification does hold. The values stored in a and x are unmodifed by the code, so
remain the same in the post-condition. Also 3 + 7 = 10, so the value stored in y in the
post-condition is also correct.
// PRE: a = 5 ∧ x = 2
int y = a + x;
// POST: x = 2 ∧ y = 7 ∧ a = 5
This specification also holds. The ordering of variables in the post-condition is not im-
portant (remember that ∧ is both associative and commutative). Since 5 + 2 = 7, the
value stored in y in the post-condition is correct.
// PRE: a = 3 ∧ x = 7
int y = a + x;
// POST: a = 6 ∧ x = 2 ∧ y = 8
This specification does not hold. Whilst 6 + 2 = 8, so the post-condion might appear to
be self-consistent, the code does not modify a or x, so the assertions a = 6 and x = 2
cannot hold in the post-condition from the given precondition.
If the pre-condition were instead to state a = 6 ∧ x = 2, then the specification would
then hold.
3
Program Specifications Program Specifications
It is important that this universal quantification is over the whole specification, otherwise
we would change its meaning. For example, if the post-condition above were written as
∀u, v. [a = u ∧ x = v ∧ y = u + v]
then we would be saying that the variable a has all possible values after running this
code, which is clearly nonsense!
Similarly, if the post-condition were written as
∃u, v [a = u ∧ x = v ∧ y = u + v]
then we would be saying that the variables a and x have some values, but we are not
being terribly precise about what values. This specification is at least true, and does
enforce that the value of y is the sum of the values of a and x, but we usually want to
make stronger assertions about our program state.
4
Program Specifications Program Specifications
“If property P holds before the execution of code, then after the
execution of code property Q will hold”.
We can formally express this as a Hoare Triple:
{ P } code { Q }
For example:
{ true } x = 5; { x > 0 }
{ 0 ≤ x < 10 } x++; { 0 ≤ x ≤ 10 }
{ P } code { Q }
{ true } x = 5; { x > 0 }
true ∧ x = 5 −→ x > 0
5
Program Specifications Program Specifications
{ 0 ≤ x < 10 } x++; { 0 ≤ x ≤ 10 }
Slide 7
0 ≤ x < 10 ∧ x = x + 1 −→ 0 ≤ x ≤ 10
6
Program Specifications Program Specifications
So now to prove:
{ 0 ≤ x < 10 } x++; { 0 ≤ x ≤ 10 }
To simply the presentation, we will only use the old annotation when a
variable is actually modified by the code.
Slide 9
{ x = 1 ∧ y = 2 } x = x + y; { x = 3 ∧ y = 2 }
xold = 1 ∧ y = 2 ∧ x = xold + y −→ x = 3 ∧ y = 2
7
Program Specifications Program Specifications
We can also apply this convention to array references (as with any other variable).
That is:
aold [k] refers to the current value, at index k, in the array that is referenced by a
before the code is executed.
aold [k]old refers to the value stored in the array that is referenced by a before the
code is executed, at index k, before the code is executed.
aold [kold ]old refers to the value stored in the array that is referenced by a before the
code is executed, at the index that is stored in k before the code is executed, before
the code is executed.
As you can see, this can get a little complicated. However, in practice we will rarely
need to make use of such array reference annotations ( apre [..) or aold [..) ), as it is quite
uncommon for Java programs destroy or replace entire arrays.
8
Program Specifications Program Specifications
Mod(x = E) = {x}
Mod(i++) = {i}
Mod(i--) = {i}
Mod(a[k] = E) = {a[k]}
Mod(C1 ; C2 ) = Mod(C1 ) ∪ Mod(C2 )
Mod(if(E){C1 }else{C2 }) = Mod(C1 ) ∪ Mod(C2 )
Mod(while(E){C}) = Mod(C)
There are other programming constructs in Java besides those listed above, but the Mod
function can be extended to those in the (hopefully) obvious way.
Throughout these notes, we will refer to the Mod function whenever we need to justify
the substitutions we are applying for any of our examples. However, this is not something
that we expect you to memorise beyond the simple intuition of annotating any modified
variables when writing proof obligations or full formal proofs.
The Mod function can also be extended to handle side-effecting expressions, in which
case the right-hand sides above will be extended by ... ∪ ModE (E) whenever there is an
expression E on the left-hand side. For example:
Mod(if(E){C1 }else{C2 }) = ModE (E) ∪ Mod(C1 ) ∪ Mod(C2 )
The function ModE (Mod : E → IDn ) returns the set of variables/object-attributes that
are modified by an expression. We define ModE for a small number of expressions below,
but this can be lifted to the full set of Java expressions in a similar way:
ModE (c) = {}
ModE (x++)∗ = {x}
ModE (++x)∗ = {x}
ModE (E1 + E2 ) = ModE (E1 ) ∪ ModE (E2 )
ModE (x += E) = {x} ∪ ModE (E)
ModE (E1 == E2 ) = ModE (E1 ) ∪ ModE (E2 )
where c is an arbitrary constant value and E, E1 and E2 are arbitrary expressions.
∗ similarly for --.
9
Program Specifications Program Specifications
Hoare Logic
In 1969 Sir Tony Hoare developed a new logic which allows for formal
reasoning directly with Hoare Triples (hence the name).
For example there is an axiom for dealing with assignment:
{ P } x = E; { Q }
This axiom states that after the assignment we can establish any property
that is derivable from the pre-condition and the effect of the assignment.
The substitution [x 7→ xold ] reflects the modification of the variable x.
For example:
{ x = 5 } x = x + 2; { x = 7 }
can be proven by showing that:
xold = 5 ∧ x = xold + 2 −→ x = 7
In the above we write P [x 7→ y] to denote the predicate P with all free occurrences of x
replaced with y and similarly for the expression E. In this specific case, we are replacing
all occurrences of the variable x with references to that variable’s old value xold in the
precondition P (the state before execution of the code) and the assignment expression E
(which is evaluated before we have updated the value stored in x).
Hopefully you will recall from the Logic course last term, that when we write rules of
inference of the form:
P
Q
then this states that it is possible to prove the conclusion Q holds if we have shown that
the premise P holds.
For the rest of this course we will stick to a simplified presentation of such formal ar-
guments. However, we will be relating our approach to the formal Hoare Logic rules
that guide our style. In later years of the degree you will have the chance to study such
formalisms in more detail.
The interested can find a primer on Hoare Logic on Wikipedia at: https://en.wikipedia.
org/wiki/Hoare_logic
Hoare’s original paper that launched the program verification field “An Axiomatic Ba-
sis for Computer Programming” can be found at: https://www.cs.cmu.edu/~crary/
819-f09/Hoare69.pdf
10
Hoare has also written a retrospective article on his career for the ACM’s online magazine:
http://cacm.acm.org/magazines/2009/10/42360-retrospective-an-axiomatic-basis-for-computer-programming/fulltext
1 // PRE: P
2 code1
3 // MID: R
4 code2
5 // POST: Q
{ P } code1 { R } { R } code2 { Q }
{ P } code1 ; code2 { Q }
If you were developing your code, you would hopefully write some comments to guide
you. In a similar way, we use mid-conditions to guide us through the proof of the code.
The allows to break down reasoning about a complex program into reasoning about
smaller fragments of the code.
11
Program Specifications Program Specifications
1 // PRE: a = x ∧ b = y ∧ c = z (P )
2 c = a * b;
// MID: a = x ∧ b = y ∧ c = xy (M1 )
Slide 14
4 b = b * b;
5 // MID: a = x ∧ b = y 2 ∧ c = xy (M2 )
6 a = a * a;
7 // MID: a = x2 ∧ b = y 2 ∧ c = xy (M3 )
8 c = c + c;
9 // MID: a = x2 ∧ b = y 2 ∧ c = 2xy (M4 )
10 int result = a + b + c;
11 //POST: result = x2 + 2xy + y 2 (Q)
You need to take care to preserve the values of a, b and c throughout the program to
ensure that you have all of the necessary information to establish the post-condition.
The proof obligations for this program are:
line 2:
P [c 7→ cold ] ∧ c = a * b; −→ M1
a = x ∧ b = y ∧ cold = z ∧ c = a ∗ b −→ a = x ∧ b = y ∧ c = xy
line 4:
M1 [b 7→ bold ] ∧ b = b * b; −→ M2
a = x ∧ bold = y ∧ c = xy ∧ b = bold ∗ bold −→ a = x ∧ b = y 2 ∧ c = xy
line 6:
M2 [a 7→ aold ] ∧ a = a * a; −→ M3
aold = x ∧ b = y 2 ∧ c = xy ∧ a = aold ∗ aold −→ a = x2 ∧ b = y 2 ∧ c = xy
line 8:
M3 [c 7→ cold ] ∧ c = c + c; −→ M4
a = x ∧ b = y ∧ cold = xy ∧ c = cold + cold −→ a = x2 ∧ b = y 2 ∧ c = 2xy
2 2
line 10:
M4 ∧ int result = a + b + c; −→ Q
a = x ∧ b = y ∧ c = 2xy ∧ result = a + b + c −→ result = x2 + 2xy + y 2
2 2
12
Program Specifications Program Specifications
Pre-/Post-/Mid-conditions
Pre-condition:
required to hold before some code is run
an assumption that the code can make
Post-condition:
Slide 15
Note that if a mid-condition evaluates to f alse, then this specifies that the program
cannot reach that location in the code. This may sometimes be correct, but more often
than not it suggests that your mid-condition is incorrect.
13
Program Specifications Program Specifications
Method Specifications
2 // PRE: P
3 // POST: Q
4 {
5 ...
6 }
In the above we write P [x1 7→ v1 , ..., xn 7→ vn ] to denote the predicate P with all free
occurrences of x1 ...xn replaced with v1 , ...vn respectively.
Note that v1 , ..., vn are values, while x1 , ..., xn are program variables. We will sometimes
refer to such lists of values/variables with vector notation to simplify the presentation.
e.g. x or v.
14
Program Specifications Program Specifications
15
Program Specifications Program Specifications
4 {
5 int c = x * y;
6 int b = y * y;
7 int a = x * x;
8 c = c + c;
9 return a + b + c;
10 }
Method Bodies
4 {
5 code
6 }
If property P holds before the execution of code then after the execution
of code property Q must hold.
i.e.
{ P } code { Q }
16
Program Specifications Program Specifications
Method Bodies
If the body of the method consists of multiple lines of code?
1 type someMethod(type x1 , ..., type xn )
2 // PRE: P
3 // POST: Q
4 { // MID: R
Slide 20
5 code1
6 // MID: S
7 code2
8 // MID: T
9 }
Then, as before, we introduce appropriate mid-conditions, such that:
1 We can establish R from P .
Using properties R and T helps us to account for the method parameter book-keeping.
Sometimes we omit properties R and T if the code’s behaviour is straight-forward.
The substitution Q[x 7→ xpre ] accounts for Java’s call-by-value semantics.
17
Program Specifications Program Specifications
6 int c = x * y;
7 // MID: c = xy (M1 )
8 int b = y * y;
9 // MID: b = y2 ∧ c = xy (M2 )
10 int a = x * x;
11 // MID: a = x2 ∧ b = y2 ∧ c = xy (M3 )
12 c = c + c;
13 // MID: a = x2 ∧ b = y2 ∧ c = 2xy (M4 )
14 return a + b + c;
15 }
The code above does not modify the input parameter x, so you do not need to distinguish
between x, xpre or xold throughout. Similarly for y, ypre and yold .
As with the stright-line version of the code, you need to take care to preserve the values
of a, b and c throughout the program to ensure that you have all of the necessary
information to establish the post-condition. However, we also require some additional
book-keeping to deal with the call-by-value method parameter passing semantics of Java.
In full, the proof obligations for this program are:
line 6:
M0 ∧ int c = x * y; −→ M1
true ∧ c = x ∗ y −→ c = xy
line 8:
M1 ∧ int b = y * y; −→ M2
c = xy ∧ b = y ∗ y −→ b = y2 ∧ c = xy
line 10:
M2 ∧ int a = x * x; −→ M3
b = y ∧ c = xy ∧ a = x ∗ x −→ a = x2 ∧ b = y2 ∧ c = xy
2
line 12:
M3 [c 7→ cold ] ∧ c = c + c; −→ M4
a = x ∧ b = y ∧ cold = xy ∧ c = cold + cold −→ a = x2 ∧ b = y2 ∧ c = 2xy
2 2
18
line 14:
M4 ∧ return a + b + c; −→ Q[x 7→ xpre , y 7→ ypre ]
a = x2 ∧ b = y2 ∧ c = 2xy ∧ r = a + b + c −→ result = x2pre + 2xpre ypre + y2pre
6 int z = x * y;
7 // MID: x = xpre ∧ y = ypre ∧ z = xpre ypre (M1 )
8 y = y * y;
9 // MID: x = xpre ∧ y = y2pre ∧ z = xpre ypre (M2 )
10 x = x * x;
11 // MID: x = x2pre ∧ y = y2pre ∧ z = xpre ypre (M3 )
12 z = z + z;
13 // MID: x = x2pre ∧ y = y2pre ∧ z = 2xpre ypre (M4 )
14 return x + y + z;
15 }
Drossopoulou & Wheelhouse (DoC) Discrete Mathematics, Logic & Reasoning 22 / 23
This version of the code does modify the input parameters x and y, so we need to track
the values of these variables more carefully during our specification of the code.
In full, the proof obligations for this version of the program are:
line 6:
M0 ∧ int z = x * y; −→ M1
x = xpre ∧ y = ypre ∧ z = x ∗ y −→ x = xpre ∧ y = ypre ∧ z = xpre ypre
line 8:
M1 [y 7→ yold ] ∧ y = y * y; −→ M2
x = xpre ∧ yold = ypre ∧ z = xpre ypre ∧ y = yold ∗ yold −→ x = xpre ∧ y = y2pre ∧ c = xpre ypre
line 10:
M2 [x 7→ xold ] ∧ x = x * x; −→ M3
xold = xpre ∧ y = y2pre ∧ z = xpre ypre ∧ x = xold ∗ xold −→ x = x2pre ∧ y = y2pre ∧ z = xpre ypre
line 12:
M3 [z 7→ zold ] ∧ z = z + z; −→ M4
x= x2pre ∧ y= y2pre ∧ zold = xpre ypre ∧ z = zold + zold −→ x = x2pre ∧ y = y2pre ∧ z = 2xpre ypre
19
line 14:
M4 ∧ return x + y + z; −→ Q[x 7→ xpre , y 7→ ypre ]
x = x2pre ∧ y = y2pre ∧ z = 2xpre ypre ∧ r = a + b + c −→ result = x2pre + 2xpre ypre + y2pre
20
[Extra] Weakest Pre-conditions and Strongest Post-conditions
Whilst we do not cover this area as part of the Reasoning About Programs course cur-
riculum, the general idea is actually quite simple.
The Weakest Pre-condition for a program is the most general (or weakest) property
which a program requires to function correctly (i.e. to satisfy its post-condition without
faulting) and which can be inferred from all correct pre-conditions.
For example, the weakest pre-condition of the tiny program n++ with a desired post-
condition of n > 1 would be n > 0. Then any concrete case, such as n = 5, would imply
the weakest pre-condition.
Conversely, the Strongest Post-condition for a program is the most specific (or strongest)
property which holds after the program has run on a state satisfying its pre-condition,
and from which any correct post-condition can be derived.
Again, taking the simple program n++ as an example, now with a pre-condition of n = 5,
the strongest post-condition would be n = 6. Any other more general case, such as n > 1,
could then be derived from the strongest post-condition.
The weakest pre-conditions and strongest post-conditions are primarily used in automated
verification techniques. Typically you use only one, depending on if you are working
through the program backwards or forwards.
21