Abstraction and Modular Reasoning
for the Verification of Software
Corina Pasareanu,
October, 2001
Thesis Committee:
Matthew Dwyer, Major Advisor
David Schmidt
Michael Huth
George Strecker
Kenneth Kemp, Outside Chairperson
Finite-state Verification
OK
Finite-state system
Verification or
tool
Error trace
(F W) Line 5: …
Line 12: …
Line 15:…
Specification Line
Line
21:…
25:…
Line 27:…
…
Line 41:…
Line 47:…
Finite-state Verification of Software
Techniques for checking correctness of concurrent programs:
Formal verification
– That a program satisfies a complete specification
Dynamic methods/testing
– That examine the outcome of executions of the program
– That check assertions during executions
Finite state verification (FSV)
– Automated (like testing)
– Guarantees satisfaction of properties (like formal verification)
– Useful for debugging (counter-examples)
FSV is a potential very useful technique for insuring high-
quality software
Finite State Verification of Software
Challenges:
Creation of tractable models
… software has large/infinite state space
Modular reasoning about software components
… using tools that reason about complete systems
Solutions:
Abstract interpretation
… to reduce the data domains of a program
Program completion
… with code for missing components
… use environment assumptions, assume-guarantee paradigm
Goals of our work …
Develop multiple forms of tool support for …
… abstraction and program completion
… applicable to program source code
… largely automated, usable by non-experts
We concentrate on building …
… finite-state models, not finite-state tools
We evaluate the effectiveness of this tool support through…
… implementation
… application to real Java and Ada programs
General Methodology
Partial Program, Property Y, Assumption F
Program Data Refine Selection
Completion Abstraction
Universal and Heuristics for abstraction selection
synthesized environments
Abstracted Completed Program
Translator
Model of Program
Property yes Model Checking no Counter-example Property
True! <F,Y> Analysis False!
SPIN, SMV, JPF… Choose-free search and
guided simulation
Contributions
Integration of existing technologies into a coherent
methodology that enables FSV of software systems
Development of new technologies:
– Use of theorem proving to automatically derive
abstractions of data types
– Instrumentation of source code to enable assume-
guarantee reasoning about partial programs
– Two techniques for analysis of abstract counter-
examples
Formalization and implementation of these technologies
Evaluation of technologies on case studies (Java, Ada)
Context
The Bandera toolset (KSU)
– Collection of program analysis and transformation
components
– Allows users to analyze properties of Java programs
– Tailor analysis to property to minimize analysis time
The Ada translation toolset
Java PathFinder (NASA Ames)
– Model checker for Java programs
– Built on top of a custom made Java Virtual Machine
– Checks for deadlock and violations of assertions
Related Projects
The Microsoft Research SLAM project
G. Holzmann’s Fever Tool (Bell Labs)
Stoller’s stateless checker for Java programs
Godefroid’s Verisoft (Bell Labs)
David Dill’s Hardware Verification Group
Eran Yahav’ s Java checking tool
Outline of the Talk
Introduction
Data Abstraction (FSE’98, ICSE’01)
Program Completion
Abstract Counter-example Analysis
Formal Justification
Case Studies: Java and Ada Programs
Conclusions, Limitations and Future Work
Finite-state Verification
Effective for analyzing properties of
hardware systems
Widespread success and
adoption in industry
Recent years have seen many efforts to
apply those techniques to software
Limited success due to the
enormous state spaces
associated with most software systems
Abstraction: the key to scaling up
represents a symbolic state
set of states
abstraction
Original Abstract
system system
Safety: The set of behaviors of the abstract system over-approximates
the set of behaviors of the original system
Data Type Abstraction
Collapses data domains via abstract interpretation:
Code Data domains
int x = 0; int
if (x == 0)
x = x + 1;
(n<0) : NEG
(n==0): ZERO
(n>0) : POS
Signs x = ZERO; Signs
if (Signs.eq(x,ZERO)) NEG ZERO POS
x = Signs.add(x,POS);
Our hypothesis (?)
Abstraction of data domains is necessary
Automated support for
– Defining abstract domains (and operators)
– Selecting abstractions for program
components
– Generating abstract program models
– Interpreting abstract counter-examples
will make it possible to
– Scale property verification to realistic systems
Bandera
Abstraction in Bandera Abstraction
Specification
Language
PVS
Concrete Abstract Inferred
Variable Type Type Type Abstraction
Definition
x int Signs
y int Signs Signs
done bool bool
count int int
…. …. Abstraction BASL
Compiler
o Object Point Library
b Buffer Buffer
Abstract Code Abstracted
Program
Generator Program
Definition of Abstractions in BASL
operator + add
abstraction Signs abstracts int begin
begin (NEG , NEG) -> {NEG} ;
TOKENS = { NEG, ZERO, POS }; (NEG , ZERO) -> {NEG} ;
(ZERO, NEG) -> {NEG} ;
abstract(n) (ZERO, ZERO) -> {ZERO} ;
begin (ZERO, POS) -> {POS} ;
n < 0 -> {NEG}; Automatic (POS , ZERO) -> {POS} ;
n == 0 -> {ZERO}; (POS , POS) -> {POS} ;
n > 0 -> {POS}; Generation
(_,_) -> {NEG,ZERO,POS};
end /* case (POS,NEG),(NEG,POS) */
end
Example: Start safe, then refine: +(NEG,NEG)={NEG,ZERO,POS}
Proof obligations submitted to PVS...
Forall n1,n2: neg?(n1) and neg?(n2) implies not pos?(n1+n2)
Forall n1,n2: neg?(n1) and neg?(n2) implies not zero?(n1+n2)
Forall n1,n2: neg?(n1) and neg?(n2) implies not neg?(n1+n2)
Compiling BASL Definitions
abstraction Signs abstracts int public class Signs {
begin public static final int NEG = 0; // mask 1
TOKENS = { NEG, ZERO, POS }; public static final int ZERO = 1; // mask 2
public static final int POS = 2; // mask 4
abstract(n)
begin public static int abs(int n) {
n < 0 -> {NEG}; if (n < 0) return NEG;
n == 0 -> {ZERO}; if (n == 0) return ZERO;
n > 0 -> {POS}; if (n > 0) return POS;
end }
operator + add public static int add(int arg1, int arg2) {
begin Compiled if (arg1==NEG && arg2==NEG) return NEG;
(NEG , NEG) -> {NEG} ; if (arg1==NEG && arg2==ZERO) return NEG;
(NEG , ZERO) -> {NEG} ; if (arg1==ZERO && arg2==NEG) return NEG;
(ZERO, NEG) -> {NEG} ; if (arg1==ZERO && arg2==ZERO) return ZERO;
(ZERO, ZERO) -> {ZERO} ; if (arg1==ZERO && arg2==POS) return POS;
(ZERO, POS) -> {POS} ; if (arg1==POS && arg2==ZERO) return POS;
(POS , ZERO) -> {POS} ; if (arg1==POS && arg2==POS) return POS;
(POS , POS) -> {POS} ; return Bandera.choose(7);
(_,_)-> {NEG, ZERO, POS}; /* case (POS,NEG), (NEG,POS) */
/* case (POS,NEG), (NEG,POS) */ }
end
Data Type Abstractions
Library of abstractions for base types contains:
– Range(i,j), i..j modeled precisely, e.g., Range(0,0) is the
signs abstraction
– Modulo(k), Set(v,…)
– Point maps all concrete values to unknown
– User extendable for base types
Array abstractions
– Specified by an index abstraction and an element
abstraction
Class abstractions
– Specified by abstractions for each field
Comparison to Related Work
Predicate abstraction (Graf, Saidi)
– We use PVS to abstract operator definitions, not
complete systems
– We can reuse abstractions for different systems
Tool support for program abstraction
– e.g., SLAM, JPF, Feaver
Abstraction at the source-code level
– Supports multiple checking tools
– e.g., JPF, Java Checker/Verisoft, FLAVERS/Java, …
Outline of the Talk
Introduction
Data Abstraction
Program Completion (FSE’98, SPIN’99)
Abstract Counter-example Analysis
Formal Justification
Case Studies: Java and Ada Programs
Conclusions, Limitations and Future Work
Program Completion
Software systems are collections of software
components
Approach similar to unit testing of software
– Applied to components (units) that are code complete
– Stubs and drivers simulate environment for unit
We complete a system with a source code
representation for missing components
– Universal environments
– Environment assumptions used to refine definitions of
missing components
Example of Universal Environment
The stub for an Ada partial program, with an
Insert procedure and a Remove function
procedure stub is
choice: Integer;
theObject: Object_Type;
begin loop
case choice is
when 1 => Insert(theObject);
when 2 => theObject:=Remove;
when 3 => null;
when others => exit;
end case; end loop; end stub;
Assume-guarantee Model Checking
A system specification is a pair <F,Y>
– Y describes the property of the system
– F describes the assumption about the environment
Problem: check property Y vs. System , assuming F
Linear temporal paradigm: 2nd approach:
Synthesized
Universal
F -> Y vs. || System Y vs. environment || System
environment
(LTL) (LTL) F (LTL)
(ACTL)
Synthesis of Environments
Environment assumptions F
– Encode behavioral info. about system interfaces
– Written in LTL
“Classical” tableau construction for LTL
– Modified to assume that only one interface
operation can occur at a time
– Builds a model of F :
• I.e., a graph with arcs labeled by interface operations
• Can be easily translated into source code
Graph Generated from Assumption: (! Remove) U Insert
Insert t
Insert
8
Insert 9 t
t
Remove
Remove
16 Insert
t
Synthesized Code
procedure stub is when 9 => case choice is
state,choice: Integer; when 1 => Insert(theObject);state:=8;
theObject: Object_Type; when 2 => null;state:=9;
begin state:=0; when others => exit;
end case;
loop
case state is
when 16=> case choice is
when 1 => Insert(theObject);state:=16;
when 0 => case choice is when 2 => theObject:=Remove;state:=16;
when 1 => Insert(theObject);state:=8; when 3 => null; state:=16;
when 2 => null;state:=9; when others => exit;
when others => exit; end case;
end case;
end case; end loop; end stub;
when 8 => case choice is
when 1 => Insert(theObject);state:=8;
when 2 => theObject:=Remove;state:=16;
when 3 => null; state:=8;
when others => exit;
end case;
Comparison to Related Work
Much theoretical work on compositional
verification
Avrunin, Dillon, Corbett:
– Analysis of real-time systems, described as a mixture of
source code and specifications
Colby, Godefroid, Jagadeesan:
– Automatable approach to complete a partially specified
system
– Use the VeriSoft toolset
– No environment assumptions
Long:
– Synthesis of environments from ACTL
Outline of the Talk
Introduction
Data Abstraction
Program Completion
Abstract Counter-example Analysis (TACAS’01)
Formal Justification
Case Studies: Java and Ada Programs
Conclusions, Limitations and Future Work
Abstract Counter-example Analysis
For an abstracted program, a counter-example
may be infeasible because:
– Over-approximation introduced by abstraction
Example:
x = -2; if(x + 2 == 0) then ...
x = NEG; if(Signs.eq(Signs.add(x,POS),ZERO)) then ...
{NEG,ZERO,POS}
Our Solutions
Choice-bounded State Space Search
– “on-the-fly”, during model checking
Abstract Counter-example Guided
Concrete Simulation
– Exploit implementations of abstractions
for Java programs
– Effective in practice
Choose-free state space search
Theorem [Saidi:SAS’00]
Every path in the abstracted program where all
assignments are deterministic is a path in the
concrete program.
Bias the model checker
– to look only at paths that do not include
instructions that introduce non-determinism
JPF model checker modified
– to detect non-deterministic choice (i.e. calls to
Bandera.choose()); backtrack from those
points
Choice-bounded Search
State space searched
choose()
X
X
Counter-example guided simulation
(?)
Use abstract counter-example to
guide simulation of concrete
program
Why it works:
– Correspondence between concrete and
abstracted program
– Unique initial concrete state (Java
defines default initial values for all data)
Comparison to Related Work
Previous work:
– After model checking; analyze the
counter-example to see if it is feasible
Pre-image computations; theorem
prover based (InVest)
Forward simulation (CMU)
Symbolic execution (SLAM)
Outline of the Talk
Introduction
Data Abstraction
Program Completion
Abstract Counter-example Analysis
Formal Justification
Case Studies: Java and Ada Programs
Conclusions, Limitations and Future Work
Formal Justification
Our techniques build safe abstractions of software systems
P < P’ means “ P’ is a safe abstraction of P”
– < is a simulation relation (Milner)
– Properties are expressed in universal temporal logics (LTL, ACTL)
preserved by <
Data Abstraction builds abstractions Pabs of systems P s.t.
P < Pabs
Universal Environments U:
P||E < P||U, for all environments E
Synthesized Environments TF from assumptions F:
P||E < P||TF, for all environments E that satisfy F
Outline of the Talk
Introduction
Data Abstraction
Program Completion
Abstract Counter-example Analysis
Formal Justification
Case Studies: Java and Ada Programs
Conclusions, Limitations and Future Work
Case Studies
Filter-based Model Checking of a Reusable Parameterized
Programming Framework (FSE’98)
– Ada implementation of the Replicated Workers Framework
Model Checking Generic Container Implementations (GP’98)
– Ada implementations of generic data structures:
• queue, stack and priority queue
Abstracted using data type abstraction
Completed with code for universal environments
Environments’ behavior refined based on LTL assumptions
We used the SPIN tool:
– To check 8/5 properties; 5/3 required use of assumptions
– To detect seeded errors
Case Studies (contnd.)
Assume-guarantee Model Checking of Software: A Comparative
Case Study (SPIN’99)
– Ada implementations of software components:
• The Replicated Workers Framework, the Generic Containers
• The Gas Station, the Chiron Client
Out of 39 properties, 15 required use of assumptions
For < F ,Y >, we compared two ways of completing systems:
– Universal environment: check F -> Y (LTL)
– using SPIN
– Synthesized environment from F (LTL): check Y (LTL/ACTL)
– using SPIN/SMV
Synthesized environments enable faster verification
Case Studies (contnd.)
Analysis of the Honeywell DEOS Operating System (NASA’00,ICSE’01)
– A real time operating system for integrated modular avionics
– Non-trivial concurrent program (1433 lines of code, 20 classes, 6 threads)
– Written in C++, translated into Java and Promela
– With a known bug
Verification of the system exhausted 4 Gigabytes of memory without
completion
Abstracted using data type abstraction
Completed with code for
– User threads that run on the kernel
– A system clock and a system timer
Environments’ behavior refined based on LTL assumptions
Checked using JPF and SPIN
Defect detected using choice-bounded search
Case Studies (contnd.)
Finding Feasible Counter-examples when Model Checking
Abstracted Java Programs (TACAS’01)
We applied the 2 counter-example analysis techniques to Java
defective applications:
– Remote agent experiment, Pipeline, Readers-Writers, DEOS
Both techniques are fast:
– Choose-free search is depth-bounded
– Cost of simulation depends on the length of the counter-example
Choose-free counter-examples …
– are common
– are short
– enable more aggressive abstraction
Summary
We integrated several technologies into a coherent methodology
that enables FSV of complex, concurrent software systems
We developed technologies for…
– Data type abstraction:
• Enable non-experts to generate models that make FSV less costly
• Facilities to define, select and generate abstracted programs
– Program completion
• Enable verification of properties of individual components, collections of
components and entire systems
• Take into account assumptions about environment behavior
– Analysis of abstract counter-examples
• Choose-free state space search
• Counter-example guided simulation
We implemented and formalized these technologies
We evaluated them on extensive case studies (Ada and Java)
Future Work
Generalize choice-bounded search
technique
– CTL* model checking of abstracted programs
Extend abstraction support for objects
– Heap abstractions to handle an unbounded
number of dynamically allocated objects
Extend automation
– Automated selection and refinement of
abstractions based on counter-example analysis
Conclusions
Tool support for:
– Abstraction of data domains
– Program completion
– Interpretation of counter-examples
… is essential for the verification of
realistic software systems
Ensures the safety of the verification
process