0% found this document useful (0 votes)
66 views20 pages

Software Constraints For Large Application Systems: The Computer Journal October 1997

This document summarizes a research paper about using software constraints to improve structure and consistency in large, long-lived application systems. The paper proposes: 1) Categorizing different types of software constraints that can be automatically checked, such as constraints within or between software components. 2) An architecture for constraint checking tools consisting of two subsystems - one to collect information from software components, and one to evaluate constraints and check for violations. 3) Examples of generally applicable constraints and constraints specific to programming methods to ensure rules around software development are followed.

Uploaded by

Death Angel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views20 pages

Software Constraints For Large Application Systems: The Computer Journal October 1997

This document summarizes a research paper about using software constraints to improve structure and consistency in large, long-lived application systems. The paper proposes: 1) Categorizing different types of software constraints that can be automatically checked, such as constraints within or between software components. 2) An architecture for constraint checking tools consisting of two subsystems - one to collect information from software components, and one to evaluate constraints and check for violations. 3) Examples of generally applicable constraints and constraints specific to programming methods to ensure rules around software development are followed.

Uploaded by

Death Angel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/220459741

Software Constraints for Large Application Systems

Article  in  The Computer Journal · October 1997


DOI: 10.1093/comjnl/40.10.598 · Source: DBLP

CITATION READS

1 6,325

3 authors:

Dag I. K. Sjøberg Ray Welland


University of Oslo University of Glasgow
121 PUBLICATIONS   6,544 CITATIONS    54 PUBLICATIONS   401 CITATIONS   

SEE PROFILE SEE PROFILE

Malcolm Phillip Atkinson


The University of Edinburgh
315 PUBLICATIONS   6,234 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Keynote at PROFES'2016: The Relationship between Software Process, Context and Outcome View project

ENVRI+ View project

All content following this page was uploaded by Ray Welland on 19 May 2014.

The user has requested enhancement of the downloaded file.


Software Constraints for Large
Application Systems
DAG I. K. S JØBERG 1 , R AY W ELLAND 2 AND M ALCOLM P. ATKINSON 2
1 Department of Informatics, University of Oslo, PO Box 1080 Blindern, N-0316 Oslo, Norway
2 Department of Computing Science, University of Glasgow, 17 Lilybank Gardens, Glasgow G12 8QQ,
UK
Email: [email protected]

As application systems live longer and grow in size and complexity, there is an ever increasing
need for methods and tools that can support software builders in constructing maintainable, well-
structured and consistent systems. This paper describes the notion of software constraints as an aid
to developing such systems. Software constraints make rules and conventions commonly agreed to
in a given programming environment explicit and automatically checkable. The potential usefulness
of software constraints was investigated in both industrial and research environments. A framework
for categorization of such constraints is defined. Constraints are proposed that are generally
applicable and others that are tightly connected to and support a certain programming method.
Tools for automatic checking are crucial if software constraints are to be used. An architecture for
such tools and two realizations are described.

Received October 16, 1996; revised December 2, 1997

1. INTRODUCTION subsystems. The information collection system observes


the software components and the maintenance process, and
Large and long-lived application systems that satisfy a extracts pertinent data. This data is organized around
complete area of information-processing requirements, such names and is assembled in a structure we call the thesaurus,
as management information systems, health management although in some senses it resembles a concordance, a data
systems, CAD/CAM systems, CASE tools, etc., must dictionary, a cross-reference database or a repository. The
continuously undergo change in order to reflect change in other subsystem performs various checks on the contents of
their environments [1, 2, 3, 4]. To satisfy new requirements, the thesaurus, evaluating various constraints and providing
code must be modified, which in turn may cause its advisory information about the current state of software
structure to deteriorate [5] and introduce inconsistencies. consistency.
Consequently, to ensure consistency in such systems, better
methods, techniques and tools are required, and there is an This architecture permits generality and language inde-
increased need for standardization and discipline in software pendence. The collection systems are independent analysers
construction and maintenance. that scan stores, schema definitions, program sources,
The subject of this paper is how software constraints can scripts, etc. Each such analyser must be specific to its infor-
improve structure and consistency of application systems, mation source, for example capable of analysing its language
which we believe will simplify the maintenance of such or data structures. Having collected this information it can
systems. Automatically checkable software constraints can be held in the thesaurus in a source-independent structure,
support software engineers who have chosen and made but with references to those sources. The constraints can
explicit commonly agreed rules and conventions. In order then describe rules that should be satisfied either within a
to understand how software constraints can be used and source or between sources in a consistent notation.
in which contexts, a framework for categorizing them is
proposed. The remainder of this paper is organized as follows.
The compiler of a programming language already Section 2 discusses the problem domain in further detail,
performs many forms of consistency check such as type describes assumptions of our work, and defines some basic
checking, ensuring that identifiers are declared and unique terms frequently used in the paper. Section 3 categorizes
within a scope, etc. Modern compilers give warnings of software constraints and provides several examples. A
local redundancy [6, 7]. Our main concern is complementary general implementation architecture and two constraint-
checks at a more macroscopic level: those among software checking tools are described in Section 4. Evaluation of
components, and those between software components and our proposed constraints and constraint-checking tools is the
data on a secondary (persistent) storage. issue of Section 5. Section 6 describes related work and
Our automated technology consists of two primary Section 7 concludes.

T HE C OMPUTER J OURNAL, Vol. 40, No. 10, 1997


S OFTWARE C ONSTRAINTS FOR L ARGE A PPLICATION S YSTEMS 599

2. PROBLEM DOMAIN AND ASSUMPTIONS the space management code was significantly more complex
because of various constructs to support multiple paths to
A typical software construction or maintenance team will
objects. However, the rest of the software only ever used
not find it difficult to agree that certain general rules should
a single path. Revisions of the store manager had lovingly
describe their software, for example:
(but expensively!) preserved the multiple path capability,
• that for every component, all other components that are presumably because programmers working on that package
needed to use it or understand it should also exist; and had no convenient mechanism for establishing that the
• that the body of software they maintain should not facility was not used in any of the other packages.
contain redundant parts. One underlying assumption of our approach is that names
However, such rules are not very helpful as they are so are highly significant in an engineering enterprise such
general that they are difficult to support automatically and as the construction and maintenance of a large software
lack the precision necessary to report violations usefully or system by a team of people. In particular, we contend
to permit control of rule application and reporting. Hence, that these software engineers will name everything that is
crucial issues for the successful exploitation of software important to them, and depend on this naming when they
constraints are: return to a subsystem and when they communicate with one
another. The habit of defining policies for naming and the
• the actual constraints, which may depend on the prevalence of jargon within any group is anecdotal evidence
programming language(s), the software engineering that this general human property also applies in the software
environment, the programming methodology being engineering context.
used, etc.; Other assumptions on which our work is based are:
• the availability of supporting tools for automatic system
analysis and constraint checking; and • that we have access to all relevant information about the
• the controls available to software engineers over the application system under development;
checks performed and the ways in which they are • that the primary data from which we extract informa-
reported. tion is reliable and accurate, including the applications’
We assume that all useful constraint maintenance systems components (source code, object code, data, etc.);
must be automated as it is well known that imposing • that it will be possible to implement constraint checking
extra work on programmers to supply information manually so that it is economically justified and sufficiently
fails. Under the pressure of development work with tight unobtrusive and useful that programmers do not
deadlines, programmers either circumvent the requirement, circumvent the system; and
or supply minimal information and fail to maintain it. • that the ultimate responsibility for application system
A violation of a constraint could be a logical error, it could quality remains with software engineers, so that they
indicate a situation that might eventually cause problems, should retain control of what is checked, when it is
or it could be a transient consequence of the stage the checked, how it is reported and what responses should
development has reached. In the last case, constraints may be made.
be deliberately violated. The constraint checking may thus Five terms used frequently in the paper are defined as
be perceived as advisory more than mandatory. follows:
The significance of constraints increases as the size and
longevity of a project increase. A greater number of • Code: any fragment of program, for example a
people who are working on or have worked on the software procedure.
increases the chance of inconsistency as one programmer is • Software component: an association between a name
unaware of decisions taken by others. The turnover of staff and any collection of code that can be edited, rebuilt
increases the number of programmers who have to spend (recompiled, linked, etc.) and maintained as a
increased time finding their way around the software if it single unit. Software components would typically be
is inconsistent. These points are illustrated by two concrete modelled as modules in Pascal, Modula-2 [8], Modula-
examples we have observed in two quite different systems. 3 [9] and Standard ML [10, 11], packages in Ada
In a hospital administrative system various programmers [12] and classes, interfaces, etc., in JavaTM [13], etc.
edited the relational schema, the set of ‘canned’ queries, the Schemata in relational databases and scripts are other
GUI scripts and the C/C++ programs. As programmers were examples.
not confident that they knew what all the other programmers • Store: persistent store, persistent object store, object-
were using, they were extremely reluctant to remove oriented database, relational database, file store or
anything. Unsurprisingly, a consistency check across these similar data repository.
components showed that many relations, queries and scripts • Persistent component: a software component in a store
were no longer used. However, the software team was or an association between a name and a value, object,
spending considerable time carefully working around these root, relation, file, or other collection of data also in a
redundant items each time they made a change. store.
In a language system constructed from many packages • Persistent Application System (PAS): a software
that had been developed by a succession of programmers, system with a common purpose comprising one or

T HE C OMPUTER J OURNAL, Vol. 40, No. 10, 1997


600 D. I. K. S JØBERG et al.

TABLE 1. Dimensions of constraint categorization.

Dimension Explanation Examples


Kind of constraint The purpose of the constraint Eliminating redundant code or persistent components
– eliminating unused code or persistent components
– eliminating duplicate code
Minimizing mutability
Ensuring existence of required components at run-time
Locality The containers of the names being Local, i.e. within software components
subject to the constraint Global
– between names internal to different software components
– between names internal to a software component and name of another software
component
– between names internal to a software component and named stored data
External, i.e. between software components and external components
Generality Context dependence of the Any combination of:
constraint programming language dependent
data model dependent
methodology dependent
application dependent

more stores with a set of persistent components. A for example, the common practice by many programmers
PAS has the potential to be large, long-lived and where identifier3 declarations are deliberately left in the code
concurrently accessed. in the belief that the identifiers will be used later. (Their
use may also be temporarily commented out.) Although we
3. SOFTWARE CONSTRAINTS have experienced that this is an example of a poor strategy,
particularly in the long run, there may be cases where change
This section discusses categories of constraints and gives
can be ‘anticipated’ and a sensible plan would be to place
examples of constraints applicable in most programming
hooks for future changes. Therefore, it should be possible to
environments.
omit parts of the application from checking.
Constraints aiming to ‘minimize mutability’ are applica-
3.1. Categorizing software constraints ble to both source code identifiers and persistent compo-
Various kinds of software constraint (automatically checked nents. If a name denotes a variable, it should be updated with
or not) have been defined in both academia and indus- a new value, which in turn should be used somewhere. If it
try. Meyers et al. [14, 15] distinguish among stylistic, is not, it should be a constant, which may lead to improved
implementation and design constraints, but there exists performance and programming precision. Unnecessary use
no commonly used framework for categorizing software of variables can lead to redundant code, which in turn
constraints. Table 1 shows our first attempt to develop one. can lead to further redundant code. For example, if the
We categorize constraints along three dimensions: kind of expression on the right-hand side of an assignment is the
constraint, locality and generality. (A discussion of the need only use of some other identifier, that use also becomes
for categorization and taxonomies in computer science in redundant. Moreover, a maintenance team might waste time
general can be found in [16].) trying to inspect the current value of a variable or ensure that
it is correctly assigned.
3.1.1. Kind of constraint If a component specified to be used by a software
The kind of constraint categories are abstractions over the component does not exist at run-time, run-time problems
purpose of collections of constraints. The purpose of will occur. Constraints for ‘ensuring existence of required
constraints for ‘eliminating redundant code or persistent items’ aim to reduce the likelihood of such cases.
components’ is to help prevent applications from becoming An important kind of constraint not discussed further
unnecessarily large, complex and confusing. Redundant concerns documentation, which includes conventions for
source code and stored components are in particular likely layout, commenting and naming. Naming is of particular
to distract future maintenance programmers. It is a widely interest as names are central to system builders’ thinking and
experienced problem in industry, also confirmed by our thus influence the way software is organized. The choice
experiments, that programmers rarely remove code or files of names is crucial to the readability of programs and is
etc. because they are worried by the potential effects. particularly important when trying to administer and manage
However, elimination of redundant information militates
against planning for change. The constraints discourage, 3 In the context of source code, names are usually called identifiers.

T HE C OMPUTER J OURNAL, Vol. 40, No. 10, 1997


S OFTWARE C ONSTRAINTS FOR L ARGE A PPLICATION S YSTEMS 601

FIGURE 1. Components and relationships in a PAS.

TABLE 2. Local constraints.


(a) Eliminating unused identifiers: A declared identifier should be used at least once within a software
component
(b) Minimizing mutability: For every identifier declared as variable there should be some code in the
system that might update it, otherwise it should be constant
(c) Eliminating unused variables: If there is code that updates a variable, then there should also be code
that might subsequently access it

change. Various naming guidelines have been proposed in and components in stores. Table 1 distinguishes among three
the literature [17, 18]. The important point is that there is a categories of global constraints in compliance with three
naming scheme, not its exact form. kinds of relationship shown in Figure 1. Relationship (2)
is between names within different software components.
Relationship (3) is between a name within a software
3.1.2. Locality component and the name of another software component.
The locality dimension categorizes constraints according Relationship (4) is between a name within a software
to the kind of container of the names and the kind of component and the name of a data component. Since our
components they denote. The two kinds of persistent work focuses on global constraints, we present some detailed
components, software components and data components, examples in Subsection 3.2.
and the different relationships we consider are shown in External constraints involve relationships between a name
Figure 1. within a software component and a name in an external PAS,
It is possible to define constraints at the fine granularity for example a library or legacy system (relationship (5)).
of statement, declaration, assignment, etc. However, A constraint could be that the component denoted by the
the constraints at the finest granularity we consider are external name should be accessible.
those local to a software component, as illustrated by
relationship (1) in Figure 1. These are constraints within
single compilation units, exemplified by those in Table 2, 3.1.3. Generality
and are well understood and described in the literature, for The generality dimension categorizes software constraints
example in the data flow community [19]. Exempt from according to whether or not they are dependent on
constraint 2a are identifiers to persistent components. A file, the application programming language, data model, and
relation, etc., may correctly be declared, but not used, within methodology being applied by the software engineers, and
one software component. Similarly, mutable persistent the semantics of the actual application being developed.
components are exempt from constraint 2c since they may These four elements describing the context of a software
be accessed in other software components. Whether the constraint are not mutually exclusive; a constraint may
persistent components actually are accessed elsewhere is depend or not depend on each of them, which gives 16
captured by constraints defined at the global level. combinations.
Global software constraints operate at the PAS level, that Date [20] describes constraints in databases that are
is, they are defined across code in different compilation units programming language and methodology independent. He

T HE C OMPUTER J OURNAL, Vol. 40, No. 10, 1997


602 D. I. K. S JØBERG et al.

TABLE 3. Eliminating redundant and ambiguous meta-data.


(a) Eliminating unused types. There should be at least one occurrence of a statement that will generate
an instance of any declared type identifier
(b) Eliminating unused substructures of types. If a type definition introduces internal substructure,
such as named fields in a record, then code should exist that may utilise values from each
substructure
(c) Eliminating duplicate and ambiguous type definitions. A type name should be declared only once
within an application system:
(i) Type definitions should not be duplicated
(ii) Type definitions should have unique names

refers to application-independent constraints as ‘general a higher level in the hierarchy of scopes, that is the type
integrity rules’ or ‘meta-rules’. They constrain the definitions should be more global. Maintaining consistency
application structure independently of the actual application. requires that all declarations describing the same concept
‘Specific integrity rules’ express constraints in the real- (e.g. Person) must be changed consistently if the intention is
world application being modelled; general integrity rules are to modify the implementation of the concept (e.g. add a new
independent of a specific application but may depend on the attribute). It is difficult to arrange consistent changes when
type of data model being used (e.g. the relational model). several programmers (responsible for several components),
All the software constraint examples we present in this who require use of a common type, each write out equivalent
paper are independent of the programming language, data type definitions (particularly if they are complex). It is
model and application. Those in Subsection 3.2 are also even harder to ensure that when the type is amended, the
methodology independent, while those in Subsection 4.3.1 same amendments are applied in every usage context. One
are examples of methodology-dependent constraints. concept should therefore be represented by only one type
It may not always be simple to classify a constraint. definition.
For example, we consider the constraints we present that Second, type definitions may have the same name but
involve relationships of category (4) (Figure 1) as being denote different types. To avoid confusion, they should
programming language independent. However, one may then be renamed to acquire unique names. This might be
argue that to some extent they are language dependent since unrealistic in large PASs where names may be reused in
there are, for example, functional languages that do not different naming contexts, but the programmers should at
support I/O to secondary storage. least be aware of such clashes. The inverse, that several
names denote structurally equivalent types, is accepted
3.2. Examples of global software constraints because semantically different types with different names
may in practice have the same type structure (e.g. integer). A
This section describes a set of global constraints among
useful by-product of a tool for checking constraint 3c could
software components and another set of constraints between
be a list of equivalent type names.
software components and a store. The samples we
have identified are obviously not exhaustive; many other An issue for future work is to examine the use of type
constraints may be applicable. functions and formulate and check appropriate constraints
on these. Templates in object-oriented languages, type
3.2.1. Constraints among software components functions in Quest [21], functors in ML [22], parametric
Table 3 defines five constraints on type definitions. A types in Napier88 [23], etc., have similar sets of rules as type
permitted violation of constraint 3a is the case where the definitions: they should be used at least once, they should
type (class) is abstract in the sense that it is solely used not be multiply defined, etc.
for modelling purposes. In those cases one should be able Other constraints among software components of an
to annotate the code with ‘virtual’, for example, enabling application concern source code involving persistent com-
the constraint-checking tool to avoid reporting warning ponents such as files in a file store, relations in a relational
messages for those cases. One might also introduce a database, objects in an OODB or values in a persistent store
constraint that says that an abstract class should have at least (Table 4). There might be a few permitted exceptional
one (or two) specialization(s). cases to constraint 4a. For example, logging files may be
In systems with flat naming space, the compiler ensures frequently written to within an application, but they may be
that a type name is declared in only one place (constraint 3c). inspected only in an ad hoc way by programmers. Neverthe-
In systems where types can be defined in different scopes, less, even that example indicates an unsatisfactory situation.
the constraint may be violated in two ways. First, two The application should include software components that
or more types might be defined with the same name and read the file for the purpose of undo, restore or examination
type structure in the overall application system. In that of the audit trail.
case they should be replaced by exactly one definition at Constraint 4b states that there should be exactly one

T HE C OMPUTER J OURNAL, Vol. 40, No. 10, 1997


S OFTWARE C ONSTRAINTS FOR L ARGE A PPLICATION S YSTEMS 603

TABLE 4. Eliminating unused and duplicated persistent components.


(a) Eliminating unused persistent components. A persistent component specified to be created in a
storea by one software component should be used, in particular:
(i) There must exist at least one software component that potentially updates the component with a
meaningful (i.e. non-null) value
(ii) There must exist at least one software component that potentially reads that value (it should be a
different software component to that which specified the component creation)
(b) Eliminating duplicate persistent component creation. For each persistent component potentially
used by a software component there should be exactly one corresponding software component that
specifies the creation of that component
(c) Eliminating duplicate persistent component deletion. A persistent component should potentially be
deleted in at most one software component
a ‘Create file’, ‘insert object’, ‘create table’, etc.

software component that specifies the creation of a persistent For example, a relation or file, once used by a software
component used in an application. (Exempt from this component, may still reside in a store even though that
constraint are persistent components imported from an software component has been deleted or changed in such
external application.) If no such software component exists, a way that the relation or file will never again be
there is a risk of the component being non-existent when used. It is particularly during development and ad hoc
a software component attempts to write to or read from it, programming that programmers tend to forget to remove
thereby causing a run-time error. unused components.
There is also a risk of confusion and run-time problems if A component in the store that is not referred to in the
several software components specify creation of the same source code should probably be removed. However, it could
persistent component. Some examples will illustrate the be the case that the source code was changed or a source
potential problems. O2 [24] permits objects (that become program deleted by accident. Hence, it is impossible to
persistent roots) and values to be named. It would be automate deletion of components entirely without any user
confusing if two software components created an object intervention.
(root) with the same name. It would be a race condition as to Moreover, the reader may correctly object that data is
which one made it. If the other was then run, it would either often collected with a view to future use. New programs
generate an error or lose information. The same potential that analyse it may be written. There is also the possibility
problems arise in a filing system if two software components of using general tools outside the application software for
attempt to generate the same file, and in a system that can reading files such as editors, browsers and ad hoc query
dynamically create relations. notations, for example SQL. Even though it still may be
Constraint 4c has been defined in order to avoid confusion useful to be told that certain data is currently unused by the
and reduce the chances of attempting to delete a component application software, this is an example of a constraint that
that has already been deleted (which may cause a run-time we clearly may wish to switch off.
error). If a file, for example, used by a program is unintentionally
deleted or renamed, or if the programmer forgets to change
3.2.2. Constraints between software components and store the program in accordance with a file deletion or renaming,
The constraints presented in Table 5 concern relationships then the inconsistency would be detected by constraint 5b
between source code and components accessible from a before a software component attempts to access the file at
store at the time of analysis. This store should be run-time.
either the store the code will eventually work with, or Cases where the persistent component deliberately is
a store that is representative of this target store. These not present at the time of the constraint checking are
constraints are similar to those in Table 4, but instead allowable exceptions to this constraint. For example, a
of comparing source code with source code, source code software component that creates a file to be used by another
is compared with the actual contents of a store. It is software component may be executed just before the latter is
checked whether the association between names in the executed.
source code and components in a store, specified to be Constraint 5c concerns compliance between components
established at execution-time, is likely to succeed. The in a persistent store and the source code that specifies the
constraints described in this section must be rechecked when component creation. (Components that are imported as part
the application code is installed in the environment in which of another application are exempt from this constraint.) If
it will be executed. a component in the store has no corresponding creation
The purpose of constraint 5a is to prevent the situation specification, then that specification must have been changed
where a part of a store associated with an application or deleted by mistake, or the programmer must have
has accumulated components unused by the application. forgotten to delete the component when the code was

T HE C OMPUTER J OURNAL, Vol. 40, No. 10, 1997


604 D. I. K. S JØBERG et al.

TABLE 5. Constraints between software components and components in a store.


(a) Eliminating unused data. A component present in a store should be used by at least one software
component
(b) Used data must exist. A persistent component specified to be used in a software component should
be present in a store (unless something else is indicated by the programmer)
(c) Unique component creator must exist. For each component present in a store there should be
exactly one software component that creates the persistent component

deliberately changed. Alternatively, components may have extracts relevant name information and stores it in the
been created in an ad hoc way. thesaurus.
This constraint helps ensure that persistent components • The thesaurus is a persistent data structure that
can be recreated on the basis of the source code (stores contains information about each name occurrence
may get corrupted, be remote, be isolated, or use different in all application components. In a multilingual
value representations), and that the whole application can environment an additional thesaurus structure should
be installed in another environment if need be. Although record dependencies among names used in code written
components may be copied directly between stores, it should in the different languages.
still be possible to recreate a system. Furthermore, the • The constraint checker encapsulates all the constraints
source programs serve as documentation for the declaration that are defined in the system and actually checks those
and usage of the components in the store. To avoid confusion that have been selected by the application builder. The
and make it easier for programmers to understand the constraint checking is performed on the basis of the
application, there should not be more than one place for contents of the thesaurus.
potential component creation. • The results of the constraint checking, including
information about constraints that were violated, the
4. IMPLEMENTING CONSTRAINT CHECKING sources of the violations and the time of the last
thesaurus update, are stored in a structure called
It is generally difficult (in some cases hardly possible) and constraint-checking results.
invariably time consuming to check software constraints • The presenter displays the constraint-checking results
manually. Hence, their success depends heavily upon a using various textual or graphical presentations that
supporting environment that automatically checks adherence illustrate the kind of constraint violation and the
and provides relevant information in the case of violation. affected parts of the application. These may be
Our assumption is that constraint checking will take place requested at various levels of detail. Statistical
after successful compilation of the application software. summaries of constraint violation can be provided for
This is similar to invoking a compiler after the source evaluation and quality management.
code has been written, as opposed to invoking a compiler
interactively as part of using a syntax-directed editor. The 4.1.1. The thesaurus
constraints are post-checked because it is not sensible Below are described the attributes of thesaurus entries
to check most of the constraints interactively (code for which can be used to check the constraints discussed in
declaring and using a component cannot be written at exactly Section 3. In practice, an implementation of the thesaurus
the same time). must be tailored to the actual programming language(s) and
Subsection 4.1 presents a general architecture for the environment.
implementation of a constraint management system. Sub- • Name denotes a persistent component or an identifier
section 4.2 describes an exploratory study in an industrial, in the source text of a software component.
multilingual environment that preceded the development • Kind of the name is a base type (integer, real, etc.), a
of this general architecture. Subsection 4.3 describes a constructed type (record, class, procedure, etc.) or a
realization in a research, monolingual environment. construct with a loose connection to the notion of type
(relation, file, query, etc.).
4.1. Implementation architecture • Constancy shows whether the name was declared
constant or variable.
The major elements of our architecture are shown in Figure 2
• Container identifies the enclosing unit of the name
and are described below:
occurrence. The name, access path and type of the
• The application components comprise all persistent container are recorded. We divide the various types into
components and all the source code written in all the two categories:
languages used to build the application programs, user – Persistent components are contained within a
interfaces and databases of a PAS. store, or they may be nested within other
• The observer scans all the application components, persistent components.

T HE C OMPUTER J OURNAL, Vol. 40, No. 10, 1997


S OFTWARE C ONSTRAINTS FOR L ARGE A PPLICATION S YSTEMS 605

FIGURE 2. General implementation architecture.

– If the name is an identifier in a source text, then 4.1.2. The observer


the container is its enclosing software component. The observer consists of two parts: a source code analyser,
Block depth and block sequence, which yield which provides the information required to check local
information about the scope of the name, may be constraints and part of the information required to check
recorded for block-structured languages. global constraints, and a store analyser, which provides
• Usage is applicable only where the container is the other part of the information required for global
a software component and records what the name constraints. The source code analyser may be part of an
occurrence represents, which is categorized as follows: enhanced compiler where additional information relevant to
the thesaurus is extracted. A requirement would be that it
– Declaration of a type (kind). must be easy to switch on and off the extraction of this
– Use of a type (kind), which may be specialized information since this process will degrade performance
into more detailed usage such as creation of and is an unnecessary overhead if the compiler is invoked
an instance of the type, specialization of a for the purpose of syntax checking only. However, we
polymorphic procedure, specification of the type implemented our source code analysers as tailored tools,
of a parameter, etc. decoupled from a compiler, because they are then simpler
– Declaration of a value, which may be specialized to implement and can be made more flexible for the users.
into declaration of a local identifier, declaration of Whereas most programmers use the same compiler with
a parameter of a procedure, class etc., declaration minor adaptations, different programmers will benefit from
of a substructure such as an attribute of a class different sets of constraint depending on context, level of
or relation, named field of a record or tag of a experience (e.g. tighter constraints for novices than for
variant, etc., creation of a persistent component, experienced programmers), etc. One may build a decoupled
specification of a persistent component to be used source code analyser from scratch, or one may reuse the
in a program (i.e. reading a persistent value), lexical and syntax analysers of a compiler, but replace
etc. Checking many of the constraints requires executable code generation with generation of thesaurus-
that the name of the unit to which the declared relevant information.
name belongs also be recorded, for example the The most convenient way of implementing a store
procedure of a parameter, the class or relation analyser depends on the kind of store, for example shell-
of an attribute, etc. Similarly, when the name scripts for scanning file stores, SQL code for querying the
occurrence denotes a persistent component, the catalogue of relational databases, browsers for scanning
name of its container is also recorded. The same persistent stores or object-oriented databases, etc.
kind of additional information is recorded for the To help ensure that changes have been correctly imple-
following two categories of usage. mented, the constraints should be checked at intermediate
– Assignment of a value, which may be specialized states of the software product during its development
into assignment of a local or global variable and and maintenance, particularly after each major change.
update of the value of a persistent component. Constraint checking is typically performed after successful
– Use of a value, which may be specialized into compilation and before execution of software components.
the right context of an assignment, a procedure Since the constraint checker extracts information from the
or function call, a de-referencing of an attribute thesaurus, the timing of the thesaurus update is crucial.
of a class, field of a record, etc., or a deletion of a The thesaurus content is automatically maintained and is
persistent component. accurate up to the time of the last update.
• Date keeps track of the date and time when the entry In the two constraint-checking systems we have built,
was inserted. the observers analyse the whole application and update

T HE C OMPUTER J OURNAL, Vol. 40, No. 10, 1997


606 D. I. K. S JØBERG et al.

the thesaurus regularly at times specified by the user. A components present in the store when the store analyser of
full analysis and update can also be initiated at any time the observer was last run. Checking is straightforward by
(e.g. after a major set of changes). Analysing all software comparing the elements in these three sets.
components may be expensive. A smarter approach would
be to analyse only the changed components (e.g. indicated 4.1.4. Constraint-checking results
by timestamps) and incrementally update the thesaurus. Constraint checking may be carried out regularly, typically
in quiescent periods. Storing the constraint-checking results
4.1.3. The constraint checker in an appropriate format in a database, including information
The checking of the constraints described in Section 3 about the time and extent of the checking, enables the
exploits the information in the thesaurus attributes. How to developers to obtain this information when convenient for
tackle certain problems, for example scoping, depends on them. Such a database may also be exploited to provide
the actual programming environment and requires tailored useful summary information over time and thus document
thesaurus information. This section outlines a model for the aspects of the system development process.
implementation of constraint checking.
The constraints in Table 2 are checked by scanning 4.1.5. The presenter
all the thesaurus entries of a software component. All The most straightforward presentation of the constraint-
identifier declarations are added to one set; all identifier checking results is a textual interface giving a list, in which
uses to another set. A set difference between the latter each entry displays which constraint was violated, when
and the former set yields the unused identifiers violating and where. Constraint checking can potentially generate a
constraint 2a. Regarding constraint 2b, variables not updated large number of violations. Therefore it is useful to allow
are detected by performing a set difference on one set the users to select subsets of the violation entries either by
with variable declarations and another with the names kind of violation or by location. Alternatively, it would be
occurring in assignments. The same principle also applies possible to graphically depict the areas affected by constraint
to constraint 2c. violations. Colour could be used to enhance the feedback
Regarding constraints on meta-data (Table 3), we assume provided. This would require a graphical presentation of the
a model where types can be defined in special, globally application, which could be generated from the thesaurus.
accessible units such as declaration files or database Statistical summaries could be generated to provide input
schemata. Constraint 3a is checked by performing a set to the evaluation of the constraint checking, see Section 5.
difference on one set of all global type definitions and Such summaries could also be used to compare development
another set of all type identifiers used in instance creations. technologies, for example the use of different programming
In systems where types may also be defined and used languages.
locally to a software component, an additional check must
be carried out for each component4. 4.1.6. Constraint rectification
The most difficult problem of checking constraint 3b is to The architecture illustrated in Figure 2 does not include
provide sufficiently detailed information uniquely to identify constraint rectification. Generally, there are many ways of
the definition and use of substructures. If this is achieved, violating software constraints, and for each violation there
checking is done by a simple set difference. are several ways of rectification. For example, if a persistent
Identifying duplicate use of type names (constraint 3c (ii)) component is never used, a program could be modified or
is trivial. Regarding checking duplicate declarations of a new one created to use it, or it could be removed from
the same type (constraint 3c (ii)), the thesaurus yields the store. One may envisage automatic support for the latter
information about kinds only. Checking complete equality but not the former. Since it is a semantic problem to rectify
between two complex types would require access to the full inconsistent states, fully automatic supporting tools are
type graph. generally infeasible, but future research should investigate
Checking the constraints in Table 4 involves creating four the possibility of semi-automatic tools that interact with the
sets that contain information about software components that programmer.
respectively create, read the value of, update the value of,
and delete persistent components. The information required 4.2. Exploratory study in a multilingual environment
to create the first two sets is extracted from the third category
of the usage attribute (Subsection 3.1.1). The last two sets The Health Management System (HMS) is a large
exploit information in the third and fourth category of the application system currently running in several hospitals in
usage attribute, respectively. Having created these four sets, the UK. It was developed in a C/C++, X Window system and
checking the constraints is straightforward. relational database environment. To speed up development
Checking the constraints in Table 5 exploits the first time, tailored languages (a screen definition language, a
two sets described above and another set that identifies the procedural language, a query dictionary language and a
4 In our programming environment, we are unable to distinguish between
schema definition language) running on top of this base
the use of a type that is defined globally and the use of a type that is defined
technology were used to implement the system. To help
locally. However, this potential problem is avoided if constraint 3c (ii) is solve problems of maintenance experienced in the HMS
complied with. project, we built the HMS thesaurus tool [3]. This work

T HE C OMPUTER J OURNAL, Vol. 40, No. 10, 1997


S OFTWARE C ONSTRAINTS FOR L ARGE A PPLICATION S YSTEMS 607

• NAME—a textual form of the entry


• SEQ NO—system-generated key
• NAME TYPE—one of the following codes:
Action Name (AN), Action Script name (AS), Class Name (CN), Datum Name (DN), Field Name (FN), Function
name (FU), Query Name (QN), Relation Name (RN), Screen Macro name (SM), Transaction Name (TN) or Update
function Name (UN)
• CONTAINER—a textual name describing where a name is used
• CONTAINER TYPE—codes appropriate to the type of the CONTAINER value:
Action Script (AS), Display Language program (DL), Hippo Program (HP), Query (QN), Query Dictionary (QD),
Relation (RN), Schema (SC), Transaction (TN) or Update function (UN)
• DEFINITION USE (D/U)—indicates definition or use of the name
FIGURE 3. HMS thesaurus.

preceded and gave input to the development of the general the conventional definition–use anomalies within software
architecture described above. components [25]. More detailed thesaurus information
The thesaurus consists of three relations; the main one would have been needed to check the other constraints. The
is shown in Figure 3. The categories of NAME TYPE checks were implemented as one (complex) SQL query over
and CONTAINER TYPE reflect particular constructs of the thesaurus relation.
the languages used in the HMS project. To carry out There is no database that stores the results of the checking;
global constraint checking, dependencies among identifier the results are presented directly to the user as two tables.
occurrences in the software written in the different languages Both the invocation of the checking and the presentation of
had to be recorded. Hence, we created another relation the results take place through a coloured, X Window system
that describes direct correspondences between fields of the user interface. Details of the user interface and example
database relations and identifiers used in the user interface thesaurus data can be found in another paper [3].
or application code. That is, an occurrence where a variable
in such code reads or updates a database field, or vice versa, 4.3. Realization in a persistent programming environ-
gives rise to an entry in this thesaurus relation (duplicates are ment
not included). The third relation keeps track of the history
of thesaurus entries and is defined similarly to the main After the HMS experience, we implemented a constraint
relation except for two additional fields: ADD DELETE, management system in and for the persistent programming
which indicates whether an entry represents an addition or language Napier88 [23]. The concept of persistence tackles
a deletion, and INTRODUCED, which stores the date for the mismatch between database systems and programming
the addition/deletion. languages [26, 27, 28]; a uniform model for representations
The observer, which analyses the schemata and other and operations on persistent and transient data is provided.
software and subsequently updates the thesaurus, is Tools, programs and data may all reside in the same
implemented as a combination of SQL scripts, Unix csh persistent store. The experimental work has been carried out
and awk scripts, and one C program. The analysis part of using Napier88 because it has several properties we needed:
the observer is implemented by SQL scripts that operate on • it provides longevity for all data, including the data we
the schemata and Unix csh and awk scripts that implement generate to represent the thesaurus;
parsers for the display, query dictionary and the application • as it provides persistence through reachability [28],
languages. The first two thesaurus relations are updated by it maintains references and hence relationships in the
simply deleting the existing contents and loading the newly system reliably [27];
generated information in the relations. The history relation • it provides strong typing, which assisted in our
is updated by first performing a difference (Unix diff) on information collection and gave us accurate type
a file with the newly generated information and a file with information about application components;
the thesaurus history unloaded. Entries that are only in • we had access to the source code of the compiler and
the former file are inserted with an ‘A’ for addition in the all other program development tools, which greatly
ADD DELETE field; those only in the latter are inserted facilitated the collection of data; and
with a ‘D’ indicating a deletion. • there was an active local user community giving us
The HMS project comprised about 150,000 lines of source access to several applications under construction.
code when we collected measurements. The total analysis
and thesaurus update was carried out at 02:00 every night The Structured Persistent Application System Model
and took about 30 minutes. (SPASM) is a set of software constraints introduced to
The constraint checker identifies two global anomalies: support programmers using Napier88. SPASM includes
‘names defined but not used’, and worse, ‘names used the general constraints described in Section 3 and a set of
but not defined’. These basically capture constraints 2a, methodology constraints. The concrete interpretation and
2c, 3a, 4a, 5a and 5b, and are a generalization of implementation of the constraints have been tailored for the

T HE C OMPUTER J OURNAL, Vol. 40, No. 10, 1997


608 D. I. K. S JØBERG et al.

programming environment of Napier88 and reflect particular • a software component is restricted to perform only one
constructs of the language. For example, consider constraint kind of construction step (place procedure in the store,
3a: ‘There should be at least one occurrence of a statement update procedure value, delete procedure);
that will generate an instance of any declared type identifier’. • each procedure declaration should occur only once and
Napier88 provides polymorphic procedures in which a type have exactly one software component to create it, one
identifier may occur as a parameter in an instantiation of to update it and at most one to delete it; and
such a procedure. This is not an instantiation of the type, • there should be a partial order among software
but is sufficient to justify the existence of the type identifier. components that create procedures in the persistent
Therefore, this requires an exception to constraint 3a, in store and those that use them.
addition to the exception for abstract classes discussed in
Subsection 3.2.1. The full description and justification of these constraints
are given in [33].
4.3.1. Methodology-dependent constraints
SPASM defines a set of constraints that support a software 4.3.2. Implementation
construction method, called location binding, described The implementation of the prototype Napier88 constraint-
elsewhere [29, 30, 31, 32, 33]. We believe similar sets of checking system is based on the general architecture
constraints would be useful where other methods are used described in Subsection 4.1. The implementation of
to build applications with this technology or for systems the thesaurus particularly affects the general model of
using other technologies, such as the synthesis of an object- containers. The containers of persistent components in
oriented language and a database system. This particular set Napier88 are environments. The thesaurus records the name
is therefore considered illustrative and a valid foundation for of an environment and its path from a persistent root. If
preliminary experiments. the container of an identifier is a software component and
Following the method, applications are built incremen- that identifier denotes a component that is inserted into,
tally by placing procedures in the store, so that they can updated within, or deleted from an environment, then the
be found via named constructs in the stable store called thesaurus also records the name and access path of that
environments. Each procedure is given a name that is unique environment. This information is required in order to check
within an environment. Programs that place the procedures the constraints of Tables 4 and 5, and the method-support
in the store obtain other procedures they use by looking constraints outlined above.
up the name in a specified environment. The target of The part of the observer that processes Napier88 source
construction is application programs that use procedures to programs is based on the Napier88-in-Napier88 compiler
implement required tasks. [31]. The lexical and syntax analysers have been adjusted to
To support the construction method we have further conform to the special information needs of the thesaurus.
tailored the general constraints presented in Section 3 Instead of generating executable code, relevant information
and defined some specific method-support constraints. is inserted into the thesaurus. The part of the thesaurus
Inter-component constraints, including those between a analyser that extracts information from the persistent store
stable store (effectively a database for our purposes) and reuses low-level procedures used in the implementation of a
software components, are expressed in terms of procedures, Napier88 browser [34].
environments and programs. The tailored constraints The Napier88 constraint checker includes the full set of
include: SPASM constraints, which are embedded in the tool. A
programmer can select which constraints should be applied
• for every program all the environments it requires are
in a particular tool invocation. Those constraints that
in the store;
apply to the contents of the persistent store are simple to
• for every program all the procedures it requires are in
implement since the same language model is used both for
the place it specifies in the store;
transient data during program executions and for persistent
• every procedure is used directly or indirectly by a
data in a store.
program; and
In the present version of the constraint-checking system,
• every environment contains at least one procedure.
the constraint-checking results are presented directly to the
The above constraints are illustrative simplifications of user, but could easily be retained in the persistent store.
those necessary in the Napier88 programming environment The first version of our presenter has only a primitive
in which we perform our experiments. However, they textual user interface for displaying constraint violations.
are representative. For example, the first two are The example in Figure 4 shows the output after a check
tokens for those constraints that ensure that all that is of constraint 4a (ii). Two software components each have
needed is available, whilst the last two are typical of one persistent component that is not used. A component is
constraints to avoid redundant vestiges that often remain identified by its name and path of Napier88 environments
from earlier versions of a system or from abandoned lines (separated by a ‘\’ in the figure), analogously to file
of development. identification in a filing system. The built-in procedure ‘PS’
A simplified subset of the method-support constraints yields a persistent root environment.
used in our experiment is: Although the constraint-checking results are not persis-

T HE C OMPUTER J OURNAL, Vol. 40, No. 10, 1997


S OFTWARE C ONSTRAINTS FOR L ARGE A PPLICATION S YSTEMS 609

CONSTRAINT VIOLATION: PERSISTENT COMPONENTS CREATED BUT NOT READ


-------------------------------------------------------------------------------
SOFTWARE COMPONENT UNUSED PERSISTENT COMPONENT
-------------------------------------------------------------------------------
˜Personnel/Project/libraryKey_I.N PS()\Personnel\Project\libraryKey
˜Personnel/Project/team_I.N PS()\Personnel\Project\team
-------------------------------------------------------------------------------
Persistent components created in total : 24
Persistent components created but not used: 2
Proportion not used : 8%
-------------------------------------------------------------------------------

FIGURE 4. Presentation of constraint-checking results.

tently stored and the presenter is minimal, our Napier88 The study checked inconsistencies that are ‘violations’
prototype demonstrates the utility of the general constraint- of constraints imposed a posteriori. Each application was
checking architecture. This prototype has also provided us constructed with no formal requirement for programmers
with experience of constraint checking which is discussed in to adhere to the constraints tested, they had no aids
the next section. for checking these constraints themselves, and they were
unaware that their software would be analysed. The
5. EVALUATION applications were operational at the time of the analysis. One
would expect an even larger proportion of inconsistencies
A constraint should be of practical value in that either it during periods of development.
is violated in a significant number of applications or the
Measurement 1a in Table 6 indicates that 35% of all
constraint violation is expensive. Therefore measurements
variables that were never updated could have been declared
of inconsistency in real applications should be collected.
as constants. The table (1b) also shows that 8% of
The results of a study we carried out are described in
all value identifiers (as opposed to type identifiers) were
Subsection 5.1. Owing to limited resources we were unable
unused. Interviews with the programmers revealed that there
to set up an experiment that measured the consequences of
are several reasons why this kind of redundancy occurs:
constraint violation, but some experiences are reported in
collections of declarations are copied indiscriminately from
Subsection 5.2.
other programs; too many identifiers are declared in the
Some of the anomalies and inconsistencies detected by
belief that they would be needed later; and code using
automatic constraint checking may cause errors that can
identifiers is removed without the programmer remembering
be detected during testing, but there are always cases not
to remove the corresponding declarations. Also other factors
identified in a test. The errors that we aim to avoid may occur
may affect such figures, such as programmer expertise, tool
when the system has been operational for any period from
support and programming language. Hence, there may be
one day to many years. There are also limitations to software
many reasons why, for example, as much as 28% of all
constraint checking, however. All cases of constraint
identifiers were reported unused in a study of production
violation cannot be detected, and several constraints would
PL/1 programs [36].
have been more useful if they could have been defined more
In our study, 10% of the names declared to denote external
accurately. This is the issue of Subsection 5.3.
software or data components were unused (not shown in
The subsequent three sections discuss, respectively,
the table). Other studies of unused imported names report
checking of method adherence, adapting constraint checking
similar figures (from 7% to 20%) [37, 38].
to individual needs, and some viewpoints on how to deploy
The 4% violation in Table 6 (1c) is a lower bound on the
constraint checking in an organization.
frequency of redundant updates as we have not analysed
actual paths through programs. Table 6 (2b) shows that
5.1. Extent of constraint violation
24% of the type identifiers are unused. Some applications
This section presents statistics concerning the extent of use all the type identifiers declared within the application;
software constraint violations in a Napier88 context. We other applications use only one-third. In the latter extreme
collected data from 20 applications consisting of more than cases the reason is that when libraries are used, all the types
108,000 lines of source code with developers ranging from associated with the library are copied even though only a
students to experienced programmers from three separate small part of the library is actually used in the application.
universities. This section reports some of the results; This indiscriminate copying of types is indicative of a
more can be found in [35]. All the results are based on requirement for a tool to collect required items (types or
measurements of source code and the actual contents of a values).
persistent store, which were automatically collected by the Inconsistencies 2c, 2d and 2e show that 10% of all
Napier88 observer. statements specifying deletion of components are repeated,

T HE C OMPUTER J OURNAL, Vol. 40, No. 10, 1997


610 D. I. K. S JØBERG et al.

TABLE 6. Measurements of inconsistency.


Measurement Percentage Percentage of
1. Local
(a) Value identifiers declared variable when they 35 All value identifiers declared
could have been constant variable
(b) Value identifiers declared but never used 8 All declared value identifiers
(c) Variables updated but not otherwise accessed 4 All updated variables

2. Global—between software components


(a) Repeated declarations of type identifiers 29 All type identifiers
(b) Unused type identifiers 24 All type identifiers
(c) Repeated delete statements 10 All delete statements
(d) Inserted components not used 8 All inserted components
(e) Repeatedly inserted components 7 All inserted components

3. Global—between software components and persistent store


(a) Unused components in the store 9 All stored components
(b) Used components that do not exist 2 All use-declared components

8% of the components declared to be inserted into a that will help prevent such faults occurring is essential when
persistent store are not used elsewhere, and 7% of such defining programming methods and standards.
declarations are re-declarations. However, the consequences of violation are very hard
Inconsistency 3a shows that 9% of all components in the to measure, and thus conducting a proper cost/benefit
persistent store are not used in any software component. analysis is extremely difficult. Such an analysis should
Finally, 3b shows the inverse, that 2% of the components aim to quantify the effect of software constraints and their
specified to be used in a software component do not exist. As supporting tools on the basis of empirical data. This may
part of another study on maintaining Napier88 applications be achieved by measuring software before and after the
[39], we collected statistics on the extent of run-time errors constraints and tools have been used. Such an experiment
and found in particular that the error ‘component cannot be is beyond the scope of this paper, but feedback received
found’ is a significant problem. from industrial software engineers, students and researchers
The applications were divided into four groups depending who have used our tools indicates, at least in our context,
on the experience of the application programmers. There that checking the particular constraints we have defined is
was no noticeable difference between experts and novices worthwhile.
regarding the extent of inconsistencies. In the HMS project the usefulness of the constraints was
The study described above investigated the extent of evaluated on the basis of programmer feedback. Some of the
certain inconsistencies, and thereby one aspect of the experiences were:
relevance of constraints to help prevent them. Most of the
• Many unused identifiers were found. In the case of
programmers thought that their programs were relatively
procedures, the most common reason was that existing
free of the kinds of inconsistency that were detected and
procedures were replaced with new ones without the
claimed to try to avoid them. They were therefore surprised
programmers remembering to delete the old ones.
by the high rate of inconsistencies found in the studies.
• Several references to undeclared identifiers were
They found the results interesting and were curious about the
detected. Typically, those were identifiers used in one
quality of their own software compared with other people’s
language and supposed to be declared in another, for
software.
example, queries called from code written in the screen
definition language that did not have a corresponding
5.2. Consequences of constraint violation declaration in code written in the query dictionary
language.
The purpose of defining and checking software constraints
• Inconsistencies were also detected at the macro
is to prevent states that may have undesirable consequences.
command level: shell environment variables not set
For example, a study of FORTRAN programs found a
as appropriate, non-existent files included (because file
correlation between the proportion of unused variables and
names had changed but not the corresponding file name
fault rate [40], which justifies the constraint that a declared
in an #include statement), etc.
name should be used. It may not necessarily be a causal
relationship, but that study indicated that reducing the The automatic checks on identifiers defined but not used,
number of unused variables might reduce the fault rate. and vice versa, proved more useful than they may seem at
Identifying possible causes of faults and then constraints first sight since they operate across software components

T HE C OMPUTER J OURNAL, Vol. 40, No. 10, 1997


S OFTWARE C ONSTRAINTS FOR L ARGE A PPLICATION S YSTEMS 611

written in different languages. Performing inspection of all constraint violations. For example, the constraints of
the direct and indirect relationships among identifiers in the Tables 4 and 5 concern persistent components that are
various components manually is infeasible in a system of the identified by their name and the name of their context (e.g.
size and complexity of the HMS. a path name) although this may not be possible in all cases.
The constraint checker was not used in all parts of the During static analysis it may be hard to identify a file given
HMS. Constraint 5b: ‘Used data must exist’, which applies by a name in a program. The identity of the file may depend
regardless of the type (text, multimedia data, etc.) of the on dynamic issues such as the directory of the executing
content of the persistent component being used, could have program or the contents of Unix environment variables, for
prevented a very expensive service interruption and the loss example.
of information in the HMS in a particular hospital. The Restricting the constraints to named components elim-
reason for the problem was that a file containing a font used inates the possibility of handling unnamed files (e.g. in
by the user interface management system had been lost [28]. VMS/VME) or other components that are statically bound,
Examples of the usefulness of constraint checking such as Ada code, Fibonacci code [44] or Hyper-programs
similar to those described above were experienced in the [45, 46]. In all these environments, names have been
context of Napier88, but more importantly there we were eliminated or never existed. Unnamed components are
able to experiment with method-specific constraints, see typically computationally identified by a persistent identifier
Subsection 5.4. (PID) or object identifier (OID) and reached by navigation
We have described some examples of usefulness, but or query. There is a new and difficult challenge of how to
more investigation on whether using a constraint-checking detect and report constraint violation among such unnamed
system significantly improves the structure and consistency components. This is an issue for future work.
of an application is needed. Demonstrating the effect on Some of the constraints involve statements in source code
maintenance requires long-term experiments, which could that could be part of procedures or conditions, implying
not be carried out within the current work. The longevity that they are not necessarily executed in a given program
and scale of apparatus that are needed can be illustrated execution. For example, consider constraint 2b: ‘For every
by the work at the Software Engineering Laboratory, where identifier declared as variable there should be some code
empirical data about software development processes has in the system that might update it’. It cannot be checked
been collected for two decades [41]. statically that particular updates are actually executed. It
In a long-term experiment the constraint-checking system can be checked statically that no code exists to do the
could be instrumented to collect information automatically, update, but we cannot say that the variable will be updated.
such as which constraint was checked and when, whether That would require an execution profile, and in this case
it was adhered to or violated (if possible, to what extent), constraint 2b could be replaced by a stronger constraint: ‘For
name and size of application, etc. More challenging is to every identifier declared as variable code will eventually be
define measurable criteria of maintainability. This is still an executed that modifies its value’.
open research issue. Data flow analysis techniques [19] could have been
exploited to improve the accuracy of our constraint
5.3. Completeness and accuracy of constraint checking checking. Consider constraint 2b again; data flow analysis
would allow us to eliminate cases where the update code can
All of our constraints concerning software components are never be executed. This is more accurate than our present
based on the use of names, which are extracted statically checking, which will not detect constraint violations in such
from source code. This restricts the kinds of languages cases. However, there are still cases that can only be detected
to which we can completely apply our constraints. Using at run-time.
constraints based on names causes some problems even for An even stronger constraint could aim also to prevent
the programming languages we have investigated. Because cases where variables are updated but due to various
we depend on static analysis, we cannot determine whether conditions could be replaced by constants. One example is
specific code fragments will be executed at run-time. the case where the only use of a variable occurs after the last
Conditional execution of code reduces the accuracy of the of several assignments and its final value can be determined
constraint we are able to specify. statically. Constant propagation techniques can be exploited
Our approach to constraint checking is not generally to detect such cases [47].
applicable to languages using type inference because some In addition to improving the accuracy of the checking of
of our constraints depend on explicit type declarations. our existing constraints, data flow analysis would allow the
Languages in which code can be generated dynamically definition of further constraints, for example detection of
(reflective programming languages [42, 43], dynamic SQL, dead code within software components.
etc.) cause us problems because the complete source code It has to be recognized that we can never specify the
is not available statically. However, even for such languages software constraints for consistency and maintainability
there will usually be a substantial amount of code that can completely. Were this possible, we would be able to specify
be statically analysed and subjected to constraint checking. completely all possible future systems. It is impractical to
The use of names as the basis of constraint checking do this for two reasons. We have insufficient information
is one reason why we are unable to identify completely and knowledge to write such constraints, and many of

T HE C OMPUTER J OURNAL, Vol. 40, No. 10, 1997


612 D. I. K. S JØBERG et al.

them could not be checked because they would require • determination of when all or specific constraints should
information about future changes, states of the system, be checked, for example after each build, nightly, etc.
etc., or because they would be computationally infeasible. • specification as to where the constraint-checking
Nevertheless, our work suggests that there remain useful reports should be stored and what should be included;
constraints that can be written and checked, and we believe and
that supporting them will give significant benefit. • identification of what summary or detailed extracts are
to be presented during checking.
5.4. Checking method adherence There is an interesting point about how programmers
Programmers who share a common view of how to indicate the exceptions to the default constraint checking. It
develop applications in their environment form a particular is clearly tedious to indicate the exceptions repeatedly every
programming culture. The rules and conventions of time a check is run. However, the presence of exceptions
such a culture implicitly express software construction reduces the effectiveness of constraint checking and their
and maintenance methods adhered to within that culture. validity as exceptions may be questionable or temporary. If
Automatically checkable software constraints make such the tool remembers exceptions indefinitely, the programmers
methods explicit, which have the following advantages: will forget them, they will accumulate, and the program’s
structure will deteriorate. So the handling of such exceptions
• they allow adherence to the methods to be checked; is a major challenge, including the design of a convenient
• they communicate the methods to others, that is user interface.
programmers elsewhere and programmers who subse- The Napier88 constraint checker allows individual
quently maintain the actual application systems; constraints to be disabled. For example, a programmer
• they will most likely refine the methods. may know that certain constraints will not be adhered
to during a certain period of the development (typically
Possible disadvantages are: during initial construction) and may wish to avoid the
• they remove important flexibility and variability; noise of unnecessary inconsistency messages. There are
• they hinder growth of better methods. several possibilities for specifying the period for ignoring
the constraint checking including: turning off checking for
In compliance with the first advantage, we exploited the the invoked session only; specifying a given period (e.g. in
Napier88 constraint-checking system to investigate whether terms of weeks or months); turning off checking until a given
programmers follow the location binding method (Subsec- action occurs (e.g. no checking before a quality assurance
tion 4.3.1). The applications described in Subsection 5.1 final report is to be produced). It is also possible to select
were divided into four groups: OLD applications developed parts (e.g. source programs or environments in the persistent
before the latest methods were developed, applications of the store) that should or should not be the subject of checking.
STUDENTS who were taught the latest methods, new ap- Another aspect of adaptability is the possibility of
plications of experienced programmers who were AWARE modifying the set of constraints for specific purposes. Some
of those methods, without being fully committed to them, of the SPASM constraints reflect a methodology that has
and finally, applications with authors who were explicitly been developed over years by experienced programmers.
COMMITTED. Not surprisingly, major differences were Methodologies do not change that fast, and we have received
measured among the groups. Constraint violation was very few suggestions for new constraints. Therefore, in
smallest in the COMMITTED group but still not ignorable. our current implementation the constraint specifications are
Details can be found in [35]. hard-wired into the Napier88 constraint checker. One way
to achieve flexibility in constraint specification would be to
5.5. Adaptability design and implement a constraint specification language
(cf. CCEL [14], see Section 6) that is tailored to and exploits
Software engineers may wish to control the application the software engineering features of persistent language
and the reporting of constraint checking. Therefore, the technology.
user interface should permit them to control the checks
performed and to manage the consequent reports. There 5.6. Constraint checking in an organizational context
are several ways of tailoring a constraint checking tool to
individual needs. For example, software engineers should Developing large and long-lived application systems is
be enabled to perform the following: a complex and time-consuming task; with many people
involved it is crucial that commonly agreed practices and
• selection of a subset of constraints to be used in an conventions are followed. Within an organization there is a
application; potential conflict between allowing flexibility in constraint
• selection of components to which the constraints should checking and the need for standardization and method
or should not apply; adherence. Increased flexibility makes maintenance more
• specification as to when observers should collect difficult, but a very rigid standardization policy may lead to
information, for example concurrently with each build, programmers circumventing the constraint-checking system.
nightly or each time a version is checked in: For example, software developers at ICL (e.g. working on

T HE C OMPUTER J OURNAL, Vol. 40, No. 10, 1997


S OFTWARE C ONSTRAINTS FOR L ARGE A PPLICATION S YSTEMS 613

VME) were required to use CADES [48, 49] and to conform constraints also exists in the area of extensible compilers
to its built-in constraints. The S3 compiler was set up only [56].
to take source from properly installed items in the CADES We have generalized work on software constraints also
database. The result was unreproducible and undocumented to include those that operate between software components
patches in machine code as programmers circumvented and a secondary storage (file stores, databases, etc.). In
CADES’ lack of speed and restrictions. At one stage, in the related work we have been able to find, the constraints
consequence, a new release of VME was prevented because are always defined within or between compilation units
the developers could not rebuild the kernel. (software components). In the latter case, they operate at
Software constraints provide one mechanism for stan- the interface level. For example, to support Ada software
dardization of software structure, which may eliminate builders in incremental development, the AdaPIC tool set
peculiar programming styles and simplify collaboration, [57] provides consistency analyses on interfaces among (and
maintenance, software reuse, etc. It is essential that the within) modules. Consistency violations are divided into
software engineers and programmers find it worthwhile to errors and anomalies.
learn and apply the standards; it should be easier to fulfil The majority of related work concerns constraints local
the programmers’ tasks by using them, and they should not to software components. Within this category, one of the
hinder normal working practice. In particular, experienced few language-independent works described in the literature
software builders may feel that constraints inhibit their is the Law of Demeter [58, 59], which intends to improve
personal programming styles5 . However, it is crucial that the style and structure of object-oriented programs. The
the organization invest in setting up and preserving structure aim is to help ensure that the software is as modular as
to reduce maintenance costs. possible. Transformations are defined that modify any
Our experience from an industrial environment indicates program to become consistent according to the law in the
that when introducing a tool that automatically checks the case of violation.
quality of software, one should consider the following Language-dependent related work includes many en-
questions. Who should use the tool? How should the hanced compilers [6, 7] and tailored static analysis tools
working process be organized to benefit as much as possible that perform code-rule checking beyond that of conventional
from the tool? How should the project management motivate compilers. The classical C program checker in Unix
and encourage active use of the tool? It is particularly environments is lint [60, 61]. It checks many kinds of
important that inexperienced and immature programmers inconsistencies that are due to weak typing and memory
find bugs and inconsistencies by themselves before the management in C. However, it also detects anomalies such
software is released. The only purpose of the tool should as unused #include directives, variables and procedures,
be to improve the quality of the software. A negative unused variables after assignments, and uses of variables
attitude may be created if it is felt that the tool is used for before they are initialized. For C and C++ there are also
individual monitoring purposes, for example by the project many commercial tools [62, 63, 64, 65]. An example in an-
management. other language context is FORTVER [66], which identifies
unused, unevaluated, unassigned, and other anomalous use
6. RELATED WORK of variables in FORTRAN programs.
In our tools the constraint checking is hard-coded.
Constraints are employed in many domains of computer Programmers do not need to specify the constraints—
science—from hardware verification to graphical user just indicate the ones they wish to be checked. Other
interfaces to computational linguistics [50]. In software work has focused on providing programmers with the
engineering, rule checkers of many CASE tools support flexibility of defining constraints themselves. In the
constraints defined on design structures, data model context of C++, Meyers et al. [15, 67] have identified
specifications, etc., and may also support particular system several programming anomalies that are not language errors
development methodologies [51]. Recently, work has been detected by compilers. To help prevent such anomalies,
carried out on automatic checking of consistency rules in they developed the C++ Constraint Expression Language
requirements specifications [52, 53]. We are concerned (CCEL) [14], a meta-language for C++ that enables software
with constraints that apply to the implementation phase of builders to specify a whole range of constraints on programs.
the software life cycle, which we therefore have termed Violations are automatically detected. In CCEL one can also
‘software constraints’. One of the contributions of this paper specify parts of a system (program files, functions or classes)
is the classification scheme for such constraints (Table 1), where the constraints should (or should not) apply.
which will be used to describe related work below. The basic functionality provided by LCLint [68] is similar
The focus of our work is application-independent to that of lint. However, LCLint gives enhanced support if
constraints. In the area of formal specification of the (C) program conforms to certain stylistic guidelines, or
requirements and constraint specification languages, for if the programmer writes formal program specifications in
example Kaleidoscope [54, 55], constraints are typically the LCLanguage [69, 70]. LCLint reports inconsistencies
application dependent. Work on application-dependent between a program and its specification.
5 On the contrary, novices will probably appreciate the support given by Finally, our techniques for implementing constraint
the rules for organizing their applications. checking are simple. More sophisticated and accurate

T HE C OMPUTER J OURNAL, Vol. 40, No. 10, 1997


614 D. I. K. S JØBERG et al.

techniques for checking constraints within software com- addition, it has libraries that permit combination with many
ponents have been developed by the data flow community. legacy and database components. With the addition of
The program dependency graph used in program slicing groups at different sites semi-independently constructing
[71, 72] to capture relationships among statements and libraries of components, Java provides a domain where
control predicates can be compared with our thesaurus, and macroscopic constraint support will have a large pay-off.
the slicing method can be compared with our constraint
checking. Nevertheless, the software constraints discussed ACKNOWLEDGEMENTS
in this paper are at a higher level of abstraction than those
constraints commonly tackled in program slicing. Regarding The St Andrews persistent programming team provided the
implementation, our contribution is the general architecture underlying language technology for the work described in
for building constraint-checking systems. the paper. We would like to thank our colleagues in the
Persistence Systems Research Group at the University of
Glasgow for stimulating discussions. We would also like to
7. CONCLUSIONS
thank the anonymous referees for their extensive feedback
Software constraints that identify missing components, re- which improved this paper and provided us with many
dundant components, components whose definition permits useful ideas for future work. The authors benefited from
more operations than are actually used, etc., are commonly a collaboration project between Norway and the UK funded
used within compilation units. We have generalized the by the Research Council of Norway and the British Council,
scope of such constraints to include also intercomponent and and the Pastel Working Group (EP 22552) funded by the
secondary storage relationships. A classification of software European Community.
constraints was given.
Various strategies for reifying general rules as specific REFERENCES
software constraints are possible. We have chosen to focus
[1] Lientz, B. P. and Swanson, E. B. (1980) Software
on named constructs as elements that can appear in particular
Maintenance Management: A Study of the Maintenance
constraints. Constraints of this form have three advantages:
in 487 Data Processing Organizations. Addison-Wesley
• they are in some sense ‘natural’ as it is an established Publishing Company, Reading, MA.
human practice to name concepts, processes, constructs [2] Nosek, J. T. and Palvia, P. (1990) Software maintenance
and components of importance; management: changes in the last decade. J. Software
Maintenance, 2, 157–174.
• they are neutral in the sense that naming pervades
[3] Sjoberg, D. I. K. (1993) Quantifying schema evolution.
all technologies and components out of which an
Information and Software Technology, 35, 35–44.
application is built; and
[4] Jorgensen, M. (1995) An empirical study of software
• they permit extensive, though not necessarily complete, maintenance tasks. J. Software Maintenance, 7, 27–48.
automation of consistency checking as we have [5] Lehman, M. M. and Belady, L. (1985) Program Evolution,
demonstrated in two different contexts. Processes of Software Change. APIC. Studies in Data
We anticipate that various kinds of software constraint Processing, No. 27. Academic Press, London.
will help software engineers in building and maintaining [6] Borland C++, User’s Guide, Version 4.5. (1994) Borland
International Inc., Scotts Valley, CA.
long-lived and large application systems. As yet we have
[7] CodeWarrior User’s Guide. (1995) Metrowerks, Inc., Austin,
only common-sense inference and anecdotal observation
TX.
with which to validate this claim to utility. The prototypical
[8] Wirth, N. (1983) Programming in Modula-2. Springer-Verlag,
experiments described in this paper demonstrated that it was New York.
possible to construct software to check many of the example [9] Cardelli, L., Donahue, J., Glassman, L., Jordan, M.,
constraints over some medium-scale application systems. Kalsow, B. and Nelson, G. (1989) Modula-3 Report (revised).
However, we suspect that the gains are greatest for large Research Report 52, Systems Research Center, Digital
and particularly long-lived systems. Although inevitably Equipment Corporation, Palo Alto, CA.
difficult to conduct, because of the significance of scale and [10] MacQueen, D. B. (1985) Modules for Standard ML.
duration, we believe that experiments to test this hypothesis Polymorphism, 2(2).
are worthwhile. [11] Appel, A. W. and MacQueen, D. B. (1994) Separate
We plan to rebuild the technology from scratch using compilation for standard ML. Proc. International Conference
JavaTM and one of its available persistent technologies. This on Programming Language Design and Implementation,
will bring us into a much larger community of developers Orlando, FL, pp. 13–23. ACM.
with whom we would hope to refine the example constraints [12] Barnes, J. G. P. (1991) Programming in Ada plus Language
and conduct larger-scale experiments. Java provides an Reference Manual. Addison-Wesley, New York.
interesting context because its incremental construction [13] Gosling, J. and McGilton, H. (1995) The Java Language
Environment: A White Paper. Sun Microsystems, Inc., San
methodology and dynamic binding mechanisms mean that
Jose, CA.
the intercomponent properties are not fully specified6 . In
[14] Meyers, S., Duby, C. K. and Reiss, S. P. (1993) Constraining
6 For example, the class loader will not necessarily find a class that some the structure and style of object-oriented programs. Proc.
other class expects to use. First Workshop on Principles and Practice of Constraint

T HE C OMPUTER J OURNAL, Vol. 40, No. 10, 1997


S OFTWARE C ONSTRAINTS FOR L ARGE A PPLICATION S YSTEMS 615

Programming (PPCP93), Newport, RI, 28-30 April, pp. 200– [34] Kirby, G. N. C. and Dearle, A. (1990) An Adaptive Graphical
209. Browser for Napier88. University of St Andrews, UK.
[15] Chowdhury, A. and Meyers, S. (1993) Facilitating software [35] Sjøberg, D. I. K., Cutts, Q., Welland, R. and Atkinson, M. P.
maintenance by automated detection of constraint violations. (1994) Analysing persistent language applications. Proc.
Proc. Conference on Software Maintenance, Montreal, Sixth International Workshop on Persistent Object Systems,
Quebec, Canada, 27–30 September, pp. 262–271. IEEE Tarascon, Provence, France, 5–9 September, pp. 235–255.
Computer Society Press. Springer-Verlag, Berlin, Germany and British Computer
[16] Glass, R. L. and Vessey, I. (1995) Contemporary application- Society, Swindon, UK.
domain taxonomies. IEEE Software, 12, 63–76. [36] Elshoff, J. L. (1976) An analysis of some commercial PL/1
[17] Anand, N. (1988) Clarify function! ACM SIGPLAN Notices, programs. IEEE Trans. Software Engng, SE-2, 113–120.
23, 69–79. [37] Kamel, R. F. (1984) Further experience with separate
[18] Einbu, J. M. (1988) An architectural approach to improved compilation at BNR. Proc. IFIP WG2.4/ IFORS System
program maintainability. Software Practice Experience, 18, Implementation and Languages: Experience and Assessment,
51–62. Canterbury, UK, September. North-Holland.
[19] Keables, J., Roberson, K. and von Mayrhauser, A.
[38] Conradi, R. and Wanvik, D. H. (1985) Mechanisms and
(1988) Data flow analysis and its application to software
Tools for Separate Compilation. Technical Report 25/85, The
maintenance. Proc. Conference on Software Maintenance,
Norwegian Institute of Technology, University of Trondheim,
Phoenix, AR, 24–27 October, pp. 335–347. IEEE Computer
Norway.
Society Press.
[20] Date, C. J. (1990) An Introduction To Database Systems. [39] Sjøberg, D. I. K., Welland, R., Atkinson, M. P., Jorgensen, M.,
The Systems Programming Series. Addison Wesley, Reading, Martinussen, J. P. and Maus, A. (1996) Evaluating software
MA. maintenance technology. Proc. Norwegian Conference in
[21] Cardelli, L. (1989) The Quest Language and System (Tracking Informatics, Alta, Norway, 18–20 November, pp. 49–61.
Draft). Digital Equipment Corporation, Systems Research TAPIR.
Center, Palo Alto, CA. [40] Card, D. N., Church, V. E. and Agresti, W. W. (1986) An
[22] Milner, R., Tofte, M. and Harper, R. (1989) The Definition of empirical study of software design practices. IEEE Trans.
Standard ML. MIT Press, Cambridge, MA. Software Engng, SE-12, 264–270.
[23] Morrison, R., Brown, F., Connor, R. and Dearle, A. (1989) [41] SEL (1994) Annotated Bibliography of Software Engineering
The Napier88 Reference Manual. Technical Report PPRR-77- Laboratory Literature. SEL-82-1306, Software Engineering
89, Universities of Glasgow and St Andrews, UK. Laboratory, NASA/Goddard Space Flight Center, MD.
[24] Bancilhon, F., Delobel, C. and Kanellakis, P. (1992) Building [42] Maes, P. (1987) Concepts and experiments in computational
an Object-Oriented Database System: The Story of O2 . reflection. Proc. Conference on Object-Oriented Program-
Morgan Kaufmann Publishers, San Mateo, CA. ming Systems, Languages and Applications, Orlando, FL, 4–8
[25] Taylor, R. N. and Osterweil, L. J. (1980) Anomaly detection October, pp. 147–155. ACM, New York.
in concurrent software by static data flow analysis. IEEE [43] Stemple, D. et al. (1992) Type-Safe Linguistic Reflection:
Trans. Software Engng, 6, 265–278. A Generator Technology. Technical Report FIDE/92/49,
[26] Atkinson, M. P. (1978) Programming languages and ESPRIT Basic Research Action, Project Number 3070–
databases. Proc. Fourth International Conference on Very FIDE1.
Large Data Bases, Berlin, West Germany, 13–15 September, [44] Albano, A., Ghelli, G. and Orsini, R. (1995) Fibonacci: a
pp. 408–419. IEEE and ACM. programming language for object databases. VLDB J., 4, 403–
[27] Morrison, R., Connor, R. C. H., Cutts, Q. I., Dunstan, V. S. 444.
and Kirby, G. N. C. (1995) Exploiting persistent linkage in
[45] Kirby, G., Connor, R., Cutts, Q., Dearle, A., Farkas, A. and
software engineering environments. Comput. J., 38, 1–16.
Morrison, R. (1992) Persistent hyper-programs. Proc. Fifth
[28] Atkinson, M. and Morrison, R. (1995) Orthogonally
International Workshop on Persistent Object Systems. Design,
persistent object systems. VLDB J., 4, 319–401.
Implementation and Use, San Miniato, Italy, 1–4 September,
[29] Dearle, A. (1988) On the Construction of Persistent
pp. 86–106. Springer-Verlag, Berlin, Germany, and British
Programming Environments. Ph.D. Thesis, Department of
Computer Society, Swindon, UK.
Mathematical and Computational Sciences, University of St
Andrews, UK. [46] Kirby, G. N. C. (1992) Reflection and Hyper-Programming in
[30] Connor, R. C. H. (1991) Types and Polymorphism in Persistent Programming Systems. Ph.D. Thesis, Department
Persistent Programming Systems. Ph.D. Thesis, Department of Mathematical and Computational Sciences, University of
of Mathematical and Computational Sciences, University of St Andrews, UK.
St Andrews, UK. [47] Wegman, M. N. and Zadeck, F. K. (1991) Constant
[31] Cutts, Q. I. (1993) Delivering the Benefits of Persistence propagation with conditional branches. ACM Trans. Progr.
to System Construction and Execution. Ph.D. Thesis, Languages Systems, 13, 181–210.
Department of Mathematical and Computational Sciences, [48] Hutchings, A. F., McGuffin, R. W., Elliston, A. E.,
University of St Andrews, UK. Tranter, B. R. and Westmacott, P. N. (1979) CADES—
[32] Dearle, A., Cutts, Q. and Connor, R. (1993) Using persistence software engineering in practice. Proc. Fourth International
to support incremental system construction. Microprocessors Conference on Software Engineering, Munich, September,
Microsystems, 17, 161–171. pp. 136–144. IEEE Computer Society Press.
[33] Sjøberg, D. I. K., Welland, R., Atkinson, M. P., Philbrow, P. [49] Snowdon, R. A. (1981) CADES and Software System
and Waite, C. (1997) Exploiting persistence in build Development. Software Engineering Environments. North
management. Software Practice Experience, 27, 447–480. Holland.

T HE C OMPUTER J OURNAL, Vol. 40, No. 10, 1997


616 D. I. K. S JØBERG et al.

[50] CP96 (1996) Proc. Second International Conference on Prin- [61] Darwin, I. F. (1988) Checking C Programs with lint. O’Reilly,
ciples and Practice of Constraint Programming, Cambridge, Sebastopol, CA.
Massachusetts, 19–22 August. Lecture Notes in Computer [62] Programming Research Ltd (1993) QA C/C++. Hersham,
Science, 1118 Springer-Verlag, Berlin. Surrey, UK.
[51] Jankowski, D. J. (1994) The feasibility of CASE structured
[63] VERILOG SA (1994) LOGISCOPE C Programming Rules
analysis methodology support. Software Engng Notes, 19,
Checker, Reference Manual, Version 1.0.
72–82.
[64] Gimpel Software (1994) PC-lint for C/C++. Collegeville, PA,
[52] Heitmeyer, C. L., Jeffords, R. D. and Labaw, B. G.
USA.
(1996) Automatic consistency checking of requirements
specifications. ACM Trans. Software Engng Methodol., 5, [65] Rational Software Corporation (1995) Rational Apex C/C++.
231–261. User’s guide, Release 2.1, Boulder, CO.
[53] ACM SIGSOFT‘96 Workshops (1996) Proc. of View- [66] Conradi, R. (1987) Experience with Fortran verifier. Proc.
points‘96, San Francisco, CA, 14–16 October. European Software Engineering Conference (ESEC‘87),
[54] Freeman-Benson, B. and Borning, A. (1992) Integrating Lecture Notes in Computer Science, 289, 263–275. Springer-
constraints with an object-oriented language. Proc. European Verlag, Berlin.
Conference on Object-Oriented Programming, Lecture Notes [67] Meyers, S. and Lejter, M. (1991) Automatic detection of
in Computer Science, 615, 268–286. Springer-Verlag, Berlin. C++ programming errors: initial thoughts on a lint++. Proc.
[55] Lopez, G., Freeman-Benson, B. and Borning, A. (1994) USENIX C++ Conference Proceedings, pp. 29–40.
Kaleidoscope: a constraint imperative programming lan- [68] Evans, D., Guttag, J., Horning, J. and Tan, Y. M. (1994)
guage. In Mayoh, B., Tougu, E. and Penjam, J. (eds) Con- LCLint: a tool for using specifications to check code.
straint Programming, pp. 313–329, Springer-Verlag, Berlin. Software Engng Notes, 19, 87–96.
[56] Hedin, G. (1997) Attribute extensions—a technique for
[69] Guttag, J. V., Horning, J. J., Garland, S. J., Jones, K. D.,
enforcing programming conventions. Nordic J. Comput., 4,
Modet, A. and Wing, J. M. (1993) Larch: Languages and
93–122.
Tools for Formal Specification. Springer-Verlag, Berlin.
[57] Wolf, A. L., Clarke, L. A. and Wileden, J. C. (1989) The
AdaPIC tool set: supporting interface control and analysis [70] Tan, Y. M. (1994) Formal Specification Techniques for
throughout the software development process. IEEE Trans. Promoting Software Modularity, Enhancing Software Doc-
Software Engng, 15, 250–263. umentation, and Testing Specifications. Technical report
MIT/LCS/TR-619, MIT Laboratory for Computer Science,
[58] Lieberherr, K. J., Holland, I. M. and Riel, A. J. (1988)
MA.
Object-Oriented Programming: An Objective Sense of Style.
Proc. Object-Oriented Programming Systems, Languages and [71] Weiser, M. (1979) Program Slicing: Formal, Psychological
Applications, New York, pp. 323–334. ACM Press. and Practical Investigations of an Automatic Program
[59] Lieberherr, K. J. and Holland, I. M. (1989) Assuring good Abstraction Method. Ph.D. Thesis, University of Michigan,
style for object-oriented programs. IEEE Software, 6, 38–48. Ann Arbor.
[60] lint—a C program verifier (1988), Manual page, Unix System [72] Tip, F. (1995) A survey of program slicing techniques.
V release 4. J. Progr. Languages, 3, 121–189.

T HE C OMPUTER J OURNAL, Vol. 40, No. 10, 1997

View publication stats

You might also like