Re Engineering Software Programming 2nd
Re Engineering Software Programming 2nd
Software
How to R euse
Programming
to Build New,
State-of-the-Art
Software
Second Edition
Roy Rada
First published 1999 by Glen lake Publishing Company, Ltd.
Notice:
Product or corporate names may be trademarks or registered
trademarks, and are used only for identification and explanation
without intent to infringe.
P re fa c e ................................................................................................. vii
Chapter 1—Introduction........................................................................ 1
The N e e d ................................................................................................. 1
What is Reuse? .......................................................................................3
Types of Reuse .......................................................................................5
Domain A nalysis.....................................................................................6
Hypertext................................................................................................. 7
Epilogue..................................................................................................11
Section 1— Background
iii
iv Contents
Scheduling.............................................................................................44
Epilogue ............................................................................................... 46
Chapter 5—Standards.......................................................................... 71
Expectations...........................................................................................72
Existing Related Standards....................................................................74
Recommendations ................................................................................ 80
Conclusion............................................................................................. 86
Chapter 6—Organizing........................................................................ 91
Indexing................................................................................................. 91
Document Outlines .............................................................................. 94
Domain M odels.....................................................................................97
Code Organization...............................................................................101
Frameworks......................................................................................... 104
Epilogue............................................................................................... 106
Appendix II—References..................................................................237
Index................................................................................................... 249
Preface
First Edition
The first drafts of this book began in 1988 when Section 1, ‘Background,’
was written for a course on software engineering. Section 2, ‘Reuse
Processes’ and Section 3, ‘Practical Examples,’ began with the joining of
Roy Rada s research group in the European ESPRIT Practitioner Project
on software reuse. Three researchers at the University of Liverpool,
Weigang Wang, Karl Strickland, and Cornelia Boldyreff, were particular-
ly active in the Practitioner work. Hafedh Mili of the University of
Quebec at Montreal is the leader of the SoftClass project that is described
VII
viii Preface
S e c o n d Edition
One of the greatest impacts on reuse is the widespread accessibility of
document and software archives through the World Wide Web. Tht!
spread of the web has also stimulated further development of software
and information systems whose components can be reused. In other
words, the needs and opportunities for reuse have significantly grown.
Preface IX
Since this book appeared in 1995, several other reuse books have
appeared on the market—a good sign of the topic s increasing impor-
tance. The other new books fall into a few, general categories: selected
conference papers, managing software reuse, special tools or libraries for
reuse, and general reuse overviews. For example, new books in the cate-
gory of selected conference papers include a conference on correctness
and reusability (Wieringa and Feenstra, 1995). Such a book is more
focused and less cohesive than this book. In the topic of managing reuse
were new books by Lim (1995) and Tracz (1995) that were specifically
about managing or institutionalizing software reuse. The book in your
hand does emphasize the importance of managerial and institutional
approaches to software reuse, but covers, in some depth, the more asset
specific topics. In the category of special tools are many books, such as a
book on reusable Unix software (Krishnamurthy, 1995) and one on reuse
metrics (Poulin, 1997). The book in your hand addresses reuse of Unix
software components and metrics of reuse but only as part of the book
rather than the theme of the book.
Finally comes the category of general-purpose book into which this
book falls. Here there seem to be fewer books than in the other three cat-
egories but still the reader has options. For instance, the book by Bassett
(1997) suits the strong, contemporary interest in frameworks. This par-
ticular book is less scholarly in its intentions than the book in your hand.
Comparisons could continue in this fashion but suffice it to say that this
book attempts to cover the field broadly and concisely. Furthermore, this
second edition has been updated to reflect developments of the past two
years.
The Background Section looks at the software life cycle and software
management. The Enterprise and Standards Section presents first a con-
ceptual framework for reuse that emphasizes enterprise issues and second
the important standards are germane to reuse. The Organize, Retrieve,
and Reorganize Section examines reuse from the perspective of organiz-
ing a library, retrieving items from the library, and reorganizing or tailor-
ing the assets thus retrieved to make a new product. The Practical
X Preface
During the last decade, the gap between the demand for new complex
software systems and the supply has widened. This gap and the difficul-
ties faced by software engineers in bridging it have been described as the
Software Crisis, whereby systems have become so large and complex that
creating software for them is increasingly more difficult to complete on
time and within the constraints of the project budget. Software reuse is of
growing importance as a major factor in alleviating some of the problems
resulting from the Software Crisis.
The Need
Engineering is about using knowledge of natural principles from science
and technology to design and build artifacts. In the early 20th century, an
engineer was one who designed and supervised the execution of physical
systems. In the late 20th century, the notion of engineering has been
extended. For instance, Webster’s Dictionary defines engineering as ‘the
application of science and mathematics by which the properties of matter
and the sources of energy in nature are made useful to man in structures,
machines, products, systems, and processes’. A process is not necessari-
ly physical.
In the first 35 years of computer history the emphasis was on hard-
ware developments, but now the emphasis has shifted more toward
human concerns. As late as the mid-1950’s, 90 percent of application
costs were devoted to hardware, but now 90 percent of the costs are soft-
ware. This reversal reflects not only the decline in hardware costs and the
1
2 Chapter 1
That software reuse has not been widely accepted questions the suit-
ability of existing management practices, organisational structures and
technologies involved in the development of software. In short a rethink
of software development is needed.
What is R e u se ?
The distinction between use and reuse is sometimes a subtle one. We
would argue that success in society is intimately linked, in the first
instance, to the ability to create products and/or services that are used.
The grander success occurs when what one produces becomes a critical
building block in what others create— this is reuse.
The popular reuse icon (three green arrows in a cycle) is typically
about decomposing natural products and incorporating them in new nat-
ural products in a cyclic way. Software can, however, be arbitrarily often
copied, and software reuse should lead to new products in a spiraling way
(see Figure 1.1 Reuse Spiral)
Figure 1.1 -R eu se Spiral: The figure on the left show s the cycle of physical reuse
w hereas the software reuse spiral can lead to progressively more software from
the sam e original software.
4 Chapter 1
nent for use in each context. In summary, reuse is the process of adapting
a generalized component to various contexts of use.
Types of R e u s e
While current management practices are not suitable for reuse-oriented
development methodologies, the unsuitability is often overstated. Part of
the problem is due to a blurring of the distinction between information
life-cycles and development methodologies, and one often gets blamed
for the shortcomings of the other (Agresti, 1986). Roughly speaking an
information life-cycle is a model for organising, planning, and controlling
the activities associated with software development and maintenance. For
the most part, a life-cycle prescribes a division of labor, and identifies and
standardises intermediary work products. A development methodology on
the other hand specifies a notation with which to describe those work
products and a process by which to arrive at those products.
Activities associated with the life-cycle involve financial and human
resources. Diverting resources, both human and financial, into building a
base of reusable information has a number of organisational implications,
including team structures and cost imputations. In addition to the typical
project team structure of information organisations, a reuse library team
is needed. Minimally, the library team would be responsible for packag-
ing and controlling the quality of what gets added to the reuse library
(Prieto-Diaz and Freeman, 1987). The librarians may work closely with
the project teams that develop information as those project teams both use
material from the library and contribute new material to the library.
Development methodologies broadly use either generative or build-
ing blocks approaches. The generative approach shortens the typical
information life-cycle by removing design, implementation, and testing.
Developers specify the desired product in some high-level specification
language. The generated information is usually correct by construction,
and no testing is needed (Simos, 1988).
The building blocks approach typically incorporates:
Domain A n a ly s i s
Recently there has been a growing interest in domain analysis and
domain model ‘reuse,’ extending the scope of reusable information to
earlier in the development life-cycle than the code stage (Mili et al,
1994). At this stage the software developer is able to look at the structure
of a component, expressed perhaps in some formal specification method
without the important concepts of the component being masked by
implementation details. This method does not offer the huge productivity
gains made possible by reusing a piece of code directly but has advan-
tages. Storing components in this manner allows for the range of require-
ments satisfied by any component to be extended, since the items are in
a simple generic form and can thus be more widely applied, allowing
changes to the design to be made directly. Storing components in this
manner it should be possible to reach a stage such that if a component at
the program code level is a close but not exact match for a developer’s
desired component the developer would be able to trace back through that
component’s development history and find a level of abstraction at which
the component is general enough to be reused.
The main problem with reuse is how to render the software items
readily reusable. Domain analysis can be a fundamental step in creating
real reusable components. Organisations who have conducted domain
analysis prior to creating reusable components have met with greater suc-
cess in software reuse.
Domain Analysis is a method for analysing a software domain by
studying existing software systems, emerging technology, and the devel-
opments in terminology of the software field (Lung and Urban, 1993). In
domain analysis common characteristics from similar systems are gener-
alised, objects and operations common to all systems within the same
Introduction 7
domain are identified, and a domain model is defined to describe the rela-
tionships between the objects
Domain modelling in software reuse aims to provide a framework for
the identification of objects, operations, and other structures that can be
captured as reusable software concepts. Both domain expertise and exper-
tise in design-with-reuse are used for efficient domain modelling.
Whether the domain analyst is an expert in the domain or not, she will
require access to, and experience in the use of, tools that can aid in pro-
viding an overview of the domain. Such tools and techniques have been
developed in areas such as systems analysis and knowledge engineering,
where the problem of domain comprehension is also a central issue.
The concepts of ‘domain analysis’ and ‘domain modelling’ are fun-
damental to all object-oriented approaches to software modelling. One
commonly cited and well understood example of a domain is that of
Mathematical Applications. The topics in this area can be modelled into
classes, such as ‘equations’, ‘set theory’, ‘calculus’ and so on. The equa-
tions class, for example, can be subdivided into further classes, called
subclasses, such as ‘simultaneous equations’, ‘differential equations’ and
so on, as can the other classes. Often these subclasses too can be split fur-
ther into subclasses, and so on (see Figure 1.2 Domain Analysis).
A long document that is to be read by people usually has a table of
contents or an outline. This table of contents corresponds to a hierarchy
of headings in the document and gives readers an overview of the con-
tents, enabling them to find thematically-organised sections. Due to the
nature of this division a well-organised outline is a ready made form of
domain model. This is an example of reuse of information already con-
tained in documents helping the reuse process.
Hypertext
Hypertext is a richly-linked, document-like information structure. It
allows the reader of a document to access the information stored in it
from many perceived points of view, in any order and to follow many dif-
ferent paths through the information. In a hypertext system, information
is stored in ‘chunks’ which can be of any size, depending upon imple-
mentation. These chunks are called nodes, and they can be linked togeth-
er to make up a document (see Figure 1.3 Hypertext Document).
8 Chapter 1
Figure 1.2-Domain Analysis: The ‘Domain Analyst’ divides up the domain into
its various classifications
Figure 1.3-H ypertext D ocum ent: The text of the document is held in nodes, the
order in which information is presented to the user will depend on the path fol-
lowed through the nodes.
Figure 1.4-Ccode1: A sam ple C program. The variables q1, q2, nam e, and buf
m ust be understood.
s t r c p y (name, ( i n t ) ql +2) ;
g e t b l o ck. () ;
q2 = ( c h a r * ) strchr (buf, T " T) ;
if ( i q 2)
{
/ * No l uck t h i s t i m e */
s t r c a t ( n a m e , buf);
get bl ock () ;
}
keep track of who has accessed what and when. It can then be an aid to
understanding which are the most important parts of the documents, who
in the team understands which parts of the system, and so forth.
Epilogue
There is increasingly a need for more reliable and complex computer pro-
grams which will be delivered on time and will be cost effective to main-
tain. Traditional software engineering techniques do not fulfill these
needs. Software reuse techniques may help.
Inhibitors to the widespread acceptance of reuse are both managerial
and technological. This book emphasises a technological approach to
reuse based on domain analysis and software libraries. The management
problems and solutions are also described
Section 1
B ack g round
This section provides background in the form of two chapters. The first
chapter describes the software life-cycle and the second chapter intro-
duces issues in the management of software engineers. Inhibitors to effec-
tive software reuse are largely managerial in character. Software engi-
neers must be persuaded that creating material which fits well into a
library is of long-term value. These two chapters relate to management in
software engineering and to the life cycle of software components. One
can not properly understand the management of the people and their tools
without also understanding the processes through which the software
objects themselves go. Software reuse involves many, many aspects that
are both concerned with people and with software.
C hapter 2
T h e S o f t w a r e Life C y c l e
The classic software life-cycle was not conceived with reuse in mind.
This life-cycle has been criticized for being inherently top-down, where-
as good software reuse techniques require a combination of top-down and
bottom-up approaches. Nevertheless, an understanding of the traditional
approach is important as a foundation for understanding reuse, and this
chapter provides that foundation.
The traditional software life-cycle emphasizes the need for each step to
meet the specifications of the previous step (see Figure 2.1 Software life-
cycle). The five basic steps are:
This stage is made easier if the previous stages have been well executed.
15
16 Chapter 2
Figure 2.1-S oftw are life-cycle: The software life-cycle seen in a typical
production engineering flow chart.
R eq u irem en ts
Requirements should:
Design
Two different designers using the same design method and same require-
ments document would not necessarily generate the same design (though
some methodologies such as Jackson Structured Design state that they
should). The designer must still rely on his or her own insight and cre-
ativity in decomposing the system into its constituent structures and
ensuring that the design adequately captures the system specifications.
Design methods in common use have been criticized because they are
largely informal. Nevertheless they have been applied successfully in
many large projects and have resulted in significant cost reductions.
There is no single design tool which is best for all types of software
design. In fact, there are hundreds of different design tools and design
notations, each of which may be useful for describing different levels of
design within particular application areas. Some of the more common,
generic design tools include data flow diagrams, structure charts and
trees. An experiment was run in which a group of students were asked to
use various methods and a group of experts assess the results (Yadav et
al, 1988). The conclusion of the study was that the Data Flow Diagram
method was easier to use and learn.
Data-flow diagrams may be part of top-down design methodology.
Top-down design involves decomposing the system into its functional
sub-components and then creating a design for each sub-component. The
designs for the collection of sub-components are joined to create a design
for the overall system. For top-down design there are 4 phases:
Figure 2.4-Data Flow Diagram: Data flow diagram for Office Information
Retrieval System. DBMS m eans D ataBase M anagem ent System.
22 Chapter 2
Implementation
After design comes implementation of code or software. Correspondence
from design to code should be correct and traceable. It should preserve all
decisions made earlier. With a well-defined design notation, it may be
possible to specify explicit guidelines for mapping design constructs to
code.
If a project has the opportunity to select the programming language
to be used, it is desirable to consider support for high-quality, component
construction provided by a candidate language. Some of the considera-
tions include:
The Software Life Cycle 23
should help define the relationship between internal support, such as for
screen design and external support, such as manuals. It should clarify the
relative costs and benefits of information services, such as teaching and
consulting versus information products, such as manuals and tutorial
disks. One principle of a theory of documentation is that:
Manuals are not the lone creations of individual writers or artists. Each
publication should be written to an engineered specification, not created
in private by an artisan. Each part of the document should go into a data
base where it will be maintained and reused by other writers. Manuals
will be developed through models and prototypes and tested before they
are drafted, by applying the principles to models.
Case studies reveal interesting patterns in investment in documenta-
tion. Firms which make a large investment in a product of which only a
few copies are sold tend to rely heavily on personal contacts with the buy-
ers to help the buyers use the product. Firms which sell large quantities
of a modestly priced product find it cost effective to have excellent doc-
umentation because the firm could not afford the number of service calls
which would be required for helping the users in person. More specifi-
cally, two variables dominate management’s decision on the relative size
of its investment in documentation. High projected sales volume seems to
predict a relatively large investment, because high volume precludes per-
sonal contact with customers before and after sales. High unit cost, on the
other hand, predicts a lower relative investment in documentation,
because personal contact is more effective in influencing the purchase
decision and competes with documentation as a means of delivering
maintenance support.
In the case of low-investment documentation the vast majority of
time spent on documentation goes towards writing. Relatively little time
The Software Life Cycle 27
M ainten ance
The primary business of the software industry has historically been new
development; now it is maintenance and evolution. Today more software
professionals are employed to maintain and evolve existing applications
than to develop new systems from scratch. Software engineers need a
variety of analytical skills, tools, and methodologies to cope with the
challenges of maintaining large, aging software systems. The software
industry critically depends on enhancing the maintenance processes of
legacy or heritage systems which potentially constitute immense corpo-
rate advantages if managed effectively.
Every time an alteration is made to any aspect of the software, be it
designs or code, it is necessary to make the corresponding change in the
documentation to indicate the change. Otherwise, it may be very difficult
for another programmer to understand the software sufficiently to main-
tain it. Software maintenance is the term given to the process of modify-
ing the program after it has been delivered and is in use. These modifica-
tions may involve simple changes to correct coding errors, more exten-
sive changes to correct design errors, or drastic rewrites to accommodate
new requirements.
Programmers may hope that what is important about their programs
is immediately visible. Realistically the problems are many, and cost lies
under the surface. Generally the cost of software maintenance has steadi-
ly increased during the past 20 years. A typical software development
organization spends about 40% - 60% of its money on software mainte-
nance. The common error made by the maintainers is that, when an error
is encountered, the coding is investigated and corrected, but the docu-
mentation is not correspondingly updated. Whenever a change to the soft-
ware, whatever the format or type of change may be, is made, the docu-
mentation must also be updated. Otherwise a subsequent user or main-
tainer would find it difficult to realize the change that has been made to
the original version of the software.
28 Chapter 2
Standards
The importance of standards in software engineering can not be overstat-
ed. Reuse hinges on standards. A person can not reuse an object when the
description is not in a language (and this means not only the natural lan-
guage but also the software design language) that the person understands.
The interfaces of the object to other objects have to speak the same lan-
guage. If the components of the life cycle are to communicate with one
another, the first standard that is required is directed to the commonality
in the language that is used to describe the software life cycle. This chap-
ter has summarized some of the key views on and components of the soft-
ware life cycle. Next standards for the software life cycle are considered
after an introductory look at standards organizations and processes.
The standard making process involves many different groups whose
mode of operation is a complex combination of commerce and govern-
ment. Consensus must be obtained and often this involves significant
expenditure of resources in time and money. An advanced country tends
to have at least one major standards institute. The United States has the
American National Standards Institute, the United Kingdom has the
British Standards Institute, and so on. An alliance of European nations
has created a standards organization called the Committee of European
Normalization which receives input from 18 European countries. The
International Organization for Standards receives input from virtually all
countries. In addition to the national and supranational organizations,
many commercial and volunteer groups play important roles in develop-
ing standards (Rada et al, 1994). For instance, the Institute of Electrical
and Electronic Engineers (IEEE) sponsors the development of numerous
standards.
The International Organization of Standards (ISO) is an independent
organization for fostering international agreement on standards with a
view to expanding international trade. ISO consists of national represen-
tatives only. The work of ISO is undertaken by Technical Committees. A
draft standard is advanced by a Technical Committee to the membership
of ISO, and if 75% of the membership of ISO approve, then the draft
becomes an International Standard.
American Standards
Traditionally, each sector interested in regulating the development and
maintenance of software has written its own specification to summarize
the requirements of interest. Over twenty years ago the U.S. government
over twenty years ago organized a software life cycle around ten docu-
The Software Life Cycle 29
ments (NBS, 1976). At roughly the same time, the U.S. Navy commis-
sioned a standard for the software life cycle (USN, 1976). The Federal
Aviation Authority and other regulatory bodies also standardized the soft-
ware life cycle they expected contractors to follow in developing quality
software.
The commercial sector also helped to proliferate life cycle standards.
The Institute of Electrical and Electronic Engineers (IEEE) created a
standard for the definition of life cycle processes, IEEE Std 1074. Unlike
the aforementioned standards which placed requirements on the external
characteristics of a life cycle, 1074 focused on the life cycle model. It
specified a number of process fragments along with their inputs and out-
puts. The process architect could assemble a life cycle model from the
pieces specified by the standard. Even private companies, like IBM, had
their own life cycle definitions, treated as proprietary material because
they were presumed to confer a competitive advantage.
If the 1970s and 80s were a period of differentiation in life cycle stan-
dards, the 1990s were a period of consolidation. The DoD undertook an
effort to unify various software life cycle standards it had sponsored. The
IEEE and the Electronics Industry Association then produced a joint stan-
dard that the American National Standards Institute (ANSI) issued as
ANSI Joint Standard 016.
ISO 12207
While each country was developing its own software life cycle standards,
the international standards community was also active. ISO/IEC
JTC1/SC7 (the software engineering subcommittee of Joint Technical
Committee 1 of the International Organization for Standardization and
the International Electrotechnical Commission) developed a standard
known as ISO/IEC 12207. Whereas ANSI 016 placed requirements on
only the development process, 12207 specified four additional primary
processes (acquisition, supply, maintenance, and operation), as well as
eight supporting processes and four organizational processes. Although
12207 is useful in organizational or individual contexts, its conformance
requirement specifically applies to the relationship between an acquirer
and a supplier in the development, maintenance, or operation of software.
The standard is intended to be tailored by the deletion of inapplicable
tasks (specific parts of processes) when it is applied to any particular con-
tract. (Of course, additional requirements may be placed in a contract.)
On the other hand, organizations that adopt the standard as a “condition
of trade” are expected to publish the minimum set of acceptable tasks.
30 Chapter 2
Figure 2.6-IEEE and ISO: Existing IEEE Standards support the ISO 12207
Process Framework. The leftmost column is the high-level process addressed by
I S 0 12207, while the second column is the next-level process in I S 0 12207. The
third column is an IEEE standard that relates directly to the corresponding ISO
process.
ISO 12207 Process Corresponding IEEE
SESC Standard
Figure 2.6—
IEEE and ISO, c o n 't
ISO 12207 Process Corresponding IEEE
SESC Standard
Joint Review 1028 Software Reviews and
Audits
Problem Resolution 1044 Classification of Software
Anomalies
Organizational Management 1058 Software Project
Management Plans
Infrastructure 1209 Evaluation and Selection of
CASE Tools
Tailoring 1074 Developing Software Life
Cycle Processes
A 12207 badge suits the needs of the U.S. defense industry, one of the pri-
mary customers of software engineering standards. When the Department
of Defense stopped writing and imposing its own process standards, it
endorsed the concept that contractors should develop their own organiza-
tional processes conforming to generally accepted standards. Defense
contractors are among the first to display the 12207 badge. ISO is exploit-
ing the success of 12207 in several ways. A guidebook providing advice
on 12207 usage is available. ISO is reshaping its other software engi-
neering standards projects so they will “plug into” the 12207 life cycle
processes.
P ro ces s A s s e s s m e n t Methods
A badge already familiar to some U.S. software developers is the
Capability Maturity Model developed by the Software Engineering
Institute at Camegie-Mellon University. The CMM began as a self-
assessment mechanism intended to guide the improvement of organiza-
tional software development processes. It grew to an evaluation mecha-
nism, applied by some agencies of the U.S. Department of Defense to
judge the capabilities of potential suppliers.
U.S. software developers may not be as familiar with assessment
models and methods, such as Trillium (Bell, 1994), Bootstrap and per-
haps a half-dozen others, developed by organizations in other countries.
The proliferation of assessment mechanisms presents a barrier to organi-
zations desiring to act as suppliers in the international software market-
place.
32 Chapter 2
Epilogue
Reuse may take place at any level of the life-cycle model, and thus the
model needs to be addressed at each level to reflect reuse practices.
Perhaps one of the most fundamental criticisms of the traditional software
life cycle is its separation of design and implementation. In a reuse envi-
ronment, design and implementation are linked. To implement with a pre-
designed (reusable) component may mean going back and changing the
design. Analysts may need to look ahead in the process to determine what
components are available to them in the components library and tailor
their design accordingly. It would therefore be better, if these two stages
were linked.
Reuse of components is greatly facilitated if the components are in
machine-readable form. This is essential for effective computer organiza-
tion and retrieval of the material (which is discussed later in this book)
and also means that the components can be loaded into software engi-
neering support tools, since this will usually facilitate the easiest manip-
ulation of this material. This portability of representations is unlikely to
be generally possible until there are more widespread standards for these
representations and tools.
The plethora of existing software life cycle models and languages is
one barrier to reuse. At least, within a given domain such as manufactur-
ing software or health care software, software engineering organizations
in that discipline need to agree on some standards. This agreement on
standards is a salient mark of a profession. This conformance to the stan-
dards will be associated with badges. Software product and service
providers will earn and wear these badges to signify their compliance
with professional standards.
Chapter 3
M anagement
responsibility of the person given the task. A serious problem arises when
no team members have substantial experience, for then the authority of
the experienced members is missing and coordination suffers.
directed by the other program managers. The quality manager has a direct
line to the director so as to avoid distortion of the quality assessment
results. The program managers have project managers who report to
them, and team leaders report to the project managers.
The chief programmer team approach utilizes an experienced chief
programmer and provides him with substantial support (Baker, 1972). All
communication goes through the team chief. The other members of the
team might include :
Figure 3.2-Chief Programmer: The Chief Program mer paradigm allows the
expert program m er to take advantage of the resources of a team of assistants.
P r o c e s s Modeling
Reuse can occur on more than code, it can occur on process models of an
organization. These models were discussed loosely in the preceding sec-
tion. Here, one particular method and language for developing such
process models is presented. Activity standards are prescribed in places,
such as ISO 12207 (see the preceding chapter), but process model for a
particular organization must include and thus transcend just the activity
models.
Software reuse has gone further and further from the emphasis on
simply reusing code to the emphasis on appreciating and reusing infor-
mation throughout the software life cycle. This has corresponded with an
appreciation of the importance of describing the entire organizational
operation and reusing that information as well (Cockbum, 1996). This
higher level analysis of activities in an organization may go under the
heading of patterns work and an issue of the Communications of the
ACM was devoted to such patterns in 1996 (Schmidt, et al, 1996).
A simple taxonomy of business process models distinguishes formal
methods from empirical ones. The object-oriented paradigm used here is
empirical. The modeling language is called Gertrude (Succi et al, 1997)
and uses four entities: people, roles, processes, and infrastructures.
People fill roles. Roles execute processes through activity profiles of the
roles and processes. Infrastructure provides materials for processes.
40 Chapter 3
Activities are atomic processes. People are the employees of the firm;
they play roles to perform the activities. Infrastructures are passive phys-
ical objects, such as equipment and facilities. A process may be farther
categorized as interfacing to roles and infrastructure outside the firm or
inside to controlling other processes, or to neither interfacing nor con-
trolling.
As employees in the firm must benefit from the model, they need to
understand the modeling language to some extent. The modeler has to
know the details of the language. Other people may be given external
views that hide complexities or details of the model and its language so
that those people can derive from the model just what they need to know.
To do quantitative evaluations, such as productivity and profitability,
activity-based costing is incorporated in the Gertrude approach. When a
process is executed, data is collected about resource consumption. If a
process is reused with changes, it is necessary to evaluate the effects of
the changes on the profitability of the process.
The modeling process is approached in an iterative fashion both off-
line and on-line. Off-line means by benefiting from accumulated infor-
mation. On-line means the firm provides data on a day-to-day basis that
is directly incorporated in the model building.
An example of the modeling is presented. Employees describe the
operations in the firm. Some general patterns appear. For instance, a ver-
bal description like this might appear:
The client asks for a design. Sue handles the order and requirements.
Sue informs John about the resource requirements. John uses a
resource-allocation strategy to choose which designers will handle
the project. The designers work on the project. The design is passed
to Ann who meets with the client to discuss the designs suitability.
Ann then reports to John on the success of the design, and John deter-
mines the next step in resource allocation.
Figure 3.3-Roles Hierarchy: The boxes describe roles. The lines indicate
taxonomy hierarchical relations.
Each new person brought into a project needs instructions about what
he or she is to do, which takes time from those who give the instructions.
The new person can do work but also needs guidance which costs the
work time of others. In the worst case, the communication needs are such
that a new person must communicate with every person already on the
project. If there are p-people and each must spend time in contact with the
other p-1 people, then the amount of contact is p(p-l) or roughly p2. The
effort which can be expended by p people over t time when there is no
overhead cost is p times t. Given that e effort satisfies e = p times t, the
time to complete a given task is t = e over p. As p grows, t declines. If the
communication costs are considered, then these costs are proportional to
the square of the number of people. In this case, the curve of t versus p
no longer declines monotonically with p; rather there is an inflection
point from which t rises with rising p (see Figure 3.5 Time Rising).
44 Chapter 3
Figure 3.5-Time Rising: Curve showing the initial decline but the subsequent
rise in time as people are added to the job.
Scheduling
Without some estimate of programmer productivity, project scheduling is
impossible. Also, some of the advantages to the use of new programming
and management methodologies are difficult to assess without quantita-
tive measures of programmer productivity. The most commonly used
measure of programmer productivity is lines of source code per pro-
grammer-month. This is computed by dividing the number of lines of
source code delivered by the programmer-months in the project. The pro-
grammer-months include analysis and design, coding, testing, and docu-
mentation time. One of the difficulties in applying this measure of pro-
ductivity is defining ‘a line of code’ (Jones, 1978) Another difficulty is
that this measure does not take into account the quality of the code pro-
duced, only the quantity.
Management 45
Figure 3 .6-B ar Chart: Person A does tasks T1, T2, and T4, while Person B does
tasks T3 and T5.
E p ilo g ue
Producing large systems is a difficult task of managing both people and
information. This is true in a reuse environment where extra personnel,
such as those that manage and retrieve elements from the reuse library,
are needed. The Chief programmer approach to software team organiza-
tion includes reuse-oriented roles, such as the librarian role.
Of the many models which have been developed to model the devel-
opment of software systems, the COCOMO model is perhaps the best
known. It estimates the number of lines of code that a project will need to
produce and shows interesting relations among project type and effort
required. Reuse parameters can be folded into the COCOMO model.
The mythical man-month notion (Brooks, 1975) shows that adding
more people to a project does not necessarily reduce the time needed to
finish the project. Adding a person to the project may bring more com-
munication costs than productivity benefits. Initiating a reuse effort may
initially require additional staff and their concomitant costs.
To schedule a software project, tasks and their dependencies must be
appreciated. Software reuse introduces new tasks and dependencies. The
challenge then is to manage these dependencies so that overall benefit
exceeds overall cost.
Section 2
Enterprise and
Standards
This section contains two chapters. The first details a conceptual frame-
work for reuse. A reuse life cycle for software assets is described that
centers around the library of assets. The human issues in the reuse
framework are also emphasized in a cycle of activity that goes from plan-
ning to enactment to learning. The enterprise chapter also addresses eco-
nomics and legal issues. Software reuse involves many, many aspects that
are both concerned with people and with software. For instance, to estab-
lish a software reuse library one should first estimate the costs of devel-
oping software with or without the library. Legal matters, such as copy-
right, may play an important role in determining what is or is not reused.
The second chapter of this section looks at standards relevant to
reuse. Reuse in some fundamental ways requires standardization. People
have to agree on the language of discourse, assets have to be able to com-
municate with other assets through standard interfaces, and organizations
must communicate in clear, systematic ways to their employees about
processes for reuse. Certainly, standards within an organization exist but
these are being supported increasingly by formal, global standards, and
those are the subjects of the second chapter in this Section.
Chapter 4
R eu se Framework
The vision for reuse is to move from the current6re-invent the software’
cycle to a library-based way of constructing software (DoD, 1992). A
conceptual framework for reuse should provide the technological and
management basis to influence and enable this paradigm shift. In this new
paradigm the standard approach to software development is to derive sys-
tems principally from existing assets rather than to create the systems
anew. Reusable assets are thus a central concept of the reuse vision, and
they imply a need for processes to create such assets, manage them and
utilize them to produce new systems.
Experience suggests that this library-based approach must be
domain-specific. Being domain-specific means that the reusable assets,
the development processes, and the supporting technology are appropri-
ate to the application domain for which the software is being developed.
Application domains are generally considered to be broad in scope, for
example communication systems. The effectiveness of domain-specific
assets depends on a number of factors, including the maturity of the appli-
cation domain and the investment applied to create the assets. As a
domain matures, it generally becomes more stable and better understood,
thus increasing the likelihood that assets will be reusable. However, even
in mature domains, asset reusability and quality will be maximized only
if suitable investment has been applied to identify and exploit key reuse
opportunities. Domain analysis and its resultant models are critical to the
success of a domain-specific reuse program.
49
50 Chapter 4
P r o c e s s Idioms and S o u r c e s
Software engineering should be done in accordance with well defined,
repeatable processes. One framework for reuse consists of dual, intercon-
nected ‘process idiom s’ called Reuse M anagem ent and Reuse
Engineering (see Figure 4.1 Reuse Management and Reuse Engineering).
Outputs from the Framework are software systems and new reusable
assets.
R e u s e M a nagem ent
Reuse involves both people and information. The Reuse Management
process idiom focuses on people. It describes a cyclical pattern of plan-
ning, enacting, and learning (see Figure 4.2 Reuse Management). Reuse
Management incorporates emerging general theories of organizational
learning (Senge, 1990) that have been adapted to the reuse-based soft-
ware engineering context. The following subsections deal with managing
people, but later this chapter will emphasize the information- side o f reuse
under the headings of Asset Creation, Asset Management, and Asset
Utilization.
Reuse Planning
The Reuse Planning process encompasses both strategic planning and
tactical, project-oriented planning within a reuse program. One key
strategic reuse planning function, which augments traditional product line
planning within an organization, is to select the key domains of focus for
the reuse program and determine how the domain assets will support the
organization’s product engineering efforts. One key focus of Reuse
Planning is the reuse infrastructure that is required to sustain a reuse-
Reuse Framework 53
A s s e t Creation
Reuse Engineering addresses the creation, management, and utilization of
reusable assets. Asset Management serves a brokerage role between
Asset Creation and Asset Utilization and reflects common marketplace
interactions (see Figure 4.3 Decomposition of Reuse Engineering). In an
organization that has a mature reuse program underway, there will likely
be multiple Asset Creation, Asset Management, and Asset Utilization
projects in operation simultaneously.
The goals of Asset Creation are to capture, organize, and represent
knowledge about a domain, and use that knowledge to develop reusable
assets. Asset Creation can be viewed as consisting of:
• Reverse engineering,
• Knowledge acquisition,
• Technology forecasting,
• Domain modeling, and
• Asset specification.
• Asset Creation
Domain Analysis and Modeling
Domain Architecture Development
Asset Implementation
• Asset Management
Library Operation
Library Data Modeling
Library Usage Support
Asset Brokering
Asset Acquisition
Asset Acceptance
Asset Cataloging
Asset Certification
• Asset Utilization
Asset Criteria Determination
Asset Identification
Asset Selection
Asset Tailoring
Asset Integration
A s s e t M a nagem ent
The Asset Management processes fall into two general classes: processes
that focus on acquiring, installing, and evaluating individual assets in a
library, and processes that focus on developing and operating libraries
that house collections of assets, provide access to those assets, and sup-
port their utilization (see Figure 4.4 Managing Engineering). Asset
Management overlaps in some ways with Reuse Management.
Organizational assets, such as plans, are generally treated as part of the
reuse infrastructure, while library support technology is generally
58 Chapter 4
Figure 4.4-M anaging E ngineering: The reuse cycle is shown with detail pro-
vided for the M anagem ent phase in which Library and A sset processes occur.
Li br a r y Processes
Asset Processes
Library P r o c e s s e s
A library houses managed asset collections. A library need not be auto-
mated to effectively manage a collection of assets and serve a useful
mediator role between Asset Creation and Asset Utilization processes.
The goal of Library Operation processes is to ensure the availability
and accessibility of the library and its associated assets for Asset
Utilization. This can involve a variety of activities, such as:
Reuse Framework 59
ity, and rationale for its request. The librarians evaluate each proposal for
a component to ensure that the proposer has, at least, one application in
mind, to assess the cost-effectiveness of the component, and to identify
other beneficial characteristics, such as wide applicability and low
complexity.
The library acquires reusable components from numerous sources.
There is no absolute need for the library to contain a physical copy of
each component. A component would be incorporated by reference only,
if it is:
A s s e t Utilization
The goal of Asset Utilization is to construct new application products
using previously developed assets. The outputs from Asset Utilization
include:
62 Chapter 4
Figure 4.5-Reuse Costs: Initially the cost of the reuse program is very high, as
the library of reusable items is built. Returns cannot start until the library is usable,
and then will tend to be low, a s the library needs to becom e suited to the devel-
opers who use it and gaps in the libraries content are filled. The final level of
return is uncertain, it should becom e high and remain high, but this will depend
on the quality of the reuse library, it’s retrieval system and the willingness of
developers to reuse.
M on ey
C o sts v e rs u s B enefits
Software reuse is difficult for companies to initiate because it has the least
desirable cost structure, initially very high, recouping over time. Start-up
costs include setting up the software component library that will be
64 Chapter 4
needed, training staff to set up the library, populating the library and train-
ing the software development teams to use the library software and reuse
the actual components. Gradual returns over the life of the library should
pay for initial costs over a period of time, and return profits, but may not.
There are other possible benefits in the quality of the software produced
and minimization of developer time and resources, but these benefits may
be difficult to measure (see Figure 4.5 Reuse Costs).
To be willing to provide an investment in reuse, companies will need
proof of the benefits of reuse. Some of the benefits of reuse are tangible,
for example the time to build a project should be reduced. But the way to
prove that project development time had been reduced would be to build
a project using conventional means and then to ‘go back in time’ to before
it was written and to create it again using exactly the same people, with-
out the knowledge they gained by building the system in the first case,
with reusable components. This is of course impossible, and would in any
case be inconclusive since the benefits of reuse will vary from project to
project. Most of the benefits of reuse, increased quality of programs and
documentation, less testing required, fewer skills needed within the
development team, and the production of components available for future
projects, are difficult to measure.
Reuse is only cost effective if it actually takes place. If on a project
reuse options are investigated, i.e., requests for reusable components are
formulated and submitted to the library system, and some components
retrieved and investigated for suitability and none of them prove suitable
or none are found, then a perhaps significant amount of time has been
wasted by the developers. Also the library has failed to live up to its
expectations on this project and thus no money used to set up and run the
reuse library has been recouped. For a new library and reuse methodolo-
gy unless very significant amounts of money have been spent creating the
library, then low returns on reuse should be expected on quite a regular
basis. The components created (since reusable ones could not be found)
should be used to add to the library, but this will be more expensive than
custom-creating the components for just this project.
The method used to estimate reuse costs and benefits should be com-
patible with the methods used by the rest of the company. The two quan-
tities of primary importance are:
• the net saving to the individual user for each instance of reuse of a
component and
• the net saving to the company from all reuses of the component.
Reuse Framework 65
• The cost to reuse, C R, is the cost incurred each time the compo-
nent is reused, including retrieval and tailoring costs.
• The accession time, T A, is the amount of time between the deci-
sion to acquire the component and its availability in the library.
• The accession cost, C A, is the cost to add the component to the
library.
• The maintenance cost, C M, is the cost to maintain the component
in the library. Again if yearly costs vary, the maintenance cost dis-
tribution C My where y=l, 2,..., L is the cost to maintain the com-
ponent for each year of its service life.
be given the discount rate i. Thus the annual discounted NSP can be given
as DNSPy = ( ( NSR x Ny) - CMy) / (1 + i)y.
To compare two potential reusable components to determine which to
acquire for the library and which not, the cumulative discounted cash
flow CDCF for each would be determined and the component would be
preferred with the higher CDCF. CDCF = DNSP! + DNSP2 + ... +
DNSPl - CA. With the appropriate data and these formulas, a company
can better plan its reuse investments.
Depending on the management strategy adopted by the software
development company, the costs of reuse may be absorbed by the devel-
oper in the hope of creating similar projects later, and thus recouping the
investment that way. Or alternatively the developer may form a partner-
ship with the client, where initial startup costs and any later benefits are
both shared.
Legal I s s u e s
If the client of a software house is willing to allow the software house to
reuse components developed for them for other clients who need similar
systems, then the software house has more reasons to practice software
reuse. However who it is that is the owner of a component developed for
a project could be a thorny issue. Does permission need to be sought to
include a component developed for a project in a library, if that compo-
nent is generic? Should commission be paid to the instigating client?
These issues may be soluble by written agreements listing terms and con-
ditions between the client and the developers.
Legal issues affect a reuse effort as they effect any other endeavor
that involves the utilization of the work of others in the creation of new
work. There must be clear legal boundaries defining what is reuse and
what would constitute plagiarism. Copyright exists in most western coun-
tries to some extent, and motion pictures and books have well defined
protection in law (Lahore and Dworkin, 1984). In nearly all countries
legal protection of software is more hazy as the law attempts to get to
grips with and catch up with the technologies to hand.
Within the European Union moves are under way to standardize leg-
islation between the various member states. Computer software is gener-
ally regarded as intellectual property and as such may be protected under
patent and copyright law. The copyright protects the expression of the
software, the routines and the order in which they are called. Whereas
patent protects the industrial realization of the concept. That the rationale
Reuse Framework 67
Impact
To successfully set up a reuse program involves transferring research
approaches to reuse into standard industrial practices. Four main stages
are identifiable in the transfer process:
• the Audit,
• the Planning Stage,
• the Implementation, and
• the Evaluation.
E p ilo g ue
The themes about reuse as expressed in the Conceptual Framework for
Reuse Processes are:
level. It provides a basis for the analysis of reuse processes and the defi-
nition of reusable assets.
The individual reuse processes gain synergistic value when viewed as
modular building blocks that can be used to construct a wide variety of
reuse-specific process configurations reflecting different planning levels,
organizational structures, and interaction patterns. To support the con-
struction of process configurations, the Framework should include a set
of composition techniques to connect the processes together in a variety
of ways. These techniques provide a flexible and scalable composition
approach enabling the Framework to capture aspects of reuse-based engi-
neering practice not easily described with traditional software life cycle
models.
Modifying the software life-cycle for reuse involves everyone
involved in the creation of software. Formulas exist for determining the
costs and benefits but obtaining realistic values for the variables in the
formula is challenging. Everyone from developers to those who manage
budgets and plan projects and even the final customers for the programs
produced will play some role in the ultimate value of the reuse effort. If
reuse is successful at a company, then everyone involved will benefit.
Developers will not have to waste time developing software repetitively
and costs will be reduced in the long term. These savings can be passed
to the customer in reduced costs, faster delivery, and increased reliability.
Chapter 5
Standards
The Reuse Planning Group will define for SESC a statement of direc-
tion for IEEE standards related to the analysis, design, implementa-
tion, validation, verification, documentation, and maintenance of
reusable software assets as well as their supporting infrastructure in
the creation of new applications.
Expectations
User expectations have been annotated to categorize the kind of standard
that might address such an expectation. The lists of expectations here are
based on those contained in the IEEE SESC Master Plan for Software
Engineering Standards. At the beginning is a list of postulated roles that
reuse standards might fill. Each expectation is annotated with one or more
roles indicating that standards fulfilling the annotated role should address
the expectation.
The postulated roles are:
The lists of expectations can be regrouped by role. For instance, the soft-
ware asset manager expects the standards to provide requirements or
guidelines that ensure that the theory for software depreciation schedules
has a practical business foundation and that cost models for modifying
versus replacing software are reliable.
Existing Related S ta n d a r d s
Which existing documents might be of use in a standardization program?
Documents are considered that are normative, (i.e., standard-like, in
nature). Each document is described in terms of its purpose, scope, and
audience. An evaluation is performed to assign each document a candi-
date usage.
Evaluation Criteria
The purpose of evaluating existing documents is to determine how they
might appropriately contribute to standardization efforts regarding soft-
ware reuse. Evaluation criteria are formulated to support the making of
such recommendations. The result of each evaluation will be an assign-
ment of a candidate usage for the document. There are four categories of
usage called base document, normative advice, helpful information, and
not useful with the following descriptions:
The Impact criterion evaluates the extent to which the document has
already achieved acceptance in the reuse and software engineering com-
munity. Impact is determined on the basis of the following characteristics:
Figure 5.1-ARPA Evaluation: The docum ent titles are in the first column. The
document attributes are given in the first row. The other entries are the values on
each attribute for each document.
Document Relevance Impact Nature Currency Quality Usage
CFRP High High Guidance Far Seeing Good to Base
High Document
Strategy High Some Conform Far Seeing Good Normative
Advice
Direction- Relevant Some Guidance Current Good Normative
Level Advice
Figure 5.2-DoD Document Evaluations: Two DoD docum ents are listed in the
leftmost column and their attributes are nam ed in the top row.
Document Relevance Im pact Norm ative C urrency Quality C andidate
N ature Usage
P rim er R e le v an t Som e to In fo rm atio n C u rren t G ood H elp fu l
L ittle In fo rm atio n
Figure 5.3-Armed Forces Standards: The two Armed Forces docum ents are
listed in the leftmost column and their attribute values in the interior cells of the
array.
Document Relevance Im pact Norm ative C urrency Quality C andidate
Nature Usage
Acquisition R e le v an t Som e In fo rm atio n C u rren t G ood H e lp fu l
In fo rm a tio n
R ecom m endations
The Master Plan of the IEEE SESC describes an evolutionary approach
to reorganizing the collection of SESC standards. Future SESC standards
related to reuse can be positioned within this master plan. The master plan
provides a four-layer model structured as follows (see Figure 5.4
Organization of SESC):
1. Terminology
2. Master Road Map
3. Program Elements
4. Technique Standards
Standards 81
Figure 5.4-Organization of SESC: The four major com ponents are shown in
their order
• ANSI J-016
• CARDS Direction Level Handbook
• DoD Software Reuse Institute Reuse Business Model
• Software Productivity Consortium Reuse Adoption Guidebook
Conclusion
In 1992 the North Atlantic Treaty Organization (NATO) issued three doc-
uments about software reuse as standards for NATO. When one reads the
NATO documents, one appreciates that they are NATO-specific. In other
words, they are plans for how NATO itself will conduct software reuse
activities.
The observation about the NATO-specific standards is merely one
example of a more general phenomenon in reuse standardization—the
most tangible reuse standards are specific to a single organization. That’s
because successful reuse practices touch many parts of an organization’s
methods and culture for doing business.
To better appreciate the organization-specific character of the most
tangible reuse guidelines consider an example. Assume a standard says
that any reusable subroutine must be documented with its “intended func-
tion.” In an organization indoctrinated in Harlan Mills’s structured pro-
gramming techniques, this has a precise mathematical definition that is
well understood because of the corporate culture of pursuing Mills’s tech-
niques (Mills et al, 1986). If one generalizes this standard to other orga-
nizations and expects them to use the Mills’s techniques, one would be
faced with the complaint that “this organization documents its software
differently.” So in an effort to be general one would relax the standard to
a requirement that software be documented—a very non-specific require-
ment that can be satisfied in a variety of superficial ways. In broadening
the applicability of the standard, one robs the standard of its ability to dis-
criminate. This phenomenon appears over and over again in software
reuse standards.
Standards can be developed by many different kinds of organizations
(Rada and Berg, 1995). IBM and Motorola have developed corporate
standards for reuse among their employees. Government agencies have
developed standards for how government contractors must follow soft-
ware reuse processes. Relatively little has been done explicitly about soft-
ware reuse by official standards development organizations.
The major international standards organization relative to informa-
tion technology is ISO-IEC JTC1. The standards from the US Military
and NATO, are not JTC1 standards. JTC1 has one project underway that
Standards 87
The recommendations are notable also for the efforts that were con-
sidered but rejected. For example, the group rejected efforts for standards
related to “reuse development practices” because they were regarded as
being too specific to tools and technologies and being too low-level and
voluminous. Standards for library organization and operation were reject-
ed because reliable measures of effectiveness do not yet exist. Guides for
reuse adoption were rejected because there is little evidence of repeatable
effectiveness and because there is little evidence of de facto consensus.
Perhaps most notably, an effort to write a standard for the assessment of
reusable components was rejected because of the concerns regarding the
ability to “scale” reuse practices beyond the scope of a single organiza-
tion.
The US Department of Defense is already writing software develop-
ment contracts that require reuse. Yet, no internationally recognized stan-
dard for how software reuse should occur exists. Such standards are
needed.
88 Chapter 5
Document and object oriented approaches to all three phases are dis-
cussed and compared, as are applications of both techniques to program
code, and higher levels of abstraction such as software requirements. The
discussion of program document retrieval is augmented by practical
examples.
A major problem to be overcome in the deployment of software reuse
techniques is to identify appropriate methods for the classification and
retrieval of software items. If retrieval is not made easy for developers,
they will prefer to re-write the component from scratch. At the same time
classification must be simple and cost effective to justify the cost of set-
ting up and maintaining the library of items against the cost of develop-
ing the items each time from scratch. These twin aims of simple classifi-
cation and powerful retrieval are contradictory in themselves, as will be
explained.
Chapter 6
O r g a n izi n g
Indexing
After material has been collected and analyzed (as described earlier), it
should be indexed. The two most important basic approaches to docu-
ment indexing are the interpretive and structural approaches. With inter-
pretive indexing the document is read and understood before index terms
are assigned (Kaplan and Maarek, 1990). In contrast, the structural
approach uses the frequency of word usage in natural text as an indicator
of relevance of the contents to a topic without semantic interpretation
91
92 Chapter 6
Figure 6.1-Component Sources: Com ponents may arrive for the library from a
software house, or from public domain sources, or from projects on-going which
have developed new com ponents suitable for reuse.
L ib r a r ia n
Organizing 93
“Computer Aided Instmction (CAI) systems have improved greatly over the
years. Their knowledge content is still however low, and the system decisions,
though usually correct cannot be analyzed for the reasoning behind them.
Knowledge can be stored in the domain of a learner model in a variety of ways,
and thus a student may have to acquire knowledge using several different
methods.”
D o c u m e n t O utlin es
Successful interpretive indexing requires understanding a document’s
content. A document’s outline may provide a valuable guide to this
understanding. Most well prepared documents have an outline. The out-
line manifests itself in the layout of the document as highlighted and
numbered headings in the document body, as well as in a separate listing
at the beginning of the document in the ‘table of contents.’ This physical
layout helps people understand the logical structures of the document and
find thematically-organized sections in the document. This means that
there is an existing organization of themes (or sort of domain model) in
the document that can be easily observed and perhaps utilized in orga-
nizing the document for reuse systems. Analysis of the outline provides a
quick and simple insight into the content of the document. Tools can auto-
matically extract an outline from a document and thus split documents
into themed sections, showing the relations between these sections (see
Figure 6.3 Outline Analysis).
Obviously the size, balance and depth of this structure varies from
document to document. There are no strict rules for the construction of
headings or outlines, though some patterns in headings do exist. In many
cases, headings are noun phrases, such as ‘Introduction to Database
System Concepts,’ ‘Physical Data Organization’ and ‘Protecting the
Database Against Misuse.’ But headings may even be complete sen-
tences, such as ‘What Makes Interlisp Unique?.’ The heading of a section
should briefly describe the contents of that section, and headings can be
seen as ‘content index’ terms for a document. It follows therefore that one
may extract index terms from the headings to represent more generically
the sections under those headings. For example, both ‘The Network
Model’ and ‘The Relational Model’ may be indexed by the key word
‘Model.’
In many documents, some subheadings inherit attributes from their
parent heading. For example, the heading ‘Hardware’ under ‘Site
Requirement’ means ‘the site requirement on hardware.’ This depen-
dence may continue on several levels. In some scientific documents,
headings repeat. Documents of a given type, such as requirements docu-
ments, may all have the same outline. Some company policies require
such conformity of outlines. Furthermore, the documents of a given soft-
ware life-cycle fall themselves within a kind of high-level outline whose
headings may show some relationship across documents (see Figure 6.4
Relations among Outlines). When one studies the nature of the relation-
ship between headings in an outline one observes three kinds of relations,
structural, hierarchical and attributive (Mili et al, 1996).
Organizing 95
Figure 6.3-O utline A nalysis: By analyzing the hierarchy of the outline of a book
it may be possible to automatically break the book into domains.
S o f tw a re E n g in e e r m g
B ook e tc ,
C h 1 : E. a s ic C o n c e p ts
W h a t is so f tw are ?
W h a t is s o f t w a r e E n g i n e e ri n g
C h 2 : S o f t w a r e S p e c i f ic a t io n
R e q u i re rn e n ts D o c u m e n ts
M o d e l l i n g R e q u i r e rn e n t s
C h 3 : 3 o f t w a re D e s i g n
T o p D o w n D e sig n
B o t to rn TJ p D e s l g n
C h 4 : 3 o ftw are i r n p i e rn e n t a t i o n
D esig n
T e s t in g
C h 5 : M e t h o d o 1o g y
e tc .
96 Chapter 6
Figure 6.4-Relations among Outlines: Each box represents a docum ent of the
software life-cycle w hose title is indicated in its upper left corner. The arrows show
how a heading of a docum ent in one phase of the software life-cycle can be relat-
ed to a heading in another docum ent of the sam e software life-cycle.
Hardware
Organizing 97
Domain Models
A high-level organization of an information space may be reflected in a
model of that space. Given that the information space addresses a partic-
ular topic or domain, the model could be called a topic model or a domain
model. A domain model should identify the objects and operations on
those objects that are common to an application domain. Also important
are relationships and constraints between the objects and their corre-
sponding properties or attributes, that are likely to be used by developers
in the process of searching for reusable components - these must be made
explicit. One part of a domain model may be a classification. A classifi-
cation groups together like things. An enumerated classification scheme
(Ranganathan, 1937) assumes a universe of knowledge divided into suc-
cessively narrower classes that include all possible classes (see Figure 6.5
Enumerated Example). These are then arranged to display their hierar-
chical relationships. An example of this scheme is the Dewey Decimal
system.
Thesauri
The enumerated classification has little flexibility, as it is usually repre-
sented as simply a strict hierarchy of terms with no further attributes and
no term occurs in more than one place in the hierarchy. A thesaurus
extends the enumerated classification by allowing a few other attributes
for each terms, in addition to the attribute of hierarchical location. A the-
saurus may be presented in a hierarchical, ‘Table of Contents’ form or as
an alphabetical sorting. It includes preferred terms (descriptors) for
indexing, and non-preferred terms as lead-in terms (or synonyms) to cor-
responding preferred terms. The preferred term essentially labels a con-
cept. Two basic relationships between concepts are hierarchical (broader
than, narrower than) and associative relations. Several other attributes for
a thesaurus term can be defined, including dates of entry (see Figure 6.6
Thesaurus Terms Table).
A thesaurus supports organizing and finding documents in both
object- and document-oriented systems (though in itself it is object-ori-
ented). With a thesaurus one document can be indexed under several
terms. A user can broaden or restrict the results of his or her search by
asking the system to refer to the thesaurus. Thus the thesaurus provides
the retrieval system with some natural language ‘ability.’
98 Chapter 6
Science
Thesauri are normally part of a larger system and are best built with
regard to their function in that system. It is possible to build a thesaurus
first and then use it to index the documents to be added to the library. This
is described as a top-down approach. To do this requires an amount of
prior knowledge of what the library is likely to contain. Alternatively, in
the bottom-up approach the documents may be indexed using free terms
and then the thesaurus is constructed after accumulating a number of
these free terms. Consistency must be maintained in the terms used and
their structure as preferred and non-preferred terms. When the function
required of the thesaurus is retrieval, as it largely is in the software reuse
Organizing 99
world, the best method for building the thesaurus is a mixture of these.
The drawback of using thesauri is the effort needed to build and maintain
them.
Figure 6.6-Thesaurus Terms Table: This show s the term s associated with the
preferred term ‘Snakes.’ Top term s represent main classes in a classification sys-
tem, in this case the main class being ‘Animals.’ It is also possible to define a rela-
tionship between prior designations for a concept and its current naming, in this
case ‘Serpents’ a s an older naming for the concept of ‘S nakes.’ In software doc-
um ents this is useful for version following.
A ttr ib u te V a lu e
P re fe rre d T erm S n a k es
D efin itio n
D a te o f E ntry June 3 1992
T o p T erm A n im als
B road er T erm R ep tiles
N arrow er T erm R a ttle sn a k e s
A ssociated Terms W orm s
Synonym V ip ers
Pr ior T erm S er p e n ts
Faceted Classification
A thesaurus basically provides a hierarchy of concepts with little other
information about each concept than its hierarchical position. To extend
the thesaurus, one may expand the representation of each concept in it. A
faceted classification is a kind of extended thesaurus. Faceted schemes
are easier to expand than enumerative schemes. They are more flexible,
more precise, and better suited for large, continuously expanding collec-
tions. A software component may be described by a set of {facet, facet
term} pairs:
100 Chapter 6
Figure 6.7-Descriptions of Sort Routines: The facets for two different software
com ponents, called Rigid Component and Flexible Component, are given. Note
that the ‘Flexible Com ponent’ does not use the facet ‘object.’
C o d e Organization
Reuse of code, usually in very informal ways, is almost as old as pro-
gramming itself. All high-level programming languages are in them-
selves a kind of code reuse system. They provide a method of manipulat-
ing the hardware at quite a high-level which reuses large amounts of very
low-level code in each instruction. Some aspects of operating systems
offer a similar function. Libraries of functions in high-level languages
continue this abstraction a level higher. Systems like the UNIX pipe func-
tion (which is described later in this book) extend this abstraction to
include reuse of whole programs typically themselves written in a high-
level language using sub-routines. Newer languages, such as object-ori-
ented languages like C++, and generic functions and packages in ADA,
provide powerful facilities for the programmer to make abstractions.
102 Chapter 6
Figure 6.8-Component Code: A sam ple piece of code to show how keywords
can be considered to exist in ordinary code.
PACKAGE i n t _qu eu e ( p a r a m e te r s . . . )
---- D e s c r i p t i o n -----
AUTHOR: D. C h a p lin
DATE: 2 8 th May 1 9 9 2
NAME: i n t _ q u e u e
COMPANY: N O J o f t w a r e
VERSION: 4 .2
FUNCTION: A v a r i a b l e s i z e queue for
i n t e g e r s . queue is a f i r s t —i n -
f i r s t —o u t (or FIFO)
d a ta s t r u c t u r e .
---- C l a s s i f i c a t i o n - —
CA T E G O R I E S : D ata S t o r a g e ,
Ab s t r a c t Da t a Ty pe .
KE Y ¥ O R D S : que u e ,
f i r s t —i n —f i r s t —o u t ,
d a ta s t o r a g e .
——S t o r a g e R e q u i r e m e n t s ——
M E MO R Y A L L O C A T I O N : as n e e d e d .
— -D e p e n d e n c i e s ----
LIB RARIES : n e e d s m ath, l i b .
----- tt * D e f i n i t i o n s * ■* ■*
end i n t q u e u e ;
Code that is to be reused can vary in size from a small block of code,
of say a dozen lines, that performs some minor function, to a procedure
of thousands of lines that performs a complex operation. There are prob-
lems and advantages to reusing both sizes of object. Smaller components
are less productive since their functionality is not as great due to their
size, but are easier to combine and modify to achieve a different goal
from that originally intended, since they encapsulate a single function.
Organizing 103
level this should lead to subclasses that contain program segments which
are different versions of the same program segment or functionally equiv-
alent program segments.
Framework s
One extension to the object-oriented methodology is sometimes
described as frames or frameworks. Of course, the term framework has
many meanings in the world at large and may be used in generic ways in
software engineering also. However, for the purposes of software reuse,
a framework is defined as a reusable, semi-complete application that can
be specialized to produce custom applications (Fayad and Schmidt,
1997). In contrast to earlier object-oriented reuse techniques based on
class libraries, frameworks are targeted for particular business units or
application domains.
A framework enhances extensibility by providing explicit hood
methods that allow applications to extend its stable interfaces. Hook
methods systematically decouple the stable interfaces and behaviors of an
application domain from the variations required by instantiation of an
application in a particular context. The run-time architecture of a frame-
work is characterized by an inversion of control. This architecture enables
canonical application processing steps to be customized by event handler
objects that are invoked via the framework’s reactive dispatching mecha-
nism. The framework’s dispatcher reacts by invoking hook methods on
pre-registered handler objects, which perform application-specific pro-
cessing on the events. The framework determines which set of applica-
tion-specific methods to invoke—this is the inversion of control.
Developers in certain domains have been using frameworks for many
years. The Microsoft Foundation Classes is a contemporary graphical-
user interface framework that has become the de facto industry standard
for creating graphical applications on personal computer platforms. For
numerous complex domains, off-the-shelf frameworks do not exist, but
the developers in those domains have each developed their own frame-
works. Java is spreading new frameworks like AWT and Beans.
Frameworks are a component in the sense that vendors sell them as
products. But frameworks are more customizable than most components,
and have more complex interfaces. Classes in the object-oriented sense
have not realized much success in general as reuse devices. Frameworks
are harder to learn than individual components or classes. However, a
good framework lends itself more to reuse.
Organizing 105
Epilogue
The way in which software information is organized in a library is cru-
cial. It constrains how the developer may use the library. Techniques for
the organization of components may be document-oriented or object-
oriented.
The document-oriented approach relies on the extant structure of doc-
uments and on free-word indexing. A document-oriented system is easy
to set up and offers many methods for organizing documents fully or, at
least partially, automatically. But document-oriented methods are weak in
many retrieval situations, since the information they provide about the
structure of documents and the structure into which documents are orga-
nized is quite simple. Only basic retrieval methods, such as searching for
topics explicitly mentioned in a document, is possible. Automatically
extracted information is usually a subset of the information stored in the
document, and thus offers less information about the documents than the
documents themselves and only weak associations between them. This
can be improved by human indexers reading the documents and assign-
ing keywords to describe the contents of the document, but it is hard for
this indexing to be kept consistent over large databases of thousands of
documents, or with several indexers working simultaneously. This caus-
es many problems at the retrieval stage, which are described in the fol-
lowing chapter.
Object-oriented systems basically depend on some form of high-level
abstraction or model, commonly called a domain model. Object-oriented
techniques are labor-intensive for the set-up stage. However a rich struc-
ture is produced that can be interrogated in many ways and provides
many advantages to a searcher. A thesaurus provides a system with some
basic knowledge of a domain, thus allowing ‘intelligent’ searches which
find documents using terms related to desired terms rather than simply
failing. The emphasis can be on making retrieval easier, at a cost of mak-
ing organization more complex, time-consuming and expensive. Or the
opposite, where it is retrieval that requires expertise and may prove
unproductive, but organization is simple.
Chapter 7
R etrievi ng
Retrieval S p e c i f ic a t io n
As described earlier, the software life-cycle can be viewed as a refine-
ment process, which starts with the developer having only very informal
and abstract specifications of what the final product should look like.
Unfortunately it may be important to know in the early stages of this
process which reusable components are applicable, since existing
reusable items can only be integrated, if the system is designed to accom-
modate the reusable items. However, at these early stages, the user of the
library might have difficulty expressing what precisely is required.
Despite the vagueness of the retrieval specification, the resulting search
of the library should lead to helpful information. The developer should be
107
108 Chapter 7
Figure 7 .3 -G u aran teed R eturn: The user has requested text on ‘depth-first,’
and the thesaurus has been followed to find related text.
Request No
Depth First Depth First Match
Traversal Traversal
>
Thesaurus
e turned
II "E'epth First Traversal"
item
Broader G raph
Term Algorithms
Graph
Algorithms
IR System
Retrieving 111
D o c u m e n t Retrieval
Document-oriented retrieval approaches include searching for key terms
expressed in a natural language, such as English. When a large library
containing many thousands of items is indexed by more than one person
the terms adopted can become inconsistent. This inconsistency o f terms
chosen to represent components can lead to different keywords being
used to represent the same concept in two components. Problems may
also exist in searching the library for very common phrases like ‘system’
that may have been used as an indexing term for very many unrelated
items. Efforts can be made to help this situation if careful controls over
the indexing terms are exercised.
The keyword form of Information Retrieval is widely used in
Japanese software houses, where reuse of software components is an inte-
grated activity in software development. The stored components are
indexed manually using keywords covering the technical or application-
oriented aspects of the component. The tools applied for retrieval are very
simple (Matsumoto, 1987). Much of the success enjoyed using this infor-
mal method is due in part to many specific Japanese social and working
conditions, rather than their implementation of this method. The
Japanese use training procedures encouraging software reuse and stan-
dardized methods for software description and development, in addition
most Japanese software houses enjoy only very small staff turnover with
the result that informal contacts with fellow employees allows useful
components to be found without relying solely on the retrieval system.
Free-text retrieval requires that the user knows, or is able to deter-
mine what free-text term would represent the document for which he or
she looking. This approach will fail to find documents which are relevant
but use a different term, such as ‘sorting’ and ‘ranking’ (unless good the-
saurus support is given, which due to the effort required to maintain the-
sauri, in some ways defeats the advantages given by using this method).
If the system is using the concept model described in the previous
Chapter, then techniques for finding documents can use the attribute val-
ues stored for each of the components. These attributes and their values
carry more information about a document than a simple list of keywords.
They give not only an indication of what subjects a document is dis-
cussing, but how those subjects are relevant to a topic (see Figure 7.4
Concept Model Links). This kind of linking among concept models is
especially important in situations where hundreds of documents are avail-
able and many thousands of links are possible.
112 Chapter 7
Program Retrieval
Several methods for the retrieval of software documents have been dis-
cussed above. Next methods that can be used to retrieve existing code are
outlined. Some of these approaches are similar to those for retrieving
ordinary documents, but there are specific constraints to take into account
when retrieving program code.
The simplest technique for specifying a search is by using a small
piece of code that the developer would expect to find in the target routine,
such as an Ada statement or a list of commands that are expected to be in
the program. This is in many ways analogous to full-text searching for
code. The string may need to be exactly specified or in a more sophisti-
cated system wildcards may be allowed, for example printf(*%f!) may
find all programs that contain the command printf with a floating point
number regardless of the other parameters used in each case. This search-
ing method makes it difficult for the library to find related components
without some automatic way of generalizing from such a precise defini-
tion. It is more useful, particularly in the early stages of the software
development life-cycle, to use a less specific representation, one which
can describe a component in terms of what it needs to be able to do rather
than how it could be able to do it.
Retrieving 113
Figure 7.4-Concept Model Links: This diagram show s three concept models. A
plain text system would link all three with equal weighting. The links can be seen
to be accurate as there is a logical ‘chain’ of topics from ‘Archivers’ to ‘Dynamic
Interfaces.’Although com pression is significant to the ‘MPEG II’ Model, it is a sub-
issue of dealing with motion video. In the ‘Archivers’ Model it is central. Similarly
the ‘MPEG II’ docum ent though relevant, is not as relevant to ‘Dynamic
Interfaces’ a s a docum ent about 'Novel Interaction’ would be. This difference in
em phasis is important.
114 Chapter 7
------A Va i 1 a b 1 e t yp e S a e e
ty Pe b o X Xs p e i v a t e
hyp e i t em is p r xv a t e ;
-----A v a ± 1 a b 1 e £ u n c t Xo n s a r e .
p e o c e d 'Ll e e ITOVE ( x t e ill b ox, box ) ;
------M o v e a n ite m f Eo m o n e b oX t o a n ot b e r
p r o c e dn r e COPY ( x t e ill , b ox, box ) ;
-----C opy a n ite m £ e om on e b oX t o a n ot b e e
p e a c e d i.i e e DE L E TE ( x t e ill , b oX ) ;
------D e 1 e t e an x t e ILL £ e om a b OX .
p e o c e din e e C L E A R ( Xt e m , box
-----C 1 e a r an item xn a box .
end an_ADT;
(2)
p a c k a g e an o t b e e AD T is
-
------A v a x 1 a b 1 e t yp e s a e e .
ty p e b o X Xs p Ei Va t e
ty p e x t e ill is p e i V a t e ;
-----A Va x 1 a b 1 e f mn c t i o n s a e e .
p r oc e dm e e COPY <i t e m , b ox, b o x ) ;
-----C o py a n ite m £ Eo m o n e b ox to an ot h e e .
P E OC e dm e e DELE TE ( i t e m , b oX ) ;
-----D e 1 e t e an i t e m £ Eo m a b OX .
p E OC e dm e e CLEAR( i t e m, box )
------c 1 e a E an item i n a box ■
end. a n o t b e E _A DT
Figure 7.6-Second ADT: This specifies the minimum or family interface for the
two com ponents.
P ac k ag e ge EL e e i c t y p e AD T i s
t YP e b ox i s P e i V a t e r
t YP e i t e III i s P E i v a t e
P E 0 c e d U E e COP Y ( i t e ILL , b ox , b
P E 0 c e d U E e DEL ETE ( i t em , b ox )
P E 0 c e d U E e CLE AR ( i t e m, b o X) ;
e nd aEL 0 t he E ADT
116 Chapter 7
Retrieval S y s t e m s
Many software document retrieval systems exist in the form of on-line
help systems. These systems typically lack a faceted classification or
domain model but do illustrate powerful features of document-oriented
systems. The UNIX and Andrew help systems are next described in terms
of their retrieval support.
UNIX man Command
Every major command on a UNIX system has an associated man page
which describes its function (Kemighan and Pike, 1984). These entries
vary in size between one and several pages and are all stored in a stan-
dard format to make retrieval of specific information from a document
easy (see Figure 7.7 Sample Man Page). Users may retrieve these ‘man’
pages in many ways depending on how man is invoked at the command
line:
Any term mentioned in the main title of a help file of a utility can be
used as a search term. There is also an option to search for a keyword in
a specific section of the man page, for example to search for printer set-
tings in the ‘External Influences’ section.
This is a very useful system for experienced users, but not for begin-
ners or people who are looking for a new command, since it demands that
the user knows quite specifically what they are looking for before they
can find it. If the on-line help does not contain an entry on a particular
topic, then the user is simply informed there is no manual entry corre-
sponding to that topic, and no help is given to enable the user to find a
related or equivalent program held in the system.
The Andrew Help System
The Andrew help system is part of the Andrew Toolkit (Zanger, 1996).
The Andrew Toolkit provides a total environment for the integration of
diagrams, animations, raster images and other multimedia elements, the
sending and receipt of multimedia mail, and the easy creation of new
Andrew packages, seamlessly in one user environment built under X
Windows on UNIX. The Help System is designed to support the user
more comprehensively than the basic man command. The system is set up
to provide help on any of the programs that make up the Andrew Toolkit
or the underlying UNIX system. To get help on any topic in the system
the user first types the command “help which” displays a window (see
Figure 7.8 Andrew Help Screen) listing the main programs on which help
is available in the right side of the window. This list can be expanded to
list all the programs for which help can be found by selecting an option
from the menus.
Help can be found by using a specific word which describes some-
thing the user wants, such as ‘Bitmap,’ or a specific program, such as
‘ez.’ The system can also maintain a history of the help requests made by
the user (see Figure 7.9 Andrew Help Screen 2). This allows the user to
backtrack through the help screens they have accessed and thus follow-
up keywords or return to previously viewed information.
This system better supports the user than the standard command,
since the users are choosing a subject they require from a list of mean-
ingful options rather trying to guess at what might be the correct term.
There are overview documents that can give the user an overview of what
each available topic is, rather than the user having to wade through a
Retrieving 119
series of entries to find out whether it is like the one line description. The
system also incorporates manual pages defined to a strict format which is
designed to allow the user to find specific sections quickly without
having to read the whole entry and contains a section on related topics in
the system. To view any of these related topics, or if at any time a term in
the text is not understood, it can be highlighted with the mouse and the
system will attempt to find an entry explaining that term.
A similar program, called ‘xm an,’ has been developed for the stan-
dard UNIX system. It too lists all the available commands, but this com-
mand is in addition to and optional to the standard ‘m an’ system and not
the de facto system as ‘help’ is for Andrew, as well as providing fewer
features.
Figure 7.8-A ndrew Help S creen: This screen shows the large left-hand window
which displays the help page for the users chosen topic, here the start-up con-
tents of ‘help.’ On the right are two smaller windows showing document titles on
general aspects of the system in ‘Overviews’ and a list of the programs available
under Andrew in ‘Program .’
To see more of this text: Move toe ir.ws e curs or Into the verdcal b ar z\ tie
left edge of the Hdp vrir.daw. This verted boor is the sereBter . To move
dawn in the text click toe kfimausm mutton, I'omavcup, r±c.< toe right mouse
5
li-.itton when thf. ri:rsorw Htor. srroFiar. The. wlvnlr scrriFliar np.p-* ft-Ths toe
Iwd. die •.■•’iii.ebciik toe tc:‘dlbtr Jit pa:-. oCihs: toxllha.i*
on fhe screen.
To n « tl» niwinw: Mos" win dews rantan sets op tst&m? . Hloosiig ontinrs
horn Ihiie menus dlows vculo chantey aurworkji .0 e vaxij us leaLire so J
a program. On a r.^a-buiton mouse, vress ar.dhald dovm both meuse bur.or.s
to ccuse the menus :o eppear. (Or. a three b oon mausa, press and hold
davmthc midele mouse button.) To cnoose a main aphon. keep -_hc butior.tsj
p-rssrdwlvlr. novrng tor. euno- to thr* nr.m: i-m yo j wa nt. W im "hr. irenii
iLeic: beco:nifc* darkened, lei «o uf IhebuLuiic* VU: choose it
Figure 7.9-A ndrew Help S creen 2: This show s the history window in the bottom
right hand corner, showing the programs on which the user has sought help.
d '£;i
A rd w ror
;jvjr.i W hat Zip Is 1 7£/. r/umreJia •
v.^ • h/al j:
i!&
7 p s a p r o la n fir r.rr/thg and virsvino drawings. You f.a"i nsr 7jp as ?
ilajiL-aioitprogramwvcucankcluJeZpiriseUinItxLdacuntJiU j i labitv
|
|
P'ocrarrinhq \
Far r.ors Infarrr.jdcn. an crssdng and eitb fi 21a as an insets see -Jie J v4i:
helpdocuner.t, |
ill if i;I 1
K 71Js Htlp Locujntnt comisls :I Jit CaLowuig par.* | Mi
1 Starting Zip
Lrcpeoring. to dravt
Program* ;
|
Drawing {gaterd insyjctiicns) corsole
■wih fading and depth i® cut
Boudwl Hg.iie* S ' cq
Free-fomi figures m cz
czonnt
j
i !pj,:I
Drawing. each Aprs :vpe (specific insttvcions)
::exiting, maving. duplirsiting ccad deleting aLt jure types
M
i RMhapirgaid r.ditirg, r/r.i ngurr. typr.
i# Htrlji TTikIi ■'y
i j**)':
i
Previewing scad Pmtiiq s g iiju i# o
|
Using the Grit end 7 aordhate system
j
langir g aiir view oJth r. der^in « B
;! ij ui
fqjres
I
i
; |
ixccntUhengcs 1
i....
Rrlatrd nr.o s H
•3
| Ifjjw: j
Web
The World Wide Web provides an interface to libraries like the world has
never seen previously. The web provides a massive collection of retrieval
tools that expedite finding information anywhere on the web. Features of
tools that were originally developed independently of the web and oper-
ated independently of it have become incorporated rapidly into features
of web systems. For instance, Gopher had much wider usage than the web
for some years. That’s no longer true and whatever can be done with
Gopher is now done through a web interface and typically more easily.
Retrieving 121
The WAIS information servers at one point competed with the web for
information access provision, but the word frequency tools used by WAIS
are now part of web retrieval tools.
Aside from these generic interfaces to web information, one can go
via the web to specific sites for retrieving software-related assets. For
instance, the United States Army maintains an active web site for the
“Army Reuse Center.” There one can not only learn much about the
Arm y’s views on reuse, but can also gain access to depositories of soft-
ware assets (see Figure 7.10 Army Center).
Figure 7.10-A rm y C enter: This screen image shows the w eb-based tutorial to
the Army R euse Center reusable software library (Army, 1997).
722 Chapter 7
Monitoring Retrieval
If items retrieved from the library are separated into ‘relevant’ and ‘non-
relevant’ sets (see Figure 7.12 Retrieved Sets), then Recall and Precision
may be defined:
A good system is one which exhibits both high recall and high preci-
sion. A mathematical model of recall versus precision has been described
and can be used to quantify the trade-off in recall and precision (Gordon
and Kochen, 1989). In general most systems with high recall have low
precision, and most high precision systems have relatively low recall.
The size o f a library? relates to its value. The larger the stock of a
library the more likely the library is to contain the desired component.
The retrieval effort pays off more often and thus is more likely to be used
by developers to find components. The larger the library is the more it
will cost to build, but use o f the library will be more cost effective due to
cost savings resulting from reusing components. There must be a trade-
off between these requirements. Policies can be defined for adding com-
ponents to the library. The library retrieval system could be set up to
124 Chapter 7
monitor requests for components and produce statistics for the rate of
retrieval for each of the components in the library. This will show which
types of components are the most useful in the library and thus should be
updated or expanded. If the library system also produces statistics based
on components that are requested but do not exist in the library, the
library organizers could monitor which components should be added. If a
well organized method is used to search the library, the failed searches
could provide detailed specifications of needed components.
T o t a l E. e tri e v e d It ern s
N on -
R e 1e v a n t R elev a n t
Item s Item s
E. e c a l l = IT u rn b e r o £ 1 1e rn s r e tr 1 e v e d an d r e 1e v a n t
T o t al r el e v an t
T o t a l r e tri e v e d
Ep ilogue
The concept of retrieving documents from a library is intrinsically linked
with that of storing the data in the library. Software documents commu-
nicate some model of the world, and for reuse of these concepts a devel-
oper needs to be able to access this model.
Retrieving 125
components they need. This tension has been a contributing factor to the
lack of acceptance of software reuse as a standard practice in software
development in the software industry.
The wide-spread use of the World Wide Web has changed many
things. The web does not resolve the problem of building thesauri. If any-
thing the plethora of infonnation on the web makes the importance and
challenge of good organization and thesauri all the more severe.
However, the popularity of the web interface creates a certain uniformity
of expectation and use that supports the use o f web libraries. Even simple
domain models and free text searching can be good enough when the
assets are good and people are comfortable with the interface.
The goal of software reuse is not simply to find program and document
components which might subsequently be reused, it is also to allow the
developer to modify and combine components and concepts to create new
software. In general, the components and concepts retrieved by the soft-
ware developer from the library cannot be directly reused without being
modified. They must be revised to fit with the target problem. This stage
may be called asset utilization or reorganization. How much work is
needed to reorganize an item depends upon many factors.
If the library contains program code for reuse, then unless the com-
ponents were designed from the outset to be reused in other projects and
in other domains, then reorganizing the component may not be a simple
case of instantiating the existing general component for the current prob-
lem in hand. Fundamental changes to a component may be necessary. For
example, to change the data-type or language, a redesign may be
required. Alternatively, the library may contain the top-level design doc-
uments abstracted from the brief of the prior project. These are general
documents and changing aspects of them is easier than with more specif-
ic documents, but the number of abstractional stages that are reused is
reduced.
The application in which the component was originally developed
will affect it in many ways, some perhaps extremely subtle. In a particu-
lar application it may be important for a component to be efficient in its
memory allocation, or to be fast and efficient in execution. This is just
one of many compromise decisions that may be encoded in a software
document, be it a code segment or a design document. There may be
many compromises made in the design of the software, dealing with every
aspect of the described component: usability versus functionality,
127
128 Chapter 8
R etriev ed C o m p o n e n t Suitability
Even if a search for an item does not simply fail, the retrieved item may
not fit the expectations of the developer. To what this may be attributable
will depend on the skills and methodologies of both the indexer and the
developer. Systems for retrieval might allow searches to account for qual-
ity of desired components. Items may be inefficient or poorly tested.
A code segment or design, although well written may have require-
ments for control over data structures or system resources that are unrea-
sonable in the context of the new target system. As a simplified example,
a retrieved item may be a fast sorting algorithm. However to operate this
algorithm requires two copies of the data. This may be undesirable, or
infeasible given storage restrictions on the target machine.
The library may only store components expressed in one design
methodology or programming language (or both). However if mixed
methodologies are used, it might be the case that a retrieved component
is expressed in a specification or programming language with which the
developer is unfamiliar. If this is the case it may be that the developer can-
not rely on understanding the component to the required degree. Only in
extreme circumstances would it be considered worthwhile for the devel-
oper to learn a new software engineering technique, or programming lan-
guage in order to comprehend the component. A similar problem would
exist if the library contained specifications which were a mixture of data-
flow oriented and control-flow oriented designs, as these are fundamen-
tal methodologies and cannot easily be reconciled together.
It may be very clear to a developer that a component exactly fits the
specification, or is totally unsuitable. However in most cases the useful-
ness of the retrieved component or components is somewhere between
these two extremes. It would be ideal if some automatic system could be
designed to quantify how close a match exists between the retrieved com-
ponent or components and the desired one. This is a non-trivial exercise
and could only be achieved (if at all) by complex and formal specifica-
tion of both the retrieved components and the target component, which
would in many cases be undesirable.
Reorganizing 129
1. Those that describe the problem domain and model the solution,
2. Constraints imposed by the solution space, such as power and type
of target machine and programming language used, and
3. Stand alone decisions that have little or no effect on the rest of the
program.
Decisions of Type 1 are the most important ones in the reuse process,
these are affected by Type 2 decisions. Type 3 are in general irrelevant.
Some design decisions are explicitly documented, though usually these
will only be the fundamental decisions that affected large areas of the
components structure. Most decisions made and in many cases, all deci-
sions, go undocumented and thus the reason that the decision was made
and to a large extent the specific result of that decision are lost. The only
representation of the decisions is the parts of the program that were influ-
enced by the decision made. These parts may be hidden in several parts
of the program and influenced by other factors and decisions.
So it is important to be able to find these hidden global and local deci-
sions. These are most detectable at different levels of abstraction. The
major decisions are best viewed at a higher level of abstraction than a rep-
resentation suitable for finding the localized decisions. Reverse engineer-
ing is helpful in this process, it is the process of extracting a higher level
description of a component from a lower level one, such as pseudo-code
from a C code program (Hausler et al, 1990). Performing this task reveals
decisions that are obscured by the perhaps pages of code that is required
to implement them (Ruigaber et al, 1990).
130 C h apter 8
D o c u m e n t R e organ izing
Reorganization of non-code documents forms an important part of utiliz-
ing reusable components since much of the information generated by the
software development cycle produces critical documents as well as the
next abstraction of the algorithm. Some of this documentation will be cre-
ated anew, but for the reused components attempts should be made to
generate the new documents from their existing documentation (see
Figure 8.1 Document Reorganization)
Figure 8.1-D ocum ent R eorganization: Original docum ents are processed
using their outlines and thesauri to form new outlines and thus new documents.
Existing
Docum ents
R eo rg a n izin g 131
Figure 8.2-D ocum ent R eorganization U sing a T h esau ru s: The docum ents
are indexed using common terms, shown in the large box above. Those docu-
m ents that share indexing terms can be considered to cover som e common
ground and thus are candidates for integration.
documents have been indexed using some or all of the same terms, then
it is not unreasonable to assume that those two sections cover related top-
ics and thus could be usefully compared or combined (see Figure 8.2
“Document Reorganization Using a Thesaurus”). If not enough links are
generated by the raw indexing material, this method of concatenation can
be extended by using a thesaurus to broaden the range of the linked terms.
P rogram R eorganizing
Retrieved software components may be modified in various ways (Mili
et al, 1994):
— Oper at or s
pr o c e d u r e add_i t er n(t o _ q u e u e : queu e ; n e w_ i t e r n : i t er n ) ;
—‘ Con d i t i o n s
f u n c t i o n q u e u e _ e m p t y ( t e s t _ q u e u e : q u e u e ) r e t u r n boo l e a n ;
pr i vat e
■
— Code to implement the p r oc e d u r e s and f u n c t i o n s ,
end queue_adt ;
Every component has the same input and output mechanism, and the
output of one can be joined to the input of another very simply using a
simple command called a ‘pipe.’ To join two commands together is sim-
ply a case of placing the pipe command ‘|’ between the two commands
(see Figure 8.5 UNIX Pipe). This means that if a program is required to,
for example, open a file, process it (for example to add certain compo-
nents), and print the result, instead of a new program being written, or
even a sequence of code modules being joined together and compiled, the
three programs are simply linked together at the command line (see
Figure 8.6 Inter-Program Communication Using Pipe). This forms what
is called a ‘pipeline.’
R eo rg a n izin g 135
Figure 8.4-T em plate: An example of a code template, showing the param eters
in bold that should or could be changed by the reusing developer.
Fu n c t i o n B ub b l e S o r t ( ;a n a ys J
+ Template written by An d y Gr a y .
+ Declare Parameters :
Be g i n
for counter var = first input to length i n p u t ~1
+ Check for c o u n te r ja r element greater than n ex t
i f
t hen
+ Swa p the items
endi f
en d
Figure 8.5-UNIX Pipe: The process on the left takes its input, from Std-ln and
produces its output on Std-Out. The ‘pipe’ connects the Std-Out from Process
One and the Std-ln from Process Two. Thus the output from Process One
becom es the input to Process Two.
136 Chapter 8
Figure 8.6-Inter-Program C om m unication Using Pipe: cat’ opens the file and
its output is fed (via a pipe) into ‘new _process’ input and then ‘new process’ out-
put is fed (again via a pipe) into ‘Ip’s input.
There is no problem with version numbers since the programs are not
actually compiled together and all versions used will automatically be the
latest one. UNIX provides several of these basic programs for opening
files, printing them, sorting them, searching them, word counting, and so
Reorganizing 137
forth. Since the code used to implement these routines is very basic C
code, and all UNIX systems use C, they should be very portable from sys-
tem to system.
A very useful application of this pipe technique is to use programs as
what are called filters. These are extensively used in many systems. For
example, say that the totals program ‘new_process’ above, will only work
with floating point numbers, due to the way in which it was written. It
would be possible to modify the source code to create another program to
add up integers, but this may be bad practice. For example, if the way in
which the totals are calculated is changed, then the two programs, floats
and integers, must be modified, not one. In UNIX one solution is to write
a filter, another program which goes into the pipeline, that takes integers
and converts them to floats. Thus only the pipeline is changed and
‘new_process’ is unmodified.
This filter technique can be and is used to solve common problems,
such as that of picture display and conversion. Many different file formats
exist for storing raster picture information. Many micro-computer plat-
forms have their own most prevalent image formats, such as TIFF and
PCX. Other formats are more common on workstations, such as
Postscript encapsulated rasters and X Bitmaps. More recently standard
cross-platform formats like GIF and JPEG have become widely used.
These formats are all incompatible with one another. A program for dis-
playing GIFs cannot display a PCX file and a X Bitmap viewer cannot
display a GIF.
A user may have a display program for every format he or she uses,
for example a X Bitmap display program, a TIFF display program and so
forth. Each program must handle different screen displays and is usually
quite complex. An alternative approach is to have a set of conversion pro-
grams. But converting files one at a time is tedious. It would be better to
have a single display program that could handle all formats. However
building such a large system is a huge task. With the UNIX pipe another
alternative is available. The user has one complex display program, that
can display for example Portable PixMap (PPM) files, and has a conver-
sion program to convert JPEG to PPM, X bitmap to PPM, and so forth.
Pipes filter the input file to create a generic image format and send that to
the display program (see Figure 8.7 Image Conversion).
The UNIX pipe mechanism is easy to use due to it’s simplicity, but
this is also a problem. It is often important to pass many different para-
meters in different forms, such as complex, abstract data structures, like
arrays, stacks or queues, for which the above is not an appropriate model.
It is only suited to simple flows of information. The handling of complex
138 Chapter 8
Figure 8.7-Im age C onversion: The original image format is filtered by a rele-
vant program to produce a PPM or Portable PixMap. This is a very simple ‘low-
est common denom inator’ file format, which is easy to handle. Then this file is
displayed via a pipe by another program which can only display PPM’s, which
performs all the complex display handling. A large number of programs are avail-
able to perform functions on PPMs, such as to double the size.
Reorganizing 139
C od e G en erators
An alternative method of creating code efficiently is to generate it auto-
matically from formal requirements or design. This code generation tech-
nique is not strictly speaking software reuse. On the other hand, it is
another use of the specification and bypasses the organization and
retrieval stages of the reuse cycle. The code generation approach may
shorten the waterfall life-cycle by removing design, implementation, and
testing from the software production process. Developers specify the
desired program in some high-level language, with a possible mix of
declarative and procedural constructs. The generated programs are usu-
ally correct in construction, thus alleviating the need for testing.
In some views of software engineering, software development is seen
as simply a series of transformations (Balzer, 1981; Rich and Waters,
1988; Green and Westfold, 1982; Smith et al, 1985) from formal specifi-
cations to the finished program (see Figure 8.8 Program Transformation).
Code generating systems are designed such that given a program specifi-
cation written using some formal method, like Z-Sehema or VDM, that
specification is transformed, via a series of intermediate representations,
to either an executable form or a form that can be readily made exe-
cutable. Code generation systems have four main advantages:
Form al
S p ecifica tio n Transform Transform Transform Program
O ne Tw o Three C ode
Change Log
R eo rg a n izin g 141
Figure 8.9-P ro g ram T ransform ation Exam ple: Here can be seen part of a for-
mal specification (in a textual form, for simplicity). This specifies that at that par-
ticular point the user may wish to load an exam ple file. This can be transformed
or expanded to show the tasks involved in the two intermediate representations
(I.R. 1 and I.R. 2) choosing a file and opening the file. This in turn can be expand-
ed to list the steps involved in each of these stages. This produces an executable
specification, which could be compiled or perhaps transformed (via T’form 1 and
T’form 2) into an executable language like C or Pascal.
Testing after R e u s e
In the vast majority of cases software will not be reused in the same state
in which it was retrieved from the library, instead it will be modified or
re-written. These changes mean that any validation or testing techniques
used to check the original program are no longer guaranteed to be applic-
able. This case is also true of programs that are modified as part of the
normal software maintenance process, and it has been because of this that
the selective revalidation of software has been most extensively
researched in the past. This technique involves designing testing strate-
gies that instead of testing a program exhaustively, which has already
been fully tested prior to being modified, simply tests the modified sec-
tions. This is quite complex as the interactions between perhaps hundreds
of components are extremely complex.
However in a reuse situation this revalidation is more complex due to
the more subtle ways in which the original testing procedures can be
invalidated. For example the original testing can be made invalid by the
component being reused in a situation where the data the routine is
required to act upon is different in emphasis and range of values.
Reuse provides special challenges to a developer who reaches the
testing stage. Testing is easier to do if the developer of a piece of software
is at hand, since that individual will know that piece of software inti-
mately and will have a good idea of what they think are difficult parts of
the solution that the program provides. In many ways of course it could
be argued that equally important is to use an impartial tester who does not
have a large set of preconceived ideas about how the routine is written
and what special cases and so forth are handled in the code. However, in
a reuse scenario the choice of using either or both will not exist in many
cases, since the original developer of a reused component might have no
connection at all with the current team. The program may come with test
data, but if the program has been changed, then this test data may be inap-
propriate. The programmer or analyst who has modified the component
will obviously have gained an understanding of the component, but this
will not usually be as in depth as that held by its original author.
Whilst talking about software testing it is appropriate to talk about the
quality of software. All through this book it has been put forward that the
R eo rg a n izin g 143
Figure 8.10-Q uality Review: From the original specification the developer cre-
ates a software package following the guidelines. The com ponent is checked for
quality by the reviewer comparing it with the guidelines and a review of how well
it m eets the criterion is written.
S pecification G uidelines
R e v ie w er
144 Chapter 8
Epilogue
Precise methods for tailoring or adapting software or other assets are not
generally useful. Instead the reorganization of assets remains largely an
art. The analogy to reproduction with change in biology is partially valid.
The organisms do not know exactly how to modify their genes, but use
some broad heuristics like the cross-over operator and hope for the best.
Software reuse is a little like this. The problems relate to the complexity
of the assets to be reused and the complexity of the target, new products.
One recent experience highlights these difficulties. A team attempted
to reuse three standard pieces (Garlan et al, 1995):
• an object-oriented database,
• a toolkit for constructing graphical user interfaces, and
• an event-based tool-integration mechanism.
• excessive code,
• poor performance,
• the need to modify external packages, and
• the need to reinvent existing functions.
research and tools. There are several reasons for this. The first and fore-
most reason is that reorganization can not occur until there has been orga-
nization and retrieval and those two steps remain themselves the subject
of much debate and research.
The organization and retrieval stages are based on much long term
work in the field of library science. The reorganization stage enjoys no
such parallel field. Reorganization is the most complex and difficult to
perform. It will be no easier to automate this stage of the reuse process
than to write programs which can themselves write programs, adapting to
problems along the way.
The reorganizing stage raises many questions:
This section contains three chapters. The first describes reuse tools, while
the second presents case studies in the management of reuse at compa-
nies. The third chapter is about systems for multimedia courseware reuse.
Chapter 9
S oftw are R e u s e Tools
Within this Chapter several prototype systems are examined which sup-
port document and object-oriented software reuse. First software engi-
neering tools are reviewed and then two prototype reuse systems, called
Practitioner and SoftClass. Finally a user interface generator is described.
CASE
Computer Aided Software Engineering (CASE) may increase productivi-
ty in software development. The productivity gained by using CASE
comes from the following areas:
1. Operating System,
2. Database System,
3. Object Management,
4. Public Tools Interface, and
5. User Interface.
Software Reuse Tools 151
Figure 9 .1 -C ase Tools: The database supports the wider functions taking place
in the development of software.
Hypertext CASE
Many interconnections exist among the components of the software life-
cycle but these interconnections are difficult and time-consuming to
maintain in paper forms. Hypertext makes it practical to connect all these
pieces together automatically and dynamically (Biggerstaff, 1989).
Furthermore, hypertext supports information reuse. For example, when a
paragraph about a component’s design is used as a comment in the pro-
gram documentation and as a paragraph in both the user and design doc-
umentation, in a conventional system this means creating and maintain-
ing multiple copies of the same information. In hypertext, this configura-
tion can be implemented with appropriate links from all the occurrences
to a master copy of the information node. This gives many advantages,
not the least being that any modifications made to the section are auto-
matically reflected in all its occurrences. Also the relationship between
such documents as requirements and code can be traced.
The Hypertext Abstract Machine (HAM) from Tektronix is a gener-
al-purpose hypertext storage system which can be used as a base engine
for other hypertext systems and in CASE systems (Campbell and
Goodman, 1988). HAM stores its database in a centralized file server.
The storage model is based on five objects: graphs, contexts, nodes, links,
and attributes. A graph is the highest-level object which, in turn, contains
contexts. Each context has one parent context and zero or more child con-
texts. Contexts contain nodes and links, while attributes can be attached
to contexts, nodes, or links. HAM is designed to work in a networked
environment.
Tektronix has developed a CASE system called Neptune which uses
HAM and is extensible (Delisle and Schwartz, 1986). Neptune holds all
the project components : requirements, design, source code, test data and
results, and documentation. In Neptune a link or node can have any num-
ber of attribute/value pairs. The attribute ‘projectComponent’ can have
any value from the set of project components: requirements, design, source
code, tests or documentation. The attribute ‘relatesTo’ is applied to links and
can have any value from the set ‘leadsTo,’ ‘comments,’ ‘refersTo,’
‘callsProcedure,’ ‘followsFrom,’ ‘implements,’ or ‘isDefinedBy.’ By example,
a node with ‘projectComponent’ value of ‘requirements’ would have a
‘relatesTo’ value of TeadsTo’ with the node whose ‘projectComponent’ was
‘design’ (see Figure 9.2 Neptune).
A node may contain any amount or type of information. A link is not
restricted to pointing to an entire node but can point to any point within a
node. Contexts are defined by grouping nodes and links with certain val-
ues. For instance, nodes with the ‘projectComponent’ value of ‘code’ are
Software Reuse Tools 153
implicitly grouped into a context and each node gets an attribute called
‘System.’ The values of ‘System’ can be UNIX or VMS a query is made for
the node predicate of ‘System=VMS,’ then only those nodes whose source
code is applicable to Digital Equipment Corporations VAX/VMS operat-
ing system are returned.
Figure 9.2-N eptune: Showing two nodes related by one attribute, as described
in the text.
With Neptune one can copy a subset o f nodes and links from one con-
text into another context. Contexts can be used to define a workspace and
partition a project into local and global workspaces. A local workspace
lets a developer abstract a subset of nodes and links from the global work-
space and place them in a workspace where he or she can make local
modifications, test these modifications against the rest of the project, and
when satisfied merge the changes back into the global workspace.
154 Chapter 9
Practitioner
Practitioner was a five year project funded by the European Commission
and comprising teams in Germany, the United Kingdom and Denmark.
The ultimate goal was the development of a support system for develop-
ers involved in the pragmatic reuse of software concepts. This system was
realized as a set of prototypes, called PRESS {Practitioner Reuse Support
System). Practitioner was concerned with the reuse of software concepts
from the ‘ideas’ embodied in requirements documents through to code.
PRESS was designed not to be bound to a specific programming lan-
guage but to emphasize the description of software concepts. Thus it tried
to bridge the gap between:
Figure 9.3-L evel O ne P ress: Here the three main com ponents of the system
can be seen, in conjunction with their interactions with the user-interface.
UN IX ORACLE
F ile-S v stem D at a b a s e
I n p u t f or T o P D
User input
f or e d i t i n g
and browsing
ID F ie
U N IX U N IX
F il e - S y ste m F ile -S y ste m
O u t p u t f r om T o P D
PRESSTO
PRESSTO is the simplest o f the Practitioner tools, it is designed to specif-
ically deal with plain text, with some thesaurus support. PRESSTO sup-
ports searching across document collections using words taken from a
word index by the user. PRESSTO is very much based on features in the
UNIX operating system and does not use a database system. PRESSTO
instead simply accesses files containing reusable components or concepts
as they are stored in directories on the host computer (see Figure 9.4
ToPD Bubble).
Figure 9.5-PR ESSTO Indexer: Showing a docum ent that has been loaded to be
indexed and the functions that are available to act on it.
0ES lir a s * i
F ile E d it Help [
| s e l e c t dccunart l i s t |
_________________________________________ 1
I i d set- d zeu n srt
2
B o x n e n ta on 1 y p e t e x e
| s e l e c t te r n l i s t
| e e l act stopuDrd l i s t !
Ted N elacn D e f in it io n s c f H 2 S B S L
□ ir c .u d e rubbers o n / s f f
' H u p e r te x t i s ! a c o n b ir a tic n of n a c jr a l language t e x t J i t h tk e c u n
iirtjri: y fin* iril.K”ri:l.i .fl- lrvrn:hir y i f ilyn-n ii: l is p l r y ". , , 1 if n i t
cr ea te M l ter n l i s t tcK t U * +. i n ic h aennot be p r in te d on a czjrvcn tion n i p c q : / i
[Mahon.. T.H t C1367> G ettin g ] t Dct o f Our S ys:= n . l r S c n e c h ta r, G, 1
in fe r n a tio n :Jetr L?.,.a.i f C*i t : c a l R c d e u , k aih * ILD*. F ic n p ia - Ilo:k:
| 11 xjlrty i i:i:u"iny |j -r m |
'H jr e r -a v t. :r n o n -e e q .ia n :ie ] v r l t m;) u : :n f r e e u ta r nouenent a l © x
i i e i h f l e ard o d i o u s : d s c . I t i i m creld t h e e . e i t r o n i f i c a t i o n af
h t e r a r j c o n n e c t ld t s aa ue a ir e a c y knew t n e n . ' [ N e ls o n , T A C1987)
| d is p la y r e n a ir in g t e r ns | 2
| <91------------------------- i b
| ren ^ -e stop u ard s
| npil-il m I w i i 1 ixl.
| upd ate T i le l i l t
[q u it index: i l
1 n k id n e j f : l a l i s t |
158 Chapter 9
The key concept around which PRESSTO is built is the notion of the
occurrence of a token (a term important to the developer) in a file. The
main question that is being asked by a developer of the software is ‘ Which
files contain which tokens?.’ To set up PRESSTO a file has to be created,
called a Ts-file’ which lists all the relevant files containing reusable infor-
mation and a ‘talc-file’ that lists all the relevant indexing words in a nat-
ural language (such as English or German). PRESSTO can then be used
to extract all user defined symbols from programs written in a program-
ming language such as C or to extract only those terms that appear in the
talc and a document. The activation of the indexing function is controlled
from a menu-based interface (see Figure 9.5 PRESSTO Indexer).
The retrieval function of the PRESSTO tool involves (see Figure 9.6
PRESSTO Retrieval) two modes, 'File Mode’ and 'Index Term Mode.’ In
Index Term Mode clicking on an index term, for example Giraffe,’ caus-
es all files that contain giraffe as an indexing term to be highlighted in the
file list. Clicking on the file name then causes it to be displayed in the text
window at the bottom of the screen. The File Mode is similar in principle
but is functionally opposite. Hence clicking on a filename would cause all
keywords contained in that file to be highlighted in the terms list.
PRESSTIGE
PRESSTIGE provides methods and tools for building a concept model
called a ‘questionnaire,’ and provided techniques for retrieving informa-
tion relating to software concepts. One method used to provide a struc-
tured form for entering information about software is to create a concept
model. One of the main purposes of the concept model is to guide the
analysis of documents being added and the extraction of the information
that will be needed to reuse the component later in a structured way. A
concept model should ideally contain all the information that may be use-
ful to a potential reuser (Albrechtsen, 1990). It may be possible to derive
some aspects of the concept model automatically using facilities offered
by software engineering tools. The concept model should store informa-
tion about itself in addition to the information it stores about the compo-
nent it describes. It should say who created the concept model, if it is tied
to a specific project, and so on. PRESSTIGE offered a set of supporting
functions for three tasks :
Software Reuse Tools 159
Figure 9.6-PR ESSTO Retrieval: The PRESSTO search tool interface. More
power can be given by using the buttons in the middle of the screen. If two term s
are selected, then clicking on ‘OR’ will list all files that contain either keyword,
clicking on ‘AND’ would display all files that contain both. The ‘NOT’ button is
designed to allow the developer to search for files that do not contain a particular
term /set of terms.
160 Chapter 9
Figure 9.7—H ierarchical C oncept Model: The answ ers to the questions on the
Top Level’ questionnaire are in turn ‘Component 6,' ‘String Lib,’ ‘Math Lib’ and for
‘Component 6,’ ‘A.D.T. 6.’ This provides a hierarchically structured sem antic net-
like structure of descriptions of a software item.
Software Reuse Tools 161
Figure 9.8-HLQS DFD: Data-Flow diagram of the software module. HLQS has
the three com ponents Translate, Query, and Report.
HLQS BD
Users
Input
User's
Report
E x p l o d e d Vi ew
of HLQS
162 Chapter 9
De f i nit ion:
Func t i o n : HLQS provides a high level i n t e r f a c e . ..
In t e r f ace s
Inte r faces Prov ided
Us e r 1s rep o r t
In t e r f a c e s Required
User's Input
R elational DB
Cone ept Decomposition
Sub c o n c e p t b e i n g Insta n ila t e d
T ran s 1 at e
Interface Bindings
External concept in terface bindings
IN: U s e r 1s Input
Interna 1 s ubconcept in terface bindings
0UT: SQL query
0 UT: Format tempi a te
Sub c o n c e p t being In stan tiated
Qu e r y
Interface Bindings
External concept in terface bindings
IN: R elational DB
Internal sub c o n c e p t in terface bindings
IN: SQL Q u e r y
OUT: T a b l e
Sub c o n c e p t b e i n g In stan tiated
Report
I n t e r f a c e Bin ding s
External concept interface bindings
OUT: User's Report
Internal sub c o n e e p t in terface bindings
IN: Table
IN: Format template
Software Reuse Tools 163
These tasks constitute a reuse scenario, where the domain model pro-
vides a standard vocabulary that can be applied for subject representation
(indexing) as well as for subject retrieval of software concepts. The the-
saurus links together questionnaires and aids the developer in browsing
groups of related documents by creating networks of documents based on
linking terms in separate texts via the thesaurus.
The key components in the PRESSTIGE questionnaire are derived
by the developer answering various questions about the component to be
stored. What these questions are in a particular company would depend
on the domain in which the PRESSTIGE system was to be used. Typical
ones include :
Figure 9.10-T he B ase W indow of the S earch Tool: This window show s the
Press search tool. An example of the result of a search for all the concept mod-
els indexed on 'heating’ is shown. The nam es in the window are the titles of con-
cept models found using that term.
rm PRFSSiSparrh Trml
f ih d h satiric
$
Where entries between brackets are optional search terms may contain
wild cards. For example, the search term
ind?
will find questionnaires that have been indexed with a term starting with
‘ind’ such as indexing, indexed, or indices. The qualifier can constrain the
matches found by a search by specifying a questionnaire property (for
example the 'Function’ of component) for which the match must occur.
The qualifier <thes. relations> is used to extend search terms. For
example, the query:
that form the thesaurus tool display. The user can then see the BT (broad-
er term), the date of creation, and so forth. Any o f the terms generated by
the search, for example the Narrower Term, can now be used as a search
term to retrieve further related items. Usually only the domain analyst has
control over adding and deleting terms to or from the thesaurus. Control
over terms is important, if consistency is to be maintained throughout the
thesaurus.
Figure 9 .1 2 -P ractitio n er R euse Mill: This diagram shows how the different pro-
gram s that were developed for the Practitioner project fit together and exchange
information.
MUCH
The MUCH (Many Using and Creating Hypermedia) system supports
collaborative hypertext authoring (Rada et al, 1989). MUCH enhances a
number of functionalities of PRESS and helps integrate PRESSTO and
PRESSTIGE into a single multi-faceted system (see Figure 9.12
Practitioner Reuse Mill). PRESSTO testing demonstrated that 'under-
standing’ tasks were not well-supported when searching separate, but
inter-connected documents, the user having to go to the next one via word
search. The MUCH system can alleviate the problems in this situation.
The MUCH system is programmed in C running on networked UNIX
workstations, having its own database system, and X-Windows interface.
MUCH puts stress on the outline in a document, since it forms the
existing structure in documents and is possible to extract automatically.
Outlines can provide an overview of a domain and can aid a user in
domain exploration. In MUCH, a fisheye view of a document is imple-
mented, by the user 'folding/unfolding’ the outline. Different links can be
distinguished by labels. Annotation and discussion are also supported by
a Typed link,’ called ‘Comment’ as a communication mechanism among
co-authors. Users can add links between hypertext nodes at will (see
Figure 9.13 Create Link). The MUCH system is particularly useful for
writing with reuse. To this end two particularly strong features have been
incorporated into the MUCH system :
Figure 9.13-C reate Link: The outline is on the left. Paragraphs that are attached
to the nodes appear on the right when the user selects an item from the windows
on the left. The user has elected to create another link and is about to enter rel-
evant information in the small, ‘pop-up’ window in the center of the screen. (Note
that links have types).
m m
The MUCH system allows a user to take existing documents and cre-
ate new documents based on these. The user selects a start heading and a
level to which a depth-first traversal should proceed. New links can be
created and certain links may be defined as dead ends so that the traver-
sals can generate documents with significantly different outlines from
any currently in the library. The reorganization efforts in MUCH aim at
providing different views of the same library according to users’ specifi-
cation. Thus the result in some sense can be seen as a draft of a new doc-
ument, and users can readily modify the structure and the contents of the
draft with the MUCH authoring facilities.
170 Chapter 9
Figure 9.14-T raversal System : Here the MUCH outline system can be seen,
and options available to the user for manipulating the outline, are shown. T hese
allow the user to control which of the types of links between nodes that exist in
the text should be followed.
U i
SoftC lass
The SoftClass project was funded jointly by Canadian research granting
agencies and Tandem Corporation (Mili, 1994). The project aimed to
enhance software reuse for distributed management software and priori-
ties hinged upon two alternative goals:
Figure 9.15-SoftText Extraction: Analyzing this docum ent and having identified
‘Function’ as an attribute name, and ‘HLQS’ a s a software com ponent name,
SoftText knows that ‘Function’ is an attribute of ‘HLQS,’ and that the text immedi-
ately following section 1.1 is the textual description of the value ‘Function’ for
HLQS.
1. HLQS System
1.1 Function
HLQS provides a high-level interface
code. Designs for a particular domain are expressed in a formal way, lead-
ing to an understanding of the inputs/outputs and processes. DRACO is
based upon domains and involves three new human roles in the software
development process :
Once the DRACO domain is large enough, new systems can be built
from the existing component information. This is software concept level
reuse but this can then be transformed into code. The system has many
domains, organized hierarchically and some of which contain executable
information to allow the transformation of code to take place.
level constructs and allows the user, in a sense, to readily reorganize the
components into a new product. In this sense the application generator is
a reuse tool. On the other hand, some would argue that an application
generator is basically another piece of software and not a reuse tool. The
remainder of this section describes a graphical user interface generator
and treats it as an example of a reuse tool.
The main overall requirement of a user interface is ease of use. In
several ways this can be seen as a subjective thing, though there are sev-
eral sets o f guidelines to help developers. One o f the main needs when
designing an interface is for consistency (Nielson, 1989). Consistency
should exist at many levels, from window to window in a program, from
program to program on a platform and so on. If all the gadgets (a func-
tional part of the interface, like a button or scroll bar) work in the same
way, then the user should be able not only to transfer skills from gadget
to gadget, when used for the same thing, (e.g., all scrolling windows have
the same sort of scroll bars, whether they scroll text or graphics) but also
when faced with a new interface item, will be able to predict what it does
by the gadgets it uses. This also reduces potentially costly mistakes which
users make. Furthermore, consistency relates to code reuse, as the same
code can be used to ensure that all functions of a particular type behave
identically.
Figure 9.16—G raphical U ser Interface Exam ple: This show s the standard Motif
file selector.
PUKH ID s e le c t io n popup
F i It-cr-
^ c © /F c r n b -k ,hy=fcook-,‘emac:=±:ook-,T:tfPE33ErtC.-iV’s*
J ir e 'j ’ zr i a© Fi la ©
d :r
/■c,3 / fo r n ^ /h H = b .o o < /c n o = jb o o -:y F n P = i-± !n U 'c/ _ ► c n o io b o o <
o r « 3t / , h y = b o o s / * e n o ^ 5 b o o v r f l P i r :B f l C ■ V b o z ^ i h y p i r T o ..s i
l^/iiy-lllHK/HlIrl =".| ll ll I _f--
m m
na s:
j5lcc3.icn
/oa/Kcrrib^hM^ook^crtoc3took-,T-flPt-ifcflU<^JS ]
C an cel
I °K I I m m
178 Chapter 9
specific, the rest is generated from reusable templates and code segments
by the system.
Interface Architect is a software package by Hewlett Packard for the
creation of X Window applications. It allows the user to create interfaces
which use the international standard widget set of Motif, with very little,
if any actual code having to be written by hand. Interfaces are created by
selecting items from menus and drawing on the screen, the interface
being drawn as it will look on the screen when completed. Then the user
can attach text areas, buttons and so forth to that window by selecting the
desired items from menus and dragging out their area. Each of these items
can be edited to move them, change fonts, indicate how it behaves when
activated, and so on. The developers can add their own code to the vari-
ous buttons so that when a button is clicked a program is executed. This
is in stark contrast to usual methods of creating these interfaces, which
involve writing many hundreds of lines of code.
Interface creation is typically complex but unless something special-
ized or complex is needed a developer can use Architect to create an
interface without having to learn anything about the actual physical code
needed to perform any task such as opening a window, thus allowing
them to concentrate on the actual application-specific segments. Once the
interface has been designed, it can be automatically generated. Architect
generates stand-alone C code. Developers never need to directly manipu-
late the generated source code. Instead the constructs can be loaded into
Architect and manipulated graphically.
Generic components can be created. It is possible to define parts of
the interface, such as the title of a window, using variables. The actual
value used at run time (and thus the text that is displayed) is the value of
the variables when the call is made to create the window. This feature is
described in Hewlett Packard’s Architect documentation as a reusable
parametric component. It means that if similar dialogs are needed for a
group of tasks only one need be defined uniquely, the others can be
acquired by modifying the values.
As well as creating programs, users can create partial interfaces, thus
a developer could create a set of reusable higher level components, which
would be of widgets similar in level of functionality to the file selector
(which is already predefined) and load them into Architect whenever they
are needed. Also Architect provides for reuse by allowing graphical user
interfaces to be added to existing command line driven UNIX programs
quite simply.
180 Chapter 9
E p ilo g ue
CASE tools support systematic software engineering and facilitate reuse
processes. This chapter has emphasized two such tools, namely
Practitioner and SoftClass. A user interface generator has also been pre-
sented as an example of another kind of reuse tool.
The Practitioner and SoftClass systems emphasize domain analysis.
Each helps the user analyze software-related information and encode that
analysis into a computerized-representation. Some of the analysis may be
automatically done. Retrieval in the Practitioner system is based on word
searches, thesaurus searches, or hypertext browsing. SoftClass supports
retrieval of software descriptions from queries that are partial descrip-
tions. The authoring of new software or software-related documents is
supported by collaborative, hypertext authoring tools in Practitioner and
transformational processes in SoftClass.
An application generator is a very different type of reuse tool.
Hewlett Packard’s Interface Architect is a user interface generator. It has
been described in some detail to illustrate the domain model and the
assets which application generators need and benefits of consistency and
efficiency they allow.
Chapter 1O
C ase Studies
S u c c e s s f u l Com m ercial C a s e s
The following brief sketches illustrate the potential benefits of reuse
(GTE, 1992a). The companies tend to be large and several are dealing
with military applications. Later in this chapter detailed managerial
mechanisms of the corporate giants IBM, HP, and Motorola are
presented.
The Toshiba Fuchu Software Factory produces software using a stan-
dardized life-cycle model. This factory produces process control software
products. At Fuchu, the use of metrics has been recognized from the start
and reuse has been measured since 1977. A fourteen percent gain in pro-
ductivity has been achieved annually. 'Promote Reuse’ is a company
motto, and reuse is fully supported by the management. A large library of
reusable software items has been developed, and it is standard practice for
181
182 Chapter 10
every project to review possible candidates for reuse at the start and
throughout a development project.
The GTE Asset Management Programme has been based on a proto-
type system developed at the University of California at Irvine. This pro-
totype was redeveloped by GTE and resulted in a successful transfer of
technology. At GTE, the aim was to support reuse of any asset with the
emphasis on software. The production system developed by GTE held
over 200 COBOL components. GTE found that in 1987, they achieved a
reuse factor of 14% and saved $1.5 million.
Raytheon Missile Systems recognized the redundancy in its business
application systems and instituted a reuse program. In an analysis of over
5000 production COBOL programs, three major classes were identified.
Templates with standard architectures were designed for each class, and
a library of parts developed by modifying existing modules to fit the
architectures. Raytheon reports an average of 60% reuse and 50% net
productivity increase in new developments.
NEC Software Engineering Laboratory analyzed its business appli-
cations and identified 32 logic templates and 130 common algorithms. A
reuse library was established to catalogue these templates and compo-
nents. The library was automated and integrated into NEC’s software
development environment, which enforces reuse in all stages of develop-
ment. NEC reports a 7:1 productivity improvement and 3:1 quality
improvement.
Bofors Electronics had a requirement to develop command, control,
and communications systems for five ship classes. As each ship class was
specific to a different country, there were significantly different require-
ments for each. To benefit from reuse, Bofors developed a single generic
architecture and a set of large-scale reusable parts to fit that architecture.
Because of a well-structured design, internal reuse, and a transition to
modern CASE tools, Bofors experienced a sizeable productivity
improvement in the number of lines of code generated per hour.
Universal Defense Systems in Australia develops Ada command and
control applications. The company began its work in this business with a
reuse focus, and has developed a company-owned library of 400 Ada
modules comprising 500 thousands lines of code. With this base, the
company developed the Australian Maritime Intelligent Support Terminal
with approximately 60% reuse, delivering a 700 thousand line system in
18 months.
Case Studies 183
detail. Also expectations of software now are much greater than they
would have been 20 years ago. An example of the life-span of these sys-
tems is that at ABB one of the steel mill lines was first made operational
in 1963, enhanced in 1975 and again in 1985, this line has always been
computer controlled. The Practitioner Project aimed to find a way of
reusing at least some of the concepts that this software addressed.
Several studies of steel mills and steel mill processes were analyzed
and combined to form a set of concept models or questionnaires to
describe processes in the ‘Hot Mill Rolling Area’ of the steel mill. This
formed the domain analysis necessary to form a useful model of the steel
mill, the processes involved and the materials transformed by these
processes. The Practitioner Project built on the results of AISE.
Questionnaires were produced to 'provide a high level conceptual view’
of the control systems in various areas of the plant, for example 'Hot
Rolling Line Roll Management Systems’. These questionnaires both
complemented and augmented existing documentation about the system
and planned systems.
Experiments were carried out in-house by the Practitioner Developers
and at ABB with the Practitioner tools. Two sets of experiments tested the
functionality of PRESSTIGE. In one experiment, a C software package
amounting to 2,800 lines of code was redocumented using questionnaires.
The entire functionality of the package was documented in question-
naires. Filling the questionnaires took most of the 240 hours spent on the
task. Users complained that some aspects of the concept model were
either irrelevant or unimportant.
After an introduction into the concepts underlying Practitioner and
the tools, two engineers performed tasks related to three major areas:
The preparation of a tender for a project was chosen for the purpose
of demonstrating the reuse process, as it provides a short illustration of
reuse, while at the same time bearing enough resemblance to the design
process to make realistic use of the Practitioner methods and tools. The
engineers were impressed with the functionality of PRESSTIGE, and
especially the browse tool which they felt was a powerful instrument to
list relations between objects in the database. However, they felt that
Case Studies 185
IBM R e u s e
Most major companies have quality management programs. IBM’s qual-
ity management approach has as a major element the increased reuse of
valuable assets such as software, designs, and experiences to prevent
redundant development and maintenance efforts. In the late 1980s, IBM
launched a worldwide campaign to implement reuse formally into the
processes of its internal operations.
Significant accomplishments have been made within IBM since the
early 1980s in reuse technology. Sites such as Boblingen, Germany,
Houston, Texas, and Poughkeepsie, New York have participated in this
work. Part of the effort to formalize the management of reuse in IBM can
be traced to the work in Boblingen on building blocks. Subsequently the
IBM Corporate Reuse Council was established. The Council established
broad communication channels. These took many forms, including
newsletters, a ‘starter kit’, and electronic bulletin boards (Tiro and
Gregorius, 1993).
A focal group called the Reuse Technology Support Center was
formed in January 1991. Its responsibility was to coordinate the reuse
effort within IBM, provide consulting to technical organizations, and pro-
vide funds for tools and assets. In addition, reusable parts technology cen-
ters were established. As writing reusable software costs more initially
than other software, management support is needed to make that invest-
ment possible.
The application of reuse at IBM was recognized to occur at different
levels. The implementation of reuse in a business area entails exploiting
opportunities for software reuse across multiple contracts or products. For
project-level reuse the key activity is the establishment of a project reuse
186 Chapter 10
team leader. The leader participates in all of the project reviews and must
be aware of external sources for reusable components.
Implementing reuse for a site requires additional coordination. A site
champion is given broad coordination responsibilities. A common library
of reusable parts is established. The primary focus of the IBM reuse pro-
gram is to establish reuse across an entire site. Different sites within IBM
have taken different approaches to populating their reuse libraries. Some
examine their current development efforts and identify and build reuse
candidates, while other sites solicit donations.
When the IBM Reuse Technology Center was formed, it targeted five
sites for support during the first year. By 1993, 30 sites world-wide were
involved with the Center. The best programs showed savings in the mil-
lions of dollars and reuse accounted for 25 percent of the components in
a software product. There have been cases where the finely-tuned data
abstractions provided by the building blocks exhibited better performance
characteristics than custom-built data structures. These projects have ben-
efited from the reduced maintenance costs as well as the improved per-
formance gains.
Few major breakthroughs are necessary to exploit reuse. Certain
attributes of software make software easier to reuse, but these are not nec-
essary for reuse. IBM experience shows that reuse can be accomplished
successfully in existing products with existing techniques and knowledge.
a list of requirements for parts. This kind of requirement allows for prop-
er subsequent monitoring of progress in reuse. The state of reuse is col-
lected in every development step.
Maintenance effectiveness for reused parts is gained by redefining
the appropriate process step such that error reports of customers can be
quickly routed to the owner of an erroneous component. More and more
products should be composed of building blocks owned by different orga-
nizations rather than created new each time. To exploit the economic
attributes of reuse, new accounting methods are needed. Experiences at
IBM shows that there is a lack of progress in determining the value of a
reusable part and of availability of flexible charging mechanisms between
organizations.
The IBM experience with modifying processes indicates that imma-
ture processes are not appropriate to modify. With maturity measured on
a scale from least mature to most mature, a good starting point for modi-
fication of processes so as to better suit reuse was deemed to be a process
in the middle of the least to most mature scale. Even at this one can not
expect the staff to initially change their day-to-day operations.
The implementation of the process modifications is time-consuming
and ideally one would have many years to slowly adapt an organization
to a new emphasis. However, the IBM reuse goal was to reach significant
results within two years. Accordingly an accelerator was needed. IBM
adopted incentive programs for reuse. The provider of a reusable part gets
credit depending on the size and usage of the part, and the user gets cred-
it for integrating available parts. The award program at IBM Boblingen
also is consistent with existing award schemes in its schedule and empha-
sis on quality, and rewards for usage of unmodified code only.
Considering the benefits of reuse, the incentives are a relatively low-cost
investment.
The third activity in validating the plan concerns the communication
paths. Several communication channels were deemed important at IBM,
including personal communication, electronic bulletin boards, and data-
bases. Personal communication always proves to be the most important
channel when working with representatives of product areas. This person
should be recognized as a competent professional and be able to support
exchange of information germane to reuse objectives. Bulletin boards are
also heavily used for reuse communications.
In the initial phase of reuse, when the repository of parts is small, a
simple list of these parts is circulated around the site. Later as the repos-
itory grows in size, a database is used. IBM developed a sophisticated
database for corporate-wide storage and retrieval of reusable parts for
Case Studies 189
MACRO (512).
The reuse report therefore contained 512 source instructions and
5120 reused instructions, but does not fairly represent the degree of reuse.
In finding sources for reuse, IBM Boblingen had the advantage of
multiple existing sources. As previously noted, electronic bulletin boards
are a popular medium of exchange about reusable parts. While these bul-
letin boards have the disadvantages of'as-is’ usage without certification
of parts, they are attractive to staff. A reusable parts center was also con-
structed. Finally, as new software is developed it is reviewed by a reuse
board for spin-offs that can be contributed to the reusable parts center.
A curriculum for helping people understand and employ reuse was
designed. The concept of reuse was best introduced in the course of tran-
sition from third-generation language to object-oriented technologies
because object-oriented technologies have many reuse characteristics.
190 Chapter 10
Support tools should be available for reuse. IBM identified three par-
ticularly important tool types:
Figure 10.1-Fingertip Reuse: The figure show s two fingers walking across a
book.
Execution
Two extreme modes of inserting new technologies are the grass-roots and
the edict approaches. IBM used both. Edicts quickly generate some
results. However, experience showed that acceptance by staff of the
Case Studies 191
HP R e u s e
Hewlett-Packard (HP) has been engaged in software reuse since the early
1980s (Griss, 1993). Early work included the development of libraries of
software components written in the BASIC language and more recently
the development of libraries in object-oriented programs. Some of these
libraries have been widely distributed within HP and some provided to
the outside. At the end of the 1980s HP established a corporate-wide
reuse strategy. This lead in the early 1990s to the successful application
of reuse on a larger scale and the development of further software
libraries.
The HP corporate reuse strategy involves a core team of software
reuse experts with additional people working on assignments with sever-
al HP pilot projects. HP is divided into several large divisions, such as the
printer division, and unlike some corporations, HP is not building a sin-
gle corporate-wide reuse library. Rather each division creates reuse pro-
grams and products customized to their needs. The core team works with
the different divisions to help them exploit reuse. The core team develops
economic models, coding guidelines, educational handbooks, and gener-
ally consults with the divisions. The core team focuses on domain-spe-
cific approaches to software reuse and has developed a domain analysis
methodology for HP.
A study of reuse practice at HP has made it strikingly clear that the
impediments to improving software reuse are predominantly nontechni-
194 Chapter 10
cal and socioeconomic. When confronted with their first reuse failure, a
division should pursue an incremental improvement process. For a reuse
program to be effective, the specific inhibitors likely to affect it must be
identified. To better visualize these inhibitors, HP divides these factors
into the following categories:
Figure 10.2-Kit: In the domain-specific kit, com ponents are placed within a
framework and connected with glue.
Components Framework
Application
196 Chapter 10
Figure 10.3-Kit Factory: The two inputs on the left go through kit production and
kit use inside the factory before applications result.
Software Factory
Motorola R e u s e
Until the 1990s Motorola was primarily a hardware producer. From the
beginning of the 1990s the company committed itself to becoming a
premier producer of software also. The role of reuse in this multi-year,
Case Studies 197
CIM-EXP
CIM-EXP Limited is a typical small enterprise consisting of 10 pro-
grammer-engineers working on different research and development pro-
jects (Kovacs, 1997). Some of the projects are one-of-a-kind to serve spe-
cific industrial needs. In these one-of-a-kind projects, any kind of stan-
dardization is very hard and complicated.
Other projects employ standards and reuse, but the projects are a
challenge to manage due to the specificity of the tool set. Most program-
ming is done in C and C++ and various computer-aided software engi-
neering tools are used along with the programming language. The CIM-
EXP projects tend to address communication and networking problems,
and the design of real-time control systems for flexible manufacturing
systems.
One of the approaches to reuse is based on faceted classification of
reusable components. Reuse is valuable when components can be found
that fit into new system needs. The basic approach is to classify part of
the design for reuse by strictly using the facets of the classification lan-
guage to describe all documents and components of the software life
cycle. The advantage to this was that the employees developed a certain
familiarity with the classification and could use it fairly easily. The prob-
lem has been that not all facets apply to all assets. The range of assets that
CIM-EXP uses is so wide that the classification system is neither too
broad to be useful or so specific that it only partially applies to many
assets.
Ep ilogue
The construction of domain models and libraries that support software
reuse has occurred in numerous organizations. The costs of developing
and maintaining the libraries are high and only systematic, reuse-orient-
ed management of the software staff leads to long-term benefits exceed-
ing cost. This chapter has documented several experiences of software
reuse.
At ABB the sophisticated Practitioner tool set and its domain model
methodology was not attractive enough for ABB divisions to be willing
to further invest in the tool set. The challenge is to fit into the work flow
of software engineers. For new reuse efforts this fit may require simple
tools.
At IBM, fingertip reuse has proved critical to user acceptance. If soft-
ware engineers must consult with the corporate or division reuse
Case Studies 199
librarians in formalized ways, the engineers will not bother to follow the
reuse plan. With fingertip reuse, a designer can look for reusable parts
within seconds, just when it comes to mind.
The HP experience is consistent with that of ABB and IBM. HP
found that the most effective reuse programs concentrate on a small,
high-quality set of useful components, and make sure that the engineers
know about this small library. At Motorola a cash incentive scheme has
proven most helpful to reuse. Again and again the conclusion is that care-
ful focused management of incremental change is the key to reuse.
C h a p t e r 11
Courseware R euse
C o u rs ew a re S t a n d a r d s
Courseware components can be reused when appropriately classified and
embedded within environments that have standard interfaces. Standards
have been developed by the aviation industry that standardize what these
components should be like so that such reuse can be facilitated.
Course Reuse 203
• Text,
• Graphics,
• Video,
• Audio, and
• Logic.
Files are the most common data structure in computer science and by
asking that the courseware structure be represented in files, the standards
developers have reached to the lowest common denominator among the
target audience, as standards developers are expected to do.
In the past, authoring systems made the courseware author and stu-
dent user a captive of the authoring system vendor. If the customer want-
ed to manage a set of students in a class, he had two choices:
In either case, the management system works only for course content
from a single vendor. This is fine, until the customer acquires course
material prepared with a different authoring system. Standards should
promote interoperatibilty (AICC, 1997). Interoperability means the abili-
ty of a given management system to handle lessons from different origins.
It also means the ability for a given lesson to exchange data with differ-
ent management systems.
There are two ways to enable interoperability of management with
lesson delivery:
• When the student leaves the lesson, the lesson system updates and
completes the file of information for the management system.
• The management system reads the lesson-to-management file,
updates applicable student data, and determines the next student
assignment or routing activity.
Small C o m p a ny
Integrated Radiological Services Limited (IRS Ltd) is a small company
with seventeen employees that specializes in diagnostic radiology and
authors courseware about radiological safety. IRS Ltd has a large number
of potential reusable components around the office, the majority of which
are not currently being used by courseware developers at IRS Ltd. IRS
Ltd develops its courseware with an authoring package called Toolbook
from Asymmetrix Corporation and decided to develop facilities in
Toolbook to support courseware reuse.
206 Chapter 11
Figure 11.1-E ntry: This screen from the IRS Ltd system shows the Table of
C ontents’ and ‘Media Index' options which the user first faces.
System Architecture
The IRS Ltd system supports librarians in entering material into the
library and authors in accessing material from the library. The authors at
IRS Ltd identified eight types o f material that they wanted the library to
contain:
• Text
• Diagrams
• Photographs
• Graphs
• Tables
• References
• Questions/Answers
• Programs
After the library is ready, the author uses the material within the
library to author courseware. The time taken for the author to do this task
is noted. Hence, the time taken for the author to develop courseware with
the aid of the tool can be calculated and this can be compared with the
time taken for the author to develop a similar piece of courseware with-
out the aid of the tool. The values obtained are substituted into the fol-
lowing relationship:
At IRS Ltd, the librarian-skilled staff are paid $7 per hour, while the
authors, who are also domain experts, are paid $15 per hour. For the sin-
gle course developed in this exercise, the cost of the author’s time is about
$80 and the librarian costs are about $500. Developing the course by the
author alone costs $240. How many courses of similar size to the one
already developed and such that all components come from the reuse
library would have to be developed before the cost of the library was less
than the benefit o f the library ? The inequality given earlier of
For this inequality about 40 courses must be written from the library
before the library proves cost effective.
Media Index versus Table of Contents
The Table of Contents was a useful guide for text retrieval (Rada, 1996).
Unexpected problems occurred when the author tried to use the table o f
contents within the reusable courseware library to retrieve material other
than text, for example, diagrams or photographs. If there was a section
within the book called Introduction, then this would suggest that the text
in this section of the book was introductory. If however, a diagram, graph,
photograph or table was included in this introductory section of the book,
210 Chapter 11
the author could not anticipate the content of the media. The author was
unable to develop courseware using the table of contents alone because
he was only able to effectively retrieve text from the library, and the
courseware which the author wished to develop was multimedia.
The author found the media index easier to use than the table of con-
tents in some ways. The author was able to first decide the medium he
wished to examine and then enter a keyword describing the topic on
which he wished to retrieve material. However, the author was not prac-
tically able to develop courseware using the media index alone, because
he could not get an adequate overview simply by accessing the media
index of what was available within the courseware library.
Results show that neither the media index nor the table of contents
alone are enough to support good recall or precision but together the
media index and table of contents do support effective retrieval (Acquah,
1994). The author found the table of contents index useful as he was able
to see, from the headings in the contents window, what was present with-
in the library. This enabled the author to access easily the text he required
and also helped him to decide on relevant keywords to enter when using
the media index to retrieve text. The author spent the majority of his time
retrieving material using the media index, but used the table of contents
index when he wished to orientate himself. To improve the speed with
which items could be retrieved from the library and thus increase author-
ing speed, both the table of contents and the media index should be
present.
Collecting material which might go into a courseware reuse library is
a major task. At some juncture, this material must be assessed for its true
value to the library and such assessments are themselves difficult. Getting
the material into the proper format for the library is a job for multimedia
which is more complicated than for text. Digitising video, for instance,
requires powerful hardware. Indexing the material for the library is anoth-
er major activity. While some indexing can be done automatically, much
experience suggests that human indexing, while laborious, is important.
With these various costs to acquiring and evaluating material for the
library the challenge of building a large enough library to be useful is
clearly daunting. Furthermore, the contents of the library must be contin-
ually updated and this must be done in close communication with the
needs of the users of the library. In the fast evolving world of hyperme-
dia, new formats themselves are regularly introduced and old ones made
extinct. Maintaining the format converters for this multimedia library is
a technical problem and in a sense easier to handle than the complex
Course Reuse 211
Coordination
The experiences with IRS Ltd system have indicated various important
features of courseware reuse libraries. The conceptual model for the
library and the mechanisms for supporting coordination can be extended.
One project for the training division of a large, Italian, aerospace manu-
facturing company, called Augusta SpA, has produced a particularly
sophisticated prototype courseware reuse system. The overall system is
called Open System for Collaborative Authoring and Reuse of course-
ware (OSCAR).
Reuse Architecture
The OSCAR architecture represents the way in which the OSCAR ser-
vices are organized, what functional level they realize, and what relation-
ship exists between them. To better represent the organization of services
provided by OSCAR and the relationship between them, OSCAR services
have been grouped in layers (see Figure 11.3 The OSCAR Layers).
OSCAR provides the following layers:
Figure 11.3-The OSCAR Layers: Four layers are depicted here. OODBMS
m eans object-oriented database m anagem ent system.
O p eratin g System
Client workstations represent the user entry point into the OSCAR
system. The OSCAR client workstations are mainly multimedia personal
computers on which library applications run. They also allow access to
shared services such as email, file transfer, and information management.
They can be remotely connected to allow a distant author to get access to
the OSCAR services. The OSCAR server provides multi-user services in
the distributed environment. Operating system services provide the man-
agement of all physical resources of a computer system and establish the
basic execution environment for applications. UNIX serves as the multi-
user operating system. MS-WINDOWS is the reference operating system
for the client workstation.
The OSCAR Common Information Space (CIS) allows different soft-
ware components and different users of the system to share information,
update them consistently, and base their work on the work of others. The
Course Reuse 213
Instructional Presentational
Figure 11.5 -S creen o nto Media Units: This screen dump from the OSCAR sy s-
tem show s som e of the features of the CIS, particularly media units.
Course Reuse 215
Figure 11.6-C onverter: Screen dump from the OSCAR system which presents
information about a particular image or bitmap and gives the user an option to
convert that bitmap into a variety of formats.
• Messages are objects that flow between the role instances associ-
ated with an activity.
• Information Units are used in building messages.
• Rules constrain the behavior of components.
THE ORGANIZATION
■activity
organizational
manual
message
The populator prepares the material for entry into the library and
physically enters it. This may involve scanning material or converting
formats. The indexers assign index terms to the library. Simultaneously,
the indexers work with the indexing language experts to create an index-
ing language. As the library and its index grows, maintaining the index-
ing language becomes itself a job (Mili and Rada, 1988). The quality
assurer does quality control and specialists on quality are needed to cor-
respond with every other role just mentioned.
For the Coordination Services all intermediate products can be treat-
ed as messages. For example, when an indexer proposes changes to an
indexing language expert, a message is created in the indexer workspace
using a template from the organizational manual. The message records
Course Reuse 217
information about the person who created it, the role the creator was play-
ing, and the time it was created. The indexer completes the message.
The workspace tries to determine which person should deal with the
message next based on attributes of the message. In this case the work-
space forwards the message to the indexing language expert workspace
and tells the indexing language expert role that a message is awaiting
attention. If the message can be processed, the role instance locks it until
the process finishes.
The indexing language expert workspace retrieves an ‘assessment of
proposal’ template from the organizational manual. By default, the per-
son performing the role would fill-in the details in the appropriate infor-
mation unit of the message. However, some of the fields within the infor-
mation unit in the message may be filled-in automatically by the role
agent. After the person or the role agent fills-in the fields, the workspace
unlocks the message and informs the current information unit that it is
complete. At this stage, the information unit triggers its rules which check
the validity of the field values and determines which will be the new cur-
rent unit and which role will process it. The message then routes itself to
the appropriate workspace.
This circuit is repeated until all the information units are completed.
At this point the message is considered complete and the next message is
activated and routed to the appropriate workspace. In this way, indexing
language maintenance is supported by the computer.
The preceding sketch of indexing language maintenance is only a
small part of library maintenance. The ‘reuse assurance role’ monitors the
extent to which authors are using the library. A search librarian helps the
author find material. In the OSCAR scenario, searching and browsing the
CIS is supported by computer programs, but experience suggests that
human assistance would also be important.
interested in learning to use these tools. Further, while authoring tools can
empower individuals to create educational objects, they do not help an
author answer the question “does what I want to create already exist?”
Because of this, duplication of effort is common.
To address the issues of duplication of effort, lack of organization,
and the (relatively) small authoring community, an on-line community
and searchable resources are needed. From a developer’s perspective, the
EOE can help creators of Java educational applets know what has already
been created and by whom. The EOE is also intended to help educators
and learners access this material and the creators of the Java applets.
Working together, the educators, learners, and developers can collaborate
to enhance existing material and produce new innovations. The EOE
vision relies on a strong, diverse community of users and creators who
form small partnerships to modify something that exists or create some-
thing completely new.
After four months of operation, the EOE had a library of over 1,000
pointers to Java applets, over 25% of which made source code available.
In addition, about 100 people had signed up to be members of the com-
munity. The organizers of EOE believe that over time a whole class of
domain-specific education communities will start to develop around
repositories and directories. Accordingly, the EOE have made the infra-
structure available for download for anyone who’d like to use it to start
their own EOE.
EOE Plans
The EOE started as a directory of freely available resources on the web
with a community of people who are willing to work together to share and
add value to these resources. Nevertheless, long-term EOE encourages
the development of EOE-related businesses and for-profits. Just as
libraries do not put bookstores out of business, but spur demand for books
through a more literate population, the EOE of free resources should
expand the market for “for-fee” educational resources. For example,
teachers who start using the EOE would to be able to know which EOE
objects are associate with chapters in the textbooks that they are using in
their classes. So publishers might provide indices from their textbooks to
be freely available, and for fee web resources that could be augments to
the printed textbook. Meta-data and micropayment systems are part of the
infrastructure that is being developed to support these sorts of
businesses.
The EOE has worked with developers to produce different types of
license agreements for sharing source code. For example, if a developer
220 Chapter 11
Ep ilogue
Courseware is a kind of software and courseware reuse problems are a
particular case of software reuse problems. This chapter has examined
two courseware reuse tools and experience with their use. As was the case
for software reuse at IBM, HP, and Motorola, one general impression is
that the tools might be simple to fit into the workflow for initial course-
ware reuse efforts.
A model of courseware development via reuse from a courseware
library has been elaborated. This model has been contrasted to a model of
courseware development without reuse. A major challenge for course-
ware reuse which is not confronted for software reuse in general concerns
the wide variety of incompatible media formats. The challenges of con-
verting courseware components from one format to another have been
largely overcome through the provision of various conversion tools. The
conceptual overview of the library contents has been divided into two
high-level types, namely a media view and a contents view. Neither alone
supports adequate retrieval but both together do.
If a courseware library is developed and is only used by one author to
develop one piece of courseware, then the efficiency of the reuse process
will be very low. A reusable courseware library is most efficient when it
is used to develop numerous courses. As the costs of developing and
maintaining a multimedia courseware library are relatively high, a small
firm might best choose a simple facility and try to quickly realize some
benefit from the system. Larger firms may be able to afford larger start-
up investments in the library. The critical factor in cost efficiency is a
repeated use of the library which will depend in part on the firm’s man-
agement policy.
Chapter 12
Conclusion
Representations
Traditionally reuse focuses on the reuse of code only. This requires least
effort from the developer and offers the most immediate returns, when
successful. It also has its roots in the component libraries associated with
languages like Fortran, or in systems such as X Windows with its widget
code that implements buttons and windows and is easily accepted by
developers. The code is pre-written, pre-documented and pre-tested.
221
222 Chapter 12
However more than code can and should be reused, if full reuse is to be
achieved. All information produced during the software life-cycle should
be reusable to some extent and tools should be available to the developer
to help him or her in the reuse of this information. Typically, the knowl-
edge used and produced at the earlier development stages of software
tends to be expressed and presented in a human or human manipulate
language, while in the later stages of the development process represen-
tations are closer to, or actually are, computer languages. Neither form is
problem-free for reuse.
Often the software library will not have a suitable component (code
is very specific). Developers may find the component difficult to under-
stand. Any changes other than very minor ones may involve reverse-engi-
neering the component to reach a state where it can be reliably modified.
Often the testing advantage is lost, since the component has to be modi-
fied or is being used in an environment different enough from its devel-
opment domain to warrant retesting.
Industry View
The increased size of the global market increases the potential for the
number of units to be sold. This permits a business to justify increased
capital investment in a product while lowering per-unit price, if penetra-
tion of a large percentage of the now larger potential market can be
ensured. Penetration, though, is crucially linked to being first to the mar-
ket. Software developers, therefore, find themselves in a situation of cop-
ing with the commodity pricing of high-capitalization software with mar-
ket share strongly linked to the speed of introduction. This trend is favor-
able to software reuse because reuse practices support the economical
capitalization of development effort in a manner that can accelerate the
introduction of new products to the market.
Experience in software development has frequently shown that the
challenge to software reuse is less the development of new programming
languages or technologies, but rather the way an organization rewards
software reuse on the part of its software engineers. Motorola software
engineers in one division were given financial rewards for storing assets
in a library and then given significant further rewards each time the asset
was used in another product. Those incentives helped that division
become the premier example of software reuse in Motorola.
Conclusion 223
G o v er n m e n t View
Companies often see their approach to internal software reuse standards
as important to their competitive advantage. Some of these companies
will keep these software reuse standards confidential. Government agen-
cies are more likely to want to share reuse libraries with the public.
Government agencies have played a particularly active role in advancing
reuse standards and libraries that are shared with the public.
By law, United States government procurements are required to be
fair, a virtue valued beyond even effectiveness. In general, government
cannot award a follow-on contract to a company simply because it per-
formed well on the predecessor program—a fair competition among
interested parties is required and the award will be made based on some
combination of proposed actions and estimated cost rather than track
record. In this sort of contracting environment, government has an essen-
tial need to ensure that one contractor can reuse the products of previous
contracts.
Conclusion 225
C o s t s and B e n e f it s
Reuse may involve significant change to traditional practice, and there
are a number of challenges to overcome in achieving its full benefits.
Making software that is reusable generally requires investment above and
beyond that required for a one-time system. This effort goes into making
the software more flexible, ensuring its quality, and providing additional
documentation. Each organization must make decisions about how the
investment is supported.
Today’s usual contracting methods can create a disincentive for con-
tractors to reuse existing software or to provide software for reuse by oth-
ers. Legal issues arise over liabilities and warranties. Responsibility for
maintenance must be identified.
Reuse should reduce maintenance cost. Because proven parts are
used, expected defects are fewer. Also, there is a smaller body of software
to be maintained. For example, if a maintenance organization is respon-
sible for several different systems with a common graphic user interface,
only one fix is required to correct a problem in the user interface rather
than one for each system.
Reuse should improve interoperability among systems. Through the
use of single implementations of interfaces, systems will be able to more
effectively interoperate with other systems. For example, if multiple com-
munications systems use a single software package to implement one
standard communication protocol, it is very likely that they will be able
to interact correctly—more so than when each package is written by a dif-
ferent company but is supposed to follow the same standard.
Another benefit of reuse is support for rapid prototyping. A library of
reusable components provides an effective basis for quickly building
application prototypes. With these prototypes the software group can get
customer feedback on the capability of the system and revise the require-
ments as dictated by the customer.
Conclusion 227
A n a lo g y to Traditional Libraries
Software reuse could not have occurred more than about 50 years ago
because there was no software. Document reuse has, however, occurred
for centuries, at least. One domain of document reuse is scientific
research. There reuse by reference is fundamental. A quality research
journal article typically contains citations to about 20 other journal
articles.
A small research team may maintain its own small library.
Investment in this library may include in the first instance the purchase of
subscriptions to some journals. A member of the team may be assigned
on a part-time basis to somehow organize the journals in the library so
that others can find them.
As research teams cooperate and see the advantage to larger libraries,
they may pool their resources. In the extreme case, the national govern-
ment is convinced to establish a comprehensive library. One example of
such a library within the medical domain is the National Library o f
Medicine (NLM) in the U.S.A. NLM subscribes to all 20,000 of the
world’s biomedical journals. A continual and extensive quality assess-
ment of these journals selects the 3,000 best from the 20,000 and every
article within those 3,000 is indexed with about 10 concepts from a the-
saurus (Bachrach and Charen, 1978). The thesaurus itself contains about
100,000 concepts and is maintained by a full-time staff of about 10
228 Chapter 12
people. The indexing section of NLM employs about 400 full-time, pro-
fessional indexers. The results of indexing are distributed world-wide via
paper publications, electronic network and CD-ROM. In short, the
national level effort to maintain a kind of reuse library is a massive effort.
Prior to the summer of 1997, NLM charged users a small fee for hourly
connect time to the online database. Since the summer of 1997 access to
this online library is completely free to anyone in the wold.
The parallels of the traditional library situation to the software reuse
situation are instructive. Software teams that begin a reuse effort will nat-
urally start with a small library. Comparable to the journal article citation
for software might be a call to a program in the library. As the size of the
software library and the number of its users grows, the importance of a
systematic approach to the library also increases. National or internation-
al efforts may ultimately be the most appropriate.
Researchers who use the NLM system may also write journal articles
which would ultimately be indexed in the NLM system. For publicly-
funded, medical researchers a quantitative measure of success is the num-
ber of published, journal articles. For commercially-funded, medical
researchers the objective may instead be to suggest methods or products
which the commercial body can later exploit on the marketplace.
Accordingly, the commercially-funded researcher may be forbidden from
publishing some research results. For instance, a drug company may not
want its researchers to publish work about the new drug which the com-
pany is investigating.
Much software is made by companies that do not want to freely con-
tribute their products to a library for other companies to use. The exam-
ple of national libraries of research literature suggests an approach to
software reuse. The government could require that successful bidders for
a government software development contract would provide the product
to a public reuse library. This kind of approach is being taken by the
American Department of Defense and may be an important step in the
wider acceptance of software reuse methods.
Ep ilo g ue
The US Army finished in 1997 an extensive survey of Army personnel
responsible for software reuse (Army, 1997b). The survey covered the
major areas of Reuse Management, Reuse Education, Domain Analysis,
Domain Implementation, and Reusable Asset Acquisition. These are the
major areas of interest as regards reuse. The results of the study are con-
Conclusion 229
sistent with the experiences of other organizations and are the basis for
the final recommendations of this book.
Many projects reported that they had created a working group to
address reuse-related issues and were using reuse language in Requests
for Proposals. However, overall adherence to reuse management guide-
lines varied widely from one part of the organization to another part of the
organization. The recommendation is for further establishment of con-
nections among projects so that reuse efforts can be further harmonized.
The greatest potential for improvement was documented in the area
of education. Very few employees have received any significant level of
formalized reuse education. However, 70% of staff requested reuse edu-
cation. Extension of reuse education opportunities should be a high pri-
ority.
The results of the domain analysis part of the survey was to highlight
the importance of high-level domain analysis. This domain analysis
should occur not only within an army such as “command and control” but
across areas to highlight opportunities for horizontal reuse—namely,
reuse across areas. Such further domain analysis would be consistent with
the emphasis in software engineering on frameworks or patterns which
require high level domain analysis activities. The areas of the organiza-
tion which show the most overlap in the domain analysis results should
be the most appropriate for reuse across areas.
Within domain implementation, both opportunistic and systematic
reuse activities were reported. To further promote a systematic approach
which contributes to greater cost benefits, reuse needs to be incorporated
earlier in the software development life cycle. This approach must
include varied types of products, such as requirements, designs, architec-
tures, models, application program interfaces, schema, and tests. For each
of these different product types, a common set of reuse metrics must be
developed, documented, collected, and evaluated.
With regard to Reusable Asset Acquisition, an effort needs to be made
to extend the availability of reuse repositories and encourage quality
donations via an incentive program. This will facilitate an increase in sys-
tematic reuse as more products of higher quality and various types are
made available. The increasing availability of information across com-
puter networks, particularly the web, leads to the prediction that the con-
tents of reuse repositories will improve in quality and increase in size.
However, efforts to reduce the cost of acquisitions implies a reduction in
the ability to require reuse features in delivered products. The tension
remains between the desire to get a product finished and the desire to con-
tribute a flexible asset to a reuse library.
230 Chapter 11
This book does not per se provide reusable software assets. It does
provide a kind of domain analysis of the area of software reuse itself.
Furthermore, it serves an education function. The book is available for
free across the Internet. The book is also part of an online course. The
goal of the author is to work with others to build a virtual information
technology college for which reuse will be a fundamental tenet.
Contributors to the Virtual Information Technology College (Rada,
1997) will include students, teachers, and administrators. Additionally,
software developers can contribute to the College by providing software
assets that become part of the College. These assets must be provided in
such a way that they are open to inspection and fit into the domain model
for the College. The infrastructure of the College will be a kind of library
of reusable software modules itself.
The financial incentive for contributing to the infrastructure will be
based on a revenue sharing scheme. The College will be self-financing
and each contributed asset that is used will earn a certain percentage of
the revenue stream that comes to the College. Of course, this same rev-
enue stream has to pay the human teachers and administrators of the
College, but the software is expected to play a very active role in the run-
ning of the College and the developers of such software need to be ade-
quately rewarded. Normally an organization has difficulty in adequately
defining the financial incentive to software reuse activities but this chal-
lenge will be very directly addressed by the Virtual Information
Technology College. Other virtual organizations could follow this same
model.
Students in the College will be able to examine the infrastructure of
their own college to see examples of systematic software reuse. Software
reuse will be a key topic in education in this new college. Students that
graduate from this College will be better acclimated to a culture o f reuse
and will contribute more effectively to reuse within other organizations.
Appendix I
S elected G lossary
A
application domain: The knowledge and concepts that pertain to a par-
ticular computer application area. Examples include battle management,
avionics, and nuclear physics. Each application domain can be decom-
posed into more specialized subdomains where the decomposition is
guided by the overall purpose or mission of systems in the domain.
application engineering: The development or evolution of a system to
meet particular application requirements.
application generator: A software tool that generates software work
products from nonprocedural user specifications of desired capability.
asset: A unit of information of value to a software engineering enterprise.
Assets can include a wide variety of items, such as software life cycle
products, domain models, processes, documents, and case studies.
asset base: A coherent set of assets, addressing one or more domains and
residing in one or more asset libraries.
C
certification: The process of determining to what extent something can be
trusted to satisfy its requirements without error.
chiefprogrammer team: A group of people who work together under the
guidance of a chief programmer with key support from the team’s librar-
ian.
component: Synonymous with asset.
231
232 Appendix I
D
design: The process of defining the software structure, components, mod-
ules, interfaces, and data for an application system to satisfy specified
requirements.
document: Any information product, such as a requirements document or
a computer program.
document-oriented system: A system in which the integrity of documents
is paramount and their available structure in, for instance, Tables of
Contents, are needed to provide overviews. In such a system documents
are often located by a string search.
domain: An area of activity or knowledge. A number of different classi-
fication schemes have been proposed for domains; some of the classes of
domains that have been identified include: application, horizontal, and
vertical.
domain analysis: The process of identifying, collecting, organizing, ana-
lyzing, and modeling domain information by studying and characterizing
existing systems, underlying theory, domain expertise, emerging technol-
ogy, and development histories within a domain of interest. A primary
goal is to produce domain models to support the development and evolu-
tion of domain assets.
domain engineering: The development and evolution of domain-specific
knowledge and assets to support the development and evolution of appli-
cation systems in a domain. Includes engineering of domain models,
architectures, components, generators, methods, and tools.
domain model: A definition of the characteristics of existing and envi-
sioned application products within a domain in terms of what the prod-
ucts have in common and how they may vary.
domain-specific reuse: Reuse in which the reusable assets, the develop-
ment processes, and the supporting technology are appropriate to, and
perhaps developed or tailored for, the application domain for which a sys-
tem is being developed.
Selected Glossary 233
G
generation: A technique or method that involves generating software
work product from nonprocedural user specifications of desired capabil-
ity.
H
horizontal domain: The knowledge and concepts that pertain to particu-
lar functional capabilities that can be utilized across more than one appli-
cation domain. Examples include user interfaces, database systems, and
statistics. Most horizontal domains can be decomposed into more spe-
cialized subdomains where the decomposition is often guided by charac-
teristics of the solution software.
I
interpretive indexing: The assignment of concepts to a document to indi-
cate its fundamental meaning.
L
legacy systems: Software systems in domains of interest that can impart
legacy knowledge about the domains and feed domain analysis or reengi-
neering efforts to produce domain assets or new application systems.
library: A collection of components, together with the procedures and
support functions required to provide the components to users.
library data model: The information (sometimes called meta-data) that
describes the structure of the data in an asset library.
life cycle: The stages a software or software-related product passes
through from its inception until it is no longer useful.
life cycle model: A model describing the processes, activities, and tasks
involved in the development and maintenance of software and software
related products, spanning the products’ life cycles.
M
methodology: A set or system of methods and principles for achieving a
goal such as producing a software system.
234 Appendix I
O
object-oriented system: A system in which objects and their relations are
paramount. Hierarchical relations are particularly important as they sup-
port inferencing along inheritance paths.
opportunistic reuse: The ad hoc reuse of assets in the development of
software systems using a software development process that has not been
altered to accommodate systematic reuse. In opportunistic reuse, the
developer determines where reuse can be applied to develop a software
system without the organized use of domain engineering products during
successive stages of a software engineering process.
organizing: The collecting, analyzing, indexing and storing of informa-
tion so that it can be easily accessed later.
P
portability: The extent to which a software component originally devel-
oped on one computer and operating system can be used on another com-
puter and operating system.
precision: A measure of the ability to reject non-relevant materials.
process: A description of a series of steps, actions, or activities to bring
about a desired result.
process-driven software engineering: An approach in which software is
developed or evolved in accordance with well defined, repeatable
processes that are subject to continuous measurement and improvement
and are enforced through management policies.
Q
query: A request for identification of a set of assets, expressed in terms
of a set of criteria which the identified items must satisfy.
Selected Glossary 235
R
recall: measures the ability of a system to retrieve relevant documents.
reorganizing: The tailoring of information to suit a new purpose after that
information has been first organized into a library and then retrieved from
that library.
requirement: A condition or capability that must be met or possessed by
a software system or software-related product.
retrieval system: an automated tool that supports classification and
retrieval of assets.
reusability: the extent to which information is able to be reused.
reuse: The application of existing information. In software engineering,
reuse usually involves the application of information encoded in soft-
ware-related work products. A simple example of the reuse of software
work products is reuse of subroutine libraries for string manipulations or
mathematical calculations. A simple example of the reuse of information
not encoded in software work products is consultation with a human
expert to obtain desired knowledge.
reuse-based software engineering: An approach to software-intensive
system development in which systems are constructed principally from
existing software assets rather than through new development.
reuse cycle: One pass through the Reuse Planning, Enactment, and
Learning processes in a particular reuse program.
reuse infrastructure: The collection of capabilities that is needed to sup-
port and sustain reuse projects within a reuse program. Includes tools and
technology; organizational structure, policies, and procedures; and edu-
cation and training.
reuse library: A set of assets and associated services for accessing and
reusing the assets. A library typically consists of assets, corresponding
asset descriptions, a library data model, and a set of services (manual or
automated) for managing, finding, retrieving, and reusing assets. Such
services can include reuse consultation services.
reuse library interoperability: The ability of two or more distinct, hetero-
geneous Software reuse Libraries to dynamically provide access to the
other’s Assets, Asset descriptions, and other available information.
reuser: An individual or organization that reuses assets, reverse engi-
neering: The process of analyzing a computer system’s software to iden-
tify components and their interrelationships.
236 Appendix I
S
software engineering environment: The computer hardware, operating
system, tools, and encoded processes and rules that an individual soft-
ware engineer works with to develop a software system.
specification: A document or formal representation that prescribes, in a
precise manner, the requirements, design, behavior, or other characteris-
tics of a software product.
T
tailoring: The process of adapting products for application in new, spe-
cific situations.
thesaurus: A set of concepts in which each concept may have hierarchi-
cal and associative relations to other concepts. A concept is labeled with
a preferred term. Synonymous or non-preferred terms are also provided.
traceability: The characteristic of software-related products that docu-
ments the derivation path.
W
word frequency indexing: An automatic assignment of words to a docu-
ment based on their frequency of occurrence in the documents.
A p p e n d i x II
References
237
238 Appendix II
249
250 Index
d o cu m en ts; O v e rv ie w d o cu m en ts; F
R e q u irem en ts; S o ftw a re-related
F acet, 100
d o cu m en ts, term , 100
co m p o n en ts, 109 F a c e te d c lassificatio n , 9 9-101, 198
o u tlin e, 94 F ile m o d e, 158
re o rg a n iz atio n , 131
F iles, 203
D oD . S e e U n ite d S tates D e p a rtm e n t o f
F ilters, 137
D efen se.
F in a n c ia l in cen tiv e, 230
D o m ain , 23 2 . S e e a ls o H o rizo n tal d o m ain ,
F in g e rtip reuse, 191, 198
an aly st, 7
F o rm a t co n v erters, 210
e n g in eerin g , 232
F o rtran , 221
im p lem en tatio n , 229
F o u rth G e n eratio n L an g u ag es, 176
m o d elin g , 7, 56
F ra m e w o rk co ncepts, 51
m o d els, 9 7 -1 0 1 , 106, 232 F ra m ew o rk s, 104-105. S e e a ls o G en eric
D o m ain an aly sis, 6-7, 184, 22 9 , 232 fram ew ork; R e u se fram ew o rk ; S oftw a re
p ro c e sse s, 56
P ro d u c tiv ity C onsortium .
sta n d ard , 84
F re e -te x t retriev al, 111
D o m ain E n g in e e rin g G u id eb o o k , 79
F u ll-te x t se arc h in g , 112
D o m a in -re la te d in fo rm atio n , 154
F u n c tio n a l testin g , 24
D o m a in -sp e c ific ap p ro ach , 49
D o m a in -sp e c ific k its, 194
G
D o m a in -sp e c ific reu se, 232
k its, 196 G en e ra te d p ro g ram s, 139
D R A C O , 175, 176 G e n eratio n , 232
G e n erativ e ap p ro ach , 5, 176
E G en eric fram ew o rk , 51
G en eric p ack ag e s, 132
E d u catio n , 187, 197
G IF, 137
E d u catio n al m eth o d o lo g y /m o tiv a tio n , 3
G lo b al d e p en d en c ie s, 192
E d u catio n al o b ject e c o n o m ies (E O E ), 2 1 7 -2 2 0
G lo b al w o rk sp ace s, 153
histo ry , 2 1 8 -2 1 9
G o p h er, 120
p lan s, 2 1 9 -2 2 0
G o v e rn m e n t agen cies, 224
E d u catio n al o b jects, 219
G o v e rn m e n t p ro c u re m e n ts, 224
E m b e d d ed firm w are p ro d u cts, 194
G rass ro o ts phase. S e e M o to ro la reuse.
E n g in e erin g , 1. S e e a ls o D o m ain ; R ev erse
G ra ss-ro o ts p ro ject, 191
e n g in eerin g ; S o ftw a re reen g in eerin g ,
G ro u p p ro d u ctiv ity , 39
c o n cen s, 221
G T E A sse t M a n a g e m e n t P ro g ram m e, 182
p e rsp ec tiv es. S e e A ssets.
G u id an ce, 75
E n g lish , 100
G u id e fo r R e u sab le S oftw are. S e e A ero sp ac e
E n te rp rise. S e e S m all en terp rise.
A p p licatio n s asse ssm e n t criteria.
E O E . S e e E d u catio n al o b je c t eco n o m ies.
G u id elin es fo r S u ccessfu l A c q u isitio n and
E u ro p ean C o m m issio n , 154
M a n a g e m e n t o f S o ftw a re In ten siv e S ystem s.
E u ro p ean U n io n , 66
S e e S oftw are In ten siv e S ystem s.
E v a lu atio n
criteria, 74-75
H
stag e, 68
E x cep tio n s, 23 H A M . S e e H y p ertex t A b stra c t M achine.
E x e cu tab le co d e, 61 H ard w a re p latfo rm s, 211
E x istin g re la te d sta n d ard s, 7 4 -80 H e u ristic m eth o d s, 174
E x p ectatio n s, 72-74. S e e a ls o U se r H ew le tt-P a c k a rd (H P ), 178-181, 199, 220
ex p ectatio n s. co rp o ra te reu se strategy, 193
E x te rn al sy stem b eh av io r, 17 H P -V E E , 196
E x tra c tio n alg o rith m , 172 reu se, 193-196
Index 253
N atio n a l A ero n a u tic s an d S p ace A d m in istra tio n O p en S y stem fo r C o lla b o ra tiv e A u th o rin g and
(N A S A ), 69 R e u se (O S C A R ), 211-217
N atio n a l In stitu te o f S tan d ard s an d T echnology, se rv ices, 211
80 O p en D o c, 105
N a tio n al lib ra ries, 228 O p e ra tin g system s, 151, 211, 212
N a tio n a l L ib ra ry o f M e d ic in e (N L M ), 2 2 7 , 228 O p p o rtu n istic reu se, 234
N a tio n a l S cien ce F o u n d a tio n (N S F ), 218 O p tical ch a ra c te r re c o g n itio n softw are, 207
N A TO . S e e N o rth A tla n tic T reaty O rg an izatio n . O rg an ism s, 144
N a tu ra l lan g u ag e req u irem en ts, 18 O rg an izatio n , 86, 89, 91. S e e a ls o B e n ev o len t
N E C S o ftw a re E n g in e e rin g L ab o rato ry , 182 o rg an izatio n ; C o d e organ izatio n ; H ig h -lev el
N ep tu n e , 152, 154 o rg an izatio n ; S o ftw a re team o rganization.
N e t sa v in g N S R , 65 O rg an izin g , 234
N e tscap e, 22 4 O rig in al author, 142
N L M . S e e N a tio n a l L ib rary o f M e d icin e. O S C A R . S e e O p en S y stem fo r C o lla b o ra tiv e
N o n -c o d e d o cu m en ts, 130 A u th o rin g an d R euse.
N o n -h ie ra rc h ic a l relatio n , 95 O u tlin e -e x tra c tio n , form , 103
N o n -o p e ra tio n a l re q u ire m e n ts, 19 O u tlin es, 168. S e e a ls o D o c u m en tatio n ;
N o n -p re fe rre d term s, 97, 98 D o cu m en ts; H ig h -lev el o utline,
N o n -te c h n ic a l facto rs, 68 re latio n s, 95
N o rm a tiv e ad v ice, 74 O v e rv ie w do cu m en ts, 118
O w n er, 66
N o rm a tiv e d o cu m en ts, 74
N o rth A tla n tic T reaty O rg an izatio n (N A TO ),
5 1 ,8 6
P
N o t In v e n te d H ere S y n d ro m e , 234 P ack ag es, 114. S e e a ls o G en eric p ack ag e s.
N SF. S e e N a tio n a l S cien ce F o u n d atio n . P arad o x . S e e R e o rg an izin g .
N S R . S e e N e t sa v in g N S R . P ara lle l-p ro c e ssin g com puter, 42
P a ra m e te rs, 137
O P arts cen ters, 192
resp o n sib ility , 191
O b je c t eco n o m ies. S e e E d u catio n al o b ject
P arts so u rces, 187
eco n o m ies.
P aten ts, 67
256 Index
ch ip , 23 in d u stry view , 2 2 2 -2 2 4
c lien t req u ire m e n ts, 15 rec o m m e n d a tio n s, 229
co n fig u ra tio n m an ag em en t, 122 S o ftw a re re e n g in eerin g , in tro d u c tio n , 1
d o m ain a rch itectu re , 83 n eed , 1-3
e n g in eers, 27, 197, 209 S o ftw a re reuse, 127. S e e a ls o R e w a rd s softw are
factory, 196 reu se.
item s, c la s sificatio n /retriev al, 89 p ractice, 75
m ain te n a n c e , 27 to o ls, 149
p ro je c t m o d elin g , 42 -4 4 S o ftw a re R e u se B u sin e ss M ode l, 78
to o ls, 150 S o ftw a re R e u se G u id elin es, 79
u se rs, 73 S o ftw a re R e u se In itiativ e, 77
S o ftw a re assets, 121 S o ftw a re team o rg an izatio n , 35-39
m a n ag ers, 73 S o ftw a re T echnology fo r A d a p ta b le R e liab le
S o ftw a re co m p o n en ts, 99, 125, 172 S y stem s (ST A R S ), 51, 77, 84, 225, 226
d escrip tio n s, 171 S o ftw a re -re la te d d o cu m en ts, 60
S o ftw are C risis, 1 S o rtin g , 111
S o ftw a re d e v elo p m en t, 36 S o u rce co de, 152
cy cle, 24 co n fig u ra tio n m a n a g e m e n t tool, 190
life c y c le , 112, 125 S p ecializatio n , 132-133, 227
reth in k , 3 S p ecificatio n , 236
S o ftw a re e n g in eerin g , 198. S e e a ls o P ro c ess- SP IC E . S e e S oftw a re P ro c ess Im p ro v e m e n t and
d riv e n so ftw are en g in eerin g ; R e u se-b ase d C a p a b ility dE te rm in atio n .
so ftw are en g in eerin g , SQ L , 172
e n v iro n m en t, 236 rela tio n a l d atab ase m a n a g e m e n t system , 155
S o ftw are E n g in e e rin g In stitu te. S e e S tab le in terface s, 104
C a m e g ie -M e llo n U n iv ersity . S tan d ard izatio n , 78
S o ftw a re E n g in e e rin g S tan d ard s C o m m ittee effo rts, 74
(S E S C ), 32, 33, 76 su itab ility, 85
M a ste r P lan , 33, 71, 72, 80, 82 S tan d ard s, 71, 202. S e e a ls o A m erican
M a ste r R o a d M ap, 80, 81 sta n d ard s; C o u rse w are reuse;
P ro g ra m E le m e n ts, 80, 81 D o c u m en tatio n ; D o m a in analysis;
Survey, 33 E x istin g re lated sta ndards; P o licy
T echnic S tan d ard s, 80, 81 sta n dards; P o te n tia l n e w standards;
S o ftw a re In ten siv e S y stem s, a cq u isitio n / P ro c ess; R euse; S o ftw a re life cycle,
m a n a g e m e n t g u id elin es, 78 a p p licatio n , 187
S oftw a re life cycle, 13, 15, 94, 107, 222 co n se rv a tiv e ap p ro ach , 82
d esig n , 2 0 -2 2 d ev e lo p m e n t o rg an izatio n s, 86
im p lem en tatio n , 22-23 e x p ectatio n s, 71
m ain ten an ce, 27 in tern atio n a l reco g n itio n , 87
req u irem en ts, 17-19 k in d s, 88
stan d ard s, 2 8 -3 4 rec o m m e n d a tio n s, 80-86
te stin g /d o c u m e n ta tio n , 2 4 -2 7 reje c te d altern ativ es, 85-86
S oftw a re P ro c ess Im p ro v e m e n t an d C a p a b ility STA R S. S e e S oftw are T echnology for
d E te rm in a tio n (S P IC E ), 3 2 -3 4 , 85, 87 A d a p ta b le R e liab le S ystem s.
S oftw a re P ro d u c tiv ity C o n so rtiu m , 79 S tart-u p costs, 63
R e u se A d o p tio n G u id eb o o k , 79, 84 S teel m ills, 183
S y n th esis fram ew o rk , 79 S trin g searc h in g , 116
S oftw a re ree n g in e e rin g S tru c tu ral in d ex in g , 93
co n clu sio n , 221 S tru c tu ral testin g , 24
c o sts/b en efits, 2 2 6 -2 2 7 S tru c tu re clashes, 103
g o v e rn m e n t view , 2 2 4 -2 2 6 S tru c tu re relatio n , 95
S tu d en t, p erfo rm a n c e , 205
Index 259
X
X B itm ap s, 137
X W in d o w s, 221. S e e a ls o In terface A rch itect,
x m an , 119
Y
Y ellow P ag es, 191
Z
Z -S ch em a, 114, 139