DeMIMA A Multilayered Approach For Design Pattern Identification
DeMIMA A Multilayered Approach For Design Pattern Identification
Abstract—Design patterns are important in object-oriented programming because they offer design motifs, elegant solutions to
recurrent design problems, which improve the quality of software systems. Design motifs facilitate system maintenance by helping
maintainers to understand design and implementation. However, after implementation, design motifs are spread throughout the source
code and are thus not directly available to maintainers. We present DeMIMA, an approach to semiautomatically identify
microarchitectures that are similar to design motifs in source code and to ensure the traceability of these microarchitectures between
implementation and design. DeMIMA consists of three layers: two layers to recover an abstract model of the source code, including
binary class relationships, and a third layer to identify design patterns in the abstract model. We apply DeMIMA to five open-source
systems and, on average, we observe 34 percent precision for the 12 design motifs considered. Through the use of explanation-based
constraint programming, DeMIMA ensures 100 percent recall on the five systems. We also apply DeMIMA on 33 industrial
components.
1 INTRODUCTION
relationships [3]. They influence the design of modules and
M AINTAINERS must be aware of design choices in order to
modify an object-oriented software system appropri-
ately. Design choices include all decisions made by
classes but not the overall architecture. They are defined in
terms of classes and relationships; thus their implementa-
developers when designing and implementing the system: tion uses idioms.
the structures of classes and the relationships among them. We use the term motif to express the solution of a pattern
However, design choices are often scattered in the source as “a reliable sample of traits, acts, tendencies, or other
code of systems after implementation because, with avail- observable characteristics” [1]. We distinguish between
patterns and motifs because patterns often encompass
able object-oriented programming languages, they do not
information that is not readily available for their identifica-
transcribe directly into source code; developers must write
tion. For example, the Composite design pattern [2, p.163]
several lines of code using constructs of the languages to also includes information about its intent, motivation,
implement their choices. Moreover, documentation is often applicability, and consequences, which are not observable
obsolete, if it even exists, and these choices are thus lost. characteristics. Only its structure, its participants, and their
However, design choices are often implemented with collaborations are observable in the source code. Thus,
recurring patterns, “a form or model proposed for imita- strictly speaking, we cannot use the terms design pattern
tion” [1], to facilitate writing and understanding the source “identification,” “detection,” or “instantiation” but rather
code. Idioms and design patterns are two types of patterns; the instantiation and identification of microarchitectures
architectural patterns and micropatterns are others. Idioms similar to some motifs; thus, we use the term “design motif
are low-level patterns specific to some programming identification” for the process traditionally called design
languages and to the implementation of particular char- pattern identification.
acteristics of classes or their relationships. They are We define the term microarchitectures as concrete
intraclass patterns describing typical implementation of, manifestations of some motifs in the implementation of a
for example, relationships, object containment, and collec- system. A microarchitecture is composed of classes, methods,
tion traversal. Design patterns [2] are recurring interclass fields, and relationships having structure and organization
patterns that define solutions to common design problems similar to one or more motifs. A microarchitecture can be
in the organization of classes. They are “tactics” that similar to more than one motif because only developers may
generate the structure and behavior of classes and their decide intent, motivation, and consequences.
Developers usually search for some kinds of patterns in
order to understand a system [4]; by recognizing concrete
. Y.G. Guéhéneuc is with the Département d’Informatique et Recherche
Opérationnelle, Université de Montréal, C.P. 6128, succ. Centre Ville, manifestations of these patterns, they deduce, from their
Montréal, Québec, H3C 3J7, Canada. E-mail: [email protected]. experience, the design choices underlying the presence of
. G. Antoniol is with the Département d’Informatique, Ecole Polytechnique motifs in the source code. During maintenance and evolution,
de Montréal, C.P. 6079, succ. Centre Ville, Montréal, Québec, H3C 3A7, maintainers would greatly benefit from knowing the design
Canada. E-mail: [email protected]. choices made during implementation, see, for example, [5].
Manuscript received 18 Apr. 2007; revised 1 Apr. 2008; accepted 29 May To support design pattern identification and program
2008; published online 27 June 2008. comprehension, we combine and extend our previous work
Recommended for acceptance by R. Taylor.
For information on obtaining reprints of this article, please send e-mail to: [6], [7], [8] in a new multilayered approach named the Design
[email protected], and reference IEEECS Log Number TSE-2007-04-0133. Motif Identification Multilayered Approach (DeMIMA).
Digital Object Identifier no. 10.1109/TSE.2008.48. DeMIMA makes it possible to recover two kinds of design
Authorized licensed use limited to: ULAKBIM UASL - Izmir Ekonomi Univ. Downloaded on March 20,2024 at 21:03:58 UTC from IEEE Xplore. Restrictions apply.
0098-5589/08/$25.00 ß 2008 IEEE Published by the IEEE Computer Society
668 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 34, NO. 5, SEPTEMBER/OCTOBER 2008
choices from source code: idioms pertaining to the relation- describe design motifs [12]. Also, class diagrams are often
ships among classes and design motifs characterizing the produced early in the development cycle and are the sole
organization of the classes. DeMIMA is extensible and reliable documentation because they can be reverse
scalable; it ensures traceability between motifs and source engineered with reasonable accuracy. We will use other
code by first identifying idioms related to binary class information in future work.
relationships to obtain an idiomatic model of the source code DeMIMA assists maintainers in task 1 by providing a
and then using this model to identify design motifs and three-step identification process of a design motif DM in
generate a design model of the system. On average, we the source code S of a system based on UML-like class
observe 34 percent precision for the 12 design motifs diagram models:
considered and the five open-source systems on which we
apply our approach. DeMIMA ensures 100 percent recall on 1. Model the source code S as a model MS using a subset
the five systems. We also apply DeMIMA on industrial of the language used to describe models of motifs and
system source code and designs. including all of the constituents corresponding to
The remainder of the paper is organized as follows: In constructs of S, as explained in Section 4.1.
Section 2, we give an overview of the approach and justify 2. Enrich model MS with idioms that reveal binary
its rationale. In Section 3, we summarize related work and class relationships to obtain a model MI , which uses
present essential characteristics of the identification steps. the same language used to describe models of
In Sections 4.2 and 4.3, we describe our approach and motifs, as detailed in Section 4.2.
discuss its characteristics. In Section 6, we apply the 3. Enrich the model MI through the following three
approach on a testbed of open source and industrial substeps, as shown in Section 4.3:
systems. In Section 7, we summarize our work and discuss
future challenges. Build a model MDM of a motif DM as a class
a.
diagram with the formalism used to describe MI .
b. Identify microarchitectures similar to MDM in
2 DESIGN MOTIF IDENTIFICATION MI . A microarchitecture A might be either a
2.1 Context complete form if its entities and their relation-
We have broken down the comprehension process that ships match one to one the entities and relation-
maintainers use to identify recurring motifs in the source ships in MDM or an approximate form if they do
code into three tasks. not, e.g., if a suggested relationship between two
entities does not exist.
Identifying a microarchitecture A similar to some
1. c. Instantiate a model MD based on MI and
motifs from a set of known patterns S DP . Main- enriched with models MA of the identified
tainers analyze a system source code S, either microarchitectures.
manually or using tools, and identify subsets of the Any approach to design motif identification should
source code that are similar to known motifs. maintain a traceability link between the different layers
2. Contextualizing A to keep a unique motif from S DP from source code up to the identified microarchitectures:
using semantic data extrinsic to S. Maintainers
choose in S DP the pattern DP whose corresponding 1 2 3
S Ð MS Ð MI Ð MD fMA g ; ð1Þ
motif DM is embodied by A. Contextualization
x
depends on the system domain and on the main- where Ð describes the xth layer to produce the next model.
tainers’ experience and understanding of the system.
Example. In the rest of this paper, we use the simple
3. Comprehending S. Maintainers deduce from DP ,
example taken from [6] and shown in Fig. 1 to illustrate
whose motif DM was manifested by A during the
implementation of S, the design choice behind A, the different steps performed by DeMIMA. The example
including the intent and motivation of the devel- uses two classes, C1 and C2, linked by an aggregation
opers and the consequences on the overall system relationship. The aggregation relationship exists through
design. the field C2 c2 and the void operation1() method
body.
Because subtasks 2 and 3 depend on the maintainers’
We want to identify in this source code any micro-
experience and the system domain, they are difficult to
architecture similar to the design motif represented by
automate. In contrast, task 1, which is tedious and error
the UML-like class diagram in Fig. 2a. Thus, we need to
prone [9], [10] is a good candidate for automation.
first recover a model MS of the system, then refine this
2.2 Problem model into MI , which includes the aggregation relation-
Design motifs are described with UML-like class and ship, and, finally, model and match the motif MDM
against MI to create a model MD , which includes the
sequence diagrams,1 which represent different aspects of
result of the matching, MA , as shown in Fig. 2.
software systems [11]. Class diagrams are global models of
systems, representing their entities and the relationships 2.3 Our Solution
among entities, while sequence diagrams specify local
In DeMIMA, we characterize the constituents of class
interactions in entities and sequences of method calls
diagrams and propose algorithms to identify these consti-
among entities.
In the rest of this paper, we only consider class tuents in source code. Basically, class diagrams consist of
diagrams because they are most frequently used to classes, fields, methods, interfaces, inheritance, and im-
plementation relationships. We concur with Dave Thomas
1. Design motifs notation borrows from OMT class diagrams, OBJECTORY that “Every model needs a metamodel” [13]. Thus, we
interaction diagrams, and the BOOCH method [2]. define a metamodel, Pattern and Abstract-level Description
Authorized licensed use limited to: ULAKBIM UASL - Izmir Ekonomi Univ. Downloaded on March 20,2024 at 21:03:58 UTC from IEEE Xplore. Restrictions apply.
ENEUC
GUEH AND ANTONIOL: DEMIMA: A MULTILAYERED APPROACH FOR DESIGN PATTERN IDENTIFICATION 669
Fig. 2. Models of the motif and of the source code for the running example. (a) The bottom part shows the UML-like diagram of a simple motif; this
part, together with the upper part, represents MDM . (b) UML-like diagram of MS . (c) UML-like diagram of MI (some instantiation links are omitted).
(d) UML-like diagram of MD (some instantiation links are omitted).
Authorized licensed use limited to: ULAKBIM UASL - Izmir Ekonomi Univ. Downloaded on March 20,2024 at 21:03:58 UTC from IEEE Xplore. Restrictions apply.
670 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 34, NO. 5, SEPTEMBER/OCTOBER 2008
and MD ð MA Þ) can be traced back to the source code 3.3 Related Work on MD (Including MDM and MA )
constructs in S (respectively, to the constituents in MS and Several authors proposed approaches to identify micro-
MI ) from which it originates. architectures similar to design motifs. In general, these
approaches rely on a design motif library; thus they are
similar to the program understanding and architectural
3 RELATED WORK recovery approaches based on clichés matching and plan
We classify the related work according to the recovered recognition. The main problems of these approaches as
models because obtaining and abstracting the data needed identified by Wills’ precursor work [23] and put forward
to identify design motifs is problematic. We conclude with a recently by Niere et al. [24] is that a design motif may
summary of essential characteristics of any identification appear in several different forms due to variants. Wills
approach for design motifs. classifies the main sources of variants as syntactic variation,
implementation variation, delocalization, organization var-
3.1 Related Work on MS iation, redundancy, unrecognizable code, and function
Building a model of the source code is the first step of any sharing. Syntactic variation is mostly with regard to the
static analysis. The objective of this step is to obtain a model syntactic level clichés. Cliché recognizers traditionally
of the source code that can be manipulated programmati- embody the knowledge of all of the different forms that a
cally. This step can be performed using a readily available certain cliché can assume. This is not the case in our
approach, where the use of explanation-based constraint
parser technology such as JAVACC or COLUMBUS [14].
programming accounts for syntactic variants. Implementa-
3.2 Related Work on MI tion variation is related to the fact that a given concept may
be implemented in different ways: An aggregation may be
Several authors proposed approaches to extract binary class
implemented with a list or a set or any other user-defined
relationships, which is an important concern when building
type. We define such relationships using language-inde-
models of source code. Indeed, these relationships are not pendent properties to avoid this problem. Another example
explicit constructs of mainstream object-oriented program- concerns the depth of the inheritance tree between a
ming languages, such as C++, Java, or Smalltalk, and they superclass and a derived class participating in a motif
lack precise definitions. (see, for example, the Composite design motif). Again, the
Jahnke et al. [15] and Niere et al. [16] introduced generic use of explanation-based constraint programming deals
fuzzy reasoning nets (GFRN) to recover association rela- with such variants. The other problems highlighted by
tionships among entities in the context of the Fujaba project. Wills—delocalization, redundancy, unrecognizable code,
They proposed a set of clichés from source code. Source and function sharing—do not concern our approach.
code clichés used together with GFRN allow identifying Rich and Waters [4] proposed the use of constraint
associations relationships while managing variations of programming to recognize plans in Cobol source code.
implementation. Although their work is promising, the use Cobol systems are modeled by their abstract syntax trees. A
of GFRN is complex and they consider association relation- plan is modeled as nodes of the abstract syntax tree and
ships only, not aggregation and composition relationships. constraints among nodes (control and data-flow, function
More recently, Niere et al. [17] introduced an approach calls. . . ). The identification of a plan in source code is
converted to a constraint satisfaction problem in which
based on fuzzy beliefs able to recover association and
nodes of the plan are variables, constraints among nodes are
aggregation relationships in large software systems while
constraints among variables, and the source code abstract
handling impreciseness. syntax tree is the domain of the variables. This work is the
Jackson and Waingold [18] developed WOMBLE, a tool first account of the use of constraint programming for plan
for the lightweight extraction of object models from Java identification. However, it does not apply to design motif
bytecodes. They described an object model as a graph identification because plans are low level and it does not
wherein nodes are entities and links are binary class identify approximate forms of the plans. Nevertheless, we
relationships. Relationships considered in WOMBLE are draw from this work two important characteristics of
inheritance, association, and aggregation. WOMBLE in- design motif identification: the need for explanations and
cludes heuristics to infer the target entities of association for approximations [4, pp. 83 and 181].
and aggregation relationships. This work is a source of Other approaches to design motif identification used
inspiration even though it did not consider composition clichés recognition algorithms such as unification, see the
relationships. precursor work by Krämer and Prechelt [25]. An example is
In general, previous work was limited by the lack of the SOUL environment [5], a logic programming environ-
commonly agreed upon definitions for binary class relation- ment based on Smalltalk that directly manipulates Smalltalk
ships. Moreover, to the best of our knowledge, no constructs through predicates. The SOUL environment
definitions of the association, aggregation, and composition allows direct representation of the abstract syntax tree of
relationships existed, describing how these relationships the Smalltalk source code managed by the underlying
environment as logic facts. Using these facts, it is possible to
must be implemented in source code. For example, [19],
build a library of predicates and to identify entities whose
[20], [21], [22] proposed definitions of these relationships,
structures and organizations correspond to design motifs.
but there were no hints on their concrete implementation. However, the use of logic programming requires the
Thus, the first step toward design motif identification is to definitions of predicates for all possible variants, i.e., all
define the association, aggregation, and composition rela- expected variations of implementation. The definition of all
tionships and to obtain models of systems that integrate variants of implementation is cumbersome. Also, the use of
these relationships. A complete survey of the subject is logic programming does not explain the presence or
available in [6]. absence of microarchitectures similar to design motifs.
Authorized licensed use limited to: ULAKBIM UASL - Izmir Ekonomi Univ. Downloaded on March 20,2024 at 21:03:58 UTC from IEEE Xplore. Restrictions apply.
ENEUC
GUEH AND ANTONIOL: DEMIMA: A MULTILAYERED APPROACH FOR DESIGN PATTERN IDENTIFICATION 671
Other authors introduced the use of queries to identify Contributions of DeMIMA are the following: For the first
entities whose structure and organization are similar to time, as suggested in previous work, an approach brings a
design motifs [26], [27]. In particular, Keller et al. [27] solution to the identification of microarchitectures similar to
introduced the SPOOL environment for reverse engineer- design motifs using commonly agreed-upon definitions of
ing, which allows manual, semiautomated, or automated the unidirectional binary class relationships, unique repre-
identification of abstract design components using queries sentations of design motifs, and semiautomated and/or
on source code models. A query is manually associated automated algorithms explaining identified microarchitec-
with an abstract design component and applied to a source tures. Thus, it complies with the characteristics of the
code model. The main limitation of this work is the need to identification of microarchitectures similar to design motifs.
develop and associate queries with abstract design compo- In particular, explanation-based constraint programming
nents manually and with each possible variant of their explains identified microarchitectures for maintainers to
implementation. direct their search and discriminate among possible false
Generic fuzzy reasoning nets have also been applied to positives easily. Explanation and constraint relaxation lead
the identification of design motifs [24], [28]. A design motif to interactive or automatic algorithms while naturally
is described as a generic fuzzy reasoning net representing tackling the problem of variants identified by Wills [23].
rules to identify microarchitectures similar to its implanta-
tion in source code. However, this approach has not been
pursued or implemented despite its promises. Moreover, it 4 MULTILAYERED APPROACH
is difficult to express design motifs as generic fuzzy DeMIMA relies on a multilayered approach, detailed in the
reasoning nets and to modify them. following sections.
Graphs and graph-transformation techniques also have
been used to describe and identify design motifs in system 4.1 First Layer: Source Code Model MS
source code [29], [30]. A design motif is described as a The first layer consists of an infrastructure, e.g., parsers, to
graph whose nodes represent entities and whose edges obtain models MS of the source code of systems. MS is
represent relationships among entities. The identification of expressed using the language defined by the metamodel
microarchitectures corresponds to a graph isomorphism: shown in Fig. 3 (Part 1 exclusively) and inspired by UML. It
the identification of a subgraph similar to a given graph in a includes all of the constituents found directly in any Java
graph, which is a difficult problem [31]. Pettersson and object-oriented system: class, interface, member class and
Löwe [32] proposed transforming graphs of systems into interface, method, field, inheritance and implementation
planar graphs to improve performance with interesting relationships, and rules controlling their interactions. The
results. An approach based on similarity scoring has also constituents describe the structure of systems and a subset
been proposed [33] which provides an efficient means to of their behavior. The main constituents in the metamodel
compute the similarity between the graph of a design motif and their relationships are the following:
and the graph of a system to identify classes potentially
. Class Entity to describe entities of a system. An entity
playing a role in the design motif. Although efficient, these
might be a Class or an Interface.
approaches are not interactive, do not explain their results,
. Class Element, to describe elements of entities. An
and only allow a limited set of approximations.
element might be a Method or a Field.
Finally, several authors proposed dedicated syntactic
analyses to identify design motifs in source code, for A model of a system is an instance of class Program-
example, [34], [35], [36], [37]. These analyses are efficient in Model. It contains a set of entities, each of which contains a
time, recall, and precision but are specialized to particular set of elements.
design motifs. We propose a more general solution that uses We have implemented the first layer to cope with any
standard algorithms, as offered by constraint programming. number of parsers for various programming languages
Some authors, such as Heuzeroth et al. [38], combined static (e.g., C++ and Java) and produce an instance of Program-
and dynamic analyses to improve the precision of the Model representative of the parsed source code:
identification but faced the problem of the choice of the 1
methods to instrument and of the scenarios to execute. S Ð MS : ð2Þ
3.4 Summary of the Characteristics of DeMIMA Example. Fig. 2b shows a UML-like diagram of the model
From our study of the related work, DeMIMA must possess MS of the source code illustrated in Fig. 1, as well as the
the following characteristics: instantiation links between the objects in MS and their
classes reported in Part 1 of Fig. 3.
. Models of source code must differentiate among use,
association, aggregation, and composition relation- 4.2 Second Layer: Idiom-Level Model MI
ships so that design motif models are as close as The second layer describes systems at a higher level of
possible to their usual descriptions in [2]. abstraction than their source code by making explicit certain
. A given model of a design motif must serve to programming idioms. Idioms reveal particular characteris-
identify both complete and approximate forms of tics of classes or their relationships. For example, a class
microarchitectures similar to the design motif with- could be stereotyped as a UML Data Type according to
out explicitly enumerating all variants. certain idioms used in its implementation [39]. Thus, in
. The algorithms must be semiautomatic or automatic general, idioms can implement other characteristics of
and must explain the identified microarchitectures classes than binary class relationships. Nevertheless, in
so that maintainers can direct their search to easily the rest of this paper, we only study binary class relation-
distinguish possible false positives. ships as they are relevant to design motif identification; the
Authorized licensed use limited to: ULAKBIM UASL - Izmir Ekonomi Univ. Downloaded on March 20,2024 at 21:03:58 UTC from IEEE Xplore. Restrictions apply.
672 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 34, NO. 5, SEPTEMBER/OCTOBER 2008
terms idioms and binary class relationships are therefore definitions; more details and examples of each property
interchangeable. are available elsewhere [6].
This layer provides models MI of systems in which An instance of class B involved at a given time in a
binary class relationships are reified as first-class entities. relationship with an instance of class A may also participate
We focus on the use, association, aggregation, and in another relationship at the same time. We name B B the set
composition unidirectional binary class relationships as ftrue; falseg. We define the exclusivity property EX as
commonly advocated in UML-like notations because these
relationships are used to describe design patterns [2]. EX : Class Class ! B
B:
Parts 1 and 2 (exclusively) in Fig. 3 present the language
to describe idiom-level models. Instances of class A involved in a relationship send
messages to instances of class B. We name any the set of all
4.2.1 Informal Definitions possible message receivers:
An extensive survey of the literature related to the
any ¼ ffield; array field; collection field;
relationships in different domains such as database, soft-
ware engineering, or reverse engineering can be found in parameter; array parameter; collection parameter;
[6]. Table 1 summarizes the definitions of the relationships local variable; local array; local collectiong:
used in DeMIMA from the existing links among instances.
Association, aggregation, and composition are relationships We distinguish three types of message receivers: fields,
among instances of classes. Relationships involving classes parameters, and local variables. Also, we distinguish
(not instances) are modeled as use relationships. “simple” message receivers from arrays and collections
Let A and B be two classes. Association and aggregation because they imply different sets of programming idioms
relationships allow multiple instances of A and B to take for their declarations and uses and thus different identifica-
part in the relationship. The composition relationship tion strategies. The set any of receivers is language
allows multiple instances of B to be in a relationship with independent and its elements correspond to concepts
one instance of A at a time. In an aggregation relationship, available in object-oriented programming languages, such
instances of A access instances of B through a field as a as C++, Java, and Smalltalk. We define the receiver type
particular type of message receiver. In a composition property RT 2 as
relationship, instances of B are exclusive to their corre-
sponding instances of A and instances of A and B have RT : Class Class ! any:
related lifetimes. The lifetime property LT constrains the lifetime of all
instances of class B with respect to the lifetime of all
4.2.2 Definitions of the Properties
instances of class A. It relates to the difference between the
The definitions of the binary class relationships use four
language-independent properties. We present here only 2. The RT property was formerly named “invocation site” IS in [6] but is
information needed to explain the subsequent formal renamed to avoid confusion with the location of a method invocation.
Authorized licensed use limited to: ULAKBIM UASL - Izmir Ekonomi Univ. Downloaded on March 20,2024 at 21:03:58 UTC from IEEE Xplore. Restrictions apply.
ENEUC
GUEH AND ANTONIOL: DEMIMA: A MULTILAYERED APPROACH FOR DESIGN PATTERN IDENTIFICATION 673
TABLE 1
Definitions and Applicability of the Unidirectional Relationships in Our Model
times of destruction LTd of two instances of classes A and B 4.2.3 Formalizations of the Relationships
[21]. The time is in any convenient unit such as seconds or Using EX, LT , MU, and RT , formalizations of the
CPU ticks:3 relationships are expressed as three conjunctions, respec-
LTd : Instance ! IN: tively, AS, association, AG, aggregation, and CO, composi-
In programming languages with garbage collection, LTd tion. The formalizations of the relationships are important
matches the moment where an instance is ready to be because they are the basis of the identification algorithms
collected for garbage. We infer from LTd a relation between needed to abstract MS into MI .
the lifetimes of all instances of two classes A and B. We An association between classes A and B characterizes the
name k the set f; þg: ability of an instance of A to send a message to an instance
LT : Class Class ! k: of B. Nothing prevents other relationships from linking
classes A and B. We define ASðA; BÞ as
The multiplicity property MU specifies the number of
instances of class B allowed in a relationship. We express ASðA; BÞ ¼
this property as4 ðRT ðA; BÞ ¼ anyÞ ^ ðRT ðB; AÞ ¼ ;Þ:
MU : Class Class ! IN [ fþ1g: An aggregation exists between classes A and B when the
The four properties are orthogonal, but the exclusivity definition of A, the whole, contains instances of B, its part.
and multiplicity properties are closely related. For example, The whole must define a field (“simple,” array, or
in the Country-Language relationship, we have the collection) of the type of its part. Instances of the whole
following:
send messages to instances of its part. We formalize
. The multiplicity property states the number of AGðA; BÞ as
instances of class Language that each instance of
class Country possesses: AGðA; BÞ ¼
MUðCountry; LanguageÞ ¼ ½1; þ1: RT ðA; BÞ ffield; array field;
collection fieldg ^
(For example, Canada possesses two official lan-
guages, English and French, and several spoken ðRT ðB; AÞ ¼ ;Þ ^
languages, Inuktitut, Punjabi, Portuguese, and so on.) ðMUðA; BÞ ¼ ½1; þ1Þ ^ ðMUðB; AÞ ¼ ½0; þ1Þ:
. The exclusivity property states that an instance of
A composition is an aggregation with a constraint
class Language is shared among instances of class
Country and of other classes: between the lifetimes of the whole and its part and a
constraint on the ownership of the part by the whole.
EXðCountry; LanguageÞ ¼ false:
Instances of the whole own the instances of its part.
(French is spoken in Canada, in France, . . . .) Instances of the part might be instantiated before the whole
is instantiated, but they must not belong to any other whole.
Example. The values of the four properties are reported and
They are exclusive to the instance of the whole. The
commented on in Table 2 for the source code of the
definition of the composition relationship allows only an
running example in Fig. 1.
association between part and whole to ensure the lifetime
3. IN represents the set of all natural numbers. and ownership properties between whole and part. We
4. We need þ1 to denote multiplicities with no limit in the numbers of
instances in the relationships. define COðA; BÞ as
Authorized licensed use limited to: ULAKBIM UASL - Izmir Ekonomi Univ. Downloaded on March 20,2024 at 21:03:58 UTC from IEEE Xplore. Restrictions apply.
674 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 34, NO. 5, SEPTEMBER/OCTOBER 2008
TABLE 2
Values of the Four Properties Instantiated for the Running Example
composition relationships would be replaced by aggrega- . Strict transitive inheritance constraint. The domains of
tion relationships and, thus, its recall would not be two variables contain entities that belong to the same
impacted. branch of the inheritance tree.
. Transitive inheritance constraint. The domains of two
4.2.6 MI Construction in Summary variables contain entities that belong to the same
We formalized the definitions of the use, association, branch of the inheritance tree or that are identical.
aggregation, and composition relationships and developed . Use constraint. The entities in the domain of variable v1
algorithms based on dynamic and static analyses to build a use the entities in the domain of variable v2 .
model MI of a system from its source code model MS , thus . Ignorance constraint. This constraint explicitly states
creating the traceability link: that two entities must not have any relationship.
. Association constraint. Association relationships link
2
MS Ð MI : the entities in the domain of v1 with the entities in
the domain of v2 .
Example. Fig. 2c shows how the UML-like model MS is . Aggregation constraint. Aggregation relationships link
enriched into a model MI by adding an aggregation the entities in the domain of v1 with the entities in
relationship between C1 and C2, instance of the the domain of v2 .
Aggregation relationship class. . Composition constraint. Composition relationships
link the entities in the domain of v1 with the entities
4.3 Third Layer: Design-Level Model, MD ð MA Þ in the domain of v2 .
In the third layer, we first describe a model MDM of a design . Creation constraint. Entities in the domain of v1
motif with the same language used for MI . Then, DeMIMA instantiate (at least once) entities in the domain of v2 .
looks for microarchitectures MA similar to the design motif We add standard (in)equality constraints to these
DM in a model MI of a system. To identify microarchitec- constraints which ensure that different entities play
tures similar to MDM , it transforms MDM into a constraints different roles. We associate a weight with each constraint,
system. It then solves the constraint satisfaction problem an integer value p 2 f1; 2; 3 . . . ; 100g, which indicates the
using explanation-based constraint programming [8]. The relative importance of the constraints with one another or
solutions of the constraint satisfaction problem represent an order among constraints.
microarchitectures similar to MDM in MI . Example. The model MDM of the motif of the running
example transforms into a constraint system with
4.3.1 Modeling of Design Motifs two variables vZ1 and vZ2 corresponding to the
Parts 1, 2, and 3 in Fig. 3 show the language used to describe classes Z1 and Z2 and the composition constraint
design motifs as first-class entities that can be manipulated compositionðvZ1 ; vZ2 ; 100Þ.
programmatically. A design motif is represented by an
instance of the class DesignMotifModel and is composed 4.3.3 Resolution of the Constraint System
of Participants, each having different Elements. DeMIMA uses explanation-based constraint programming
Example. Fig. 2a shows the UML-like diagram of the model [8], [41] as a technique to solve constraint satisfaction
MDM of the motif that we want to identify as well as problems translated from the identification of microarchi-
instantiation links with some of the classes in Fig. 3. tectures similar to design motifs. Explanation-based con-
straint programming justifies solutions, and lack thereof, of
4.3.2 Transformation of Design Motifs a constraint satisfaction problem by remembering con-
straints that can or cannot be satisfied. Explanation-based
With DeMIMA, the identification of microarchitectures
constraint programming is an extension of constraint
similar to a design motif translates into a constraint
programming in which the solver justifies its behavior at
satisfaction problem, which we list as follows:
each step of the resolution process.
. Variables correspond to the participants of the We implemented an explanation-based constraint reso-
design motif model, MDM . lution system dedicated to design motif identification
. Domains of the variables correspond to the entities reusing the JPALM [42] explanation-based constraint
of MI in which to identify microarchitectures. library. This extension includes a generic algorithm for the
. Constraints among variables correspond to the resolution of constraint satisfaction problems with explana-
relationships among the participants of MDM . tions and a backtrack algorithm to manage contradiction.
The transformation of a design motif into a constraint Example. In the running example, no solution of the
system requires dedicated constraints that represent relation- constraint system is found and, thus, no microarchitec-
ships among participants. For example, constraint Strict ture is identified and reported.
Inheritance, in the case of Java-like single inheritance, creates
a partial order on the set of entities and is satisfied for any 4.3.4 Relaxation of the Constraint System
couple ðv1 ; v2 Þ if the domain D1 of v1 represents a set of entities Constraint relaxation consists of replacing the constraints
inheriting from the entities in the domain D2 of v2 . that led to a contradiction with semantically weaker
We proceed in a similar fashion for all relationships and constraints.
define the following constraints: As shown in Table 1 and from the formalizations of the
binary class relationships, an order exists among the use,
. Inheritance constraint. The domains of two variables association, aggregation, and composition relationships.
may contain the same entities, in contrast to strict The properties of the use relationship are less constraining
inheritance. than those of the association relationship, which in turn are
Authorized licensed use limited to: ULAKBIM UASL - Izmir Ekonomi Univ. Downloaded on March 20,2024 at 21:03:58 UTC from IEEE Xplore. Restrictions apply.
676 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 34, NO. 5, SEPTEMBER/OCTOBER 2008
less constraining than those of the aggregation relationship. describe models MD and the set of models fMA g of
Finally, the properties of the aggregation relationship are microarchitectures similar to design motifs:
less constraining than those of the composition relationship.
Inheritance-related constraints are also ordered from the . The MicroArchitecture class describes micro-
most constraining to the least: strict inheritance, inheritance, architectures similar to design motifs models. A
strict transitive inheritance, and transitive inheritance. microarchitecture model aggregates a set of entities
We take advantage of these orders; for example, if a which play a role in the microarchitecture. It also
composition relationship between two entities prevents records the score of the solution and the set of
microarchitectures from being found, then this constraint relaxed constraints.
can be replaced by an aggregation relationship between the . An instance of class ProgramModel may contain
same two entities. The microarchitectures found are instances of class MicroArchitecture.
semantically similar to the design motif model to the extent Thus, DeMIMA can build models MA of microarchi-
of the semantic similarity between the relationships. tectures identified as similar to MDM models in MI and
Problem relaxation is a special case of constraint relaxation ensure the traceability between their constituents:
in which no semantically weaker constraint is added to the
3
constraint system. MI Ð MD ð MA Þ:
DeMIMA enables experts to relax constraints and
problems interactively as a guide in the identification of Example. The model MI is enriched by the microarchitec-
microarchitectures similar to a design motif. Relaxation is ture corresponding to the found approximate solution
important because entities or relationships among entities into a model MD shown in Fig. 2d.
in a model may differ from the expected entities and their
relationships as defined in a design motif model. First, the
solver searches for microarchitectures identical to a design 5 TOOLING
motif model and provides maintainers with explanations of We implement DeMIMA on top of the PTIDEJ framework.
contradiction. A maintainer chooses one or more constraints The main programming language for the tools is Java. We
which she believes are not essential to the design motif use Prolog for the computation of the EX and LT
model and removes them from the constraint system
properties and JPALM to implement the constraint solver
dynamically, replacing them with semantically weaker
to benefit from existing libraries. We present here only the
constraints; the solver then searches for approximate
components of the PTIDEJ framework relevant to DeMIMA:
microarchitectures. This process goes on until the main-
tainer decides that too many constraints have been relaxed 1. PADL provides the language needed to describes
and the microarchitectures are becoming too distant from models MS , MI , and MD of systems. Its imple-
the design motif model. Weights associated with each mentation is general enough to cope with different
constraint are used to score a microarchitecture to help programming languages, such as C++ and Java.
maintainers in choosing which constraints to relax. The 2. The PADL CLASSFILE CREATOR parser analyzes the
score of a microarchitecture is Java class files associated with a system to produce a
0 1 0 1 model MS of the system.
X X 3. RELATIONSHIP STATIC ANALYSER computes values
score ¼ @ pA @ p=100A;
of the RT and MU properties and infers use,
p2fp1 ;...;pn g p2fpj ;...;pk g
association, and aggregation relationships among
where fp1 ; . . . ; pn g is the set of weights of all constraints and entities of MS to refine MS into MI .
fpi ; . . . ; pj g is the set of weights of the relaxed constraints. If 4. CAFFEINE performs dynamic analyses of a system to
all constraints from the design motif model are satisfied, compute values for the EX and LT properties.
then score ¼ 100 else score < 100. Results are integrated within MI to refine aggrega-
The solver may be automated to compute all combina- tion relationships into composition relationships if
tions of constraint relaxations. The set of all possible required.
microarchitectures (complete and approximate) is identical 5. PTIDEJ UI allows the visualization and refinement of
manually or automatically. This set only depends on the MS , MI , and MD . It displays the models as UML-
design motif and system models. The difference between like class diagrams with a Sugiyama-based layout
automated and manual constraint relaxation is that main- algorithm. It is also responsible to convert a chosen
tainers may choose to relax constraints in a different order design motif MDM into a constraint system and MI
than that suggested by the design motif model and thus into a domain for its variables.
may direct the search more quickly toward useful micro- 6. Finally, the constraint solver PTIDEJ SOLVER is
architectures. applied on the generated constraint satisfaction
problem to solve the problem either interactively
Example. The composition constraint would be relaxed into or automatically. The constraint solver produces
an aggregation constraint aggregationðvZ1 ; vZ2 ; 100Þ ac- microarchitectures MA similar to the design motif
cording to Table 1. A solution to this constraint system to create MD .
exists with vZ1 ¼ C1 and vZ1 ¼ C2.
Fig. 4. Comparison of JHOTDRAW documented and recovered MD model. The list and box show one selected MA similar to Composite.
where several microarchitectures are similar to design 6.4 A Step-by-Step Identification of Composite in
motifs yet do not implement their intents and motivations. JHOTDRAW
As described above and shown in Tables 3 and 4, our We perform a step-by-step identification of the Composite
subjects are comprised of systems from different domains, design motif in JHOTDRAW to illustrate the use of
complexities, and sizes. Thus, results reported below DeMIMA.
support the feasibility of DeMIMA and its ability to identify The top-left part of Fig. 4 shows a subset of the system
design motifs based on structural properties captured by design as presented in its documentation. We apply
AS, AG, and CO relationships and the set of defined DeMIMA to build a model MI of JHOTDRAW from its
constraints. Results are encouraging; future work will source code. Fig. 4 compares the recovered design-level
include studying generalization to other object-oriented model of the system and its documented design. The
programming languages, domains, and design motifs. recovered model presents essentially the same data as the
Internal validity is defined as the ability to detect a documented architecture. Some relationships among classes
cause-effect relationship between independent and depen- and interfaces differ because the authors of the documenta-
dent variables. DeMIMA obviously detects design motifs tion summarized the main classes and interfaces of the
and thus highlights microarchitectures to help program framework and reported against these entities some
comprehension and documentation of reverse-engineered relationships existing only among their subclasses. For
design choices; however, the extent to which these micro- example, the instantiation relationship between interfaces
architectures correspond to the intention or motivation of Figure and Handle only exists between class Standard-
the developers has not been assessed and will be studied in DrawingView (which implements Figure) and class
future work. NullHandle (which implements Handle). Thus, with
Authorized licensed use limited to: ULAKBIM UASL - Izmir Ekonomi Univ. Downloaded on March 20,2024 at 21:03:58 UTC from IEEE Xplore. Restrictions apply.
680 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 34, NO. 5, SEPTEMBER/OCTOBER 2008
TABLE 5
Results of Design Motif Identification in Public-Domain Systems
DeMIMA, we obtain a model MI of a system source code S Thus, with DeMIMA, we obtain models MA similar to a
and ensure the traceability between MI and S: design motif model MDM in a model MD . DeMIMA also
ensures the traceability between MA , MD , and S:
S Ð MS Ð MI :
S Ð MS Ð MI Ð MD fMA g :
The Composite design motif [2, p. 163] defines three
participants, Component, Composite, and Leaf, and three Models MA of microarchitectures similar to the
relationships among them, an inheritance between Compo- Composite design motif help maintainers in understanding
nent and Composite and between Component and Leaf and the design of the JHOTDRAW system by explaining the roles
a composition between Composite and Component. It of the highlighted classes, which solve the problem of
composing “objects into tree structures to represent part-
translates into the following constraint system: three
whole hierarchies” and “let clients treat individual objects
variables, component, composite, and leaf, and three and compositions of objects uniformly,” as defined by the
constraints: Composite design pattern. Maintainers are guided by the
identification in their comprehension of the system. Thus,
. Two inheritance constraints between variables leaf
DeMIMA may ease Task 3 of comprehending the system, as
and component, composite and component:
inheritance(component, composite, 100) presented in Section 2.
and inheritance(component, leaf, 100). 6.5 Open Source Systems Case Studies
. A composition constraint between variables compo-
Table 5 gives the number of microarchitectures identified for
site and component: composition(compo-
each system in the public domain for each design motif.
site, component, 100). Columns labeled with I report detected motifs, with
DeMIMA solves the constraint satisfaction problem T microarchitectures manually classified as true motifs and
defined by the constraint system from the Composite design with P the corresponding precision. It can be observed that
motif using as domain the JHOTDRAW idiom-level model. the most frequently found design motifs are the Abstract
During the process, the composition constraint is relaxed Factory and Factory Method because they use characteristics
because only aggregation relationships are present in the at the core of object-oriented programming. The last row of
model MI of JHOTDRAW; the inheritance constraints are also Table 5 gives the precision of the design motif identification.
relaxed because an intermediate class, AbstractFigure, Precision is computed over all motifs by summing the
numbers of each column and then computing T =I, assuming
exists in the framework. Then, the identified microarchitec-
a precision of 100 percent when I ¼ T ¼ 0.
tures are integrated in a design-level model. The bottom part In some cases, DeMIMA does not identify any micro-
of Fig. 4 shows the design-level model MD of JHOTDRAW architecture similar to some design motifs. The reason is
and, together with the top-right list, highlights a microarch- twofold: First, we only allow one approximation for each
itecture similar to the Composite design motif. type of relationship; thus, it is possible that we do not
Authorized licensed use limited to: ULAKBIM UASL - Izmir Ekonomi Univ. Downloaded on March 20,2024 at 21:03:58 UTC from IEEE Xplore. Restrictions apply.
ENEUC
GUEH AND ANTONIOL: DEMIMA: A MULTILAYERED APPROACH FOR DESIGN PATTERN IDENTIFICATION 681
TABLE 6 TABLE 7
Results of Design Motif Identification Results of Design Motif Identification
in the Source Code of Industrial Components in the Design of Industrial Components
[16] J. Niere, J.P. Wadsack, and A. Zündorf, “Recovering UML [34] K. Brown, “Design Reverse-Engineering and Automated Design
Diagrams from Java Code Using Patterns,” Proc. Second Workshop Pattern Detection in Smalltalk,” Technical Report TR-96-07, Dept.
Soft Computing Applied to Software Eng., J.H. Jahnke and C. Ryan, of Computer Science, Univ. of Illinois at Urbana-Champaign,
eds., pp. 89-97, http://trese.cs.utwente.nl/scase/scase-2/ http://citeseer.nj.nec.com/context/734211/0, July 1996.
Proceedings.pdf, Feb. 2001. [35] G. Hedin, “Language Support for Design Patterns Using Attribute
[17] J. Niere, J.P. Wadsack, and L. Wendehals, “Handling Large Search Extension,” Proc. First ECOOP Workshop Language Support for
Space in Pattern-Based Reverse Engineering,” Proc. 11th Int’l Design Patterns and Frameworks), J. Bosch and S. Mitchell, eds.,
Workshop Program Comprehension, K. Wong and R. Koschke, eds., Springer, pp. 137-140, http://www.cs.lth.se/Research/ProgEnv/
pp. 274-280, http://portal.acm.org/citation.cfm?id=857020, May LSDF.html, June 1997.
2003. [36] H. Albin-Amiot and Y.-G. Guéhéneuc, “Meta-Modeling Design
[18] D. Jackson and A. Waingold, “Lightweight Extraction of Object Patterns: Application to Pattern Detection and Code Synthesis,”
Models from Bytecode,” Proc. 21st Int’l Conf. Software Eng., Proc. First ECOOP Workshop Automating Object-Oriented Software
D. Garlan and J. Kramer, eds., pp. 194-202, http://sdg.lcs.mit. Development Methods, P. van den Broek, P. Hruby, M. Saeki,
edu/ dnj/, May 1999. G. Sunyé, and B. Tekinerdogan, eds., http://www.iro.umontreal.
[19] Object Management Group, UML v1.5 Specification, http:// ca/~ptidej/Publications/Documents/ECOOP01AOOSDM.
www.omg.org/cgi-bin/doc?formal/03-03-01, Mar. 2003. doc.pdf, Centre for Telematics and Information Technology, Univ.
[20] J. Noble and J. Grundy, “Explicit Relationships in Object-Oriented of Twente, tR-CTIT-01-35, Oct. 2001.
Development,” Proc. 18th Conf. Technology of Object-Oriented [37] I. Philippow, D. Streitferdt, M. Riebisch, and S. Naumann, “An
Languages and Systems, B. Meyer, ed., pp. 211-226, http:// Approach for Reverse Engineering of Design Patterns,” Software
citeseer.nj.nec.com/noble95explicit.html, Nov. 1995. and System Modeling, vol. 4, no. 1, pp. 55-70, http://www.springer
[21] F. Civello, “Roles for Composite Objects in Object-Oriented link.com/content/0dn4pmqh5uhnbk69/, Feb. 2005.
Analysis and Design,” Proc. Eighth Conf. Object-Oriented Program- [38] D. Heuzeroth, T. Holl, and W. Löwe, “Combining Static and
ming, Systems, Languages, and Applications, A. Paepcke, ed., Dynamic Analyses to Detect Interaction Patterns,” Proc. Sixth
pp. 376-393, http://www.it.bton.ac.uk/staff/frc/papers/ World Conf. Integrated Design and Process Technology, H. Ehrig,
aboops93.html, Sept. 1993. B.J. Krämer, and A. Ertas, eds., http://www.info.uni-karlsruhe.
[22] S. Ducasse, M. Blay-Fornarino, and A.-M. Pinna-Dery, “A de/publications.php/bib=281, June 2002.
Reflective Model for First Class Dependencies,” Proc. 10th Conf. [39] Y.-G. Guéhéneuc“A Systematic Study of UML Class Diagram
Object-Oriented Programming, Systems, Languages, and Applications, Constituents for Their Abstract and Precise Recovery,” Proc.
F. Manola, ed., pp. 265-280, http://www.iam.unibe.ch/ 11th Asia-Pacific Software Eng. Conf., D.-H. Bae and W.C. Chu,
~ducasse/WebPages/Publications.html, Oct. 1995. eds., pp. 265-274, http://www.iro.umontreal.ca/~ptidej/
[23] L. Wills, “Automated Program Recognition by Graph Parsing,” Publications/Documents/APSEC04.doc.pdf, Nov.-Dec. 2004.
PhD dissertation, Massachusetts Inst. of Technology, 1992. [40] A. Donovan, A. Kiezun, M.S. Tschantz, and M.D. Ernst, “Con-
[24] J. Niere, W. Schäfer, J.P. Wadsack, L. Wendehals, and J. Welsh, verting Java Programs to Use Generic Libraries,” Proc. 19th Conf.
“Towards Pattern-Based Design Recovery,” Proc. 24th Int’l Conf. Object-Oriented Programming Systems, Languages, and Applications,
Software Eng., M. Young and J. Magee, eds., pp. 338-348, http:// D. Schmidt, ed., pp. 15-34, http://portal.acm.org/citation.cfm?id
portal.acm.org/citation.cfm?id=581382, May 2002. =1035292.1028979, Oct. 2004.
[41] N. Jussien and V. Barichard, “The PaLM System: Explanation-
[25] C. Krämer and L. Prechelt, “Design Recovery by Automated
Based Constraint Programming,” Proc. Techniques for Implementing
Search for Structural Design Patterns in Object-Oriented Soft-
Constraint Programming Systems), N. Beldiceanu, W. Harvey,
ware,” Proc. Third Working Conf. Reverse Eng., L.M. Wills and
M. Henz, F. Laburthe, E. Monfroy, T. Müller, L. Perron, and
I. Baxter, eds., pp. 208-215, http://www.computer.org/
C. Schulte, eds., pp. 118-133, Sept. 2000, School of Computing,
proceedings/wcre/7674/76740208abs.htm, Nov. 1996.
Nat’l Univ. of Singapore, tRA9/00.
[26] B. Kullbach and A. Winter, “Querying as an Enabling Technology
[42] N. Jussien, “e-Constraints: Explanation-Based Constraint Pro-
in Software Reengineering,” Proc. Third Conf. Software Maintenance
gramming,” Proc. First CP Workshop User-Interaction in Constraint
and Reengineering, P. Nesi and C. Verhoef, eds., pp. 42-50, http://
Satisfaction, B. O’Sullivan and E. Freuder, eds., http://
www.computer.org/proceedings/csmr/0090/00900042abs.htm,
www.emn.fr/jussien/publications/jussien-WCP01.pdf, Dec.
Mar. 1999.
2001.
[27] R.K. Keller, R. Schauer, S. Robitaille, and P. Pagé“Pattern-Based [43] E. Gamma and T. Eggenschwiler, “JHotDraw,” http://members.
Reverse-Engineering of Design Components,” Proc. 21st Int’l pingnet.ch/gamma/JHD-5.1.zip, 1998.
Conf. Software Eng., D. Garlan and J. Kramer, eds., pp. 226-235, [44] W.B. Frakes and R. Baeza-Yates, Information Retrieval: Data
http://www.iro.umontreal.ca/~schauer/Private/Publications/ Structures and Algorithms. Prentice Hall, 1992.
icse1999/icse1999.html, May 1999. [45] J. Bieman, G. Straw, H. Wang, P.W. Munger, and R.T. Alex-
[28] J.H. Jahnke and A. Zündorf, “Rewriting Poor Design Patterns by ander“Design Patterns and Change Proneness: An Examination
Good Design Patterns,” Proc. First ESEC/FSE Workshop Object- of Five Evolving Systems,” Proc. Ninth Int’l Software Metrics
Oriented Reengineering, S. Demeyer and H.C. Gall, eds., http:// Symp., M. Berry and W. Harrison, eds., pp. 40-49, http://
www.iam.unibe.ch/~famoos/ESEC97/, Distributed Systems csdl.computer.org/comp/proceedings/metrics/2003/1987/00/
Group, Technical Univ. of Vienna, UV-1841-97-10, Sept. 1997. 19870040abs.htm, Sept. 2003.
[29] G. Antoniol, R. Fiutem, and L. Cristoforetti, “Design Pattern [46] G. Antoniol, G. Casazza, M. di Penta, and R. Fiutem, “Object-
Recovery in Object-Oriented Software,” Proc. Sixth Int’l Workshop Oriented Design Patterns Recovery,” J. Systems and Software,
Program Comprehension, S. Tilley and G. Visaggio, eds., pp. 153- vol. 59, pp. 181-196, http://web.soccerlab.polymtl.ca/~antoniol/
160, http://citeseer.nj.nec.com/antoniol98design.html, June 1998. publications/index.html, Nov. 2001.
[30] J. Seemann and J.W. von Gudenberg, “Pattern-Based Design [47] Y.-G. Guéhéneuc, H. Sahraoui, and F. Zaidi, “Fingerprinting
Recovery of Java Software,” Proc. Fifth Int’l Symp. Foundations of Design Patterns,” Proc. 11th Working Conf. Reverse Eng., E. Stroulia
Software Eng., B. Scherlis, ed., pp. 10-16, http://www.informatik. and A. de Lucia, eds., pp. 172-181, http://www.iro.umontreal.
uni-trier.de/~ley/db/indices/a-tree/s/Seemann:Jochen.html, ca/~ptidej/Publications/Documents/WCRE04.doc.pdf, Nov.
Nov. 1998. 2004.
[31] D. Eppstein, “Subgraph Isomorphism in Planar Graphs and [48] M. Fowler, Patterns of Enterprise Application Architecture, first ed.
Related Problems,” Proc. Sixth Ann. Symp. Discrete Algorithms, Addison-Wesley Professional, http://www.amazon.com/
K. Clarkson, ed., pp. 632-640, www.ics.uci.edu/~eppstein/pubs/ Patterns-Enterprise-Application-Architecture-Martin/dp/
Epp-TR-94-25.pdf, Jan. 1995. 0321127420, Nov. 2002.
[32] N. Pettersson and W. Löwe, “Efficient and Accurate Software
Pattern Detection,” Proc. 13th Asia Pacific Software Eng. Conf.,
P. Jalote, ed., pp. 317-326, http://ieeexplore.ieee.org/xpls/
abs_all.jsp?isnumber=4137387&arnumber=4137433&count=
65&index=43, Dec. 2006.
[33] N. Tsantalis, A. Chatzigeorgiou, G. Stephanides, and S. Halkidis,
“Design Pattern Detection Using Similarity Scoring,” IEEE Trans.
Software Eng., vol. 32, no. 11, Nov. 2006.
Authorized licensed use limited to: ULAKBIM UASL - Izmir Ekonomi Univ. Downloaded on March 20,2024 at 21:03:58 UTC from IEEE Xplore. Restrictions apply.
684 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 34, NO. 5, SEPTEMBER/OCTOBER 2008
Yann-Gaël Guéhéneuc received the engineer- Giuliano Antoniol received the degree in
ing diploma from the Ecole des Mines of Nantes, electronic engineering from the Università di
France, in 1998 and the PhD degree in software Padova in 1982 and the PhD degree in electrical
engineering from the University of Nantes, engineering from the Ecole Polytechnique de
France (under Professor Pierre Cointe’s super- Montréal, Canada, in 2004. He has worked in
vision) in 2003. His PhD thesis was funded by companies, research institutions, and universi-
Object Technology International, Inc. (now IBM ties. He is currently an associate professor at the
OTI Labs.) in 1999 and 2000. He is an assistant
the Ecole Polytechnique de Montréal, where he
professor in the Department of Computing works on software evolution, software traceabil-
Science and Operations Research at the Uni- ity, software quality, and maintenance. He has
versity of Montreal, where he leads the Ptidej team on evaluating and published more than 100 papers in journals and international conference
enhancing the quality of object-oriented programs by promoting the use proceedings. He has served as a member of the program committees of
of patterns at the language, design, or architectural levels. His research international conferences and workshops such as the International
interests are program understanding and program quality during Conference on Software Maintenance, the International Conference on
development and maintenance, in particular through the use and the Program Comprehension, and the International Symposium on Software
identification of recurring patterns. He is also interested in empirical Metrics. He is currently a member of the editorial board of the Journal
software engineering; he uses eye trackers to understand and to Software Testing Verification and Reliability, the Journal Information and
develop theories about program comprehension. He has published Software Technology, the Journal of Empirical Software Engineering,
many papers in international conference proceedings and journals. He is and the Journal of Software Quality. In 2005, he was awarded the
a member of the IEEE. Canada Research Chair Tier I in software change and evolution. He is a
member of the IEEE.
Authorized licensed use limited to: ULAKBIM UASL - Izmir Ekonomi Univ. Downloaded on March 20,2024 at 21:03:58 UTC from IEEE Xplore. Restrictions apply.