0% found this document useful (0 votes)
54 views5 pages

Employing Clustering For Assisting Source Code Maintainability Evaluation According To ISO/IEC-9126

This document proposes a methodology that combines clustering and multi-criteria decision techniques to assist in evaluating the maintainability of software systems according to ISO/IEC-9126. It extracts source code metrics and elements, assigns them weights using analytical hierarchy process, and applies k-Attractors clustering to group systems and provide overviews. The methodology is evaluated on Apache Geronimo, an open-source application server, and results are discussed.

Uploaded by

Emil Stankov
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views5 pages

Employing Clustering For Assisting Source Code Maintainability Evaluation According To ISO/IEC-9126

This document proposes a methodology that combines clustering and multi-criteria decision techniques to assist in evaluating the maintainability of software systems according to ISO/IEC-9126. It extracts source code metrics and elements, assigns them weights using analytical hierarchy process, and applies k-Attractors clustering to group systems and provide overviews. The methodology is evaluated on Apache Geronimo, an open-source application server, and results are discussed.

Uploaded by

Emil Stankov
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Employing Clustering for Assisting Source Code

Maintainability Evaluation according to ISO/IEC-


9126
Panagiotis Antonellis1, Dimitris Antoniou1, Yiannis Kanellopoulos1, 2, Christos Makris1,
Evangelos Theodoridis1, Christos Tjortjis2, Nikos Tsirakis1
1
Universiry of Patras, Computer Engineering and Informatics Department, Greece
2
The University Of Manchester, School Of Computer Science, U.K.

Abstract • The k-Attractors clustering algorithm [4] is then applied


on the derived ISO/IEC-9126’s maintainability values, in
This paper elaborates on how to use clustering for the order to provide the evaluator with a quick and rough
evaluation of a software system’s maintainability according to grasp of the system.
the ISO/IEC-9126 quality standard. More specifically it We attempt to evaluate the usefulness of this methodology
proposes a methodology that combines clustering and by employing as test-bed, Geronimo 1.0, an open source
multicriteria decision aid techniques for knowledge application server used in real life industrial applications. The
acquisition by integrating groups of data from source code remaining of this paper is organized as follows: Section 2
with the expertise of a software system’s evaluators. A process reviews existing work in the area of data mining and software
for the extraction of elements from source code and Analytical evaluation. Section 3 outlines the logic behind the main parts
Hierarchical Processing for assigning weights to these data of the proposed methodology. Section 4 assesses the accuracy
are provided; k-Attractors clustering algorithm is then of the output of the proposed framework, analyses its results
applied on these data, in order to produce system overviews and outlines deductions from its application. Finally,
and deductions. The methodology is evaluated on Apache conclusions and directions for future work are presented in
Geronimo, a large Open Source Application Server; results Section 5.
are discussed and conclusions are presented together with
directions for future work 2. Background
1. Introduction Data mining [3], is the process which extracts implicit,
previously unknown, and potentially useful information from
Software maintenance is considered as the most difficult data, by searching large volumes of them for patterns and by
stage in software lifecycle. According to the National Institute employing techniques such as classification, association rules
of Standards and Technology (NIST), it costs the U.S. mining, and clustering. It is a quite complex topic and has
economy $60 billion per year [12]. Given this high cost, links with multiple core fields such as computer science and
maintenance processes can be considered as an area of adds value to rich seminal computational techniques from
competitive advantage. There are several studies for statistics, information retrieval, machine learning and pattern
evaluating a system’s maintainability and controlling the recognition. Its ability to deal with vast amounts of data has
effort required to carry out maintenance activities [2], [14], been considered a suitable solution in assisting software
[18]. According to ISO/IEC-9126, maintainability is the maintenance, often resulting in remarkable results [1], [7],
capability of a software product to be modified. Evaluating [8]], [10], [20]. As previous studies have shown, data mining
such a characteristic is a difficult process as many is capable to obtain useful knowledge about the structure of
contradictory criteria must be considered in order to reach a large systems.
decision. Sartipi et al. used data mining for architectural design
This paper presents a methodology that facilitates the recovery [16]. They proposed a model for the evaluation of
evaluation of a software product’s maintainability according the architectural design of a system based on associations
to the ISO/IEC-9126 software engineering quality standard. among system components and used system modularity
The intuition of this methodology is to integrate groups of measurement as an indication of design quality and its
measurement data extracted from source code’s elements with decomposition into subsystems. Besides association rules, the
the expertise of a system’s evaluators by providing them the clustering data mining technique has been used to support
ability to define a number of attributes suitable for such software maintenance and software systems knowledge
evaluation. For this reason: discovery [21], [15]. The work in [15] proposes a
• Metrics are extracted from elements of system’s source methodology for grouping Java code elements together,
code. according to their similarity and focuses on achieving a high
• Relative weights are assigned to these metrics by level system understanding.
employing the Analytical Hierarchy Process, reflecting Understanding low/medium level concepts and
their importance on evaluating maintainability. relationships among components at the function, paragraph or
even line of code level by mining C and COBOL legacy • At first to collect appropriate elements that
systems source code was addressed in [19]. For C programs, describe the software architecture and its
functions were used as entities, and attributes defined characteristics. These elements include native
according to the use and types of parameters and variables, source code attributes and metrics.
and the types of returned values. Then clustering was applied • Then to analyze the collected elements, choose
to identify sub-sets of source code that were grouped together a refinement subset of them and store them in a
according to custom-made similarity metrics [19]. An relational database system for further analysis.
approach for the evaluation of dynamic clustering is presented Native attributes include Definition files, classes, Structure
in [22]. The scope of this solution is to evaluate the usefulness blocks etc. Metrics, on the other hand, provide additional
of providing dynamic dependencies as input to software system information and describe more effectively the
clustering algorithms. Finally, Clustering over a Module system’s characteristics and behaviour.
Dependency Graph (MDG) [9] uses a collection of algorithms All the metrics are associated with a native source code
which facilitate the automatic recovery of the modular attribute, e.g. the lack of cohesion is associated with a class
structure of a software system from its source code. The member method. All of the above collected attributes and
method creates a hierarchical view of system architecture into metrics are stored into appropriate structured XML files. We
subsystems, based on the components and the relationships have chosen XML because of its interoperability and its wide
between components that can be detected in source code. acceptance as a de facto standard for data representation and
Recently, [8] presented an approach that examines the exchange. Storing the metrics in XML files enables further
evolution of code stored in source control repositories. This processing and analysis with a variety of tools.
technique identifies Change Clusters, which can help For simplicity, we chose to analyse a refinement subset of
managers to classify different code change activities as either the most important collected elements. This subset should be
software maintenance or a new development. On the other small enough in order to be easily analyzed and large enough
hand [20] analyzes whether some change coupling between to contain all the necessary system information. Based on this
source code entities is significant or only minor textual requirement, we stored and further analyzed only the metrics
adjustments have been checked in, as reflect the changes to and their associated native attributes.
the source code entities. An approach for analyzing and The elements chosen need to be extracted from the XML
classifying change types based on code revisions has been files and stored permanently in a relational database. For this
developed. Finally, in [4] language processing techniques are reason we used tools that map XML elements and nodes into
applied to extend human judgment into situations where any relational database, keeping the extraction method
obtaining direct human judgment is impractical due to the transparent from the underlying database.
volume of information that must be considered. Figure 1 depicts the general architecture of data extraction
The value of this work that differentiates it from what and preparation module.
presented above, is that we don’t cluster raw software
measurement data. Instead, we provide the evaluator the 3.2 Weights Assignment
ability to employ a Multicriteria Analysis (MA) method, the As mentioned above, we have adopted the analytic
Analytical Hierarchy Process (AHP), for assigning relative hierarchy process (AHP) for the weights assignment. AHP is a
weights to the extracted metrics in order to reflect their decision making technique that allows consideration of both
importance on evaluating maintainability. This helps qualitative and quantitative aspects of decisions [25]. It
incorporating the evaluator’s domain expertise with the reduces complex decisions to a series of one-on-one
measurement data extracted from source code, which may comparisons and then synthesizes the results. Compared to
lead to more accurate and interesting clustering results. other techniques, like ranking or rating techniques, AHP
emulates the human ability to compare single properties of
3. Description of the Methodology alternatives. It not only helps decision makers choose the best
The proposed methodology is supported by the alternative, but also provides a clear rationale for the choice.
Code4Thought tool [24]. Our main purpose when In a systematic way AHP compares a list of objectives or
implementing this tool was to use open source and portable alternatives. When used in the systems engineering process,
technologies. Thus, we decided to use the Java programming AHP can be a powerful tool for comparing alternative design
language for implementing the main functionality of our tool, concepts. Assuming that a set of objectives has been
the MySQL database for storing our data the PHP scripting established; and that we are trying to establish a normalized
language for designing the user interface of our tool. set of weights to be used when comparing alternatives using
This section presents the logic behind the following these objectives. AHP forms a pairwise comparison matrix A,
modules that constitute the Code4Thought tool: where the number in the i-th row and j-th column gives the
• Data extraction and preparation relative importance of objective O(i) as compared with O(j).
• Weights assignment Values that usually are used are in a 1–9 scale, with a(i,j) = 1
• Data analysis if the two objectives are equal in importance, a(i,j) = 3 if O(i)
is weakly more important than O(j), a(i,j) = 5 if O(i) is
3.1. Data Extraction and Preparation strongly more important than O(j), a(i,j) = 7 if O(i) is very
The objective of data extraction and preparation is two- strongly more important than O(j), and a(i,j) = 9 if O(i) is
fold: absolutely more important than O(j). After this procedure the
comparison matrix is normalized and its eighenvalues are
Figure 1. Architecture of data extraction and preparation module

Figure 2: Weights Assignment Hierarchy

Figure 3. Data analysis module

computed. These eighenvalues play the role of 3.3. Data Analysis


coefficients/weights when someone wants to evaluate the As depicted in the Figure 3, the k-Attractors algorithm,
alternatives for the examined objectives. accepts data from the source code analyzer, by performing
In our case when we aim at evaluate maintainability (see queries on the database, wherethe data reside. The outcome
Figure 2) from a set of employed metrics, we apply AHP of the analysis is stored in XML files, in order to be
procedure in each level of the maintainability metrics visualized by the corresponding module.
hierarchy. At the first level we evaluate the characteristics In the case of software maintainability evaluation,
(analyzability, changeability, etc) from the extracted metrics clustering produces overviews of systems by creating
and at the second level we evaluate maintainability from the mutually exclusive groups of classes, member data or
characteristics applying AHP procedure again. So at first methods, according to their similarities in terms of technical
level we construct a pairwise comparison table for each one (source code) measurements [16]. This helps reducing the
of the characteristics reflecting the expert’s knowledge of time required to understand and evaluate the overall system.
how much each metric influences each characteristic. Then Another contribution of clustering is that it helps discovering
by applying the normalization and extraction of eighenvalues programming patterns and “unusual” or outlier cases which
upon each matrix we find the weight of each metric for may require attention.
calculating a score for each characteristic. At the higher level For this purpose the k-Attractors algorithm was
a pairwise comparison table is constructed too reflecting the employed which is tailored for numerical data such as
expert’s knowledge of how much each characteristic measurements from source code Error! Reference source
influences maintainability; and the weights are calculated by not found.. The main characteristics of k- Attractors are:
normalization and eighenvalues extraction.
o It defines the desired number of clusters (i.e. the 7. DERobject.java, a class of only 38 LOC.
number of k), without user intervention. Table 2 presents the metric values for the classes in
o It locates the initial attractors of cluster centers cluster 2. A further study on these values indicates that the
with great precision. classes in cluster 2 are grouped in two categories:
o It measures similarity based on a composite metric • The first category includes the first five classes that
that combines the Hamming distance and the inner product of have the following characteristics:
transactions and clusters’ attractors. • They don’t follow the principle of low
The k-Attractors algorithm employs the maximal coupling/high cohesion. On the contrary they exhibit
frequent itemset discovery and partitioning in order to define low cohesion and high coupling.
the number of desired clusters and the initial attractors of the • They are highly complex.
centers of these clusters. The intuition is that a frequent • All of them have polymorphic methods; which
itemset in the case of software metrics is a set of indicates that encapsulation is not applied in these
measurements that occur together in a minimum part of a classes.
software system’s classes. Classes with similar • The second category includes the classes
measurements are expected to be on the same cluster. The ASN1Encodable and DERObject that are difficult to
term attractor is used instead of centroid, as it is not maintain for different reasons. More specifically
determined randomly, but by its frequency in the whole these two classes have the following characteristics:
population of a software system’s classes. o Interestingly they are not complex, and
their size is very small unlike the classes
4. Application - Results Evaluation on the first category. They also follow the
The evaluation of Apache Geronimo’s maintainability principle of low coupling/high cohesion.
according to ISO/IEC-9126, involved the study of 1440 o They have an excessive number of
classes. Figure 3 depicts the clusters derived from clustering children. This indicates probably that these
the maintainability values of Geronimo’s classes. The higher classes are fundamental elements of
the values on axis X the less maintainable the classes are. Apache Geronimo’s structure.
Table 1 presents statistics for the derived clusters. o The number of classes depending on them
(Ca) is big.
Table 3 presents statistics for the metrics of Apache
Geronimo’s classes in clusters 0, 1, 3 and 4.
This table indicates that:
• The lower the metric values the higher the probability
of low maintainability.
• There is limited use of inheritance as shown by the low
DIT and NOC values.
• The majority of the classes follow the low
coupling/high cohesion principle.
• Most of the classes exhibit low complexity.
• The design property of encapsulation is applied to most
of the classes.
Figure 3: Apache Geronimo ISO/IEC-9126
Maintainability Clusters 5. Conclusions and Future Work
Table 1: Clusters Statistics The application of the proposed methodology has been
S/N Standard proved to be time and performance efficient. The extraction
Population Percentage Mean Deviation process, which is the most time-consuming part of this
0 methodology, analyzed the 1440 classes of Apache
419 29% 1.10 0.29 Geronimo 1.0 and stored the corresponding metrics and
1 130 9% 2.45 0.60 elements in a limited amount of time. A domain expert
2 7 0.004% 13.75 2.27 previewed the stored metrics and assigned easily and
3 856 59% 0.39 0.16 efficiently the corresponding weights, according to his
priorities and concernings. After clustering application, the
4 28 1.996% 5.02 1.55 resulted clusters proved to be representative of the code
artifacts, helping the domain expert to identify relations
Cluster 3, which has the biggest population, contains between specific metrics and global maintainability as well
classes that their maintainability values range between 0 and as spot individual outlier classes that may need
0.9. This shows that the vast majority of Geronimo’s classes reconsideration.
are highly maintainable. Furthermore, clusters 0, 1 and 4 As future work, we intend to enhance our extraction
contain classes that their maintainability values range from method by calculating metrics from other languages like
0.9 – 2, 2 - 4 and 4 – 9.2 respectively, which can be C++, C and COBOL which were used for the development of
considered good in terms of maintainability. the majority of legacy systems, a category of software
However, outliers are detected in cluster 2, which consists systems which is very interesting in terms of program
of only seven (7) classes that have the lowest maintainability comprehension and maintainability evaluation.
values. These classes are:
1. KernelManagementHelper.java, a class of 1024
Lines Of Code (LOC).
Acknowledgements
2. TradeDirect.java, a class of 2312 LOC. This research work has been partially supported by the Greek
3. ClientApp.java, a class of 1633 LOC. General Secretariat for Research and Technology (GSRT)
4. CdrInputStream.java, a class of 1569 LOC. and Dynacomp S.A. within the program “P.E.P. of Western
5. CdrOutputStream.java, a class of 1241 LOC. Greece Act 3.4”
6. ASN1Encodable.java, a class of only 62 LOC.
Table 2: Cluster 2 Metrics
S/N WMC NPM DAM CBO POM DIT NOC LCOM Ca
1 9.15 11.13 1.62 17.40 40.00 0.72 0.00 42.69 0.00
2 11.58 4.52 1.62 35.65 30.00 0.72 0.00 45.21 0.51
3 10.68 0.32 1.62 2.99 2.50 0.72 0.00 81.97 2.53
4 18.38 11.45 1.62 14.37 20.00 0.72 0.00 64.96 9.61
5 14.77 11.29 1.62 13.14 12.50 0.72 0.00 47.82 9.61
6 0.42 0.81 0.00 0.33 0.00 0.72 149.49 0.27 26.30
7 0.28 0.48 0.00 0.00 0.00 1.44 76.27 0.18 52.10

Table 3: Cluster 0, 1, 3 and 4 Metrics Statistics


Min. Max. Mean Median Stand. Dev.
WMC 0.07 12.55 0.96 0.55 1.20
NPM 0.00 8.71 0.98 0.65 1.17
DAM 0.00 1.62 1.00 1.62 0.76
CBO 0.00 16.54 0.95 0.41 1.54
POM 0.00 37.50 0.93 0.00 2.88
DIT 0.72 3.60 1.00 0.72 0.49
NOC 0.00 70.17 0.85 0.00 3.87
LCOM 0.00 26.84 0.81 0.11 2.43
Ca 0.00 81.94 0.93 0.00 3.28
[14] Rajendra K. Bandi, Vijay K. Vaishnavi, Daniel E. Turk,
References “Predicting Maintenance Performance Using Object Oriented
[1] N. Anquetil and T. C. Lethbridge, “Experiments with Clustering Design Complexity Metrics”, IEEE Transactions on Software
as a Software Remodularization method”, Proc. 6th Working Engineering, vol. 29, No. 1, January 2003, pp. 77-87.
Conf. Reverse Engineering (WCRE 99), IEEE Comp. Soc. [15] D. Rousidis and C. Tjortjis, “Clustering Data Retrieved from
Press, (1999) 235-255. Java Source Code to Support Software Maintenance: A Case
[2] Erik Arisholm, Lionel C. Briand, Audun Foyen, “Dynamic Study”, Proc IEEE 9th European Conf. Software Maintenance
Coupling Measurement for Object-Oriented Software”, IEEE and Reengineering (CSMR 05), IEEE Comp. Soc. Press,
Transactions on Software Engineering, vol. 30, No. 8, August (2005) 276-279.
2004, pp. 491-506. [16] K. Sartipi, K. Kontogiannis and F. Mavaddat, “Architectural
[3] Dunham, M. H. Data Mining: Introductory and Advanced Design Recovery Using Data Mining Techniques”, Proc. 2nd
Topics. Prentice Hall PTR, 2002. European Working Conf. Software Maintenance
[4] Kanellopoulos Y., Antonellis P. Tjortjis C., Makris C., “k- Reengineering (CSMR 00), IEEE Comp. Soc. Press, (2000)
Attractors, A Clustering Algorithm for Software Measurement 129-140.
Data Analysis”, In Proceedings of IEEE 19th International [17] Spinellis D: “Code Quality: The Open Source Perspective“,
Conference on Tools for Artificial Intelligence (ICTAI 2007), Addison-Wesley, 2006.
IEEE Computer Society Press 2007 [18] Yong Tan, Vijay S. Mookerjee, “Comparing Uniform and
[5] Kan, S. H. Metrics and Models in Software Quality Engineering. Flexible Policies for Software Maintenance and
Addison-Wesley. Second Edition. 2002. Replacement”, IEEE Transactions on Software Engineering,
[6] Jay Kothari, Ali Shokoufandeh, Spiros Mancoridis, Ahmed E. vol. 31, No. 3, March 2005, pp. 238-255.
Hassan, "Studying the 1Evolution of Software Systems Using [19] C. Tjortjis, N. Gold, P.J. Layzell and K. Bennett, ”From
Change Clusters," ICPC, pp. 46-55, 14th IEEE International System Comprehension to Program Comprehension”, Proc.
Conference on Program Comprehension (ICPC'06), 2006. IEEE 26th Int’l Computer Software Applications Conf.
[7] T. Kunz and J. P. Black, “Using Automatic Process Clustering (COMPSAC 02), IEEE Comp. Soc. Press, (2002) 427-432.
for Design Recovery and Distributed Debugging”, IEEE [20] C. Tjortjis C., L. Sinos and Layzell P.J., “Facilitating Program
Transactions on Software Engineering, 21(6), (1995) 515-527. Comprehension by Mining Association Rules from Source
[8] Dawn J. Lawrie, Henry Feild, David Binkley, "Leveraged Code”, Proc. IEEE 11th Int’l Workshop Program
Quality Assessment using Information Retrieval Techniques," Comprehension (IWPC 03), IEEE Comp. Soc. Press, (2003)
ICPC, pp. 149-158, 14th IEEE International Conference on 125-132.
Program Comprehension (ICPC'06), 2006. [21] V. Tzerpos and R. Holt, “Software Botryology: Automatic
[9] S. Mancoridis, B.S. Mitchell, Y. Chen and E.R. Gansner, Clustering of Software Systems”, Proc. 9th Int'l Workshop
“Bunch: A Clustering Tool for the Recovery and Maintenance of Database Expert Systems Applications (DEXA 98), IEEE
Software System Structures”, Proc. Int'l Conf. Software Comp. Soc. Press, (1998) 811-818.
Maintenance (ICSM 99), IEEE Comp. Soc. Press, (1998) 50-59. [22] C. Xiao and V. Tzerpos, “Software Clustering on Dynamic
[10] O. Maqbool, H.A. Babri, A. Karim, and M. Sarwar, “Metarule- Dependencies”, Proc. IEEE 9th European Conf. Software
guided association rule mining for program understanding, Maintenance and Reengineering (CSMR 05), IEEE Comp.
Software”, IEEE Proceedings, 152(6) (2005) 281- 296. Soc. Press, (2005) 124-133.
[11] Storey Margaret-Anne: “Theories, Methods and Tools in [23] http://geronimo.apache.org/downloads.htm
Program Comprehension: Past, Present and Future”, Proc. IEEE [24] http://www.code4thought.org
13th Int’l Workshop Program Comprehension (IWPC 2005), [25] Saaty T.. Multicriteria Decision Making: The Analytic
2005. Hierarchy Process, Vol. 1, AHP Series, RWS Publications,
[12] National Institute of Standards and Technology (NIST), “The 502 pp., 1990
Economic Impacts of Inadequate Infrastructure for Software
Testing.”, Washington D.C. 2002.
[13] C. M. de Oca and D. L. Carver, “Identification of Data Cohesive
Subsystems Using Data Mining Techniques”, Proc. Int'l Conf.
Software Maintenance (ICSM 98), IEEE Comp. Soc. Press,
(1998) 16-23.

You might also like