0% found this document useful (0 votes)
29 views5 pages

Tarimliq A New Internal Metric For Software Clustering Analysis

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views5 pages

Tarimliq A New Internal Metric For Software Clustering Analysis

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

27th Iranian Conference on Electrical Engineering (ICEE2019)

Tarimliq: A new internal metric for software


clustering analysis
Masoud Kargar Habib Izadkhah Ayaz Isazadeh
dept. computer engineering dept. of computer science dept. of computer science
Qazvin Branch, Islamic Azad University University of Tabriz University of Tabriz
Qazvin, Iran Tabriz, Iran Tabriz, Iran
[email protected] [email protected] [email protected]

Abstract—Clustering techniques are utilized in the maintenance


process of a software system to understand it. Different clustering
algorithms have been proposed for this purpose in the literature.
In the field of software clustering, a number of external criteria
are presented to evaluate and validate the obtained clusters.
External criteria use a reference clustering to evaluate an achieved
clustering. Because of the comparison with reference clustering,
the validity and accuracy of these methods are reliable in the
assessment. When there is no reference clustering, internal criteria
are used to validate clustering algorithms. Since there is no
internal criterion for evaluating software clustering algorithms, Fig. 1. A sample software system graph
the internal criteria available in data clustering are employed. In
this paper, we propose an internal metric for evaluating software
clustering algorithms. The results on Mozilla Firefox, as a large-
scale software, demonstrate that the proposed internal metric is
more accurate than the tested internal criteria and can also be a
suitable alternative for external criteria.
Index Terms—clustering algorithms, program understandabil-
ity, cluster validity, validity indices

I. I NTRODUCTION
Lehman points out in one of his laws about software evo- Fig. 2. An obtained clustering for Fig. 1
lution that software systems progress to evolve over time [1].
Maintenance and improvement of a software system spend a
major portion of the total life-cycle cost of a software system. resulting decomposition is called software architecture (or soft-
It is estimated that a large part of the software budget in large ware structure). Different clustering algorithms for this purpose
organizations is allocated to maintain existing software systems. are presented in the literature. Figure 1 represents a graph
According to [1], approximately 90% of software costs are constructed for a software system such that the nodes of this
evolution costs. Although this rate may not be exactly right, the graph represent artifacts and the edges between the nodes
fact remains that the large percentage of software costs is spent represent the relationships between the artifacts. Figure 2 shows
on software maintenance process. The proper understanding a sample clustering for Figure 1. According to the principles of
of the software has a major impact on the maintenance and software engineering, clustering should be done in such a way
development of a software system. One of the ways that can that the relationships within the clusters are maximized and the
greatly help to the process of understanding a large application relationships between the clusters are minimized.
from its source code is to form a meaningful partition of Clustering is an unsupervised process, which means that the
its structure into smaller, more controllable subsystems [2]. user does not interfere in the process of clustering. Usually,
To fulfill this aim, clustering methods are utilized. Clustering there exist no predefined classes or examples showing whether
techniques make easy to understand a program by partitioning the results are credible. Hence, various criteria for assessing the
of it. obtained clusters are presented. These criteria are categorized
The purpose of software clustering is to overcome the into two class of external and internal criteria. In the external
complexity of a large program by replacing a set of artifacts metrics, the cluster obtained by an algorithm is compared with
(e.g. file, function, class, etc.) with a cluster, a representative a predetermined clustering. This predetermined clustering is
abstraction of all artifacts grouped within it. Consequently, created by an expert. This predetermined cluster is also called
the obtained partition is straightforward to understand. This the ground-truth architecture [3]. For example, the ground-truth

978-1-7281-1508-5/19/$31.00 2019
c IEEE
1879
Authorized licensed use limited to: Istinye Universitesi. Downloaded on February 26,2023 at 01:13:17 UTC from IEEE Xplore. Restrictions apply.
27th Iranian Conference on Electrical Engineering (ICEE2019)

architecture of the mtunis academic operating system is outlined External validity. External validity indexes are preferable when
in its documentation. However, most software does not have a ground-truth labels are available. External measures are used to
ground-truth structure, and on the other hand, it is necessary to compare the similarity of the two clustering results. The most
validate the obtained clustering. In these cases, internal criteria important external criteria in evaluation of software architecture
are used. The aim of the internal criteria is to determine how are Precision [1], Recall [1], FM [1], MoJo [5] and MoJoFM
well the clusters are separated. It is important to note that [6]. Here’s a brief overview of these criteria.
the outcome of the external metrics is more reliable than the Precision: This criteria is intersect of extracted and ground-
internal criteria in evaluating a clustering algorithm. Therefore, truth architecture divided by the extracted architecture. The
it is essential to provide an internal measure that can simulate precision belongs to the positive class (i.e. true positives and
the behavior of external metrics. false positives).
In this paper, we present an internal metric for software Recall: This criteria is intersect of extracted and ground-
cluster validity. The results of our experiments on Mozilla truth architecture divided by the ground-truth architecture (also
Firefox demonstrate that it can be a good alternative for external known as sensitivity).
criteria. Hence, it can be used to evaluate the clustering achieved F-measure: This criteria, denoted by FM, is the weighted
by the algorithms. harmonic mean of precision and recall.
The structure of this paper is organized as follows: Section MoJo: This metric gauges how “close” two different clusterings
2 addresses the related works. Section 3 presents the proposed are. It counts the minimum number of operations (move and
internal metric. Section 4 evaluates the proposed metric, and join operations) one needs to perform in order to transform one
Sect. 5 concludes the paper. partition to the other. Because this criterion does not produce
an answer at a given interval and only calculates the number
II. R ELATED W ORKS of movements, the MoJoFM was introduced.
This section is organized into two sub-sections. In the first MoJoFM: Let C1 and C2 indicate the clustering achieved
part, we examine the criteria for clustering assessment, and in by a clustering algorithm and an authoritative decomposition,
the second part, we examine the clustering algorithms. respectively. mno(C1 , C2 ) indicates the least number of move
and join operations required to attain from C1 to C2 . The
A. Clustering Validity Criteria MoJoFM measure is taken from Eq. 1. The number produced
In most cases, the number of clusters is an unknown pa- by this criterion located in the range of 0 and 100, so that the
rameter because clustering is unsupervised and the user has larger number describes the higher similarity between the two
very little knowledge about the data. Thus, the evaluation of clustering algorithms.
different clustering algorithms is an important research problem
in cluster analysis. The evaluation of clustering results can be mno(A, B)
M oJoF M (A, B) = 1 − × 100% (1)
examined from two perspectives: internal index and external max(mno(∀A, B))
index [4].
Internal Index. The purpose of an internal evaluation is to B. Clustering Algorithms
examine the achievement of clustering goals. Validation of Due to the NP-hardness of the clustering problem, most
clustering results is as difficult as clustering. The overall goal algorithms presented in this field use evolutionary methods. The
of the internal criteria is to evaluate the clusters obtained from algorithms of E-CDGM [7], Bunch [8], DAGC [9], SAHC [10],
the two aspects of Compactness and Separation [4]. Some NAHC [10], Multiple-HC [11], MCA [12], ECA [12], GA-
of the internal criteria are Homogeneity, Separation, Dunn SMCP [13], PSOMC [14], HSBRA [15], MAE [16] and BCA
Index, Davies-Bouldin, Silhouette coefficient. We have selected [17] are a number of software clustering algorithms that use
three internal criteria i.e. Homogeneity, Separation, Dunn Index evolutionary techniques to perform clustering. The following is
for comparisons. The reason for this choice is to use the a brief explanation of some of these algorithms, which are of
criteria for comparison that evaluates the clusters from different interest in the community.
perspectives. Following is a brief overview of these methods. Bunch algorithm: One of the most popular algorithms for
Cluster Homogeneity index: This index displays the pairwise software clustering is provided by Mitchell in his Ph.D thesis.
similarity of the cluster artifacts. Intracluster homogeneity is a This algorithm uses a genetic algorithm with a vector-based
concept related to the degree of similarity between artifacts in encoding for software clustering. This is a single objective
the same cluster. algorithm, and by providing a quality function called TurboMQ,
Dunn Index: If a data set contains well-separated clusters, the it creates clusters with a maximum cohesion and minimum
distances among the clusters are usually large and the diameters coupling.
of the clusters are expected to be small [3]. Therefore a higher DAGC algorithm: This algorithm is similar to the Bunch
Dunn index indicates better clustering. algorithm, with the difference that the encoding used is a
Separation: This index measures how well separated a cluster permutation-based encoding.
is from other clusters. Separation is measured by the between- Hill climbing algorithm: This algorithm uses a local search for
cluster sum of squares. clustering, and there are two different versions, called NAHC

1880
Authorized licensed use limited to: Istinye Universitesi. Downloaded on February 26,2023 at 01:13:17 UTC from IEEE Xplore. Restrictions apply.
27th Iranian Conference on Electrical Engineering (ICEE2019)

and SAHC. The difference between these two versions is in source. Of course, Z can be empty. To calculate common power,
how neighbors are searched. neighboring artifacts are identified between the two artifacts x
MCA: This algorithm uses a multi-objective function to cluster and y. Then the relationships between the artifacts x and y are
a software system. The objective used include: maximizing calculated considering common neighboring artifacts. Equation
the sum of intra-edges of all clusters, minimizing the sum of 5 shows the dissimilarity between the two artifacts x and y. If x
inter-edges of all clusters, maximizing the number of clusters, and y do not communicate together directly or indirectly, they
maximizing TurboMQ, minimizing the number of isolated have the maximum dissimilarity and the value of 1 is assigned
clusters. to them. Equation 6 calculates the total dissimilarity for cluster
ECA: The goals used in this algorithm are the same as those i. In Eq. 7, the total external dissimilarity for artifacts located
used in the MCA, with the difference that instead of the last one, in the cluster i is calculated. In Eqs. 6 and 7, the direction
the difference between the maximum and a minimum number of relationships is important for dissimilarity. In Eq. 8 and 9,
of modules in a cluster (minimizing), has been used. the Tarimliq for each clusters and an achieved clustering are
calculated, respectively.
III. P ROPOSED I NTERNAL M ETRIC
 
This section proposes a metric that can measure a clustering Adj(x) = y ∈ V | (x, y) ∈ E (2)
obtained by an algorithm. According to the software mainte-

nance principles, a good software system should include several power(x) = E(x, Adj(x)) (3)
understandable and independent functional clusters as possible. x∈V
To this end, the following requirements should be hold: 
1) The relationship between different clusters should be power(x, y) = E(x, Z) + E(Z, y) (4)
minimized. x,y∈V

2) The relationship within the cluster should be maximized. where


3) To be an understandable structure of a software system,
the direction of edges between different clusters should be Z= {z | ∃E(x, z) & ∃E(z, y) &
as consistent as possible. One-way and consistent direc- z ∈ {Adj(x), ∅} & y ∈ Adj(z)}
tions between clusters facilitates the achieved clustering
comprehension. DisSimilarity(x, y) =

4) All utility artifacts exist in the software system should be 1 power(x) = 0&power(y) = 0
placed in an individual cluster. |power(x,y)∗power(x)−power(y,x)∗power(y)|
Otherwise
(power(x)+power(y))2
Utility artifacts are the artifacts that are called by more than (5)
two different clusters but do not call other clusters. These 
artifacts tend to have large degrees, which often mean the DisSimIn(ci ) = DisSimilarity(x, y) (6)
basic files or classes in software systems. Because of their x,y∈ci
high dependence on different artifacts, utility artifacts fall into
different clusters in different clustering. So these artifacts make 
DisSimiOut(ci ) = x∈ci , y∈ci , (x,y)∈E DisSimilarity(x, y)
it difficult to understand a program. Therefore, it is suggested (7)
that such artifacts be placed within a single and individual
cluster. In Fig. 1, the utility files are represented by g and h.
DisSimiIn(ci )
To handle the four requirements, this paper proposes a new T arimliq(ci ) = 1 − (8)
DisSimiIn(ci ) + DisSimOut(ci )
metric named Tarimliq. Suppose a program is represented by
a directed graph as G=(V, E), such that V and E represent its n

vertices and edges. The vertices and edges denote the artifact T arimliq = ( T arimliq(ci ))/n (9)
number and relationships between artifacts, respectively. The i=1
numbers generated by this metric will be in the range 0 to one, where n is the number of clusters.
and if the principles described above are met for a clustering,
this metric will have a maximum value. Let x denotes an IV. R ESULTS
artifact, in Eq. 2, Adj(x) shows the adjacent artifacts of x, Our goal is to show that the proposed criterion can replace
indicating the number of artifacts associated with x. Power x external criteria and can simulate their behavior. To achieve this
(Eq. 3) represents the power of the artifact x, which is the aim, we have chosen Mozilla Firefox software for experiments.
sum of the relationships that the artifact x shares with all The reason for this choice is that its expert architecture is there,
artifacts. Eq. 4 shows the common power between two artifacts and so an external evaluation can be done on it. Table 1 shows
x and y, where x is the source and y is the destination. The the specifications of this software.
common power is a directional relationship, and power (x, y) We chose six state-of-the-art clustering algorithms for eval-
is not equal to power (y, x). Let Z shows the adjacent artifacts uation. NAHC and SAHC algorithms are two local search
that are associated with x as the destination and y as the algorithms, the Bunch and DAGC are global-based search

1881
Authorized licensed use limited to: Istinye Universitesi. Downloaded on February 26,2023 at 01:13:17 UTC from IEEE Xplore. Restrictions apply.
27th Iranian Conference on Electrical Engineering (ICEE2019)

TABLE I . &&


S PECIFICATION OF M OZILLA F IREFOX
   
   
Folder Name # clusters or sub-folders #Files
        
Accessible 8 179
Browser 4 54
Build 2 21 

Content 13 881 
 
Db 4 97
Dom 5 163
Extensions 13 179
Gfx 7 324
Intl 7 573 Fig. 6. Ranking and grouping of six algorithms in terms of Recall
Ipc 4 391

!
SUHFLVLRQ
 5HFDOO
)P
 
     
 0RMR)0
  " # # 
0HDQ






 
 



 

%XQFK '$*& (&$ 0&$ 1$+&' 6$+&'

$OJRULWKP1DPH

Fig. 7. Ranking and grouping of six algorithms in terms of FM


Fig. 3. The average of ten independent runs of six algorithms in terms of
external metrics
- !

 
     
algorithms. The remarkable point is that these four algorithms
" "   "   
are single-objective. We also selected two multi-objective al-
gorithms namely ECA and MCA. Figures 3 and 4 represent, 

 



 
respectively, the average of ten runs of these six algorithms on
Mozilla folders in terms of external and internal metrics.
One of the one-way ANOVA techniques called Duncan Fig. 8. Ranking and grouping of six algorithms in terms of MoJoFM
has been used to compare the average of different criteria
to evaluate the efficiency of algorithms [18]. This technique
$% & % ! %  &'
categorizes algorithms according to different indexes according
to priorities. The algorithms are grouped and prioritized by this  
     

## #   ( ( ( 


technique based on different indexes. Figures 5 - 8 depict how
to group the algorithms and their priority order, taking into 

 
 


account the four criteria Precision, Recall, FM and MoJoFM. 


  ) * +  , "#

It is important to note that differences between groups are


meaningful but not significant within the group.
Fig. 9. Ranking and grouping of six algorithms using Factor Analysis

+RPRJHQHLW\
 6HSDUDWLRQ
'XQQ,QGH[ 
 7DULPOLT
0HDQ


 
     

       #


 

 

%XQFK '$*& (&$ 0&$ 1$+&' 6$+&'

$OJRULWKP1DPH 
 

Fig. 4. The average of ten independent runs of six algorithms in terms of


internal metrics Fig. 10. Ranking and grouping of six algorithms in terms of homogeneity



 
 
     
 
     
       
" #  # # 



 
 



 


 

 

Fig. 5. Ranking and grouping of six algorithms in terms of Precision Fig. 11. Ranking and grouping of six algorithms in terms of Separation

1882
Authorized licensed use limited to: Istinye Universitesi. Downloaded on February 26,2023 at 01:13:17 UTC from IEEE Xplore. Restrictions apply.
27th Iranian Conference on Electrical Engineering (ICEE2019)

we present an external metric in this paper that can assess the



achieved clustering.
 
     

       R EFERENCES




 
 
 [1] Isazadeh, Ayaz, Habib Izadkhah, and Islam Elgedawy. Source Code

  Modularization: Theory and Techniques. Springer, 2017.
[2] Beck, Fabian, and Stephan Diehl. “On the impact of software evolution
on software clustering.” Empirical Software Engineering , vol. 18, no .5,
pp. 970-1004, 2013.
Fig. 12. Ranking and grouping of six algorithms in terms of Dunn [3] Garcia, Joshua, Ivo Krka, Chris Mattmann, and Nenad Medvidovic.
“Obtaining ground-truth software architectures.” In Proceedings of the
  2013 International Conference on Software Engineering, pp. 901-910,
2013.
 
      [4] Duran, Benjamin S., and Patrick L. Odell. Cluster analysis: a survey. Vol.
 #" "" "# ""   100. Springer Science & Business Media, 2013.
[5] Tzerpos, Vassilios, and Richard C. Holt. “MoJo: A distance metric for


 
 
 software clusterings.” In Reverse Engineering, 1999. Proceedings. Sixth

 
Working Conference on, pp. 187-193. IEEE, 1999.
[6] Wen, Zhihua, and Vassilios Tzerpos. “An effectiveness measure for
software clustering algorithms.” In Program Comprehension, 2004. Pro-
Fig. 13. Ranking and grouping of six algorithms in terms of Tarimliq
ceedings. 12th IEEE International Workshop on, pp. 194-203. IEEE, 2004.
[7] Izadkhah, Habib, Islam Elgedawy, and Ayaz Isazadeh. “E-CDGM: An
Evolutionary Call-Dependency Graph Modularization Approach for Soft-
For example, in Figure 5, according to the Precision criterion, ware Systems.” Cybernetics and Information Technologies vol.16, no. 3,
pp. 70-90, 2016.
the algorithms from, respectively, left to right, according to the [8] Mitchell, Brian S. A heuristic search approach to solving the software
proximity of the clusters created by these algorithms to ground- clustering problem. Ph.D. Theses. Drexel University, 2002.
truth clustering are prioritized (MCA algorithm based on this [9] Parsa, Saeed, and Omid Bushehrian. “A new encoding scheme and
a framework to investigate genetic clustering algorithms.” Journal of
criterion, has had the best performance). Based on this indicator, Research and Practice in Information Technology, vol. 37, no. 1 pp. 127,
after the MCA algorithm, the Bunch and ECA algorithms were 2005.
in the same group (that is, on the basis of this index, the [10] Mitchell, Brian S., and Spiros Mancoridis. “On the automatic modular-
ization of software systems using the bunch tool.” IEEE Transactions on
two algorithms had the same function). What is clear in these Software Engineering vol. 32, no. 3, pp. 193-208, 2006.
figures is that the Recall index has not been able to prioritize [11] Mahdavi, Kiarash. A clustering genetic algorithm for software modular-
algorithms and is therefore not a good indicator, and is a good isation with a multiple hill climbing approach. Diss. Brunel University,
2005.
indicator against the FM index. After removing the Recall [12] Praditwong, Kata, Mark Harman, and Xin Yao. “Software module cluster-
index, using the Factor Analysis [18], the three remaining ing as a multi-objective search problem.” IEEE Transactions on Software
indicators are analyzed and a factor is extracted from them. The Engineering, vol.37, no. 2, pp. 264-282, 2011.
[13] Huang, Jinhuang, and Jing Liu. “A similarity-based modularization quality
purpose of factor analysis is to reduce the three indicators to an measure for software module clustering problems.” Information Sciences,
equivalent index, so that they can simulate the behavior of the vol. 342, pp. 96-110, 2016.
three indicators simultaneously. Figure 9 shows the ranking of [14] Rajapati, Amarjeet, and Jitender Kumar Chhabra. “A Particle Swarm
Optimization-Based Heuristic for Software Module Clustering Problem.”
different algorithms based on the newly extracted index. Based Arabian Journal for Science and Engineering, vol. 43, no. 12, pp. 7083-
on this factor, the algorithms were divided into four graters. 7094, 2018.
Figures 10 - 12 show how to group the algorithms and their [15] Chhabra, Jitender Kumar. “Harmony search based remodularization for
object-oriented software systems.” Computer Languages, Systems &
priority order, taking into account the four criteria Homogenity, Structures, vol. 47, pp. 153-169, 2017.
Separation, and DunnIndex. Figure 13 shows how the proposed [16] Huang, Jinhuang, Jing Liu, and Xin Yao. “A multi-agent evolutionary
metric ranked the clustering algorithms and put them in what algorithm for software module clustering problems.” Soft Computing, vol.
21, no. 12, pp. 3415-3428, 2017.
groups. In this figure, it is clear that the categorization by the [17] Chhabra, Jitender Kumar. “Many-objective artificial bee colony algorithm
proposed metric is better than other tested internal criteria and is for large-scale software module clustering problem.” Soft Computing, vol.
similar to Factor Analysis of external criteria. This demonstrates 22, no. 19, pp. 6341-6361, 2018.
[18] Coakes, Sheridan J., and Lyndall Steed. SPSS: Analysis without anguish
that the proposed metric can replace external metrics. using SPSS version 14.0 for Windows. John Wiley & Sons, Inc., 2009.

V. C ONCLUSION
Due to the NP-Hardness of the clustering problem, the ex-
isting algorithms produces different clustering. Data clustering
uses internal criteria to evaluate the obtained clustering. Given
that the purpose of software clustering is different from that
of data clustering, so the internal metrics provided for data
clustering may not be able to properly gauge the clustering.
The purpose of the software clustering is to partition the
software into clusters that, in addition to maximum cohesion
and minimum coupling, should be well-understood. To this end,

1883
Authorized licensed use limited to: Istinye Universitesi. Downloaded on February 26,2023 at 01:13:17 UTC from IEEE Xplore. Restrictions apply.

You might also like