0% found this document useful (0 votes)
45 views47 pages

SW Metrics

This document discusses software metrics and complexity metrics that can be used to measure various attributes of software, including size, structure, and maintainability. It provides examples of calculating metrics like Halstead volume and McCabe complexity on sample code. It also discusses metrics for object-oriented software like those from the Chidamber and Kemerer suite that measure attributes at the class level like weighted methods per class, depth of inheritance, and coupling between object classes.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views47 pages

SW Metrics

This document discusses software metrics and complexity metrics that can be used to measure various attributes of software, including size, structure, and maintainability. It provides examples of calculating metrics like Halstead volume and McCabe complexity on sample code. It also discusses metrics for object-oriented software like those from the Chidamber and Kemerer suite that measure attributes at the class level like weighted methods per class, depth of inheritance, and coupling between object classes.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

2IS55 Software Evolution

Software metrics (2)


Alexander Serebrenik
Assignment 6
/ SET / W&I PAGE 1 27-4-2011
Assignment 6:
Deadline: May 11
1-2 students
Assignment 8:
Open: June 1
Deadline: June 22
1-2 students
ReqVis
http://www.student.tue.nl/Q/w.j.p.v.rave
nsteijn/index.html
Try it!
Give us feedback before June 1!
Mac-fans: Talk to Wiljan!
Sources
/ SET / W&I PAGE 2 27-4-2011
So far
x
/ SET / W&I PAGE 3 27-4-2011
Metrics
Size Length (S)LOC
Number of
files, classes
Amount of
functionality
Structure
Control flow
Data flow
Modularity
Complexity metrics: Halstead (1977)
Sometimes is classified as size rather than complexity
Unit of measurement
/ SET / W&I PAGE 4 27-4-2011
Line: LOC,
SLOC, LLOC
Units, files,
classes
Parts of a
statement:
operators
and operands
Packages,
directories
Operators:
traditional (+,++, >), keywords (return, if, continue)
Operands
identifiers, constants
Halstead metrics
/ SET / W&I PAGE 5 27-4-2011
Length: N = N1 + N2
Vocabulary: n = n1 + n2
Volume: V = N log
2
n
Insensitive to lay-out
VerifySoft:
20 Volume(function) 1000
100 Volume(file) 8000
Total Unique
Operators N1 n1
Operands N2 n2
Four basic metrics of Halstead
Halstead metrics: Example
void sort ( int *a, int n ) {
int i, j, t;
if ( n < 2 ) return;
for ( i=0 ; i < n-1; i++ ) {
for ( j=i+1 ; j < n ; j++ ) {
if ( a[i] > a[j] ) {
t = a[i];
a[i] = a[j];
a[j] = t;
}
}
}
}
/ SET / W&I PAGE 6 27-4-2011
Ignore the function definition
Count operators and operands
Total Unique
Operators N1 = 50 n1 = 17
Operands N2 = 30 n2 = 7
V = 80 log
2
(24) 392
Inside the boundaries [20;1000]
Further Halstead metrics
Volume: V = N log
2
n
Difficulty: D = ( n1 / 2 ) * ( N2 / n2 )
Sources of difficulty: new operators and repeated
operands
Example: 17/2 * 30/7 36
Effort: E = V * D
Time to understand/implement (sec): T = E/18
Running example: 793 sec 13 min
Does this correspond to your experience?
Bugs delivered: E
2/3
/3000
For C/C++: known to underapproximate
Running example: 0.19
/ SET / W&I PAGE 7 27-4-2011
Total Unique
Operators N1 n1
Operands N2 n2
Halstead metrics are sensitive to
What would be your answer?
Syntactic sugar:
Solution: normalization (see the code duplication
slides)
/ SET / W&I PAGE 8 27-4-2011
i = i+1 Total Unique
Operators N1 = 2 n1 = 2
Operands N2 = 3 n2 = 2
i++ Total Unique
Operators N1 = 1 n1 = 1
Operands N2 = 1 n2 = 1
Structural complexity
Structural complexity:
Control flow
Data flow
Modularity
/ SET / W&I PAGE 9 27-4-2011
Commonly
represented
as graphs
Graph-
based
metrics
Number of vertices
Number of edges
Maximal length
(depth)
McCabe complexity (1976)
In general
v(G) = #edges - #vertices + 2
For control flow graphs
v(G) = #binaryDecisions + 1, or
v(G) = #IFs + #LOOPs + 1
Number of paths in the control flow graph.
A.k.a. cyclomatic complexity
Each path should be tested!
v(G) a testability metrics
/ SET / W&I PAGE 10 27-4-2011
Boundaries
v(function) 15
v(file) 100
McCabe complexity: Example
void sort ( int *a, int n ) {
int i, j, t;
if ( n < 2 ) return;
for ( i=0 ; i < n-1; i++ ) {
for ( j=i+1 ; j < n ; j++ ) {
if ( a[i] > a[j] ) {
t = a[i];
a[i] = a[j];
a[j] = t;
}
}
}
}
/ SET / W&I PAGE 11 27-4-2011
Count IFs and LOOPs
IF: 2, LOOP: 2
v(G) = 5
Structural complexity
Question to you
Is it possible that the McCabes complexity is higher
than the number of possible execution paths in the
program?
Lower than this number?
/ SET / W&I PAGE 12 27-4-2011
McCabes complexity in Linux kernel
/ SET / W&I PAGE 13 27-4-2011
A. Israeli, D.G. Feitelson 2010
Linux kernel
Multiple
versions and
variants
Production
(blue dashed)
Development
(red)
Current 2.6
(green)
McCabes complexity in Mozilla [Rsdal 2005]
Most of the modules have low cyclomatic complexity
Complexity of the system seems to stabilize
/ SET / W&I PAGE 14 27-4-2011
Summarizing: Maintainability index (MI)
[Coleman, Oman 1994]
MI
2
can be used only if comments
are meaningful
If more than one module is
considered use average values
for each one of the parameters
Parameters were estimated by
fitting to expert evaluation
BUT: few not big systems!
/ SET / W&I PAGE 15 27-4-2011
) ln( 2 . 16 ) ( 23 . 0 ) ln( 2 . 5 171
1
LOC g V V MI =
Halstead McCabe
LOC
perCM MI MI 46 . 2 sin 50
1 2
+ =
% comments
85
65
0
McCabe complexity: Example
void sort ( int *a, int n ) {
int i, j, t;
if ( n < 2 ) return;
for ( i=0 ; i < n-1; i++ ) {
for ( j=i+1 ; j < n ; j++ ) {
if ( a[i] > a[j] ) {
t = a[i];
a[i] = a[j];
a[j] = t;
}
}
}
}
/ SET / W&I PAGE 16 27-4-2011
Halsteads V 392
McCabes v(G) = 5
LOC = 14
MI
1
96
Easy to maintain!
Comments?
Peaks:
25% (OK),
1% and
81% - ???
/ SET / W&I PAGE 17 27-4-2011
[Liso 2001]
Better:
0.12 K 0.2
perCM 46 . 2 sin 50
Another alternative:
Percentage as a fraction
[0;1] [Thomas 2008, Ph.D. thesis]
The more comments the
better?
/ SET / W&I PAGE 18 27-4-2011
0.0 0.2 0.4 0.6 0.8 1.0
0
1
0
2
0
3
0
4
0
5
0
percentage of comments
M
I

c
o
n
t
r
i
b
u
t
i
o
n
Evolution of the maintainability index in Linux
Size, Halstead
volume and
McCabe
complexity
decrease
% comments
decreases as well
BUT they use the
[0;1] definition, so
the impact is
limited
/ SET / W&I PAGE 19 27-4-2011
A. Israeli, D.G. Feitelson 2010
What about modularity?
Squares are modules, lines are calls,
ends of the lines are functions.
Which design is better?
/ SET / W&I PAGE 20 27-4-2011
Design A Design B
Cohesion: calls
inside the
module
Coupling: calls
between the
modules
A B
Cohesion Lo Hi
Coupling Hi Lo
Do you still remember?
Many intra-package dependencies: high cohesion
Few inter-package dependencies: low coupling
Joint measure
/ SET / W&I PAGE 21 27-4-2011
2
i
i
i
N
A

=
( ) 1
=
i i
i
i
N N
A

or
j i
j i
j i
N N
E
2
,
,

=


= + = =

=
1
1 1
,
1
) 1 (
2 1
k
i
k
i j
j i
k
i
i
E
k k
A
k
MQ
k- Number of
packages
Modularity metrics: Fan-in and Fan-out
Fan-in of M: number of
modules calling functions
in M
Fan-out of M: number of
modules called by M
Modules with fan-in = 0
What are these modules?
Dead-code
Outside of the system
boundaries
Approximation of the call
relation is imprecise
/ SET / W&I PAGE 22 27-4-2011
Henry and Kafuras information flow
complexity [HK 1981]
Fan-in and fan-out can be defined for procedures
HK: take global data structures into account:
read for fan-in,
write for fan-out
Henry and Kafura: procedure as HW component
connecting inputs to outputs
Shepperd
/ SET / W&I PAGE 23 27-4-2011
2
) * ( * fanout fanin sloc hk =
2
) * ( fanout fanin s=
Information flow complexity of Unix
procedures
Solid #procedures within
the complexity range
Dashed - #changed
procedures within the
complexity range
Highly complex procedures
are difficult to change but
they are changed often!
Complexity comes the
most complex
procedures
/ SET / W&I PAGE 24 27-4-2011
1e+00 1e+02 1e+04 1e+06
0
1
0
2
0
3
0
4
0
Henry-Kafura complexity
F
r
e
q
u
e
n
c
y
Evolution of the information flow complexity
Mozilla
Shepperd
version
Above:
the metrics
over all
modules
Below: 3
largest
modules
What does
this tell?
/ SET / W&I PAGE 25 27-4-2011
Summary so far
Complexity metrics
Halsteads effort
McCabe (cyclomatic)
Henry Kafura/Shepperd
(information flow)
Are these related?
And what about bugs?
Harry,Kafura,Harris 1981
165 Unix procedures
What does this tell us?
/ SET / W&I PAGE 26 27-4-2011
Bugs
Halstead
McCabe HK
0.95
0.96
0.89
0.84
0.36
0.38
From imperative to OO
All metrics so far were designed for imperative
languages
Applicable for OO
On the method level
Also
Number of files number of classes/packages
Fan-in afferent coupling (C
a
)
Fan-out efferent coupling (C
e
)
But do not reflect OO-specific complexity
Inheritance, class fields, abstractness,
Popular metric sets
Chidamber and Kemerer, Li and Henry, Lorenz and
Kidd, Abreu, Martin
/ SET / W&I PAGE 27 27-4-2011
Chidamber and Kemerer
WMC weighted methods per class
Sum of metrics(m) for all methods m in class C
DIT depth of inheritance tree
java.lang.Object? Libraries?
NOC number of children
Direct descendents
CBO coupling between object classes
A is coupled to B if A uses methods/fields of B
CBO(A) = | {B|A is coupled to B} |
RFC - #methods that can be executed in response to
a message being received by an object of that class.
/ SET / W&I PAGE 28 27-4-2011
Chidamber and Kemerer
WMC weighted methods per class
Sum of metrics(m) for all methods m in class C
Popular metrics: McCabes complexity and unity
WMC/unity = number of methods
Statistically significant correlation with the number of
defects
/ SET / W&I PAGE 29 27-4-2011
WMC/unity
Dark: Basili et al.
Light: Gyimothy
et al. [Mozilla 1.6]
Red: High-
quality NASA
system
Chidamber and Kemerer
WMC weighted methods per class
Sum of metrics(m) for all methods m in class C
Popular metrics: McCabes complexity and unity
WMC/unity = number of methods
Statistically significant correlation with the number of
defects
/ SET / W&I PAGE 30 27-4-2011
WMC/unity
Gyimothy et al.
Average
Depth of inheritance - DIT
Variants: Were to start and what classes to include?
1, JFrame is a library class, excluded
2, JFrame is a library class, included
7
/ SET / W&I PAGE 31 27-4-2011
JFrame MyFrame
java.awt.F
rame
java.awt.
Window
java.lang.
Object
java.awt.C
omponent
java.awt.C
ontainer
DIT what is good and what is bad?
Three NASA systems
What can you say about the use of inheritance in
systems A, B and C?
Observation: quality assessment depends not just
on one class but on the entire distribution
/ SET / W&I PAGE 32 27-4-2011
Average DIT in Mozilla
How can you explain the decreasing trend?
/ SET / W&I PAGE 33 27-4-2011
Other CK metrics
NOC number of
children
CBO coupling between
object classes
RFC - #methods that can
be executed in response
to a message being
received by an object of
that class.
More or less
exponentially
distributed
/ SET / W&I PAGE 34 27-4-2011
Significance of CK metrics to
predict the number of faults
Modularity metrics: LCOM
LCOM lack of cohesion of
methods
Chidamber Kemerer:
where
P = #pairs of distinct methods
in C that do not share variables
Q = #pairs of distinct methods
in C that share variables
/ SET / W&I PAGE 35 27-4-2011

>
=
otherwise 0
if
) (
Q P Q P
C LCOM
[BBM] 180 classes
Discriminative ability
is insufficient
What about get/set?
First solution: LCOMN
Defined similarly to LCOM but allows negative values
/ SET / W&I PAGE 36 27-4-2011
Q P C LCOMN = ) (
LCOM LCOMN
Still
/ SET / W&I PAGE 37 27-4-2011
Method * method tables
Light blue: Q, dark blue: P
Calculate the LCOMs
Does this correspond to your intuition?
Henderson-Sellers, Constantine and Graham 1996
m number of methods
v number of variables (attrs)
m(V
i
) - #methods that access V
i
/ SET / W&I PAGE 38 27-4-2011
m
m V m
v
v
i
i

|
.
|

\
|

=
1
) (
1
1
Cohesion is maximal: all methods access all variables
and
No cohesion: every method accesses a unique variable
and
Can LCOM exceed 1?
m V m
i
= ) (
0 = LCOM
1 ) ( =
i
V m
1 = LCOM
LCOM > 1?
/ SET / W&I PAGE 39 27-4-2011
If some variables are not accessed at all, then
and
Hence
LCOM is undefined for m = 1
LCOM 2
1
1
1
1 1
) (
1
1

+ =

|
.
|

\
|

=
m m
m
m
m V m
v
v
i
i
0 ) ( =
i
V m
Evolution of LCOM [Henderson-Sellers et al.]
Project 6 (commercial human resource system)
suggests stabilization, but no similar conclusion can
be made for other projects
/ SET / W&I PAGE 40 27-4-2011
Sato, Goldman,
Kon 2007
Shortcomings of LCOM [Henderson-Sellers ]
Due to [Fernndez, Pea 2006]
Method-variable diagrams: dark spot = access
LCOM( ) = LCOM( ) = LCOM( ) = 0.67
seems to be less cohesive than and !
/ SET / W&I PAGE 41 27-4-2011
Variables
M
et
ho
ds
Variables
M
et
ho
ds
Variables
M
et
ho
ds
Alternative [Hitz, Montazeri 1995]
LCOM as the number of strongly
connected components in the
following graph
Vertices: methods
Edge between a and b, if
a calls b
b calls a
a and b access the same variable
LCOM values
0, no methods
1, cohesive component
2 or more, lack of cohesion
/ SET / W&I PAGE 42 27-4-2011
Variables
M
et
ho
ds
Variables
M
et
ho
ds
Question: LCOM?
Experimental evaluation of LCOM variants
Cox, Etzkorn and
Hughes 2006
Correlation with expert assessment
Group 1 Group 2
Chidamber Kemerer -0.43 (p = 0.12) -0.57 (p = 0.08)
Henderson-Sellers -0.44 (p = 0.12) -0.46 (p = 0.18)
Hitz, Montazeri -0.47 (p = 0.06) -0.53 (p = 0.08)
/ SET / W&I PAGE 43 27-4-2011
Etzkorn, Gholston,
Fortune, Stein,
Utley, Farrington,
Cox
Correlation with expert assessment
Group 1 Group 2
Chidamber Kemerer -0.46 (rating 5/8) -0.73 (rating 1.5/8)
Henderson-Sellers -0.44 (rating 7/8) -0.45 (rating 7/8)
Hitz, Montazeri -0.51 (rating 2/8) -0.54 (rating 5/8)
LCC and TCC [Bieman, Kang 1994]
Recall: LCOM HM a and b access the same variable
What if a calls a, b calls b, and a and b access the
same variable?
Metrics
NDP number of pairs of methods directly
accessing the same variable
NIP number of pairs of methods directly or
indirectly accessing the same variable
NP number of pairs of methods: n(n-1)/2
Tight class cohesion TCC = NDP/NP
Loose class cohesion LCC = NIP/NP
NB: Constructors and destructors are excluded
/ SET / W&I PAGE 44 27-4-2011
Experimental evaluation of LCC/TCC
/ SET / W&I PAGE 46 27-4-2011
Etzkorn, Gholston,
Fortune, Stein, Utley,
Farrington, Cox
Correlation with expert assessment
Group 1 Group 2
Chidamber Kemerer -0.46 (rating 5/8) -0.73 (rating 1.5/8)
Henderson-Sellers -0.44 (rating 7/8) -0.45 (rating 7/8)
Hitz, Montazeri -0.51 (rating 2/8) -0.54 (rating 5/8)
TCC -0.22 (rating 8/8) -0.057 (rating 8/8)
LCC -0.54 (rating 1/8) -0.73 (rating 1.5/8)
Conclusions: Metrics so far
/ SET / W&I PAGE 47 27-4-2011
Level Matrics
Method LOC, McCabe, Henry Kafura
Class WMC, NOC, DIT, LCOM (and
variants), LCC/TCC
Packages ???
Next time:
Package-level metrics (Martin)
Metrics of change

You might also like