A Complexity Measure

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

308 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-2, NO.

4, DECEMBER 1976

A Complexi-ty Measure
THOMAS J. MCCABE

Abstract-This paper describes a graph-theoretic complexity measure II. A COMPLEXITY MEASURE


and illustrates how it can be used to manage and control program com- In this section a mathematical technique for program mod-
plexity. The paper first explains how the graph-theory concepts apply
and gives an intuitive explanation of the graph concepts in programming ularization will be developed. A few definitions and theorems
terms. The control graphs of several actual Fortran programs are then from graph theory will be needed, but several examples will
presented to iUustrate the correlation between intuitive complexity andbe presented in order to illustrate the applications of the
the graph-theoretic complexity. Several properties of the graph- technique.
theoretic complexity are then proved which show, for example, that The complexity measure approach we will take is to mea-
complexity is independent of physical size (adding or subtracting
functional statements leaves complexity unchanged) and complexity sure and control the number of paths through a program. This
depends only on the decision structure of a program. approach, however, immediately raises the following nasty
The issue of using nonstructured control flow is also discussed. A problem: "Any program with a backward branch potentially
characterization of nonstructured control graphs is given and a method has an infinite number of paths." Although it is possible to
of measuring the "structuredness" of a program is developed. The re-
define a set of algebraic expressions that give the total number
lationship between structure and reducibility is illustrated with several
examples. of possible paths through a (structured) program,1 using the
The last section of this paper deals with a testing methodology used total number of paths has been found to be impractical. Be-
in conjunction with the complexity measure; a testing strategy is de- cause of this the complexity measure developed here is defined
fined that dictates that a program can either admit of a certain minimal
in terms of basic paths-that when taken in combination will
testing level or the program can be structurally reduced.
generate every possible path.
Index Terns-Basis, complexity measure, control flow, decomposi- The following mathematical preliminaries will be needed, all
tion, graph theory, independence, linear, modularization, programming, of which can be found in Berge [1 .
reduction, software, testing. Definition 1: The cyclomatic number V(G) of a graph G
with n vertices, e edges, and p connected components is
v(G)=e- n+p.
I. INTRODUCTION Theorem 1: In a strongly connected graph G, the cyclo-
T HERE is a critical question facing software engineering matic number is equal to the maximum number of linearly
today: How to modularize a software system so the independent circuits.
resulting modules are both testable and maintainable? The applications of the above theorem will be made as
That the issues of testability and maintainability are impor- follows: Given a program we will associate with it a directed
tant is borne out by the fact that we often spend half of the graph that has unique entry and exit nodes. Each node in the
development time in testing [2] and can spend most of our graph corresponds to a block of code in the program where the
dollars maintaining systems [3]. What is needed is a mathe- flow is sequential and the arcs correspond to branches taken in
matical technique that will provide a quantitative basis for the program. This graph is classically known as the program
modularization and allow us to identify software modules control graph (see Ledgard [6] ) and it is assumed that each
that will be difficult to test or maintain. This paper reports node can be reached by the entry node and each node can
on an effort to develop such a mathematical technique which reach the exit node. For example, the following is a program
is based on program control flow. control graph with entry node "a" and exit node "f."
One currently used practice that attempts to ensure a reason-
able modularization is to limit programs by physical size
(e.g., IBM-50 lines, TRW-2 pages). This technique is not
adequate, which can be demonstrated by imagining a 50 line
program consisting of 25 consecutive "IF THEN" constructs.
Such a program could have as many as 33.5 million distinct
control paths, only a small percentage of which would prob-
ably ever be tested. Many such examples of live Fortran pro-
grams that are physically small but untestable have been iden-
tified and analyzed by the tools described in this paper.

Manuscript received April 10, 1976.


The author is with the Department of Defense, National Security
Agency, Ft. Meade, MD 20755. 1 See the Appendix.

Authorized licensed use limited to: University of London: Online Library. Downloaded on May 14,2023 at 10:53:09 UTC from IEEE Xplore. Restrictions apply.
MC CABE: A COMPLEXITY MEASURE 309

Theorem 1 is applied to G in the following way. Imagine that CONTROL STRUCTURE CYCLOMATIC COMPLEXITY
*v = e - n + 2p
the exit node (f) branches back to the entry node (a). The
SEQUENCE v =1 - 2 + 2 =1
control graph G is now strongly connected (there is a path
joining any pair of arbitrary distinct vertices) so Theorem 1 IF THEN ELSE v = 4 - 4 + 2 = 2

applies. Therefore, the maximum number of linearly indepen-


dent circuits in G is 9-6+2. For example, one could choose WHILE v = 3 - 3 + 2 = 2

the following 5 independent circuits in G:


UNTIL v = 3 - 3 + 2 = 2
Bi: (abefa), (beb), (abea), (acfa), (adcfa).
It follows that Bi forms a basis for the set of all circuits in G Notice that the sequence of an arbitrary number of nodes al-
and any path through G can be expressed as a linear combina- ways has unit complexity and that cyclomatic complexity
tion of circuits from Bi. For instance, the path (abeabebebef) conforms to our intuitive notion of "minimum number of
is expressable as (abea) +2(beb) + (abefa). To see how this paths." Several properties of cyclomatic complexity are stated
works its necessary to number the edges on G as in below:
1) v(G)>1.
2) v(G) is the maximum number of linearly independent
paths in G; it is the size of a basis set.
3) Inserting or deleting functional statements to G does not
10, affect v(G).
4) G has only one path if and only if v(G) = 1.
5) Inserting a new edge in G increases v(G) by unity.
6) v(G) depends only on the decision structure of G.
III. WORKING EXPERIENCE WITH THE
COMPLEXITY MEASURE
Now for each member of the basis Bi associate a vector as
follows: In this section a system which automates the complexity
measure will be described. The control structures of several
1 23456 7 8 9 10 PDP-10 Fortran programs and their corresponding complexity
(abefa) 1 0 0 1 0 0 0 1 0 1 measures will be illustrated.
(beb) 000 1 1 0 000 0 To aid the author's research into control structure complex-
(abea) 1 00 1 00 0 00 0 ity a tool was built to run on a PDP-10 that analyzes the
(acfa) 0 1 0 0 0 1 000 1 structure of Fortran programs. The tool, FLOW, was written
(adcfa) 00 1 00 1 1 00 1 in APL to input the source code from Fortran files on disk.
The path (abea(be)3 fa) corresponds to the vector 200420011 1 FLOW would then break a Fortran job into distinct subrou-
and the vector addition of (abefa), 2(beb), and (abea) yields tines and analyze the control structure of each subroutine. It
the desired result. does this by breaking the Fortran subroutines into blocks that
In using Theorem 1 one can choose a basis set of circuits are delimited by statements that affect control flow: IF, GOTO,
that correspond to paths through the program. The set B2 is a referenced LABELS, DO, etc. The flow between the blocks is
basis of program paths. then represented in an n by n matrix (where n is the number
of blocks), having a 1 in the i-jth position if block i can branch
B2: (abef), (abeabef), (abebef), (acf), (adcf), to block j in 1 step. FLOW also produces the "blocked"' listing
Linear combination of paths in B2 will also generate any path. of the original program, computes the cyclomatic complexity,
For example, and produces a reachability matrix (there is a 1 in the i-jth
position if block i can branch to block i in any number of
(abea(be)3f) = 2(abebef) - (abef) steps). An example of FLOW'S output is shown below.
and
IMPLICIT INTEGER(A-Z)
(a(be)2abef) = (a(be)2f) + (abeabef) - (abef). COMMON / ALLOC / MEM(2048),LM,LU,LV,LW,LX,LY,LQ,LWEX,
NCHARS,NWORDS
DIMENSION MEMORY(2048),INHEAD((4),ITRANS(128)
The overall strategy will be to measure the complexity of a 1
TYPE 1
FORMATCDOMOLKI STRUCTURE FILE NAME?" $)
program by computing the number of linearly independent NAMDML= S
ACCEPT 2,NAMDML
paths v(G), control the "size" of programs by setting an upper 2 FORMAT(A5)
CALL ALCHAN ( ICHAN)
limit to v(G) (instead of using just physical size), and use the CALL IFILE(ICHAN,'DSK',NAIDML,'AT',Oo0)
CALL READB'ICHAN,INHEAD,1?2,NREAD, $990,$990)
cyclomatic complexity as the basis for a testing methodology. NCHARS=INHEA1)( 1)
NWORDS =INHEAD( 2)
A few simple examples may help to illustrate. Below are the
control graphs of the usual constructs used in structured pro- *The role of the variable p will be explained in Section IV. For these
grammning and their respective complexities. examples assume p = 1.

Authorized licensed use limited to: University of London: Online Library. Downloaded on May 14,2023 at 10:53:09 UTC from IEEE Xplore. Restrictions apply.
310 IEEE TRANSACTIONS ON SOFTWARE EN(GINEERING, DECEMBER 1976

NTCT= (NCHARS+ 7 ) "NWORDS


LTOT= (NCHARS+ 5) *NWORDS
******:* BLOCK NO. 1 ********************
IF(LTOT,GT,2048) GO TO 900
****** BLOCK NO. 2 ***************************
CALL READB(ICHANT,EMORY,LTOT,NREAD,$99 0,$9S0)
.LIN=O
LU= N CHARS *NWORDS+ LM
LV=NWORDS+ LU
LW=NWORDS+ LV
LX=NWORDS+ LW
LY-NWORDS+ LX
LQ=NWORDS+ LY
LWEX=NWORDS+LQ
BLOCK NO. 3
700 I=,NWORD0************************** 2 V(G) =2
MEMORY(LWEX+I)=(MEMORY(LW+I),OR,(MEMORY(LW+I)*2))
700 CONTINUE
******** BLOCK NO. 4 *************************
CALL EXTEXT(ITRANS)
STOP
********BLOCK NO. 5 ***************************
900 TYPE 3,LTOT
3 FORNAT(STRUCTURE TOO LARGE FOR CORE; ',18,' WORDS'
t SEE COOPER /)
STOP
********BLOCK NO. 6 ************************** 2
990 TYPE $
4 FORMAT(' READ ERROR, OR STRUCTURE FILE- ERROR; J
' SEE COOPER I)
STOP
END

V(G)=3

CONNECTIVITY MATRIX

1 2 3 4 5 6 7

OOOOO1O
1 011 0 0 0 0

2 0

23 O 1 0 0 0

4 0 0 0 1 1 0 0

5 0 0 0 0 01
6 0 0 0 0 0 0 1

7 0 000000 1 6 5

.DL.DL.DL.DL.DL.DL.DL.DL.DL.DL.DL.DL.DL CYCLOMATIC COMPLEXITY = V(G) =

CLOSURE OF CONNECTIVITY MATRIX

1 2 3 4 5 6 7

1 0 1 1 1 1 1 1

2 0 0 0 0 0 1 1

3 0 0 0 1 1 1 1

4 0 0 0 1 1 1 1 7
5 0 0 0 0 0 1 1

6 0 0 0 0 0 0 1

7 0000000 8
,END

V(G)=6
At this point a few of the control graphs that were found in
live programs will be presented. The actual control graphs
from FLOW appear on a DATA DISK CRT but they are hand
drawn here for purposes of illustration. The graphs are pre-
sented in increasing order of complexity in order to suggest
the correlation between the complexity numbers and our in-
tuitive notion of control flow complexity.

Authorized licensed use limited to: University of London: Online Library. Downloaded on May 14,2023 at 10:53:09 UTC from IEEE Xplore. Restrictions apply.
MC CABE: A COMPLEXITY MEASURE 311

Authorized licensed use limited to: University of London: Online Library. Downloaded on May 14,2023 at 10:53:09 UTC from IEEE Xplore. Restrictions apply.
312 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, DECEMBER 1976

Authorized licensed use limited to: University of London: Online Library. Downloaded on May 14,2023 at 10:53:09 UTC from IEEE Xplore. Restrictions apply.
MC CABE: A COMPLEXITY MEASURE 313

One of the more interesting aspects of the automatic approach is that although FLOW could be implemented much more effi-
ciently in a compiler level language, it is still possible to go through a year's worth of a programmer's Fortran code in about 20
min. After seeing several of a programmer's control graphs on a CRT one can often recognize "style" by noting similar patterns
in the graphs. For example, one programmer had an affinity for sequencing numerous simple loops as in

V(G) =10

Authorized licensed use limited to: University of London: Online Library. Downloaded on May 14,2023 at 10:53:09 UTC from IEEE Xplore. Restrictions apply.
314 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, DECEMBER 1976

It was later revealed that these programs were eventually to ponents2 as MUAUB. Now, since p = 3 we calculate complex-
run on a CDC6600 and the "tight" loops were designed to stay ity as
within the hardware stack.
These results have been used in an operational environment v(MUAUB)=e- n+2p= 13- 13+2X3=6.
by advising project members to limit their software modules This method with p*l can be used to calculate the complex-
by cyclomatic complexity instead of physical size. The par- ity of a collection of programs, particularly a hierarchial nest
ticular upper bound that has been used for cyclomatic com- of subroutines as shown above.
plexity is 10 which seems like a reasonable, but not magical, Notice that v(MUAUB) = v(M) + v(A) + v(B) = 6. In general,
upper limit. Programmers have been required to calculate the complexity of a collection C of control graphs with k con-
complexity as they create software modules. When the com- nected components is equal to the summation of their com-
plexity exceeded 10 they had to either recognize and modularize plexities. To see this let Ci, l<i6k denote the k distinct con-
subfunctions or redo the software. The intention was to keep nected components, and let ei and ni be the number of edges
the "size" of the modules manageable and allow for testing all and nodes in the ith connected component. Then
the independent paths (which will be elaborated upon in
Section VII.) The only situation in which this limit has seemed k k
v(C) = e - n + 2p= ei - ni +2k
unreasonable is when a large number of independent cases 1 1
followed a selection function (a large case statement), which
was allowed. k k
It has been interesting to note how individual programmer's =E(ei
1
- ni +2) = Ev(Q.)
1
style relates to the complexity measure. The author has been
delighted to find several programmers who never had formal V. SIMPLIFICATION
training in structured programming but consistently write Since the calculation v = e - n + 2p can be quite tedious for
code in the 3 to 7 complexity range which is quite well struc- a programmer an effort has been made to simplify the com-
tured. On the other hand, FLOW has found several program- plexity calculations (for single-component graphs). There are
mers who frequently wrote code in the 40 to 50 complexity two results presented in this section-the first allows the com-
range (and who claimed there was no other way to do it). On plexity calculations to be done in terms of program syntactic
one occasion the author was given a DEC tape of 24 Fortran constructs, the second permits an easier calculation from the
subroutines that were part of a large real-time graphics system. graph form.
It was rather disquieting to fmd, in a system where reliability In [7] Mills proves the following: if the number of function,
is critical, subroutines of the following complexity: 16, 17, predicate, and collecting nodes in a structured program is 0, r,
24, 24, 32, 34, 41, 54, 56, and 64. After confronting the and y, respectively, and e is the number of edges, then
project members with these results the author was told that
the subroutines on the DEC tape were chosen because they e = 1 + 0 + 3ir.
were troublesome and indeed a close correlation was found Since for every predicate node there is exactly one collecting
between the ranking of subroutines by complexity and a rank- node and there are unique entry and exit nodes it follows that
ing by reliability (performed by the project members).
n = 0 + 2ir + 2.
IV. DECOMPOSITION
Assuming p = 1 and substituting in v = e - n + 2 we get
The role of p in the complexity calculation v =e - n + 2p
will now be explained. Recall in Definition 1 that p is the v=(l +0 +3rr)- (0 + 2ir+ 2)+2 =r+ 1.
number of connected components. The way we defined a pro- This proves that the cyclomatic complexity of a structured
gram control graph (unique entry and exit nodes, all nodes program equals the number of predicates plus one, for exam-
reachable from the entry, and the exit reachable from all ple in
nodes) would result in all control graphs having only one con-
nected component. One could, however, imagine a main pro-
gram M and two called subroutines A and B having a control
structure shown below:

A:

M:
B:

2A graph is connected if for every pair of vertices there is a chain go-


ing from one to the other. Given a vertex a, the set of vertices that can
Let us denote the total graph above with 3 connected com- be connected to a, together with a itself is a connected component.

Authorized licensed use limited to: University of London: Online Library. Downloaded on May 14,2023 at 10:53:09 UTC from IEEE Xplore. Restrictions apply.
MC CABE: A COMPLEXITY MEASURE 315

complexity v(G) = IT + 1 = 3 + 1 = 4. Notice how in this case written with only these constructs. One of the difficulties
complexity can be computed by simply counting the number with this approach is it does not define for programmers what
of predicates in the code and not having to deal with the con- constructs they should not use, i.e., it does not tell them what
trol graph. a structured program is not. If the programming population
In practice compound predicates such as IF "Cl AND C2" had a notion of what constructs to avoid and they could see
THEN are treated as contributing two to complexity since the inherent difficulty in these constructs, perhaps the notion
without the connective AND we would have of structuring programnming would be more psychologically
IF Cl THEN IF C2 THEN palatable. A clear defmition of the constructs that structured
which has two predicates. For this reason and for testing pur- programming excludes would also sensitize programmers to
their use while programs are being created, which (if we be-
poses it has been found to be more convenient to count con- lieve in structured programming) would have a desirable effect.
ditions instead of predicates when calculating complexity.3 One of the reasons that the author thinks this is important
It has been proved that in general the complexity of any (un- is that as Knuth [4] points out-there is a time and a place
structured) program is fT + 1. when an unstructured goto is needed. The author has had a
The second simplification of the calculation of e - n + 2p similar experience structuring Fortran jobs-there are a few
reduces the calculation of visual inspection of the control very specific conditions when an unstructured construct works
graph. We need Euler's formula which is as follows. If G is a best. If it is the case that unstructured constructs should only
connected plane graph with n vertices, e edges, and r regions, be allowed under special circumstances, one need then to dis-
then tinguish between the programmer that makes judicious use of
n - e + r = 2. a few unstructured goto's as compared to the programmer that
Just changing the order of the terms we get r = e - n + 2 so the is addicted to them. What would help is first the definition of
number of regions is equal to the cyclomatic complexity. the unstructured components and second a measure of the
Given a program with a plane control graph one can therefore structureness of a program as well as the complexity of a
calculate v by counting regio-ns, as in program.
Rao Kasaraju [5] has a result which is related-a flow graph
is reducible5 to a structured program if and only if it does not
contain a loop with two or more exits. This is a deep result
G~~~~~
but not, however, what we need since many programs that are
reducible to structured programs are not structured programs.
In order to have programmers explicitly identify and avoid
v (G) = 5 unstructured code we need a theorem that is analogous to a
theorei like Kuratowski's theorem in graph theory. Kuratow-
ski's theorem states that any nonplanar graph must contain at
least one of two specific nonplanar graphs that he describes.
The proof of nonplanarity of a graph is then reducible to
VI. NONSTRUCTURED PROGRAMMING locating two specific subgraphs whereas showing nonplanarity
The main thrust in the recent popularization of structured without Kuratowski's result is, in general, much more difficult.
programming is to make programmers aware of a few syntactic The following four control structures were found to generate
constructs4 and tell them that a structured program is one all nonstructured programs.

3For the CASE construct with N cases use N-i for the number of A number of theorems and results will be stated below with-
conditions. Notice, once -again, that a simulation of case with IF'S WiU out proof.
have N-1 conditions.
4The usual ones used (sometines called D-structures) are Result 1: A necessary and sufficient condition that a pro-
gram6 is nonstructured (one that is not written with just
------
0-
5Reducibility here means the same function is computed with the
same actions and predicates although the control structure may differ.
6Assuming the program does not contain unconditional GOTO'S.

Authorized licensed use limited to: University of London: Online Library. Downloaded on May 14,2023 at 10:53:09 UTC from IEEE Xplore. Restrictions apply.
316 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, DECEMBER 1976

D-structures) is that it contains as a subgraph either a), b), is branched into. So in this case we have a c) graph along with
or c). the original b) graph.
The reason why graph d) was slighted in Result 1 is that any Case 2: E is "after" the loop. The control graph would ap-
3 of the 4 graphs will generate all the unstructured programs- pear as follows:
this will be illustrated later. It is convenient to verbalize the
graphs a)-d), respectively, as follows:
a) branching out of a loop;
b) branching into a loop; b)
c) branching into a decision; and
d) branching out of a decision. a)
The following version of Result 2 may seem more intuitively
appealing. Notice how a type a) graph must appear.
A structured program can be written by not "branching out Case 3: E is independent of the loop. The control graph
of or into a loop, or out of or into a decision." would look as follows:
The following result gives insight into why a nonstructured
program's logic often becomes convoluted and entwined.
Result 2: A nonstructured program cannot be just a little
nonstructured. That is any nonstructured program must con- b)
tain at least 2 of the graphs a)-d). Part of the proof of Result s d)
2 will be shown here because it helps to illustrate how the con-
trol flow in a nonstructured program becomes entangled. We
show, for an example, how graph b) cannot occur alone. As-
suming we have graph b): The graph c) must now be present with b). If there is another
path that can go to a node after the loop from E then a type
d) graph is also generated. Things are often this bad, and in
fact much worse.
Similar arguments can be made for each of the other non-
( 9 ~~~b) structured graphs to show that a)-d) cannot occur alone. If
one generates all the possible pairs from a)-d) it is interesting
to note that they all reduce to 4 basic types:

-~~~~~~ a

d
(a,b) (a,d) (b, c) (c,d) 0
rC
a' M
d

the entry node E occurs either before, after, or from a node which leads us the following result.
independent of the loop. Each of these three cases will be Result 3: A necessary and sufficient condition for a pro-
treated separately. gram to be nonstructured is that it contains at least one of: (a,
Case 1: E is "before" the loop E is on a path from entry to b), (a, d), (b, c), (c, d). Result 4 is now obvious.
the loop so the program must have a graph as follows: Result 4: The cyclomatic complexity if a nonstructured
program is at least 3. It is interesting to notice that when the
orientation is taken off the edges each of the 4 basic graphs
a)- d) are isomorphic to the following nondirected graph.
c)
\) b)

Notice how E is a split node at the beginning of a decision that

Authorized licensed use limited to: University of London: Online Library. Downloaded on May 14,2023 at 10:53:09 UTC from IEEE Xplore. Restrictions apply.
MC CABE: A COMPLEXITY MEASURE 317

Also if the graphs (a, b) through (c, d) have their directions Notice in the nonstructured graphs below, however, that such
taken off they are all isomorphic to: a reduction process is not possible.

GI: G2:

v=6 v=6

By examining the graphs (a, b) through (c, d) one can formu-


late a more elegant nonstructured characterization:
Result S: A structured program can be written by not
branching out of loops or into decisions-a) and d) provide a
basis. G3:
Result 6: A structured program can be written by not
branching into loops or out of decisions-b) and d) provide a
basis.
A way to measure the lack of structure in a program or flow
graph will be briefly commented upon. One of the. difficulties
with the nonstructured graphs mentioned above is that there is
no way they can be broken down into subgraphs with one
entry and one exit. This is a severe limitation since one way in v=6
which program complexity can be controlled is to recognize
when the cyclomatic complexity becomes too large-and then Let m be the number of proper subgraphs with unique entry
identify and remove subgraphs with unique entry and exit and exit nodes. Notice in GI, G2, and G3 m is equal to 0, 1,
nodes. and 2, respectively. The following definition of essential com-
Result 7: A structured program is reducible7 to a program plexity ev is used to reflect the lack of structure.
of unit complexity. Definition: ev = v - n.
The following example illustrates how a structured program For the above graphs we have ev(Gl) = 6, ev(G2) = 5, and
can be reduced. ev(G3) = 4. Notice how the essential complexity indicates the

/\ ,
--Nm c

v = 4 v = 3 v = 2 v = 1

7-Reduction is the process of removing subgraphs (subroutines) with


unique entry and exit nodes.

Authorized licensed use limited to: University of London: Online Library. Downloaded on May 14,2023 at 10:53:09 UTC from IEEE Xplore. Restrictions apply.
318 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, DECEMBER 1976

extent to which a graph can be reduced-GI cannot be re-


duced at all since its complexity is equal to its essential com-
plexity. G2 and G3, however, can be reduced as follows:

G2':

v(G2') =5 Suppose that ac = 2 and the two tested paths are [E, a,, b, c2,
x] and [E, a2, b, cl, x]. Then given that paths [E, a,, b, cl,
x] and [E, a2, b, c2, x] cannot be executed we have ac<v
so case 2 holds and G can be reduced by removing decision b
as in

G3':
GI:
2 C

v(G3') =4
Notice how in G v = ac and the complexity of Gl is less than
the complexity of G.
This last result is stated for completeness. In experience this approach is most helpful when program-
Result 8: The essential complexity of a structured program mers are required to document their flow graph and complex-
is one. ity and show explicitly the different paths tested. It is often
the case when the actual number of paths tested is compared
VII. A TESTING METHODOLOGY with the cyclomatic complexity that several additional paths
The complexity measure v is designed to conform to our in- are discovered that would normally be overlooked. It should
tuitive notion of complexity and since we often spend as much be noted that v is only the minimal number of independent
as 50 percent of our time in test and debug mode the measure paths that should be tested. There are often additional paths
should correlate closely with the amount of work required to to test. It should also be noted that this procedure (like any
test a program. In this section the relationship between test- other testing method) will by no means guarantee or prove the
ing and cyclomatic complexity will be defined and a testing software-all it can do is surface more bugs and improve the
methodology will be developed. quality of the software.
Let us assume that a program P has been written, its com- Two more examples are presented without comment.
plexity v has been calculated, and the number of paths tested
is ac (actual complexity). If ac is less than v then one of the
following conditions must be true:
1) there is more testing to be done (more paths to be tested);
2) the program flow graph can be reduced in complexity by Gl:
v-ac (v-ac decisions can be taken out); and
3) portions of the program can be reduced to in line code
(complexity has increased to conserve space). TESTS:

Up to this point the complexity issue has been considered albl


a2b2
purely in terms of the structure of the control flow. This clbl
c2b2
v=5
ac=5
testing issue, however, is closely related to the data flow be- c3bl

cause it is the data behavior that either precludes or makes


realizable the execution of any particular control path. A few
simple examples may help to illustrate. Assume we start with v=6
the following flow graph: ac=5

Authorized licensed use limited to: University of London: Online Library. Downloaded on May 14,2023 at 10:53:09 UTC from IEEE Xplore. Restrictions apply.
MC CABE: A COMPLEXITY MEASURE 319

G2: GG2':

TESTS:
acdfghik
acefgijabk

v=3
ac=2

ac-2

_ G2 @:(

v=2
(sg7 ac=2

APPENDI X The program SEARCH below is used to illustrate. SEARCH


A method of computing the number of possible paths in a performs a binary search for input parameter ITEM on a table
structured program will be briefly outlined. This method asso- T of length N. SEARCH sets F to 1 and J to ITEM'S index
ciates an algebraic expression C with each of the structured within T if the search is successful-otherwise F is set to 0 in-
constructs and assumes that the complexity of a basic func- dicating that ITEM is not in T.
tional or replacement statement is one. The various syntactic
constructs used in structured programming and their control PROCDURE SACIIA (ITE4) INTEGER ITEM
flow and complexity expressions are shown below. The sym- BEGIN
INTEGER L, H;
bol a stands for the number of iterations in a loop. F4-0;
IA-O;
H4-N;
CONSTRUCT CO>.'TRO1. FLO!; C (CO:.O7,RUC¶2)
While H > L and F = O Do
SE2QUtLCE A;b --
--I C (A) x C (D) If TtJ < (HNIL) DIV 21 = item
THiEN
If item < T[J]
IF A THEN B C (A) ; () + C (C}
i. LSE C THEN

,liILE A D LC (S.) + C CB)) + CC(C ,; EI - J - 1


ELSE
L 4-J + 1
CASE A OF (A ;h) C (A)X [CI( A + C(A T
ELSE
(A + EC( )3
F * 1
END

DO B UiiTIL A t C (i) x C(A)]e


The flow graph for SEARCH iS

Authorized licensed use limited to: University of London: Online Library. Downloaded on May 14,2023 at 10:53:09 UTC from IEEE Xplore. Restrictions apply.
320 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, DECEMBER 1976

[21 B. W. Boehm, "Software and its impact: A quantitative assess-


ment," Datamation, vol. 19, pp. 48-59, May 1973.
[3] W. B. Cammack and H. J. Rogers, "Improving the programming
process," IBM Tech. Rep. TR 00.2483, Oct. 1973.
[4] D. E. Knuth, "Structured programming with GO statements,"
Computing Surveys, vol. 6, pp. 261-301, Dec. 1974.
[5] R. Kosaraju, "Analysis of structured programs," J. Comput. Syst.
Sci., vol. 9, pp. 232-255, Dec. 1974; also, Dep. Elec. Eng., The
Johns Hopkins Univ., Baltimore, MD, Tech. Rep. 72-11,1972.
[6] H. Legard and M. Marcotty, "A generalogy of control structures,"
Commun. Assoc. Comput. Mach., vol. 18, pp. 629-639, Nov.
1975.
[7] H. D. MiUls, "Mathematical foundations for structured program-
ming," Federal System Division, IBM Corp., Gaithersburg, MD,
FSC 72-6012, 1972.

The algebraic complexity C would be computed as


C(SEARCH) = 111 {1 +(1 till + 1] + l])r 1} Thomas J. McCabe was born in Central Fals,
= {I1 + (3)a}. RI, on November 28, 1941. He received the
A.B. degree in mathematics from Providence
Assuming I to be at least 4, the lower bound for the expression College, Providence, RI and the M.S. degree in
{ I + 3a } is 4 which indicates there are at least 4 paths to be mathematics from the University of Connecti-
cut, Storrs, in 1964 and 1966, respectively.
tested. The first test would be from the immediate exit from He has been employed since 1966 by the
the WHILE loop which could be tested by choosing H less than Department of Defense, National Security
L initially. The next three tests (the three ways through the Agency, Ft. Meade, MD in various systems pro-
gramming and programming management posi-
body of the loop) correspond to cases where ITEM = T[J1, tions. He also, during a military leave, served as
ITEM <T[l],and ITEM >T[J]. a Captain in the Army Security Agency engaged in large-cale compiler
implementation and optimization. He has recently been active in soft-
REFERENCES ware engineering and has developed and taught various software related
courses for the Institute for Advanced Technology, the University of
[1] C. Berge, Graphs and Hypergraphs. Amsterdam, The Netherlands: California, and Massachusetts State College System.
North-Holland, 1973. Mr. McCabe is a member of the American Mathematical Association.

Authorized licensed use limited to: University of London: Online Library. Downloaded on May 14,2023 at 10:53:09 UTC from IEEE Xplore. Restrictions apply.

You might also like