0% found this document useful (0 votes)

118 views10 pages

A Metric For Software Readability: Raymond P.L. Buse and Westley R. Weimer

Uploaded by

Isahdaarey Holluwatosyn Psalmson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

118 views10 pages

A Metric For Software Readability: Raymond P.L. Buse and Westley R. Weimer

Uploaded by

Isahdaarey Holluwatosyn Psalmson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

A Metric for Software Readability

Raymond P.L. Buse and Westley R. Weimer

Department of Computer Science
University of Virginia
Charlottesville, VA, USA
{buse, weimer}@cs.virginia.edu

ABSTRACT a project [1]. Other researchers have noted that the act
In this paper, we explore the concept of code readability of reading code is the most time-consuming component of
and investigate its relation to software quality. With data all maintenance activities [29, 36, 38]. Furthermore, main-
collected from human annotators, we derive associations be- taining software often means evolving software, and modi-
tween a simple set of local code features and human notions fying existing code is a large part of modern software en-
of readability. Using those features, we construct an au- gineering [35]. Readability is so significant, in fact, that
tomated readability measure and show that it can be 80% Elshoff and Marcotty proposed adding a development phase
effective, and better than a human on average, at predict- in which the program is made more readable [11]. Knight
ing readability judgments. Furthermore, we show that this and Myers suggested that one phase of software inspection
metric correlates strongly with two traditional measures of should be a check of the source code for readability [26].
software quality, code changes and defect reports. Finally, Haneef proposed the addition of a dedicated readability and
we discuss the implications of this study on programming documentation group to the development team [19].
language design and engineering practice. For example, our We hypothesize that everyone who has written code has
data suggests that comments, in of themselves, are less im- some intuitive notion of this concept, and that program fea-
portant than simple blank lines to local judgments of read- tures such indentation (e.g., as in Python [43]), choice of
ability. identifier names [37], and comments are likely to play a part.
Dijkstra, for example, claimed that the readability of a pro-
gram depends largely upon the simplicity of its sequencing
Categories and Subject Descriptors control, and employed that notion to help motivate his top-
D.2.9 [Management]: Software quality assurance (SQA); down approach to system design [10].
D.2.8 [Software Engineering]: Metrics We present a descriptive model of software readability
based on simple features that can be extracted automat-
ically from programs. This model of software readability
General Terms correlates strongly both with human annotators and also
Measurement, Human Factors with external notions of software quality, such as defect de-
tectors and software changes.
Keywords To understand why an empirical and objective model of
software readability is useful, consider the use of readabil-
software readability, program understanding, machine learn- ity metrics in natural languages. The Flesch-Kincaid Grade
ing, software maintenance, code metrics, FindBugs Level [12], the Gunning-Fog Index [18], the SMOG Index [31],
and the Automated Readability Index [24] are just a few ex-
1. INTRODUCTION amples of readability metrics that were developed for ordi-
We define “readability” as a human judgment of how easy a nary text. These metrics are all based on simple factors such
text is to understand. The readability of a program is related as average syllables per word and average sentence length.
to its maintainability, and is thus a critical factor in over- Despite their relative simplicity, they have each been shown
all software quality. Typically, maintenance will consume to be quite useful in practice. Flesch-Kincaid, which has
over 70% of the total lifecycle cost of a software product [5]. been in use for over 50 years, has not only been integrated
Aggarwal claims that source code readability and documen- into popular text editors including Microsoft Word, but has
tation readability are both critical to the maintainability of also become a United States governmental standard. Agen-
cies, including the Department of Defense, require many
documents and forms, internal and external, to meet have
a Flesch readability grade of 10 or below (DOD MIL-M-
Permission to make digital or hard copies of all or part of this work for 38784B). Defense contractors also are often required to use
personal or classroom use is granted without fee provided that copies are it when they write technical manuals.
not made or distributed for profit or commercial advantage and that copies These metrics, while far from perfect, can help organiza-
bear this notice and the full citation on the first page. To copy otherwise, to tions gain some confidence that their documents meet goals
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
for readability very cheaply, and have become ubiquitous
ISSTA’08, July 20–24, 2008, Seattle, Washington, USA. for that reason. We believe that similar metrics, targeted
Copyright 2008 ACM 978-1-60558-050-0/08/07 ...$5.00.

121
specifically at source code and backed with empirical evi- 2. HUMAN READABILITY ANNOTATION
dence for effectiveness, can serve an analogous purpose in A consensus exists that readability is an essential deter-
the software domain. mining characteristic of code quality [1, 5, 10, 11, 19, 29, 34,
It is important to note that readability is not the same as 35, 36, 37, 38, 44], but not about which factors most con-
complexity, for which some existing metrics have been em- tribute to human notions of software readability. A previous
pirically shown useful [44]. Brooks claims that complexity is study by Tenny looked at readability by testing comprehen-
an “essential” property of software; it arises from system re- sion of several versions of a program [42]. However, such an
quirements, and cannot be abstracted away [13]. Readabil- experiment is not sufficiently fine-grained to extract precise
ity, on the other hand, is purely “accidental.” In the Brooks features. In that study, the code samples were large, and
model, software engineers can only hope to control acciden- thus the perceived readability arose from a complex inter-
tal difficulties: coincidental readability can be addressed far action of many features, potentially including the purpose
more easily than intrinsic complexity. of the code. In contrast, we choose to measure the software
While software complexity metrics typically take into ac- readability of small (7.7 lines on average) selections of code.
count the size of classes and methods, and the extent of Using many short code selections increases our ability to
their interactions, the readability of code is based primarily tease apart which features are most predictive of readabil-
on local, line-by-line factors. It is not related, for example, ity. We now describe an experiment designed to extract a
to the size of a piece of code. Furthermore, our notion of large number of readability judgments over short code sam-
readability arises directly from the judgments of actual hu- ples from a group of human annotators.
man annotators who need not be familiar with the purpose Formally, we can characterize software readability as a
of the system. Complexity factors, on the other hand, may mapping from a code sample to a finite score domain. In this
have little relation to what makes code understandable to experiment, we present a sequence of short code selections,
humans. Previous work [34] has shown that attempting to called snippets, through a web interface. Each annotator is
correlate artificial code complexity metrics directly to de- asked to individually score each snippet based on a personal
fects is difficult, but not impossible. In this study, we have estimation of how readable it is. There are two important
chosen to target readability directly both because it is a con- parameters to consider: snippet selection policy and score
cept that is independently valuable, and also because devel- range.
opers have great control over it. We show in Section 4 that
there is indeed a significant correlation between readability 2.1 Snippet Selection Policy
and quality.
We claim that the readability of code is very different from
The main contributions of this paper are:
that of natural languages. Code is highly structured and
• An automatic software readability metric based on lo- consists of elements serving different purposes, including de-
cal features. Our metric correlates strongly with both sign, documentation, and logic. These issues make the task
human annotators and also external notions of soft- of snippet selection an important concern. We have designed
ware quality. an automated policy-based tool that extracts snippets from
Java programs.
• A survey of 120 human annotators on 100 code snip- First, snippets should be relatively short to aid feature
pets that forms the basis for our metric. We are un- discrimination. However, if snippets are too short, then they
aware of any published software readability study of may obscure important readability considerations. Second,
comparable size (12,000 human judgments). snippets should be logically coherent to give the annotators
• A discussion of the features involved in that metric and the best chance at appreciating their readability. We claim
their relation to software engineering and program- that they should not span multiple method bodies and that
ming language design. they should include adjacent comments that document the
code in the snippet. Finally, we want to avoid generating
There are a number of possible uses for an automated snippets that are “trivial.” For example, the readability of a
readability metric. It may help developers to write more snippet consisting entirely of boilerplate import statements
readable software by quickly identifying code that scores or entirely of comments is not particularly meaningful.
poorly. It can assist project managers in monitoring and Consequently, an important tradeoff exists such that snip-
maintaining readability. It can serve as a requirement for pets must be as short as possible to adequately support anal-
acceptance. It can even assist inspections by helping to tar- ysis, yet must be long enough to allow humans to make sig-
get effort at parts of a program that may need improvement. nificant judgments on them. Note that it is not our intention
The structure of this paper is as follows. In Section 2 we to “simulate” the reading process, where context may be im-
investigate the extent to which our study group agrees on portant to understanding. Quite the contrary: we intend
what readable code looks like, and in Section 3 we determine to eliminate context and complexity to a large extent and
a small set of features that is sufficient to capture the notion instead focus on the “low-level” details of readability. We
of readability for a majority of annotators. In Section 4 we do not imply that context is unimportant; we mean only
discuss the correlation between our readability metric and to show that there exists a set of local features that, by
external notions of software quality. We discuss some of the themselves, have a strong impact on readability and, by ex-
implications of this work on programming language design tension, software quality.
in Section 5, place our work in context in Section 6, discuss With these considerations in mind, we restrict snippets
possibilities for extension in Section 7, and conclude in Sec- for Java programs as follows. A snippet consists of precisely
tion 8. three consecutive simple statements [16], the most basic unit
of a Java program. Simple statements include field declara-
tions, assignments, function calls, breaks, continues, throws

122
and returns. We find by experience that snippets with fewer
such instructions are sometimes too short for a meaningful
evaluation of readability, but that three statements are gen-
erally both adequate to cover a large set of local features
and sufficient for a fine-grained feature-based analysis.
A snippet does include preceding or in-between lines that
are not simple statements, such as comments, function head-
ers, blank lines, or headers of compound statements like if-
else, try-catch, while, switch, and for. Furthermore, we
do not allow snippets to cross scope boundaries. That is, a
snippet neither spans multiple methods nor starts inside a
compound statement and then extends outside it. We find
that with this set of policies, over 90% of statements in all of
the programs we considered (see Figure 8) are candidates for
incorporation in some snippet. The few non-candidate lines
are usually found in functions that have fewer than three
statements. This snippet definition is specific to Java but
extends directly to similar languages like C and C++.

2.2 Readability Scoring

Prior to their participation, our volunteer human annota-
tors were told that they would be asked to rate Java code Figure 1: The complete data set obtained for this
on its readability, and that their participation would assist study. Each box corresponds to a judgment made
in a study of that aspect of software quality. Responses by a human annotator. Darker colors correspond to
were collected using a web-based annotation tool, Snippet lower readability scores (e.g., 1 and 2) the lighter
Sniper, that users were permitted to access at their leisure. ones correspond to higher scores. Our metric for
Users were presented with a sequence of snippets and but- readability is derived from these 12,000 judgments.
tons labeled 1–5 [28]. Each user was shown the same set of Vertical bands indicate snippets that were judged
one hundred snippets in the same order. Users were graph- similarly by many annotators.
ically reminded that they should select a number near five
for “more readable” snippets and a number near one for “less
readable” snippets, with a score of three indicating neutral-
(i.e., ranked), rather than simply nominal, is an important
ity. Additionally, there was an option to skip the current
statistical consideration. Since the annotators did not re-
snippet; however, it was used very infrequently (15 times
ceive precise guidance on how to score snippets, absolute
in 12,000). Snippets were not modified from the source,
score differences are not as important as relative ones. If
but they were syntax highlighted to better simulate the way
two annotators both gave snippet X a higher score than
code is typically viewed. Finally, clicking on a “help” link
snippet Y , then we consider them to be in agreement with
reminded users that they should score the snippets “based
respect to those two snippets, even if the actual numerical
on [their] estimation of readability” and that “readability is
score values differ. The most popular correlation statistic for
[their] judgment about how easy a block of code is to un-
this sort of data is the Pearson product-moment correlation
derstand.” Readability was intentionally left formally unde-
coefficient [40]. A Pearson correlation of 1 indicates perfect
fined in order to capture the unguided and intuitive notions
correlation, and 0 indicates no correlation (i.e., uniformly
of participants.
random scoring with only random instances of agreement).
Our 120 annotators each scored 100 snippets for a total
A correlation of 0.5 would arise, for example, if two annota-
of 12,000 distinct judgments. Figure 1 provides a graphical
tors scored half of the snippets exactly the same way, and
representation of this publicly-available data.1 The distri-
then scored the other half randomly. We employ the Pearson
bution of scores can be seen in Figure 2. The annotators
statistic throughout this study as a measure of agreement.
were all computer science students. They had varying expe-
We can combine our large set of judgments into a single
rience reading and writing code as evidenced by their current
model simply by averaging them. Pearson, like other similar
course enrollment: 17 were taking 100-level courses, 63 were
correlation statistics, compares the judgments of two anno-
taking 200-level courses, 30 were taking 400-level courses,
tators at a time. We extend it by finding the average cor-
and 10 were graduate students. In Section 3.2 we will dis-
relation between our unified model and each annotator, and
cuss the effect of experience on readability judgments.
obtain 0.56. Translating this sort of statistic into qualitative
The snippets were generated from five open source projects
terms is difficult, but correlation at this level is typically con-
(see Figure 8). They were chosen to include varying levels
sidered to be moderate to strong. We use this unified model
of maturity and multiple application domains to keep the
in our subsequent experiments and analyses. Figure 3 shows
model generic and widely-applicable. We discuss the possi-
the range of agreements. We have explored using the median
bility of domain-specific models in Section 7.
and mode statistics as well, but found that the correlation
Next we consider inter-annotator agreement, and evalu-
was essentially the same. We therefore choose the mean
ate whether we can extract a single coherent model from
because it produces values on a continuum, making them
this data set. The fact that this judgment data is ordinal
more directly comparable to the classifier probabilities we
1 will discuss later.
The dataset is available at http://www.cs.virginia.edu/
~weimer/readability/data This analysis seems to confirm the widely-held belief that

123
Figure 2: Distribution of readability scores made by
120 human annotators on code snippets taken from
several open source projects (see Figure 8).

Figure 3: Distribution of the average readability

people agree significantly on what readable code looks like, scores across all the snippets. The resulting bimodal
but not to an overwhelming extent, due perhaps to individ- distribution presents us with a natural cutoff point
ual preferences. One implication is that there are, indeed, from which we can train a binary classifier. The
underlying factors that influence readability of code. By curve is a probability-density representation of the
modeling the average score, we can capture most of these distribution with a window size of 0.8.
common factors, while simultaneously omitting those that
arise largely from personal preference.
hard to locate. As a result, simple methods for establishing
correlation may not be sufficient. Fortunately, there are a
3. READABILITY MODEL number of machine learning algorithms designed precisely
We have shown that there is significant agreement between for this situation. Such algorithms typically take the form
our group of annotators on the relative readability of snip- of a classifier which operates on instances [33]. For our
pets. However, the processes that underlie this correlation purposes, an instance is a feature vector extracted from a
are unclear. In this section, we explore the extent to which single snippet. In the training phase, we give a classifier a
we can mechanically predict human readability judgments. set of instances along with a labeled “correct answer” based
We endeavor to determine which code features are predic- on the readability data from our annotators. The labeled
tive of readability, and construct a model (i.e., an automated correct answer is a binary judgment partitioning the snip-
software readability metric) to analyze other code. pets into “more readable” and “less readable” based on the
human annotator data. We designate snippets that received
3.1 Model Generation an average score below 3.14 to be “less readable” based on
First, we form a set of features that can be detected stati- the natural cutoff from the bimodal distribution in Figure 3.
cally from a snippet or other block of code. We have chosen We group the remaining snippets and consider them to be
features that are relatively simple, and that intuitively seem “more readable.” Furthermore, the use of binary classifica-
likely to have some effect on readability. They are factors tions also allows us to take advantage of a wider variety of
related to structure, density, logical complexity, documen- learning algorithms.
tation, and so on. Importantly, to be consistent with our When the training is complete, we apply the classifier to
notion of readability as discussed in Section 2.1, each feature an instance it has not seen before, obtaining an estimate of
is independent of the size of a code block. Figure 4 enumer- the probability that it belongs in the “more readable” or “less
ates the set of code features that our metric considers when readable” class. This allows us to use the probability that
judging code readability. Each feature can be applied to an the snippet is “more readable” as a score for readability. We
arbitrary sized block of Java source code, and each repre- used the Weka [21] machine learning toolbox.
sents either an average value per line, or a maximum value We build a classifier based on a set of features that have
for all lines. For example, we have a feature that represents predictive power with respect to readability. To help miti-
the average number of identifiers in each line, and another gate the danger of over-fitting (i.e., of constructing a model
that represents the maximum number in any one line. The that fits only because it is very complex in comparison the
last two features listed in Figure 4 detect the character and amount of data), we use 10-fold cross validation [27]. This
identifier that occur most frequently in a snippet, and return consists of randomly partitioning the data set into 10 sub-
the number of occurrences found. Together, these features sets, training on 9 of them and testing on the last one. This
create a mapping from snippets to vectors of real numbers process is repeated 10 times, so that each of the 10 subsets is
suitable for analysis by a machine-learning algorithm. used as the test data exactly once. Finally, to mitigate any
Earlier, we suggested that human readability judgments bias arising from the random partitioning, we repeat the en-
may often arise from a complex interaction of features, and tire 10-fold validation 10 times and average the results across
furthermore that the important features and values may be all of the runs.

124
Average Maximum Feature Name
X X line length (characters)
X X identifiers
X X identifier length
X X indentation (preceding whitespace)
X X keywords
X X numbers
X comments
X periods
X commas
X spaces
X parenthesis
X arithmetic operators
X comparison operators
X assignments (=)
X branches (if)
Figure 5: Annotator agreement with a model ob-
X loops (for, while)
tained by averaging the scores of 100 annotators
X blank lines with the addition of our metric.
X occurrences of any single character
X occurrences of any single identifier

Figure 4: The set of features considered by our met-

ric.

3.2 Model Performance

Two relevant success metrics in an experiment of this type
are recall and precision. Here, recall is the percentage of
those snippets judged as “more readable” by the annotators
that are classified as “more readable” by the model. Preci-
sion is the fraction of the snippets classified as “more read-
able” by the model that were also judged as “more readable”
by the annotators. When considered independently, each of
these metrics can be made perfect trivially (e.g., a degener- Figure 6: Annotator agreement by experience
ate model that always returns “more readable” has perfect group.
recall). We thus weight them together using the f-measure
statistic, the harmonic mean of precision and recall [8]. This,
in a sense, reflects the accuracy of the classifier with respect ative ones. When we compare the output of the Baysean
to the “more readable” snippets. We also consider the over- classifier to the average human score model it was trained
all accuracy of the classifier by finding the percentage of against, we obtain a Pearson correlation of 0.63. As shown
correctly classified snippets. in Figure 5, that level of agreement is better than what the
We performed this experiment on ten different classifiers. average human in our study produced. While we could at-
To establish a baseline, we trained each classifier on the set of tempt to employ more exotic classifiers or investigate more
snippets with randomly generated score labels. None of the features to improve this result, it is not clear that the result-
classifiers were able to achieve an f-measure of more than ing model would be any “better” since the model is already
0.61 (note, however, that by always guessing ‘more read- well within the margin of error established by our human
able’ it would actually be trivial to achieve an f-measure annotators. In other words, in a very real sense, this metric
of 0.67). When trained on the average human data (i.e., is “just as good” as a human. For performance we can thus
when not trained randomly), several classifiers improved to select any classifier in that equivalence class, and we choose
over 0.8. Those models included the multilayer perceptron to adopt the Bayesian classifier because of its run-time effi-
(a neural network), the Bayesian classifier (based on condi- ciency.
tional probabilities of the features), and the Voting Feature We also repeated the experiment separately with each an-
Interval approach (based on weighted “voting” among clas- notator experience group (e.g., 100-level CS students, 200-
sifications made by each feature separately). On average, level CS students). Figure 6 shows the mean Pearson cor-
these three best classifiers each correctly classified between relations. The dark blue bars on the left show the average
75% and 80% of the snippets. We view a model that is well- agreement between humans and the average score vector for
captured by multiple learning techniques as an advantage: their group (i.e., inter-group agreement). For example, 400-
if only one classifier could agree with our training data, it level CS students agree with each other more often (Pearson
would have suggested a lack of generality in our notion of correlation over 0.6) than do 100-level CS students (corre-
readability. lation under 0.5). The light red bar on the right indicates
While an 80% correct classification rate seems reasonable the correlation between our metric (trained on the annotator
in absolute terms, it is perhaps simpler to appreciate in rel- judgments for that group) and the average of all annotators

125
Project Name KLOC Maturity Description
JasperReports 2.04 269 6 Dynamic content
Hibernate* 2.1.8 189 6 Database
jFreeChart* 1.0.9 181 5 Data rep.
FreeCol* 0.7.3 167 3 Game
jEdit* 4.2 140 5 Text editor
Gantt Project 3.0 130 5 Scheduling
soapUI 2.0.1 98 6 Web services
Xholon 0.7 61 4 Simulation
Risk 1.0.9.2 34 4 Game
JSch 0.1.37 18 3 Security
jUnit* 4.4 7 5 Software dev.
jMencode 0.64 7 3 Video encoding

Figure 8: Benchmark programs used in our exper-

iments. The “Maturity” column indicates a self-
reported SourceForge project status. *Used as a
snippet source.

nomena. As a simple example, if there is exactly one space

between every two words then a feature that counts words
and a feature that counts spaces will capture essentially the
same information and leaving one of them out is unlikely to
decrease accuracy. A principle component analysis (PCA)
indicates that 98% of the total variability can be explained
by 6 principle components, thus implying that feature over-
lap is significant.

4. CORRELATING READABILITY
Figure 7: Relative power of features as determined WITH SOFTWARE QUALITY
by a singleton (one-feature-at-a-time) analysis. The In the previous section we constructed an automated model
direction of correlation for each is also indicated. of readability that mimics human judgments. We imple-
mented our model in a tool that assesses the readability
of programs using a fixed classifier. In this section we use
in the group. Two interesting observations arise. First, for that tool to investigate whether our model of readability
all groups except graduate students, our automatic metric compares favorably with external conventional metrics of
agrees with the human average more closely than the hu- software quality. Specifically, we first look for a correlation
mans agree. We suspect that the difference with respect to between readability and FindBugs, a popular static bug-
graduates may a reflection of the more diverse background finding tool [22]. Second, we look for a similar correlation
of the graduate student population, their more sophisticated with changes to code between versions of several large open
opinions, or some other external factor. Second, we see a source projects. We chose FindBugs defects and version
gradual trend toward increased agreement with experience. changes related to code churn in part because they can be
We investigated which features have the most predictive measured objectively. Finally, we look for trends in code
power by re-running our all-annotators analysis using only readability across those projects.
one feature at a time. The relative magnitude of the per- The set of open source Java programs we have employed
formance of the classifier is indicative of the comparative as benchmarks can be found in Figure 8. They were selected
importance of each feature. Figure 7 shows the result of because of their relative popularity, diversity in terms of de-
that analysis with the magnitudes normalized between zero velopment maturity and application domain, and availabil-
and one. ity in multiple versions from SourceForge, an open source
We find, for example, that factors like ‘average line length’ software repository. Maturity is self reported, and catego-
and ‘average number of identifiers per line’ are very impor- rized by SourceForge into 1-planning, 2-pre-alpha, 3-alpha,
tant to readability. Conversely, ‘average identifier length’ is 4-beta, 5-production/stable, 6-mature, 7-inactive. Note that
not, in of itself, a very predictive factor; neither are if con- some projects present multiple releases at different maturity
structs, loops, or comparison operators. Section 5 includes a levels; in such cases we selected the release for the maturity
discussion of some of the possible implications of this result. level indicated.
We prefer this singleton feature analysis to a leave-one- Running our readability tool (including feature detection
out analysis (which judges feature power based on decreases and the readability judgment) was quite rapid. For example,
in classifier performance) that may be misleading due to the 98K lines of code in soapUI took less than 16 seconds to
significant feature overlap. This occurs when two or more process on a machine with a 2GHz processor and disk with
features, though different, capture the same underlying phe- a maximum 150 MBytes/sec transfer rate.

126
Figure 9: f-measure for using readability to predict
functions with FindBugs defect reports and func-
tions which change between releases.
Figure 10: Mean ratio of the classifier probabilities
(predicting ‘less readable) assigned to functions that
4.1 Readability Correlations contained a FindBugs defect or that will change in
Our first experiment attempts to correlate defects de- the next version to those that were not. For ex-
tected by FindBugs with our readability metric at the func- ample, Risk functions with FindBug errors were as-
tion level. We first ran FindBugs on the benchmark, not- signed a probability of ‘less readable’ that was nearly
ing defect reports. Second, we extracted all of the functions 150% greater on average than the probabilities as-
and partitioned them into two sets: those containing at least signed to functions without such defects.
one reported defect, and those containing none. We normal-
ized function set sizes to avoid bias between programs for
which more or fewer defects were reported. We then ran 100%) for many of the projects indicates that the functions
the already-trained classifier on the set of functions, record- with these features tend to have lower readability scores than
ing an f-measure for “contains a bug” with respect to the functions without them. For example, in the jMencode and
classifier judgment of “less readable.” soapUI projects, functions judged less readable by our met-
Our second experiment relates future code churn to read- ric were dramatically more likely to contain FindBugs de-
ability. Version-to-version changes capture another impor- fect reports, and in the JasperReports project less-readable
tant aspect of code quality. This experiment used the same methods were very likely to change in the next version.
setup as the first, but used readability to predict which func- For both of these external quality indicators we found that
tions will be modified between two successive releases of a our tool exhibits a substantial degree of correlation. Predict-
program. For this experiment, “successive release” means ing based on our readability metric yields an f-measure over
the two most recent stable versions. In other words, instead 0.7 in some cases. Again, our goal is not a perfect correla-
of “contains a bug” we attempt to predict “is going to change tion with version changes and code churn. These moderate
soon.” We consider a function to have changed in any case correlations do, however, imply a connection between code
where the text is not exactly the same, including changes readability, as described by our model, and defects and up-
to whitespace. Whitespace is normally ignored in program coming code changes.
studies, but since we are specifically focusing on readability
we deem it relevant. 4.2 Software Lifecycle
Figure 9 summarizes the results of these two experiments. To further investigate the relation of our readability met-
Guessing randomly yields an f-measure of 0.5 and serves as ric to external factors, we investigate changes over long pe-
a baseline, while 1.0 represents a perfect upper bound. The riods of time. Figure 11 shows how readability tends to
average f-measure over 11 of our benchmarks for the Find- change over the lifetime of a project. To construct this fig-
Bugs correlation is 0.61. The average f-measure for version ure we selected several projects with rich version histories
changes over all 12 of our benchmarks is 0.63. It is important and calculated the average readability level over all of the
to note that our goal is not perfect correlation with Find- functions in each.
Bugs or any other source of defect reports: projects can run Note that newly-released versions for open source projects
FindBugs directly rather than using our metric to predict are not always more stable than their predecessors. Projects
its output. Instead, we argue that our readability metric often undergo major overhauls or add additional crosscut-
has general utility and is correlated with multiple notions of ting features. Consider jUnit, which has recently adopted a
software quality. “completely different API . . . [that] depends on new features
A second important consideration is the magnitude of the of Java 5.0 (annotations, static import. . . )” [15]. We thus
difference. We claim that classifier probabilities (i.e. con- conducted an additional experiment to measure readability
tinuous output v.s. discrete classifications) is useful in eval- against maturity and stability.
uating readability. Figure 10 presents this data in the form Figure 12 plots project readability against project matu-
of a ratio, the mean probability assigned by the classifier to rity, as self-reported by developers. The data shows a noisy
functions positive for FindBugs defects or version changes upward trend implying that projects that reach maturity
to functions without these features. A ratio over 1 (i.e., > tend to be more readable. The results of these two experi-

127
Figure 12: Average readability metric of all func-
tions in a project as a function of self-reported
Figure 11: Average readability metric of all func- project maturity with best fit linear trend line. Note
tions in a project as a function of project lifetime. that projects of greater maturity tend to exhibit
Note that over time, the readability of some projects greater readability.
tends to decrease, while it gradually increases in oth-
ers.
cated integrated development environments (IDEs) and spe-
ments would seem not to support the Fred Brooks argument cialized static analysis tools designed to aid in software in-
that, “all repairs tend to destroy the structure, to increase spections (e.g., [3]), may constitute a better approach to
the entropy and disorder of the system . . . as time passes, the goal of enhancing program understanding.
the system becomes less and less well ordered” [14] for the Unlike identifiers, comments are a very direct way of com-
readability component of “order”. While Brooks was not municating intent. One might expect their presence to in-
speaking specifically of readability, a lack readability can be crease readability dramatically. However, we found that
a strong manifestation of disorder. comments were are only moderately well-correlated with
readability (33% relative power). One conclusion may be
that while comments can enhance readability, they are typ-
5. DISCUSSION ically used in code segments that started out less readable:
This study includes a significant amount of empirical data the comment and the unreadable code effectively balance
about the relation between local code features and readabil- out. The net effect would appear to be that comments are
ity. We believe that this information may have implications not always, in and of themselves, indicative of high or low
for the way code should be written and evaluated, and for readability.
the design of programming languages. However, we caution The number of identifiers and characters per line has a
that this data may only be truly relevant to our annotators; strong influence on our readability metric (100% and 96%
it should not be interpreted to represent a comprehensive or relative power respectively). It would appear that just as
universal model for readability. long sentences are more difficult to understand, so are long
To start, we found that the length of identifier names con- lines of code. Our findings support the conventional wisdom
stitutes almost no influence on readability (0% relative pre- that programmers should keep their lines short, even if it
dictive power). Recently there has been a significant move- means breaking up a statement across multiple lines.
ment toward “self documenting code” which is often char- When designing programming languages, readability is an
acterized by long and descriptive identifier names and few important concern. Languages might be designed to force
abbreviations. The movement has had particular influence or encourage improved readability by considering the im-
on the Java community. Furthermore, naming conventions, plications of various design and language features on this
like the “Hungarian” notation which seeks to encode typing metric. For example, Python enforces a specific indentation
information into identifier names, may not be as advisable scheme in order to aid comprehension [43, 32]. In our exper-
as previously thought [39]. While descriptive identifiers cer- iments, the importance of character count per line suggests
tainly can improve readability, perhaps some additional at- that languages should favor the use of constructs, such as
tention should be paid to the fact that they may also reduce switch statements and pre- and post-increment, that encour-
it; our study indicates the net gain may be near zero. age short lines. Our conclusion, that readability does not
For example, forcing readers to process long names, where appear to be negatively impacted by repeated characters or
shorter ones would suffice, may negatively impact readabil- words, runs counter to the common perception that oper-
ity. Furthermore, identifier names are not always an ap- ator overloading is necessarily confusing. Finally, our data
propriate place to encode documentation. There are many suggests that languages should add additional keywords if it
cases where it would be more appropriate to use comments, means that programs can be written with fewer new identi-
possibly associated with variable or field declarations, to ex- fiers.
plain program behavior. Long identifiers may be confusing, As we consider new language features, it might be useful to
or even misleading. We believe that in many cases sophisti- conduct studies of the impact of such features on readability.

128
The techniques presented in this paper offer a means for Finally, in line with conventional readability metrics, it
conducting such experiments. would be worthwhile to express our metric using a simple
formula over a small number of features (the PCA from Sec-
tion 3.2 suggests this may be possible). Using only the truly
6. RELATED WORK essential and predictive features would allow the metric to be
Previously, we identified several of the many automated adapted easily into many development processes. Further-
readability metrics that are in use today for natural lan- more, with a smaller number of coefficients the readability
guage [12, 18, 24, 31]. While we have not found analogous metric could be parameterized or modified in order to bet-
metrics targeted at source code (as presented in this paper), ter describe readability in certain environments, or to meet
some metrics do exist outside the realm of traditional lan- more specific concerns.
guage. For example, utility has been claimed for a readabil-
ity metric for computer generated math [30], for the layout
of treemaps [4], and for hypertext [20].
8. CONCLUSION
Perhaps the bulk of work in the area of source code read- It is important to note that the metric described in this
ability today is based on coding standards (e.g., [2, 6, 41]). paper is not intended as the final or universal model of read-
These conventions are primarily intended to facilitate col- ability. Rather, we have shown how to produce a metric
laboration by maintaining uniformity between code written for software readability from the judgments of human an-
by different developers. Style checkers such as PMD [9] and notators, relevant specifically to those annotators. In fact,
The Java Coding Standard Checker are employed as a means we have shown that it is possible to create a metric that
to automatically enforce these standards. agrees with these annotators as much as they agree with
We also note that machine learning has, in the past, been each other by only considering a relatively simple set of low-
used for defect prediction, typically by training on data from level code features. In addition, we have seen that readabil-
source code repositories (e.g., [7, 17, 23, 25]). We believe ity, as described by this metric, exhibits a significant level
that machine learning has substantial advantages over tra- of correlation with more conventional metrics of software
ditional statistics and that much room yet exists for the quality, such as defects, code churn, and self-reported sta-
exploitation of such techniques in the domains of Software bility. Furthermore, we have discussed how considering the
Engineering and Programming Languages. factors that influence readability has potential for improving
the programming language design and engineering practice
with respect to this important dimension of software quality.
7. FUTURE WORK
The techniques presented in this paper should provide an 9. REFERENCES
excellent platform for conducting future readability experi- [1] K. Aggarwal, Y. Singh, and J. K. Chhabra. An
ments, especially with respect to unifying even a very large integrated measure of software maintainability.
number of judgments into an accurate model of readability. Reliability and Maintainability Symposium, 2002.
While we have shown that there is significant agreement Proceedings. Annual, pages 235–241, September 2002.
between our annotators on the factors that contribute to [2] S. Ambler. Java coding standards. Softw. Dev.,
code readability, we would expect each annotator to have 5(8):67–71, 1997.
personal preferences that lead to a somewhat different weight-
[3] P. Anderson and T. Teitelbaum. Software inspection
ing of the relevant factors. It would be interesting to inves-
using codesurfer. WISE ’01: Proceeding of the first
tigate whether a personalized or organization-level model,
workshop on inspection in software engineering, July
adapted over time, would be effective in characterizing code 2001.
readability. Furthermore, readability factors may also vary
[4] B. B. Bederson, B. Shneiderman, and M. Wattenberg.
significantly based on application domain. Additional re-
Ordered and quantum treemaps: Making effective use
search is needed to determine the extent of this variability,
of 2d space to display hierarchies. ACM Trans.
and whether specialized models would be useful.
Graph., 21(4):833–854, 2002.
Another possibility for improvement would be an exten-
sion of our notion of local code readability to include broader [5] B. Boehm and V. R. Basili. Software defect reduction
features. While most of our features are calculated as aver- top 10 list. Computer, 34(1):135–137, 2001.
age or maximum value per line, it may be useful to consider [6] L. W. Cannon, R. A. Elliott, L. W. Kirchhoff, J. H.
the size of compound statements, such as the number of Miller, J. M. Milner, R. W. Mitze, E. P. Schan, N. O.
simple statements within an if block. For this study, we in- Whittington, H. Spencer, D. Keppel, , and M. Brader.
tentionally avoided such features to help ensure that we were Recommended C Style and Coding Standards: Revision
capturing readability rather than complexity. However, in 6.0. Specialized Systems Consultants, Inc., Seattle,
practice, achieving this separation of concerns is likely to be Washington, June 1990.
less compelling. [7] T. J. Cheatham, J. P. Yoo, and N. J. Wahl. Software
Readability measurement tools present their own chal- testing: a machine learning experiment. In CSC ’95:
lenges in terms of programmer access. We suggest that Proceedings of the 1995 ACM 23rd annual conference
such tools could be integrated into an IDE, such as Eclipse, on Computer science, pages 135–141, 1995.
in the same way that natural language readability metrics [8] T. Y. Chen, F.-C. Kuo, and R. Merkel. On the
are incorporated into word processors. Software that seems statistical properties of the f-measure. In QSIC’04:
readable to the author may be quite difficult for others to Fourth International Conference on Quality Software,
understand [19]. Such a system could alert programmers as pages 146–153, 2004.
such instances arise, in a way similar to the identification of [9] T. Copeland. PMD Applied. Centennial Books,
syntax errors. Alexandria, VA, USA, 2005.

129
[10] E. W. Dijkstra. A Discipline of Programming. [27] R. Kohavi. A study of cross-validation and bootstrap
Prentice Hall PTR, 1976. for accuracy estimation and model selection.
[11] J. L. Elshoff and M. Marcotty. Improving computer International Joint Conference on Artificial
program readability to aid modification. Commun. Intelligence, 14(2):1137–1145, 1995.
ACM, 25(8):512–521, 1982. [28] R. Likert. A technique for the measurement of
[12] R. F. Flesch. A new readability yardstick. Journal of attitudes. Archives of Psychology, 140:44–53, 1932.
Applied Psychology, 32:221–233, 1948. [29] J. Lionel E. Deimel. The uses of program reading.
[13] J. Frederick P. Brooks. No silver bullet: essence and SIGCSE Bull., 17(2):5–14, 1985.
accidents of software engineering. Computer, [30] S. MacHaffie, R. McLeod, B. Roberts, P. Todd, and
20(4):10–19, 1987. L. Anderson. A readability metric for
[14] J. Frederick P. Brooks. The Mythical Man-Month: computer-generated mathematics. Technical report,
Essays on Software Engineering, 20th Anniversary Saltire Software,
Edition. Addison-Wesley Professional, August 1995. http://www.saltire.com/equation.html, retrieved 2007.
[15] A. Goncalves. Get acquainted with the new advanced [31] G. H. McLaughlin. Smog grading – a new readability.
features of junit 4. DevX, Journal of Reading, May 1969.
http://www.devx.com/Java/Article/31983, 2006. [32] R. J. Miara, J. A. Musselman, J. A. Navarro, and
[16] J. Gosling, B. Joy, and G. L. Steele. The Java B. Shneiderman. Program indentation and
Language Specification. The Java Series. comprehensibility. Commun. ACM, 26(11):861–867,
Addison-Wesley, Reading, MA, USA, 1996. 1983.
[17] T. L. Graves, A. F. Karr, J. S. Marron, and H. Siy. [33] T. Mitchell. Machine Learning. McGraw Hill, 1997.
Predicting fault incidence using software change [34] N. Nagappan and T. Ball. Use of relative code churn
history. IEEE Trans. Softw. Eng., 26(7):653–661, 2000. measures to predict system defect density. In ICSE
[18] R. Gunning. The Technique of Clear Writing. ’05: Proceedings of the 27th international conference
McGraw-Hill International Book Co, New York, 1952. on Software engineering, pages 284–292, 2005.
[19] N. J. Haneef. Software documentation and readability: [35] C. V. Ramamoorthy and W.-T. Tsai. Advances in
a proposed process improvement. SIGSOFT Softw. software engineering. Computer, 29(10):47–58, 1996.
Eng. Notes, 23(3):75–77, 1998. [36] D. R. Raymond. Reading source code. In CASCON
[20] A. E. Hatzimanikatis, C. T. Tsalidis, and ’91: Proceedings of the 1991 conference of the Centre
D. Christodoulakis. Measuring the readability and for Advanced Studies on Collaborative research, pages
maintainability of hyperdocuments. Journal of 3–16. IBM Press, 1991.
Software Maintenance, 7(2):77–90, 1995. [37] P. A. Relf. Tool assisted identifier naming for
[21] G. Holmes, A. Donkin, and I. Witten. Weka: A improved software readability: an empirical study.
machine learning workbench. Proceedings of the Empirical Software Engineering, 2005. 2005
Second Australia and New Zealand Conference on International Symposium on, November 2005.
Intelligent Information Systems, 1994. [38] S. Rugaber. The use of domain knowledge in program
[22] D. Hovemeyer and W. Pugh. Finding bugs is easy. understanding. Ann. Softw. Eng., 9(1-4):143–192,
SIGPLAN Not., 39(12):92–106, 2004. 2000.
[23] T. M. Khoshgoftaar, E. B. Allen, N. Goel, A. Nandi, [39] C. Simonyi. Hungarian notation. MSDN Library,
and J. McMullan. Detection of software modules with November 1999.
high debug code churn in a very large legacy system. [40] S. E. Stemler. A comparison of consensus, consistency,
In ISSRE ’96: Proceedings of the The Seventh and measurement approaches to estimating interrater
International Symposium on Software Reliability reliability. Practical Assessment, Research and
Engineering (ISSRE ’96), page 364, Washington, DC, Evaluation, 9(4), 2004.
USA, 1996. IEEE Computer Society. [41] H. Sutter and A. Alexandrescu. C++ Coding
[24] J. P. Kinciad and E. A. Smith. Derivation and Standards: 101 Rules, Guidelines, and Best Practices.
validation of the automated readability index for use Addison-Wesley Professional, 2004.
with technical materials. Human Factors, 12:457–464, [42] T. Tenny. Program readability: Procedures versus
1970. comments. IEEE Trans. Softw. Eng., 14(9):1271–1279,
[25] P. Knab, M. Pinzger, and A. Bernstein. Predicting 1988.
defect densities in source code files with decision tree [43] A. Watters, G. van Rossum, and J. C. Ahlstrom.
learners. In MSR ’06: Proceedings of the 2006 Internet Programming with Python. MIS Press/Henry
international workshop on Mining software Holt publishers, New York, 1996.
repositories, pages 119–125, 2006. [44] E. J. Weyuker. Evaluating software complexity
[26] J. C. Knight and E. A. Myers. Phased inspections and measures. IEEE Trans. Softw. Eng., 14(9):1357–1365,
their implementation. SIGSOFT Softw. Eng. Notes, 1988.
16(3):29–35, 1991.

130

Software Architecture in Practice, 4th Edition
0% (1)
Software Architecture in Practice, 4th Edition
32 pages
Dokumen - Pub Software Architecture in Practice 4th Edition 9780136885979 I 4851967
No ratings yet
Dokumen - Pub Software Architecture in Practice 4th Edition 9780136885979 I 4851967
33 pages
Toshiba Laptop Schematic Diagram PDF
100% (2)
Toshiba Laptop Schematic Diagram PDF
37 pages
National Guidelines For Road Signing V3 22 03 16 PDF
67% (3)
National Guidelines For Road Signing V3 22 03 16 PDF
52 pages
EFMA Ammonia Pipeline Guidance 2008
100% (2)
EFMA Ammonia Pipeline Guidance 2008
50 pages
Learning Software Engineering
From Everand
Learning Software Engineering
IT Campus Academy
No ratings yet
Weimer Tse2010 Readability Preprint
No ratings yet
Weimer Tse2010 Readability Preprint
14 pages
Learning A Metric
No ratings yet
Learning A Metric
13 pages
Software Understandability
No ratings yet
Software Understandability
55 pages
Basics of Programming: A Comprehensive Guide for Beginners: Essential Coputer Skills, #1
From Everand
Basics of Programming: A Comprehensive Guide for Beginners: Essential Coputer Skills, #1
DG. Junior
No ratings yet
Role of Software Readability On Software Development Cost: Collare@wcsu - Edu rvalerdi@MIT - EDU
No ratings yet
Role of Software Readability On Software Development Cost: Collare@wcsu - Edu rvalerdi@MIT - EDU
3 pages
Programming Best Practices for New Developers: A Practical Guide with Examples
From Everand
Programming Best Practices for New Developers: A Practical Guide with Examples
William E. Clark
No ratings yet
Writing Clean Code Step by Step: A Practical Guide with Examples
From Everand
Writing Clean Code Step by Step: A Practical Guide with Examples
William E. Clark
No ratings yet
A Novel Metric For Measuring The Readability of Software Source Code
No ratings yet
A Novel Metric For Measuring The Readability of Software Source Code
6 pages
Algorithms Made Simple: Understanding the Building Blocks of Software
From Everand
Algorithms Made Simple: Understanding the Building Blocks of Software
William E. Clark
No ratings yet
WSRE2016 15 Paper 22
No ratings yet
WSRE2016 15 Paper 22
2 pages
Mastering the Craft: Unleashing the Art of Software Engineering
From Everand
Mastering the Craft: Unleashing the Art of Software Engineering
Kiran Nagesh
No ratings yet
Software Design And Development in your pocket
From Everand
Software Design And Development in your pocket
David Chen
5/5 (1)
Software Testing Interview Questions You'll Most Likely Be Asked
From Everand
Software Testing Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Efficient Development with JetBrains Tools: Definitive Reference for Developers and Engineers
From Everand
Efficient Development with JetBrains Tools: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Ch-3 Maintenance Measurement
No ratings yet
Ch-3 Maintenance Measurement
29 pages
Code Generation Techniques and Applications: Definitive Reference for Developers and Engineers
From Everand
Code Generation Techniques and Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
SE Full Notes
No ratings yet
SE Full Notes
75 pages
Origins of Poor Code Readability (Short Paper, Conjectural)
No ratings yet
Origins of Poor Code Readability (Short Paper, Conjectural)
5 pages
Compose in Practice: Definitive Reference for Developers and Engineers
From Everand
Compose in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Conscientious Software
No ratings yet
Conscientious Software
18 pages
Lecture 3 Maintenance Measurements
No ratings yet
Lecture 3 Maintenance Measurements
12 pages
Essays on Infrastructure-as-code
From Everand
Essays on Infrastructure-as-code
Ravi Rajamani
No ratings yet
Learning Software Architecture
From Everand
Learning Software Architecture
IT Campus Academy
No ratings yet
Modeling Design-Coding Factors That Drive Maintainability of Software Systems
No ratings yet
Modeling Design-Coding Factors That Drive Maintainability of Software Systems
24 pages
IGNOU Software Engineering Previous 10 Years Solved Papers
From Everand
IGNOU Software Engineering Previous 10 Years Solved Papers
Manish Soni
No ratings yet
Ada Language Reference and Application Guide: Definitive Reference for Developers and Engineers
From Everand
Ada Language Reference and Application Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Software Engineering: Topic 2: Software: Its Nature and Qualities
No ratings yet
Software Engineering: Topic 2: Software: Its Nature and Qualities
16 pages
Advanced Metaprogramming Techniques: Definitive Reference for Developers and Engineers
From Everand
Advanced Metaprogramming Techniques: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
CODING FOR ABSOLUTE BEGINNERS: How to Keep Your Data Safe from Hackers by Mastering the Basic Functions of Python, Java, and C++ (2022 Guide for Newbies)
From Everand
CODING FOR ABSOLUTE BEGINNERS: How to Keep Your Data Safe from Hackers by Mastering the Basic Functions of Python, Java, and C++ (2022 Guide for Newbies)
Eric Vargas
No ratings yet
Veracode Essentials: Definitive Reference for Developers and Engineers
From Everand
Veracode Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
C# OOP Step by Step: A Practical Guide with Examples
From Everand
C# OOP Step by Step: A Practical Guide with Examples
William E. Clark
No ratings yet
Software Development Fundamentals
From Everand
Software Development Fundamentals
IntroBooks Team
No ratings yet
C++ OOP Made Simple: A Practical Guide with Examples
From Everand
C++ OOP Made Simple: A Practical Guide with Examples
William E. Clark
No ratings yet
Mastering C: A Comprehensive Guide to Programming Excellence
From Everand
Mastering C: A Comprehensive Guide to Programming Excellence
THE NORTHERN HIMALAYAS
No ratings yet
Readability of Source Code: Chess Algorithms As An Example
No ratings yet
Readability of Source Code: Chess Algorithms As An Example
4 pages
Efficient Code Review with Gerrit: Definitive Reference for Developers and Engineers
From Everand
Efficient Code Review with Gerrit: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Debugging Like a Pro: A Practical Guide with Examples
From Everand
Debugging Like a Pro: A Practical Guide with Examples
William E. Clark
No ratings yet
Software Engr Handbk
No ratings yet
Software Engr Handbk
21 pages
Pub Computer Software Engineering Research
No ratings yet
Pub Computer Software Engineering Research
229 pages
Unit5 Updated
No ratings yet
Unit5 Updated
151 pages
Hla PR Ogramming Style Guidelines Appendix C
No ratings yet
Hla PR Ogramming Style Guidelines Appendix C
38 pages
Image Collection Exploration: Unveiling Visual Landscapes in Computer Vision
From Everand
Image Collection Exploration: Unveiling Visual Landscapes in Computer Vision
Fouad Sabry
No ratings yet
Software Testing
No ratings yet
Software Testing
16 pages
Efficient Editing with Kate: Definitive Reference for Developers and Engineers
From Everand
Efficient Editing with Kate: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Masterarbeit Oleksandr Panchenko PDF
No ratings yet
Masterarbeit Oleksandr Panchenko PDF
106 pages
Analysis of Software Quality Using Softw
No ratings yet
Analysis of Software Quality Using Softw
10 pages
Analysis of Software Quality Using Softw
No ratings yet
Analysis of Software Quality Using Softw
10 pages
Software Development Security: CISSP, #8
From Everand
Software Development Security: CISSP, #8
Selwyn Classen
No ratings yet
LabVIEW Programming for Engineering Applications: Definitive Reference for Developers and Engineers
From Everand
LabVIEW Programming for Engineering Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Neutralino.js Essentials: Definitive Reference for Developers and Engineers
From Everand
Neutralino.js Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Building Desktop Applications with Electron: Definitive Reference for Developers and Engineers
From Everand
Building Desktop Applications with Electron: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Introduction To Software Engineering
No ratings yet
Introduction To Software Engineering
59 pages
Behavior-Driven Development in Practice: Definitive Reference for Developers and Engineers
From Everand
Behavior-Driven Development in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Fortify Security Analysis Essentials: Definitive Reference for Developers and Engineers
From Everand
Fortify Security Analysis Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Rebol Programming Insights: Definitive Reference for Developers and Engineers
From Everand
Rebol Programming Insights: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
SE Course Outline Covered by Grok
No ratings yet
SE Course Outline Covered by Grok
201 pages
Programming with Nim: Definitive Reference for Developers and Engineers
From Everand
Programming with Nim: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Cmath Cstdlib Iostream Cstring Cctype Csignal Clocale Cwctype Cstdio Cwchar
No ratings yet
Cmath Cstdlib Iostream Cstring Cctype Csignal Clocale Cwctype Cstdio Cwchar
5 pages
Waiting Line Theory or Queuing Model PDF
No ratings yet
Waiting Line Theory or Queuing Model PDF
11 pages
Com 317
No ratings yet
Com 317
2 pages
CPP Tutorial
No ratings yet
CPP Tutorial
70 pages
Zero Capital 3rd Edition
No ratings yet
Zero Capital 3rd Edition
1 page
Overview LI-FI Technology
No ratings yet
Overview LI-FI Technology
41 pages
Power Team PUA PMA Series Pumps - Catalog
No ratings yet
Power Team PUA PMA Series Pumps - Catalog
4 pages
Construction Management Strategy: Chatswood Chase Shopping Centre Expansion and Redevelopment, Chatswood NSW
No ratings yet
Construction Management Strategy: Chatswood Chase Shopping Centre Expansion and Redevelopment, Chatswood NSW
13 pages
Abb Iec61850
No ratings yet
Abb Iec61850
20 pages
Om Sai Educational & Charitable Trust
No ratings yet
Om Sai Educational & Charitable Trust
2 pages
BMW Head Stud Torque Settings
No ratings yet
BMW Head Stud Torque Settings
4 pages
CS472 Principles of Information Security - Image.marked
No ratings yet
CS472 Principles of Information Security - Image.marked
2 pages
4x20 LCD
No ratings yet
4x20 LCD
17 pages
Govt. Aided 1
No ratings yet
Govt. Aided 1
13 pages
31 E EFIS - ECAM and INSTRUMENTS
100% (1)
31 E EFIS - ECAM and INSTRUMENTS
92 pages
P3 74 Study Pack 05 Mapping
No ratings yet
P3 74 Study Pack 05 Mapping
8 pages
Beijing Valve General Factory Co., LTD.: Introduction of Foundries
No ratings yet
Beijing Valve General Factory Co., LTD.: Introduction of Foundries
3 pages
Reinforced-Concrete - Material SUPPLIERS PDF
No ratings yet
Reinforced-Concrete - Material SUPPLIERS PDF
21 pages
Intellectual Capital
No ratings yet
Intellectual Capital
16 pages
UTA Hidros 2021 - ENG
No ratings yet
UTA Hidros 2021 - ENG
6 pages
ASTM-D5045 Fracture Toughness Testing
100% (1)
ASTM-D5045 Fracture Toughness Testing
3 pages
286 Idbi Statement PDF
No ratings yet
286 Idbi Statement PDF
1 page
Flexible Low - Cost SSP Technology
No ratings yet
Flexible Low - Cost SSP Technology
17 pages
AMOS-18 Programming Reference
No ratings yet
AMOS-18 Programming Reference
778 pages
ToR BANGGA Papua Process Evaluation
No ratings yet
ToR BANGGA Papua Process Evaluation
8 pages
Efektivitas Pelaksanaan Ujian Nasional Tingkat Sekolah Menengah Atas Negeri Oleh Dinas Pendidikan Dan Pengajaran Kota Palu Provinsi Sulawesi Tengah
No ratings yet
Efektivitas Pelaksanaan Ujian Nasional Tingkat Sekolah Menengah Atas Negeri Oleh Dinas Pendidikan Dan Pengajaran Kota Palu Provinsi Sulawesi Tengah
11 pages
Share Prices
No ratings yet
Share Prices
186 pages
Target ISC 2012 Computer Science
50% (2)
Target ISC 2012 Computer Science
11 pages
Text 3. Helicopter Overhaul Manual: Make Written Translation of The Text
No ratings yet
Text 3. Helicopter Overhaul Manual: Make Written Translation of The Text
3 pages
2.2.3. Quality of Student Projects
No ratings yet
2.2.3. Quality of Student Projects
2 pages
ICA0173Presentation PDF
No ratings yet
ICA0173Presentation PDF
19 pages
E-Governance Case Study: ITC's E-Choupal
No ratings yet
E-Governance Case Study: ITC's E-Choupal
16 pages
Reich Ritter Von Strongbergh's War Band
No ratings yet
Reich Ritter Von Strongbergh's War Band
2 pages

A Metric For Software Readability: Raymond P.L. Buse and Westley R. Weimer

Uploaded by

A Metric For Software Readability: Raymond P.L. Buse and Westley R. Weimer

Uploaded by

A Metric for Software Readability

Raymond P.L. Buse and Westley R. Weimer

2.2 Readability Scoring

Figure 3: Distribution of the average readability

Figure 4: The set of features considered by our met-

3.2 Model Performance

Figure 8: Benchmark programs used in our exper-

nomena. As a simple example, if there is exactly one space

You might also like