We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61
What Are Software Metrics?
• The continuous application of measurement
based techniques to the software development process and its products to supply meaningful and timely management information, together with the use of those techniques to improve that process and its products. • 1.What is the size of the program? • 2. What is the estimated cost and duration of the software? • 3. Is the requirement testable? • 4. When is the right time to stop testing? • 5. What is the effort expended during maintenance phase? • 6. How many defects have been corrected that are reported during maintenance • phase? • 7. How many defects have been detected using a given activity such as inspections? • 8. What is the complexity of a given module? • 9. What is the estimated cost of correcting a given defect? • 10. Which technique or process is more effective than the other? • 11. What is the productivity of persons working on a project? • 12. Is there any requirement to improve a given process, method, or technique? Characteristics of Software Metrics • A metric is only relevant if it is easily understood, calculated, valid, and economical: • Quantitative: The metrics should be expressible in values. • Understandable: The way of computing the metric must be easy to understand. • Validatable: The metric should capture the same attribute that it is designed for. • Economical: It should be economical to measure a metric. • Repeatable: The values should be same if measured repeatedly, that is, can be consistently • repeated. • anguage independent: The metrics should not depend on any language. • Applicability: The metric should be applicable in the early phases of software • development. • Comparable: The metric should correlate with another metric capturing the same • feature or concept. Product and Process Metrics • 1. Process: The process is defined as the way in which the product is developed. • 2. Product: The final outcome of following a given process or a set of processes is • known as a product. The product includes documents, source codes, or artifacts • that are produced during the software development life cycle. Process Metrics • The process metrics can be used to • 1. Measure the cost and duration of an activity. • 2. Measure the effectiveness of a process. • 3. Compare the performance of various processes. • 4. Improve the processes and guide the selection of future processes. Process Metrics • For example, the effectiveness of the inspection activity can be measured by computing costs and resources spent on it and the number of defects detected during the inspection activity. • By assessing whether the number of faults found outweighs the costs incurred during the inspection activity or not, the project managers can decide about the effectiveness of the inspection activity. Differentiating Between Process metrics • In discerning between the two metric types, we follow the Henderson-Sellers’ (1996) definitions of product and process metrics (product metric refers to software ‘‘snapshot’’ at a particular point of time, while process metrics reflects the changes over time, e.g., the number of code changes). • recently the term ‘‘historical metrics’’ has been used with growing frequency to replace the ‘‘process metrics,’’ e.g. (IllesSeifert and Paech 2010) Internal and External Attributes • The process and product metrics can further be classified as internal or external attributes. • The internal attribute concerns with the internal structure of the process or product. The common internal attributes are size, coupling, and complexity. • The external attributes concern with the behavior aspects of the process or product. The external attributes such as testability, understandability, maintainability, and reliability can be measured using the process or product metrics. • The difference between attributes and metrics is that metrics are used to measure a given attribute. For example, size is an attribute that can be measured through lines of source code (LOC) metric. Internal and External Attributes • The internal attributes of a process or product can be measured without executing the source code. For instance, the examples of internal attributes are number of paths, number of branches, coupling, and cohesion. • External attributes include quality attributes of the system. They can be measured by executing the source code such as the number of failures, response time, and navigation easiness of an item. Continuous Data • The data can be classified into two types— continuous or categorical. • Continuous type that represents the amount of magnitude of a given entity.For example, the number of faults in a class or number of LOC added or deleted during maintenance phase. • Continuous data can be measured on interval, ratio, or absolute scale. Categorical Data • Discrete or categorical type that is represented in the form of categories or classes. For example, weather is sunny, cloudy, or rainy. Measurement Scales • What is the Scale? • A scale is a device or an object used to measure or quantify any event or another object. • Levels of Measurements • There are five different scales of measurement. The data can be defined as being one of the four scales. The four types of scales are: • Nominal Scale • Ordinal Scale • Interval Scale • Ratio Scale • Absolute Nominal Scale • Nominal Scale • A nominal scale is the 1st level of measurement scale in which the numbers serve as “tags” or “labels” to classify or identify the objects. A nominal scale usually deals with the non-numeric variables or the numbers that do not have any value. • Characteristics of Nominal Scale • A nominal scale variable is classified into two or more categories. In this measurement mechanism, the answer should fall into either of the classes. • It is qualitative. The numbers are used here to identify the objects. • The numbers don’t define the object characteristics. The only permissible aspect of numbers in the nominal scale is “counting.” Ordinal Scale • Ordinal Scale • The ordinal scale is the 2nd level of measurement that reports the ordering and ranking of data without establishing the degree of variation between them. Ordinal represents the “order.” Ordinal data is known as qualitative data or categorical data. It can be grouped, named and also ranked. • Characteristics of the Ordinal Scale • The ordinal scale shows the relative ranking of the variables • It identifies and describes the magnitude of a variable • Along with the information provided by the nominal scale, ordinal scales give the rankings of those variables • The interval properties are not known • The surveyors can quickly analyse the degree of agreement concerning the identified order of variables • Example: • Ranking of school students – 1st, 2nd, 3rd, etc. • Ratings in restaurants • Evaluating the frequency of occurrences – Very often – Often – Not often – Not at all • Assessing the degree of agreement – Totally agree – Agree – Neutral – Disagree – Totally disagree Interval Scale • The interval scale is the 3rd level of measurement scale. It is defined as a quantitative measurement scale in which the difference between the two variables is meaningful. In other words, the variables are measured in an exact manner, not as in a relative way in which the presence of zero is arbitrary. • Characteristics of Interval Scale: • The interval scale is quantitative as it can quantify the difference between the values • It allows calculating the mean and median of the variables • To understand the difference between the variables, you can subtract the values between the variables • The interval scale is the preferred scale in Statistics as it helps to assign any numerical values to arbitrary assessment such as feelings, calendar types, etc. • Example: • Likert Scale • Net Promoter Score (NPS) • Bipolar Matrix Table Interval Scale • The interval scale is used when the interpretation of difference between values is same. • For example, difference between 40°C and 50°C is same as between 70°C and 80°C. In interval scale, one value cannot be represented as a multiple of other value as it does not have an absolute (true) zero point. For example, if the temperature is 20°C, it cannot be said to be twice hotter than when the temperature was 10°C. Ratio Scale • The ratio scale is the 4th level of measurement scale, which is quantitative.. The ratio scale has a unique feature. It possesses the character of the origin or zero points. Characteristics of Ratio Scale: • Ratio scale has a feature of absolute zero • It doesn’t have negative numbers, because of its zero-point feature • It affords unique opportunities for statistical analysis. The variables can be orderly added, subtracted, multiplied, divided. Mean, median, and mode can be calculated using the ratio scale. • Ratio scale has unique and useful properties. One such feature is that it allows unit conversions like kilogram – calories, gram – calories, etc. Ratio Scale • Example: • An example of a ratio scale is: • What is your weight in Kgs? • Less than 55 kgs • 55 – 75 kgs • 76 – 85 kgs • 86 – 95 kgs • More than 95 kgs Quiz Question Consider the count of number of faults detected during inspection activity: • 1. What is the measurement scale for this definition? • 2. What is the measurement scale if number of faults is classified between 1 and 5, where 1 means very high, 2 means high, 3 means medium, 4 means low, and 5 means very low? Software Quality • Software Quality can be defined as the capability of a software product to satisfy stated and implied needs under specified conditions. • Additionally, Quality refers to the degree to which software products meet their stated requirements. • Quality is a basic parameter of software engineering efforts whose primary goal is the delivery of maximum stakeholder value while balancing cost and schedule Maintainability • Characteristics related to the effort needed to make modifications, including corrections, improvements or adaptation of software to changes in environment, requirements and functions specifications. Reliability • “Software reliability is defined as the ability of a system or component to perform its required functions under stated conditions for a specified period of time”. • “Software reliability is defined as the probability of failure free operation under stated conditions for a specified period of time”. Software Quality Metrics Based on Defects • Defect is defined by IEEE/ANSI as “an accidental condition that causes a unit of the system to fail to function as required” IEEE/ANSI (Standard 982.2). • A failure occurs when a fault executes and more than one failure may be associated with a given fault. • The defect-based metrics can be classified at product and process levels. • The difference of the two terms fault and the defect is unclear from the definitions. In practice, the difference between the two terms is not significant and these terms are used interchangeably. • The commonly usedproduct metrics are defect density and defect rate that are used for measuring defects Defect Density • Defect density metric can be defined as the ratio of the number of defects to the size of thesoftware. Size of the software is usually measured in terms of thousands of lines of code (KLOC) and is given as: • Defect density =Number of defects/KLOC Defect Density • Defect density during testing is an effective metric that can be used during formal testing. • It measures the defect density during the formal testing after completion of the source code and addition of the source code to the software library. • If the value of defect density metric during testing ishigh, then the tester should ask the following questions: – 1Whether the software is well designed or developed? Phase-Based Defect Density • It is an extension of defect density metric where instead of calculating defect density at system level it is calculated at various phases of the software development life cycle, including verification techniques such as reviews, walkthroughs inspections, and audits before the validation testing begins. • This metric provides an insight about the procedures and standards being used during the software development. Some organizations even set “alarming values” for these metrics so that the quality of the software can be assessed andmonitored, thus appropriate remedial actions can be taken. Defect Removal Effectiveness
• Defect removal effectiveness (DRE) is defined as:
• DRE = Defects removed in a given lifecycle phase/ Latent defects • For a given phase in the software development life cycle, latent defects are not known. • Thus, they are calculated as the estimation of the sum of defects removed during a phase and defects detected later. • The higher the value of the DRE, the more efficient and effective is the process followed in a particular phase. • The ideal value of DRE is 1. • The DRE of a product can also be calculated by: – DRE =DB/(DB+DA)
• DB depicts the defects encountered before
software delivery • DA depicts the defects encountered after software delivery Usability Metrics • The ease of use, user-friendliness, learnability, and user satisfaction can be measured through usability for a given software. • Bevan (1995) used MUSIC project to measure usability attributes. Task Effectiveness • The task effectiveness is defined as follows: • Task effectiveness =1/100×(quantity×quality)% where: – Quantity is defined as the amount of task completed by a user – Quality is defined as the degree to which the output produced by the user satisfies the targets of a given task Usability Metrics • For example, consider a problem of proofreading an eight-page document. • Quantity is defined as the percentage of proof read words, and quality is defined as the percentage of the correctly proofread document. • Suppose quantity is 90% and quality is 70%, then task effectiveness is 63%. Other Usability Metrics • Temporal efficiency =Effectiveness/Task time • Productive period =(Task time- unproductive time)/Task time Usability Metrics • There are various measures that can be used to measure the usability aspect of the system and are defined below: – 1. Time for learning a system – 2. Productivity increase by using the system – 3. Response time Usability Metrics Web based applications • In testing web-based applications, usability can be measured by conducting a survey based on the questionnaire to measure the satisfaction of the customer. • The expert having knowledge must develop the questionnaire. The sample size should be sufficient enough to build the confidence level on the survey results. The results are rated on a scale. Usability Metrics of Web based applications For example, the difficulty level is measured for the following questions in terms of very easy, easy, difficult, and very difficult. The following questions may be asked in the survey: How the user is able to easily learn the interface paths in a webpage? Are the interface titles understandable? Whether the topics can be found in the ‘help’ easily or not? The charts, such as bar charts, pie charts, scatter plots, and line charts, can be used to depict and assess the satisfaction level of the customer. The satisfaction level of the customer must be continuously monitored over time. Testing Metrics • Testing metrics are used to capture the progress and level of testing for a given software. • The amount of testing done is measured by using the test coverage metrics. These metrics can be used to measure the various levels of coverage, such as statement, path, condition, and branch, and are given below: Testing Metrics 1. The percentage of statements covered while testing is defined by statement coverage metric. 2. The percentage of branches covered while testing the source code is defined by branch coverage metric. 3. The percentage of operations covered while testing the source code is defined by operation coverage metric. 4. The percentage of conditions covered (both for true and false) is evaluated using condition coverage metric. 5. The percentage of paths covered in a control flow graph is evaluated using conditioncoverage metric. 6. The percentage of loops covered while testing a program is evaluated using loop coverage metric. 7. All the possible combinations of conditions are covered by multiple coverage metrics. Testing Metrics • NASA developed a test focus (TF) metric defined as the ratio of the amount of effort spent in finding and removing “real” faults in the software to the total number of faults reported • in the software. The TF metric is given as (Stark et al. 1992): – TF = Number of STRs fixed and closed/ Total number of STRs • where: STR is software trouble report • The fault coverage metric (FCM) is given as: • FCM =(Number of faults addressed X severity of faults) /Total numberX severity of faults) Object Oriented Paradigm • The key concepts of OO paradigm are: classes, objects, attributes, methods, modularity, encapsulation, inheritance, and polymorphism • An object is made up of three basic components: an identity, a state, and a behavior (Booch 1994). • The identity distinguishes two objects with same state and behavior. • The state of the object represents the different possible internal conditions that the object may experience during its lifetime. • The behavior of the object is the way the object will respond to a set of received messages. OO Paradigm • A class is a template consisting of a number of attributes and methods. • Every object is the instance of a class. • The attributes in a class define the possible states in which an instance of that class may be. • The behavior of an object depends on the class methods and the state of the object as methods may respond differently to input messages depending on the current state. • Attributes and methods are said to be encapsulated into a single entity. • Encapsulation and data hiding are key features of OO languages. OO Metrics Suites • Chidamber and Kemerer (1994) defined a suite of six popular metrics. • This suite has received widest attention for predicting external quality attributes , such as maintainability and defect-proneness in literature. Popular OO Metric Suites • C&K(1991). • Li and Henry (1993) • Lorenz and Kidd • Bieman and Kang (1995) • Lorenz and Kidd (1994) proposed a suite of 11 metrics. • Briand et al. (1997) • Henderson-Sellers (1996). • Lee et al. (1995) • Benlarbi and Melo (1999). C&K Metrics Suite Metric Name Definition Construct Being Measured WMC It counts the number of Size methods weighted by complexity in a class. RFC It counts the number of Coupling external and internal methods in a class LCOM Lack of cohesion in methods Coupling CBO It counts the number of other Coupling classes to which a class is linked. NOC It counts the number of Inheritance immediate subclasses of a given class. DIT It counts the number of steps Inheritance Li and Henry Metric Suites Metric Name Definition Construct Being Measured DAC It counts the number of abstract Coupling data types in a class. MPC It counts a number of unique send Coupling statements from a class to another class SIZE1 It counts the number of Size semicolons. SIZE2 It is the sum of number of Size attributes and methods in a class. Lorenz and Kidd Metric Suites Metric Name Definition Construct Being Measured NOP It counts the number of immediate parents of a given class. NOD It counts the number of indirect and direct subclasses of a given class NMO It counts the number of methods overridden in a class. NMI It counts the number of methods inherited in a class. NMA It counts the number of new methods added in a class. SIX Specialization index Question – Calculate WMC and RFC for the following example Details of Various Metric Suites, Dynamic Metrics and Code Churn Metrics • Study various Details of Various Metric Suites, Dynamic Metrics from document provided CKJM Tool • Study from here in detail • https://gromit.iiar.pwr.wroc.pl/p_inf/ckjm/ Metrics Calculated By ckjm • WMC - Weighted methods per classA class's weighted methods per class WMC metric is simply the sum of the complexities of its methods. As a measure of complexity we can use the cyclomatic complexity, or we can abritrarily assign a complexity value of 1 to each method. The ckjm program assigns a complexity value of 1 to each method, and therefore the value of the WMC is equal to the number of methods in the class. • DIT - Depth of Inheritance TreeThe depth of inheritance tree (DIT) metric provides for each class a measure of the inheritance levels from the object hierarchy top. In Java where all classes inherit Object the minimum value of DIT is 1 • NOC - Number of ChildrenA class's number of children (NOC) metric simply measures the number of immediate descendants of the class. • CBO - Coupling between object classesThe coupling between object classes (CBO) metric represents the number of classes coupled to a given class (efferent couplings and afferent couplings). This coupling can occur through method calls, field accesses, inheritance, arguments, return types, and exceptions. • RFC - Response for a Class The metric called the response for a class (RFC) measures the number of different methods that can be executed when an object of that class receives a message (when a method is invoked for that object). Ideally, we would want to find for each method of the class, the methods that class will call, and repeat this for each called method, calculating what is called the transitive closure of the method's call graph. This process can however be both expensive and quite inaccurate. In ckjm, we calculate a rough approximation to the response set by simply inspecting method calls within the class's method bodies. The value of RFC is the sum of number of methods called within the class's method bodies and the number of class's methods. This simplification was also used in the 1994 Chidamber and Kemerer description of the metrics. • LCOM - Lack of cohesion in methods – A class's lack of cohesion in methods (LCOM) metric counts the sets of methods in a class that are not related through the sharing of some of the class's fields. – The original definition of this metric (which is the one used in ckjm) considers all pairs of a class's methods. In some of these pairs both methods access at least one common field of the class, while in other pairs the two methods to not share any common field accesses. – The lack of cohesion in methods is then calculated by subtracting from the number of method pairs that don't share a field access the number of method pairs that do. – Note that subsequent definitions of this metric used as a measurement basis the number of disjoint graph components of the class's methods. – Others modified the definition of connectedness to include calls between the methods of the class. – The program ckjm follows the original (1994) definition by Chidamber and Kemerer. • Ca - Afferent couplings A class's afferent couplings is a measure of how many other classes use the specific class. Coupling has the same definition in context of Ca as that used for calculating CBO. • Ce - Efferent couplingsA class's efferent couplings is a measure of how many other classes is used by the specific class. Coupling has the same definition in context of Ce as that used for calculating CBO. • NPM - Number of Public Methods The NPM metric simply counts all the methods in a class that are declared as public. It can be used to measure the size of an API provided by a package. • LOC - Lines of Code. The lines are counted from java binary code and it is the sum of number of fields, number of methods and number of instructions in every method of given class. • DAM: Data Access Metric This metric is the ratio of the number of private (protected) attributes to the total number of attributes declared in the class. A high value for DAM is desired. (Range 0 to 1) • MOA: Measure of Aggregation This metric measures the extent of the part-whole relationship, The metric is a count of the number of data declarations (class fields) whose types are user defined classes. • MFA: Measure of Functional Abstraction This metric is the ratio of the number of methods inherited by a class to the total number of methods accessible by member methods of the class. The constructors and the java.lang.Object (as parent) are ignored. (Range 0 to 1) • CAM: Cohesion Among Methods of Class T his metric computes the relatedness among methods of a class based upon the parameter list of the methods. The metric is computed using the summation of number of different types of method parameters in every method divided by a multiplication of number of different method parameter types in whole class and number of methods. A metric value close to 1.0 is preferred. (Range 0 to 1). • IC: Inheritance Coupling This metric provides the number of parent classes to which a given class is coupled. A class is coupled to its parent class if one of its inherited methods functionally dependent on the new or redefined methods in the class. A class is coupled to its parent class if one of the following conditions is satisfied: – One of its inherited methods uses a variable (or data member) that is defined in a new/redefined method. – One of its inherited methods calls a redefined method. – One of its inherited methods is called by a redefined method and uses a parameter that is defined in the redefined method. • CBM: Coupling Between Methods The metric measure the total number of new/redefined methods to which all the inherited methods are coupled. There is a coupling when one of the given in the IC metric definition conditions holds. • AMC: Average Method Complexity This metric measures the average method size for each class. Size of a method is equal to the number of java binary codes in the method. • CC - The McCabe's cyclomatic complexityIt is equal to number of different paths in a method (function) plus one. The cyclomatic complexity is defined as:CC = E - N + PwhereE - the number of edges of the graphN - the number of nodes of the graphP - the number of connected components • LCOM3 -Lack of cohesion in methods.LCOM3 varies between 0 and 2. • m - number of procedures (methods) in class • a - number of variables (attributes) in class • µ(A) - number of methods that access a variable (attribute) Understand Tool It is a proprietary and paid application developed by SciTools (http://www.scitools. com). It is a static code analysis software tool and is mainly employed for purposes such as reverse engineering, automatic documentation, and calculation of source code metrics for software projects with large size or code bases. Understand basically functions through an integrated development environment (IDE), which is designed to aid the maintenance of old code and understanding new code by employing detailed cross references and a wide range of graphical views. Understand supports a large number of programming languages, including Ada, C, the style sheet language CSS, ANSI C, and C++, C#, Cobol, JavaScript, PHP, Delphi, Fortran, Java, JOVIAL, Python, HTML, and the hardware description language VHDL. The calculated metrics include complexity metrics (such as McCabe’s CC), size and volume metrics (such as LOC), and other OO metrics (such as depth of inheritance tree [DIT] and coupling between object classes [CBO]).