0% found this document useful (0 votes)
18 views5 pages

Assignment 1 S 21

Uploaded by

Sanket Waghmare
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views5 pages

Assignment 1 S 21

Uploaded by

Sanket Waghmare
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

DSCI-644-01.

Software Engineering for Data Science


Individual Assignment 01
V.2.2

Part 1. Software Quality Metrics

Description:
This assignment is to evaluate a given project quality based on:
• Complexity Metrics: McCabe Cyclomatic, Comment in code ratio.
• Volume Metrics: Lines of Code, Number of Files/functions/Variables.
• Object-Oriented: CK Metrics.

Peter Drucker, a famous management consultant, states: “If you can’t measure it, you can’t manage it”. It may
not be possible for a project team to successfully manage a project if they are not able to accurately measure all
the aspects of that project. Metrics have played a vital role in software industries and, as a result, the failure rate
of software projects has decreased considerably due to the use of software project management tools and
techniques. Nowadays, more efficient, robust and quantitative measures for effective software requirement,
analysis, design, development, quality assurance, integration, deployment and support are in practice. The quality
of software systems should be measured during each phase of software system development.
In this context, the CK-Metric suite is an initiative to examine the various aspect of object-oriented system design
such as size, coupling, cohesion, and inheritance.
Understand is a commercial tool that calculates various metrics for your code during build cycles and warns you,
via the Problems view, of 'range violations' for each metric. This allows you to stay continuously aware of the
health of your codebase. You may also export the metrics to HTML for public display or to CSV or XML format
for further analysis. Note that although metrics can be useful in a software development effort, they should not
take the place of good taste and experience. Also, metrics are more useful as indicators of unhealthy code than
they are as indicators of healthy code, i.e. a codebase with many range violation warnings is probably an indication
that the code needs to be refactored but no range violation warnings do not necessarily mean that the code is good.
Download link (if you’re a student you can get a free license):
https://scitools.com/download/ (Links to an external site.)

Tasks:
• Search for a small/medium (=< 100 000 LOC) open-source project (Java), which has a least 5
releases/versions (v1.0, v1.1, v2.0 etc.).
• In order to find a variance in the values of metrics between versions, it is highly recommended to choose
versions that vary in size. Practically, choose 5 versions of the software, where at least there are two
versions whose difference in LOC is statistically significant. Try a test similar to Wilcoxon rank-sum.
Make sure to show the 5 boxplots of LOCs, and a screenshot of the T-test results as part of your submission
report.
• (Optional) Slack me the project name if you’re not sure about your selection.
• Use the tool Understand to calculate the metrics for each of the 5 releases/versions. The results should be
saved in an excel file. Use the values of these metrics to plot the variation of metrics throughout the
evolution of the software.
• The metrics to extract are:
1
o Complexity Metrics: McCabe Cyclomatic, Comment in code ratio.
o Volume Metrics: Lines of Code, Number of Files/functions/Variables.
o Object-Oriented: CK Metrics.
• An example of the excel file with some needed graphs for a given project was provided under Content ->
Assignments -> Assignment 1.
• Use these graphs to evaluate the evolution of the project from quality perspective:
1) comment on each of these graphs individually to describe the evolution of each metric.
2) using all these graphs combined, try to locate any patterns that can be seen as indicators of good or bad
quality and use them to give insights about any possible recommendations for developers in the future.
• Now, we want to determine which of these metrics has been drastically evolving in the five versions that
you have considered for this assignment. For each metric, compare its values in the first version with its
values in the last version. The comparison is performed using the Wilcoxon Signed-Rank test to verify
which of metric’s difference between the two compared versions is found to be statistically significant.
For example, let us consider the first version (e.g., v1.0) and the last version (e.g., v3.5). v1.0 contains 5
classes and v3.5 contains 9 classes. If we are comparing the LOC between these two versions then we will
end up with a column of 5 values, representing v1.0 and another column of 9 values representing v3.5.
When feeding these columns to the Wilcoxon test, we need to add 4 zeros to the first column, since
Wilcoxon only accepts columns of equal size. You can hypothesize that v3.5 values are higher than v1.0
and verify of your hypothesis holds, and whether the p-value is statistically significant.

Part 2: Design Defects Detection

Description:
Fault prediction is necessary in software development life cycle in order to reduce the probable software failure
and is carried out mostly during initial planning to identify fault-prone modules. Fault prediction not only gives
an insight to the need for increased quality of monitoring during software development but also provides necessary
tips to undertake suitable verification and validation approaches that eventually lead to improvement of efficiency
and effectiveness of fault correction.
• inFusion is a very efficient commercial tool to detect anti-patterns (called also code-smells or design
flaws). All detected design flaws are associated numerical scores (probability), representing the relative
severity of the design flaw, as well as the absolute negative impact on overall quality. Using this
prioritization scheme, you save time and maximize the efficiency of allocating maintenance effort in your
project. inFusion can detect several types of code-smells including:
• DesigniteJava is a code quality assessment tool for code written in Java. It detects numerous architectures,
design, and implementation smells that show maintainability issues present in the analyzed code.
DesigniteJava also computes many commonly used object-oriented metrics. It helps you reduce technical
debt and improve maintainability of your software.
Both tools are able to identify several design and code defects such as:
• God Class
• Blob
• Data Clumps
• SAP Breakers:
• Data Class

2
• Refused Parent Bequest
• Feature Envy
• Cyclic Dependencies
• Internal Duplication
• Schizophrenic Class
• External Duplication
• Intensive Coupling

Downloads:
• You can find Infusion in Content -> Assignments -> Assignment 2. If the tool generates errors, means that
you do not have the right version of JRE or JDK installed.
• You can find DesigniteJava with the description in: http://www.designite-
tools.com/designitejava/enterprise , also I added the Jar file in Assignment 2 folder. To execute
DesigniteJava follow the instruction in Usage section (in the website)

Tasks:
• Search for a small/medium (=< 100 000 LOC) open source project (Java), which has a least 5
releases/versions (v1.0, v1.1, v2.0 etc.). You may use the same project from assignment 1 if possible.
• Use the tool InFusion to generate a report about the design flaws for each of the 5 releases/versions. The
results should be saved in an excel file. Use the design flaws values to plot the variation of quality deficit
indices (complexity, coupling, cohesion, encapsulation, inheritance) throughout the 5 versions of the
software.
• An example of the excel file with some needed graphs for a given project was provided under Content ->
Assignments -> Assignment 2.
• Analyze the evolution of the design flaws when as the software size increases and point out which of the
design qualities are in deficit (refer to the graphs to determine the highest flaw).
• Use the tool DesigniteJava to generate a report about the design smells and implementation smells for
each of the 5 releases/versions. Include the files generated by DesigniteJava as part of your final
submission. Calculate the sum of design smells (adding all smells from all files) and implementation
smells, to have two totals (design and implementation) per version. Use these totals to plot their variation
throughout the 5 versions of the software.

Part 3: Design Defects Correction

Description:
Locating and fixing software defects is one of the most expensive tasks involved in software development. It
tends to be subjective, manual and requires extensive knowledge of the target software design. In this case, you
are assigned to fix detected defects on a given project, you will be given a set of defects types that needs to be
fixed and you will need to provide a sequence of refactoring operations that will fix multiple instances of those
defects. This will require analyzing the project to look for code fragments to fix (infected with smells) and then
provide guidelines on how to fix them by giving a set of refactorings that you recommend. Your output will be
used to optimize the quality of the software.

3
You are required to justify your choices in terms of refactoring decisions for each fixed defect type but you are
not required to execute the recommended refactorings, you just need to enumerate them and provide their
execution order. You are required to fix at least 6 instances of at least 4 defect types, for example: you can fix 2
God class instances, 1 Data Class instance, 2 Lazy class instances, 1 feature envy instance. Your expected number
of refactoring operations to propose may be on the range of [6 ... 10].

Table 1. Refactorings catalog

Refactorings Actors Roles


class source class, target class
Move method
method moved method
class source class, target class
Move field
field moved field
class sub classes, super class
Pull up field
field moved field
class sub classes, super class
Pull up method
method moved method
class super class, sub classes
Push down field
field moved field
Push down class super class, sub classes
method method moved method
Inline class class source class, target class
class source class, new class
Extract class field moved fields
method moved methods
package source package, target package
Move class
class moved class
class source classes, new interface
Extract interface field moved fields
method moved methods
Defect Types priority list:
1. God Class
2. Data Class
3. Lazy Class
4. Feature Envy
5. Shotgun Surgery
6. Brain Method
7. Refused Parent Bequest
8. Intensive Coupling

Task:
For this exercise, you will have InFusion analyze a given version of a software that you choose. You can keep
using the one you used for the previous parts. You will then be asked to provide some details regarding the project,
those details are indicated on the table above (Number of classes, KLOC, KLOC of classes responsible for testing
the project etc.). You may use Understand to get the details and to visualize the source code when recommending
refactoring operations.

1. Run InFusion to generate the number of defects on the project.


2. Run Understand to visualize the class diagram of the project before you refactor it.
3. Locate the 6 defects instances at most that belong to at least 4 different types that are given to you on the
above description. Recall that it is just a priority list, you can still fix defects that may not belong to the
list if needed.

4
4. List the necessary refactoring operations that can fix them along with your justification on the choice of
refactorings you made. Your overall refactorings should be between 6-20. An example of the excel file
with some sample refactorings for a given project was provided under Content -> Assignments ->
Assignment 2.
5. Run Understand to visualize the class diagram of the project after you refactor it. Locate any differences
you notice.
6. Fill out the excel file template given with your work.

Grading:
Part1.
Graphs - 35%
Part2.
Graphs - 35%
Part3.
Refactorings - 35%

Total: 105% (5% bonus).

Submission:
Submit 1 pdf file and one zip file containing separate folders for each part. The pdf should contain 3 parts. The
first pdf contains the graphs and your comments/recommendations. For the second part, it contains your
screenshots of defects graphs you generated. The third part, you provide the refactorings you suggested and your
explanation of their correction.
For the zip file with a folder for each part, it contains the following:
Part1.
• You also need to provide the source code (compressed) of the project you have studied along with the
excel file that generated the graphs that you reported in your report.
• The excel file must be labeled the name of the studied project (the given template excel file was named
Notepad++).
Part2.
• You need to include the Excel file, and the zipped file of the source code of every chosen project release.
It should be renamed “ProjectName+Detection” where ProjectName is the name of the project you
analyzed (example: Notepad++Detection.xlsx).
Part3.
• You need to provide the Excel file containing the refactorings that you’re recommending. It should be
renamed “ProjectName+Correction” where ProjectName is the name of the project you analyzed
(example: Notepad++Correction.xlsx). Also provide the output files of DesigniteJava.

Change Log
V.2.2 2021_02_04. Removed Designite for now.
V.2.1 2021_02_04. Switched testing with defects detection and correction.
V.1.1 2021_02_01. Added the statistical significance between two versions of all metrics.
V.1.0 2021_01_20. Initial verification by TA.
5

You might also like