DELHI TECHNOLOGICAL UNIVERSITY
Empirical Software Engineering Lab File
Submitted To: Submitted By:
Prof. Sanjay Kumar Sahil Bhasin
2K16/SE/071
Q 1.Online loan system has two modules for the two basic services, namely Car loan service
and House loan service. The two modules have been named as Car_Loan_Module and
House_Loan_Module. Car_Loan_Module has 2000 lines of uncommented source code.
House_Loan_Module has 3000 lines of uncommented source code. Car_Loan_Module was
completely implemented by Mike. House_Loan_Module was completely implemented by
John. Mike took 100 person hours to implement Car_Loan_Module. John took 200 person
hours to implement House_Loan_Module. Mike’s module had 5 defects. John’s module had
6 defects. With respect to the context given, which among the following is an INCORRECT
statement?
Justify and choose one:
a) John’s Quality is better than Mike’s Quality
b) John’s Productivity is more than Mike’s Productivity
c) John introduced more defects than Mike
d) John’s Effort is more than Mike’s Effort.
ANS. For John,
Size = 3000 LOC
Effort = 200 person-hours
Defect = 6
Productivity = size/effort = 3000/200 LOC/person-hours = 15 LOC/person-hours
Quality = defect/size = 6 defects/ 3000 LOC = 0.02 defect/size
For Mike,
Size = 2000 LOC
Effort = 100 person-hours
Defect = 5
Productivity = size/effort = 2000/100 LOC/person-hours = 20 LOC/person-hours
Quality = defect/size = 5 defects/ 2000 LOC = 0.025 defect/size
Mike’s productivity is more than that of John. So option b) is the INCORRECT statement.
Q 2.Defect detection activities like reviews and testing help in identifying the defects in the
artefacts (deliverables). These defects must be classified into various buckets before carrying
out the root cause analysis. Following are some the defect Categories: Logical, User interface,
Maintainability, and Standards.
In the context of the above defect categories, classify the following statements under the defect
categories.
a) Divide by Zero Error is not guarded
b) Usage of 3.14 in the statement Circle_Area = 3.14 * Radius * Radius;
c) 3500 lines of code in a single function
d) A pointer is declared but not initialized. It is used in the program for storing a value.
e) A program designed to handle 1000 simultaneous users, crashed when 1001 the user
logged in.
f) A “while” loop never exits
g) User interface displays “MALFUNCTION 54” when something goes wrong in the
back-end. No documentation (comments) for the source code
h) Hungarian Notation not followed while coding, even though the coding guidelines
mandate to use Hungarian Notation
i) Pressing of “Tab” key moves the cursor in different fields of a web form randomly.
ANS.
S.No DEFECTS DEFECT CATEGORY
1. Divide by Zero Error is not guarded Logical
2. Usage of 3.14 in the statement Circle_Area = 3.14 Logical
* Radius * Radius;
3. 3500 lines of code in a single function Maintainability
4. A pointer is declared but not initialized. It is used Standards
in the program for storing a value.
5. A program designed to handle 1000 simultaneous Maintainability
users, crashed when 1001 the user logged in.
6. A “while” loop never exits Logical
7. User interface displays “MALFUNCTION 54” User interface
when something goes wrong in the back-end. No
documentation (comments) for the source code
8. Hungarian Notation not followed while coding, Standards
even though the coding guidelines mandate to use
Hungarian Notation
9. Pressing of “Tab” key moves the cursor in different User Interface
fields of a web form randomly
Q3. Statistical Hypothesis Testing in R- Statisticians use hypothesis testing to formally check
whether the hypothesis is accepted or rejected. Consider an example or data of your choice and
identify the following:
• State the Hypotheses
• Formulate an Analysis Plan
• Analyze Sample Data
• Interpret Results
• Estimate type-I and type-II error
ANS.
Suppose the mean weight of King Penguins found in an Antarctic colony last year was 15.4
kg. Assume in a random sample 35 penguins, the standard deviation of the weight is 2.5 kg.
We begin with computing the standard error estimate, SE.
> n = 35 # sample size
> s = 2.5 # sample standard deviation
> SE = s/sqrt(n); SE # standard error estimate
[1] 0.42258
We next compute the lower and upper bounds of sample means for which the null
hypothesis μ = 15.4 would not be rejected.
> alpha = .05 # significance level
> mu0 = 15.4 # hypothetical mean
> I = c(alpha/2, 1-alpha/2)
> q = mu0 + qt(I, df=n-1) * SE; q
[1] 14.541 16.259
Therefore, so long as the sample mean is between 14.541 and 16.259 in a hypothesis test,
the null hypothesis will not be rejected. Since we assume that the actual population mean is
15.1, we can compute the lower tail probabilities of both end points.
> mu = 15.1 # assumed actual mean
> p = pt((q - mu)/SE, df=n-1); p
[1] 0.097445 0.995168
A type I error can also be called as false positive”; a result when an actual “hit” is erroneously
seen as a “miss.” Type I error is denoted by the Greek letter alpha (α). This means that it usually
equals the significance level of a test. Type II error is defined as the probability of wrongly not
rejecting the null hypothesis when the null hypothesis is false. In other words, a type II error
occurs when the null hypothesis is actually false, but somehow, it fails to get rejected. It is also
known as “false negative”; a result when an actual “miss” is erroneously seen as a “hit.” The
rate of the type II error is denoted by the Greek letter beta (β) and related to the power of a
test(which equals 1 − β).
Probability of Type I error = significant level = 0.05 or 5%
Finally, the probability of type II error is the probability between the two end points is 89.8%.
> diff(p) # p[2]-p[1]
[1] 0.89772
Q4. Write two-page report in your wordings on usage of various machine learning techniques
for fault prediction?
ABSTRACT
Software fault prediction is a significant part of software engineering. Fault
prediction means to identify fault prone modules at the early stage of software
development. It helps to reduce overall testing time, effort, and cost. It significantly
improves the goodwill and profit of the organization by providing customer
satisfaction.
INTRODUCTION
Today, we are living in the world of computers where software’s are used in almost every
field of life. In 2018, the worldwide software development market is about $389 billion
according to IT research and advisory firm Statista. This data shows the importance of
software. So, it is necessary for a software development company to deliver error free
software. But practically it is not possible to make software 100% error free. We can reduce it
by using well known techniques called fault prediction models which constitute the topic of
this paper. Software fault prediction (SFP) models are used to identify the fault prone
modules at the early stage of software development because detecting fault at later stage will
increase the cost exceptionally high. So, this will decrease the quality as well as leads to
customer dissatisfaction. So, SFP models helps the testing team to focus more on fault prone
modules and enables to optimize the utilization of resources.
MACHINE LEARNING TECHNIQUES USED FOR SOFTWARE FAULT
PREDICTION
Machine learning methods are mainly categorized into two main categories:
• Supervised learning: Supervised learning is a method where both the predictors and
response variables are given. We have various techniques like Decision tree, Random Forest,
Naïve Bayes that comes under the category of supervised learning.
• Unsupervised learning: Unsupervised learning approach is basically used in those situations
where no fault data is given. Here the algorithm finds the hidden structure or pattern in
unlabelled data. In case of fault prediction, if we need to predict faults at different levels then
clustering will be the better approach.
COMPARISON OF DIFFERENT MLT FOR FAULT PREDICTION
In the above section we have discussed various machine learning techniques used by
researchers for prediction of faults. The performance of each technique varies according to
dataset. So it is the responsibility of the practitioner to select the best technique depending
upon the requirements of the dataset. The following figure 1 shows the comparative use of
Machine learning techniques over last few years. Figure 1: Comparative Use of Various ML
Techniques over last few years It is clear from the extensive literature survey that decision
tree is the most widely used techniques for prediction of faults. Bayesian learner and
regression is also used by the researchers depending upon the requirements of the dataset. In
future it is better to implement hybrid approaches to improve the accuracy of the model.
RESEARCH CONTRIBUTION
This paper presents a comprehensive survey to show the current trend of various machine
learning techniques to predict faults in software modules at different levels. Most of the
experimental work done by researchers is based on promise data repository which does not
reflect the real-life problems. After detailed literature survey, we find some limitations in the
existing research work that is also presented in the tabular form. In the previous research
works there are various issues that need to be reconsidered like class rebalancing, threshold
dependent performance measure and unavailability of well documented modelling scripts
from published settings so that we are not able to generalize the findings of the study. This
survey will guide the practitioners to explore more problems and hence solve them by
providing the relevant solution
CONCLUSION AND FUTURE WORK
This paper presents detailed review on various machine learning techniques for SFP. SFP is
necessary for minimizing the cost as well as time of software testing. Those modules which
are more prone to errors requires more resources. SFP enables testing team to optimally
utilize the resources which helps to improve the quality of the system. The aim of this study
is to access research works done by various researchers related to machine learning
techniques for software fault prediction so that it will helps the practitioners who are
interested in building fault prediction model. After detailed review we found that random
forest, neural network and naïve Bayes are good enough for SFP, but no single technique is
appropriate for all kinds of dataset. So, it is better to choose the result from the set of
prediction models. Hence, in future we are planning to implement heterogeneous ensembling
to overall increase the efficiency of the system
Q5. Why is version control important? How many types of version control systems are there?
Version control is important for all code, files, and assets that multiple team members will
collaborate on. Using version control software helps you keep track of changes — and keep
every team member working off the latest version.
It needs to do more than just manage and track files. It should help you develop and ship
products faster. This is especially important for teams practicing DevOps.
That’s because using the right one:
• Improves visibility.
• Helps teams collaborate around the world.
• Accelerates product delivery.
There are several types of version control systems that teams use today. Some are centralized.
Some are distributed.
Here are a few of the most popular types of version control:
• Helix Core (Perforce)
• Git
• SVN
• ClearCase
• Mercurial
• TFS
Q6. Demonstrate how Git can be used to perform version control?
Git is one of the most popular version control systems. It is a distributed version control
system. Changes do not have to be committed to the same central repository, which would
require that every person working on the project to access that central repository and
download the latest code in order to save changes. Instead, everyone can have their own
localized repository with its entire history.
Q7. List the various Git commands and their description.
1) Git config
To set your user name and email in the main configuration file.
How to : To check your name and email type in git config --global user.name and git config
--global user.email. And to set your new email or name git config --global user.name =
“Rishabh Gautam” and git config --global user.email = “
[email protected]”
2) git init
To initialise a git repository for a new or existing project.
How to : git init in the root of your project directory.
3) git clone
To copy a git repository from remote source, also sets the remote to original source so that
you can pull again.
How to : git clone <:clone git url:>
4) git status
To check the status of files you’ve changed in your working directory, i.e, what all has
changed since your last commit.
How to : git status in your working directory. lists out all the files that have been changed.
5) git add
Adds changes to stage/index in your working directory.
How to : git add .
6) git commit
Commits your changes and sets it to new commit object for your remote.
How to : git commit –m “sweet little commit message”
7) git push/git pull
Push or Pull your changes to remote. If you have added and committed your changes and you
want to push them. Or if your remote has updated and you want those latest changes.
How to : git pull <:remote:> <:branch:> and git push <:remote:> <:branch:>
8) git branch
Lists out all the branches.
How to : git branch or git branch -a to list all the remote branches as well.
9) git checkout
Switch to different branches
How to : git checkout <:branch:> or **_git checkout -b <:branch:> if you want to create
and switch to a new branch.
10) git stash
Save changes that you don’t want to commit immediately.
How to : git stash in your working directory. git stash apply if you want to bring your saved
changes back.
11) git merge
Merge two branches you were working on.
How to : Switch to branch you want to merge everything in. git
merge <:branch_you_want_to_merge:>
12) git reset
You know when you commit changes that are not complete, this sets your index to the latest
commit that you want to work on with.
How to : git reset <:mode:> <:COMMIT:>
13) git remote
To check what remote/source you have or add a new remote.
How to : git remote to check and list. And git remote add <:remote_url:>
These are the commands that I feel are essential and get things done, at least for me.
Comment here if you think I’ve missed something important or if something can be done
differently.
Q8. Perform comparison of following data analysis tools.
a. WEKA
b. KEEL
c. SPSS
d. MATLAB
e. R
Ans.
Tools Operating License Interface Open Lang Graphics Availability Speciality
System Source Of Help
WEKA Mac/ GNU GPL Syntax/ Yes Java Excellent Good Used for
Windows Menu machine
learning
techniques
KEEL Mac/ GNU GPL Menu Yes Java Moderate Moderate Used for
Windows evolutionary
algorithms
SPSS Mac/ Proprietary Syntax/ No Java Very Good Used for
Windows Menu Good multivariate
and statistical
testing
MATLAB Mac/ Proprietary Syntax/ No C++/ Good Very Good Best for
Windows Menu Java developing
new
mathematical
techniques,
used for
image and
signal
processing
R Mac/ GNU GPL Syntax Yes Fortron NA Average Extensive
Windows /C Library
Support
Q9. Consider defect dataset and implement following statistical test using SPSS tool.
a. t-test
b. Chi-Square Test
c. Wicoxon Signed Test
d. Friedman Test
e. Kruskal Wallis Test
Ans. SPSS statistics is a software package that is used for statistical analysis. It was acquiredby
IBM in 2009 and the current versions (2014) are officially named IBM SPSS Statistics.The
software name stands for Statistical Package for Social Sciences, which reflects theoriginal
market.
SPSS is one of the most powerful tools that can be used for carrying out almost anytype of data
analysis. This data analysis could be either in the field of social sciences,natural sciences, or in
the world of business and management. This tool is widely usedfor research and interpretation
of the data.
a. t-test
The dataset used is from NASA metric data program (MDP). The hypothesis is stated
as follows:
H0: μ = 25 (Mean of total lines of code is equal to 25.)
Ha: μ ≠ 25 (Mean of total lines of code is not equal to 25.)
After implement One sample t-test using SPSS tool, following results were obtained:
The obtained p-value is 0.000 which is less than 0.001. Therefore, the null hypothesis
is rejected. Thus, it is concluded that the mean total lines of code is statistically
significantly different from 25.
b. Chi-Square Test
After implement Chi-Square test using SPSS tool, following results were obtained:
In the figure above, for LOC (Lines Of Code), 1 represents the software with LOC more
than or equal to 100 and 2 represents the software with LOC less than 100.
So, the Chi-Square Tests table shows that the p-value (Asymptotic Significance (2-
sided)) is 0.000 and tabulated χ2 value is 12.96. The results are significant at 0.05
significance level. Hence, we reject the null hypothesis, and the results show that
software codes having more than 100 LOC value would be defective
c. Wicoxon Signed Test
Wilcoxon signed-ranks test is a nonparametric test that is used to perform pairwise
comparisons among different treatments (Wilcoxon 1945). It is also called Wilcoxon
matched pairs test and is used in the scenario of two related samples (Kothari 2004).
The comparison of performance of two machine learning techniques i.e. RF (Random
Forest) and LR (Linear Regression) are done in terms of AUC (Area Under Curve)
metric on 11 datasets from the NASA metric data program (MDP). The hypothesis is
stated as:
H0: The performance of the two techniques does not differ significantly.
Ha: The performance of the two techniques differs significantly.
Using Wicoxon Signed Test, following results are obtained:
Thus, the p value obtained is 0.062 as seen in the above figure. The obtained p-value is
greater than α = 0.05. Thus, the results are not significant at critical value α = 0.05.
Therefore, the hypothesis is accepted and the performance of both the techniques do
not differ significantly.
d. Friedman Test
The hypothesis is stated as:
H0: There is no statistical difference between the performances of various methods
Ha: There is statistical significant difference between the performances of various
methods.
Using Friedman Test, following results are obtained:
Since, the p-value obtained is 0.003 which is less than 0.05, therefore the results are
significant at critical value α = 0.05. Thus, the null hypothesis is rejected and the
performance of six methods differ significantly
e. Kruskal Wallis Test
The Kruskal-Wallis H test showed that there was a statistically significant difference in
cyclomatic complexity between defective and non-defective softwares as p = 0.000<
0.05.
Q10. Consider defect dataset and perform following feature reduction techniques using
Wekatool. Validate the dataset using 10-cross validation.
a. Correlation based feature evaluation
b. Relief Attribute feature evaluation
c. Information gain feature evaluation
d. Principle Component
Ans.
a. Correlation based feature evaluation
The following results are obtained after using Correlation based feature evaluation:
b. Relief Attribute feature evaluation
The Relief Attribute feature evaluation method evaluates the worth of an attribute by
repeatedly sampling an instance and considering the value of the given attribute for
the nearest instance of the same and different class.
The following results are obtained after using Relief Attribute feature evaluation:
c. Information gain feature evaluation
The following results are obtained after using Information gain feature evaluation:
d. Principle Component
Principal component method (or P.C. method) is a standard technique used to find
theinterdependence among a set of variables.
The following results are obtained after using Principle Component: