Lecture 8 - SCM and 7 Quality Tools
Lecture 8 - SCM and 7 Quality Tools
Software Quality
Assurance
Lecture 7
Outline
• Software Configuration Management
• GIT
• Seven Basic Quality Tools
2
Software Configuration Management
• Definition of CM
• What is CM
• Why do we have CM
3
4
Configuration management is the art of identifying, organizing, and
controlling modifications to the software (& documentation) being built by
a programming team.
SCM activities:
– identify change
– control change
– ensure that change is being properly implemented
– report change to others who may have an interest
5
Baselines
“A specification or product that has been formally reviewed and agreed
upon, that thereafter serves as the basis for further development, and
that can be changed only through formal change control procedures.”
6
7
8
SCM Tasks
• According to IEEE we can define these primary activities in CM
1. Identification of objects
2. Management
3. Configuration auditing
4. Status reporting
9
1. Identification of objects
• Identification is the task of identifying our divisional artifacts i.e. the
items that make up our system e.g.
– SDLC Documents
– Software
– Hardware
10
2. Management
• Management is the introduction of controls (procedures and quality gates) to
ensure that product evolves appropriately
• Keep your focus on
– Version control
– Change control
11
Change Control Procedure
• Need for change is recognized
• User creates Change Request (CR)
• Developer evaluates CR, and produces report
• Change Control body decides, either
– Yes: ECO (engineering change order) is queued
– Maybe: Report rejected – need more info.
– No: CR is denied; user is informed
12
3. Configuration Audit
• Audit is the review of the organizational process against the
defined/required standards
• Areas of focus includes
– Adherence to process
– Conformance to security
– Configuration Verification
13
4. Status reporting
• Configuration Status Reporting
• A CSR report is generated on regular basis and is intended to keep
management and practitioners appraised of important changes
• What happened?
• Who did it?
• When did it happen?
• What else will be affected?
14
GiT
15
• Git is a version control system.
• It helps you keep track of code changes.
• GIT is an acronym for Global Information Tracker
• With Git, every time you commit, or save the state of your project, Git basically
takes a picture of what all your files look like at that moment and stores a reference
to that snapshot.
• To be efficient, if files have not changed, Git doesn’t store the file again, just a link
to the previous identical file it has already stored.
• Git thinks about its data more like a stream of snapshots.
Git Has Integrity
• Everything in Git is checksummed before it is stored and is then referred to by that
checksum.
• You can’t lose information in transit or get file corruption without Git being able to
detect it.
• The mechanism that Git uses for this checksumming is called a SHA-1 hash. This is a
40-character string composed of hexadecimal characters (0–9 and a–f) and calculated
based on the contents of a file or directory structure in Git.
• A SHA-1 hash looks something like this:
24b9da6552252987aa493b52f8696cd6d3b00373
What is a git repository?
A repository is a file structure where git stores all the project-based files.
Git can either stores the files on the local or the remote repository
GITHUB has a wide range of features. These features expand the functions of GIT. The
user interface of GITHUB allows users to do everything that GIT does without having
to write the git commands. It makes communicating with collaborators or developers
easier.
Although GIT and GITHUB are capable of running independently, they
both go hand in hand. Using GITHUB looks easy for a non-programmer
but as a programmer, learning GIT commands is necessary.
Head
Head is the reference to the last commit object of the branch
There is always a default head referred to as master or main
What is a conflict?
Sometimes while working in a team environment, there might be cases of conflicts such
as:
1. When two separate branches have made changes to the same line in a file
2. A file is deleted in one branch but has been modified in the other.
These conflicts have to be solved manually after discussion with the team as git will not
be able to predict what and whose changes have to be given precedence.
Before making commits to the changes done, the developer is given
provision to format and review the files and make innovations to them.
All these are done in the common area which is known as ‘Index’ or
‘Staging Area’.
• Staging area gives the control to make commit smaller. Just make
one logical change in the code, add the changed files to the staging area
and finally if the changes are bad then checkout to the previous commit
or otherwise commit the changes.
• It gives the flexibility to split the task into smaller tasks and commit
smaller changes. With staging area it is easier to focus on small tasks.
• The staging area acts as a middle ground between your working directory
(where you make changes) and the committed history.
• Selective Staging: With the staging area, you can selectively stage specific
changes within a file. This means you can commit only portions of a file
while keeping other changes for a later commit. It provides fine-grained
control over what exactly goes into each commit.
• Unstaged changes are modifications made to files in your working directory that Git
is not currently tracking for the next commit.
• Git is aware of these changes, but they have not been explicitly marked for inclusion
in the next commit.
• Unstaged changes are not part of the upcoming commit until you explicitly stage
them using git add.
• Staged changes are modifications that have been marked for inclusion in the next
commit.
• These changes are temporarily stored in the staging area (also known as the index).
• Staged changes represent the state of the files as you intend to commit them.
• When you run git commit, Git records the changes that are staged at that moment.
• Staged changes are not part of the committed history until you complete the
commit process with git commit.
• Git has three main states that your files can
reside in: modified, staged, and committed:
• You selectively stage just those changes you want to be part of your next commit,
which adds only those changes to the staging area.
• You do a commit, which takes the files as they are in the staging area and stores that
snapshot permanently to your Git directory.
If a particular version of a file is in the Git directory, it’s considered committed. If it has
been modified and was added to the staging area, it is staged. And if it was changed since
it was checked out but has not been staged, it is modified.
Git Commit
• Since we have finished our work, we are ready move from stage to
commit for our repo.
• Git considers each commit change point or "save point".
• It is a point in the project you can go back to if you find a bug, or
want to make a change.
• When we commit, we should always include a message.
Contd..
• You can even switch between branches and work on different projects
without them interfering with each other.
• Branching in Git is very lightweight and fast!
• You create branches to isolate your code changes, which you test
before merging to the main branch
• Main branch is the first branch made when you initialize a Git
repository using the git init command.
30
• Branching means you diverge from the main line of development and
continue to do work without messing with that main line. In many VCS
tools, this is a somewhat expensive process, often requiring you to create
a new copy of your source code directory, which can take a long time for
large projects.
• When you make a commit, Git stores a commit object that contains a
pointer to the snapshot of the content you staged. This object also
contains the author’s name and email address, the message that you
typed, and pointers to the commit or commits that directly came before
this commit (its parent or parents): zero parents for the initial commit,
one parent for a normal commit, and multiple parents for a commit that
results from a merge of two or more branches.
A branch in Git is simply a lightweight movable pointer to one of these
commits. The default branch name in Git is master. As you start making
commits, you’re given a master branch that points to the last commit
you made. Every time you commit, the master branch pointer moves
forward automatically.
Branching
• In Git, a branch is a new/separate version of the main repository.
• Let's say you have a large project, and you need to update the design
on it.
• Branches allow you to work on different parts of a project without
impacting the main branch.
• When the work is complete, a branch can be merged with the main
project.
33
How will you create a git repository?
• Have git installed in your system.
• Then in order to create a git repository, create a folder for the project and
then run git init.
• Doing this will create a .git file in the project folder which indicates that the
repository has been created
git init
Consider that you are developing a project with your team, and you
finish a feature. You contact the client to request them to see the feature,
but they are too busy, so you send them the link to have a look at the
project.
Project Development through linear development.
You have been working on a project with the client being happy until
this point.
Now, you decide to develop a feature and start developing it on the same code (denoted by
black commits).
In the meantime, you decide to develop another feature (let say xyz) and wait for the
client's approval (xyz denoted by brown commits).
The client disapproves of the feature (black commits) and requests to delete it (denoted by
grey color depicting deletion).
Now, since you were following the linear development method, you need to delete the
complete code and go through the hectic process of adjustments and removing glitches
repeatedly to achieve the following:
Developing the project through branching.
Let see the same scenario by using the Git branching
technique.
39
You have been working on a project with the client being happy until this
point.
After that, you decide to develop a feature and create a new branch called
feature for the same purpose and start working on it.
45
Let's assume you are working on a project along with your friend. You both are
working on two different features and hence are working on two different
branches. We can see it in the below image.
Once you both finish work on your particular features, these feature
branches were merged into the master branch and accepted into the main
stable code.
Later on, you start working on a different branch and a distinct feature. But,
due to some reason, the feature is not required this time. Therefore, the main
master branch does not include it. The branch below in grey shows that it has
to delete without inclusion.
Although the team is working on the master branch continuously,
sometime later, an urgent fix pops up that should urgently address. The
team fixes it and merges it into the master.
Meanwhile, when this fix came, to add some additional functionalities, it was pulled to a
feature branch.
After working on this pulled branch, it finally merges to the master branch.
Different Operations On Branches
52
• Create a Branch: This is the first step in the process, you can start on a default branch
or create a new branch for the development.
• Merge A Branch: An already running branch can merge with any other branch in
your Git repository. Merging a branch can help when you are done with the branch
and want the code to integrate into another branch code.
• Delete A Branch: An already running branch can delete from your Git repository.
Deleting a branch can help when the branch has done its job, i.e., it's already merged,
or you no longer need it in your repository for any reason.
• Checkout A Branch: An already running branch can pull or checkout to make a clone
of the branch so that the user can work on any of them. Pulling a branch can help
when you don't want to disturb the older branch and experiment on the new one.
ToDo
• Run basic commands to get hands on
54
Here are some basic Git commands along with brief explanations:
git init: Initializes a new Git repository in the current directory. This command
creates a hidden directory called ".git" where Git stores its internal data for version
control.
git clone <repository_url>: Clones an existing Git repository from a remote server to
your local machine. This command creates a copy of the remote repository on your
local machine, allowing you to work on the codebase locally.
git add <file(s)>: Adds file(s) to the staging area, preparing them to be included in
the next commit. This command stages changes for commit.
git commit -m "commit message": Commits the changes staged in the current
branch to the repository. A commit represents a snapshot of the project at a specific
point in time.
git status: Displays the current status of the repository, including the state of tracked
and untracked files, changes in the working directory, and the branch you are
currently on.
git pull: Fetches changes from the remote repository and merges them into the current
branch. This command is used to update your local repository with changes made by
others.
git push: Pushes committed changes from your local repository to the remote repository.
This command is used to share your work with others and update the remote repository
with your changes.
git branch: Lists all local branches in the repository. By default, this command shows the
current branch with an asterisk (*) next to it.
git checkout <branch_name>: Switches to the specified branch. This command is used
to navigate between branches or restore files in the working directory to their state in a
specific branch.
git merge <branch_name>: Merges changes from the specified branch into the current
branch. This command combines the changes made in the specified branch with the
changes in the current branch.
git remote -v: Lists the remote repositories associated with your local repository along
with their URLs. This command is useful for viewing the remote repositories you are
working with.
Sample
57
Initialize a Git Repository:
mkdir my-website
cd my-website
git init
<!-- index.html -->
Create Some Files: <!DOCTYPE html>
Create some files for your website. For this <html lang="en">
example, let's create an index.html file: <head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width,
initial-scale=1.0">
<title>My Website</title>
</head>
<body>
<h1>Hello, World!</h1>
</body>
</html>
Add and Commit Changes:
Add the index.html file to the staging area and commit it to the repository:
Collaboration: Imagine your colleague wants to contribute to the project. They clone
the repository to their local machine:
Pull Changes (You): You pull the changes made by your colleague to your local
repository:
Review Changes:
You review the changes made by your colleague in the index.html file
Make Further Changes:
You decide to make further changes to the index.html file. You update the heading to
make it more prominent:
<!-- index.html -->
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width,
initial-scale=1.0">
<title>My Website</title>
</head>
<body>
<h1>Welcome to Our Website!</h1> <!-- Updated heading --
>
<p>Hello, World!</p>
<p>Welcome to our website!</p>
</body>
</html>
Add and Commit Changes (You): You add and commit your changes:
Push Changes: Finally, you push your committed changes to the remote repository:
64
• they do not provide specific information to software developers on how to improve the quality of
their designs or implementation.
• In software development, however, the process is complex and involves a high degree of creativity
and mental activity.
• It is extremely difficult, if not impossible, to define the process capability of software development
in statistical terms.
• However, good use of the seven basic tools can lead to positive long-term results for process
improvement and quality management in software development
65
66
67
Check sheet
• A check sheet is a paper form with printed items to be checked.
• Its main purposes are to facilitate gathering data and to arrange data
while collecting it so the data can be easily used later.
• Another type of check sheet is the check-up confirmation sheet. It is
concerned mainly with the quality characteristics of a process or a
product. To distinguish this confirmation check sheet from the
ordinary data-gathering check sheet, we use the term checklist
68
Classification check sheet: A trait such as a defect must be classified into a category. If you
just kept track of the total defects, you would know that you had 101 total defects. That is
somewhat useful but that, in and of itself, does not provide much insight as to which day is
the worst day or which source of defects is in the worst shape, etc. With a classification
check sheet, it provides a visual overview of the problem areas.
Frequency check sheet: The presence or absence of a trait or combination of traits is
indicated. Also, number of occurrences of a trait on a part can be indicated. Notice that if
you just tracked the number of defects, you may not realize that Wrong Color has the
highest frequency of occurrence. Furthermore, if Wrong Color was not broken down
further, you might not realize that GREEN is giving you the most defects
WHEN TO USE CHECK SHEET
For collecting real-time data from production processes. For example, Production data
recording check sheet – part wise, model wise, machine wise and operator wise.
Helps in making the Bar graph, Histogram and Pareto Chart.
Helps to take decision at a glace to control the product and process related non
conformance.
Pareto diagram
• A Pareto Chart is a graph that indicates the frequency of defects, as well as
their cumulative impact. Pareto Charts are useful to find the defects to
prioritize in order to observe the greatest overall improvement.
• A Pareto Chart is a combination of a bar graph and a line graph
• By arranging the causes based on defect frequency, a Pareto diagram can
identify the few causes that account for the majority of defects.
• It indicates which problems should be solved first in eliminating defects
and improving the operation.
• Pareto analysis is commonly referred to as the 80–20 principle (20% of the
causes account for 80% of the defects), although the cause-defect
relationship is not always in an 80–20 distribution
72
1) A Pareto Chart is a combination of a bar graph and a line graph. Notice the presence of
both bars and a line on the Pareto Chart below.
2) Each bar usually represents a type of defect or problem. The height of the bar represents
any important unit of measure — often the frequency of occurrence or cost.
3) The bars are presented in descending order (from tallest to shortest). Therefore, you can
see which defects are more frequent at a glance.
4) The line represents the cumulative percentage of defects.
• For Collar Defects, the % of Total is simply (10/59)*100
• Cumulative percentages indicate what percentage of all defects can be removed if the
most important types of defects are solved.
• In the example above, solving just the two most important types of defects — Button
Defects and Pocket Defects – will remove 66% of all defects.
• In any Pareto Chart, for as long as the cumulative percentage line is steep, the types of
defects have a significant cumulative effect. Therefore, it is worth finding the cause of
these types of defects and solving them. When the cumulative percentage line starts to
flatten, the types of defects do not deserve as much attention since solving them will not
influence the outcome as much.
• A Pareto Chart is a quality tool: it helps analyze and prioritize issue resolution.
What is the Pareto Principle?
• The 80 20 rule is one of the most helpful concepts for life and time
management. Also known as the Pareto Principle, this rule suggests that 20
percent of your activities will account for 80 percent of your results.
• The Pareto Principle can analyze Pareto Charts, also known as the 80/20 rule.
• The Pareto Principle states that 80% of the results are determined by 20% of the
causes. Therefore, you should try to find the 20% of defect types that cause 80%
of all defects. While the 80/20 rule does not apply perfectly to the example
above, focusing on just 2 types of defects (Button and Pocket) has the potential
to remove the majority of all defects (66%).
Histogram
• Histogram is a type of Bar Chart that graphs the frequency of
occurrence of continuous data, and will aid you in analyzing your
data.
• A Histogram will group your data into Bins or Ranges while a bar chart
displays discrete data by categories. If your data is discrete or in
Categories, then you should use a Bar chart instead of a Histogram.
• The purpose of the histogram is to show the distribution
characteristics of a parameter such as overall shape, central tendency,
dispersion, and skewness.
• It enhances understanding of the parameter of interest
78
Step 1 – Minimum Data Points
To accurately analyze a data set, it’s commonly recommended that you have at least 50 data points. Without an
adequate amount of data, you cannot make reasonable conclusions about your data.
Basically you may miss the pattern in the variation.
On the flip side of this requirement, one of the strengths of the Histogram is that it allows you to easily analyze large
data sets, so don’t feel shy about collecting or analyzing ALOT of data.
Step 2 – Number of Bins
Now that you’ve collected an adequate amount of data, it’s time to calculate the number of Bars, sometimes called Bins
or Ranges, for your data set. The number of Bars for your Histogram will depend on the number of data points you
collected.
Selecting the correct number of Bins is important as it can drastically affect the appearance of your data, which might
lead you to the wrong conclusion.
Step 3 – Determine Bin Width
Once you’ve determined the number of Bins for your Histogram, it’s
time to calculate the Width or Range of each individual Bin.
To do that you take the entire Range of the data (Max data point
minus Min data point) and divide by the total number of Bins.
So for example, let’s say you’re creating a Histogram of Student’s
Test Scores on an exam and the maximum score was 100 and the
minimum score was 20; then your Range is 80(100 – 20).
Then you can divide your data Range (80), by the total number of
Bins, lets say 8 in this instance. So the Width of each Bin is 80/ 8 =
10.
Similar to selecting the right number of total Bins, it’s important that
you keep all the Bin widths the same or this will skew the
distribution of the data.
Scatter diagram
• A scatter diagram vividly portrays the relationship of two interval
variables.
• In a cause-effect relationship, the X-axis is for the independent
variable and the Y-axis for the dependent variable.
• Each point in a scatter diagram represents an observation of both the
dependent and independent variables.
• Scatter diagrams aid data-based decision making
• (e.g., if action is planned on the X variable and some effect is
expected on theY variable).
81
A variable that is influencing other variable is called an independent or control
parameter.
Scatter Diagram can also be created with two variables even there is no control
parameter, in this case either type of parameter can be plotted on either axis.
Types of correlation
1. Positive: In this case pattern of observations slant from lower left to upper right of
the chart. When a value of independent variable gets increase, as a result value of
dependent variable also gets increased. For example productivity of a team member
gets increase with its experience.
2. Negative: In this case pattern of observations slant from upper left to lower right of
the chart. Value of dependent variable gets to decrease if value of independent variable
gets the increase. For example number of farm workers in country, decreasing in years.
3. Null: There is no correlation between two variables and observations are scattered
into the chart. For example, this is baseless to find correlation in number of vacations
sanctioned to a team member as per their height.
Run chart
• A run chart is a line graph of data plotted over time. By collecting and
charting data over time, you can find trends or patterns in the process.
• Because they do not use control limits, run charts cannot tell you if a
process is stable. However, they can show you how the process is running.
• Control limits in the context of run charts are horizontal lines that are
typically drawn on the chart to represent the upper and lower bounds
within which the process is expected to operate under normal conditions.
• These control limits are typically calculated statistically based on the data
collected from the process.
• One common method for calculating control limits is using the process
mean (average) and standard deviation. The control limits are often set at a
certain number of standard deviations above and below the mean, such as
±3 standard deviations for a control chart based on normal distribution.
87
Example: Bug Fixing Process
Imagine a software development team working on a web application. As part of their QA
process, they track the number of bugs reported by users over time and use run charts to
monitor the effectiveness of their bug-fixing efforts.
Data Collection: The QA team collects data on the number of reported bugs each day from
their bug tracking system.
Run Chart Creation: Using this data, they create a run chart where the x-axis represents
time (days) and the y-axis represents the number of reported bugs.
Monitoring Trends: By regularly updating the run chart, the team can visually monitor the
trend in bug reports over time. They observe that initially, the number of reported bugs is
high but gradually decreases as the team fixes them.
Identifying Patterns and Shifts: The run chart helps the team identify patterns or shifts in
the bug reports. For example, they notice a sudden spike in bug reports after a new feature
release, indicating potential issues with the new functionality.
Assessing Process Stability: The team uses the run chart to assess the stability of their bug-
fixing process. Consistent patterns or trends in the data suggest a stable process, while
unexpected variations may indicate areas for improvement.
Process Improvement: Based on the insights gained from the run chart, the team can make
data-driven decisions to improve their bug-fixing process. For instance, they might allocate
more resources to address critical bugs or introduce code reviews to prevent future issues.
Continuous Monitoring: The team continues to update and analyze the run chart over time
to ensure ongoing improvement in software quality. They may also use additional metrics
alongside bug reports, such as customer satisfaction ratings or system performance metrics,
to gain a comprehensive understanding of software quality.
Limitations of Run Charts
● They don’t have any statistical control limits; they don’t show you the upper and lower
tolerance and threshold limits.
● Run charts cannot show you if the process is stable and in control.
Control chart
• A control chart can be regarded as an advanced form of a run chart for
situations where the process capability can be defined.
• A control chart is one of the seven basic tools of control, a modified version
of the run chart. If you add control limits to a run chart, it will become a
control chart.
• It consists of a central line, a pair of control limits (and sometimes a pair of
warning limits within the control limits), and values of the parameter of
interest plotted on the chart, which represent the state of a process.
• If all values of the parameter are within the control limits and show no
particular tendency, the process is regarded as being in a controlled state.
• If they fall outside the control limits or indicate a trend, the process is
considered out of control.
• Such cases call for causal analysis and corrective actions are to be taken.
92
Example: Defect Tracking and Resolution
Consider a software development team working on a mobile application. They use
control charts to monitor the number of defects reported during the testing phase and
to track the time taken to resolve these defects.
Data Collection: The QA team collects data on the number of defects reported during
each testing cycle, along with the time taken to resolve each defect. These data points
are collected over multiple iterations of the software development process.
Calculation of Control Limits: Based on historical data and statistical analysis, the team
calculates control limits for both the number of defects reported (defect count) and the
time taken to resolve defects (defect resolution time).
Plotting Data Points: Using the control limits, the team creates control charts for both
defect count and defect resolution time. They plot the data points for each testing cycle
on the respective control charts.
Interpretation of Control Charts:
Defect Count Control Chart: The team monitors the number of defects reported in
each testing cycle. If the number of defects falls within the control limits and shows
random variation around a central value, it indicates that the testing process is stable
and under control. However, if there are sudden spikes or trends exceeding the control
limits, it suggests the presence of special causes of variation, such as coding errors or
insufficient testing, which require investigation and corrective action.
Defect Resolution Time Control Chart: Similarly, the team tracks the time taken to
resolve defects in each testing cycle. Control limits are used to identify whether the
defect resolution process is stable or experiencing variations beyond acceptable
bounds. If the resolution time consistently exceeds the control limits, it may indicate
inefficiencies in the development or QA process, such as resource constraints or
inadequate prioritization of defects.
Usage of Control Charts
● To find and correct errors in an ongoing process.
● A control chart may show you a false special cause variation which wastes your time and
resources.
● Although control charts are easy to understand, they require knowledge of mathematical
concepts like mean and standard deviation to draw the diagram.
97
A fishbone diagram’s causes and subcauses are usually grouped into six
main groups, including measurements, materials, personnel, environment,
methods, and machines. These categories can help you identify the probable
source of your problem while keeping your diagram structured and orderly.
Common uses of the fishbone diagram are to identify:
- Potential causes of problems in new product design
- Prevention of quality defect
- Potential factors that can cause the defect
- Identify the symptoms of the cause
Man: Any man/people related causes to the problem?
Machine: What are the machine-related problems?
Method: What is wrong in the method associated that is giving rise to the problem?
Measurement: Any tool or standard error that needs rectification?
Material: What changes in the properties of the material that occurred?
Environment: What were the environmental conditions (temperature, pressure, etc)?
Checklist
• checklists that summarize the key points of the process are much
more effective than the lengthy process documents
• Several examples of checklists are
– design review checklist
– code inspection checklist
– moderator (for design review and code inspection) checklist
– pre-code-integration (into the system library) checklist
– entrance and exit criteria for system tests checklist
– product readiness checklist.
104
Next agenda
• SW testing
105