0% found this document useful (0 votes)
67 views10 pages

UTest Whitepaper New Metrics of Software Testing

- Pass/fail tests alone are not enough, software teams must analyze metrics throughout production for app success - Continuous testing and testing in production (TiP) methods like A/B testing and load testing can provide important metrics on user experience and app performance under real-world conditions - Key metrics to analyze include end user behavior and feedback, CPU and system performance, API usage, and system response times to optimize the app based on real user data

Uploaded by

Ruben Fernandez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views10 pages

UTest Whitepaper New Metrics of Software Testing

- Pass/fail tests alone are not enough, software teams must analyze metrics throughout production for app success - Continuous testing and testing in production (TiP) methods like A/B testing and load testing can provide important metrics on user experience and app performance under real-world conditions - Key metrics to analyze include end user behavior and feedback, CPU and system performance, API usage, and system response times to optimize the app based on real user data

Uploaded by

Ruben Fernandez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

New Metrics of Testing

Pass/Fail Tests Aren’t Enough; You Must Analyze Metrics


Throughout Production for a Successful App

White Paper
November 2012
White Paper: Software Testing Metrics

White Paper
Increase Software Quality
Pass/Fail Tests Aren’t Enough; You Must Analyze Metrics
Throughout Production for a Successful App

Table of Contents
• Introduction........................................................... 3

• Testing in Production .......................................... 3


“The ultimate users of the
- Continuous Testing & TiP .................................... 3 program are not robots or
- TiP Methods ........................................................ 4 automated test frameworks.
• The Metrics............................................................ 5 They are live human beings
- End Users ............................................................ 5 with lots of different skill sets,
- CPU ..................................................................... 6
- API ....................................................................... 6
backgrounds, geographic
- System Response ............................................... 7 locations and expectations.”
• Tips ........................................................................ 8
- API Issues ........................................................... 8 - Matt Evans
- System Response Best Practices ....................... 8 QA Director, Mozilla
• Conclusion ............................................................ 8

• About uTest........................................................... 9

2
White Paper: Software Testing Metrics

Introduction
As mobile and desktop devices advance and more people weave technology seamlessly
into their everyday life, how applications function and how they’re produced and tested
needs to change. We’ve entered a time where people expect more from their
technology. They expect applications to function correctly, run smoothly and not bog
down their devices from the moment they are released. If there is a bug, users expect it
to be corrected quickly. This has led to near constant releases and version updates and
the need to perform deep metric analysis to ensure apps are running as efficiently as
possible. Enter continuous development (which calls for continuous testing), Testing in
Production and the rise of modern application metrics.

Adopting a mind-frame that embraces continuous testing and Testing in Production will
help teams perform traditional testing while also providing them with the vital metrics all
developers need to pay attention to.

In this whitepaper we’ll cover which metrics are most important to modern
development and how continuous testing and Testing in Production can help you
analyze and act on that data.

Testing in Production
Testing in Production (TiP) is a testing method that advocates releasing your product to
the public while having developers on hand to monitor and immediately fix any bugs.
Bear in mind that this method will not work for applications that have a vetting process
(such as iOS mobile apps). It may seem like a risky option, but if done correctly, TiP can
be extremely useful. But first, you need to be sure that your application is thoroughly
tested before being released into production. That’s where continuous testing comes in.

Continuous Testing & TiP


We’ve reached a point of technology saturation that
means companies are continuously releasing new
versions of their applications. Whether you are using
the continuous release method of Agile development
or are working to complete a new version in its
entirety before release, employing continuous
testing will help ensure your application is as bug
free as possible when it hits end users. Integrating
testing throughout the development process gives
teams more time to find and fix bugs and helps
prevent bugs from effecting later code. Remember,
testers excel at spotting problems. Including them
early in the development process can prevent hassle
down the line. Plus, doing the heavy lifting of testing
up front frees teams up for later metric analysis and
the changes it’ll dictate.

3
White Paper: Software Testing Metrics

Testing in Production (TiP) is a natural extension of continuous testing. By the time you
near launch, your product should be largely bug free from the continuous testing. This is
what makes TiP feasible. Though Testing in Production involves releasing your
application to the public, keep in mind that this can be accomplished through a limited
release – either to a subset of users or during a lull in activity. This gives you the same
end user insight while limiting exposure of a potentially faulty product. Since the product
is in active use, development teams have the ability not only to find bugs that eluded
them in the lab, but also see if a bug fix works almost immediately. This last line of
testing defense, which is intended to find fringe use case or in-the-wild bugs that didn’t
appear in traditional testing, helps developers find and fix issues before they cause any
major impact.

Another advantage of TiP is that it’s a


great way to gather data from real life
“It’s no secret that products like Bing,
conditions. When testing in production, you Amazon, Google and Facebook launch
see how an application works for a real experiments all the time exposing
user, with a real everyday device, under code and features to a small number of
real conditions. The only way to replicate uses. This is a powerful way to get
this kind of insight without releasing your information on your product in real-
application to the public is to test in-the- world conditions that simply cannot be
wild with a crowdsourced testing company. reproduced in a test lab.”
This window into real life use doesn’t only - Seth Eliot
flush out any bugs missed in the lab, it also Senior Test Engineer, Test Excellence
provides teams with the modern metrics Microsoft
they need to analyze. By watching how an
application functions for real end users teams are able to gather and act on real metrics
that can be used to optimize applications before they reach a larger crowd.

TiP Methods
So how do you Test in Production? Do you just release your application in the middle of
the night and watch what happens? Not quite. Here are a few methods (identified by
Seth Eliot, a Senior Test Engineer of Testing Excellence at Microsoft) that will give you
an idea of what you should look for and what you can achieve using TiP.

 Data Mining: Data mining allows you to track real usage patterns and tease out
defects. It also lends itself to optimization since you can look at statistics from
before and after changes. Another option is to collect the data in real-time for
team members to analyze later. This can influence future changes.
 User Performance Testing: This issue will come up again when we discuss
which metrics are most important. As far as testing goes, use TiP to get an idea
of how your app performs across the hardware/software matrix. Like with data
mining, this gives you access to real life results from a user’s perspective.
 Environment Validation: You can run environment validation during the initial
launch or collect data continuously. This category involves traditional pass/fail
tests that look for version compatibility, connection health, installation success
and other important factors across the user matrix.

4
White Paper: Software Testing Metrics

 Experimentation for Design: This is your typical A/B test. Divide your users into
two (or more) groups and give each a different design to see which users
respond to best.
 Load Testing in Production: Release your application into TiP and add
synthetic load on top of the real users. You’ll be able to see exactly what
happens if something goes wrong when your app is hit with heavy traffic – before
you disappoint a tidal wave of visitors. Load testing this way can help you identify
issues that may not appear in traditional automated load testing – such as
images not loading properly.

Remember, ideally these tests shouldn’t have adverse effects on users. Most major
issues should have been caught already. Many of these tests can be performed in ways
that don’t involve releasing your product to your real-life audience. With that being said,
it’s important to remember that your real-life audience will be the ones who actually use
your application in the end. So do both. Use traditional testing methods initially, then use
TiP as a sanity check. You will get the most accurate and useful information by testing
with end users – actionable metrics you can’t get from a controlled environment.

The Metrics
Much of the testing you’ve likely done to date consists of a simple pass or fail
assessment. While this type of testing is important, pass/fail tests alone aren’t enough to
make your app succeed. Performance
based metrics, especially when gleaned “If the team really needs to plot the
from real-life devices, are particularly history of pass/fail, the team is living
important these days. Users will quickly
abandon your application if it is slow to in the past.”
load, takes up too much memory or data or - Jason Arbon
doesn’t interact properly with other aspects Former Google Engineer
of their device. An end user doesn’t care
about how many test cases passed or failed, those are useless metrics in the long run.
Instead, metrics like CPU usage, API performance and system response time should all
be considered while testing. These are the things your users are concerned about.

As technology continues to become more prevasive, everyday users will get savvier and
more comfortable with the “tech” part of techonolgoy. There are already a number of
consumer-facing applications that help users measure their connection speeds, data
usage and a variety of other information that used to be stricly in the relm of developers
and testers. With the glut of big data flooding in, it is helpful to focus specifically on the
metrics users themselves can acess. These will be the ones they are paying attention to
and the ones that will be influcening their use habits. In many cases, ignoring these
metrics can cost you users – and ultimately revenue.

End Users
What’s the point of putting out an application if no one uses it? Software testing should
be extremely end user focused. After all, they’re the ones who will ultimately be making

5
White Paper: Software Testing Metrics

you money by using your application. Understanding


your end user can help contain testing costs and
looking at metrics from an end users’ point of view
will help your app succeed. Users will dump your
app if it’s a data hog or takes too long to respond to
their actions. Always test with your end users in
mind.

Don’t waste time testing on deceives your target


audience likely won’t be using. Likewise, don’t only
test one device, operating system, browser,
platform, etc. Your users will be using a multitude of
hardware/software combinations and bugs are
guaranteed to slip though if you don’t test on as
many of them as possible. Identify the most common combinations within your target
demographic and begin your testing with those. This is another instance when
crowdsourced testing or Testing in Production comes in handy. Testing with your actual
users is a way to be sure you’ve covered the most important matrix considerations.

CPU
Using too much processing power is one of the biggest reasons people will abandon an
application. You absolutely must measure this. Slow response time is an indicator to
users that something is bogging down their system and there is a selection of apps
across the different markets that let users see how much power each application is
using. If your application is flagged as the biggest abuser it’s not unlikely that users will
look for a better tuned replacement, particularly if your app isn’t absolutely necessary.

Another important factor to remember – and another reason to monitor CPU usage – is
that not all devices have the same processing capabilities. By not monitoring CPU usage
across a range of devices you may miss some major device-specific issues. It is
important to particularly monitor and test CPU usage as you release new versions or
new popular handsets hit the market. Do not assume that because your application was
working fine at one point that it will always be fine.

API
API requests can also have an effect on the response time of an application. If not
properly integrated, APIs can slow down an application, or completely fail to return a
desired action. Whether you create a custom API or use an open source version, be
sure to test the functionality and security of the API itself in addition to testing how it
works on devices.

APIs are becoming increasingly common in the world of mobile and web, but unlike
CPUs, API technology is less understood by the common user. This should be an even
bigger incentive to carefully vet and monitor any API requests integrated into your
application. Users will recognize that something is wrong, but they won’t be able to
pinpoint the problem, which will lead to general frustration and anger.

6
White Paper: Software Testing Metrics

Once you move out of initial testing and into a production environment, continue to
monitor your APIs to ensure outside factors don’t adversely affect the application’s
quality or the effectiveness of the API’s service.

System Response
Testing CPU usage and APIs isn’t enough to ensure good system response time – there
is still a lot more to look at specifically related to system performance.

Measuring system response time has its own set of sub-metrics. Nexcess, a web hosting
company, highlights these specific measurements:
 Payload: The total size in bytes sent to the test application, including all resource
files.
 Bandwidth: From client to server, the minimal bandwidth in Bps across all
network links.
 AppTurns: The number of components (images, scripts, CSS, etc) needed for
the page to render.
 Round-Trip Time: The amount of time in milliseconds it takes to communicate
from client to server.
 Concurrency: The number of simultaneous requests an application will make for
resources.
 Server Compute Time: The time it takes for the server to parse the request, run
application code, fetch data and compose a response.
 Client Compute Time: The time it takes for the application to render client-
facing features (e.g. HTML, scripts, stylesheets, etc. for web apps).

When testing system response, be sure to


look at how long an action takes to “Billions of dollars in revenue are lost
complete from start to finish – from the each year by websites that are unable
second a user hits a button to the second a to measure and control client
page finishes loading. It is imperative you perceived response time. Clients get
monitor system response under real world frustrated and leave websites before
situations. Tests that only look at server- completing a transaction, often never
side response won’t account for real use to return.”
factors that can drastically increase the - Software Systems Laboratory,
time it takes for the action to complete in Columbia University
the user’s eyes. Some information may not
return fully, information cached in browsers
could have trouble loading, some data centers may have slower response times than
others, a variety of things could go wrong that you won’t know about unless you are
monitoring real users in real situations.

7
White Paper: Software Testing Metrics

Tips
Now you know what to monitor once your application is released to the public, but what if
you turn up a problem? Here’s a few recommendations that will help you process the
data you’re collecting and troubleshoot issues if they arise.

API Issues
The best way to avoid API issues is to reduce API request complexity. Couple as many
queries as possible instead of sending many individual requests. Similarly, if you are
working with an application that has caching capabilities, figure out what items can be
cached to cut down on the amount of data that needs to be retrieved everytime your app
is used.

It’s also extremely important to remember that APIs can be affected by platform version,
particularly within the Android ecosystem. Each version of the Android operating system
only supports specific API classes. Identify the most popular devices within your target
demographic and see what platform versions those devices support – it is often not the
most recent platform release. Tailor your API integration to work with the dominate
platform version (and remember to update as the new versions disseminate).

System Response Best Practices


How you measure system response time can drastically affect the data. Here are a few
tips for collecting system response data that will most closely resemble what your users
are experiencing.

 Work with percentiles, not averages. Taking a broad measurement and finding
the average response speed will not give you an accurate portrayal. This practice
disregards the top and bottom speeds (important information) and doesn’t give
you any idea of how many users are experiencing what speeds. Measuring
response speed and dividing the data into percentile designations will give you a
clearer picture of the dominate response speed. If the biggest percentile has slow
response times, you have an issue.
 Not all performance data is the same. Don’t lump initial load time with
response time for logging in – they are two separate actions and should be
analyzed as such. One action may be slower than others (especially if an action
relies on API calls) but if you look at all response time data together you will not
know which action needs addressing.
 Cache what you can. Like with API requests, caching whatever data you can
will reduce response time.

Conclusion
Understanding that these metrics are important – and often accessible – your users is a
vital part of modern application development and testing. You cannot push aside data
like CPU usage, API response and system response time and deal with it another day.

8
White Paper: Software Testing Metrics

Users will notice an issue and know that it isn’t a price they have to pay for technology.
They can and will go elsewhere.

But don’t be overwhelmed by the flood of big data. Take advantage of testing methods
such as continuous testing and Testing in Production to help you not only find and fix
bugs, but see your application from an end users’ perspective. These practices will only
become more important as people continue to grow more and more involved with their
everyday technology.

For more information …

9
White Paper: Software Testing Metrics

About uTest
uTest provides in-the-wild testing services that span the entire software development lifecycle –
including functional, security, load, localization and usability testing. The company’s community
of 80,000+ professional testers from 190 countries put web, mobile and desktop applications
through their paces by testing on real devices under real-world conditions.

Thousands of companies -- from startups to industry-leading brands – rely on uTest as a critical


component of their testing processes for fast, reliable, and cost-effective testing results.

More info is available at www.utest.com or blog.utest.com, or you can watch a brief online
demo at www.utest.com/demo.

uTest, Inc.
153 Cordaville Road
Southborough, MA 01772
p: 1.800.445.3914
e: [email protected]
w: www.utest.com

10

You might also like