Load Testing Overview
Load Testing Overview
Demandware Services
Damian Goshey
5/2/2016
This document introduces to load testing concepts for engineers and testers that have minimal background in
load testing.
Table of Contents
Load Testing Basics ................................................................................................................... 2
Overview ................................................................................................................................. 2
Terminology ............................................................................................................................ 2
About Load Testing ................................................................................................................. 2
Load Test Tool ........................................................................................................................ 3
Standardized Load Test .............................................................................................................. 4
Load Test Types ......................................................................................................................... 5
Baseline .................................................................................................................................. 5
Comparison ............................................................................................................................. 5
User Experience ...................................................................................................................... 5
Capacity .................................................................................................................................. 5
Longevity ................................................................................................................................. 5
Latency ................................................................................................................................... 6
Load Test Best Practices ............................................................................................................ 7
Developing Realistic Load ....................................................................................................... 7
Scripting .................................................................................................................................. 8
Test Execution ........................................................................................................................ 9
Iterative Testing ..................................................................................................................... 11
Results Analysis .................................................................................................................... 11
Page 1 of 12
Load Testing Basics
Overview
This document introduces load testing basic concepts with a focus on commerce-based websites. You
can apply this information to load testing on the Demandware Platform. This document is a primer; it
does not provide certification for load testing.
Terminology
Load Test
Load Testing is often referred to as performance testing or performance load testing. In this document,
these terms are equivalent and refer to a test that evaluates the performance of a customized storefront
under load by simulating user activity.
Load Test Partner or Entity
A service organization or expert that is contracted to perform load testing can be referred to as a load
test partner or load test entity. That team is responsible for delivering the load test.
Storefront
A storefront is a web site that provides commerce services for a user.
Custom Code
Any code that enables a commerce site to perform, render, and deliver content to a storefront site.
Page 2 of 12
Load Test Tool
Demandware does not recommend a specific load testing tool to load test a storefront on the
Demandware platform. Your selected tool can be industry standard or open source. It should meet
some basic requirements:
• Highly scalable – Can leverage load generation agents from a public or private cloud. Can
execute from various geographic locations.
• Realistic load profiles – Easily creates load distribution profiles.
• Transaction based dynamic load adjustment – Can run load test at a specified hourly
transaction rate.
• Robust scripting language – Uses a non-proprietary object oriented programming language.
• Comprehensive reports – Flexible, easy-to-use reporting of test results. Reports should easily
and quickly compare runs with configurable content and layout.
Page 3 of 12
Standardized Load Test
There are many types of load testing. The type used is typically based on the partner’s testing
experience and the load test objectives. To ensure the most effective outcome, it is important to follow
a standard template for any load test that is run on the Demandware platform. It should contain the
following phases.
Execution Run, adjust and monitor the 1. Load test execution – 25% target load.
load test. Incrementally
2. Adjust scripts as needed.
increase volume per run to
uncover any issues before 3. Load test execution – 50% target load.
moving to the target volume 4. Load test execution – 100% target load.
runs.
5. Monitor test environment during load test run(s).
Analysis Integrate the load test 1. Review load test results report.
results, test environment
2. Review realm instance platform metrics.
metrics, and profiler data into
a cohesive load test results 3. Review any profiler data.
report. This phase has 4. Generate data into results report for target audience
multiple owners.
Resolution Address identified areas of 1. Engage development team to address any issues.
custom code where there
2. Apply code optimizations as identified.
may be poor performance
and engage appropriate 3. Validate code optimizations with additional load test
resources to resolve runs.
problems. 4. Sign off on success criteria established in scope
phase.
Page 4 of 12
Load Test Types
Baseline
This is the most common type of performance load test. It establishes the response and characteristic
baseline of the site under load when the site is at a known state. Any changes to the storefront
configuration and then code can be compared to the baseline. A known state is a state that the tester
can restore to its previous state (typically by resetting a site or database).
This test helps to:
• Determine the direct impact of custom code changes on storefront performance.
• Evaluate the impact of platform configuration changes on storefront performance.
• Define the maximum acceptable load.
Comparison
This type of load test compares a known state site or application to that same site or application with
code changes. It uses the established baseline to determine how performance after a change in site
configuration or code compares performance at the platform’s known state.
This test helps to:
• Determine the direct impact of custom code changes on storefront performance.
• Evaluate the impact of platform configuration changes on storefront performance.
User Experience
This load test helps understand average response for a user under expected normal production traffic
for a defined load profile (set of user scenarios) when a parallel event of spike or short burst in high
volume is experienced. The measurement of user response before, during, and after the event can be
reviewed separately to understand behavior and trends that may be encountered due to the event.
This test helps to:
• Determine how a flash sale could impact normal user experience.
• Determine how a special sale event could impact normal user experience.
Capacity
This load test measures the ability of a site to support a large user load. Use it to determine the load at
which storefront response is no longer acceptable. It also helps determine the volume of traffic that a
customized storefront can handle with given underlying site resources.
Longevity
This type of load test, also called a reliability test, simulates sustained load over 24 to 48 consecutive
hours during which transaction volume varies. The purpose of this test is to evaluate both the storefront
custom code under test as well as the user experience throughout.
This test helps to:
• Identify memory leaks (and similar issues) caused by custom storefront code over time.
• Evaluate the ability of third party services to support wave load over time.
Page 5 of 12
Latency
This type of load test simulates users at a low connection speed. Knowing where problems arise for low
bandwidth users enables you to identify areas of the storefront where optimization might improve the
low bandwidth user experience. Latency tests can also identify areas for improvement for mobile device
users.
Page 6 of 12
Load Test Best Practices
Developing Realistic Load
Determine Target Volume
The most important information needed to set up a realistic load test is the target load. Derive this
number using either real data or business projections from sources who understand:
• The intended driver (marketing, sales event).
• Operational traffic expectations from this driver.
Measurements Units
You can specify the load test volume in either Number of Virtual Users or Transactions / Hour.
However, using Transactions/Hour makes it much easier to test to a specific load in a given time
period. The Virtual Users count measurement could create higher or lower volume traffic depending on
factors that are often dynamic during a load test run.
Scenario Definition
Load test scenarios represent detailed actions that are scripted. They make up the set of actions that
are available to help model traffic under load.
Consider these standard commerce actions when defining scenarios for the load test:
• Search and browse products
• Add to cart
• Checkout
• Place order
Separate special business cases into unique scenarios. For example:
• Promotions purchases
• Gift certificate redemption
• Custom campaign cases
The scripted scenarios for your load test project may be very different from these standard and special
scenarios.
Page 7 of 12
Load Distribution Profile
A realistic target load should be distributed over different actions. Load distribution exercises custom
code in similar way to production usage pattern which is important to identify performance hot spots
that are most likely to impact the storefront during high traffic events.
Base load distribution on analytics data available through conventional Site Administration tools, for
example – Demandware Business Manager. Distribution information can also come from Business
Analysts familiar with business use cases and traffic patterns. Load distribution can vary widely based
on the event being simulated and the goal of the load test. Here is an example load distribution for
some common commerce scenarios.
Scripting
Good Coding Standards
• Have script design reviewed before beginning coding.
• Develop modular and reusable script code.
• Develop easily extensible script code.
• Include comments to document your script code.
• Organize scripts by user scenario.
• Name tests and test measurements to reflect user scenario actions.
• Use think times in all scripts to mimic actual user activity.
Page 8 of 12
Determining Think Times
The project manager (or the person who is closest to the Scoping and Planning load test phases) can
help to determine the think times to apply to specific script actions. These can be specified at an action
category level. For example:
• 15 – 25 second think time for the Details Viewing Action
• 10 – 15 second think time for the Search Results Viewing Action
• 2 – 5 second think time for the General Page Wait Action
Random vs. Specific Think Times
To realistically simulate user actions, think times should never be constant. They should be randomized
each time they are called within a script. Most load testing tools can randomize think times for both
actions (individual script steps) and transactions (multiple script actions).
Script Testing
Application behavior under load cannot be predicted in most cases. Therefore, it is important to have a
high level of confidence in the script functionality. This starts with a comprehensive knowledge of and
high comfort level with the load test tool. In addition, use levels of testing (ranging from single user
debug runs to low volume load profile runs).
• Development testing – single user and run at frequent checkpoints using the IDE’s debug
function.
• Dry run testing – test each scenario script separately at low volume as part of a load profile
(multiple scenarios simultaneously) at low volume to ensure the virtual user load does not
encounter unexpected errors.
• Volume testing – a few moderate volume load tests to evaluate load generator agents and
validate the test environment.
Test Execution
Where to Run the Test
If possible, run the load test in the same production environment that will support the storefront when it
is live. If this is not an option, it is important that the test environment resources and configuration
match the exact configuration that the storefront will be running on in production.
Page 9 of 12
Dynamic Model (Transaction Rate)
A dynamic load is determined by the maximum transaction rate, which is typically applied over one
hour. The tool uses any user load specified as an upper limit only. The tool dynamically adjusts the user
load based on the execution time per iteration of each user (transaction time). If the server slows down,
the tool increases the load because the transaction total time is increasing. In some scenarios where
the server is overloaded, the system gets loaded faster and faster creating a worse state.
This model allows for testing to a specific expected target load, which is derived either from historic
peak event data or from business projections (as in the case of a new storefront).
The dynamic model is recommended for most load testing.
Results Analysis
Results Confidence
Results of load tests may vary from one test run to another, bringing the validity of those results into
question. You can increase confidence in test results by following some standard practices and
preparing results data before reporting it.
Allow Time for Warm-Up
Applications initially run slower than usual. During the warm-up period, an application caches pages
and initialize databases, after which the application is running in what is called a steady state. Begin
measuring load test results only after the application has reached a steady state.
Understand the test sample size
Test results data commonly have outliers. Consequently, your test sample size should be big enough
that those outliers are merely white noise to be disregarded. You can increase the size of your sample
by running the test for a long time, or by doing multiple test runs and averaging the results of those
runs. You should have a minimum of one hour of measurement data.
Page 11 of 12
Validate results consistency
Start by gauging the test results using standard deviation. What percentage of the median does the
standard deviation show? Next, use 80th and 90th percentile algorithms to process the data into the
highest number of measurements and use those numbers.
Investigate results trends
Look at the results measurements graphed through the entire test run. What does the trend of the
measurements graph show? Optimally, it should be nearly flat the application’s steady state.
Investigate spikes and upward trends.
Results Interpretation
The load test engineer (or a load testing expert) analyzes the results and communicates to the team. It
can be the case that some small spikes and trends are expected or acceptable within any load test.
This providing of insight can help to keep focus on the problems that need to be addressed.
Page 12 of 12