Scalability Security: Functional Vs Non-Functional Testing
Scalability Security: Functional Vs Non-Functional Testing
Functional testing refers to tests that verify a specific action or function of the code. These are
usually found in the code requirements documentation, although some development
methodologies work from use cases or user stories. Functional tests tend to answer the question
of "can the user do this" or "does this particular feature work".
Non-functional testing refers to aspects of the software that may not be related to a specific
function or user action, such as scalability or security. Non-functional testing tends to answer
such questions as "how many people can log in at once", or "how easy is it to hack this
software".
Not all software defects are caused by coding errors. One common source of expensive defects is
caused by requirement gaps, e.g., unrecognized requirements, that result in errors of omission by
the program designer[14]. A common source of requirements gaps is non-functional requirements
such as testability, scalability, maintainability, usability, performance, and security.
Usability :::: Usability testing is the measurement of ease of use of a product or piece of software.
testability ::::: Software testability is the degree to which a software artifact (i.e. a software system,
software module, requirements- or design document) supports testing in a given test context.
Testability is not an intrinsic property of a software artifact and can not be measured directly (such as
software size). Instead testability is an extrinsic property which results from interdependency of the
software to be tested and the test goals, test methods used, and test resources (i.e., the test context).
maintainability ::::: In software engineering, the ease with which a software product can be modified
in order to:
correct defects
There are many approaches to software testing. Reviews, walkthroughs, or inspections are
considered as static testing, whereas actually executing programmed code with a given set of test
cases is referred to as dynamic testing. Static testing can be (and unfortunately in practice often
is) omitted. Dynamic testing takes place when the program itself is used for the first time (which
is generally considered the beginning of the testing stage). Dynamic testing may begin before the
program is 100% complete in order to test particular sections of code (modules or discrete
functions).
Verification: Have we built the software right? (i.e., does it match the specification).
Validation: Have we built the right software? (i.e., is this what the customer wants).
Therefore, black box testing has the advantage of "an unaffiliated opinion," on the one hand, and the
disadvantage of "blind exploring," on the other.
Boundary value analysis is a software testing design technique in which tests are designed to include
representatives of boundary values. Values on the edge of an equivalence partition or at the smallest
value on either side of an edge. The values could be either input or output ranges of a software
component. Since these boundaries are common locations for errors that result in software faults they
are frequently exercised in test cases.
Equivalence partitioning is a software testing technique that divides the input data of a software
unit into partition of data from which test cases can be derived. In principle, test cases are
designed to cover each partition at least once. This technique tries to define test case that
uncovers classes of errors, thereby reducing the total number of test cases that must be
developed.
Integration testing (sometimes called Integration and Testing, abbreviated "I&T" [citation needed]) is the
activity[1] of software testing in which individual software modules are combined and tested as a group.
It occurs after unit testing and before system testing. Integration testing takes as its input modules that
have been unit tested, groups them in larger aggregates, applies tests defined in an integration test plan
to those aggregates, and delivers as its output the integrated system ready for system testing.
Fuzz testing or fuzzing is a software testing technique that provides invalid, unexpected, or
random data to the inputs of a program. If the program fails (for example, by crashing or failing
built-in code assertions), the defects can be noted.
File formats and network protocols are the most common targets of fuzz testing, but any type of
program input can be fuzzed. Interesting inputs include environment variables, keyboard and
mouse events, and sequences of API calls. Even items not normally considered "input" can be
fuzzed, such as the contents of databases, shared memory, or the precise interleaving of threads.
For the purpose of security, input that crosses a trust boundary is often the most interesting[1]. For
example, it is more important to fuzz code that handles a type of file commonly downloaded
from the Web than it is to fuzz the parsing of a configuration file.
Uses::Fuzz testing is often used in large software development projects that employ black box
testing. These projects usually have a budget to develop test tools, and fuzz testing is one of the
techniques which offers a high benefit to cost ratio.
However, fuzz testing is not a substitute for exhaustive testing or formal methods: it can only
provide a random sample of the system's behavior, and in many cases passing a fuzz test may
only demonstrate that a piece of software handles exceptions without crashing, rather than
behaving correctly. Thus, fuzz testing can only be regarded as a bug-finding tool rather than an
assurance of quality.
As a gross measurement of reliability, fuzzing can suggest which parts of a program should get
special attention, in the form of a code audit, application of static analysis, or partial rewrites.
A traceability matrix is a document, usually in the form of a table, that correlates any two
baselined documents that require a many to many relationship to determine the completeness of
the relationship. It is often used with high-level requirements (these often consist of marketing
requirements) and detailed requirements of the software product to the matching parts of high-
level design, detailed design, test plan, and test cases.
For instance a requirements traceability matrix is used to check to see if the current project
requirements are being met, and to help in the creation of a Request for Proposal, various
deliverable documents, and project plan tasks.[1]
Common usage is to take the identifier for each of the items of one document and place them in
the left column. The identifiers for the other document are placed across the top row. When an
item in the left column is related to an item across the top, a mark is placed in the intersecting
cell. The number of relationships are added up for each row and each column. This value
indicates the mapping of the two items. Zero values indicate that no relationship exists. It must
be determined if one must be made. Large values imply that the relationship is too complex and
should be simplified.
To ease the creation of traceability matrices, it is advisable to add the relationships to the source
documents for both backward traceability and forward traceability. In other words, when an item
is changed in one baselined document, it's easy to see what needs to be changed in the other.
Grey Box Testing
Grey box testing involves having access to internal data structures and algorithms for purposes of
designing the test cases, but testing at the user, or black-box level. Manipulating input data and
formatting output do not qualify as grey box, because the input and output are clearly outside of the
"black-box" that we are calling the system under test. This distinction is particularly important when
conducting integration testing between two modules of code written by two different developers,
where only the interfaces are exposed for test. However, modifying a data repository does qualify as
grey box, as the user would not normally be able to change the data outside of the system under test.
Grey box testing may also include reverse engineering to determine, for instance, boundary values or
error messages.
Testing Levels
Unit Testing
Unit testing refers to tests that verify the functionality of a specific section of code, usually at the
function level. In an object-oriented environment, this is usually at the class level, and the minimal
unit tests include the constructors and destructors. [25]
These type of tests are usually written by developers as they work on code (white-box style), to
ensure that the specific function is working as expected. One function might have multiple tests,
to catch corner cases or other branches in the code. Unit testing alone cannot verify the
functionality of a piece of software, but rather is used to assure that the building blocks the
software uses work independently of each other.
Integration testing (sometimes called Integration and Testing, abbreviated "I&T" [citation needed]) is the
activity[1] of software testing in which individual software modules are combined and tested as a group.
It occurs after unit testing and before system testing. Integration testing takes as its input modules that
have been unit tested, groups them in larger aggregates, applies tests defined in an integration test plan
to those aggregates, and delivers as its output the integrated system ready for system testing.
As a rule, system testing takes, as its input, all of the "integrated" software components that have
successfully passed integration testing and also the software system itself integrated with any
applicable hardware system(s). The purpose of integration testing is to detect any inconsistencies
between the software units that are integrated together (called assemblages) or between any of
the assemblages and the hardware. System testing is a more limiting type of testing; it seeks to
detect defects both within the "inter-assemblages" and also within the system as a whole.
System testing is performed on the entire system in the context of a Functional Requirement
Specification(s) (FRS) and/or a System Requirement Specification (SRS). System testing is an
investigatory testing phase, where the focus is to have almost a destructive attitude[citation needed] and
tests not only the design, but also the behaviour and even the believed expectations of the
customer. It is also intended to test up to and beyond the bounds defined in the
software/hardware requirements specification(s).
The following examples are different types of testing that should be considered during System
testing:
System Integration Testing verifies that a system is integrated to any external or third party
systems defined in the system requirements.
Regression Testing
Regression testing focuses on finding defaults after a major code change has occurred.
Specifically, it seeks to uncover software regressions, or old bugs that have come back. Such
regressions occur whenever software functionality that was previously working correctly stops
working as intended. Typically, regressions occur as an unintended consequence of program
changes, when the newly developed part of the software collides with the previously existing
code. Common methods of regression testing include re-running previously run tests and
checking whether previously fixed faults have re-emerged. The depth of testing depends on the
phase in the release process and the risk of the added features. They can either be complete, for
changes added late in the release or deemed to be risky, to very shallow, consisting of positive
tests on each feature, if the changes are early in the release or deemed to be of low risk.
Background
Experience has shown that as software is fixed, emergence of new and/or reemergence of old
faults is quite common. Sometimes reemergence occurs because a fix gets lost through poor
revision control practices (or simple human error in revision control). Often, a fix for a problem
will be "fragile" in that it fixes the problem in the narrow case where it was first observed but not
in more general cases which may arise over the lifetime of the software. Frequently, a fix for a
problem in one area inadvertently causes a software bug in another area. Finally, it has often
been the case that when some feature is redesigned, the same mistakes that were made in the
original implementation of the feature were made in the redesign.
Therefore, in most software development situations it is considered good practice that when a
bug is located and fixed, a test that exposes the bug is recorded and regularly retested after
subsequent changes to the program. Although this may be done through manual testing
procedures using programming techniques, it is often done using automated testing tools. Such a
test suite contains software tools that allow the testing environment to execute all the regression
test cases automatically; some projects even set up automated systems to automatically re-run all
regression tests at specified intervals and report any failures (which could imply a regression or
an out-of-date test). Common strategies are to run such a system after every successful compile
(for small projects), every night, or once a week. Those strategies can be automated by an
external tool, such as BuildBot.
Regression testing is an integral part of the extreme programming software development method.
In this method, design documents are replaced by extensive, repeatable, and automated testing of
the entire software package at every stage in the software development cycle.
Traditionally, in the corporate world, regression testing has been performed by a software quality
assurance team after the development team has completed work. However, defects found at this
stage are the most costly to fix. This problem is being addressed by the rise of developer testing.
Although developers have always written test cases as part of the development cycle, these test
cases have generally been either functional tests or unit tests that verify only intended outcomes.
Developer testing compels a developer to focus on unit testing and to include both positive and
negative test cases.[1]
Uses
Regression testing can be used not only for testing the correctness of a program, but often also
for tracking the quality of its output. For instance, in the design of a compiler, regression testing
should track the code size, simulation time and time of the test suite cases.
A software regression is a software bug which makes a feature stop functioning as intended
after a certain event (for example, a system upgrade, system patching or a change to daylight
saving time). A software performance regression is a situation where the software still
functions correctly, but performs slowly or uses more memory when compared to previous
versions.
Regressions are often caused by encompassed bug fixes included in software patches. One
approach to avoiding this kind of problem is regression testing. It is essential that a
comprehensive test plan covers this possibility before releasing any software patch. Automated
testing and well-written test cases can reduce the likelihood of a regression.
The term regression was originally used to refer to a change that caused a previously fixed bug
to reappear and the software to thus regress to a prior, incorrect state. Regression test suites were
accordingly composed of tests which tried to ensure the continued effectiveness of past bug
fixes. However, the term has since evolved to its current, more general meaning of any change
that breaks existing functionality.
Acceptance testing
In engineering and its various subdisciplines, acceptance testing is black-box testing performed
on a system (e.g. software, lots of manufactured mechanical parts, or batches of chemical
products) prior to its delivery.[1] It is also known as functional testing, black-box testing, release
acceptance, QA testing, application testing, confidence testing, final testing, validation testing, or
factory acceptance testing.[citation needed]
In software development, acceptance testing by the system provider is often distinguished from
acceptance testing by the customer (the user or client) prior to accepting transfer of ownership. In
such environments, acceptance testing performed by the customer is known as user acceptance
testing (UAT). This is also known as end-user testing, site (acceptance) testing, or field
(acceptance) testing.
A smoke test is used as an acceptance test prior to introducing a build to the main testing
process.
Overview
Acceptance testing generally involves running a suite of tests on the completed system. Each
individual test, known as a case, exercises a particular operating condition of the user's
environment or feature of the system, and will result in a pass or fail boolean outcome. There is
generally no degree of success or failure. The test environment is usually designed to be
identical, or as close as possible, to the anticipated user's environment, including extremes of
such. These test cases must each be accompanied by test case input data or a formal description
of the operational activities (or both) to be performed—intended to thoroughly exercise the
specific case—and a formal description of the expected results.
Acceptance Tests/Criterion (in Agile Software Development) are usually created by business
customers and expressed in a business domain language. These are high level tests to test the
completeness of a user story or stories 'played' during any sprint/iteration. These tests are created
ideally through collaboration between business customers, business analysts, testers and
developers, however the business customers (product owners) are the primary owners of these
tests. As the user stories pass their acceptance criteria, the business owners can be sure of the fact
that the developers are progressing in the right direction about how the application was
envisaged to work and so it's essential that these tests include both business logic tests as well as
UI validation elements (if need be).
In software engineering, a smoke test generally consists of a collection of tests that can be
applied to a newly created or repaired computer program. Sometimes the tests are performed by
the automated system that builds the final software. In this sense a smoke test is the process of
validating code changes before the changes are checked into the larger product’s official source
code collection or the main branch of source code.
In software testing, a smoke test is a collection of written tests that are performed on a system
prior to being accepted for further testing. This is also known as a build verification test. This is a
"shallow and wide" approach to the application. The tester "touches" all areas of the application
without getting too deep, looking for answers to basic questions like, "Can I launch the test item
at all?", "Does it open to a window?", "Do the buttons on the window do things?".
The purpose is to determine whether or not the application is so badly broken that testing
functionality in a more detailed way is unnecessary. These written tests can either be performed
manually or using an automated tool. When automated tools are used, the tests are often initiated
by the same process that generates the build itself. This is sometimes referred to as "rattle"
testing - as in "if I shake it does it rattle?".
Alpha testing
Beta testing
Beta testing comes after alpha testing. Versions of the software, known as beta versions, are
released to a limited audience outside of the programming team. The software is released to
groups of people so that further testing can ensure the product has few faults or bugs. Sometimes,
beta versions are made available to the open public to increase the feedback field to a maximal
number of future users.
Performance testing, or Load testing checks to see if the software can handle large
quantities of data or users. This is generally referred to as software scalability. This
activity of Non Functional Software Testing is often referred to as Endurance Testing.
Load Testing
This is the simplest form of performance testing. A load test is usually conducted to understand
the behavior of the application under a specific expected load. This load can be the expected
concurrent number of users on the application performing a specific number of transaction
within the set duration. This test will give out the response times of all the important business
critical transactions. If the database, application server, etc are also monitored, then this simple
test can itself point towards the bottleneck in the application software
Stress Testing
This testing is normally used to break the application. Double the number of users are added to
the application and the test is run again until the application breaks down. This kind of test is
done to determine the application's robustness in times of extreme load and helps application
administrators to determine if the application will perform sufficiently if the current load goes
well above the expected load.
This test is usually done to determine if the application can sustain the continuous expected load.
During endurance tests, memory utilization is monitored to detect potential leaks. Also
important, but often overlooked is performance degradation. That is, to ensure that the
throughput and/or response times after some long period of sustained activity are as good or
better than at the beginning of the test.
Spike Testing
Spike testing, as the name suggests is done by spiking the number of users and understanding the
behavior of the application whether it will go down or will it be able to handle dramatic changes
in load.
Scalability Testing
Scalability testing is an extension of performance testing - it's part of the battery of non-
functional tests, is the testing of a software application for measuring its capability to scale up or
scale out in terms of any of its non-functional capability - be it the user load supported, the
number of transactions, the data volume etc. The purpose of scalability testing is to identify
major workloads and mitigate bottlenecks that can impede the scalability of the application.
The performance testing environment should not be clubbed with UAT or development
environment. This is dangerous as if an UAT or Integration testing or other testing is going on
the same environment, then the results obtained from the performance testing may not be
reliable. As a best practice it is always advisable to have a separate performance testing
environment resembling the production environment as much as possible.
2. Stability testing
Stability testing checks to see if the software can continuously function well in or above an
acceptable period. This activity of Non Functional Software Testing is oftentimes referred to as
load (or endurance) testing.
3. Usability testing
Usability testing is needed to check if the user interface is easy to use and understand.
4. Security testing
Security testing is essential for software that processes confidential data to prevent system
intrusion by hackers.
Internationalization and localization is needed to test these aspects of software, for which a
pseudolocalization method can be used. It will verify that the application still works, even after it
has been translated into a new language or adapted for a new culture (such as different currencies
or time zones).
6. Destructive testing
Destructive testing attempts to cause the software or a sub-system to fail, in order to test its
robustness.
Another practice is to start software testing at the same moment the project starts and it is a
continuous process until the project finishes.[30]
In counterpoint, some emerging software disciplines such as extreme programming and the agile
software development movement, adhere to a "test-driven software development" model. In this
process, unit tests are written first, by the software engineers (often with pair programming in the
extreme programming methodology). Of course these tests fail initially; as they are expected to.
Then as code is written it passes incrementally larger portions of the test suites. The test suites
are continuously updated as new failure conditions and corner cases are discovered, and they are
integrated with any regression tests that are developed. Unit tests are maintained along with the
rest of the software source code and generally integrated into the build process (with inherently
interactive tests being relegated to a partially manual build acceptance process).
Although variations exist between organizations, there is a typical cycle for testing[31]. The
sample below is common among organizations employing the Waterfall development model.
Requirements analysis: Testing should begin in the requirements phase of the software
development life cycle. During the design phase, testers work with developers in determining
what aspects of a design are testable and with what parameters those tests work.
Test planning: Test strategy, test plan, testbed creation. Since many activities will be carried out
during testing, a plan is needed.
Test development: Test procedures, test scenarios, test cases, test datasets, test scripts to use
in testing software.
Test execution: Testers execute the software based on the plans and tests and report any errors
found to the development team.
Test reporting: Once testing is completed, testers generate metrics and make final reports on
their test effort and whether or not the software tested is ready for release.
Test result analysis: Or Defect Analysis, is done by the development team usually along with the
client, in order to decide what defects should be treated, fixed, rejected (i.e. found software
working properly) or deferred to be dealt with later.
Defect Retesting: Once a defect has been dealt with by the development team, it is retested by
the testing team. AKA Resolution testing.
Regression testing: It is common to have a small test program built of a subset of tests, for each
integration of new, modified, or fixed software, in order to ensure that the latest delivery has
not ruined anything, and that the software product as a whole is still working correctly.
Test Closure: Once the test meets the exit criteria, the activities such as capturing the key
outputs, lessons learned, results, logs, documents related to the project are archived and used
as a reference for future projects.
Automated testing
Many programming groups are relying more and more on automated testing, especially groups
that use Test-driven development. There are many frameworks to write tests in, and Continuous
Integration software will run tests automatically every time code is checked into a version
control system.
While automation cannot reproduce everything that a human can do (and all the strange ways
they think of to do it), it can be very useful for regression testing. However, it does require a
well-developed test suite of testing scripts in order to be truly useful.
Testing artifacts
Software testing process can produce several artifacts.
Test plan
A test specification is called a test plan. The developers are well aware what test plans
will be executed and this information is made available to management and the
developers. The idea is to make them more cautious when developing their code or
making additional changes. Some companies have a higher-level document called a test
strategy.
Traceability matrix
A traceability matrix is a table that correlates requirements or design documents to test
documents. It is used to change tests when the source documents are changed, or to verify
that the test results are correct.
Test case
A test case normally consists of a unique identifier, requirement references from a design
specification, preconditions, events, a series of steps (also known as actions) to follow,
input, output, expected result, and actual result. Clinically defined a test case is an input
and an expected result.[32] This can be as pragmatic as 'for condition x your derived result
is y', whereas other test cases described in more detail the input scenario and what results
might be expected. It can occasionally be a series of steps (but often steps are contained
in a separate test procedure that can be exercised against multiple test cases, as a matter
of economy) but with one expected result or expected outcome. The optional fields are a
test case ID, test step, or order of execution number, related requirement(s), depth, test
category, author, and check boxes for whether the test is automatable and has been
automated. Larger test cases may also contain prerequisite states or steps, and
descriptions. A test case should also contain a place for the actual result. These steps can
be stored in a word processor document, spreadsheet, database, or other common
repository. In a database system, you may also be able to see past test results, who
generated the results, and what system configuration was used to generate those results.
These past results would usually be stored in a separate table.
Test script
The test script is the combination of a test case, test procedure, and test data. Initially the
term was derived from the product of work created by automated regression test tools.
Today, test scripts can be manual, automated, or a combination of both.
Test suite
The most common term for a collection of test cases is a test suite. The test suite often
also contains more detailed instructions or goals for each collection of test cases. It
definitely contains a section where the tester identifies the system configuration used
during testing. A group of test cases may also contain prerequisite states or steps, and
descriptions of the following tests.
Test data
In most cases, multiple sets of values or data are used to test the same functionality of a
particular feature. All the test values and changeable environmental components are
collected in separate files and stored as test data. It is also useful to provide this data to
the client and with the product or a project.
Test harness
The software, tools, samples of data input and output, and configurations are all referred
to collectively as a test harness.