0% found this document useful (0 votes)
163 views37 pages

Testing For DW BI

Testing for DW BI

Uploaded by

Pvkr Reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
163 views37 pages

Testing For DW BI

Testing for DW BI

Uploaded by

Pvkr Reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 37

Testing Techniques for Data Warehousing and

Business Intelligence Systems


Muralidharan Subbukutty
Nov 2008

© 2008 MindTree Consulting


Agenda

 Introduction
 WHY of DW/BI Testing
 WHAT of DW/BI Testing
 Testing Life Cycle – How of DW Testing
 Test case Scenarios
 Unit Testing
 Performance Testing
 System Testing
 User Acceptance testing

© 2008 MindTree Consulting Page 2


Introduction

© 2008 MindTree Consulting Slide 3


Introduction

 Testing is very crucial in the DW/BI Projects


 Testing for DW/BI carries unique challenges and requires specialized
approaches
 The testing function for this highly dynamic technology area is at a
very nascent stage of maturity
 This session walks you through How to perform an ETL Testing &
Reporting testing.

© 2008 MindTree Consulting Slide 4


Introduction

Challenges:
 Lack of awareness
 Absence of tools
 Lack of standard approach/methodology
 Unwillingness on the part of DW developers

© 2008 MindTree Consulting Slide 5


WHY of DW/BI Testing

© 2008 MindTree Consulting Slide 6


WHY of DW/BI Testing

 To ensure whether the requirements are met


 To ensure the system is well integrated
 To ensure data is accurate (complete and correct)– Data
Reconciliation
 To ensure Data Quality Issues are handled
 To check the Performance – Data loading & Reporting
 To check the scalability
 To Err is human – Test for obvious errors

© 2008 MindTree Consulting Page 7


WHAT of DW/BI Testing

© 2008 MindTree Consulting Slide 8


WHAT of DW/BI Testing

© 2008 MindTree Consulting Slide 9


 Data-Warehouse testing is divided into two parts
 'Back-end' testing where the source systems data is compared to the end-
result data in Loaded area,
 'Front-end' testing where the user checks the data by comparing their MIS
with the data displayed by the end-user tools like OLAP.

© 2008 MindTree Consulting


Why is DW/BI testing different?

Why is DW/BI testing different?


 Answer lies in the definition of what constitutes DW/BI.
 In-depth analysis of detailed business data
 BI is a broad category of application programs and technologies
 It contains historical Data
 It Support Reporting and Analysis.
 DWs tend to have these distinguishing features:
 Use a subject-oriented dimensional data model,
 Contain publishable data from potentially multiple sources
 Contain integrated reporting tools.
 It is the Single version of Truth

© 2008 MindTree Consulting Slide 11


Why is DW/BI testing different?

User-Triggered vs. System triggered


 Most of the production/Source system testing is the processing of
individual transactions, which are driven by some input from the
users (Application Form, Servicing Request.). There are very few
test cycles, which cover the system-triggered scenarios (Like
billing, Valuation.)
 In data Warehouse, most of the testing is system triggered as per
the scripts for ETL ('Extraction, Transformation and Loading'), the
view refresh scripts etc.

© 2008 MindTree Consulting Page 12


Why is DW/BI testing different?

Batch vs. online gratification


 A transaction system will provide instant OR at least overnight
gratification to the users, when they enter a transaction, which
either is processed online OR maximum via overnight batch.
 In the case of data- warehouse, most of the action is happening in
the back-end and users have to trace the individual transactions to
the MIS and views produced by the OLAP tools. This is the same
challenge, when you ask users to test the month-end mammoth
reports/financial statements churned out by the transaction
systems.
 This is something, which makes it a challenge to retain users
interest.

© 2008 MindTree Consulting Page 13


Why is DW/BI testing different?

Volume of Test Data


 The test data in a transaction system is a very small sample of the overall
production data. Typically to keep the matters simple, we include as many
test cases as are needed to comprehensively include all possible test
scenarios, in a limited set of test data.
 Data Warehouse has typically large test data as one does try to fill-up
maximum possible combination and permutations of dimensions and facts.
 For example, if you are testing the location dimension, you would like
the location-wise sales revenue report to have some revenue figures
for most of the 100 cities and the 44 states. This would mean that you
have to have thousands of sales transaction data at sales office level
(assuming that sales office is lowest level of granularity for location
dimension).

© 2008 MindTree Consulting Page 14


Why is DW/BI testing different?

Possible scenarios/ Test Cases


 If a transaction system has hundred (say) different scenarios, the
valid and possible combination of those scenarios will not be
unlimited.
 In case of Data Warehouse, the permutations and combinations one
can possibly test is virtually unlimited due to the core objective of
Data Warehouse is to allow all possible views of Data. In other
words, 'You can never fully test a data Warehouse'
 Therefore one has to be creative in designing the test scenarios to
gain a high level of confidence.

© 2008 MindTree Consulting Page 15


Why is DW/BI testing different?

Test Data Preparation


 This is linked to the point of possible test scenarios and volume of data.
Given that a data- warehouse needs lots of both, the effort required to
prepare the same is much more.

Programming for testing challenge


 In case of transaction systems, users/business analysts typically test the
output of the system.
 However, in case of data warehouse, as most of the action is happening at
the back-end, most of the 'Data Warehouse data Quality testing' and
'Extraction, Transformation and Loading' testing is done by running
separate stand-alone scripts. These scripts compare pre-Transformation to
post Transformation (say) comparison of aggregates and throw out the
pilferages. Users roles come in play, when their help is needed to analyze
the same (if designers OR business analysts are not able to figure it out).

© 2008 MindTree Consulting Page 16


HOW of DW/BI Testing
Testing Life Cycle

© 2008 MindTree Consulting Slide 17


Testing Life Cycle

Standard Development Methodology


Requirements Design Develop Test Roll Out

Activities for Testing

System
Testing
Define Prepare Test cases User Post
Unit
Testing Acceptanc Deployment
Testing
Strategy Prepare Test Data e Testing Testing
Performance
Testing

© 2008 MindTree Consulting Page 18


HOW of DW/BI Testing - Test Scenarios

Test Scenarios:
 Simple scenarios are those, which are relatively straightforward and
can be the first step to understand the health of the system.
examples are:
 Extraction– Complete table Extraction from a core system with
robust DBMS.
 Transformation – Creation of simple derived attributes
(creating complete bill amount from individual billing items)
OR creating aggregates.
 Loading – Loading a dimension set with lesser attributes and
without any Transformation during Loading.
 OLAP – Testing using 'Basic Functions'

© 2008 MindTree Consulting Page 19


HOW of DW/BI Testing - Test Scenarios

Negative Testing
 Checking on how the system handles the negative conditions:
 Extraction – Wrong OR unexpected data in the table. (For example
you place the wrong customer ID format, character fields in what
should be numeric etc.)
 Transformation- having negative sales numbers, age of 200 years
etc. This is important, as the transformation logic should not only
work on what it wants to do, but what all it could face.
 Loading – Having wrong data sets. For example having data set of
dimension 'location' has two columns less OR not existing OR having
null values. There should be some fundamental checks, which need
to be run by Loading system before it goes for bulk Loading.

© 2008 MindTree Consulting Page 20


HOW of DW/BI Testing - Test Data

Test Data ranges from a small set to the complete production data.
Limited Data Warehouse test Data
 This involves feeding limited transactions in the source systems (typically less than
few thousands for each data-mart').
 This should ideally take care of key scenarios in terms of different Transformation
logics.
 Say, if Transformation is doing some de-duping, place couple of duplicate cases. For
customer dimension (say) you can have customers of different ages, income groups
etc.

 After these transactions have been processed by the source systems, the entire
processing is conducted and results are checked at each interim stage and also in
the end user tools.
 The expertise here lies in making it happen with minimum transactions with
maximum scenarios. However, never try to include all scenarios. The guidelines
here are to include scenarios, which are complex OR have complex programming
logic (like 'de-dup', 'standardize')

© 2008 MindTree Consulting Page 21


HOW of DW/BI Testing - Test Data

Limited Production Data for Data Warehouse Testing


 The next step is to further expand the scenarios and expand data.
This is achieved by fine-tuning your extract scripts so that they
pick-up limited amount of production data from the source system.
The filtering is typically kept simple.

Full Production Data


 This is the final and must do test for a data warehouse. Users will
generally not accept a system till they reconcile it with their
reports from the source system. (For example Data Warehouse
should show as many telecom customers, as shown in the core
production systems.)

© 2008 MindTree Consulting Page 22


Testing Life Cycle

Unit Testing:
 Unit Testing is done at individual component level.
 Type of unit testing are
 Extraction Testing
 Transformation Testing
 Loading Testing
 End User Browsing and OLAP Testing
 Ad-hoc Query Testing
 Down Stream Flow Testing
 Data Migrations Verification

© 2008 MindTree Consulting Slide 23


HOW of DW/BI Testing – Unit Testing

Extraction Testing
This testing checks the following:
 Data is able to extract the required fields.
 The Extraction logic for each source system is working
 Extraction scripts are granted security access to the source systems.
 Updating of extract audit log and time stamping is happening.
 Source to Extraction destination is working in terms of
completeness and accuracy.
 Extraction is getting completed with in the expected window.

© 2008 MindTree Consulting Page 24


HOW of DW/BI Testing – Unit Testing

Transformation Testing
 Transaction scripts are transforming the data as per the expected
logic.
 The one time Transformation for historical snap-shots are working.
 Detailed and aggregated data sets are created and are matching.
 Transaction Audit Log and time stamping is happening.
 There is no pilferage of data during Transformation process.
 Transformation is getting completed with in the given window

© 2008 MindTree Consulting Page 25


HOW of DW/BI Testing – Unit Testing

Loading Testing
 There is no pilferage during the Loading process.
 Any Transformations during Loading process is working.
 Data sets in staging to Loading destination is working.
 One time historical snap-shots are working.
 Both incremental and total refresh are working.
 Loading is happening with in the expected window.

© 2008 MindTree Consulting Page 26


HOW of DW/BI Testing – Unit Testing

End User Browsing and OLAP Testing


 The Business views and dashboard are displaying the data as
expected.
 The scheduled reports are accurate, correct and complete.
 The scheduled reports and other batch operations like view refresh
etc. is happening in the expected window.
 'Analysis Functions' and 'Data Analysis' are working.
 There is no pilferage of data between the source systems and the
views.

© 2008 MindTree Consulting Page 27


HOW of DW/BI Testing – Unit Testing

Ad-hoc Query Testing


 Ad-hoc queries creation is as per the expected functionalities.
 Ad-hoc queries output response time is as expected.

Down Stream Flow Testing


 Data is extracted from the data warehouse and updated in the down-
stream systems/data marts.
 There is no pilferage.

One Time Population testing (Data Migration verification)


 The one time ETL for the production data is working
 The production reports and the data warehouse reports are matching
 The time taken for one time processing will be manageable within the
conversion weekend.

© 2008 MindTree Consulting Page 28


HOW of DW/BI Testing - Templates

DW BO Universe Reports

© 2008 MindTree Consulting Page 29


Testing Life Cycle - Performance Testing

Performance Testing:
 Performance testing addresses an often neglected area of system
performance
 It often gets neglected and is not focused on, as performance typically
does not become an issue when the application goes live and impacts
only down the line the when load on the application increases.

 Ensures that the application has been designed to scale based on


acceptable performance benchmarks.
 Other name for performance testing is Stress testing or volume testing

© 2008 MindTree Consulting Slide 30


Testing Life Cycle - Performance Testing

Stress and volume Testing


 This part of testing will involve, placing maximum volume OR failure points
to check the robustness and capacity of the system.
 The level of stress testing depends upon the configuration of the test
environment and the level of capacity planning done.
 Here are some examples from the ideal world:
 Server shutdown during batch process.
 Extraction, Transformation and Loading with two to three times of
maximum possible imagined data (for which the capacity is planned)
 Having 2 to 3 times more users placing large numbers of ad-hoc
queries.
 Running large number of scheduled reports.

© 2008 MindTree Consulting Page 31


Testing Life Cycle – System Testing

System Testing:
 System testing is done after the integration of all the different
components and deploying it in the UAT /QA environment.
 The purpose of System testing is to perform a detailed system
testing to identify and resolve all potential issues present in the
application prior to go live of the application.
 Checks the end to end data flow from the source system to the
OLAP Reports is complete and accurate.
 Test for all the requirements of the system are met.
 This is also done during the User Acceptance Testing also

© 2008 MindTree Consulting


Testing Life Cycle -User Acceptance Testing:

 User Acceptance Testing (UAT), needs to be owned by the Client team and MindTree
team will be providing support and assistance during the process.
 UAT is carried out in the QA environment.
 MindTree suggests the following approaches to effectively validate the data element
values:
 One-to-one comparison
 To test data, reports can be generated from the source system/ existing

applications and the same reports be generated from BI.

 Business Validation - Once the values in the reports match, these reports can
then be sent to business users to validate the accuracy from a business
perspective.
 Operations Validation – Operations team will test the modules from an operations
perspective. Whether the operational metadata is being captured and is
accurate, email notifications are being sent, reject handling is being done,
appropriate messages are logged in error/log files.

© 2008 MindTree Consulting Slide 33


DW/BI Other Testing Methods

Parallel Testing
 Parallel testing is done where the Data Warehouse is run on the
production data as it would have done in real life and its outputs are
compared with the existing set of reports to ensure that they are in
synch OR have the explained mismatches.
Security Framework testing
 Check all possible aspects of Security Framework.
 Does the user have access to all of the data that he is entitled to.
 Does any user have access to the data that he is not entitled to.
 Are the roles and security are define properly

© 2008 MindTree Consulting Page 34


DW/BI Other Testing Methods

Full Production Simulation


 One takes the back up of the source systems from an earlier date
and runs the complete ETL and 'end user tools' operations to look at
the results.
 Production simulation is more of a lab test by technology before the
system is released to full user view of parallel testing.
 This typically is a step before the parallel testing is done.
 This can be a full scale parallel testing, but is something more than
that. Where-as parallel testing is done in synch with the production,
the production simulation does not necessarily have to do the same.

© 2008 MindTree Consulting Page 35


References

 www.dmreview.com
 www.bipminstitute.com

© 2008 MindTree Consulting Slide 36


Imagination Action Joy

Muralidharan Subbukutty
[email protected]
+91 98802 81785
www.mindtree.com

© 2008
© 2008MindTree Consulting
MindTree Consulting

You might also like