ETL Testing and Datawarehouse Testing
ETL Testing and Datawarehouse Testing
Before we learn anything about ETL Testing its important to learn about Business
Intelligence and Dataware. Lets get started
What is BI?
Business Intelligence is the process of collecting raw data or business data and turning
it into information that is useful and more meaningful. The raw data is the records of
the daily transaction of an organization such as interactions with customers,
administration of finance, and management of employee and so on. These datas will be
used for Reporting, Analysis, Data mining, Data quality and Interpretation, Predictive
Analysis.
What is ETL?
ETL stands for Extract-Transform-Load and it is a process of how data is loaded from
the source system to the data warehouse. Data is extracted from an OLTP database,
transformed to match the data warehouse schema and loaded into the data warehouse
database. Many data warehouses also incorporate data from non-OLTP systems such as
text files, legacy systems and spreadsheets.
1.
Extract
2.
Transform
Cleansing of data :After the data is extracted, it will move into the
next phase, of cleaning and conforming of data. Cleaning does the
omission in the data as well as identifying and fixing the errors.
Conforming means resolving the conflicts between those datas that
is incompatible, so that they can be used in an enterprise data
warehouse. In addition to these, this system creates meta-data that
is used to diagnose source system problems and improves data
quality.
3.
Load
2.
Data acquisition
3.
4.
5.
Build Reports
Types Of Testing
Testing Process
Table balancing or production reconciliation this
type of ETL testing is done on data as it is being
moved into production systems. To support your
business decision, the data in your production
Application Upgrades
Metadata Testing
GUI/Navigation Testing
While performing ETL testing, two documents that will always be used by an ETL tester
are
1. ETL mapping sheets :An ETL mapping sheets contain all the information
of source and destination tables including each and every column and
their look-up in reference tables. An ETL testers need to be comfortable
with SQL queries as ETL testing may involve writing big queries with
multiple joins to validate data at any stage of ETL. ETL mapping sheets
provide a significant help while writing queries for data verification.
2. DB Schema of Source, Target: It should be kept handy to verify any
detail in mapping sheets.
Test Cases
Verify mapping doc whether corresponding ETL
1.
2.
3.
Validation
4.
5.
6.
Constraint Validation
as expected
1.
1.
2.
3.
4.
Completeness Issues
column of target tables
5.
6.
1.
recorded
Correctness Issues
2.
Transformation
Transformation
Data Quality
1.
2.
Null Validate
3.
Precision Check
4.
Data check
5.
Null check
1.
Duplicate Check
2.
3.
2.
Date Validation
3.
4.
3.
4.
5.
6.
Data Cleanness
Type of Bugs
Description
User interface
bugs/cosmetic bugs
Mathematical errors
Equivalence Class
Partitioning (ECP)
related bug
Input/Output bugs
Calculation bugs
Load Condition bugs
No logo matching
H/W bugs
Verifies whether data is moved as expected following the rules/ standards defined in
the Data Model
Verifies whether counts in the source and
target are matching
Target table loading from stage file or table after applying atransformation.
Create, design and execute test cases, test plans and test harness
2.
Without any data loss and truncation projected data should be loaded into the
data warehouse
3.
Ensure that ETL application appropriately rejects and replaces with default
values and reports invalid data
4.
Need to ensure that the data loaded in data warehouse within prescribed and
expected time frames to confirm scalability and performance
5.
6.
To measure their effectiveness all unit tests should use appropriate coverage
techniques
7.
8.