DW Design and Data Model Example
DW Design and Data Model Example
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
1 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
INTRODUCTION 3
CONTEXT 4
SOURCE DESIGN 11
SAL DESIGN 22
CAL DESIGN 28
EUL DESIGN 32
OBIEE DESIGN 39
TESTING 40
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
2 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
Introduction
Licence
As this is a generic software document, this will be covered by the 'Creative Commons Zero v1.0
Universal' CC0 licence.
Warranty
The author does not make any warranty, express or implied, that any statements in this document
are free of error, or are consistent with particular standard of merchantability, or they will meet the
requirements for any particular application or environment. They should not be relied on for solving
a problem whose incorrect solution could result in injury or loss of property. If you do use this
material in such a manner, it is at your own risk. The author disclaims all liability for direct or
consequential damage resulting from its use.
Purpose
The primary goal of this document is to provide an example of DW design with data models.
Audience
The primary audience of this document are any DW designers or data modellers. It is assumed that
the reader is familiar with DW and Kimball standards.
Assumptions
This is only an example, so any attempt to use this for other cases is limited.
Approach
The document is a normal design document. It begins with context and definitions. It then provides
high level requirements traceability, scope, high level design and environments. Then it examines
the source data model and design in details, as imposes constraints on the subsequent Kimball
reporting data model. Source data quality was an issue, so it is discussed at length. Another issue
was date casting. After that the SAL or landing zone design is covered, where most of the data
quality issues were resolved. Then the CAL or Inmon zone design is covered. Finally, the EUL or
reporting zone design is discussed. The EUL data model is a standard Kimball fact dimensional
model. The end user tool here was OBIEE, so its design and impact on the EUL is also covered.
Finally, there is a discussion of testing design.
Documents
Normally there would be a list of documents, covering requirements, other design documents,
mappings spreadsheets, visio data models, and resultant artefacts such as DDL, SQL, etc.
Tags
Business Intelligence ; Data Design ; Data Load ; Data Mapping ; Data Model ; Data Transformation ;
Data Vault ; Data Warehouse ; Database ; Database Design ; Design Pattern ; DW Appliance ; Extract
Load Transform - ELT ; Extract Transform Load - ETL ; Fact / Dimension ; Hierarchy ; Inmon ; Kimball ;
Massive Parallel Processing - MPP ; Master Data Management ; Metadata ; Netezza ; Oracle ; SQL ;
Standards ; Teradata ; Data Architect ; Data Architecture ;
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
3 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
Context
Source System Context Diagram
This shows the flow of data and major components in the SCEM/BAM source system.
Supply
Co-ordination
Centre
16. Tickets
13. Tickets
9. Alert Notifications Supply Chain Event
Management
Supplier
SOW 1. Oracle Inventory 5.Oracle
Events 3. Oracle AP 4. Returns OM Events
File 2. Oracle Purchasing
Events
Events Events
Account Order
Inventory Purchasing Quality Management
Payables
6. 3PL Upload
3PL
Terms
This is meant to highlight the key terms that are used in the BAM system.
Term Definition
Alert This represents the lowest level of violation. It is displayed as either ‘Medium’ severity
or as an ‘Amber’ Traffic Light. An alert will not raise a Remedy ticket. An Alert can exist
without an Exception, but not vice versa. The business prefers to use the term Alert as
the common or abstract term to cover both Alert and Exception. This will be
implemented in the data model. It is assumed that a green or unalerted event is simply
on without any alert.
BAM BAM is a module of the SCEM package. For the sake of simplicity, the source system
will be referred to as BAM within this document, as this is the actual data source.
EMEM The application name is EMEM or Event Monitoring and Exception Management. This
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
4 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
Acronyms
These are standard acronyms in the BAM system. Some are Supply acronyms, and some are IT
acronyms. Many are used in column names.
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
6 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
7 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
Technical Requirements
These requirements are provided here, as they were not in the documents provided by the end
users. They are incorporated into the final requirements document.
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
8 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
Scope
Element Item
Deletes The BAM data is very ephemeral, and is deleted when the alerts are resolved. The
users need to retain data which is a central need for the UDS load.
Granularity There is no need for summing by event attribute quantities. For example, by PO
Quantity. Summations will be based on event types, alert types only.
Frequency Daily loads are sufficient. There is a weekly reporting cycle.
Join All reporting will be supported from BAM + Remedy data, joined by the BAM Remedy
ticket # and Remedy ticket #.
OOS Oracle Quality Schema as a source. Instead, there is some Tolerance data already
within BAM, and this will made available to the users.
OOS SCEM as a source. This was formerly the UDS source, but it was decided to use BAM as
the UDS source instead, as BAM data was in table form, whereas SCEM uses a more
complex internal structure.
Source BAM for events and alerts
Source EBS for Demand Planning Category
Source Remedy for Ticket info
Scope: Drops
The project was implemented using an agile approach with multiple drops.
Drop Table
1 BIA_BA_EUL.D_BAM_ALERT_TYPE
1 BIA_BA_EUL.D_BAM_EVENT_DATA_EP2P
1 BIA_BA_EUL.D_BAM_EVENT_DATA_IP2P
1 BIA_BA_EUL.D_BAM_EVENT_DATA_MO
1 BIA_BA_EUL.D_BAM_EVENT_DATA_MR
1 BIA_BA_EUL.F_BAM_ALERT
1.5 BIA_BA_EUL.D_BAM_EVENT_PARTY_CUST
1.5 BIA_BA_EUL.D_BAM_EVENT_PARTY_SUPPLIER
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
9 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
Drop Table
1.5 BIA_BA_EUL.D_BAM_EVENT_TYPE
2.1 BIA_BA_EUL.D_BAM_DP_CATEGORY
2.1 BIA_BA_EUL.D_BAM_EVENT_DATA_REMEDY
2.1 BIA_BA_EUL.D_BAM_REMEDY_TICKET
Environments
Database Env Type Purpose Database Schema
Oracle Test Source EBS04EE SOA_ORABAM
Oracle Test UDS Load EBS04EE EIM_IS_BAM
Oracle Prod Source EBS01PR SOA_ORABAM
Oracle Prod UDS Load EBS01PR EIM_IS_BAM
Netezza Dev SAL BIA01SB BIA_SA_SCEM
Netezza Dev EUL BIA01SB BIA_BA_EUL
Netezza Test SAL BIA02SB BIA_SA_SCEM
Netezza Test EUL BIA02SB BIA_BA_EUL
Netezza Prod SAL BIA01IT BIA_SA_SCEM
Netezza Prod EUL BIA01IT BIA_BA_EUL
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
10 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
Source Design
All the BAM tables come from the EBS02PR database, and the schema SOA_ORABAM.
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
12 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
14 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
15 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
10 tables will be used for EUL. These contain 626 columns. 6 of these tables have referential
integrity issues. 2 of these tables are empty, but are included as they are part of the scope.
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
16 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
By Matthew Lawler
The visio file has more detailed A3 versions of the data models, if required.
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
17 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
EBS02PR
SOA_ORABAM
CAL Data Model
With Table and Column names
By Matthew Lawler
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
18 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
EBS02PR
BAM Subject Area
With Table and Key names only
By Matthew Lawler
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
19 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
Remedy
The Remedy tables will be the source for a Remedy ticket dimension. They will also be a source for a
Remedy alert fact that can be unioned with the other alert fact source tables. The Remedy alert fact
will be filtered based on Supply Tenancy. It also has an ‘EMEM’ string as a prefix of the name.
DP Category
DP Category or Demand Plan Category is a 3 level hierarchy contained within an EBS Item Master
Flex field. This data is contained in a single source column, using a full stop as a delimiter. It is keyed
by Item Number, which will need to be on the fact table. The flex field data will be pivoted out into
4 columns, for Item Number, and the 3 DP Category levels. The correct organisation level for the flex
field is ‘MASTER’.
The correct source for this was the tables EBS_MTL_ITEM_CATEGORIES, EBS_MTL_CATEGORIES_B
and EBS_MTL_SYSTEM_ITEMS_B. This could not be determined until we had received snapshots of
the actual screens used to define this code. Once this was available, the internal EBS design was
discoverable. See the appendix for these snapshots.
Tolerance
This tolerance data is available in the Alert type dimension, sourced from BAM directly.
Date Casting
The sources used different ways to represent dates. BAM had taken actual dates from EBS and cast
them as strings into VARCHARs, so these needed to be transformed back to be usable by reporting
users. Remedy used UNIX time or Epoch to save dates. These cast functions were applied where
ever needed. 52 BAM columns and 25 Remedy columns were cast. No EBS date columns were
required in the solution. Note that the pattern handles null values as well. This works in Netezza.
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
20 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
Using IS_DELETED_YN
This became an issue that needed to be clarified. The IS_DELETED_YN flag must be used differently
depending on the source. The IS_DELETED_YN flag is used as a redundant flag to indicate
TRANSACTION_TYPE_C = 'D'.
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
21 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
SAL Design
SAL Design Overview
Question Answer
Performance The ETL that loaded the SAL tables does not use distribution keys. This means that
there may be a performance hit, but hopefully the low volumes will mean that this
will not be an issue. Nonetheless the distribution keys are known, and can be
applied later if ETL remediation is required. Performance issues were encountered,
but these were resolved in the cleansing and EUL.
DP Category The ITEM_NUMBER_ID will be the key to join to this dimension.
Remedy The Remedy ticket numbers are stored in the four Event Alert Remedy Link tables.
They will join to INCIDENT_NUMBER_ID in the SAL Remedy table.
Transparency As the SAL tables are one to one mappings from the BAM source tables, refer to the
source metadata as this will be the same for SAL tables.
True Keys The SAL model shows the source tables and their true keys.
Referential The BAM source system does not enforce Referential Integrity (RI) well, as is
integrity evident from the production data. This will be resolved by cleansing the data,
which will discard duplicate data.
Time Variance The ETL that loaded the SAL tables does not use the true keys. This means that
they are in effect snapshot tables, and that time variant views will be needed to
present keys and values correctly to the EUL layer.
SCD1 Invariant (I_*) tables will be used to filter out all but the most recent data, which
will be used for SCD1 dimension.
Data Quality To resolve the many data quality issues, data quality tables called Nub (N_*) tables
will be created. These will de-duplicate the data, add in default values, cast dates
correctly, change column lengths correctly and discard boilerplate columns.
Surrogate Surrogate keys for the Facts and Dimension will be generated using 2 approaches.
Keys Firstly, for small dimensions, the keys will use the hash function. For large
dimensions, surrogate key lookup (N_*_SK and N_*_DK) tables will be created.
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
22 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
BIA01SB
BIA_SA_SCEM
SAL Data Model
With Table and Key names
By Matthew Lawler
The model above shows the expanded keys in the link tables, discovered during later profiling.
These are additional helper tables added to the SAL to eliminate data quality issues.
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
24 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
Remedy
Remedy based dimensions, D_BAM_REMEDY_TICKET and D_BAM_EVENT_DATA_REMEDY are based
on the source joins below. All the columns in RMDY_T2115_BASE are included in the
D_BAM_REMEDY_TICKET dimension. The RMDY_FILED_ENUM_VALUEs is used to expand 6 code
columns. The RMDY_T2115__C_C1000000151 is used to provide the ticket description information.
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
25 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
DP Category
The correct source for this are the tables: EBS_MTL_SYSTEM_ITEMS_B, EBS_MTL_ITEM_CATEGORIES
and EBS_MTL_CATEGORIES_B. The original join within EBS is below. The UDS version is contained
within the final deliverable.
ON MIC.CATEGORY_ID = MCB.CATEGORY_ID
ON MIC.INVENTORY_ITEM_ID = MSIB.INVENTORY_ITEM_ID
AND MIC.ORGANIZATION_ID = 86
As Datastage captures SQL errors, a hard error (255 error) is needed from any PROC. Otherwise, it
will not detect the error and it will not fail the job. Therefore, remove the clause “EXCEPTION WHEN
OTHERS THEN” from the PROC, and allow all errors. This also works in practice when testing, as
Aginity handles the SQL error as well.
Implementation Script
No Description
1 Log in as BIA_SA_SCEM, and Select BIA01IT database
2 Create the following Procedure: BIA_BA_EUL.Z_DROP_OBJECT_IF_EXISTS_P.SQL (In Aginity,
select Query, and then Execute as a Single Batch, for each file. )
3 Create the following Procedure: BIA_SA_SCEM.F_BAM_ALERT_P.SQL (In Aginity, select Query,
and then Execute as a Single Batch, for each file.)
4 Execute the deploy script called 01_DEPLOY.BIA01IT.BIA_SA_SCEM.F_BAM_ALERT_P.SQL
5 Execute the post load proc by CALL BIA_SA_SCEM.F_BAM_ALERT_P()
6 If there are unresolvable issues, then the rollback script is
02_ROLLBACK.BIA01IT.BIA_SA_SCEM.F_BAM_ALERT_P.SQL
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
26 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
If Datastage fails
In this case, refer to the standard Datastage trouble shooting procedure.
In all cases, rerun the F_BAM_ALERT_P() PROC manually, at least once. Note that this PROC is
completely rerunnable, so it can be rerun as many times as required.
The PROC normally returns a 'F_BAM_ALERT_P SUCCEEDED' output text string that will contain start
and stop times for each inserted table.
If the following text string 'F_BAM_ALERT_P SUCCEEDED' DOES NOT appear at the end of the return
value, then there has been an error in this PROC.
The last successful command will be at the end of the file, and indicated by the following text string
'F_BAM_ALERT_P executed for: ' <tableName>. The next command will be the one that failed.
Search the PROC to determine which table load the PROC failed at. Determine if there is some
environmental cause for this error. If so, fix the error, and then rerun the PROC.
If the cause of the error is unclear, and rerunning does not return a successful outcome, then
capture the output text string and forward it to the development team for resolution.
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
27 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
CAL Design
CAL Design Overview
The CAL was not directly needed as there were no cross schema integration requirements. It was
cleaner to design all joins within a single SAL schema, before exposing the results into the EUL.
These design patterns were tested in the build phase. The goal was to optimise for the end
reporting user, using the integrated capability of the Netezza and OBIEE environments. They are
recorded as design decisions here, in order to make clear the choices and considerations behind
each decision.
Decision: Physical tables are preferable to views for performance reasons, except for small tables (<
50,000 rows), in a Netezza database
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
28 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
negative Extra work needed. Unable to rerun scripts or procedures when dropping and
creating tables. Unlike Oracle, Netezza generates an error
when a DROP is used on a table that does not exist.
This kind of missing Netezza Database infrastructure, there needs to be a common script available
and visible across all Netezza instances.
Decision: Use separate composable steps for large dimensions, and use the integrated for simpler,
smaller cases
So the pattern is to use a series of composable steps that convert the raw SAL data into a form that
can be used for reporting.
3. Filter out non-current rows to give Invariant data K_* -> I_*
4. Move data into EUL as Facts of Dims I_* -> F_*, D_*
Extract SAL source data into Nub Tables (N_*) in order to discard boilerplate columns, de-duplicate
rows, add defaults values (e.g. N/A for nulls), convert types (e.g. Text -> Dates), fix column lengths,
etc. Note that this step does not include name changes.
Add distribution Key to Nubbed data, which is critical for adequate Netezza performance. Note that
this step does not include name changes.
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
29 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
Filter out non-Current rows, to provide Invariant or SCD1 data. Note that this step can also be used
to provide Time Variant data.
Some conversion of raw timestamps into time periods (Effective from and Effective to) can be done
here.
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
30 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
SAL Table
SAL Table SAL Table
I_* (Invariant),
T_* (Time Variant)
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
31 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
EUL Design
EUL Design Overview
Question Answer
All Any tables starting with D_* are dimensions, and with F_* are facts. All tables are
derived from the SAL tables already defined, unless otherwise stated. Note that the
SAL snapshot tables are not used in the EUL model. They should still be retained in SAL
if the business needs to later access these tables.
Dimension Dimension for Date is D_DATE. This is the current EUL D_DATE dimension will be used.
Dimension Dimension for Demand Plan Product hierarchy will be taken from the EBS SAL.
Dimension Dimension for Event data is based on each of the Snapshot tables.
Dimension Dimension for Event Party is derived from the BAM SAL only, to be able to report on
Suppliers, etc. This is not a conformed dimension, as it is simply extracted from the
BAM SAL tables.
Dimension Dimension for Event types is manually defined as one of the 5 events.
Dimension Dimension for Alert types is derived from the EMEM_REMEDY reference tables.
Dimension Dimension for Remedy will be taken from the Remedy SAL.
Fact The Fact table will be based on a union over all the Event Alert Remedy Link tables.
This will allow integration across all event and alert types, using a single fact table. A
default value will be added for non-applicable events.
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
32 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
33 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
35 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
36 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
BIA01SB
BIA_BA_EUL
EUL Data Model
With Table and Key names
By Matthew Lawler
Last Edit: 20/05/2015 By: Matthew Lawler Filename: BAM.EBS02PR.SOA_ORABAM.vsd
This shows all tables and keys of the final model. The diagram below shows all columns based on
BAM sourced data. The Remedy and DP category columns will be added for drop 2. Primary and
foreign keys are in bold case, all other columns are unbolded.
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
37 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
BIA01SB
BIA_BA_EUL
EUL Data Model
With Table and Column names
By Matthew Lawler
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
38 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
OBIEE Design
OBIEE Design Overview
Question Answer
Kimball OBIEE needs a clearly defined fact table grain. This is provided in the EUL model.
Kimball The business model mapping layer works best with a properly formed Kimball star
schema. All group bys needed will be defined in the CAL and EUL layers.
Security User access can be managed directly in OBIEE.
Single An RPD cannot have unconnected tables. Therefore, this forms a natural way to split
RPD RPD subject area releases. However, in this case, as there is only one fact table, a single
RPD will be deployed. Modified RPDs will be redeployed as a unit.
Key OBIEE requires that the dimensions have single key. When the true dimension key is
Structure made up of compound keys, these may be concatenated into a single key, using a
unique separator to prevent inadvertent key duplication. Alternately the key can be
created using a lookup table..
Folder Folder size should contain between 10 and 20 columns only. Larger folders impact
size usability and comprehensibility. Also users can’t deselect columns in OBEE. Instead,
rather they select columns. Therefore, large dimensions will be organised in sets of
folders, that follow some logical pattern.
User OBIEE folders contain shared dimensions, and shared facts. The first dimension in the
Control dimension folder will be D_DATE, followed by all other dimensions.
User The presentation layer supports simple WHERE clause, and simple aggregations.
Control
Data Only standard Netezza data types can be used. In particular, do not use BIGINT, as
Types OBIEE does not recognise this data type. All surrogate keys in OBIEE must be integers.
The standard Netezza data type for this should be NUMERIC (19, 0).
Table OBIEE provides some automatic quality checks, including that the row count must equal
counts the distinct count of the primary key.
SCD2 Use the standard Kimball surrogate key design pattern to represent SCD2. OBIEE does
not allow BETWEEN joins.
No Nulls OBIEE assumes that all columns have values, so null values must be filled with defaults.
Keys While an additional dimension key does not fit OBIEE's standard join pattern, the
additional distribution key join is essential to get acceptable Netezza performance.
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
39 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
Testing
Overview
System Testing (ST) can start when the EUL data is in place. ST will focus on verifying transformation
such as below. UAT can start when the OBIEE RPD is in place. UAT will focus on the business logic.
Testable statements
A testable statement is some system property that will remain unchanged when some
transformation is applied to some system object.
Deployment Tests
These are the actual invariants or tests performed each time new code was deployed.
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
40 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
The reason is that the BAM munges all rows together and generates a false, synthetic, surrogate key,
completely unrelated to the actual underlying business data. The synthetic key is used to detect
SCD2 in the DAF ETL.
So, whenever EBS data changes are propagated through to BAM, BAM creates a NEW row, and does
not update a current row. With a new key each time, the DAF ETL will never detect an update, and a
TRANSACTION_TYPE_C = ‘U’ cannot occur. Therefore, the only possible TRANSACTION_TYPE_C
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
41 of 42
Matthew Lawler [email protected] DW Design and Data Model Example
values will be 'I' or 'D'. D occurs whenever a BAM row is deleted, which occurs for housekeeping
reasons. This D event has no business value, and is just a consequence of the BAM data
administration. Therefore, all BAM data will now be filtered for 'I' TRANSACTION_TYPE_C only.
Secondly, one the key tasks of the N_* tables is to reduce the tables, so that true business key is
actually unique on these tables. So, it is also really important when generating test rows to not use
the same business keys as previous rows.
There are 10 duplicates here, which is to be expected, due to the duplicate rows on the BAM source.
Clearly, this is also test data. So to fix it, the data is filtered onto the N_* table.
This still produced one duplicate, which is an error. The reason is that there are 2 test case rows
were entered with the same key, but with different values. This situation could not occur in reality,
as each time a new event is detected, it must have a new timestamp by definition. The true keys for
all tables are all defined in the spreadsheet.
D:\D\Documents\DW Me\0 Publish\DW Design and Data Model Example.docx February 13, 2018
42 of 42