Web Content: With The SQL Server 2008R2 Platform
Web Content: With The SQL Server 2008R2 Platform
Web Content
Chapter 2 – Example of Type 1 and Type 2 attribute change tracking techniques
The best way to understand the concepts of Type 1 and Type 2 change
tracking, and the substantial impact of accurately tracking changes, is with an
example. As we see in the Data Mining chapter, Adventure Works Cycles collects
demographic information from its Internet customers. These attributes, such as
gender, homeowner status, education, and commute distance, all go into the
customer dimension. Many of these attributes will change over time. If a customer
moves, her commute distance will change. If a customer bought a bike when he
went to college, his education status will change when he graduates. These
attributes all certainly qualify as slowly changing, but should you treat them as
Type 1 or Type 2 attributes?
If you’ve been paying attention, you know the correct answer is to ask the
business users how they’ll use this information. You already know from the
requirements gathering information in Chapter 1 that Marketing intends to use
some data mining techniques to look for demographic attributes that predict
buying behaviors. In particular, they want to know if certain attributes are
predictive of higher-than-average buying. They could use those attributes to create
targeted marketing campaigns. Let’s examine some example customer-level data
to see how decisions on handling attribute changes can affect the information you
provide your users.
Table 2.1 shows the row in the customer dimension for a customer named
Jane Rider as of January 1, 2011. Notice that the customer dimension includes the
business key from the transaction system along with the attributes that describe
Jane Rider. The business key allows users to tie back to the transaction system if
need be.
Table 2.2 shows some example rows from an abridged version of the AWC
Orders fact table in the data warehouse database. We have used surrogate keys for
Table 2.2: Example Order Fact Table Rows for Jane Rider as of Feb 22, 2011
Date Customer_Ke Product_Ke Item_Coun Dollar_Amoun
y y t t
1/7/2009 1552 95 1 1,798.00
3/2/2009 1552 37 1 27.95
5/7/2010 1552 87 2 320.26
8/21/201 1552 33 2 129.99
0
2/21/201 1552 42 1 19.95
1
Table 2.3: Example Customer Dimension Table Row as of Jan 2, 2011 with Type 2
Change Tracking
Home_
Customer BKCusto Customer_ Commute_ Owner_
_Key mer_Id Name Distance Gender Flag Eff_Date End_Date
1552 31421 Jane Rider 3 Female No 1/7/2009 1/1/2011
2387 31421 Jane Rider 31 Female Yes 1/2/2011 12/31/9999
The customer dimension table has been augmented to help manage the
Type 2 process. Two columns have been added to indicate the effective date and
end date of each row. This is how you can tell exactly which row was in effect at
any given time.
As Table 2.4 shows, the fact table has to change as well because it has a
row that occurred after the Type 2 change occurred. The last row of the fact table
needs to join to the row in the dimension that was in effect when the order
occurred on February 21, 2011. This is the new dimension row with
Customer_Key = 2387.
Table 2.4: Updated Order Fact Table Rows for Jane Rider as of Feb 22, 2011
Date Customer_Ke Product_Ke Item_COUN Dollar_Amoun
y y T t
1/7/2009 1552 95 1 1,798.00
3/2/2009 1552 37 1 27.95
5/7/2010 1552 87 2 320.26