0% found this document useful (0 votes)
265 views51 pages

Data Warehouse Data Design

drill down, drill across, handling time, top-down, bottom-up

Uploaded by

Sweet Emme
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as ODP, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
265 views51 pages

Data Warehouse Data Design

drill down, drill across, handling time, top-down, bottom-up

Uploaded by

Sweet Emme
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as ODP, PDF, TXT or read online on Scribd
You are on page 1/ 51

BAFWARE: Fundamentals of Data Warehouse

Week 04

IBM Confidential IBM Global Center for Smarter Analytics.


© Copyright IBM Corporation 2013. All rights reserved.

THE INFORMATION CONTAINED IN THIS PRESENTATION IS FOR INFORMATIONAL PURPOSES ONLY. IBM SHALL
NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO,
THIS PRESENTATION OR ANY OTHER DOCUMENTATION.

IBM, the IBM logo, ibm.com, Cognos, SPSS and iLog are trademarks or registered trademarks of International
Business Machines Corporation in the United States, other countries, or both. If these and other IBM
trademarked terms are U.S. registered or common law trademarks owned by IBM at the time this
information was published. Trademarks may also be registered or common law trademarks in other
countries. A current list of IBM trademarks is available on the Web at “Copyright and trademark
information” at http://www.ibm.com/legal/copytrade.html. The IBM logo must not be moved, added to
or altered in any way.

Other company, product, or service names may be trademarks or service marks of others.

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Agenda
§Module 4: Designing the Data Delivery Database

Purpose of Dimensional Model

Visualization

Data Warehouse Fundamental Themes

Data Marts and Usage

Dimensional Model

Data Warehousing and Business Intelligence Approaches

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Readings

Kimball, R. and Ross, M. (2010). The Kimball Group Reader: Relentlessly Practical Tools for Data
Warehousing and Business Intelligence. John Wiley & Sons.
Reeves, L. (2009). A Manager's Guide to Data Warehousing. John Wiley & Sons.
Laberge, R. (2011). The Data Warehouse Mentor: Practical Data Warehouse and Business Intelligence
Insights. McGraw-Hill/Osborne.
Whitney, H. (2013). Data Insights: New Ways to Visualize and Make Sense of Data. Morgan Kaufmann
Publishers.
[5] Ponniah, P. (2010). Data Warehousing Fundamentals for IT Professionals.2nd edition. John
Wiley & Sons.

• IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


BAFWARE: Fundamentals of Data Warehouse

Module 4: Designing the Data Delivery


Database

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Understanding Your Data[2]
What is the
What frequency
percentage of with which
the rows in this these values
table contain are used?
Has this data
element ever data?
been
populated?

What are the


possible values
for this data
When was the element?
last time this
table was
updated?

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


DW/BI Design Task[1]

Business Requirements

Strategic Data Tactical Data


Profiling Profiling

Integration and Compliance and


Latency Security

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Data Warehousing and Business Intelligence Approaches

BAFWARE: Fundamentals of Data Warehouse

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Top – Down Enterprise [3]
Enterprise data with and without a
business usage perspective

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Bottom - Up [3]
Pure end-user usage perspective

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Hybrid [3]
●All data is fully understood and can support any business application
including a data warehouse and business intelligence application
●Using both (Top – Down and Bottom -up) of the preceding two perspectives

within the same development project

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Essential Steps for the Integrated Enterprise Data Warehouse[1]
Detailed textual descriptions of entities

Aggregated groupings

Constraints posed by business intelligence applications

Numeric facts be represented consistently across subject


areas

International differences in languages, location descriptions,


time zones, currencies,and business rules

Auditing, compliance, authentication, and authorization

Coordination with industry standards


Nu

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation

C
c
The Key Boundaries[1]
Boundaries are guidelines, rules or limits.

Boundaries
● with the business users - to find the business users, interview

them, and interpret what they tell you into specific DW/BI deliverables


Boundaries
● with finance - finance should work out the logical and political

implications of the cost allocations, and you can quietly implement them.
● ●


Boundaries
● across organizations - executives must establish a corporate

culture that sends a very clear message to all the separate departments


Boundaries with legal - providing adequate security, privacy, archiving, and

compliance across the DW/BI system



Boundaries with IT - be able to rely on other groups within IT for storage

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Data Warehouse Process[1]

Perform change data capture on the


Change data production source computer or after
capture extraction to a machine

It occurs before or after change data


Extraction capture, transfer of data from the source
system to the DW/BI environment

Data staging Support all the types of backup

Serves as an important step when you


are forced to deal with compliance-
Archiving sensitive data

IBM Confidential
1 IBM Global Center for Smarter Analytics.
BAFWARE: Fundamentals of Data Warehouse

Visualization

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Why do we need Visualization[1]

Help us understand and manage important data streams that are flowing
●around us




Move easily from "the big picture" to the minute details at will



Allow us to quickly spot errors in the data


Enable us to perceive things we were not considering or expecting and


help us to better deal with the unexpected


Have deep simplicity
● ●


Have some level of interactivity and qualities of good collaborators

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Why do we need Visualization[4]

Help a spectrum of people make better real-life, day-to-day decisions



Help viewers gain new insights and enjoy exploring data
●●


Have some flexibility and be able to adapt and adjust to the changing


needs and contexts of the user

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Descriptive Analytics – Visualization Types[5]


List Reports

Crosstabs

Charts

Graphs

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


List Reports[6]
A report that show data in rows and columns

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Crosstabs[6]
Show information in a more compact form than in a grouped list

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Charts[6]
Graphical way of presenting data

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Graphs
Are used to represent, model and analyze data

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Advanced Visualization Techniques[5]

Chart Manipulation - a user can rotate a chart or dynamically change
the chart type to get a clearer view of the results


Drill Down - the visualization first presents the results at the summary
level


Advanced Interaction - user simply double clicks a part of the
visualization and then drags and drops representations of data

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


BAFWARE: Fundamentals of Data Warehouse

Data Warehouse Fundamental Themes

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Drilling Down[1]

Probably the most basic capability that a data warehouse needs to support

Most directly addresses the natural business user need to see more detail

in an interesting result.

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Drilling Across [1]

Are at the heart of data warehouse architecture

Navigating from one report to another report

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Handling Time [1]


Support the natural ways people have of viewing data over time

Include seeing instantaneous events, regular periodic reports, and latest

status

Data warehouse shall preserve history

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


BAFWARE: Fundamentals of Data Warehouse

Data Marts and Usage

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Data Marts and Usage[3]
Business Information Aspects:
Depending on your field of study, you may have studied these fields

●Information on the final reports


●Performance or speed of obtaining the report

information
●Underlying data mart designs with dimensions and

measures
●Volumetric

●Indexes

●Data partitions

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Data Marts and Usage[3]
Business Information Aspects:

●Loading of data
●Quality of data

●Source of data

●Granularity of data

●Association of one data mart to another

●Operational support

●Synchronization with the enterprise data model

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Data Marts and Usage[3]
Why Model the Data?

Data Modelling: Abstracting the individual data elements and how they
interact with one another

Enhances the effectiveness of


individual queries

Increases the ease of use for the


user

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


BAFWARE: Fundamentals of Data Warehouse

Dimensional Model

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


What is a Dimensional Model?[2]

Data model organized for the purpose of user understandability

and high performance

● Dimensions
Support the business perspective of the data, and today's technology

ensures that they can be effectively implemented.


Basic parts of a dimensional model: the dimensions and the facts

Facts

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


The Purpose of Dimensional Models[2]

2 principles that guide the entire dimensional modeling process


1. Ease of Use

2 Query Performance

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Dimensions[2]

Are groupings of data elements in major business categories

Are the basic stable entities in our environment, such as customers,
Dimensions

products, locations, marketing promotions, and calendars

Reeves, L. (2009). A Manager's Guide to Data Warehousing. John Wiley & Sons.

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Fact Table[2]
●Facts are typically the amounts and counts that
show
up as the body of reports

●These are captured as specific information about a


business event or transaction

●These are measured, monitored, and


●tracked over time

● Used as the basis for all calculations

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


Factless Fact [2]
Captures the many-to-many relationships between dimensions, but
contains no numeric facts

No Measure to
calculate

Tracks the attendance although no measured facts in the fact table


Reeves, L. (2009). A Manager's Guide to Data Warehousing. John Wiley & Sons.
IBM Confidential
3 IBM Global Center for Smarter Analytics.
Implementing Changing Dimensions [7]
Type 1: Overwriting History
● Overwrite the old attribute value in the dimension row, replacing it
with the current value
● Attribute always reflects the most recent assignment

● Simplest approach to dealing with dimension attribute changes

● It is fast and easy

IBM Confidential
3 IBM Global Center for Smarter Analytics.
Implementing Changing Dimensions (continued)[7]
Reasons for Type 1: Overwriting History
● Usually, changes relate to correction of errors in source systems

● Sometimes the change in the source system has no significance

● Sometimes the old value in the source system needs to be discarded

● Change in the source system need not be preserved in the data

warehouse

IBM Confidential
3 IBM Global Center for Smarter Analytics.
Implementing Changing Dimensions (continued)

Original:

Type 1:

Just overwrite the old value.

All product and company names are trademarks™ or registered® trademarks of their respective holders. Use of them does not imply any affiliation with or endorsement by them.
IBM Confidential
4 IBM Global Center for Smarter Analytics.
Implementing Changing Dimensions (continued)[7]
Type 2: Preserving History
● Predominant technique for supporting this requirement when it comes

to slowly changing dimensions


● Requires the use of surrogate keys

● Writes a record with the new attribute information

● Preserves a record of the old dimensional data

IBM Confidential
4 IBM Global Center for Smarter Analytics.
Implementing Changing Dimensions (continued)[7]
Reasons for Type 2: Preserving History

Usually relate to true changes in source systems

There is a need to preserve history in the data warehouse

Type of change partitions history in the data warehouse

Every change for the same attribute must be preserved

IBM Confidential
4 IBM Global Center for Smarter Analytics.
Implementing Changing Dimensions (continued)
Original:

Type 2:

Create a new row with a new surrogate key that reflects the changes.

All product and company names are trademarks™ or registered® trademarks of their respective holders. Use of them does not imply any affiliation with or endorsement by them.
IBM Confidential
4 IBM Global Center for Smarter Analytics.
Implementing Changing Dimensions (continued)[7]
Type 3: Preserving a Version of History

Places a value for the change in the original dimensional record.

Appropriate when there's a strong need to support two views of the
world simultaneously

Preserves the change

IBM Confidential
4 IBM Global Center for Smarter Analytics.
Implementing Changing Dimensions (continued)[7]
Reasons for Type 3: Preserving History

Usually relate to “soft” or tentative changes in the source systems

There is a need to keep track of history with old and new values of the

changed attribute

Used to compare performance across the transition

Provide the ability to track forward and backward

IBM Confidential
4 IBM Global Center for Smarter Analytics.
Implementing Changing Dimensions (continued)
Original:

Type 3:
Add an “old” field in the dimension table for the affected attribute
Keep the new value of the attribute in the “current” field, current date may also be added
No new dimension row is needed

All product and company names are trademarks™ or registered® trademarks of their respective holders. Use of them does not imply any affiliation with or endorsement by them.
IBM Confidential
4 IBM Global Center for Smarter Analytics.
Implementing Changing Dimensions (continued)
Original:

Type 1:

Type 2:

Type 3:

All product and company names are trademarks™ or registered® trademarks of their respective holders. Use of them does not imply any affiliation with or endorsement by them.
IBM Confidential
4 IBM Global Center for Smarter Analytics.
Steps in Developing Business Dimensional Model [2]

Creating the Expanding


First Draft Business
Data Over
Time

Preparing for Business


Modeling Reviews of
Sessions the Model
Completing or
Fleshing Out
the Model

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


BAFWARE: Fundamentals of Data Warehouse

For the Next Session

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


For the Next Sessions


Information Management

Data Governance

Data Security

Data Ownership

Data Dictionary

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation


References
[1]: Kimball, R. and Ross, M. (2010). The Kimball Group Reader: Relentlessly Practical Tools for Data Warehousing
and Business Intelligence. John Wiley & Sons.

[2]: Reeves, L. (2009). A Manager's Guide to Data Warehousing. John Wiley & Sons.

[3]: Laberge, R. (2011). The Data Warehouse Mentor: Practical Data Warehouse and Business Intelligence Insights.
McGraw-Hill/Osborne.

[4]Whitney, H. (2013). Data Insights: New Ways to Visualize and Make Sense of Data. Morgan Kaufmann
Publishers.

[5] Ponniah, P. (2010). Data Warehousing Fundamentals for IT Professionals.2nd edition. John Wiley & Sons.
[6]:n.d.(2008). IBM Cognos 8 Business Intelligence. IBM Corporation.
[7]: Mohanty, S. (2006). Data Warehousing: Design, Development and Best Practices. Tata McGraw-Hill
Publishing Company, India.

IBM Global Center for Smarter Analytics. © 2013 IBM Corporation

You might also like