BW Multi Dimensional Data Modelling
BW Multi Dimensional Data Modelling
with BW
ASAP FOR BW ACCELERATOR
BUSINESS INFORMATION WAREHOUSE
SAP (SAP America, Inc. and SAP AG) assumes no responsibility for errors or omissions in these materials.
These materials are provided “as is” without a warranty of any kind, either express or implied, including but not limited to, the
implied warranties of merchantability, fitness for a particular purpose, or non-infringement.
SAP shall not be liable for damages of any kind including without limitation direct, special, indirect, or consequential damages
that may result from the use of these materials.
SAP does not warrant the accuracy or completeness of the information, text, graphics, links or other items contained within
these materials. SAP has no control over the information that you may access through the use of hot links contained in these
materials and does not endorse your use of third party web pages nor provide any warranty whatsoever relating to third party
web pages.
MULTI-DIMENSIONAL MODELING WITH BW
TH BW
ASAP FOR BW ACCELERATOR
Table of Contents
MULTI-DIMENSIONAL MODELING WITH BW..................................................................................1
ASAP FOR BW ACCELERATOR...................................................................................................................1
TABLE OF CONTENTS...............................................................................................................................2
1 INTRODUCTION...................................................................................................................................1
1.1 SOFTWARE VERSION SUPPORTED................................................................................................1
1.2 REFERENCES............................................................................................................................... 1
1.3 OVERVIEW................................................................................................................................. 2
2 FROM MULTI-DIMENSIONAL MODEL TO INFOCUBE – FIRST APPROACH..................5
2.1 THE GOALS OF MULTI-DIMENSIONAL DATA MODELS.....................................................................5
2.2 SUBJECT AREA........................................................................................................................... 5
2.3 THE ROLE OF THE BW BUSINESS CONTENT.................................................................................5
2.4 BASIC MODELING STEPS............................................................................................................. 6
2.4.1 Step 1: Develop a complete understanding of the underlying business processes.................7
2.4.1.1 Reaping benefits of BW’s Business Content.....................................................................................9
2.4.2 Step 2: Create a valid Schema...............................................................................................10
2.4.2.1 The Multi-Dimensional Model (MDM)..........................................................................................10
2.4.2.2 The Star Schema..............................................................................................................................10
2.4.3 Step 3 : Create an InfoCube Description..............................................................................13
2.5 RESUME................................................................................................................................... 14
3 STAR SCHEMA BASICS AND MODELING ISSUES...................................................................15
3.1 HOW THE STAR SCHEMA WORKS.............................................................................................. 15
3.2 STAR SCHEMA ISSUES............................................................................................................... 16
4 MULTI-DIMENSIONAL SCHEMAS IN BW..................................................................................18
4.1 OVERVIEW................................................................................................................................ 18
4.2 CONNECTING MASTER TABLES TO INFOCUBES..........................................................................20
4.3 DIMENSIONS IN A BW SCHEMA................................................................................................. 21
4.3.1 Master Data Table.................................................................................................................23
4.3.1.1 Reference Characteristic Assignment..............................................................................................23
4.3.1.2 Master Table Existence....................................................................................................................23
4.3.1.3 Assigning Attributes........................................................................................................................23
4.3.1.4 Attributes and Querying..................................................................................................................23
4.3.1.5 InfoObject Names and Names of Attributes....................................................................................24
4.3.1.6 Time Dependent Attributes..............................................................................................................24
4.3.1.7 Compound Attributes.......................................................................................................................25
4.3.2 Text Tables.............................................................................................................................26
4.3.3 SID Tables..............................................................................................................................27
4.3.3.1 InfoObject Definition and SID Tables.............................................................................................27
4.3.3.2 SID Tables Mainetance....................................................................................................................29
4.3.3.3 InfoCube Access and SID Tables....................................................................................................29
4.3.4 External Hierarchy Table......................................................................................................31
4.3.4.1 External Hierarchy Types................................................................................................................31
4.3.4.2 Tables for External Hierarchies.......................................................................................................31
4.3.4.3 Loading External Hierarchy Data....................................................................................................31
4.3.4.4 External Hierarchies and InfoCube Access.....................................................................................31
TH BW
ASAP FOR BW ACCELERATOR
4.3.5 Dimension Tables of an InfoCube..........................................................................................33
4.3.5.1 Defining Dimension Tables.............................................................................................................33
4.3.5.2 Columns of a Dimension Table.......................................................................................................33
4.3.5.3 Limitations.......................................................................................................................................34
4.3.5.4 Dimensions and Navigation.............................................................................................................34
4.3.5.5 Loading data into Dimension Tables...............................................................................................34
4.3.5.6 Special BW Dimensions..................................................................................................................34
4.3.5.6.1 Packet Dimension.....................................................................................................................34
4.3.5.6.2 Unit/ Currency Dimension........................................................................................................34
4.3.5.7 Dimensions with only one Characteristic (Line Item Dimensions)................................................35
4.4 FACT TABLE............................................................................................................................. 36
4.4.1 Multiple Fact Tables..............................................................................................................36
4.4.2 Fact Table Partitioning..........................................................................................................37
4.5 BW TERMINOLOGY.................................................................................................................. 39
5 MODELING ISSUES AND THE BW SCHEMA.............................................................................40
5.1 GRANULARITY.......................................................................................................................... 41
5.1.1 Fact Tables and Granularity..................................................................................................41
5.1.2 Impacts on Storage.................................................................................................................42
5.1.3 Impacts on Performance........................................................................................................42
5.2 LOCATION OF DEPENDENT ATTRIBUTES IN THE BW SCHEMA.....................................................43
5.2.1 Performance and Location of Dependent Attributes.............................................................43
5.2.2 Enterprise Data Warehouse and Location of Dependent Attributes....................................43
5.2.3 Data Load and Location of Dependent Attributes................................................................44
5.3 TRACKING HISTORY IN THE BW SCHEMA..................................................................................45
5.3.1 History and InfoCube.............................................................................................................45
5.3.2 Slowly Changing Dimensions................................................................................................47
5.3.2.1 Scenario I: Report the data to today‘s constellation - Today is Yesterday.....................................49
5.3.2.1.1 Scenario I : Description............................................................................................................49
5.3.2.1.2 Scenario I: Solutions with BW.................................................................................................50
5.3.2.2 Report the data to yesterday‘s constellation as well -Yesterday is Today......................................52
5.3.2.2.1 Scenario II : Description...........................................................................................................52
5.3.2.2.2 Scenario II: Solutions with BW................................................................................................53
5.3.2.3 Scenario III: Report the data to the respective constellation-Today or Yesterday-........................56
5.3.2.3.1 Scenario III : Description.........................................................................................................56
5.3.2.3.2 Scenario III: Solution with BW...............................................................................................57
5.3.2.4 Scenario IV: Report only on data for constellations valid today and yesterday -Today and
Yesterday-......................................................................................................................................................... 58
5.3.2.4.1 Scenario IV : Description.........................................................................................................58
5.3.2.4.2 Scenario IV: Solution with BW...............................................................................................59
5.3.3 Usage of Time Scenarios.......................................................................................................61
5.4 M:N RELATIONSHIPS................................................................................................................ 62
5.4.1 M:N Relationships and the Fact Table..................................................................................62
5.4.2 M:N Relationships within a Dimension.................................................................................62
5.4.2.1 Designing M:N Relationships using the Dimension Table.............................................................62
5.4.2.2 Designing M:N Relationships using a Compound Attribute...........................................................62
5.5 FREQUENTLY CHANGING ATTRIBUTES (STATUS ATTRIBUTES)....................................................64
5.6 INFLATION OF DIMENSIONS....................................................................................................... 65
5.7 MULTIPLE PROCESS REPORTING SCENARIOS..............................................................................66
5.7.1 MultiCubes.............................................................................................................................67
5.7.2 Partitioning Attributes...........................................................................................................70
5.8 ATTRIBUTE OR FACT (KEY FIGURE)........................................................................................... 72
5.9 SAME CHARACTERISTIC SEVERAL TIMES IN THE MODEL.............................................................73
5.10 ARTIFICIAL KEY FIGURES......................................................................................................... 73
5.10.1 Factless Fact tables...............................................................................................................73
TH BW
ASAP FOR BW ACCELERATOR
5.10.2 Counting.................................................................................................................................73
5.11 BIG DIMENSIONS...................................................................................................................... 73
5.12 HIERARCHIES IN THE BW SCHEMA............................................................................................ 75
5.12.1 Hierarchies within a Dimension............................................................................................75
5.12.2 Hierarchies within a Master Data table of a Characteristic................................................76
5.12.3 External Hierarchies..............................................................................................................76
1 Introduction
This document provides background on the techniques used to create multi-dimensional structures
within SAP BW which are called InfoCubes and suggestions to help the customer to understand
when to apply the various techniques available.
TH BW
ASAP FOR BW ACCELERATOR
1.2 References
For more detailed information on the SAP BW Architecture please refer to The BW ODS Whitepaper
and to the paper Hierachies in BW.
TH BW
ASAP FOR BW ACCELERATOR
1.3 Overview
The BW version 2.0 was a major step in the evolution of the BW architecture and functionality. From
the architecture point of view the introduction of the new BW Operational Data Store (BW ODS) is
most important.
Note: The new BW ODS introduced with version 2.0B may not be confused with layer in version 1.2B
which was called ODS. This layer is renamed in Version 2.0B to Persistent Staging Area (PSA).
The BW ODS is a multi-level layer in the BW data warehouse, which offers the functionality to store
the result of the data cleansing, and data transformation process in transparent tables that are called
ODS Objects. Doing so the BW ODS forms the historical foundation of the data warehouse.
To enable process integration multiple BW ODS Objects can feed other ODS Objects or InfoCubes.
Business rules can be applied in the integration process.
The length of this integration chain of ODS Objects is not limited by BW.
Meta Data
CRM
Business Rules
Cleansing & Transformation
BBP
•Ad Hoc
Queries
Extraction
ETL Tools
•Reporting
Legacy •Applications
•Models
Integration
Granularity
Master Data
TH BW
ASAP FOR BW ACCELERATOR
The BW Architecture graphics illustrate that the InfoCubes should be founded on the integration layer
for transactional data the BW ODS, but this is of course an option. Furthermore the InfoCubes are
linked to common master reference data located in master data tables, text tables, and (external)
hierarchy tables. Thus the BW infrastructure provides the structure for building InfoCubes founded on
a common integrated basis. This approach allows for partial solutions based on a blueprint for an
enterprise-wide data warehouse.
TH BW
ASAP FOR BW ACCELERATOR
With the reporting capability on the members of the BW ODS Objects, also in the context of this
paper additional functionality is offered.
BW ODS Objects can either be accessed directly or serve as Drill Thru target from the InfoCube
point of view.
Meta Data
Provider
Master Data
However when to use what BW structure (InfoCubes or ODS-Objects) as foundation for reporting and
analysis is not discussed in this paper. This is done in The BW ODS White paper.
The focus of this paper is the support of Online Analytical Processing (OLAP) in BW. OLAP
functionality is one of the mayor requirements in data warehousing. Roughly speaking OLAP offers
even to inexperienced end-users the capability to analyze business process data (KPIs) with respect
to the terms of the involved business lines. This is normally done step-by-step starting with business
terms showing the KPIs on an aggregate level and further by proceeding to business terms on a
more detailed level.
TH BW
ASAP FOR BW ACCELERATOR
A simple example:
TH BW
ASAP FOR BW ACCELERATOR
TH BW
ASAP FOR BW ACCELERATOR
Nevertheless Business Content InfoCubes and even more the Business Content InfoSources
(data structures offered by R/3 applications) helps at least to abbreviate the modeling process. We
will not discuss the Business Content and how to get benefit during the modeling process as this is
done in special papers.
If however we are in the situation to create an InfoCube based partly or even entirely on non-R/3
applications (so called legacy systems) the general proceeding offer a proofed approach.
TH BW
ASAP FOR BW ACCELERATOR
TH BW
ASAP FOR BW ACCELERATOR
Address
...
Material DIM ID
OrgStr DIM ID
Time Code ID
SAP BW ....
Quantity
..... Build the solution as a
part of an integrated
data warehouse
There is no strict rule on how to develop a complete understanding of the underlying business
process. Nevertheless using an Entity Relationship Model (ERM) is a good way of seeing the
relevant business objects and their relationships. But depending on the situation and the experience
sometimes it will be sufficient just to paint a diagram showing the entities and their relations.
Tools like VISIO or Erwin or any other modeling tool could be very helpful in this step.
An example may be the most efficient means to provide the understanding of how to approach to a
Multi-Dimensional Model / Star Schema and finally to a valid BW implementation and to introduce the
basic terms.
TH BW
ASAP FOR BW ACCELERATOR
If the end-user describes his information needs and thus the subject area as follows:
‘Track the performance of materials with respect to customers and sales persons’ .
The following nouns tell how the end-user looks at the world:
Material
Customer
Sales Person
The nouns are basic business terms and are usually called Strong Entities :
Ask the end-user about the relationship between his basic business terms (strong entities).
Normally the relationship between strong entities are N:M Relationships i.e. a customer can
purchase multiple materials and materials can be purchased by multiple customers:
TH BW
ASAP FOR BW ACCELERATOR
Material group Sales Department
Sales Transaction
Intersection Entity
Now the customer is asked to be more precise. The customer determines that additional details for
material, customer and sales person are also required.
This gives you additional entities and attributes where attributes are the ”describing fields” of an
entity In ERM diagrams attributes show the ”fields” in relational tables.
The attributes demonstrate to which extent it is possible to store data concerning this entity.
Sales Transaction
Date
Customer no
Material no
Sales pers no
Amount
Quantity
Currency
It is useful for the following steps to ask the end-user for details concerning relationships between
entities and relationships between entities and their attributes.
TH BW
ASAP FOR BW ACCELERATOR
It gives you an idea of ‘abnormal’ situations like n:m relationships between an entity and an attribute
(s. material and color). This relationships have to be treated carefully:
Material Group
Price
Sales order
After these steps you will have an good idea about the business terms and how are the relationships
between them. It gives you a good basis for a multidimensional model.
In SAP product based scenarios the Business Content InfoSources give you a good foundation to
identify entities, attributes and facts (key figures) of the underlying subject area. As BW offers the
InfoSources ordered by applications it is easy to identify the InfoSource(s) which cover(s) your
subject area. If the subject area is based on customer generated structures like LIS and CO-PA you
have to contact these structures. The result is normally a complete set of entities and attributes. The
relationships can be derived from the SAP product data model if they are not obvious.
Even if the solution is not entirely SAP product based or you plan to migrate a source legacy system
e.g. to R/3 in the future the respective InfoSources should be regarded.
TH BW
ASAP FOR BW ACCELERATOR
Overcome model complexity means the creation of a schema that is comprehensible for the end-
user and for software.
(References to publications by Ralph Kimball provide the details for the multi-dimensional data
model.)
Comprehensibility for end-user is reached by organizing entities and attributes from step 1 that are
related in parent-child relationship (1:N) into groups. We call such groups Dimensions and the
members of the dimensions Dimension Attributes or just Attributes. One could say that the strong
entities define the dimensions. For the end-user the attributes of a dimension represent a specific
business view on the facts (or key figures or KPIs) which are derived from the intersection entities.
According to his usual understanding the attributes of a dimension are organized in a hierarchical way
and the most atomic attribute that forms the leaves of the hierarchy defines the Granularity of the
dimension. Granularity determines the detail of information. This model is called Multi-Dimensional
Model (MDM). The Multi-Dimensional Model with the facts based in the center and the dimensions
surrounding them is a simple but mighty concept and is understood by technical resources as well as
by the end-user.
The Star Schema offers comprehensibility for software. The Star Schema is the most popular
way to implement a Multi-Dimensional Model in a relational database. Other solutions in this area are
Snowflake Schemas. The BW InfoCubes are based on a Star Schema therefore we give a short
introduction to the main terms and capabilities of the Star Schema.
TH BW
ASAP FOR BW ACCELERATOR
In a Star Schema, one dimension represents one table. These Dimension Tables surround the Fact
Table, which contains the facts (key figures), and are linked to that Fact Table via unique keys, one
per Dimension Table. Each dimension key uniquely identifies a row in the associated Dimension
Table. Together these dimension keys uniquely identify a specific row in the Fact Table.
Star Schema
Material ID
Sales Rep ID
Material Name
LastName
Material Type
SalesDep
Material Group
Sales Org
Dimension Material ID Material
(Table) Sales Rep ID Dimension
Time Code ID (Table)
Customer ID
Customer ID Time Code ID
Customer Name Sales Amount Year
City Quantity Fiscal Year
Region Quater
Office Name Mounth
FACT (Table) Day of the Week
Customer
Dimension Time
(Table) Dimension
(Table)
TH BW
ASAP FOR BW ACCELERATOR
C
u
st
ome
rS
rSt
r
eet S
a
le
sP
e
rs S
a
le
sR
e
gi
on M
a
t
e
tr
i
ra
l
l U
a n
it D
a
D
te
a
t
e
I
des
Gmb
h M
e
i
er M
o
ni
tor 9
8
11
18
C
u
st
om
me
r
eSa
l
esP
e
rs
Mat
er
ia
lD
at
eAm
o
un
tQ
ua
n
ti
ty
I
des
Gmb
hMe
i
er M
o
ni
tor 9
8
11
18 1
0
00 2
The basic proceeding mapping an ERM to the MDM/ Star Schema is shown on the following graphic:
Material Group
Region Sales Dept. Loc.
Price
Sales order
?
Unit Price
City Fiscal Year
Region Quater
Office Name Mounth
FACT Day of the Week
Customer Dimension
Time Dimension
TH BW
ASAP FOR BW ACCELERATOR
A Central intersection entities defines a Fact Table. An intersection entity like document number
is normally described by facts (Sales Amount, Quantity) which form the non-key columns of the
Fact Table. One could say the M:N relationships between strong entities meet each other in the
Fact Table thus defining the cut between dimensions
Dimensions (Tables):
Attributes with 1:N conditional relationships should be stored in the same dimension, such as
material group and material.
The foreign -> primary key relations define the dimensions
Time :
A special case is the time dimension because there is no correspondence in the ERM therefore
we have to introduce time attributes (like day, week, year,..) in the MDM process to cover the
analysis needs
These considerations provide a starting point for dimension analysis, but additional considerations will
impact the grouping of the attributes and will be discussed in detail later.
TH BW
ASAP FOR BW ACCELERATOR
Build the solution within BW with respect to the analytical needs and as a part of an integrated data
warehouse.
2.3.3 Translating the MDM/ Star Schema i.e. the results of Step 1 and Step 2 to an
InfoCube Description is of course the topic of this paper and will be investigated in the
following chapters in depth.
Address
...
Material DIM ID
OrgStr DIM ID
Time Code ID
....
Quantity
.....
TH BW
ASAP FOR BW ACCELERATOR
2.4 Resume
2.4.1.1 Ralph Kimball writes in his book ‘ The Data Warehouse Toolkit’ :
The nine database design decision points for a dimensional data warehouse consist of
deciding on the following:
1. The processes, and hence the identity of the Fact Tables ((one Fact Table - one InfoCube...) ->
intersection entities)
2. The dimensions of each Fact Table (-> strong entities)
3. The dimension attributes with complete descriptions and proper terminology (-> attributes and
entities)
4. The grain of each Fact Table
5. The facts, including pre-calculated facts
6. How to track slowly changing dimensions
7. The aggregations, heterogeneous dimensions, mini-dimensions, query modes and other
physical storage decisions
8. The historical duration of the database (archiving aspects)
9. The urgency with which the data is extracted and loaded into the data warehouse (time frame
for loading)
TH BW
ASAP FOR BW ACCELERATOR
Star Schema
Material ID
Sales Rep ID
Material Name
LastName
Material Type
SalesDep
Material Group
Sales Org
Dimension Material ID Material
(Table) Sales Rep ID Dimension
Time Code ID (Table)
Customer ID
Customer ID Time Code ID
Customer Name Sales Amount Year
City Quantity Fiscal Year
Region Quater
Office Name Mounth
FACT (Table) Day of the Week
Customer
Dimension Time
(Table) Dimension
(Table)
TH BW
ASAP FOR BW ACCELERATOR
As a result of these three browsing activities, there are a number of key values
(Customer IDs, Material IDs, Time Code ID), one from each Dimension Table affected.
Accessing the Fact Table
Using the key values evaluated during Browsing, select all records in the Fact Table
which have these values in common in the Fact Table record key.
Changes in the real world -> new rows in the fact table
TH BW
ASAP FOR BW ACCELERATOR
Star-II. The role of the Dimension Tables
But there are also changes between attribute values of attributes within the same
dimension (e.g. the material X belongs no longer to material group Y but to material group
Z). Usually these changes occur more or less frequent and in the theory they are therefore
called ‘slowly changing dimensions’. How to deal with these changes has a big impact on
reporting possibilities and the data warehouse management. The different time scenarios
that are possible and how you can solve these with BW are discussed in detail in the next
sections.
Reporting
Star-III. Many reports can be created by accessing only the Dimension Tables (Master data
reporting).
Star-IV. The Star Schema saves information about things that did happen and not things that did
not happen (e.g. report the revenue for the customers in New York within a certain time
span would show the customers that have any revenue but not the customers that have no
revenue)
Aggregation
Star-V. Only the information at the granularity of the Dimension Table keys (Material ID, Customer
ID, Time Code ID, SalesRep ID) need to be stored to make any desired aggregated level
of information available.
Star-VI. More precise: any summarized information can be retrieved at run time i.e. from a
functionality point of view there is no need to store precalculated aggregated data but
Star-VII. With large ( number of rows) Fact Tables and / or large Dimension Tables precalculated
aggregates must be introduced for performance reasons.
Attribute Relationships (Hierarchies)
In the Star Schema there is one (real) attribute (most granular) as unique identifier of each
Dimension Table row joining the Fact Table. The other attributes of a Dimension Table
normally are parents of such an identifying attribute. This leads to the term Hierarchy. With
hierarchies there exist a lot of challenges :
Star-VIII. N:M relationship within a dimension
There is no simple way to handle an N:M relationship between two attributes within a
Dimension Table, such as having materials with different colors. If material is the lowest
level, it is not possible to put both material and material color into one normal star
Dimension Table as we would have the one material value with multiple colors associated
with that one material. If this were the case, material is no longer a unique key.
TH BW
ASAP FOR BW ACCELERATOR
Star-IX. (s. 6.1.6).No leaf attribute values
Again there is no easy way to handle transactional input to a Star Schema where the
facts are offered at different attribute levels whereby the attributes belong to the same
dimension. For example, assume there are the attributes material and material group in the
same dimension. Some subsidiaries can offer transactional data at material level whereas
others can only offer data at material group level. The result in the latter case is Dimension
Table rows with blank or null values for the material, which destroys the unique key
material.
Star-X. Unbalanced Hierarchies
Very often we have attributes in a dimension where there exists a relationship between
some attribute values whereas with others there is none. As the relations between attribute
values of different attributes within a dimension form a tree that will result in paths from the
root to the leaves of different length. This unbalanced hierarchies will produce reports with
dummy hierarchy tree nodes.
Table Sizes and Performance
Star-XI. Don't destroy browsing performance. Dimension Tables should have a 'relatively' small
number of rows (in comparison to the Fact Table; factor at least 1:10 until 1:20).
Schema Maintenance
Star-XII. There are no limitations to the Star Schema with respect to the number of attributes in the
dimension and Fact Tables except the limitations caused by the underlying relational data
base.
Star-XIII. Flexibility regarding the addition of characteristics and key figures to the schema caused by
properties of relational data bases.
4 Multi-Dimensional Schemas in BW
TH BW
ASAP FOR BW ACCELERATOR
BW ‘Developed’ Star SchemaBased on experience with the Star Schema, the SAP BW Schema uses
a more sophisticated approach to guarantee consistency in the data warehouse and to offer schema
based functionality to cover the end-users analysis needs.
Creating a valid a multi-dimensional Schema in BW means always that you have to bear in mind the
overall enterprise data warehouse requirements and the solution specific analysis and reporting
needs. Wrong decisions in this area will have a deep impact to the solution. The result can mean bad
performance or even an invalid schema.
Overview
The picture shows you a multi-dimensional BW Schema using the example from the previous
chapters. Only those parts are included which are important from the modeling point of view.
S a le s R e p N u m b e r
S a le s R e p N u m b e r D im e n s io n M a te r ia l N u m b e r
M a te r ia l N u m b e r
M a te r ia l T y p e
S a le s D E P
S a le s O r g _ D im e n s io n _ ID M a t e r ia l_ D im e n s io n _ ID
M a te r ia l T e x t T a b le
S a le s R e p T e x t T a b le
S a le s R e p N u m b e r M a t e r ia l N u m b e r M a te r ia l N u m b e r
S a le s R e p N u m b er M a te r ia l N u m b e r
S a le s R e p N u m b er Language C ode
Languag e C ode S a le s O r g D im e n s io n T a b le M a t e r ia l D im e n s io n T a b le Language C ode
Langua g e C ode M a te r ia l N a m e
S a le s R e p N am e
M a t e r ia l_ D im e n s io n _ ID
M a t e r ia l H ie r a r c h y T a b le
S a le s O r g _ D im e n s io n _ ID
In fo C u b e
V e rtri e b s o rg a n i s a ti o n
T im e _ D im e n s io n _ ID
C u s to m e r _ D im e n s io n _ ID R e g io n 1 R e g io n 2 R e g io n 3
C u s to m e r T e x t T a b le C u s to m e r N u m b e r Y ear
Q u a te r
C u s to m e r N u m b e r M o u n th
C u s to m e r N u m b e r
Language C ode C u s to m e r D im e n s io n T a b le D ay
Language C ode
C u s to m e r N a m e T im e D im e n s io n T a b le
C u s t o m e r D im e n s io n T im e D im e n s io n
Observations:
The center of a multidimensional Schema in BW forms the Fact Table
The facts of the Fact Table are called in BW Key Figures (e.g. Sales Amount).
The Fact Table is surrounded by Dimensions
A Dimension consist of different table types:
TH BW
ASAP FOR BW ACCELERATOR
Dimension Table
The attributes of the Dimension Tables are called in BW Characteristics (e.g. Material).
The meta data object in BW to describe Characteristics and also Key Figures (facts) is
called InfoObject
Master Tables :
Master Data Table
Dependent attributes of a characteristic can be stored in a separate table called the
Master Data Table of the characteristic. They are called in BW terminology
Attributes (e.g. Material Type).
Text Tables
Textual descriptions of a characteristic are stored in a separate Text Table. The
system runs consistently in different languages at a time.
Important
A possible point of confusion is the use of the term hierarchy in BW. The normal
understanding of hierarchy is defined as a sequence of parent-child relationships
between characteristics. From this perspective, there are hierarchies in the
Dimension Tables, Master Tables, and in Hierarchy Tables.
TH BW
ASAP FOR BW ACCELERATOR
The solution-independent shared Master tables valid for use with any InfoCube and BW
ODS Object in the data warehouse.
These Master tables are the glue of the data warehouse and are discussed in depth in
the next chapter.
TH BW
ASAP FOR BW ACCELERATOR
To cover all the requirements Master Tables of a BW Schema are not linked directly to InfoCubes
as the following simplified picture illustrates :
Multi-Dimensional Schema in BW
Master Text
SID Tables Hierarchies
SID Tables Hierarchies
As you can observe in the BW Schema pointer or translation tables called SID (Surrogate-ID) Tables
are used to link the solution independent Master tables of the BW Schema to InfoCubes.
The graphic shows a simplified version of the reality what kind of SID tables exist and their tasks is
discussed in detail in the SID table section.
Dimensions in a BW Schema
Earlier we introduced some basic rules to define the dimensions on the results of the prior analysis.
Rules of thumb to:
Attributes with 1:N conditional relationships should be stored in the same Dimension, such as
material group and material.
TH BW
ASAP FOR BW ACCELERATOR
The foreign -> primary key relations define the dimensions.
If we have made the decision about the members of a dimension we have to consider that a
Dimension in the BW Schema might consists of different parts :
M a te r ia l M a te r ia l M a s te r T a b le
M a t e r ia l N u m b e r
D im e n s io n M a te r ia l N u m b e r
M a te r ia l T y p e
M a te r ia l_ D im e n s io n _ ID
M a t e r ia l T e x t T a b le
M a te r ia l N u m b e r M a t e r ia l N u m b e r
M a te r ia l N u m b e r
La ng ua g e C o de
M a t e r ia l D im e n s io n T a b le La n gu a ge C od e
M a t e r ia l N a m e
M a te r ia l H ie r a r c h y T a b le
V e r trie b s o rg a n is a ti o n
R e g io n 1 R e g io n 2 R e g io n 3
G e b ie t 1 G e b ie t 2 G e b i e t 3 G e b i e t 3 a G e b ie t 4 G e b ie t 5 G e b ie t 6 G e b ie t 7 G e b ie t 8
One of the primary goals of this paper is to show the different modeling aspects which result in a
different location of an attribute in a dimension of a muti-dimensional BW schema.
TH BW
ASAP FOR BW ACCELERATOR
Material Dimension
Material
Dimension table
Material
As a Characteristic ?
Material
Master table
As a Navigational /
Display Attribute ?
Materialgroup
As a Hierarchy ? Material
Hierarchy table
as the graphic shows the Material - Material group relation can be designed defining Material group
either as a Characteristic i.e. member of a Material Dimension Table
or as an attribute i.e. member of the Material Master Table
or as a node describing attribute of the Material Hierarchy Table
or as any combination of the above options.
Which choice fits best primarily depends on the desired time aspects in your queries and is discussed
in chapter 5.
Important
To avoid confusion we emphasize:
In BW the terms characteristic and attribute shall only show the different locations in the Schema. As
shown above the Material group can occur even in the same schema as a Characteristic in the
Material Dimension table and as an Attribute of Material in the Material Master Data Table.
Without regard to a specific schema location as with the meta data definition we just talk about
InfoObjects of type characteristic.
TH BW
ASAP FOR BW ACCELERATOR
Defining an InfoObject of type Characteristic you have the following modeling relevant options with
respect to the defintion of the Master Data Table.
When defining an InfoObject of type characteristic you are asked whether you want to refer to an
existing other characteristic. If you do so beside others the new characteristic will have the master
table of the referred characteristic.
For example: the characteristics ‘sending costcenter’ and ‘receiving costcenter’ refer to the same
characteristic 0COSTCENTER and thus the same Master Tables
Does a Master Data Table exist at all ? (tab strip: Master Data -> Check box)
This allows you do add Infoobjects as attributes in the attribute tab strip section.
For example in your schema all attributes of a document number may be assigned to other
characteristics like customer or material.
A result of the modeling phase are the attributes of a characteristic which shall reside in it’s Master
Data Table. The attributes are added using the ‘Attributes’ tab strip in the InfoObject maintenance.
These attributes form the communication structure for the InfoSource to load the master data.
Whether an attribute can potentially be used for query navigation (such as drill-down, up, across, or
within) on an InfoCube or ODS Object can be individually defined (Attribute tab strip-> Navigational
check boxes). If you mark the navigation check box of an attribute this attribute is called a
Navigational Attribute.
Note: you have to activate the Navigational Attributes in the InfoCube definition to allow navigation
with respect to this InfoCube.
From navigational point of view navigational attributes behave like characteristics in an InfoCube. But
the reporting behavior of the Navigational Attributes in Master Tables differ from the characteristics
behavior.
Attributes not used for navigation are called Display Attributes. If an InfoObject of type
characteristic is an attribute and not marked as navigational attribute then it is only possible to report
this attribute in conjunction with a characteristic or with a navigational attribute.
For attributes of type key figure the following applies:
TH BW
ASAP FOR BW ACCELERATOR
It is possible to create schemas having the same InfoObject as characteristic in a Dimension Table of
an InfoCube and as Navigational Attribute of another characteristic which is in the InfoCube as well.
To avoid confusion you should give a name to the Navigational Attribute that differs from its
characteristic name. The name is defined in the attribute tabstrip for each navigational attribute.
For example:
The InfoObject MMATERIAL is in the InfoCube and MMATGR is a Navigational Attribute from
MMATERIAL. Let’s assume MMATGR is as a result of the model also a characteristic in the
InfoCube. ‘Material group’ is the name of the InfoObject MMATGR if now you would use the same
name ‘Material group’ for the Navigational Attribute this name would occur twice in the InfoCube
description of the query builder. This would certainly confuse the end user.
Material MatType
AAA 100
BBB 200
CCC 100
DDD 100
TH BW
ASAP FOR BW ACCELERATOR
The InfoObject Material has the attribute MatGroup.We are also interested in former Materials –
MatGroup constellations. MatGroup is defined as a time dependent attribute (check in time
dependent check box). Lets assume that Material ‘BBB’ has MatGroup ‘X’. Then from October, 1 st
1998 a new assignment of MatGroup ‘Y’ to Material ‘BBB’ is valid. The result is a new record in the
Time Dependent Attribute Master Data Table with the respective validity. The old constellation
gets only a new ‘Date To’ value:
In the Attribute tabstrip section exists one ‘time dependent’ check box for each attribute.
Time dependency of an attribute allows you to keep track on the changes over time of the
relation of the characteristic and the time dependent attribute values.
From the technical implementation point of view there exist two Master Data Tables if we have
not time dependent and time dependent attributes.
One master data table to store all relations to not time dependent attributes (name of the
table: /BIC/Pinfobjectname) and
one table for relations to time dependent attributes (name of the table: /BIC/Qinfobjectname).
The time dependent attributes master data table has additional DATETO and DATEFROM
system attributes. In queries the different constellations are addressed using the key date (->
Query properties). The validity attributes are not available for navigation.
Note: The table names of BW business content InfoObjects start with /BI0/ ...
TH BW
ASAP FOR BW ACCELERATOR
A closer look at the reporting possibilities of time dependent attributes is given in chapter 5.
Important
There are no precalculated aggregates at time-dependent attribute level!
Characteristics may not be unique i.e. another attribute is necessary to allow addressing the data.
Example: the InfoObject 0COSTCENTER (cost center) offered from R/3 applications is only unique
with the InfoObject 0CO_AREA (Controlling Area)
These additional characteristics attributes can be defined in the compound tabstripsect section of the
characteristic InfoObject maintenance.
Text Tables
The Text Table of an InfoObject of Type characteristic keeps the descriptions of the characteristic
values. The existence of a text table and different description types as short, middle and long text
descriptions and language dependency can be defined in the master data tabstrip section.
The Text Table or better the description attributes may be defined as time dependent.
Transfer Rules may be applied during text data load.
TH BW
ASAP FOR BW ACCELERATOR
SID Tables
SID Tables play an important role linking the data warehouse information structures to the subject-
oriented InfoCubes and ODS Objects.
To speed up the access to InfoCubes and to allow an InfoCube and ODS-Object independent master
data layer each characteristic and attribute is assigned a SID column and their values are encoded
into 4-byte integer values.
Note:
The algorithm to determine a SID value works fastest if the characteristic does not exceed the
numerical size of nine as in this case the characteristic values will be the SID. No traditional SID table
has to be accessed as the charactereristic or attribute values correspond 1:1 to their SIDs.
To offer optimal performance with the various schemas with respect to master data access three
different SID tables might be generated.
SID tables with respect to Master Data:
The ‘Traditional’ SID table which we know already from earlier versions is always generated if
an InfoObject is not defined as ‘Attribute Only’ (Tabstrip general). This table is used if the access
to an Infocube or ODS-Object use an navigational attribute or if the access is via a characteristic
without attributes.
The Not Time Dependent Attribute SID table of a characteristic for access via not time
dependent attributes.
The Time Dependent Attribute SID table of a characteristic for access via time dependent
attributes.
Example:
Supposed the InfoObject Material has attributes of type ‘not time dependent’ and ‘time dependent’.
The activiation of this InfoObject generates the following tables (for illustration purposes we use the
example from the Master Table section) :
:
Material Master Table for not time dependent attributes : /BIC/Pmaterial
TH BW
ASAP FOR BW ACCELERATOR
Material MatType
AAA 100
BBB 200
CCC 100
DDD 100
TH BW
ASAP FOR BW ACCELERATOR
Material-SID Material
001 AAA
002 BBB
003 CCC
004 DDD
TH BW
ASAP FOR BW ACCELERATOR
Material-SID Material DateFrom DateTo MatGroup-SID
001 AAA 01/1000 12/9999 910
002 BBB 01/1000 09/1998 910
002 BBB 10/1998 12/9999 920
003 CCC 01/1000 12/9999 920
004 DDD 01/1000 12/9999 920
All these SID tables are automatically maintained during master data load.
SID tables are maintained during InfoCube load if no referential integrity check is enforced
(InfoPackage).
To get an understanding of the function of these SID tables a simple example is given how the result
of query is evaluated. If we need the following information :
Show me the Sales Amount for Customers located in 'New York' with Material group 'X'
and ‘Y’ in the Year = '1999'
Lets assume the Material group is an Navigational Attribute (not time dependent) of the characteristic
Material in the Material Master Data Table and we have no predefined aggregates at Material group
level.
How the different tables of the Material Dimension operate together to access the InfoCube Fact
Table shows the following picture:
TH BW
ASAP FOR BW ACCELERATOR
SID Tables for Infocube Access
Material
MaterialMaster
Mastertable
table Example: Show me the sales values for material group X and Y
(Name: /BIC/PMATERIAL)
(Name: /BIC/PMATERIAL)
Material MatGroup
MatGroup SID MatGroup
AAA X
CCC Y X 345
DDD Y 678
Y MatGroup
Z 999 MatGroupSID
SIDtable
table
(Name:
Not used for Infocube access ! (Name:/BIC/SMATGROUP)
/BIC/SMATGROUP)
Then the result set for the Material groups is determined in two steps:
Browsing the tables that form the Dimensions
Material Dimension
Access the ‘traditional’ Material group SID Table and select the Material group
SIDs (here ‘345’ and ‘678’) for Material group = 'X' and ‘Y’
Access the Material not time dependent Attribute SID Table with these Material
group SIDs and determine the Material SID values (here ‘111’, ‘222’ and ‘333’).
Access the Material Dimension Table with these Material SID values and
determine the Material Dimension table Dim-Id values (here ‘1’, ‘2’ and ‘3’)
Customer Dimension: same proceeding
Time Dimension: same proceeding
As a result of these three browsing activities, there are a number of key values (Material
Dimension Table Dim Ids, Customer Dimension Table Dim-Ids, Time Dimension Table
Dim Ids), one from each Dimension Table affected.
Accessing the Fact Table
Using the key values (Dim-Ids) determined during Browsing, select all records in the Fact
Table which have these values in the Fact Table record key.
TH BW
ASAP FOR BW ACCELERATOR
We can summarize that accessing an InfoCube no ‘real value’ Master Data Tables are used. The
following graphic illustrates that:
SID Tables and
InfoCube Access 5 5
5
5
2 4 5
4
5
3 5
2 1 2
5
5 5
3 2 3
5 5
(1) Fact Table
5 5
3 3
(2) Dimension Tables
5
(3) time-independent-SID
(4) time-dependent-SID
5 5
(5) ‘traditional‘ SID
Hierarchies in general are essential structures for navigation and of course having characteristics and
attributes in the Dimension Tables and Master Data Tables that are related in a sequence of parent-
child relationships means hierarchies but internal hierarchies.
External Hierarchies of a characteristic are defined seperatly from the other master data and are as
mentioned above indepent from specfic InfoCubes. They are therefore called External Hierarchies.
The different model properties of ‘internal’ and ‘external’ hierarchies in the BW Schema will be
discussed in chapter 5.
28. During the creation of an InfoObject of type characteristic you define the basic
functionality of External Hierachies for this InfoObject (Tabstrip: Hierarchies) and
whether they exist at all.
TH BW
ASAP FOR BW ACCELERATOR
1. Allow Versioning and / or time dependency of the whole external hierarchy structure (DateTo,
Date From)
2. Or (exclusive) allow time dependency for each external hierarchy node (time dependent
structure)
With both structure types you can allow intervals for the leave nodes which make the definition of an
External Hierarchy easier.
Important
From the performance perspective it is important to know, that ith external hierarchies of type 1
there are precalculated aggregates at each level even for specific node values possible.
With external hierarchies of type 2 there are no precalculated aggregates.
The activation of the InfoObject Material results in the creation of the following tables:
Material Hierarchy Table : : /BIC/HMaterial
Material Hierarchy SID Table : : /BIC/KMaterial
Material SID-Structure Hierarchy Table : : /BIC/IMaterial
External hierarchies can be transferred into the BW directly from an SAP product environment (e.g.
standard cost center hierarchy from R/3), defined manually in BW or loaded via flat file. The latest is
discussed in a separate paper.
BW allows you the definition of multiple External Hierarchies for a characteristic. External Hierarchies
can be used for characteristics in the Dimension Tables and for activated Navigational Attributes for
query navigation.
Example:
Consider a simple external hierarchy for characteristic Country. Country is a member of the Customer
Dimension Table but it could be instead or additionally a Navigational Attribute in the Customer
Master Data Table. The nodes are of textual nature. If ‘Continent’ would be an InfoObject of type
characteristic we could use this InfoObject to define the nodes using its characteristic values like
‘Europe’:
TH BW
ASAP FOR BW ACCELERATOR
Country Hierarchy
-3 World
-2 Europe
3 Germany
4 Austria
5 Switzerland
-1 America
1 USA
2 Canada
6* Japan
Fact 11 1711 1
Table 22 1712 1
33 2711 2
44 3711 3
55 4711 4
66 5711 5
77 6711 6
A node of a hierarchy can be either textual or an InfoObject with a specified value e.g. InfoObject
Material group with value ‘X’. All Display Attributes of the InfoObject Material Group are associated
with this node.
TH BW
ASAP FOR BW ACCELERATOR
In the definition of an InfoCube you select all the InfoObjects of type characteristic which shall be
direct members of this InfoCube. After this you define your Dimensions and assign the selected
characteristics to a Dimension.
Important
BW does not force you to assign only related characteristics to the same Dimension Table. This
offers you additional schema potential. Nevertheless as a rule of thumb you should put only
characteristics into the same Dimension that have a parent – child relationsship.
The activation of the InfoCube results (with one exception which we discuss later) then in the
generation of InfoCube Dimension Tables one for each Dimension.
The columns of a Dimension Table are not the characteristics themself but the SIDs of the
characteristics you have chosen to be member of the InfoCube Dimension (Table). The unique key
of a Dimension Table is the Dimension ID (DIM-ID) that is a surrogate key ( integer 4).
11 1711 1
22 1712 1
33 2711 2
44 3711 3
55 4711 4
66 5711 5
77 6711 6
We emphasize:
In the BW Schema a surrogate key is used as a unique key with each Dimension Table, not the real
most granular characteristic within the dimension. I.e. for each unique combination of SID values of
the different characteristics within a Dimension Table there is a unique surrogate key value assigned.
So in the BW the Dimension Tables are joined to the Fact Table using surrogate keys.
Important
TH BW
ASAP FOR BW ACCELERATOR
The use of a surrogate key as a unique key in a Dimension Table allows modeling patterns like N:M
relationships within the same dimension or like leafless hierarchies and most important it allows you
to follow up changes of constellations between values of different characteristics within the same
dimension over time (time rows). This will be discussed in depth in chapter 5.
TH BW
ASAP FOR BW ACCELERATOR
4.1.2.3 Limitations
An InfoCube allows
16 Dimensions
3 Dimensions exist with each InfoCube (whether they are used and thus visible or not)
Time Dimension
Unit/ Currency Dimension
Packet Dimension
Important
It should be mentioned that in the market sometimes each attribute / characteristic is called a
dimension. This a potential point of misunderstandings as just saying with the BW Schema we
have 16 dimensions and three of them are used internally this may sound very limited. Using this
definition of a dimension there are 13 X 248 dimensions possible with BW plus the dimensions
defined by the Navigational Attributes.
All characteristics which assigned to Dimension Tables can be used for navigation (drilling) and
filtering within queries. Navigation with Navigational Attributes of InfoCube characteristics has to be
explictly switched on for each Navigational Attribute (Tabstrip: Navigation).
The activation of a Navigational Attribute for an InfoCube can be done afterwards. Deactivation of
Navigational Attributes is not possible!
Time Dimension
Unit/ Currency Dimension
Packet Dimension
TH BW
ASAP FOR BW ACCELERATOR
With each load into an InfoCube there is a unique Packet-ID assigned. This allows you to purge
erroneous loads without recreating the whole InfoCube again. The Packet Dimension can cause an
overhead during querying it therefore can be eliminated after proofed correctness of the loads up to a
certain packet-id using the compress feature of the InfoCube.
Unit/ Currency Dimension
The respective Dimension Table is generated if in the InfoCube key figures are selected which are of
type Amount or Quantity.
Important
If you are not interested in Unit or Currency calculations you should define the key figures as
Numbers and then introduce the Unit in the Key figure header (like: Sales in HL). This will reduce
overhead.
Very often we have the situation that our model let’s us assign only one characteristic to a
Dimension.
This will probably occurr if for example you have the document line item in your model or with
specific reporting requirements (Chapter 5: all scenarios exept no. 3).
In this situations a Dimension Table means only overhead. BW allows you define this kind of
Dimensions as a Line Item Dimension. (Check box Dimension definition)
Doing so no Dimension Table for this Dimension will be generated. As Dimension Table will serve
the SID Table of this characteristic. The key in the Fact Table will be the SID of the SID Table.
Line-Item Dimension:
TH BW
ASAP FOR BW ACCELERATOR
Line-Item
5 5
Dimension
5
5
2 4 5
3 5
3 1 2
5 5
5
2 3
5
(1) Fact Table
5 5
3 3
(2) Dimension Tables
5
(3) time-independent-SID
(4) time-dependent-SID
5 5
(5) ‘traditional‘ SID
4.1.3
TH BW
ASAP FOR BW ACCELERATOR
TH BW
ASAP FOR BW ACCELERATOR
5 5
2 multiple
5
Fact Tables
5 4 5
E
3 5
3 2
5 5
F
5
3
5
(F) F-Fact Table Requid > 0 5
2
(E) E-Fact Table Requid = 0
(3) time-independent-SID 5
(4) time-dependent-SID 5
(5) ‘traditional‘ SID
5
Both Fact Tables have the same columns. The F-Table uses b-tree indixes the E-Table uses bitmap
indixes except for Line-Item Dimensions where a b-tree index is used.
The InfoCube compression feature moves the fact records of all selected Requests from the F- to the
E-Fact Table. Doing so the Request-ID of each fact record is set to zero.
The separation into two fact tables is fully transparent.
TH BW
ASAP FOR BW ACCELERATOR
To partition a table you have to define a criteria which allows the database engine to decide where a
specific record has to be loaded and to find him afterwards.
In BW the Fact Table can be either partitioned by the InfoObject 0CALMONTH i.e. Calender year and
month or by 0FISCPER i.e. fiscalyear and period.
5 5
Time
Dimension Partitioning
5
Fact Tables
5 4 5
E 5
2 3
3
5 5
F
5
3
5
(F) F-Fact Table Requid > 0 5
(E) E-Fact Table Requid = 0 Packet
Dimension
(2) Dimension Tables 3 3
(3) time-independent-SID
(4) time-dependent-SID
(5) ‘traditional‘ SID
Togather with the entire value range for your partitioning InfoObject that you expect and the optional
maximal number of partitions the value range for each partition is determined.
Note: Partitioning is a database functionality. Have a look to the OSS up to which degree and how
your database provider supports partitioning!
For example:
Let us assume we want to partition a Fact Table using 0CALMONTH. We want to have data
in our Fact Table starting from ‘199901’. Let us further assume that we expect a life time of
our InfCube until ‘201012’. Without specifying a maximum value for the Partitions we would
have
TH BW
ASAP FOR BW ACCELERATOR
11 years x 12 month + 2 = 134 Partitions
The additional 2 Partitions are reserved for data which have a 0CALMONTH value less or
larger our expected values.
To bring in 1 Quarter in each Partition we proceed as follows :
134 Part. / 4 = 33,5 => maximum = 34
Important
Partitioning for a Fact Table has to be defined before you activate the InfoCube. It cannot be
done afterwards!
The above described Fact Table Partitioning affects only the E-Fact Table. The F-Fact Table is
automatically partitioned by the Reqeust-ID. For this and other reasons do not forget to
compress your InfoCube on a regular base!
TH BW
ASAP FOR BW ACCELERATOR
4.3 BW Terminology
The following picture shows the differences in the terminology.
Important
It should be mentioned that in the market sometimes each attribute/ characteristic is called a
dimension. This a potential point of misunderstandings as just saying with the BW Schema we
have 16 dimensions and three of them are used internally this sounds very limited. Using this
definition of a dimension there are 13 X 248 dimensions possible with BW plus the dimensions
defined by the Navigational Attributes.
TH BW
ASAP FOR BW ACCELERATOR
Analysis
Aspects
Global
Performance
Data Warehouse
Aspects
Design Aspects
TH BW
ASAP FOR BW ACCELERATOR
5.1 Granularity
The decision about Granularity that means the level of detail of your data is one of the important
results of data modeling phase. Granularity deeply influences
Reporting capabilities
Performance
Space needed
Load Time....
You have to decide whether you really need the data in an InfoCube or whether it is meaningful to
store detailed data in an ODS object or even whether you do not store detailed data in your data
warehouse at all addressing via Drill Thru the detailed data in your Source system directly.
These decisions are decisions which do not influence only your current scope but the entire data
warehouse approach and architecture.
This topic is discussed in a special paper.
Volume is a concern for Fact Tables. How can the number of rows of data in a Fact Table be
estimated? Consider the following:
How long shall the data be stored in the Fact Table?
How granular shall the data be?
The first point is quite understandable. However, the grain of the information has a large impact on
querying efficiencies and overall storage requirements. The grain of the Fact Table is directly
impacted by Dimension Table design because the most atomic characteristic in each dimension
determines the grain of the Fact Table. For example, assume the need to analyze the performance of
outlets and articles. Attributes exist which describe:
Outlet
Receipts
Articles
Customers
Time
Limit analysis to articles and time, and further assume 1,000 articles are grouped by 10 article
groups. To track the article group performance on a weekly basis:
Granularity: article group, week, and 300 sales days a year (45 weeks)
10 X 45 = 450 records in the Fact Table per year due to only these two attributes if all articles
are sold within a week
Granularity: article, week, 300 sales days a year (45 weeks)
TH BW
ASAP FOR BW ACCELERATOR
1,000 X 45 = 45,000 records in the Fact Table per year due to only these two attributes if all
articles are sold within a week
Granularity: article, day, 300 sales days a year
1,000 X 300 = 300,000 records in the Fact Table per year due to only these two attributes if
all articles are sold within a day
Granularity: article, hour, 300 sales days a year, 12 sales hours a day
500 X 300 X 12 = 1,800,000 records in the Fact Table per year due to only these two
attributes if on average 500 articles are sold within an hour
Finally, assuming 500 outlets, there will be 900,000,000 records a year in the Fact Table.
Quite obviously granularity directly impacts the storage space needed. The Fact Table stores the
transaction data so is the largest table in the InfoCube. Therefore, reviewing the size of the Fact
Table provides a rough idea of space required for the InfoCube.
For each Dimension Table a four byte integer DIM ID (Dimension ID) is used, in conjunction with the
other DIM IDs, to point to the associated row of data in the Fact Table. In addition, the length of all
the key figures in the Fact Table must be considered:
((Number Of DIM IDs) * 4 + (Total Length of All Key Figures)) * Number of Records
Important
Remember the three required dimensions are time, unit, and packet.
Large Fact Tables impact reporting and analysis. Apart from hardware considerations, there are a
few additional considerations to keep in mind
Aggregation
For large Fact Tables consider the use of precalculated aggregates. See the implications,
such as the increase in the storage space required, in an earlier section of this document.
Partitioning
Partition the Fact Table. The option exists to divide a table with respect to the values of a
specific attribute, into several physical tables. This process is transparent to the user. This
technique is useful with large Fact Tables because it provides access via smaller indexes.
TH BW
ASAP FOR BW ACCELERATOR
TH BW
ASAP FOR BW ACCELERATOR
Parent Attributes in BW
Material Dimension
Material
Dimension table
Material
As a Characteristic ?
Material
Master table
As a Navigational /
Display Attribute ?
Materialgroup
As a Hierarchy ? Material
Hierarchy table
The freedom to choose between different locations of dependent attributes means no real freedom as
the reporting behavior and possibilities differ and depend upon the location. Thus the reporting needs
investigated during the blueprint phase of the project normally define exactly the location of a
dependent attribute. This is discussed in detail in the following chapters.
The reporting needs should guide you in the decision where put a dependent attribute. There is little
or nothing to be said from the performance point of view in favor of attributes in an InfoCube
Dimension Table instead in Master or Hierarchy Tables.
From an enterprise data warehouse point of view and apart from analysis demands and performance
issues the following hint should be observed:
TH BW
ASAP FOR BW ACCELERATOR
Parent attributes should be placed in master tables (->Navigational/ Display Attributes) or
designed as an external hierarchy to minimize redundancy and to guarantee integration in the
data warehouse.
Data warehousing should mean controlled redundancy to achieve a high degree of integration. From
this point of view all the dependent attributes should reside in master tables which means in the
extreme case that there is only one characteristic in each Dimension Table (s. Line-Item Dimension).
TH BW
ASAP FOR BW ACCELERATOR
5.2.4 We now discuss the most important term with data warehouses : time
5.2.5.1 Changes over time are normally tracked in the Fact Table by loading
transaction data.
It is the task of the Fact Table to track changes (e.g. Sales) between characteristics of different
dimensions.
For example:
if the material ‘EEE‘ is purchased by customer ‘123‘ on day ‘19990630‘, this sale will occur as a new
row in the Fact Table and thus the existence of the new relationship between material ‘AAA‘ and
customer ‘123‘ and date ‘19990630‘ become visible.
Things that did happen
The Fact Table normally reports things that did happen. There is no easy way to report on things that
did not happen.
TH BW
ASAP FOR BW ACCELERATOR
Materialgroup SID
Mat Mat-SID Material Dimension Table
AAA 001 Mat-GR-SID
Mat-GR-SIDMat-SID
Mat-SID Mat-DIM-ID
Mat-DIM-ID Fact Table
BBB 002
910
910 001
001 111
111
CCC 003
Mat-DIM-ID
910
910 002
002 222
222 Mat-DIM-IDTime-DIM-ID
Time-DIM-ID Revenue
Revenue
DDD 004
920 002 666 111
111 09/1998
09/1998 100
100
EEE 005 920 002 666
920 003 333 Fact
222 Table
09/1998 100
920 003 333 222 09/1998 100
Material SID 920
920 004
004 444
444 333
333 09/1998
09/1998 100
100
920
920 005
005 555
555 444
444 09/1998
09/1998 100
100
Mat-GR Mat-GR-SID 111
111 10/1998
10/1998 100
100
X 910 222
222 10/1998
10/1998 100
100
333
333 10/1998
10/1998 100
100
Y 920
444
444 10/1998
10/1998 100
100
Add new record to dim table
555
555 10/1998
10/1998
10/1998 100
100
Add new record
to fact table
Changes in the relationship between the values of parent - child attributes within a Dimension are
discussed in detail in the next chapter.
TH BW
ASAP FOR BW ACCELERATOR
To Track changes between attributes of different dimensions (like a sales transaction) is the ‘normal‘
business of an InfoCube and is covered by the Fact Table.
But there are also changes between characteristic value and dependent attribute value assignments.
For example :
Tthe Material ‘BBB’ belongs no longer to Material group ‘X’ but to Material group ‘Y’
Usually these changes occur rarely and in the theory they are addressed as ‘slowly changing
dimensions’. How to handle these changes has a big impact on reporting possibilities and the data
warehouse management.
We emphasize again:
The reporting possibilities differ whether you define a dependent attribute as a characteristic, a
Navigational Attribute or a node of an external hierarchie. Because the loactions offer different time
scenarios
To explain the different time scenarios we will use the example as follows :
Constellation 09/1998:
Material Material group
AAA X
Fact Table
BBB X
Material Date Revenue
CCC Y
AAA 09/1998 100
DDD Y
BBB 09/1998 100
CCC 09/1998 100
Constellation 10/1998: DDD 09/1998 100
The example shows the Material – Material group value constellations in 09/1998 and in 10/1998.
The Fact Table shows the transactions which occurred during the same time span.
TH BW
ASAP FOR BW ACCELERATOR
With this simple example we are able to produce 4 reports with different results which all can claim to
report the truth. But the truth depends on how you treat changes in the relationships between
Materials and Material groups :
The reader is invited to implement this little example (just 9 rows in the Fact Table) on BW to verify
the following scenarios :
TH BW
ASAP FOR BW ACCELERATOR
TH BW
ASAP FOR BW ACCELERATOR
1st Solution :
The parent attribute (Material group) resides in the Master Date Table of the child characteristic
(Material) (BW Admin WB : InfoObject maintenance-> Attributes)
TH BW
ASAP FOR BW ACCELERATOR
The parent attribute has to be defined as a Navigational Attribute to allow drill and filter functions
(BW Admin WB : InfoObject maintenance-> Attributes and InfoCube maintenance -> Navigation)
TH BW
ASAP FOR BW ACCELERATOR
2nd Solution :
Today is Yesterday or Today‘s constellation is the truth – 2nd solution:
Define the dependent attribute of your multi-dimensional model as Node Attribute of an External
Hierarchy of your characteristic.
As with BW Vers. 1.2b for all attributes of Material there would be no precalculated aggregates
possible even if there is only a time dependency desired for the Material – Material group relationship
(-> section about aggregates).
Example – 2nd Solution:
Material group as node-attribute of an External Material Hierarchy.
Parent attribute resides in the Hierarchy Table as node attribute of an External Hierarchy of the child
characteristic
No time dependent hierarchy name, structure or versions are necessary for the External
Hierarchy to implement this scenario.
Today is Yesterday or Today‘s constellation is the truth – conclusion :
TH BW
ASAP FOR BW ACCELERATOR
If you want to report your fact data always with respect to latest characteristic – attributes value
constellations the dependent attributes have to be either Navigational Attributes or Nodes of an
External Hierarchy of the characteristic.Loading new constellations (Master or Hierarchy data) the
fact data stored on characteristic level are automatically realigned to the new Navigational Attribute or
Node values.
Important
TH BW
ASAP FOR BW ACCELERATOR
Constellation 09/98: Fact Table Reporting demands:
Material Material group Material Date Revenue
AAA X AAA 09/1998 100
BBB X BBB 09/1998 100
CCC Y CCC 09/1998 100
Report using yesterday‘s constallation
DDD Y DDD 09/1998 100
Material group Rev 09/98 Rev 10/98
Constellation 10/98: AAA 10/1998 100 X 200 200
Material Material group BBB 10/1998 100 Y 200 200
AAA X CCC 10/1998 100
BBB Y (changed) DDD 10/1998 100
CCC Y EEE 10/1998 100
DDD Y
EEE Y (new) ?
This scenario may be of interest if you want to report the effects of organizational changes
Example:
When the Materials are reorganized using new Material group assignments this scenario would allow
one query to report your last year sales data with the today’s Material assignment and another query
with the Material assignment which was valid last year.
Thus offering a fundament for comparisons.
The question may come up how to handle revenues in the Fact Table which cannot be assigned to a
Material because it does not exist in the yesterdays master data.
TH BW
ASAP FOR BW ACCELERATOR
1st Solution :
The Material group is a Time Dependent Navigational Attribute of Material. (BW Admin WB :
InfoObject maintenance-> Attributes)
As emphasized above there would be no precalculated aggregates possible at Material group level.
How to address different constellations
The parent attribute (Material group) resides in the master table of the child characteristic (BW
Admin WB : InfoObject maintenance-> Attributes)
TH BW
ASAP FOR BW ACCELERATOR
The DateTo and DateFrom Attributes are not for navigation and do not appear directly in the Query
Builder. Different master data records of the same characteristic value are addressed using the Key
Date in the Properties Window of a query.
E.g. a Key Date 30.09.1998 means : select master records with DateTo >= 30.09.1998 and
DateFrom =< 30.09.1998
Hint: Define a BW variable to allow flexible reports and analysis (BEX Query Builder) with different
Key dates
Important
The Key Date of a Query allows you to address different master data records having the same
characteristic value.
This Key Date is valid for all master records of characteristics having time dependent attributes.
Using the time dependent feature you are not able to report more than one master record
(constellation) for a characteristic value at a single query execution !!
2nd Solution :
Today is Yesterday or Today‘s constellation is the truth – 2nd solution:
Define the dependent attribute of your multi-dimensional model as Node Attribute of an External
Hierarchy of your characteristic where the entire Hierarchy or even the structure is time dependent.
TH BW
ASAP FOR BW ACCELERATOR
Keydate
Keydate==09/1998
09/1998
Report using yesterday‘s constellation
Material Hierachy Table Material group Rev 9/98 Rev 10/98
-1 -1
(All) (All)
+ X 200 200
-2 -3 -2 -3 + Y 200 200
(X) (Y) (X) (Y) not assigned 100
Allow versions and/ or entire Hierarchy time dependent or even time-dependent structures for
external hierarchies of the child characteristic (Material) (BW Admin WB : InfoObject maintenance->
Hierarchies)
The parent attribute resides as a node attribute of an external hierarchy in the Hierarchy Table of the
child characteristic (BW Admin WB : InfoObject maintenance-> Hierarchies)
TH BW
ASAP FOR BW ACCELERATOR
Important
If all dependent attributes of a characteristic are Navigational (time dependent or not) or Display
Attributes in the characteristic’s Master Data Table or Nodes (time dependent or not) of an
External Hierarchy (time dependent or not) then remember the possibility to define this
characteristic as Line Item Dimension!
TH BW
ASAP FOR BW ACCELERATOR
This scenario is of interest if you want reports that track the organizational changes (time rows):
e.g. With Human Resources
TH BW
ASAP FOR BW ACCELERATOR
The parent attribute (Material group) resides as a characteristic in the Dimension Table of the child
characteristic (Material) (BW Admin WB : InfoCube maintenance -> Characteristics).
If the parent characteristic is not delivered via transaction data load an update rule has to be created
to determine via automatic lookup to the Characteristic’s Master the parent characteristic value.
TH BW
ASAP FOR BW ACCELERATOR
This scenario illustrates a strength of the BW Schema. The usage of surrogate keys (DIM IDs) for the
Dimension Tables makes this time scenario possible.
It allows you to track all the constellation changes and to assign the validity of such constellation
implicitly via the Time in the Fact Table.
TH BW
ASAP FOR BW ACCELERATOR
5.2.6.4 Scenario IV: Report only on data for constellations valid today and yesterday
-Today and Yesterday-
TH BW
ASAP FOR BW ACCELERATOR
As in the Yesterday is Today scenario we store all the different Parent-Child constellations which
occurred over time.
TH BW
ASAP FOR BW ACCELERATOR
The parent attribute (Material group) resides in the master table of the child characteristic (BW
Admin WB : InfoObject maintenance-> Attributes)
The key date mechanism for addressing specific master data records does not allow time ranges
Furthermore the DateTo and DateFrom (To-Sys / From-Sys) attributes which are generated
automatically to handle Time Dependent Attributes cannot be used for user defined navigation or
filters.
We have to define our own DateTo and DateFrom attributes (To-User and From-User) in the
master table.
During master data load the user Date To value of the old master record has to be updated.
Hint: Define time variables with intervals for Date From and Date To to allow flexible reports and
analysis (BEx Query Builder)
e.g. To make a query with comparable data for the period 9/1998 to 10/1998 you have to define
the Intervals as follows:
(userdefined) DateFrom : 011900 - 091998
(userdefined) DateTo : 101998 – 129999
The Query Key date must be in 9 or 10/1998
TH BW
ASAP FOR BW ACCELERATOR
As shown in the previous chapter BW supports a wide range of time scenarios. Summarizing what
we learned in the previous sections we emphasize:
It is possible to design all time scenarios within one BW Scema.
Using different time scenarios in a Schema increases the potential value of our solution
Thus during analysis it is quite understandable that the end-user may wish to have all time scenarios
in the BW Schema – just in case.
If this wish comes up and there is no fundamental information need one have to warn the end-user
because he will have to pay for it :
He will lose the simplicity of the Multi-Dimensional Model and beside this produce overhead during
load and querying thus:
With each additional time scenario in a BW Schema the complexity increases and thus
the potential of erroneous and misleading queries.
A direct consequence is:
Additional training has to be done for ad hoc users and for query authors to explain the
differences of the time scenarios and how and in which case to use them.
If on the other hand side the end-user has a real need to report using different time scenarios the
following rules has to be observed:
TH BW
ASAP FOR BW ACCELERATOR
There are no precalculated aggregates possible on Time Dependent Attribute level thus
introducing time dependency for an attribute without any need might make performance
improvements impossible. The same is true with external hierarchies that are structure-
time-dependent.
TH BW
ASAP FOR BW ACCELERATOR
5.3.1.1
Normally N:M relationships between two attributes discovered during analysis mean that they reside
as characteristics in different Dimension Tables like customer and material. And the Fact Table
resolves this M:N relationship. This kind of relationship is described by facts / key figures like
revenue.
N:M relationships may also occur within the same dimension like Material and Color or Customer and
Communication-Possibilities.
e.g. Material and Color
Material Color
Color is an attribute of the characteristic Material. A Material can have multiple colors and vice versa.
From normal understanding color should be in the Master Data Table of material like material type.
But this is not possible because the material is the unique key of the master table. Thus we cannot
have one material with multiple colors in the master table (This a typical challenge with Star
SchemasStar-VIII.).
The BW Schema allows such N:M relationships locating the parent attribute Color as a characteristic
in the Material Dimension Table. This is possible due to the usage of surrogate keys (Dim Ids) in the
Dimension Tables allowing the same Material several times in the Dimension Table.
TH BW
ASAP FOR BW ACCELERATOR
5.3.3.2 Designing M:N Relationships using a Compound Attribute
It is possible to achieve the uniqness of a characteristic defining one or even multiple attributes as a
Compound Attributes (InfoObject mainetance – Tabstrip Compound).
Compound Attributes
If you can avoid Compounding - do it !
Compound Attributes means always an overhead with respect to
Reporting as you will alwas have to qualify the Compound Attributes within a query
And from performance point of view.
Bear in mind:
Compounding means always a heritage of Source Systems. What make sense with Source Systems
does not necessarilly mean that it make sense in data warehousing.
Remember that data warehousing does not mean copy management!
TH BW
ASAP FOR BW ACCELERATOR
TH BW
ASAP FOR BW ACCELERATOR
TH BW
ASAP FOR BW ACCELERATOR
ONUM: Order Number (C) ONUM: Order Number (C) ONUM: Order Number (C)
ODAT: Order Date (C) DDAT: Delivery Date (C) BDAT: Billing Date (C)
SALP: Sales Person (C) DELP: Delivery Person (C) BILP: Billing Person (C)
OQTY: Order Quantity (K) DQTY: Delivered Quantity (K) BQTY: Billing Quantity (K)
OPRI: Order Price (K) DPRI: Delivery Price (K) BPRI: Billing Price (K)
TH BW
ASAP FOR BW ACCELERATOR
5.6.1 MultiCubes
Looking at the example introduced above one might come to the following conclusion:
As you frequently want to report data from these processes togather the first approach might be to
create one common multi-dimensional model and following one Infocube.
Creating a solution using one InfoCube without any further schema improvements we would achieve :
O r d e r - D e liv e r y - B illin g C u b e
ONUM CUS PROD ODAT SALP DDAT DELP BDAT B IL P OQTY OPRI DQTY DPRI BQTY BPRI
1 C1 P1 1998 S1 * * * * 5 100 0 0 0 0
2 C2 P1 1998 S2 * * * * 10 200 0 0 0 0
3 C1 P2 1997 S3 * * * * 4 130 0 0 0 0
4 C2 P2 1997 S2 * * * * 8 150 0 0 0 0
4 C2 P2 1998 S2 * * * * -2 -4 0 0 0 0 0
1 C1 P1 * * 1 99 8 D2 * * 0 0 5 1 00 0 0
2 C2 P1 * * 1 99 9 D1 * * 0 0 7 1 20 0 0
2 C2 P1 * * 1 99 9 D2 * * 0 0 3 80 0 0
3 C1 P2 * * 1 99 8 D1 * * 0 0 2 60 0 0
4 C2 P2 * * 1 99 8 D2 * * 0 0 6 1 10 0 0
1 C1 P1 * * * * 1 99 9 B1 0 0 0 0 5 1 00
2 C2 P1 * * * * 1 99 9 B1 0 0 0 0 10 2 00
3 C1 P2 * * * * 1 99 8 B2 0 0 0 0 4 1 30
TH BW
ASAP FOR BW ACCELERATOR
If the same InfoObject of type Key Figure occurrs multiple times you have to decide whether to
add the values from the different cubes or you have to choose one Key Figure from one cube. In
some scenarios adding makes sense (for example: MultiCube on country-specific Basic Cubes
with revenue data) with other scenarios (example: actual and plan) adding is nonsense.
The best way to handle Key Figures is to use an Key figure InfoObject not in different semantic
constellations like Key Figure QTY for ordered quantity in the Order cube and for invoiced
quantity in the Invoiced .
O N U M C U S P R O D O D A T S A LP O Q TY O P R I
O rd e r-C u b e
1 C 1 P 1 1 9 9 8 S 1 5 100
2 C 2 P 1 1 9 9 8 S 2 10 200
3 C 1 P 2 1 9 9 7 S 3 4 130
4 C 2 P 2 1 9 9 7 S 2 8 150
4 C 2 P 2 1 9 9 8 S 2 -2 -4 0
O N U M C U S P R O D D D A T D E LP D Q TY D P R I
D e liv e r y - C u b e 1 C 1 P 1 1 9 9 8 D 2 5 100
2 C 2 P 1 1 9 9 9 D 1 7 120
2 C 2 P 1 1 9 9 9 D 2 3 80
3 C 1 P 2 1 9 9 8 D 1 2 60
4 C 2 P 2 1 9 9 8 D 2 6 110
O N U M C U S P R O D B D A T B IL P B Q TY B P R I
B illin g - C u b e 1 C 1 P 1 1999 B 1 5 100
2 C 2 P 1 1999 B 1 10 200
3 C 1 P 2 1998 B 2 4 130
and based on these Basic InfoCubes a MultiCube a query showing Sales and Delivered Quantity
would look like this:
TH BW
ASAP FOR BW ACCELERATOR
This results a evaluated sending two queries in parallel to the Order and Delivery Cube. A
subsequent union creates the result table.
Multi-Cube Queries
Sales
Delivery
Billing
Basic-Cube Queries Multi-Cube Basic-Cube Queries
Sales Billing
Cube Cube
Delivery
Cube
Basic-Cube Queries
TH BW
ASAP FOR BW ACCELERATOR
In the modeling phase the situation often appears that there are dozens of key figures (facts) like
Actual Sales / Planned Sales / Forecast Sales / Budget Sales... / Planned Units / Forecast Units ...
Furthermore Actual and Plan key figures are normally defined on different granular levels like
Actual data on Product and Daily level
Plan data on Product group and Monthly level
Question:
Shall I introduce all these key figures into the Fact Table of a single InfoCube ?
Answer:
Bearing in mind what we discussed with respect to MultiCube scenarios it does not make sense
to create n-cubes one for each scenario.
It make sense to think of two basic reporting scenarios and to create two cubes one for actual
sales and one for plannings, forecasts and budget
This takes into consideration the different granularity levels in the scenarios
Question:
What will happen if the users want to introduce a 3-month forecast, a 6-month forecast ...?
Answer:
Think of plan, budget and forecast as values of a characteristic for example named ‘ValueType’
located in a separate dimension (table) for example named ‘Scenario’. ValueType replicates the
remaining structure of the schema. We will then have only one key figure e.g. Sales Amount
which only in conjunction with the characteristic ValueType gives a meaning. These attributes are
often called Partitioning Attributes and their dimension a Partitioning Dimension.
The structure is flexible to expand, if for instance another scenario like 3-month forecast is
needed this will be just a new ValType value.
Example:
CUS PROD DAT ValType QTY
C1 P1 199801 P 10
C2 P1 199801 P 10
C1 P2 199801 P 4
C2 P2 199801 P 8
C1 P1 199801 F6 80
C2 P1 199801 F6 70
C1 P2 199801 F6 30
C2 P2 199801 F6 60
TH BW
ASAP FOR BW ACCELERATOR
It is important to remember that reporting the Sales Amount is not meaningful without specifying
the ValType (as filter, in a restricted key figure...) . You would summarize for example plan data
and forecast data.
TH BW
ASAP FOR BW ACCELERATOR
Thus using both features the MultiCube and a Partitioning Attribute offer a good implementation:
Multi-Cube Queries
Plan /Actual
Multi-Cube
Basic-Cube Queries Basic-Cube Queries
Plan,
Forecast.. Actual Data
Data Cube
Cube
TH BW
ASAP FOR BW ACCELERATOR
From reporting point of view it can also be of interest to store an Attribute Key Figures additionally as
a characteristic or an attribute of type characteristic.
This would allow navigation on prices using for example External Hierarchies. Same Characteristic
several times in the Model
It may occurr that you find the same characteristic several times in your BW Schema just playing a
different role.
Example:
Sales Employee, Delivery Employee, Billing Employee
Create one InfoObject Employee. The other characteristics as InfoObject that refer to Employee.
It makes often sense to introduce Employee additionally to the schema to allow simple questions like
show me all transactions where a specific Employee was involved it doesn’t matter in which role.
There might be a Fact Table without a “true” fact e.g. with attendance questions (attendance
intersection entity). The same applies to human resource statistics.
Those situations could be solved introducing an artificial key figure which is always ‘1’ :
TH BW
ASAP FOR BW ACCELERATOR
Month Course Student Attendance
199905 TABW10 Haupt 1
199905 TABW10 Brugna 1
5.8.2 Counting
Often it makes sense to introduce additionally an artificial key figure to allow easy counting. This key
figure is filled by ‘1’during load for each record.
TH BW
ASAP FOR BW ACCELERATOR
A typical example for a hierarchy fitting into this context is a time hierarchy with levels such as
millenium – century – decade – year – month – day – hour etc. Another typical example is a
geographic hierarchy with levels such as continent – country – state – region – city etc.
Hierarchies that can be modeled within a Dimension Table have certain properties:
The number of levels should be fixed i.e. each path from the root to a leaf should have the same
length; each level is represented by an InfoObject.
Example: A geographic dimension with InfoObjects 0COUNTRY (country), 0REGION (region)
and 0CITY (city).
But as the BW does not know anything about parent-child relationships within Dimension Tables
it can make sense to design even unbalanced hierarchies in a Dimension Table if the end-user
know about these strange behavior and can choose so a meaningful child attribute. Note there
are no predefined drill down paths within a Dimension Table. (As Kimball says : the true meaning
of drilling is just adding or removing row headers)
Due to the fact that surrogate keys are used in the Dimension Tables it is possible to design even
‘leafless’ hierarchies. This situation often arises when different OLTP source systems offer data
at different attribute (Hierarchy) levels:
Fact table Dimension table
Dim ID SALES
Umsatz Dim ID Material* Materialgroup*
1 10.000 1 A beverage
2 12.000 2 B sweets
3 25.000 3 C beverage
4 50.000 4 '_' beverage
5 40.000 5 '_' sweets
1
Actually, this is the notion of a hierarchy that is used in documents related to BW.
TH BW
ASAP FOR BW ACCELERATOR
Queries to InfoCubes that use such kind of hierarchies are generally faster than the same queries
to InfoCubes that model the same scenario with one of the two other hierarchy modeling
techniques.
The BW does not explicitly know about the hierarchical dependencies. Therefore precalculated
aggregates that summarize data over regions are not used for queries that summarize over
countries if the country is not included in that precalculated aggregate as well! Therefore
you should always include the hierarchical levels to such an aggregate that are above the level
over which data is summarized.
Example 1: If an aggregate summarizes data over 0REGION then do include 0COUNTRY in that
aggregate too.
Example 2: If an aggregate summarizes data over months then do include years, decades, ...
too!
The reporting aspects of this technique are:
The BW does not explicitly know about the hierarchical dependencies. Therefore there is no
predefined drill down path with this hierarchy design.
This case is very similar to the one discussed in section 5.9.1. The difference is the increased
flexibility (i.e. realignment facilities) that comes with Navigational Attributes. The hierarchy should still
have a fixed number of levels. However, changes to that hierarchy (i.e. changes to attribute values)
can be easily applied to facts that are already loaded into a cube.
A typical example is the hierarchy of sales office – sales group – sales person. This hierarchy has a
fixed number of levels but is frequently reorganized.
From a performance perspective this is the least attractive hierarchy modeling technique.
TH BW
ASAP FOR BW ACCELERATOR
A typical example is a cost center hierarchy in which several (sub-)cost centers belong to one cost
center which itself belong to another cost center and so on. Such a hierarchy has no fixed number of
levels as cost centers usually correspond to departments or groups within a company which might be
reorganized into new subgroups. Thus new levels might be introduced, old ones disappear, the
hierarchy might be deeper at one end (due to a deeper hierarchical organization) and shallow on the
other end.
Another major advantage of external hierarchies vs. its alternatives is that an InfoObject can have
several such hierarchies and all these can be used within the same cube. The same effect could only
be achieved through nasty work-arounds when using the alternative approaches.
The performance issues connected to this type of hierarchy are the following:
These hierarchies usually perform worse that those modeled within dimensions.
They usually perform at least as well as the hierarchies based on Navigational Attributes.
Problems can arise for big external hierarchies with many thousands of nodes and leaves. In that
case it might be better to consider one of the two alternatives.
We have the following types of external Hierarchies :
Versions and / or time dependency of the whole external hierarchy structure (DateTo,
DateFrom) - there are precalculated aggregates at each level even for specific node values
possible
Or (exclusive) time dependency for each external hierarchy node (time dependent structure) -
there are no precalculated aggregates possible