0% found this document useful (0 votes)
30 views14 pages

Bi Unit 2

This document discusses key concepts in dimensional data modeling including fact tables, dimensions, grain, additive and non-additive facts, slowly changing dimensions, clickstream data, multivalued dimensions, and dimension attributes. Fact tables contain numeric measurements while dimension tables contain descriptive attributes. Together they define the structure for analyzing business processes and events.

Uploaded by

Jagdish Kapadnis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views14 pages

Bi Unit 2

This document discusses key concepts in dimensional data modeling including fact tables, dimensions, grain, additive and non-additive facts, slowly changing dimensions, clickstream data, multivalued dimensions, and dimension attributes. Fact tables contain numeric measurements while dimension tables contain descriptive attributes. Together they define the structure for analyzing business processes and events.

Uploaded by

Jagdish Kapadnis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Star Schema

Snow flake schema


Fact constellation schema
What is a Fact Table?

• Fact tables contain the data corresponding to


a particular business process.
• Each row represents a single event associated
with that process and contains the
measurement data associated with that event.
• The information contained within a fact table
is typically numeric data.
What are Dimensions?

• Dimensions describe the objects involved in a


business intelligence effort. While facts
correspond to events, dimensions correspond
to people, items, or other objects.
• Dimension tables contain details about each
instance of an object.
Fact table Vs dimension table
• Fact tables and dimension tables are related
to each other.
• If we take example of retail model, the fact
table for a customer transaction would likely
contain a foreign key reference to the item
dimension table, where the entry corresponds
to a primary key in that table for a record
describing the item purchased.
Dimensional Modeling - Grain

• Grain: The grain describe the level of detail.


• It can be also seen as:
• the unique key of a table of a SQL Statement of the unique
identifier of a hierarchy level in a dimension.

At the top level, there are two main options in choosing the level
of granularity:
• Unsummarized/Atomic (transaction level granularity): this is
the highest level of granularity where each fact table row
corresponds to a single transaction or line item
• Summarized: transactions may be summarized by a subset of
dimensions or dimensional attributes. In this case, each row in
the fact table corresponds to multiple transactions
Dimensional Modeling - Grain
• The most granular or atomic data (atomic as an indivisible unit of work)
has the most dimensionality. Atomic data is highly dimensional.
Preferably, you should develop dimensional models for the most atomic
information captured by a Event. Atomic data is the most detailled
information collected: such data cannot be subdivided further.
Example: If a high grain is the month whereas a low or detail grain can
be the day
• A data warehouse almost always demands data to the lowest possible
grain of each dimension not because queries want to see individual low
level rows but because queries need to cut through the details in very
precise ways.
• The lower the level of granularity (or conversely, the higher the level of
summarization), the less storage space required and the faster queries
will be executed.
Additive, Semi-Additive, and Non-Additive Facts

• The numeric measures in a fact table fall into three categories.


The most flexible and useful facts are fully additive; additive
measures can be summed across any of the dimensions
associated with the fact table. Semi-additive measures can be
summed across some dimensions, but not all; balance amounts
are common semi-additive facts because they are additive across
all dimensions except time. Finally, some measures are
completely non-additive, such as ratios. A good approach for
non-additive facts is, where possible, to store the fully additive
components of the non-additive measure and sum these
components into the final answer set before calculating the final
non-additive fact. This final calculation is often done in the BI
layer or OLAP cube.
Slowly Changing Dimensions(SCD)
• dimensions have been assumed to be independent
of time. Unfortunately, this is not the case in the
real world.
• dimension table attributes are relatively static, they
are not fixed forever.
• Dimension attributes change, though rather slowly,
over time. Dimensional designers must engage
business representatives proactively to help
determine the appropriate change-handling
strategy.
Clickstream Data
• Clickstream data is an information trail a user
leaves behind while visiting a website.
• It is typically captured in semi-structured
website log files.
• These website log files contain data elements
such as a date and time stamp, the visitor's IP
address, the destination URLs of the pages
visited, and a user ID that uniquely identifies
the website visitor.
Clickstream Data logs
Multivalued Dimensions
• An open-ended many-valued attribute can be associated with a
dimension row by using a bridge table to associate the many-
valued attributes with the dimension.
Example:
In some financial services companies, the individual customer is
identified and associated with each transaction. For example,
credit card companies often issue unique card numbers to each
cardholder. John and Mary Smith may have a joint credit card
account, but the numbers on their respective pieces of plastic are
unique. In this case there is no need for an account-to-customer
bridge table because the atomic transaction facts are at the
discrete customer grain. Account and customer would both be
foreign keys in this fact table.
Dimension attributes.
• Dimension tables are integral companions to a fact table. The dimension
tables contain the textual descriptors of the business.
• In a well-designed dimensional model, dimension tables have many
columns or attributes. These attributes describe the rows in the dimension
table.
• Dimension attributes serve as the primary source of query constraints,
groupings, and report labels. In a query or report request, attributes are
identified as the by words.
• For example, when a user states that he or she wants to see dollar sales by
week by brand, week and brand must be available as dimension attributes.
• Dimension table attributes play a vital role in the data warehouse. Since
they are the source of virtually all interesting constraints and report labels,
they are key to making the data warehouse usable and understandable.
• The data warehouse is only as good as the dimension attributes. The power
of the data warehouse is directly proportional to the quality and depth of
the dimension attributes.

You might also like