0% found this document useful (0 votes)
8 views4 pages

Session-9 Final Notes PRM 45

The document discusses data warehouse schemas, including Star, Snowflake, and Fact Constellation schemas, highlighting their structures and normalization processes. It also introduces OLAP technology, emphasizing its advantages for business intelligence through operations like slicing, dicing, pivoting, roll-up, and drill-down. Additionally, it outlines the benefits of data warehousing, such as improved data quality and faster decision-making capabilities.

Uploaded by

Chaitanya Joshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views4 pages

Session-9 Final Notes PRM 45

The document discusses data warehouse schemas, including Star, Snowflake, and Fact Constellation schemas, highlighting their structures and normalization processes. It also introduces OLAP technology, emphasizing its advantages for business intelligence through operations like slicing, dicing, pivoting, roll-up, and drill-down. Additionally, it outlines the benefits of data warehousing, such as improved data quality and faster decision-making capabilities.

Uploaded by

Chaitanya Joshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

MIS

Session-9
Kushal Anjaria and Akshay Aggarwal

• The normalization splits up the data into additional


tables.
Data Warehouse schemas • Unlike Star schema, the dimensions table in a
A schema is a logical description of the entire database. It snowflake schema are normalized. For example, the
includes the name and description of records of all record item dimension table in star schema is normalized
types, associated data items, and aggregates. Much like a and split into two-dimension tables, namely item and
database, a data warehouse also requires maintaining a supplier table.
schema. A database uses a relational model, while a data
warehouse uses Star, Snowflake, and Fact Constellation
schema. This chapter will discuss the schemas used in a data
warehouse.

Star Schema

• Each dimension in a star schema is represented with


only one-dimension table.
• This dimension table contains the set of attributes.
• The following diagram shows the sales data of a
company with respect to the four dimensions,
namely time, item, branch, and location.

• Now the item dimension table contains the attributes


item_key, item_name, type, brand, and supplier-key.
• The supplier key is linked to the supplier dimension
table. The supplier dimension table contains the
attributes supplier_key and supplier_type.
• Note − Due to normalisation in the Snowflake
schema, the redundancy is reduced, making it easy
to maintain and save storage space.

Fact Constellation Schema/Galaxy Schema/Hybrid Schema

• A fact constellation has multiple fact tables. It is also


known as galaxy schema.
• The following diagram shows two fact tables,
• There is a fact table at the center. It contains the namely sales and shipping.
keys to each of four dimensions.
• The fact table also contains the attributes, namely
dollars sold and units sold.
• Each dimension has only one dimension table and
each table hold a set of attributes. For example, the
location dimension table contains the attribute set
{location_key, street, city,
province_or_state,country}. This constraint may
cause data redundancy. For example, "Vancouver"
and "Victoria" both the cities are in the Canadian
province of British Columbia. The entries for such
cities may cause data redundancy along the
attributes province_or_state and country.
Snowflake Schema • The sales fact table is same as that in the star schema.
• The shipping fact table has the five dimensions,
• Some dimension tables in the Snowflake schema are
namely item_key, time_key, shipper_key,
normalized.
from_location, to_location.
• The shipping fact table also contains two measures: After pivoting, the block would be changed. The status of the
dollars sold and units sold. data block after pivoting is visible in the figure below:
• It is also possible to share dimension tables between
fact tables. For example, time, item, and location
dimension tables are shared between the sales and
shipping fact table.

In the previous session, we presented the concept of OLTP.


Next, we will understand OLAP.

OLAP:

OLAP (Online Analytical Processing) is the technology


behind many Business Intelligence (BI) applications. OLAP
is a powerful data discovery technology, including limitless
report viewing capabilities, complex analytical calculations,
and predictive “what if” scenario (budget, forecast) planning.

Advantages of OLAP:

Knowledge is the foundation of all successful decisions. Slice and Dice: In slicing and dicing, cross-tabulation is done
Successful businesses continuously plan, analyse and report for specific values other than for all other dimensions
on sales and operational activities to maximise efficiency,
reduce expenditures and gain greater market share. The data cube before slice and dice is shown in the figure
Statisticians will tell you that the more sample data you have, below:
the more likely the resulting statistic will be accurate.
Naturally, the more data a company can access about a
specific activity, the more likely the plan to improve that
activity will be effective. All businesses collect data using
many different systems, and the challenge remains: how to
get all the data together to create accurate, reliable, fast
information about the business. A company that can take
advantage and turn it into shared knowledge accurately and
quickly will surely be better positioned to make successful
business decisions and rise above the competition.

OLAP technology has been defined as achieving “fast access


to shared multidimensional information.” Given OLAP
technology’s ability to create very fast aggregations and
calculations of underlying data sets, one can understand its
usefulness in helping business leaders make better, quicker,
“informed” decisions.

Using OLAP, the following operations can be performed

Pivot is the technique of changing from one-dimensional Dice operation selects a sub-cube from the OLAP cube by
orientation to another. It is also known as rotation. selecting two or more dimensions. In the cube given in the
overview section, a sub-cube is selected by selecting
Before pivoting, the original data cube is shown in the figure following dimensions with criteria:
below:
• Location = “Delhi” or “Kolkata”
• Time = “Q1” or “Q2”
• Item = “Car” or “Bus”

After the dice operation, the block would change, and the
changed block is shown in the figure below:
Slice operation selects a single dimension from the OLAP
cube, creating a new sub-cube. In the cube in the overview
section, Slice is performed on the dimension Time = “Q1”.

After the slice operation, the data would be as follows:

After applying roll-up operation, the data cube is shown in the


figure below:

Roll-up and Drill-down: Rollup is the operation that converts


data with finer granularity to the corser granularity with the
help of aggregation.

Drill-down is the conversion of coarser to finer granularity.


First, normal data-cube is demonstrated in the figure below:

Next, we will study the data warehousing concepts:

• Data warehouse is a subject-oriented, integrated,


time-variant, and non-volatile collection. This data
helps analysts to take informed decisions in an
organisation.
• Using data warehousing, keeping data summarised
at different levels using granularity is efficient.
• The geographical and structural arrangements must
be reflected while establishing data warehousing
principles with appropriate technological adoption.
• It requires a fitness exercise between the
organisation, systems and technology layer.
After that, an operation drill-down is applied. After the drill-
Benefits of data warehousing:
down operation, the data cube will be as follows:
Organisations that use a data warehouse to assist their
analytics and business intelligence see several substantial
benefits:

Better data — Adding data sources to a data warehouse


enables organisations to ensure that they are collecting
consistent and relevant data from that source. They don’t need
to wonder whether the data will be accessible or inconsistent
as it comes into the system. This ensures higher data quality
and data integrity for sound decision-making.

Faster decisions — Data in a warehouse is in such consistent


formats that it is ready to be analysed. It also provides the
analytical power and a complete dataset to base decisions on
hard facts. Therefore, decision-makers no longer need to rely
on hunches, incomplete or poor-quality data and risk
delivering slow and inaccurate results.

You might also like