0% found this document useful (0 votes)

165 views15 pages

DWDM UNIT-1 Lecture Notes

Uploaded by

tkranthika.cse

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

165 views15 pages

DWDM UNIT-1 Lecture Notes

Uploaded by

tkranthika.cse

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

UNIT-I

DATA WAREHOUSING AND DATA MINING

LECTURE NOTES UNIT-I

UNIT-I: DATA WAREHOUSE SYLLABUS

Data warehouse: Introduction to Data warehouse, Difference between operational database
systems and data warehouses, Data warehouse Characteristics , Data warehouses, Data
warehouse characteristics, Data warehouse Architecture and its components, Extraction –
Transformation-Loading , Logical (Multi-Dimensional),Data Modeling Schema Design, Star and
Snow- Flake Schema , Fact Consultation, Fact Table, Fully Addictive, Non Addictive Measures;
Fact-Less-Facts, Dimension Table Characteristics ; OLAP cube, OLAP Operations, OLAP
Server Architecture- ROLAP,MOLAP and HOLAP.

LECTURE NOTES
Introduction to Data warehouse:
A data warehouse is a repository of information collected from multiple sources, stored under a
unified schema, and that usually resides at a single site. Data warehouses are constructed via a
process of data cleaning, data integration, data transformation, data loading, and periodic data
refreshing. Below figure shows the typical framework for construction and use of a data
warehouse for AllElectronics Sales warehouse example.

AIML- |DATA WAREHOUSING AND DATA MINING 1

UNIT-I

Data warehouse Definition:

According to William H. Inmon, a leading architect in the construction of data warehouse
systems, “A data warehouse is a subject-oriented, integrated, time-variant, and non-volatile
collection of data in support of management’s decision making process”.

Key Features of Data Warehouse:

Subject-oriented: A data warehouse is organized around major subjects, such as customer,
supplier, product, and sales. Rather than concentrating on the day-to-day operations and
transaction processing of an organization, a data warehouse focuses on the modelling and
analysis of data for decision makers around subjects.
Integrated: A data warehouse is usually constructed by integrating multiple heterogeneous
sources, such as relational databases, flat files, and on-line transaction records.

Time-variant: Data are stored to provide information from a historical perspective (e.g., the past
5–10 years). Every key structure in the data warehouse contains, either implicitly or explicitly,
an element of time.

Non-volatile: A data warehouse is always a physically separate store of data transformed from
the application data found in the operational environment and it is permanent nature.

Differences between Operational Database Systems and Data Warehouses:

Operational Database Systems: The major task of on-line operational database systems is to
perform on-line transaction and query processing. These systems are called on-line transaction
processing (OLTP) systems. They cover most of the day-to-day operations of an organization.

AIML- |DATA WAREHOUSING AND DATA MINING 2

UNIT-I

Data warehouse systems: The major task of Data warehouse systems is to serve users or
knowledge workers in the role of data analysis and decision making. These systems are known as
on-line analytical processing (OLAP) systems.
The major distinguishing features between OLTP and OLAP are summarized as follows:
Feature OLTP OLAP
Characteristic operational processing informational processing
Orientation transaction analysis
User clerk, DBA, database knowledge worker (e.g.,
professional manager, executive,
Function day-to-day operations long-term
analyst) informational
requirements, decision support
DB design ER based, application-oriented star/snowflake, subject-oriented
Data current; guaranteed up-to-date historical; accuracy
maintained over time
Summarization primitive, highly detailed summarized, consolidated
View detailed, flat relational summarized, multidimensional
Unit of work short, simple transaction complex query
Access read/write mostly read
Focus data in information out
Operations index/hash on primary key lots of scans
Number of
records tens millions
Number
accessed of users thousands hundreds
DB size 100 MB to GB 100 GB to TB
Priority high performance, high high flexibility, end-user
Metric availability throughput
transaction autonomy
query throughput, response time

AIML- |DATA WAREHOUSING AND DATA MINING 3

UNIT-I

Data warehouse Architecture and its components

Data warehouses often adopt a three-tier architecture:

Bottom Tier - The bottom tier of the architecture is the data warehouse database server. It is the
relational database system. Back end tools and utilities are used to feed data into the bottom tier
from operational databases or other external sources. These back end tools and utilities perform
the extraction, cleaning, transformation, and as well as load and refresh functions to update the
data warehouse. This tier also contains a metadata repository, which stores information about the
data warehouse and its contents.

Middle Tier: In the middle tier, the OLAP Server that can be implemented in either of the
following ways.

 By Relational OLAP (ROLAP), which is an extended relational database management

system. The ROLAP maps the operations on multidimensional data to standard relational
operations.
AIML- |DATA WAREHOUSING AND DATA MINING 4
UNIT-I

 By Multidimensional OLAP (MOLAP) model, which directly implements the

multidimensional data and operations.

Top-Tier - This tier is the front-end client layer. This layer holds the query tools and reporting
tools, analysis tools and data mining tools.

Data Warehouse Models

From the architecture point of view, there are three data warehouse models: the enterprise
warehouse, the data mart, and the virtual warehouse

Enterprise warehouse: An enterprise warehouse collects all of the information about subjects
spanning the entire organization. It provides corporate-wide data integration, usually from one or
more operational systems or external information providers, and is cross-functional in scope.

Data mart: A data mart contains a subset of corporate-wide data that is of value to a specific
group of users. The scope is confined to specific selected subjects. For example, a marketing data
mart may confine its subjects to customer, item, and sales. The data contained in data marts tend
to be summarized.
Virtual warehouse: A virtual warehouse is a set of views over operational databases. For
efficient query processing, only some of the possible summary views may be materialized. A
virtual warehouse is easy to build but requires excess capacity on operational database servers.

AIML- |DATA WAREHOUSING AND DATA MINING 5

UNIT-I

Data Warehouse Back-End Tools and Utilities

Data warehouse systems use back-end tools and utilities to populate and refresh their data. This
process is done by ETL (Extraction- Transform-Load). ETL covers a process of how the data
are loaded from the source system to the data warehouse. Currently, the ETL encompasses a
cleaning step as a separate step. The sequence is then Extract-Clean-Transform-Load.

Data Extraction: Typically gathers data from multiple, heterogeneous, and external sources.
The main objective of the extract step is to retrieve all the required data from the source system
with as little resources as possible.

Data cleaning: The cleaning step is one of the most important as it ensures the quality of the
data in the data warehouse, which detects errors in the data and rectifies them.

Data transformation: The transform step applies a set of rules to transform the data from the
source to the target. This converts data from legacy or host format to warehouse format.

Load: which sorts, summarizes, consolidates, computes views, checks integrity, and builds
indices and partitions.

Refresh: which propagates the updates from the data sources to the warehouse.

Metadata Repository:
Metadata are data about data. Metadata are the data that define warehouse objects.
A metadata repository should contain the following:
 A description of the structure of the data warehouse, which includes the warehouse
schema, view, dimensions, hierarchies, and derived data definitions, as well as data mart
locations and contents.
 Operational metadata, which include data lineage (history of migrated data and the
sequence of transformations applied to it), currency of data (active, archived, or purged),
and monitoring information (warehouse usage statistics, error reports, and audit trails).

AIML- |DATA WAREHOUSING AND DATA MINING 6

UNIT-I

 The algorithms used for summarization, which include measure and dimension definition
algorithms, data on granularity, partitions, subject areas, aggregation, summarization, and
predefined queries and reports.
 The mapping from the operational environment to the data warehouse, which includes
source databases and their contents, gateway descriptions, data partitions, data extraction,
cleaning, transformation rules and defaults, data refresh and purging rules, and security.
 Data related to system performance, which include indices and profiles that improve data
access and retrieval performance, in addition to rules for the timing and scheduling of
refresh, update, and replication cycles.
 Business metadata, which include business terms and definitions, data ownership
information, and charging policies.

A Multidimensional Data Model:

Data warehouses modelled based on a multidimensional data model. This model views data in
the form of a data cube. A data cube allows data to be modeled and viewed in multiple
dimensions. It is defined by dimensions and facts.
Dimensions are the perspectives or entities with respect to which an organization wants to keep
records. For example, AllElectronics may create a sales data warehouse in order to keep records
of the store’s sales with respect to the dimensions time, item, branch, and location.
Each dimension may have a table associated with it, called a dimension table, which further
describes the dimension.

Facts are numerical measures. A multidimensional data model is typically organized around a
central theme, this theme is represented by a fact table. Examples of facts for a sales data
warehouse include dollars sold (sales amount in dollars), and units sold (number of units sold).
The fact table contains the names of the facts, or measures, as well as keys to each of the related
dimension tables.

AIML- |DATA WAREHOUSING AND DATA MINING 7

UNIT-I

A 3-D data cube representation of the data in Table 3.3, according to the dimensions time,
item, and location. The measure displayed is dollars sold (in thousands).

Measures of Data Cube Facts: Three Categories

 Distributive: if the result derived by applying the function to n aggregate values is the
same as that derived by applying the function on all the data without partitioning
 E.g., count(), sum(), min(), max()
 Algebraic: if it can be computed by an algebraic function with M arguments (where M is
a bounded integer), each of which is obtained by applying a distributive aggregate
function
 E.g., avg(), min_N(), standard_deviation()
 Holistic: if there is no constant bound on the storage size needed to describe a
subaggregate.
 E.g., median(), mode(), rank()

AIML- |DATA WAREHOUSING AND DATA MINING 8

UNIT-I

Multidimensional Data Model Conceptual Design Schemas:

A multidimensional data model can be designed conceptually with a star schema, a snowflake
schema, or a fact constellation schema.

Star schema: The most common modeling paradigm is the star schema, in which the data
warehouse contains (1) a large central table (fact table) containing the bulk of the data, with no
redundancy, and (2) a set of smaller attendant tables (dimension tables), one for each dimension.
The schema graph resembles a starburst, with the dimension tables displayed in a radial pattern
around the central fact table.

Example: Star schema of a data warehouse for sales.

Snowflake schema: The snowflake schema is a variant of the star schema model, where some
dimension tables are normalized, thereby further splitting the data into additional tables. The
resulting schema graph forms a shape similar to a snowflake.
The major difference between the snowflake and star schema models is that the
dimension tables of the snowflake model may be kept in normalized form to reduce
redundancies.
Example: Snowflake schema of a data warehouse for sales.

AIML- |DATA WAREHOUSING AND DATA MINING 9

UNIT-I

Fact constellation: Sophisticated applications may require multiple fact tables to share
dimension tables. This kind of schema can be viewed as a collection of stars, and hence is called
a galaxy schema or a fact constellation.

Fact constellation schema of a data warehouse for sales and shipping.

AIML- |DATA WAREHOUSING AND DATA MINING 10

UNIT-I

Fact-Less-Fact Table
Fact table is a collection of many facts and measures having multiple keys joined with one or
more dimension tables. Facts contain both numeric and additive fields.
But fact less fact table are different from all these.
A fact less fact table is fact table that does not contain facts. They contain only dimensional
keys.
It captures events that happen only at information level but not included in the calculations level.
A fact less fact table captures the many-to-many relationships between dimensions, but contains
no numeric or textual facts. They are often used to record events or coverage information.

Factless fact tables are used for tracking a process or collecting stats. They are called so
because, the fact table does not have aggregatable numeric values or facts or information.

Two types of factless fact tables

1. factless fact tables those that describe events
2. factless fact tables those that describe Conditions
Factless fact tables for Events
The first type of factless fact table is a table that records an event. Many event-tracking tables in
dimensional data warehouses turn out to be factless.
Eg: Capturing the leaves taken by an employees
Factless fact tables for Conditions
Factless fact tables are also used to model or other important relationships among dimensions. In
these cases, there are no clear transactions or events. It is used to support negative analysis report
For example a Store that did not sell a product for a given period

Fact Table Measures:

The numeric measures in a fact table fall into three categories.
1) Fully Additive: - measures can be summed across any of the dimensions associated with
the fact table. Eg: Sales
2) Semi-Additive:- measures can be summed across some dimensions, but not all.
Eg: checking account or savings account balance amounts.

AIML- |DATA WAREHOUSING AND DATA MINING 11

UNIT-I

3) Non-Additive:-These are those specific class of fact measures which cannot be

aggregated across all/any dimension and their hierarchy.

OLAP Operations in the Multidimensional Data Model

Roll-up: The roll-up operation (also called the drill-up operation by some vendors) performs
aggregation on a data cube, either by climbing up a concept hierarchy for a dimension or by
dimension reduction.
Ex: Roll-up on Location (From cities to countries)

Drill-down: Drill-down is the reverse of roll-up. It navigates from less detailed data to more
detailed data. Drill-down can be realized by either stepping down a concept hierarchy for a
dimension or introducing additional dimensions.

Ex: Drill-down on Time (From quarters to months)

AIML- |DATA WAREHOUSING AND DATA MINING 12

UNIT-I

Slice: The slice operation performs a selection on one dimension of the given cube, resulting in a
sub cube.

Ex: Slice for Time= “Q1”

AIML- |DATA WAREHOUSING AND DATA MINING 13

UNIT-I

Dice: The dice operation defines a sub cube by performing a selection on two or more
dimensions.
Ex: Dice for (location = “Toronto” or “Vancouver”) and (time = “Q1” or “Q2”) and (item =
“home entertainment” or “computer”).

Pivot (rotate): Pivot (also called rotate) is a visualization operation that rotates the data axes in
view in order to provide an alternative presentation of the data.

Other OLAP operations: Some OLAP systems offer additional drilling operations. For
example, Drill-Across executes queries involving (i.e., across) more than one fact table. The
Drill-Through operation uses relational SQL facilities to drill through the bottom level of a data
cube down to its back-end relational tables.

AIML- |DATA WAREHOUSING AND DATA MINING 14

UNIT-I

OLAP Server Architectures

Logically, OLAP servers present business users with multidimensional data from data
warehouses or data marts. The physical architecture and implementation of OLAP servers
include the following.

Relational OLAP (ROLAP) Servers: These are the intermediate servers that stand in between a
relational back-end server and client front-end tools. They use a relational or extended-relational
DBMS to store and manage warehouse data, and OLAP middleware to support missing pieces.
ROLAP servers include optimization for each DBMS back end, implementation of aggregation
navigation logic, and additional tools and services. ROLAP technology tends to have greater
scalability than MOLAP technology.

Multidimensional OLAP (MOLAP) servers: These servers support multidimensional views of

data through array-based multidimensional storage engines. They map multidimensional views
directly to data cube array structures. The advantage of using a data cube is that it allows fast
indexing to precomputed summarized data.

Hybrid OLAP (HOLAP) servers: The hybrid OLAP approach combines ROLAP and MOLAP
technology, benefiting from the greater scalability of ROLAP and the faster computation of
MOLAP. For example, a HOLAP server may allow large volumes of detail data to be stored in a
relational database, while aggregations are kept in a separate MOLAP store.

Specialized SQL servers: To meet the growing demand of OLAP processing in relational
databases, some database system vendors implement specialized SQL servers that provide
advanced query language and query processing support for SQL queries over star and snowflake
schemas in a read-only environment.

AIML- |DATA WAREHOUSING AND DATA MINING 15

Datawarehousing and Data Mining Full Notes PDF
No ratings yet
Datawarehousing and Data Mining Full Notes PDF
162 pages
Chapter 5 JDBC Programming
No ratings yet
Chapter 5 JDBC Programming
25 pages
DW DM Notes
No ratings yet
DW DM Notes
107 pages
Data Warehousing Components - L3 - L4 - L5
No ratings yet
Data Warehousing Components - L3 - L4 - L5
26 pages
CCS341 - Data Warehousing - Unit 4 Notes
0% (1)
CCS341 - Data Warehousing - Unit 4 Notes
19 pages
DWH by Concepts - v1
No ratings yet
DWH by Concepts - v1
56 pages
Data Warehousing & Data Mining Unit-2 Notes
100% (1)
Data Warehousing & Data Mining Unit-2 Notes
36 pages
What Kind of Data Can Be Mined
No ratings yet
What Kind of Data Can Be Mined
6 pages
Data Warehousing and On-Line Analytical Processing
No ratings yet
Data Warehousing and On-Line Analytical Processing
40 pages
Mining Social Network Graphs
No ratings yet
Mining Social Network Graphs
35 pages
Data Warehousing Laboratory
0% (1)
Data Warehousing Laboratory
28 pages
Introduction To Data Warehouse
No ratings yet
Introduction To Data Warehouse
34 pages
Unit-2 Notes DW 2021
No ratings yet
Unit-2 Notes DW 2021
45 pages
DWH QB
No ratings yet
DWH QB
10 pages
Batch B DWM Experiments
No ratings yet
Batch B DWM Experiments
90 pages
DWM Unit 1
No ratings yet
DWM Unit 1
34 pages
Unit 2 - Knowledge Delivery
No ratings yet
Unit 2 - Knowledge Delivery
31 pages
Unit5 DW
No ratings yet
Unit5 DW
18 pages
Dsi 142
100% (1)
Dsi 142
19 pages
Ad3381 - Data Base Design and Management Manual
No ratings yet
Ad3381 - Data Base Design and Management Manual
56 pages
DMW Lab Manual (1) EDIT
No ratings yet
DMW Lab Manual (1) EDIT
118 pages
DWDM Unit 1
No ratings yet
DWDM Unit 1
103 pages
Data Engineering UNIT-1
100% (1)
Data Engineering UNIT-1
14 pages
Data Warehousing
No ratings yet
Data Warehousing
16 pages
Data Engineering Interview Preparation Questions
No ratings yet
Data Engineering Interview Preparation Questions
7 pages
Data Warehouse Notes
No ratings yet
Data Warehouse Notes
21 pages
Information Retrieval 1 Introduction To IR
No ratings yet
Information Retrieval 1 Introduction To IR
12 pages
BCA-404: Data Mining and Data Ware Housing
No ratings yet
BCA-404: Data Mining and Data Ware Housing
19 pages
CS8091 Bigdata Analytics Lessonplan With Date
No ratings yet
CS8091 Bigdata Analytics Lessonplan With Date
11 pages
Unit 3
No ratings yet
Unit 3
24 pages
Exertherm® Modbus Datacard
No ratings yet
Exertherm® Modbus Datacard
2 pages
DBMS Notes
No ratings yet
DBMS Notes
141 pages
Data Warehousing and Data Mining Syllabus
No ratings yet
Data Warehousing and Data Mining Syllabus
2 pages
CS2202 DataWarehouse OLAP
No ratings yet
CS2202 DataWarehouse OLAP
49 pages
Unit 1
No ratings yet
Unit 1
14 pages
DBMS Unit4 Notes
No ratings yet
DBMS Unit4 Notes
14 pages
5000 SQli Vulnerable Websites List 2016 Fresh
No ratings yet
5000 SQli Vulnerable Websites List 2016 Fresh
120 pages
DWDM Unit-2 PDF
No ratings yet
DWDM Unit-2 PDF
149 pages
Unit Iv
No ratings yet
Unit Iv
8 pages
Advanced Databases - Unit - V - PPT
No ratings yet
Advanced Databases - Unit - V - PPT
71 pages
Language Form One
No ratings yet
Language Form One
91 pages
Dbms PDF
100% (1)
Dbms PDF
4 pages
Hbase PPT PDF
No ratings yet
Hbase PPT PDF
100 pages
ccs341 Data Warehouse Lab Experiments
No ratings yet
ccs341 Data Warehouse Lab Experiments
26 pages
Sample Paper Q0503
No ratings yet
Sample Paper Q0503
20 pages
Unit 1 Data Warehousing and Mining
100% (1)
Unit 1 Data Warehousing and Mining
19 pages
Big Data and Analytics Syllabus 2021
No ratings yet
Big Data and Analytics Syllabus 2021
3 pages
UNIT-III Data Warehouse and Minig Notes MDU
No ratings yet
UNIT-III Data Warehouse and Minig Notes MDU
42 pages
Data Mining: Concepts and Techniques: Jiawei Han and Micheline Kamber
No ratings yet
Data Mining: Concepts and Techniques: Jiawei Han and Micheline Kamber
46 pages
BDA Unit 1-1
No ratings yet
BDA Unit 1-1
21 pages
DWDM Unit-1
No ratings yet
DWDM Unit-1
31 pages
SKP Engineering College: A Course Material On
No ratings yet
SKP Engineering College: A Course Material On
212 pages
DataWarehouseMining Complete Notes
No ratings yet
DataWarehouseMining Complete Notes
55 pages
Chapter - 1 Introduction
No ratings yet
Chapter - 1 Introduction
22 pages
DM Module 1
No ratings yet
DM Module 1
16 pages
DBMS - Unit 4
No ratings yet
DBMS - Unit 4
22 pages
Integrity and Domain Constraints
No ratings yet
Integrity and Domain Constraints
25 pages
Requirement Analysis and Modeling
No ratings yet
Requirement Analysis and Modeling
6 pages
Asymptotic Notation - Analysis of Algorithms
No ratings yet
Asymptotic Notation - Analysis of Algorithms
37 pages
ASSIGNMENT 1 Questions BI
No ratings yet
ASSIGNMENT 1 Questions BI
1 page
Lesson Plan F1.1-DMDW
No ratings yet
Lesson Plan F1.1-DMDW
3 pages
Study On: Dubbing
No ratings yet
Study On: Dubbing
22 pages
MT-8000 Series: MT-8056T/ MT-6056T Installation Instruction 1.0 Installation and Startup Guide
No ratings yet
MT-8000 Series: MT-8056T/ MT-6056T Installation Instruction 1.0 Installation and Startup Guide
8 pages
Advantages of Data Warehouse
No ratings yet
Advantages of Data Warehouse
2 pages
Sa1 - Sample Question Paper Grade 8 PDF
No ratings yet
Sa1 - Sample Question Paper Grade 8 PDF
4 pages
Connections I V2.1.0
No ratings yet
Connections I V2.1.0
49 pages
Data Analytics Process
No ratings yet
Data Analytics Process
10 pages
Gillette Pepsi Cola Media Kit
No ratings yet
Gillette Pepsi Cola Media Kit
7 pages
Os 1-4
No ratings yet
Os 1-4
16 pages
Untitled
No ratings yet
Untitled
2 pages
R22 ML Unit 5
No ratings yet
R22 ML Unit 5
29 pages
Need of Two Types of Data: Information
No ratings yet
Need of Two Types of Data: Information
7 pages
Data Warehousing and Data Mining
No ratings yet
Data Warehousing and Data Mining
4 pages
Lesson Plan: Data Warehousing and Data Mining
No ratings yet
Lesson Plan: Data Warehousing and Data Mining
1 page
Chapter One Roots of Non-Linear Equations
No ratings yet
Chapter One Roots of Non-Linear Equations
15 pages
Assignment
No ratings yet
Assignment
13 pages
Advantages and Disadvantages of Doing Coursework
100% (1)
Advantages and Disadvantages of Doing Coursework
5 pages
Mishafi Manual 2
No ratings yet
Mishafi Manual 2
20 pages
Asignment-3 Ahmed Shahid 035
No ratings yet
Asignment-3 Ahmed Shahid 035
3 pages
Social Media Regulation Freedom of Expression and Civic Space in Nigeria
No ratings yet
Social Media Regulation Freedom of Expression and Civic Space in Nigeria
11 pages
Complete System Design Roadmap - Aman
No ratings yet
Complete System Design Roadmap - Aman
3 pages
BTCO12107 Pps
No ratings yet
BTCO12107 Pps
9 pages
40 Gbps QSFP Cables Ds
No ratings yet
40 Gbps QSFP Cables Ds
2 pages
VRV Ahu Apcvdt1509b 1
No ratings yet
VRV Ahu Apcvdt1509b 1
10 pages
Clat1 Vlsi Ak
No ratings yet
Clat1 Vlsi Ak
5 pages
Polynomials Test Paper
No ratings yet
Polynomials Test Paper
3 pages
(Readings) Deep Learning Applications in Business Activities
No ratings yet
(Readings) Deep Learning Applications in Business Activities
6 pages
Devops: Roadmap - SH
No ratings yet
Devops: Roadmap - SH
1 page
University Research Graph Database
No ratings yet
University Research Graph Database
5 pages
Lab 7
No ratings yet
Lab 7
6 pages

DWDM UNIT-1 Lecture Notes

Uploaded by

DWDM UNIT-1 Lecture Notes

Uploaded by

UNIT-I

DATA WAREHOUSING AND DATA MINING

UNIT-I: DATA WAREHOUSE SYLLABUS

AIML- |DATA WAREHOUSING AND DATA MINING 1

Data warehouse Definition:

Key Features of Data Warehouse:

Differences between Operational Database Systems and Data Warehouses:

AIML- |DATA WAREHOUSING AND DATA MINING 2

AIML- |DATA WAREHOUSING AND DATA MINING 3

Data warehouse Architecture and its components

Data warehouses often adopt a three-tier architecture:

 By Relational OLAP (ROLAP), which is an extended relational database management

 By Multidimensional OLAP (MOLAP) model, which directly implements the

Data Warehouse Models

AIML- |DATA WAREHOUSING AND DATA MINING 5

Data Warehouse Back-End Tools and Utilities

AIML- |DATA WAREHOUSING AND DATA MINING 6

A Multidimensional Data Model:

AIML- |DATA WAREHOUSING AND DATA MINING 7

Measures of Data Cube Facts: Three Categories

AIML- |DATA WAREHOUSING AND DATA MINING 8

Multidimensional Data Model Conceptual Design Schemas:

Example: Star schema of a data warehouse for sales.

AIML- |DATA WAREHOUSING AND DATA MINING 9

Fact constellation schema of a data warehouse for sales and shipping.

AIML- |DATA WAREHOUSING AND DATA MINING 10

Two types of factless fact tables

Fact Table Measures:

AIML- |DATA WAREHOUSING AND DATA MINING 11

3) Non-Additive:-These are those specific class of fact measures which cannot be

OLAP Operations in the Multidimensional Data Model

Ex: Drill-down on Time (From quarters to months)

AIML- |DATA WAREHOUSING AND DATA MINING 12

Ex: Slice for Time= “Q1”

AIML- |DATA WAREHOUSING AND DATA MINING 13

AIML- |DATA WAREHOUSING AND DATA MINING 14

OLAP Server Architectures

Multidimensional OLAP (MOLAP) servers: These servers support multidimensional views of

AIML- |DATA WAREHOUSING AND DATA MINING 15

You might also like