0% found this document useful (0 votes)

607 views25 pages

Data Warehouse Development Approach

The document discusses different approaches to developing a data warehouse, including top-down vs bottom-up and big bang vs incremental. It describes Inmon and Kimball models and notes there is no single best approach. Top-down starts at the enterprise level and risks failure, while bottom-up implements smaller data marts first for quicker results but risks redundancy. Dimensional modeling, requirements definition, and considerations for storage and performance are also covered.

Uploaded by

Viktor Suwiyanto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

607 views25 pages

Data Warehouse Development Approach

Uploaded by

Viktor Suwiyanto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 25

Data Warehouse Development Approaches

1
1

Fundamental Questions
Before deciding to build a data warehouse for your organization, you need to ask the following basic and fundamental questions and address the relevant issues:

Top-down or bottom-up approach? Enterprise-wide or departmental? Which firstdata warehouse or data mart? Build pilot or go with a full-fledged implementation? Dependent or independent data marts?
2

Data Warehouse Development Approaches

Data warehouse development approaches

Inmon Model: EDW approach Kimball Model: Data mart approach

Which model is better?

There is no one-size-fits-all strategy to data warehousing One alternative is the hosted warehouse

General Data Warehouse Development Approaches

Big bang approach

Incremental approach: Top-down incremental approach Bottom-up incremental approach

ISQS 6339, Data Mgmt & BI, Zhangxi Lin

Big Bang Approach

Analyze enterprise requirements

Build enterprise data warehouse

Report in subsets or store in data marts

ISQS 6339, Data Mgmt & BI, Zhangxi Lin

Incremental Approach to Warehouse Development

Multiple iterations Shorter implementations Validation of each phase

Increment 1 Strategy Definition Analysis Design

Iterative

Build

Production
ISQS 6339, Data Mgmt & BI, Zhangxi Lin

Top-Down Approach
Analyze requirements at the enterprise level Develop conceptual information model Identify and prioritize subject areas Complete a model of selected subject area Map to available data Perform a source system analysis Implement base technical architecture Establish metadata, extraction, and load
processes for the initial subject area

Create and populate the initial subject area

data mart within the overall warehouse framework
ISQS 6339, Data Mgmt & BI, Zhangxi Lin 7

Top down
The advantages of this approach are:

A truly corporate effort, an enterprise view of data

Inherently architectednot a union of disparate data marts Single, central storage of data about the content Centralized rules and control May see quick results if implemented with iterations

The disadvantages are:

Takes longer to build even with an iterative method High exposure/risk to failure

Needs high level of cross-functional skills

High outlay without proof of concept

Bottom-Up Approach
Define the scope and coverage of the
data warehouse and analyze the source systems within this scope

Define the initial increment based on the

political pressure, assumed business benefit and data volume

Implement base technical architecture

and establish metadata, extraction, and load processes as required by increment

Create and populate the initial subject

areas within the overall warehouse framework
ISQS 6339, Data Mgmt & BI, Zhangxi Lin

Bottom-Up
The advantages of this approach are:

Faster and easier implementation of manageable pieces

Favorable return on investment and proof of concept Less risk of failure Inherently incremental; can schedule important data marts first Allows project team to learn and grow

The disadvantages are:

Each data mart has its own narrow view of data Permeates redundant data in every data mart

Perpetuates inconsistent and irreconcilable data

Proliferates unmanageable interfaces

Dimensional Modeling Process

High level dimensional model design

Choosing business model Declaring the grain Choosing dimensions Identifying the facts

Detailed dimensional model development Dimensional model review and validation

IS Core users Business community

Final design iteration

ISQS 6339, Data Mgmt & BI, Zhangxi Lin

Supplemental Slides : Data Warehouse Design Phases

Defining the Business Requirements

The concept of business dimensions is fundamental to the requirements definition for a data warehouse.

Information package
Your primary goal in the requirements definition phase is to compile information packages

Once you have firmed up the information packages, youll be able to proceed to the other phases. Essentially, information packages enable you to:
Define the common subject areas Design key business metrics Decide how data must be presented Determine how users will aggregate or roll up Decide the data quantity for user analysis or query Decide how data will be accessed

Supplemental Slides : The Others

Snowflake Schema Model

Country

Direct use by some tools More flexible to change Provides for speedier data loading Can become large and unmanageable Degrades query performance More complex metadata
State County City

18
18

Degenerate Dimensions
order_number and order_line in the fact table

For example, you may be looking for average number of products per order. Then you will have to relate the products to the order number to calculate the average. Attributes such as order_number and order_line in the example are called degenerate dimensions and these are kept as attributes of the fact table.

Storage and Performance Considerations

Database sizing Data partitioning Indexing Star query optimization

20
20

Database Sizing - Test Load Sampling

Analyze a representative sample of the data chosen using proven statistical methods. Ensure that the sample reflects: Test loads for different periods Day-to-day operations Seasonal data and worst-case scenarios Indexes and summaries

21
21

Data Partitioning
Breaking up of data into separate physical units that can be handled independently Types of data partitioning Horizontal partitioning. Vertical partitioning

22
22

Indexing

Indexing is used for the following reasons: It is a huge cost saving, greatly improving performance and scalability. It can replace a full table scan by a quick read of the index followed by a read of only those disk blocks that contain the rows needed.

23
23

Parallelism
Sales table P1 P2 P3

Customers table

Parallel Execution Servers

24
24

Using Summary Data

Designing summary tables offers the following benefits: Provides fast access to precomputed data Reduces use of I/O, CPU, and memory

25
25

CH 2 Introduction To Data Warehousing
No ratings yet
CH 2 Introduction To Data Warehousing
31 pages
Building A Data Warehouse With SQL Server: Presented by John Sterrett
No ratings yet
Building A Data Warehouse With SQL Server: Presented by John Sterrett
28 pages
Data Warehouse
No ratings yet
Data Warehouse
74 pages
Course+Slides+ +Data+Warehouse+ +the+Ultimate+Guide
No ratings yet
Course+Slides+ +Data+Warehouse+ +the+Ultimate+Guide
393 pages
Case Study On Sears Logistics Management Practices
0% (1)
Case Study On Sears Logistics Management Practices
10 pages
Data Mining: Concepts and Techniques: 0501 - 01/server.920/a96520 PDF
100% (1)
Data Mining: Concepts and Techniques: 0501 - 01/server.920/a96520 PDF
63 pages
Reopen Zaragosa1
100% (1)
Reopen Zaragosa1
17 pages
Example of Star Schema in Data Warehouse
No ratings yet
Example of Star Schema in Data Warehouse
16 pages
DWH by Concepts - v1
No ratings yet
DWH by Concepts - v1
56 pages
What Are The Dimensions in Data Warehouse
100% (1)
What Are The Dimensions in Data Warehouse
6 pages
Unit 2 - Data Warehouse Logical Designm
No ratings yet
Unit 2 - Data Warehouse Logical Designm
73 pages
Data Warehousing&Data Mining
No ratings yet
Data Warehousing&Data Mining
170 pages
Data Warehouse Massively Parallel Processing Design Patterns
100% (1)
Data Warehouse Massively Parallel Processing Design Patterns
28 pages
A Data Warehouse Technical Architecture - v3.0
No ratings yet
A Data Warehouse Technical Architecture - v3.0
84 pages
MIE1628 Big Data Analytics Lecture8
No ratings yet
MIE1628 Big Data Analytics Lecture8
82 pages
Drill Slides
No ratings yet
Drill Slides
14 pages
Ch4 - Data Warehousing
No ratings yet
Ch4 - Data Warehousing
33 pages
Advanced Data Warehouse Design
0% (1)
Advanced Data Warehouse Design
12 pages
Unit No: 01 Introduction To Data Warehouse: by Pratiksha Meshram
No ratings yet
Unit No: 01 Introduction To Data Warehouse: by Pratiksha Meshram
38 pages
Chap01 Data Warehouse 1
No ratings yet
Chap01 Data Warehouse 1
65 pages
DWDM Lecturenotes PDF
No ratings yet
DWDM Lecturenotes PDF
133 pages
The Following Are The Different Phases Involved in A ETL Project Development Life Cycle
100% (2)
The Following Are The Different Phases Involved in A ETL Project Development Life Cycle
3 pages
ch09 SM Moroney 3e
100% (1)
ch09 SM Moroney 3e
27 pages
Data Warehousing
No ratings yet
Data Warehousing
39 pages
AEC12 - Governance, Business Ethics, Risk Management and Internal Control
No ratings yet
AEC12 - Governance, Business Ethics, Risk Management and Internal Control
1 page
6 Documentdatabases
No ratings yet
6 Documentdatabases
27 pages
Data Warehouse Concepts
No ratings yet
Data Warehouse Concepts
68 pages
Adbms Data Warehousing and Data Mining
No ratings yet
Adbms Data Warehousing and Data Mining
169 pages
Big Data and Data Warehouse
No ratings yet
Big Data and Data Warehouse
19 pages
DW Concepts Shiva
No ratings yet
DW Concepts Shiva
32 pages
Dataware Q&a Bank
100% (1)
Dataware Q&a Bank
42 pages
SQL Server Change Tracking Feature
No ratings yet
SQL Server Change Tracking Feature
21 pages
Data Warehouse Concepts & Terminology: - Vamshi Myana
No ratings yet
Data Warehouse Concepts & Terminology: - Vamshi Myana
39 pages
Data Mining Unit - 1 Notes
No ratings yet
Data Mining Unit - 1 Notes
16 pages
Final InformaticaHandBook
No ratings yet
Final InformaticaHandBook
133 pages
Organization and Management Q2 Week 1
No ratings yet
Organization and Management Q2 Week 1
6 pages
DWH
No ratings yet
DWH
48 pages
Data Architect or ETL Architect
100% (1)
Data Architect or ETL Architect
4 pages
Data Warehouse and Design Presentation
No ratings yet
Data Warehouse and Design Presentation
11 pages
What Is The Level of Granularity of A Fact Table
No ratings yet
What Is The Level of Granularity of A Fact Table
15 pages
Data Warehousing - Architecture
No ratings yet
Data Warehousing - Architecture
6 pages
Understanding Business Intelligence:: ETL and Data Mart Best Practices
No ratings yet
Understanding Business Intelligence:: ETL and Data Mart Best Practices
20 pages
Unit 1
No ratings yet
Unit 1
14 pages
Chapter-21The Virtual Data Warehouse
No ratings yet
Chapter-21The Virtual Data Warehouse
11 pages
Data Warehouse Interview Questions:: Why Oracle No Netezza?
No ratings yet
Data Warehouse Interview Questions:: Why Oracle No Netezza?
6 pages
What's A Data Warehouse
No ratings yet
What's A Data Warehouse
24 pages
Star and Snowflake Schemas
No ratings yet
Star and Snowflake Schemas
4 pages
Data Warehouse Testing - Approaches and Standards
No ratings yet
Data Warehouse Testing - Approaches and Standards
8 pages
Battle of The Giants - Comparing Kimball and Inmon
No ratings yet
Battle of The Giants - Comparing Kimball and Inmon
15 pages
7-Eleven Japan Co. Case Analysis
100% (1)
7-Eleven Japan Co. Case Analysis
11 pages
Tuning SQL Queries - Oracle
100% (1)
Tuning SQL Queries - Oracle
27 pages
Change Data Capture
No ratings yet
Change Data Capture
10 pages
Data Vault and HQDM Principles PDF
No ratings yet
Data Vault and HQDM Principles PDF
8 pages
Harvard Business School Mba Dissertation
100% (2)
Harvard Business School Mba Dissertation
8 pages
CH 13
No ratings yet
CH 13
20 pages
V12 Algo
No ratings yet
V12 Algo
9 pages
Data Warehousing FAQ
No ratings yet
Data Warehousing FAQ
5 pages
Data Warehouses and Data Cubes
No ratings yet
Data Warehouses and Data Cubes
21 pages
Eyewear Market Size PDF
No ratings yet
Eyewear Market Size PDF
3 pages
Farhan TCKT PDF
No ratings yet
Farhan TCKT PDF
3 pages
Need of Two Types of Data: Information
No ratings yet
Need of Two Types of Data: Information
7 pages
Data Mining and Data Warehouse
No ratings yet
Data Mining and Data Warehouse
11 pages
CSP Sinnar Full Report
No ratings yet
CSP Sinnar Full Report
124 pages
What Are Slowly Changing Dimensions
No ratings yet
What Are Slowly Changing Dimensions
2 pages
HEI P U 0564 - SelfLearning - 20210729142758
No ratings yet
HEI P U 0564 - SelfLearning - 20210729142758
256 pages
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
From Everand
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
Eric Tome
No ratings yet
Project On Consumer Awareness
0% (1)
Project On Consumer Awareness
5 pages
Case Digest Unionbank Vs Santibaez
No ratings yet
Case Digest Unionbank Vs Santibaez
2 pages
Insurance DataWare House Design Vechiles
No ratings yet
Insurance DataWare House Design Vechiles
2 pages
HDInsight Essentials - Second Edition
From Everand
HDInsight Essentials - Second Edition
Rajesh Nadipalli
No ratings yet
Approved Supplier To Manufacturers: Standards To Part Numbers
No ratings yet
Approved Supplier To Manufacturers: Standards To Part Numbers
2 pages
A Framework For ETL Systems Development
No ratings yet
A Framework For ETL Systems Development
16 pages
Optimizing Hadoop for MapReduce
From Everand
Optimizing Hadoop for MapReduce
Khaled Tannir
No ratings yet
Company Bribery and Corruption Policy Template
No ratings yet
Company Bribery and Corruption Policy Template
1 page
Supermarket - Sales - Sheet1 - 1 - (Version 1) .XLSB
No ratings yet
Supermarket - Sales - Sheet1 - 1 - (Version 1) .XLSB
478 pages
Does Gamification Affect Brand Engagement and Equity - 2020 - Journal of Busin PDF
No ratings yet
Does Gamification Affect Brand Engagement and Equity - 2020 - Journal of Busin PDF
12 pages
Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
From Everand
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
Robert Johnson
No ratings yet
Take From Excel: S&P GSCI Commodity Index
No ratings yet
Take From Excel: S&P GSCI Commodity Index
5 pages
Mgt203 Assignment 1
No ratings yet
Mgt203 Assignment 1
12 pages
Aditi Gupta: Travel Company That Offers Homestay Facilities
No ratings yet
Aditi Gupta: Travel Company That Offers Homestay Facilities
2 pages
F 14 - Form Rekapitulasi Temuan Audit Internal - REV 02 - ALLDEPT FINAL - Id.en
No ratings yet
F 14 - Form Rekapitulasi Temuan Audit Internal - REV 02 - ALLDEPT FINAL - Id.en
9 pages
Telework and Remote Work Guide 14 May 2021
No ratings yet
Telework and Remote Work Guide 14 May 2021
27 pages
2019 - 3 Crew Shift Calendar PDF
No ratings yet
2019 - 3 Crew Shift Calendar PDF
4 pages
Broker's Undertaking
No ratings yet
Broker's Undertaking
2 pages
Topic 6: Service and Non-Profit Marketing
No ratings yet
Topic 6: Service and Non-Profit Marketing
12 pages
Economics SSS1 Third Term exam-WPS Office
No ratings yet
Economics SSS1 Third Term exam-WPS Office
4 pages
Arukshita Sood
No ratings yet
Arukshita Sood
2 pages
Learn Hive in 24 Hours
From Everand
Learn Hive in 24 Hours
Alex Nordeen
No ratings yet
Moving Average Trading System
No ratings yet
Moving Average Trading System
3 pages

Data Warehouse Development Approach

Uploaded by

Data Warehouse Development Approach

Uploaded by

Data Warehouse Development Approaches

Data Warehouse Development Approaches

Data warehouse development approaches

Which model is better?

General Data Warehouse Development Approaches

Big bang approach

Incremental approach: Top-down incremental approach Bottom-up incremental approach

ISQS 6339, Data Mgmt & BI, Zhangxi Lin

Big Bang Approach

Build enterprise data warehouse

Report in subsets or store in data marts

ISQS 6339, Data Mgmt & BI, Zhangxi Lin

Incremental Approach to Warehouse Development

Increment 1 Strategy Definition Analysis Design

Create and populate the initial subject area

A truly corporate effort, an enterprise view of data

The disadvantages are:

Needs high level of cross-functional skills

Define the initial increment based on the

Implement base technical architecture

Create and populate the initial subject

Faster and easier implementation of manageable pieces

The disadvantages are:

Perpetuates inconsistent and irreconcilable data

Dimensional Modeling Process

High level dimensional model design

Detailed dimensional model development Dimensional model review and validation

Final design iteration

ISQS 6339, Data Mgmt & BI, Zhangxi Lin

Supplemental Slides : Data Warehouse Design Phases

Defining the Business Requirements

Supplemental Slides : The Others

Snowflake Schema Model

Storage and Performance Considerations

Database sizing Data partitioning Indexing Star query optimization

Database Sizing - Test Load Sampling

Parallel Execution Servers

Using Summary Data

You might also like