Introduction To Data Warehouses. Data Warehouse Development Lifecycle (Kimball's Approach)
Introduction To Data Warehouses. Data Warehouse Development Lifecycle (Kimball's Approach)
By Dr. Gabriel
Key Definitions
• Data mart is a specific, subject-oriented
repository of data that was designed to answer
specific questions
– Usually, multiple data marts exist to serve the needs
of multiple business units (sales, marketing,
operations, collections, accounting, etc.)
• Data warehouse is a single organizational
repository of enterprise wide data across many
or all subject areas.
– Data warehouse is an enterprise wide collection of
data marts
Key Definitions
• “Business Intelligence” refers to reporting
and analysis of data stored in the
warehouse
• Data warehouse is the foundation for
business intelligence.
• ‘‘Data warehouse/business intelligence’’
(DW/BI) refers to the complete end-to-end
system.
Two Main Data Warehouse
Development Methodologies
• Top-down approach
– The Inmon’s approach
– DW is developed based on the Enterprise wide data model
– DW as a single repository feeds data into data marts
– Longer to implement
• May fail due to the lack of patience and commitment
• Bottom-up approach
– The Kimball’s approach
– Starts with one data mart (ex. sales); later on additional data marts
are added (ex. collection, marketing, etc.)
– Data flows from source into data marts, then into the data
warehouse
– Faster to implement
• Implementation in stages
– Need to ensure consistency of metadata
• Making sure each data mart calls Apple and Apple
• The Hybrid approach
The Kimball Lifecycle Diagram
The Kimball Lifecycle
• Illustrates the general flow of a DW
implementation
• Identifies task sequencing and highlights
activities that should happen concurrently
• May need to be customized to address the
unique needs of your organization
• Not every detail of every Lifecycle task will
be performed on every project
The Kimball Lifecycle,
SDLC, and DBLC
DB Design
Analysis
Implementation
Detailed System
Design Testing
Implementation
Operation
Maintenance Maintenance
Program/Project Planning
• Kimball’s view of programs and projects
– Project refers to a single iteration of the Kimball
Lifecycle
• from launch through deployment
– Program refers to the broader, ongoing coordination
of resources, infrastructure, timelines, and
communication across multiple projects
• a program contains multiple projects
– In real world, programs do not necessarily start before
projects although ideally they should be.
Program/Project Planning
• Project planning
– Scope definition understanding business
requirements
– Tasks’ identification
– Scheduling
– Resource planning
– Workload assignment
– The end document represents a blueprint of
the project
Program/Project Management
• Enforces the project plan
• Activities:
– Status monitoring
– Issue tracking
– Development of a comprehensive
communication plan that addresses both the
business and IT units
Business Requirements Definition
• Success of the project depends on a solid
understanding of the business
requirements!!!
• Understanding the key factors driving the
business is crucial for successful
translation of the business requirements
into design considerations
What follows the business
requirements definition?
• 3 concurrent tracks focusing on
– Technology
– Data
– Business intelligence applications
– Arrows in the diagram indicate the activity
workflow along each of the parallel tracks
– Dependencies between the tasks are
illustrated by the vertical alignment of the task
boxes.
Technology Track
• Technical Architecture Design
– Overall architectural framework and vision
– Considerations:
• the business requirements
• current technical environment
• planned strategic technical directions
Technology Track
• Product Selection and Installation
– Based on the designed technical architecture
• Evaluation and selection of
– Products that will deliver needed capabilities
– Hardware platform
– Database management system
– Extract-transformation-load (ETL) tools
– Data access query tools
– Reporting tools must be evaluated
• Installation of selected products/components/tools
• Testing of installed products to ensure appropriate
end-to-end integration within the data warehouse
environment.
Data Track