0% found this document useful (0 votes)
78 views12 pages

DWH PPT Topics

This chapter discusses planning and project management for a data warehouse. It covers key topics such as business requirements, top management support, justifying the data warehouse, developing an overall plan and project, assessing readiness, using a life-cycle approach, defining roles and responsibilities, and ensuring user participation. It emphasizes adopting a practical approach and provides guidance on common warning signs and success factors.

Uploaded by

hayyatali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views12 pages

DWH PPT Topics

This chapter discusses planning and project management for a data warehouse. It covers key topics such as business requirements, top management support, justifying the data warehouse, developing an overall plan and project, assessing readiness, using a life-cycle approach, defining roles and responsibilities, and ensuring user participation. It emphasizes adopting a practical approach and provides guidance on common warning signs and success factors.

Uploaded by

hayyatali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

4 Planning and Project Management 63

1 Chapter Objectives 63
1 Planning Your Data Warehouse 64
1 Key Issue 64
1 Business Requirements, Not Technology 66
1 Top Management Support 67
1 Justifying Your Data Warehouse 67
1 The Overall Plan 68
1 The Data Warehouse Project 69
1 How is it Different? 70
1 Assessment of Readiness 71
1 The Life-Cycle Approach 71
1 The Development Phases 73
1 The Project Team 74
1 Organizing the Project Team 75
1 Roles and Responsibilities 75
1 Skills and Experience Levels 77
1 User Participation 78
1 Project Management Considerations 80
1 Guiding Principles 81
CONTENTS ix
1 Warning Signs 82
1 Success Factors 82
1 Anatomy of a Successful Project 83
1 Adopt a Practical Approach 84
1 Chapter Summary 86
1 Review Questions 86
1 Exercises 87
5 Defining the Business Requirements 89
1 Chapter Objectives 89
1 Dimensional Analysis 90
1 Usage of Information Unpredictable 90
1 Dimensional Nature of Business Data 90
1 Examples of Business Dimensions 92
1 Information Packages—A New Concept 93
1 Requirements Not Fully Determinate 93
1 Business Dimensions 95
1 Dimension Hierarchies/Categories 95
1 Key Business Metrics or Facts 96
1 Requirements Gathering Methods 97
1 Interview Techniques 99
1 Adapting the JAD Methodology 102
1 Review of Existing Documentation 103
1 Requirements Definition: Scope and Content 104
1 Data Sources 105
1 Data Transformation 105
1 Data Storage 105
1 Information Delivery 105
1 Information Package Diagrams 106
1 Requirements Definition Document Outline 106
1 Chapter Summary 106
1 Review Questions 107
1 Exercises 107
6 Requirements as the Driving Force for Data Warehousing 109
1 Chapter Objectives 109
1 Data Design 110
1 Structure for Business Dimensions 112
1 Structure for Key Measurements 112
1 Levels of Detail 113
1 The Architectural Plan 113
1 Composition of the Components 114
x CONTENTS
1 Special Considerations 115
1 Tools and Products 118
1 Data Storage Specifications 119
1 DBMS Selection 120
1 Storage Sizing 120
1 Information Delivery Strategy 121
1 Queries and Reports 122
1 Types of Analysis 123
1 Information Distribution 1231
1 Decision Support Applications 123
1 Growth and Expansion 123
1 Chapter Summary 124
1 Review Questions 124
1 Exercises 125
Part 3 ARCHITECTURE AND INFRASTRUCTURE
7 The Architectural Components 127
1 Chapter Objectives 127
1 Understanding Data Warehouse Architecture 127
1 Architecture: Definitions 127
1 Architecture in Three Major Areas 128
1 Distinguishing Characteristics 129
1 Different Objectives and Scope 130
1 Data Content 130
1 Complex Analysis and Quick Response 131
1 Flexible and Dynamic 131
1 Metadata-driven 132
1 Architectural Framework 132
1 Architecture Supporting Flow of Data 132
1 The Management and Control Module 133
1 Technical Architecture 134
1 Data Acquisition 135
1 Data Storage 138
1 Information Delivery 140
1 Chapter Summary 142
1 Review Questions 142
1 Exercises 143
8 Infrastructure as the Foundation for Data Warehousing 145
1 Chapter Objectives 145
1 Infrastructure Supporting Architecture 145
CONTENTS xi
1 Operational Infrastructure 147
1 Physical Infrastructure 147
1 Hardware and Operating Systems 148
1 Platform Options 150
1 Server Hardware 158
1 Database Software 164
1 Parallel Processing Options 164
1 Selection of the DBMS 166
1 Collection of Tools 167
1 Architecture First, Then Tools 168
1 Data Modeling 169
1 Data Extraction 169
1 Data Transformation 169
1 Data Loading 169
1 Data Quality 169
1 Queries and Reports 170
1 Online Analytical Processing (OLAP) 170
1 Alert Systems 170
1 Middleware and Connectivity 170
1 Data Warehouse Management 170
1 Chapter Summary 170
1 Review Questions 171
1 Exercises 171
9 The Significant Role of Metadata 173
1 Chapter Objectives 173
1 Why Metadata is Important 173
1 A Critical Need in the Data Warehouse 175
1 Why Metadata is Vital for End-Users 177
1 Why Metadata is Essential for IT 179
1 Automation of Warehousing Tasks 181
1 Establishing the Context of Information 183
1 Metadata Types by Functional Areas 183
1 Data Acquisition 184
1 Data Storage 186
1 Information Delivery 186
1 Business Metadata 187
1 Content Overview 188
1 Examples of Business Metadata 188
1 Content Highlights 189
1 Who Benefits? 190
1 Technical Metadata 190
xii CONTENTS
1 2 Content Overview 190
1 2 Examples of Technical Metadata 191
1 2 Content Highlights 192
1 2 Who Benefits? 192
12 How to Provide Metadata 193
1 2 Metadata Requirements 193
1 2 Sources of Metadata 194
1 2 Challenges for Metadata Management 196
1 2 Metadata Repository 196
1 2 Metadata Integration and Standards 198
1 2 Implementation Options 199
1 2 Chapter Summary 200
1 2 Review Questions 201
1 2 Exercises 201
Part 4 DATA DESIGN AND DATA PREPARATION
10 Principles of Dimensional Modeling 203
1 1Chapter Objectives 203
1 1From Requirements to Data Design 203
1 2 Design Decisions 204
1 2 Dimensional Modeling Basics 204
1 2 E-R Modeling Versus Dimensional Modeling 209
1 2 Use of CASE Tools 209
1 1The STAR Schema 210
1 2 Review of a Simple STAR Schema 210
1 2 Inside a Dimension Table 212
1 2 Inside the Fact Table 214
1 2 The Factless Fact Table 216
1 2 Data Granularity 217
1 1STAR Schema Keys 218
1 2 Primary Keys 218
1 2 Surrogate Keys 219
1 2 Foreign Keys 219
1 1Advantages of the STAR Schema 220
1 2 Easy for Users to Understand 220
1 2 Optimizes Navigation 221
1 2 Most Suitable for Query Processing 222
1 2 STARjoin and STARindex 223
1 1Chapter Summary 223
1 1Review Questions 224
1 1Exercises 224
CONTENTS xiii
11 Dimensional Modeling: Advanced Topics 225
1 1Chapter Objectives 225
1 1Updates to the Dimension Tables 226
1 2 Slowly Changing Dimensions 226
1 2 Type 1 Changes: Correction of Errors 227
1 2 Type 2 Changes: Preservation of History 228
1 2 Type 3 Changes: Tentative Soft Revisions 230
1 1Miscellaneous Dimensions 231
1 2 Large Dimensions 231
1 2 Rapidly Changing Dimensions 233
1 2 Junk Dimensions 235
1 1The Snowflake Schema 235
1 2 Options to Normalize 235
1 2 Advantages and Disadvantages 238
1 2 When to Snowflake 238
1 1Aggregate Fact Tables 239
1 2 Fact Table Sizes 241
1 2 Need for Aggregates 242
1 2 Aggregating Fact Tables 243
1 2 Aggregation Options 247
1 1Families of STARS 249
1 2 Snapshot and Transaction Tables 250
1 2 Core and Custom Tables 251
1 2 Supporting Enterprise Value Chain or Value Circle 251
1 2 Conforming Dimensions 253
1 2 Standardizing Facts 254
1 2 Summary of Family of STARS 254
1 1Chapter Summary 255
1 1Review Questions 255
1 1Exercises 256
12 Data Extraction, Transformation, and Loading 257
1 1Chapter Objectives 257
1 1ETL Overview 258
1 2 Most Important and Most Challenging 259
1 2 Time-consuming and Arduous 260
1 2 ETL Requirements and Steps 260
1 2 Key Factors 261
1 1Data Extraction 262
1 2 Source Identification 263
1 2 Data Extraction Techniques 263
1 2 Evaluation of the Techniques 270
xiv CONTENTS
1 1Data Transformation 271
1 2 Data Transformation: Basic Tasks 272
1 2 Major Transformation Types 273
1 2 Data Integration and Consolidation 275
1 2 Transformation for Dimension Attributes 277
1 2 How to Implement Transformation 277
1 1Data Loading 279
1 2 Applying Data: Techniques and Processes 280
1 2 Data Refresh Versus Update 282
1 2 Procedure for Dimension Tables 283
1 2 Fact Tables: History and Incremental Loads 284
1 2 ETL Summary 285
1 2 ETL Tool Options 285
1 2 Reemphasizing ETL Metadata 286
1 2 ETL Summary and Approach 287
1 1Chapter Summary 288
1 1Review Questions 288
1 1Exercises 289
13 Data Quality: A Key to Success 291
1 1Chapter Objectives 291
1 1Why is Data Quality Critical? 292
1 2 What is Data Quality? 292
1 2 Benefits of Improved Data Quality 295
1 2 Types of Data Quality Problems 296
1 1Data Quality Challenges 299
1 2 Sources of Data Pollution 299
1 2 Validation of Names and Addresses 301
1 2 Costs of Poor Data Quality 302
1 1Data Quality Tools 303
1 2 Categories of Data Cleansing Tools 303
1 2 Error Discovery Features 303
1 2 Data Correction Features 303
1 2 The DBMS for Quality Control 304
1 1Data Quality Initiative 304
1 2 Data Cleansing Decisions 305
1 2 Who Should be Responsible? 307
1 2 The Purification Process 309
1 2 Practical Tips on Data Quality 311
1 1Chapter Summary 311
1 1Review Questions 312
1 1Exercises 312
CONTENTS xv
Part 5 INFORMATION ACCESS AND DELIVERY
14 Matching Information to the Classes of Users 315
1 1Chapter Objectives 315
1 1Information from the Data Warehouse 316
1 2 Data Warehouse Versus Operational Systems 316
1 2 Information Potential 318
1 2 User-Information Interface 321
1 2 Industry Applications 323
1 1Who Will Use the Information? 323
1 2 Classes of Users 323
1 2 What They Need 326
1 2 How to Provide Information 329
1 1Information Delivery 329
1 2 Queries 331
1 2 Reports 332
1 2 Analysis 333
1 2 Applications 334
1 1Information Delivery Tools 335
1 2 The Desktop Environment 335
1 2 Methodology for Tool Selection 335
1 2 Tool Selection Criteria 338
1 2 Information Delivery Framework 340
1 1Chapter Summary 341
1 1Review Questions 341
1 1Exercises 341
15 OLAP in the Data Warehouse 343
1 1Chapter Objectives 343
1 1Demand for Online Analytical Processing 344
1 2 Need for Multidimensional Analysis 344
1 2 Fast Access and Powerful Calculations 345
1 2 Limitations of Other Analysis Methods 347
1 2 OLAP is the Answer 349
1 2 OLAP Definitions and Rules 349
1 2 OLAP Characteristics 352
1 1Major Features and Functions 353
1 2 General Features 353
1 2 Dimensional Analysis 353
1 2 What are Hypercubes? 357
1 2 Drill-Down and Roll-Up 360
1 2 Slice-and-Dice or Rotation 362
xvi CONTENTS
1 2 Uses and Benefits 363
1 1OLAP Models 363
1 2 Overview of Variations 364
1 2 The MOLAP Model 365
1 2 The ROLAP Model 366
1 2 ROLAP Versus MOLAP 367
1 1OLAP Implementation Considerations 368
1 2 Data Design and Preparation 368
1 2 Administration and Performance 370
1 2 OLAP Platforms 372
1 2 OLAP Tools and Products 373
1 2 Implementation Steps 374
1 1Chapter Summary 374
1 1Review Questions 374
1 1Exercises 375
16 Data Warehousing and the Web 377
1 1Chapter Objectives 377
1 1Web-Enabled Data Warehouse 378
1 2 Why the Web? 378
1 2 Convergence of Technologies 380
1 2 Adapting the Data Warehouse for the Web 381
1 2 The Web as a Data Source 382
1 1Web-Based Information Delivery 383
1 2 Expanded Usage 383
1 2 New Information Strategies 385
1 2 Browser Technology for the Data Warehouse 387
1 2 Security Issues 389
1 1OLAP and the Web 389
1 2 Enterprise OLAP 389
1 2 Web-OLAP Approaches 390
1 2 OLAP Engine Design 390
1 1Building a Web-Enabled Data Warehouse 391
1 2 Nature of the Data Webhouse 391
1 2 Implementation Considerations 393
1 2 Putting the Pieces Together 394
1 2 Web Processing Model 394
1 1Chapter Summary 396
1 1Review Questions 396
1 1Exercises 396
CONTENTS xvii
17 Data Mining Basics 399
1 1Chapter Objectives 399
1 1What is Data Mining? 400
1 2 Data Mining Defined 401
1 2 The Knowledge Discovery Process 402
1 2 OLAP Versus Data Mining 404
1 2 Data Mining and the Data Warehouse 406
1 1Major Data Mining Techniques 408
1 2 Cluster Detection 409
1 2 Decision Trees 411
1 2 Memory-Based Reasoning 413
1 2 Link Analysis 415
1 2 Neural Networks 417
1 2 Genetic Algorithms 418
1 2 Moving into Data Mining 419
1 1Data Mining Applications 422
1 2 Benefits of Data Mining 423
1 2 Applications in Retail Industry 424
1 2 Applications in Telecommunications Industry 425
1 2 Applications in Banking and Finance 426
1 1Chapter Summary 426
1 1Review Questions 426
1 1Exercises 427
Part 6 IMPLEMENTATION AND MAINTENANCE
18 The Physical Design Process 429
1 1Chapter Objectives 429
1 1Physical Design Steps 430
1 2 Develop Standards 430
1 2 Create Aggregates Plan 431
1 2 Determine the Data Partitioning Scheme 431
1 2 Establish Clustering Options 432
1 2 Prepare an Indexing Strategy 432
1 2 Assign Storage Structures 432
1 2 Complete Physical Model 433
1 1Physical Design Considerations 433
1 2 Physical Design Objectives 433
1 2 From Logical Model to Physical Model 434
1 2 Physical Model Components 435
1 2 Significance of Standards 436
1 1Physical Storage 438
xviii CONTENTS
1 2 Storage Area Data Structures 439
1 2 Optimizing Storage 440
1 2 Using RAID Technology 442
1 2 Estimating Storage Sizes 442
1 1Indexing the Data Warehouse 443
1 2 Indexing Overview 443
1 2 B-Tree Index 445
1 2 Bitmapped Index 446
1 2 Clustered Indexes 448
1 2 Indexing the Fact Table 448
1 2 Indexing the Dimension Tables 449
1 1Performance Enhancement Techniques 449
1 2 Data Partitioning 449
1 2 Data Clustering 450
1 2 Parallel Processing 450
1 2 Summary Levels 451
1 2 Referential Integrity Checks 451
1 2 Initialization Parameters 451
1 2 Data Arrays 452
1 1Chapter Summary 452
1 1Review Questions 452
1 1Exercises 453
19 Data Warehouse Deployment 455
1 1Chapter Objectives 455
1 1Major Deployment Activities 456
1 2 Complete User Acceptance 456
1 2 Perform Initial Loads 457
1 2 Get User Desktops Ready 458
1 2 Complete Initial User Training 459
1 2 Institute Initial User Support 460
1 2 Deploy in Stages 460
1 1Considerations for a Pilot 462
1 2 When Is a Pilot Data Mart Useful? 462
1 2 Types of Pilot Projects 463
1 2 Choosing the Pilot 465
1 2 Expanding and Integrating the Pilot 466
1 1Security 467
1 2 Security Policy 467
1 2 Managing User Privileges 468
1 2 Password Considerations 469
1 2 Security Tools 469
CONTENTS xix
1 1Backup and Recovery 470
1 2 Why Back Up the Data Warehouse? 470
1 2 Backup Strategy 471
1 2 Setting Up a Practical Schedule 472
1 2 Recovery 472
1 1Chapter Summary 473
1 1Review Questions 474
1 1Exercises 474
20 Growth and Maintenance 477
1 1Chapter Objectives 477
1 1Monitoring the Data Warehouse 478
1 2 Collection of Statistics 478
1 2 Using Statistics for Growth Planning 480
1 2 Using Statistics for Fine-Tuning 480
1 2 Publishing Trends for Users 481
1 1User Training and Support 481
1 2 User Training Content 482
1 2 Preparing the Training Program 482
1 2 Delivering the Training Program 484
1 2 User Support 485
1 1Managing the Data Warehouse 487
1 2 Platform Upgrades 487
1 2 Managing Data Growth 488
1 2 Storage Management 488
1 2 ETL Management 489
1 2 Data Model Revisions 489
1 2 Information Delivery Enhancements 489
1 2 Ongoing Fine-Tuning

You might also like