0% found this document useful (0 votes)
94 views46 pages

Chapter 13 - Data Warehouse

This document discusses business intelligence and data warehousing. It defines business intelligence as a set of tools and processes for collecting, analyzing and presenting data to support business decision making. A key component of business intelligence is the data warehouse, which stores historical data from multiple sources to support analysis and reporting. The document explains how data is transformed and loaded from operational systems into a data warehouse through the ETL process to prepare the data for analysis and decision making. It also discusses data marts, star schemas, and other concepts related to business intelligence architectures.

Uploaded by

Aida Gd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
94 views46 pages

Chapter 13 - Data Warehouse

This document discusses business intelligence and data warehousing. It defines business intelligence as a set of tools and processes for collecting, analyzing and presenting data to support business decision making. A key component of business intelligence is the data warehouse, which stores historical data from multiple sources to support analysis and reporting. The document explains how data is transformed and loaded from operational systems into a data warehouse through the ETL process to prepare the data for analysis and decision making. It also discusses data marts, star schemas, and other concepts related to business intelligence architectures.

Uploaded by

Aida Gd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

11e Database Systems

Design, Implementation, and Management

Coronel | Morris

Chapter 13
Business Intelligence and Data
Warehouses
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Learning Objectives
▪ In this chapter, students will learn:
▪ How business intelligence provides a comprehensive
business decision support framework
▪ About business intelligence architecture, its evolution,
and reporting styles
▪ About the relationship and differences between
operational data and decision support data
▪ What a data warehouse is and how to prepare data for
one

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 2
Learning Objectives
▪ In this chapter, students will learn:
▪ What star schemas are and how they are constructed
▪ About data analytics, data mining, and predictive
analytics
▪ About online analytical processing (OLAP)
▪ How SQL extensions are used to support OLAP-type
data manipulations

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 3
Business Intelligence (BI)
▪ Comprehensive, cohesive, integrated set of tools and
processes
▪ Captures, collects, integrates, stores, and analyzes data
▪ Purpose - Generate and present information to
support business decision making
▪ Allows a business to transform:
▪ Data into information
▪ Information into knowledge
▪ Knowledge into wisdom

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 4
Figure 13.1 - Business Intelligence
Framework

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 5
Business Intelligence Tools
▪ Dashboards and business activity monitoring
▪ Dashboards: Shows key business performance
indicators in a single integrated view
▪ Portals: Integrate data using web browser from
multiple sources into a single webpage
▪ Data analysis and reporting tools
▪ Data-mining tools
▪ Data warehouses (DW)
▪ OLAP tools and data visualization

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 6
Practices to Manage Data
▪ Master data management (MDM): Collection of
concepts, techniques, and processes for identification,
definition, and management of data elements
▪ Governance: Method of government for controlling
business health and for consistent decision making
▪ Key performance indicators (KPI): Numeric or
scale-based measurements that assess company’s
effectiveness in reaching its goals

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 7
Practices to Manage Data
▪ Data visualization: Abstracting data to provide
information in a visual format
▪ Enhances the user’s ability to efficiently comprehend
the meaning of the data
▪ Techniques
▪ Pie charts and bar charts
▪ Line graphs
▪ Scatter plots
▪ Gantt charts
▪ Heat maps
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 8
Reporting Styles of a Modern BI
System

Monitoring and
Advanced reporting
alerting

Advanced data
analytics

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 9
Business Intelligence Benefits

Improved decision making

Integrating architecture

Common user interface for data reporting and analysis

Common data repository fosters single version of company data

Improved organizational performance

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 10
Table 13.4 - Business Intelligence
Evolution

Cengage Learning © 2015

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 11
Table 13.4 - Business Intelligence
Evolution

Cengage Learning © 2015

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 12
Figure 13.3 - Evolution of BI
Information Dissemination Formats

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 13
Business Intelligence Technology
Trends
Data storage improvements

Business intelligence appliances

Business intelligence as a service

Big Data analytics

Personal analytics

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 14
Decision Support Data
▪ Effectiveness of BI depends on quality of data
gathered at operational level
▪ Operational data
▪ Seldom well-suited for decision support tasks
▪ Stored in relational database with highly normalized
structures
▪ Optimized to support transactions representing daily
operations

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 15
Decision Support Data
▪ Differ from operational data in:
▪ Time span
▪ Granularity
▪ Drill down: Decomposing a data to a lower level
▪ Roll up: Aggregating a data into a higher level
▪ Dimensionality

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 16
Table 13.5 - Contrasting Operational and
Decision Support Data Characteristics

Cengage Learning © 2015

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 17
Decision Support Database
Requirements
▪ Database schema
▪ Must support complex, non-normalized data
representations
▪ Data must be aggregated and summarized
▪ Queries must be able to extract multidimensional time
slices

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 18
Decision Support Database
Requirements
▪ Data extraction and loading
▪ Allow batch and scheduled data extraction
▪ Support different data sources and check for
inconsistent data or data validation rules
▪ Support advanced integration, aggregation, and
classification
▪ Database size should support:
▪ Very large databases (VLDBs)
▪ Advanced storage technologies
▪ Multiple-processor technologies
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 19
Table 13.8 - Characteristics of Data Warehouse
Data and Operational Database Data

Cengage Learning © 2015

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 20
Figure 13.5 - The ETL Process

Cengage Learning © 2015

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 21
Data Marts
▪ Small, single-subject data warehouse subset
▪ Provide decision support to a small group of people
▪ Benefits over data warehouses
▪ Lower cost and shorter implementation time
▪ Technologically advanced
▪ Inevitable people issues

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 22
Table 13.9 - Twelve Rules for a Data
Warehouse

Cengage Learning © 2015

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 23
Table 13.9 - Twelve Rules for a Data
Warehouse

Cengage Learning © 2015

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 24
Star Schema
▪ Data-modeling technique
▪ Maps multidimensional decision support data into a
relational database
▪ Creates the near equivalent of multidimensional
database schema from existing relational database
▪ Yields an easily implemented model for
multidimensional data analysis

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 25
Components of Star Schemas
Facts

• Numeric values that represent a specific business aspect

Dimensions

• Qualifying characteristics that provide additional perspectives to a


given fact

Attributes

• Used to search, filter, and classify facts


• Slice and dice: Ability to focus on slices of the data cube for more
detailed analysis

Attribute hierarchy

• Provides a top-down data organization

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 26
Star Schema Representation
▪ Facts and dimensions represented by physical tables
in data warehouse database
▪ Many-to-one (M:1) relationship between fact table
and each dimension table
▪ Fact and dimension tables
▪ Related by foreign keys
▪ Subject to primary and foreign key constraints

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 27
Star Schema Representation
• Primary key of a fact table
▪ Is a composite primary key because the fact table is
related to many dimension tables
▪ Always formed by combining the foreign keys
pointing to the related dimension tables

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 28
Techniques Used to Optimize Data
Warehouse Design
▪ Normalizing dimensional tables
▪ Snowflake schema: Dimension tables can have their
own dimension tables
▪ Maintaining multiple fact tables to represent different
aggregation levels
▪ Denormalizing fact tables

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 29
Techniques Used to Optimize Data
Warehouse Design
▪ Partitioning and replicating tables
▪ Partitioning: Splits tables into subsets of rows or
columns and places them close to customer location
▪ Replication: Makes copy of table and places it in a
different location
▪ Periodicity: Provides information about the time span
of the data stored in the table

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 30
Data Analytics
▪ Encompasses a wide range of mathematical,
statistical, and modeling techniques to extract
knowledge from data
▪ Subset of BI functionality
▪ Classification of tools
▪ Explanatory analytics: Focuses on discovering and
explaining data characteristics and relationships based
on existing data
▪ Predictive analytics: Focuses on predicting future
outcomes with a high degree of accuracy

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 31
Data Mining
▪ Analyzing massive amounts of data to:
▪ Uncover hidden trends, patterns, and relationships
▪ Form computer models to stimulate and explain the
findings
▪ Use the models to support business decision making
▪ Run in two modes
▪ Guided
▪ Automated

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 32
Figure 13. 15 - Extracting Knowledge
from Data

Cengage Learning © 2015

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 33
Figure 13. 16 - Data-Mining Phases

Cengage Learning © 2015


©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 34
Predictive Analytics
▪ Employs mathematical and statistical algorithms,
neural networks, artificial intelligence, and other
advanced modeling tools
▪ Creates actionable predictive models based on
available data
▪ Next logical step after data mining
▪ Adds value to an organization
▪ Helps optimize the existing processes
▪ Identify hidden problems
▪ Anticipate future problems or opportunities
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 35
Online Analytical Processing
▪ Advanced data analysis environment that supports
decision making, business modeling, and operations
research
▪ Characteristics
▪ Multidimensional data analysis techniques
▪ Advanced database support
▪ Easy-to-use end-user interfaces

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 36
Multidimensional Data Analysis
Techniques
▪ Data are processed and viewed as part of a
multidimensional structure
▪ Augmenting functions
▪ Advanced data presentation functions
▪ Advanced data aggregation, consolidation, and
classification functions
▪ Advanced computational functions
▪ Advanced data-modeling functions

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 37
Advanced Database Support
▪ Advanced data access features
▪ Access to many different kinds of DBMSs, flat files,
and internal and external data sources
▪ Access to aggregated data warehouse data and to the
detail data found in operational databases
▪ Advanced data navigation features
▪ Rapid and consistent query response times
▪ Ability to map end-user requests to appropriate data
source and to proper data access language
▪ Support for very large databases
©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 38
Easy-to-Use End-User Interface
▪ Proper implementation leads to simple navigation and
accelerated decision making or data analysis
▪ Advanced OLAP features are more useful when
access is kept simple
▪ Many interface features are borrowed from previous
generations of data analysis tools

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 39
Figure 13.19 - OLAP Architecture

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 40
Figure 13.20 - OLAP Server with Local
Miniature Data Marts

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 41
Relational Online Analytical
Processing (ROLAP)
▪ Provides OLAP functionality using relational
databases and familiar relational tools to store and
analyze multidimensional data
▪ Extensions added to traditional RDBMS technology
▪ Multidimensional data schema support within the
RDBMS
▪ Data access language and query performance optimized
for multidimensional data
▪ Support for very large databases (VLDBs)

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 42
Multidimensional Online Analytical
Processing (MOLAP)
▪ Extends OLAP functionality to multidimensional
database management systems (MDBMSs)
▪ MDBMS: Uses proprietary techniques store data in
matrix-like n-dimensional arrays
▪ End users visualize stored data as a 3D data cube
▪ Grow to n dimensions, becoming hypercubes
▪ Held in memory in a cube cache to speed access
▪ Sparsity: Measures the density of the data held in the
data cube

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 43
Table 13.12 - Relational vs.
Multidimensional OLAP

Cengage Learning © 2015

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 44
SQL Extensions for OLAP
The ROLLUP extension

• Used with GROUP BY clause to generate aggregates by


different dimensions
• Enables subtotal for each column listed except for the last
one, which gets a grand total
• Order of column list important

The CUBE extension

• Used with GROUP BY clause to generate aggregates by the


listed columns
• Includes the last column

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 45
Materialized View
▪ Dynamic table that contains SQL query command to
generate rows and stores the actual rows
▪ Created the first time query is run
▪ Summary rows are stored in the table
▪ Automatically updated when base tables are updated
▪ Requires specified privileges

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 46

You might also like