0% found this document useful (0 votes)
14 views39 pages

Jukic Chapter07

OIM

Uploaded by

csachdeva
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views39 pages

Jukic Chapter07

OIM

Uploaded by

csachdeva
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Chapter 7 - Data

Warehousing Concepts

Database Systems -
Introduction to Databases
and Data Warehouses

By Jukic, Vrbsky, & Nestorov

Business Intelligence
and Analytics

Professor Traci Hess


Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 1
Data Warehouse Progression

Packaged Transient
application data source

Data
warehouse

Legacy
Extract Transform Cleanse Load
system

Data mart
Other internal
applications

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 2


DIMENSIONAL MODELING
A dimensional model (star schema)

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 3


Example: Dimensional Model Based on A Single Source
ER diagram : ZAGI Retail Company Sales Dept Database (Source)

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 4


Example: Dimensional Model Based on A Single Source
Relational schema: ZAGI Retail Company Sales Dept Database (source)

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 5


Example: Dimensional Model Based on A Single Source
Relational schema: ZAGI Retail Company Sales Dept Database (source)

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 6


Example: Dimensional Model Based on A Single Source
Data records: ZAGI Retail Company Sales Dept Database (source)

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 7


Example: Dimensional Model Based on A Single Source
STAR Schema - ZAGI Retail Company dimensional model for the subject sales

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 8


INTRODUCTION
• Organizations typically maintain and utilize a number of
operational data sources
• Operational data sources include databases and other data
repositories which support an organization’s day-to-day
operations
• A data warehouse is created and used as a separate analytical
data store for the purpose of data analysis

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 9


INTRODUCTION
• Two main reasons for creating a separate data warehouse
(analytical database)
– Performance of operational databases can be severely
diminished if day-to-day tasks have to share computing
resources with analytical queries
– Difficult to structure a database which can be efficiently
used for both operational and analytical purposes

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 10


INTRODUCTION
• Operational (transactional) information - information
collected and used in support of day to day operational needs

• Analytical information - information collected and used in


support of analytical tasks

• Analytical information is based on operational (transactional)


information

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 11


OPERATIONAL VS. ANALYTICAL INFORMATION
Operational Data Analytical Data
Data Makeup Differences
Time-Horizon: Days/Months Time-Horizon: Years
Detailed Summarized (and/or Detailed)
Current Values over time (Snapshots)

Technical Differences
Small Amounts used in a Process Large Amounts used in a Process
High frequency of Access Low/Modest frequency of Access
Can be Updated Read (and Append) Only
Non-Redundant Redundancy not an Issue

Functional Differences
Used by all types of employees Used by narrower set of
for tactical purposes users for decision making
Application Oriented Subject Oriented

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 12


Application vs. Subject Oriented Example

An application-
oriented database
serving the Vitality
Health Club Visits
and Payments
Application

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 13


Application vs. Subject Oriented Example

A subject-oriented database for analyzing revenue in the Vitality Health Club

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 14


THE DATA WAREHOUSE (DW) DEFINITION

• A structured repository of integrated, subject-oriented,


enterprise-wide, historical, and time-variant data.

• Purpose: retrieval of analytical information. A data


warehouse can store detailed and/or summarized data.

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 15


THE DATA WAREHOUSE (DW) DEFINITION
• Structured repository
– A DW is a database containing analytically useful information
– Any database is a structured repository, with structure
represented by metadata
• Integrated
– Integrates useful data from operational databases & other sources
– Integration: the process of bringing the data from multiple data
sources into a singular data warehouse

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 16


THE DATA WAREHOUSE (DW) DEFINITION

• Subject-oriented
– Refers to fundamental difference in purpose of operational
database system vs. a data warehouse.
– Operational database system: developed to support a specific
business operation
– Data warehouse: developed to analyze specific business
subject areas

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 17


THE DATA WAREHOUSE (DW) DEFINITION
• Enterprise-wide
– Data warehouse provides organization-wide view of
analytically useful information

• Historical
– Refers to larger time horizon in DW compared to operational
databases
– Operational databases hold typically 1 year of data
– DW hold many years of data
Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 18
THE DATA WAREHOUSE (DW) DEFINITION
• Time variant
– A data warehouse contains slices or snapshots of data from
different periods of time across the time horizon
– With the data slices, the user can create reports for various
periods of time within the time horizon
• Detailed and/or summarized data
– A DW may include detailed data or summary data or both
– A DW that contains data at the finest level of detail is most
powerful
Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 19
THE DATA WAREHOUSE (DW) DEFINITION

• Retrieval of analytical information


– Developed for retrieval of analytical information, not direct
data entry by users
– Retrieval is only functionality available to DW users
– DW data is not subject to changes
– Data in a DW is referred to as non-volatile, static, or read-only

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 20


DATA WAREHOUSE COMPONENTS

• Data warehouse components


– Source systems
– Extraction-transformation-load (ETL) infrastructure
– Data warehouse
– Front-end applications

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 21


DATA WAREHOUSE COMPONENTS

Example:
core components
of a data
warehousing
system

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 22


DATA WAREHOUSE COMPONENTS

• Source systems
– Operational databases and repositories that provide
analytically useful information in DW subject areas
– Each operational data store has two purposes:
• Original operational purpose
• Source system for the data warehouse
– Source systems can include external data sources

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 23


DATA WAREHOUSE COMPONENTS

Example: A data
warehouse with internal
and external source
systems

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 24


DATA WAREHOUSE COMPONENTS
• Data warehouse
– DW sometimes referred to as target system - a destination
for data from source systems
– Typical DW retrieves selected, analytically useful data from
operational data sources

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 25


DATA WAREHOUSE COMPONENTS
• ETL infrastructure
– Facilitates retrieval of data from operational databases into DW
– ETL includes the following tasks:
• Extracting analytically useful data from operational data
sources
• Transforming data to conform to structure of the subject-
oriented target DW model
• Loading transformed and quality-assured data into target DW

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 26


DATA WAREHOUSE COMPONENTS

• Data warehouse front-end (BI) applications


– Provides access to DW for users who are engaging in indirect
use

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 27


DATA WAREHOUSE COMPONENTS

Example: data
warehouse with
front-end
applications

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 28


DATA MARTS
• Data mart
– A data store with same principles as DW, but with a more
limited scope

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 29


DATA MARTS

• Independent data mart


– Stand-alone data mart, created in the same fashion as DW
– Independent data mart has own source systems and ETL
infrastructure
• Dependent data mart
– Does not have own source systems
– Data comes from a DW

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 30


STEPS IN THE DEVELOPMENT OF DATA WAREHOUSES

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 31


STEPS IN THE DEVELOPMENT OF DATA WAREHOUSES

• Requirements collection, definition, and


visualization - specifies desired functionalities
– Based on data in the internal data source systems and
external data sources
– Requirements are collected through interviewing various
stakeholders
– Collected requirements should be clearly defined in a
written document, and visualized as a conceptual data
model

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 32


STEPS IN THE DEVELOPMENT OF DATA WAREHOUSES

• Data warehouse modeling (logical data


warehouse modeling ) - creation of the data
warehouse data model that is implementable by
the DBMS software
• Creating the data warehouse - using a
DBMS to implement the data warehouse data
model as an actual data warehouse
– Typically, data warehouses are implemented using a
relational DBMS (RDBMS) software
Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 33
STEPS IN THE DEVELOPMENT OF DW
• Creating ETL infrastructure - creating
necessary procedures and code for:
– Automatic extraction of relevant data from operational
data sources
– Transformation of extracted data - quality is assured
and structure conforms to the structure of the modeled
and implemented DW
– Seamless load of the transformed data into DW
– Creating ETL infrastructure is often the most time- and
resource-consuming part of the data warehouse
development process
Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 34
STEPS IN THE DEVELOPMENT OF DATA WAREHOUSES

• Developing front-end (BI) applications -


design and create applications for end-users
– Front-end applications included in most data
warehousing systems, referred to as business
intelligence (BI) applications
– Front-end applications contain interfaces (such as
forms and reports) accessible via a navigation
mechanism (such as a menu)

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 35


STEPS IN THE DEVELOPMENT OF DATA WAREHOUSES

• Data warehouse deployment - releasing the


data warehouse and its front-end (BI)
applications for use by the end users

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 36


STEPS IN THE DEVELOPMENT OF DATA WAREHOUSES

• Data warehouse use - the retrieval of the


data in the data warehouse
– Indirect use
• Via the front-end (BI) applications
– Direct use
• Via the DBMS
• Via the OLAP (BI) tools

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 37


STEPS IN THE DEVELOPMENT OF DATA WAREHOUSES

• Data warehouse administration and


maintenance - perform activities that support
the data warehouse end user, such as:
– Provide security for information contained in the data
warehouse
– Ensure sufficient hard-drive space for the data
warehouse content
– Implement backup and recovery procedures

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 38


THE NEXT VERSION OF THE DATA WAREHOUSE

• The next (new) version of the data warehouse


follows the same development steps as the
initial version

Database Systems - Jukić, Vrbsky, Nestorov Ch. 7, slide 39

You might also like