Defining Data Warehouse Concepts and Terminology
Defining Data Warehouse Concepts and Terminology
Chapter 3
Data Warehouse
Non Volatile
Time Variant
Subject-Oriented
Data is categorized and stored by business subject rather than by application
OLTP Applications Equity Plans Data Warehouse Subject
Shares
Insurance
Savings Loans
Integrated
Data on a given subject is defined and stored once.
Savings
Current accounts
Loans
Customer
OLTP Applications
Data Warehouse
Time-Variant
Data is stored as a series of snapshots, each representing a period of time
Nonvolatile
Typically data in the data warehouse is not updated or delelted. Operational Warehouse
Load
Read
Read
Changing Data
First time load Warehouse Database
Operational Database
Refresh
Refresh
Refresh
Data Warehouse
Seconds to hours Primarily read only
Nature of Data
30-60 days
Usage Curves
Operational system is predictable Data warehouse - Variable - Random
User Expectations
Control expectations Set achievable targets for query response Set SLAs Educate Growth and use is exponential
Enterprisewide Warehouse
Large scale implementation Scope the entire business Data from all subject areas Developed incrementally Single source of enterprisewide data Single distribution point to dependent data marts
Marketing
Data Integration
Source data
Methodolgy
Ensures a successful data warehouse Encourages incremental development Provides a staged approach to an enterprisewide warehouse - Safe - Manageable - Proven - Recommended
Modeling
Warehouses differ from operational structures: - Analytical requirements - Subject orientation Data must map to subject oriented information: - Identify business subjects - Define relationships between subjects - Name the attributes of each subject Modeling is iterative Modeling tools are available
OLTP Databases
Staging File
Warehouse Database
Purchase specialist tools, or develop programs Extraction-- select data using different methods Transformation--validate, clean, integrate, and time stamp data Transportation--move data into the
Data Management
Efficient database server and management tools for all aspects of data management Imperatives - Productive - Flexible - Robust - Efficient Hardware, operating system and
Drill-down
Tools that retrieve data for business analysis Imperatives - Ease of use - Intuitive - Metadata - Training More than one tool may be required
Operational data
Relational / Multidimensional
Relational tools
Spatial
OLAP tools
Web
Audio video
Applications/Web
Warehousing Engines
SQL*Plus
Sources
Filter Transform
PL/SQL, Java Transforms Transform Driver PL/SQL, Java Wrapper External Functions Target Tables Oracle 8i
Business users
Analysis
Current
Oracle Reports
Tactical
Oracle Discover
Strategic
Oracle Express
Task
Production reporting Ad hoc query and analysis Advanced analysis
Question
What were sales by region last quarter? What is driving the increase in North American sales?
Given the rapid increase in Web sales, what will total sales be for the rest of the year?
Customers
Summary
This lesson covered the following topics: Identifying a common, broadly accepted definition of the data warehouse Distinguishing the differences between OLTP systems and analytical systems Defining some of the common data warehouse terminology Identifying some of the elements and processes in a data warehouse Identifying and positioning the Oracle Warehouse vision, products, and services