Data Warehouse: Lutfi Freij Konstantin Rimarchuk Vasken Chamlaian John Sahakian Suzan Ton
Data Warehouse: Lutfi Freij Konstantin Rimarchuk Vasken Chamlaian John Sahakian Suzan Ton
Lutfi Freij
Konstantin Rimarchuk
Vasken Chamlaian
John Sahakian
Suzan Ton
Inmon
Father of the data warehouse
Co-creator of the Corporate
Information Factory.
He has 35 years of
experience in database
technology management
and data warehouse design.
Inmon-Cont’d
Bill has written about a variety
of topics on the building, usage,
& maintenance of the data warehouse
& the Corporate Information Factory.
Subject oriented
Data integrated
Time variant
Nonvolatile
Characteristics of Data Warehouse
The time horizons for warehouse and operational data elements are
unique. Data in the operational environment are fresh, whereas
warehouse data are generally much older.(so there is minimal
opportunity of the data to overlap between two environments )
Given this factors, Inmon suggests that data redundancy between the two
environments is a rare occurrence with a typical redundancy factor of
less than 1 %
The Data Warehouse
Architecture
The architecture consists of various
interconnected elements:
Operational and external database layer – the
source data for the DW
Information access layer – the tools the end
user access to extract and analyze the data
Data access layer – the interface between the
operational and information access layers
Metadata layer – the data directory or
repository of metadata information
Components of the Data
Warehouse Architecture
The Data Warehouse
Architecture
Additional layers are:
Process management layer – the scheduler or job
controller
Application messaging layer – the “middleware” that
transports information around the firm
Physical data warehouse layer – where the actual
data used in the DSS are located
Data staging layer – all of the processes necessary to
select, edit, summarize and load warehouse data
from the operational and external data bases
Data Warehousing Typology
The virtual data warehouse – the end users
have direct access to the data stores, using tools
enabled at the data access layer
The central data warehouse – a single physical
database contains all of the data for a specific
functional area
The distributed data warehouse – the
components are distributed across several
physical databases
The Metadata
The name suggests some high-level
technological concept, but it really is fairly
simple. Metadata is “data about data”.
With the emergence of the data warehouse as a
decision support structure, the metadata are
considered as much a resource as the business
data they describe.
Metadata are abstractions -- they are high level
data that provide concise descriptions of lower-
level data.
The Metadata
Stephen Brobst
Chief Technology Office
Teradata
Success of Data Warehouse
Projects
Most challenging type of deployment for an
enterprise
Used by:
Continental Airlines in the US: reroute passengers on
delayed flights, reissuing tickets, reserving a room in
a hotel booking system
Southwest Airlines- savings between $1.2-$1.4 Million
Identity Theft
Government Regulation of Personal Data is Needed
(National Consumer Protection Standards)
ChoicePoint Folly
http://seattletimes.nwsource.com/html/editorialsopinion/2002191098_credite
d27.html
Seattle times, plugging holes in data warehousing
ON THE MARK
Mark Hall. Computerworld. Framingham: Oct 18, 2004. Vol. 38, Iss. 42; p.
6 (1 page)
Optimization: It's All About the Data Brandweek: Ellen Pederson, Mark
Anderson
http://www.computerworld.com/printthis/2001/0,4814,56969,00.html Micro-
segmentation – Computerworld