Unit 1 Data Warehouse Fundamentals: Structure
Unit 1 Data Warehouse Fundamentals: Structure
Unit 1
Unit 1
Structure:
1.1 Introduction
Objectives
1.2 OLTP Systems
1.3 Characteristics & Functions of Data Warehouses
1.4 Advantages and Applications of Data Warehouse
1.5 Top- Down and Bottom-Up Development Methodology
1.6 Tools for Data warehouse development
1.7 Data Warehouse Types
1.8 Summary
1.9 Terminal Questions
1.10 Answers
1.1 Introduction
Data Warehouses and Data Warehouse applications are designed primarily
to support executives, senior managers, and business analysts in making
complex business decisions. Data Warehouse applications provide the
business community with access to accurate consolidated information from
various internal and external sources. The goal of using a Data Warehouse
is to have an efficient way of managing information and analyzing data. Now
days corporate organizations are generating Gigabytes of data daily and
storing these data in various database systems. But the question is, how
efficiently do people use such a huge amount of data to control and monitor
their business performance? Are they able to get timely information without
errors? Are they able get to useful data for analysis? The answer for these
questions is Data Warehouse. So what is Data Warehouse and how it will
be used? You will get the answers to these questions in the subsequent
paragraphs.
Objectives:
After studying this unit, you will be able to:
describe the differences between Online Analytical Processing (OLAP)
Systems and Data Warehouse systems
define the characteristics of a Data Warehouse
describe the functionality of Data Warehouse
Sikkim Manipal University
B1633
Page No.: 1
Unit 1
B1633
Page No.: 2
Unit 1
The OLTP systems alone cannot give the answers for all these questions.
And again the answer for all these systems is again a Data Warehouse. So
it is time know, the differences between OLTP and Data Warehouse
systems.
Differences between OLTP and Data Warehouse
Application databases are OLTP (On-Line Transaction Processing)
systems where every transaction has to be recorded as and when it
occurs. Consider the scenario where a bank ATM has disbursed cash to
a customer but was unable to record this event in the bank records. If
this happens frequently, the bank wouldn't stay in business for too long.
So the banking system is designed to make sure that every transaction
gets recorded within the time you stand before the ATM machine.
A Data Warehouse (DW) on the other end, is a database (yes, you are
right, it's a database) that is designed for facilitating querying and
analysis. Often designed as OLAP (On-Line Analytical Processing)
systems, these databases contain read-only data that can be queried
and analyzed far more efficiently as compared to your regular OLTP
application databases. In this sense an OLAP system is designed to be
read-optimized.
Separation from your application database also ensures that your
business intelligence solution is scalable (your bank and ATMs don't go
down just because the CFO asked for a report), better documented and
managed.
Creation of a DW leads to a direct increase in quality of analysis as the
table structures are simpler (you keep only the needed information in
simpler tables), standardized (well-documented table structures), and
often de-normalized (to reduce the linkages between tables and the
corresponding complexity of queries). Having a well-designed DW is the
foundation for successful BI (Business Intelligence)/Analytics initiatives,
which are built upon.
Data Warehouses usually store many months or years of data. This is to
support historical analysis. OLTP systems usually store data from only a
few weeks or months. The OLTP system stores only historical data as
needed to successfully meet the requirements of the current transaction.
B1633
Page No.: 3
Unit 1
OLTP
Data Warehouse
3 NF
Multidimensional
Indexes
Few
Many
Joins
Many
Some
Duplicate data
Normalized
Demoralized
Aggregate data
Rare
Common
Queries
Mostly predefined
Mostly adhoc
Nature of queries
Mostly simple
Mostly complex
Updates
Historical data
Essential
Self
1.
2.
3.
4.
Assessment Questions
OLTP stands for ________________________
OLTP handles day to day business transactions (true/false)
Updates on the Data Warehouse is allowed (true/false)
Data Warehouse is a database that is designed for facilitating
_________ and __________.
B1633
Page No.: 4
Unit 1
B1633
Page No.: 5
Unit 1
B1633
Page No.: 6
Unit 1
B1633
Page No.: 7
Unit 1
B1633
Page No.: 8
Unit 1
1.8 Summary
B1633
Page No.: 9
Unit 1
The popular data warehouse tools are Cognos, Informatica and SAS
etc.
1.10 Answers
Self Assessment Questions
1. On-Line Transaction Processing
2. True
3. False
4. Query and Analysis
5. Non-Volatile
6. True
7. Real time, federated and distributed
8. time variant
Terminal Questions
1. A Data Warehouse is a relational database that is designed for query
and analysis rather than for transaction processing Refer section 1.3 &
1.4
2. Data Warehouses are designed to help you analyze data. 2. Refer
section 1.3
3. Data Warehouses can be developed using top-down or bottom-up
methodologies Refer section 1.5
4. Data Mart is small Data Warehouse which is having a limited scope,
usually departmental level data Refer section 1.6.
5. On-Line Analytical Processing) systems, these databases contain readonly data that can be queried Refer section 1.2
6. Refer section 1.6
Sikkim Manipal University
B1633
Page No.: 10