Chapter 2 Data Warehousing
Chapter 2 Data Warehousing
Warehousing
The data is periodically pulled from various internal applications like sales,
marketing, and finance; customer-interface applications; as well as external
partner systems.
This data is then made available for decision-makers to access and analyze.
Data warehouses are exclusively intended to perform queries and analysis and
often contain large amounts of historical data. The data within a data
warehouse is usually derived from a wide range of sources such as application
log files and transaction applications.
Key Characteristics of Data
Warehouse
Subject-Oriented
Integrated
Non-Volatile
Time-Variant
A typical data warehouse often includes
the following elements:
• A relational database to store and manage data
• An extraction, loading, and transformation (ELT)
solution for preparing the data for analysis
• Statistical analysis, reporting, and data mining
capabilities
• Client analysis tools for visualizing and presenting data
to business users
• Other, more sophisticated analytical applications that
generate actionable information by applying
data science and artificial intelligence (AI) algorithms,
or graph and spatial features that enable more kinds of
What is ELT?
It's the process of collecting data from multiple sources and transforming it into a
usable format for analysis.
Extract, transform, and load (ETL) is a data pipeline used to collect data from
various sources.
ETL stands for extract, transform and load, which is a data integration process
that combines data from multiple data sources into a single, consistent data store
that is loaded into a data warehouse or other target system.
• Extract data from legacy systems
• Cleanse the data to improve data quality and establish consistency
• Load data into a target database
Database vs Data Warehouse
A data warehouse and a traditional database share some
similarities, But they need not be the same idea.
The main difference is that in a database, data is collected for
multiple transactional purposes.
In a data warehouse, data is collected on an extensive scale to
perform analytics.
Databases provide real-time data, while warehouses store data to
be accessed for big analytical queries.
Data warehouse is an example of an OLAP system or an online
database query answering system. OLTP is an online database
modifying system, for example, ATM.
Data Warehouse Architecture
Simple.
Simple with a staging area.
Hub and spoke.
Sandboxes
Data Warehouse Architecture
Dundas BI
Sisense
IBM Cognos Analytics
InetSoft
SAP Business Intelligence
Halo
OLAP for Multidimensional Analysis
Volume of data It can process a limited It processes enormous It can process huge
volume of data. data. volumes of data.