CDM Class1,2,3
CDM Class1,2,3
CDM Class1,2,3
Data Warehouse is a relational database management system (RDBMS) construct to meet the requirement
of transaction processing systems. It can be loosely described as any centralized data repository which
can be queried for business benefits. It is a database that stores information oriented to satisfy decision-
making requests. It is a group of decision support technologies, targets to enabling the knowledge worker
(executive, manager, and analyst) to make superior and higher decisions. So, Data Warehousing support
architectures and tool for business executives to systematically organize, understand and use their
information to make strategic decisions.
Data Warehouse environment contains an extraction, transportation, and loading (ETL) solution, an
online analytical processing (OLAP) engine, customer analysis tools, and other applications that handle
the process of gathering information and delivering it to business users.
A Data Warehouse (DW) is a relational database that is designed for query and analysis rather than
transaction processing. It includes historical data derived from transaction data from single and multiple
sources.
A Data Warehouse provides integrated, enterprise-wide, historical data and focuses on providing support
for decision-makers for data modeling and analysis.
A Data Warehouse is a group of data specific to the entire organization, not only to a particular group of
users.
It is not used for daily operations and transaction processing but used for making decisions.
A Data Warehouse can be viewed as a data system with the following attributes:
o It is a database designed for investigative tasks, using data from various applications.
o It supports a relatively small number of clients with relatively long interactions.
o It includes current and historical data to provide a historical perspective of information.
o Its usage is read-intensive.
o It contains a few large tables.
Subject-Oriented
A data warehouse target on the modeling and analysis of data for decision-makers. Therefore, data
warehouses typically provide a concise and straightforward view around a particular subject, such as
customer, product, or sales, instead of the global organization's ongoing operations. This is done by
excluding data that are not useful concerning the subject and including all data needed by the users to
understand the subject.
Integrated
A data warehouse integrates various heterogeneous data sources like RDBMS, flat files, and online
transaction records. It requires performing data cleaning and integration during data warehousing to
ensure consistency in naming conventions, attributes types, etc., among different data sources.
Time-Variant
Historical information is kept in a data warehouse. For example, one can retrieve files from 3 months, 6
months, 12 months, or even previous data from a data warehouse. These variations with a transactions
system, where often only the most current file is kept.
Non-Volatile
The data warehouse is a physically separate data storage, which is transformed from the source
operational RDBMS. The operational updates of data do not occur in the data warehouse, i.e., update,
insert, and delete operations are not performed. It usually requires only two procedures in data accessing:
Initial loading of data and access to data. Therefore, the DW does not require transaction processing,
recovery, and concurrency capabilities, which allows for substantial speedup of data retrieval. Non-
Volatile defines that once entered into the warehouse, and data should not change.
The idea of data warehousing came to the late 1980's when IBM researchers Barry Devlin and Paul
Murphy established the "Business Data Warehouse."
In essence, the data warehousing idea was planned to support an architectural model for the flow of
information from the operational system to decisional support environments. The concept attempt to
address the various problems associated with the flow, mainly the high costs associated with it.
In the absence of data warehousing architecture, a vast amount of space was required to support multiple
decision support environments. In large corporations, it was ordinary for various decision support
environments to operate independently.
1. 1) Business User: Business users require a data warehouse to view summarized data from the
past. Since these people are non-technical, the data may be presented to them in an elementary
form.
2. 2) Store historical data: Data Warehouse is required to store the time variable data from the
past. This input is made to be used for various purposes.
3. 3) Make strategic decisions: Some strategies may be depending upon the data in the data
warehouse. So, data warehouse contributes to making strategic decisions.
4. 4) For data consistency and quality: Bringing the data from different sources at a
commonplace, the user can effectively undertake to bring the uniformity and consistency in data.
5. 5) High response time: Data warehouse has to be ready for somewhat unexpected loads and
types of queries, which demands a significant degree of flexibility and quick response time.