Lecture 1
Lecture 1
Lecture 1
The term "Data Warehouse" was first coined by Bill Inmon in 1990.
According to Inmon, a data warehouse is a subject-oriented, integrated, time-
variant, and non-volatile collection of data. This data helps analysts to take
informed decisions in an organization. An operational database undergoes
frequent changes on a daily basis on account of the transactions that take place.
Suppose a business executive wants to analyze previous feedback on any data
such as a product, a supplier, or any consumer data, then the executive will
have no data available to analyze because the previous data has been updated
due to transactions. A data warehouses provides us generalized and
consolidated data in multidimensional view. Along with generalized and
consolidated view of data, a data warehouses also provides us Online
Analytical Processing (OLAP) tools. These tools help us in interactive and
effective analysis of data in a multidimensional space. This analysis results in
data generalization and data mining. Data mining functions such as
association, clustering, classification, prediction can be integrated with OLAP
operations to enhance the interactive mining of knowledge at multiple level of
abstraction. That's why data warehouse has now become an important
platform for data analysis and online analytical processing.
Data Warehouse and Data Mining 2023-2024
A Data Warehousing (DW) is process for collecting and managing data from
varied sources to provide meaningful business insights. A Data warehouse is
typically used to connect and analyze business data from heterogeneous
sources. The data warehouse is the core of the BI system which is built for
data analysis and reporting.
You many know that a 3NF-designed database for an inventory system many
have tables related to each other. For example, a report on current inventory
information can include more than 12 joined conditions. This can quickly
slow down the response time of the query and report. A data warehouse
provides a new design which can help to reduce the response time and helps to
enhance the performance of queries for reports and analytics.
Structured
Semi-structured
Unstructured data
The data is processed, transformed, and ingested so that users can access
the processed data in the Data Warehouse through Business Intelligence
tools, SQL clients, and spreadsheets. A data warehouse merges
information coming from different sources into one comprehensive
database.
Operational Data Store, which is also called ODS, are nothing but data
store required when neither Data warehouse nor OLTP systems support
organizations reporting needs. In ODS, Data warehouse is refreshed in real
time. Hence, it is widely preferred for routine activities like storing records
of the Employees.
3. Data Mart:
The following are general stages of use of the data warehouse (DWH):
In this stage, Data warehouses are updated whenever any transaction takes
place in operational database. For example, Airline or railway booking
system.
1. Data Reporting
2. Query Tools
3. Application development tools
4. EIS tools
5. OLAP tools and data mining tools.
Airline:
In the Airline system, it is used for operation purpose like crew assignment,
analyses of route profitability, frequent flyer program promotions, etc.
Banking:
Healthcare:
Public sector:
In this sector, the warehouses are primarily used to analyze data patterns,
customer trends, and to track market movements.
Retain chain:
Telecommunication:
Hospitality Industry:
The best way to address the business risk associated with a Data warehouse
implementation is to employ a three-prong strategy as below
6. Data warehouse allows users to access critical data from the number
of sources in a single place. Therefore, it saves user’s time of
retrieving data from multiple sources.
7. Data warehouse stores a large amount of historical data. This helps
users to analyze different time periods and trends to make future
predictions.