Data Mining UNIT - 2 (Data Warehouse Architecture)
Data Mining UNIT - 2 (Data Warehouse Architecture)
Data Mining
Bottom Tier :
The bottom tier consists of a warehouse database server, typically implemented as a relational
database system.
Back-end tools and utilities are used to load data into the bottom tier of the architecture from
operational databases or external sources.
Data extraction.
Data cleaning.
Data transformation.
The data are extracted using application program interface know as gateways.
Data Mining 1
A gateway, supported by the underlying DBMS, enables client programs to generate SQL code
for execution on a server.
Example :
This tier is also contains a metadata repository, which stores information about the data
warehouse and its contents.
Data Cleaning : Detects errors in the data and rectify them when possible.
Data Transformation : Convert data from legacy or host format to warehouse format.
Load :
Sort.
Summarize.
Consolidate.
Compute views.
Check integrity.
Refresh : Transfers updates from the data sources to the data warehouse.
Middle Tier :
The middle tier is an OLAP server that is typically implemented using either :
A Relational OLAP (ROLAP) model extends a traditional relational DBMS by mapping multi-
dimensional data operations to standard relational operations.
Top Tier :
The top tier is a front-end client layer, which contains :
Analysis tools.
Data Mining 2
The Data Mart.
Enterprise Warehouse :
It collects all of the information about subjects spanning the entire organization.
It typically contains detailed data as well as summarized data and can range in size from a
few gigabytes to hundreds of gigabytes, terabytes or beyond.
Traditional mainframes.
Data Mart :
Example : A marketing data mart may confine its subjects to customer, item, and sales.
The implementation cycle of data mart is more likely to be measured in weeks rather than
months or years.
Virtual Warehouse :
For efficient query processing, only some of the possible summary views may be
materialized.
Data Mining 3