0% found this document useful (0 votes)
71 views13 pages

L4. Datawarehouse Architecture PDF

The three-tier architecture is the most widely used data warehouse architecture. It consists of three tiers - a bottom tier database that stores cleansed and transformed data, a middle tier OLAP server that acts as an interface between users and the database, and a top tier client layer of query and analysis tools. ETL tools are used to extract, transform and load data from multiple sources into the bottom tier database. Metadata provides information about the data such as where it came from and how it is transformed. Query tools allow users to analyze and retrieve information from the data warehouse.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views13 pages

L4. Datawarehouse Architecture PDF

The three-tier architecture is the most widely used data warehouse architecture. It consists of three tiers - a bottom tier database that stores cleansed and transformed data, a middle tier OLAP server that acts as an interface between users and the database, and a top tier client layer of query and analysis tools. ETL tools are used to extract, transform and load data from multiple sources into the bottom tier database. Metadata provides information about the data such as where it came from and how it is transformed. Query tools allow users to analyze and retrieve information from the data warehouse.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Data warehouse Architecture

Data Warehouse Architecture


Data Warehouse Architecture is complex as it’s an information system that
contains historical and commutative data from multiple sources. There are 3
approaches for constructing Data Warehouse layers: Single Tier, Two tier and Three
tier. This 3 tier architecture of Data Warehouse is explained as below.
Single-tier architecture
• The objective of a single layer is to minimize the amount of data stored. This goal
is to remove data redundancy. This architecture is not frequently used in practice.
Two-tier architecture
• Two-layer architecture is one of the Data Warehouse layers which separates
physically available sources and data warehouse. This architecture is not
expandable and also not supporting a large number of end-users. It also has
connectivity problems because of network limitations.
Three-Tier Data Warehouse Architecture
This is the most widely used Architecture of Data Warehouse.
• Bottom Tier: The database of the Data warehouse serves as the bottom
tier. It is usually a relational database system. Data is cleansed,
transformed, and loaded into this layer using back-end tools.
• Middle Tier: The middle tier in Data warehouse is an OLAP server which is
implemented using either ROLAP or MOLAP model. For a user, this
application tier presents an abstracted view of the database. This layer also
acts as a mediator between the end-user and the database.
• Top-Tier: The top tier is a front-end client layer. Top tier is the tools and API
that you connect and get data out from the data warehouse. It could be
Query tools, reporting tools, managed query tools, Analysis tools and Data
mining tools.
Three-Tier Data Warehouse Architecture
Data Warehouse Back-End Tools and Utilities
• Data extraction
• get data from multiple, heterogeneous, and external sources
• Data cleaning
• detect errors in the data and rectify them when possible
• Data transformation
• convert data from legacy or host format to warehouse format
• Load
• sort, summarize, consolidate, compute views, check integrity, and build
indices and partitions
• Refresh
• propagate the updates from the data sources to the warehouse

5
Sourcing, Acquisition, Clean-up and Transformation
Tools (ETL)
The data sourcing, transformation, and migration tools are used for performing all the conversions,
summarizations, and all the changes needed to transform data into a unified format in the data
warehouse. They are also called Extract, Transform and Load (ETL) Tools.
Their functionality includes:
• Anonymize data as per regulatory stipulations.
• Eliminating unwanted data in operational databases from loading into Data warehouse.
• Search and replace common names and definitions for data arriving from different sources.
• Calculating summaries and derived data
• In case of missing data, populate them with defaults.
• De-duplicated repeated data arriving from multiple data sources.
Metadata
The name Meta Data suggests some high-level technological Data
Warehousing Concepts. However, it is quite simple. Metadata is data
about data which defines the data warehouse. It is used for building,
maintaining and managing the data warehouse.

In the Data Warehouse Architecture, meta-data plays an important role


as it specifies the source, usage, values, and features of data
warehouse data. It also defines how data can be changed and
processed. It is closely connected to the data warehouse.
Metadata
Metadata helps to answer the following questions
• What tables, attributes, and keys does the Data Warehouse contain?
• Where did the data come from?
• How many times do data get reloaded?
• What transformations were applied with cleansing?
Metadata can be classified into following categories:
• Technical Meta Data: This kind of Metadata contains information about
warehouse which is used by Data warehouse designers and administrators.
• Business Meta Data: This kind of Metadata contains detail that gives end-
users an easy way to understand information stored in the data warehouse.
Query Tools
One of the primary objects of data warehousing is to provide
information to businesses to make strategic decisions. Query tools
allow users to interact with the data warehouse system.
These tools fall into four different categories:
• Query and reporting tools
• Application Development tools
• Data mining tools
• OLAP tools
Design of Data Warehouse: A Business Analysis Framework

• Four views regarding the design of a data warehouse


• Top-down view
• allows selection of the relevant information necessary for the data warehouse
• Data source view
• exposes the information being captured, stored, and managed by operational systems
• Data warehouse view
• consists of fact tables and dimension tables
• Business query view
• sees the perspectives of data in the warehouse from the view of end-user

10
Data Warehouse Design Process
• Top-down, bottom-up approaches or a combination of both
• Top-down: Starts with overall design and planning (mature)
• Bottom-up: Starts with experiments and prototypes (rapid)
• From software engineering point of view, the design and construction of a data warehouse may consist
of the following steps: planning, requirements study, problem analysis, warehouse design, data
integration and testing, and finally deployment of the data warehouse.
• Waterfall: structured and systematic analysis at each step before proceeding to the next
• Spiral: rapid generation of increasingly functional systems, short turn around time, quick turn
around

11
Data Warehouse Usage
• Three kinds of data warehouse applications
• Information processing
• supports querying, basic statistical analysis, and reporting using crosstabs, tables, charts
and graphs
• Analytical processing
• multidimensional analysis of data warehouse data
• supports basic OLAP operations, slice-dice, drilling, pivoting
• Data mining
• knowledge discovery from hidden patterns
• supports associations, constructing analytical models, performing classification and
prediction, and presenting the mining results using visualization tools

12
THANK YOU
13

You might also like