0% found this document useful (0 votes)

312 views16 pages

Data Warehousing

Data warehousing involves processing large amounts of stored electronic data and using it to support decision-making beyond routine daily tasks. A data warehouse is a collection of integrated databases that serves as a single source of information to support decision making and informational applications. It contains subject-oriented, consistent data that evolves over time and is not volatile. Major components of a data warehouse system include the data warehouse layer, where information is centrally stored, and an analysis layer where integrated data is accessed for reporting, analysis and simulating business scenarios.

Uploaded by

Raman Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

312 views16 pages

Data Warehousing

Uploaded by

Raman Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Data Warehousing

Data warehousing is a processing a huge amount of electronic data stored in recent

years and use that data to accomplish goals that go beyond the routine tasks linked
to daily processing

Data Warehouse
A data warehouse is a collection of data that supports decision-making processes.
An integrated collection of databases rather than a single database is a Data
Warehouse. It must be considered as the single source of information for all decision
support processing and all informational applications throughout the organization. It
provides the following features :
It is subject-oriented.
It is integrated and consistent.
It shows its evolution over time and it is not volatile.
It used to support management
Decision-making processes
Business intelligence.
Emerge the information and knowledge needed to effectively manage
the organization.
Investigation of key challenges and research directions for this
discipline.
It Comprises of data that belong to different imformation subject areas.
It contains different catagories of data.
The process to access heterogenous data , its cleanng transformation , and storing
the data in a structure that is easy to access , understand and use us carried out by
the data warehouse. This data id finally used for report generation , querying and
data analysis.
Warehouse catalog
The warehouse catalog is the subsystem that stores and manages all the metadata.
The metadata refers to such information as data element mapping from source to
target, data element meaning information for information systems and business
users, the data models (both logical and physical), a description of the use of the
data, and temporal information.

Architectural Properties of Data Warehousing System

The following are the architecture properties of data warehouse system
Separation - Analytical and transactional processing should be kept apart as much
as possible.
Scalability - Hardware and software architectures should be easy to upgrade as the
data volume, which has to be managed and processed, and the number of users
requirements, which have to be met, progressively increase.
Extensibility - The architecture should be able to host new applications and
technologies without redesigning the whole system.

Security - Monitoring accesses is essential because of the strategic data stored in

data warehouses.
Administerability - Data warehouse management should not be overly difficult.

Data Warehouse Architectures

Single Layer
Single-Layer Architecture
Single-layer architecture is not frequently used in practice.
Its goal is to minimize the amount of data stored and to reach this goal,
it removes data redundancies.
In this case, data warehouses are virtual.

Two Layer
The requirement for separation plays a fundamental role in defining the typical
architecture for a data warehouse system. Although it is typically called two-layer
architecture to highlight a separation between physically available sources and data
warehouses, it consists of the following 4 layers :
1) Source layer

It uses heterogeneous sources of data.

Data is originally stored to

Corporate relational databases

Legacy databases
Information systems outside the corporate walls

2) Data Staging
The data stored to sources should be extracted, cleansed to remove inconsistencies
and fill gaps, and integrated to merge heterogeneous sources into one common
schema. The so called Extraction, Transformation, and Loading tools (ETL) can
merge heterogeneous schemata, extract, transform, cleanse, validate, filter, and load
source data into a data warehouse. ETL takes place once when a data warehouse is
populated for the first time, then it occurs every time the data warehouse is regularly
updated ETL consists of four separate phases: extraction (or capture), cleansing (or
cleaning or scrubbing), transformation, and loading.

Extraction
Relevant data is obtained from sources in the extraction phase. You can use static
extraction when a data warehouse needs populating for the first time. Incremental
extraction, used to update data warehouses regularly. The data to be extracted is
mainly selected on the basis of its quality.

Cleansing
The cleansing phase is crucial in a data warehouse system because it is supposed
to improve data quality . few mistakes and inconsistencies that make data dirty are :
Duplicate data
Inconsistent values that is logically
Missing data Such as a customers job in ETL tools.
Impossible or wrong values

Loading
Loading into a data warehouse is the last step to take. Loading can be carried out in
two ways:
Refresh Data warehouse data is completely rewritten. This means that older data
is replaced.
Update Only those changes applied to source data are added to the data
warehouse. Update is typically carried out without deleting or modifying preexisting
data. This technique is used in combination with incremental extraction to update
data warehouses regularly.
3) Data warehouse layer :

Information is stored to one logically centralized single

repository is called as data warehouse.
The data warehouse can be directly accessed, but it can also be used
as a source for creating data marts, which partially replicate data
warehouse contents.
Meta-data repositories store information on sources, access
procedures, data staging, users, data mart schemata, and so on.
4) analysis: In this layer, integrated data is efficiently and flexibly accessed to issue
reports, dynamically analyze information, and simulate hypothetical business
scenarios

Difference between Data Marts and Data Warehouse

The architectural difference between data warehouses and data marts needs to be
studied closer. The component marked as a data warehouse is also often called the
primary data warehouse or corporate data warehouse. It acts as a centralized
storage system for all the data being summed up. Data marts can be viewed as
small, local data warehouses replicating (and summing up as much as possible) the
part of a primary data warehouse required for a specific application domain.
Three Layer
In this architecture, the third layer is the reconciled data layer or operational data
store. This layer materializes operational data obtained after integrating and
cleansing source data. As a result, those data are integrated, consistent, correct,
current, and detailed. The main advantage of the reconciled data layer is that it
creates a common reference data model for a whole enterprise. At the same time, it
sharply separates the problems of source data extraction and integration from those
of data warehouse population. Remarkably, in some cases, the reconciled layer is
also directly used to better accomplish some operational tasks, such as producing
daily reports that cannot be satisfactorily prepared using the corporate applications,
or generating data flows to feed external processes periodically so as to benefit from

cleaning and integration. However, reconciled data leads to more redundancy of

operational source data.
This approach can be described as a hybrid solution between the single-layer
architecture and the two/three-layer architecture.
This approach assumes that although a data warehouse is available, it is
unable to solve all the queries formulated.
This means that users may be interested in directly accessing source data from
aggregate data (drill-through)

A perspective on decision support applications

Successfully supporting managerial decision making has become critically
dependent upon the availability of integrated, high quality information organized and
presented to managers in a timely and easily understood manner. Data warehouses

have emerged to meet this need. Surrounded by analytical tools and models, data
warehouses have the potential to transform operational data into business
intelligence; enabling effective problem and opportunity identification and critical
decision making, as well as strategy formulation, implementation, and evaluation.
Content Management :
Managing the content of a data warehouse is a daunting task. These
operational systems draw data from a variety of databases that operate on
different hardware platforms, use different operating systems and DBMSs,
and have different database structures with varying structural, conceptual,
and instance level semantics.
Major challenges remain for data warehouse content management:
These include identifying and accessing the appropriate data sources,
coordinating data capture from them in an appropriate timeframe.
A data warehouse serves as a repository for data extracted from diverse
operational information systems.
The extraction, transformation, and loading (ETL) functions in a data
warehouse are considered the most time-consuming and expensive portion of
the development lifecycle.
Often such operational systems were not designed to be integrated and data
extracts are performed manually or on a schedule determined by the
operational systems.
As a result data in the data warehouse may reflect different states of different
systems. Data extracted from an inventory system.
Coordination mechanisms must be established.

Integration and Design

Given that the data from varied sources have been loaded into the data warehouse,
the next set of challenges is the determination, representation, and conceptual integration
of the data that are "relevant" to the managerial decision making in an organization.
Methodologies for these tasks are in their infancy. The challenge is to integrate data from
diverse information systems in the face of organizational or economic constraints that
require those systems to remain autonomous.

Clearly the data warehouse must go beyond its current role as a repository of
historical data describing the operations and transactions in which the organization
has engaged. It must include data describing partners and partnerships, policies and
rules of the business, competitors and markets, goals and standards, opportunities
and problems, and alternatives and predicted futures

Support
Organizations are using data warehousing to support strategic and mission-critical
applications. Data deposited into the data warehouse must be transformed into
information and knowledge and appropriately disseminated to decision makers within
the organization and to critical partners in various supply chains . problems that need
to be addressed n this area are
1) Selection of proper analytical and data minig tools
2) Privacy and security of data
3) System performance
4) Adequate level of training and support

Data warehouse modelling

Data warehouse design is sometime referred as data warehouse model . A data warehouse
model represents an

integrated,

subject-oriented,

and very granular base

of strategic information that serves as a single source for the decision support environment.
Data warehouse model = abstract model, supported by graphical and lexical documentation
representing the data warehouse content that is involved in analytics applications
Difference between Data warehousing and OLTP model

Purpose and Source of Data

Data in a warehouse is Subject oriented whereas in an OLTP environment it is
Transaction Oriented
For a data warehouse, the data is consolidated data sourced from various OLTP
databases. In OLTP systems the data is mainly operational
Data warehouse uses the information stored for Decision Support, Extended
Planning and Problem Solving, but OLTP uses data to execute fundamental
business tasks
Space Requirements and Processing Speeds
In case of OLTP environments, the processing speed is very fast typically and
the storage requirements can be relatively small when compared to a
Warehouse.
In case of the warehouse model, the speed of processing depends on the
amount of data involved. If complex queries or batched processes are involved
then it may run into hours and storage requirements are also large due to the
existence of aggregation structures and historical data.

OLTP Applications are characterized by short online transactions such as INSERT,

UPDATE, and DELETE in large numbers. OLTP has faster query processing,
maintenance of data integrity in multi-access environments and effectiveness
measured by the number of transactions done per second. Data available in an
OLTP database is current and is an entity based model which is normalized
somewhere to 3NF.
Warehousing Applications are fairly limited number of transactions, but the queries
are mostly very complex and involve aggregations which should have shorter
response times. Warehouses host data that is historical, aggregated and stored in
multi-dimension schemas.

Data Warehouse Modeling Approaches

Each data warehouse deployment project can have different approaches like:
a) Global data warehouse Architecture
b) Independent data mart Architecture
c) Interconnected data mart Architecture
(or)
a combination of the above architectures

Global
it is designed and created based on the holistic needs of the enterprise. This can act
as a common repository for decision support data across the entire enterprise.
The term global in this warehouse architecture doesnt apply to be only a centralized
scheme (or a physical location), but reflects the scope and access of data across the
organization. The data warehouse could also be distributed across different physical
locations.
The major issue in setting up this kind of data warehouse is time and cost involved
when it is spanning multiple geographic locations.

Independent Data Mart

Independent data mart architecture implies stand-alone data marts that are
controlled by a particular workgroup, department, or line of business and are built
solely to meet their needs. There may, in fact, not even be any connectivity with data
marts in other workgroups, departments, or lines of business. the data in any
particular data mart will be accessible only to those in the workgroup, department, or
line of business that owns the data mart.

Interconnected Data Mart

Interconnected data mart architecture is basically a distributed implementation.

Although separate data marts are implemented in a particular workgroup,
department, or line of business, they can be integrated, or interconnected, to provide
a more enterprise-wide or corporate-wide view of the data. In fact, at the highest
level of integration, they can become the global data warehouse.

Approaches to Implement the Architecture

There are several approaches available to implement the above architectures. The
major approaches are: top down, bottom up, or a combination of both.

Top-Down Implementation
A top down implementation requires more planning and design work to be completed
at the beginning of the project. This brings with it the need to involve people from
each of the workgroups, departments, or lines of business that will be participating in
the data warehouse implementation. Decisions concerning data sources to be used,
security, data structure, data quality, data standards, and an overall data model will
typically need to be completed before actual implementation begins. However, the
cost of the initial planning and design can be significant. It is a time-consuming
process and can delay actual implementation, benefits, and return-on-investment.

Bottom Up Implementation
A bottom up implementation involves the planning and designing of data marts
without waiting for a more global infrastructure to be put in place. This does not

mean that a more global infrastructure will not be developed; it will be built
incrementally as initial data mart implementations expand. This approach is more
widely accepted today than the top down approach because immediate results from
the data marts can be realized and used as justification for expanding to a more
global implementation. The bottom up implementation approach has become the
choice of many organizations, especially business management, because of the
faster payback.
Considerations While Choosing data warehouse modelling approach

Incorporate several modeling techniques in a well-balanced and integrated

approach.
Each of the modeling techniques, which are a part of the approach, should
have its own area of applicability.
End user communication should be well-integrated and simple to interpret.
Metrics to assess quality and feasibility of the models must be in place.

Data warehouse modeling - Techniques and guidelines

Multi-dimensional data modeling
Entity-relationship modeling
Temporal data modeling
Pattern-oriented data modeling
Data architecture modeling Master Data Management
The Multi-dimensional data modelling technique is discussed in detail.

OLAP (online Analytical Processing )

When we look at the analysis of data across a multi-dimension, it is called as
OLAP(OnLine Analytical Processing). It is a multi-dimensional logical view of data
allowing the end users or analysts for forecasting, trend analysis, statistical analysis
etc.
Multi-dimensional analysis
Analysis of data along several dimensions
Example:
"Analysis of sales revenue by product category, store, customer group,
over the last 4 quarters"

Some Basic Concepts of Data Modelling

1) Fact : A fact is a collection of related measures plus their associated
dimensions, represented by dimension keys

Facts contain:
Dimension Keys
Each dimension key is a references to a dimension
grain Measures and supportive measures
2) Dimension : A dimension is a collection of members or units of the same type
of views. A dimension provides a certain business context to each measure
common dimensions could be:
Time
Location/region
Customers
Salesperson
Grain of a dimension : The grain of a dimension is the lowest level of detail available within
that dimension

3) Measure : A measure is a numeric attribute of a fact, representing the

performance or behaviour of the business relative to the dimensions. A
measure is a data item used by end-users in their business queries to
measure the performance or the behaviour of a business process or of a
business objectThe measure focuses in on what is being evaluated.
Granularity of measure : The granularity of a measure is determined by the
combination of the grains of all its dimensions

Basic OLAP Operations

1) Drill down and roll up

Drill-down
Exploring facts at more detailed levels
Roll-up
Aggregating facts at less detailed levels

2) Slide and dice

Slice and dice are the operations for browsing the data through the visualized cube.
Slicing cuts through the cube so that users can focus on some specific perspectives.
Dicing rotates the cube to another perspective so that users can be more specific with the
data analysis.

Dimensional modeling gives us an improved capability to visualize the very

abstract questions that the business end users are required to answer. Utilizing
dimensional modeling, end users can easily understand and navigate the data
structure and fully exploit the data. To create a data model, we must first understand
the Business process. Capture the requirements, analyse them and validate the
requirements using an initial model i.e. assess the feasibility of the model and
assess the efficiency of the model. Once the validation of the requirements is done,
then a detailed dimensional model is developed.The detailed dimension model can be
further extended and optimized.

Requirement Analysis
Requirement analysis is used to build an initial dimensional model that represents
the end user requirements which were previously captured in an informal way. This
output of this phase acts as an input for requirement modeling activities, once they
have passed the requirements validation phase. The deliverables of this phase
consist of a combination of
Initial dimensional data models
Business directory or metadata definitions of all elements of the Multi-dimension
model
The end user requirements can be classified into two major categories:
Process-oriented requirements: These represent the major information
processing elements which the end users are performing. Process oriented
requirements may be Business Objectives or Business Queries (represent
Information-oriented requirements: These represent the major data items which
the end users require for their data analysis activities.
The ultimate scope of requirement analysis be summarized as
Gather and interpret business requirements and formulate a business question.
Candidate measures, facts, and dimensions are determined.
Grains of dimensions and granularities of measures and facts are indomitable.

Dimension hierarchies and aggregation levels are determined.

Initial Multi-Dimension model is developed
Business directory definitions are established for the model.
There are several approaches to construct capture the requirements and its
artefacts. The most common approaches are:
Query-oriented approach : Measures and their associated dimensions are

determined first. Then, the facts are established. This follows the natural query
oriented approach by picking the end user queries as the first source of information.
Business-oriented approach
This approach tries to capture the fundamental elements of the business problem
basically. Firstly, the facts are determined through the analysis of the problem
domain from the business point of view. Then, the dimensions and measures are
added to the model.
Data source oriented approach
This approach focuses on the source databases models to determine the
dimensions followed by the measures and facts.

Requirements modelling
After the requirements have been validated, the requirements can be represented as
a model. The model can be an initial multi-dimension model or a concrete model
represented using cubes or a mathematical notation technique representing points in
a multi-dimension space. These representations may be appealing, especially
cubes, but their complexity increases exponentially as the dimensionality increases.
For simplicity well keep the model as a cubical dimensional model.

The requirements modelling activities can be distinguished into two broad groups as
(i) Base Techniques - used for producing the logical models for the dimensions in
the initial model. These techniques of dimension modeling involve:
Adding Dimension attributes which aid in selecting the relevant facts
Dimension browsing exploring the dimension to detect and set the appropriate
selection and aggregation constraints used in subsequent analysis of facts.
Once the dimension attributes and facts are gathered a detailed dimension model is
prepared.
(ii) Detailed dimension modeling which should incorporate structure of the
dimension as well as all of its attributes.

The proposed approach for modeling the dimensions consists of the following
activities for each dimension hierarchy:
Create an entity for each of the aggregation levels within the hierarchy and add
identifiers for each of the dimension entities.
Link the entities in a hierarchical structure and add required attributes to each
dimension entity(useful/relevant/requested by end user)
Demote aggregation levels which do not have any associated attributes from the
dimension entities into dimension attributes.
This kind of approach leads to the so called snowflake models, because of its
support to standardize the dimension hierarchies and aggregation levels

Considerations for building detailed dimension modeling

Dimension hierarchies identified on the basis of end user requirements may not be
the best. Other dimension hierarchies may be identified during requirements
modeling.
Structure of the dimension hierarchies, which has been identified before, must not
be taken for granted. When modeling dimensions, new structures may be identified
and proposed to end users as better solutions.
All the dimension hierarchies and aggregation levels in the hierarchies should be
standardized for achieving maximum consistency of information analysis.
Additionally, recommended that the model should be verified for typical relationships
that exist\ within the dimension.
There are three important types of dimension hierarchies which appear in multidimension data models, they are:
Classification relationships: Classifications are arrangements of like objects into
groups or classes. Example Sales items or products may be classified according to
sales oriented properties and also according to manufacturing and stock oriented
characteristics.
Construction relationships: If relationships are represented in the form of Bill of
Materials and are used by information analysts to explore the construction
relationships between objects and their parts, they fall under this category. Example
Calculate the cost of a product using the cost associated with the products
components and the cost for construction of that product.
Variation relationships: These are used to differentiate objects in terms of
models, versions, implementations etc. Example when a version of a product is sold
to customers when the original item is not available. All these types of relationships
are candidates for being used in the dimension models, but exactly how many of
them should be used in a model can be determined by the judgement of the
modeler.

Multi-Dimensional Model Structures

There are two basic models that can be used in dimensional modeling:
Star model
Snowflake model

Star Schema
Star schema has become a common term used to connote a dimensional model.
Database designers have long used the term star schema to describe dimensional
models because the resulting structure looks like a star and the logical diagram looks
like the physical schema. Each of the dimension table is a denormalized construct
which holds all the attributes of all the aggregation levels in all of the hierarchies of a
given dimension.
Very pragmatic approach

Model looks very much like the user thinks

Easier to use

Querying usually very efficient

Best suited for most M-OLAP tools

Snowflake Schema
This is a representation of a multidimensional data model in which the dimension
hierarchies are structured and normalized. When the schema has data that is
normalized, there is always a possibility of minimal redundancy when compared to
Star schema. This model is highly useful in situations where the models have
dimensions that are really very complex. Increased modeling and design flexibility
Hints and Tips while making a multi Dimensional model

Properties of measures
Business-related facts
Fact identifiers, dimension keys and uniqueness
Dimension roles

(i) Properties of measures

Measures that are obtained can be classified into three major categories. They are:
Additive measures measures which can be added across all of its dimensions.
Example Quantity_Changed, Rental_Amount, etc.
Semi-additive measures measures which can only be added across some of its
dimensions. It depends on the aggregation that is made in the analysis. Example
Total_Quantity_Available (ignoring the Time dimension).

Non-additive measures measures cannot be added across any of its dimensions.

Example Ratios like (Total_Sales/Average_Quantity_On_Hand).
(ii) Business-related facts
Facts can be represented as
a business transaction or a business event (Example: a Sale, representing
what was bought, where and when the sale took place, who bought the
item, how much was paid for the item sold, possible discounts involved in the
sale, etc.).
the states of a given business object (Example: the Inventory state,
representing what is stored where and how much of it was stored during a
given period).
changes to the state of a given business object (Example: Inventory
changes, representing item movements from one inventory to another and
the amount involved in the move, etc.)
(iii) Fact identifiers, dimension keys and uniqueness
It is a recommendation to assign a unique identifier to all the facts in the model and
to have the identification value, i.e. a primary key with each fact, and to be assigned
automatically during the populating process.
(iv) Dimension roles
A fact can have several dimension keys which are basically different roles of the
same dimension.

Solution Validation Techniques

Once requirements analysis is done, the requirements are mapped to a multidimensional data model. This model must be validated against the end user
requirements and candidate data sources with which the models have been mapped
are identified and accounted. Base activities involved in accessing the initial data
model are to check for the:
Validity of the model - Does the model functionally covers the end-users
requirements and is it consistent with models of the business?
Feasibility
o Is the required source data available and can it be gathered?
o Hot Spot analysis of the populating process, including a performance assessment.
Efficiency
o Hot Spot analysis of potential query performance problems.
o Sizing of fact entities and dimension structures.

Tradition
100% (5)
Tradition
3 pages
BDA - UNIT 4 PIG Notes
No ratings yet
BDA - UNIT 4 PIG Notes
9 pages
1
No ratings yet
1
2 pages
Letter Request Waiver On Penalty Interest
62% (21)
Letter Request Waiver On Penalty Interest
2 pages
Data Warehousing Components - L3 - L4 - L5
No ratings yet
Data Warehousing Components - L3 - L4 - L5
26 pages
Tes Evaluasi - Self Introduction PDF
100% (1)
Tes Evaluasi - Self Introduction PDF
4 pages
DWH by Concepts - v1
No ratings yet
DWH by Concepts - v1
56 pages
Development Standard Deliverables
No ratings yet
Development Standard Deliverables
8 pages
Nomophobia Essay
100% (1)
Nomophobia Essay
3 pages
ETL Process-Training
0% (1)
ETL Process-Training
85 pages
Data Flow Testing
50% (2)
Data Flow Testing
51 pages
Fuel Consumption
100% (12)
Fuel Consumption
37 pages
ITLSA1 - Lesson 1
No ratings yet
ITLSA1 - Lesson 1
15 pages
Conceptual Modelling
No ratings yet
Conceptual Modelling
30 pages
Software Project 1
100% (1)
Software Project 1
2 pages
Data Warehouse Architecture
No ratings yet
Data Warehouse Architecture
11 pages
Enterprise Data Planning
100% (1)
Enterprise Data Planning
24 pages
Data Driven Framework
No ratings yet
Data Driven Framework
11 pages
Eli Reiter Resume
No ratings yet
Eli Reiter Resume
1 page
Figurative Lanugage Definition 1
No ratings yet
Figurative Lanugage Definition 1
4 pages
Multidimensional Data Mode:-: Characteristics of Data Warehouse
100% (1)
Multidimensional Data Mode:-: Characteristics of Data Warehouse
26 pages
Rousseau - ''The State of War'' (Gour.)
100% (1)
Rousseau - ''The State of War'' (Gour.)
15 pages
How To Read A Tape Measure: Our Thanks To Johnson Level For Allowing Us To Reprint The Following
No ratings yet
How To Read A Tape Measure: Our Thanks To Johnson Level For Allowing Us To Reprint The Following
3 pages
Polyvinyl Acetate - Polyvinyl Alcohol
100% (1)
Polyvinyl Acetate - Polyvinyl Alcohol
15 pages
CH 2 Introduction To Data Warehousing
No ratings yet
CH 2 Introduction To Data Warehousing
31 pages
Module 2 Content and Pedagogy For The Mother Tongue SC MTB MLE Part IIpdf
No ratings yet
Module 2 Content and Pedagogy For The Mother Tongue SC MTB MLE Part IIpdf
2 pages
DWBI Testing by Puneet
No ratings yet
DWBI Testing by Puneet
27 pages
Datawarehousing Chap01
No ratings yet
Datawarehousing Chap01
27 pages
Ria Fortuna Wijaya - Assignment1 - StratManagement
No ratings yet
Ria Fortuna Wijaya - Assignment1 - StratManagement
2 pages
Data Model
100% (1)
Data Model
11 pages
Peter Rabbit (Comparative Literature)
No ratings yet
Peter Rabbit (Comparative Literature)
3 pages
WORKSHOPMANUAL
No ratings yet
WORKSHOPMANUAL
78 pages
Chap01 Data Warehouse 1
No ratings yet
Chap01 Data Warehouse 1
65 pages
Reading Test 2nd Cse A Group 1
No ratings yet
Reading Test 2nd Cse A Group 1
1 page
Rangelia Vitalli Dogs
No ratings yet
Rangelia Vitalli Dogs
5 pages
03 Data Warehouse
No ratings yet
03 Data Warehouse
27 pages
Hive Is A Data Warehouse Infrastructure Tool To Process Structured Data in Hadoop
No ratings yet
Hive Is A Data Warehouse Infrastructure Tool To Process Structured Data in Hadoop
30 pages
Literary Terms-IGCSE
No ratings yet
Literary Terms-IGCSE
4 pages
575 - Sahodaya Post Mid Term Circular 2024
No ratings yet
575 - Sahodaya Post Mid Term Circular 2024
1 page
Itl 526 Lessonplan
No ratings yet
Itl 526 Lessonplan
3 pages
Dsi 142
100% (1)
Dsi 142
19 pages
Domain Study Healthcare Analytics
No ratings yet
Domain Study Healthcare Analytics
3 pages
Resume
No ratings yet
Resume
3 pages
Sepm Unit 3.... Roshan
No ratings yet
Sepm Unit 3.... Roshan
16 pages
Parasitology Study Questions On: Snkurimana2014@gmail - Co
No ratings yet
Parasitology Study Questions On: Snkurimana2014@gmail - Co
6 pages
H.NO 9-4-110/3/56, Virasath Nagar, Tolichowki, HYDERABAD-500008 Andhra Pradesh PHONE NO: 04023562340/04024571530
No ratings yet
H.NO 9-4-110/3/56, Virasath Nagar, Tolichowki, HYDERABAD-500008 Andhra Pradesh PHONE NO: 04023562340/04024571530
8 pages
1505782-7871-CCMB 31-Oct-23
No ratings yet
1505782-7871-CCMB 31-Oct-23
3 pages
Ch4 - Data Warehousing
No ratings yet
Ch4 - Data Warehousing
33 pages
FSM PDF
No ratings yet
FSM PDF
3 pages
City of Bones City of Bones Nephilim Angelic
No ratings yet
City of Bones City of Bones Nephilim Angelic
3 pages
Ad3381 - Data Base Design and Management Manual
No ratings yet
Ad3381 - Data Base Design and Management Manual
56 pages
Advanced Data Warehouse Design
0% (1)
Advanced Data Warehouse Design
12 pages
Hive Lecture Notes
100% (1)
Hive Lecture Notes
17 pages
DWDM Lecturenotes PDF
No ratings yet
DWDM Lecturenotes PDF
133 pages
Data Warehouse Architecture
No ratings yet
Data Warehouse Architecture
8 pages
Internationale Bewerbende Merkblatt Auswahlkommission English
No ratings yet
Internationale Bewerbende Merkblatt Auswahlkommission English
4 pages
Introduction To Data Warehouse
No ratings yet
Introduction To Data Warehouse
34 pages
DWDM UNIT-1 Lecture Notes
No ratings yet
DWDM UNIT-1 Lecture Notes
15 pages
MOTM
No ratings yet
MOTM
3 pages
Data Mining-Data Warehouse
No ratings yet
Data Mining-Data Warehouse
7 pages
Week 2-Agile
No ratings yet
Week 2-Agile
23 pages
Diagnostic Instrument I
No ratings yet
Diagnostic Instrument I
4 pages
Data Warehousing Strategy
No ratings yet
Data Warehousing Strategy
22 pages
ETL Process: - 4 Major Components
No ratings yet
ETL Process: - 4 Major Components
27 pages
Data Modeling
No ratings yet
Data Modeling
26 pages
Dimensional Modeling
No ratings yet
Dimensional Modeling
38 pages
Unit No: 01 Introduction To Data Warehouse: by Pratiksha Meshram
No ratings yet
Unit No: 01 Introduction To Data Warehouse: by Pratiksha Meshram
38 pages
Unit 1 Fundamentals of Data Warehouse
No ratings yet
Unit 1 Fundamentals of Data Warehouse
21 pages
Erwin Documents
No ratings yet
Erwin Documents
8 pages
Dimensional Modeling PDF
No ratings yet
Dimensional Modeling PDF
14 pages
How To Load Fact Tables
No ratings yet
How To Load Fact Tables
6 pages
Informatica
No ratings yet
Informatica
7 pages
What Is The Level of Granularity of A Fact Table
No ratings yet
What Is The Level of Granularity of A Fact Table
15 pages
Unit 1
No ratings yet
Unit 1
14 pages
ETL Specific
No ratings yet
ETL Specific
12 pages
Data Warehousing Chapter 1
No ratings yet
Data Warehousing Chapter 1
8 pages
Unit 1
No ratings yet
Unit 1
61 pages
Data Mining Unit - 1 Notes
No ratings yet
Data Mining Unit - 1 Notes
16 pages
Data Warehousing - Architecture - Tutorialspoint
No ratings yet
Data Warehousing - Architecture - Tutorialspoint
7 pages
Etl
No ratings yet
Etl
13 pages
DWH Question Bank
No ratings yet
DWH Question Bank
9 pages
COGNOS Guidelines and Best Practices
No ratings yet
COGNOS Guidelines and Best Practices
21 pages
ETL Staging Area
No ratings yet
ETL Staging Area
3 pages
Why You Need A Data Warehouse
No ratings yet
Why You Need A Data Warehouse
8 pages
Data Warehouse Questions
No ratings yet
Data Warehouse Questions
2 pages
Best Practices Data Model Review Process
No ratings yet
Best Practices Data Model Review Process
3 pages
Advantages of Data Warehouse
No ratings yet
Advantages of Data Warehouse
2 pages
Data Warehousing
No ratings yet
Data Warehousing
24 pages
Need of Two Types of Data: Information
No ratings yet
Need of Two Types of Data: Information
7 pages

Data Warehousing

Uploaded by

Data Warehousing

Uploaded by

Data Warehousing

Data warehousing is a processing a huge amount of electronic data stored in recent

Architectural Properties of Data Warehousing System

Security - Monitoring accesses is essential because of the strategic data stored in

Data Warehouse Architectures

It uses heterogeneous sources of data.

Data is originally stored to

Corporate relational databases

Information is stored to one logically centralized single

Difference between Data Marts and Data Warehouse

cleaning and integration. However, reconciled data leads to more redundancy of

A perspective on decision support applications

Integration and Design

Data warehouse modelling

and very granular base

Purpose and Source of Data

OLTP Applications are characterized by short online transactions such as INSERT,

Data Warehouse Modeling Approaches

Independent Data Mart

Interconnected Data Mart

Interconnected data mart architecture is basically a distributed implementation.

Approaches to Implement the Architecture

Incorporate several modeling techniques in a well-balanced and integrated

Data warehouse modeling - Techniques and guidelines

OLAP (online Analytical Processing )

Some Basic Concepts of Data Modelling

3) Measure : A measure is a numeric attribute of a fact, representing the

Basic OLAP Operations

1) Drill down and roll up

2) Slide and dice

Dimensional modeling gives us an improved capability to visualize the very

Dimension hierarchies and aggregation levels are determined.

Considerations for building detailed dimension modeling

Multi-Dimensional Model Structures

Model looks very much like the user thinks

Querying usually very efficient

(i) Properties of measures

Non-additive measures measures cannot be added across any of its dimensions.

Solution Validation Techniques

You might also like