0% found this document useful (0 votes)

14 views111 pages

Module2 ADBMS

Uploaded by

abhayjha30

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views111 pages

Module2 ADBMS

Uploaded by

abhayjha30

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 111

Data

Warehousing
▪ Introduction to DW
▪ DW Architecture
▪ ETL Process
▪ Top-Down and Bottom-Up Approaches
▪ Characteristics & Benefits of Data Mart
▪ Differences between OLAP & OLTP
▪ Dimensional Analysis
▪ Drill down and Roll up
▪ OLAP Models
▪ Schemas – Star, Snowflake and Fact Constellation

2
1.
Introduction
to Data
Warehousin
g
“
A Data Warehouse is a subject
oriented, integrated, non-volatile and
time variant collection of data in
support of the management’s
decisions

4
Subject Oriented Data
▪ In every industry, data sets are organized
around individual applications to support
those particular operational systems.
▪ In a DW, data is stored by subjects, not by
applications.
▪ Business subjects differ from enterprise to
enterprise. Eg: in a manufacturing company
– sales, shipments, inventory are critical
business subjects
5
6
▪ In the operational systems shown, data for each
application is organized separately by application: Order
processing, customer loans, billing, accounts receivable,
claims processing and savings account.
▪ CLAIMS is a critical business subject for an insurance
company.
▪ Claims under automobile insurance policies are
processed in the Auto Insurance Application.
▪ But in the DW for the insurance company, claims data
are organized around the subject of CLAIMS (and not by
application).
▪ NOTE: Data in a DW cut across applications 7
Integrated Data
▪ For proper decision making, one needs
relevant data from various applications.
▪ The data in the DW comes from several
operational systems.
▪ Source data are in different databases and
files – these are disparate applications; so
the operational platforms and OS could be
different.
8
Integrated Data
▪ In addition to data from internal
operational systems, for many enterprises,
data from outside sources is likely to be
very important.
▪ The DW may need data from such
sources.

9
Data from
applications

10
▪ Here the data fed into the subject area of ACCOUNT in
the DW comes from 3 different operational applications
▪ There could be several variations:
▪ Naming conventions could be different
▪ Attributes for data items may be different
▪ Account number in the savings bank application may
be 8 bytes long, but only 6 bytes in the checking
application
▪ Before moving the data to the DW, it must go through the
process of: transformation, consolidation and integration
of the data source
11
Non-volatile Data
▪ Data extracted from the various operational systems
and pertinent data obtained from outside sources are
transformed, integrated and stored in the DW.
▪ Data in the DW is not intended to run the day-to-day
business.
▪ Data in the operational systems are moved to the DW
at specific intervals.
▪ Data movements are scheduled based on the
requirements of the user. 12
13
▪ Business transactions update the operational
systems databases in real time.
▪ We add, change, or delete data from an
operational database as each transaction
takes place.
▪ But we do not usually update or delete the
data in a DW.
▪ Data in a DW is not as volatile as the data in
an operational database is.
▪ It is primarily used for query and analysis.

14
Time-Variant Data
▪ For an operational system, the stored data contains
CURRENT values.
▪ Eg: In an accounts receivable system, the balance is the
current outstanding balance in the customer’s account.
▪ We store past transactions, but essentially operational
systems reflect current information because these
systems support day-to-day current operations

15
Time-Variant Data
▪ On the other hand, the data in the DW is meant for
analysis and decision making.
▪ Eg: If the user is looking at the buying pattern of a
customer, he requires data about current and past
purchases.
▪ A DW, because of the very nature of its purpose, has to
contain historical data (and not just current values).

16
Time-Variant Data
▪ Every data structure in a DW contains the time element.
▪ Eg: In a DW containing units of sale, quantity stored in
each record relates to a specific time element.
Depending on the level of details in the DW, sales
quantity may relate to a specific date, week, month,
quarter or even year.

17
COMPONENTS
OF A DATA
WAREHOUSE

18
Building blocks of a DW
▪ Include
▪ 1. Source Data Component (Production,
External, Internal, Archived)
▪ 2. Data Staging Component
▪ 3. Data Storage Component
▪ 4. Information Delivery Component
▪ 5. Metadata
▪ 6. Management & Control Component
19
1. Source Data Component
▪ Source data in the DW may be grouped into 4
broad categories

20
Production Internal
Data Data

Archived External
Data Data
21
A) Production Data

▪ This category of data comes from various operational systems of the

enterprise.
▪ There may be variations in data formats, they may reside on different
hardware platforms, they may be supported by different databases and
operating systems.
▪ Challenge is to standardize and transform disparate data from different
production systems, convert the data, and integrate the pieces into
useful data for storage in the DW.

22
B) Internal Data

▪ In every organization, users keep their “private” spreadsheets,

documents, customer profiles and sometimes departmental databases
▪ This internal data may be useful in the DW.
▪ Internal data adds additional complexity to the process of transforming
and integrating the data before it can be stored in the DW.

23
C) Archived Data

▪ In every operational system, periodically old data is taken and stored in

archived files.
▪ Sometimes data is archived after a year, sometimes even 5 years.
▪ A DW keeps historical snapshots of data which is required for analysis
over a period of time.
▪ For getting historical information, archived data must be accessed.
▪ This type of data is useful for discerning patterns and analysing trends.

24
D) External Data

▪ Most executives depend on data from external sources for a high

percentage of information that they use.
▪ Eg: They may require market share of competitors, may use standard
values of financial indicators for their business to check their
performance.
▪ Usually, data from outside sources do not conform to the organization’s
formats. Conversions must take place in order to make the external
data adhere the internal requirements.

25
Data Staging Component
▪ Now we need to prepare the data for storage in the
DW.
▪ The three major functions involved are
▪ A) Extraction
▪ B) Transformation all take place in the
Staging Area
▪ C) Loading
▪ Data staging provides an area with a set of
functions to clean, change, combine, convert,
deduplicate and prepare the source data for
storage in the DW. 26
3. Data Storage Component
▪ The DW storage requires large volumes of historical
data for analysis.
▪ Further, the data in the DW must be kept in structures
suitable for analysis.
▪ Hence, the data storage component for the DW is a
separate repository.

27
3. Data Storage Component
▪ When analysts use the data in the DW for analysis, they
need to know that the data is stable and that it
represents snapshots at specific periods.
▪ As they are working with the data, storage must not be
in a state of updating.
▪ For this reason, the DW are “read only” repositories.

28
29
4. Information Delivery Component
Novice users need prefabricated reports and preset
queries.
Casual users also need prepackaged information once in a
while, not regularly.
Business Analysts look for the ability to perform complex
queries.
Power users want to be able to navigate through the DW,
pick up interesting data, format their own queries, drill
through the data layers and create custom reports and
queries.

30
Novice/
Casual Users

Business
Analysts /
Power Users

Senior/Executive
Level Managers

KDD

31
5. Metadata
Metadata is the data about the data in the DW

32
6. Management & Control Component
This component coordinates the services and activities
within the DW.
It controls the data transformation and the data transfer to
the DW storage.
It also monitors the movement of data into the staging
area.
Interacts with the metadata component to perform its
necessary functions.

33
34
DATA
MARTS

35
Deﬁnition
A data mart is focused on a
single functional area of an
organization and contains a
subset of data stored in the
DW.
A data mart is a condensed
version of a DW and is
designed for a use by a
specific department, unit, or
a set of users in an 36
organization.
Deﬁnition
It is often controlled by a
single department in an
organization.
It usually draws data from
only a few sources compared
to a DW.
They are small in size and
more flexible than a DW.

37
TOP-DOWN & BOTTOM-UP
APPROACH

Bottom-Up (Characteristics
& Beneﬁts of Data Mart)
38
TOP-DOWN APPROACH
This is the big-picture approach in which the
overall, big, enterprise-wide DW is built.
There is no collection of fragmented islands of
information.
The DW is large and integrated.
This approach would take longer to build and
has a high risk of failure.

39
ADVANTAGES
1) Truly corporate-effort, enterprise-view of data
2) Inherently architected – not a union of disparate
data marts.
3) Single, central repository of data about the
content.
4) Centralized rules and control.
5) May see quicker results if implemented with
iterations. 40
DISADVANTAGES

1) Takes longer to build even with an iterative method.

2) High risk of failure.

3) High-level of cross-functional skills required.

4) High outlay without proof of concept.

41
BOTTOM-UP APPROACH (Characteristics
& Beneﬁts of Data Mart)
Here departmental data marts are built one by
one.
A priority scheme is required to determine which
data marts must be built first.
Most severe drawback of this approach is data
fragmentation.
Each independent data mart will be blind to the
overall requirements of the entire organizations.
42
ADVANTAGES

1) Faster and easier implementation of manageable pieces

2) Favourable return on investment and proof of concept.

3) Less risk of failure.

4) Inherently incremental, can schedule the important data

marts first.

5) Allows project team to learn and grow. 43

DISADVANTAGES

1) Each data mart has its own narrow view of data.

2) Permeates redundant data in every data mart.

3) Risk of inconsistent and irreconcilable data.

4) Proliferates unmanageable interfaces.

44
Data Warehouse Data Marts
1) Corporate/Enterprise Wide 1) Departmental
2) Single business
2) Union of all data marts
process
3) Data received from the
3) Star join
staging area
4) Structure for corporate view 4) Structure for
of data – strategic decision departmental view of data
making – tactical decision making
5) Data comes from many 5) Data comes from few
sources sources 45
Extraction,
Transformation &
Loading (ETL)
46
ETL functions reshape the relevant data from
the source systems into useful information to be
stored in the DW.

ETL is a 3 step process

47
1. Extraction
In this step of the ETL architecture, data is
extracted from the source data into the staging
area.
Transformations (if any) are done in the staging
area.
The staging are gives an opportunity to validate
extracted data before it moves into the DW.

48
Three data extraction methods
1. Full extraction - Involves extracting all the data from
the sources.

2. Partial Extraction (with update notification) – Easier

and faster in comparison to full extraction. It involves
extracting only modified data.

3. Partial Extraction (without update notification) – This

involves extracting the data based on certain key
features. Eg. If extracted data is already there till
yesterday, it is possible to extract today’s data and
identify changes in them. 49
Regardless of the method used, extraction
should not affect the performance and response
time of the source systems.
These source systems are live production
databases; any slow down could affect the
organization.

50
Some validations done during Extraction:
1. Reconcile records with source data.
2. Ensure no unwanted data is loaded.
3. Data type check.
4. Remove duplicated data.
5. Check whether keys are in place.

51
2. Transformation
Data that does not require any transformation it is
called as a DIRECT MOVE.
Data extracted from the source is raw and usable
in its original form.
This is an important ETL step, where important
functions are applied on the extracted data.
These operations can be customised as per
users needs:
52
2. Transformation
These operations can be customised as per
users needs:
Eg: If the user wants sum-of-sales which is not in
the original databases.
If the first name, middle name and last name are
in different fields, it is possible to concatenate
them before loading.

53
Problems that arise during transformation
1. Different spellings of the name of the same
person (Jon, John).
2. Multiple ways in which we denote a company
name (Google, Google Inc.).
3. Use different names (Cleaveland, Cleveland).
4. Different account numbers generated for the
same person.
5. Required data field left blank.
6. Invalid manual entry.
54
Validations done during this stage
1. FILTERING – 2. Use RULES for 3. Character set 7. Cleaning
select only certain data conversion. (Map
columns to load. standardization. NULL to 0,
MALE to
M,
FEMALE
to F)
4. Conversion of 5. Data Threshold 6. Required
units of validation check. columns are not
measurement. (Age cannot be blank.
more than 2 digits)

55
3. Loading
In a DW, huge volumes of data need to be
loaded in a relatively short period.
In case of a load failure, recovery mechanisms
should be configured to restart from the point of
failure.
DW administrators need to monitor, resume, or
cancel loads as per prevailing server
performance.

56
Types
of
Loading
Initial Load

Incremental Load

Full Refresh
57
1. Initial Load
Populating all the DW tables.

2. Incremental Load
Applying ongoing changes as and when
required.

3. Full Refresh
Erasing all the contents of one or more
tables and reloading with fresh data. 58
Load Veriﬁcation
1. Ensure that 2. Test modelling 3. Check
the key field data views based on combined values
is neither missing the target tables. and calculated
nor NULL. measures.

4. Data checks in 5. Check the BI reports

dimension tables and generated.
history tables.

59
OLTP
(Online
Transaction Place your screenshot here

Processing
Systems)

60
Deﬁnition
Operational systems are OLTP systems.
These are systems that are used to run the
day-to-day core business of the company.
They support the basic business processes of
the company.
These systems typically get data IN the system.

61
Decision Support Systems (DSS)
On the other hand, specially designed DSS are
not meant to run the core business processes.
They are used to watch how the business runs,
and then to make strategic decisions to improve
the business.
DSS are developed to get strategic information
OUT of the database.

62
GET THE DATA IN

Take an order
Process a claim
Make a shipment
Generate an invoice Operational Systems
Receive cash
Reserve an airline seat GET THE INFORMATION OUT
Show me the top selling products
Show me the problem regions
Tell me why (Drill Down)
Let me see other data (Drill Across)
Show me the highest margins
DSS Alert me when sales in a district goes below a
certain level 63
Operational Informational

Data Content Current Values Archived, Derived,

Summarized
Data Structure Optimized for transactions Optimized for complex queries

Access Frequency High Medium to Low

Access Type Read, Update, Delete Read

Usage Predictable, Repetitive Ad-Hoc, Random, Heuristic

Response Time Sub seconds Several seconds to minutes

Users Large number Relatively smaller number of

users

64
OLAP
(Online Place your screenshot here

Analytical
Processing)

65
OLAP is a category of software technology
that enables analysts, managers and executives
to gain insight into data
Through fast, consistent and interactive access
With a wide variety of possible views of
information
(that has been transformed from raw data)
To reflect the real dimensionality of the
enterprise.

66
1. Lets users have a multidimensional and logical view of
the data in the DW
2. Facilitates interactives queries and complex analysis by the
users
3. Allows users to drill down for greater details or roll up for
aggregations along a single business dimension or across
multiple dimensions.
4. Provides the ability to perform intricate calculations and
comparisons.
5. Present results in meaningful ways – include charts and
graphs.
67
Data Cube

68
Deﬁnition
A data cube in a DW is a multidimensional
structure used to store data.
Data cubes represent the data in terms of
dimensions and facts.
In a DW, we can implement an n-dimensional
data cube.

69
Dimensions are the attributes with respect to
which an organization wants to keep records.
For example, AllElectronics may create a sales
data warehouse in order to keep records with
respect to the dimensions time, item, branch,
and location.
These dimensions allow the store to keep track
of things like monthly sales of items, and the
branches and locations at which the items were
sold.
Each dimension may have a table associated
with it, called a dimension table. 70
A multidimensional data model is typically organized
around a central theme, like sales, for instance.
This theme is represented by a fact table.
Facts are numerical measures.
Facts are the quantities used to analyze relationships
between dimensions.
Examples of facts for a sales data warehouse include
sales amount in dollars, number of units sold and
amount budgeted.
The fact table contains the names of the facts, or
measures, as well as keys to each of the related
dimension tables 71
2-D View of Sales Data
2-D view of sales details for the city Vancouver
with respect to the dimensions time and item .
Key:
Home Entertainment – HE
Computers – C
Phone – P
Security - S

72
73
3-D View of Sales Data
Suppose we would like to view the data with a
3rd dimension – cities.
The 3D view is as shown below:
Key:
Chicago – Ch
New York – NY
Toronto – T
Vancouver - V
74
75
3-D View of Sales Data
Conceptually the same data may be represented
in the form of 3D data cubes.

76
77
4-D Cuboid
If we now want to view the sales data with an additional
fourth dimension, such as supplier.
In the example given below, the 4D cuboid is a
representation of sales data according to the dimensions
– time, item, location, supplier.

78
79
The topmost 0-D cuboid, which holds the
highest level of approximation is known as the
APEX CUBOID.
Here, the total sales is summarized over all the 4
dimensions.

80
Operations on
Data Cubes
81
Operations

Roll Up Drill Down Slice & Dice Pivot/Rotation

82
Roll Up

83
Drill Down

84
▪ When the drill down operation is performed on any
dimension, the data (on that dimension) is
fragmented into granular form.
▪ Figure above shows the drill down operation on the
time dimension where each quarter is fragmented
into months.

85
Slice

86
▪ The Slice operations picks up one dimension of the
data cube and forms a subcube out of it.
▪ In the figure above, the slice operation has been
performed on the data cube on the basis of the time
dimension.

87
Dice

88
▪ The Dice operation selects more than one
dimension to form a subcube.
▪ The figure above shows the subcube formed by
selecting the dimensions – location, item and time.

89
Pivot/Rotation

90
▪ The Pivot operation rotates the data cube in order
to view it from a different dimension.

91
Summary
A data cube is a multidimensional data structure model for
storing data in the data warehouse.
Data cube can be 2D, 3D or n-dimensional in structure.
Data cube represent data in terms of dimensions and
facts.
Dimension in a data cube represents attributes in the data
set.
Each cell of a data cube has aggregated data.
Data cube provides fast computation and easy access to
data and thereby increases the efficiency of the data cube
Data cube performs indexing to access dimensions. 92
Summary
Data cube can be categorized into two main types such as
multidimensional data cube and relational data cube.
Multidimensional data cube has fast computation.
The relational data cube is scalable and is efficient for
growing data.
Roll-up, drill-down, slice and dice, pivoting are the
operations that can be performed on a data cube.

93
MOLAP &
ROLAP
94
MOLAP (Multidimensional OLAP)
Multidimensional arrays are used to store data
that assures a multidimensional view of data.
Multidimensional data cube helps in storing a
large amount of data.
It implements indexing to represent each
dimension of a data cube.
This improves the accessing, retrieving and
storing of data in a data cube.

95
ROLAP (Relational OLAP)
The relational data cube is an extended version of
the relational DBMS (RDBMS).
Relational tables are used to store data and each
relational table represents a dimension of the data
cube.
When it comes to performance, the relational data
cube is slower than the multidimensional data
cube.
But the relational data cube is scalable for
steadily increasing data. 96
HOLAP (Hybrid OLAP)
It is possible to get a combination of both the
relational data cube and the multidimensional
cube, which is called as the hybrid data cube
(HOLAP).
This has the scalability of the relational data
cube and the performance of the
multidimensional data cube.

97
OLAP vs. OLTP
98
99
OLTP OLAP
Functionality Manage transactions that Used for analytical and
modify data in the reporting purposes.
databases.
Source Real-time transactions of Data is consolidated from
organizations. various OLTP databases.
Storage Format Tabular form in RDBMS. Multidimensional form in
OLAP cubes.
Operation Read & Write. Read only.
Response Time Fast processing since Slower than OLTP.
queries are simple.
Users Executives. Programmers,
professionals. 100
SCHEMAS
101
Schemas in a DW
Like a database, a DW also requires schemas.
The three types of schemas are:
1. Star Schema
2. Snowflake Schema
3. Fact Constellation Schema (Galaxy Schema)

102
Star Schema
Each dimension in a star schema is represented
with only 1-d tables.
This dimension table contains the set of attributes
Following diagram shows the sales data of a
company with respect to 4 dimensions – time,
item, branch and location.
There is a fact table at the centre containing keys
to each of the 4 dimensions.
Further it contains attributes dollars_sold and
103
units_sold
104
Snowﬂake Schema
Unlike the star schema, dimension tables in the
snowflake schema are normalized.
The normalization splits the data into additional
tables.
Eg: The ITEM dimension is normalized and split
into 2 dimensional tables – ITEM and SUPPLIER.
LOCATION dimension is split into LOCATION
and CITY tables.
Due to normalization, the redundancy is reduced
105
and it becomes easy to maintain and store.
106
Fact Constellation Schema (Galaxy
Schema)
A fact constellation schema has multiple fact
tables.
It is also possible to share dimension tables
between fact tables.
Following example shows two fact tables – sales
and shipping.
Here, TIME and ITEM are shared between the fact
tables.

107
108
Role of Concept Hierarchies in deﬁning
dimensions of a data cube
A concept hierarchy is a set of variables which
represent different levels of aggregation of the same
dimension and are linked with a mapping
For example:
City - > State -> Region -> Country

109
110
In a multidimensional database different cubes are stored,
each of which is defined with different dimensions.
The roll-up operator decreases the detail of the measure,
aggregating it along the concept hierarchy.
In the example, it allows to change the level from City to State
– recomputing values of the measure.
The drill-down operator increases the detail of the measure –
by moving to the lower level of the concept hierarchy.
In the example, one can move from State to City, retrieving
the values of the measure that were previously stored in the
cube. 111

PLCC Overview
100% (2)
PLCC Overview
27 pages
Building Permit Form
100% (2)
Building Permit Form
52 pages
Components (Building Blocks) of Data Warehouse
No ratings yet
Components (Building Blocks) of Data Warehouse
17 pages
Physical Science Q4 Week 2 v2
100% (1)
Physical Science Q4 Week 2 v2
20 pages
Chapter1 Lat
No ratings yet
Chapter1 Lat
43 pages
Unit 1 Notes - DW
No ratings yet
Unit 1 Notes - DW
25 pages
Data Warehouse 2
No ratings yet
Data Warehouse 2
33 pages
Data Modelling 242
100% (1)
Data Modelling 242
247 pages
Lecture-2 The Building Blocks
No ratings yet
Lecture-2 The Building Blocks
36 pages
Chapter1 Data Warehousing Intro
No ratings yet
Chapter1 Data Warehousing Intro
48 pages
Unit - 1 Introduction To Data Warehousing
No ratings yet
Unit - 1 Introduction To Data Warehousing
57 pages
Data Warehousing: 5 April 2013 TCS Public
No ratings yet
Data Warehousing: 5 April 2013 TCS Public
50 pages
Data Warehouse
No ratings yet
Data Warehouse
68 pages
CHE134P FINAL EXAM 2013 14 4t
No ratings yet
CHE134P FINAL EXAM 2013 14 4t
10 pages
Topic 4 (Data Warehouse)
No ratings yet
Topic 4 (Data Warehouse)
41 pages
DWH Week 02
No ratings yet
DWH Week 02
22 pages
Data Warehousing: Engr. Madeha Mushtaq Department of Computer Science Iqra National University
0% (1)
Data Warehousing: Engr. Madeha Mushtaq Department of Computer Science Iqra National University
35 pages
Cours BI - 23 - 24 - Session - 2 & 3
No ratings yet
Cours BI - 23 - 24 - Session - 2 & 3
101 pages
Bi 102 Fbi Lecture 2 (Data Warehousing) 2 3
No ratings yet
Bi 102 Fbi Lecture 2 (Data Warehousing) 2 3
47 pages
Data Warehouses: FPT University
No ratings yet
Data Warehouses: FPT University
49 pages
Data Warehouse-Ccs341 Material
No ratings yet
Data Warehouse-Ccs341 Material
58 pages
Practical Research 2: Quarter 1 - Module 3
No ratings yet
Practical Research 2: Quarter 1 - Module 3
11 pages
Foreword
No ratings yet
Foreword
1,318 pages
Assignment 1 PDF
No ratings yet
Assignment 1 PDF
3 pages
Module 3 - Data Warehousing
No ratings yet
Module 3 - Data Warehousing
6 pages
Business Intelligence - Chapter 3
No ratings yet
Business Intelligence - Chapter 3
72 pages
Data Warehousing: Engr. Madeha Mushtaq Department of Computer Science Iqra National University
No ratings yet
Data Warehousing: Engr. Madeha Mushtaq Department of Computer Science Iqra National University
31 pages
Data Warehousing (Chapter 2)
No ratings yet
Data Warehousing (Chapter 2)
21 pages
Fees Structure 2015 - 2016: PGDCA Courses
No ratings yet
Fees Structure 2015 - 2016: PGDCA Courses
3 pages
Data Warehousing and Data Mining: Dr. Karunendra Verma
No ratings yet
Data Warehousing and Data Mining: Dr. Karunendra Verma
101 pages
Module1 Part3
No ratings yet
Module1 Part3
46 pages
Unit 1 Notes - DW
No ratings yet
Unit 1 Notes - DW
29 pages
Data Warehousing Fundamentals
No ratings yet
Data Warehousing Fundamentals
47 pages
Data Warehousing: Chetan R Assistant Professor, Dept. of ISE SJB Institute of Technology
No ratings yet
Data Warehousing: Chetan R Assistant Professor, Dept. of ISE SJB Institute of Technology
23 pages
Chapter 2
No ratings yet
Chapter 2
79 pages
Lecture DW 021
No ratings yet
Lecture DW 021
195 pages
CS2202 DataWarehouse OLAP
No ratings yet
CS2202 DataWarehouse OLAP
49 pages
Data Warehousing 1
No ratings yet
Data Warehousing 1
29 pages
02 DW
No ratings yet
02 DW
84 pages
Lecture # 1-2-Intro
No ratings yet
Lecture # 1-2-Intro
55 pages
Introduction To Data Warehouse
No ratings yet
Introduction To Data Warehouse
42 pages
CHAPTER 2 - Group 6 - Data Warehouse - The Building Blocks
No ratings yet
CHAPTER 2 - Group 6 - Data Warehouse - The Building Blocks
69 pages
Data Warehouse: Lutfi Freij Konstantin Rimarchuk Vasken Chamlaian John Sahakian Suzan Ton
No ratings yet
Data Warehouse: Lutfi Freij Konstantin Rimarchuk Vasken Chamlaian John Sahakian Suzan Ton
59 pages
DW - Course Information: - Teachers
No ratings yet
DW - Course Information: - Teachers
18 pages
Data Warehousing and BA
No ratings yet
Data Warehousing and BA
77 pages
02 Data Management
No ratings yet
02 Data Management
174 pages
Data Warehouse Unit1 CS3551
No ratings yet
Data Warehouse Unit1 CS3551
25 pages
RIL Index 12-JUN-2020
No ratings yet
RIL Index 12-JUN-2020
36 pages
DWM Unit-I Notes
No ratings yet
DWM Unit-I Notes
9 pages
1 Logarthmic - Decrement
No ratings yet
1 Logarthmic - Decrement
5 pages
Online Guest Room Booking System
No ratings yet
Online Guest Room Booking System
19 pages
Unit-1.1 Data Warehouse
No ratings yet
Unit-1.1 Data Warehouse
29 pages
Module 3 - Datawarehousing
No ratings yet
Module 3 - Datawarehousing
45 pages
What Is A Data Warehouse
No ratings yet
What Is A Data Warehouse
34 pages
DataWarehousing Building Blocks
No ratings yet
DataWarehousing Building Blocks
34 pages
Unit 1
No ratings yet
Unit 1
26 pages
R18CSE4102-UNIT 1 Data Mining Notes
No ratings yet
R18CSE4102-UNIT 1 Data Mining Notes
26 pages
Data Warehousing-1
No ratings yet
Data Warehousing-1
51 pages
Introduction To Data Warehousing
No ratings yet
Introduction To Data Warehousing
43 pages
DWDM Unit 1
No ratings yet
DWDM Unit 1
103 pages
DW Unit1
No ratings yet
DW Unit1
26 pages
Data Warehouse
No ratings yet
Data Warehouse
74 pages
Today in Physics 217: Electric Dipoles and Their Interactions
No ratings yet
Today in Physics 217: Electric Dipoles and Their Interactions
15 pages
Network Communication Types: by Ahmed El Hefny
100% (1)
Network Communication Types: by Ahmed El Hefny
15 pages
Unit No: 01 Introduction To Data Warehouse: by Pratiksha Meshram
No ratings yet
Unit No: 01 Introduction To Data Warehouse: by Pratiksha Meshram
38 pages
List of Experiments OOPM16
No ratings yet
List of Experiments OOPM16
3 pages
Asus Prime Z270-A User's Manual (En)
No ratings yet
Asus Prime Z270-A User's Manual (En)
104 pages
DW Basics
No ratings yet
DW Basics
17 pages
Resume Updated 1 18
No ratings yet
Resume Updated 1 18
2 pages
English Paper 1 2025
No ratings yet
English Paper 1 2025
143 pages
RAMET Main Brochure
No ratings yet
RAMET Main Brochure
60 pages
DWH Start l2
No ratings yet
DWH Start l2
117 pages
Bohler Art of Interpretation PDF
No ratings yet
Bohler Art of Interpretation PDF
20 pages
Human Behavior Insights (Ft. Kunal Shah) X
No ratings yet
Human Behavior Insights (Ft. Kunal Shah) X
64 pages
HolaKola HW Model Provide
No ratings yet
HolaKola HW Model Provide
4 pages
DHCP Configration
No ratings yet
DHCP Configration
3 pages
LCF Paper High Strength Steel-2024
No ratings yet
LCF Paper High Strength Steel-2024
12 pages
Vacation Survey
No ratings yet
Vacation Survey
2 pages
Catalogue Partition Seiceito
No ratings yet
Catalogue Partition Seiceito
3 pages
Meditation 5
No ratings yet
Meditation 5
12 pages
Reporte de Belmont
No ratings yet
Reporte de Belmont
8 pages
Burdwan University Economics PH D List
No ratings yet
Burdwan University Economics PH D List
8 pages
T 502 Productleaflet
No ratings yet
T 502 Productleaflet
2 pages
Knowledge Booster 4 Unit-6
No ratings yet
Knowledge Booster 4 Unit-6
7 pages
Need of Two Types of Data: Information
No ratings yet
Need of Two Types of Data: Information
7 pages
Database Management System
From Everand
Database Management System
Manish Soni
No ratings yet
The Snowflake Handbook: Optimizing Data Warehousing and Analytics
From Everand
The Snowflake Handbook: Optimizing Data Warehousing and Analytics
Robert Johnson
No ratings yet
Learn Data Warehousing in 24 Hours
From Everand
Learn Data Warehousing in 24 Hours
Alex Nordeen
No ratings yet
Learn SAP BI in 24 Hours
From Everand
Learn SAP BI in 24 Hours
Alex Nordeen
3/5 (1)
Practical Data Strategies and Recipes
From Everand
Practical Data Strategies and Recipes
Tom Henricksen
No ratings yet

Module2 ADBMS

Uploaded by

Module2 ADBMS

Uploaded by

Data

▪ This category of data comes from various operational systems of the

▪ In every organization, users keep their “private” spreadsheets,

▪ In every operational system, periodically old data is taken and stored in

▪ Most executives depend on data from external sources for a high

1) Takes longer to build even with an iterative method.

2) High risk of failure.

3) High-level of cross-functional skills required.

4) High outlay without proof of concept.

1) Faster and easier implementation of manageable pieces

2) Favourable return on investment and proof of concept.

3) Less risk of failure.

4) Inherently incremental, can schedule the important data

5) Allows project team to learn and grow. 43

1) Each data mart has its own narrow view of data.

2) Permeates redundant data in every data mart.

3) Risk of inconsistent and irreconcilable data.

4) Proliferates unmanageable interfaces.

ETL is a 3 step process

2. Partial Extraction (with update notification) – Easier

3. Partial Extraction (without update notification) – This

4. Data checks in 5. Check the BI reports

Data Content Current Values Archived, Derived,

Access Frequency High Medium to Low

Access Type Read, Update, Delete Read

Usage Predictable, Repetitive Ad-Hoc, Random, Heuristic

Response Time Sub seconds Several seconds to minutes

Users Large number Relatively smaller number of

Roll Up Drill Down Slice & Dice Pivot/Rotation

You might also like