0% found this document useful (0 votes)

133 views53 pages

OLAP2

Data warehouses use a multi-tiered architecture with operational databases at the bottom extracting and transforming data that is loaded into the data warehouse. The data warehouse uses an OLAP server and front-end analysis tools. OLAP server architectures can be relational, multidimensional, or hybrid. Data warehousing provides a foundation for data mining by integrating data into a multi-dimensional structure optimized for analysis.

Uploaded by

Srinivasa Rao T

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

133 views53 pages

OLAP2

Uploaded by

Srinivasa Rao T

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 53

Data Warehouses

and OLAP

*Slides by Nikos Mamoulis

Multi-Tiered Architecture

Monitor
Metadata & OLAP Server
other
source Integrator
s Analysis
Operational Extract Query
DBs Transform Data Serve Reports
Load
Refresh
Warehouse Data mining

Data Marts

Data Sources Data Storage OLAP Engine Front-End Tools

OLAP Server Architectures
 Relational OLAP (ROLAP)
 Use relational or extended-relational DBMS to store and manage
warehouse data and OLAP middle ware to support missing pieces
 Include optimization of DBMS backend, implementation of
aggregation navigation logic, and additional tools and services
 greater scalability
 Multidimensional OLAP (MOLAP)
 Array-based multidimensional storage engine (sparse matrix
techniques)
 fast indexing to pre-computed summarized data
 Hybrid OLAP (HOLAP)
 User flexibility, e.g., low level: relational, high-level: array
 Specialized SQL servers
 specialized support for SQL queries over star/snowflake schemas
Data Warehousing and
OLAP Technology for
Data Mining
 What is a data warehouse?

 A multi-dimensional data model

 Data warehouse architecture

 Data warehouse implementation

 Further development of data cube technology

 From data warehousing to data mining

Efficient Data Cube
Computation
 Data cube can be viewed as a lattice of cuboids
 The bottom-most cuboid is the base cuboid
 The top-most cuboid (apex) contains only one cell
 How many cuboids in an n-dimensional cube with L
levels? n
T   ( Li 1)
i 1
 Materialization of data cube
 Materialize every (cuboid) (full materialization), none (no
materialization), or some (partial materialization)
 Selection of which cuboids to materialize
 Based on size, sharing, access frequency, etc.
Cube Operations
 Cube definition and computation in DMQL
define cube sales[item, city, year]: sum(sales_in_dollars)
compute cube sales
 Transform it into a SQL-like language (with a new operator
cube by, introduced by Gray et al.’96)
SELECT item, city, year, SUM (amount) ()
FROM SALES
CUBE BY item, city, year (city) (item) (year)
 Need compute the following Group-Bys
(date, product, customer),
(date,product),(date, customer), (product,
(city, item) customer),
(city, year) (item, year)
(date), (product), (customer)
()
(city, item, year)
Cube Computation: ROLAP-
Based Method
 Efficient cube computation methods
 ROLAP-based cubing algorithms (Agarwal et al’96)
 Array-based cubing algorithm (Zhao et al’97)
 Bottom-up computation method (Bayer & Ramarkrishnan’99)
 ROLAP-based cubing algorithms
 Sorting, hashing, and grouping operations are applied to the
dimension attributes in order to reorder and cluster related
tuples
 Grouping is performed on some sub-aggregates as a “partial
grouping step”
 Aggregates may be computed from previously computed
aggregates, rather than from the base fact table
Aggregation for Cube
Computation
 Partition arrays into chunks (a small subcube which fits in memory).
 Compressed sparse array addressing: (chunk_id, offset)
 Compute aggregates in “multiway” by visiting cube cells in the order
which minimizes the # of times to visit each cell, and reduces memory
access and storage cost.

C c3 61
c2 45
62 63 64
46 47 48
c1 29 30 31 32 What is the best
c0
b3 B13 14 15 16 60 traversing order
44
9
28 56 to do multi-way
b2
B 40
24 52 aggregation?
b1 5 36
20
b0 1 2 3 4
a0 a1 a2 a3
A
Aggregation for Cube
Computation

C c3 61
c2 45
62 63 64
46 47 48
c1 29 30 31 32
c0
B13 14 15 16 60
b3 44
B 28 56
b2 9
40
24 52
b1 5
36
20
b0 1 2 3 4
a0 a1 a2 a3
A
Aggregation for Cube
Computation

C c3 61
c2 45
62 63 64
46 47 48
c1 29 30 31 32
c0
B13 14 15 16 60
b3 44
B 28 56
b2 9
40
24 52
b1 5
36
20
b0 1 2 3 4
a0 a1 a2 a3
A
Aggregation for Cube
Computation (Cont.)
 Method: the planes should be sorted and
computed according to their size in ascending
order.
 See the details of Example 4.4
 Idea: keep the smallest plane in the main memory,
fetch and compute only one chunk at a time for the
largest plane
 Limitation of the method: works well only for a
small number of dimensions
 If there are a large number of dimensions, “bottom-up
computation” and iceberg cube computation methods
can be explored
Indexing OLAP Data:
Bitmap Index
 Index on a particular column
 Each value in the column has a bit vector: bit-op is fast
 The length of the bit vector: # of records in the base table
 The i-th bit is set if the i-th row of the base table has the
value for the indexed column
 not suitable for high cardinality domains

Base table Index on Region Index on Type

Cust Region Type RecIDAsia Europe America RecID Retail Dealer
C1 Asia Retail 1 1 0 0 1 1 0
C2 Europe Dealer 2 0 1 0 2 0 1
C3 Asia Dealer 3 1 0 0 3 0 1
C4 America Retail 4 0 0 1 4 1 0
C5 Europe Dealer 5 0 1 0 5 0 1
Indexing OLAP Data: Join
Indices
 Join index: JI(R-id, S-id) where R (R-id, …)
 S (S-id, …)
 Traditional indices map the values to a list
of record ids
 It materializes relational join in JI file and
speeds up relational join — a rather costly
operation
 In data warehouses, join index relates the
values of the dimensions of a start schema
to rows in the fact table.
 E.g. fact table: Sales and two dimensions
city and product
 A join index on city maintains for each
distinct city a list of R-IDs of the tuples
recording the Sales in the city
 Join indices can span multiple dimensions
Efficient Processing OLAP
Queries
 Determine which operations should be performed
on the available cuboids:
 transform drill, roll, etc. into corresponding SQL and/or
OLAP operations, e.g, dice = selection + projection

 Determine to which materialized cuboid(s) the

relevant operations should be applied.
 Exploring indexing structures and compressed vs.
dense array structures in MOLAP
Metadata Repository
 Meta data is the data defining warehouse objects. It has the
following kinds
 Description of the structure of the warehouse
 schema, view, dimensions, hierarchies, derived data defn, data mart
locations and contents
 Operational meta-data
 data lineage (history of migrated data and transformation path),
currency of data (active, archived, or purged), monitoring information
(warehouse usage statistics, error reports, audit trails)
 The algorithms used for summarization
 The mapping from operational environment to the data
warehouse
 Data related to system performance
 warehouse schema, view and derived data definitions
 Business data
 business terms and definitions, ownership of data, charging policies
Data Warehouse Back-End
Tools and Utilities
 Data extraction:
 get data from multiple, heterogeneous, and external
sources
 Data cleaning:
 detect errors in the data and rectify them when possible
 Data transformation:
 convert data from legacy or host format to warehouse
format
 Load:
 sort, summarize, consolidate, compute views, check
integrity, and build indicies and partitions
 Refresh
 propagate the updates from the data sources to the
warehouse
Data Warehousing and
OLAP Technology for
Data Mining
 What is a data warehouse?

 A multi-dimensional data model

 Data warehouse architecture

 Data warehouse implementation

 Further development of data cube technology

 From data warehousing to data mining

Discovery-Driven
Exploration of Data Cubes
 Hypothesis-driven: exploration by user, huge search space
 Discovery-driven (Sarawagi et al.’98)
 pre-compute measures indicating exceptions, guide user in the
data analysis, at all levels of aggregation
 Exception: significantly different from the value anticipated,
based on a statistical model
 Visual cues such as background color are used to reflect the
degree of exception of each cell
 Computation of exception indicator (modeling fitting and
computing SelfExp, InExp, and PathExp values) can be
overlapped with cube construction
Examples: Discovery-Driven
Data Cubes
Data Warehousing and
OLAP Technology for
Data Mining
 What is a data warehouse?

 A multi-dimensional data model

 Data warehouse architecture

 Data warehouse implementation

 Further development of data cube technology

 From data warehousing to data mining

Data Warehouse Usage
 Three kinds of data warehouse applications
 Information processing
 supports querying, basic statistical analysis, and reporting
using crosstabs, tables, charts and graphs
 Analytical processing
 multidimensional analysis of data warehouse data
 supports basic OLAP operations, slice-dice, drilling, pivoting
 Data mining
 knowledge discovery from hidden patterns
 supports associations, constructing analytical models,
performing classification and prediction, and presenting the
mining results using visualization tools.
 Differences among the three tasks
Processing to On Line
Analytical Mining (OLAM)
 Why online analytical mining?
 High quality of data in data warehouses
 DW contains integrated, consistent, cleaned data

 Available information processing structure surrounding data

warehouses
 ODBC, OLEDB, Web accessing, service facilities, reporting

and OLAP tools

 OLAP-based exploratory data analysis
 mining with drilling, dicing, pivoting, etc.

 On-line selection of data mining functions

 integration and swapping of multiple mining functions,

algorithms, and tasks.

 Architecture of OLAM
An OLAM Architecture
Mining query Mining result Layer4
User Interface
User GUI API
Layer3
OLAM OLAP
Engine Engine OLAP/OLAM

Data Cube API

Layer2
MDDB
MDDB
Meta
Data
Filtering&Integration Database API Filtering
Layer1
Data cleaning Data
Databases Data
Data integration Warehouse Repository
Summary
 Data warehouse
 A subject-oriented, integrated, time-variant, and nonvolatile collection of
data in support of management’s decision-making process
 A multi-dimensional model of a data warehouse
 Star schema, snowflake schema, fact constellations
 A data cube consists of dimensions & measures
 OLAP operations: drilling, rolling, slicing, dicing and pivoting
 OLAP servers: ROLAP, MOLAP, HOLAP
 Efficient computation of data cubes
 Partial vs. full vs. no materialization
 Multiway array aggregation
 Bitmap index and join index implementations
 Further development of data cube technology
 Discovery-drive and multi-feature cubes
 From OLAP to OLAM (on-line analytical mining)
References (I)
 S. Agarwal, R. Agrawal, P. M. Deshpande, A. Gupta, J. F. Naughton, R. Ramakrishnan, and S.
Sarawagi. On the computation of multidimensional aggregates. In Proc. 1996 Int. Conf. Very
Large Data Bases, 506-521, Bombay, India, Sept. 1996.
 D. Agrawal, A. E. Abbadi, A. Singh, and T. Yurek. Efficient view maintenance in data
warehouses. In Proc. 1997 ACM-SIGMOD Int. Conf. Management of Data, 417-427, Tucson,
Arizona, May 1997.
 R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high
dimensional data for data mining applications. In Proc. 1998 ACM-SIGMOD Int. Conf.
Management of Data, 94-105, Seattle, Washington, June 1998.
 R. Agrawal, A. Gupta, and S. Sarawagi. Modeling multidimensional databases. In Proc. 1997
Int. Conf. Data Engineering, 232-243, Birmingham, England, April 1997.
 K. Beyer and R. Ramakrishnan. Bottom-Up Computation of Sparse and Iceberg CUBEs. In Proc.
1999 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'99), 359-370, Philadelphia, PA,
June 1999.
 S. Chaudhuri and U. Dayal. An overview of data warehousing and OLAP technology. ACM
SIGMOD Record, 26:65-74, 1997.
 OLAP council. MDAPI specification version 2.0. In
http://www.olapcouncil.org/research/apily.htm, 1998.
 J. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart, M. Venkatrao, F. Pellow, and H.
Pirahesh. Data cube: A relational aggregation operator generalizing group-by, cross-tab and
sub-totals. Data Mining and Knowledge Discovery, 1:29-54, 1997.
References (II)
 V. Harinarayan, A. Rajaraman, and J. D. Ullman. Implementing data cubes efficiently. In
Proc. 1996 ACM-SIGMOD Int. Conf. Management of Data, pages 205-216, Montreal,
Canada, June 1996.
 Microsoft. OLEDB for OLAP programmer's reference version 1.0. In
http://www.microsoft.com/data/oledb/olap, 1998.
 K. Ross and D. Srivastava. Fast computation of sparse datacubes. In Proc. 1997 Int. Conf.
Very Large Data Bases, 116-125, Athens, Greece, Aug. 1997.
 K. A. Ross, D. Srivastava, and D. Chatziantoniou. Complex aggregation at multiple
granularities. In Proc. Int. Conf. of Extending Database Technology (EDBT'98), 263-277,
Valencia, Spain, March 1998.
 S. Sarawagi, R. Agrawal, and N. Megiddo. Discovery-driven exploration of OLAP data cubes.
In Proc. Int. Conf. of Extending Database Technology (EDBT'98), pages 168-182, Valencia,
Spain, March 1998.
 E. Thomsen. OLAP Solutions: Building Multidimensional Information Systems. John Wiley &
Sons, 1997.
 Y. Zhao, P. M. Deshpande, and J. F. Naughton. An array-based algorithm for simultaneous
multidimensional aggregates. In Proc. 1997 ACM-SIGMOD Int. Conf. Management of Data,
159-170, Tucson, Arizona, May 1997.
Selection of tables,
attributes, domains in the
DW design process
 If you are asked to design a data
warehouse for a set of operational
databases how would you do it?
 Use specifications of requirements to design the
data warehouse schema and select:
 The central theme(s) of the analysis (e.g., sales)
 The measures on the central themes (e.g.,
sum(dollars))
 The dimensions used by analytical processing
 The attributes and hierarchies of the dimensions
 Clean, transform, and integrate information
Example
 A large company which sells engine parts
 Database 1 (Los Angeles)
 employee(id, name, dept, lot, salary, age)
 department(id, name, type, manager_id)
 part(id, name, type, brand, manufacturer, color)
 customer(id, name, type, age, city, state, zip, tel)
 sales(id, part_id, customer_id, quantity, price)
 Database 2 (New York)
 employee(id, ename, dept_id, salary, age)
 department(id, name, type, manager)
 part(id, title, type, brand, manufacturer, color)
 customer(id, name, type, zip, tel)
 location(zip, city_id)
 city(city_id, state, country)
 sales(id, part_id, customer_id, quantity, price)
Example: requirements and
selection
 Requirements of the data warehouse
 we want to analyze the total sales in dollars
and the average price of sold units with respect
to time, part, customer.
 Selection of the basic features of the
warehouse
 central theme(s): sales
 measure(s): sum(sales_in_dollars),
avg(price_sold_units)
 dimensions: time, part, customer
Example: selection of
hierarchies
 To determine the dimension hierarchies we
have to select which dimensional attributes
are required to include for analysis
 We go back to the requirements and ask
the analyst:
 time: day, week, month, quarter, year
 part: name, type, color, brand, manufacturer
 customer: name, type, city, state, country
Exercise
 Find the hierarchies for time, part,
customer

•time: day, week, month, quarter, year

•part: name, type, color, brand,
manufacturer
•customer: name, type, city, state, country
Example: selection of
hierarchies
 Definition of the hierarchies:

year manufacturer country

quarter state

month week type color brand type city

day name name

time part customer
Example: is the information
that we need to analyze
available?
 We have to check if the required
information for analysis exists in the
databases to be integrated
 All requested attributes exist, except from the
time. This can be determined by accessing the
transaction logs of the databases.
Example: Design the DW
schema
 We use the star schema in this example
Time
time_id Fact table Customer
day time_id cust_id
week
part_id name
month
type
quarter cust_id
city
year quantity state
Part price country
part_id
name We need just quantity and (unit) price to derive
type sum(sales_in_dollars) = quantity*price
color
brand
avg(price_sold_units) =
manufacturer Σ(quantity*price)/ Σ(quantity)
Integration tasks
convert:
 A large company which sells engine parts
•attribute names
 Database 1 (Los Angeles) •attribute types
 employee(id, name, dept, lot, salary, age)
 department(id, name, type, manager_id) join tables
 part(id, name, type, brand, manufacturer, color)
derive data not
 customer(id, name, type, age, city, state, zip, tel)
stored explicitly
 sales(id, part_id, customer_id, quantity, price) in the databases
 Database 2 (New York)
 employee(id, ename, dept_id, salary, age)
fill in missing
values
 department(id, name, type, manager)
 part(id, title, type, brand, manufacturer, color)
ignore irrelevant data:
 customer(id, name, type, zip, tel)
•tables
 location(zip, city_id)
•attributes
 city(city_id, state, country)
 sales(id, part_id, customer_id, quantity, price)
Example: how to populate the
DW?
Load, Clean, Integrate
 Convert attribute names and types
 E.g., part.name = part.title
 Convert values to be consistent
 Join tables if necessary
 Join customer,location,city from “New York” DB
 Derive time if not present
 Check transaction log for sales table to get the time and convert it to
the required format
 Complete missing values
 The “Los Angeles” database does not record customer country
information because all its customers are from US. In the integrated
data from LA country value is set to “USA”
 Ignore irrelevant tables and attributes
 Tables employee, department are ignored
 Attributes zip, tel, id are ignored.
How many cuboids ?
 How do we compute the total number of
cuboids of a data cube?
 Compute the product of the number of
levels for each dimension
 Number of cuboids =
(levels for time)*
(levels for part)*
(levels for customer) =
5*4*5 = 100
What is the data cube?
 The data cube is NOT a cube
 Multiple dimensions, variable range and interpretations
of the cells, does not look always like a “cube”
 The data cube is the set of all non-redundand,
multidimensional views from which we can
analyze the measures on the central theme(s)
 Remember: the terms multidimensional view and
cuboid have the same meaning
 A multidimensional view is non-redundant, if
there are no hierarchical relationships between its
dimensional attributes
Redundancy in views
 A non-redundant combination of month
attributes: 23 12 23 11 5067
 (month, part_type) 10 20 43 33 13 22

part_type
58 18 72 25 30 25
23 12 0 40 9012
23 45 43 33 13 1
2 0 56 23 6 25

 A redundant (not useful)

combination of attributes: part_brand

manufacturer
 (part_brand, part_manufacturer) 23 12 0 0 0 0
0 0 43 33 13 0
0 0 0 0 0 25
Exercise
 Find the non-redundant combinations of
the attributes of part.

manufacturer

type color brand

name
part
Visualization of non-redundant
attribute combinations for part
ALL

manufacturer

type color brand

(type,color) (type, manufacturer) (color, manufacturer)

(color,brand) (type,brand) (type,color,manufacturer)

(type,color,brand)

name

part
How many (non-redundant)
cuboids ?
 How do we compute the total number of cuboids of a data
cube?
 Compute the product of the number of combinations for
each dimension
 Number of cuboids =
(combinations for time)*
(combinations for part)*
(combinations for customer) =
6*13*9 = 702
 Note: sometimes we use the term cuboid to also denote
multidimensional views with selections:
 total sales for each (part.type,customer.city) for year = “2001”
 We do not count such cuboids in the computation above
Cube: A Lattice of
Cuboids
E.g., (location) is dependent on
all (time,item,location)

time item location supplier

time,item time,location item,location location,supplier

time,supplier item,supplier

time,location,supplier
time,item,location

time,item,supplier item,location,supplier

time, item, location, supplier

How to answer queries from a
set of materialized cuboids?
 In real-life examples it is not possible to materialize the
whole data cube
 Typically, a small subset of cuboids is materialized
 To answer a query we have to select the cuboid that results
in the minimum cost for the query
 A query typically consists of
 a set of group-by attributes
 a set of selection clauses
 E.g., compute the total sales per part.type, cust.city for
year=2002.
 The factors for selecting the best cuboid for a particular query
are:
 the size of the cuboid
 any indexes on the attributes of the “select” clause of the query
Cube: A Lattice of
Cuboids
= materialized How would you compute the
all following queries?
cuboid Q1: <time,item>
Q2: <supplier>
time item location supplier Q3: <location>
Q4: <time,supplier>
Q5: <item>
time,item time,location item,location location,supplier

time,supplier item,supplier

time,location,supplier
time,item,location

time,item,supplier item,location,supplier

time, item, location, supplier

Which cuboids should we
materialize?
 In real-life examples it is not possible to materialize the
whole data cube
 We have to select the most beneficial cuboids to materialize
 This depends mainly on the size of the cuboids and their
usage by queries
 Thus to select we need information about (i) the size of
cuboids, (ii) the queries and their frequency
 The base cuboid almost always corresponds to the fact table,
which is already materialized. For example, if the products
have unique name, and customers unique name, we can use:
 time_id to derive the day
 part_id to derive part.name
 cust_id to derive customer.name
Which cuboids should we
materialize?
 Example
 candidate cuboids:
 (day,pname,cname) – 100GB (already materialized)
 (day,pname) – 60GB
 (day,cname) – 20GB
 (pname,cname) – 1GB
 (day) – 10GB Exercise: Which views
 (pname) – 200MB

 (cname) – 30MB
should we materialize if
 (ALL) – 8 bytes
the available space is:
 queries (with equal probability)
1. 10GB
2. 1GB
 Q1: total sales per (pname,cname)

 Q2: total sales per (pname) 3. 100MB

 Q3: total sales per (cname)
Which cuboids should we
materialize?
 Case 1: Available space = 10GB
 We can materialize all three views (pname,cname) ,
(pname), and (cname)
 The cost of Q1 is reading 1GB
 The cost of Q2 is reading 200MB
 The cost of Q3 is reading 30MB
 Average query cost = (1230MB)/3=410 MB/query
Which cuboids should we
materialize?
 Case 2: Available space = 1GB
 We have two choices:
 materialize (pname,cname) using 1GB
 Q1 costs 1GB, Q2 costs 1GB, Q3 costs 1GB
 Average query cost = 1GB
 materialize (pname) and (cname) using 230MB
 Q1 costs 100GB, Q2 costs 200MB, Q3 costs 30MB
 Average query cost = (100,230MB)/3 = 34GB
 First choice is better than the second!
Which cuboids should we
materialize?
 Case 3: Available space = 100MB
 We can only materialize (cname)
 Q1 costs 100GB
 Q2 costs 100GB
 Q3 costs 30MB
 Average query cost = (200,030MB)/3 = 67GB
Bitmap Indexes
 The bitmap index is used to index attributes with
small domains
 For each attribute value, a bitmap is defined to
indicate the rows of the table that contain this
value
 Bitmaps are useful especially when we want to
join some attribute values
 Example: find the total sales of red parts to
female customers
Bitmap Indexes - Example
100 bytes
date pcolor cname cgender sales
10/10/00 red Smith M 21
12/10/00 green Jones F 13
15/10/00 green Kane M 14
20/10/00 blue Nike M 23 1 billion rows
22/11/00 red Ellis F 9
26/11/00 red Jones F 92
... ... ... ... ...
index for pcolor index for gender Exercise:
red green blue M F 1. what are the sizes of
1 0 0 1 0 the table and indexes
0 1 0 0 1 2. what is the cost of the
0 1 0 1 0 query: “find the total sales of
0 0 1 1 0 red parts to female
1 0 0 0 1 customers” if:
1 0 0 0 1 a) the table is used
... ... ... ... ... b) the indexes are used
Bitmap Indexes - Example
 The size of the table is 100GB=100bytes*1billion
 The size of the indexes are:
 3bits*1billion = 3000Mbits = 375MB
 2bits*1billion = 2000Mbits = 250MB
 The cost of evaluating the query directly on the table is reading
100GB and for each of the 1 billion tuples perform a comparison
with “red” and “F” (expensive)
 The cost of evaluating the query using the indexes is:
 read bitmap for red, read bitmap for F and join them.
 This costs reading 1Gbits+1Gbits = 250MB
 for each join result accumulate the sales for the corresponding rid
 this retrieves (1/3)*(1/2) = 1/6 tuples (estimated) and accumulates them
 we will probably read the whole table (since we want to avoid random
accesses), but we will avoid any comparisons.

Test 1: (Units 1-2)
100% (2)
Test 1: (Units 1-2)
59 pages
RBM Metrics Report - December 2019
No ratings yet
RBM Metrics Report - December 2019
17 pages
Session - 11 (Shell Variables)
No ratings yet
Session - 11 (Shell Variables)
6 pages
PowerBI Filters Guide
No ratings yet
PowerBI Filters Guide
10 pages
DINLect 1
No ratings yet
DINLect 1
69 pages
ONE-SA Enabling Nonlinear Operations in Systolic Arrays For Efficient and Flexible Neural Network Inference PDF
No ratings yet
ONE-SA Enabling Nonlinear Operations in Systolic Arrays For Efficient and Flexible Neural Network Inference PDF
6 pages
8.4 IdentityIQ Rapid Setup
No ratings yet
8.4 IdentityIQ Rapid Setup
30 pages
Software Characteristics
No ratings yet
Software Characteristics
15 pages
什么是评论？
100% (2)
什么是评论？
7 pages
DWH Unit 1
No ratings yet
DWH Unit 1
12 pages
AWS Certified Machine Learning Study Guide: Specialty (MLS-C01) Exam
From Everand
AWS Certified Machine Learning Study Guide: Specialty (MLS-C01) Exam
Shreyas Subramanian
No ratings yet
Mih55t MV2
No ratings yet
Mih55t MV2
3 pages
Testing
No ratings yet
Testing
10 pages
Datasheet CV-115 P2000
No ratings yet
Datasheet CV-115 P2000
6 pages
Ch-7 & 8 Written & Effective Communication
No ratings yet
Ch-7 & 8 Written & Effective Communication
39 pages
E-Wallet - The Smart Walletworks - A. Aparna
No ratings yet
E-Wallet - The Smart Walletworks - A. Aparna
3 pages
QGIS - Instructions
No ratings yet
QGIS - Instructions
13 pages
Data Warehouse - Unit-2 - S
No ratings yet
Data Warehouse - Unit-2 - S
21 pages
MultiDimensional Data Model
No ratings yet
MultiDimensional Data Model
22 pages
3-Data Warehouse Modeling - Data Cube and OLAP-18!12!2024
No ratings yet
3-Data Warehouse Modeling - Data Cube and OLAP-18!12!2024
25 pages
Customized Image Reference Guide
No ratings yet
Customized Image Reference Guide
72 pages
ML Module1
No ratings yet
ML Module1
56 pages
Machine Learning in the AWS Cloud: Add Intelligence to Applications with Amazon SageMaker and Amazon Rekognition
From Everand
Machine Learning in the AWS Cloud: Add Intelligence to Applications with Amazon SageMaker and Amazon Rekognition
Abhishek Mishra
No ratings yet
Couchbase Certified Java Developer - Exam Practice Tests
From Everand
Couchbase Certified Java Developer - Exam Practice Tests
Cristian Scutaru
No ratings yet
DWDM Notes
No ratings yet
DWDM Notes
19 pages
DCAR - Global - CardioSoft ABP v7 - TONOPORT VI - Sell Sheet - JB64484XX
No ratings yet
DCAR - Global - CardioSoft ABP v7 - TONOPORT VI - Sell Sheet - JB64484XX
2 pages
CSPC Quick Start Guide
No ratings yet
CSPC Quick Start Guide
20 pages
Cloud Computing Deployment Models
No ratings yet
Cloud Computing Deployment Models
5 pages
R Programming Presentation Schedule BCA Section - A
No ratings yet
R Programming Presentation Schedule BCA Section - A
1 page
Python For Oracle 1521064361670001MPBh
No ratings yet
Python For Oracle 1521064361670001MPBh
21 pages
Unit-2 1
No ratings yet
Unit-2 1
60 pages
Sniffers: Group Members
No ratings yet
Sniffers: Group Members
11 pages
Hadoop Installation Steps
No ratings yet
Hadoop Installation Steps
16 pages
Releasenote 2.0.0
No ratings yet
Releasenote 2.0.0
5 pages
UNIT2DM
No ratings yet
UNIT2DM
63 pages
Mongodb Indexes
No ratings yet
Mongodb Indexes
31 pages
Unit 2 - Data Science BCA
No ratings yet
Unit 2 - Data Science BCA
20 pages
TRBOnet Watch Release Notes 3.1
No ratings yet
TRBOnet Watch Release Notes 3.1
15 pages
Unit 1 - Data Warehouse
No ratings yet
Unit 1 - Data Warehouse
21 pages
Chapter 3 Data Warehouse & OLAP
No ratings yet
Chapter 3 Data Warehouse & OLAP
17 pages
Note Taking
No ratings yet
Note Taking
1 page
PCS7 System Recovery With Veritas en
No ratings yet
PCS7 System Recovery With Veritas en
19 pages
MSC Comp SC Syllabus Cbcs 09072016
No ratings yet
MSC Comp SC Syllabus Cbcs 09072016
37 pages
Dynamic Programming: How Can We Calculate F (20) ?
No ratings yet
Dynamic Programming: How Can We Calculate F (20) ?
10 pages
What Is Cloud Computing Reference Model
100% (1)
What Is Cloud Computing Reference Model
3 pages
Resume: Mamatha.K
No ratings yet
Resume: Mamatha.K
2 pages
BIG-IP Service Provider SIP Administration
No ratings yet
BIG-IP Service Provider SIP Administration
80 pages
MCA Admission List
No ratings yet
MCA Admission List
3 pages
GIS Manual (Powerpoint) Final
100% (1)
GIS Manual (Powerpoint) Final
252 pages
DWDM 2
No ratings yet
DWDM 2
15 pages
Module 3 DM
No ratings yet
Module 3 DM
9 pages
AS 9618 Practice Test P2
No ratings yet
AS 9618 Practice Test P2
2 pages
WWW Perfect English Grammar Com Time Prepositions Exercise 2
No ratings yet
WWW Perfect English Grammar Com Time Prepositions Exercise 2
4 pages
BSC6900 GSM Quick Installation Guide (V900R019C10 - 01) (PDF) - EN
No ratings yet
BSC6900 GSM Quick Installation Guide (V900R019C10 - 01) (PDF) - EN
39 pages
Mongodb MCQ
No ratings yet
Mongodb MCQ
3 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
46 pages
DWM Unit 1
No ratings yet
DWM Unit 1
67 pages
Panchromatic Image Sharpening
No ratings yet
Panchromatic Image Sharpening
4 pages
DM 6
No ratings yet
DM 6
29 pages
HAJJATII
No ratings yet
HAJJATII
11 pages
Csb4318 DWDM Unit - 1 Revised
No ratings yet
Csb4318 DWDM Unit - 1 Revised
68 pages
SAP S4HANA DTS Practice Questions
No ratings yet
SAP S4HANA DTS Practice Questions
11 pages
DMDW 1 2nd Module
No ratings yet
DMDW 1 2nd Module
29 pages
Synthesis
100% (1)
Synthesis
98 pages
Module 2 DMDW
No ratings yet
Module 2 DMDW
132 pages
Chap 2
No ratings yet
Chap 2
21 pages
FALLSEM2023-24 CSI3010 ETH VL2023240104197 2023-07-28 Reference-Material-I
No ratings yet
FALLSEM2023-24 CSI3010 ETH VL2023240104197 2023-07-28 Reference-Material-I
32 pages
Cloud Computing Architecture
100% (1)
Cloud Computing Architecture
4 pages
Lecture 2
No ratings yet
Lecture 2
11 pages
CC MCQ
No ratings yet
CC MCQ
28 pages
Unit2 Olap
No ratings yet
Unit2 Olap
13 pages
DWM Unit 1 (2023)
No ratings yet
DWM Unit 1 (2023)
38 pages
DM Chapter 2
No ratings yet
DM Chapter 2
35 pages
Chapter 2.introduction To Data Warehouse
No ratings yet
Chapter 2.introduction To Data Warehouse
49 pages
Data Mining 9,10,11
No ratings yet
Data Mining 9,10,11
27 pages
Data Mining and Warehousing (203105431) : Sandeep Jangir, Assistant Professor
No ratings yet
Data Mining and Warehousing (203105431) : Sandeep Jangir, Assistant Professor
44 pages
Unit 2
No ratings yet
Unit 2
34 pages
04OLAP
No ratings yet
04OLAP
50 pages
Chapter 3 Topic - 4
No ratings yet
Chapter 3 Topic - 4
29 pages
Note2 3
No ratings yet
Note2 3
36 pages
Unit 2 Datawarehouse
No ratings yet
Unit 2 Datawarehouse
58 pages
Data Warehousing & Modeling: Module - 2
No ratings yet
Data Warehousing & Modeling: Module - 2
144 pages
Data Mining: Concepts and Techniques: - Slides For Textbook - Chapter 2
No ratings yet
Data Mining: Concepts and Techniques: - Slides For Textbook - Chapter 2
86 pages
DM and DW Notes-Module2
No ratings yet
DM and DW Notes-Module2
18 pages
DMDW Notes
100% (1)
DMDW Notes
62 pages
Data Mining:: Concepts and Techniques
No ratings yet
Data Mining:: Concepts and Techniques
48 pages
Chapter 2 and 3
No ratings yet
Chapter 2 and 3
89 pages
DMDW 2nd Module
No ratings yet
DMDW 2nd Module
29 pages
Define Data Warehouse. Differentiate Between OLTP and OLAP Databases
No ratings yet
Define Data Warehouse. Differentiate Between OLTP and OLAP Databases
6 pages
OLAP (Online Analytical Processing) : Zalpa Rathod (39) Yatin Puthran (37) Mayuri Pawar (35) Mitesh Patil
No ratings yet
OLAP (Online Analytical Processing) : Zalpa Rathod (39) Yatin Puthran (37) Mayuri Pawar (35) Mitesh Patil
37 pages
Data Mining - 3 PDF
No ratings yet
Data Mining - 3 PDF
62 pages
Data Warehousing: Data Models and OLAP Operations
No ratings yet
Data Warehousing: Data Models and OLAP Operations
39 pages
Implementation: Data Warehouse
No ratings yet
Implementation: Data Warehouse
56 pages
Data Warehousing: Online Analytical Processing (OLAP)
No ratings yet
Data Warehousing: Online Analytical Processing (OLAP)
44 pages
DWDM 3
0% (1)
DWDM 3
52 pages
Data Warehouse
No ratings yet
Data Warehouse
71 pages
Data Warehousing & OLAP
No ratings yet
Data Warehousing & OLAP
57 pages
Adbms: Data Warehousing OLAP Technology
No ratings yet
Adbms: Data Warehousing OLAP Technology
57 pages
Chapter 1 Datawarehouse
100% (1)
Chapter 1 Datawarehouse
47 pages
Cube Computation and Indexes For Data Warehouses: CPS 196.03 Notes 7
No ratings yet
Cube Computation and Indexes For Data Warehouses: CPS 196.03 Notes 7
28 pages
What Is OLAP - On - Line Analytical Processing
No ratings yet
What Is OLAP - On - Line Analytical Processing
34 pages
Data Warehousing, OLAP, and Data Mining
No ratings yet
Data Warehousing, OLAP, and Data Mining
28 pages
DW - Rolap Molap Holap
No ratings yet
DW - Rolap Molap Holap
48 pages

OLAP2

Uploaded by

OLAP2

Uploaded by

Data Warehouses

*Slides by Nikos Mamoulis

Data Sources Data Storage OLAP Engine Front-End Tools

 A multi-dimensional data model

 Data warehouse architecture

 Data warehouse implementation

 Further development of data cube technology

 From data warehousing to data mining

Base table Index on Region Index on Type

 Determine to which materialized cuboid(s) the

 A multi-dimensional data model

 Data warehouse architecture

 Data warehouse implementation

 Further development of data cube technology

 From data warehousing to data mining

 A multi-dimensional data model

 Data warehouse architecture

 Data warehouse implementation

 Further development of data cube technology

 From data warehousing to data mining

 Available information processing structure surrounding data

and OLAP tools

 On-line selection of data mining functions

algorithms, and tasks.

Data Cube API

•time: day, week, month, quarter, year

year manufacturer country

month week type color brand type city

day name name

 A redundant (not useful)

type color brand

type color brand

(type,color) (type, manufacturer) (color, manufacturer)

(color,brand) (type,brand) (type,color,manufacturer)

time item location supplier

time,item time,location item,location location,supplier

time, item, location, supplier

time, item, location, supplier

 Q2: total sales per (pname) 3. 100MB

You might also like