05 Cube Tech
05 Cube Tech
Concepts and
Techniques
1
Data Cube Technology
Summary
2
Data Cube: A Lattice of
Cuboids
all
0-D(apex) cuboid
time,location,supplier
3-D cuboids
time,item,locationtime,item,supplier item,location,supplier
4-D(base) cuboid
3
Data Cube: A Lattice of Cuboids
all
0-D(apex) cuboid
time item location supplier
1-D cuboids
time,item time,location item,location location,supplier
time,supplier item,supplier 2-D cuboids
time,location,supplier
time,item,location item,location,supplier
3-D cuboids
time,item,supplier
Base vs. aggregate cells; ancestor vs. descendant cells; parent vs. child
cells
1. (9/15, milk, Urbana, Dairy_land)
2. (9/15, milk, Urbana, *)
3. (*, milk, Urbana, *)
4. (*, milk, Urbana, *)
5. (*, milk, Chicago, *)
6. (*, milk, *, *)
4
Cube Materialization:
Full Cube vs. Iceberg Cube
Full cube vs. iceberg cube
compute cube sales iceberg as
select month, city, customer group, count(*)
iceberg from salesInfo
condition cube by month, city, customer group
having count(*) >= min support
How many aggregate cells if “having count >= 1”?
What about “having count >= 2”?
5
Iceberg Cube, Closed Cube & Cube
Shell
Is iceberg cube good enough?
2 base cells: {(a1, a2, a3 . . . , a100):10, (a1, a2, b3, . . . , b100):10}
How many cells will the iceberg cube have if having count(*)
>= 10? Hint: A huge but tricky number!
Close cube:
Closed cell c: if there exists no cell d, s.t. d is a descendant of
c, and d has the same measure value as c.
Closed cube: a cube consisting of only closed cells
What is the closed cube of the above base cuboid? Hint: only
3 cells
Cube Shell
Precompute only the cuboids involving a small # of
dimensions, e.g., 3
For (A1, A2, … A10), how many combinations to
More dimension combinations will need to be computed on
compute?
the fly
6
Roadmap for Efficient
Computation
General cube computation heuristics (Agarwal et al.’96)
Computing full/iceberg cubes: 3 methodologies
Bottom-Up: Multi-Way array aggregation (Zhao, Deshpande &
Naughton, SIGMOD’97)
Top-down:
BUC (Beyer & Ramarkrishnan, SIGMOD’99)
H-cubing technique (Han, Pei, Dong & Wang: SIGMOD’01)
Integrating Top-Down and Bottom-Up:
Star-cubing algorithm (Xin, Han, Li & Wah: VLDB’03)
High-dimensional OLAP: A Minimal Cubing Approach (Li, et al.
VLDB’04)
Computing alternative kinds of cubes:
Partial cube, closed cube, approximate cube, etc.
7
General Heuristics (Agarwal et al.
VLDB’96)
Sorting, hashing, and grouping operations are applied to the
dimension attributes in order to reorder and cluster related
tuples
Aggregates may be computed from previously computed
aggregates, rather than from the base fact table
Smallest-child: computing a cuboid from the smallest,
previously computed cuboid
Cache-results: caching results of a cuboid from which
other cuboids are computed to reduce disk I/Os
Amortize-scans: computing as many as possible cuboids
at the same time to amortize disk reads
Share-sorts: sharing sorting costs cross multiple cuboids
when sort-based method is used
Share-partitions: sharing the partitioning cost across
multiple cuboids when hash-based algorithms are used 8
Data Cube Technology
Summary
9
Data Cube Computation
Methods
BUC
Star-Cubing
High-Dimensional OLAP
10
Multi-Way Array Aggregation
All
Array-based “bottom-up”
algorithm
A B C
Using multi-dimensional chunks
No direct tuple comparisons
AB AC BC
Simultaneous aggregation on
multiple dimensions
ABC
Intermediate aggregate values are
re-used for computing ancestor
cuboids
Cannot do Apriori pruning: No
iceberg optimization
11
Multi-way Array Aggregation for Cube
Computation (MOLAP)
Partition arrays into chunks (a small subcube which fits in memory).
Compressed sparse array addressing: (chunk_id, offset)
Compute aggregates in “multiway” by visiting cube cells in the order
which minimizes the # of times to visit each cell, and reduces
memory access and storage cost.
C c3 61
c2 45
62 63 64
46 47 48
c1 29 30 31 32 What is the best
c0
b3 B13 14 15 16 60 traversing order
44
9
28 56 to do multi-way
b2
B 40
24 52 aggregation?
b1 5 36
20
b0 1 2 3 4
a0 a1 a2 a3
A 12
Multi-way Array Aggregation for Cube
Computation (3-D to 2-D)
All
A B C
a ll
AB AC BC
A B C
ABC
AB AC BC
All
A B C
AB AC BC
ABC
14
Multi-Way Array Aggregation for Cube
Computation (Method Summary)
Method: the planes should be sorted and
computed according to their size in ascending
order
Idea: keep the smallest plane in the main
memory, fetch and compute only one chunk at
a time for the largest plane
Limitation of the method: computing well only for
a small number of dimensions
If there are a large number of dimensions, “top-
down” computation and iceberg cube
computation methods can be explored
15
Data Cube Computation
Methods
BUC
Star-Cubing
High-Dimensional OLAP
16
Bottom-Up Computation (BUC)
a ll
iceberg pruning 1 a ll
If a partition does not satisfy 2 A 10 B 14 C 16 D
If minsup = 1 compute
full CUBE!
4 ABC 6 ABD 8 ACD 12 BC D
No simultaneous aggregation
5 ABCD
17
BUC: Partitioning
Usually, entire data set
can’t fit in main memory
Sort distinct values
partition into blocks that fit
Continue processing
Optimizations
Partitioning
External Sorting, Hashing, Counting Sort
Ordering dimensions to encourage pruning
Cardinality, Skew, Correlation
Collapsing duplicates
Can’t do holistic aggregates anymore!
18
Data Cube Computation
Methods
BUC
Star-Cubing
High-Dimensional OLAP
19
Star-Cubing: An Integrating
Method
D. Xin, J. Han, X. Li, B. W. Wah, Star-Cubing: Computing Iceberg
Cubes by Top-Down and Bottom-Up Integration, VLDB'03
Explore shared dimensions
E.g., dimension A is the shared dimension of ACD and AD
ABD/AB means cuboid ABD has shared dimensions AB
Allows for shared computations
e.g., cuboid AB is computed simultaneously as ABD
C /C D
Aggregate in a top-down
manner but with the
A C /A C A D /A B C /B C B D /B CD
bottom-up sub-layer
underneath which will allow A C D /A
A B C /A B C A B D /A B BCD
Apriori pruning
Shared dimensions grow in A B C D /a ll
bottom-up fashion 20
Iceberg Pruning in Shared Dimensions
21
Cell Trees
22
Star Attributes and Star Nodes
Intuition: If a single-dimensional
aggregate on an attribute value p
does not satisfy the iceberg
A B C D Count
condition, it is useless to
a1 b1 c1 d1 1
distinguish them during the
a1 b1 c4 d3 1
iceberg computation
a1 b2 c2 d2 1
E.g., b2, b3, b4, c1, c2, c4, d1, d2, a2 b3 c3 d4 1
d3 a2 b4 c3 d4 1
Solution: Replace such attributes
by a *. Such attributes are star
attributes, and the corresponding
nodes in the cell tree are star
nodes 23
Example: Star Reduction
A B C D Count
Suppose minsup = 2
a1 b1 * * 1
Perform one-dimensional a1 b1 * * 1
aggregation. Replace attribute a1 * * * 1
values whose count < 2 with *. a2 * c3 d4 1
And collapse all *’s together a2 * c3 d4 1
Resulting table has all such
attributes replaced with the
star-attribute A B C D Count
a1 b1 * * 2
With regards to the iceberg
a1 * * * 1
computation, this new table is a
a2 * c3 d4 2
lossless compression of the
original table
24
Star Tree
A B C D Count
Given the new compressed a1 b1 * * 2
table, it is possible to a1 * * * 1
a2 * c3 d4 2
construct the
corresponding cell tree—
called star tree
Keep a star table at the
side for easy lookup of star
attributes
The star tree is a lossless
compression of the original
cell tree
25
Star-Cubing Algorithm—DFS on Lattice
Tree
a ll
BC D : 51
A /A B /B C /C D /D
b* : 3 3 b1 : 2 6
ro o t: 5
c* : 1 4 c3 : 2 1 1 c* : 2 7
A B /A B A C /A C A D /A B C /B C B D /B CD
a1: 3 a2: 2
d* : 1 5 d4 : 2 1 2 d* : 2 8
A B C /A B C A B D /A B A C D /A BCD
b* : 1 b1 : 2 b* : 2
c* : 1 c* : 2 c3 : 2
ABCD
d* : 1 d* : 2 d4 : 2
26
Multi-Way BCD A C D /A A B D /A B A B C /A B C
Aggregation
ABCD
27
B C DDFS
Star-Cubing Algorithm— A C D /A onA B DStar-
/A B A B C /A B C
Tree
ABCD
28
Multi-Way Star-Tree
BCD A C D /A A B D /A B A B C /A B C
Aggregation
ABCD
BUC
Star-Cubing
High-Dimensional OLAP
30
The Curse of Dimensionality
31
Motivation of High-D OLAP
32
Fast High-D OLAP with Minimal
Cubing
34
Example Computation
Let the cube aggregation function be count
tid A B C D E
1 a1 b1 c1 d1 e1
2 a1 b2 c1 d2 e1
3 a1 b2 c1 d1 e2
4 a2 b1 c1 d1 e2
5 a2 b1 c1 d1 e3
35
1-D Inverted Indices
Build traditional invert index or RID list
36
Shell Fragment Cubes: Ideas
Generalize the 1-D inverted indices to multi-dimensional
ones in the data cube sense
Compute all cuboids for data cubes ABC and DE while
retaining the inverted indices
For example, shell Cell Intersection TID List List Size
fragment cube ABC a1 b1 1 2 3 1 4 5 1 1
contains 7 cuboids:
a1 b2 1 2 3 2 3 23 2
A, B, C
a2 b1 4 5 1 4 5 45 2
AB, AC, BC
ABC
a2 b2 4 5 2 3 0
This completes the offline
computation stage
37
Shell Fragment Cubes: Size and
Design
Given a database of T tuples, D dimensions, and F shell
fragment size, the fragment cubes’ space requirement is:
D F
For F < 5, the growth is sub-linear
OT (2 1)
F
Shell fragments do not have to be disjoint
Fragment groupings can be arbitrary to allow for
maximum online performance
(e.g.,<city, state>)
Known common combinations
should be grouped together.
Shell fragment sizes can be adjusted for optimal balance
between offline and online computation
38
ID_Measure Table
If measures other than count are present, store in
ID_measure table separate from the shell
fragments
tid count sum
1 5 70
2 3 10
3 8 20
4 5 40
5 2 30
39
The Frag-Shells Algorithm
(P1,…,Pk).
bottom- up fashion.
40
Frag-Shells (2)
Dimensions D Cuboid
EF Cuboid
A B C D E F … DE Cuboid
Cell Tuple-ID List
d1 e1 {1, 3, 8, 9}
d1 e2 {2, 4, 6, 7}
d2 e1 {5, 10}
… …
ABC DEF
Cube Cube
41
Online Query Computation: Query
a1,a2 ,,an : M
A query has the general form
1. Instantiated value
2. Aggregate * function
3. Inquire ? function
For example,3 ? ? * 1: count returns a 2-
D data cube.
42
Online Query Computation: Method
A B C D E F G H I J K L M N …
Instantiated Online
Base Table Cube
44
Experiment: Size vs.
Dimensionality (50 and 100
cardinality)
18
19
20
50
Problems for Drilling in Multidim.
Space
Data is only a sample of population but samples
could be small when drilling to certain
multidimensional space
Age\Education High-school College Graduate
18
19
20
51
OLAP on Survey (i.e., Sampling)
Data
Semantics of query is unchanged
Input data has changed
18
19
20
52
Challenges for OLAP on Sampling
Data
Computing confidence intervals in OLAP
context
No data?
Not exactly. No data in subspaces in cube
Sparse data
Causes include sampling bias and query
selection bias
Curse of dimensionality
Survey data can be high dimensional
Over 600 dimensions in real world example
Impossible to fully materialize
53
Example 1: Confidence Interval
What is the average income of 19-year-old high-school
students?
Return not only query result but also confidence
Age/Education High-school College Graduate
interval
18
19
20
54
Confidence Interval
Confidence interval at :
x is a sample of data set; is the mean of sample
tc is the critical t-value, calculated by a look-up
is the estimated standard error of the mean
Example: $50,000 ± $3,000 with 95% confidence
Treat points in cube cell as samples
Compute confidence interval as traditional sample
set
Return answer in the form of confidence interval
Indicates quality of query answer
User selects desired confidence interval
55
Efficient Computing Confidence Interval
Measures
is algebraic
where both s and l (count) are algebraic
Thus one can calculate cells efficiently at more general
cuboids without having to start at the base cuboid each
time
56
Example 2: Query Expansion
What is the average income of 19-year-old college students?
18
19
20
57
Boosting Confidence by Query
Expansion
From the example: The queried cell “19-year-old
college students” contains only 2 samples
Confidence interval is large (i.e., low confidence).
why?
Small sample size
High standard deviation with samples
Small sample sizes can occur at relatively low
dimensional selections
Collect more data?― expensive!
Use data in other cells? Maybe, but have to be
careful
58
Intra-Cuboid Expansion: Choice 1
Expand query to include 18 and 20 year olds?
18
19
20
59
Intra-Cuboid Expansion: Choice 2
Expand query to include high-school and graduate students?
18
19
20
60
Query Expansion
61
Intra-Cuboid Expansion
Combine other cells’ data into own to “boost”
confidence
If share semantic and cube similarity
Use only if necessary
Bigger sample size will decrease confidence
interval
Cell segment similarity
Some dimensions are clear: Age
Some are fuzzy: Occupation
May need domain knowledge
Cell value similarity
How to determine if two cells’ samples come
from the same population?
Two-sample t-test (confidence-based)
62
Inter-Cuboid Expansion
If a query dimension is
Not correlated with cube value
But is causing small sample size by drilling
down too much
Remove dimension (i.e., generalize to *) and
move to a more general cuboid
Can use two-sample t-test to determine similarity
between two cells across cuboids
Can also use a different method to be shown later
63
Query Expansion Experiments
Real world sample data: 600 dimensions and
750,000 tuples
0.05% to simulate “sample” (allows error
checking)
64
Data Cube Technology
66
Ranking Cube: Partition Data on Both
Selection and Ranking Dimensions
One single data
partition as the template
Partition for
Slice the data partition all data
by selection conditions
800
11
15
1000
Measure for
Without ranking-cube: start With ranking-cube:
LA: {11, 15}
search from here start search from here
{11: t6,t7;
15:t5}
69
Processing Ranking Query: Execution
Trace
Select top 1 from Apartment
where city = “LA”
order by [price – 1000]^2 + [sq feet - 800]^2 asc
Execution Trace:
800
1. Retrieve High-level measure for LA {11, 15}
2. Estimate lower bound score for block 11, 15
11
f(block 11) = 40,000, f(block 15) = 160,000
15
1000
3. Retrieve block 11
4. Retrieve low-level measure for block 11
5. f(t6) = 130,000, f(t7) = 97,600
With ranking- Measure for
cube: start search LA: {11, 15}
Output t7, done!
from here {11: t6,t7;
15:t5} 70
Ranking Cube: Methodology and
Extension
71
Data Cube Technology
Summary
72
Multidimensional Data Analysis in
Cube Space
Cubes
73
Data Mining in Cube Space
Data cube greatly increases the analysis bandwidth
Four ways to interact OLAP-styled analysis and data
mining
Using cube space to define data space for mining
Using OLAP queries to generate features and targets
for mining, e.g., multi-feature cube
Using data-mining models as building blocks in a
multi-step mining process, e.g., prediction cube
Using data-cube computation techniques to speed up
repeated model construction
Cube-space data mining may require building a
model for each candidate data space
Sharing computation across model-construction for
different candidates may lead to efficient mining
74
Prediction Cubes
Prediction cube: A cube structure that stores
prediction models in multidimensional data space
and supports prediction in OLAP manner
Prediction models are used as building blocks to
define the interestingness of subsets of data, i.e.,
to answer which subsets of data indicate better
prediction
75
How to Determine the Prediction
Power of an Attribute?
Ex. A customer table D:
Two dimensions Z: Time (Month, Year ) and
77
Multidimensional Data Analysis in
Cube Space
Cubes
78
Complex Aggregation at Multiple
Granularities: Multi-Feature Cubes
Multi-feature cubes (Ross, et al. 1998): Compute complex
queries involving multiple dependent aggregates at
multiple granularities
Ex. Grouping by all subsets of {item, region, month}, find
the maximum price in 2010 for each group, and the total
sales among all maximum price tuples
select item, region, month, max(price), sum(R.sales)
from purchases
where year = 2010
cube by item, region, month: R
such that R.price = max(price)
Continuing the last example, among the max price tuples,
find the min and max shelf live, and find the fraction of the
total sales due to tuple that have min shelf life within the
set of all max price tuples
79
Multidimensional Data Analysis in
Cube Space
Cubes
80
Discovery-Driven Exploration of Data
Cubes
Hypothesis-driven
exploration by user, huge search space
Discovery-driven (Sarawagi, et al.’98)
Effective navigation of large OLAP data cubes
pre-compute measures indicating exceptions,
guide user in the data analysis, at all levels of
aggregation
Exception: significantly different from the value
anticipated, based on a statistical model
Visual cues such as background color are used to
reflect the degree of exception of each cell
81
Kinds of Exceptions and their
Computation
Parameters
SelfExp: surprise of cell relative to other cells at
same level of aggregation
InExp: surprise beneath the cell
PathExp: surprise beneath cell for each drill-
down path
Computation of exception indicator (modeling
fitting and computing SelfExp, InExp, and PathExp
values) can be overlapped with cube construction
Exception themselves can be stored, indexed and
retrieved like precomputed aggregates
82
Examples: Discovery-Driven Data
Cubes
83
Data Cube Technology
Summary
84
Data Cube Technology: Summary
Data Cube Computation: Preliminary Concepts
Data Cube Computation Methods
MultiWay Array Aggregation
BUC
Star-Cubing
High-Dimensional OLAP with Shell-Fragments
Processing Advanced Queries by Exploring Data Cube
Technology
Sampling Cubes
Ranking Cubes
Multidimensional Data Analysis in Cube Space
Discovery-Driven Exploration of Data Cubes
Multi-feature Cubes
Prediction Cubes
85
Ref.(I) Data Cube Computation Methods
S. Agarwal, R. Agrawal, P. M. Deshpande, A. Gupta, J. F. Naughton, R. Ramakrishnan, and S. Sarawagi. On the
computation of multidimensional aggregates. VLDB’96
D. Agrawal, A. E. Abbadi, A. Singh, and T. Yurek. Efficient view maintenance in data warehouses. SIGMOD’97
K. Beyer and R. Ramakrishnan. Bottom-Up Computation of Sparse and Iceberg CUBEs.. SIGMOD’99
M. Fang, N. Shivakumar, H. Garcia-Molina, R. Motwani, and J. D. Ullman. Computing iceberg queries efficiently.
VLDB’98
J. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart, M. Venkatrao, F. Pellow, and H. Pirahesh. Data cube:
A relational aggregation operator generalizing group-by, cross-tab and sub-totals. Data Mining and Knowledge
Discovery, 1:29–54, 1997.
J. Han, J. Pei, G. Dong, K. Wang. Efficient Computation of Iceberg Cubes With Complex Measures. SIGMOD’01
L. V. S. Lakshmanan, J. Pei, and J. Han, Quotient Cube: How to Summarize the Semantics of a Data Cube,
VLDB'02
X. Li, J. Han, and H. Gonzalez, High-Dimensional OLAP: A Minimal Cubing Approach, VLDB'04
Y. Zhao, P. M. Deshpande, and J. F. Naughton. An array-based algorithm for simultaneous multidimensional
aggregates. SIGMOD’97
K. Ross and D. Srivastava. Fast computation of sparse datacubes. VLDB’97
D. Xin, J. Han, X. Li, B. W. Wah, Star-Cubing: Computing Iceberg Cubes by Top-Down and Bottom-Up Integration,
VLDB'03
D. Xin, J. Han, Z. Shao, H. Liu, C-Cubing: Efficient Computation of Closed Cubes by Aggregation-Based Checking,
ICDE'06
86
Ref. (II) Advanced Applications with Data Cubes
D. Burdick, P. Deshpande, T. S. Jayram, R. Ramakrishnan, and S. Vaithyanathan. OLAP over
uncertain and imprecise data. VLDB’05
X. Li, J. Han, Z. Yin, J.-G. Lee, Y. Sun, “Sampling Cube: A Framework for Statistical OLAP over
Sampling Data”, SIGMOD’08
C. X. Lin, B. Ding, J. Han, F. Zhu, and B. Zhao. Text Cube: Computing IR measures for
multidimensional text database analysis. ICDM’08
D. Papadias, P. Kalnis, J. Zhang, and Y. Tao. Efficient OLAP operations in spatial data
warehouses. SSTD’01
N. Stefanovic, J. Han, and K. Koperski. Object-based selective materialization for efficient
implementation of spatial data cubes. IEEE Trans. Knowledge and Data Engineering, 12:938–
958, 2000.
T. Wu, D. Xin, Q. Mei, and J. Han. Promotion analysis in multidimensional space. VLDB’09
T. Wu, D. Xin, and J. Han. ARCube: Supporting ranking aggregate queries in partially
materialized data cubes. SIGMOD’08
D. Xin, J. Han, H. Cheng, and X. Li. Answering top-k queries with multi-dimensional selections:
The ranking cube approach. VLDB’06
J. S. Vitter, M. Wang, and B. R. Iyer. Data cube approximation and histograms via wavelets.
CIKM’98
D. Zhang, C. Zhai, and J. Han. Topic cube: Topic modeling for OLAP on multi-dimensional text
databases. SDM’09
87
Ref. (III) Knowledge Discovery with Data Cubes
R. Agrawal, A. Gupta, and S. Sarawagi. Modeling multidimensional databases. ICDE’97
B.-C. Chen, L. Chen, Y. Lin, and R. Ramakrishnan. Prediction cubes. VLDB’05
B.-C. Chen, R. Ramakrishnan, J.W. Shavlik, and P. Tamma. Bellwether analysis: Predicting global
aggregates from local regions. VLDB’06
Y. Chen, G. Dong, J. Han, B. W. Wah, and J. Wang, Multi-Dimensional Regression Analysis of
Time-Series Data Streams, VLDB'02
G. Dong, J. Han, J. Lam, J. Pei, K. Wang. Mining Multi-dimensional Constrained Gradients in Data
Cubes. VLDB’ 01
R. Fagin, R. V. Guha, R. Kumar, J. Novak, D. Sivakumar, and A. Tomkins. Multi-structural
databases. PODS’05
J. Han. Towards on-line analytical mining in large databases. SIGMOD Record, 27:97–107, 1998
T. Imielinski, L. Khachiyan, and A. Abdulghani. Cubegrades: Generalizing association rules. Data
Mining & Knowledge Discovery, 6:219–258, 2002.
R. Ramakrishnan and B.-C. Chen. Exploratory mining in cube space. Data Mining and Knowledge
Discovery, 15:29–54, 2007.
K. A. Ross, D. Srivastava, and D. Chatziantoniou. Complex aggregation at multiple granularities.
EDBT'98
S. Sarawagi, R. Agrawal, and N. Megiddo. Discovery-driven exploration of OLAP data cubes.
EDBT'98
G. Sathe and S. Sarawagi. Intelligent Rollups in Multidimensional OLAP Data. VLDB'01
88
Surplus Slides
89
Data Cube Technology
Efficient Methods for Data Cube Computation
Preliminary Concepts and General Strategies for Cube
Computation
Multiway Array Aggregation for Full Cube Computation
BUC: Computing Iceberg Cubes from the Apex Cuboid Downward
H-Cubing: Exploring an H-Tree Structure
Star-cubing: Computing Iceberg Cubes Using a Dynamic Star-tree
Structure
Precomputing Shell Fragments for Fast High-Dimensional OLAP
Data Cubes for Advanced Applications
Sampling Cubes: OLAP on Sampling Data
Ranking Cubes: Efficient Computation of Ranking Queries
Knowledge Discovery with Data Cubes
Discovery-Driven Exploration of Data Cubes
Complex Aggregation at Multiple Granularity: Multi-feature Cubes
Prediction Cubes: Data Mining in Multi-Dimensional Cube Space
Summary
90
H-Cubing: Using H-Tree Structure
a ll
Bottom-up computation A B C D
Exploring an H-tree
AB AC AD BC BD CD
structure
If the current ABC ABD ACD BCD
computation of an H- ABCD
91
H-tree: A Prefix Hyper-tree
Quant-
Attr. Val. Side-link
Info root
Sum:2285
Edu
…
Header Hhd … edu hhd bus
Bus …
table … …
Jan …
Feb … Jan Mar Jan Feb
… …
Tor …
Van …
Mon …
Tor Van Tor Mon
Cust_gr
Month City … Prod… Cost Price
p
Jan Tor Edu Printer 500 485 Quant- Q.I. Q.I. Q.I.
Jan Tor Hhd TV 800 1200 Info
Attr. Side-
Val.
Q.I.
link From (*, *, Tor) to (*, Jan, Tor)
Header Edu …
root
Hhd …
Table Bus …
HTor …
Jan
…
… Edu. Hhd. Bus.
Feb …
… …
Attr.
Quant-Info Side-link
Jan. Mar. Jan. Feb.
Val.
Edu Sum:2285 …
Hhd …
Bus … Tor. Van. Tor. Mon.
… …
Jan …
Feb …
Quant- Q.I. Q.I. Q.I.
… …
Tor … Info
Van … Sum:
Mon … 1765
… …
Cnt: 2
93
Computing Cells Involving Month But No
City
root