0% found this document useful (0 votes)

2 views

ImplementatioDbNOSQL

This paper explores the implementation of multidimensional data warehouses using document-oriented NoSQL systems, highlighting the advantages of NoSQL such as scalability and flexibility in schema design. It proposes mapping rules to transform multidimensional data models into logical document-oriented models and compares three different logical translations for instantiating data warehouses. The study emphasizes the need for formalization and evaluation of NoSQL systems in the context of data warehousing, addressing the limitations of traditional relational database management systems.

Uploaded by

Jhoel Cosio

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

ImplementatioDbNOSQL

Uploaded by

Jhoel Cosio

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Implementation of Multidimensional Databases

with Document-Oriented NoSQL

M. Chevalier, M. El Malki ✉ , A. Kopliku, O. Teste, and R. Tournier

( )

IRIT 5505, Université de Toulouse, 118 Route de Narbonne,

31062 Toulouse, France
{Max.Chevalier,Mohammed.ElMalki,Arlind.Kopliku,Olivier.Teste,
Ronan.Tournier}@irit.fr

Abstract. NoSQL (Not Only SQL) systems are becoming popular due to known
advantages such as horizontal scalability and elasticity. In this paper, we study
the implementation of data warehouses with document-oriented NoSQL
systems. We propose mapping rules that transform the multidimensional data
model to logical document-oriented models. We consider three different
logical translations and we use them to instantiate multidimensional data
warehouses. We focus on data loading, model-to-model conversion and cuboid
computation.

1 Introduction

NoSQL solutions have proven some clear advantages with respect to relational database
management systems (RDBMS) [14]. Nowadays, the research attention has moved
towards the use of these systems for storing “big” data and analyzing it. This work joins
our previous work on the use of NoSQL solutions for data warehousing [3] and it joins
substantial ongoing works [6, 9, 15]. In this paper, we focus on one class of NoSQL
stores, namely document-oriented systems [7].
Document-oriented systems are one of the most famous families of NoSQL systems.
Data is stored in collections, which contain documents. Each document is composed of
key-value pairs. The value can be composed of nested sub-documents. Document-
oriented stores enable more ﬂexibility in schema design: they allow the storage of
complex structured data and heterogeneous data in one collection. Although, document-
oriented databases are declared to be “schema less” (no schema needed), most uses
convey to some data model.
When it comes to data warehouses, previous work has shown that it can be instan‐
tiated with diﬀerent logical models [10]. We recall that data warehousing relies mostly

M. El Malki—This study is supported by the ANRT funding under CIFRE-Capgemini

partnership.

© Springer International Publishing Switzerland 2015

S. Madria and T. Hara (Eds.): DaWaK 2015, LNCS 9263, pp. 379–390, 2015.
DOI: 10.1007/978-3-319-22729-0_29
380 M. Chevalier et al.

on the multidimensional data model. The latter is a conceptual model,1 and we need to
map it in document-oriented logical models. Mapping the multidimensional model to
relational databases is quite straightforward, but until now there is no work (except of
our previous [3] ) that considers the direct mapping from the multidimensional concep‐
tual model to NoSQL logical models (Fig. 1). NoSQL models support more complex
data structures than relational model i.e. we do not only have to describe data and the
relations using atomic attributes. They have a ﬂexible data structure (e.g. nested
elements). In this context, more than one logical model is candidate for mapping the
multidimensional model. As well, the evolving needs might demand for switching from
one model to another. This is the scope of our work: NoSQL logical models and their
use for multidimensional data warehousing.

Conceptual Level Multidimensional

OLAP

Logical Level

Relational NoSQL
OLAP OLAP

Legend New transformation

Existing transformation

Fig. 1. Translations of a conceptual multidimensional model into logical models.

In this paper, we focus on multidimensional data models for data warehousing. We

compare three translations of the conceptual model at logical document-oriented model
level. We provide formalism that enables us to define the mapping from the conceptual
model to the logical model. Then, we show how we can instantiate data warehouses in
document-oriented systems. Our studies include the load of data, the conversions model-
to-model and the computation of pre-aggregate OLAP cubes.
Our motivations are multiple. The implementation of OLAP systems with NoSQL
systems is a new alternative [6, 14]. It is justified by the advantages such as more scal‐
ability. The increasing scientific research in this direction demands for formalization,
common-agreement models and evaluation of different NoSQL systems.
The paper is organized as follows. The following section studies the state of the art.
In Sect. 3, we formalize the multidimensional data model and OLAP cuboids. Then, we
focus on formalisms and definitions of document-oriented models. In Sect. 4, we show
experiments.

1
The conceptual level consists in describing the data in a generic way regardless the information
technologies whereas the logical level consists in using a speciﬁc technique for implementing
the conceptual level.
Implementation of Multidimensional Databases 381

2 State of the Art

Considerable research has focused on the translation of data warehousing concepts to

relational R-OLAP logical level [2, 5]. Multidimensional databases are mostly imple‐
mented using RDBMS technologies. Mapping rules are used to convert structures of the
conceptual level (facts, dimensions and hierarchies) into a logical model based on rela‐
tions. Moreover, many researchers [1] have focused on the implementation of optimi‐
zation methods based on pre-computed aggregates (also called materialized views, or
OLAP cuboids). However, R-OLAP implementations suﬀer from scaling-up to very
large data volumes (i.e. “Big Data”). Research is currently under way for new solutions
such as using NoSQL systems [14]. Our approach aims at revisiting these processes for
automatically implementing multidimensional conceptual models directly into NoSQL
models.
Other studies investigate the process of transforming relational databases into a
NoSQL logical model (bottom part of Fig. 1). In [12], an algorithm is introduced for
mapping a relational schema to a NoSQL schema in MongoDB [7], a document-oriented
NoSQL database. However, either these approaches not consider the conceptual model
of data warehouses because they are limited to the logical level, i.e. transforming a
relational model into a documents-oriented model. In [11] the author proposes an
approach to optimize schema in NoSQL.
There is currently no approach that automatically and directly transforms a multi‐
dimensional conceptual model into a NoSQL logical model. It is possible to transform
multidimensional conceptual models into a logical relational model, and then to trans‐
form this relational model into a logical NoSQL model. However, this transformation
using the relational model as a pivot model has not been formalized as both transfor‐
mations were studied independently of each other. The work presented here is a
continuation of our previous work where we study and formalize the implementation
of data warehouses with NoSQL systems [3]. Our previous work considers two NoSQL
models (one column-oriented and one document oriented). This article focuses only on
document-oriented systems; we analyze three data models (with respect to 1); we
consider all cross-model mappings; we improve the formalization and we provide new
experiments.

3 Multidimensional Conceptual Model and Olap Cube

3.1 Conceptual Multidimensional Model

To ensure robust translation rules we first define the multidimensional model used at the
conceptual level [8, 13].
A multidimensional schema, namely E, is defined by (FE, DE, StarE)
• FE = {F1,…, Fn} is a finite set of facts,
• DE = {D1,…, Dm} is a finite set of dimensions,
382 M. Chevalier et al.

• StarE: FE → 2DE is a function that associates facts of FE to sets of dimensions along

which it can be analyzed ( is the power set of DE).
A dimension, denoted Di∈DE (abusively noted as D), is deﬁned by (ND, AD, HD)
• ND is the name of the dimension,
• AD = a1D,…,auD∪idD,AllD is a set of dimension
attributes,
• HD = {H1D,…,HvD} is a set of hierarchies.

A hierarchy, denoted Hi∈HD, is deﬁned by (NHi, ParamHi, WeakHi)

• NHi is the name of the hierarchy,
• ParamHi = <idD,p1Hi,…,pviHi,AllD > is an
ordered set of vi + 2 attributes which are called parameters of the relevant graduation
scale of the hierarchy, ∀k∈[1..vi], pkHi ∈AD,
• is a function associating with each
parameter possibly one or more weak attributes.
• A fact, denoted F∈FE, is deﬁned by (NF, MF)
• NF is the name of the fact,
• MF = {f1(m1F),…,fv(mvF)} is a set of measures, each
associated with an aggregation function fi.

3.2 The OLAP Cuboid

The pre-aggregate view or OLAP cuboid corresponds to a subset of aggregated meas‐

ures on a subset of analysis dimensions. OLAP cuboids are often pre-computed to turn
frequent analysis of data more eﬃcient. Typically, we pre-compute aggregate functions
on given interest measures grouping on some analysis dimensions. The OLAP cube
O = (FO,DO) derived from E is formally composed of
• FO = (NFo, MFo) a fact derived from F∈FE with NFo = NF a subset of measures.
• DO = Star(FO) ⊆ DE a subset of dimensions.
If we generate OLAP cuboids on all combination of dimension attributes, we have
an OLAP cube lattice.

Illustration: Let’s consider an excerpt of the star schema benchmark [12]. It consists
in a monitoring of a sales system. Orders are placed by customers and the lines of the
orders are analyzed. A line consists in a part (a product) bought from a supplier and sold
to a customer at a speciﬁc date. The conceptual schema of this case study is presented
in Fig. 2.
Implementation of Multidimensional Databases 383

Dimension Name
HCust
Fact CUSTOMER
Customer City Region Nation All
Size
Prod_Name
Weak Attributes
LineOrder HBrand
PART
Quantity Partkey Brand Type All
HCateg
Discount Hierarchy
Category
Revenue
Month_Name
Tax
HTIME
DATE
Measures Date Month Year All
HTime
Name
HSuppl
SUPPLIER
Legend Supplier City Region Nation All
– F SSB={ FLineOrder }
– D SSB={ D Customer , D Part , D Date , D Supplier },
– Star SSB(FLineOrder )={ D Customer , D Part , D Date , D Supplier } Parameter

Fig. 2. Graphical notations of the multidimensional conceptual model

From this schema, called ESSB, we can deﬁne cuboids, for instance:
– (FLineOrder, {DCustomer, DDate, DSupplier}),
– (FLineOrder, {DCustomer, DDate}).

4 Document-Oriented Modeling of Multidimensional

Data Warehouses

4.1 Formalism for Document-Oriented Data Models

In the document-oriented model, data is stored in collections of documents. The structure
of documents is defined by attributes. We distinguish simple attributes whose values
are atomic from compound attributes whose values are documents called nested
documents or sub-documents. The structure of document is described as a set of paths
from the document tree. The following example illustrates the above formalism.
Consider the document tree in Fig. 3 describing a document di of the collection C with
identifier id:vi.
A document C[id:vi] is defined by:
– C: the collection where the document belongs. We use the notation C[id:vi] to indi‐
cate a document of a collection C with identifier id:vi.
– P: all paths in the tree of the document. A path p is described as p = C[id:vi]{k1:k2…
kn:a} where k1, k2, … kn are keys within the same path ending at the leaf node with
the value of a simple attribute.
384 M. Chevalier et al.

Tree notations Path notations

id vi
Customer C02265 C[id:vi]{customer:Customerkey:C02265}
Name M. Smith C[id:vi]{customer:Name:M. Smith}
Customer City Toulouse C[id:vi]{customer:City:Toulouse}
Region Midi Pyrénées C[id:vi]{customer:Region:Midi Pyrénées}
Nation France C[id:vi]{customer:Nation:France}
Partkey P09878 C[id:vi]{Part:Partkey:P09878}
Prod_Name Copper valve c3 C[id:vi]{Part:Prod_Name:Copper valve 3}
Part Size 10x15x10 C[id:vi]{Part:Size:10x15x10}
Brand B3 C[id:vi]{Part:Brand:B3}
Type Connector C[id:vi]{Part:Type:Connector}
Category Plumbing C[id:vi]{Part:Category:Plumbing}
date 03-31-2015 C[id:vi]{Date:Date:03-31-2015}
di month 03-2015 C[id:vi]{Date:Month:03-2015}
Date C[id:vi]{Date:Month-Name:March}
month_name March
year 2015 C[id:vi]{Date:Year:2015}
Supplier SP015678 C[id:vi]{Supplier:Supplierkey:CP015678}
Name CPR Int. C[id:vi]{Supplier:Name:CPR int}
Supplier City Madrid C[id:vi]{Supplier:City:Madrid}
Region Center Spain
Nation Spain C[id:vi]{Supplier:Nation:Spain}
Quantity 10 C[id:vi]{LineOrder:Quantity:10}
Discount 0 C[id:vi]{LineOrder:Discount:0}
LineOrder
Revenue 3.5789 C[id:vi]{LineOrder:Revenue:3,5789}
Tax 2.352 C[id:vi]{LineOrder:Tax:2,352}

Fig. 3. Tree-like representation of documents

4.2 Document-Oriented Models for Data Warehousing

In document-oriented stores, the data model is determined not only by its attributes and
values, but also by the path to the data. In relational database models, the mapping from
conceptual to logical is more straightforward. In document-oriented stores, there are
multiple candidate models that differ on the collections and structure. No model has
been proven better than the others and no mapping rules are widely accepted. In this
section, we present three approaches of logical document-oriented modeling. These
models correspond to a broad classification of possible approaches for modeling multi‐
dimensional cuboid (one fact and its star). In the first approach, we store all cuboid data
in one collection in a flat format (without sub-document). In the second case, we make
use of nesting (embedded sub-documents) within one collection (richer expressivity).
In the third case, we consider distributing the cuboid data in more than one collection.
These approaches are described below.
MLD0: For a given fact, all related dimensions attributes and all measures are combined
in one document at depth 0 (no sub-documents). We call this the flat model noted MLD0.

MLD1: For a given fact, all dimension attributes are nested under the respective attribute
name and all measures are nested in a subdocument with key “measures”. This model
is inspired from [3]. Note that there are diﬀerent ways to nest data, this is just one of
them.

MLD2: For a given fact and its dimensions, we store data in dedicated collections one
per dimension and one for the fact. Each collection is kept simple: no sub-documents.
The fact documents will have references to the dimension documents. We call this model
MLD2 (or shattered). This model has known advantages such as less memory usage and
data integrity, but it can slow down interrogation.
Implementation of Multidimensional Databases 385

4.3 Mapping from the Conceptual Model

The formalism that we have defined earlier allows us to define a mapping from the
conceptual multidimensional model to each of the logical models defined above. Let
O = (FO, DO) be a cuboid for a multidimensional model for the fact FO (MO is a set of
measures) with dimensions in D. NF and ND stand for fact and dimension names.
The above mappings are detailed below. Let C be a generic collection, CD a collection
for the dimension D and CF a collection for a fact FO. The Table 1 shows how we can
map any measure m of FO and any dimension of DO into any of the models: MLD0,
MLD1, MLD2.

Table 1. Mapping rules from the conceptual model to the logical models

Conceptual Model to MLD0: To instantiate this model from the conceptual model,
these rules are applied:
• Each cuboid O (FO and its dimensions DO) is translated in a collection C.
• Each measure m ∈ MF is translated into a simple attribute (i.e. C[id]{m})
• For all dimension D ∈ DO, each attribute a ∈ AD of the dimension D is converted into
a simple attribute of C (i.e. C[id]{a})
Conceptual Model to MLD1: To instantiate this model from the conceptual model,
these rules are applied:
• Each cuboid O (FO and its dimensions DO) is translated in a collection C.
• The attributes of the fact FO will be nested in a dedicated nested document C[id]
{NF}. Each measure m ∈ MF is translated into a simple attribute C[id]{NF:m}.
• For any dimension D ∈ DO, its attributes will be nested in a dedicated nested docu‐
ment C[id]{ND}. Every attribute a ∈ AD of the dimension D will be mapped into a
simple attribute C[id]{ND:a}.
Conceptual Model to MLD2: To instantiate this model from the conceptual model,
these rules are applied:
386 M. Chevalier et al.

• Each cuboid O (FO and its dimensions DO), the fact FO is translated in a collection
CF and each dimension D ∈ DO into a collection CD.
• Each measure m ∈ MF is translated within CF as a simple attribute (i.e. CF[id’]{m})
• For all dimension D ∈ DO, each attribute a ∈ AD of the dimension D is mapped into
CD as a simple attribute (i.e. CD[id]{a}), and if a = idD the document CF is completed
by a simple attribute CF[id’]{a} (the value reference of the linked dimension).

5 Experiments

Our experimental goal is to validate the instantiation of data warehouses with the three
approaches mentioned earlier. Then, we consider converting data from one model to the
other. In the end, we generate OLAP cuboids and we compare the eﬀort needed by
model. We rely on the SSB + benchmark that is popular for generating data for decision
support systems. As data store, we rely on MongoDB one of the most popular document-
oriented system.

5.1 Protocol

Data: We generate data using the SSB + [4] benchmark. The benchmark models a
simple product retail reality. It contains one fact table “LineOrder” and 4 dimensions
“Customer”, “Supplier”, “Part” and “Date”. This corresponds to a star-schema. The
dimensions are hierarchic e.g. “Date” has the hierarchy of attributes [d_date, d_month,
d_year]. We have extended it to generate raw data specific to our models in JSON file
format. This is convenient for our experimental purposes. JSON is the best file format
for Mongo data loading. We use different scale factors namely sf = 1, sf = 10, sf = 25
and sf = 100 in our experiments. The scale factor sf = 1 generates approximately 107
lines for the LineOrder fact, for sf = 10 we have approximately 108 lines and so on. In
the MLD2model we will have (sf x 107) lines for LineOrder and quite less for the
dimensions.

Data loading: Data is loaded into MongoDB using native instructions. These are
supposed to load data faster when loading from files. The current version of MongoDB
would not load data with our logical model from CSV file, thus we had to use JSON
files.

Lattice computation: To compute the pre-aggregate lattice, we use the aggregation

pipeline suggested as the most performing alternative by Mongo itself. Four levels of
pre-aggregates are computed on top of the benchmark generated data. Precisely, at each
level we aggregate data respectively on: the combination of 4 dimensions all combina‐
tions of 3 dimensions, all combinations of 2 dimensions, all combinations of 1 dimen‐
sion, 0 dimensions (all data). At each aggregation level, we apply aggregation functions:
max, min, sum and count on all dimensions.
Implementation of Multidimensional Databases 387

Hardware. The experiments are done on a cluster composed of 3 PCs (4 core-i5, 8 GB

RAM, 2 TB disks, 1 Gb/s network), each being a worker node and one node acts also
as dispatcher.

5.2 Results
In Table 2, we summarize data loading times by model and scale factor. We can observe
at scale factor SF1, we have 107 lines on each line order collections for a 4.2 GB disk
memory usage for MLD2 (15 GB for MLD0 and MLD1). At scale factors SF10 and
SF100 we have respectively 108 lines and 109 lines and 42 GB (150 GB MLD0 and
MLD1) and 420 GB (1.5 TB MLD0 and MLD1) for of disk memory usage. We observe
that memory usage is lower in the MLD2 model. This is explained by the absence of
redundancy in the dimensions. The collections “Customers”, “Supplier”, “Part” and
“Date” have respectively 50000 records, 3333 records, 3333333 records and 2556
records.

Table 2. Loading times by model into MongoDB

MLD0 MLD1 MLD2

SF = 1 1306 s/15 GB 1235 s/15 GB 1261 s/4.2 GB
107 lines
SF = 10 16680 s/150 GB 16080 s/150 GB 4320 s/42 GB
108 lines
SF = 25 46704 s/375 GB 44220 s/375 GB 10980 s/105 GB
25.107 lines

In Fig. 4, we show the time needed to convert data of one model to data of another
model with SF1. When we convert data from MLD0 to MLD1 and vice versa conversion
times are comparable. To transform data from MLD0 to MLD1 we just introduce a depth
of 1 in the document. On the other sense (MLD1 to MLD0), we reduce the depth by one.
The conversion is more complicated when we consider MLD0 and MLD2. To convert
MLD0 data into MLD2 we need to split data in multiple tables: we have to apply 5
projections on original data and select only distinct keys for dimensions. Although, we
produce less data (in memory usage), we need more processing time than when we
convert data to MLD1. Converting from MLD2 to MLD0 is the slowest process by far.
This is due to the fact that most NoSQL systems (including MongoDB) do not support
joins (natively). We had to test diﬀerent optimization techniques hand-coded. The
loading times fall between 5 h to 125 h for SF1. It might be possible to optimize this
conversion further, but the results are illustrative of the jointure issues in MongoDB.
388 M. Chevalier et al.

550s 870s
MLD1 MLD0 MLD2
720s 5h-125h

Fig. 4. Inter-model conversion times

In Fig. 5, we sumarize experimental observations concerning the computation of the

OLAP cuboids at diﬀerent levels of the OLAP lattice for SF1 using data from the model
MLD0. We report the time needed to compute the cuboid and the number of records it
contains. We compute the cuboids from one of the “upper”-in hierarchy cuboids with
less records, which makes computation faster.

10000000 documents*
CSPD Loading time only (no processing time)

9392109 documents 4470422 documents 9790169 documents 9390777 documents

CSP 475s CSD 327s CPD 455s SPD 454s

62500 documents 937475 documents 21250 documents 937478 documents 21250 documents 317415 documents
CS 36s CP 229s CD 35s SP 237s SD 35s PD 217s

250 documents 250 documents 3750 documents 85 documents

C <1s S <1s P <1s D <1s

Record count
1 document Computation time
All <1s

Fig. 5. Computation time and records by OLAP cuboid

We observe as expected that the number of records decreases from one level to the
lower level. The same is true for computation time. We need between 300 and 500 s to
compute the cuboids at the first level (3 dimensions). We need between 30 s and 250 s
at the second layer (2 dimensions). We need less than one second to compute the cuboids
at the third and fourth level (1 and 0 dimensions).
OLAP computation using the model MLD1 provides similar results. The perform‐
ance is significantly lower with the MLD2 model due to joins. These differences involve
only the layer 1 (depth one) of the OLAP lattice, cause the other layers can be computed
from the latter. We do not report this results for space constraints.

Observations: We observe that we need comparable times to load data in one model
with the conversion times (except of MLD2 to MLD0). We also observe reasonable
times for computing OLAP cuboids. These observations are important. At one hand, we
show that we can instantiate data warehouses in document-oriented data systems. On
the other, we can think of pivot models or materialized views that can be computed in
parallel with a chosen data model.
Implementation of Multidimensional Databases 389

6 Conclusion

In this paper, we have studied the instantiation of data warehouses with document-
oriented systems. We propose three approaches at the document-oriented logical model.
Using a simple formalism, we describe the mapping from the multidimensional concep‐
tual data model to the logical level.
Our experimental work illustrates the instantiation of data warehouses with each of
the three approaches. Each model has its weaknesses and strengths. The shattered model
(MLD2) uses less disk memory, but it is quite inefficient when it comes to answering
queries with joins. The simple models MLD0 and MLD1 do not show significant
performance differences. Passing from one model to another is shown to be easy and
comparable in time to “data loading from scratch”. One conversion is significantly non-
performing; it corresponds to the mapping from multiple collections (MLD2) to one
collection. Interesting results are also met in the computation of the OLAP lattice with
document-oriented models. The computation times are reasonable enough.
For future work, we will consider logical models in column-oriented models and
graph-oriented models. After exploring data warehouse instantiation across different
NoSQL systems, we need to generalize across logical model. We need a simple
formalism to express model differences and we need to compare models within each
paradigm and across paradigms (document versus column).

References

1. Bosworth, A., Gray, J., Layman, A., Pirahesh, H.: Data cube: A relational aggregation
operator generalizing group-by, cross-tab, and sub-totals. Technical report MSRTR-95-22,
Microsoft Research, February 1995
2. Chaudhuri, S., Dayal, U.: An overview of data warehousing and OLAP technology. SIGMOD
Rec. 26, 65–74 (1997)
3. Chevalier, M., malki, M.E., Kopliku, A., Teste, O., Tournier, R.: Implementing
multidimensional data warehouses into NoSQL. In: 17th International Conference on
Entreprise Information Systems, April 2015
4. Chevalier, M., El Malki, M., Kupliku, A., Teste, O., Tournier, R.: Benchmark for OLAP on
NoSQL technologies, comparing NoSQL multidimensional data warehousing solutions. In:
9th International Conference on Research Challenges in Information Science (RCIS). IEEE
(2015)
5. Colliat, G.: Olap, relational, and multidimensional database systems. SIGMOD Rec. 25(3),
64–69 (1996)
6. Cuzzocrea, A., Song, I.Y., Davis, K.C.: Analytics over large-scale multidimensional data:
The big data revolution!. In: 14th International Workshop on Data Warehousing and OLAP
DOLAP 2011, pp. 101–104. ACM (2011)
7. Dede, E., Govindaraju, M., Gunter, D., Canon, R.S., Ramakrishnan, L.: Performance
evaluation of a MongoDB and hadoop platform for scientiﬁc data analysis. In: 4th ACM
Workshop on Scientiﬁc Cloud Computing Science Cloud 2013, pp.13–20. ACM (2013)
8. Dehdouh, K., Boussaid, O., Bentayeb, F.: Columnar NoSQL star schema benchmark. In: Ait
Ameur, Y., Bellatreche, L., Papadopoulos, G.A. (eds.) MEDI 2014. LNCS, vol. 8748, pp.
281–288. Springer, Heidelberg (2014)
390 M. Chevalier et al.

9. Golfarelli, M., Maio, D., Rizzi, S.: The dimensional fact model: A conceptual model for data
warehouses. Int. J. Coop. Inf. Syst. 7, 215–247 (1998)
10. Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Complete Guide to Dimensional
Modeling, 2nd edn. Wiley, New York (2002)
11. Mior, M.J.: Automated schema design for NoSQL databases. In: SigMOD (2014)
12. O’Neil, P., O’Neil, E., Chen, X., Revilak, S.: The star schema benchmark and augmented fact
table indexing. In: Nambiar, R., Poess, M. (eds.) TPCTC 2009. LNCS, vol. 5895, pp. 237–
252. Springer, Heidelberg (2009)
13. Ravat, F., Teste, O., Tournier, R., Zuruh, G.: Algebraic and graphic languages for OLAP
manipulations. IJDWM 4(1), 17–46 (2008)
14. Stonebraker, M.: New opportunities for new SQL. Commun. ACM 55(11), 10–11 (2012).
http://doi.acm.org/10.1145/2366316.2366319
15. Zhao, H., Ye, X.: A practice of TPC-DS multidimensional implementation on NoSQL
database systems. In: Nambiar, R., Poess, M. (eds.) TPCTC 2013. LNCS, vol. 8391, pp. 93–
108. Springer, Heidelberg (2014)

Google Services
No ratings yet
Google Services
5 pages
EasyChair-Preprint-10735
No ratings yet
EasyChair-Preprint-10735
14 pages
Data Warehousing and OLAP Technology For Data Mining
No ratings yet
Data Warehousing and OLAP Technology For Data Mining
30 pages
MSC CS Mqp0708
No ratings yet
MSC CS Mqp0708
12 pages
Data Warehouse and Data Mining Question Bank R13 PDF
No ratings yet
Data Warehouse and Data Mining Question Bank R13 PDF
12 pages
Graph NoSQL Data Warehouse Creation
No ratings yet
Graph NoSQL Data Warehouse Creation
5 pages
Chap 2
No ratings yet
Chap 2
21 pages
Unit 1
No ratings yet
Unit 1
26 pages
2.data Warehouse and OLAP
No ratings yet
2.data Warehouse and OLAP
14 pages
Unit - 3 Data Warehousing and OLAP Technology
No ratings yet
Unit - 3 Data Warehousing and OLAP Technology
20 pages
Data Warehouse
No ratings yet
Data Warehouse
71 pages
3 - Business Analysis in Data Mining - L6 - 7 - 8 - 9 - 10
No ratings yet
3 - Business Analysis in Data Mining - L6 - 7 - 8 - 9 - 10
39 pages
DWDM-QB
No ratings yet
DWDM-QB
12 pages
DWDM Unit III
No ratings yet
DWDM Unit III
14 pages
3 - Business Analysis in Data Mining - L6 - 7 - 8 - 9 - 10
No ratings yet
3 - Business Analysis in Data Mining - L6 - 7 - 8 - 9 - 10
40 pages
Notes On Dimension and Facts
No ratings yet
Notes On Dimension and Facts
32 pages
Chapter 2 and 3
No ratings yet
Chapter 2 and 3
89 pages
Assignment On Chapter 5 Data Warehousing and Management
No ratings yet
Assignment On Chapter 5 Data Warehousing and Management
20 pages
3 Business Analysis in Data Mining L6 7 8-9-10
No ratings yet
3 Business Analysis in Data Mining L6 7 8-9-10
39 pages
Advantages of Multidimensional Data Model
No ratings yet
Advantages of Multidimensional Data Model
6 pages
Data Mining Notes UNIT II
No ratings yet
Data Mining Notes UNIT II
25 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
46 pages
Chapter 3
100% (3)
Chapter 3
4 pages
Olp PDF
No ratings yet
Olp PDF
25 pages
Hierarchies in A Multidimensional Model
No ratings yet
Hierarchies in A Multidimensional Model
30 pages
Assignment 4-1
100% (2)
Assignment 4-1
27 pages
MAC - Conceptual Data Modeling for OLAP
No ratings yet
MAC - Conceptual Data Modeling for OLAP
11 pages
UNIT 1 DWDM PRE
No ratings yet
UNIT 1 DWDM PRE
20 pages
3
No ratings yet
3
77 pages
unit1
No ratings yet
unit1
36 pages
(Tournier-07) ER OLAP Conceptual Model
No ratings yet
(Tournier-07) ER OLAP Conceptual Model
16 pages
Hierarchies and different operators in OLAP
No ratings yet
Hierarchies and different operators in OLAP
6 pages
01
No ratings yet
01
17 pages
Unit 2 DATA WAREHOUSE AND DATA MART
No ratings yet
Unit 2 DATA WAREHOUSE AND DATA MART
17 pages
Unit-I: Introduction and Data Warehousing
No ratings yet
Unit-I: Introduction and Data Warehousing
17 pages
IT DWDM Unit I New PPT
No ratings yet
IT DWDM Unit I New PPT
60 pages
DW Design Ex PDF
No ratings yet
DW Design Ex PDF
2 pages
DW Design Ex
No ratings yet
DW Design Ex
2 pages
Unit Wise-Question Bank UNIT-1 1. Two Marks Question With Answers: 1. What Are The Uses of Multi Feature Cubes?
No ratings yet
Unit Wise-Question Bank UNIT-1 1. Two Marks Question With Answers: 1. What Are The Uses of Multi Feature Cubes?
85 pages
CS 8031 Data Mining and Data Warehousing Tutorial
No ratings yet
CS 8031 Data Mining and Data Warehousing Tutorial
9 pages
unit-2_1 (1)
No ratings yet
unit-2_1 (1)
60 pages
UNIT-1 (RIT-062) : Data Warehousing
No ratings yet
UNIT-1 (RIT-062) : Data Warehousing
34 pages
Towards NoSQL-based Data Warehouse Solutions
No ratings yet
Towards NoSQL-based Data Warehouse Solutions
8 pages
UNIT2DM
No ratings yet
UNIT2DM
63 pages
Data Warehouse - Logical Design
No ratings yet
Data Warehouse - Logical Design
40 pages
Multidimensional Normal Forms For Data Warehouse Design: J. Lechtenb Orger, G. Vossen University of M Unster Germany
No ratings yet
Multidimensional Normal Forms For Data Warehouse Design: J. Lechtenb Orger, G. Vossen University of M Unster Germany
27 pages
Assignment On Chapter 3 Data Warehousing and Management
No ratings yet
Assignment On Chapter 3 Data Warehousing and Management
17 pages
iiwas02dbb
No ratings yet
iiwas02dbb
5 pages
DM Chapter 2
No ratings yet
DM Chapter 2
35 pages
Data Warehouse and Data Mining Unit 1
No ratings yet
Data Warehouse and Data Mining Unit 1
2 pages
DMDW-MDM L8,9
No ratings yet
DMDW-MDM L8,9
53 pages
Implementation: Data Warehouse
No ratings yet
Implementation: Data Warehouse
56 pages
33-82-SA-V1-S1 DMDW Unit 4 Module5 Data Warehousing Olap and Cube-Self Assessment1
No ratings yet
33-82-SA-V1-S1 DMDW Unit 4 Module5 Data Warehousing Olap and Cube-Self Assessment1
2 pages
Database Design and Introduction To MySQL Day - 1
No ratings yet
Database Design and Introduction To MySQL Day - 1
29 pages
Unit-1 Lecture Notes
100% (1)
Unit-1 Lecture Notes
43 pages
A UML-based Data Warehouse Design Method PDF
No ratings yet
A UML-based Data Warehouse Design Method PDF
25 pages
AlgebraOLAP
No ratings yet
AlgebraOLAP
20 pages
DATABASE From the conceptual model to the final application in Access, Visual Basic, Pascal, Html and Php: Inside, examples of applications created with Access, Visual Studio, Lazarus and Wamp
From Everand
DATABASE From the conceptual model to the final application in Access, Visual Basic, Pascal, Html and Php: Inside, examples of applications created with Access, Visual Studio, Lazarus and Wamp
Olga Maria Stefania Cucaro
No ratings yet
Databases: System Concepts, Designs, Management, and Implementation
From Everand
Databases: System Concepts, Designs, Management, and Implementation
Jonathan Rigdon
No ratings yet
SQL and NoSQL Full Mastery: A Comprehensive Guide to Modern Data Management
From Everand
SQL and NoSQL Full Mastery: A Comprehensive Guide to Modern Data Management
Kameron Hussain
No ratings yet
Data Structures and Algorithms with Python
From Everand
Data Structures and Algorithms with Python
Aadinath Pothuvaal
No ratings yet
Resume: Name Sex Address Contact Place/Date of Birth Languages Educations Graduate Undergraduate
No ratings yet
Resume: Name Sex Address Contact Place/Date of Birth Languages Educations Graduate Undergraduate
2 pages
SE Chapter Four final
No ratings yet
SE Chapter Four final
61 pages
OSST Model Question Paper
No ratings yet
OSST Model Question Paper
2 pages
DWG List For Issued 21-12-27
No ratings yet
DWG List For Issued 21-12-27
2 pages
Guide To Computer Forensics and Investigations Fifth Edition
No ratings yet
Guide To Computer Forensics and Investigations Fifth Edition
48 pages
Treemacs
No ratings yet
Treemacs
4 pages
Operating System - Lab Manual # 8
No ratings yet
Operating System - Lab Manual # 8
6 pages
Open Vizsla Catalogue
No ratings yet
Open Vizsla Catalogue
7 pages
8dio Fire Sax
No ratings yet
8dio Fire Sax
14 pages
Ansar Basha Kacheri
No ratings yet
Ansar Basha Kacheri
2 pages
Ladder Diagram Programming
No ratings yet
Ladder Diagram Programming
3 pages
IC3 Placement Test Sample 2 (1)
No ratings yet
IC3 Placement Test Sample 2 (1)
14 pages
Samtrac CV + Q
No ratings yet
Samtrac CV + Q
8 pages
Maya Render Log
No ratings yet
Maya Render Log
24 pages
Fake News Detection Using Machine Learni
100% (1)
Fake News Detection Using Machine Learni
13 pages
Pom Psse33 PDF
63% (8)
Pom Psse33 PDF
1,504 pages
Society Law and Ethics
No ratings yet
Society Law and Ethics
45 pages
Lab7 Queue
No ratings yet
Lab7 Queue
5 pages
fm99 - Copie
No ratings yet
fm99 - Copie
4 pages
CV Dhai Mansour
No ratings yet
CV Dhai Mansour
1 page
Lab 1-Setting Up The Environment
No ratings yet
Lab 1-Setting Up The Environment
15 pages
Input and Output Constructs
No ratings yet
Input and Output Constructs
10 pages
Business Analytics With Excel Course
No ratings yet
Business Analytics With Excel Course
9 pages
Microprocessor /microcontroller Based Systems: Dr. Abbas Javed (Abbasjaved@cuilahore - Edu.pk)
No ratings yet
Microprocessor /microcontroller Based Systems: Dr. Abbas Javed (Abbasjaved@cuilahore - Edu.pk)
54 pages
Linux Administration
No ratings yet
Linux Administration
2 pages
SQL Based Metric Extension Oem12c1
No ratings yet
SQL Based Metric Extension Oem12c1
9 pages
Order Block plus (@PineScript2)
No ratings yet
Order Block plus (@PineScript2)
3 pages
United States Patent (10) Patent No.: US 8.484,696 B2
No ratings yet
United States Patent (10) Patent No.: US 8.484,696 B2
26 pages
Đọc thử Trader Handbook by TP Trading - Issuu
No ratings yet
Đọc thử Trader Handbook by TP Trading - Issuu
4 pages

ImplementatioDbNOSQL

Uploaded by

ImplementatioDbNOSQL

Uploaded by

Implementation of Multidimensional Databases

with Document-Oriented NoSQL

M. Chevalier, M. El Malki ✉ , A. Kopliku, O. Teste, and R. Tournier

IRIT 5505, Université de Toulouse, 118 Route de Narbonne,

M. El Malki—This study is supported by the ANRT funding under CIFRE-Capgemini

© Springer International Publishing Switzerland 2015

Conceptual Level Multidimensional

Legend New transformation

Fig. 1. Translations of a conceptual multidimensional model into logical models.

In this paper, we focus on multidimensional data models for data warehousing. We

2 State of the Art

Considerable research has focused on the translation of data warehousing concepts to

3 Multidimensional Conceptual Model and Olap Cube

3.1 Conceptual Multidimensional Model

• StarE: FE → 2DE is a function that associates facts of FE to sets of dimensions along

A hierarchy, denoted Hi∈HD, is deﬁned by (NHi, ParamHi, WeakHi)

3.2 The OLAP Cuboid

The pre-aggregate view or OLAP cuboid corresponds to a subset of aggregated meas‐

Fig. 2. Graphical notations of the multidimensional conceptual model

4 Document-Oriented Modeling of Multidimensional

4.1 Formalism for Document-Oriented Data Models

Tree notations Path notations

Fig. 3. Tree-like representation of documents

4.2 Document-Oriented Models for Data Warehousing

4.3 Mapping from the Conceptual Model

Lattice computation: To compute the pre-aggregate lattice, we use the aggregation

Hardware. The experiments are done on a cluster composed of 3 PCs (4 core-i5, 8 GB

Table 2. Loading times by model into MongoDB

MLD0 MLD1 MLD2

Fig. 4. Inter-model conversion times

In Fig. 5, we sumarize experimental observations concerning the computation of the

9392109 documents 4470422 documents 9790169 documents 9390777 documents

250 documents 250 documents 3750 documents 85 documents

Fig. 5. Computation time and records by OLAP cuboid

You might also like