Mongodb Vs Cassandra
Mongodb Vs Cassandra
14
or “NoRel”, which is a principal difference between that The main difference of our paper is its goal to study execution
technology and already existent [13]. The origin of NoSQL can be time evolution according to increase in database size. Although all
related to BigTable, model developed by Google [7]. That different studies performed are important and allow better
database system, BigTable, was used to storage Google’s projects, understanding of capabilities of NoSQL database and how those
such as, Google Earth. Posteriorly Amazon developed his own differ, we consider data volume a very important factor that must
system, Dynamo [5]. Both of those projects highly contributed for be evaluated. At the same time, execution time provides better
NoSQL development and evolution. However NoSQL term was perception of performance while the number of operations per
not popular or known until the meeting held in San Francisco in second may be hard to analyze. At the same time, while examine
2009 [20, 21]. Ever since then, NoSQL became a buzzword. related work, it is important to notice that there are no much
papers discussing performance and benchmarking NoSQL
This paper is focused on testing NoSQL databases and compare
databases. With all aspects defined above, the main aim of our
performance of two widely used databases, MongoDB and
study is to increase the number of analysis and studies available,
Cassandra. We will describe the main characteristics and
while focusing on different parameters compared to existing
advantages of NoSQL databases compared to commonly used
papers.
relational databases. Some advantages and innovation brought by
noseequel model and different existing types of NoSQL databases
will be discussed. The benchmarking of these two NoSQL 3. NOSQL DATABASES
databases, MongoDB and Cassandra is also described. The main reason to NoSQL development was Web 2.0 which
increased the use and data quantity stored in databases [8, 11].
The experimental evaluation of both databases will test the Social networks and large companies, such as Google, interact
difference in managing and data volume scalability, and verify with large scale data amount [12]. In order not to lose
how databases will respond to read/update mix while running just performance, arises the necessity to horizontally scale data.
on one node without a lot of memory and processor resources, just Horizontal scaling can be defined as an attempt to improve
like personal computers. More specifically will be used Virtual performance by increasing the number of storage units [19]. Large
Machine environment. It is common to benchmark databases on amount of computers can be connected creating a cluster and its
high processing and with large capabilities clusters, but in our performance exceeds a single node unit with a lot of additional
study the main goal is focus on less capacity servers. CPUs and memory. With increased adherence to social networks,
The remainder of this paper is organized as follows. Section 2 information volume highly increased. In order to fulfill users
reviews related work on the topic and Section 3 makes a brief demands and capture even more attention, multimedia sharing
summary of NoSQL databases. Section 4 describes the became more used and popular. Users became able to upload and
comparison between MongoDB and Cassandra. Section 5 share multimedia content. So, the difficulty to keep performance
describes the YCSB – Yahoo! Cloud Serving Benchmark. In and satisfy users became higher [19]. Enterprises became even
section 6 the experimental results obtained in the study are shown. more aware of efficiency and importance of information
Finally, Section 7 presents our conclusions and suggests future promptness. The most widely used social network, Facebook,
work. developed by Mark Zuckerberg grew rapidly. With that, meet all
requirements of its users became a hard task. It is difficult to
define the number of millions of people who use this network at
2. RELATED WORK the same time to perform different activities. Internally interaction
Performance and functional principles of NoSQL databases has of all those users with multimedia data is represented by millions
been approached ever since those gained popularity. While of requests to database at the same time. The system must be
analyzing different papers and studies of NoSQL databases two designed to be able to handle large amount of requests and
different types of approaches can be defined. The first is focused process data in a fast and efficient way. In order to keep up with
on compare commonly used SQL databases and NoSQL all demands as well as keep high performance, companies invest
databases, evaluate and study performance in order to distinguish in horizontal scaling [18]. Beyond efficiency, costs are also
those two types of databases. The other one consists of reduced. It is more inexpensive to have a large amount of
comparison only between NoSQL databases. Those studies computers with fewer resources than build a supercomputer.
commonly pick most known NoRel databases and compare their Relational databases allow to horizontally scaling data but
performance. However, both of those comparisons in most cases NoSQL provide that in an easier way. This is due to ACID
are focused on analyzing the number of operations per second and principles and transaction support that is described in the next
latency for each database. While latency may be considered an section. Since data integrity and consistency are highly important
important factor while working in cluster environment, there is no for relational databases, communication channels between nodes
value for it in a single node study. and clusters would have to instantly synchronize all transactions.
Brian F. Cooper et al. analyzed NoSQL databases and MySQL NoSQL databases are designed to handle all type of failures.
database performance using YCSB benchmark by relating latency Variety of hardware fail may occur and system must be prepared
with the number of operations per second [4]. In our paper the so it is more functional to consider those concerns as eventual
main focus is to perform studies prioritizing different execution occurrences than some exceptional events.
parameters. More specifically our goal is based on relating
execution time to the number of records used on each execution. In the next sections it will be described the principles of
More importantly, all benchmarking is commonly done in high operation, characteristics and different types of NoSQL databases.
processing and with lots of memory clusters, it is also important to
understand how these databases behave in simpler environments
and while using just one server.
15
3.1 ACID vs BASE 3.3 Types of NoSQL databases
Relational databases are based on a set of principles to optimize With high adherence to NoSQL databases, different databases
performance. Principles used by Relational or NoSQL databases have been developed. Currently there are over 150 different
are derived from CAP theorem [11]. According to this theorem, NoSQL databases. All those are based on same principles but own
following guarantees can be defined: some different characteristics. Typically can be defined four
categories [9]:
Consistency – all nodes have same data at the same
time; Key-value Store. All data is stored as set of key and
value. All keys are unique and data access is done by
Availability – all requests have response; relating those keys to values. Hash contains all keys in
Partition tolerance – if part of system fails, all system order to provide information when needed. But value
won’t collapse. may not be actual information, it may be other key.
Examples of Key-value Store databases are:
ACID is a principle based on CAP theorem and used as set of
BynamoDB, Azure Table Storage, Riak, Redis.
rules for relational database transactions. ACID’s guarantees are
[17]: Document Store. Those databases can be defined as set
of Key-value stores that posteriorly are transformed into
Atomic – a transaction is completed when all operations documents. Every document is identified by unique key
are completed, otherwise rollback1 is performed; and can be grouped together. The type of documents is
Consistent – a transaction cannot collapse database, defined by known standards, such as, XML or JSON.
otherwise if operation is illegal, rollback is performed; Data access can be done using key or specific value.
Isolated – all transactions are independent and cannot Some examples of Document Store databases are:
affect each other; MongoDB, Couchbase Server, CouchDB, RavenDB.
Durable – when commit2 is performed, transactions Column-family. That is the type most similar to
cannot be undone. relational database model. Data is structured in columns
that may be countless. One of projects with that
It is noticeable that in order to have robust and correct database
approach is HBase based on Google’s Bigtable [24].
those guarantees are important. But when the amount of data is
Data structure and organization consists of:
large, ACID may be hard to attain. That why, NoSQL focuses on
BASE principle [17, 20]: o Column – represents unit of data identified by
key and value;
Basically Avaliable – all data is distributed, even when
there is a failure the system continues to work; o Super-column – grouped by information
columns;
Soft state – there is no consistency guarantee;
o Column family – set of structured data similar
Eventually consistent – system guarantees that even to relation database table, constituted by
when data is not consistent, eventually it will be. variety of super-columns.
It is important to notice, that BASE still follows CAP theorem Structure of database is defined by super-columns and
and if the system is distributed, two of three guarantees must be column families. New columns can be added whenever
chosen [1]. What to choose depends of personal needs and necessary. Data access is done by indicating column
database purpose. BASE is more flexible that ACID and the big family, key and column in order to obtain value, using
difference is about consistency. If consistency is crucial, relational following structure:
databases may be better solution but when there are hundreds of
<columnFamily>.<key>.<column> = <value>
nodes in a cluster, consistency becomes very hard to accomplish.
Examples of Column-family databases: HBase,
3.2 Data access Cassandra, Accumulo, Hypertable.
When it comes to data access, data interaction and extraction in Graph database. Those databases are used when data
NoSQL databases is different. Usual SQL language cannot be can be represented as graph, for example, social
used anymore. NoSQL databases tend to favor Linux so data can networks.
be manipulated with UNIX commands. All information can be
easily manipulated using simple commands as ls, cp, cat, etc. and Examples of Graph databases: Neo4J, Infinite Graph,
extracted with I/O and redirect mechanisms. Even though, since InfoGrid, HyperGraphDB.
SQL became a standard and widely used, there are NoSQL In the next section we describe the main characteristics of the two
databases where SQL-like query language can be used. For popular NoSQL databases under test.
example, UnQL – Unstructured Query language developed by
Couchbase [22] or CQL – Cassandra Query language [23]. 4. MONGODB VS CASSANDRA
In this section we describe MongoDB and Cassandra, which are
1
Operation that returns database to consistent state the databases chosen for analysis and tests. The main
2
Operation that confirms all changes done over database as characteristics to be analyzed are: data loading, only reads, reads
permanent and updates mix, read-modify-write and only updates. Those
databases were chosen in order to compare different types of
16
databases, MongoDB as Document Store and Cassandra as on replica will be older compared to the Master and will
Column family. not match last updates done.
17
which means that there are no master. That architecture is known There are different ways to use Cassandra, some of most
as peer-to-peer and overcomes master-slave limitations such as, prominent areas of use are: financial, social media, advertising,
high availability and massive scalability. Data is replicated over entertainment, health care, government, etc. There are many
multiple nodes in the cluster. It is possible to store terabytes or companies that use Cassandra, for example, IBM, HP, Cisco and
petabytes of data. Failed nodes are detected by gossip protocols eBay [24].
and those nodes can be replaced with no downtime. The total
number of replicas is referred as replication factor. For example, 4.3 Features comparison
replication factor 1 means that there is only one copy of each row In order to better understand differences between MongoDB and
on one node but replication factor 2 represents that there are two Cassandra we study some features of those NoSQL databases such
copies of same records, each one on different node. There are two as: development language, storage type, replication, data storage,
available replication strategies: usage and some other characteristics. All those characteristics are
Simple Strategy: it is recommended when using a single shown in Table 1.
data center. Data center can be defined as group of
related nodes in cluster with replication purpose. First Table 1. MongoDB and Cassandra features
replica is defined by system administrator and
additional replica nodes are chosen clockwise in the MongoDB Cassandra
ring. Development
C++ Java
Network Topology Strategy: it is a recommended language
strategy when the cluster is deployed across multiple Storage Type BSON files Column
data centers. Using this strategy it is possible to specify
the number of replicas to use per data center. Protocol TCP/IP TCP/IP
Commonly in order to keep tolerance-fault and Transactions No Local
consistency it should be used two or three replicas on
each data center. Concurrency Instant update MVCC
One of the important features of Cassandra is durability. There are Locks Yes Yes
two available replication types: synchronous and asynchronous,
and the user is able to choose which one to use. Commit log is Triggers No Yes
used to capture all writes and redundancies in order to ensure data
Replication Master-Slave Multi-Master
durability.
Consistency, Partition tolerance,
Another important feature for Cassandra is indexing. Each node CAP theorem
Partition tolerance High Availability
maintains all indexes of tables it manages. It is important to notice Operating Linux / Mac OS / Linux / Mac OS /
that each node knows the range of keys managed by other nodes. Systems Windows Windows
Requested rows are located by analyzing only relevant nodes.
Data storage Disc Disc
Indexes are implemented as a hidden table, separated from actual
data. In addition, multiple indexes can be created, over different A cross between
Retains some SQL
fields. However, it is important to understand when indexes must BigTable and
Characteristics properties such as
be used. With larger data volumes and a large number of unique Dynamo. High
query and index
values, more overhead will exist to manage indexes. For example, availability
having database with millions of clients’ records and indexing by CMS system, Banking, finance,
Areas of use
e-mail field that usually is unique will be highly inefficient. comment storage logging
All stored data can be easily manipulated using CQL – Cassandra By analyzing core properties it is possible to conclude that there
Query Language based on widely used SQL. Since syntax is are similarities when it comes to used file types, querying,
familiar, learning curve is reduced and it is easier to interact with transactions, locks, data storage and operating systems. But it is
data. In Figure 2 is shown a Cassandra client console. important to notice the main difference, according to CAPs
theorem, MongoDB is CP type system – Consistency and
Partition tolerance, while Cassandra is PA – Consistency and
Availability. In terms of replication, MongoDB uses Master-Slave
while Cassandra uses peer-to-peer replication that is typically
named as Multi-master.
In terms of usage and best application, MongoDB has better use
for Content Management Systems (CMS), while having dynamic
queries and frequently written data. Cassandra is optimized to
store and interact with large amounts of data that can be used in
different areas such as, finance or advertising. Following, we
describe the benchmark to test MongoDB and Cassandra
databases.
Figure 2 – Cassandra console
18
5. YCSB BENCHMARK In the following figures we show data loading phase tests and
The YCSB – Yahoo! Cloud Serving Benchmark is one of the time execution for the different types of workloads: A, B, C, F, G,
most used benchmarks to test NoSQL databases [10]. YCSB has a and H.
client that consists of two parts: workload generator and the set of
scenarios. Those scenarios, known as workloads, are
combinations of read, write and update operations performed on Data loading phase
randomly chosen records. The predefined workloads are:
Workload A: Update heavy workload. This workload 09:36
Time (min:sec)
has a mix of 50/50 reads and updates.
MongoDB
Workload B: Read mostly workload. This workload 04:48
Cassandra
has a 95/5 reads/update mix.
Workload C: Read only. This workload is 100% read. 00:00
100K 280K 700K
Workload D: Read latest workload. In this workload,
new records are inserted, and the most recently inserted MongoDB 00:45 02:00 04:42
records are the most popular.
Cassandra 00:59 02:24 05:42
Workload E: Short ranges. In this workload, short
ranges of records are queried, instead of individual Number of records
records.
Figure 3 - Data loading test
Workload F: Read-modify-write. In this workload, the
client will read a record, modify it, and write back the To compare loading speed and throughput different volumes of
changes. were loaded with 100.000, 280.000 and 700.000 records as shown
in Figure 3. While observing results, it is possible to see that there
Because our focus is on update and read operations, workloads D was no significant difference between MongoDB and Cassandra.
and E will not be used. Instead, to better understand update and MongoDB had slightly lower insert time, regardless of number of
read performance, two additional workloads were defined: records, compared to Cassandra, which has an average overhead
Workload G: Update mostly workload. This workload of 24%. When the size of loaded data increases, the execution
has a 5/95 reads/updates mix. time increased in a similar proportion for both databases with
highest time of 04:42 for MongoDB and 05:42 for Cassandra
Workload H: Update only. This workload is 100% when inserting 700.000 records.
update.
The loaded data is from a variety of files, each one with a certain
Workload A (50/50 reads and updates)
number of fields. Each record is identified by a key, string like
“user123423”. And each field is named as field0, field1 and so on.
Values of each field are random characters. For testing we use 00:57
records with 10 fields each of 100 bytes, meaning a 1kb per
Time (min:sec)
record. MongoDB
00:28
Since client and server are hosted on the same node, latency will Cassandra
not take part of this study. YCSB provides thread configuration
and set of operation number per thread. During initial tests we 00:00
observed that using threads, the number of operations per second 100K 280K 700K
actually reduced. That is due to the fact that tests are running on MongoDB 00:19 00:31 00:28
virtual machine with even lower resources than a host.
Cassandra 00:10 00:14 00:11
6. EXPERIMENTAL EVALUATION Number of records
In this section we will describe the experiments while using
different workloads and data volumes. Tests were running using Figure 4 - Workload A experiments
Ubuntu Server 12.04 32bit Virtual Machine on VMware Player.
As experimental setup it is important to notice that VM has Compared to MongoDB, Cassandra had better execution time
available 2GB RAM and Host was single-node Core 2 Quad 2.40 regardless database size. The performance of Cassandra can be
GHz with 4GB RAM and Windows 7 Operating System. The 2.54 times faster than Mongo DB using a mix of 50/50 reads and
tested versions of NoSQL databases are MongoDB version 2.4.3 updates with 700.000 records. Another important fact that can be
and Cassandra version 1.2.4. observed is the decrease in time execution when number of
records used goes from 280.000 up to 700.000, for both databases
As focus of study, we take the execution time to evaluate the best (see Figure 4). This happens due to optimization of databases to
database performance. All workloads were executed three times work with larger volumes of data.
with reset of computer between tests. All the values are shown in
(minutes:seconds) and represent the average value of the three
executions.
19
Workload B (95/5 reads and updates) Workload F (read-modify-write)
00:57
Time (min:sec)
00:57
MongoDB
00:28
(min:sec)
MongoDB Cassandra
Time
00:28
Cassandra
00:00
00:00 100K 280K 700K
100K 280K 700K
MongoDB 00:19 00:21 00:36
MongoDB 00:12 00:22 00:32
Cassandra 00:40 00:21 00:20
Cassandra 00:29 00:21 00:18
Number of records
Number of records
Figure 7 - Workload F experiments
Figure 5 - Workload B experiments
In this workload, the client will read a record, modify it, and write
When we test the databases with a 95/5 reads/update mix the back the changes. In this workload Cassandra and MongoDB
results for Cassandra and MongoDB had different behavior as showed opposite behavior results as illustrated id Figure 7. The
shown in Figure 5. While execution time for MongoDB kept Cassandra’s higher execution time was with small data volume
increasing, Cassandra was able to reduce time while data volume and with increase it kept reducing while MongoDB has worst time
became larger. However, the highest time for Cassandra was with bigger data size. MongoDB is 2.1 faster for querying over
00:29 and corresponds to querying over 100.000 records when for 100.000 records but 1.8 slower for 700.000 records, and have the
MongoDB highest time was of 00:32 for 700.000 records. The same value for 280.000 records, when comparing to Cassandra
performance of Cassandra with this workload is 56% better when execution time. Smallest execution time variations were 00:01 for
comparing to MongoDB, using 700.000 records. Although for Cassandra when increasing number of records from 280.000 up to
small size data (100.000 records) the MongoDB has better results. 700.000 and 00:02 for MongoDB when lowing number of records
used from 280.000 down to 100.000 records.
Workload C (100% reads)
Workload G (5/95 reads and updates)
00:57
Time (min:sec)
00:57
MongoDB
Time (min:sec)
00:28 MongoDB
Cassandra
00:28
Cassandra
00:00
100K 280K 700K 00:00
100K 280K 700K
MongoDB 00:16 00:27 00:35
Cassandra 00:43 00:24 00:20 MongoDB 00:23 00:31 00:36
Cassandra 00:01 00:02 00:03
Number of records
Number of records
Figure 6 - Workload C experiments
In this workload we have 100% of reads. As the previous Figure 8 - Workload G experiments
experiments, when it comes to large amount of read operations, This workload has a 5/95 reads/updates mix. The results shown in
Cassandra becomes more efficient with bigger quantity of data, as Figure 8 are absolutely demonstrative of the superiority of
illustrated in Figure 6. MongoDB showed similar behavior of the Cassandra over MongoDB for all database sizes. On every
previous workload, where execution time is directly proportional execution time Cassandra showed better results. With grown of
to data size. However, MongoDB is 2.68 faster when using data volume both Cassandra and MongoDB started having higher
100.000 records but 1.75 slower for 700.000 records, when execution time, but MongoDB was not even close to Cassandra.
comparing to Cassandra execution time. Fastest execution time of The performance of Cassandra with this workload varies from 23
MongoDB is 00:16 and for Cassandra is 00:20, however those to 12 times faster than MongoDB. That established difference in
results represent opposite volumes of data, being better execution performance allows us to conclude that in this environment,
time for Cassandra with high number of records and for Cassandra is more optimized to update operations compared to
MongoDB with just 100.000 records. MongoDB, showing surprisingly high performance results.
20
Workload H (100% updates) 9. REFERENCES
[1] Brewer, E., "CAP twelve years later: How the "rules" have
00:57 changed," Computer , vol.45, no.2, pp.23,29, Feb. 2012. doi:
Time (min:sec) 10.1109/MC.2012.37.
MongoDB [2] Codd. E. F. 1970. A relational model of data for large
00:28 shared data banks. Communications of ACM 13, 6 (June
Cassandra
1970), 377-387. doi=10.1145/362384.362685.
00:00 [3] Codd. E. F. 1985. “Is your DBMS Really Relational?” and
100K 280K 700K “Does your DBMS Run by the Rules?” Computer World,
MongoDB 00:25 00:27 00:43 October 14 and October 21.
Cassandra 00:01 00:01 00:01 [4] Cooper B. F., Adam Silberstein, Erwin Tam, Raghu
Ramakrishnan, and Russell Sears. 2010. Benchmarking
Number of records cloud serving systems with YCSB. In Proceedings of the 1st
ACM symposium on Cloud computing (SoCC '10). ACM,
Figure 9 - Workload H experiments New York, NY, USA, 143-154.
DOI=10.1145/1807128.1807152
When it came to a 100% update workload Cassandra had stable http://doi.acm.org/10.1145/1807128.1807152
performance even with increased number of records, as shown in
[5] DeCandia Giuseppe, Deniz Hastorun, Madan Jampani,
Figure 9. Similarly to results of workload G, Cassandra showed
Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin,
great results compared to MongoDB, which varies from 25 to 43
Swaminathan Sivasubramanian, Peter Vosshall, and Werner
times better. For MongoDB the difference in execution time
Vogels. 2007. Dynamo: amazon's highly available key-value
between 100.000 records and 280.000 records was not big but
store. In Proceedings of twenty-first ACM SIGOPS
almost doubled when using 700.000 records
symposium on Operating systems principles (SOSP '07).
ACM, New York, NY, USA, 205-220.
7. CONCLUSIONS AND FUTURE WORK [6] Donald D. Chamberlin, Raymond F. Boyce: SEQUEL: A
The development of the Web need databases able to store and
Structured English Query Language. SIGMOD Workshop,
process big data effectively, demand for high-performance when
Vol. 1 1974: 249-264.
reading and writing, so the traditional relational database is facing
many new challenges. NoSQL databases have gained popularity [7] Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C.
in the recent years and have been successful in many production Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra,
systems. In this paper we analyze and evaluate two of the most Andrew Fikes, and Robert E. Gruber. 2006. Bigtable: a
popular NoSQL databases: MongoDB and Cassandra. In the distributed storage system for structured data. In Proceedings
experiments we test the execution time according to database size of the 7th USENIX Symposium on Operating Systems
and the type of workload. We test six different types of Design and Implementation - Volume 7 (OSDI '06), Vol. 7.
workloads: mix of 50/50 reads and updates; mix of 95/5 USENIX Association, Berkeley, CA, USA, 15-15.
reads/updates; read only; read-modify-write cycle; mix of 5/95 [8] Hecht, R.; Jablonski, S., "NoSQL evaluation: A use case
reads/updates; and update only. With the increase of data size, oriented survey," Cloud and Service Computing (CSC), 2011
MongoDB started to reduce performance, sometimes showing International Conference on , vol., no., pp.336,341, 12-14
poor results. Differently, Cassandra just got faster while working Dec. 2011. doi: 10.1109/CSC.2011.6138544.
with an increase of data. Also, after running different workloads
to analyze read/update performance, it is possible to conclude that [9] Indrawan-Santiago, M., "Database Research: Are We at a
when it comes to update operations, Cassandra is faster than Crossroad? Reflection on NoSQL," Network-Based
MongoDB, providing lower execution time independently of Information Systems (NBiS), 2012 15th International
database size used in our evaluation. As overall analysis turns out Conference on , vol., no., pp.45,51, 26-28 Sept. 2012. doi:
that MongoDB fell short with increase of records used, while 10.1109/NBiS.2012.95.
Cassandra still has a lot to offer. In conclusion Cassandra show [10] Jayathilake, D.; Sooriaarachchi, C.; Gunawardena, T.;
the best results for almost all scenarios. Kulasuriya, B.; Dayaratne, T., "A study into the capabilities
of NoSQL databases in handling a highly heterogeneous
As future work, we pretend to analyze the number of operations tree," Information and Automation for Sustainability
per second vs database size. That would help to understand, how (ICIAfS), 2012 IEEE 6th International Conference on , vol.,
those databases would behave with higher number of records to no., pp.106,111, 27-29 Sept. 2012. doi:
read/update with data volume grown. 10.1109/ICIAFS.2012.6419890.
[11] Jing Han; Haihong, E.; Guan Le; Jian Du, "Survey on
8. AKNOWLEDGMENTS NoSQL database," Pervasive Computing and Applications
Our thanks to ISEC – Coimbra Institute of Engineering from (ICPCA), 2011 6th International Conference on , vol., no.,
Polytechnic Institute of Coimbra for following us to use the pp.363,366, 26-28 Oct. 2011.
facilities of the Laboratory of Research and Technology doi:10.1109/ICPCA.2011.6106531.
Innovation of Computer Science and Systems Engineering
Department. [12] Leavitt, N., "Will NoSQL Databases Live Up to Their
Promise?," Computer , vol.43, no.2, pp.12,14, Feb. 2010.
doi: 10.1109/MC.2010.58.
21
[13] Lith, Adam; Jakob Mattson (2010). "Investigating storage [19] Silberstein, A.; Jianjun Chen; Lomax, D.; McMillan, B.;
solutions for large data: A comparison of well performing Mortazavi, M.; Narayan, P. P S; Ramakrishnan, R.; Sears,
and scalable data storage solutions for real time extraction R., "PNUTS in Flight: Web-Scale Data Serving at Yahoo,"
and batch insertion of data". Göteborg: Department of Internet Computing, IEEE , vol.16, no.1, pp.13,23, Jan.-Feb.
Computer Science and Engineering, Chalmers University of 2012. doi: 10.1109/MIC.2011.142.
Technology. [20] Tudorica, B.G.; Bucur, C., "A comparison between several
[14] Lombardo, S.; Di Nitto, E.; Ardagna, D., "Issues in Handling NoSQL databases with comments and notes," Roedunet
Complex Data Structures with NoSQL Databases," Symbolic International Conference (RoEduNet), 2011 10th , vol., no.,
and Numeric Algorithms for Scientific Computing pp.1,5, 23-25 June 2011.
(SYNASC), 2012 14th International Symposium on , vol., doi:10.1109/RoEduNet.2011.5993686.
no., pp.443,448, 26-29 Sept. 2012. doi: [21] Yahoo! Developer Network 2009. Notes from NoSQL
10.1109/SYNASC.2012.59. Meetup. - http://developer.yahoo.com/blogs/ydn/notes-nosql-
[15] M.M. Astrahan, A history and evaluation of system R, meetup-7663.html.
Performance Evaluation, Volume 1, Issue 1, January 1981, [22] http://www.couchbase.com/press-releases/unql-query-
Page 95, ISSN 0166-5316, 10.1016/0166-5316(81)90053-5. language, accessed on 30th April 2013
[16] nosql-database.org, accessed on 30th April 2013. [23] http://www.datastax.com/docs/1.0/references/cql/index,
[17] Roe C. 2012 “ACID vs. BASE: The Shifting pH of Database accessed on 30th April 2013.
Transaction Processing” - http://www.dataversity.net/acid- [24] http://cassandra.apache.org/, accessed on 30th April 2013.
vs-base-the-shifting-ph-of-database-transaction-processing/.
[25] http://docs.mongodb.org/ecosystem/tools/administration-
[18] Shidong Huang; Lizhi Cai; Zhenyu Liu; Yun Hu, "Non- interfaces/, accessed on 30th April 2013.
structure Data Storage Technology: A Discussion,"
Computer and Information Science (ICIS), 2012 IEEE/ACIS [26] http://www.datastax.com/what-we-offer/products-
11th International Conference on , vol., no., pp.482,487, services/datastax-enterprise/apache-cassandra, accessed on
May 30 2012-June 1 2012. doi: 10.1109/ICIS.2012.76. 30th April 2013.
22