Comparison of Relational Database With Document-Oriented Database (Mongodb) For Big Data Applications
Comparison of Relational Database With Document-Oriented Database (Mongodb) For Big Data Applications
Satyadhyan Chickerur
Centre for High Performance Computing
B V Bhoomaraddi College of Engineering and Technology
Hubli, India
[email protected]
Anoop Goudar
Tata Consultancy Services
Hyderabad, India
[email protected]
Ankita Kinnerkar
Tesco HSC
Bengaluru, India
ankita.kinnerkar @gmail.com
Abstract— Database can accommodate a very large number of users on an on-demand basis. The main limitations with
conventional relational database management systems (RDBMS) are that they are hard to scale with Data warehousing, Grid, Web
2.0 and Cloud applications, have non-linear query execution time, have unstable query plans and have static schema. Even though
RDBMS’s have provided database users with the best mix of simplicity, robustness, flexibility, performance, scalability and
compatibility but they are not able to satisfy the present day users and applications for the reasons mentioned above. The next
generation NonSQL (NoSQL) databases are mostly non-relational, distributed and horizontally scalable and are able to satisfy
most of the needs of the present day applications. The main characteristics of these databases are schema-free, no join, non-
relational, easy replication support, simple API and eventually consistent. The aim of this paper is to illustrate how a problem
being solved using MySQL will perform when MongoDB is used on a Big data dataset. The results are encouraging and clearly
showcase the comparisons made. Queries are executed on a big data airlines database using both MongoDB and MySQL. Select,
update, delete and insert queries are executed and performance is evaluated.
I. INTRODUCTION
Relational databases are great for enforcing data integrity. They are the tool of choice for online transaction processing
(OLTP) applications like data entry systems or on-line ordering applications. RDBMS requires that data be normalized so that
it can provide quality results and prevent orphan records and duplicates. It uses primary and secondary keys and indexes to
allow queries to quickly retrieve data. But all of the good intentions that the RDBMS has for ensuring data integrity come’s
with a cost. Normalizing data requires more tables, which requires more table joins, thus requiring more keys and indexes. As
databases start to grow into the terabytes, performance starts to significantly fall off. Often, hardware is thrown at the problem,
which can be expensive both from a capital endpoint and from an ongoing maintenance and support standpoint. [1,2]
One of the popular Document-oriented databases is MongoDB [3]. It is part of the NoSQL family of database systems.
Instead of storing data in tables as is done in a "classical" relational database, MongoDB stores structured data as JSON like
documents with dynamic making the integration of data in certain types of applications easier and faster [4].
II. COMPARISON BETWEEN MYSQL AND MONGODB
A huge airlines database [5] with 1050000 records is considered. The attributes are Year, Month, DayofMonth,
DayOfWeek, DepTime, CRSDepTime, ArrTime, CRSArrTime, UniqueCarrier, FlightNum, TailNum, ActualElapsedTime,
CRSElapsedTime, AirTime, ArrDelay, DepDelay, OriginDest, Distance, TaxiIn, TaxiOut, Cancelled, CancellationCode,
In SQL,
Insert into project1988
Values ( );
In SQL,
select * from project1988;
In Mongodb,
db.project1988.update ( );
In SQL,
Update from project1988 set deptime=”957”;
In MongoDB,
db.project1988.remove ( );
In SQL,
delete flightno from project1988 where deptime=”957”;
42
40
The implementation has two major steps:
A. Extracting the data from MySQL to csv files
Define Input: Table from MySQL
Output: csv file generation
Algorithm:
Step1: Select the particular table for migration.
Step2: Generate the csv file for the particular table selected.
Step3: Save the generated csv file in the project folder.
Step4: If successful then go to step 6.
Step5: Else go to step 7.
Step6: Move the generated csv file to the next major step for migrating it to MongoDB.
Step7: Ask the user to once again identify the table for migration.
Step8: End.
B. Dumping the extracted data in the csv files to MongoDB
Input: csv file
Output: Collection of migrated data in MongoDB.
Algorithm:
Step1: Start the MongoDB server.
Step2: Start the MongoDB client.
Step3: Set the path of the MongoDB bin file.
Step4: Import the data from csv file.
Step5: Display the imported data to user.
Step6: End.
After executing step A the csv files can be viewed as in Fig 2. Fig 3 shows the data in the MongoDB after the execution of
step B.
43
41
Figure 3. Imported data in MongoDB from csv file.
IV. RESULTS
The results for queries executed and briefly stated above are shown in this section. The results are for work in progress
application development.
A. Relational table insert vs. MongoDB insert
Considering different number of attributes time for execution is calculated. MySQL takes 0.08 seconds to insert a tuple
whereas MongoDB takes on 0.06 seconds. The results of experimentation are as shown in Fig. 4. and Table I.
44
42
TABLE I. TIME TAKEN FOR INSERT OPERATION
45
43
Figure 6. Update operation in MongoDB and MySQL.
46
44
TABLE IV. TIME TAKEN FOR DELETE OPERATION
V. CONCLUSION
The paper provides the comparison of MongoDB and MySQL and performance-testing results for insert, select, update
and delete operations. The results are encouraging for various operations, which may be carried out for big data
applications involving huge databases.
REFERENCES
[1] G. Eason, Lara Nichols,“A comparison of object-relational and relational databases”, presented to the Faculty of California , chapter 4, pp. 6-7.
[2] Jae Jin Koh,( 3-6 October, 2007), Relational database schema integration by overlay and redundancy elimination methods, in International Forum on
Strategic Technology( 2007), Institute of Electrical and Electronic Engineers, IEEE Computer Society.
[3] https://en.wikipedia.org/wiki/MongoDB . Last Accesses on: July 30 2015
[4] “MongoDB” http://www.mongodb.org/. Last Accessed on: July 30, 2015
[5] “The Airline Data Set; http://stat-computing.org/dataexpo/2009/. Last accessed on: August 1-7-2015
[6] Department of Education Office of Federal Student Aid, Data Migration Roadmap, “A Best Practice Summary”, pp.5-6
47
45