0% found this document useful (0 votes)
20 views5 pages

Data Migration Methodology From Relational To Nosql Databases

Uploaded by

chaimae saadouni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views5 pages

Data Migration Methodology From Relational To Nosql Databases

Uploaded by

chaimae saadouni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

World Academy of Science, Engineering and Technology

International Journal of Computer, Electrical, Automation, Control and Information Engineering Vol:9, No:12, 2015

Data Migration Methodology from Relational to


NoSQL Databases
Mohamed Hanine, Abdesadik Bendarag, Omar Boutkhoum

• Economic problems in organization: This can lead to


Abstract—Currently, the field of data migration is very topical. uninstalling commercial and expensive database software
As the number of applications developed rapidly, the ever-increasing and installing open-source and cheaper database instead.
volume of data collected has driven the architectural migration from • Unifying various applications and databases: We need to
Relational Database Management System (RDBMS) to NoSQL (Not unify various databases to one database specific to ensure
Only SQL) database. This very recent technology is important
enough in the field of database management. The main aim of this
better efficiency and consistency of data.
International Science Index, Computer and Information Engineering Vol:9, No:12, 2015 waset.org/Publication/10004179

paper is to present a methodology for data migration from RDBMS to In this article, two types of databases will be used for data
NoSQL database. To illustrate this methodology, we implement a migration. Source database is relational, and target database is
software prototype using MySQL as a RDBMS and MongoDB as a a recent one in the market: NoSQL database [4].
NoSQL database. Although this is a hard engineering work, our • Relational database: A relational database or RDBMS is a
results show that the proposed methodology can successfully database based on a relational model that was developed
accomplish the goal of this study.
by Edgar Codd in 1970 [5]. The basic notion of this
database is the separation into tables for storing all these
Keywords—Data Migration, MySQL, RDBMS, NoSQL,
MongoDB. data. The tables are structured in rows and columns,
where they can be linked with each other by foreign keys.
I. INTRODUCTION A big advantage of this database type is its ease to use, so
the untrained business users can create easy own
C URRENTLY, there are many types of databases in which
large amount of data are stored. Sometimes, it is
necessary to migrate data from one type of database to another
databases. However, the huge growth of new applications
that depend on storing and processing big amount of data
which added more challenges to the RDBMS, and where
or to create new database implemented in another one and
the classical SQL systems being inappropriate in a variety
move data from old database to a new one. In these cases, the
of ways, lead to a new database model called NoSQL.
process of data migration consists of three steps called ETL
• NoSQL Database: NoSQL databases have emerged
(Extract, Transform and Load) [1]: (1) Extraction data from
tremendously in the last years owing to their less
the source database, (2) Transformation data, and (3)
constrained structure, scalable schema design, and faster
Migration of data to the target database.
access compared to relational databases. The main feature
There is a variety of reasons for data migration, including
that makes difference in the model of NoSQL data is that
server, storage equipment replacements or upgrades. In this
it does not use the table as storage structure of the data. In
regard, we only state few of these reasons [1]-[3]:
addition, its schema is very efficient in handling the
• Upgrading to new version database: In the case of
unstructured data. The NoSQL database takes many
upgrading of software equipment in the organization,
modeling techniques like key-value stores, document data
including new version of database, we need to migrate
model, and graph. The following illustrates this clearly
data to new database.
[6], [7]:
• Existing database is insufficient: In the case of a large
- Key-value stores: Data is stored as values with a unique
increase of stored data, there is insufficient capacity or
key assigned to each value. Also, this type of NoSQL
speed of database.
database allows for keeping high performance in reading
• Changing of organization policy: In the case of changing
and/or writing. Currently, the best solutions having
security or another type of policy in organization, we need
adopted the system of key-value are Voldemort, Redis
to upgrade to better database.
and Riak.
- Document stores: The concept of this database which is
H.M. is with the Department of Computer Science, Laboratory of based on documents is a kind of extension of the
Engineering and Information Systems, Faculty of Sciences Semlalia, Cadi key/value database in which the value is represented as a
Ayyad University, Marrakesh, Morocco (corresponding author to provide document containing data represented in standard formats
phone: +212 677 34 63 41; e-mail: [email protected]).
B.A. is with the Department of Mathematics and Computer Science, (JSON: JavaScript Object Notation, XML, etc.). All
Faculty of Polydisciplinary, Safi, Cadi Ayyad University, Morocco (e-mail: documents are stored in collections (equivalent to tables
[email protected]) in SQL). The advantage of database oriented documents is
B.O. is with the Department of Computer Science, Laboratory of
Engineering and Information Systems, Faculty of Sciences Semlalia, Cadi to retrieve a set of hierarchically structured information
Ayyad University, Marrakesh, Morocco (e-mail: [email protected]). from a unique key. The most current implementations are

International Scholarly and Scientific Research & Innovation 9(12) 2015 2404 scholar.waset.org/1999.4/10004179
World Academy of Science, Engineering and Technology
International Journal of Computer, Electrical, Automation, Control and Information Engineering Vol:9, No:12, 2015

CouchDB and MongoDB. TABLE I


- Graph databases: An important aspect of the database- BASIC SQL VS MONGODB SYNTAX
oriented graph is the use of index. This means that each SQL Terminology/ MongoDB Terminology/
Concepts/Syntax Concepts/Syntax
element contains a pointer to its adjacent element and table collection
does not require indexing of every element. The most row document or bson document
current implementations are Neo4j, HypergraphDB and column field
FlockDB. table joins embedded documents and linking
This paper presents a methodology for data migration from primary key primary key
relational database to NoSQL. During data migration, the create table users ( db.users.insert( {
id mediumint not null user_id: "abc123",
NoSQL schema will be created automatically. The proposed auto_increment, age: 55, status: "a"} )
methodology will be verified on data migration from MySQl user_id varchar (30),
relational database to NoSQL database implemented in age number,
status char (1),
MongoDB. primary key (id))
The main reasons that lead us to choose MongoDB as target alter table users db.users.update({ },{ $set: { xdate:
database are the fact that it has many advantages: add xdate datetime new date() } },{multi: true})
International Science Index, Computer and Information Engineering Vol:9, No:12, 2015 waset.org/Publication/10004179

alter table users db.users.update({ },{ $unset: {


- Open source. drop column xdate xdate: "" } },{multi: true })
- Document oriented NoSQL databases. create index idx_user_id_asc db.users.ensureindex( { user_id: 1 }
- Functioning as a distributed centralized architecture, it on users(user_id) )
replicates data on multiple servers with the master-slave drop table users db.users.drop()
insert into users(user_id,age, db.users.insert({ user_id: "bcd001",
principle, allowing a greater fault tolerance. status)values ("bcd001",45,"a") age: 45, status: "a" })
- The number and type of fields in a document are not select * from users db.users.find({ status: "a" })
defined previously. where status = "a"
- It is usable with major development languages (C, Java,
PHP, Python ...) via drivers. II. PROPOSED METHODOLOGY
- Availability in multiple environments such as Linux and The main aim of this study is the realization of a
Windows methodology for facilitating the migration from a RDBMS
- A very powerful base in terms of speed. (MySQL) to NoSQL (MongoDB). The main steps of the
- Characterized by dynamic schema, high scalability, and methodology are visually displayed in Fig. 2, which, as
optimal query performance. highlighted in black circles, explain how the methodology
Fig. 1 shows the model structure of MongoDB, and Table I works. To present each step, we will use an example of a
shows some SQL concepts and terminology and their database representing the orders of a company in Fig. 3.
correspondences in MongoDB [8], [9].

Fig. 1 Model structure of MongoDB

The rest of the paper is organized as follows. The second Fig. 2 A methodology for data migration
section explains concisely the proposed methodology. In third
section, the software prototype for demonstrating the proposed
methodology is illustrated. Finally, conclusions and further
research are offered in the last section.

International Scholarly and Scientific Research & Innovation 9(12) 2015 2405 scholar.waset.org/1999.4/10004179
World Academy of Science, Engineering and Technology
International Journal of Computer, Electrical, Automation, Control and Information Engineering Vol:9, No:12, 2015

this example, the application relies on the RDBMS to join four


separate tables. With MongoDB, all of the data is aggregated
within a single document, linked with a single reference to a
customer document (Fig. 6) [10].

Fig. 3 Relational database logical structure


International Science Index, Computer and Information Engineering Vol:9, No:12, 2015 waset.org/Publication/10004179

A. Loading the Logical Structure of the Source Database


In the beginning it is necessary to specify the relational
database source. Then, we connect to source database to
obtain all information about type and version. We also need to
specify all the information about NoSQL target database. Fig. 4 Relational Schema, Flat 2-D Tables (Customer and Order)
Furthermore, in this step we obtain a representation of the
relational model of the source database. For this we need to
get the names of tables, their attributes and relationships. The
information on the relationships can be retrieved from the
primary constraint and foreign keys for each table. Fig. 3 is an
instance of our model representing a portion of our source
database.
B. Mapping Between Relational Model and MongoDB
Model
This step is dedicated to define the mapping between the
relational model of MySQL and document-oriented model of
MongoDB. It consists in defining which attributes of the
relational model of the source database will be linked to the Fig. 5 Richly Structured MongoDB Document
attributes of the target model. Loading database tables and
their attributes can be implemented in different ways. One of
them is to use JDBC driver. During the migration of relational
model, the system will propose a set of suitable tables to
migrate to the target database. The proposal of suitable tables
will be based on the type of the MySQL of the source
database.
The main concern of those coming from a relational
background is the absence of JOINs in NoSQL databases. As
demonstrated below, the document model makes JOINs
redundant in many cases. In Fig. 4, the relational database uses
the “Customer_ID” field to join the “Customer” table with the
“Order” table to enable the application to report each order to
the right customer. Using the document model, embedded sub-
documents and tables effectively pre-join data by aggregating
related data within a single data structure. Columns and rows
that were traditionally normalized and distributed across
Fig. 6 The MongoDB model result of the mapping from relational
separate tables can presently be stored together in a single
model
document, eliminating the need to JOIN separate tables when
the application has to retrieve complete records [10].
III. PROPOSED SOFTWARE
Modeling the same data in MongoDB enables us to create a
schema in which we embed an array of sub-documents for In order to test the validity of the proposed methodology,
each order directly within the customer document (Fig. 5). In we develop a software prototype which is programmed by

International Scholarly and Scientific Research & Innovation 9(12) 2015 2406 scholar.waset.org/1999.4/10004179
World Academy of Science, Engineering and Technology
International Journal of Computer, Electrical, Automation, Control and Information Engineering Vol:9, No:12, 2015

using JAVA on a PC platform. The operation sequence will be Finally, the data are inserted in MongoDB. Fig. 9 presents
demonstrated in the following paragraphs, through the use of data migrated in collection forms where the user can access to
several screenshots. all the features of MongoDB using the tabs presented in this
Initially, the user must connect to MySQL system through interface: Documents, Collections, and Requests.
the interface shown in Fig. 7 for choosing the source database
which will migrate to MongoDB.
International Science Index, Computer and Information Engineering Vol:9, No:12, 2015 waset.org/Publication/10004179

Fig. 7 Connection to source database Fig. 9 Result of migration

Next, Fig. 8 provides a set of tables of source database to IV. CONCLUSION


the user, which he will choose in order to consult or migrate.
In this paper, we have focused on data migration from
If the migrated table is join to another table, both will be
relational to NoSQL database. We have also discussed that
migrated automatically.
NoSQL is better and faster than RDBMS. Next, we presented
methodology for data migration from MySQL as RDBMS to
MongoDB as NoSQL oriented-document database. During
data migration, the model structure of the target (NoSQL) was
automatically created from relational schema. For illustration,
we propose an application developed on JAVA that is based
on the proposed methodology.
As a future direction, the software that we developed could
be enhanced by other features and improvements, such as: To
cover all types of databases available as RDBMS or NoSQL
for migrating from both of them, not just the MySQL and
MongoDB.

ACKNOWLEDGMENT
The authors wish to acknowledge the contributions of other
members of the department of computer science for their
helpful discussions and the availability of all resources that
have helped make this work in the best conditions. They also
wish to thank Mr. Redouane Boulguid for pointing out many
English corrections that lead to the improvement of the paper.
The authors would also like to thank the reviewers for their
remarks and suggestions.
Fig. 8 Migration to MongoDB

International Scholarly and Scientific Research & Innovation 9(12) 2015 2407 scholar.waset.org/1999.4/10004179
World Academy of Science, Engineering and Technology
International Journal of Computer, Electrical, Automation, Control and Information Engineering Vol:9, No:12, 2015

REFERENCES
[1] B. Walek, K. Cyril, “A methodology for Data Migration between
Different Database Management Systems”. World Academy of Science,
Engineering and Technology Vol:6 2012-05-23
[2] B. Walek, K. Cyril, “Data Migration between Document-Oriented and
Relational Databases”. World Academy of Science, Engineering and
Technology Vol:6 2012-09-22
[3] M. Alam, and Krishan Wasan S. “Migration from Relational Database
into Object Oriented Database”. Journal of Computer Science 2 (10):
781-784, 2006.
[4] X. Lixian, L. Yanhong, “Design and application of data migration
system in heterogeneous Database”. International Forum on Information
Technology and Applications 2010.
[5] N. Cory, L. Travis, I. Reenu, H. Gary, “Nosql Vs Rdbms - Why There Is
Room for Both”. Proceedings of the Southern Association for
Information Systems Conference, Savannah, GA, USA March 8th–9th,
2013.
[6] M.A Mohamed, O.G Altrafi, M.O Ismail, “Relational vs. NoSQL
Databases: A Survey”. International Journal of Computer and
International Science Index, Computer and Information Engineering Vol:9, No:12, 2015 waset.org/Publication/10004179

Information Technology (ISSN: 2279 – 0764) Volume 03 – Issue 03,


May 2014.
[7] R. Cattell, “Scalable SQL and NoSQL Data Stores”. SIGMOD Record,
December 2010 (Vol. 39, No. 4)
[8] MongoDB, (2014) The MongoDB 2.6 Manual:
http://docs.mongodb.org/manual
[9] NoSQL, (2014) Making sense of NoSQL, A guide for managers and the
rest of us: Dan McCreary et Ann Kelly. 2014.
[10] Migration Guide, (2015) RDBMS to MongoDB Migration Guide
Considerations and Best Practices a MongoDB White Paper February
2015.

International Scholarly and Scientific Research & Innovation 9(12) 2015 2408 scholar.waset.org/1999.4/10004179

You might also like