0% found this document useful (0 votes)
11 views10 pages

Comparison of Data Migrationtechniques From SQL Databaseto Nosql Database KLKR

Uploaded by

Roshan Chitrakar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views10 pages

Comparison of Data Migrationtechniques From SQL Databaseto Nosql Database KLKR

Uploaded by

Roshan Chitrakar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Bhandari and Chitrakar, J Comput Eng Inf Technol 2020, 9:6

DOI: 10.37532/jceit.2020.9(6).241 Journal of Computer


Engineering & Information
Technology
Review Article A SciTechnol Journal

compete with relational database. Later Eric Evans an employee in


Comparison of Data Migration Rackspace Company explained the ambition of the NoSQL movement
as a new trend to solve a problem that Relational Databases are not
Techniques from SQL Database fit. The increasing usage of NoSQL products have energized other

to NoSQL Database companies to develop their own solutions and headed to emerge of
generic NoSQL database systems. This way there are more than 150
Hira Lal Bhandari*, and Roshan Chitrakar NoSQL products. These products come with issues like suitability to
some areas of application, security and reliability [3].
Abstract NoSQL databases are emerging from last few years due to its less
With rapid and multi-dimensional growth of data, Relational
constrained structure, scalable schema design, and faster access in
Database Management System (RDBMS) having Structured Query comparison to relational databases. The key attributes that make it
Language (SQL) support is facing difficulties in managing huge data different from relational database are that it does not use the table as
due to lack of dynamic data model, performance and scalability storage structure of the data. In addition, its schema is very efficient
issues etc. NoSQL database addresses these issues by providing in handling the unstructured data. NoSQL database also uses many
the features that SQL database lacks. So, many organizations modeling techniques like key-value stores, document data model, and
are migrating from SQL to NoSQL. RDBMS database deals with graph databases [1].
structured data and NoSQL database with structured, unstructured
and semi-structured data. As the continuous development of This research study aims to present comparative study on data
applications is taking place, a huge volume of data collected has migration techniques from SQL database to NoSQL database. This
already been taken for architectural migration from SQL database study analyses 7 (seven) recent approaches [4] which have been
to NoSQL database. Since NoSQL is emerging and evolving proposed for data migration from SQL database to NoSQL database.
technology in the field of database management and because
of increased maturity of NoSQL database technology, many Statement of the problem
applications have already switched to NoSQL so that extracting
information from big data. This study discusses, analyzes and There is nothing wrong in using traditional RDBMS for database
compares 7 (seven) different techniques of data migration from SQL management. As huge introduction of data from social sites and other
database to NoSQL database. The migration is performed by using digital media, it simply isn’t enough for the application dealing with
appropriated tools / frameworks available for each technique and huge databases. Also, NoSQL databases need cheap hardware. Hence,
the results are evaluated, analyzed and validated using a system
requirement of some of the relational databases need to be converted
tool called SysGauge. The parameters used for the analysis and the
comparison are Speed, Execution Time, Maximum CPU Usage and
to NoSQL databases which then enable to overcome drawbacks
Maximum Memory Usage. At the end of the entire work, the most found in relational databases. Some drawbacks of relational database
efficient techniques have been recommended. management systems are:
Keywords 1. They do not encompass a wide range of data models in data
management.
Data Migration; MySQL; RDBMS; Unstructured Data; SysGauge
2. They are not easily scalable because of their constrained
Introduction structure.

In 1970, Edgar Frank Codd has introduced architectural 3. They are not efficient and flexible for unstructured and semi-
framework on the relational database approach in his paper.”A structured database.
relational model of data for large shared data banks” [1]. After some 4. They cannot handle data during hardware failure.
time Codd has introduced Structured English Query Language and
later has renamed it as Structured Query Language to provide a way Due to massive use of mobile computing, cloud computing,
to access data in a relational database [2]. Since then, relational model Internet of Things, and other so many digital technologies, large
has had dominant form in the database market.The most popularly volume of streaming data is available nowadays. Such huge amounts
has used database management systems are Oracle, Microsoft SQL of data take a great deal of challenges to the traditional relational
server and MySQL [2]. All these three DBMS are based on relational database paradigm. Those challenges are related to performance,
database model and use SQL as query language.When NoSQL scalability, and distribution. To overcome such challenges enterprises
database has been introduced by Carlo Strozzi in 1998 as a file based begin to move towards implementing new database paradigm known
database, it has been used to represent relational database without as NoSQL [5].
using Structured Query Language. However, it has not be able to On the other hand, NoSQL database contains several different
models for accessing and managing data, each suited to specific
use cases. This is also significant reason to migrate data from SQL
*Corresponding author: Hira Lal Bhandari, Faculty of Science Health and database to NoSQL database. The several models are summarized in
Technology Nepal Open University, Nepal. E-mail: [email protected] the Table 1.
Received: November 01, 2020 Accepted: December 14, 2020 Published:
December 21, 2020
NoSQL DBMSs are distributed, non-relational databases. They

All articles published in Journal of Computer Engineering & Information Technology are the property of SciTechnol, and is
International Publisher of Science,
protected by copyright laws. Copyright © 2020, SciTechnol, All Rights Reserved.
Technology and Medicine
Citation: Bhandari HL, Chitrakar R (2020) Comparison of Data Migration Techniques from SQL Database to NoSQL Database. J Comput Eng Inf Technol
9:6.

doi: 10.37532/jceit.2020.9(6).241

Table 1: NOSQL database models.


Model Characteristics
Document Store Data and metadata are stored hierarchi-cally in JSON-based documents inside the database.
Key Value Store The simplest of the NoSQL Databases, data is represented as a collection of key-value pairs.
Wide-Column Store Related data is stored as a set of nested-key/ value pairs within a single column.
Graph Store Data is stored in a graph structure as node, edge, and data properties.

are designed for large-scale data storage and for massive parallel in organizing the data which makes it easy to access the data. The data
data processing across a large number of commodity servers. They generated from social networking sites and real time applications
use non-SQL languages and mechanisms to interact with data. Use needs flexible and scalable system which increases the requirement
of NoSQL database systems in database management increased in of NoSQL. Hence, multidimensional model has been proposed for
major Internet companies, such as Google, Amazon, and Facebook; data migration. The biggest challenge is the migration of existing data
which has aroused challenges in dealing with huge quantities of data residing in data warehouse to NoSQL database by maintaining the
with conventional RDBMS solutions could not cope. These systems characteristics of the data. The growing use of web applications has
can support multiple activities, including exploratory and predictive raised the demand to use NoSQL because traditional databases are
analytics, ETL-style data transformation, and non-mission critical unable to handle the rapidly growing data [4].
OLTP. These systems are designed so as to scale up thousands
The concept of NoSQL was first used in 1998 by Carlo Strozzi
or millions of users doing updates as well as reads, in contrast to
to represent open source database that does not use SQL interface.
traditional DBMSs and data warehouses [6].
Strozzi likes to refer to NoSQL as “noseequel” since there is difference
The focus of the study is to get comparative study on different between this technology and relational model. The white paper
seven techniques to migrate data from relational database to published by Oracle mentions techniques and utilities for migrating
NoSQL database. Migration of data from relational database to non Oracle databases to Oracle databases [7]. Abdelsalam Maatuk
NoSQL database refers the transformation of data from structured [8] describes an investigation into approaches and techniques used
and normalized database to flexible, scalable and less constrained for database conversion. Its origin is also regarded to the invention of
structure NoSQL database. The main objective of this research is to Google’s BigTable model. This database system, BigTable, is used for
find out the most efficient data migration technique among seven storage of projects developed by Google, for example, Google Earth.
major migration techniques from SQL database to NoSQL database. BigTable is a compressed high performance database which was
initially released in 2005 and is built on the Google file system. It was
Scope and Limitations of the Research Study developed using C and C++ languages. It provides consistency, fault
Scope and limitation of this research covers the following: tolerance and persistence. It is designed to scale across thousands of
This study is focused to get analyzed with different techniques to machines and it is easy to add more machines to it [9]. Later, Amazon
migrate the data from SQL database to NoSQL database to know developed fully managed NoSQL database service DynamoDB that
efficient migration technique so that one can efficiently adapt is used to provide a fast, highly reliable and cost effective NoSQL
emerging technology in the database world. Therefore, the study database services designed for internet scale applications [9]. These
does not include technical discussion of the risks identified, or of the projects directed a step towards the evolution of NoSQL.
implementation guideline here. The demand for NoSQL databases However, the term re-emerged only in 2009, at a meeting in
is increasing because of their diversified characteristics that offer San Francisco organized by Johan Oskarsson. The name for the
rapid, smooth, scalability, great availability, distributed architecture, meeting, NoSQL meetup, was given by Eric Evans and from there
significant performance and rapid development agility. It provides a on NoSQL became a buzzword [8]. Many early papers have talked
wide range of data models to choose from and is easily scalable where about the relationship between Relational and NoSQL Databases
database administrators are not required. Some of the SQL to NOSQL which gave a brief introduction of NoSQL database, its types and
data migrating providers like Riak and Cassandra are programmed to characteristics. They also discussed about the structured and non-
handle hardware failures and are faster, more efficient and flexible. It structured database and explained how the use of NoSQL database
has evolved at a very high pace. like Cassandra improved the performance of the system, in addition
However, some data migration techniques and NoSQL is still to it can scale the network without changing any hardware or buying
immature and they do not have standard query language. Some bigger server. This result is improving the network scalability with
NoSQL databases are not ACID compliant. No standard and data low-cost commodity hardware [10].
loss are the major problems while migrating data from SQL database Sunita Ghotiya [4] gave literature review of some of the recent
to NoSQL database. approaches proposed by various researchers to migrate data from
Review of Related Works Relational to NoSQL databases. Arati Koli and Swati Shinde [11]
presented comparison among five different techniques to migrate
This research study provides the comparative study on different from SQL database to NoSQL database with the help different
data migration approaches from SQL database to NoSQL databases. research paper reviews. Shabana Ramzan, Imran Sarwar Bajwa and
This focuses on the study of major migration techniques and suggests Rafaqut Kazmi [12] stated the comparison of transformation in
the efficient approach for data migration. Migrating process is tabulated format with different parameters such as source database,
performed with the help of tools/ framework available. target database, schema conversion, data conversion, conversion
time, data set, techniques, reference papers which clearly shows the
SQL database and other traditional databases strictly follow
research gap that currently no approach or tool supports automated
structured way to organize the data generated from various
transformation of MySQL to Oracle NoSQL for both data and
applications but NoSQL databases provide flexibility and scalability

Volume 9 • Issue 6 • 1000241 • Page 2 of 10 •


Citation: Bhandari HL, Chitrakar R (2020) Comparison of Data Migration Techniques from SQL Database to NoSQL Database. J Comput Eng Inf Technol
9:6.

doi: 10.37532/jceit.2020.9(6).241

schema transformation. Arnab Chakrabarti and Manasi Jayapal [13]


presents empirical comparative study to compare and evaluate data
transformation methodologies between varied data sources as well
as discuss the challenges and opportunities associated with those
transformation methodologies. The database used in transformation
was heterogeneous in nature.
In this way, this research study explores the issues regarding
relational databases, their features and shortcomings as well as
NoSQL and its features. It emphasizes on comparative study on the
migration approaches from structured (SQL) database to NoSQL
database. In this present scenario maximum application are to be
transformed into NoSQL databases because of incremental growth
of heterogeneous data. In such condition, SQL database is no more
has the ability to handle such complex dataset. So, there is the need
of migration of structured and normalized dataset into NoSQL
database. In this manner, the research study is focused on performing
major migration techniques to transfer data from SQL database i.e.
MySQL to NoSQL databases i.e. MongoDB, Hadoop database, etc.
Major seven migrating approaches are discussed and used to perform
migration task.
This comparative study presented in this research study could be
as guide lines for the organizations which are shifting their application
towards NoSQL databases. This research will be helpful choose the
efficient migration approach to transfer structured and normalized
Figure 1: Workflow to run the transformation.
database into NoSQL database.

Methodology which clarifies the structure of data. Table 2 includes six different
This research study evaluates major migration approaches which columns and seven different rows. First column consists of fields such
have been proposed in the previous research papers. The evaluation as user id, user name, last name, Gender, password and Status. They
is done through comparative study on the migration approaches have int and varchar data type. int basically the numeric data type and
efficiency measurement with different parameters. They are Speed, varchar is the character data type.
Execution Time, Maximum CPU Usage, and Maximum Memory
Environment and Comparison Characteristics
Usage. Migration of data from SQL database to NoSQL database
belonging to different migration approaches is done using available Implementation Details: This section includes the details of
framework/tools. implementation of the study in which an experiment to execute the
data migration between the data stores was setup. Microsoft Windows
In the Figure 1 we have presented the workflow that has been
machine with the following configuration is used to run all type of
followed during the entire process of data transformation. This
data migration approaches using respective tools Table 2.1.
helps to systematically run and verify each job as it was essential in
concluding the study among major migrating approaches performed. Only the migrating tools and concerned database were allowed to
This way we can trace the most efficient migration approach to run whereas all others shut down to make sure that no other variable
transform data from traditional normalized Database to NoSQL had impact on the result. After the completion of each job, the tools
database. and databases were restarted. SysGauge tool was used to analyse
the processes running on the machine with respect to the CPU and
Figure 1 shows how data is migrated from source data store to
memory utilization. The process specific to the technology was studied
destination data store i.e. SQL database to NoSQL databases. Here in
using ’SysGauge’ and the quantitative characteristics like maximum
the diagram each migration approach is planned to implement with
CPU, Memory and Time are documented as Maximum CPU load,
the help of respective technology i.e. tools/ framework. Data store 1
Maximum Memory Usage and CPU Time respectively. Figure 2
signifies SQL database i.e. MySQL and data store 2 implies MogoDB
shows an instance of the SysGauge tool in which the characteristics
and HBase. Up to the migrating process completion, SysGauge tool
are highlighted.
is run to check either other processes are run or not. If there are
processes running that will be shut down, then only the migration Characteristics of Comparison: In this section, a set of well-defined
technology run for respective migration approaches using tools/ characteristics have been discussed which can be considered for
framework. comparative study. Previous study stated NoSQL databases are often
evaluated on the basis of scalability, performance and consistency.
Data Description In addition to that system or platform dependent characteristics
The source of sample database to migrate from SQL database to there could be complexity, cost, time, loss of information, fault
NoSQL data. Database used in the migrating process is structured tolerance and algorithm dependent characteristics could be real time
database. Data set containing in the database table consists of 1000 processing, data size support etc. To meet the scope of this research,
number of records. The database table schema is presented below quantitative characteristics are considered hence actual values are

Volume 9 • Issue 6 • 1000241 • Page 3 of 10 •


Citation: Bhandari HL, Chitrakar R (2020) Comparison of Data Migration Techniques from SQL Database to NoSQL Database. J Comput Eng Inf Technol
9:6.

doi: 10.37532/jceit.2020.9(6).241

Table 2: NOSQL database models.

Field Type Null Key Default Extra


user_id Int(11) No PRI Null auto_increment
user_name varchar(255) Yes Null -
last_name varchar(50) Yes Null -
Gender varchar(50) Yes Null -
password varchar(50) Yes Null -
Status varchar(50) Yes Null - Figure 3: Migration Module Working Diagram.

Table 2.1: NOSQL database models.


Processor Intel® Core(TM)i3-3217U [email protected]
GHZ
Installed Memory (RAM) 2.00 GB
Operating System Windows 7 Professional
Processor type 64-bit
Hard disk 500 GB

Figure 4: Data Mapping Module Working Diagram.

Figure 5: Original system with RDB only.

destination. As a common unit, all the results were converted into


second. However, some migration took long time to complete, were
expressed in minutes.
Figure 2: SysGauge Instance.
Speed: speed is computed as the size of data transformed per
second. For each of the migration techniques, this value was obtained
retained and can be traced actual result observed from performing from the tools using which migration was performed. The value of
the migration of data from SQL database to NoSQL database. These speed was important, for example, in the migration of data from
numerical aspects were carefully studied before collecting the data to MySQL to MongoDB database.
give the best comparative Figures 3-5. We present the metrics that
have been used to evaluate our results. Methods of Migration
Maximum CPU Load: This refers maximum load percentage of While comparing SQL databases with NoSQL databases, the
the processor time used by the processor during the data migration. structure is more complex because they use structured way to access
This is a key performance metric and useful for investing issues and store data as well as the concept of normalization. According
was monitored by shutting down all other unnecessary processor to the rules of normalization they split their information into
technologies management. different tables with join relationship. On the other hand, NoSQL
Databases store their information in a de-normalized way which is
Maximum Memory Usage: Maximum memory usage refers
unstructured or semi-structured. Therefore the successful migration
maximum percentage of the physical RAM used by the process during
with data accuracy and liability from relational to NoSQL would not
data migration. An important metric to keep a track of resource
be an easy task. To come to the conclusion, comparison of major
consumption and impact it has on the time.
data migration techniques is done with the help of different tools
Analysis of changes in the resource consumption is an important such as MysqlToMongo, phpMyAdmin, Sqoop, Mysq l2 etc. Speed,
performance metric. Maximum CPU load, CPU time and maximum Execution Time, Maximum CPU Usage and Maximum Memory
memory usage were calculated for each of the migration approaches Usage are checked for the comparison of major approaches for data
using SysGauge tool in Windows operating system. migration from relational to NoSQL database.
Execution Time: It is the total time taken to complete the data Mid-model Approach using Data and Query Features: This model
migration. This was measured using the respective tools for the is used for transition and for migration of data from SQL database
migration techniques to compare the faster means of migrating data to NoSQL database. This model works on two basic concepts: Data
between SQL databases to NoSQL databases. This time included the features and query features. First mid model is migrated to the physical
time taken to establish a connection to the source and destination model which is destination database and when it is successfully
databases, reading data from the source and writing data to the performed the data is migrated from SQL to NoSQL Databases [4].

Volume 9 • Issue 6 • 1000241 • Page 4 of 10 •


Citation: Bhandari HL, Chitrakar R (2020) Comparison of Data Migration Techniques from SQL Database to NoSQL Database. J Comput Eng Inf Technol
9:6.

doi: 10.37532/jceit.2020.9(6).241

To perform the migration task an application ‘MysqlToMongo’ is 4. go to Execute Transaction


used to perform migration of data using its data and query features.
5. else
Algorithm 1 Mid-model approach (For executed transaction)
6. for each transaction in waiting transaction
Goal:
7. if (all transaction status==completed or errored)
Execute Transaction
8. remove all keys from locked table
Assumption:
Once Transaction start execution it’s not interrupted 9. current transaction.status=running
Input: Keys:
in which transactions will operate
10. go to execute transaction
Operations: NoSQLayer Approach: This migration approach works on the
(kind of operation required for each key read, write) Data of basis of two modules: Data Migration Module and Data Mapping
each Sub Transaction and operations resides in memory of layer Module. In this data migration module the elements for example,
Output: column and row are identified from source database and then they
Transaction Data Steps are mapped automatically into NoSQL model. Data-mapping module
1. Inform data migration to get data of the required keys. consists of the persistence layer, designed to be an interface between
2. If data is ok in memory the application and the DBMS, which monitors all SQL transactions
3. Inform secondary middle layer to start execution. from the application, translates these operations and redirects to the
4. Lock data in key status by saving the required operation on it. NoSQL model created in the previous module. Finally, the result of
5. For each key in SubTransactionKeys each operation is treated and transformed to the standard expected
6. do the required operation using current data in memory by the SQL application. The pictorial representations presented below
7. write operation with data describe each of these modules [11].
8. If Transaction.status==”Running”
9. Transaction.status=”Completed” so no transaction can in-terrupt it This migration approach migrate dataset from MySQL to
10.Update data in layer Memory MongoDB. To perform the NoSQLayer migrating process, software
11. If(updated data status is delete) ’MysqlToMongo’ is used so that data is migrated from MySQL to
12. State Change current data status to delete MongoDB. MysqlToMongo is data conversion software that helps
13. Else if(current data status is insert) database user to convert MySQL database data to MongoDB.
14. Leave it Insert Content Management System Approach for Schema De-
15. Else If (current data status is update) normalization: Almost all web-based applications and Content
16. Leave it update Management System (CMS) solutions are using Relational
17. End return selected data databases for data management. But, when users of internet and
18. Else clouds are growing rapidly, it is difficult for relational databases to
19. Go to Transactions In waiting handle the huge data traffic. This is why database design approach
Algorithm 2 Mid-model approach (For waiting transaction) has transformed the real CMS SQL database to a NoSQL database.
This approach consists of two steps, first to de-normalize the SQL
Goal: database and then to choose a unique identifier key as a primary key
for a big table [12,13]. Conversion from RDBMS TO NOSQL by
Execute Transaction
schema mapping and migration, centered on two forms of analysis:
Input: Keys: qualitative and quantitative. In the evaluation, goal of qualitative is
to provide a proof of concept by showing the schema migration and
In which transactions will operate
mapping framework execution in practice, in the quantitative one we
List of currently locked keys and operations in each key aim to verify whether the application of NoSQL, with our framework,
(Read or leverages the system performance [14].
Write) reside in locked table Schema migration and query mapping framework consist of:
Schema Migration Layer, Reverting Normal Forms and Row-key
List of Waiting Transactions (transactions in waiting that arrive
Selection, and Schema Migration.
Before current transaction and use any of keys associated with
Algorithm below shows a schema migration algorithm that
Current transaction) for current transactions uses table-level de-normalization. We first generate a schema graph
from the relational schema and make it acyclic if needed. We then
Output:
transform the schema graph into a set of schema trees. For each
Locking keys of transaction and go to transaction execution schema tree, we create a collection for the root node and replace a
foreign key in each node with the child node that the foreign key
Steps: refers to (i.e., primary key table).
1. while (transaction.status==”waiting”) Algorithm 3 A schema migration using table-level de-normalization
2. if(no keys were locked) Input: relational schema RS
3. transaction.status=”running” Output: MongoDB schema

Volume 9 • Issue 6 • 1000241 • Page 5 of 10 •


Citation: Bhandari HL, Chitrakar R (2020) Comparison of Data Migration Techniques from SQL Database to NoSQL Database. J Comput Eng Inf Technol
9:6.

doi: 10.37532/jceit.2020.9(6).241

1. Generate a schema graph G from RS


2. Make G acyclic based on user’s decision if needed
3. Transform G into a set ST of schema trees
4. for (each schema tree TST) {
5. create a collection for the root of T
6. for (each non-root node n of T)
7. embed n into the parent node np of n
8. remove the foreign key in np that refers to n
9. }
10. }
Figure 6: System architecture with data adapter and its components.
HBase Database Technique: HBase is the Hadoop database,
a distributed and scalable big data store. HBase consists of some
the architecture of data adapter system consisting of: a Relational
features such as linear and modular scalability, strictly consistent
Database, a NoSQL Database, DB Adapter, and DB Converter. Above
reads and writes convenient base classes for backing Hadoop Map
mentioned system is the coordinator between applications and two
Reduce jobs with Apache HBase tables [15]. By using Sqoop we can
databases. It controls query flow and transformation process. The
import information from a NoSQL database from social website
DB Converter is needed for transformation of data and reporting
framework into HDFS. The information to the import procedure is a
transformation progress to DB Adapter for further actions.
database table. Sqoop read the table column by line into HDFS [16].
Application i.e. Ruby on rails access databases through the DB
When direct access is available to the RDBMS source system,
Adapter i.e. mysq l2. The DB Adapter parses query, submits query,
we may choose for either a File Processing method if not we may
and gets result set from databases. The system needs some necessary
choose RDBMS processing while database client access is available
information such as transformation progress from DB Converter,
[17].
and then decides when the query can be performed to access
Algorithm 4 Migration from MySQL to HBase database. DB Converter migrate data from a relational database to
a NoSQL database. The data adapter system accepts queries while
1. Steps to migrate from MySQL to HBase
the transformation is performed, but the data in two databases may
2. Setup Hadoop on the system. not be consistent. The DB Adapter will detect and ask DB Converter
to perform synchronization process to maintain data consistency.
3. Use Sqoop to migrate data (tables) from MySQL to Hadoop
Automatic Mapping Framework: This approach of migration
Distributed File System. provides a framework which is generally used for automatic mapping
of Relational databases to a NoSQL database. Data migration to a
4. Convert the data stored in HDFS to a designated data store
Column-oriented database is beneficial for several cases because the
format such as XML or CSV etc. data can be appended on one dimension that is technically simpler
and faster: the data are added one after the other, thus arouses much
5. Setup HBase on top of the Hadoop framework. higher write speeds with very low latency. This technique consists
6. Map the data onto tables created on the HBase – column of better scalability since the development of data is done only on
one dimension their partitioning is simpler to perform and can be
oriented database based on the data access needs of the distributed across multiple servers [12].
applications. Framework ’NoSQLBooster’ is used for MongoDB for automatic
Data Adapter Approach: The data adapter system is highly database mapping from MySQL to MongoDB. NoSQLBooster for
modularized, layered between application and databases. It is basically MongoDB (formerly MongoBooster) is a shell-centric cross-platform
lies on the concept of performing queries from applications and data GUI tool for MongoDB, which provides comprehensive server
transformation between databases at the same time. This system monitoring tools, fluent query builder, SQL query, ES2017 syntax
provides a SQL interface to parse query statements that enables to support and true intelligence experience.
access both a Relational database and a NoSQL database. Here is an algorithm of automatic mapping of MySQL
This approach offers a mechanism to control the database relational databases to MongoDB. The algorithm uses the MySQL
transformation process and to let applications perform queries INFORMATION SCHEMA that provides access to database
whether target data (table) are being transformed or not. After metadata. Metadata is data about the data, such as the name of
data are transformed, we get a patch mechanism to synchronize a database or table, the data type of a column, or access privileges.
inconsistent tables [18]. We present the data adapter system with its INFORMATION SCHEMA is the information database, the place
design and implementation in following manner. that stores information about all the other databases that the MySQL
server maintains. Inside INFORMATION SCHEMA there are several
Without using adapter i.e. mysq l2, available system only allows read-only tables. They are actually views, not base tables.
application to connect to a relational database. Figure 6 depicts
Algorithm 5 Automatic Migration Framework

Volume 9 • Issue 6 • 1000241 • Page 6 of 10 •


Citation: Bhandari HL, Chitrakar R (2020) Comparison of Data Migration Techniques from SQL Database to NoSQL Database. J Comput Eng Inf Technol
9:6.

doi: 10.37532/jceit.2020.9(6).241

1. Creating the MongoDB database. The user must specify the Load: During the load step, it is necessary to ensure that the load is
performed correctly and with as little resources as possible. The target
MySQL database that will be represented in MongoDB. The
of the Load process is often a database. In order to make the load
database is created with the following MongoDB command: use process efficient, it is helpful to disable any constraints and indexes
before the load and enable them back only after the load completes.
DATABASE NAME.
The referential integrity needs to be maintained by ETL tool to ensure
2. Creating tables in the new MongoDB database. The algorithm consistency.
verifies for each table in what relationships is involved, if it has Steps: -
foreign keys and/or is referred by other tables. 1. Lock the target database in source system.
3. If the table is not referred by other tables, it will be represented 2. Lock the target database in destination system.
by a new MongoDB collection. 3. Extract information from target database from
4. If the table has not foreign keys, but is referred by another Source system.
table, it will be represented by a new MongoDB collection. 4. Transform information to destination database.
5. If the table has one foreign key and is referred by another 5. Release lock of source and destination systems.
table, it will be represented by a new MongoDB collection. Discussion
In our framework, for this type of tables we use linking In this section we discuss the results of the experiment and also
method, using the same concept of foreign key. report the challenges that we faced during the entire phase.

6. If the table has one foreign key but is not referred by another Comparing Quantitative Characteristics of Migration Approaches:
This determinative evaluation was used to check if the study is going
table, the proposed algorithm uses one way embedding model. in the right direction. The data migration methodologies which were
implemented in this research study are compared with one another
So, the table is embedded in the collection that represents the
and evaluated in the matrix as described. Since each aspect cannot be
table from the part 1 of the relationship. predicted at the initial of the study and due to unexpected changes
that happened at different phases, a revision of the methodologies was
7. If the table has two foreign keys and is not referred by another
necessary at every stage.
table, it will be represented using the two way embedding
Migrating Results
model, described in section 2.4.
An implementation details as described earlier was environmental
8. If the table has 3 or more foreign keys, so it is the result of setup; the values of maximum CPU load, CPU time, and maximum
memory usage are retrieved using the SysGauge tool, outcome of
a N:M ternary, quaternary relationships, the algorithm uses
execution time, speed are documented from the respective technology
the linking model, with foreign keys that refer all the tables used in the migration process and the results are compiled as shown
in the Table 3. There were 3 target data stores such as MongoDB, CMS
initially implied in that relationship and already represented as
Database and Hadoop Database used in the research study. The tools
MongoDB collections. The solution is good even the table is and framework involved in the transformation were MysqlToMongo,
phpmyadmin, mysq l2, NoSQLBooster for MongoDB, Sqoop and
referred or not by other tables.
Studio 3T.
Extract-Transform-Load approach: The term ETL came into
Transformation result varies from one migration technique
existence from data warehousing and is an acronym for Extract-
to another technique that was evaluated according to the values
Transform-Load. ETL insists a process of how the data are loaded
retained from execution of respective methodologies. That execution
from the source system to the data warehouse [19, 20]. In these days,
was performed with the help of tools or framework which belongs
the ETL enhances a cleaning step as a separate step. The sequence is
to different migration approaches. Evaluated result of different
then Extract-Transform-Load.
migration approaches are discussed below:
Extract: The Extract step consists of the data extraction from the
Mid-model Approach using Data and Query Features:
source system and makes it accessible for further processing. The
MongoDB using MysqlToMongo Framework): MysqlToMongo tool
main aim of the extract step is to fetch all the necessary data from the
is used to migrate data from MySQL to MongoDB. It uses data and
source system with as minimal amount of resources as possible.
query features. It transforms structured data of size 2833.3 KB per
Transform: The transform step applies a set of rules to transform second from MySLQL to MongoDB. Data set having size 85 KB and
the data from the source to the target. This includes converting including data 1000 rows is transformed in 0.03 sec. At the time of data
any measured data to the same dimension using the same units so transformation from MySQL to MongoDB using MysqlToMongo
that they can later be joined. The transformation step also requires tool, Maximum CPU Usage is 23 percentage and Maximum memory
joining data from several sources, generating aggregates, generating consumption is 9.1 percentage and after transformation and
surrogate keys, sorting, deriving new calculated values, and applying conversion of SQL database is 4 Kb.
advanced validation rules.

Volume 9 • Issue 6 • 1000241 • Page 7 of 10 •


Citation: Bhandari HL, Chitrakar R (2020) Comparison of Data Migration Techniques from SQL Database to NoSQL Database. J Comput Eng Inf Technol
9:6.

doi: 10.37532/jceit.2020.9(6).241

Table 3: NOSQL database models. conversion of SQL database is 1 KB.


Approaches Speed Execution Maximum Maximum Extract-Transform-Load Approach: It transforms structured
(Kb/ sec.) time (sec.) CPU Usage Memory
Usage data of size 1214.29 KB per second from MySLQL to MongoDB
Mid-model 2833.3 0.03 23 9.1 Database using Studio 3T. Data set having size 85 KB and including
approach data 1000 rows is transformed in 0.07 sec. At the time of data
NoSQLayer 8500 0.01 21 7.1 transformation from MySQL to MongoDB database, Maximum
approach CPU Usage is 70 percentages and Maximum memory consumption
Content 44.97 1.89 26 50 is 17.6 percentages, and after transformation and conversion of SQL
Management
System approach
database is 88 KB.
HBase database 0.39 215.4 84 59.4 From the evaluated results during migration of data set from SQL
Technique
Database to NoSQL database. In totality, ’Data Adapter Approach’
Data Adapter 850 0.1 14 5.4
Approach
was found the most efficient from the point of CPU Usage and
Automatic 56.67 1.5 63 16.9
Memory Usage. On the other hand, NoSQLayer Approach is the most
Mapping efficient from execution time and data migration speed point of view.
Framework Basis of comparison were Speed, Maximum CPU Usage percentage,
Extract- 1214.29 0.07 70 17.6 Maximum Memory Usage percentage and Execution Time. The
Transform-Load resource consumption of migrating procedure was evaluated using
approach
’SysGauge’ tool. Data conversion/ transformation speed and total
execution time were evaluated using framework/ tools regarding
NoSQLayer Approach: MysqlToMongo tool was used migrating respective migration approach.
data from MySQL to MongoDB. It uses data and query features. It
transforms structured data of size 8500 KB per second from MySLQL Migrating Efficiency of Transformation Techniques: The overall
to MongoDB. Data set having size 85 KB and including data 1000 evaluation of all transformation techniques involved in transforming
rows is transformed in 0.01 sec. At the time of data transformation data from SQL Database i.e. MySQL to NoSQL Databases such as
from MySQL to MongoDB using MysqlToMongo tool, Maximum MongoDB, Hadoop Database and CMS Database have been plotted
CPU Usage is 21 percentage and Max-imum memory consumption as shown from Figure 7-10. This provides a clear picture of which
is 7.1 percentage and after transformation and conversion of SQL technology was the most efficient in comparison to the others. The
database is 1 Kb. average data size per second, Database size, Maximum CPU Usage

Content Management System Approach for Schema De-


normalization: It transforms structured data of size 44.97 KB per
second from MySLQL to Word press. Data set having size 85 KB and
including data 1000 rows is transformed in 1.89 sec. At the time of
data transformation from MySQL to Word press using phpmyadmin,
Maximum CPU Usage is 26 percentages and Maximum memory
consumption is 50 percentages and after transformation and
conversion of SQL database is 84.7 Kb.
HBase Database Technique: It transforms structured data of
size 0.39 KB per second from MySLQL to Hadoop database using
Sqoop. Data set having size 85 KB and including data 1000 rows is
transformed in 215.4 sec . At the time of data transformation from
MySQL to Hadoop database, Maximum CPU Usage is 84 percentages
and Maximum memory consumption is 59.4 percentages, and after
Figure 7: Data Migration Speed.
transformation and conversion of SQL database is 65.1 KB.
Data Adapter Approach: It transforms structured data of size
850 KB per second from MySLQL to MongoDB Database using
mysq l2 data adapter on ruby on rails. Data set having size 85 KB
and including data 1000 rows is transformed in 0.1 sec. At the time
of data transformation from MySQL to Hadoop database using
Sqoop, Maximum CPU Usage is 14 percentage and Maximum
memory consumption is 5.4 percentage, and after transformation and
conversion of SQL database is 88 KB.
Automatic Mapping Framework: It transforms structured data
of size 56.67 KB per second from MySLQL to MongoDB Database
using NoSQLBooster for MongoDB. Data set having size 85 KB and
including data 1000 rows is transformed in 1.5 sec. At the time of
data transformation from MySQL to Hadoop database using sqoop,
Maximum CPU Usage is 63 percentage and Maximum memory
Figure 8: Data Migration Execution Time.
consumption is 16.9 percentage, and after transformation and

Volume 9 • Issue 6 • 1000241 • Page 8 of 10 •


Citation: Bhandari HL, Chitrakar R (2020) Comparison of Data Migration Techniques from SQL Database to NoSQL Database. J Comput Eng Inf Technol
9:6.

doi: 10.37532/jceit.2020.9(6).241

records from SQL database to NoSQL database. Then there are other
techniques such as Mid-model Approach, Extract-Transform-Load
Approach and Data Adapter Approach are the techniques which
consume lesser time in data migration. The execution time during
the completion of data migration by them are are 0.03 Sec., 0.07 Sec.
and 0.1 Sec . respectively. Thus, we can come to the conclusion that
NoSQLayer is the migrating technique which is the most efficient
from the execution time point of view.
In the Figure 9, horizontal axis shows the techniques that are
used in migration and vertical axis is used to rep-resent Maximum
CPU Usage percentage which is consumed during the completion
of data migrating process from SQL Database to NoSQL Database.
Maximum CPU Usage of Data Adapter Approach has 14 percentages
Figure 9: Maximum CPU Usage Percentage.
which is comparatively the least among seven migration techniques.
Then, NoSQLayer Approach and Mid-layer Approach have 21
percentage and 23 percentage CPU Usage respectively. They are two
other techniques which have lesser CPU Usage. Thus, we can come to
the conclusion that Data Adapter Approach the most efficient from
the CPU Usage point of view i.e. it uses only the 14 percentage of the
CPU Load during the complete migration of 1000 number of records
from SQL Database to NoSQL Database.
In the Figure 10, horizontal axis shows the techniques that are
used in migration and vertical axis is used to represent Maximum
Memory Usage percentage which is consumed during the completion
of data migrating process from SQL Database to NoSQL Database.
Maximum CPU Usage of Data Adapter Approach has 5.4 percentages
which is comparatively the least among seven migration techniques.
Then, NoSQLayer Approach and Mid-layer Approach has 7.1
percentage and 9.1 percentage Memory Usage respectively. These are
Figure 10: Maximum Memory Usage Percentage. the two other techniques which have lesser Memory Usage. Thus, we
can come to the conclusion that Data Adapter Approach is the most
percentage, Maximum Memory Usage percentage transformed per efficient from the Memory Usage point of view i.e. it uses only the
second for each migration approach have also been plotted to convey 5.4 percentage of the Memory Load during the complete migration
the efficiency of each migrating technique. of 1000 number of records from SQL database to NoSQL databases.

Summarization of the Results: Although, a final result for The experiments, results, analysis and comparisons show
migrating speed amongst major migration techniques has been that HBase Database Technique, Content Management System
drawn, there were other results which further verify the efficiency Approach, Automatic Mapping Framework and ETL Approach
of the migration techniques which has helped validate our results to Technique reached a higher maximum CPU and memory loads than
measure the efficiency of the transformation techniques: To depict other techniques during the migration process. It is also seen from
clear picture for migrating techniques’ efficiency, the results for each the viewpoint of Speed of Data migration and Execution time, the
parameter has been presented. NoSQLayer Approach is the most efficient. And, from CPU Usage
and Memory Usage point of view, the Data Adapter is the most
In the Figure 7, horizontal axis shows the techniques that efficient technique.
are used in migration and vertical axis is used to represent data in
byte to be migrated in a second during the migrating process from Conclusion
SQL Database to NoSQL Database. From the Figure 7, NoSQLayer
The main objective of this study is to compare various
Approach is migrating largest data size i.e. 8,500 kilo byte per second
approaches of data migration from SQL to NoSQL by using
from SQL database to NoSQL database. Then Mid-model Approach,
well defined characteristics and datasets. In order to address the
Extract Transform-Load Approach and Data Adapter Approach are
growing demands of modern applications to manage huge / big
better from data migrating speed point of view. The migrating speed
data in an efficient manner, there emerges a need of schema-less
of these approaches is 2833.3 KB, 1214.29 KB and 850 KB per second
NoSQL databases that is capable of managing large amount of data
respectively. Thus, we can come to the conclusion that NoSQLyer is
in terms of storage, access and efficiency. The main focus of this
the migrating technique which is the most efficient from the migrating
research is to carry out a comparative study and analysis of most
speed point of view.
common migrating approaches using most appropriate tools (other
In the Figure 8, horizontal axis shows the techniques that are than commercially available ones) that prefer basic and practical
used in migration and vertical axis is used to represent total execution conversion from structured data to unstructured data. In this work,
time which is consumed during the completion of data migrating 7 (seven) migrations procedures have been performed one-by-one
process from SQL database to NoSQL database. From the Figure 8, and separatley by using freely available resources (data and tools)
NoSQLayer Approach has taken 0.01 Sec. to migrate 1000 number of and then performance analysis of each procedure has been evaluated

Volume 9 • Issue 6 • 1000241 • Page 9 of 10 •


Citation: Bhandari HL, Chitrakar R (2020) Comparison of Data Migration Techniques from SQL Database to NoSQL Database. J Comput Eng Inf Technol
9:6.

doi: 10.37532/jceit.2020.9(6).241

on the basis of performance parameters. Further, all the challenges 10. Mohamed A, Altrafi G,Ismail O (2014) “Re-lational Vs. NoSQL databases: A
survey,” Int J Comput Inf Technol, 2279–2764.
faced during the course of this work have been documented for
future reference. The main contribution of this work is that it will 11. Koli A, Shinde S (2017) “Approaches used in efficient migration from
serve as guidelines for organizations looking for migrating data from Relational Database to NoSQL Database,” Proc Second Int Conf Res Intell
Comput Eng, 10: 223–227.
structured to semi or unstructured repository in the most efficient
way. 12. Ramzan S, Bajwa S, Kazmi R (2018) “An intelligent approach for han-dling
complexity by migrating from conventional databases to big data,” Symmetry
References (Basel), 10:1-12.

1. Mohamed H, Omar B, Abdesadik B (2015) “Data Migration Methodology 13. Chakrabarti A, Jayapal M (2017) “Data transformation methodologies
from Relational To NoSQL Databases. Inter J Comp App, 9: 2511–2515. between heterogeneous data stores: A comparative study,” Data 2017 –
Proc 6th Int Conf Data Sci Technol Appl, 241–248.
2. Pretorius D (2013) “NoSQL database considerations and implications for
businesses” Inter J Comp App. 14. Kuderu N, Kumari V (2016) “Relational Database to NoSQL Conversion by
Schema Migration and Mapping,” Int J Comput Eng Res Trends, 3: 506.
3. Mughees M (2013) “NoSQL, Data migration from standard SQL to NOSQL.
15. Khourdifi Y,Bahaj M, Elalami A (2018) “A new approach for migration of
4. Ghotiya S, Mandal J,Kandasamy S (2017) “Migration from relational to a relational database into column-oriented nosql database on hadoop,” J
NoSQL database,” IOP Conf Ser Mater Sci Eng 263: 1-4. Theor Appl Inf Technol, 96: 6607.
5. Yassine F,Awad M (2018) “Migrating from SQL to NOSQL Database: 16. Tiyyagura N,Rallabandi M, Nalluri R (2016) “Data Migration from RDBMS
Practices and Analysis,” Proc 13th Int Conf Innov Inf Technol 58-62. to Hadoop,”184.
6. Moniruzzaman B, Akhter H (2013) “NoSQL Database: New Era of Databases 17. Seshagiri V, Vadaga M, Shah J, Karunakaran P (2016) “Data Migration
for Big data. Technology from SQL to Column Oriented Databases (HBase),” 5:1-11.
7. Potey M,Digrase M, Deshmukh G, Nerkar M (2015) “Database Mi-gration 18. Liao T (2016) “Data adapter for querying and transformation between SQL
from Structured Database to non- Structured Database,” Int J Comput Appl, and NoSQL database,” Futur Gener Comput Syst, 65: 111–121.
8975–8887.
19. Lalitha R (2016) “Classical Data Migration Technique in Multi-Database
8. Abramova P, Veronika B, Jorge F (2014) “Experimental Evaluation of Indoor Systems ( SQL and NOSQL ),” Int J Comput Sci Inf Technol,7: 2472–2475.
Visual Comfort,” Int J Database Manag Syst. 6:1-16.
20. Yangui R, Nabli A, Gargouri F (2017) “ETL based framework for NoSQL
9. Ameya N, Anil P, Dikshay P (2013) “Type of NOSQL databases and its warehousing,” Lect Notes Bus Inf Process, 299: 40-53.
comparison with relational databases” Int J Appl Inf Syst 5: 16-19.

Author Affiliation Top


Faculty of Science Health and Technology, Nepal Open University, Nepal

Volume 9 • Issue 6 • 1000241 • Page 10 of 10 •

You might also like