0% found this document useful (0 votes)
30 views

Creating A Table in RDBMS 3 2. Importing RDBMS Data Into H DFS 3 Exporting HDFS Data To RDBMS .. 6

This document discusses importing and exporting data between a relational database (RDBMS) and HDFS. It provides steps to: 1. Create a table in RDBMS and import the data into HDFS using Sqoop. 2. Export HDFS data to a new table in RDBMS using Sqoop. 3. Export HDFS data to an existing RDBMS table, updating records and inserting new ones.

Uploaded by

Karthick selvam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

Creating A Table in RDBMS 3 2. Importing RDBMS Data Into H DFS 3 Exporting HDFS Data To RDBMS .. 6

This document discusses importing and exporting data between a relational database (RDBMS) and HDFS. It provides steps to: 1. Create a table in RDBMS and import the data into HDFS using Sqoop. 2. Export HDFS data to a new table in RDBMS using Sqoop. 3. Export HDFS data to an existing RDBMS table, updating records and inserting new ones.

Uploaded by

Karthick selvam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Table of Contents

1. Creating a table in RDBMS ……………………………………………… 3

2. Importing RDBMS data into HDFS……………………………………… 3

3. Exporting HDFS data to RDBMS……………………………………….. 6

2
Creating Table in RDBMS:
1. Open My Sql Instance either in My sql editor or command line.

2. Create a table named customer using below query in test database:


CREATE TABLE test.customer (cust_id int, name varchar(50), age int, country varchar(100));

3. Insert data into table customer


INSERT INTO test.customer VALUES(101,'John',25,'US');
INSERT INTO test.customer VALUES(102,'Jack',35,'US');
INSERT INTO test.customer VALUES(103,'Peter',28,'UK');
INSERT INTO test.customer VALUES(104,'Katie',30,'US');
INSERT INTO test.customer VALUES(105,'Daniel',22,'UK');

4. Check whether data is inserted in customer table or not


SELECT * FROM test.customer;

Importing RDBMS data into HDFS:


5. Check whether mysql-connectorXXXX.jar is present or not under /usr/lib/sqoop
directory of your hadoop machine.

6. If not, then download and copy the mysql-connector jar to /usr/lib/sqoop


directory.

7. Import customer table’s data to HDFS using sqoop command.


$ sqoop-import --connect jdbc:mysql://192.168.1.121/test \
--username root –password root \
--table customer \
--target-dir /etl/input/

Note: Please delete the customer directory from HDFS(if present) before running sqoop command.

3
8. Check whether data is imported to HDFS or not from customer table
$ hadoop fs -cat /etl/input/customer/part-m*

9. Import only those records of customer table where country is ‘US’


$ sqoop-import --connect jdbc:mysql://192.168.1.121/test \
--username root –password root \
--table customer \
--where “country = ‘US’” \
--target-dir /etl/input/US/

10. Import only id, name and age from customer table belonging to UK and load to
/etl/input/selected/customer path in HDFS

$ sqoop-import --connect jdbc:mysql://192.168.1.121/test \


--username root –password root \
--table customer \
--query ‘SELECT id,name,age FROM test.customer WHERE country=’UK’’ \
--split-by id \
--target-dir /etl/input/UK/
Note: --split-by is used for slicing data to multiple parallel tasks. Usually done on primary key of table.

11. Defining number of mappers explicitly


$ sqoop-import --connect jdbc:mysql://192.168.1.121/test \
--username root –password root \
--table customer \
--target-dir /etl/input/ \
-m 5

12. Import customer table data to HDFS as avro data file


$ sqoop-import --connect jdbc:mysql://192.168.1.121/test \
--username root –password root \
--table customer \
--target-dir /etl/input/
--as-avrodatafile

4
13. Import the data of customer table to hive table
$ sqoop-import --connect jdbc:mysql://192.168.1.121/test \
--username root –password root \
--table customer \
--hive-import

14. Check whether data is loaded to Hive table correctly


Select * from customer;

15. Add 1 row to your customer table in mysql


INSERT INTO test.customer VALUES(106,'Vipul',25,'IN');

16. Check whether data is inserted in customer table or not


SELECT * FROM test.customer

17. Append the newly added row(incremental import) in customer table to


/etl/input/ directory of HDFS
$ sqoop-import --connect jdbc:mysql://192.168.1.121/test \
--username root –password root \
--table customer \
--incremental append \
--check-column id \
--last-value 105 \
--target-dir /etl/input/

18. Import the data of customer table to Hbase

$ sqoop-import --connect jdbc:mysql://192.168.1.121/test \


-Dsqoop.hbase.add.row.key=true \
--username root –password root \
--table customer \
--hbase-table customer \

5
--column-family market
Note: HBase table and column family should exist before importing using sqoop.

Exporting HDFS data into RDBMS:


1. Create a new table in mysql with name customer_new in test db in which we will
export data from HDFS
CREATE TABLE test.customer_new (cust_id int, name varchar(50), age int, country varchar(100));

2. Check whether data is present in HDFS path /etl/input/customer/ or not


$ hadoop fs -cat /etl/input/customer/part*

3. Export data from HDFS path /etl/input/customer/ to customer_new table in


RDBMS under test db
$ sqoop-export --connect jdbc:mysql://192.168.1.121/test
--username root --password root \
--table customer_new \
--export-dir /etl/input/customer/part* -m 2

4. Check whether data is exported to customer_new table of mysql or not


SELECT * FROM customer_new;

5. Export the data from HDFS to mysql table customer_new: Update existing records
and add newly added records.
$ sqoop-export --connect jdbc:mysql://192.168.1.121/test
--username root --password root \
--table customer_new \
--update-key id \
--update-mode allowinsert --export-dir /etl/input/customer/part*

You might also like