M - M - Num-Mappers
M - M - Num-Mappers
DATE: DATABASES
AIM: To install Sqoop and execute basic commands of Hadoop ecosystem component Sqoop.
Import command is used to importing a table from relational databases to HDFS. In our case,
we are going to import tables from MySQL databases to HDFS. As you can see in the below image,
we have employees table in the employees database which we will be importing into HDFS
As you can see in the below image, after executing this command Map tasks will be executed
at the back end. After the code is executed, you can check the Web UI of HDFS i.e.localhost:50070
where the data is imported.
Sqoop imports data in parallel from most database sources. -m property is used to specify the
number of mappers to be executed. Sqoop imports data in parallel from most database sources. You
can specify the number of map tasks (parallel processes) to use to perform the import by using the -
m or –num-mappers argument. Each of these arguments takes an integer value which corresponds to
the degree of parallelism to employ.
You can see in the below image, that the number of mapper task is 1. The number of files that are
created while importing MySQL tables is equal to the number of mapper created.
36
after executing this command Map tasks will be executed at the back end. The Web UI of HDFS i.e.
localhost:50070 where the data is imported.
The number of mapper tasks is 1. The number of files that are created while importing MySQL
tables is equal to the number of mapper created.
37
Sqoop – IMPORT Command with Where Clause
Import a subset of a table using the ‘where’ clause in Sqoop import tool. It executes the
corresponding SQL query in the respective database server and stores the result in a target directory
in HDFS. You can use the following command to import data with ‘where‘ clause:
sqoop import --connect jdbc:mysql://localhost/employees --username edureka --table
employees --m 3 --where "emp_no >
49000" --target-dir /Latest_Employees
Finally, Sqoop is used to transfer data from RDBMS (relational database management
system) like MySQL and Oracle to HDFS (Hadoop Distributed File System). Big Data Sqoop can
also be used to transform data in Hadoop MapReduce and then export it into RDBMS.
38
3. Importing Data into a Database:
RESULT:
Thus the importing and exporting of data between databases are executed successfully.
39