0% found this document useful (0 votes)

36 views8 pages

BDA Lab2

This document discusses using Sqoop to load data from a relational database (RDBMS) into Hadoop and analyze the data using Pig and Hive. It provides an overview of Sqoop and its features for transferring data between Hadoop and external data stores like RDBMS. The document then outlines steps to use Sqoop to load a dataset from a MySQL database into HDFS and analyze it using Hive, including importing and querying tables and adding new rows.

Uploaded by

Mohit Gangwani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views8 pages

BDA Lab2

Uploaded by

Mohit Gangwani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Name: Utsav Vijay

Name:Mohit Gavli
Gangwani
Div: D17A
Roll no-16
Roll No. 17
Div-D17A

Aim: Use sqoop to load data from RDBMS (weblog/transactions data) and analyze it using PIG and HIVE.

Theory:

1. The Hadoop ecosystem consists of various facets specific to different career specialties. One such discipline
centers around Sqoop, which is a tool in the Hadoop ecosystem used to load data from relational database
management systems (RDBMS) to Hadoop and export it back to the RDBMS. Simply put, Sqoop helps
professionals work with large amounts of data in Hadoop.
2. Sqoop is a tool used to transfer bulk data between Hadoop and external datastores, such as relational
databases (MS SQL Server, MySQL). However, it turned out that the process of loading data from several
heterogeneous sources was extremely challenging.

3. The problems administrators encountered included:

a. Maintaining data consistency
b. Ensuring efficient utilization of resources
c. Loading bulk data to Hadoop was not possible
d. Loading data using scripts was slow
e. The solution was Sqoop. Using Sqoop in Hadoop helped to overcome all the challenges of the traditional
approach and it could load bulk data from RDBMS to Hadoop with ease.
4. Sqoop Features:
a. Parallel Import/Export
Sqoop uses the YARN framework to import and export data. This provides fault tolerance on top of
parallelism.
b. Import Results of an SQL Query
Sqoop enables us to import the results returned from an SQL query into HDFS.
c. Connectors For All Major RDBMS Databases
Sqoop provides connectors for multiple RDBMSs, such as the MySQL and Microsoft SQL servers.
d. Kerberos Security Integration
Sqoop supports the Kerberos computer network authentication protocol, which enables nodes
communication over an insecure network to authenticate users securely.
e. Provides Full and Incremental Load
Sqoop can load the entire table or parts of the table with a single command.
4. Sqoop Architecture:
a. The client submits the import/ export command to import or export data.
b. Sqoop fetches data from different databases. Here, we have an enterprise data warehouse,
document-based systems, and a relational database. We have a connector for each of these; connectors
help to work with a range of accessible databases.

c. Multiple mappers perform map tasks to load the data on to HDFS.

d. Similarly, numerous map tasks will export the data from HDFS on to RDBMS using the Sqoop export
command.
5. Sqoop Processing:
Processing takes place step by step, as shown below:
a. Sqoop runs in the Hadoop cluster.
b. It imports data from the RDBMS or NoSQL database to HDFS.
c. It uses mappers to slice the incoming data into multiple formats and loads the data in HDFS.
d. Exports data back into the RDBMS while ensuring that the schema of the data in the database is
maintained.

Conclusion: The use of sqoop to load data from RDBMS has been done and the same has been analyzed using HIVE.
Name: Utsav Vijay Gavli BDA 23

1) Login to mysql

2) Creating Database tables

3) Creating tables

4) Describing table created in MySql

5) Loading dataset from device to sales table

6) Selecting rows from table

7) Using sqoop to list all the tables present in table in MySQL Database

8) Importing tables from RDBMS to HDFS using sqoop

9) Checking if tables are imported properly

10) Importing tables from HDFS to Hive

11) Checking if it has been created in Hive

12) Connecting to hive and show tables

13) Executing Queries in hive

14) Writing a query to add a new tuple in Hive HDFS

15) Checking if row is added

16) Exporting to MySQL

17) Checking if table added to MySQL table

32 BDA Exp2
No ratings yet
32 BDA Exp2
24 pages
SIC Big Data Chapter 3 Workbook
No ratings yet
SIC Big Data Chapter 3 Workbook
86 pages
Facilities Management Policy Draft 12
100% (2)
Facilities Management Policy Draft 12
36 pages
Azizuddin Cv-Store Keeper
100% (1)
Azizuddin Cv-Store Keeper
3 pages
PHYHOME - FTTH PON Series
No ratings yet
PHYHOME - FTTH PON Series
37 pages
How To Use NFC Shield With Arduino and Demo Code
No ratings yet
How To Use NFC Shield With Arduino and Demo Code
8 pages
Autocad MEP 2016
No ratings yet
Autocad MEP 2016
20 pages
Geographic Vs Projected Coordinate Systems PDF
No ratings yet
Geographic Vs Projected Coordinate Systems PDF
8 pages
BigData - Sem 4 - Elective 1 - Module 2 - PPT
No ratings yet
BigData - Sem 4 - Elective 1 - Module 2 - PPT
29 pages
FM-2 Indexing Module Reference Manual
No ratings yet
FM-2 Indexing Module Reference Manual
290 pages
BDA Module 2 PDF
No ratings yet
BDA Module 2 PDF
123 pages
US Address Generator - Fake Address, Random Address Generator 2 PDF
No ratings yet
US Address Generator - Fake Address, Random Address Generator 2 PDF
1 page
Bda U3
No ratings yet
Bda U3
59 pages
Content Addressable Memory Using XNOR CAM Cell
No ratings yet
Content Addressable Memory Using XNOR CAM Cell
5 pages
Sqoop User Guide
No ratings yet
Sqoop User Guide
90 pages
Ericsson Rbs 6601 Manual
No ratings yet
Ericsson Rbs 6601 Manual
1 page
Fundamentals of Apache Sqoop Notes
No ratings yet
Fundamentals of Apache Sqoop Notes
66 pages
RR1720 User Manual PDF
No ratings yet
RR1720 User Manual PDF
71 pages
U Iv Sqoop 1
No ratings yet
U Iv Sqoop 1
20 pages
04 Sqoop
No ratings yet
04 Sqoop
30 pages
Unit 3 Topic 8 Flume and Scoop
No ratings yet
Unit 3 Topic 8 Flume and Scoop
35 pages
Big Data: Sqoop
No ratings yet
Big Data: Sqoop
43 pages
Lesson 3 - Data - Ingestion - Into - Big - Data - Systems - and - ETL
No ratings yet
Lesson 3 - Data - Ingestion - Into - Big - Data - Systems - and - ETL
104 pages
Unit 6
No ratings yet
Unit 6
26 pages
Module 5 - Sqoop
No ratings yet
Module 5 - Sqoop
25 pages
How Sqoop Works?: Relationaldatabase Servers in The Relational Database Structure
No ratings yet
How Sqoop Works?: Relationaldatabase Servers in The Relational Database Structure
7 pages
DSCI 5350 - Lecture 3 PDF
No ratings yet
DSCI 5350 - Lecture 3 PDF
39 pages
Unit 4 3 Lumify, Data Rapper and Sqooop
No ratings yet
Unit 4 3 Lumify, Data Rapper and Sqooop
27 pages
Sqoop
No ratings yet
Sqoop
28 pages
Big Mumbai
No ratings yet
Big Mumbai
10 pages
Zep Sqoop Big Data Interview Questions
No ratings yet
Zep Sqoop Big Data Interview Questions
25 pages
Ce6306 Strength of Materials Ii/Iii Mechanical Engineering
No ratings yet
Ce6306 Strength of Materials Ii/Iii Mechanical Engineering
29 pages
Scoop Intro
No ratings yet
Scoop Intro
9 pages
160 P16cse5a-P16ite3a 2020052411232116
No ratings yet
160 P16cse5a-P16ite3a 2020052411232116
13 pages
Unit 3 Apache Sqoop and Drill
No ratings yet
Unit 3 Apache Sqoop and Drill
10 pages
B22 BDA Experiment 03
No ratings yet
B22 BDA Experiment 03
11 pages
Bda 11
No ratings yet
Bda 11
10 pages
6.moving Data Into Hadoop
No ratings yet
6.moving Data Into Hadoop
18 pages
Cse 17CS82 M2 S2 PPT
No ratings yet
Cse 17CS82 M2 S2 PPT
20 pages
Lab Experiments 1,2&4
No ratings yet
Lab Experiments 1,2&4
8 pages
SQOOP
No ratings yet
SQOOP
8 pages
Module 2
No ratings yet
Module 2
27 pages
DC42C Mebay
No ratings yet
DC42C Mebay
48 pages
Test Bank Questions Chapter 3
No ratings yet
Test Bank Questions Chapter 3
3 pages
Gold Video Task Complted
No ratings yet
Gold Video Task Complted
31 pages
Cloudera Academic Partnership 8 PDF
No ratings yet
Cloudera Academic Partnership 8 PDF
69 pages
Evolution of Computer Devices: Grade 12 Competency Level 2.2 Anuradha Dissanayake
No ratings yet
Evolution of Computer Devices: Grade 12 Competency Level 2.2 Anuradha Dissanayake
16 pages
Bda Exp8 Chinmay
No ratings yet
Bda Exp8 Chinmay
6 pages
SQOOP
No ratings yet
SQOOP
6 pages
Module IV
No ratings yet
Module IV
5 pages
DMBD MBAA21041 Sqoop
No ratings yet
DMBD MBAA21041 Sqoop
11 pages
Sqoop Students Datadotz
No ratings yet
Sqoop Students Datadotz
19 pages
Sqoop
No ratings yet
Sqoop
9 pages
Sqoop v1.1
No ratings yet
Sqoop v1.1
18 pages
Apache Sqoop Data Transfer Between Hadoop and RDBMS
No ratings yet
Apache Sqoop Data Transfer Between Hadoop and RDBMS
9 pages
Experiment-5 (Case Study On Sqoop)
No ratings yet
Experiment-5 (Case Study On Sqoop)
5 pages
Notes Bug Data and of Apache
No ratings yet
Notes Bug Data and of Apache
6 pages
Sqoop - A Haddop Technology: Srikalahasti
No ratings yet
Sqoop - A Haddop Technology: Srikalahasti
13 pages
How Sqoop Works?: Sqoop "SQL To Hadoop and Hadoop To SQL"
No ratings yet
How Sqoop Works?: Sqoop "SQL To Hadoop and Hadoop To SQL"
27 pages
15CS82 Module 2
No ratings yet
15CS82 Module 2
12 pages
BigData Module 2
No ratings yet
BigData Module 2
18 pages
BD Sqltohadoop3 PDF
No ratings yet
BD Sqltohadoop3 PDF
13 pages
What Are The Components of Web Service?: Java Questions
No ratings yet
What Are The Components of Web Service?: Java Questions
9 pages
Apache Sqoop: Hanoi - Autumn 2019
No ratings yet
Apache Sqoop: Hanoi - Autumn 2019
18 pages
Chapter n3 Sqoop
No ratings yet
Chapter n3 Sqoop
24 pages
M - M - Num-Mappers
No ratings yet
M - M - Num-Mappers
4 pages
OOPS Project Proposal-3
No ratings yet
OOPS Project Proposal-3
3 pages
Practice Assignment
No ratings yet
Practice Assignment
4 pages
Essential Hadoop Tools: Module - 2 Session - 2
No ratings yet
Essential Hadoop Tools: Module - 2 Session - 2
6 pages
Scoop PPT
No ratings yet
Scoop PPT
3 pages
5 - Big - Data Vivek
No ratings yet
5 - Big - Data Vivek
4 pages
Sqoop
No ratings yet
Sqoop
4 pages
CMOS Answers: 1. What Is Intrinsic and Extrinsic Semiconductor?
No ratings yet
CMOS Answers: 1. What Is Intrinsic and Extrinsic Semiconductor?
4 pages
Practice Assignment
No ratings yet
Practice Assignment
3 pages
Spectralayers Pro 6: Version History
No ratings yet
Spectralayers Pro 6: Version History
4 pages
Snapdeal MIS
No ratings yet
Snapdeal MIS
16 pages
Invertek Drives ODE-3-120070-301B Datasheet
No ratings yet
Invertek Drives ODE-3-120070-301B Datasheet
2 pages
Module 7
No ratings yet
Module 7
15 pages
Knowledge About Apache Sqoop and Its All Basic Commands To Import and Export The Data
No ratings yet
Knowledge About Apache Sqoop and Its All Basic Commands To Import and Export The Data
7 pages
Sqoop Cammand
No ratings yet
Sqoop Cammand
8 pages
Changelog
No ratings yet
Changelog
2 pages
Linux Mpstat Command - Reports Processors Related Statistics
No ratings yet
Linux Mpstat Command - Reports Processors Related Statistics
4 pages
Euro Company Profile-Packers and Movers Kolkata
No ratings yet
Euro Company Profile-Packers and Movers Kolkata
17 pages
Apache Sqoop
No ratings yet
Apache Sqoop
21 pages
Answer Key
No ratings yet
Answer Key
2 pages
Utilization of The AISC Steel Sculpture For An Introductory Construction Plan Reading Course
No ratings yet
Utilization of The AISC Steel Sculpture For An Introductory Construction Plan Reading Course
7 pages
Drive Spares Old PDF
No ratings yet
Drive Spares Old PDF
3 pages
Wi-Fi Interview Questions & Answers
No ratings yet
Wi-Fi Interview Questions & Answers
6 pages
Facebook Emily Fortis
No ratings yet
Facebook Emily Fortis
2 pages

BDA Lab2

Uploaded by

BDA Lab2

Uploaded by

Name: Utsav Vijay

3. The problems administrators encountered included:

c. Multiple mappers perform map tasks to load the data on to HDFS.

2) Creating Database tables

4) Describing table created in MySql

6) Selecting rows from table

8) Importing tables from RDBMS to HDFS using sqoop

9) Checking if tables are imported properly

10) Importing tables from HDFS to Hive

12) Connecting to hive and show tables

13) Executing Queries in hive

15) Checking if row is added

17) Checking if table added to MySQL table

You might also like