Session9 DataIngestion SQOOP
Session9 DataIngestion SQOOP
LINUX --> 20
HDFS --> 15 COMMANDS
===========================================
ETL or ELT ( BD)
ETL or ELT
============================================
EXTRACT TERMINOLOGY -->
INJECTION PHASE --> TAKING THE DATA FROM SOURCE SYSTEM and keeping in DATA LAKE
TALEND
INFORMATICA
SQOOP --> THIS ONE --> SIMPLE --> BD projects --> DATA INJECTION TOOL ( CLOUDERA ,
CLOUDXLAB)
SPARK SQL PULL INJECTION
KAFKA (BD and RT)
SSIS
AWS GLUE
AZURE ADF
AZURE SYANPSE ANALYTICS
APACHE FLINK
============================
SQOOP --> SQL + HADOOP --> TOOL which takes the dta from RDBMS to HDFS
a) EDGE NODE
b) HDFS
c) BOTH
d) NONE
1) HOST ID
2) USERNAME
3) PASSWORD
4) DATABASE
5) TABLE NAME
6) ACCESS KEY
7) S3 bucket name
1) HOST ID
2) username
3) password
4) retail_db
5) customers
6) HDFS directory
SQOOP -->
1) IMPORT --> RDBMS to HDFS
2) EXPORT --> HDFS TO RDBMS
CLOUDERA -->
============================================
CLOUDXLAB -->
1) open mysql
mysql command -->
mysql -h cxln2.c.thelab-240901.internal -u sqoopuser -pNHkkP876rp
2) Go inside retail_db;
use retail_db;
4)
sqoop import --connect jdbc:mysql://cxln2.c.thelab-240901.internal/retail_db -
username sqoopuser --password NHkkP876rp -table customers -m 1 --target-dir
/user/gadirajumidhun2082/Sqoop_B17_MIDHUN
==============================================
==============================================
1) SQOOP COMMAND
2) SQOOP IMPORT ARCGH
=============================================
============================================