0% found this document useful (0 votes)

14 views14 pages

HADOOP Notes Unit 3 and 4

The document outlines the data flow in the Hadoop ecosystem, detailing stages from ingestion to output, and tools like Flume and Sqoop for data ingestion. It also covers Hadoop I/O, focusing on compression, serialization, and file-based data structures like Avro, Parquet, and ORC. Additionally, it introduces the Hadoop ecosystem components, Apache Pig for data analysis, and its execution modes, highlighting the use of Pig Latin and User Defined Functions (UDFs).

Uploaded by

agyaprajapati3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views14 pages

HADOOP Notes Unit 3 and 4

Uploaded by

agyaprajapati3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 14

MODULE- 3

1. Data Flow in Hadoop Ecosystem

The data flow in a Hadoop ecosystem involves the stages through which data passes from
source to processing and finally storage. The flow typically follows a path from data
ingestion, storage, processing, and finally output.

Key Stages of Data Flow:

1. Data Ingestion:

o Data is collected from various sources and ingested into the Hadoop system.

o This step involves tools like Flume, Sqoop, and custom data
ingestion processes.

2. Data Storage:

o Data is stored in the Hadoop Distributed File System (HDFS), where it

is organized into blocks and replicated for fault tolerance.

o HDFS is the backbone storage system that supports scalable and fault-
tolerant data storage.

3. Data Processing:

o Once data is stored in HDFS, it is processed using MapReduce, Apache

Spark, or Apache Hive.

o This processing might include transformations, aggregations, or analysis

of large datasets.

4. Data Output:

o After processing, data is output to HDFS or other storage systems.

o It can also be output to external systems like relational databases or

data lakes for further use in downstream applications.

Data Flow Example:

A company might use Flume to ingest logs from servers into HDFS. The data is processed
using MapReduce or Hive, and finally, the processed data is output to HDFS, or it might be
used for further analysis in tools like Apache Spark.

11
2. Data Ingestion with Flume and Sqoop

Data ingestion refers to the process of collecting and importing data into Hadoop. Two
commonly used tools for this task are Flume and Sqoop.

Flume:

Apache Flume is a distributed, reliable, and available service for collecting, aggregating, and
moving large amounts of streaming data into HDFS.

Key Features of Flume:

 Stream Processing: Flume can ingest data from multiple sources like logs,
social media feeds, and sensors.

 Reliability: Flume ensures data reliability with mechanisms like Channel, Source, and
Sink.

 Scalability: Flume is highly scalable and can be extended with custom plugins.

 Integration: Flume integrates well with Hadoop, HDFS, and other data
storage systems.

Flume Architecture:

 Source: Where data comes from (e.g., logs, syslog, or Kafka).

 Channel: The medium through which data flows (e.g., memory, file, or database).

 Sink: The destination where data is stored (typically HDFS).

Example of Flume Use Case:

Collecting logs from multiple web servers and ingesting them into HDFS:

flume-ng agent --conf /etc/flume/conf --conf-file /etc/flume/conf/flume.conf --name agent1

Sqoop:

Apache Sqoop is a tool designed for efficiently transferring bulk data between Hadoop and
relational databases like MySQL, PostgreSQL, and Oracle.

Key Features of Sqoop:

 Relational Database Integration: Sqoop is primarily used for importing data

from RDBMS into HDFS and exporting data from HDFS back into RDBMS.

 Bulk Data Transfer: Sqoop allows efficient bulk transfers of data using parallelism.

 Data Types: Sqoop supports a wide range of data types and can map them
from RDBMS to HDFS.

12
Sqoop Operations:

 Import: Imports data from an RDBMS into HDFS.

 sqoop import --connect jdbc:mysql://localhost/test --table employees --target-dir

/user/hadoop/employees

 Export: Exports data from HDFS to an RDBMS.

 sqoop export --connect jdbc:mysql://localhost/test --table employees --export-dir

/user/hadoop/employees

Sqoop Features:

 Data Import: Efficiently imports large datasets from RDBMS into HDFS.

 Incremental Imports: Supports importing only new or updated data based on

a column like a timestamp.

 Data Export: Exports processed data from Hadoop back into relational databases
for downstream consumption.

Hadoop Archives (HAR)

Hadoop Archives (HAR) are used to bundle multiple files into a single archive file in HDFS.
HAR files help in reducing the overhead associated with having too many small files in HDFS.

Key Features:

 Storage Optimization: HAR files reduce the number of files in HDFS and
optimize storage.

 Efficient Access: Improves data retrieval and access when working with many
small files.

 Compression: HAR files can be compressed, saving space on HDFS.

Creating HAR Files:

hadoop archive -archiveName archive.har -p /source/directory

/user/hadoop/target/directory

3. Hadoop I/O: Compression, Serialization, Avro, and File-Based Data Structures

Hadoop I/O focuses on how data is compressed, serialized, and stored efficiently for
processing and storage in Hadoop. This includes the handling of compression, serialization,
and formats like Avro and other file-based data structures.

13
Compression in Hadoop I/O

Compression in Hadoop helps reduce the amount of disk space required for storing data and
improves the I/O performance of the system by reducing network bandwidth and disk
storage requirements.

Key Compression Formats:

1. Gzip: Commonly used compression format in Hadoop, particularly for text files.

2. Bzip2: Offers better compression than Gzip but has slower performance.

3. LZO: Provides a good balance between compression ratio and speed, used in
real- time applications.

4. Snappy: Provides high-speed compression with a lower compression ratio, ideal

for real-time processing.

Enabling Compression in Hadoop:

Compression can be enabled at various stages:

 MapReduce: Compression can be applied to the output of MapReduce jobs

by setting the appropriate compression codec.

Serialization in Hadoop I/O

Serialization is the process of converting an object into a byte stream for storage or
transmission, and deserialization is the reverse process.

Key Serialization Formats:

1. Writable: The default serialization mechanism in Hadoop. It is highly optimized

for performance but not always the most flexible.

o Example: IntWritable, Text, LongWritable.

2. Avro: A more flexible serialization format, offering compact storage and schema-
based data serialization. It is ideal for complex data structures and is often used in
Big Data systems.

14
Example of Serialization with Writable:

4. Avro

Avro is a binary serialization format used by Hadoop for encoding data. It provides a
compact, fast, and efficient way of encoding data, and is particularly useful for serializing
data that can be written to disk or transferred over the network.

Key Features of Avro:

 Compact: It offers a compact binary format, which helps in reducing storage

and network bandwidth requirements.

 Schema-based: Avro data is always serialized with its schema, ensuring the
data’s structure is always known and consistent.

 Interoperability: Avro can be used in different programming languages such as

Java, Python, C, and C++.

 Data Evolution: Avro supports schema evolution, meaning new fields can be
added to data without breaking compatibility with older versions.

Example of Avro Usage:

1. Avro Schema Definition: The schema is defined in JSON format.

15
2. Reading Avro Data:

DatumReader<GenericRecord> reader = new SpecificDatumReader<>(User.class);

DataFileReader<GenericRecord> fileReader = new DataFileReader<>(new File("user.avro"),

reader);

5. File-Based Data Structures

File-based data structures are formats used for storing data on HDFS or similar distributed
file systems. These structures help in organizing data for better access, processing, and
compression. Some common file-based data structures are:

1. SequenceFile:

 SequenceFile is a flat file format used for storing binary key/value pairs.

 It is highly efficient for sequential access and works well for intermediate
MapReduce results.

 Supports compression of both keys and values.

Example:

2. Parquet:

 Parquet is a columnar storage format for Hadoop, optimized for storing

large datasets.

 It is particularly useful for analytical workloads where column-based access

is required.

 Parquet provides efficient data compression and encoding schemes, reducing

storage costs and improving query performance.

Key Features of Parquet:

 Columnar Storage: Enables fast reading of a subset of columns.

 Optimized for Read-heavy Workloads: Great for use in data warehouses and
for analytical queries.

16
 Efficient Compression: Parquet’s columnar storage format allows for
higher compression ratios compared to row-based formats.

3. ORC (Optimized Row Columnar):

 ORC is a columnar format designed to optimize the read and write performance
in Hive.

 It provides highly efficient storage by applying lightweight compression

techniques and efficient indexing.

Benefits of ORC:

 Efficient Compression: ORC files can reduce storage space by up to 75% compared
to text-based formats.

 Efficient Querying: ORC enables faster query performance for Hive queries due to
its efficient indexing and predicate pushdown.

17
MODULE- 4
1. Hadoop Ecosystem

The Hadoop Ecosystem refers to a suite of tools and frameworks that extend the
capabilities of the Hadoop platform. It consists of various components, each serving a
specific purpose, ranging from data ingestion, processing, and storage to resource
management, workflow
orchestration, and data analysis.

Key Components of the Hadoop Ecosystem:

1. Hadoop Common: The foundational libraries and utilities that support other
Hadoop modules.

2. HDFS (Hadoop Distributed File System): A distributed storage system that

stores large datasets across multiple nodes.

3. MapReduce: A computational model and processing engine for processing

large datasets in parallel.

4. YARN (Yet Another Resource Negotiator): A resource management layer

that allocates resources to different applications in a cluster.

5. Hive: A data warehouse solution that allows SQL-like queries to be executed

on Hadoop.

6. Pig: A high-level platform for processing large datasets using a scripting

language called Pig Latin.

7. HBase: A distributed NoSQL database that runs on top of HDFS and is suitable
for real-time random access to large datasets.

8. Sqoop: A tool for transferring data between relational databases and HDFS.

9. Flume: A distributed service for collecting, aggregating, and moving log data or
real- time data into HDFS.

10. Oozie: A workflow scheduler system for managing Hadoop jobs.

11. Zookeeper: A coordination service for managing distributed applications.

12. Mahout: A machine learning library built on top of Hadoop for scalable algorithms.

13. Storm: A real-time computation system designed for processing unbounded

streams of data.

These components work together to form a robust environment for storing, processing, and
analyzing large-scale data.

18
2. Pig: Introduction to PIG

Apache Pig is a platform for analyzing large datasets. It is built on top of the Hadoop
ecosystem and provides a simpler way to write MapReduce programs. Pig is often used for
ETL (Extract, Transform, Load) tasks.

Key Features of Pig:

 Abstraction Over MapReduce: Pig simplifies the process of writing MapReduce

programs by providing a higher-level abstraction called Pig Latin, which is a scripting
language.

 Extensibility: Pig allows users to write custom User Defined Functions (UDFs)
to extend its capabilities.

 Data Processing: It supports both batch and real-time processing of large datasets.

 Optimized Execution: Pig automatically optimizes queries for better performance.

Pig vs. MapReduce:

 Pig abstracts the complexity of writing MapReduce code by using a simple

scripting language.

 With Pig, the developer can focus more on the logic and transformations
without worrying about the intricacies of the MapReduce programming model.

3. Execution Modes of Pig

Pig can run in different modes based on the needs of the user and the environment in which
it is deployed. The execution modes define how and where Pig queries are executed.

1. Local Mode:

 Description: In Local Mode, Pig runs in a single JVM (Java Virtual Machine),
typically on a local machine or a single node.

 Use Case: Useful for small-scale datasets or for development and testing purposes.

 Execution: The processing is done sequentially (i.e., without parallelization),

making it suitable for non-distributed, small datasets.

 Limitations: Limited scalability as it runs on a single node and does not take
full advantage of Hadoop’s distributed nature.

2. MapReduce Mode (Hadoop Mode):

 Description: In this mode, Pig submits jobs to Hadoop’s MapReduce framework

for distributed execution across a cluster of machines.

20
 Use Case: Ideal for large-scale data processing when the system needs to scale
out and distribute the workload.

 Execution: Pig scripts are converted into MapReduce jobs and are executed
across the Hadoop cluster.

 Advantages: Scalability and parallel processing of large datasets.

3. Tez Mode:

 Description: Apache Tez is an advanced execution engine that improves

the performance of Pig by reducing the overhead of MapReduce.

 Use Case: Used when higher performance and optimization are needed,
especially for complex queries.

 Execution: Tez runs on top of YARN and supports more efficient DAG-
based processing than the traditional MapReduce model.

 Advantages: Faster execution for complex data flows compared to MapReduce.

4. Comparison of Pig with Databases

Pig and traditional databases (such as SQL databases) differ in their design, purpose, and
capabilities. Below is a comparison highlighting their strengths and differences.

21
5. Grunt

Grunt is the interactive shell that comes with Pig, which allows users to run Pig Latin scripts
interactively from the command line. Grunt is useful for testing Pig Latin scripts and for
performing ad-hoc data exploration.

Key Features:

 Interactive Querying: Provides an interactive environment to execute Pig scripts

line- by-line.

 Feedback and Debugging: It allows users to quickly see the results of

their commands and debug them.

 Access to Pig Functions: Users can access Pig’s built-in functions and libraries
directly through Grunt.

Example of Grunt Usage:

grunt> A = LOAD 'data.txt' USING PigStorage(',') AS (name:chararray, age:int);

grunt> DUMP A;

6. Pig Latin

Pig Latin is a data flow language used to write Pig scripts. It is procedural in nature, meaning
that it allows users to describe the sequence of operations to perform on the data.

Key Features:

 Simplicity: Pig Latin scripts are simpler to write than MapReduce code and are
more readable.

 Extensibility: Users can define User Defined Functions (UDFs) to extend

Pig's functionality.

 Optimized Execution: Pig Latin scripts are automatically optimized by the Pig
engine for efficient execution.

Basic Syntax:

 LOAD: Reads data from a file or a relational database.

 FILTER: Filters records based on conditions.

 GROUP: Groups data for aggregation.

 JOIN: Performs a join between two datasets.

 FOREACH: Applies transformations to data.

22
 DUMP: Displays the output of the query.

 STORE: Saves the output to a specified location.

Example of Pig Latin:

A = LOAD 'input.txt' USING PigStorage(',') AS (name:chararray, age:int);

B = FILTER A BY age > 30;

C = GROUP B BY name;

DUMP C;

7. User Defined Functions (UDFs) in Pig

User Defined Functions (UDFs) are custom functions that can be written in Java, Python,
or Ruby to extend the capabilities of Pig Latin. UDFs allow you to implement complex
business logic that is not available through built-in operators.

Types of UDFs:

1. Eval Functions: These are used to transform the data (e.g., changing data types
or performing calculations).

o Example: MyUDF that calculates the square of a number.

2. Load/Store Functions: These are used for custom data loading and
storing mechanisms.

o Example: A UDF for reading from a custom data source.

3. Bag and Tuple UDFs: These allow you to manipulate complex data structures
like bags, tuples, and maps.

Example of a Simple UDF in Pig (Java):

23
8. Data Processing Operators in Pig

Pig provides several built-in operators for data processing. These operators allow users to
perform various types of transformations, aggregations, and joins on datasets.

Key Operators:

1. LOAD: Reads data into the Pig environment.

A = LOAD 'data.txt' USING PigStorage(',') AS (name:chararray, age:int);

2. FILTER: Filters the data based on a condition.

B = FILTER A BY age > 30;

3. GROUP: Groups data based on a key.

C = GROUP B BY name;

4. JOIN: Joins two datasets based on a common key.

D = JOIN A BY name, B BY name;

5. FOREACH: Applies transformations to each element of the data.

E = FOREACH D GENERATE name, age * 2;

6. ORDER: Orders the data based on a given field.

F = ORDER E BY age;

7. DISTINCT: Removes duplicate records.

G = DISTINCT F;

8. CROSS: Performs a Cartesian product of two datasets.

H = CROSS A, B;

9. JOIN: Combines two datasets based on a key.

I = JOIN A BY name, B BY name;

10. LIMIT: Restricts the number of records.

J = LIMIT E 100;

24
30

BDT Unit04
No ratings yet
BDT Unit04
136 pages
Unit 2 (2 Part)
No ratings yet
Unit 2 (2 Part)
69 pages
Unit Iii
No ratings yet
Unit Iii
107 pages
Big Data Developer
No ratings yet
Big Data Developer
81 pages
1 - HADOOP Crash Course
No ratings yet
1 - HADOOP Crash Course
52 pages
Bigdata PPT
No ratings yet
Bigdata PPT
140 pages
System Design Course Content - Gaurav Sen
No ratings yet
System Design Course Content - Gaurav Sen
14 pages
Unit IV Hadoop
No ratings yet
Unit IV Hadoop
90 pages
Unit 4 Bda
No ratings yet
Unit 4 Bda
33 pages
Module 2 Big Data Analytics
No ratings yet
Module 2 Big Data Analytics
38 pages
Hadoop File Formats and Data Ingestion
No ratings yet
Hadoop File Formats and Data Ingestion
25 pages
Bda Ut2 QB Ans-1
No ratings yet
Bda Ut2 QB Ans-1
17 pages
BDACT2QB
No ratings yet
BDACT2QB
19 pages
Unit 3 Analyzing Data With Hadoop Notes
No ratings yet
Unit 3 Analyzing Data With Hadoop Notes
2 pages
Bda Unit4
No ratings yet
Bda Unit4
22 pages
BDT Unit04
No ratings yet
BDT Unit04
89 pages
IV-UNIT - BIG - DATA (2 Files Merged)
No ratings yet
IV-UNIT - BIG - DATA (2 Files Merged)
25 pages
Unit 2 Part A
No ratings yet
Unit 2 Part A
34 pages
Hadoop
No ratings yet
Hadoop
61 pages
Unit # 2
No ratings yet
Unit # 2
23 pages
Unit 2
No ratings yet
Unit 2
15 pages
BD Unit-02
No ratings yet
BD Unit-02
16 pages
BDA Module 2-2023
No ratings yet
BDA Module 2-2023
30 pages
Unit Ii
No ratings yet
Unit Ii
30 pages
Hadoop Ecosystem
No ratings yet
Hadoop Ecosystem
55 pages
Hadoop Ecosystem
No ratings yet
Hadoop Ecosystem
58 pages
Big Data Unit - 2
No ratings yet
Big Data Unit - 2
18 pages
Big Data Lab Manual
No ratings yet
Big Data Lab Manual
44 pages
BDA Unit-3
No ratings yet
BDA Unit-3
47 pages
Bda Lab Manual
0% (1)
Bda Lab Manual
40 pages
Act2 - March7 - 6E - BDA - SEC
No ratings yet
Act2 - March7 - 6E - BDA - SEC
8 pages
Unit-2 Imp Ques Ans
No ratings yet
Unit-2 Imp Ques Ans
8 pages
Csen 3101
No ratings yet
Csen 3101
11 pages
BDA Unit 3
No ratings yet
BDA Unit 3
7 pages
2.2. Components of Hadoop - Analysing
No ratings yet
2.2. Components of Hadoop - Analysing
16 pages
New Printout
No ratings yet
New Printout
5 pages
18 Module 2
No ratings yet
18 Module 2
9 pages
BDT Unit - Iii
No ratings yet
BDT Unit - Iii
12 pages
BIG Data - Unit - 2
No ratings yet
BIG Data - Unit - 2
24 pages
Unit 3
No ratings yet
Unit 3
12 pages
2 Hadoop Ecosystem
No ratings yet
2 Hadoop Ecosystem
41 pages
Unit IV Basics - of - Hadoop
No ratings yet
Unit IV Basics - of - Hadoop
20 pages
CT2 BDTT
No ratings yet
CT2 BDTT
6 pages
Unit 2
No ratings yet
Unit 2
9 pages
Big Data - Introduction To Hadoop
No ratings yet
Big Data - Introduction To Hadoop
61 pages
Hadoop Ecosystem
No ratings yet
Hadoop Ecosystem
5 pages
BD - Unit - II - Hadoop Frameworks and HDFS
No ratings yet
BD - Unit - II - Hadoop Frameworks and HDFS
37 pages
Gold Video Task Complted
No ratings yet
Gold Video Task Complted
31 pages
Oracle Vector Ai
No ratings yet
Oracle Vector Ai
23 pages
Bda Unit 1
No ratings yet
Bda Unit 1
13 pages
Chapter 2 Hadoop Eco System
No ratings yet
Chapter 2 Hadoop Eco System
34 pages
Unit 3-BDA
50% (2)
Unit 3-BDA
26 pages
Unit 5 - Introduction To Hadoop
No ratings yet
Unit 5 - Introduction To Hadoop
50 pages
Bda Lab 1
No ratings yet
Bda Lab 1
9 pages
HADOOP
No ratings yet
HADOOP
10 pages
Unit 2 - Hadoop PDF
No ratings yet
Unit 2 - Hadoop PDF
7 pages
AaxHadoop Interview Questions and Answers
No ratings yet
AaxHadoop Interview Questions and Answers
37 pages
Big Data Testing
100% (1)
Big Data Testing
34 pages
Hadoop Overview Training Material
No ratings yet
Hadoop Overview Training Material
44 pages
Class 11 IP Annual Exam (QP) Sample Paper 1
100% (3)
Class 11 IP Annual Exam (QP) Sample Paper 1
6 pages
Application Note - Schema Companion and Reference Guide For Content Server
No ratings yet
Application Note - Schema Companion and Reference Guide For Content Server
103 pages
Certified Hadoop and Spark Course Curriculum
No ratings yet
Certified Hadoop and Spark Course Curriculum
9 pages
S 389.msa 34
No ratings yet
S 389.msa 34
9 pages
AIMDOCS
No ratings yet
AIMDOCS
331 pages
Chapter - 5: Mcgraw-Hill/Irwin
No ratings yet
Chapter - 5: Mcgraw-Hill/Irwin
64 pages
Chapter 6 Review of Literature
No ratings yet
Chapter 6 Review of Literature
66 pages
NOSQL Lecture 1 Notes
No ratings yet
NOSQL Lecture 1 Notes
31 pages
ANTENATAL CARE MANAGEMENT SYSTEM REPORT (Suleiman Abdul) PDF
No ratings yet
ANTENATAL CARE MANAGEMENT SYSTEM REPORT (Suleiman Abdul) PDF
54 pages
Final Exam Big Data - 11112
No ratings yet
Final Exam Big Data - 11112
6 pages
G-06-Autonomous Database - Serverless and Dedicated-Transcript
No ratings yet
G-06-Autonomous Database - Serverless and Dedicated-Transcript
7 pages
Btrees Animated
No ratings yet
Btrees Animated
77 pages
84311737514719120-Class8WorksheetLs4andLs5 MSACCESS2010AnswerKey PDF
No ratings yet
84311737514719120-Class8WorksheetLs4andLs5 MSACCESS2010AnswerKey PDF
5 pages
03 Introduction To PostgreSQL
No ratings yet
03 Introduction To PostgreSQL
43 pages
MySQL Practical File
No ratings yet
MySQL Practical File
62 pages
UNIT - IV Chapter 1: Relational Database Design Via Er Modelling
No ratings yet
UNIT - IV Chapter 1: Relational Database Design Via Er Modelling
60 pages
Lesson 1 - Data Analytics TEVTA 2024 - Certiportlearning-Merged
No ratings yet
Lesson 1 - Data Analytics TEVTA 2024 - Certiportlearning-Merged
19 pages
SQL - Advanced Interview Questions
No ratings yet
SQL - Advanced Interview Questions
17 pages
DBMS Course Pack
No ratings yet
DBMS Course Pack
14 pages
SQL Stored Procedures: Winter 2006-2007
No ratings yet
SQL Stored Procedures: Winter 2006-2007
33 pages
Hibernate Mapping and Inheritance Annotations
No ratings yet
Hibernate Mapping and Inheritance Annotations
57 pages
Chapter No 5 (Answer Key) Set1
No ratings yet
Chapter No 5 (Answer Key) Set1
4 pages
CC Unit - 03
No ratings yet
CC Unit - 03
10 pages
What Is Normalization in SQL and What Are Its Types
No ratings yet
What Is Normalization in SQL and What Are Its Types
6 pages
Communities - How To Change The Database.
No ratings yet
Communities - How To Change The Database.
9 pages
DBMS Lesson 5.1
No ratings yet
DBMS Lesson 5.1
17 pages
B. C. (Sem. VI) (CBCS) (W.E.F. 2016) Examination: Faculty Code: 003 Subject Code: 1036001
No ratings yet
B. C. (Sem. VI) (CBCS) (W.E.F. 2016) Examination: Faculty Code: 003 Subject Code: 1036001
4 pages
C Emester: B.S .IT: S - V
No ratings yet
C Emester: B.S .IT: S - V
6 pages
Rajat Dubey Resume
No ratings yet
Rajat Dubey Resume
1 page
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
From Everand
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
Wei Liu
No ratings yet
Big Data Analytics
From Everand
Big Data Analytics
Nitin Kumar Yadav
No ratings yet