Apache Hadoop: Jump To Navigation Jump To Search

Apache Hadoop is an open-source software framework for distributed storage and processing of large datasets across clusters of computers. It provides storage through the Hadoop Distributed File System (HDFS) and processing via MapReduce. HDFS splits files into blocks and stores them across nodes, while MapReduce transfers code to nodes to process data in parallel, taking advantage of data locality for faster and more efficient computations. The core of Hadoop includes common utilities, HDFS, YARN for resource management, and MapReduce.

Uploaded by

Varun Malik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views2 pages

Apache Hadoop: Jump To Navigation Jump To Search

Uploaded by

Varun Malik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 2

Apache Hadoop

From Wikipedia, the free encyclopedia

Jump to navigationJump to search

Apache Hadoop ( /həˈduːp/) is a collection of open-source software utilities that facilitate using a
network of many computers to solve problems involving massive amounts of data and
computation. It provides a software framework for distributed storage and processing of big
data using the MapReduce programming model. Originally designed for computer clusters built
from commodity hardware[3]—still the common use—it has also found use on clusters of higher-
end hardware.[4][5] All the modules in Hadoop are designed with a fundamental assumption that
hardware failures are common occurrences and should be automatically handled by the
framework.[2]
The core of Apache Hadoop consists of a storage part, known as Hadoop Distributed File
System (HDFS), and a processing part which is a MapReduce programming model. Hadoop
splits files into large blocks and distributes them across nodes in a cluster. It then
transfers packaged code into nodes to process the data in parallel. This approach takes
advantage of data locality,[6] where nodes manipulate the data they have access to. This allows
the dataset to be processed faster and more efficiently than it would be in a more
conventional supercomputer architecture that relies on a parallel file system where computation
and data are distributed via high-speed networking.[7][8]
The base Apache Hadoop framework is composed of the following modules:

 Hadoop Common – contains libraries and utilities needed by other Hadoop modules;

 Hadoop Distributed File System (HDFS) – a distributed file-system that stores data on
commodity machines, providing very high aggregate bandwidth across the cluster;
 Hadoop YARN – introduced in 2012 is a platform responsible for managing computing
resources in clusters and using them for scheduling users' applications; [9][10]

 Hadoop MapReduce – an implementation of the MapReduce programming model for

large-scale data processing.
The term Hadoop is often used for both base modules and sub-modules and also the ecosystem,
[11]
or collection of additional software packages that can be installed on top of or alongside
Hadoop, such as Apache Pig, Apache Hive, Apache HBase, Apache Phoenix, Apache
Spark, Apache ZooKeeper, Cloudera Impala, Apache Flume, Apache Sqoop, Apache Oozie,
and Apache Storm.[12]
Apache Hadoop's MapReduce and HDFS components were inspired by Google papers
on MapReduce and Google File System.[13]
The Hadoop framework itself is mostly written in the Java programming language, with some
native code in C and command line utilities written as shell scripts. Though MapReduce Java
code is common, any programming language can be used with Hadoop Streaming to implement
the map and reduce parts of the user's program.[14] Other projects in the Hadoop ecosystem
expose richer user interfaces.

I Need You To Survive - DB
No ratings yet
I Need You To Survive - DB
9 pages
Best Practices For Effectively Implementing An ATP Sanitation Verification Program
100% (1)
Best Practices For Effectively Implementing An ATP Sanitation Verification Program
16 pages
Week+3+ (8W) + +Exploring+Hadoop+Ecosystem+ (W6)
No ratings yet
Week+3+ (8W) + +Exploring+Hadoop+Ecosystem+ (W6)
72 pages
Week 3 (8W) - Exploring Hadoop Ecosystem (W6) - Revised
No ratings yet
Week 3 (8W) - Exploring Hadoop Ecosystem (W6) - Revised
66 pages
UNIT-4-Hadoop Ecosystem-Part 1
No ratings yet
UNIT-4-Hadoop Ecosystem-Part 1
22 pages
wk8 Final
No ratings yet
wk8 Final
39 pages
Analytics: For The Ice Hockey Term, See
No ratings yet
Analytics: For The Ice Hockey Term, See
8 pages
Big Data Unit 2
No ratings yet
Big Data Unit 2
277 pages
霍尼韦尔 Lks310 燃烧器控制器说明书
No ratings yet
霍尼韦尔 Lks310 燃烧器控制器说明书
8 pages
Unit 2
No ratings yet
Unit 2
73 pages
Cloud Computing
No ratings yet
Cloud Computing
21 pages
Part 1.2
100% (1)
Part 1.2
88 pages
Unit 2,3
No ratings yet
Unit 2,3
24 pages
Bda 18CS72 Mod-2
No ratings yet
Bda 18CS72 Mod-2
152 pages
Hadoop Apache
No ratings yet
Hadoop Apache
13 pages
Week 5 Researchpaper
No ratings yet
Week 5 Researchpaper
7 pages
Hadoop Notes 1
No ratings yet
Hadoop Notes 1
9 pages
Big Data Technologies (Spark & Scala) (22CSH-391) Lecture-1 (CO1)
No ratings yet
Big Data Technologies (Spark & Scala) (22CSH-391) Lecture-1 (CO1)
30 pages
Module 6 (1) - Buying and Selling
No ratings yet
Module 6 (1) - Buying and Selling
28 pages
Unit 2-1
No ratings yet
Unit 2-1
43 pages
Unit Ii
No ratings yet
Unit Ii
30 pages
Unit 3-1
No ratings yet
Unit 3-1
14 pages
Big Data - Introduction To Hadoop
No ratings yet
Big Data - Introduction To Hadoop
61 pages
Module 2. 16974328568170
No ratings yet
Module 2. 16974328568170
113 pages
Unit Iii
No ratings yet
Unit Iii
20 pages
Chapter 3 Hadoop
No ratings yet
Chapter 3 Hadoop
10 pages
Unit 4 Hadoop
No ratings yet
Unit 4 Hadoop
31 pages
Part List DCP t500w
No ratings yet
Part List DCP t500w
29 pages
Unit 3 ETI (BDA)
No ratings yet
Unit 3 ETI (BDA)
34 pages
BIG Data - Unit - 2
No ratings yet
BIG Data - Unit - 2
24 pages
Unit 2 Big Data Notes
No ratings yet
Unit 2 Big Data Notes
21 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
14 pages
Parallel Project
No ratings yet
Parallel Project
32 pages
HADOOP ECOSSYTEM, COMPONENTS, Loading, Getting Data From Hadoop
No ratings yet
HADOOP ECOSSYTEM, COMPONENTS, Loading, Getting Data From Hadoop
10 pages
2 Hadoop Ecosystem
No ratings yet
2 Hadoop Ecosystem
41 pages
Unit 2
No ratings yet
Unit 2
9 pages
Univariate Time Series Analysis: Arnaud Amsellem
No ratings yet
Univariate Time Series Analysis: Arnaud Amsellem
26 pages
CC Unit - 5
No ratings yet
CC Unit - 5
27 pages
Quests in White Orchard The Witcher 3 Wiki
No ratings yet
Quests in White Orchard The Witcher 3 Wiki
1 page
Chalukya Exp Second Ac (2A) : Electronic Reservation Slip (ERS)
No ratings yet
Chalukya Exp Second Ac (2A) : Electronic Reservation Slip (ERS)
3 pages
Unit 5 - Introduction To Hadoop
No ratings yet
Unit 5 - Introduction To Hadoop
50 pages
CC-KML051-Unit V
No ratings yet
CC-KML051-Unit V
17 pages
Big Data Analytics
From Everand
Big Data Analytics
Nitin Kumar Yadav
No ratings yet
Chapter 2 Hadoop Eco System
No ratings yet
Chapter 2 Hadoop Eco System
34 pages
2 Hadoop
No ratings yet
2 Hadoop
20 pages
Unit 5 - Introduction To Hadoop
No ratings yet
Unit 5 - Introduction To Hadoop
50 pages
Hadoop
No ratings yet
Hadoop
7 pages
Big Data Analytics - Unit 4
No ratings yet
Big Data Analytics - Unit 4
32 pages
CASE STUDY On Application of Hadoop
No ratings yet
CASE STUDY On Application of Hadoop
16 pages
Big Data ABHISHEK PRAJA C CCCCCCCCCCC
No ratings yet
Big Data ABHISHEK PRAJA C CCCCCCCCCCC
11 pages
Library Jit Final Handout
No ratings yet
Library Jit Final Handout
49 pages
Science Quarter 4 Week 4: Capslet
No ratings yet
Science Quarter 4 Week 4: Capslet
9 pages
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
From Everand
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
Wei Liu
No ratings yet
BLUEFISH Ingilizce
No ratings yet
BLUEFISH Ingilizce
2 pages
HADOOP
No ratings yet
HADOOP
10 pages
Hadoop
No ratings yet
Hadoop
11 pages
Big Data RAJNEESH CCC
No ratings yet
Big Data RAJNEESH CCC
11 pages
Apache Hadoop
No ratings yet
Apache Hadoop
27 pages
Social Security and Health Rights of Migrant Workers in India
No ratings yet
Social Security and Health Rights of Migrant Workers in India
4 pages
Big Data Analytics Assignment
No ratings yet
Big Data Analytics Assignment
7 pages
CV Ognjanovic
No ratings yet
CV Ognjanovic
23 pages
G9 DLL Q1 Week4
No ratings yet
G9 DLL Q1 Week4
3 pages
Introduction To Practical Exercises Using MODICOM 2 ... - LJ Create PDF
No ratings yet
Introduction To Practical Exercises Using MODICOM 2 ... - LJ Create PDF
8 pages
GS Yuasa Battery Europe Ltd. Safety Data Sheet
No ratings yet
GS Yuasa Battery Europe Ltd. Safety Data Sheet
11 pages
Welcome To Apache™ Hadoop®!
No ratings yet
Welcome To Apache™ Hadoop®!
9 pages
What Is The Hadoop Ecosystem?
No ratings yet
What Is The Hadoop Ecosystem?
4 pages
Laag 1
No ratings yet
Laag 1
12 pages
Hadoop and Mapreduce
No ratings yet
Hadoop and Mapreduce
21 pages
Bods Interview
100% (3)
Bods Interview
61 pages
Analytics: What Type of Thing Is Artificial Intelligence?
No ratings yet
Analytics: What Type of Thing Is Artificial Intelligence?
3 pages
Machine Learning Tasks: Support Vector Machine Linear Boundary
No ratings yet
Machine Learning Tasks: Support Vector Machine Linear Boundary
3 pages
Existential Risk From Artificial General Intelligence: Main Article
No ratings yet
Existential Risk From Artificial General Intelligence: Main Article
3 pages
Cloth Stock Management
No ratings yet
Cloth Stock Management
8 pages
Company Details - 豐達-2 - eAuditNet
No ratings yet
Company Details - 豐達-2 - eAuditNet
3 pages
History and Relationships To Other Fields: Timeline of Machine Learning
No ratings yet
History and Relationships To Other Fields: Timeline of Machine Learning
2 pages
History and Relationships To Other Fields: Timeline of Machine Learning
No ratings yet
History and Relationships To Other Fields: Timeline of Machine Learning
2 pages
What Type of Thing Is Artificial Intelligence?
No ratings yet
What Type of Thing Is Artificial Intelligence?
2 pages
What Type of Thing Is Artificial Intelligence?: Scientific Field
No ratings yet
What Type of Thing Is Artificial Intelligence?: Scientific Field
2 pages
What Type of Thing Is Artificial Intelligence?: Outline
No ratings yet
What Type of Thing Is Artificial Intelligence?: Outline
2 pages
BigData Unit 2
No ratings yet
BigData Unit 2
15 pages
HI121 - Installation Instructions PDF
No ratings yet
HI121 - Installation Instructions PDF
2 pages
Week 4 Assignment - Jaime Eggspuehler: EDLD 5311 Fundamentals of Leadership
No ratings yet
Week 4 Assignment - Jaime Eggspuehler: EDLD 5311 Fundamentals of Leadership
9 pages
Big Data Analytics Unit-3
No ratings yet
Big Data Analytics Unit-3
15 pages
BDA Presentations Unit-4 - Hadoop, Ecosystem
100% (1)
BDA Presentations Unit-4 - Hadoop, Ecosystem
25 pages
Hadoop and Their Ecosystem
100% (2)
Hadoop and Their Ecosystem
24 pages
History Definitions
No ratings yet
History Definitions
1 page
Support Vector Machine Linear Boundary
No ratings yet
Support Vector Machine Linear Boundary
2 pages
Principles of Digital Transmission
No ratings yet
Principles of Digital Transmission
1 page
Academic Full Length Practice Test
0% (1)
Academic Full Length Practice Test
25 pages
Course Outcome 3rd Semester
No ratings yet
Course Outcome 3rd Semester
11 pages
Hadoop Presentation: Swarnali B.SC Computer Science Hons. 2 Year Chandernagore Govt. College Halder
No ratings yet
Hadoop Presentation: Swarnali B.SC Computer Science Hons. 2 Year Chandernagore Govt. College Halder
8 pages
JLL The Rise of Global Capabilities Centres in India Updated
No ratings yet
JLL The Rise of Global Capabilities Centres in India Updated
15 pages
Apache Hadoop: Abstract
No ratings yet
Apache Hadoop: Abstract
1 page
Hadoop-How It Works
No ratings yet
Hadoop-How It Works
5 pages
Leave Application Form: To Be Filled-Out by Employee
No ratings yet
Leave Application Form: To Be Filled-Out by Employee
4 pages
Apache Hadoop: Developer(s) Stable Release Preview Release
No ratings yet
Apache Hadoop: Developer(s) Stable Release Preview Release
5 pages
Hadoop Common Hadoop Distributed File System (HDFS) Hadoop Yarn Hadoop Mapreduce
No ratings yet
Hadoop Common Hadoop Distributed File System (HDFS) Hadoop Yarn Hadoop Mapreduce
1 page
Lectura Comprensiva Inglés
No ratings yet
Lectura Comprensiva Inglés
2 pages
Digiscope Slimhole MWD Ps
No ratings yet
Digiscope Slimhole MWD Ps
2 pages
Apache Hadoop Is A Set of Algorithms (An
No ratings yet
Apache Hadoop Is A Set of Algorithms (An
1 page
What Is Apache Hadoop?: Ambari™
No ratings yet
What Is Apache Hadoop?: Ambari™
1 page

Apache Hadoop: Jump To Navigation Jump To Search

Uploaded by

Apache Hadoop: Jump To Navigation Jump To Search

Uploaded by

Apache Hadoop

From Wikipedia, the free encyclopedia

 Hadoop MapReduce – an implementation of the MapReduce programming model for

You might also like