SlideShare a Scribd company logo
Danairat T., 2013, danairat@gmail.comBig Data Hadoop – Hands On Workshop
Setting up Hadoop Clustering
Hands-On Workshop
Danairat T.
Line ID: Danairat
FB: Danairat Thanabodithammachari
+668-1559-1446, danairat@gmail.com, Certified Java Programmer
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
Big Data Introduction
Volume
Variety Velocity
DB Table
Delimited Text
XML, HTML
Free Form Text
Image, Music, VDO, Binary
Batch
Near real time
Real time
GB
TB
PB
XB
ZB
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
Big Data Architecture
Big Data InfrastructureBig Data Infrastructure
BI/Report
Next Best
Action
Distributed Data Processing
Integration and Metadata Framework
Distributed Data Store and DWH
Monitoring
and
Management
Framework
Security
Framework
Predictive
Analytics
Descriptive
Analytics
Prescriptive
Analytics
Big Data Platform
Big Data Applications
Hardware, Storage, Network
Fraud
Analysis
Cyber
Security
Talent
Search
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
Hadoop Timeline
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
Apache Hadoop Core Technology
j2eedev.org/ecosystem-hadoop
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
Apache Hadoop Ecosystem
j2eedev.org/ecosystem-hadoop
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
Big Data Platform & Big Data Analytics
Hadoop Technology
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
Block Size = 64MB
Replication Factor = 3
HDFS: Hadoop Distributed File System
Cost/GB is a few
¢/month vs $/month
apache.org/hadoop/
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
YARN: Yet Another Resource Negotiator
Hadoop.apache.org
MRV2 maintains API compatibility with previous stable release
(hadoop-1.x). This means that all Map-Reduce jobs should still run
unchanged on top of MRv2 with just a recompile.
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
Hadoop 1.0 vs Hadoop 2.0
Hortonwork.com
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
Hadoop 1.0 vs Hadoop 2.0
Hortonwork.com
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
Hadoop 2
Hortonworks.com
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
Hadoop Symbols and Reasons Behind
1
3
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
Clone hadoop master to slave1 and slave2
master
slave1
slave2
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At master node: Edit host file
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At master node : Copy key file to slave1 and slave2
scp /home/ubuntu/.ssh/id_dsa.pub ip-172-31-1-8:/home/ubuntu/.ssh/master.pub
scp /home/ubuntu/.ssh/id_dsa.pub 172.31.15.16:/home/ubuntu/.ssh/master.pub
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
After this slide, we will use 3 cascaded
windows to represent master node, slave1
node and slave2 node
master node
slave1 node
slave2 node
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At slave1 and slave2: cat /home/ubuntu/.ssh/master.pub >> /home/ubuntu/.ssh/authorized_keys
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At master: Test ssh to slave1 and slave 2
$ ssh ip-172-31-1-8
$ exit
$ ssh ip-172-31-15-16
$ exit
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At master: add slave1 and slave2 to Hadoop slave file
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At master: add slave1 and slave2 to Hadoop slave file
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At master: edit hdfs-site.xml
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At master: edit hdfs-site.xml for 2 replication servers
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At all nodes: remove directories of namenode and datanode
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At master: format namenode
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At master: format namenode
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At master: Execute start-dfs.sh
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At slave1: Check jps result, you will see DataNode has been started
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At slave2: Check jps result, you will see DataNode has been started
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At master: Execute start-yarn.sh
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At slave1: Check jps result, you will see NodeManager has been started
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At slave2: Check jps result, you will see NodeManager has been started
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
Importing data into HDFS Cluster
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At master: import data to hdfs
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At slave1: review imported result data from hdfs
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At slave2: review imported result data from hdfs
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
Running MapReduce in Cluster Mode
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At master: execute YARN mapreduce program
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At slave1, slave2: you will see Application Master and Yarn Child Container
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At master: review output file from hdfs
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At master: review output file from hdfs
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At slave1, slave2: review output file from hdfs by using command:-
hdfs dfs -cat /outputs/wordcount_output_dir01/part-r-00000
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At master: review output result data from
web console
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At master: review output result data from
web console
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At master: review output result data from
web console
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At master: review output result data from
web console
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
Stopping Hadoop Cluster
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At master: execute stop-yarn.sh
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At slave1: use jps to review NodeManager has been stopped
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At slave2: use jps to review NodeManager has been stopped
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At master: execute stop-dfs.sh
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At slave1: use jps to review DataNode has been stopped
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
At slave2: use jps to review DataNode has been stopped
Danairat T., danairat@gmail.com:Big Data Hadoop – Hands On Workshop
Thank you very much
Danairat T.
Line ID: Danairat
FB: Danairat Thanabodithammachari
+668-1559-1446, danairat@gmail.com, Certified Java Programmer

More Related Content

PDF
Hadoop Hand-on Lab: Installing Hadoop 2
PDF
Big Data Hadoop using Amazon Elastic MapReduce: Hands-On Labs
PDF
Big Data Hadoop Local and Public Cloud (Amazon EMR)
PDF
Big Data Programming Using Hadoop Workshop
PDF
Big data Hadoop Analytic and Data warehouse comparison guide
PDF
What is Hadoop | Introduction to Hadoop | Hadoop Tutorial | Hadoop Training |...
PDF
Hadoop Interview Questions and Answers | Big Data Interview Questions | Hadoo...
PPTX
SQL on Hadoop: Defining the New Generation of Analytics Databases
Hadoop Hand-on Lab: Installing Hadoop 2
Big Data Hadoop using Amazon Elastic MapReduce: Hands-On Labs
Big Data Hadoop Local and Public Cloud (Amazon EMR)
Big Data Programming Using Hadoop Workshop
Big data Hadoop Analytic and Data warehouse comparison guide
What is Hadoop | Introduction to Hadoop | Hadoop Tutorial | Hadoop Training |...
Hadoop Interview Questions and Answers | Big Data Interview Questions | Hadoo...
SQL on Hadoop: Defining the New Generation of Analytics Databases

What's hot (20)

PDF
Hadoop Developer
PDF
Big Data on Public Cloud Using Cloudera on GoGrid & Amazon EMR
PDF
Hadoop MapReduce Framework
PPTX
Hadoop for Java Professionals
PDF
Hadoop Architecture and HDFS
PDF
2014 sept 26_thug_lambda_part1
PPTX
Hadoop Summit 2015: Hive at Yahoo: Letters from the Trenches
PPTX
Introduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
PPTX
Faster Faster Faster! Datamarts with Hive at Yahoo
PPTX
Pig Tutorial | Apache Pig Tutorial | What Is Pig In Hadoop? | Apache Pig Arch...
PDF
Hadoop 31-frequently-asked-interview-questions
PPTX
Hadoop and Big Data
PPTX
Python in big data world
PDF
Why Talend for Big Data?
PDF
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka
PPTX
Hadoop for Data Warehousing professionals
PPT
Hadoop summit 2010 frameworks panel elephant bird
DOCX
10 Popular Hadoop Technical Interview Questions
PPTX
Big Data & Hadoop Tutorial
PDF
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Hadoop Developer
Big Data on Public Cloud Using Cloudera on GoGrid & Amazon EMR
Hadoop MapReduce Framework
Hadoop for Java Professionals
Hadoop Architecture and HDFS
2014 sept 26_thug_lambda_part1
Hadoop Summit 2015: Hive at Yahoo: Letters from the Trenches
Introduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
Faster Faster Faster! Datamarts with Hive at Yahoo
Pig Tutorial | Apache Pig Tutorial | What Is Pig In Hadoop? | Apache Pig Arch...
Hadoop 31-frequently-asked-interview-questions
Hadoop and Big Data
Python in big data world
Why Talend for Big Data?
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka
Hadoop for Data Warehousing professionals
Hadoop summit 2010 frameworks panel elephant bird
10 Popular Hadoop Technical Interview Questions
Big Data & Hadoop Tutorial
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Ad

Viewers also liked (15)

PDF
JEE Programming - 03 Model View Controller
PDF
Digital Transformation, Enterprise Architecture, Big Data by Danairat
PDF
Perl Programming - 04 Programming Database
PDF
Perl Programming - 03 Programming File
PDF
Perl for System Automation - 01 Advanced File Processing
PDF
Perl Programming - 02 Regular Expression
PDF
JEE Programming - 05 JSP
PDF
Perl Programming - 01 Basic Perl
PPT
Dwdm unit8 jntuworld (1)
PPTX
Hadoop Summit Dublin 2016: Hadoop Platform at Yahoo - A Year in Review
PPTX
Hadoop Summit Europe 2015 - YARN Present and Future
ODP
Modeling and Mining Sequential Data
PPT
3.6 constraint based cluster analysis
PPTX
Hadoop Summit Europe Talk 2014: Apache Hadoop YARN: Present and Future
PDF
The Business value of agile development
JEE Programming - 03 Model View Controller
Digital Transformation, Enterprise Architecture, Big Data by Danairat
Perl Programming - 04 Programming Database
Perl Programming - 03 Programming File
Perl for System Automation - 01 Advanced File Processing
Perl Programming - 02 Regular Expression
JEE Programming - 05 JSP
Perl Programming - 01 Basic Perl
Dwdm unit8 jntuworld (1)
Hadoop Summit Dublin 2016: Hadoop Platform at Yahoo - A Year in Review
Hadoop Summit Europe 2015 - YARN Present and Future
Modeling and Mining Sequential Data
3.6 constraint based cluster analysis
Hadoop Summit Europe Talk 2014: Apache Hadoop YARN: Present and Future
The Business value of agile development
Ad

Similar to Setting up Hadoop YARN Clustering (20)

PDF
Big data hadooop analytic and data warehouse comparison guide
PDF
Hadoop Workshop on EC2 : March 2015
PDF
Big Data Analytics Using Hadoop Cluster On Amazon EMR
PDF
Scale 12 x Efficient Multi-tenant Hadoop 2 Workloads with Yarn
PDF
Why Scala Is Taking Over the Big Data World
PPTX
Introduction to HDFS
PPTX
Hadoop ppt on the basics and architecture
PPT
Apache hadoop, hdfs and map reduce Overview
PDF
Hadoop-2.6.0 Slides
PPTX
Big data processing using hadoop poster presentation
PDF
Hadoop 2.0 handout 5.0
PDF
Hadoop Architecture in Depth
PDF
Unleash your cluster with YARN
PDF
Hadoop - Past, Present and Future - v1.2
PPTX
Hadoop Tutorial For Beginners
PPTX
Big Data UNIT 2 AKTU syllabus all topics covered
PDF
Hadoop, Taming Elephants
PPTX
HadoopIntroduction.pptx
PPTX
HadoopIntroduction.pptx
PPTX
Hadoop and BigData - July 2016
Big data hadooop analytic and data warehouse comparison guide
Hadoop Workshop on EC2 : March 2015
Big Data Analytics Using Hadoop Cluster On Amazon EMR
Scale 12 x Efficient Multi-tenant Hadoop 2 Workloads with Yarn
Why Scala Is Taking Over the Big Data World
Introduction to HDFS
Hadoop ppt on the basics and architecture
Apache hadoop, hdfs and map reduce Overview
Hadoop-2.6.0 Slides
Big data processing using hadoop poster presentation
Hadoop 2.0 handout 5.0
Hadoop Architecture in Depth
Unleash your cluster with YARN
Hadoop - Past, Present and Future - v1.2
Hadoop Tutorial For Beginners
Big Data UNIT 2 AKTU syllabus all topics covered
Hadoop, Taming Elephants
HadoopIntroduction.pptx
HadoopIntroduction.pptx
Hadoop and BigData - July 2016

More from Danairat Thanabodithammachari (20)

PDF
Thailand State Enterprise - Business Architecture and SE-AM
PDF
PDF
Agile Organization and Enterprise Architecture v1129 Danairat
PDF
Blockchain for Management
PDF
Enterprise Architecture and Agile Organization Management v1076 Danairat
PDF
Agile Enterprise Architecture - Danairat
PDF
JEE Programming - 04 Java Servlets
PDF
JEE Programming - 08 Enterprise Application Deployment
PDF
JEE Programming - 07 EJB Programming
PDF
JEE Programming - 06 Web Application Deployment
PDF
JEE Programming - 01 Introduction
PDF
JEE Programming - 02 The Containers
PDF
Glassfish JEE Server Administration - JEE Introduction
PDF
Glassfish JEE Server Administration - The Enterprise Server
PDF
Glassfish JEE Server Administration - Clustering
PDF
Glassfish JEE Server Administration - Module 4 Load Balancer
PDF
Java Programming - 07 java networking
PDF
Java Programming - 08 java threading
PDF
Java Programming - 06 java file io
PDF
Java Programming - 05 access control in java
Thailand State Enterprise - Business Architecture and SE-AM
Agile Organization and Enterprise Architecture v1129 Danairat
Blockchain for Management
Enterprise Architecture and Agile Organization Management v1076 Danairat
Agile Enterprise Architecture - Danairat
JEE Programming - 04 Java Servlets
JEE Programming - 08 Enterprise Application Deployment
JEE Programming - 07 EJB Programming
JEE Programming - 06 Web Application Deployment
JEE Programming - 01 Introduction
JEE Programming - 02 The Containers
Glassfish JEE Server Administration - JEE Introduction
Glassfish JEE Server Administration - The Enterprise Server
Glassfish JEE Server Administration - Clustering
Glassfish JEE Server Administration - Module 4 Load Balancer
Java Programming - 07 java networking
Java Programming - 08 java threading
Java Programming - 06 java file io
Java Programming - 05 access control in java

Recently uploaded (20)

PDF
Digital Strategies for Manufacturing Companies
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PPTX
Materi_Pemrograman_Komputer-Looping.pptx
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
medical staffing services at VALiNTRY
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
AI in Product Development-omnex systems
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PPT
Introduction Database Management System for Course Database
PPTX
ISO 45001 Occupational Health and Safety Management System
PPTX
AIRLINE PRICE API | FLIGHT API COST |
PDF
Multi-factor Authentication (MFA) requirement for Microsoft 365 Admin Center_...
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PPTX
Mastering-Cybersecurity-The-Crucial-Role-of-Antivirus-Support-Services.pptx
PPTX
Odoo POS Development Services by CandidRoot Solutions
PPTX
FLIGHT TICKET RESERVATION SYSTEM | FLIGHT BOOKING ENGINE API
PDF
A REACT POMODORO TIMER WEB APPLICATION.pdf
PPTX
Introduction to Artificial Intelligence
PDF
PTS Company Brochure 2025 (1).pdf.......
Digital Strategies for Manufacturing Companies
How to Choose the Right IT Partner for Your Business in Malaysia
Materi_Pemrograman_Komputer-Looping.pptx
Softaken Excel to vCard Converter Software.pdf
2025 Textile ERP Trends: SAP, Odoo & Oracle
medical staffing services at VALiNTRY
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
AI in Product Development-omnex systems
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Introduction Database Management System for Course Database
ISO 45001 Occupational Health and Safety Management System
AIRLINE PRICE API | FLIGHT API COST |
Multi-factor Authentication (MFA) requirement for Microsoft 365 Admin Center_...
Upgrade and Innovation Strategies for SAP ERP Customers
Mastering-Cybersecurity-The-Crucial-Role-of-Antivirus-Support-Services.pptx
Odoo POS Development Services by CandidRoot Solutions
FLIGHT TICKET RESERVATION SYSTEM | FLIGHT BOOKING ENGINE API
A REACT POMODORO TIMER WEB APPLICATION.pdf
Introduction to Artificial Intelligence
PTS Company Brochure 2025 (1).pdf.......

Setting up Hadoop YARN Clustering