0% found this document useful (0 votes)

25 views3 pages

MapReduce Commands

Computer science

Uploaded by

Ravinder K Singla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views3 pages

MapReduce Commands

Computer science

Uploaded by

Ravinder K Singla

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 3

# load Hadoop module

--------------------

module load Hadoop/2.6.0-cdh5.8.0-native

# find out where Hadoop is installed (variable $HADOOP_HOME)

echo $HADOOP_HOME
#/opt/apps/software/Hadoop/2.6.0-cdh5.8.0-native/share/hadoop/mapreduce

# find the streaming library

find /opt/apps/software/Hadoop/2.6.0-cdh5.8.0-native -name "hadoop-streaming*jar"
# . . .
#/opt/apps/software/Hadoop/2.6.0-cdh5.8.0-native/share/hadoop/tools/lib/hadoop-
streaming-2.6.0-cdh5.8.0.jar

# save library in the variable $STREAMING

export STREAMING=/opt/apps/software/Hadoop/2.6.0-cdh5.8.0-native/share/hadoop/
tools/lib/hadoop-streaming-2.6.0-cdh5.8.0.jar

# start a simple MapReduce job

#-----------------------------

# Simple job
############

# check that the output directory does not exist

hdfs dfs -rm -r output

# copy the file to HDFS

hdfs dfs -put wiki_1K_lines

# launch MapReduce job

# hadoop jar $STREAMING \
-input wiki_1k_lines \
-output output \
-mapper /bin/cat \
-reducer '/bin/wc -l'

# check if job was successful (output should contain a file named _SUCCESS)
hdfs dfs -ls output
# check result
hdfs dfs -cat output/part-00000

# Simple job with 4 mappers

###########################

hdfs dfs -rm -r output

# launch MapReduce job

hadoop jar $STREAMING \
-D mapreduce.job.maps=4 \
-input wiki_1k_lines \
-output output \
-mapper /bin/cat \
-reducer '/bin/wc -l'
# Wordcount with MapReduce
##########################

# use mapper.py and reducer.py

# mini-test of mapper and reducer
echo "carrot carrot apple carrot" | ./mapper.py | sort -k1 | ./reducer.py

# run wordcount job

# upload file to HDFS
hdfs dfs -put data/wiki_1k_lines
# remove output directory
hdfs dfs -rm -r output

hadoop jar $STREAMING \

-files mapper.py \
-files reducer.py \
-mapper mapper.py \
-reducer reducer.py \
-input wiki_1k_lines \
-output output

# check if output contains _SUCCESS

hdfs dfs -ls output
# check result
hdfs dfs -cat output/part-00000|head

# sort output by frequency

hdfs dfs -cat output/part-00000|sort -k2nr|head

# use swap_keyval.py

# might not be necessary

hdfs dfs -rm -r output2

hadoop jar $STREAMING \

-files swap_keyval.py \
-input output \
-output output2 \
-mapper swap_keyval.py

# check if output contains _SUCCESS

hdfs dfs -ls output
# check result

hdfs dfs -cat output2/part-00000|head

# 10021 his
# 1005 per
# 101 merely
# . . .

hdfs dfs -rm -r output2

comparator_class=org.apache.hadoop.mapred.lib.KeyFieldBasedComparator

hadoop jar $STREAMING \

-D mapreduce.job.output.key.comparator.class=$comparator_class \
-D mapreduce.partition.keycomparator.options=-nr \
-files swap_keyval.py \
-input output \
-output output2 \
-mapper swap_keyval.py

hdfs dfs -cat output2/part-00000|head

# 193778 the
# 117170 of
# 89966 and
# 69186 in

# Run MapReduce examples

########################

# list all examples

hadoop jar $HADOOP_HOME/hadoop-mapreduce-examples-2.6.0-cdh5.8.0.jar

2022 (2023) AL ICT Marking Scheme English Medium 2
No ratings yet
2022 (2023) AL ICT Marking Scheme English Medium 2
1 page
2014 Burndy Master Catalog
No ratings yet
2014 Burndy Master Catalog
1,004 pages
Spec Sheet Dx500lca Hddx520lca HD
100% (1)
Spec Sheet Dx500lca Hddx520lca HD
36 pages
PDC All Labs
100% (1)
PDC All Labs
129 pages
DA Lab EXERCISE
No ratings yet
DA Lab EXERCISE
24 pages
Database Vs Data Warehouse
No ratings yet
Database Vs Data Warehouse
21 pages
Workshop Manual Transporter 2016 19-29
No ratings yet
Workshop Manual Transporter 2016 19-29
157 pages
BDA Lab Manual
No ratings yet
BDA Lab Manual
49 pages
虚拟机安装 CentOS7
No ratings yet
虚拟机安装 CentOS7
49 pages
BDA Lab Manual - Organized
No ratings yet
BDA Lab Manual - Organized
69 pages
Hadoop Hands-On Exercises: Lawrence Berkeley National Lab
100% (1)
Hadoop Hands-On Exercises: Lawrence Berkeley National Lab
23 pages
Brutalist Style Technology Pitch Deck by Slidesgo
No ratings yet
Brutalist Style Technology Pitch Deck by Slidesgo
47 pages
Phase III Data Recovery at 1CK56 PDF
No ratings yet
Phase III Data Recovery at 1CK56 PDF
197 pages
Data Science
No ratings yet
Data Science
82 pages
Bigdatamanual
No ratings yet
Bigdatamanual
45 pages
BDA Manual
No ratings yet
BDA Manual
41 pages
Big Data Record 2024-25
No ratings yet
Big Data Record 2024-25
46 pages
Lunenburg, Fred C Louder Than Words
No ratings yet
Lunenburg, Fred C Louder Than Words
5 pages
Exp 5 - 9
No ratings yet
Exp 5 - 9
25 pages
Hadoop Module1
No ratings yet
Hadoop Module1
37 pages
Developing A MapReduce Application
No ratings yet
Developing A MapReduce Application
30 pages
JEDI Slides Intro1 Chapter 03 Programming Environment
100% (1)
JEDI Slides Intro1 Chapter 03 Programming Environment
23 pages
BDA Output
No ratings yet
BDA Output
32 pages
Hadoop and Hive Installation
No ratings yet
Hadoop and Hive Installation
19 pages
BDF Programs
No ratings yet
BDF Programs
32 pages
Procedure: 1
No ratings yet
Procedure: 1
29 pages
Bi Lab File
No ratings yet
Bi Lab File
19 pages
Test Your C Skills
No ratings yet
Test Your C Skills
129 pages
Data Storage Data Processing: Hadoop Distributed File System (HDFS) Mapreduce
No ratings yet
Data Storage Data Processing: Hadoop Distributed File System (HDFS) Mapreduce
35 pages
Codigos Errores Truma Combi 4
No ratings yet
Codigos Errores Truma Combi 4
23 pages
Chapter 03 - UNIX For Power Users
No ratings yet
Chapter 03 - UNIX For Power Users
32 pages
BDT Lab Manual
No ratings yet
BDT Lab Manual
48 pages
Dsbda 2
No ratings yet
Dsbda 2
12 pages
104 Da11-13
No ratings yet
104 Da11-13
14 pages
BDA Lab Manual-1
No ratings yet
BDA Lab Manual-1
60 pages
Ethical Challenges of The Information Society
No ratings yet
Ethical Challenges of The Information Society
17 pages
Week 1 in Terminal
No ratings yet
Week 1 in Terminal
10 pages
CS1255 OS Lab Manual Good
No ratings yet
CS1255 OS Lab Manual Good
66 pages
Icom-Radios-A25 A25e Im Web 4 PDF
No ratings yet
Icom-Radios-A25 A25e Im Web 4 PDF
53 pages
Hands On-Exercies
No ratings yet
Hands On-Exercies
17 pages
MT6771 Android Scatter
No ratings yet
MT6771 Android Scatter
18 pages
Practical-1: Aim: Hadoop Configuration and Single Node Cluster Setup and Perform File Management Task in
No ratings yet
Practical-1: Aim: Hadoop Configuration and Single Node Cluster Setup and Perform File Management Task in
61 pages
Thrust Bars PDF
No ratings yet
Thrust Bars PDF
6 pages
Midterm Sol
No ratings yet
Midterm Sol
12 pages
MongoDB NareshIT 17 1 2022
No ratings yet
MongoDB NareshIT 17 1 2022
13 pages
Good Organic Gardening - Issue 14 No 1, MayJune 2023
No ratings yet
Good Organic Gardening - Issue 14 No 1, MayJune 2023
116 pages
Assignment Tanupriya BDDV
No ratings yet
Assignment Tanupriya BDDV
8 pages
BDA Lab
No ratings yet
BDA Lab
13 pages
Exp 1-2
No ratings yet
Exp 1-2
9 pages
Hadoop Single Node Cluster Setup Steps
No ratings yet
Hadoop Single Node Cluster Setup Steps
7 pages
Recommended Platform:: 4.1. Install Java 7 (Recommended Oracle Java)
No ratings yet
Recommended Platform:: 4.1. Install Java 7 (Recommended Oracle Java)
5 pages
Recommended Platform:: 4.1. Install Java 7 (Recommended Oracle Java)
No ratings yet
Recommended Platform:: 4.1. Install Java 7 (Recommended Oracle Java)
5 pages
Chapter 1-The Numeration System
No ratings yet
Chapter 1-The Numeration System
24 pages
Department of Computer Science & Applications Panjab University
No ratings yet
Department of Computer Science & Applications Panjab University
24 pages
PRACTICAL 4 - Single and Multi Node Hadoop Install
No ratings yet
PRACTICAL 4 - Single and Multi Node Hadoop Install
11 pages
Bda Lab
No ratings yet
Bda Lab
37 pages
Algorithm by Coach Emmanuel
No ratings yet
Algorithm by Coach Emmanuel
5 pages
Hadoop
No ratings yet
Hadoop
51 pages
Big Data File
No ratings yet
Big Data File
16 pages
Hadoop Commands
No ratings yet
Hadoop Commands
5 pages
Module-1: Hdfs Basics Running Example Programs and Benchmarks Hadoop Mapreduce Framework Mapreduce Programming
No ratings yet
Module-1: Hdfs Basics Running Example Programs and Benchmarks Hadoop Mapreduce Framework Mapreduce Programming
33 pages
Hadoop Phase3 Notes
No ratings yet
Hadoop Phase3 Notes
4 pages
Step 2 - First MapReduce Program
No ratings yet
Step 2 - First MapReduce Program
25 pages
23: Economies of Scale
No ratings yet
23: Economies of Scale
55 pages
Big Data
No ratings yet
Big Data
5 pages
DSBDN
No ratings yet
DSBDN
4 pages
Hadoop All Installations
No ratings yet
Hadoop All Installations
19 pages
Group A 1st
No ratings yet
Group A 1st
4 pages
Run Python MapReduce On Local Docker Hadoop Cluster - DEV Community
No ratings yet
Run Python MapReduce On Local Docker Hadoop Cluster - DEV Community
5 pages
SOAS and Ministry of Defence, Freedom of Information Responses
No ratings yet
SOAS and Ministry of Defence, Freedom of Information Responses
2 pages
Transmission Media: Wires, Cables, Fiber Optics, and Microwaves
No ratings yet
Transmission Media: Wires, Cables, Fiber Optics, and Microwaves
15 pages
BDA Exp-1.2
No ratings yet
BDA Exp-1.2
3 pages
Trapezoidal CSG Dams
No ratings yet
Trapezoidal CSG Dams
9 pages
HadoopExercises July2011 PDF
No ratings yet
HadoopExercises July2011 PDF
26 pages
Installing Standalone and Pseudocode Hadoop Cluster: 1. Setting Up Vmware Virtual Machine
No ratings yet
Installing Standalone and Pseudocode Hadoop Cluster: 1. Setting Up Vmware Virtual Machine
14 pages
Title Page 1 1
No ratings yet
Title Page 1 1
8 pages
Report Card - 1280-393
No ratings yet
Report Card - 1280-393
2 pages
Introduction To Information Technology - Notes On The SDLC (DR R.K. Singla)
No ratings yet
Introduction To Information Technology - Notes On The SDLC (DR R.K. Singla)
9 pages
HDFS Commands
No ratings yet
HDFS Commands
1 page
Mathematics 7 First Quarter Exam 2019-2020
No ratings yet
Mathematics 7 First Quarter Exam 2019-2020
15 pages
Topas 590 TDD Productsheet en
No ratings yet
Topas 590 TDD Productsheet en
2 pages
Naukri Naveenchandrakandpal (11y 0m)
No ratings yet
Naukri Naveenchandrakandpal (11y 0m)
5 pages
Hadoop
No ratings yet
Hadoop
4 pages
TDA2086A
No ratings yet
TDA2086A
14 pages
Developing A Simple Map-Reduce Program For Hadoop: Big Data Course CS6350 Professor: Dr. Latifur Khan
No ratings yet
Developing A Simple Map-Reduce Program For Hadoop: Big Data Course CS6350 Professor: Dr. Latifur Khan
22 pages
Tutorial-Counting Words in File (S) Using Mapreduce: Prerequisites
No ratings yet
Tutorial-Counting Words in File (S) Using Mapreduce: Prerequisites
11 pages
Lesson Exemplar
No ratings yet
Lesson Exemplar
6 pages
S4 Current Meter Family
No ratings yet
S4 Current Meter Family
4 pages
Track Consignment
No ratings yet
Track Consignment
1 page
Data Sources For Data Warehouse
No ratings yet
Data Sources For Data Warehouse
2 pages
GoMechanic OrderJobCard 20231122932809949
No ratings yet
GoMechanic OrderJobCard 20231122932809949
5 pages
Assignment No1solution PDF Process (Computing) Operating System
No ratings yet
Assignment No1solution PDF Process (Computing) Operating System
1 page
Extreme Computing Lab Exercises Session One: 1 Getting Started
No ratings yet
Extreme Computing Lab Exercises Session One: 1 Getting Started
6 pages
Money
No ratings yet
Money
2 pages
(Solved) in The Context of MS-PowerPoint, A Presentation Software, WH
No ratings yet
(Solved) in The Context of MS-PowerPoint, A Presentation Software, WH
1 page
Install Hadoop-2.6.0 On Windows10
No ratings yet
Install Hadoop-2.6.0 On Windows10
8 pages
Predictor Corrector Method
No ratings yet
Predictor Corrector Method
5 pages
CC Hadoop Lab
No ratings yet
CC Hadoop Lab
6 pages
Setup Hadoop Gettingstart
No ratings yet
Setup Hadoop Gettingstart
4 pages
Lab 1 - Hadoop HDFS and MapReduce
No ratings yet
Lab 1 - Hadoop HDFS and MapReduce
4 pages
Brochure STTP On Cad
No ratings yet
Brochure STTP On Cad
2 pages
Bash Command Line Pro Tips
From Everand
Bash Command Line Pro Tips
Jason Cannon
4.5/5 (8)
Quick Configuration of Openldap and Kerberos In Linux and Authenicating Linux to Active Directory
From Everand
Quick Configuration of Openldap and Kerberos In Linux and Authenicating Linux to Active Directory
Dr. Hidaia Mahmood Alassouli
No ratings yet

MapReduce Commands

Uploaded by

MapReduce Commands

Uploaded by

# load Hadoop module

module load Hadoop/2.6.0-cdh5.8.0-native

# find out where Hadoop is installed (variable $HADOOP_HOME)

# find the streaming library

# save library in the variable $STREAMING

# start a simple MapReduce job

# check that the output directory does not exist

# copy the file to HDFS

# launch MapReduce job

# Simple job with 4 mappers

hdfs dfs -rm -r output

# launch MapReduce job

# use mapper.py and reducer.py

# run wordcount job

hadoop jar $STREAMING \

# check if output contains _SUCCESS

# sort output by frequency

# might not be necessary

hadoop jar $STREAMING \

# check if output contains _SUCCESS

hdfs dfs -cat output2/part-00000|head

hdfs dfs -rm -r output2

hadoop jar $STREAMING \

hdfs dfs -cat output2/part-00000|head

# Run MapReduce examples

# list all examples

You might also like