0% found this document useful (0 votes)

27 views12 pages

Dllction To MAPREDUCE Afflrlling: L Tro

This document discusses MapReduce optimization techniques like combiner, partitioner, and compression. It describes how MapReduce jobs are split into map and reduce tasks, and the phases within each task like record reader, mapper, combiner, partitioner, shuffle, sort, and reducer. The document also discusses using MapReduce for sorting and searching.

Uploaded by

babel 8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views12 pages

Dllction To MAPREDUCE Afflrlling: L Tro

Uploaded by

babel 8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

CHAPTER 8

dllction to MAPREDUCE
l~tro afflrlling
rro9'
~ -~ -·----~~----....,,,_-----~-------
• Sort
I - • .. · Score? • Reduce \1 '
.. ,x11i.ir's 111_
' d11cuon • Output Format
, Jn1ru
r • Combiner
I ~,JarPe ·dRcader
, Recoi • Partitioner
, rvlap • Searching
combiner
I ,•, Sorting
, PJrtirioner • •, Compression

~-=--==•
----
' R,ducer
' Shuffle _,....,..,,_.,,."'"'"'"""'""~""'-""''""ne""''""'~.""""""""",,_"""'".._....._ _ _ _ _ _ _ _ _ _ _ _ _ __

11 1,

"The alchemists in their search for gold discovered many other things ofgreater value."
- Arthur Schopenhauer, German Philosopher

HAT'S IN STORE?
W

~eassume that you are familiar with the basic concepts of HDFS and Map Reduce Programming discussed
JChapters 4 and 5. The focus of this chapter will be to build on this knowledge to understand optimiza-
~ntechniques of MapReduce Programming such as combiner, partitioner, and compression. We will also
:isr~ss how to write Map Reduce Programming for sorting and searching .
. We suggest you refer to some of the learning resources provided at the end of this chapter for better learn-
~gmdcomprehension.
2_16_•_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _B..!:igc.::D:.:a:.::ta:..:a::.'.:nd~ .A al
- n Ytic:i

8.1 INTRODUCTION
In MapReduce Programming, Jobs (Applications) are split into a set of ~ap casks reduc!. tasks. ihe
these casks are executed in a distributed fashion on Hadoop duster. Each cask processes small subset of d n
that has been assigned co it. This way, Hadoop distribute~ the load across the duster. MapReduce job t~ta
a ser of files that is scored in HDFS (Hadoop Distributed File System) as input. es
Map cask takes care of loading, parsing, transforming, and filtering. The responsibility of reduce task .
grouping and aggregating data that is produced by map casks to generate final output. Each map task ~s
. . -==- IS
bro ken mto the following phases:

I. RecordReader.
f- 2. Mapper.
c., 3. Combiner.
Q 4. Partitioner.
The output produced by map cask is known as intermediate keys and values. These intermediate keys and
values are sent to reducer. The reduce tasks are broken into the following phases:

S,, I. Shuille.
2. Sort.
\f\. 3. Reducer.
0 4. Output Format.
Hadoop assigns map tasks to the DataNode where the actual data to be processed resides. This way, Hadoop
ensures data locality. Data locality means that data is not moved over network; only computational code is
moved wprocess data which saves network bandwidth.

8.2 MAPPER
A mapper maps the input key-value pairs into a set of intermediate key-value pairs. Maps are individual
tasks that have the responsibility of transforming input records into intermediate key-value pairs.
I. RecordReader: RecordReader converts a byte-oriented view of the input (as generated by the Input-
Split) into a record-oriented view and resents it to the Mapper tasks. It presents the tasks with keys
and values. Generally the key is the positional information and value is a chunk of data that c~ ltes
...c:__-- 4iL-- __. --------

the record.
2. Map: Map function works on the key-value pair produced by RecordReader and generates zero or
more intermediate key-value pairs. The MapReduce decides the - -context.
3. Com~iner: It is an op~ nal ~ction but provid~s hig~ performance in terms ~f nen,vor~- b~dwidth
and disk space. It tak~ ermediate key-value pair prov b mapper and app hes Oser-specific aggre-
gate function to only that mapp} - It is also known a loc educer. -._
4. Partitioner: The partitioner takes the intermediate key-value pairs produced by the mapper,
splits them in~ rd, and sends the shard to -~ parcicul;g reducer as per the user-specific code.
Usually, the key with same values goes to the same reducer. The partitioned data of each map cask
is written co the local disk of that machine and pulled by the respective reducer.

/
\I ,
7, ~!/
I rO
' i
1./ cfR
· ~piJ ch< Reducer is to reduce a s« of intcrni,,I'
i J,.ore off -lues. The Reducer has three primary Ltat.c value., (the
o· s< o \f'dJ P•i.ascs• Sb fll 0 ncs rhar ah,
-1~1•
1
I ' · " " and Son R,d • ' """"1on
I'~ sJ11 • This phase takes the output of all the · .. ' uce, and Ou,pu '
i,l :iJld sort• the reduce . running. Then thes . . :~tt•oners and d
l th/ie chifle w:dh<'\ st The main purpose of this so; Ill l\'Jdua] data pipe, ;moad, thon, into tli,
I·"'11!IJ~ l:i.rger acad over by the reduce task. is grouping sinular w edssoned by k'YS which
ce .cerate or so that th .
r{!)"LIb''par I d cer cakes the grouped data produced b h
'1"'he re u . Yt e shu.fB
. e1r values
c>P11ce: J rocesses one .group a~ a time. The reduce fu . and son hase, a I'
~J •011, :i.11d ducer funcnon provides various operat' ncc,Qn Iterates all the al P~ e~~ e
J, , ,o Re .
,1 h 10 ns such as
JuO 3c e,•0 e ic 1s done, t e output (zero or rnore k
. v ues associated
aggregation, filterin d
,irbt d c:i.- nc ey-value pairs) of d . g, an corn-
'. j[lg a c re ucer is sent to the
b1t1 r for!l'la · • The output format separates key-value p . .h
utPLI forJJlat. . au wu tab (defauJ ) . . .
,o . u record wnter.
Oll•r,ollt t and wnres it our to a
)' ,, l)Slllz, fM C b. ..
he the chores o apper, om mer, Partu1oner and R d
1deScfl·bes blem has been d'1scussed under "Combiner"' d "P
e ucer for th
..
cl
,, . e wor coum problern.
,:f s. count pro an aniuoner
_·\\vrd .

. '
f
11

f
\t

Figure 8.1 The chores of Mapper, Combiner, Partitioner, and Reducer.

_
218_• _ _ _ _ _ __ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _:,:
Bi::c.g.::D~ata and A
llalytiei

8.4 COMBINER
It is an optimization technique for MapReduce Job. Generally, the reducer class is set to be t ~
111
class. The difference between combiner class and reducer class is as follows: n-t er
1. Output generated by combiner is intermediate data and it is passed to the reducer.
2. Output of the reducer is passed to the output file on disk.

The sections have been designed as follows:

Objective: What is it that we are trying to achieve here?
Input Data: What is the input that has been given to us to act upon?
Act: The actual statement/command to accomplish the task at hand.
Output: The result/output as a consequence of executing the statement.

~bjective: Write a MapReduce program to count the occurrence of similar words in a file. Use corn.
bmer for optimization.
Note: Refer Chapter 5 - Hadoop for Mapper Class and Reduce Class and Driver Program.
Input Data:

Welcome to Hadoop Session

Introduction to Hadoop
Introducing Hive
Hive Session
Pig Session

Act: In the driver program, set the combiner class as shown below.

job.setCombinerClass(WordCounterRed.class);

// Input and Output Path

FileinputFormat.addlnputPath(job, new Path("/mapreducedemos/lines.ext"));
FileOutputFormat.setOutputf'.ath(job, new_Path("/mapre~ucedemos/output/wordcount/"));

hadoop jar «jar name» «driver class» «input path» «output path»
Here driver class name, input path, and output path are optional arguments.
Output:
f root @volgalnx0l0 mapreducedemos) f hadoop jar 'w'Ordcount . jarU

Coortnls of dfrrffory [nup1'Nlurtd,mo~

Goto :pma~ederl'IM -~
······---·-··-· - - - - - - - -

- -- - - - - - - - - - - -
Gx b~cL: ro DfS1){'11.1,t

Loca I logs
I I

8.5 PARTITIONER
The partitioning phase happens after map phase and before reduce phase. Usually the number of partitions
are equal to rhe number of reducers. The default partitioner is hash partitioner.

Objective: Write a MapReduce program to count the occurrence of similar words in a file. Use parti-
tioner to partition key based on alphabets.
Note: Refer Chapter 5 - Hadoop for Mapper Class and Reduce Class and Driver Program.
Input Data:

Welcome to Hadoop Session

Introduction to Hadoop
Introducing Hive
Hive Session
Pig Session
- -----------------------~B~isDa~ and~
\.:V
-
220 • n~

Act:
WordCou.nrPartirioncr.java
impon org.apad1c.hadoop.io.lncWri rablc::
impon org.apad1e.hadoop.io.Tcxr;
impon org.apachc.hadoop. mapre<l ua:-Partitioncr:
public d,w \'(lo.lU>••• r,,,;,ion" "''"''" p,u,icione«-re,,, In, W,i.,bb I
@Override . nurnPartitions) I
public inr 1,rccParririon(fc."1:r key. Inc\X/ri~ble value. mt
String word = kcy. roStr1ngO;
char alphabet = word.roUpperCase().charAr(O);
int parcirionNumber = 0;
swirch(alphahet) I
case 'A': parririonNumber = I; break;
case 'B': partitionNumber = 2; break;
case 'C' : parcicionNumber = 3; break;
case 'D': partitionNumber = 4; break;
case 'E': parcitionNumber· = 5; break;
case 'F': parcicionNumber = 6; break;
case 'G': parcitionNumbcr = 7; break;
case 'H' : particionNumber = 8; break;
case 'I': partitionNumber = 9; break;
case 'J': particionNumber = 10; break;
case 'K': partitionNumber = 11; break;
case 'L': partitionNumber = 12; break;
case 'M': partitionNumber = 13; break;
case 'N': partitionNumber = 14; break;
case 'O': partitionNumber = 15; break;
case 'P': partitionNumber = 16; break;
case 'Q': partitionNumber = 17; break;
case 'R': partitionNumber = 18; b~eak;
case 'S': partitionNumber = 19; break;
case 'T': partitionNumber = 20; break;
case 'U': partitionNumber = 21; break;
case 'V': partitionNumber = 22; break;
case 'W': partitionNumber = 23; break;
case 'X': partitionNumber = 24· break·
case 'Y': partitionNumber = 25;
break;
case 'Z': partitionNumber = 26; break;
default: partitionNumber = 0; break;
}
return partitionNumber;
• 221

The output .lilt parc-r--00008 is associated with alphabc-r 'H'.

I Ir: .mvm1ttulln»2'1VlllVU!ll'J'.9Tik!lJIJIIPld.iJJilil.ttfpart-r-00()08

>,, ,.., ;:, ¼?;.r,,r;;0~,l,YAil

222• ------

~8.~62 SE~A~R~C~H~IN~G~ - - - - - - - - - - - - - -------.......

. 'fie keywo rd in a fi le.
10
enrdl for a speCJ
Objccth~: To wrirc a MapRcducc progr.un s'
Input Data.:

I00l John.45
I002Jack,39
1003.Alcx,44
1004,Smith,38
l 005,Bob,33

Act:
WordSearchcr.java

import java.io.lOExcepcion;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs. Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FilelnputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextlnputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output TextOutputFormat;

public class WordSearcher {

public static void main(String[] args) throws IO Exception,
Interrupted.Exception, ClassNotFoundException {
Configuration conf = new Configuration();
Job job= new Job(conf);
job.setJarByClass(WordSearcher.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
job.setMapperClass(WordSearchMapper.class);
job.setReducerClass(WordSearchReducer.class);
job.setlnputFormatClass(TextlnputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
job.setNumReduceTasks(l);
job,.getC()nfiguration().set("keyword", "Jack");
, , filein~~!-format.setlnputPaths(job, new Path("/mapreduce/student.csv"));
. vn.
1~{,ppeC·J"
,-,1.ed
~;,v · . 10£."(ccptto •
· n·
\(-1 • I•10 ·

""
<' jJ' · hadoop.conf.Co
I
ac 1e. .
nfigura
.
tion:
3
il.111' rt PacI1e. hadoop. 10.lncWmab le;
(1~o -
-~I'°
,. ft 0 rg.:iP· I hadoop. io.LongWritable;
ac 1e. .
,,,,f'°
, rt 0 rg.(1P h
ac e. hadoop. 1 0.Text;
3
·,11'1'° ft 0 rg- Pach e.hadoop.mapreduce.lnputSplit·•
;,,,I'° A 0 rg.aP
..o•· ac11e.
hadoop.mapreduce.Mapper;
. .
;Jt'r..o•·
A
0
rg.aP pacI1e. hadoop.mapreduce.ltb.1nput.FileSplit·,
jtllr of'l3·a M ds M
. port dSearch apper exten apper<Long~ • bl Ti
,ti! • dll55 Wor . rita e, ext, Text, Texr, [
11bh' • Sering keyword;
p cattC .
s . in.tpos = 0,
scauc
d void setup(Context context) throws IQ-c._ •
0 tecte i:.x:cepuon,
pr InterruptedException {

Configuration confi~ration = context.getConfiguration();

keyword= configuration.get( keyword");
11

} rotected void map(LongWritable

E . key, Text value, Context context)
P
throws IO xcepnon, lnterruptedException {

InputSplit i = context.getlnputSplit(); // Get the input split for this map.

FileSplit f = (FileSplit) i; .
String fileName = £getPath().getName();
Integer wordPos;
pos++;
if (value. toString() .contains(keyword)) {
wordPos = value.fi.nd(keyword);
context.write(value, new Text(fileName + ","+ new lntWritable(pos).
roSmng + " ' "+ wordPos.toString()));
. ()
}
--~lll illld A11.tt >iit,
lU ,, ______ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ B....;i~'D..::.

WordSearchReducer.java

import jav:1.io.iOExccp1ion:
import org.apache.hadoop.io.Tcx1;
import org.apachc.hadoop.rnaprcducc. Rc<lucer; TeXc TeXt Text, Tc:xt> {
d
public class \XfordSea rchReduccr extends Re uccr< ' '
k 1i al Concexc conceXt)
protected void rcducc(Texl ey, exc v ue, . I
throws IOException, lnccrrupcedExcepuon

concexc.wrice(key, value);

Output:
FUt: 1m1prtduct1!!JIIJ2Jil/ttaitll/P•tH·OOOOO

-
~· ,,~"°""""''°"" liJ
-
Go boct- to d,r J,soav
AID mKnl \Js:w/dromloiN opHQlli
--
1oe2, l ac k. 39 studen :. csv , 2, S

8.7 SORTING

Objective: To write a MapRe<luce program to sorr data by student name (value) .

Input Data:

1001,John,45
1002,Jack,39
1003,Alex,44
1004,Smich,38
1005,Bob,33

Act:

import java.io.IOExcepcion;
~port org.apache.hadoop.conf.Configuration;
~port org.apache.hadoop.fs.Pach;
~port org.apache.hadoop.io.LongWricable·
~mport org.apache.hadoop.io.NullWricable·'
import org.apache. hadoop.io.Text; ,
I
- V r,J
J
lJ .
Mf\J'R£DUC E Prog.!'2fflming

_,h,ctioll co
Jr'!":
hadoop.maprcduce.Job;
apa che .
rt org• che had oop.maprcd uce.Ma pper:
·p,P0 apa ·
port org•. ache.hadoop.mapreduce. Redu cer;
port org,aPache.hadoop.maprcd uce.lib.input. Fil elnp ut Form:u ;
port org,aP ache. hadoop.mapred ucc. lib.o urput.FileO utputForm:n :
1J1' re 0 rg,aP
0
iOlP SorcScud Names I
bUc da.ss
p0 rc static class SortMapper extends
pub .l Mapper<LongWritable, Text, Tex t, Tc:xr> I
protected void map(LongWri table key, Text value, Concexc conce.x t)
throws IOExcepcion, lncerru ptedException I
Sering[] token= value.toStringO .spl it(",");
concext.write(new Text(token[l ]), new Text(token[0l+ " - "+cokenflJ));

}
r_J
. so rte d ...
value 1s
// pere,
ublic static class SortReducer extends
P Reducer<Text, Text, NullWritable, Text> { I '

public void reduce(Text key, lterable<Text> values, Context contex t)

throws IOException, lnterruptedException {
for (Text details : values) {
context. write(N ullWritable.get(), details);

public static void main(String[] args) throws IO Exception,

lnterruptedException, ClassNotFoundException {
Configuration conf = new Configuration();
Job job = new Job(conf);
job.setJarByClass(SortEmpNames.class);
job.setMapperClass(SortMapper.class);
job.setReducerClass(SortReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
FilelnputFormat.setlnputPaths(job, new Path(" Imap reduce/ smdent.csv") );
FileOutputFormat.setOutputPath(job, new
Pach("/ map reduce/ output/ sorted/"));
System.exit(job.waitForCompletion(true) ? 0 : l);
V

2Ui •

Outpull
ntt: ~lllUW/)U[t)l/l'•rt-r..00000

°""' ll!d
C'\I b,x;,t,., du: tlt!W

lMl, ) eclc, ,, -
1tv6tnt .c1v,2 , 5

8.8 COMPRESSION
In MapReduce programming, you can compress the MapReduce output file. Compressio~
benelirs as follows: t\vo
I. Reduces the space co store liles.
2. Speeds up data transfer across the network.
You can specify compression format in the Driver Program as shown below:

conf.setBoolean("mapred.output.compress", true);
1 conf.setClass("mapred.output.compression.codec", GzipCodec.class, CompressionCodec.class);
,,
'
Here, codec is the implementation of a compression and decompression algorithm. GzipCodec is th
pression algorithm for gzip. This compresses the output file. e c°lll.

REMIND ME
1 • Mapper maps the input key-value pairs to intermediate key-value pairs.
• Reducer then reduces the set of key-value pairs that share a common key to a smaller set of val
• The Reducer has three primary phases: ues.
• Shu.ffie and Sort
• Reduce
• Output Format
• Combiner and Partitioner are optimization techniques.
J
POINT ME (BOOK)

• MapReduce Design Patterns, O'REILLY, Donald Miner and Adam Shook.

Advanced Mapreduce
No ratings yet
Advanced Mapreduce
37 pages
Airflow User Guide
No ratings yet
Airflow User Guide
444 pages
List - Amrita Vishwa Vidyapeetham - 2022 POB
No ratings yet
List - Amrita Vishwa Vidyapeetham - 2022 POB
6 pages
Az-900 18v1.1
No ratings yet
Az-900 18v1.1
184 pages
AWS DevOps Interview Questions & Answers
No ratings yet
AWS DevOps Interview Questions & Answers
19 pages
SEMINAR PPT On Cloud Computing
100% (10)
SEMINAR PPT On Cloud Computing
23 pages
Map Reduce
No ratings yet
Map Reduce
30 pages
KPMG - Second Drive For B.Tech. 2022 - Final List
No ratings yet
KPMG - Second Drive For B.Tech. 2022 - Final List
18 pages
AZ-900 PrepAway 222 v12.0
No ratings yet
AZ-900 PrepAway 222 v12.0
183 pages
Big Data Infrastructure: Week 2: Mapreduce Algorithm Design (2/2)
No ratings yet
Big Data Infrastructure: Week 2: Mapreduce Algorithm Design (2/2)
55 pages
03 MapReduce
No ratings yet
03 MapReduce
184 pages
Unit 1 Lecture 3
No ratings yet
Unit 1 Lecture 3
12 pages
Team Members:: Sahilpreet Singh Rohil Bansal Navleen Kaur Vedant Taak
No ratings yet
Team Members:: Sahilpreet Singh Rohil Bansal Navleen Kaur Vedant Taak
48 pages
Hadoop MapReduce Tutorial
No ratings yet
Hadoop MapReduce Tutorial
25 pages
Unit 4
No ratings yet
Unit 4
10 pages
Understanding MapReduce
No ratings yet
Understanding MapReduce
4 pages
Bda Unit III r20csm
No ratings yet
Bda Unit III r20csm
54 pages
Hadoop Wordcount Program
No ratings yet
Hadoop Wordcount Program
20 pages
Unit 2
No ratings yet
Unit 2
24 pages
Module - 4 - UNDERSTANDING MAP REDUCE FUNDAMENTALS
No ratings yet
Module - 4 - UNDERSTANDING MAP REDUCE FUNDAMENTALS
6 pages
Map Reduce
No ratings yet
Map Reduce
57 pages
Map Reduce
No ratings yet
Map Reduce
45 pages
Kick Start Hadoop: Word Count - Hadoop Map Reduce Example
No ratings yet
Kick Start Hadoop: Word Count - Hadoop Map Reduce Example
13 pages
Chapter 9 - Processing Big Data With Mapreduce
No ratings yet
Chapter 9 - Processing Big Data With Mapreduce
157 pages
AWS Well-Architected Framework - Five Pillars - Tutorials Dojo
No ratings yet
AWS Well-Architected Framework - Five Pillars - Tutorials Dojo
5 pages
Ch02a Mapreduce
No ratings yet
Ch02a Mapreduce
53 pages
1.1 AWS-Cloud-Computing-Summary-Only PDF
No ratings yet
1.1 AWS-Cloud-Computing-Summary-Only PDF
5 pages
Unit 2 Topic 4 Map Reduce
No ratings yet
Unit 2 Topic 4 Map Reduce
43 pages
Big Data Practical 2
No ratings yet
Big Data Practical 2
11 pages
MQ Basics
No ratings yet
MQ Basics
3 pages
BDA Unit 3 1
No ratings yet
BDA Unit 3 1
37 pages
Unit 3 - Big Data Technologies
No ratings yet
Unit 3 - Big Data Technologies
42 pages
UNIT 2-tt1
No ratings yet
UNIT 2-tt1
7 pages
Mapreduce Programming Framework
No ratings yet
Mapreduce Programming Framework
23 pages
Understanding Inputs and Outputs of Mapreduce
No ratings yet
Understanding Inputs and Outputs of Mapreduce
13 pages
Setting Up Eclipse:: Codelab 1 Introduction To The Hadoop Environment (Version 0.17.0)
No ratings yet
Setting Up Eclipse:: Codelab 1 Introduction To The Hadoop Environment (Version 0.17.0)
9 pages
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
No ratings yet
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
13 pages
Hadoop and Map Reduce
No ratings yet
Hadoop and Map Reduce
27 pages
Describe The MapReduce Execution Steps With A Neat Diagram
No ratings yet
Describe The MapReduce Execution Steps With A Neat Diagram
10 pages
Lez.d-01-Hadoop (A) Intro
No ratings yet
Lez.d-01-Hadoop (A) Intro
58 pages
Serverless Architecture For Product Defect Detection Using Computer Vision Ra
No ratings yet
Serverless Architecture For Product Defect Detection Using Computer Vision Ra
1 page
Cloud Computing Tutorial
No ratings yet
Cloud Computing Tutorial
8 pages
(BIG DATA) (MapReduce - Quick Guide, Tutorialspoint - Com)
No ratings yet
(BIG DATA) (MapReduce - Quick Guide, Tutorialspoint - Com)
36 pages
Oracle Cloud Slides v3
No ratings yet
Oracle Cloud Slides v3
119 pages
Palak
No ratings yet
Palak
10 pages
Hadoop Training in Hyderabad
No ratings yet
Hadoop Training in Hyderabad
49 pages
Practical 2-1
No ratings yet
Practical 2-1
4 pages
Lecture - 3
No ratings yet
Lecture - 3
25 pages
Lecture 04
No ratings yet
Lecture 04
25 pages
Parlab Parallel Boot Camp: Cloud Computing With Mapreduce and Hadoop
No ratings yet
Parlab Parallel Boot Camp: Cloud Computing With Mapreduce and Hadoop
55 pages
Bda FW-4
No ratings yet
Bda FW-4
7 pages
Map Reduce Programming
No ratings yet
Map Reduce Programming
64 pages
Bda Experiment No2
No ratings yet
Bda Experiment No2
12 pages
Why MapReduce
No ratings yet
Why MapReduce
8 pages
Big Data Unit - 3
No ratings yet
Big Data Unit - 3
7 pages
Unit IV Programming Model
No ratings yet
Unit IV Programming Model
30 pages
Cloud Computing Fresco Play Mcqs Answers: Pride Mont
No ratings yet
Cloud Computing Fresco Play Mcqs Answers: Pride Mont
11 pages
Ravikant Hadoop File
No ratings yet
Ravikant Hadoop File
22 pages
Example - (Map Function in Word Count)
No ratings yet
Example - (Map Function in Word Count)
6 pages
Parlab Parallel Boot Camp Cloud Computing With Mapreduce and Hadoop
No ratings yet
Parlab Parallel Boot Camp Cloud Computing With Mapreduce and Hadoop
49 pages
09b - MapReduce
No ratings yet
09b - MapReduce
44 pages
Map Reduce Programming
No ratings yet
Map Reduce Programming
67 pages
Map Reduce
No ratings yet
Map Reduce
74 pages
Lecture 03
No ratings yet
Lecture 03
26 pages
M4 06 MapReduce
No ratings yet
M4 06 MapReduce
28 pages
Hadoop Tutorial - YDN
No ratings yet
Hadoop Tutorial - YDN
14 pages
Module2 C MapReduceParadigm
No ratings yet
Module2 C MapReduceParadigm
74 pages
Mapreduce: Simplified Data Processing On Large Clusters by Jeffrey Dean and Sanjay Ghemawa Presented by Jon Logan
No ratings yet
Mapreduce: Simplified Data Processing On Large Clusters by Jeffrey Dean and Sanjay Ghemawa Presented by Jon Logan
30 pages
Map Reduce Programming
No ratings yet
Map Reduce Programming
74 pages
Cloud Basics
No ratings yet
Cloud Basics
3 pages
Map Red
No ratings yet
Map Red
6 pages
Hive
No ratings yet
Hive
45 pages
Bda Unit-3
No ratings yet
Bda Unit-3
44 pages
03 Firstmrjob Invertedindexconstruction 141206231216 Conversion Gate01 PDF
No ratings yet
03 Firstmrjob Invertedindexconstruction 141206231216 Conversion Gate01 PDF
54 pages
Cloud Computing Infrastructure & Services (Course Outline)
No ratings yet
Cloud Computing Infrastructure & Services (Course Outline)
3 pages
App Ache
No ratings yet
App Ache
55 pages
Introduction To Cloud Computing - Intermediate Level
No ratings yet
Introduction To Cloud Computing - Intermediate Level
12 pages
CCS335 Cloud Computing Lecture Notes 1
No ratings yet
CCS335 Cloud Computing Lecture Notes 1
139 pages
IIITB+Cloud+and+DevOps+Brochure V3+
No ratings yet
IIITB+Cloud+and+DevOps+Brochure V3+
20 pages
AWS Practice Exam 2
No ratings yet
AWS Practice Exam 2
15 pages
This Book Has Been Prepared by Sunder Kidambi With The Blessings of
No ratings yet
This Book Has Been Prepared by Sunder Kidambi With The Blessings of
190 pages
Mapreduce Programming Model and Design Patterns: Andrea Lottarini January 17, 2012
No ratings yet
Mapreduce Programming Model and Design Patterns: Andrea Lottarini January 17, 2012
23 pages
UNIT 5
No ratings yet
UNIT 5
96 pages
32 - 33 - 26th-27th October
No ratings yet
32 - 33 - 26th-27th October
41 pages
Emailing Pig PDF
No ratings yet
Emailing Pig PDF
23 pages
MLDM PPT
No ratings yet
MLDM PPT
38 pages
DCC Chapter 1
No ratings yet
DCC Chapter 1
13 pages
Amrita Vishwa - Shortlisted Candidates
No ratings yet
Amrita Vishwa - Shortlisted Candidates
2 pages
Unit III Cloud Computing CSE 8th Semester
No ratings yet
Unit III Cloud Computing CSE 8th Semester
32 pages
Lec06 MapReduce
No ratings yet
Lec06 MapReduce
37 pages
Name: Muhammad Umair Bhatti Section: D SAP: 70070382 Course: Web Engineering Lab Task
No ratings yet
Name: Muhammad Umair Bhatti Section: D SAP: 70070382 Course: Web Engineering Lab Task
6 pages
Emailing Hive PDF
No ratings yet
Emailing Hive PDF
25 pages
Kiramat Shah CV - Clouddev-Tech
No ratings yet
Kiramat Shah CV - Clouddev-Tech
3 pages
OSSD End Semester Presentation
No ratings yet
OSSD End Semester Presentation
18 pages
IBM Storage Virtualize For Public Cloud Level 2 Quiz - Attempt Review
No ratings yet
IBM Storage Virtualize For Public Cloud Level 2 Quiz - Attempt Review
11 pages
Developing Cloud-Native Solutions With Solutions For The Enterprise 1st Edition Ashirwad Satapathi
No ratings yet
Developing Cloud-Native Solutions With Solutions For The Enterprise 1st Edition Ashirwad Satapathi
49 pages
NSX Cloud Consistent Networking and Security Across Enterprise, AWS & Azure Lightning Lab-Hol-2227-91-Ism - PDF - en
No ratings yet
NSX Cloud Consistent Networking and Security Across Enterprise, AWS & Azure Lightning Lab-Hol-2227-91-Ism - PDF - en
10 pages
Quick Reference Guide Public Cloud
No ratings yet
Quick Reference Guide Public Cloud
3 pages
ACS - Chennai Shortlisted For Next Round - Business Round
No ratings yet
ACS - Chennai Shortlisted For Next Round - Business Round
3 pages
Pando AI With A Proposal For Hiring BTech Students
No ratings yet
Pando AI With A Proposal For Hiring BTech Students
3 pages
Gfs Vs Hfs
No ratings yet
Gfs Vs Hfs
2 pages
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Lisp Programming Language
From Everand
Lisp Programming Language
Faiz ul haque Zeya
No ratings yet

Dllction To MAPREDUCE Afflrlling: L Tro

Uploaded by

Dllction To MAPREDUCE Afflrlling: L Tro

Uploaded by

CHAPTER 8

Figure 8.1 The chores of Mapper, Combiner, Partitioner, and Reducer.

The sections have been designed as follows:

Welcome to Hadoop Session

// Input and Output Path

Coortnls of dfrrffory [nup1'Nlurtd,mo~

Welcome to Hadoop Session

The output .lilt parc-r--00008 is associated with alphabc-r 'H'.

>,, ,.., ;:, ¼?;.r,,r;;0~,l,YAil

~8.~62 SE~A~R~C~H~IN~G~ - - - - - - - - - - - - - -------.......

public class WordSearcher {

Configuration confi~ration = context.getConfiguration();

} rotected void map(LongWritable

InputSplit i = context.getlnputSplit(); // Get the input split for this map.

Objective: To write a MapRe<luce program to sorr data by student name (value) .

public void reduce(Text key, lterable<Text> values, Context contex t)

public static void main(String[] args) throws IO Exception,

• MapReduce Design Patterns, O'REILLY, Donald Miner and Adam Shook.

You might also like