0% found this document useful (0 votes)

22 views4 pages

BDC Output 3

Uploaded by

vogalaf328

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views4 pages

BDC Output 3

Uploaded by

vogalaf328

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Big Data Computing Practical No.

Program Source code:

import java.io.IOException; import java.util.StringTokenizer; import
org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer; import
org.apache.hadoop.conf.Configuration; import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; import
org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; import
org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import
org.apache.hadoop.fs.Path;
public class WordCount
{ public static class Map extends Mapper<LongWritable,Text,Text,IntWritable> { public void
map(LongWritable key, Text value,Context context) throws IOException,InterruptedException{
String line = value.toString(); StringTokenizer tokenizer =
new StringTokenizer(line); while
(tokenizer.hasMoreTokens()) {
value.set(tokenizer.nextToken()); context.write(value, new
IntWritable(1));
}
} } public static class Reduce extends Reducer<Text,IntWritable,Text,IntWritable>

{ public void reduce(Text key, Iterable<IntWritable> values,Context context) throws

IOException,InterruptedException

{ int sum=0;

for(IntWritable x: values)

{ sum+=x.get();

} context.write(key, new IntWritable(sum));

}
} public static void main(String[] args) throws Exception
Configuration conf= new Configuration(); Job job = new
Job(conf,"My Word Count Program");
Big Data Computing Practical No. 3

job.setJarByClass(WordCount.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class); Path
outputPath = new Path(args[1]);
//Configuring the input/output path from the filesystem into the job
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
//deleting the output path automatically from hdfs so that we don't have to delete it explicitly
outputPath.getFileSystem(conf).delete(outputPath); //exiting the job only if the flag value
becomes false
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}

The entire MapReduce program can be fundamentally divided into three parts:
A. Mapper Phase Code
B. Reducer Phase Code
C. Driver Code

We will understand the code for each of these three parts sequentially.

Mapper code:
public static class Map extends
Mapper<LongWritable,Text,Text,IntWritable> {

public void map(LongWritable key, Text value, Context context) throws

IOException,InterruptedException {

String line = value.toString();

StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) { value.set(tokenizer.nextToken());
context.write(value, new IntWritable(1));
}
Big Data Computing Practical No. 3

• We have created a class Map that extends the class

Mapper which is already defined in the MapReduce Framework.

• We define the data types of input and output key/value pair after the class declaration using angle
brackets.

• Both the input and output of the Mapper is a key/value pair.

• Input:
◦ The key is nothing but the offset of each line in the text file:LongWritable
◦ The value is each individual line (as shown in the figure at the right): Text

• Output:
◦ The key is the tokenized words: Text
◦ We have the hardcoded value in our case which is 1: IntWritable
◦ Example – Dear 1, Bear 1, etc.
• We have written a java code where we have tokenized each word and assigned them a hardcoded
value equal to 1.

Reducer Code:
public static class Reduce extends
Reducer<Text,IntWritable,Text,IntWritable>
{
public void reduce(Text key, Iterable<IntWritable> values,Context context)
throws IOException,InterruptedException
{
int sum=0; for(IntWritable x: values)
{
sum+=x.get();
}
context.write(key, new IntWritable(sum));
}
}
• We have created a class Reduce which extends class Reducer like that of
Mapper.
• We define the data types of input and output key/value pair after the class
declaration using angle brackets as done for Mapper.
• Both the input and the output of the Reducer is a keyvalue pair.
• Input:
◦ The key nothing but those unique words which have been generated after
the sorting and shuffling phase: Text
Big Data Computing Practical No. 3

◦ The value is a list of integers corresponding to each key: IntWritable ◦

Example – Bear, [1, 1], etc.
• Output:
◦The key is all the unique words present in the input text file: Text
◦The value is the number of occurrences of each of the unique words:
IntWritable
◦Example – Bear, 2; Car, 3, etc.

• We have aggregated the values present in each of the list corresponding to

each key and produced the final answer.
• In general, a single reducer is created for each of the unique words, but,
you can specify the number of reducer in mapred-site.xml.
•
Driver Code:
Configuration conf= new Configuration();
Job job = new Job(conf,"My Word Count Program"); job.setJarByClass(WordCount.class);
job.setMapperClass(Map.class); job.setReducerClass(Reduce.class);
job.setOutputKeyClass(Text.class);

job.setOutputValueClass(IntWritable.class); job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
Path outputPath = new Path(args[1]);

//Configuring the input/output path from the filesystem into the job
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));

• In the driver class, we set the configuration of our MapReduce job to run in Hadoop.
• We specify the name of the job , the data type of input/ output of the mapper and reducer.
• We also specify the names of the mapper and reducer classes.
• The path of the input and output folder is also specified.
• The method setInputFormatClass () is used for specifying that how a Mapper will read the input
data or what will be the unit of work. Here, we have chosen
TextInputFormat so that single line is read by the mapper at a time from the input text file.
• The main () method is the entry point for the driver. In this method, we instantiate a new
Configuration object for the job.

Run the MapReduce code:

The command for running a MapReduce code is:
hadoop jar hadoop-mapreduce-example.jar WordCount / sample/input /sample/output

Advanced Mapreduce
No ratings yet
Advanced Mapreduce
37 pages
Topic 1 (Whole Numbers) - Y4
No ratings yet
Topic 1 (Whole Numbers) - Y4
23 pages
Map Reduce Programming
No ratings yet
Map Reduce Programming
67 pages
DUI0448I v2p Ca9 TRM
No ratings yet
DUI0448I v2p Ca9 TRM
62 pages
CS702 Big Data Programs
No ratings yet
CS702 Big Data Programs
58 pages
BDA Exp Removed Removed
No ratings yet
BDA Exp Removed Removed
33 pages
Copernicus Product Catalogue 20200302
No ratings yet
Copernicus Product Catalogue 20200302
76 pages
Deep Learning Server Platform - Admin Manual 2.0
No ratings yet
Deep Learning Server Platform - Admin Manual 2.0
20 pages
Quiz 4 - ELEN2016A - 2020
No ratings yet
Quiz 4 - ELEN2016A - 2020
3 pages
BDF Programs
No ratings yet
BDF Programs
32 pages
All
No ratings yet
All
11 pages
Map Reduce
No ratings yet
Map Reduce
57 pages
Lecture 3 Slides
No ratings yet
Lecture 3 Slides
49 pages
6 WIBD-Practicals
No ratings yet
6 WIBD-Practicals
19 pages
Ravinder Big Data 4 PDF
No ratings yet
Ravinder Big Data 4 PDF
15 pages
Ravikant Hadoop File
No ratings yet
Ravikant Hadoop File
22 pages
Developing A Mapreduce Application: by Dr. K. Venkateswara Rao Professor Department of Cse
No ratings yet
Developing A Mapreduce Application: by Dr. K. Venkateswara Rao Professor Department of Cse
83 pages
3 MapReduce Program Ex Code
No ratings yet
3 MapReduce Program Ex Code
14 pages
MapReduce Programs
No ratings yet
MapReduce Programs
10 pages
B1 Instructions
No ratings yet
B1 Instructions
9 pages
CS-702 (D) BigData
No ratings yet
CS-702 (D) BigData
61 pages
Kick Start Hadoop: Word Count - Hadoop Map Reduce Example
No ratings yet
Kick Start Hadoop: Word Count - Hadoop Map Reduce Example
13 pages
Hadoop and Map Reduce
No ratings yet
Hadoop and Map Reduce
27 pages
Big Data Practical 2
No ratings yet
Big Data Practical 2
11 pages
Hadoop Wordcount Program
No ratings yet
Hadoop Wordcount Program
20 pages
User Guide For Free Version
No ratings yet
User Guide For Free Version
20 pages
Step by Step Procdure by Power Point Presentation 5289M
No ratings yet
Step by Step Procdure by Power Point Presentation 5289M
34 pages
Week-8 de
No ratings yet
Week-8 de
9 pages
Experiment-4 BDA LAB
No ratings yet
Experiment-4 BDA LAB
7 pages
PART 1 - Install Java and Hadoop On Ubuntu
No ratings yet
PART 1 - Install Java and Hadoop On Ubuntu
4 pages
Bda Experiment No2
No ratings yet
Bda Experiment No2
12 pages
TVL Comprog12 q3 m7
No ratings yet
TVL Comprog12 q3 m7
11 pages
Bda Exp2 Chinmay
No ratings yet
Bda Exp2 Chinmay
7 pages
SAP Tables - Overview
No ratings yet
SAP Tables - Overview
3 pages
Unit IV Programming Model
No ratings yet
Unit IV Programming Model
30 pages
Practical-2 Aim: Write A Program of Word Count in Map Reduce Over HDFS. Description
No ratings yet
Practical-2 Aim: Write A Program of Word Count in Map Reduce Over HDFS. Description
6 pages
Steps To Create Jar File and Execute Word Count Problem in Mapper Reducer
No ratings yet
Steps To Create Jar File and Execute Word Count Problem in Mapper Reducer
5 pages
Practical 3bcbs
No ratings yet
Practical 3bcbs
5 pages
BDA3
No ratings yet
BDA3
7 pages
Word Count Example
No ratings yet
Word Count Example
4 pages
VLSI Physical Design Automation PDF
No ratings yet
VLSI Physical Design Automation PDF
29 pages
Sample Course - Answer-Booklet
No ratings yet
Sample Course - Answer-Booklet
20 pages
Experiment 6 BDA
No ratings yet
Experiment 6 BDA
4 pages
Word Count Program
No ratings yet
Word Count Program
3 pages
Part B Assignment - No - 1
No ratings yet
Part B Assignment - No - 1
6 pages
Parlab Parallel Boot Camp Cloud Computing With Mapreduce and Hadoop
No ratings yet
Parlab Parallel Boot Camp Cloud Computing With Mapreduce and Hadoop
49 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
Classcreation
No ratings yet
Classcreation
2 pages
Exp 3-Word Count
No ratings yet
Exp 3-Word Count
4 pages
Response From Payment Gateway: Home
No ratings yet
Response From Payment Gateway: Home
3 pages
Palak
No ratings yet
Palak
10 pages
WordCount Program Hadoop Task 2
No ratings yet
WordCount Program Hadoop Task 2
7 pages
Practical 2-1
No ratings yet
Practical 2-1
4 pages
Hadoop Developingapps PDF
No ratings yet
Hadoop Developingapps PDF
17 pages
Practical 2c
No ratings yet
Practical 2c
2 pages
CTBD Sol02
No ratings yet
CTBD Sol02
2 pages
CTBD Ex02
No ratings yet
CTBD Ex02
3 pages
Big Data 4 Vivek
No ratings yet
Big Data 4 Vivek
3 pages
Import Import Import Import Import Import Import Import Public Class Extends Implements
No ratings yet
Import Import Import Import Import Import Import Import Public Class Extends Implements
7 pages
Exp 4 Word Count
No ratings yet
Exp 4 Word Count
4 pages
Kawai CN290 Digital Piano Manual
No ratings yet
Kawai CN290 Digital Piano Manual
24 pages
Esquema Elétrico Asus Nexus 7 ME370T
No ratings yet
Esquema Elétrico Asus Nexus 7 ME370T
44 pages
Geovision Hybrid Software Datasheet
No ratings yet
Geovision Hybrid Software Datasheet
6 pages
Edp 1 PDF
No ratings yet
Edp 1 PDF
10 pages
Arpan Koley - Oe-Ec604c - Ca-1
No ratings yet
Arpan Koley - Oe-Ec604c - Ca-1
9 pages
Mapreduce Programming Framework
No ratings yet
Mapreduce Programming Framework
23 pages
Steps: /usr/lib/hadoop-0.20/ Usr/lib/hadoop-0.20/lib
No ratings yet
Steps: /usr/lib/hadoop-0.20/ Usr/lib/hadoop-0.20/lib
4 pages
Thoshiba Power Transformer
100% (1)
Thoshiba Power Transformer
28 pages
To Count Using Map and Reduce Program: Wordcount - Java
No ratings yet
To Count Using Map and Reduce Program: Wordcount - Java
2 pages
Hands-On Exercises With Big Data: Lab Sheet 1: Getting Started With Mapreduce and Hadoop
No ratings yet
Hands-On Exercises With Big Data: Lab Sheet 1: Getting Started With Mapreduce and Hadoop
14 pages
HP F210 User Manual
No ratings yet
HP F210 User Manual
31 pages
Setting Up Eclipse:: Codelab 1 Introduction To The Hadoop Environment (Version 0.17.0)
No ratings yet
Setting Up Eclipse:: Codelab 1 Introduction To The Hadoop Environment (Version 0.17.0)
9 pages
Different Types of Sewing Machines
100% (1)
Different Types of Sewing Machines
11 pages
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
No ratings yet
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
13 pages
ADA Lab Manual
No ratings yet
ADA Lab Manual
34 pages
Example - (Map Function in Word Count)
No ratings yet
Example - (Map Function in Word Count)
6 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
02-Wordcount Mapreduce
No ratings yet
02-Wordcount Mapreduce
5 pages
Ass 06
0% (1)
Ass 06
3 pages
One To One and Onto1
No ratings yet
One To One and Onto1
9 pages
Prerequisites: Single Node Setup Cluster Setup
No ratings yet
Prerequisites: Single Node Setup Cluster Setup
5 pages
Greater Amman Water SCADA Project (GASS) - TECO GROUP
No ratings yet
Greater Amman Water SCADA Project (GASS) - TECO GROUP
2 pages
Internet Safety - Crossword Puzzle
No ratings yet
Internet Safety - Crossword Puzzle
2 pages
Word Count Program To Demonstrate The Use of Map and Reduce Tasks
No ratings yet
Word Count Program To Demonstrate The Use of Map and Reduce Tasks
5 pages
KM Assumption
No ratings yet
KM Assumption
32 pages
Advanced Continuous Historian
No ratings yet
Advanced Continuous Historian
7 pages
Lead Mechanical Design Engineer in Atlanta GA Resume Tatiana Laguna
No ratings yet
Lead Mechanical Design Engineer in Atlanta GA Resume Tatiana Laguna
2 pages
Abx Micros Range
No ratings yet
Abx Micros Range
3 pages
Reduced Row Echelon Form
No ratings yet
Reduced Row Echelon Form
4 pages
C Programming
From Everand
C Programming
Netra
No ratings yet

BDC Output 3

Uploaded by

BDC Output 3

Uploaded by

Big Data Computing Practical No.

Program Source code:

{ public void reduce(Text key, Iterable<IntWritable> values,Context context) throws

} context.write(key, new IntWritable(sum));

public void map(LongWritable key, Text value, Context context) throws

String line = value.toString();

• We have created a class Map that extends the class

• Both the input and output of the Mapper is a key/value pair.

◦ The value is a list of integers corresponding to each key: IntWritable ◦

• We have aggregated the values present in each of the list corresponding to

Run the MapReduce code:

You might also like