0% found this document useful (0 votes)
14 views

Steps to create jar file and execute word count problem in mapper reducer

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Steps to create jar file and execute word count problem in mapper reducer

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Steps to create jar file and execute word count problem in mapper reducer

1. First Open Eclipse -> then select File -> New -> Java Project ->Name it WordCount -> then
Finish.

2. Create Three Java Classes into the project. Name them WCDriver(having the main
function), WCMapper, WCReducer.

3. You have to include two Reference Libraries for that:


Right Click on Project -> then select Build Path-> Click on Configure Build Path. You can see
the Add External JARs option on the Right Hand Side.
3.1 Go to C:\hadoop-3.3.6\share\hadoop\common
Select all jar file listed in this folder
3.2 C:\hadoop-3.3.6\share\hadoop\mapreduce
Select all jar file listed in this folder
3.3 Click on apply

4. Create a class file named as WCMapper in the WordCount Project


Mapper Code: You have to copy paste this program into the WCMapper Java Class file.

// Importing libraries
import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;

public class WCMapper extends MapReduceBase implements Mapper<LongWritable,


Text, Text, IntWritable> {

// Map function
public void map(LongWritable key, Text value, OutputCollector<Text,
IntWritable> output, Reporter rep) throws IOException
{

String line = value.toString();

// Splitting the line on spaces


for (String word : line.split(" "))
{
if (word.length() > 0)
{
output.collect(new Text(word), new IntWritable(1));
}
}
}
}

5. Reducer Code: You have to copy paste this program into the WCReducer Java Class file.
// Importing libraries
import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.Reporter;

public class WCReducer extends MapReduceBase implements Reducer<Text,


IntWritable, Text, IntWritable> {

// Reduce function
public void reduce(Text key, Iterator<IntWritable> value,
OutputCollector<Text, IntWritable> output,
Reporter rep) throws IOException
{

int count = 0;

// Counting the frequency of each words


while (value.hasNext())
{
IntWritable i = value.next();
count += i.get();
}

output.collect(key, new IntWritable(count));


}
}

6. Driver Code: You have to copy paste this program into the WCDriver Java Class file.
// Importing libraries
import java.io.IOException;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

public class WCDriver extends Configured implements Tool {

public int run(String args[]) throws IOException


{
if (args.length < 2)
{
System.out.println("Please give valid inputs");
return -1;
}

JobConf conf = new JobConf(WCDriver.class);


FileInputFormat.setInputPaths(conf, new Path(args[1]));
FileOutputFormat.setOutputPath(conf, new Path(args[2]));
conf.setMapperClass(WCMapper.class);
conf.setReducerClass(WCReducer.class);
conf.setMapOutputKeyClass(Text.class);
conf.setMapOutputValueClass(IntWritable.class);
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
JobClient.runJob(conf);
return 0;
}

// Main Method
public static void main(String args[]) throws Exception
{
int exitCode = ToolRunner.run(new WCDriver(), args);
System.out.println(exitCode);
}
}
7. Now you have to make a jar file:
Right Click on Project-> Click on Export-> Select export destination as Jar File-> Name the jar
File(WordCount.jar) -> Click on next -> at last Click on Finish. Now copy this file into the
C:/hadoop-3.3.6/share/hadoop/mapreduce/

8. create one txt file named as test.txt with some repeated words

9. copy that data file into input directory


C:\hadoop-3.3.6\sbin>hadoop fs -put C:/Users/IIITK/Documents/files/test.txt /input3

10. list the contents of hdfs


C:\hadoop-3.3.6\sbin>hadoop fs -ls /input3/

11. display the contents of test.txt file


hadoop dfs -cat /input3/test.txt

12. run the wordcount.jar file saved in the shared directory of Hadoop
C:\hadoop-3.3.6\ sbin> adoop jar C:/hadoop-
3.3.6/share/hadoop/mapreduce/wordcount.jar WCDriver /input3 /output3

13. display the output stored in /output3 directory


14. C:\hadoop-3.3.6\sbin>hadoop fs -cat /output3

16. 16. we can see the output in browser also


Localhost:9870
Go to utilities

You might also like