0% found this document useful (0 votes)

9 views10 pages

MapReduce Programs

The document provides detailed instructions for implementing three different MapReduce programs: a basic Word Count program, Matrix Multiplication, and weather data mining. Each program includes steps for setting up the Java project, writing the Mapper and Reducer classes, and executing the job with Hadoop. Additionally, it covers the creation of JAR files and the necessary configurations for running the MapReduce jobs on Hadoop.

Uploaded by

sadhikashaik836

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views10 pages

MapReduce Programs

Uploaded by

sadhikashaik836

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

MapReduce Programs

1. Implementation of Run a basic Word Count Map Reduce Program

Steps:
1. First Open Eclipse -> then select File -> New -> Java Project ->Name it
WordCount -> then Finish.
2. Select Src folder and Right click ->New-> Create the java package “WordCount”
3. Create Three Java Classes into the project. Name them
WCDriver (having the main function),
WCMapper (Mapper Code)
WCReducer (Reducer Code)
4. You have to include two Reference Libraries for that:
Right Click on Project -> then select Build Path-> Click on Configure Build Path
5. select -> libraries -> select -> Add External jar
6. select the folder-> /usr/lib/hadoop select under all the jar files and select “ok”
7. select the same Hadoop folder and select the client folder
8. /usr/lib/Hadoop/client, select all the jar files and select “ok” button.
Mapper Code:
package wordcount;

import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class MapClass extends Mapper<LongWritable, Text, Text, IntWritable> {

public void map(LongWritable key, Text values, Context context) throws
IOException,InterruptedException
{
String s1 = values.toString();
String arr[] = s1.split(" ");
for(String x: arr)
{
Text word = new Text(x);
IntWritable one = new IntWritable(1);
context.write(word, one);
}

}
Reducer Code:
package wordcount;

import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

public class ReduceClass extends Reducer<Text, IntWritable, Text, IntWritable>{

public void reduce(Text key, Iterable<IntWritable> values, Context context)throws
IOException, InterruptedException
{
int x = 0;
for(IntWritable val: values)
{
x = x+ val.get();
}
context.write(key, new IntWritable(x));
}

Driver Code:
package wordcount;

import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

import org.apache.hadoop.mapreduce.Job;

public class WordCounDriver {

public static void main(String[] args) throws Exception {

// TODO Auto-generated method stub
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, “Word Count”);
job.setJarByClass(WordCounDriver.class);
job.setMapperClass(MapClass.class);
job.setReducerClass(ReduceClass.class);
job.setNumReduceTasks(1);
FileInputFormat.setInputPaths(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));

job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);

FileSystem fs = FileSystem.get(conf);
fs.delete(new Path(args[1]));
job.waitForCompletion(true);

Creation of Jar file:

1. Now you have to make a jar file. Right Click on Project-> Click on Export-> Select
export destination as Jar File-> Name the jar File(WordCount.jar) -> Click on
next -> at last Click on Finish.
2. Now copy this file into the Workspace directory of Cloudera.
3. Copy the input file into Local filesystem to hdfs filesystem.
4. Ex: hdfs dfs -put <source-path> <destination-path>.
5. Ex: hdfs dfs -put wordcount.txt /user/cloudera/Input/hfs

Executing the Jar file:

Command:
hadoop jar <jar-filename> <package-name>.<class-name> <Input-file-path> <output-
file-path>

Example:
Hadoop jar WordCount.jar WordCount.WordCountDriver
/user/cloudera/Input/hfs/wordcount.txt /user/cloudera/Input/output
2. Implementation of Matrix Multiplication with Hadoop Mapreduce

Driver Code:
package MatrixMultiply;

import org.apache.hadoop.conf.*;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
public class MatrixMultiply {
public static void main(String[] args) throws Exception { if (args.length != 2)
{
System.err.println("Usage: MatrixMultiply <in_dir> <out_dir>");
System.exit(2);
}
Configuration conf = new Configuration();
conf.set("m", "1000");
conf.set("n", "100");
conf.set("p", "1000");
@SuppressWarnings("deprecation")
Job job = new Job(conf, "MatrixMultiply");
job.setJarByClass(MatrixMultiply.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
}
}

Mapper Code:
package MatrixMultiply;

import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
//import org.apache.hadoop.mapreduce.Mapper;
import java.io.IOException;
public class Map extends Mapper<LongWritable, Text, Text, Text>
{
@Override
public void map(LongWritable key, Text value, Context context) throws IOException,
InterruptedException
{
Configuration conf = context.getConfiguration();
int m = Integer.parseInt(conf.get("m"));
int p = Integer.parseInt(conf.get("p"));
String line = value.toString();
// (M, i, j, Mij);
String[] indicesAndValue = line.split(",");
Text outputKey = new Text();
Text outputValue = new Text();
if (indicesAndValue[0].equals("M"))
{
for (int k = 0; k < p; k++)
{
outputKey.set(indicesAndValue[1] + "," + k);
// outputKey.set(i,k);
outputValue.set(indicesAndValue[0] + "," + indicesAndValue[2] + "," +
indicesAndValue[3]);
// outputValue.set(M,j,Mij);
context.write(outputKey, outputValue);
}
} else {
// (N, j, k, Njk);
for (int i = 0; i < m; i++)
{
outputKey.set(i + "," + indicesAndValue[2]); outputValue.set("N," +
indicesAndValue[1] + "," + indicesAndValue[3]); context.write(outputKey, outputValue);
}
}
}
}

Reducer Code:

package MatrixMultiply;

import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;
import java.io.IOException;
import java.util.HashMap;
public class Reduce extends Reducer<Text, Text, Text, Text> {
public void reduce(Text key, Iterable<Text> values, Context context) throws IOException ,
InterruptedException {
String[] value;
//key=(i,k),
//Values = [(M/N,j,V/W),..]
HashMap<Integer, Float> hashA = new HashMap<Integer, Float>();
HashMap<Integer, Float> hashB = new HashMap<Integer, Float>();
for (Text val : values) {
value = val.toString().split(",");
if (value[0].equals("M")) {
hashA.put(Integer.parseInt(value[1]), Float.parseFloat(value[2]));
}
else {
hashB.put(Integer.parseInt(value[1]), Float.parseFloat(value[2]));
}
}
int n = Integer.parseInt(context.getConfiguration().get("n"));
float result = 0.0f;
float m_ij;
float n_jk;
for (int j = 0; j < n; j++) {
m_ij = hashA.containsKey(j) ? hashA.get(j) : 0.0f;
n_jk = hashB.containsKey(j) ? hashB.get(j) : 0.0f;
result += m_ij * n_jk;
}
if (result != 0.0f) {
context.write(null, new Text(key.toString() + "," + Float.toString(result)));
}
}
}

Input Files:

M.txt

M,0,0,1
M,0,1,2
M,1,0,3
M,1,1,4

N.txt

N,0,0,5
N,0,1,6
N,1,0,7
N,1,1,8
3.Implementation of weather mining by taking weather data set using Map Reduce

Mapper Code:

Package Maxmin;
import java.io.IOException;
import java.util.Iterator;

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.conf.Configuration;

public class MyMaxMin extends Mapper<LongWritable, Text, Text, Text> {

public void map(LongWritable arg0, Text Value, Context context)

throws IOException, InterruptedException {

//Converting the record (single line) to String and storing it in a String variable line

String line = Value.toString();

//Checking if the line is not empty

if (!(line.length() == 0))
{

String date = line.substring(6, 14); //date

float temp_Max = Float.parseFloat(line.substring(39, 45).trim());

//maximum temperature

float temp_Min = Float.parseFloat(line.substring(47, 53).trim());

//minimum temperature

//if maximum temperature is greater than 35 , its a hot day

if (temp_Max > 35.0) {

context.write(new Text("Hot Day " + date),new

Text(String.valueOf(temp_Max))); // Hot day
}

//if minimum temperature is less than 10 , its a cold day

if (temp_Min < 10) {

context.write(new Text("Cold Day " + date),new Text(String.valueOf(temp_Min)));

// Cold day
}
}
}

Reducer Code:

/**
*MaxTemperatureReducer class is static and extends Reducer abstract class
having four hadoop generics type Text, Text, Text, Text.
*/

public class MaxTemperatureReducer extends Reducer<Text, Text, Text, Text> {

/**
* @method reduce
* This method takes the input as key and list of values pair from mapper, it
does aggregation
* based on keys and produces the final context.
*/

public void reduce(Text Key, Iterator<Text> Values, Context context)

throws IOException, InterruptedException {

//putting all the values in temperature variable of type String

String temperature = Values.next().toString();

context.write(Key, new Text(temperature));
}

/**
* @method main
* This method is used for setting all the configuration properties.
* It acts as a driver for map reduce code.
*/

Driver Code:

public static void main(String[] args) throws Exception {

//reads the default configuration of cluster from the configuration xml files
Configuration conf = new Configuration();
//Initializing the job with the default configuration of the cluster
Job job = new Job(conf, "weather example");

job.setJarByClass(MyMaxMin.class); //Assigning the driver class name

job.setMapOutputKeyClass(Text.class); //Key type coming out of mapper

job.setMapOutputValueClass(Text.class); //value type coming out of mapper

job.setMapperClass(MaxTemperatureMapper.class); //Defining the mapper

class name

job.setReducerClass(MaxTemperatureReducer.class); //Defining the reducer

class name

//Defining input Format class which is responsible to parse the dataset into a key value pair
job.setInputFormatClass(TextInputFormat.class);

//Defining output Format class which is responsible to parse the dataset into a key value pair
job.setOutputFormatClass(TextOutputFormat.class);

//setting the second argument as a path in a path variable

Path OutputPath = new Path(args[1]);

//Configuring the input path from the filesystem into the job
FileInputFormat.addInputPath(job, new Path(args[0]));

//Configuring the output path from the filesystem into the job
FileOutputFormat.setOutputPath(job, new Path(args[1]));

//deleting the context path automatically from hdfs so that we don't have delete it explicitly
OutputPath.getFileSystem(conf).delete(OutputPath);

//exiting the job only if the flag value becomes false

System.exit(job.waitForCompletion(true) ? 0 : 1);

}
}

Forouzan, Gilberg
0% (5)
Forouzan, Gilberg
7 pages
Assembly Language Project
100% (7)
Assembly Language Project
139 pages
ER (Entity Relationship) Diagram Model in DBMS
No ratings yet
ER (Entity Relationship) Diagram Model in DBMS
9 pages
Exception Handling and Multithreading
No ratings yet
Exception Handling and Multithreading
60 pages
Uni SP Servo Brochure
No ratings yet
Uni SP Servo Brochure
20 pages
Scratch Game
No ratings yet
Scratch Game
8 pages
S.V.Public School: PPT On Java
No ratings yet
S.V.Public School: PPT On Java
28 pages
Java Spring Interview - Scenario Based
No ratings yet
Java Spring Interview - Scenario Based
67 pages
Spring Framework Batch
No ratings yet
Spring Framework Batch
212 pages
CSF443 Lab-Report Nimish Shandilya 1000016934
No ratings yet
CSF443 Lab-Report Nimish Shandilya 1000016934
17 pages
CCBDI Full Lab Manual Anurag Removed
No ratings yet
CCBDI Full Lab Manual Anurag Removed
97 pages
Java Document 1
No ratings yet
Java Document 1
232 pages
Advanced Data Base Course Outline (RVU)
No ratings yet
Advanced Data Base Course Outline (RVU)
4 pages
BDA Record
No ratings yet
BDA Record
58 pages
CS702 Big Data Programs
No ratings yet
CS702 Big Data Programs
58 pages
BDA Exp Removed Removed
No ratings yet
BDA Exp Removed Removed
33 pages
BDF Programs
No ratings yet
BDF Programs
32 pages
CS-702 (D) BigData
No ratings yet
CS-702 (D) BigData
61 pages
0595 Ms Excel Using The If Function
No ratings yet
0595 Ms Excel Using The If Function
11 pages
Installation of Hadoop
No ratings yet
Installation of Hadoop
37 pages
BDT Lab Manual
No ratings yet
BDT Lab Manual
48 pages
MR Progs For Self Excercise
No ratings yet
MR Progs For Self Excercise
14 pages
BDAV Practical
No ratings yet
BDAV Practical
17 pages
Hadoop
No ratings yet
Hadoop
19 pages
class-note-PROGRAM DEVELOPMENT
No ratings yet
class-note-PROGRAM DEVELOPMENT
3 pages
DSBDA GRP B Print
No ratings yet
DSBDA GRP B Print
21 pages
Ravinder Big Data 4 PDF
No ratings yet
Ravinder Big Data 4 PDF
15 pages
MapReduce - Notes
No ratings yet
MapReduce - Notes
17 pages
Sets Bda
No ratings yet
Sets Bda
19 pages
Ravikant Hadoop File
No ratings yet
Ravikant Hadoop File
22 pages
Batch Testing in QTP
No ratings yet
Batch Testing in QTP
3 pages
BDA
No ratings yet
BDA
19 pages
Hadoop Wordcount Program
No ratings yet
Hadoop Wordcount Program
20 pages
3 MapReduce Program Ex Code
No ratings yet
3 MapReduce Program Ex Code
14 pages
Dsbda 11
No ratings yet
Dsbda 11
15 pages
BDA Lab
No ratings yet
BDA Lab
13 pages
All
No ratings yet
All
11 pages
Big Data Practical 2
No ratings yet
Big Data Practical 2
11 pages
Bda Experiment No2
No ratings yet
Bda Experiment No2
12 pages
BDA Exp5
No ratings yet
BDA Exp5
12 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
Lab3 BigData-MapReduce
No ratings yet
Lab3 BigData-MapReduce
8 pages
BDA3
No ratings yet
BDA3
7 pages
1 To 8
No ratings yet
1 To 8
16 pages
Steps To Create Jar File and Execute Word Count Problem in Mapper Reducer
No ratings yet
Steps To Create Jar File and Execute Word Count Problem in Mapper Reducer
5 pages
MPEG Layer3 Bitstream Syntax and Decoding
No ratings yet
MPEG Layer3 Bitstream Syntax and Decoding
16 pages
B1 Instructions
No ratings yet
B1 Instructions
9 pages
22MCC20017 Suraj Kumar Thakur BIG Data 2.1
No ratings yet
22MCC20017 Suraj Kumar Thakur BIG Data 2.1
7 pages
Experiment-4 BDA LAB
No ratings yet
Experiment-4 BDA LAB
7 pages
DA Lab Program-2
No ratings yet
DA Lab Program-2
6 pages
Palak
No ratings yet
Palak
10 pages
Hadoop Mini Project
No ratings yet
Hadoop Mini Project
8 pages
CS Paper1
No ratings yet
CS Paper1
2 pages
Part B Assignment - No - 1
No ratings yet
Part B Assignment - No - 1
6 pages
Practical 2-1
No ratings yet
Practical 2-1
4 pages
BDC Output 3
No ratings yet
BDC Output 3
4 pages
ADA Lab Manual
No ratings yet
ADA Lab Manual
34 pages
Mapreduce Programming Framework
No ratings yet
Mapreduce Programming Framework
23 pages
Experiment 6 BDA
No ratings yet
Experiment 6 BDA
4 pages
Import Import Import Import Import Import Import Import Public Class Extends Implements
No ratings yet
Import Import Import Import Import Import Import Import Public Class Extends Implements
7 pages
WordCount Program Hadoop Task 2
No ratings yet
WordCount Program Hadoop Task 2
7 pages
HTL Leonding College Github Io Angular Lecture Notes # Structural Directive Ngif
No ratings yet
HTL Leonding College Github Io Angular Lecture Notes # Structural Directive Ngif
112 pages
Exp 4 Word Count
No ratings yet
Exp 4 Word Count
4 pages
Steps: /usr/lib/hadoop-0.20/ Usr/lib/hadoop-0.20/lib
No ratings yet
Steps: /usr/lib/hadoop-0.20/ Usr/lib/hadoop-0.20/lib
4 pages
Practical 2c
No ratings yet
Practical 2c
2 pages
02-Wordcount Mapreduce
No ratings yet
02-Wordcount Mapreduce
5 pages
Classcreation
No ratings yet
Classcreation
2 pages
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
No ratings yet
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
13 pages
Map Reduce Java Program
No ratings yet
Map Reduce Java Program
2 pages
Developing A Simple Map-Reduce Program For Hadoop: Big Data Course CS6350 Professor: Dr. Latifur Khan
No ratings yet
Developing A Simple Map-Reduce Program For Hadoop: Big Data Course CS6350 Professor: Dr. Latifur Khan
22 pages
Wrordcount
No ratings yet
Wrordcount
2 pages
COAL Lec 5 Input Output Instructions - Chap4
No ratings yet
COAL Lec 5 Input Output Instructions - Chap4
12 pages
Example - (Map Function in Word Count)
No ratings yet
Example - (Map Function in Word Count)
6 pages
Using Map Reduce Concept, Implement A Java Pro...
No ratings yet
Using Map Reduce Concept, Implement A Java Pro...
2 pages
Minesweeper
No ratings yet
Minesweeper
6 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
List of Tools - R0
No ratings yet
List of Tools - R0
14 pages
Prerequisites: Single Node Setup Cluster Setup
No ratings yet
Prerequisites: Single Node Setup Cluster Setup
5 pages
Python Notes and Questions For Interviews
No ratings yet
Python Notes and Questions For Interviews
3 pages
Lect 11
No ratings yet
Lect 11
41 pages
Assignment: Code
No ratings yet
Assignment: Code
7 pages
Word Count Program To Demonstrate The Use of Map and Reduce Tasks
No ratings yet
Word Count Program To Demonstrate The Use of Map and Reduce Tasks
5 pages
Basic Back-End Developer Interview Questions & Answers
No ratings yet
Basic Back-End Developer Interview Questions & Answers
25 pages
Unix System Programming
No ratings yet
Unix System Programming
13 pages
Power Platform Cheat Sheet 1716014171
No ratings yet
Power Platform Cheat Sheet 1716014171
4 pages
Building K-Means Clustering Algorithm From Scratch
No ratings yet
Building K-Means Clustering Algorithm From Scratch
10 pages
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
From Everand
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
Manish Soni
No ratings yet
PRACTICAL QUESTIONS Android Studio
No ratings yet
PRACTICAL QUESTIONS Android Studio
8 pages
Sparsh Garg Resume
No ratings yet
Sparsh Garg Resume
1 page
Get DefenderReport
No ratings yet
Get DefenderReport
8 pages

MapReduce Programs

Uploaded by

MapReduce Programs

Uploaded by

MapReduce Programs

1. Implementation of Run a basic Word Count Map Reduce Program

public class MapClass extends Mapper<LongWritable, Text, Text, IntWritable> {

public class ReduceClass extends Reducer<Text, IntWritable, Text, IntWritable>{

public class WordCounDriver {

public static void main(String[] args) throws Exception {

Creation of Jar file:

Executing the Jar file:

public class MyMaxMin extends Mapper<LongWritable, Text, Text, Text> {

public void map(LongWritable arg0, Text Value, Context context)

String line = Value.toString();

//Checking if the line is not empty

String date = line.substring(6, 14); //date

float temp_Max = Float.parseFloat(line.substring(39, 45).trim());

float temp_Min = Float.parseFloat(line.substring(47, 53).trim());

//if maximum temperature is greater than 35 , its a hot day

if (temp_Max > 35.0) {

context.write(new Text("Hot Day " + date),new

//if minimum temperature is less than 10 , its a cold day

if (temp_Min < 10) {

context.write(new Text("Cold Day " + date),new Text(String.valueOf(temp_Min)));

public class MaxTemperatureReducer extends Reducer<Text, Text, Text, Text> {

public void reduce(Text Key, Iterator<Text> Values, Context context)

//putting all the values in temperature variable of type String

String temperature = Values.next().toString();

public static void main(String[] args) throws Exception {

job.setJarByClass(MyMaxMin.class); //Assigning the driver class name

job.setMapOutputKeyClass(Text.class); //Key type coming out of mapper

job.setMapOutputValueClass(Text.class); //value type coming out of mapper

job.setMapperClass(MaxTemperatureMapper.class); //Defining the mapper

job.setReducerClass(MaxTemperatureReducer.class); //Defining the reducer

//setting the second argument as a path in a path variable

//exiting the job only if the flag value becomes false

You might also like