0% found this document useful (0 votes)
69 views8 pages

Tutorial Partitioner

This document provides code for a MapReduce job in Java that finds the maximum temperature by year from data files. It includes a Mapper class to extract temperature readings from the data, a Reducer class to find the max for each year, and a driver class to run the job locally or on a cluster. Instructions are given to modify the code to run on a Hadoop cluster by commenting out hardcoded paths and copying input files to HDFS before submitting the job.

Uploaded by

pavan2711
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views8 pages

Tutorial Partitioner

This document provides code for a MapReduce job in Java that finds the maximum temperature by year from data files. It includes a Mapper class to extract temperature readings from the data, a Reducer class to find the max for each year, and a driver class to run the job locally or on a cluster. Instructions are given to modify the code to run on a Hadoop cluster by commenting out hardcoded paths and copying input files to HDFS before submitting the job.

Uploaded by

pavan2711
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

1

Map reduce - Partitioner

Start eclipse.
Create on java project: Partioner and create the following java class:

Hpot-Tech

Map reduce - Partitioner

Add the User Libraries:

package com.hp.partitioner;
// cc MaxTemperatureMapper Mapper for maximum temperature example
// vv MaxTemperatureMapper
import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
public class MaxTemperatureMapper
extends Mapper<LongWritable, Text, Text, IntWritable> {
private static final int MISSING = 9999;
@Override
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
String year = line.substring(15, 19);
int airTemperature;
if (line.charAt(87) == '+') { // parseInt doesn't like leading plus signs
airTemperature = Integer.parseInt(line.substring(88, 92));
} else {
airTemperature = Integer.parseInt(line.substring(87, 92));
}

Hpot-Tech

Map reduce - Partitioner

String quality = line.substring(92, 93);


if (airTemperature != MISSING && quality.matches("[01459]")) {
context.write(new Text(year), new IntWritable(airTemperature));
}
}
}
// ^^ MaxTemperatureMapper

package com.hp.partitioner;
// cc MaxTemperatureReducer Reducer for maximum temperature example
// vv MaxTemperatureReducer
import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;
public class MaxTemperatureReducer
extends Reducer<Text, IntWritable, Text, IntWritable> {
@Override
public void reduce(Text key, Iterable<IntWritable> values,
Context context)
throws IOException, InterruptedException {
int maxValue = Integer.MIN_VALUE;
for (IntWritable value : values) {
maxValue = Math.max(maxValue, value.get());
}
context.write(key, new IntWritable(maxValue));
}
}
// ^^ MaxTemperatureReducer

Hpot-Tech

Map reduce - Partitioner

package com.hp.partitioner;
// cc MaxTemperatureWithCombiner Application to find the maximum temperature, using a combiner function for efficiency
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
// vv MaxTemperatureWithCombiner
public class MaxTemperatureWithCombiner {
public static void main(String[] args) throws Exception {
/*if (args.length != 2) {
System.err.println("Usage: MaxTemperatureWithCombiner <input path> " +
"<output path>");
System.exit(-1);
}*/
Job job = new Job();
job.setJarByClass(MaxTemperatureWithCombiner.class);
job.setJobName("Max temperature");

FileInputFormat.addInputPath(job, new Path("in"));


FileOutputFormat.setOutputPath(job, new Path("out"+System.currentTimeMillis()));
job.setMapperClass(MaxTemperatureMapper.class);
job.setCombinerClass(MaxTemperatureReducer.class);
job.setReducerClass(MaxTemperatureReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
// ^^ MaxTemperatureWithCombiner

Hpot-Tech

Map reduce - Partitioner

Copy the following files in :

Run the application as java

Hpot-Tech

Map reduce - Partitioner

Out put.

Hpot-Tech

Map reduce - Partitioner

Hpot-Tech

Map reduce - Partitioner

Submit the application in cluster.


1) Comment the hardcoded path.
2) Copy the files to HDFS
3) And execute the class
#hadoop fs -copyFromLocal /hadoop/data/19*gz in/
# hadoop jar /hadoop/hadoop/mypartitioner.jar com.hp.partitioner.MaxTemperatureWithCombiner in output

View the jobtrackers:


http://192.168.92.128:50030/jobtracker.jsp
And output
http://192.168.92.128:50070/dfshealth.jsp

Hpot-Tech

You might also like