0% found this document useful (0 votes)
66 views83 pages

Developing A Mapreduce Application: by Dr. K. Venkateswara Rao Professor Department of Cse

The document describes the MapReduce framework and its components, including the mapper and reducer classes, input/output types, and the driver class that runs MapReduce jobs. It also explains concepts like the StringTokenizer class, context and iterator objects, and HDFS data flow in MapReduce applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views83 pages

Developing A Mapreduce Application: by Dr. K. Venkateswara Rao Professor Department of Cse

The document describes the MapReduce framework and its components, including the mapper and reducer classes, input/output types, and the driver class that runs MapReduce jobs. It also explains concepts like the StringTokenizer class, context and iterator objects, and HDFS data flow in MapReduce applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 83

Developing a MapReduce Application

By
Dr. K. Venkateswara Rao
Professor
Department of CSE
Contents
1. Hadoop Cluster
2. YARN Architecture
3. Data Flow in Hadoop Computing Model
4. MapReduce Model
5. The Configuration API
6. A simple configuration file
7. Accessing configuration properties
8. The MapReduce Web UI
9. Hadoop Logs
10.Tuning a Job
Hadoop Cluster
Primary Master

Secondary Master Client


Master Switch

Rack Switch-1 Rack Switch-2

S-11 S-12 S-13 S-14 S-22 S-23 S-24

Rack Switch-k

S-k1 S-k2 S-k3 S-k4 S-k5


YARN Architecture
Data Flow in Hadoop Computing Model

Input Map Shuffling Reducers Output


Blocks Tasks & Sorting Tasks Blocks
MapReduce Model
• MapReduce consists of two distinct tasks – Map and Reduce.
• As the name MapReduce suggests, the reducer phase takes place after the
mapper phase has been completed.
• So, the first is the map job, where a block of data is read and processed to
produce key-value pairs as intermediate outputs.
• The output of a Mapper or map job (key-value pairs) is input to the Reducer.
• The reducer receives the key-value pair from multiple map jobs.
• Then, the reducer aggregates those intermediate data tuples (intermediate
key-value pair) into a smaller set of tuples or key-value pairs which is the final
output.
MapReduce and its Components
• MapReduce majorly has three Classes: Mapper Class, Reducer Class, Driver Class
• Mapper Class
• The first stage in Data Processing using MapReduce is the Mapper Class. Here,
RecordReader processes each Input record and generates the respective key-
value pair. Hadoop’s Mapper store / saves this intermediate data into the local
disk.
• Input Split
• It is the logical representation of data. It represents a block of work that contains a
single map task in the MapReduce Program.
• RecordReader
• It interacts with the Input split and converts the obtained data in the form of Key-
Value Pairs.
Input Splits

Relation between input splits and HDFS Blocks
MapReduce and its Components
• Reducer Class
• The Intermediate output generated from the mapper is fed to the reducer which
processes it and generates the final output which is then saved in the HDFS.
• Driver Class
• The major component in a MapReduce job is a Driver Class. It is responsible for
setting up a MapReduce Job to run-in Hadoop. It allows to specify the names
of Mapper and Reducer Classes along with data types and their respective job
names.
• The entire MapReduce program can be fundamentally divided into three parts:
1. Mapper Phase Code
2. Reducer Phase Code
3. Driver Code
Mapper Code: Word Count Example
public class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
1
public void map(LongWritable key, Text value, Context context) throws
2
IOException,InterruptedException {
3
String line = value.toString();
4
StringTokenizer tokenizer = new StringTokenizer(line);
5
while (tokenizer.hasMoreTokens()) {
6
value.set(tokenizer.nextToken());
7
context.write(value, new IntWritable(1));
8
}
Mapper Code: Word Count Example
• A Map class extends the class Mapper which is already defined in the MapReduce Framework.
• The data types of input and output key/value pair are defined after the class declaration using angle
brackets.
• Both the input and output of the Mapper is a key/value pair.
• Input:
• The key is nothing but the offset of each line in the text file: LongWritable
• The value is each individual line : Text
• Output:
• The key is the tokenized word: Text
• The Value is hardcoded (in this case it is 1): IntWritable
• We have written a java code where lines are tokenized and each word is associated with a hardcoded
value equal to 1.
• The map() method also provides an instance of Context to write the output.
Hadoop Data Types
• Rather than using built-in Java types, Hadoop provides its own set of basic
types that are optimized for network serialization.
• These are found in the org.apache.hadoop.io package.
LongWritable corresponds to a Java Long
Text is like Java String
IntWritable corresponds to Java Integer.
• The Iterable interface:
The Iterable must be implemented by any class whose objects will be
used by the for-each version of the for loop. Iterable is a generic
interface that can be declared as
Interface Iterable<T> here the T is the type of the object being iterated.
Writable Classes and Writable Interface
• Hadoop defines its own ‘Box’ classes for Java primitive types
• IntWritable for int
• LongWritable for long
• FloatWritable for float
• DoubleWritable for double
• Text for String
• Example usage:
IntWritable writable = new IntWritable();
writable.set(163)
• There is a Writable Interface that has two methods
Public interface Writable {
void write(DataOutput out) throws IOException;
void readFields(DataInput in) throws IOException;
}
Context in Hadoop MapReduce
• Context is extensively used to write output pairs out of Maps and Reduce
• Context object: allows the Mapper/Reducer to interact with the rest of the
Hadoop system. It includes configuration data for the job as well as interfaces
which allow it to emit output.
• Applications can use the Context:
to report progress
to set application-level status messages
update Counters
indicate they are alive
to get the values that are stored in job configuration across map/reduce
phase.
StringTokenizer class in Java
• StringTokenizer class in Java is used to break a string into tokens.
• A StringTokenizer object internally maintains a current position within the
string to be tokenized.
StringTokenizer Constructors
Methods of StringTokenizer
• public boolean hasMoreTokens()
• Tests if there are more tokens available from this tokenizer's string.
• If this method returns true, then a subsequent call to nextToken with no
argument will successfully return a token.
• Returns:true if and only if there is at least one token in the string after the
current position; false otherwise.
• public String nextToken()
• Returns the next token from this string tokenizer.
• The following import statement must be used to use StringTokenizer
• import java.util.StringTokenizer;
Use of StringTokenizer class in Example
• Loops are often used to extract tokens from a string.
StringTokenizer strTokenizer =
new StringTokenizer("One Two Three");
while (strTokenizer.hasMoreTokens())
{
System.out.println(strTokenizer.nextToken());
}
• This code will produce the following output:
One
Two
Three
Reducer Code: Word Count Example
1 public class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {
2 public void reduce(Text key, Iterable<IntWritable> values, Context context)
3 throws IOException, InterruptedException {
4 int sum=0;
5 for(IntWritable x: values)
6 {
7 sum+=x.get();
8 }
9 context.write(key, new IntWritable(sum));
10 }
11 }
Reducer Code: Word Count Example
• A Reduce class extends class Reducer like that of Mapper.
• The data types are defined for input and output key/value pairs after the class declaration using angle
brackets.
• Both the input and the output of the Reducer are a key-value pairs.
• Input:
• The key nothing but those unique words which have been generated after the sorting and shuffling
phase: Text
• The value is a list of integers corresponding to each key: IntWritable
• Output:
• The key is all the unique words present in the input text file: Text
• The value is the number of occurrences of each of the unique words: IntWritable
• The values present in each of the list corresponding to each key are aggregated and produced the final
answer.
• In general, a single reducer is created for each of the unique words. However, the number of reducer can
be specified in mapred-site.xml.
Iterator
• Iterator:
• Allows a user to step through each item in a data structure
array, vector, linked list, tree, etc.
• is an interface implemented by an ADT

typical for loop can also be used


for-each loop can be used to access elements public void doSomething(Iterable<T> list)
public void doSomething(Iterable<T> list) {
{ for (Iterator<T> itr = list.iterator();
for(T temp : list) itr.hasNext(); )
{ {
// do something with temp T temp = iter.next();
} // do something with temp
} }
}
Driver Code: Word Count Example
1 Configuration conf= new Configuration();
2 Job job = new Job(conf,"My Word Count Program");
3 job.setJarByClass(WordCount.class);
4 job.setMapperClass(Map.class);
5 job.setReducerClass(Reduce.class);
6 job.setOutputKeyClass(Text.class);
7 job.setOutputValueClass(IntWritable.class);
8 job.setInputFormatClass(TextInputFormat.class);
9 job.setOutputFormatClass(TextOutputFormat.class);
10 Path outputPath = new Path(args[1]);
11 //Configuring the input/output path from the filesystem into the job
12 FileInputFormat.addInputPath(job, new Path(args[0]));
13 FileOutputFormat.setOutputPath(job, new Path(args[1]));
Job Class
• The Job class is the most important class in the MapReduce API.
• It allows the user to configure the job, submit it, control its execution, and query the state.
• The set methods only work until the job is submitted, afterwards they will throw an
IllegalStateException.
• Following are the constructors of Job class.
Job()
Job(Configuration conf)
Job(Configuration conf, String jobName)
• Create a new Job
Job job = new Job(new Configuration());
job.setJarByClass(MyJob.class);
• Specify various job-specific parameters
 An example
job.setJobName("myjob");
Driver Code: Word Count Example
• In the driver class, the configuration of the MapReduce job is set to run in Hadoop
• The name of the job, the data type of input/output of the mapper and reducer are
specified.
• The names of the mapper and reducer classes are also specified
• The path of the input and output folder is also specified.
• The method setInputFormatClass () is used for specifying how a Mapper will read the
input data or what will be the unit of work. Here, it is chosen as TextInputFormat so that
a single line is read by the mapper at a time from the input text file.
• The main () method is the entry point for the driver. In this method, a new Configuration
object for the job will be instantiated.
• The main method contains System.exit(job.WaitForCompletion(true)?0:1) at the end.
This makes the job to wait for completion.
Running the MapReduce
• The command for running a MapReduce code is:
• hadoop jar hadoop-mapreduce-example.jar WordCount /sample/input
/sample/output
The Configuration API
• Components in Hadoop are configured using Hadoop’s own
configuration API.
• An instance of the Configuration class (found in the
org.apache.hadoop.conf package) represents a collection of
configuration properties and their values.
• Each property is named by a String, and the type of a value.
• The value may be one of several, including Java primitives such as
boolean, int, long, and float; other useful types such as String,
Class, and java.io.File; and collections of Strings.
• Configurations read their properties from resources—XML files with a
simple structure for defining name-value pairs.
A simple configuration file, configuration-1.xml
<?xml version="1.0"?> <property>
<configuration> <name>weight</name>
<property> <value>heavy</value>
<name>color</name> <final>true</final>
<value>yellow</value> <description>Weight</description>
<description>Color</description> </property>
</property>
<property>
<property> <name>size-weight</name>
<name>size</name> <value>${size},${weight}</value>
<value>10</value> <description>Size and weight</description>
<description>Size</description> </property>
</property> </configuration>
Accessing configuration properties
• Configuration conf = new Configuration();
• conf.addResource("configuration-1.xml");
• assertThat(conf.get("color"), is("yellow"));
• assertThat(conf.getInt("size", 0), is(10));
• assertThat(conf.get("breadth", "wide"), is("wide"));
• assert is used to create an assertion which is a condition that should be true during
execution of the program
• The assertThat is one of the JUnit methods from the Assert object that can be used to check
if a specific value match to an expected one.
• It primarily accepts 2 parameters. First one is the actual value and the second is a matcher
object. It will then try to compare these two and returns a boolean result if its a match or
not
• the get() methods allow to specify a default value, which is used if the property is not
defined in the XML file, as in the case of breadth above.
A second configuration file, configuration-2.xml
• <?xml version="1.0"?> • Resources are added to a Configuration in order:
• <configuration>
• <property> • Configuration conf = new Configuration();
• <name>size</name>
• conf.addResource("configuration-1.xml");
• <value>12</value>
• conf.addResource("configuration-2.xml");
• </property>

• Properties defined in resources that are added later


• <property>
override the earlier definitions.
• <name>weight</name>
• the size property takes its value from the second
• <value>light</value>
configuration file, configuration-2.xml:
• </property>
• However, properties that are marked as final cannot
• </configuration>
be overridden in later definitions.
Variable Expansion
• Configuration properties can be defined in terms of other properties, or
system properties.
For example, the property size-weight in the first configuration file is
defined as ${size},${weight}, and these properties are expanded using the
values found in the configuration:
assertThat(conf.get("size-weight"), is("12,heavy"));
• System properties take priority over properties defined in resource files:
System.setProperty("size", "14");
assertThat(conf.get("size-weight"), is("14,heavy"));
• This feature is useful for overriding properties on the command line by using
-Dproperty=value JVM arguments.
Managing Configuration
• When developing Hadoop applications, it is common to switch between
running the application locally and running it on a cluster.
• For the purposes of this book, we assume the existence of a directory called
conf that contains three configuration files: hadoop-local.xml, hadoop-
localhost.xml, and hadoopcluster.xml
• The hadoop-local.xml file contains the default Hadoop configuration for the
default filesystem and the local (in-JVM) framework for running MapReduce
jobs: (PAGE:146)
• The settings in hadoop-localhost.xml point to a namenode and a YARN
resource manager both running on localhost: (PAGE:146 - 147)
• Finally, hadoop-cluster.xml contains details of the cluster’s namenode and
YARN resource manager addresses (PAGE:147)
Managing Configuration
• the following command shows a directory listing on the HDFS server running
in pseudodistributed mode on localhost:
• % hadoop fs -conf conf/hadoop-localhost.xml -ls .
• Found 2 items
• drwxr-xr-x - tom supergroup 0 2014-09-08 10:19 input
• drwxr-xr-x - tom supergroup 0 2014-09-08 10:19 output
• If -conf option is omitted, it pick up the Hadoop configuration
From the etc/hadoop subdirectory under $HADOOP_HOME.
Or,
if HADOOP_CONF_DIR is set, Hadoop configuration files will be read from
that location.
The MapReduce Web UI: The resource manager page
• Hadoop comes with a web UI for viewing information about the resource
manager and MapReduce jobs.
The MapReduce Web UI: The MapReduce job page
The MapReduce Web UI: The Tasks page
Hadoop Logs
Tuning a Job: Best Practices
WC_Mapper.java Code
• import java.io.IOException;
• import java.util.StringTokenizer;
• import org.apache.hadoop.io.IntWritable;
• import org.apache.hadoop.io.LongWritable;
• import org.apache.hadoop.io.Text;
• import org.apache.hadoop.mapred.MapReduceBase;
• import org.apache.hadoop.mapred.Mapper;
• import org.apache.hadoop.mapred.OutputCollector;
• import org.apache.hadoop.mapred.Reporter;
• public class WC_Mapper extends MapReduceBase implements Mapper<LongWritable,Text,Text,IntWritable>{
• private final static IntWritable one = new IntWritable(1);
• private Text word = new Text();
• public void map(LongWritable key, Text value,OutputCollector<Text,IntWritable> output,
• Reporter reporter) throws IOException{
• String line = value.toString();
• StringTokenizer tokenizer = new StringTokenizer(line);
• while (tokenizer.hasMoreTokens()){
• word.set(tokenizer.nextToken());
• output.collect(word, one);
• }
• }
• }
WC_Reducer.java Code
• package com.javatpoint;
• import java.io.IOException;
• import java.util.Iterator;
• import org.apache.hadoop.io.IntWritable;
• import org.apache.hadoop.io.Text;
• import org.apache.hadoop.mapred.MapReduceBase;
• import org.apache.hadoop.mapred.OutputCollector;
• import org.apache.hadoop.mapred.Reducer;
• import org.apache.hadoop.mapred.Reporter;

• public class WC_Reducer extends MapReduceBase implements Reducer<Text,IntWritable,Text,IntWritable> {
• public void reduce(Text key, Iterator<IntWritable> values,OutputCollector<Text,IntWritable> output,
• Reporter reporter) throws IOException {
• int sum=0;
• while (values.hasNext()) {
• sum+=values.next().get();
• }
• output.collect(key,new IntWritable(sum));
• }
• }
WC_Runner.java Code
• import java.io.IOException;
• import org.apache.hadoop.fs.Path;
• import org.apache.hadoop.io.IntWritable;
• import org.apache.hadoop.io.Text;
• import org.apache.hadoop.mapred.FileInputFormat;
• import org.apache.hadoop.mapred.FileOutputFormat;
• import org.apache.hadoop.mapred.JobClient;
• import org.apache.hadoop.mapred.JobConf;
• import org.apache.hadoop.mapred.TextInputFormat;
• import org.apache.hadoop.mapred.TextOutputFormat;
• public class WC_Runner {
• public static void main(String[] args) throws IOException{
• JobConf conf = new JobConf(WC_Runner.class);
• conf.setJobName("WordCount");
• conf.setOutputKeyClass(Text.class);
• conf.setOutputValueClass(IntWritable.class);
• conf.setMapperClass(WC_Mapper.class);
• conf.setCombinerClass(WC_Reducer.class);
• conf.setReducerClass(WC_Reducer.class);
• conf.setInputFormat(TextInputFormat.class);
• conf.setOutputFormat(TextOutputFormat.class);
• FileInputFormat.setInputPaths(conf,new Path(args[0]));
• FileOutputFormat.setOutputPath(conf,new Path(args[1]));
• JobClient.runJob(conf);
• }
• }
Anatomy of MapReduce Job Run
By
Dr. K. Venkateswara Rao
Professor, CSE
(Some slides are taken from CMU ppt presentation)
Hadoop 2.X MapReduce Components (Entities)
How Hadoop Runs a MapReduce Job
Job Submission

• Job.runJob()
• Creates a new JobClient instances
• Calls submit() on Job object
• Job.submit ()
• Creates an internal JobSubmitter instance and calls submitJobInternal() on it(step 1 in Figure).
• Having submitted the job, waitForCompletion() polls the job’s progress once per second and
reports the progress to the console if it has changed since the last report.
• The job submission process implemented by JobSubmitter. It has five things.
Five things done in Job Submission Process
1. Asks the resource manager for a new application ID, used for the
MapReduce job ID (step 2).
2. Checks the output specification of the job.
3. Computes the input splits for the job.
4. Copies the resources needed to run the job, including the job JAR
file, the configuration file, and the computed input splits, to the
shared filesystem in a directory named after the job ID (step 3).
5. Submits the job by calling submitApplication() on the resource
manager (step 4).
Input Splits

Relation between input splits and HDFS Blocks
Job Initialization
1. When the resource manager receives a call to its submitApplication()
method, it hands off the request to the YARN scheduler.
2. The scheduler allocates a container, and the resource manager then
launches the application master’s process there, under the node manager’s
management (steps 5a and 5b).
3. The application master for MapReduce jobs is a Java application whose main
class is MRAppMaster. It initializes the job by creating a number of
bookkeeping objects to keep track of the job’s progress, as it will receive
progress and completion reports from the tasks (step 6).
4. Next, it retrieves the input splits computed in the client from the shared
filesystem (step 7). It then creates a map task object for each split, as well as
a number of reduce task objects. Tasks are given IDs at this point.
Task Assignment
1. The application master requests containers for all the map and
reduce tasks in the job from the resource manager (step 8).
2. Reduce tasks can run anywhere in the cluster, but requests for map
tasks have data locality constraints that the scheduler tries to
honor.
3. Requests for reduce tasks are not made until 5% of map tasks have
completed
4. Requests also specify memory requirements and CPUs for tasks. By
default, each map and reduce task is allocated 1,024 MB of
memory and one virtual core.
Task Execution
1. Once a task has been assigned resources for a container on a particular
node by the resource manager’s scheduler, the application master starts
the container by contacting the node manager (steps 9a and 9b).
2. The task is executed by a Java application whose main class is YarnChild.
3. Before it can run the task, it localizes the resources that the task needs,
including the job configuration and JAR file, and any files from the
distributed cache (step 10).
4. Finally, it runs the map or reduce task (step 11).
5. The YarnChild runs in a dedicated JVM, so that any bugs in the user-defined
map and reduce functions (or even in YarnChild) don’t affect the node
manager—by causing it to crash or hang, for example.
Progress Measure
• Following operations constitute task progress in Hadoop:
1. Reading an input record (in a mapper or reducer)
2. Writing an output record (in a mapper or reducer)
3. Setting the status description (via Reporter’s or
TaskAttemptContext’s setStatus() method)
4. Incrementing a counter (using Reporter’s incrCounter() method
or Counter’s increment() method)
5. Calling Reporter’s or TaskAttemptContext’s progress() method
Progress and Status Updates
1. When a task is running, it keeps track of its progress (i.e., the proportion of the task
completed). Progress is not always measurable.
2. For map tasks, this is the proportion of the input that has been processed. For reduce
tasks, it’s a little more complex, but the system can still estimate the proportion of
the reduce input processed.
3. Tasks also have a set of counters that count various events as the task runs. The
counters are either built into the framework, such as the number of map output
records written, or defined by users.
4. The task reports its progress and status (including counters) back to its application
master, which has an aggregate view of the job.
5. During the course of the job, the client receives the latest status by polling the
application master every second
propagation of status updates through the
MapReduce system
Job Completion
1. When the application master receives a notification that the last
task for a job is complete, it changes the status for the job to
“successful.”
2. When the Job polls for status, it learns that the job has completed
successfully, so it prints a message to tell the user and then returns
from the waitForCompletion() method.
3. Job statistics and counters are printed to the console at this point.
4. Finally, on job completion, the application master and the task
containers clean up their working state.
5. Job information is archived by the job history server to enable later
interrogation by users if desired.
Failures in Hadoop
• In the real world,
1. User code is buggy,
2. Processes crash, and
3. Machines fail.
• One of the major benefits of using Hadoop is its ability to handle failures.
• Various entities that may fail in Hadoop
1. Task Failure
2. Application Master Failure
3. Node Manager Failure
4. Resource Manager Failure
Task Failure
• User code in the map or reduce task throws a runtime exception
The task JVM reports the error back to its parent application master before it exits. The
error ultimately makes it into the user logs. The application master marks the task
attempt as failed, and frees up the container so its resources are available for another
task.
• Sudden exit of the task JVM—perhaps there is a JVM bug
The node manager notices that the process has exited and informs the application
master so it can mark the attempt as failed.
• Hanging tasks are dealt with differently
The application master notices that it hasn’t received a progress update for a while and
proceeds to mark the task as failed.
• When the application master is notified of a task attempt that has failed, it will reschedule
execution of the task. The application master will try to avoid rescheduling the task on a
node manager where it has previously failed.
Application Master Failure
• An application master sends periodic heartbeats to the resource manager, and
in the event of application master failure, the resource manager will detect the
failure and start a new instance of the master running in a new container.
• In the case of the MapReduce application master, it will use the job history to
recover the state of the tasks that were already run by the (failed) application
so they don’t have to be rerun. Recovery is enabled by default.
• The MapReduce client polls the application master for progress reports, but if
its application master fails, the client needs to locate the new instance.
If the application master fails, however, the client will experience a timeout
when it issues a status update, at that point the client will go back to the
resource manager to ask for the new application master’s address.
Node Manager Failure
• If a node manager fails by crashing or running very slowly, it will stop sending
heartbeats to the resource manager.
The resource manager will notice a node manager that has stopped sending heartbeats
if it hasn’t received one for 10 minutes and remove it from its pool of nodes to schedule
containers on.
Any task or application master running on the failed node manager will be recovered
using the mechanisms described under “Task Failure” and “Application Master Failure”
sections respectively.
The application master arranges for map tasks (which were scheduled on failed nodes)
to be rerun if they belong to incomplete jobs.
Node managers may be blacklisted if the number of failures for the application is high,
even if the node manager itself has not failed. Blacklisting is done by the application
master.
Resource Manager Failure
• The resource manager is a single point of failure.
• To achieve high availability (HA), it is necessary to run a pair of resource managers in an active-
standby configuration. If the active resource manager fails, then the standby can take over
without a significant interruption to the client.
• Information about all the running applications is stored in a highly available state store (backed
by ZooKeeper or HDFS), so that the standby can recover the core state of the failed active
resource manager.
• Node manager information can be reconstructed by the new resource manager as the node
managers send their first heartbeats.
• When the new resource manager starts, it reads the application information from the state
store, then restarts the application masters for all the applications running on the cluster.
• The transition of a resource manager from standby to active is handled by a failover controller.
• Clients and node managers must be configured to handle resource manager failover.
Hadoop MapReduce: A Closer Look
Node 1 Node 2
Files loaded from local HDFS store Files loaded from local HDFS store

InputFormat InputFormat

file file
Split Split Split Split Split Split
file file

RecordReaders RR RR RR RR RR RR RecordReaders

Input (K, V) pairs Input (K, V) pairs


Map Map Map Map Map Map

Intermediate (K, V) pairs Intermediate (K, V) pairs


Shuffling
Partitioner Process Partitioner

Sort Intermediate Sort


(K,V) pairs
exchanged by
Reduce all nodes Reduce

Final (K, V) pairs Final (K, V) pairs

OutputFormat OutputFormat
Writeback to local Writeback to local
HDFS store HDFS store
Input Files
• Input files are where the data for a MapReduce task is initially
stored
• The input files typically reside in a distributed file system (e.g.
HDFS)
• The format of input files is arbitrary
 Line-based log files
 Binary files
 Multi-line input records
 Or something else entirely
InputFormat
• How the input files are split up and read is defined by the
InputFormat
• InputFormat is a class that does the following:
Selects the files that should be used
for input Files loaded from local HDFS store
Defines the InputSplits that break
InputFormat
a file
file
Provides a factory for RecordReader
file

objects that read the file


65
InputFormat Types
 Several InputFormats are provided with Hadoop:

66
Input Splits
 An input split describes a unit of work that comprises a single map task in a
MapReduce program

 By default, the InputFormat breaks a file up into 64MB splits

 By dividing the file into splits, we allow


Files loaded from local HDFS store
several map tasks to operate on a single
file in parallel InputFormat

 If the file is very large, this can improve file


Split Split Split
performance significantly through parallelism file

 Each map task corresponds to a single input split


Input Splits and RecordReader
RecordReader
 The input split defines a slice of work but does not describe how to access it

 The RecordReader class actually loads data from its source and converts it
into (K, V) pairs suitable for reading by Mappers

 The RecordReader is invoked repeatedly Files loaded from local HDFS store
on the input until the entire split is consumed
InputFormat

file Split Split Split


 Each invocation of the RecordReader leads
file
to another call of the map function defined RR RR RR
by the programmer
Mapper and Reducer
 The Mapper performs the user-defined work of the first phase of the MapReduce program

 A new instance of Mapper is created for each split Files loaded from local HDFS store

InputFormat

 The Reducer performs the user-defined work of file


the second phase of the MapReduce program Split Split Split
file

RR RR RR

 A new instance of Reducer is created for each partition


Map Map Map

 For each key in the partition assigned to a Reducer, the Partitioner


Reducer is called once
Sort

Reduce
Combiners and Partitioners
Combiner Example
Partitioner
 Each mapper may emit (K, V) pairs to any partition

Files loaded from local HDFS store


 Therefore, the map nodes must all agree on
where to send different pieces of InputFormat

intermediate data file


Split Split Split
file

 The partitioner class determines which RR RR RR

partition a given (K,V) pair will go to Map Map Map

Partitioner
 The default partitioner computes a hash value for a
given key and assigns it to a partition based on Sort

this result Reduce


Sort
 Each Reducer is responsible for reducing
the values associated with (several) Files loaded from local HDFS store
intermediate keys
InputFormat

file

 The set of intermediate keys on a single file


Split Split Split

node is automatically sorted by RR RR RR


MapReduce before they are presented
to the Reducer Map Map Map

Partitioner

Sort

Reduce
OutputFormat
Shuffle and Sort: The Map Side
• MapReduce guarantees that the input to every reducer is sorted by key.
• The process by which the system performs the sort—and transfers the map
outputs to the reducers as inputs—is known as the shuffle
• When the map function starts producing output, it is not simply written to
disk. It takes advantage of buffering by writing in main memory and doing
some presorting for efficiency reasons.
• Each map task has a circular memory buffer that it writes the output to. The
buffer is 100 MB by default.
• When the contents of the buffer reach a certain threshold size (default value
0.80, or 80%), background thread will start to spill the contents to disk.
Shuffle and Sort in MapReduce
Shuffle and Sort: The Map Side
• Spills are written in round-robin fashion to the specified directories.
• Before it writes to disk, the thread first divides the data into partitions corresponding
to the reducers that they will ultimately be sent to.
• Within each partition, the background thread performs an in-memory sort by key,
and if there is a combiner function, it is run on the output of the sort.
• Running the combiner function makes for a more compact map output, so there is
less data to write to local disk and to transfer to the reducer.
• Each time the memory buffer reaches the spill threshold, a new spill file is created, so
after the map task has written its last output record, there could be several spill files.
• Before the task is finished, the spill files are merged into a single partitioned and
sorted output file. If there are at least three spill files, the combiner is run again
before the output file is written.
Shuffle and Sort: The Reduce Side
• The map output file is sitting on the local disk of the machine that ran the map
task, but now the map output is needed by the machine that is about to run
the reduce task for the partition.
• Moreover, the reduce task needs the map output for its particular partition
from several map tasks across the cluster.
• The map tasks may finish at different times, so the reduce task starts copying
their outputs as soon as each completes. This is known as the copy phase of
the reduce task.
• The reduce task has a small number of copier threads, by default 5, so that it
can fetch map outputs in parallel.
• A thread in the reducer periodically asks the master for map output hosts until
it has retrieved them all.
Shuffle and Sort: The Reduce Side
• Map outputs are copied to the reduce task JVM’s memory if they are small enough.
otherwise, they are copied to disk.
• When the in-memory buffer reaches a threshold size or reaches a threshold number
of map outputs, it is merged and spilled to disk. If a combiner is specified, it will be run
during the merge to reduce the amount of data written to disk.
• As the copies accumulate on disk, a background thread merges them into larger,
sorted files.
• When all the map outputs have been copied, the reduce task moves into the sort
phase, which merges the map outputs, maintaining their sort ordering. This is done in
rounds.
• During the reduce phase, the reduce function is invoked for each key in the sorted
output.
• The output of this phase is written directly to the output filesystem, typically HDFS.
Speculative Execution
• The MapReduce model is to break jobs into tasks and run the tasks in parallel to make
the overall job execution time smaller than it would be if the tasks ran sequentially.
• A MapReduce job is dominated by the slowest task
• Hadoop doesn’t try to diagnose and fix slow-running tasks(stragglers); instead, it tries
to detect when a task is running slower than expected and launches another
equivalent task as a backup. This is termed speculative execution of tasks.
• Only one copy of a straggler is allowed to be speculated
• Whichever copy (among the two copies) of a task commits first, it becomes the
definitive copy, and the other copy is killed.
• Speculative execution is an optimization, and not a feature to make jobs run more
reliably.
• Speculative execution is turned on by default.
Output Committers
• Hadoop MapReduce uses a commit protocol to ensure that jobs and
tasks either succeed or fail cleanly.
• The behavior is implemented by the OutputCommitter in use for the
job.
• In the old MapReduce API, the OutputCommitter is set by calling the
setOutputCommitter() on JobConf or by setting
mapred.output.committer.class in the configuration.
• In the new MapReduce API, the OutputCommitter is determined by
the OutputFormat, via its getOutputCommitter() method.

You might also like