0% found this document useful (0 votes)
34 views39 pages

MapReduce and Yarn

The document describes MapReduce, a programming model for processing large datasets in a distributed manner. It breaks the processing into two phases - map and reduce. The map phase processes the input data and generates intermediate key-value pairs. The reduce phase merges all intermediate values associated with the same key. The document also provides an example of finding the highest recorded temperature for each year using MapReduce on a weather dataset.

Uploaded by

Alekhya Abbaraju
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views39 pages

MapReduce and Yarn

The document describes MapReduce, a programming model for processing large datasets in a distributed manner. It breaks the processing into two phases - map and reduce. The map phase processes the input data and generates intermediate key-value pairs. The reduce phase merges all intermediate values associated with the same key. The document also provides an example of finding the highest recorded temperature for each year using MapReduce on a weather dataset.

Uploaded by

Alekhya Abbaraju
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

MapReduce

 MapReduce is a programming model for data processing. The model is


simple, yet not too simple to express useful programs.

 MapReduce simultaneously processes and analyzes huge data sets


logically into separate clusters.
 While Map sorts the data, Reduce segregates it into logical clusters,
thus removing the bad data and retaining the necessary information.

 Hadoop can run MapReduce programs written in various languages: e.g.


Java, Ruby, Python, C++.

 Most importantly, MapReduce programs are inherently parallel, thus


putting very large-scale data analysis into the hands of anyone with
enough machines at their disposal.
A Weather Dataset

 Weather sensors collect data every hour at many locations across the globe and gather a large
volume of log data, which is a good candidate for analysis with MapReduce because it
involves processing all the data, and the data is semi-structured and record-oriented.

Data Format:

 The data used in the example is from the National Climatic Data Center, or NCDC.

 Data files are organized by date and weather station.

 There is a directory for each year from 1901 to 2001, each containing a gzipped file for each
weather station with its readings for that year.

 There are tens of thousands of weather stations, so the whole dataset is made up of a large
number of relatively small files.
Format of a National Climatic Data Center record
0057
332130 # USAF weather station identifier
99999 # WBAN weather station identifier
19500101 # observation date
0300 # observation time
4
+51317 # latitude (degrees x 1000)
+028783 # longitude (degrees x 1000)
FM-12
+0171 # elevation (meters)
99999
V020
320 # wind direction (degrees)
1 # quality code
N
0072
1
00450 # sky ceiling height (meters)
1 # quality code
C
N
010000 # visibility distance (meters)
1 # quality code
N
9
-0128 # air temperature (degrees Celsius x 10)
1 # quality code
-0139 # dew point temperature (degrees Celsius x 10)
1 # quality code
10268 # atmospheric pressure (hectopascals x 10)
1 # quality code

 It’s generally easier and more efficient to process a smaller number of relatively large files, so
the data is preprocessed so that each year’s readings are concatenated into a single file.
What’s the highest recorded global temperature for each year in the dataset?

Analyzing the Data with Hadoop


 To take advantage of the parallel processing that Hadoop provides, we need to express the
query as a MapReduce job.

 MapReduce works by breaking the processing into two phases:


 The map phase and
 The reduce phase.

 Each phase has key-value pairs as input and output, the types of which may be chosen by the
programmer.

 The programmer also specifies two functions:


 the map function and
 the reduce function.

 The input to our map phase is the raw NCDC data.


Map Function
 A simple function.

 Pull out the year and the air temperature, because these are the only fields useful for the query.

 In this case, the map function is just a data preparation phase, setting up the data in such a
way that the reduce function can do its work on it: finding the maximum temperature for each
year.

 The map function is also a good place to drop bad records: filter out temperatures that are
missing, suspect, or erroneous.

 To visualize the way the map works, consider the following sample lines of input data
0067011990999991950051507004...9999999N9+00001+99999999999...
0043011990999991950051512004...9999999N9+00221+99999999999...
0043011990999991950051518004...9999999N9-00111+99999999999...
0043012650999991949032412004...0500001N9+01111+99999999999...
0043012650999991949032418004...0500001N9+00781+99999999999...
 These lines are presented to the map function as the key-value pairs:
(0, 0067011990999991950051507004...9999999N9+00001+99999999999...)
(106, 0043011990999991950051512004...9999999N9+00221+99999999999...)
(212, 0043011990999991950051518004...9999999N9-00111+99999999999...)
(318, 0043012650999991949032412004...0500001N9+01111+99999999999...)
(424, 0043012650999991949032418004...0500001N9+00781+99999999999...)

 The keys are the line offsets within the file, which is ignored in map function.

 The map function merely extracts the year and the air temperature and emits them as its output
(1950, 0)
(1950, 22)
(1950, −11)
(1949, 111)
(1949, 78)
• The output from the map function is processed by the MapReduce framework before being sent to
the reduce function.

• This processing sorts and groups the key-value pairs by key.


• The reduce function sees the following input:
(1949, [111, 78])
(1950, [0, 22, −11])

• The reduce function iterates through the list and pick up the maximum reading: ( The final Output)
(1949, 111)
(1950, 22)

Map Reduce Logical Data Flow


import java.io.IOException;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class NewMaxTemperature {

static class NewMaxTemperatureMapper extends Mapper<LongWritable, Text, Text, IntWritable>


{
private static final int MISSING = 9999;
public void map(LongWritable key, Text value,Context context) throws IOException, InterruptedException
{
String line = value.toString();
String year = line.substring(15, 19);
int airTemperature;
if (line.charAt(87) == '+') // parseInt doesn't like leading plus signs
airTemperature = Integer.parseInt(line.substring(88, 92));
else
airTemperature = Integer.parseInt(line.substring(87, 92));
String quality = line.substring(92, 93);

if (airTemperature != MISSING && quality.matches("[01459]"))


context.write(new Text(year), new IntWritable(airTemperature));
}
}

static class NewMaxTemperatureReducer extends Reducer<Text, IntWritable, Text, IntWritable>


{
public void reduce(Text key, Iterable<IntWritable> values,Context context)
throws IOException, InterruptedException
{
int maxValue = Integer.MIN_VALUE;
for (IntWritable value : values)
maxValue = Math.max(maxValue, value.get());
context.write(key, new IntWritable(maxValue));
}

}
public static void main(String[] args) throws Exception
{
/*[*/@SuppressWarnings("deprecation")
Job job = new Job();
job.setJarByClass(NewMaxTemperature.class);/*]*/

FileInputFormat.addInputPath(job, new Path("input/ncdc/sample.txt"));


FileOutputFormat.setOutputPath(job, new Path("output"));

job.setMapperClass(NewMaxTemperatureMapper.class);
job.setReducerClass(NewMaxTemperatureReducer.class);

job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);

System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
WORD COUNT
public class WordCountMR
{
public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable>
{
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();

public void map(Object key, Text value, Context context) throws IOException,
InterruptedException
{
StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens())
{
word.set(itr.nextToken());
context.write(word, one);
}
}
}
public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWritable>
{
private IntWritable result = new IntWritable();

public void reduce(Text key, Iterable<IntWritable> values,Context context) throws IOException,


InterruptedException
{
int sum = 0;
for (IntWritable val : values)
{
sum += val.get();
}
result.set(sum);
context.write(key, result);
}
}
public static void main(String[] args) throws Exception
{

// BasicConfigurator.configure();

Configuration conf = new Configuration();


Job job = Job.getInstance(conf, "word count");
job.setJarByClass(WordCountMR.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path("input/N"));
FileOutputFormat.setOutputPath(job, new Path("output2"));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
YARN
 Apache YARN (Yet Another Resource Negotiator) is Hadoop’s cluster resource management system.

 YARN was introduced in Hadoop 2 to improve the MapReduce implementation.


 It supports other distributed computing paradigms as well.

 YARN provides APIs for requesting and working with cluster resources, but these APIs are not typically
used directly by user code.

 Users write to higher-level APIs provided by distributed computing frameworks (MapReduce, Spark, and so
on), which themselves are built on YARN and hide the resource management details from the user.

 Pig, Hive, and Crunch are all examples of processing frameworks that run on MapReduce, Spark, or Tez (or
on all three), and don’t interact with YARN directly.

YARN Applications
Anatomy of a YARN Application Run
 YARN provides its core services via two types of long-running daemon:
 A resource manager (one per cluster) to manage the use of resources across the cluster, and
 Node managers running on all the nodes in the cluster to launch and monitor containers.
 A container executes an application-specific process with a constrained set of resources (memory, CPU, and
so on).
Resource Requests

 YARN has a flexible model for making resource requests.

 A request for a set of containers can express


 the amount of computer resources required for each container (memory and CPU).
 locality constraints for the containers in that request.

 Resource requests can be made at any time by a YARN application.

 An application can make all of its requests up front, or it can take a more dynamic approach
whereby it requests more resources dynamically to meet the changing needs of the application.

 Spark takes the first approach, starting a fixed number of executors on the cluster.

 MapReduce has two phases: the map task containers are requested up front, but the reduce task
containers are not started until later.

 Also, if any tasks fail, additional containers will be requested so the failed tasks can be rerun.
Application Lifespan

 The lifespan of a YARN application can be either

 A short-lived application of a few seconds or


 A long-running application that runs for days or even months.

 Categorizing applications in terms of how they map to the jobs that users run., we have –

 The simplest case as one application per user job, which is the approach that MapReduce
takes.

 The second model is to run one application per workflow or user session of (possibly
unrelated) jobs.

 The third model is a long-running application that is shared by different users.


YARN Compared to MapReduce 1

 “MapReduce 1” - the distributed implementation of MapReduce in the original version of


Hadoop (version 1 and earlier)
 “MapReduce 2” - the implementation that uses YARN (in Hadoop 2 and later).

 A comparison of MapReduce 1 and YARN components

MapReduce 1 YARN

Jobtracker Resource manager, application master,


timeline server

Tasktracker Node manager

Slot Container
MapReduce 1 VS (YARN)

There are two types of daemon that control the job execution process:
 A jobtracker
 Coordinates all the jobs run on the system by scheduling tasks to run on tasktrackers.
( Done by Resource manager in YARN)

 Keeps a record of the overall progress of each job.


 Reschedules a failed task on a different tasktracker.
( Done by Application master in YARN one for each MapReduce job [similar to
Google MapReduce] )

 Stores job history for completed jobs.


( Done by Timeline server in YARN)

 One or more tasktrackers


 Tasktrackers run tasks and send progress reports to the jobtracker.
( Done by Node manager in YARN)
Benefits of YARN

 Scalability
 YARN can run on larger clusters than MapReduce 1.
 MapReduce 1 hits scalability bottlenecks in the region of 4,000 nodes and 40,000 tasks.
 YARN is designed to scale up to 10,000 nodes and 100,000 tasks.

 Availability
 With the jobtracker’s responsibilities split between the resource manager and application master in
YARN, making the service highly available is much simpler.
 Provide High Availabilty for the resource manager, then for YARN applications (on a per-application
basis).

 Utilization
 In YARN, a node manager manages a pool of resources, rather than a fixed number of designated slots.
 Resources in YARN are fine grained, so an application can make a request for what it needs, rather than
for an indivisible slot, which may be too big (which is wasteful of resources) or too small (which may
cause a failure) for the particular task.

 Multitenancy
 In some ways, the biggest benefit of YARN is that it opens up Hadoop to other types of distributed
application beyond MapReduce. MapReduce is just one YARN application among many.
MAP REDUCE
Anatomy of a MapReduce Job Run
 Run a MapReduce job with a single method call: submit() on a Job object.
 Give a call to waitForCompletion(), which submits the job if it hasn’t been submitted already, then waits
for it to finish.
There are five independent entities:

• The client, which submits the MapReduce job.

• The YARN resource manager, which coordinates the allocation of compute resources on the
cluster.

• The YARN node managers, which launch and monitor the compute containers on machines in
the cluster.

• The MapReduce application master, which coordinates the tasks running the Map‐Reduce job.

The application master and the MapReduce tasks run in containers that are scheduled
by the resource manager and managed by the node managers.

• The distributed filesystem , HDFS, which is used for sharing job files between the other entities.
Job Submission
 The submit() method on Job creates an internal JobSubmitter instance and calls
submitJobInternal() on it.

 Having submitted the job, waitForCompletion() polls the job’s progress once per second
 When the job completes successfully, the job counters are displayed.
 Otherwise, the error that caused the job to fail is logged to the console.

 The job submission process implemented by JobSubmitter does the following:


• Asks the resource manager for a new application ID, used for the MapReduce job ID.

• Checks the output specification of the job.

• Computes the input splits for the job.

• Copies the resources needed to run the job, including the job JAR file, the configuration
file, and the computed input splits, to the shared filesystem in a directory named after the job
ID.

• Submits the job by calling submitApplication() on the resource manager.


Job Initialization

 When the resource manager receives a call to its submitApplication() method, it hands off
the request to the YARN scheduler.

 The scheduler allocates a container, and the resource manager then launches the application
master’s process there, under the node manager’s management.

 The application master for MapReduce jobs is a Java application whose main class is
MRAppMaster.

 It initializes the job by creating a number of bookkeeping objects to keep track of the job’s
progress.

 Next, it retrieves the input splits computed in the client from the shared filesystem.
 It then creates a map task object for each split, as well as a number of reduce task objects
determined by the mapreduce.job.reduces property.
Task Assignment
 The application master requests containers for all the map and reduce tasks in the job from the
resource manager.

 Requests for map tasks are made first and with a higher priority than those for reduce tasks.

 Requests for reduce tasks are not made until 5% of map tasks have completed.

Task Execution
 The application master starts the container by contacting the node manager.

 The task is executed by a Java application whose main class is YarnChild.

 Before it can run the task, it localizes the resources that the task needs, including the job
configuration and JAR file, and any files from the distributed cache.

 Finally, it runs the map or reduce task.

 The YarnChild runs in a dedicated JVM, so that any bugs in the user-defined map and reduce
functions (or even in YarnChild) don’t affect the node manager.
Progress and Status Updates
 A job and each of its tasks have a status, which includes state of the job or task (e.g., running,
successfully completed, failed), the progress of maps and reduces, the values of the job’s
counters, and a status message or description.
Job Completion
 When the application master receives a notification that the last task for a job is complete it
changes the status for the job to “successful.”

 When the Job polls for status, it prints a message about job completion and returns from the
waitForCompletion() method.

 Job statistics and counters are printed to the console at this point.

 On job completion, the application master and the task containers clean up their working state
(so intermediate output is deleted).

 OutputCommitter’s commit Job() method is called.

 Job information is archived by the job history server.


Failures
 Failures can be due to –
 user code having bugs
 Crashing of Processes
 Failure of Machines.

 One of the major benefits of using Hadoop is its ability to handle such failures and allow your
job to complete successfully.

 It considers the failure of any of the following entities:


 The task
 The application master
 The node manager and
 The resource manager.
Task Failure
Can be due to various reasons –

 user code in the map or reduce task throws a runtime exception.


 The task JVM reports the error back to its parent application master before it exits.
 The application master marks the task attempt as failed, and frees up the container so its
resources are available for another task.

 Sudden exit of the task JVM—perhaps there is a JVM bug that causes the JVM to exit.
 In this case, the node manager notices that the process has exited and informs the
application master so it can mark the attempt as failed.

 Hanging tasks - The application master notices that it hasn’t received a progress update for a
while and proceeds to mark the task as failed.
 The task JVM process will be killed automatically after this period
Application Master Failure

 Applications in YARN are retried in the event of failure.


 The maximum number of attempts to run a MapReduce application master
 default value is 2
 mapreduce.am.max-attempts property.

 YARN imposes a limit for the maximum number of attempts for any YARN application master
running on the cluster
 default value is 2
 The limit is set by yarn.resourcemanager.am.max-attempts

 Recovery works is as follows.


 An application master sends periodic heartbeats to the resource manager,
 If the application master fails, the resource manager detects the failure and start a new
instance of the master running in a new container (managed by a node manager)
Node Manager Failure

 Can fail by crashing or running very slowly.

 Will stop sending heartbeats to the resource manager (or send them very infrequently).

 The resource manager will notice a failed node manager if it has stopped sending heartbeats
for 10 minutes.

 The failed node is then removed from its pool of nodes to schedule containers on.

 Any task or application master running on the failed node manager will be recovered.

 Node managers may be blacklisted if the number of failures for the application is high, even if
the node manager itself has not failed.

 Blacklisting is done by the application master.

 The user may set the threshold with the mapreduce.job.maxtaskfailures.per.tracker job
property.
Resource Manager Failure

 Failure of the resource manager is serious, because without it, neither jobs nor task containers
can be launched.

 To achieve high availability (HA), a pair of resource managers is run in an active-standby


configuration.

 Information about all the running applications is stored in a highly available state store (backed
by ZooKeeper or HDFS).

 When the new resource manager starts, it reads the application information from the state
store, then restarts the application masters for all the applications.

 The transition of a resource manager from standby to active is handled by a failover controller(
which by default uses Zookeeper leader election).
Shuffle and Sort

 MapReduce makes the guarantee that the input to every reducer is sorted by key.
 The process by which the system performs the sort—and transfers the map outputs to the
reducers as inputs—is known as the shuffle.

Shuffle and Sort in Map Reduce


The Map Side
 Each map task has a circular memory buffer that it writes the output to.
 The buffer is 100 MB by default (the size can be tuned by changing the property -
mapreduce.task.io.sort.mb).
 When the contents of the buffer reach a certain threshold size (default value is 0.80 or 80% -
mapreduce.map.sort.spill.percent ), a background thread will start to spill the contents to
disk.

The Reduce Side


 The reduce task needs the map output for its particular partition from several map tasks across
the cluster.
 The map tasks may finish at different times, so the reduce task starts copying their outputs as
soon as each completes. This is known as the copy phase of the reduce task.
 When in-memory buffer reaches a threshold size (mapreduce.reduce.shuffle.merge.percent)
or reaches a threshold number of map outputs (mapreduce.reduce.merge.inmem.threshold),
it is merged and spilled to disk.
 When all the map outputs have been copied, the reduce task moves into the sort phase, which
merges the map outputs, maintaining their sort ordering.
Matrix Multiplication

 Each cell of the matrix is labelled as Aij and Bij.

 One step matrix multiplication has 1 mapper and 1 reducer.

 The Formula is:

Mapper for Matrix A (k, v)=((i, k), (A, j, Aij)) for all k
Mapper for Matrix B (k, v)=((i, k), (B, j, Bjk)) for all i
Computing the mapper for Matrix A:

 k, i, j computes the number of times it occurs.


 Here all are 2, therefore when k=1, i can have 2 values 1 & 2,
 each case can have 2 further values of j=1 and j=2.
 Substituting all values in formula

k=1 i=1 j=1 ((1, 1), (A, 1, 1))


j=2 ((1, 1), (A, 2, 2))
i=2 j=1 ((2, 1), (A, 1, 3))
j=2 ((2, 1), (A, 2, 4))

k=2 i=1 j=1 ((1, 2), (A, 1, 1))


j=2 ((1, 2), (A, 2, 2))
i=2 j=1 ((2, 2), (A, 1, 3))
j=2 ((2, 2), (A, 2, 4))
Computing the mapper for Matrix B

i=1 j=1 k=1 ((1, 1), (B, 1, 5))


k=2 ((1, 2), (B, 1, 6))
j=2 k=1 ((1, 1), (B, 2, 7))
j=2 ((1, 2), (B, 2, 8))

i=2 j=1 k=1 ((2, 1), (B, 1, 5))


k=2 ((2, 2), (B, 1, 6))
j=2 k=1 ((2, 1), (B, 2, 7))
k=2 ((2, 2), (B, 2, 8))

The formula for Reducer is:

Reducer(k, v)=(i, k)=>Make sorted Alist and Blist


(i, k) => Summation (Aij * Bjk)) for j
Output =>((i, k), sum)
Computing the reducer:

We can observe from Mapper computation that 4 pairs are common (1, 1), (1, 2), (2, 1) and (2, 2)
Make a list separate for Matrix A & B with adjoining values taken from Mapper step above:

(1, 1) =>Alist ={(A, 1, 1), (A, 2, 2)}


Blist ={(B, 1, 5), (B, 2, 7)}
Now Aij x Bjk: [(1*5) + (2*7)] =19 -------(i)
(1, 2) =>Alist ={(A, 1, 1), (A, 2, 2)}
Blist ={(B, 1, 6), (B, 2, 8)}
Now Aij x Bjk: [(1*6) + (2*8)] =22 -------(ii)

(2, 1) =>Alist ={(A, 1, 3), (A, 2, 4)}


Blist ={(B, 1, 5), (B, 2, 7)}
Now Aij x Bjk: [(3*5) + (4*7)] =43 -------(iii)
(2, 2) =>Alist ={(A, 1, 3), (A, 2, 4)}
Blist ={(B, 1, 6), (B, 2, 8)}
Now Aij x Bjk: [(3*6) + (4*8)] =50 -------(iv)
From (i), (ii), (iii) and (iv) we conclude that
((1, 1), 19)
((1, 2), 22)
((2, 1), 43)
((2, 2), 50)

You might also like