0% found this document useful (0 votes)

59 views12 pages

Chapter 2: Running Example Program and Bench Mark: Big Data Analytics (15CS82)

The document discusses running and monitoring MapReduce examples and benchmarks in Hadoop. It provides instructions on running the Pi example, which calculates Pi using a quasi-Monte Carlo method. The YARN ResourceManager web GUI can then be used to monitor the example job. The GUI displays application metrics and progress, and allows drilling down to view task-level details, logs, and node information to confirm proper Hadoop operation.

Uploaded by

VISHNU

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

59 views12 pages

Chapter 2: Running Example Program and Bench Mark: Big Data Analytics (15CS82)

Uploaded by

VISHNU

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

Big Data Analytics [15CS82]

MODULE 1
Chapter 2: Running Example program and Bench Mark

When using new or updated hardware or software, simple examples and benchmarks help
confirm proper operation. Apache Hadoop includes many examples and benchmarks to aid
in this task. This chapter provides instructions on how to run, monitor, and manage some
basic Map Reduce examples and benchmarks
Running Map Reduce Examples

All Hadoop releases come with MapReduce example applications. Running the existing
MapReduce examples is a simple process—once the example files are located, that is. For
example, if you installed Hadoop version 2.6.0 from the Apache sources under
/opt, the examples will be in the following directory:
/opt/hadoop-2.6.0/share/hadoop/mapreduce/

In other versions, the examples may be in /usr/lib/hadoop-mapreduce/ or some other location.

The exact location of the example jar file can be found using the find command:
$ find / -name "hadoop-mapreduce-examples*.jar" -print

For this chapter the following software environment will be used:

OS: Linux
Platform: RHEL 6.6
Hortonworks HDP 2.2 with Hadoop Version: 2.6
In this environment, the location of the examples is /usr/hdp/2.2.4.2-2/hadoop- mapreduce.
For the purposes of this example, an environment variable called HADOOP_EXAMPLES
can be defined as follows:
$ export HADOOP_EXAMPLES=/usr/hdp/2.2.4.2-2/hadoop-mapreduce

Once you define the examples path, you can run the Hadoop examples using the commands
discussed in the following sections.
Business intelligence (BI) is an umbrella term that includes a variety of IT applications that
are used to analyze an organization’s data and communicate the information to relevant users.

Page 1
Big Data Analytics [15CS82]

Listing Available Examples

A list of the available examples can be found by running the following command. In some
cases, the version number may be part of the jar file (e.g., in the version 2.6 Apache sources,
the file is named hadoop-mapreduce-examples-2.6.0.jar).
$ yarn jar $HADOOP_EXAMPLES/hadoop-mapreduce-examples.jar
Note
In previous versions of Hadoop, the command hadoop jar . . . was used to run MapReduce
programs. Newer versions provide the yarn command, which offers more capabilities. Both
commands will work for these examples.
The possible examples are as follows:
An example program must be given as the first argument. Valid program names are:
aggregatewordcount: An Aggregate based map/reduce program that counts the words in the
input files.
aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of
the words in the input files.
bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.
dbcount: An example job that count the pageview counts from a database. distbbp: A
map/reduce program that uses a BBP-type formula to compute exact bits of Pi.
grep: A map/reduce program that counts the matches of a regex in the input.
join: A job that effects a join over sorted, equally partitioned datasets
multifilewc: A job that counts words from several files. pentomino: A map/reduce tile laying
program to find solutions to pentomino problems.
pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.
randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
randomwriter: A map/reduce program that writes 10GB of random data per node.
secondarysort: An example defining a secondary sort to the reduce. sort: A map/reduce
program that sorts the data written by the random writer.
sudoku: A sudoku solver.
teragen: Generate data for the terasort terasort: Run the terasort
teravalidate: Checking results of terasort

Page 2
Big Data Analytics [15CS82]

wordcount: A map/reduce program that counts the words in the input files.
wordmean: A map/reduce program that counts the average length of the words in the input
files.
wordmedian: A map/reduce program that counts the median length of the words in the input
files.
Word standard deviation: A map/reduce program that counts the standard deviation of the
length of the words in the input files.

To illustrate several features of Hadoop and the YARN Resource Manager service GUI, the
pi and terasort examples are presented next.

Running the Pi Example

The pi example calculates the digits of  using a quasi-Monte Carlo method. If you have not
added users to HDFS (see Chapter 10, “Basic Hadoop Administration Pro- cedures”), run
these tests as user hdfs. To run the pi example with 16 maps and 1,000,000 samples per map,
enter the following command:
$ yarn jar $HADOOP_EXAMPLES/hadoop-mapreduce-examples.jar pi 16 1000000.

Using the Web GUI to Monitor Examples

This section provides an illustration of using the YARN ResourceManager web GUI to
monitor and find information about YARN jobs. The Hadoop version 2 YARN Resource
Manager web GUI differs significantly from the MapReduce web GUI found in Hadoop
version 1. Figure 2.1 shows the main YARN web interface. The cluster metrics are displayed
in the top row, while the running applications are displayed in the main table. A menu on
the left provides navigation to the nodes table, various job categories (e.g., New, Accepted,
Running, Finished, Failed), and the Capacity Sched- uler (covered in Chapter 10, “Basic
Hadoop Administration Procedures”). This interface can be opened directly from the
Ambari YARN service Quick Links menu or by directly entering http://hostname:8088 into a
local web browser. For this example, the pi application is used. Note that the application can
run quickly and may finish before you have fully explored the GUI. A longer-running
application, such as terasort, may be helpful when exploring all the various links in the GUI.

Page 3
Big Data Analytics [15CS82]

Figure 2.1 Hadoop RUNNING Applications web GUI for the pi example

For those readers who have used or read about Hadoop version 1, if you look at the Cluster
Metrics table, you will see some new information. First, you will notice that the
“Map/Reduce Task Capacity” has been replaced by the number of running containers. If
YARN is running a MapReduce job, these containers can be used for both map and reduce
tasks. Unlike in Hadoop version 1, the number of mappers and reducers is not fixed. There
are also memory metrics and links to node status. If you click on the Nodes link (left menu
under About), you can get a summary of the node activity and state. For example, Figure 2.2
is a snapshot of the node activity while the pi application is running. Notice the number of
containers, which are used by the MapReduce framework as either mappers or reducers.
Going back to the main Applications/Running window (Figure 2.1), if you click on the
application_14299… link, the Application status window in Figure 4.3 will appear. This
window provides an application overview and metrics, including the cluster node on which
the Application Master container is running.
Clicking the Application Master link next to “Tracking URL:” in Figure 2.3 leads to the
window shown in Figure 4.4. Note that the link to the application’s Application Master is

Page 4
Big Data Analytics [15CS82]

also found in the last column on the main Running Applications screen shown in Figure 2.1.
In the MapReduce Application window, you can see the details of the MapReduce
application and the overall progress of mappers and reducers. Instead of containers, the
MapReduce application now refers to maps and reducers. Clicking job_14299… brings up
the window shown in Figure 2.5. This window displays more detail about the number of
pending, running, completed, and failed mappers and reducers, including the elapsed time
since the job started.

Figure 2.2 Hadoop YARN Resource Manager nodes status window

Figure 2.3 Hadoop YARN application status for the pi example

Page 5
Big Data Analytics [15CS82]

Figure 2.4 Hadoop YARN Application Master for Map Reduce application

Figure 2.5 Hadoop YARN MapReduce job progress

The status of the job in Figure 2.5 will be updated as the job progresses (the window needs
to be refreshed manually). The ApplicationMaster collects and reports the progress of each
mapper and reducer task. When the job is finished, the window is updated to that shown in
Figure 2.6. It reports the overall run time and provides a breakdown of the timing of the key
phases of the MapReduce job (map, shuffle, merge, reduce).
If you click the node used to run the Application Master (n0:8042 in Figure 2.6), the window
in Figure 2.7 opens and provides a summary from the Node Manager on node n0. Again, the
Node Manager tracks only containers; the actual tasks running in the containers are
determined by the Application Master.

Page 6
Big Data Analytics [15CS82]

Going back to the job summary page (Figure 2.6), you can also examine the logs for the
Application Master by clicking the “logs” link. To find information about the mappers and
reducers, click the numbers under the Failed, Killed, and Successful columns. In this
example, there were 16 successful mappers and one successful reducer. All the numbers in
these columns lead to more information about individual map or reduce process.

For instance, clicking the “16” under “Successful” in Figure 2 .6 displays the table of map
tasks in Figure 2.8. The metrics for the Application Master container are displayed in table
form. There is also a link to the log f ile for each process (in this case, a map process).
Viewing the logs requires that the yarn.log.aggregation-enable variable in the yarn-site.xml
file be set.

Figure 2.6 Hadoop YARN completed MapReduce job summary

Page 7
Big Data Analytics [15CS82]

Figure 2.7 Hadoop YARN Node Manager for n0 job summary

Figure 2.8 Hadoop YARN MapReduce logs available for browsing

Figure 2.9 Hadoop YARN application summary page

If you return to the main cluster window (Figure 4.1), choose Applications/ Finished, and
then select our application, you will see the summary page shown in Figure 2.9.
There are a few things to notice in the previous windows. First, because YARN manages
applications, all information reported by the ResourceManager con-
cerns the resources provided and the application type (in this case, MAPREDUCE). In Figure

Page 8
Big Data Analytics [15CS82]

2.1 and Figure 2.4, the YARN ResourceManager refers to the pi example by its application-id
(application_1429912013449_0044). YARN has no data about the actual application other
than the fact that it is a MapReduce job. Data from the actual MapReduce job are provided by
the MapReduce framework and referenced by a job- id (job_1429912013449_0044) in
Figure 4.6. Thus, two clearly different data streams are combined in the web GUI: YARN
applications and MapReduce framework jobs. If the framework does not provide job
information, then certain parts of the web GUI will not have anything to display.

Another interesting aspect of the previous windows is the dynamic nature of the mapper and
reducer tasks. These tasks are executed as YARN containers, and their number will change as
the application runs. Users may request specific numbers of mappers and reducers, but the
ApplicationMaster uses them in a dynamic fashion. As mappers complete, the
ApplicationMaster will return the containers to the Resource- Manager and request a smaller
number of reducer containers. This feature provides for much better cluster utilization
because mappers and reducers are dynamic—rather than fixed—resources.

Running Basic Hadoop Benchmarks

Many Hadoop benchmarks can provide insight into cluster performance. The best
benchmarks are always those that ref lect real application performance. The two bench-
marks discussed in this section, terasort and TestDFSIO, provide a good sense of how well
your Hadoop installation is operating and can be compared with public data pub- lished for
other Hadoop systems. The results, however, should not be taken as a single indicator for
system-wide performance on all applications.
The following benchmarks are designed for full Hadoop cluster installations. These tests
assume a multi-disk HDFS environment. Running these benchmarks in the Hor- tonworks
Sandbox or in the pseudo-distributed single-node install from Chapter 2 is not recommended
because all input and output (I/O) are done using a single system disk drive.
Running the Terasort Test
The terasort benchmark sorts a specified amount of randomly generated data.

Page 9
Big Data Analytics [15CS82]

This benchmark provides combined testing of the HDFS and MapReduce layers of a Hadoop
cluster. A full terasort benchmark run consists of the following three steps:
1. Generating the input data via teragen program.
2. Running the actual terasort benchmark on the input data.
3. Validating the sorted output data via the teravalidate program.
In general, each row is 100 bytes long; thus the total amount of data written is
100 times the number of rows specified as part of the benchmark (i.e., to write 100GB of
data, use 1 billion rows). The input and output directories need to be specified in HDFS. The
following sequence of commands will run the benchmark for 50GB of data as user hdfs.
Make sure the /user/hdfs directory exists in HDFS before running the benchmarks.
1. Run tera gen to generate rows of random data to sort.
$ yarn jar $HADOOP_EXAMPLES/hadoop-mapreduce-examples.jar teragen 500000000
/user/hdfs/TeraGen-50GB
2. Run tera sort to sort the database.
$ yarn jar $HADOOP_EXAMPLES/hadoop-mapreduce-examples.jar tera sort
/user/hdfs/TeraGen-50GB /user/hdfs/TeraSort-50GB
3. R tera validate to validate the sort.
$ yarn jar $HADOOP_EXAMPLES/hadoop-mapreduce-examples.jar tera validate
/user/hdfs/TeraSort-50GB /user/hdfs/TeraValid-50GB

To report results, the time for the actual sort (terasort) is measured and the benchmark rate in
megabytes/second (MB/s) is calculated. For best performance, the actual terasort benchmark
should be run with a replication factor of 1. In addition, the default number of terasort reducer
tasks is set to 1. Increasing the number of reducers often helps with benchmark performance.
For example, the following command will instruct terasort to use four reducer tasks:
$ yarn jar $HADOOP_EXAMPLES/hadoop-mapreduce-examples.jar terasort
-Dmapred.reduce.tasks=4 /user/hdfs/TeraGen-50GB /user/hdfs/TeraSort-50GB

Also, do not forget to clean up the terasort data between runs (and after testing is finished).
The following command will perform the cleanup for the previous example:
$ hdfs dfs -rm -r -skipTrash Tera*

Page 10
Big Data Analytics [15CS82]

Running the Test DFSIO Benchmark

Hadoop also includes an HDFS benchmark application called TestDFSIO. The TestDFSIO
benchmark is a read and write test for HDFS. That is, it will write or read a number of files to
and from HDFS and is designed in such a way that it will use one map task per file. The file
size and number of files are specified as command-line arguments. Similar to the terasort
benchmark, you should run this test as user hdfs.
Similar to terasort, TestDFSIO has several steps. In the following example,
16 files of size 1GB are specified. Note that the TestDFSIO benchmark is part of the hadoop-
mapreduce-client-jobclient.jar. Other benchmarks are also available as part of this jar file.
Running it with no arguments will yield a list. In addition to TestDFSIO, NNBench (load
testing the NameNode) and MRBench (load testing the MapReduce framework) are
commonly used Hadoop benchmarks. Nevertheless, TestDFSIO is perhaps the most widely
reported of these benchmarks. The steps to run TestDFSIO are as follows:
1. Run TestDFSIO in write mode and create data.
$ yarn jar $HADOOP_EXAMPLES/hadoop-mapreduce-client-jobclient-tests.jar
TestDFSIO -write -nrFiles 16 -fileSize 1000
2. Run TestDFSIO in read mode.
$ yarn jar $HADOOP_EXAMPLES/hadoop-mapreduce-client-jobclient-tests.jar
TestDFSIO -read -nrFiles 16 -fileSize 1000

Managing Hadoop MapReduce Jobs

Hadoop MapReduce jobs can be managed using the mapred job command. The most
important options for this command in terms of the examples and benchmarks are
-list, -kill, and -status. In particular, if you need to kill one of the examples or benchmarks,
you can use the mapred job –list command to find the job-id and then use mapred job –kill
<job-id> to kill the job across the cluster. Map Reduce jobs can also be controlled at the
application level with the yarn application command

Page 11
Big Data Analytics [15CS82]

Page 12

IC ENGINES PPT Revised1
No ratings yet
IC ENGINES PPT Revised1
80 pages
Hadoop 2 Quick Start Guide PDF
100% (1)
Hadoop 2 Quick Start Guide PDF
736 pages
Describe The Functions and Features of HDP
100% (2)
Describe The Functions and Features of HDP
16 pages
Part I - Sample Questions: COMPETENCY 1: Patient Care
No ratings yet
Part I - Sample Questions: COMPETENCY 1: Patient Care
20 pages
FoxScanner+Update+Guide+EN V1.00
No ratings yet
FoxScanner+Update+Guide+EN V1.00
12 pages
Insolvency and Bankruptcy Code 2016
No ratings yet
Insolvency and Bankruptcy Code 2016
46 pages
Toc 9780134049984
No ratings yet
Toc 9780134049984
10 pages
YARN Essentials - Sample Chapter
No ratings yet
YARN Essentials - Sample Chapter
12 pages
Part 03 Intro To Hadoop
No ratings yet
Part 03 Intro To Hadoop
22 pages
1.4 Map Reduce
No ratings yet
1.4 Map Reduce
30 pages
TP 2
No ratings yet
TP 2
30 pages
Module 3 - Mapreduce
No ratings yet
Module 3 - Mapreduce
40 pages
Big Data-Week 3 - 1
No ratings yet
Big Data-Week 3 - 1
22 pages
Map Reduce and Hadoop
No ratings yet
Map Reduce and Hadoop
39 pages
Chap 3-5.-Hadoop Ecosystem YARN MapReduce - 1
No ratings yet
Chap 3-5.-Hadoop Ecosystem YARN MapReduce - 1
87 pages
Developing A MapReduce Application
No ratings yet
Developing A MapReduce Application
30 pages
Big Data, Map Reduce & Hadoop: By: Surbhi Vyas (7) Varsha
No ratings yet
Big Data, Map Reduce & Hadoop: By: Surbhi Vyas (7) Varsha
40 pages
wk8 Final
No ratings yet
wk8 Final
39 pages
Assn - No:1 Cloud Computing Assignment 13.10.2019
No ratings yet
Assn - No:1 Cloud Computing Assignment 13.10.2019
4 pages
Big Data Unit 2 AKTU Notes
No ratings yet
Big Data Unit 2 AKTU Notes
63 pages
CS702 Big Data Programs
No ratings yet
CS702 Big Data Programs
58 pages
Big Data Unit 2 Notes
No ratings yet
Big Data Unit 2 Notes
6 pages
Unit 2 - From Hadoop Streaming PDF
No ratings yet
Unit 2 - From Hadoop Streaming PDF
20 pages
UNIT-4 Bda
No ratings yet
UNIT-4 Bda
26 pages
Big Data Notes Unit-3
No ratings yet
Big Data Notes Unit-3
7 pages
Big Data Lab Manual Printout
No ratings yet
Big Data Lab Manual Printout
51 pages
BDA Lab Manual - Organized
No ratings yet
BDA Lab Manual - Organized
69 pages
Bigdata Lab
No ratings yet
Bigdata Lab
55 pages
Unit-2 Bda Kalyan - Pagenumber
No ratings yet
Unit-2 Bda Kalyan - Pagenumber
15 pages
Big Data - Tomas Iglesias IV
No ratings yet
Big Data - Tomas Iglesias IV
37 pages
Big Data Akshat
No ratings yet
Big Data Akshat
57 pages
Hadoop Single Node Cluster Setup Steps
No ratings yet
Hadoop Single Node Cluster Setup Steps
7 pages
DEV 301 - Lab Guide
100% (1)
DEV 301 - Lab Guide
46 pages
CS-702 (D) BigData
No ratings yet
CS-702 (D) BigData
61 pages
Hadoop Mapreduce V2 Cookbook 2Nd Edition Explore The Hadoop Mapreduce V2 Ecosystem To Gain Insights From Very Large Datasets Thilina Gunarathne
No ratings yet
Hadoop Mapreduce V2 Cookbook 2Nd Edition Explore The Hadoop Mapreduce V2 Ecosystem To Gain Insights From Very Large Datasets Thilina Gunarathne
51 pages
04 MapRed 6 JobExecutionOnYarn
No ratings yet
04 MapRed 6 JobExecutionOnYarn
20 pages
DC Hadoop
No ratings yet
DC Hadoop
48 pages
M2 Bigdata&Hadoop
No ratings yet
M2 Bigdata&Hadoop
27 pages
Hadoop Job Runner UI Tool
No ratings yet
Hadoop Job Runner UI Tool
10 pages
Professional Hadoop Solutions
From Everand
Professional Hadoop Solutions
Boris Lublinsky
4/5 (2)
Unit5 B
No ratings yet
Unit5 B
4 pages
Hadoop Module1
No ratings yet
Hadoop Module1
37 pages
A Weather Dataset. Understanding Hadoop API For MapReduce Framework
No ratings yet
A Weather Dataset. Understanding Hadoop API For MapReduce Framework
9 pages
Bda Megh
No ratings yet
Bda Megh
50 pages
CO3 Session 19
No ratings yet
CO3 Session 19
29 pages
CASE STUDY On Application of Hadoop
No ratings yet
CASE STUDY On Application of Hadoop
16 pages
BD U-4 (Anupam Sir)
No ratings yet
BD U-4 (Anupam Sir)
23 pages
Bda - Unit 3
No ratings yet
Bda - Unit 3
29 pages
Big Data Analytics Unit-3
No ratings yet
Big Data Analytics Unit-3
29 pages
Bda Unit 3
No ratings yet
Bda Unit 3
50 pages
Unit-2 (HADOOP)
No ratings yet
Unit-2 (HADOOP)
20 pages
Hadoop Class 2 PDF
No ratings yet
Hadoop Class 2 PDF
18 pages
MapReduce Unit3
No ratings yet
MapReduce Unit3
27 pages
2-Introduction To Hadoop Eco System
No ratings yet
2-Introduction To Hadoop Eco System
35 pages
Big Data Exam Help
No ratings yet
Big Data Exam Help
7 pages
10 - Big Data Architecture and Tools
No ratings yet
10 - Big Data Architecture and Tools
31 pages
Hadoop Echosystem and Ibm Big Insights: Rafie Tarabay Eng - Rafie@Mans - Edu.Eg
No ratings yet
Hadoop Echosystem and Ibm Big Insights: Rafie Tarabay Eng - Rafie@Mans - Edu.Eg
112 pages
Bda Unit 3 - Mam
No ratings yet
Bda Unit 3 - Mam
89 pages
Big Data Technologies
No ratings yet
Big Data Technologies
37 pages
Unit-2 1
No ratings yet
Unit-2 1
93 pages
BDA Record
No ratings yet
BDA Record
58 pages
Bda Unit 4
No ratings yet
Bda Unit 4
16 pages
Unit Iii
No ratings yet
Unit Iii
38 pages
3D Hardware design:: Software applications for GPU
From Everand
3D Hardware design:: Software applications for GPU
S Mathioudakis
No ratings yet
Losses in Piping System
No ratings yet
Losses in Piping System
18 pages
2279 Performance Based Wind Resistant Design For High Rise Structures in Japan PDF
No ratings yet
2279 Performance Based Wind Resistant Design For High Rise Structures in Japan PDF
14 pages
MATULAC Activity 1 MidTerm
No ratings yet
MATULAC Activity 1 MidTerm
3 pages
Configuring A JOB in T24
No ratings yet
Configuring A JOB in T24
2 pages
BUHK408
No ratings yet
BUHK408
5 pages
Bourns N1027 4300 Vs 4600 FPB
No ratings yet
Bourns N1027 4300 Vs 4600 FPB
23 pages
Chapter-3, Size of Business
No ratings yet
Chapter-3, Size of Business
4 pages
Question Set - Asset Integrity
100% (1)
Question Set - Asset Integrity
5 pages
I C 616 Rap Workshop
No ratings yet
I C 616 Rap Workshop
62 pages
Maths Notes Unit 5
No ratings yet
Maths Notes Unit 5
36 pages
Credit Awareness
100% (2)
Credit Awareness
62 pages
Matsumoto Hakuō II
No ratings yet
Matsumoto Hakuō II
3 pages
Stock Watson 3U ExerciseSolutions Chapter5 Students PDF
No ratings yet
Stock Watson 3U ExerciseSolutions Chapter5 Students PDF
9 pages
Historiopreneurship Related Paper 3
No ratings yet
Historiopreneurship Related Paper 3
13 pages
Revised List of Items & Norms of Assistance From State Disaster Response Fund (SDRF) / National Disaster Response Fund (NDRF)
No ratings yet
Revised List of Items & Norms of Assistance From State Disaster Response Fund (SDRF) / National Disaster Response Fund (NDRF)
8 pages
Birdmobile
No ratings yet
Birdmobile
9 pages
Aguinaldo Industries V CIR - Peralta
No ratings yet
Aguinaldo Industries V CIR - Peralta
2 pages
Ice Cream Cones Manufacturing and Production Project Report
100% (1)
Ice Cream Cones Manufacturing and Production Project Report
22 pages
1,6 Hexanediamine
No ratings yet
1,6 Hexanediamine
7 pages
Customizing The Windchill 9 User Interface
No ratings yet
Customizing The Windchill 9 User Interface
3 pages
How To Compute Withholding Tax On Compensation
No ratings yet
How To Compute Withholding Tax On Compensation
6 pages
History 4/3 Gold Mining 1886
No ratings yet
History 4/3 Gold Mining 1886
15 pages
Mandeville-The Grumbling Hive
No ratings yet
Mandeville-The Grumbling Hive
5 pages
Accounting and Finance Notes For Final Exam
No ratings yet
Accounting and Finance Notes For Final Exam
5 pages
Djaneiro Cheat Sheet: by Via
No ratings yet
Djaneiro Cheat Sheet: by Via
3 pages
Aircraft Fastener
100% (3)
Aircraft Fastener
119 pages

Chapter 2: Running Example Program and Bench Mark: Big Data Analytics (15CS82)

Uploaded by

Chapter 2: Running Example Program and Bench Mark: Big Data Analytics (15CS82)

Uploaded by

Big Data Analytics [15CS82]

In other versions, the examples may be in /usr/lib/hadoop-mapreduce/ or some other location.

For this chapter the following software environment will be used:

Listing Available Examples

Running the Pi Example

Using the Web GUI to Monitor Examples

Figure 2.2 Hadoop YARN Resource Manager nodes status window

Figure 2.3 Hadoop YARN application status for the pi example

Figure 2.5 Hadoop YARN MapReduce job progress

Figure 2.6 Hadoop YARN completed MapReduce job summary

Figure 2.7 Hadoop YARN Node Manager for n0 job summary

Figure 2.8 Hadoop YARN MapReduce logs available for browsing

Figure 2.9 Hadoop YARN application summary page

Running Basic Hadoop Benchmarks

Running the Test DFSIO Benchmark

Managing Hadoop MapReduce Jobs

You might also like