SlideShare a Scribd company logo
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
Agenda for today’s Session
 MapReduce Way
 Classes and Packages in MapReduce
 Explanation of a Complete MapReduce Program
 MapReduce Examples on Analytics
 MapReduce Example on Testing
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
MapReduce Example on Word Count Process
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
MapReduce Way
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
MapReduce Way – Word Count Process
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
Input/Output Classes in MapReduce
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
Input Format – Class Hierarchy
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
Output Format – Class Hierarchy
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
Packages and Classes in Word Count
MapReduce Example
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
Packages to Import
import java.io.IOException;
import java.util.*;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import
org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
All these packages are present in
hadoop-common.jar
All these
packages are
present in
hadoop-mapreduce-
client-core.jar
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
Mapper Class
public static class Map extends
Mapper<LongWritable, Text, Text, IntWritable> {
Name of the Mapper Class which
inherits Super Class Mapper
Mapper Class takes 4 Arguments i.e.
Mapper<KEYIN, VALUEIN, KEYOUT, VALUEOUT>
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
Reducer Class
public static class Reduce extends
Reducer<Text, IntWritable, Text, IntWritable> {
Name of the Reducer Class which
inherits Super Class Reducer
Reducer Class takes 4 Arguments i.e.
Reducer <KEYIN, VALUEIN, KEYOUT, VALUEOUT>
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
Its Time to see some MapReduce Examples
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
MapReduce is useful in a wide range of applications in multiple domains.
It is majorly used for 2 things:
 Analytics: Process the data and give the desired results
 Testing: Perform few test cases using MRUnit
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
Let us see few MapReduce Examples
on Analytics
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
MapReduce Temperature Example
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
Weather Forecasting
 Problem Statement:
Âť Analysing weather data of Austin to determine Hot and Cold
Days.
We have weather data set of Austin by NCIE.
NOAA's National Centres for Environmental Information (NCEI)
(previously NCDC) is responsible for preserving, monitoring, assessing,
and providing public access to the Nation's treasure of climate and
historical weather data and information.
Refer -> ftp://ftp.ncdc.noaa.gov/pub/data/uscrn/products/daily01
Temperature Example
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
Temperature Example - Weather Dataset
6th Column
Max Temp
6th Column
Min Temp
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
MapReduce Example
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
Last.fm Example
is an online music website where users listen to various tracks,
the data gets collected like shown below. Write a map reduce
program to get the Number of unique listeners.
The data is coming in log files and looks like as shown below:
UserId TrackId Shared Radio Skip
100001 150 1 1 0
100005 103 0 0 1
100142 78 1 0 0
110005 289 1 0 1
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
Let us see a MapReduce Example
on Testing
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
MRUnit Testing Framework
 Provides 4 drivers for separately testing MapReduce code
Âť MapDriver
Âť ReduceDriver
Âť MapReduceDriver
Âť PipelineMapReduceDriver
 Helps in filling the gap between MapReduce programs and JUnit*
 Better control on log messages with JUnit Integration
*JUnit is a simple framework
to write repeatable tests.
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
MapReduce MRUnit Example
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
Learning Resources
 Hadoop Tutorial: www.edureka.co/blog/hadoop-tutorial
 MapReduce Tutorial: www.edureka.co/blog/mapreduce-tutorial
 MapReduce Interview Questions:
www.edureka.co/blog/interview-questions/hadoop-interview-questions-mapreduce
www.edureka.co/big-data-and-hadoopEDUREKA HADOOP CERTIFICATION TRAINING
Thank You …
Questions/Queries/Feedback

More Related Content

PDF
MapReduce Tutorial | What is MapReduce | Hadoop MapReduce Tutorial | Edureka
Edureka!
 
PDF
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
Edureka!
 
PPTX
Big data and Hadoop
Rahul Agarwal
 
PDF
Hadoop YARN
Vigen Sahakyan
 
PPT
Hadoop MapReduce Fundamentals
Lynn Langit
 
PPTX
Introduction to Map Reduce
Apache Apex
 
PPTX
Introduction to HDFS
Bhavesh Padharia
 
PPTX
Apache hive introduction
Mahmood Reza Esmaili Zand
 
MapReduce Tutorial | What is MapReduce | Hadoop MapReduce Tutorial | Edureka
Edureka!
 
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
Edureka!
 
Big data and Hadoop
Rahul Agarwal
 
Hadoop YARN
Vigen Sahakyan
 
Hadoop MapReduce Fundamentals
Lynn Langit
 
Introduction to Map Reduce
Apache Apex
 
Introduction to HDFS
Bhavesh Padharia
 
Apache hive introduction
Mahmood Reza Esmaili Zand
 

What's hot (20)

PPT
Hive(ppt)
Abhinav Tyagi
 
PPTX
Big Data & Hadoop Tutorial
Edureka!
 
PPTX
Hive
Manas Nayak
 
PPTX
Hadoop introduction , Why and What is Hadoop ?
sudhakara st
 
PDF
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka
Edureka!
 
PDF
HDFS Architecture
Jeff Hammerbacher
 
PDF
Mongo DB
Edureka!
 
PPTX
MongoDB
nikhil2807
 
PPTX
Introduction to HiveQL
kristinferrier
 
PPTX
Hadoop File system (HDFS)
Prashant Gupta
 
PPTX
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Simplilearn
 
PPTX
Big Data Analytics with Hadoop
Philippe Julio
 
PPTX
Introduction to Hadoop
Dr. C.V. Suresh Babu
 
PDF
Sqoop
Prashant Gupta
 
PDF
Hadoop Overview & Architecture
EMC
 
PPTX
Hadoop Architecture
Dr. C.V. Suresh Babu
 
PPTX
An Introduction To NoSQL & MongoDB
Lee Theobald
 
PDF
What are Hadoop Components? Hadoop Ecosystem and Architecture | Edureka
Edureka!
 
PPTX
Introduction to MongoDB
MongoDB
 
PPTX
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Bo Yang
 
Hive(ppt)
Abhinav Tyagi
 
Big Data & Hadoop Tutorial
Edureka!
 
Hive
Manas Nayak
 
Hadoop introduction , Why and What is Hadoop ?
sudhakara st
 
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka
Edureka!
 
HDFS Architecture
Jeff Hammerbacher
 
Mongo DB
Edureka!
 
MongoDB
nikhil2807
 
Introduction to HiveQL
kristinferrier
 
Hadoop File system (HDFS)
Prashant Gupta
 
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Simplilearn
 
Big Data Analytics with Hadoop
Philippe Julio
 
Introduction to Hadoop
Dr. C.V. Suresh Babu
 
Hadoop Overview & Architecture
EMC
 
Hadoop Architecture
Dr. C.V. Suresh Babu
 
An Introduction To NoSQL & MongoDB
Lee Theobald
 
What are Hadoop Components? Hadoop Ecosystem and Architecture | Edureka
Edureka!
 
Introduction to MongoDB
MongoDB
 
Spark Shuffle Deep Dive (Explained In Depth) - How Shuffle Works in Spark
Bo Yang
 
Ad

Similar to MapReduce Example | MapReduce Programming | Hadoop MapReduce Tutorial | Edureka (20)

PDF
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
Edureka!
 
PDF
Hadoop Administration Training | Hadoop Administration Tutorial | Hadoop Admi...
Edureka!
 
PDF
Hadoop Training For Beginners | Hadoop Tutorial | Big Data Training |Edureka
Edureka!
 
DOCX
Final ProjectsData SetsData Sets in R Packages.docx
AKHIL969626
 
PDF
Apache Hadoop Tutorial | Hadoop Tutorial For Beginners | Big Data Hadoop | Ha...
Edureka!
 
PPTX
Jonathan Hodge-SPEDDEXES-2014
aceas13tern
 
PDF
Enhancing the TIMES New User Experience - Second Step, a VEDA TIMES-Starter M...
IEA-ETSAP
 
PPT
Hawaii Utility Integration Efforts
REIS Project at University of Hawaii at Manoa
 
PDF
Business Intelligence A Managerial Perspective On Analytics 2nd Edition Shard...
fanchsolvig
 
PDF
Big Data Analytics Tutorial | Big Data Analytics for Beginners | Hadoop Tutor...
Edureka!
 
DOCX
Manno_Larry_Resume
Larry Manno
 
PDF
Business Intelligence A Managerial Perspective On Analytics 2nd Edition Shard...
kyjtopiblf783
 
PDF
Data Science Full Course | Edureka
Edureka!
 
PDF
What's in a Map?
intasave-caribsavegroup
 
PDF
Business Intelligence 2nd Edition Turban Test Bank
kvlxxjufee229
 
PDF
Predictive Analytics Using R | Edureka
Edureka!
 
PDF
Business Intelligence 2nd Edition Turban Test Bank
kryssaalthia
 
DOCX
Haley brown resume 2017
Haley Brown
 
PDF
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Edureka!
 
PDF
Business Intelligence 2nd Edition Turban Test Bank
masihvjonisnd
 
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
Edureka!
 
Hadoop Administration Training | Hadoop Administration Tutorial | Hadoop Admi...
Edureka!
 
Hadoop Training For Beginners | Hadoop Tutorial | Big Data Training |Edureka
Edureka!
 
Final ProjectsData SetsData Sets in R Packages.docx
AKHIL969626
 
Apache Hadoop Tutorial | Hadoop Tutorial For Beginners | Big Data Hadoop | Ha...
Edureka!
 
Jonathan Hodge-SPEDDEXES-2014
aceas13tern
 
Enhancing the TIMES New User Experience - Second Step, a VEDA TIMES-Starter M...
IEA-ETSAP
 
Hawaii Utility Integration Efforts
REIS Project at University of Hawaii at Manoa
 
Business Intelligence A Managerial Perspective On Analytics 2nd Edition Shard...
fanchsolvig
 
Big Data Analytics Tutorial | Big Data Analytics for Beginners | Hadoop Tutor...
Edureka!
 
Manno_Larry_Resume
Larry Manno
 
Business Intelligence A Managerial Perspective On Analytics 2nd Edition Shard...
kyjtopiblf783
 
Data Science Full Course | Edureka
Edureka!
 
What's in a Map?
intasave-caribsavegroup
 
Business Intelligence 2nd Edition Turban Test Bank
kvlxxjufee229
 
Predictive Analytics Using R | Edureka
Edureka!
 
Business Intelligence 2nd Edition Turban Test Bank
kryssaalthia
 
Haley brown resume 2017
Haley Brown
 
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Edureka!
 
Business Intelligence 2nd Edition Turban Test Bank
masihvjonisnd
 
Ad

More from Edureka! (20)

PDF
What to learn during the 21 days Lockdown | Edureka
Edureka!
 
PDF
Top 10 Dying Programming Languages in 2020 | Edureka
Edureka!
 
PDF
Top 5 Trending Business Intelligence Tools | Edureka
Edureka!
 
PDF
Tableau Tutorial for Data Science | Edureka
Edureka!
 
PDF
Python Programming Tutorial | Edureka
Edureka!
 
PDF
Top 5 PMP Certifications | Edureka
Edureka!
 
PDF
Top Maven Interview Questions in 2020 | Edureka
Edureka!
 
PDF
Linux Mint Tutorial | Edureka
Edureka!
 
PDF
How to Deploy Java Web App in AWS| Edureka
Edureka!
 
PDF
Importance of Digital Marketing | Edureka
Edureka!
 
PDF
RPA in 2020 | Edureka
Edureka!
 
PDF
Email Notifications in Jenkins | Edureka
Edureka!
 
PDF
EA Algorithm in Machine Learning | Edureka
Edureka!
 
PDF
Cognitive AI Tutorial | Edureka
Edureka!
 
PDF
AWS Cloud Practitioner Tutorial | Edureka
Edureka!
 
PDF
Blue Prism Top Interview Questions | Edureka
Edureka!
 
PDF
Big Data on AWS Tutorial | Edureka
Edureka!
 
PDF
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
Edureka!
 
PDF
Kubernetes Installation on Ubuntu | Edureka
Edureka!
 
PDF
Introduction to DevOps | Edureka
Edureka!
 
What to learn during the 21 days Lockdown | Edureka
Edureka!
 
Top 10 Dying Programming Languages in 2020 | Edureka
Edureka!
 
Top 5 Trending Business Intelligence Tools | Edureka
Edureka!
 
Tableau Tutorial for Data Science | Edureka
Edureka!
 
Python Programming Tutorial | Edureka
Edureka!
 
Top 5 PMP Certifications | Edureka
Edureka!
 
Top Maven Interview Questions in 2020 | Edureka
Edureka!
 
Linux Mint Tutorial | Edureka
Edureka!
 
How to Deploy Java Web App in AWS| Edureka
Edureka!
 
Importance of Digital Marketing | Edureka
Edureka!
 
RPA in 2020 | Edureka
Edureka!
 
Email Notifications in Jenkins | Edureka
Edureka!
 
EA Algorithm in Machine Learning | Edureka
Edureka!
 
Cognitive AI Tutorial | Edureka
Edureka!
 
AWS Cloud Practitioner Tutorial | Edureka
Edureka!
 
Blue Prism Top Interview Questions | Edureka
Edureka!
 
Big Data on AWS Tutorial | Edureka
Edureka!
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
Edureka!
 
Kubernetes Installation on Ubuntu | Edureka
Edureka!
 
Introduction to DevOps | Edureka
Edureka!
 

Recently uploaded (20)

PDF
Software Development Methodologies in 2025
KodekX
 
PDF
BLW VOCATIONAL TRAINING SUMMER INTERNSHIP REPORT
codernjn73
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PDF
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PDF
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
PDF
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
PDF
How-Cloud-Computing-Impacts-Businesses-in-2025-and-Beyond.pdf
Artjoker Software Development Company
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PPTX
How to Build a Scalable Micro-Investing Platform in 2025 - A Founder’s Guide ...
Third Rock Techkno
 
PDF
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
PDF
This slide provides an overview Technology
mineshkharadi333
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PPTX
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
PDF
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
PPTX
IoT Sensor Integration 2025 Powering Smart Tech and Industrial Automation.pptx
Rejig Digital
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PDF
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
PDF
Doc9.....................................
SofiaCollazos
 
Software Development Methodologies in 2025
KodekX
 
BLW VOCATIONAL TRAINING SUMMER INTERNSHIP REPORT
codernjn73
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
SparkLabs Primer on Artificial Intelligence 2025
SparkLabs Group
 
Google I/O Extended 2025 Baku - all ppts
HusseinMalikMammadli
 
How-Cloud-Computing-Impacts-Businesses-in-2025-and-Beyond.pdf
Artjoker Software Development Company
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
How to Build a Scalable Micro-Investing Platform in 2025 - A Founder’s Guide ...
Third Rock Techkno
 
Accelerating Oracle Database 23ai Troubleshooting with Oracle AHF Fleet Insig...
Sandesh Rao
 
This slide provides an overview Technology
mineshkharadi333
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
IoT Sensor Integration 2025 Powering Smart Tech and Industrial Automation.pptx
Rejig Digital
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
A Day in the Life of Location Data - Turning Where into How.pdf
Precisely
 
Doc9.....................................
SofiaCollazos
 

MapReduce Example | MapReduce Programming | Hadoop MapReduce Tutorial | Edureka