0% found this document useful (0 votes)

24 views4 pages

Experiment 6 BDA

Uploaded by

pabocon672

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views4 pages

Experiment 6 BDA

Uploaded by

pabocon672

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

EXPERIMENT NO - 6

AIM: -To write a program to implement a word count program using MapReduce.

THEORY:
WordCount is a simple program which counts the number of occurrences of each word in a
given text input data set. WordCount fits very well with the MapReduce programming
model making it a great example to understand the Hadoop Map/Reduce programming
style. The implementation consists of three main parts:
1. Mapper

2. Reducer

3. Driver

Step-1. Write a Mapper

A Mapper overrides the ―map function from the Class
“org.apache.hadoop.mapreduce.Mapper" which provides <key, value> pairs as the input.
A Mapper implementation may output <key,value> pairs using the provided Context .
Input value of the WordCount Map task will be a line of text from the input data file and
the key would be the line number <line_number, line_of_text> . Map task outputs <word,
one> for each word in the line of text.

Pseudo-code

void Map (key,

value){ for each word
x in value:
output.collect(x,1);
}

Step-2. Write a Reducer

A Reducer collects the intermediate <key,value> output from multiple map tasks and
assemble a single result. Here, the WordCount program will sum up the occurrence of each
word to pairs as <word, occurrence>.

Pseudo-code

void Reduce (keyword, <list of value>){

foreach x in <list of value>:
sum+=x;

final_output.collect(keyword, sum);

}
Code:

import java.io.IOException;
import
java.util.StringTokenizer;

import org.apache.hadoop.io.IntWritable;
import
org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;

import
org.apache.hadoop.mapreduce.Mapper;
import
org.apache.hadoop.mapreduce.Reducer;
import
org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import
org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.fs.Path;
public class WordCount
{

public static class Map extends Mapper<LongWritable,Text,Text,IntWritable>

{public void map(LongWritable key, Text value,Context context) throws
IOException,InterruptedException{

String line = value.toString();

StringTokenizer tokenizer = new

StringTokenizer(line);while
(tokenizer.hasMoreTokens()) {
value.set(tokenizer.nextToken());

context.write(value, new IntWritable(1));

}

}
}

public static class Reduce extends

Reducer<Text,IntWritable,Text,IntWritable> {public void reduce(Text key,

Iterable<IntWritable> values,Context context) throws

IOException,InterruptedException {

int sum=0;
for(IntWritable x:
values)
{
sum+=x.get();
}

context.write(key, new IntWritable(sum));

}
public static void main(String[] args) throws

Exception {Configuration conf= new Configuration();

Job job = new Job(conf,"My Word Count
Program");job.setJarByClass(WordCount.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setOutputKeyClass(Text.class);

job.setOutputValueClass(IntWritable.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class
);Path outputPath = new Path(args[1]);
//Configuring the input/output path from the filesystem into the
jobFileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));

//deleting the output path automatically from hdfs so that we don't

have todelete it explicitly

outputPath.getFileSystem(conf).delete(outputPath);

//exiting the job only if the flag value becomes false

System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
Final OUTPUT:
hello 2
world 1
hadoop 2
mapreduce 1

CONCLUSION:

We have studied the concept of map reduce and implemented word count using
map reduce

Date: 23/09/2024

Name of Student: Huzaif H Shaikh

Roll No.: 53

Marks: Signature of Supervisor

WordCountApp
No ratings yet
WordCountApp
2 pages
Word Count Program
No ratings yet
Word Count Program
3 pages
Run Wordcount
No ratings yet
Run Wordcount
3 pages
Source Code For Wordcount
No ratings yet
Source Code For Wordcount
3 pages
Codigo Haddop
No ratings yet
Codigo Haddop
3 pages
Practical 3bcbs
No ratings yet
Practical 3bcbs
5 pages
Word Count Program To Demonstrate The Use of Map and Reduce Tasks
No ratings yet
Word Count Program To Demonstrate The Use of Map and Reduce Tasks
5 pages
Map Reduce Program
No ratings yet
Map Reduce Program
2 pages
Sribharanitharan.M 71762234049
No ratings yet
Sribharanitharan.M 71762234049
2 pages
Ravikant Hadoop File
No ratings yet
Ravikant Hadoop File
22 pages
To Count Using Map and Reduce Program: Wordcount - Java
No ratings yet
To Count Using Map and Reduce Program: Wordcount - Java
2 pages
Practical 2c
No ratings yet
Practical 2c
2 pages
Hadoop WordCount
No ratings yet
Hadoop WordCount
2 pages
Experiment-4 BDA LAB
No ratings yet
Experiment-4 BDA LAB
7 pages
Word Count Program
No ratings yet
Word Count Program
2 pages
Wordcount
No ratings yet
Wordcount
3 pages
Part B Assignment - No - 1
No ratings yet
Part B Assignment - No - 1
6 pages
Map Reduce Java Program
No ratings yet
Map Reduce Java Program
2 pages
1 Word Count
No ratings yet
1 Word Count
2 pages
Wordcount
No ratings yet
Wordcount
3 pages
Ex No 04
No ratings yet
Ex No 04
4 pages
579 BDA Week-04
No ratings yet
579 BDA Week-04
1 page
Exp 3-Word Count
No ratings yet
Exp 3-Word Count
4 pages
PART 1 - Install Java and Hadoop On Ubuntu
No ratings yet
PART 1 - Install Java and Hadoop On Ubuntu
4 pages
BDC Output 3
No ratings yet
BDC Output 3
4 pages
3 MapReduce Program Ex Code
No ratings yet
3 MapReduce Program Ex Code
14 pages
B1 Instructions
No ratings yet
B1 Instructions
9 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
ADA Lab Manual
No ratings yet
ADA Lab Manual
34 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
DSBDA GRP B Print
No ratings yet
DSBDA GRP B Print
21 pages
Big Data 4 Vivek
No ratings yet
Big Data 4 Vivek
3 pages
02-Wordcount Mapreduce
No ratings yet
02-Wordcount Mapreduce
5 pages
Kick Start Hadoop: Word Count - Hadoop Map Reduce Example
No ratings yet
Kick Start Hadoop: Word Count - Hadoop Map Reduce Example
13 pages
Ravinder Big Data 4 PDF
No ratings yet
Ravinder Big Data 4 PDF
15 pages
Bda Exp2 Chinmay
No ratings yet
Bda Exp2 Chinmay
7 pages
BDA3
No ratings yet
BDA3
7 pages
Big Data Practical 2
No ratings yet
Big Data Practical 2
11 pages
Practical-2 Aim: Write A Program of Word Count in Map Reduce Over HDFS. Description
No ratings yet
Practical-2 Aim: Write A Program of Word Count in Map Reduce Over HDFS. Description
6 pages
Exp 4 Word Count
No ratings yet
Exp 4 Word Count
4 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
5 pages
MapReduce Programs
No ratings yet
MapReduce Programs
10 pages
Steps: /usr/lib/hadoop-0.20/ Usr/lib/hadoop-0.20/lib
No ratings yet
Steps: /usr/lib/hadoop-0.20/ Usr/lib/hadoop-0.20/lib
4 pages
Word Count Example
No ratings yet
Word Count Example
4 pages
Example - (Map Function in Word Count)
No ratings yet
Example - (Map Function in Word Count)
6 pages
Steps To Create Jar File and Execute Word Count Problem in Mapper Reducer
No ratings yet
Steps To Create Jar File and Execute Word Count Problem in Mapper Reducer
5 pages
MapReduce Word Count Example - Javatpoint
No ratings yet
MapReduce Word Count Example - Javatpoint
12 pages
CTBD Sol02
No ratings yet
CTBD Sol02
2 pages
DA Lab Program-2
No ratings yet
DA Lab Program-2
6 pages
BDF Programs
No ratings yet
BDF Programs
32 pages
Practical 2-1
No ratings yet
Practical 2-1
4 pages
Map Reduce Example
No ratings yet
Map Reduce Example
6 pages
Exp 11
No ratings yet
Exp 11
4 pages
Assignment 11 DSBDA
No ratings yet
Assignment 11 DSBDA
4 pages
WordCount Program Hadoop Task 2
No ratings yet
WordCount Program Hadoop Task 2
7 pages
Setting Up Eclipse:: Codelab 1 Introduction To The Hadoop Environment (Version 0.17.0)
No ratings yet
Setting Up Eclipse:: Codelab 1 Introduction To The Hadoop Environment (Version 0.17.0)
9 pages
6 - Simple Wordcount
No ratings yet
6 - Simple Wordcount
2 pages
Wordcount - Java: Mapreduce Tutorial
No ratings yet
Wordcount - Java: Mapreduce Tutorial
1 page
Lab3 BigData-MapReduce
No ratings yet
Lab3 BigData-MapReduce
8 pages
Java How To Program, 10/e: Reserved
No ratings yet
Java How To Program, 10/e: Reserved
191 pages
Concurrent Modeling & Simulation in Java: I.C. Legrand
No ratings yet
Concurrent Modeling & Simulation in Java: I.C. Legrand
10 pages
Os Unit 3
No ratings yet
Os Unit 3
19 pages
Crontab PDF
No ratings yet
Crontab PDF
4 pages
QB Distributed Systems
No ratings yet
QB Distributed Systems
7 pages
CC 2
No ratings yet
CC 2
35 pages
Threads in Operating System
No ratings yet
Threads in Operating System
3 pages
Cloud Computing - Lecture 3
No ratings yet
Cloud Computing - Lecture 3
22 pages
CSO Lecture Notes Unit - 5
No ratings yet
CSO Lecture Notes Unit - 5
11 pages
Sun Cluster Commands
No ratings yet
Sun Cluster Commands
29 pages
CPU Scheduling
100% (1)
CPU Scheduling
57 pages
HP Java
No ratings yet
HP Java
20 pages
Subject Name:-Distributed Systems Subject Code: - RCS-701 Unit No.: - 2 Lecture No.: - 2 (PPT-1) Topic Name: - Token Based Mutual Exclusion Algorithm
No ratings yet
Subject Name:-Distributed Systems Subject Code: - RCS-701 Unit No.: - 2 Lecture No.: - 2 (PPT-1) Topic Name: - Token Based Mutual Exclusion Algorithm
42 pages
Phase Test Chap 1,2
No ratings yet
Phase Test Chap 1,2
4 pages
OPENMP Notes
No ratings yet
OPENMP Notes
4 pages
Scheduling in Distributed Systems
No ratings yet
Scheduling in Distributed Systems
9 pages
16ec5506 Rtos Question Bank
No ratings yet
16ec5506 Rtos Question Bank
6 pages
Ymca Faridabad - JC Bose University Pyq of Operating System For MCA
No ratings yet
Ymca Faridabad - JC Bose University Pyq of Operating System For MCA
2 pages
Concurrency Control!: Q Q Q Q Q
No ratings yet
Concurrency Control!: Q Q Q Q Q
38 pages
Operating System Questions
No ratings yet
Operating System Questions
15 pages
Message Ordering and Group Communication
No ratings yet
Message Ordering and Group Communication
19 pages
Swe3001 Operating-Systems Eth 1.0 37 Swe3001
No ratings yet
Swe3001 Operating-Systems Eth 1.0 37 Swe3001
2 pages
OS in One Video
No ratings yet
OS in One Video
16 pages
Mastering Concurrency Programming With Java 8 - Sample Chapter
33% (3)
Mastering Concurrency Programming With Java 8 - Sample Chapter
37 pages
What Is A Process Scheduler? State The Characteristics of A Good Process Scheduler? Which Criteria Affect The Schedulers Performance?
No ratings yet
What Is A Process Scheduler? State The Characteristics of A Good Process Scheduler? Which Criteria Affect The Schedulers Performance?
6 pages
Operating Systems (R16 Iii B.Tech I Sem) Unit - Ii
No ratings yet
Operating Systems (R16 Iii B.Tech I Sem) Unit - Ii
17 pages
Department of Computer Science and Engineering 18Cs43: Operating Systems Lecture Notes (QUESTION & ANSWER)
100% (1)
Department of Computer Science and Engineering 18Cs43: Operating Systems Lecture Notes (QUESTION & ANSWER)
8 pages
University of Mumbai MCQ QUESTION BANK (100 Questions) : Middleware Firmware Package System Software Middleware
No ratings yet
University of Mumbai MCQ QUESTION BANK (100 Questions) : Middleware Firmware Package System Software Middleware
16 pages
CS3551 DC 5 Units Notes
No ratings yet
CS3551 DC 5 Units Notes
102 pages
Handout Week5
No ratings yet
Handout Week5
6 pages

Experiment 6 BDA

Uploaded by

Experiment 6 BDA

Uploaded by

EXPERIMENT NO - 6

Step-1. Write a Mapper

void Map (key,

Step-2. Write a Reducer

void Reduce (keyword, <list of value>){

public static class Map extends Mapper<LongWritable,Text,Text,IntWritable>

String line = value.toString();

StringTokenizer tokenizer = new

context.write(value, new IntWritable(1));

public static class Reduce extends

Reducer<Text,IntWritable,Text,IntWritable> {public void reduce(Text key,

context.write(key, new IntWritable(sum));

Exception {Configuration conf= new Configuration();

//deleting the output path automatically from hdfs so that we don't

//exiting the job only if the flag value becomes false

Name of Student: Huzaif H Shaikh

Marks: Signature of Supervisor

You might also like