0% found this document useful (0 votes)
96 views10 pages

Group 11 Assignment 4

This document describes running a MapReduce word count program on Hadoop to count the words in a large text file. It provides steps to install MapReduce with Hadoop, write and compile the WordCount Java program, run the program to count words in a sample input file, and view the output results. It also discusses problems encountered with incorrect hostnames, and shows that results did not change after shutting down one VM. Finally, it describes running a MapReduce program to calculate average grades for courses using a generated input file of student grades.

Uploaded by

saurabh nakra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views10 pages

Group 11 Assignment 4

This document describes running a MapReduce word count program on Hadoop to count the words in a large text file. It provides steps to install MapReduce with Hadoop, write and compile the WordCount Java program, run the program to count words in a sample input file, and view the output results. It also discusses problems encountered with incorrect hostnames, and shows that results did not change after shutting down one VM. Finally, it describes running a MapReduce program to calculate average grades for courses using a generated input file of student grades.

Uploaded by

saurabh nakra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

COL733CLOUDCOMPUTING

TECHNO.FUNDA.

GuidedbyProf.SCGupta

Assignment4

MapReduce

Group11

AbhijitKawale(2012CS50273)
PreritPatidar(2012CS50294)
ArvindBhuria(2012CS50280)
SaurabhNakra(2012CS50298)
BhaveshChauhan(2012CS10221)

InstallationSteps:
MapReducewasinstalledwithHadoopinthepreviousassignment.
Filesrequiredtochangewere:mapredsite.xmlandyarnsite.xml
mapredsite.xmlcontainsthehostandportforthemapreducejobtrackerand
yarnsite.xmlcontainsthepropertiesforthenodetoworkasyarnnode.

PartALargetextfilewordcount

RunningMapreduce:
ToruntheWordCountprogramcodewaswritteninWordCount.javaanduploadedto
themastervm.

Step1
Directoryunitswascreatedinthehomefoldertostoreallthe.classfiles
Commandmkdirunits

Step2
hadoopcore.jarwasneededforthewordcountprogramtocompileandexecute.So,
hadoopcore1.2.1.jarwasdownloadedtothemastervm.
Commandwgethttp://mvnrepository.com/artifact/org.apache.hadoop/hadoopcore/1.2.1

Step3
ThefollowingcommandsareusedforcompilingtheWordCount.javaprogramand
creatingajarfortheprogram.
javacclasspathhadoopcore1.2.1.jardunitsWordCount.java
jarcvfunits.jarCunits/.

Step4
ThefollowingcommandisusedtocreateaninputdirectoryinHDFS.
$HADOOP_HOME/bin/hadoopfsmkdir/user/hadoop/input_dir

Step5
The following command is used to copy the input file named sample.txt in the input
directoryofHDFS.
$HADOOP_HOME/bin/hadoopfsput/home/hduser/sample.txt
/user/hadoop/input_dir

Step6
Afterthisapplicationwasrunusing:
$HADOOP_HOME/bin/hadoopjarunits.jarWordCount
/user/hadoop/input_dir/user/hadoop/output_dir

Step7
ThefollowingcommandisusedtoseetheoutputinPart00000file.Thisfileis
generatedbyHDFS.
$HADOOP_HOME/bin/hadoopfscat/user/hadoop/output_dir/part00000

ProblemsEncountered:
Whenrunningtheapplicationwegotthefollowingerror:

Gotexception:java.net.ConnectException:CallFrom
baadaldesktopvm/127.0.1.1to
baadalservervm.cse.iitd.ernet.in:54310failedonconnection
exception:java.net.ConnectException:Connectionrefused

Wefiguredoutthatproblemwasin/etc/hostnamefilewhichcontainedtheVM
hostname.Itwasbaadaldesktopvmbydefaultbutitshouldhavebeenthesameas
masterorslavenameaddedto/etc/hosts.So,wechanged/etc/hostnameineachofthe
VMsandsetthehostnametomasterformastervmandslave1forslave1,slave2for
slave2andslave3forslave3.Finallly,theproblemgofixedandweranthewordcount
applicationsuccessfully.

Herearethescreenshotsafterrunning:

ResultsforWORDCOUNT:
Inputfileusedwassample.txtpresentinthesubmissionfolder.Itwasa2.4MBsizefile
whichcontainstheline"samplefileherearetherandomwords"62856times.
Resultsobtainedafterrunningthewordcountprogramwereasfollows:

Asitcanbeseenthatoutoftheprogramwasaccurate.

(b)AftershuttingdownoneVM,resultsdidnotchange,Herearethe
screenshots.

ComputingAveragegradeofthecoursesusing
MapReduce

InputfilewasgeneratedusingjavacodeAverage.javapresentinthesubmissionfolder.
Itcontains10,000rowsand1250studentsaredistributedamong8courses.
Average.javawascreatedtocalculatetheaveragegradeofeachofthe8courses.
Fortheinputfilegrades_our.txtpresentinthesubmissionfolderoutputwas:

ApproachUsed:
Mapperreturnskeyvaluepair.Keywascourseid,andvaluewascorrespondinggrade
ofastudentinthatcourse
CombinerandReducertakeinputkeyasText(courseid)andinputvaluesasiteratorof
<FloatWritable>andoutputskeyasText(courseid)andvalueasFloatWritable(average
grade).

Herearethescreenshotsafterrunning:

You might also like