BDM Lab Manual 2
BDM Lab Manual 2
Ex.No. 6
Load and Execute Wordcount MapReduce Java code in Hadoop
Load and Execute Wordcount MapReduce Python code in Hadoop
Where wordcount is the MapReduce program which runs on test2.txt file and generates the
output file inside the /output folder.
Note: During the subsequent execution of the program, inside the same /output folder, it can’t
create the new output file in the same name. So, either delete the existing output file and run
another code, or choose some other folder as output folder.
Ex.No. 7
Load and execute existing WordCount MapReduce code in IntelliJ
iii) Load the word count files into the current project
Copy all the three word count Java code files from its folder
Paste it inside the src folder of Project Explorer window in IntelliJ
The file WordCount.java contains the main() method which should be executed by
choosing the menu option Run or “Run WordCount.main()” option from right-
click menu. But it is erroneous one, since it requires the supportive Hadoop
libraries which are not available in IntelliJ by default.
v) Now select the menu option Run or “Run WordCount.main()” option from right-
click menu to execute the code.
On successful execution also, it can’t show the output. Since, it requires the input
file name as command line argument.
Ex.No.8
Create the Jar file of your sample Java code in IntelliJ, upload it into the VM and
execute that code in Hadoop
i) To create the Jar file, first of all inform the system where to create the Jar file.
It creates META-INF folder inside the src folder (refer the Project Explorer window)
Project Explorer > src folder > META-INF folder > manifest.mf file
iii) If any further modifications done in the code, rebuild the artifacts to get its reflection
in the .jar file also.
Otherwise, create a new folder in VM’s file system and then upload the .jar file into it.
Verify it by using,
$ jps
... Jps
... Resource Manager
... Name Node
... Backup Name Node
vii) To run the WordCount.jar file in VM, upload the text file into HDFS
$ vi test.txt
$ hadoop fs –put test.txt /test.txt
viii) If the text file (input file of the word count problem) exists in Host OS File
System(i.e., Windows) of some other system in the same network, you can get into
the Guest OS File System through any one of the two ways in MobaXterm.
ix) Check the daemons which are created during the execution of the MapReduce code,
During the execution of MapReduce WordCount program in the current terminal
window, open a new terminal window, and then give the command jps in that new
terminal. It will list out the following daemons.
... Jps
... Resource Manager
... Name Node
... Backup Name Node
... Node Manager
... MRApp Manager
... YarnChild1
... YarnChild2
MR Application Manager is to monitor and manage the currently running job in the
cluster. The YarnChilds are the daemons of Mapper and Reducer processes.
xi) View the number of blocks required to handle the input file of the Word Count Map
Reduce program.
Browser window > Double click on the test.txt file > Select Block >
Block 0
Slave 1
Slave 2 like details will be shown if it is multinode cluster configuration.
In browser window:
localhost:19888/jobhistory
It shows the log information of both Mapper code as well as the Reducer code.