Lab Programs on HDFS and MapReduce
Lab Programs on HDFS and MapReduce
Perform the following tasks by interacting with Hadoop Distributed File System (HDFS).
Perform the following tasks by interacting with Hadoop Distributed File System (HDFS).
Count the frequency of each character in a text file using the Hadoop MapReduce framework.
• Prepare Input Data: Create a text file with sample content (e.g., input.txt)
• Upload Input to HDFS: Upload the input file to HDFS
• Write Mapper Class: Implement a Mapper that reads characters and emits each character with
a count of 1
• Write Reducer Class: Implement a Reducer that sums up counts for each character
• Write Driver Class: Set up the job, define input/output paths, and specify Mapper/Reducer
classes
• Compile Java Code: Compile the Mapper, Reducer, and Driver classes into a JAR file
• Run the MapReduce Job: Execute the job with hadoop command on the input file
• Download Output: Download the result from HDFS to the local system and verify