0% found this document useful (0 votes)
12 views22 pages

HarshYadav 20CS3032 Assignment1

This document outlines the steps to install and configure Hadoop on Ubuntu, including installing Java, creating a Hadoop user, downloading and extracting Hadoop, configuring core-site.xml, hdfs-site.xml, mapred-site.xml, and yarn-site.xml files, formatting the namenode, starting HDFS and YARN services, and running a word count example job. The objectives are to learn how to install Hadoop, configure HDFS, and create and run Java mappers and reducers.

Uploaded by

Devanshu Kaushik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views22 pages

HarshYadav 20CS3032 Assignment1

This document outlines the steps to install and configure Hadoop on Ubuntu, including installing Java, creating a Hadoop user, downloading and extracting Hadoop, configuring core-site.xml, hdfs-site.xml, mapred-site.xml, and yarn-site.xml files, formatting the namenode, starting HDFS and YARN services, and running a word count example job. The objectives are to learn how to install Hadoop, configure HDFS, and create and run Java mappers and reducers.

Uploaded by

Devanshu Kaushik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

WORKING WITH HADOOP,

HDFS AND
CONFIGURATION

- Ashutosh Singh
20CS3013
TEAM MEMBERS

• Devanshu Kaushik
• Harsh Aditya
• Ashutosh Singh
• Animay Prakash
• Harsh Yadav
OBJECTIVES

• Installation of Hadoop in our system


• Learning Hadoop File System (HDFS)
• Creating Java Mapper and Reducer objects
• Launching Java Mappers and Reducers using Hadoop
INSTALLING JAVA ON UBUNTU

• Command – sudo apt install default-jdk default-jre –y


• Verification – java –version
CREATE HADOOP USER AND CONFIGURE
SSH
• Create user - sudo adduser hadoop
• Add to sudo - sudo usermod –aG sudo Hadoop
• Configure SSH : Generate keys, add public key to authorized_keys
DOWNLOAD AND INSTALL APACHE HADOOP
ON UBUNTU
• Download - wget https://downloads.apache.org/hadoop/common/stable/hadoop-
3.3.4.tar.gz
• Extract and move - tar -xvzf hadoop-3.3.4.tar.gz, sudo mv hadoop-3.3.4 /usr/local/hadoop
CONFIGURE HADOOP ON UBUNTU

• Edit.bashrc – Set Hadoop environment variable


• Edit Hadoop-env.sh – Set java environment variables.
CONFIGURING JAVA ENVIRONMENT
VARIABLES
• Edit hadoop-env.sh - Set Java environment variables
• Download Activation File - sudo wget
https://jcenter.bintray.com/javax/activation/javax.activation-api/1.2.0/javax.activation-
api-1.2.0.jar
• Check Hadoop Version - hadoop version
CONFIGURE CORE-SITE.XML

• Edit core-site.xml - Set default file system URI.


• Create Directories - sudo mkdir -p /home/hadoop/hdfs/{namenode,datanode}.
CONFIGURE HDFS-SITE.XML

• Edit hdfs-site.xml - Set replication and data directories


CONFIGURE MAPRED-SITE.XML

• Edit mapred-site.xml - Set MapReduce framework name.


CONFIGURE YARN-SITE.XML

• Edit yarn-site.xml - Set YARN settings.


VALIDATE HADOOP CONFIGURATION

• Format NameNode – hdfs namenode –format.


• Verify Configuration – hadoop version.
START THE HADOOP CLUSTER

• Start NameNode and DataNode – start-dfs.sh


• Start NodeManager and ResourceManager – start-yarn.sh
• Verify running services – jps
ACCESS HADOOP WEB INTERFACE

• Access Interface – http://server-IP:9870


INPUT TEXT
WORD COUNT JAVA FILE
COMPILATION OF WORD COUNT JAVA FILE
TO JAR FILE
JAR FILE TO MAPPER REDUCER FILE
OUTPUT
OUTPUT
THANK YOU

You might also like