HADOOP PPT
HADOOP PPT
Cluster
Installation As java is successfully installed, we will extract the Hadoop tar file.
After downloading Hadoop tar file we will extract it.
Steps on After extracting we got another file in .tar form.
Single-node Again we have to extract the Hadoop file i.e. 2 times.
Cluster
Now we will configure the files and set the environment variables.
First we will set the configuration in Hadoop.
So we will go to : etc folder> Hadoop folder> inside this we have
multiple files.
Installation <name>dfs.replication</name>
<value>1</value>
Steps on </property>
Single-node <property>
<name>dfs.namenode.name.dir</name>
Cluster <value>C:\Users\aksha\Downloads\hadoop-3.2.4.tar\hadoop-3.2.4\data\namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>C:\Users\aksha\Downloads\hadoop-3.2.4.tar\hadoop-3.2.4\data\datanode</value>
</property>
Installation mapred-site.xml:
Steps on <property>
<name>mapreduce.framework.name</name>
Single-node <value>yarn</value>
Cluster </property>
yarn-site.xml:
Installation <property>
<name>yarn.nodemanager.aux-services</name>
Steps on <value>mapreduce_shuffle</value>
</property>
Single-node <property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
Cluster <value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
Installation Hadoop-env.cmd:
Steps on set JAVA_HOME=C:\java\jdk1.8.0_351
Single-node
Set the java jdk location here.
Cluster
Installation
Steps on Now we have configure all the files and now we will set the
environment variables and set the path variable:
Single-node
Cluster
Installation Now last step is o configuring the Hadoop for this we will download
another bin folder and replace with the older one.
Steps on Link: https://drive.google.com/file/d/1zuT8G3D2JFkbkdv6fMhnhBOj8YSsgJc-/view
Single-node Unzip the folder and replace all the files with the current files in the
Cluster bin folder.
Now the Hadoop is successfully configured and installed we have to check
it.
Test the For this we will open cmd and type the command: hdfs namenode –format
Hadoop This will pop-up files in it and show starting namenode in it.
To launch Hadoop go to cmd and go to sbin folder and then type
command: “start-all.cmd”
This will launch all the Daemon’s of Hadoop.
Hadoop 1. Namenode
2. Datanode
3. Resourcemanager
4. Nodemanager
Some scripts used to launch Hadoop DFS and Hadoop Map/Reduce
daemons are:
1. start-dfs.sh: starts the Hadoop DFS Daemon’s, the namenode
and datanode. Used before start-mapred.sh
2. stop-dfs.sh: stops the Hadoop DFS Daemon’s.
Start-up script 3. start-mapred.sh: starts the Hadoop map-reduce Daemons, the
jobtracker and tasktracker.
in Hadoop 4. stop-mapred.sh: stops the map-reduce Hadoop Daemons.
5. start-all.sh: starts all the Hadoop Daemons, the namenode,
datanode, resourcemanager, nodemanager.
6. stop-all.sh: stops all the Hadoop Daemons.