0% found this document useful (0 votes)
54 views10 pages

On Master Nodes Nodes: Install and Edit Bashrc On All Nodes For JAVA and HADOOP

1. The document describes the steps to configure a Hadoop cluster with one master node and multiple slave nodes. 2. It includes instructions for setting up passwordless SSH access between nodes, configuring Hadoop configuration files, formatting the HDFS namenode, and starting Hadoop services. 3. Finally, it validates that the cluster is running properly by checking the JPS processes and YARN resource manager web UI.

Uploaded by

Anju Yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views10 pages

On Master Nodes Nodes: Install and Edit Bashrc On All Nodes For JAVA and HADOOP

1. The document describes the steps to configure a Hadoop cluster with one master node and multiple slave nodes. 2. It includes instructions for setting up passwordless SSH access between nodes, configuring Hadoop configuration files, formatting the HDFS namenode, and starting Hadoop services. 3. Finally, it validates that the cluster is running properly by checking the JPS processes and YARN resource manager web UI.

Uploaded by

Anju Yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Create same username on all nodes(one master and other slaves)

Note down all nodes IP address and hostname

Edit /etc/hosts file in all nodes (remove 127.0.1.1. on all slaves and master)

Hostname1:IP

Hostname2:IP

Hostname3:IP

On master Nodes nodes


ssh-keygen -b 4096

ssh-copy-id -i $HOME/.ssh/id_rsa.pub hadoop@master

ssh-copy-id -i $HOME/.ssh/id_rsa.pub hadoop@slave1

ssh-copy-id -i $HOME/.ssh/id_rsa.pub hadoop@slave2

Install and edit bashrc on all nodes for JAVA and HADOOP.

Check JPS

Check hadoop version

Edit env.sh and give JAVA path on both master and slave nodes.
On Master Node:
1.core-site.xml

<property>

<name>fs.default.name</name>

<value>hdfs://Master_hostname</value>

</property>

<property>

<name>dfs.permissions</name>

<value>false</value>

</property>

<property>

<name>hadoop.tmp.dir</name>

<value>/home/hadoop/tmp</value>

<description>Temporary Directory.</description>

</property>

<property>

<name>fs.defaultFS</name>

<value>hdfs:// Master_hostname:54310</value>

<description>Use HDFS as file storage engine</description>

</property>
2.hdfs-site.xml

<property>

<name>dfs.namenode.name.dir</name>

<value>/home/hadoop/data/nameNode</value>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>/home/hadoop/data/dataNode</value>

</property>

<property>

<name>dfs.replication</name>

<value>2</value>

</property>
3.mapred-site.xml :

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

<property>

<name>mapred.job.tracker.address</name>

<value> Master_hostname:54311</value>

</property>

4.yarn-site.xml

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

<property>

<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

<value>org.apache.hadoop.mapred.ShuffleHandler</value>

</property>

<property>

<name>yarn.resourcemanager.scheduler.address</name>

<value> Master_hostname:8030</value>
</property>

<property>

<name>yarn.resourcemanager.address</name>

<value> Master_hostname:8032</value>

</property>

<property>

<name>yarn.resourcemanager.webapp.address</name>

<value> Master_hostname:8088</value>

</property>

<property>

<name>yarn.resourcemanager.resource-tracker.address</name>

<value> Master_hostname:8031</value>

</property>

<property>

<name>yarn.resourcemanager.admin.address</name>

<value> Master_hostname:8033</value>

</property>
5. Create masters file and do entry of maternode_hostname.

6. Update slaves file

Change localhost to hostname of:


masternode
slave1node
slave2node
On all Slaves Node:
1. Core-site.xml

<property>

<name>fs.default.name</name>

<value>hdfs:// Master_hostname </value>

</property>

<property>

<name>dfs.permissions</name>

<value>false</value>

</property>

<property>

<name>fs.defaultFS</name>

<value>hdfs:// Master_hostname:54310</value>

<description>Use HDFS as file storage engine</description>

</property>

2. hdfs-site.xml

<property>

<name>dfs.namenode.name.dir</name>

<value>/home/hadoop/data/nameNode</value>

</property>

<property>
<name>dfs.datanode.data.dir</name>

<value>/home/hadoop/data/dataNode</value>

</property>

<property>

<name>dfs.replication</name>

<value>2</value>

</property>

3. yarn-site.xml

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

<property>

<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

<value>org.apache.hadoop.mapred.ShuffleHandler</value>

</property>

<property>

<name>yarn.resourcemanager.scheduler.address</name>

<value> Master_hostname:8030</value>
</property>

<property>

<name>yarn.resourcemanager.address</name>

<value> Master_hostname:8032</value>

</property>

<property>

<name>yarn.resourcemanager.webapp.address</name>

<value> Master_hostname:8088</value>

</property>

<property>

<name>yarn.resourcemanager.resource-tracker.address</name>

<value> Master_hostname:8031</value>

</property>

<property>

<name>yarn.resourcemanager.admin.address</name>

<value> Master_hostname:8033</value>

</property>
Running the Cluster

On Master Node only:


hdfs namenode –format

start-dfs.sh

start-yarn.sh

Run JPS on Master

NodeManager
Jps
SecondaryNameNode
NameNode
DataNode
ResourceManager

Run JPS on Slave

Jps
NodeManager
DataNode

Check the running cluster nodes: http://master_IP:8088/cluster/nodes

You might also like