hadoop
hadoop
Fault
Easy to use
tolerance
Economic Scalability
Open
Source
Nodes
Java Installation
mv hadoop-3.1.2 hadoop
Enable passwordless SSH access between all machines in the cluster to
Enable facilitate communication and remote execution
Generate an SSH key pair on the machine designated as the master node
Generate • ssh-keygen -t rsa (press enter 4 times: do this in all the nodes)
• Id_rsa.pub (in .ssh directory, copy all the keys in a new file named authorized_keys)
Test SSH connectivity by logging into each machine from the master node
Test using SSH, without requiring a password.
Configure Hadoop files in both master and slave nodes
for clustering
“etc/Hadoop” (Hadoop configuration files are located
here)
core Hadoop configuration files
▪ hadoop-env.sh (set Java_home path)
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>1536</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>128</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
/hadoop/etc/hadoop/mapred-site.xml
<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>512</value>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>256</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>256</value>
</property>
yarn.nodemanager.resource.memory-mb 1536
yarn.scheduler.maximum-allocation-mb 1536
yarn.scheduler.minimum-allocation-mb 128
yarn.app.mapreduce.am.resource.mb 512
mapreduce.map.memory.mb 256
mapreduce.reduce.memory.mb 256
bin/hdfs namenode –format