Hadoop 2.6.5 Installing On Ubuntu 16.04 and 18.04 (Single-Node Cluster)
Hadoop 2.6.5 Installing On Ubuntu 16.04 and 18.04 (Single-Node Cluster)
04
(Single-Node Cluster)
Note:
1. Jdk 8 is recommended
2. copy java path from step4, which required for step 15 and 18.
3. Make sure step15 and step18 having same java home path.
Step 1:
updates the package lists for upgrades for packages that need upgrading, as well as new packages
that have just come to the repositories.
bhaskar@D:~$sudo apt-get update
bhaskar@D:~$java -version
The next step is to create a dedicated user and group for our Hadoop installation. This allows all of
the installation to be insulated from the rest of the environment, as well as enable tighter security
measures to be enforced (in case you have a production environment). We will create a user hduser
and a group hadoop, and add the user to the group. This can be done using the following
commands.
A.Baskar Page 1
bhaskar@D:~$sudo addgroup hadoop
Step 6:
Step 7: We can check if we create the hadoop group and hduser user
bhaskar@D:~$groups hduser
Hadoop requires SSH access to manage its nodes, i.e. remote machines plus your local machine.
For our single-node setup of Hadoop, we therefore need to configure SSH access to localhost for
the hduser user we created in the earlier.
We have to generate an SSH key for the hduser user.
Hadoop uses SSH (to access its nodes) which would normally require the user to enter a password.
However, this requirement can be eliminated by creating and setting up SSH certificates using the
following commands. If asked for a filename just leave it blank and press the enter key to continue.
bhaskar@D:~$ su hduser
The following command adds the newly created key to the list of authorized keys so that Hadoop
can use ssh without prompting for a password.
A.Baskar Page 2
hduser@D:/home/bhaskar$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
Step 12: move the Hadoop installation to the /usr/local/hadoop directory. So, we should create
the directory first:
hduser@D:/home/bhaskar$ cd /home/bhaskar/Downloads/hadoop-2.6.5/
(or)
hduser@D:/home/bhaskar$
wget http://mirrors.sonic.net/apache/hadoop/common/hadoop-2.6.5/hadoop-2.6.5.tar.gz
(check pro per mirror)
Move to the folder, where your hadoop download is available and execute the following
sudo mv * /usr/local/hadoop/
Step 15:
Before editing the .bashrc file in hduser's home directory, we need to find the path where Java has
been installed to set the JAVA_HOME environment variable using Step 4
A.Baskar Page 3
hduser@D:/home/bhaskar$ vim ~/.bashrc
(or)
Step 16:
Step 17:
hduser@D:/home/bhaskar$ vi /usr/local/hadoop/etc/hadoop/hadoop-env.sh
export JAVA_HOME==/usr/lib/jvm/java-8-openjdk-amd64
Step 18:
Step 19:
hduser@D:/home/bhaskar$ vi /usr/local/hadoop/etc/hadoop/core-site.xml
Open the file and enter the following in between the <configuration></configuration> tag:
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
A.Baskar Page 4
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
Step 20:
hduser@D:/home/bhaskar$ cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template
/usr/local/hadoop/etc/hadoop/mapred-site.xml
Step 21:
hduser@D:/home/bhaskar$vim /usr/local/hadoop/etc/hadoop/mapred-site.xml
Open the file and enter the following in between the <configuration></configuration> tag:
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
Step 22:
Step 23:
Open the file and enter the following in between the <configuration></configuration> tag:
A.Baskar Page 5
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
Step 24:
Important Note :
Note that hadoop namenode -format command should be executed once before we start
using Hadoop.
If this command is executed again after Hadoop has been used, it'll destroy all the data on
the Hadoop file system.
hduser@D:/home/bhaskar$ start-all.sh
Step 26:
hduser@D:/home/bhaskar$ jps
For checking running process in our Hadoop cluster we use JPS command.JPS stands for Java
Virtual Machine Process Status Tool.
A.Baskar Page 6
Note: Your Hadoop installation is
successful only if above daemons should
start
hduser@D:/home/bhaskar$ stop-all.sh
A.Baskar Page 7