0% found this document useful (0 votes)
74 views7 pages

Hadoop Single Node Installation

Hadoop Single Node Installation

Uploaded by

Ahmed Ramadan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views7 pages

Hadoop Single Node Installation

Hadoop Single Node Installation

Uploaded by

Ahmed Ramadan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

HADOOP 2.7.

2 INSTALLING ON
UBUNTU 20
(SINGLE-NODE CLUSTER)
1- Install Java 1.8.

aramadan@ubuntu: ~$ cd ~

# Update the source list

aramadan@ubuntu: ~$ sudo apt-get update

aramadan@ubuntu: ~$ sudo apt-get upgrade

aramadan@ubuntu: ~$ sudo apt-get install openjdk-8-jdk

# Verify Java Installation

aramadan@ubuntu: ~$ java -version


openjdk version "1.8.0_292"

OpenJDK Runtime Environment (build 1.8.0_292-8u292-b10-0ubuntu1~20.04-b10)

OpenJDK 64-Bit Server VM (build 25.292-b10, mixed mode)

2- Adding a dedicated Hadoop user.

aramadan@ubuntu: ~$ sudo addgroup hadoop


Adding group `hadoop' (GID 1003) ...

Done.

aramadan@ubuntu: ~$ sudo adduser --ingroup hadoop hduser


Adding user `hduser' ...

Adding new user `hduser' (1002) with group `hadoop' ...

Creating home directory `/home/hduser' ...

Copying files from `/etc/skel' ...

New password: hduser

Retype new password: hduser

passwd: password updated successfully

Changing the user information for hduser

Enter the new value, or press ENTER for the default

Full Name []:

Room Number []:

Work Phone []:

Home Phone []:

Other []:
3- Installing SSH.
 ssh has two main components:
 ssh : The command we use to connect to remote machines - the client.
 sshd : The daemon that is running on the server and allows clients to
connect to the server.

aramadan@ubuntu: ~$ sudo apt-get install ssh

#Verify ssh installation

4- Create and Setup SSH Certificates


 Hadoop requires SSH access to manage its nodes, i.e. remote machines
plus our local machine. For our single-node setup of Hadoop, we therefore
need to configure SSH access to localhost.

su hduser

ssh-keygen -t rsa -P ""

cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys


5- Install Hadoop

hduser@ubuntu: ~$ wget "https://www.apache.org/dyn/mirrors/mirrors.cgi?


action=download&filename=hadoop/common/hadoop-2.7.2/hadoop-2.7.2.tar.gz"

#Take some time to finish the installation.

hduser@ubuntu: ~$ tar xvzf hadoop-2.7.2.tar.gz

hduser@ubuntu: ~$ cd Hadoop-2.7.2

hduser@ubuntu: ~$ sudo mv * /usr/local/hadoop

[sudo] password for hduser:

hduser is not in the sudoers file. This incident will be reported.

hduser@ubuntu: ~$ su aramadan #Type your primary username.

hduser@ubuntu: ~$ sudo adduser hduser sudo #add hduser in sudo group.

hduser@ubuntu: ~$ sudo su hduser

hduser@ubuntu:~/hadoop-2.7.2$ sudo mv * /usr/local/hadoop

mv: target '/usr/local/hadoop' is not a directory

hduser@ubuntu:~/hadoop-2.7.2$ sudo mkdir /usr/local/hadoop #Create Hadoop directory.

hduser@ubuntu:~/hadoop-2.7.2$ sudo mv * /usr/local/hadoop

#Verify files moving .

hduser@ubuntu:~/hadoop-2.7.2$ ls /usr/local/hadoop/

hduser@ubuntu:~/hadoop-2.7.2$ sudo chown -R hduser:hadoop /usr/local/hadoop

6- Setup configuration Files:

1. ~/.bashrc
2. /usr/local/hadoop/etc/hadoop/hadoop-env.sh
3. /usr/local/hadoop/etc/hadoop/core-site.xml
4. /usr/local/hadoop/etc/hadoop/mapred-site.xml.template
5. /usr/local/hadoop/etc/hadoop/hdfs-site.xml
1.~/.bashrc

hduser@ubuntu:~ $ update-alternatives --config java

There is only one alternative in link group java (providing /usr/bin/java): /usr/lib/jvm/java-8-
openjdk-amd64/jre/bin/java

Nothing to configure.

hduser@ubuntu:~ $ sudo gedit ~/.bashrc

# Add to the end of the file (Java & Hadoop Variables Environment)

#HADOOP VARIABLES START

export JAVA_HOME=usr/lib/jvm/java-8-openjdk-amd64

export HADOOP_INSTALL=/usr/local/hadoop

export PATH=$PATH:$HADOOP_INSTALL/bin

export PATH=$PATH:$HADOOP_INSTALL/sbin

export HADOOP_MAPRED_HOME=$HADOOP_INSTALL

export HADOOP_COMMON_HOME=$HADOOP_INSTALL

export HADOOP_HDFS_HOME=$HADOOP_INSTALL

export YARN_HOME=$HADOOP_INSTALL

export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native

export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"

#HADOOP VARIABLES END

hduser@ubuntu:~ $ source ~/.bashrc

hduser@ubuntu:~ $ javac -version

hduser@ubuntu:~ $ which javac

hduser@ubuntu:~ $ readlink -f /usr/lib/jvm/java-8-openjdk-amd64/bin/javac

2./usr/local/hadoop/etc/hadoop/hadoop-env.sh

hduser@ubuntu:~ $ sudo gedit /usr/local/hadoop/etc/hadoop/hadoop-env.sh

#add the following line: export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64


3./usr/local/hadoop/etc/hadoop/core-site.xml

hduser@ubuntu:~ $ sudo mkdir -p /app/hadoop/tmp

hduser@ubuntu:~ $ sudo chown hduser:hadoop /app/hadoop/tmp

hduser@ubuntu:~ $ sudo gedit /usr/local/hadoop/etc/hadoop/core-site.xml

#Replace Configuration tag with the following block

<configuration>

<property>

<name>hadoop.tmp.dir</name>

<value>/app/hadoop/tmp</value>

<description>A base for other temporary directories.</description>

</property>

<property>

<name>fs.default.name</name>

<value>hdfs://localhost:54310</value>

<description>The name of the default file system. A URI whose

scheme and authority determine the FileSystem implementation. The

uri's scheme determines the config property (fs.SCHEME.impl) naming

the FileSystem implementation class. The uri's authority is used to

determine the host, port, etc. for a filesystem.</description>

</property>

</configuration>
4./usr/local/hadoop/etc/hadoop/mapred-site.xml

hduser@ubuntu:~ $ cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template
/usr/local/hadoop/etc/hadoop/mapred-site.xml

hduser@ubuntu:~ $ sudo gedit /usr/local/hadoop/etc/hadoop/mapred-site.xml

#Replace Configuration tag with the following block

<configuration>

<property>

<name>mapred.job.tracker</name>

<value>localhost:54311</value>

<description>The host and port that the MapReduce job tracker runs

at. If "local", then jobs are run in-process as a single map

and reduce task.

</description>

</property>

</configuration>
5./usr/local/hadoop/etc/hadoop/hdfs-site.xml

hduser@ubuntu:~ $ sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode

hduser@ubuntu:~ $ sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode

hduser@ubuntu:~ $ sudo chown -R hduser:hadoop /usr/local/hadoop_store

hduser@ubuntu:~ $ sudo gedit /usr/local/hadoop/etc/hadoop/hdfs-site.xml

#Replace Configuration tag with the following block

<configuration>

<property>

<name>dfs.replication</name>

<value>1</value>

<description>Default block replication.

The actual number of replications can be specified when the file is created.

The default is used if replication is not specified in create time.

</description>

</property>

<property>

<name>dfs.namenode.name.dir</name>

<value>file:/usr/local/hadoop_store/hdfs/namenode</value>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>file:/usr/local/hadoop_store/hdfs/datanode</value>

</property>

</configuration>

Format the New Hadoop Filesystem & Run Hadoop


hduser@ubuntu:~ $ hadoop namenode -format

hduser@ubuntu:~ $ start-all.sh #Run Hadoop Services

hduser@ubuntu:~ $ jps

15888 Jps

15682 NodeManager

15218 DataNode

15415 SecondaryNameNode

15050 NameNode

15550 ResourceManager

You might also like