Hadoop Installation Guide
Hadoop Installation Guide
html
HADOOP INSTALLATION
GUIDE
Welcome to your comprehensive guide for setting up Hadoop and embarking on your
journey into the world of Big Data! Below, I've included a step-by-step guide that will help
you install Hadoop on your system. Let's dive right in!
1 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html
To start, you'll need to install the Java Development Kit (JDK) on your Ubuntu system. The
default Ubuntu repositories offer both Java 8 and Java 11, but it's recommended to use
Java 8 for compatibility with Hive. You can use the following command to install it:
Copy
Once the Java Development Kit is successfully installed, you should check the version to
ensure it's working correctly:
java -version
Copy
Output:
SSH (Secure Shell) is crucial for Hadoop, as it facilitates secure communication between
nodes in the Hadoop cluster. This is essential for maintaining data integrity and
confidentiality and enabling efficient distributed data processing across the cluster:
Copy
2 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html
You must create a user specifically for running Hadoop components. This user will also be
used to log in to Hadoop's web interface. Run the following command to create the user
and set a password:
Copy
Output:
Switch to the newly created 'hadoop' user using the following command:
su - hadoop
Copy
Next, you should set up password-less SSH access for the 'Hadoop' user to streamline
3 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html
the authentication process. You'll generate an SSH keypair for this purpose. This avoids
the need to enter a password or passphrase each time you want to access the Hadoop
system:
ssh-keygen -t rsa
Copy
Output:
Copy the generated public key to the authorized key file and set the proper permissions:
Copy
4 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html
You will be asked to authenticate hosts by adding RSA keys to known hosts. Type 'yes'
and hit Enter to authenticate the localhost:
ssh localhost
Copy
Output:
su - hadoop
Copy
5 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html
wget https://dlcdn.apache.org/hadoop/common/hadoop-3.3.6/hadoop-3.3.6.tar.gz
Copy
Once the download is complete, extract the contents of the downloaded file using the 'tar'
command. Optionally, you can rename the extracted folder to 'hadoop' for easier
configuration:
mv hadoop-3.3.6 hadoop
Copy
Next, you need to set up environment variables for Java and Hadoop in your system.
Open the '~/.bashrc' Could you file in your preferred text editor? If you're using 'nano,' you
can paste code with 'Ctrl+Shift+V,' save with 'Ctrl+X,' 'Ctrl+Y,' and hit 'Enter':
nano ~/.bashrc
Copy
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export HADOOP_HOME=/home/hadoop/hadoop
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
Copy
6 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html
Output:
source ~/.bashrc
Copy
Additionally, you should configure the 'JAVA_HOME' in the 'hadoop-env.sh' file. Edit this
file with a text editor:
nano $HADOOP_HOME/etc/hadoop/hadoop-env.sh
Copy
JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
Copy
Output:
7 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html
Create the namenode and datanode directories within the 'hadoop' user's home directory
using the following commands:
cd hadoop/
mkdir -p ~/hadoopdata/hdfs/{namenode,datanode}
Copy
Next, edit the 'core-site.xml' file and replace the name with your system hostname:
nano $HADOOP_HOME/etc/hadoop/core-site.xml
Copy
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
8 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html
</property>
</configuration>
Copy
Output:
Save and close the file. Then, edit the 'hdfs-site.xml' file:
Next, edit the 'hdfs-site.xml' file and replace the name with your system hostname:
nano $HADOOP_HOME/etc/hadoop/hdfs-site.xml
Copy
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
9 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/datanode</value>
</property>
</configuration>
Copy
Output:
10 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html
Save and close the file. Then, edit the 'mapred-site.xml' file:
nano $HADOOP_HOME/etc/hadoop/mapred-site.xml
Copy
<configuration>
<property>
<name>yarn.app.mapreduce.am.env</name>
11 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME/home/hadoop/hadoop/bin/hadoop</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME/home/hadoop/hadoop/bin/hadoop</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/datanode</value>
</property>
</configuration>
Copy
Output:
nano $HADOOP_HOME/etc/hadoop/yarn-site.xml
Copy
12 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
Copy
Output:
Before starting the Hadoop cluster, you need to format the Namenode as the 'hadoop'
user. Format the Hadoop Namenode with the following command:
Copy
13 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html
Output:
Once the Namenode directory is successfully formatted with the HDFS file system, you
will see the message "Storage directory /home/hadoop/hadoopdata/hdfs/namenode has
been successfully formatted." Start the Hadoop cluster using:
start-all.sh
Copy
Output:
You can check the status of all Hadoop services using the command:
jps
Copy
14 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html
Output:
ifconfig
Copy
Copy
To access the Namenode, open your web browser and visit http://your-server-ip:9870.
Replace 'your-server-ip' with your actual IP address. You should see the Namenode web
interface.
Output:
15 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html
To access the Resource Manager, open your web browser and visit the URL http://your-
server-ip:8088. You should see the following screen:
Output:
16 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html
The Hadoop cluster is installed and configured. Next, we will create some directories in
the HDFS filesystem to test Hadoop. Create directories in the HDFS filesystem using the
following command:
Copy
Copy
Copy
Also, put some files into the Hadoop file system. For example, put log files from the host
machine into the Hadoop file system:
Copy
You can also verify the above files and directories in the Hadoop web interface. Go to the
web interface, click on Utilities => Browse the file system. You should see the directories
you created earlier on the following screen:
17 of 18 3/5/2025, 11:30 AM
Hadoop Installation Guide https://kongu.edu/support/hadoop/index.html
To stop the Hadoop service, run the following command as a Hadoop user:
stop-all.sh
Copy
Output:
In summary, you've learned how to install Hadoop on Ubuntu. Now, you're ready to unlock
the potential of big data analytics. Happy exploring!
18 of 18 3/5/2025, 11:30 AM