Hadoop Installation
Hadoop Installation
DATE:
AIM:
The Oracle JDK is the official JDK; however, it is nolonger provided by Oracle as a default
installation for Ubuntu. You can still install it using apt-get.
Then, depending on the version you want to install, execute one of the following commands:
Follow these steps to setup Java on Windows and validate the install.
Download Java for Windows 10
Download the latest Java Development Kit installation file for Windows 10 to have the latest
features and bug fixes.
1. Using your preferred web browser, navigate to the Oracle Java Downloads page.
2. On the Downloads page, click the x64 Installer download link under
the Windows category. At the time of writing this article, Java version 17 is the latest
long-term support Java version.
After downloading the installation file, proceed with installing Java on your Windows system.
After running the installation file, the installation wizard welcome screen appears.
3. Wait for the wizard to finish the installation process until the Successfully Installed message
appears. Click Close to exit the wizard.
Set Java environment variables to enable program compiling from any directory. To do so,
follow the steps below:
4. Under the System variables category, select the Path variable and click Edit:
5. Click the New button and enter the path to the Java bin directory:
Download the latest Java Development Kit installation file for Windows 10 to have the latest
features and bug fixes.
1. Using your preferred web browser, navigate to the Oracle Java Downloads page.
2. On the Downloads page, click the x64 Installer download link under
the Windows category. At the time of writing this article, Java version 17 is the latest
long-term support Java version.
After downloading the installation file, proceed with installing Java on your Windows system.
Follow the steps below:
After running the installation file, the installation wizard welcome screen appears.
2. Choose the destination folder for the Java installation files or stick to the default path.
Click Next to proceed.
3. Wait for the wizard to finish the installation process until the Successfully Installed message
appears. Click Close to exit the wizard.
Set Environmental Variables in Java
Set Java environment variables to enable program compiling from any directory. To do so,
follow the steps below:
3. In the System Properties window, under the Advanced tab, click Environment Variables…
4. Under the System variables category, select the Path variable and click Edit:
5. Click the New button and enter the path to the Java bin directory:
Step 2: Add JAVA_HOME Variable
Some applications require the JAVA_HOME variable. Follow the steps below to create the
variable:
1. In the Environment Variables window, under the System variables category, click
the New… button to create a new variable.
2. Name the variable as JAVA_HOME.
3. In the variable value field, paste the path to your Java jdk directory and click OK.
Run the java -version command in the command prompt to make sure Java installed correctly:
RESULT:
DATE:
AIM:
PROCEDURE:
STEPS:
installation. This step is optional, but recommendedbecause it gives you flexibility to have a
separate account for Hadoop installation by separating this installation from other software
installation
common/hadoop-2.7.0/hadoop-2.7.0.tar.gz
mv hadoop-2.7.0 hadoop2
vim /home/hadoop_dev/hadoop2/etc/hadoop/
• hadoop-env.sh
• uncomment JAVA_HOME and update it followingline:
( Please check for your relevant java installation and set this value accordingly. Latest
versions of Hadoop require > JDK1.7)
8. Let us run a sample hadoop programs that isprovided to you in the download
package:
$ cp etc/hadoop/*.xml input ( copy over all the xml files to input folder)
(grep/find all the files matching the pattern ‘dfs[a-z.]+’ and copy those files
to output directory)
$ cat output/* (look for the output in the outputdirectory that Hadoop creates
for you).
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
Note: This change sets the default replicationcount for blocks used by HDFS.
3. We need to setup password less login so that themaster will be able to do a password-
less ssh to start the daemons on all the slaves.
a. ssh localhost( enter your password and if youare able to login then ssh server is
running)
We can run Hadoop jobs locally or on YARN in this mode. In this Post, we
will focus on authorized_keys
5. Format the file system. When we format namenode it formats the meta-data related to
data-nodes. By doing that, all the information on the datanodes are lost and they can be
reused for newdata:
If you are unable tosee this, try to check the logs in the /home/ hadoop_dev/hadoop2/logs
folder.
7. You can check whether the daemons are runningor not by issuing Jps command.
10. Stop the daemons when you are done executing the jobs, with the below command:
sbin/stop-dfs.sh
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
sbin/start-yarn.sh
This starts the daemons ResourceManager andNodeManager.
Once this command is run, you can check if ResourceManager is running or not by visiting
the following URL on browser : http://0.0.0.0:8088 . If you are unable to see this, check for
the logs in thedirectory: /home/hadoop_dev/hadoop2/logs
5. To check whether the services are running, issuea jps command. The following shows all
the services necessary to run YARN on a single server:
$ jps
15933 Jps
15567 ResourceManager
15785 NodeManager
6. Let us run the same example as we ran before:
7. Stop the daemons when you are done executingthe jobs, with the below command:
sbin/stop-yarn.sh
RESULT: