0% found this document useful (0 votes)
173 views

CS8711-Cloud Computing Lab

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
173 views

CS8711-Cloud Computing Lab

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

GROUP INSITUTIONS

Komarapalayam - 637303

College Name : EXCEL ENGINEERING COLLEGE

Department : COMPUTER SCIENCE AND ENGINEERING

LABORATORY RECORD NOTEBOOK

Certified that this bonafide record of work done by

Selvan/Selvi G.SANJEEV Roll.No/Reg.No 18CS091/730918104091

of the VII semester B.E CSE branch during the academic year 2021-2022 in

the CS8711 Cloud Computing Laboratory.

Signature of Lab-In charge Signature of Head of the Department


Submitted for the practical examination

held on________________________________

Internal Examiner External Examiner


Engineering College
Department of Computer Science and Engineering
Sub. Code/ Name: CS8711 Cloud Computing Lab Sem./Year : VII / IV
Academic Year: 2021 - 22 Batch: 2018 – 22
INDEX

Experiment
Date Name of the Experiment Marks Signature
No.
Install Virtualbox/VMware Workstation
1 22/09/21 with different flavours of linux or
windows OS on top of windows7 or 8.
Install a C compiler in the virtual machine
2 06/10/21 created using virtual box and execute
Simple Programs
Install Google App Engine. Create hello
3 13/10/21 world app and other simple web
applications using python/java.
Use GAE launcher to launch the web
4
20/10/21 applications.

Simulate a cloud scenario using CloudSim


5 27/10/21 and run a scheduling algorithm that is not
present in CloudSim.
Find a procedure to transfer the files from
6 27/10/21 one virtual machine to another virtual
machine
Find a procedure to launch virtual machine
7 24/11/21 using trystack (Online Openstack Demo
Version)
Install Hadoop single node cluster and run
8
24/11/21 simple applications like wordcount.

Internal Marks Staff Incharge


CS8711 CLOUD COMPUTING LABORATORY LTPC 0032

OBJECTIVES:

The student should be made to:


 Be exposed to tool kits for grid and cloud environment.
 Be familiar with developing web services/Applications in grid framework
 Learn to run virtual machines of different configuration.
 Learn to use Hadoop

LIST OF EXPERIMENTS:

Course Objective:
 To develop web applications in cloud
 To learn the design and development process involved in creating a cloud based application
 To learn to implement and use parallel programming using Hadoop
Exercises:
1. Install Virtualbox/VMware Workstation with different flavours of linux or windows OS on top
of windows7 or 8.
2. Install a C compiler in the virtual machine created using virtual box and execute Simple
Programs
3. Install Google App Engine. Create hello world app and other simple web applications using
python/java.
4. Use GAE launcher to launch the web applications.
5. Simulate a cloud scenario using CloudSim and run a scheduling algorithm that is not present in
CloudSim.
6. Find a procedure to transfer the files from one virtual machine to another virtual machine.
7. Find a procedure to launch virtual machine using trystack (Online Openstack Demo Version)
8. Install Hadoop single node cluster and run simple applications like wordcount.
Course Outcome:

On completion of this course, the students will be able to:

 Configure various virtualization tools such as Virtual Box, VMware workstation.


 Design and deploy a web application in a PaaS environment.
 Learn how to simulate a cloud environment to implement new schedulers.
 Install and use a generic cloud environment that can be used as a private cloud.
 Manipulate large data sets in a parallel environment.

TOTAL: 45 PERIODS
CLOUD COMPUTING

EX.No:1 Install Virtualbox/VMware Workstation

DATE: 22/09/21

Aim:
Find procedure to Install Virtualbox/VMware Workstation with different flavours of linux or
windows OS on top of windows7 or 8.

This experiment is to be performed through portal.

PROCEDURE TO INSTALL

Step 1- Download Link


Link for downloading the software is https://www.vmware.com/products/workstation-pro/workstation-
pro-evaluation.html. Download the software for windows. Good thing is that there is no signup process.
Click and download begins. Software is around 541 MB.

Step 2- Download the installer file


It should probably be in the download folder by default, if you have not changed the settings in your
browser. File name should be something like VMware-workstation-full-15.5.1-15018445.exe. This file
name can change depending on the version of the software currently available for download. But for
now, till the next version is available, they will all be VMware Workstation 15 Pro.

Step 3- Locate the downloaded installer file


For demonstration purpose, I have placed the downloaded installer on my desktop. Find the installer on
your system and double click to launch the application.
Step 4- User Access Control (UAC) Warning
Now you should see User Access Control (UAC) dialog box. Click yes to continue.

VMware Workstation 12 Pro installer windows 10 UAC screenshot


Initial Splash screen will appear. Wait for the process to complete.
Step 5- VMware Workstation Setup wizard
Now you will see VMware Workstation setup wizard dialog box. Click next to continue.

Step 6- End User Licence Agreement


This time you should see End User Licence Agreement dialog box. Check “I accept the terms in the
Licence Agreement” box and press next to contin
Step 7- Custom Setup options
Select the folder in which you would like to install the application. There is no harm in leaving the
defaults as it is. Also select Enhanced Keyboard Driver check box.

Step 8- User Experience Settings


Next you are asked to select “Check for Updates” and “Help improve VMware Workstation Pro”. Do as
you wish. I normally leave it to defaults that is unchecked.
Step 9- Application Shortcuts preference
Next step is to select the place you want the shortcut icons to be placed on your system to launch the
application. Please select both the options, desktop and start menu and click next.

Step 10- Installation begins


Now you see the begin installation dialog box. Click install to start the installation process.
Below screenshot shows Installation in progress. Wait for this to complete.

Screenshot for VMware Workstation 15 pro installation process.

At the end you will see installation complete dialog box. Click finish and you are done with
the
installation process. You may be asked to restart your computer. Click on Yes to restart.
Step 11- Launch VMware Workstation
After the installation completes, you should see VMware Workstation icon on the desktop. Double
clickon it to launch the application.

Step 12- Licence Key


If you see the dialog box asking for licence key, click on trial or enter the licence key. Then what you
have is the VMware Workstation 15 Pro running on your windows 10 desktop. If don’t have the
licencekey, you will have 30 days trial

VMware Workstation 15 Pro home screen

Step 13- At some point if you decide to buy


At some point of time if you decide to buy the Licence key, you can enter the Licence key by
going to Help->Enter a Licence Key You can enter the 25 character licence key in the dialog box
shown below and click OK. Now you have the licence version of the software.
EX NO:2 Setting up C Programming Environment
DATE: 06/10/21

Before you start..

o Please note that this is “NOT” a required part of the course and is not a
homework.

o This manual is written for someone who has never programmed / never
used Linux before.
o If you have background knowledge, you can ignore this.

o If you have any questions, please email Yoonji Shin


[email protected]

o Don’t be scared, programming is fun! Enjoy :)


Install Virtual Box

1. Visit http://www.virtualbox.org/wiki/downloads
2. Download VirtualBox platform packages for your OS
3. Open the Installation Package by double clicking

MAC PC
Install VirtualBox

4.Click continue and finish installing VirtualBox

MAC

PC

5.When finished installation, close the window


Download Linux

1. Visit the page


http://www.ubuntu.com/download/ubuntu/download
2. Choose the Latest version of Ubuntu and 32-bitand click
“Start Download”

CLI
CK
Install Linux using Virtual Box

1. Run VirtualBox by double-clicking the icon

2. Click “New” button on the top left corner

MAC PC
Install Linux usingVirtual Box

3.Click “Continue” on the pop-up window


4.Type VM name, select “Linux” for the OS and choose
“Ubuntu” for the version

MAC PC
Install Linux using Virtual Box

5.Choose the amount of memory to allocate (I suggest


choosing between 5I2 MB to I024 MB)

6.Click Continue or Next

MAC PC
Install Linux using Virtual Box

7. Choose create a new virtual hard disk

8. Click Continue or Next

MAC PC
Install Linux using Virtual Box

9. Choose VDI (VirtualBox Disk Image)


10. Click Continue or Next

MAC PC
Install Linux using Virtual Box

11. Choose “Dynamically Allocated” click continue.


This way, the size of your Virtual Hard Disk will growas
you use.

MAC PC
Install Linux using Virtual Box

I2. Click the folder icon and choose the ubuntu isofile you
downloaded.
I3. Select the size of the Virtual Disk (I recommend
choosing 8 GB) and click continue

MAC PC
Install Linux using Virtual Box

I4. Click Create

MAC PC
Running Linux

`1. Choose Ubuntu from left column and click Start

MAC & PC
Running Linux

1. Click continue on pop-up window

MAC PC
Running Linux

2. Click the folder icon and choose the ubuntu isofile you
downloaded and click continue and start

MAC PC
Running Linux

3. Click Install Ubuntu

4.Check “Download updates” and click Forward


Running Linux

4. Choose “Erase disk and install Ubuntu” and click Forward (Don’t
worry, it won’t wipe your computer)

5. Click “Install Now” and wait. Maybe grab a snack.

6. .When finished, click Restart and press Enter.


C Programming on Linux

1.Open Terminal (Applications-Accessories-Terminal)

2.Open gedit by typing “gedit &” on terminal


(You can also use any other Text Editor application)
C Programming on Linux

1.Type the following on gedit(or any


other text editor)
#include<stdio.h>
Main()
{
Printf(“hello world\n”);
}

2.Save this file as “helloworld.c”

3.Type “ls” on Terminal to see all files under current folder

4.Confirm that “helloworld.c” is in the current directory.If not, type cd


DIRECTORY_PATH to go to the directory that has “helloworld.c”

5.Type “gcc helloworld.c” to compile, and type “ls” to confirm that a new
executable file “a.out” is created
C Programming on Linux

8.Type “./a.out” on Terminal to run the program


9. if you see “Hello World” on the next line,you just successfully ran
your first C program!
10.Try other codes from “A Shotgun Introduction to C” on professor
Edwards’s webpage.You can also find many C programing guides online. (just
google it!) Enjoy ;) Enjoy :)
EX NO:3 Install Google App Engine. Create hello
DATE: 13/10/21 orld app and other simple web applications using python/java.

This document describes the installation of the Google App Engine Software
Development Kit (SDK) on a Microsoft Windows and running a simple “hello world”
application.

The App Engine SDK allows you to run Google App Engine Applications on your local
computer. It simulates the run-­­time environment of the Google App Engine
infrastructure.

Pre--Requisites: Python 2.5.4

If you don't already have Python 2.5.4 installed in your computer, download and
Install Python 2.5.4 from:

http://www.python.org/download/releases/2.5.4/

Download and Install

You can download the Google App Engine SDK by going to:

http://code.google.com/appengine/downloads.html

and download the appropriate install package.

Download the Windows installer – the simplest thing is to download it to your


Desktop or another folder that you remember.

1
Double Click on the GoogleApplicationEngine installer.

Click through the installation wizard, and it should install the App Engine. If you do
not have Python 2.5, it will install Python 2.5 as well.

Once the install is complete you can discard the downloaded installer

2
Making your First Application

Now you need to create a simple application. We could use the “+” option to have
the launcher make us an application – but instead we will do it by hand to get a
better sense of what is going on.

Make a folder for your Google App Engine applications. I am going to make the
Folder on my Desktop called “apps” – the path to this folder is:

C:\Documents and Settings\csev\Desktop\apps

And then make a sub--­folder in within apps called “ae--01--trivial” – the path to this
folder would be:

C:\ Documents and Settings \csev\Desktop\apps\ae--01--trivial

Using a text editor such as JEdit (www.jedit.org), create a file called app.yaml in the
ae--01--trivial folder with the following contents:

application: ae-01-trivial
version: 1
runtime: python
api_version: 1

handlers:
- url: /.*
script: index.py

Note: Please do not copy and paste these lines into your text editor – you might end
up with strange characters – simply type them into your editor.

Then create a file in the ae--01--trivial folder called index.py with three lines in it:

print 'Content-Type: text/plain'


print ' '
print 'Hello there Chuck'
EX NO:4 Use GAE launcher to launch the Web applications
DATE: 20/10/21

Then start the GoogleAppEngineLauncher program that can be found under


Applications. Use the File --> Add Existing Application command and navigate
into the apps directory and select the ae--01--trivial folder. Once you have added
the application, select it so that you can control the application using the launcher.

Once you have selected your application and press Run. After a few moments your
application will start and the launcher will show a little green icon next to your
application. Then press Browse to open a browser pointing at your application
which is running at http://localhost:8080/

Paste http://localhost:8080 into your browser and you should see your
application as follows:

Just for fun, edit the index.py to change the name “Chuck” to your own name and
press Refresh in the browser to verify your updates.
Watching the Log

You can watch the internal log of the actions that the web server is performing when
you are interacting with your application in the browser. Select your application in
the Launcher and press the Logs button to bring up a log window:

Each time you press Refresh in your browser – you can see it retrieving the output
with a GET request.

Dealing With Errors

With two files to edit, there are two general categories of errors that you may
encounter. If you make a mistake on the app.yaml file, the App Engine will not start
and your launcher will show a yellow icon near your application:

To get more detail on what is going wrong, take a look at the log for the application:
In this instance – the mistake is mis-­ indenting the last line in the app.yaml (line 8).

If you make a syntax error in the index.py file, a Python trace back error will appear
in your browser.

The error you need to see is likely to be the last few lines of the output – in this case
I made a Python syntax error on line one of our one--­line application.

Reference: http://en.wikipedia.org/wiki/Stack_trace

When you make a mistake in the app.yaml file – you must the fix the mistake and
attempt to start the application again.

If you make a mistake in a file like index.py, you can simply fix the file and press
refresh in your browser – there is no need to restart the server.

Shutting Down the Server

To shut down the server, use the Launcher, select your application and press the
Stop button.

This materials is Copyright All Rights Reserved – Charles Severance

Comments and questions to [email protected] www.dr-­­chuck.com


EX NO:5 Simulate a cloud scenario using CloudSim and run a
DATE: 27/10/21 scheduling algorithm that is not present in CloudSim.

How to use CloudSim in Eclipse


CloudSim is written in Java. The knowledge you need to use CloudSim is basic Java
programming and some basics about cloud computing. Knowledge of programming IDEs such
as Eclipse or NetBeans is also helpful. It is a library and, hence, CloudSim does not have to be
installed. Normally, you can unpack the downloaded package in any directory, add it to the Java
classpath and it is ready to be used. Please verify whether Java is available on your system.
To use CloudSim in Eclipse:
1. Download CloudSim installable files
from https://code.google.com/p/cloudsim/downloads/list and unzip
2. Open Eclipse
3. Create a new Java Project: File -> New
4. Import an unpacked CloudSim project into the new Java Project
5. The first step is to initialise the CloudSim package by initialising the CloudSim library, as
follows:

CloudSim.init(num_user, calendar, trace_flag)


6. Data centres are the resource providers in CloudSim; hence, creation of data centres is a
second step. To create Datacenter, you need the DatacenterCharacteristics object that stores the
properties of a data centre such as architecture, OS, list of machines, allocation policy that covers
the time or spaceshared, the time zone and its price:

Datacenter datacenter9883 = new Datacenter(name, characteristics, new VmAllocationPolicySimple(hostList), s


7. The third step is to create a broker:

DatacenterBroker broker = createBroker();


8. The fourth step is to create one virtual machine unique ID of the VM, userId ID of the VM’s
owner, mips, number Of Pes amount of CPUs, amount of RAM, amount of bandwidth, amount
of storage, virtual machine monitor, and cloudletScheduler policy for cloudlets:

Vm vm = new Vm(vmid, brokerId, mips, pesNumber, ram, bw, size, vmm, new
CloudletSchedulerTimeShared())
9. Submit the VM list to the broker:

broker.submitVmList(vmlist)
10. Create a cloudlet with length, file size, output size, and utilisation model:

Cloudlet cloudlet = new Cloudlet(id, length, pesNumber, fileSize, outputSize, utilizationModel, utilizationMode
11. Submit the cloudlet list to the broker:

broker.submitCloudletList(cloudletList)
12. Start the simulation:
CloudSim.startSimulation()
Sample Output from the Existing Example:
Starting CloudSimExample1...
Initialising...
Starting CloudSim version 3.0
Datacenter_0 is starting...
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>null
Broker is starting...
Entities started.
0.0 : Broker: Cloud Resource List received with 1 resource(s)
0.0: Broker: Trying to Create VM #0 in Datacenter_0
0.1 : Broker: VM #0 has been created in Datacenter #2, Host #0
0.1: Broker: Sending cloudlet 0 to VM #0
400.1: Broker: Cloudlet 0 received
400.1 : Broker: All Cloudlets executed. Finishing...
400.1: Broker: Destroying VM #0
Broker is shutting down...
Simulation: No more future events
CloudInformationService: Notify all CloudSim entities for shutting down.
Datacenter_0 is shutting down...
Broker is shutting down...
Simulation completed.
Simulation completed.
========== OUTPUT ==========
Cloudlet ID STATUS Data center ID VM ID Time Start Time Finish Time
0 SUCCESS 2 0 400 0.1 400.1
*****Datacenter: Datacenter_0*****
User id Debt
3 35.6

CloudSimExample1 finished!
EX NO:6 Find a procedure to transfer the files from
DATE: 27/10/21 one virtual machine to another virtual machine
1. You can copy few (or more) lines with copy & paste mechanism.
For this you need to share clipboard between host OS and guest OS, installing Guest Addition on
both the virtual machines (probably setting bidirectional and restarting them). You copy from guest
OS in the clipboard that is shared with the host OS.Then you paste from the host OS to the second
guest OS.
2. You can enable drag and drop too with the same method (Click on the machine, settings,
general, advanced, drag and drop: set to bidirectional )
3. You can have common Shared Folders on both virtual machines and use one of the
directory shared as buffer to copy.Installing Guest Additions you have the possibility to set
Shared Folders too. As you put afile in a shared folder from host OS or from guest OS, is
immediately visible to the other. (Keep in mind that can arise some problems for date/time of
the files when there are different clock settings on the different virtual machines).If you use
the same folder shared on more machines you can exchange files directly copying them in this
folder
4.You can use usual method to copy files between 2 different computer with client-server
application. (e.g. scp with sshd active for linux, winscp... you can get some info about SSH servers
e.g. here)You need an active server (sshd) on the receiving machine and a client on the sending
machine. Of course you need to have the authorization setted (via password or, better, via an
automatic authentication method).
Note: many Linux/Ubuntu distribution install sshd by default: you can see if it is running with
pgrep sshd from a shell. You can install with sudo apt-get install openssh-server.
5. You can mount part of the file system of a virtual machine via NFS or SSHFS on theother,
or you can share file and directory with Samba.You may find interesting the article Sharing
files between guest and host without VirtualBox shared folders with detailed step by step
instructions.
You should remember that you are dialling with a little network of machines with different
operative systems, and in particular:
 Each virtual machine has its own operative system running on and acts as a physical
machine.
 Each virtual machine is an instance of a program owned by an user in the hosting operative
system and should undergo the restrictions of the user in the hosting OS.
E.g Let we say that Hastur and Meow are users of the hosting machine, but they did not
allow each other to see their directories (no read/write/execute authorization). When each of
them run a virtual machine, for the hosting OS those virtual machine are two normal
programs owned by Hastur and Meow and cannot see the private directory of the other user.
This is a restriction due to the hosting OS. It's easy to overcame it: it's enough to give
authorization to read/write/execute to a directory or to chose a different directory in which
both users can read/write/execute.
 Windows likes mouse and Linux fingers. :-)
I mean I suggest you to enable Drag & drop to be cosy with the Windows machines and
the Shared folders or to be cosy with Linux.
When you will need to be fast with Linux you will feel the need of ssh-keygen and to Generate once
SSH Keys to copy files on/from a remote machine without writing password anymore. In this way it
functions
bash auto-completion remotely too!
EX NO:7 Find a procedure to launch virtual machine
DATE: 24/11/21 using trystack (Online Openstack Demo Version)

OpenStack is an open-source software cloud computing platform. OpenStack is primarily used


for deploying an infrastructure as a service (IaaS) solution like Amazon Web Service (AWS). In
other words, you can make your own AWS by using OpenStack. If you want to try out
OpenStack, TryStack is the easiest and free way to do it.
In order to try OpenStack in TryStack, you must register yourself by joining TryStack Facebook
Group. The acceptance of group needs a couple days because it’s approved manually. After you
have been accepted in the TryStack Group, you can log in TryStack.

TryStack.org Homepage
I assume that you already join to the Facebook Group and login to the dashboard. After you log
in to the TryStack, you will see the Compute Dashboard like:
OpenStack Compute Dashboard

Overview: What we will do?

In this post, I will show you how to run an OpenStack instance. The instance will be accessible
through the internet (have a public IP address). The final topology will like:

Network topology
As you see from the image above, the instance will be connected to a local network and the local
network will be connected to internet.
Step 1: Create Network

Network? Yes, the network in here is our own local network. So, your instances will be not
mixed up with the others. You can imagine this as your own LAN (Local Area Network) in the
cloud.

1. Go to Network > Networks and then click Create Network.


2. In Network tab, fill Network Name for example internal and then click Next.
3. In Subnet tab,
1. Fill Network Address with appropriate CIDR, for example 192.168.1.0/24. Use private
network CIDR block as the best practice.
2. Select IP Version with appropriate IP version, in this case IPv4.
3. Click Next.
4. In Subnet Details tab, fill DNS Name Servers with 8.8.8.8 (Google DNS) and then
click Create.

Step 2: Create Instance


Now, we will create an instance. The instance is a virtual machine in the cloud, like AWS EC2. You
need the instance to connect to the network that we just created in the previous step.

1. Go to Compute > Instances and then click Launch Instance.


2. In Details tab,
1. Fill Instance Name, for example Ubuntu 1.
2. Select Flavor, for example m1.medium.
3. Fill Instance Count with 1.
4. Select Instance Boot Source with Boot from Image.
5. Select Image Name with Ubuntu 14.04 amd64 (243.7 MB) if you want install Ubuntu
14.04 in your virtual machine.
3. In Access & Security tab,
1. Click [+] button of Key Pair to import key pair. This key pair is a public and private key that
we will use to connect to the instance from our machine.
2. In Import Key Pair dialog,
1. Fill Key Pair Name with your machine name (for example Edward-Key).
2. Fill Public Key with your SSH public key (usually is in ~/.ssh/id_rsa.pub). See
description in Import Key Pair dialog box for more information. If you are using
Windows, you can use Puttygen to generate key pair.
3. Click Import key pair.
3. In Security Groups, mark/check default.
4. In Networking tab,
1. In Selected Networks, select network that have been created in Step 1, for example internal.
5. Click Launch.
6. If you want to create multiple instances, you can repeat step 1-5. I created one more instance
with instance name Ubuntu 2.
Step 3: Create Router

I guess you already know what router is. In the step 1, we created our network, but it is isolated.
It doesn’t connect to the internet. To make our network has an internet connection, we need a
router that running as the gateway to the internet.

1. Go to Network > Routers and then click Create Router.


2. Fill Router Name for example router1 and then click Create router.
3. Click on your router name link, for example router1, Router Details page.
4. Click Set Gateway button in upper right:
1. Select External networks with external.
2. Then OK.
5. Click Add Interface button.
1. Select Subnet with the network that you have been created in Step 1.
2. Click Add interface.
6. Go to Network > Network Topology. You will see the network topology. In the example, there
are two network, i.e. external and internal, those are bridged by a router. There are
instances those are joined to internal network.

Step 4: Configure Floating IP Address

Floating IP address is public IP address. It makes your instance is accessible from the internet.
When you launch your instance, the instance will have a private network IP, but no public IP. In
OpenStack, the public IPs is collected in a pool and managed by admin (in our case is TryStack).
You need to request a public (floating) IP address to be assigned to your instance.
1. Go to Compute > Instance.
2. In one of your instances, click More > Associate Floating IP.
3. In IP Address, click Plus [+].
4. Select Pool to external and then click Allocate IP.
5. Click Associate.
6. Now you will get a public IP, e.g. 8.21.28.120, for your instance.

Step 5: Configure Access & Security

OpenStack has a feature like a firewall. It can whitelist/blacklist your in/out connection. It is
called Security Group.
1. Go to Compute > Access & Security and then open Security Groups tab.
2. In default row, click Manage Rules.
3. Click Add Rule, choose ALL ICMP rule to enable ping into your instance, and then click Add.
4. Click Add Rule, choose HTTP rule to open HTTP port (port 80), and then click Add.
5. Click Add Rule, choose SSH rule to open SSH port (port 22), and then click Add.
6. You can open other ports by creating new rules.

Step 6: SSH to Your Instance

Now, you can SSH your instances to the floating IP address that you got in the step 4. If you are
using Ubuntu image, the SSH user will be ubuntu.
EX NO:8 Install Hadoop single node cluster and run
DATE: 24/11/21 simple applications like wordcount.

Install Hadoop
Step 1: Click here to download the Java 8 Package. Save this file in your
home directory.

Step 2: Extract the Java Tar File.

Command: tar -xvf jdk-8u101-linux-i586.tar.gz

Fig: Hadoop Installation – Extracting Java Files

Step 3: Download the Hadoop 2.7.3 Package.

Command: wget https://archive.apache.org/dist/hadoop/core/hadoop-2.7.3/hadoop-


2.7.3.tar.gz

Fig: Hadoop Installation – Downloading Hadoop

Step 4: Extract the Hadoop tar File.

Command: tar -xvf hadoop-2.7.3.tar.gz

Fig: Hadoop Installation – Extracting Hadoop Files

Step 5: Add the Hadoop and Java paths in the bash file (.bashrc).

Open. bashrc file. Now, add Hadoop and Java Path as shown below.

Command: vi .bashrc
Fig: Hadoop Installation – Setting Environment Variable

Then, save the bash file and close it.

For applying all these changes to the current Terminal, execute the source command.

Command: source .bashrc

Fig: Hadoop Installation – Refreshing environment variables

To make sure that Java and Hadoop have been properly installed on your system
and can be accessed through the Terminal, execute the java -version and hadoop
version commands.

Command: java -version


Fig: Hadoop Installation – Checking Java Version

Command: hadoop version

Fig: Hadoop Installation – Checking Hadoop Version

Step 6: Edit the Hadoop Configuration files.

Command: cd hadoop-2.7.3/etc/hadoop/

Command: ls

All the Hadoop configuration files are located in hadoop-2.7.3/etc/hadoop directory as


you can see in the snapshot below:
Fig: Hadoop Installation – Hadoop Configuration Files

Step 7: Open core-site.xml and edit the property mentioned below inside
configuration tag:

core-site.xml informs Hadoop daemon where NameNode runs in the cluster. It contains
configuration settings of Hadoop core such as I/O settings that are common to HDFS &
MapReduce.

Command: vi core-site.xml

Fig: Hadoop Installation – Configuring core-site.xml

1
<?xml version="1.0" encoding="UTF-8"?>
2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
3 <configuration>
4 <property>
5 <name>fs.default.name</name>
6 <value>hdfs://localhost:9000</value>
</property>
7 </configuration>
8
Step 8: Edit hdfs-site.xml and edit the property mentioned below inside
configuration tag:
hdfs-site.xml contains configuration settings of HDFS daemons (i.e. NameNode,
DataNode, Secondary NameNode). It also includes the replication factor and block size
of HDFS.
Command: vi hdfs-site.xml

Fig: Hadoop Installation – Configuring hdfs-site.xml

1
2 <?xml version="1.0" encoding="UTF-8"?>
3 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
4 <property>
5 <name>dfs.replication</name>
6 <value>1</value>
7 </property>
8 <property>
<name>dfs.permission</name>
9 <value>false</value>
10 </property>
11 </configuration>
12

Step 9: Edit the mapred-site.xml file and edit the property mentioned below
inside configuration tag:

mapred-site.xml contains configuration settings of MapReduce application like number


of JVM that can run in parallel, the size of the mapper and the reducer process, CPU
cores available for a process, etc.

In some cases, mapred-site.xml file is not available. So, we have to create the mapred-
site.xml file using mapred-site.xml template.

Command: cp mapred-site.xml.template mapred-site.xml


Command: vi mapred-site.xml.

Fig: Hadoop Installation – Configuring mapred-site.xml

1
<?xml version="1.0" encoding="UTF-8"?>
2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
3 <configuration>
4 <property>
5 <name>mapreduce.framework.name</name>
6 <value>yarn</value>
</property>
7 </configuration>
8

Step 10: Edit yarn-site.xml and edit the property mentioned below inside
configuration tag:

yarn-site.xml contains configuration settings of ResourceManager and NodeManager


like application memory management size, the operation needed on program &
algorithm, etc.

Command: vi yarn-site.xml
Fig: Hadoop Installation – Configuring yarn-site.xml

1
2
<?xml version="1.0">
3 <configuration>
4 <property>
5 <name>yarn.nodemanager.aux-services</name>
6 <value>mapreduce_shuffle</value>
</property>
7 <property>
8 <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</
9 name>
1 <value>org.apache.hadoop.mapred.ShuffleHandler</value>
0 </property>
</configuration>
1
1

Step 11: Edit hadoop-env.sh and add the Java Path as mentioned below:

hadoop-env.sh contains the environment variables that are used in the script to run
Hadoop like Java home path, etc.

Command: vi hadoop–env.sh

Fig: Hadoop Installation – Configuring hadoop-env.sh


Step 12: Go to Hadoop home directory and format the NameNode.

Command: cd

Command: cd hadoop-2.7.3

Command: bin/hadoop namenode -format

Fig: Hadoop Installation – Formatting NameNode

This formats the HDFS via NameNode. This command is only executed for the first
time. Formatting the file system means initializing the directory specified by the
dfs.name.dir variable.

Never format, up and running Hadoop filesystem. You will lose all your data stored in
the HDFS.

Step 13: Once the NameNode is formatted, go to hadoop-2.7.3/sbin directory


and start all the daemons.

Command: cd hadoop-2.7.3/sbin

Either you can start all daemons with a single command or do it individually.

Command: ./start-all.sh

The above command is a combination of start-dfs.sh, start-yarn.sh & mr-jobhistory-


daemon.sh

Or you can run all the services individually as below:

Start NameNode:

The NameNode is the centerpiece of an HDFS file system. It keeps the directory tree of
all files stored in the HDFS and tracks all the file stored across the cluster.

Command: ./hadoop-daemon.sh start namenode


Fig: Hadoop Installation – Starting NameNode

Start DataNode:

On startup, a DataNode connects to the Namenode and it responds to the requests


from the Namenode for different operations.

Command: ./hadoop-daemon.sh start datanode

Fig: Hadoop Installation – Starting DataNode

Start ResourceManager:

ResourceManager is the master that arbitrates all the available cluster resources and
thus helps in managing the distributed applications running on the YARN system.
Its work is to manage each NodeManagers and the each application’s
ApplicationMaster.

Command: ./yarn-daemon.sh start resourcemanager


Fig: Hadoop Installation – Starting ResourceManager

Start NodeManager:

The NodeManager in each machine framework is the agent which is responsible for
managing containers, monitoring their resource usage and reporting the same to the
ResourceManager.

Command: ./yarn-daemon.sh start nodemanager

See Batch Details

Fig: Hadoop Installation – Starting NodeManager

Start JobHistoryServer:

JobHistoryServer is responsible for servicing all job history related requests from client.

Command: ./mr-jobhistory-daemon.sh start historyserver


Step 14: To check that all the Hadoop services are up and running, run the
below command.

Command: jps

Fig: Hadoop Installation – Checking Daemons

Step 15: Now open the Mozilla browser and go


to localhost:50070/dfshealth.html to check the NameNode interface.

Fig: Hadoop Installation – Starting WebUI

Congratulations, you have successfully installed a single node Hadoop cluster

You might also like