0% found this document useful (0 votes)
13 views55 pages

Ccs335-Cloud Computing Laboratory-778800227-Cc Lab Manual

The document outlines a series of experiments related to cloud computing and virtual machine management, including installation of VirtualBox, Google App Engine, and Hadoop, as well as executing simple programs in a C compiler. It details procedures for setting up virtual machines, simulating cloud scenarios with CloudSim, and transferring files between virtual machines. Each experiment includes aims, procedures, and applications, emphasizing practical skills in cloud technology and programming.

Uploaded by

gocon41581
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views55 pages

Ccs335-Cloud Computing Laboratory-778800227-Cc Lab Manual

The document outlines a series of experiments related to cloud computing and virtual machine management, including installation of VirtualBox, Google App Engine, and Hadoop, as well as executing simple programs in a C compiler. It details procedures for setting up virtual machines, simulating cloud scenarios with CloudSim, and transferring files between virtual machines. Each experiment includes aims, procedures, and applications, emphasizing practical skills in cloud technology and programming.

Uploaded by

gocon41581
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

LIST OF EXPERIMENTS

1. Install Virtualbox/VMware Workstation with different flavours of linux or windows OS on top of


windows7 or 8.
2. Install a C compiler in the virtual machine created using virtual box and execute Simple Programs.
3. Install Google App Engine. Create hello world app and other simple web applications using
python/java.
4. Use GAE launcher to launch the web applications.
5. Simulate a cloud scenario using CloudSim and run a scheduling algorithm that is not present in
CloudSim.
6. Find a procedure to transfer the files from one virtual machine to another virtual machine.
7. Find a procedure to launch virtual machine using trystack (Online
Openstack Demo Version)
8. Install Hadoop single node cluster and run simple applications like wordcount.

9. Creating and executing your first container using Docker.

10. Run a container from Docker hub.


EX NO. : 1

DATE:
Install Virtualbox / VMware Workstation with different flavours of linux or
windows OS on top of windows7 or 8.

Aim:
To Install Virtualbox / VMware Workstation with different flavours of linux or
windows OS on top of windows7 or 8.

Procedure:

Steps to install Virtual Box:


1. Download the Virtual box exe and click the exe file…and select next button..
2. Click the next button.

3. Click the next button


4. Click the YES button.

5. Click the install button…

6. Then installation was completed..the show virtual box icon on desktop


screen….
Steps to import Open nebula sandbox:
1. Open Virtual box
2. File import Appliance
3. Browse OpenNebula-Sandbox-5.0.ova file
4. Then go to setting, select Usb and choose USB 1.1
5. Then Start the Open Nebula
6. Login using username: root, password:opennebula
Steps to create Virtual Machine through opennebula
1. Open Browser, type localhost:9869
2. Login using username: oneadmin, password: opennebula
3. Click on instances, select VMs then follow the steps to create Virtaul machine
a. Expand the + symbol
b. Select user oneadmin
c. Then enter the VM name,no.of instance, cpu.
d. Then click on create button.
e. Repeat the steps the C,D for creating more than one VMs.
Applications:
There are various applications of cloud computing in today’s network world. Many search
engines and social websites are using the concept of cloud computing like
www.amazon.com, hotmail.com, facebook.com, linkedln.com etc. the advantages of cloud
computing in context to scalability is like reduced risk , low cost testing, ability to segment the
customer base and auto-scaling based on application load.

Result:
EX.NO.:2
DATE:
Install a C compiler in the virtual machine created using virtual box and
execute Simple Programs

Aim:
To Install a C compiler in the virtual machine created using virtual box and
execute Simple Programs`

Procedure:

Steps to import .ova file:


1. Open Virtual box
2. File import Appliance
3. Browse ubuntu_gt6.ova file
4. Then go to setting, select Usb and choose USB 1.1
5. Then Start the ubuntu_gt6
6. Login using username: dinesh, password:99425.
Steps to run c program:

1. Open the terminal


2. Type cd /opt/axis2/axis2-1.7.3/bin then press enter
3. gedit hello.c
4. gcc hello.c
5. ./a.out

1. Type cd /opt/axis2/axis2-1.7.3/bin then press enter

2. Type gedit first.c


3. Type the c program

4. Running the C program


5. Display the output:

Applications:
Simply running all programs in grid environment.

Result:
EX NO.:3
DATE:
Install Google App Engine. Create hello world app and other simple web
applications using python/java.

Aim:
To Install Google App Engine. Create hello world app and other simple web
applications using python/java.
Procedure:

1. Install Google Plugin for Eclipse


Read this guide – how to install Google Plugin for Eclipse. If you install the Google App Engine
Java SDK together with “Google Plugin for Eclipse“, then go to step 2, Otherwise, get the
Google App Engine Java SDK and extract it.

2. Create New Web Application Project


In Eclipse toolbar, click on the Google icon, and select “ New Web Application Project…”
Click finished, Google Plugin for Eclipse will generate a sample project automatically.

3. Hello World
Review the generated project directory.
Nothing special, a standard Java web project structure.

HelloWorld/
src/
...Java source code...
META-INF/
...other configuration...
war/
...JSPs, images, data files...
WEB-INF/
...app configuration...
lib/
...JARs for libraries...
classes/
...compiled classes...
Copy
The extra is this file “appengine-web.xml“, Google App Engine need this to run and deploy
the
application.

File : appengine-web.xml

<?xml version="1.0" encoding="utf-8"?>


<appengine-web-app xmlns="http://appengine.google.com/ns/1.0">
<application></application>
<version>1</version>

<!-- Configure java.util.logging -->


<system-properties>
<property name="java.util.logging.config.file" value="WEB-INF/logging.properties"/>
</system-properties>

</appengine-web-app>
Copy

4. Run it local
Right click on the project and run as “Web Application“.

Eclipse console :

//...
INFO: The server is running at http://localhost:8888/
30 Mac 2012 11:13:01 PM com.google.appengine.tools.development.DevAppServerImpl start
INFO: The admin console is running at http://localhost:8888/_ah/admin
Copy
Access URL http://localhost:8888/, see output
and also the hello world servlet – http://localhost:8888/helloworld

5. Deploy to Google App Engine


Register an account on https://appengine.google.com/, and create an application ID for your
web application.

In this demonstration, I created an application ID, named “mkyong123”, and put it in


appengine- web.xml.

File : appengine-web.xml

<?xml version="1.0" encoding="utf-8"?>


<appengine-web-app xmlns="http://appengine.google.com/ns/1.0">
<application>mkyong123</application>
<version>1</version>

<!-- Configure java.util.logging -->


<system-properties>
<property name="java.util.logging.config.file" value="WEB-INF/logging.properties"/>
</system-properties>

</appengine-web-
app> Copy
To deploy, see following steps:

Figure 1.1 – Click on GAE deploy button on the toolbar.

Figure 1.2 – Sign in with your Google account


Result:
EX. NO.:4
DATE:
Simulate a cloud scenario using CloudSim and run a scheduling
algorithm that is not present in CloudSim.

Aim:
To Simulate a cloud scenario using CloudSim and run a scheduling algorithm
that is not present in CloudSim.
Steps:
How to use CloudSim in Eclipse
CloudSim is written in Java. The knowledge you need to use CloudSim is basic Java programming
and some basics about cloud computing. Knowledge of programming IDEs such as Eclipse or
NetBeans is also helpful. It is a library and, hence, CloudSim does not have to be installed.
Normally, you can unpack the downloaded package in any directory, add it to the Java classpath and
it is ready to be used. Please verify whether Java is available on your system.

To use CloudSim in Eclipse:


1. Download CloudSim installable files
from https://code.google.com/p/cloudsim/downloads/list and unzip
2. Open Eclipse
3. Create a new Java Project: File -> New
4. Import an unpacked CloudSim project into the new Java Project
The first step is to initialise the CloudSim package by initialising the CloudSim library, as
follows

CloudSim.init(num_user, calendar, trace_flag)


5. Data centres are the resource providers in CloudSim; hence, creation of data
centres is a second step. To create Datacenter, you need the
DatacenterCharacteristics object that stores the properties of a data centre such as
architecture, OS, list of machines, allocation policy that covers the time or
spaceshared, the time zone and its price:
Datacenter datacenter9883 = new Datacenter(name, characteristics, new
VmAllocationPolicySimple(hostList), s
6. The third step is to create a broker:
DatacenterBroker broker = createBroker();
7. The fourth step is to create one virtual machine unique ID of the VM, userId
ID of the VM’s owner, mips, number Of Pes amount of CPUs, amount of
RAM, amount of bandwidth, amount of storage, virtual machine monitor, and
cloudletScheduler policy for cloudlets:
Vm vm = new Vm(vmid, brokerId, mips, pesNumber, ram, bw, size, vmm,
new CloudletSchedulerTimeShared())
8. Submit the VM list to the broker:
broker.submitVmList(vmlist)
9. Create a cloudlet with length, file size, output size, and utilisation model:
Cloudlet cloudlet = new Cloudlet(id, length, pesNumber, fileSize, outputSize, utilizationModel,
utilizationMode
10. Submit the cloudlet list to the broker:
broker.submitCloudletList(cloudletList)
Sample Output from the Existing Example:
Starting
CloudSimExample1...
Initialising...
Starting CloudSim
version 3.0 Datacenter_0
is starting...
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>null
Broker is
starting...
Entities started.
: Broker: Cloud Resource List received with 1
resource(s) 0.0: Broker: Trying to Create VM #0
in Datacenter_0
: Broker: VM #0 has been created in Datacenter #2, Host #0
0.1: Broker: Sending cloudlet 0 to VM #0
400.1: Broker: Cloudlet 0 received
: Broker: All Cloudlets executed.
Finishing... 400.1: Broker: Destroying
VM #0
Broker is shutting down...
Simulation: No more future
events
CloudInformationService: Notify all CloudSim entities for
shutting down. Datacenter_0 is shutting down...
Broker is shutting
down... Simulation
completed.
Simulation completed.
========== OUTPUT ==========
Cloudlet ID STATUS Data center ID VM ID Time Start Time
Finish Time 0 SUCCESS 2 0 400
0.1 400.1
*****Datacenter:
Datacenter_0***** User id
Debt
3 35.6

CloudSimExample1 finished!

Result:
EX.NO.:5
DATE:
Use GAE launcher to launch the web applications.

Aim:
To Use GAE launcher to launch the web applications.

Steps:

Making your First Application

Now you need to create a simple application. We could use the “+” option to have the
launcher make us an application – but instead we will do it by hand to get a better sense of
what is going on.

Make a folder for your Google App Engine applications. I am going to make the Folder
on my Desktop called “apps” – the path to this folder is:

C:\Documents and Settings\csev\Desktop\apps


And then make a sub-•‐ folder in within apps called “ae-•01-•trivial” – the path to this folder
would be:

C:\ Documents and Settings \csev\Desktop\apps\ae-•01-•trivial


Using a text editor such as JEdit (www.jedit.org), create a file called app.yaml in the ae-•01-•trivial
folder with the following contents:
application: ae-01-trivial
version: 1
runtime: python api_version: 1
handlers:- url: /.*
script: index.py
Note: Please do not copy and paste these lines into your text editor – you might end
up with strange characters – simply type them into your editor.
Then create a file in the ae-•01-•trivial folder called index.py with three lines in it:
print 'Content-Type: text/plain'
print ' '
print 'Hello there Chuck'
Then start the GoogleAppEngineLauncher program that can be found
under Applications. Use the File -•> Add Existing Application command and
navigate into the apps directory and select the ae-•01-•trivial folder. Once you have added
the application, select it so that you can control the application usingthe launcher.
Once you have selected your application and press Run. After a few moments your
application will start and the launcher will show a little green icon next to your
application. Then press Browse to open a browser pointing at your application
which is running at http://localhost:8080/

Paste http://localhost:8080 into your browser and you should see your
application as follows:

Just for fun, edit the index.py to change the name “Chuck” to your own name
and press Refresh in the browser to verify your updates.

Watching the Log

You can watch the internal log of the actions that the web server is performing
when you are interacting with your application in the browser. Select your
application in the Launcher and press the Logs button to bring up a log window:

Each time you pressRefresh in your browser –youcan seeit retrieving the
output with a GET request.
Dealing With Errors

With two files to edit, there are two general categories of errors that you may
encounter. If you make a mistake on the app.yaml file, the App Engine will not start
and your launcher will show a yellow icon near your application:

To get more detail on what is going wrong, take a look at the log for the application:
In this instance – the mistake is mis-­‐indenting the last line in the app.yaml (line 8).
If you make a syntax error in the index.py file, a Python trace back error will appear in
your browser.

The error you need to see is likely to be the last few lines of the output – in this
case I made a Python syntax error on line one of our one-•‐line application.
Reference: http://en.wikipedia.org/wiki/Stack_trace
When you make a mistake in the app.yaml file – you must the fix the mistake
and attempt to start the application again.
If you make a mistake in a file like index.py, you can simply fix the file and
press refresh in your browser – there is no need to restart the server.
Shutting Down the Server
To shut down the server, use the Launcher, select your application and press the
Stop button.

Result:
EX.NO:6
DATE:
Find a procedure to transfer the files from one virtual machine
to another virtual machine.

Aim:
To Find a procedure to transfer the files from one virtual machine
to another virtual machine.

Steps:

1. You can copy few (or more) lines with copy & paste mechanism.
For this you need to share clipboard between host OS and guest OS, installing
Guest Addition on both the virtual machines (probably setting bidirectional
and restarting them). You copy from guest OS in the clipboard that is shared
with the host OS.
Then you paste from the host OS to the second guest OS.
2. You can enable drag and drop too with the same method (Click on the
machine, settings, general, advanced, drag and drop: set to bidirectional )
3. You can have common Shared Folders on both virtual machines and
use one of the directory shared as buffer to copy.
Installing Guest Additions you have the possibility to set Shared Folders too.
As you put a file in a shared folder from host OS or from guest OS, is
immediately visible to the other. (Keep in mind that can arise some problems
for date/time of the files when there are different clock settings on the
different virtual machines).
If you use the same folder shared on more machines you can exchange files
directly copying them in this folder.
4. You can use usual method to copy files between 2 different computer with
client-server application. (e.g. scp with sshd active for linux, winscp... you
can get some info about SSH servers e.g. here)
You need an active server (sshd) on the receiving machine and a client on
the sending machine. Of course you need to have the authorization setted
(via password or, better, via an automatic authentication method).
Note: many Linux/Ubuntu distribution install sshd by default: you can see if
it is running with pgrep sshd from a shell. You can install with sudo apt-get
install openssh-server.
5. You can mount part of the file system of a virtual machine via NFS or
SSHFS on the other, or you can share file and directory with Samba.
You may find interesting the article Sharing files between guest and
host without VirtualBox shared folders with detailed step by step
instructions.
You should remember that you are dialling with a little network of machines
with different operative systems, and in particular:
Each virtual machine has its own operative system running on and acts
as a physical machine.
Each virtual machine is an instance of a program owned by an user in the
hosting operative system and should undergo the restrictions of the user in the
hosting OS.
E.g Let we say that Hastur and Meow are users of the hosting machine, but
they did not allow each other to see their directories (no read/write/execute
authorization). When each of them run a virtual machine, for the hosting OS
those virtual machine are two normal programs owned by Hastur and Meow
and cannot see the private directory of the other user. This is a restriction due
to the hosting OS. It's easy to overcame it: it's enough to give authorization to
read/write/execute to a directory or to chose a different directory in which both
users can read/write/execute.
Windows likes mouse and Linux fingers. :-)
I mean I suggest you to enable Drag & drop to be cosy with the Windows
machines and the Shared folders or to be cosy with Linux.
When you will need to be fast with Linux you will feel the need of ssh-keygen and
to Generate once SSH Keys to copy files on/from a remote machine without writing
password anymore. In this way it functions bash auto-completion remotely too!

Procedure:
Steps:
1. Open Browser, type localhost:9869
2. Login using username: oneadmin, password: opennebula
3. Then follow the steps to migrate VMs
a. Click on infrastructure
b. Select clusters and enter the cluster name
c. Then select host tab, and select all host
d. Then select Vnets tab, and select all vnet
e. Then select datastores tab, and select all datastores
f. And then choose host under infrastructure tab
g. Click on + symbol to add new host, name the host then click on create.
4. on instances, select VMs to migrate then follow the stpes
a. Click on 8th icon ,the drop down list display
b. Select migrate on that ,the popup window display
c. On that select the target host to migrate then click on migrate.
Before migration
Host:SACET

Host:one-sandbox
After Migration:

Host:one-sandbox
Host:SACET

APPLICATIONS:
Easily migrate your virtual machine from one pc to another.

Result:
EX.NO:7
DATE:
Find a procedure to launch virtual machine using trystack (Online
Openstack Demo Version)

Aim:
To Find a procedure to launch virtual machine using trystack.
Steps:

OpenStack is an open-source software cloud computing platform. OpenStack is


primarily used for deploying an infrastructure as a service (IaaS) solution like
Amazon Web Service (AWS). In other words, you can make your own AWS by
using OpenStack. If you want to try out OpenStack, TryStack is the easiest and
free way to do it.
In order to try OpenStack in TryStack, you must register yourself by joining
TryStack Facebook Group. The acceptance of group needs a couple days because
it’s approved manually. After you have been accepted in the TryStack Group,
you can log in TryStack.

TryStack.org Homepage

I assume that you already join to the Facebook Group and login to the dashboard.
After you log in to the TryStack, you will see the Compute Dashboard like:
OpenStack Compute Dashboard
Overview: What we will do?

In this post, I will show you how to run an OpenStack instance. The instance will
be accessible through the internet (have a public IP address). The final topology
will like:
Network topology
As you see from the image above, the instance will be connected to a local
network and the local network will be connected to internet.

Step 1: Create Network

Network? Yes, the network in here is our own local network. So, your instances
will be not mixed up with the others. You can imagine this as your own LAN
(Local Area Network) in the cloud.
1. Go to Network > Networks and then click Create Network.
2. In Network tab, fill Network Name for example internal and then click Next.
3. In Subnet tab,
1. Fill Network Address with appropriate CIDR, for example
192.168.1.0/24. Use private network CIDR block as the best practice.
2. Select IP Version with appropriate IP version, in this case IPv4.
3. Click Next.
4. In Subnet Details tab, fill DNS Name Servers with 8.8.8.8 (Google
DNS) and then click Create.

Step 2: Create Instance

Now, we will create an instance. The instance is a virtual machine in the cloud,
like AWS EC2. You need the instance to connect to the network that we just
created in the previous step.
1. Go to Compute > Instances and then click Launch Instance.
2. In Details tab,
1. Fill Instance Name, for example Ubuntu 1.
2. Select Flavor, for example m1.medium.
3. Fill Instance Count with 1.
4. Select Instance Boot Source with Boot from Image.
5. Select Image Name with Ubuntu 14.04 amd64 (243.7 MB) if you want install Ubuntu
14.04 in your virtual machine.
3. In Access & Security tab,
1. Click [+] button of Key Pair to import key pair. This key pair is a public and
private key that we will use to connect to the instance from our machine.
2. In Import Key Pair dialog,
1. Fill Key Pair Name with your machine name (for example Edward-Key).
2. Fill Public Key with your SSH public key (usually is in
~/.ssh/id_rsa.pub). See description in Import Key Pair dialog box for
more information. If you are using Windows, you can use Puttygen
to generate key pair.
3. Click Import key pair.
3. In Security Groups, mark/check de
.

n
4. Click Launch. a
5. If you want to create multiple instances, you can repeat step 1-5. I created one l
more instance with instance name Ubuntu 2.
Step 3: Create Router

I guess you already know what router is. In the step 1, we created our network,
but it is isolated. It doesn’t connect to the internet. To make our network has an
internet connection, we need a router that running as the gateway to the internet.
1. Go to Network > Routers and then click Create Router.
2. Fill Router Name for example router1 and then click Create router.
3. Click on your router name link, for example router1, Router Details page.
4. Click Set Gateway button in upper right:
1. Select External networks with external.
2. Then OK.
5. Click Add Interface button.
1. Select Subnet with the network that you have been created in Step 1.
2. Click Add interface.
6. Go to Network > Network Topology. You will see the network topology. In the
example, there are two network, i.e. external and internal, those are bridged by a
router. There are
instances those are joined to internal network.

Step 4: Configure Floating IP Address

Floating IP address is public IP address. It makes your instance is accessible from


the internet. When you launch your instance, the instance will have a private
network IP, but no public IP. In OpenStack, the public Ips is collected in a pool
and managed by admin (in our case is TryStack). You need to request a public
(floating) IP address to be assigned to your instance.
1. Go to Compute > Instance.
2. In one of your instances, click More > Associate Floating IP.
3. In IP Address, click Plus [+].
4. Select Pool to external and then click Allocate IP.
5. Click Associate.
6. Now you will get a public IP, e.g. 8.21.28.120, for your instance.
Step 5: Configure Access & Security

OpenStack has a feature like a firewall. It can whitelist/blacklist your in/out


connection. It is called Security Group.
1. Go to Compute > Access & Security and then open Security Groups tab.
2. In default row, click Manage Rules.
3. Click Add Rule, choose ALL ICMP rule to enable ping into your instance, and then click
Add.
4. Click Add Rule, choose HTTP rule to open HTTP port (port 80), and then click Add.
5. Click Add Rule, choose SSH rule to open SSH port (port 22), and then click Add.
6. You can open other ports by creating new rules.
Step 6: SSH to Your Instance

Now, you can SSH your instances to the floating IP address that you got in the
step 4. If you are using Ubuntu image, the SSH user will be ubuntu.

Result:
EX NO.:8
DATE :
Install Hadoop single node cluster and run simple
applications like wordcount.

Aim:
To Install Hadoop single node cluster and run simple applications like
wordcount.

Steps:

Install Hadoop

Step 1: Click here to download the Java 8 Package. Save this file in your home
directory.

Step 2: Extract the Java Tar File.

Command: tar -xvf jdk-8u101-linux-i586.tar.gz

Fig: Hadoop Installation – Extracting Java Files


Step 3: Download the Hadoop 2.7.3 Package.

Command: wget https://archive.apache.org/dist/hadoop/core/hadoop-2.7.3/hadoop-


2.7.3.tar.gz

Fig: Hadoop Installation – Downloading Hadoop

Step 4: Extract the Hadoop tar File.

Command: tar -xvf hadoop-2.7.3.tar.gz


5: Add the Hadoop and Java paths in the bash file (.bashrc). Open. bashrc

file. Now, add Hadoop and Java Path as shown below.

Command: vi .bashrc

Fig: Hadoop Installation – Setting Environment Variable


Then, save the bash file and close it.

For applying all these changes to the current Terminal, execute the source command.
Command: source .bashrc

Fig: Hadoop Installation – Refreshing environment variables

To make sure that Java and Hadoop have been properly installed on your system and can be
accessed through the Terminal, execute the java -version and hadoop version commands.

Command: java -version


Fig: Hadoop Installation – Checking Java Version
Command: hadoop version

Fig: Hadoop Installation – Checking Hadoop Version

Step 6: Edit the Hadoop Configuration files.

Command: cd hadoop-2.7.3/etc/hadoop/

Command: ls

All the Hadoop configuration files are located in hadoop-2.7.3/etc/hadoop directory as you can
see in the snapshot below:

Fig: Hadoop Installation – Hadoop Configuration Files


Step 7: Open core-site.xml and edit the property mentioned below inside
configuration tag:

core-site.xml informs Hadoop daemon where NameNode runs in the cluster. It contains
configuration settings of Hadoop core such as I/O settings that are common to HDFS &
MapReduce.

Command: vi core-site.xml

Fig: Hadoop Installation – Configuring core-site.xml

1
<?xml version="1.0" encoding="UTF-8"?>
2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
3 <configuration>
4 <property>
5 <name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
6 </property>
7 </configuration>

Step 8: Edit hdfs-site.xml and edit the property mentioned below inside
configuration tag:

hdfs-site.xml contains configuration settings of HDFS daemons (i.e. NameNode, DataNode,


Secondary NameNode). It also includes the replication factor and block size of HDFS.

Command: vi hdfs-site.xml
Fig: Hadoop Installation – Configuring hdfs-site.xml

1
2 <?xml version="1.0" encoding="UTF-8"?>
3 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
4 <property>
5 <name>dfs.replication</name>
6 <value>1</value>
7 </property>
<property>
8 <name>dfs.permission</name>
9 <value>false</value>
10 </property>
</configuration>
11

Step 9: Edit the mapred-site.xml file and edit the property mentioned below
inside configuration tag:

mapred-site.xml contains configuration settings of MapReduce application like number of JVM


that can run in parallel, the size of the mapper and the reducer process, CPU cores available for a
process, etc.

In some cases, mapred-site.xml file is not available. So, we have to create the mapred- site.xml
file using mapred-site.xml template.

Command: cp mapred-site.xml.template mapred-site.xml

Command: vi mapred-site.xml.

Fig: Hadoop Installation – Configuring mapred-site.xml


1
<?xml version="1.0" encoding="UTF-8"?>
2 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
3 <configuration>
4 <property>
5 <name>mapreduce.framework.name</name>
<value>yarn</value>
6 </property>
7 </configuration>

Step 10: Edit yarn-site.xml and edit the property mentioned below inside
configuration tag:

yarn-site.xml contains configuration settings of ResourceManager and NodeManager like


application memory management size, the operation needed on program & algorithm, etc.

Command: vi yarn-site.xml

Fig: Hadoop Installation – Configuring yarn-site.xml

Step 11: Edit hadoop-env.sh and add the Java Path as mentioned below:
1
2 <?xml version="1.0">
3 <configuration>
4 <property>
5 <name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
6 </property>
7 <property>
8 <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</
name>
9
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
1 </property>
0
1
Command: vi hadoop–env.sh

Fig: Hadoop Installation – Configuring hadoop-env.sh Step

12: Go to Hadoop home directory and format the NameNode.

Command: cd

Command: cd hadoop-2.7.3

Command: bin/hadoop namenode -format

Fig: Hadoop Installation – Formatting NameNode

This formats the HDFS via NameNode. This command is only executed for the first time.
Formatting the file system means initializing the directory specified by the dfs.name.dir
variable.

Never format, up and running Hadoop filesystem. You will lose all your data stored in the
HDFS.

Step 13: Once the NameNode is formatted, go to hadoop-2.7.3/sbin directory and start all the daemons.

Command: cd hadoop-2.7.3/sbin

Either you can start all daemons with a single command or do it individually.

Command: ./start-all.sh

The above command is a combination of start-dfs.sh, start-yarn.sh & mr-jobhistory-


daemon.sh

Or you can run all the services individually as below:


Start NameNode:

The NameNode is the centerpiece of an HDFS file system. It keeps the directory tree of all files
stored in the HDFS and tracks all the file stored across the cluster.

Command: ./hadoop-daemon.sh start namenode


On startup, a DataNode connects to the Namenode and it responds to the requests from
the Namenode for different operations.

Command: ./hadoop-daemon.sh start datanode

Fig: Hadoop Installation – Starting DataNode

Start ResourceManager:

ResourceManager is the master that arbitrates all the available cluster resources and
thus helps in managing the distributed applications running on the YARN system.
Its work is to manage each NodeManagers and the each application’s
ApplicationMaster.

Command: ./yarn-daemon.sh start resourcemanager

Fig: Hadoop Installation – Starting ResourceManager

Start NodeManager:

The NodeManager in each machine framework is the agent which is responsible for
managing containers, monitoring their resource usage and reporting the same to the
ResourceManager.

Command: ./yarn-daemon.sh start nodemanager


Fig: Hadoop Installation – Starting NodeManager

Start JobHistoryServer:

JobHistoryServer is responsible for servicing all job history related requests from client.

Command: ./mr-jobhistory-daemon.sh start historyserver

Step 14: To check that all the Hadoop services are up and running, run the below
command.

Command: jps

Fig: Hadoop Installation – Checking Daemons


Step 15: Now open the Mozilla browser and go
to localhost:50070/dfshealth.html to check the NameNode interface.

Fig: Hadoop Installation – Starting WebUI

Congratulations, you have successfully installed a single node Hadoop cluster

Result:
EX.NO:9 Creating and executing your first container using Docker

DATE:
Aim:
To Creating and executing your first container using Docker.

Steps:

Step 1: Run your first container¶

We are going to use the Docker CLI to run our first container.

1. Open a terminal on your local computer

2. Run docker container run -t ubuntu top

Use the docker container run command to run a container with the ubuntu image using the top command.
The -t flags allocate a pseudo-TTY which we need for the top to work correctly.
$ docker container run -it ubuntu top
Unable to find image 'ubuntu:latest' locally
latest: Pulling from library/ubuntu
aafe6b5e13de: Pull complete
0a2b43a72660: Pull complete
18bdd1e546d2: Pull complete
8198342c3e05: Pull complete
f56970a44fd4: Pull complete
Digest: sha256:f3a61450ae43896c4332bda5e78b453f4a93179045f20c8181043b26b5e79028
Status: Downloaded newer image for ubuntu:latest
The docker run command will result first in a docker pull to download the ubuntu image onto your host.
Once it is downloaded, it will start the container. The output for the running container should look like
this:
top - 20:32:46 up 3 days, 17:40, 0 users, load average: 0.00, 0.01, 0.00
Tasks: 1 total, 1 running, 0 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 0.1 sy, 0.0 ni, 99.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 2046768 total, 173308 free, 117248 used, 1756212 buff/cache
KiB Swap: 1048572 total, 1048572 free, 0 used. 1548356 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND


1 root 20 0 36636 3072 2640 R 0.3 0.2 0:00.04 top
top is a linux utility that prints the processes on a system and orders them by resource consumption.
Notice that there is only a single process in this output: it is the top process itself. We don't see other
processes from our host in this list because of the PID namespace isolation.Containers use linux
namespaces to provide isolation of system resources from other containers or the host. The PID
namespace provides isolation for process IDs. If you run top while inside the container, you will notice
that it shows the processes within the PID namespace of the container, which is much different than what
you can see if you ran top on the host.

3. Inspect the container with docker container execThe docker container exec command is a way to
"enter" a running container's namespaces with a new process.
Open a new terminal. On cognitiveclass.ai, select Terminal > New Terminal.

Using play-with-docker.com, to open a new terminal connected to node1, click "Add New Instance" on
the lefthand side, then ssh from node2 into node1 using the IP that is listed by 'node1 '. For example:
[node2] (local) [email protected] ~
$ ssh 192.168.0.18
[node1] (local) [email protected] ~
$
In the new terminal, use the docker container ls command to get the ID of the running container you just
created.
$ docker container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS
PORTS NAMES
b3ad2a23fab3 ubuntu "top" 29 minutes ago Up 29 minutes
goofy_nobel
Then use that id to run bash inside that container using the docker container exec command. Since we are
using bash and want to interact with this container from our terminal, use -it flags to run using interactive
mode while allocating a psuedo-terminal.
$ docker container exec -it b3ad2a23fab3 bash
root@b3ad2a23fab3:/#
And Voila! We just used the docker container exec command to "enter" our container's namespaces with
our bash process. Using docker container exec with bash is a common pattern to inspect a docker
container.

Notice the change in the prefix of your terminal. e.g. root@b3ad2a23fab3:/. This is an indication that we
are running bash "inside" of our container.

Note: This is not the same as ssh'ing into a separate host or a VM. We don't need an ssh server to
connect with a bash process. Remember that containers use kernel-level features to achieve isolation and
that containers run on top of the kernel. Our container is just a group of processes running in isolation on
the same host, and we can use docker container exec to enter that isolation with the bash process. After
running docker container exec, the group of processes running in isolation (i.e. our container)
include top and bash.

From the same termina, run ps -ef to inspect the running processes.
root@b3ad2a23fab3:/# ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 20:34 ? 00:00:00 top
root 17 0 0 21:06 ? 00:00:00 bash
root 27 17 0 21:14 ? 00:00:00 ps -ef
For comparison, exit the container, and run ps -ef or top on the host. These commands will work on linux
or mac. For windows, you can inspect the running processes using tasklist.

root@b3ad2a23fab3:/# exit
exit
$ ps -ef
# Lots of processes!
Technical Deep Dive PID is just one of the linux namespaces that provides containers with isolation to
system resources. Other linux namespaces include: - MNT - Mount and unmount directories without
affecting other namespaces - NET - Containers have their own network - Clean up the container running
the top processes by typing: <ctrl>-c, list all containers and remove the containers by their id.
4. docker ps -a
5.
docker rm <CONTAINER ID>

Result:
EX.NO:10 Run a container from Docker hub

DATE:

Aim:
To run a container from Docker hub.
Steps:
1. Explore the Docker Hub

The Docker Hub is the public central registry for Docker images, which contains community and official
images.

When searching for images you will find filters for "Docker Certified", "Verified Publisher" and
"Official Images" images. Select the "Docker Certified" filter, to find images that are deemed enterprise-
ready and are tested with Docker Enterprise Edition product. It is important to avoid using unverified
content from the Docker Store when developing your own images that are intended to be deployed into
the production environment. These unverified images may contain security vulnerabilities or possibly
even malicious software.

In Step 2 of this lab, we will start a couple of containers using some verified images from the Docker
Hub: nginx web server, and mongo database.

2. Run an Nginx server

Let's run a container using the official Nginx image from the Docker Hub.
$ docker container run --detach --publish 8080:80 --name nginx nginx
Unable to find image 'nginx:latest' locally
latest: Pulling from library/nginx
36a46ebd5019: Pull complete
57168433389f: Pull complete
332ec8285c50: Pull complete
Digest: sha256:c15f1fb8fd55c60c72f940a76da76a5fccce2fefa0dd9b17967b9e40b0355316
Status: Downloaded newer image for nginx:latest
5e1bf0e6b926bd73a66f98b3cbe23d04189c16a43d55dd46b8486359f6fdf048
We are using a couple of new flags here. The --detach flag will run this container in the background.
The publish flag publishes port 80 in the container (the default port for nginx), via port 8080 on our host.
Remember that the NET namespace gives processes of the container their own network stack. The --
publish flag is a feature that allows us to expose networking through the container onto the host.

How do you know port 80 is the default port for nginx? Because it is listed in the documentation on the
Docker Hub. In general, the documentation for the verified images is very good, and you will want to
refer to them when running containers using those images.

We are also specifying the --name flag, which names the container. Every container has a name, if you
don't specify one, Docker will randomly assign one for you. Specifying your own name makes it easier
to run subsequent commands on your container since you can reference the name instead of the id of the
container. For example: docker container inspect nginx instead of docker container inspect 5e1.

Since this is the first time you are running the nginx container, it will pull down the nginx image from
the Docker Store. Subsequent containers created from the Nginx image will use the existing image
located on your host.

3. Access the nginx server on localhost:8080.curl localhost:8080


<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
body {
width: 35em;
margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif;
}
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
4. If you are using play-with-docker, look for the 8080 link near the top of the page, or if you run a
Docker client with access to a local browser,

5. Run a mongo DB server

Now, run a mongoDB server. We will use the official mongoDB image from the Docker Hub. Instead of
using the latest tag (which is the default if no tag is specified), we will use a specific version of the
mongo image: 4.4.
$ docker container run --detach --publish 8081:27017 --name mongo mongo:4.4
Unable to find image mongo:4.4 locally
4.4: Pulling from library/mongo
d13d02fa248d: Already exists
bc8e2652ce92: Pull complete
3cc856886986: Pull complete
c319e9ec4517: Pull complete
b4cbf8808f94: Pull complete
cb98a53e6676: Pull complete
f0485050cd8a: Pull complete
ac36cdc414b3: Pull complete
61814e3c487b: Pull complete
523a9f1da6b9: Pull complete
3b4beaef77a2: Pull complete
Digest: sha256:d13c897516e497e898c229e2467f4953314b63e48d4990d3215d876ef9d1fc7c
Status: Downloaded newer image for mongo:4.4
d8f614a4969fb1229f538e171850512f10f490cb1a96fca27e4aa89ac082eba5
Again, since this is the first time we are running a mongo container, we will pull down the mongo image
from the Docker Store. We are using the --publish flag to expose the 27017 mongo port on our host. We
have to use a port other than 8080 for the host mapping, since that port is already exposed on our host.
Again refer to the official docs on the Docker Hub to get more details about using the mongo image.

6. Access localhost:8081 to see some output from mongo.


7. curl localhost:8081
which will return a warning from MongoDB,
It looks like you are trying to access MongoDB over HTTP on the native driver port.
8. If you are using play-with-docker, look for the 8080 link near the top of the page.

9. Check your running containers with docker container ls


10. $ docker container ls
11. CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
12. d6777df89fea nginx "nginx -g 'daemon ..." Less than a second ago Up 2 seconds
0.0.0.0:8080->80/tcp nginx
13. ead80a0db505 mongo "docker-entrypoint..." 17 seconds ago Up 19 seconds 0.0.0.0:8081-
>27017/tcp mongo
14. af549dccd5cf ubuntu "top" 5 minutes ago Up 5 minutes priceless_kepler
You should see that you have an Nginx web server container, and a MongoDB container running on your
host. Note that we have not configured these containers to talk to each other.

You can see the "nginx" and "mongo" names that we gave to our containers, and the random name (in
my case "priceless_kepler") that was generated for the ubuntu container. You can also see that the port
mappings that we specified with the --publish flag. For more details information on these running
containers you can use the docker container inspect [container id command.

One thing you might notice is that the mongo container is running the docker-entrypoint command. This
is the name of the executable that is run when the container is started. The mongo image requires some
prior configuration before kicking off the DB process. You can see exactly what the script does by
looking at it on github. Typically, you can find the link to the github source from the image description
page on the Docker Store website.

Containers are self-contained and isolated, which means we can avoid potential conflicts between
containers with different system or runtime dependencies. For example: deploying an app that uses Java
7 and another app that uses Java 8 on the same host. Or running multiple nginx containers that all have
port 80 as their default listening ports (if exposing on the host using the --publish flag, the ports selected
for the host will need to be unique). Isolation benefits are possible because of Linux Namespaces.

Note: You didn't have to install anything on your host (other than Docker) to run these processes! Each
container includes the dependencies that it needs within the container, so you don't need to install
anything on your host directly.

Running multiple containers on the same host gives us the ability to fully utilize the resources (cpu,
memory, etc) available on single host. This can result in huge cost savings for an enterprise.

While running images directly from the Docker Hub can be useful at times, it is more useful to create
custom images, and refer to official images as the starting point for these images. We will dive into
building our own custom images in Lab 2.

Step 3: Clean Up

Completing this lab results in a bunch of running containers on your host. Let's clean these up.

1. First get a list of the containers running using docker container ls.
2. $ docker container ls

3. CONTAINER ID IMAGE COMMAND CREATED STATUS


PORTS NAMES
4. d6777df89fea nginx "nginx -g 'daemon ..." 3 minutes ago Up 3 minutes 0.0.0.0:8080->80/tcp
nginx
5. ead80a0db505 mongo "docker-entrypoint..." 3 minutes ago Up 3 minutes 0.0.0.0:8081->27017/tcp
mongo
6. af549dccd5cf ubuntu "top" 8 minutes ago Up 8 minutes priceless_kepler
7. Next, run docker container stop [container id] for each container in the list. You can also use the
names of the containers that you specified before.
$ docker container stop d67 ead af5
d67
ead
af5

8. Remove the stopped containers

docker system prune is a really handy command to clean up your system. It will remove any stopped
containers, unused volumes and networks, and dangling images.
$ docker system prune
WARNING! This will remove:
- all stopped containers
- all volumes not used by at least one container
- all networks not used by at least one container
- all dangling images
Are you sure you want to continue? [y/N] y
Deleted Containers:
7872fd96ea4695795c41150a06067d605f69702dbcb9ce49492c9029f0e1b44b
60abd5ee65b1e2732ddc02b971a86e22de1c1c446dab165462a08b037ef7835c
31617fdd8e5f584c51ce182757e24a1c9620257027665c20be75aa3ab6591740

Total reclaimed space: 12B

Result:

You might also like