Ccs335 Lab Manual-2
Ccs335 Lab Manual-2
7 HADOOP INSTALLATION
A WORD COUNT PROGRAM USING
8 MAP-REDUCE TASKS
CREATE AND EXCEUTE CONTAINER
9 USING DOCKER
RUN A CONTAINER FROM DOCKER
10 HUB
CONTENT BEYOND THE SYLLABUS
AIM:
PROCEDURE :
Locate the installer in system and double click to launch the application.
Now we can see User Access Control (UAC) dialog box. Click yes to continue.
Then Initial Splash screen will appear. Wait for the process to complete.
Now VMware Workstation setup wizard dialog box appears. Click next to continue.
Then End User Licence Agreement dialog box appears. Then select the Check box “ I accept the terms
in the Licence Agreement” box and press next to continue.
Select the folder in which we would like to install the application. There is no harm in leaving the
defaults as it is. Also select Enhanced Keyboard Driver check box.
Next select “Check for Updates” and “Help improve VMware Workstation Pro”. Normally leave it to
defaults that is unchecked.
Next step is to select the place we want the shortcut icons to be placed on the system to launch the
application. Select both the options, desktop and start menu and click next.
Now installation dialog box appears. Click install to start the installation process.
Now the installation begins Wait for the installation process to complete.
VMware Workstation 15 pro installation process
After the Installation process is completed, the installation complete dialog box appears. Then Click
finish. We may be asked to restart the computer. Click on Yes to restart.
After the installation is completed, VMware Workstation icon appears on the desktop. Double click on it to launch the
application.
RESULT:
EX NO: 2
INSTALL C COMPILER IN THE VIRTUAL MACHINE CREATED
USING VIRTUAL BOX AND EXECUTE SIMPLE PROGRAM
DATE:
AIM:
REQUIREMENTS:
PROCEDURE:
STEP 1:
ubuntu_gt6 installation:
Open Virtual box
File import Appliance
Browse ubuntu_gt6.ova file
Then go to setting, select Usb and choose USB 1.1
Then Start the ubuntu_gt6
Login using username: dinesh, password:99425.
STEP 2:
Open the terminal
STEP 3:
//to install gcc
Sudo add-apt repository ppa:ubuntu-toolchain-r/test
sudo apt-get update
sudo apt-get install gcc-6 gcc-6-base
STEP 4:
To type a sample c program and save it
gedit hello.c
STEP 5:
To compile and run a sample c program
gcc hello.c
./a.out
OUTPUT:
RESULT:
EX NO: 3
INSTALL GOOGLE APP ENGINE AND CREATE SIMPLE WEB
APPLICATIONS
DATE:
AIM:
PROCEDURE:
The App Engine SDKallows youto run Google App Engine Applications on your local computer. It
simulates the run-time environment of the Google App Engine infrastructure.
STEP 1:
STEP 2:
Download and Install the Google App Engine SDK using the below link and also download the appropriate
install package.
http://code.google.com/appengine/downloads.html
STEP 3:
Download the Windows installer in Desktop folder.
STEP 4:
Double Click on the Google App Engine installer.
STEP 5 :
Click through the installation wizard to install the Google App Engine. If we do not have Python
2.5, it will install Python 2.5 as well. Once the install is complete you can discard the downloaded
installer.
STEP 6:
Now create a simple application. Use the “+”option to have the launcher makeanapplication.Make a
folder “apps” for your Google App Engine applications on Desktop: C:\Desktop\apps
STEP 7:
Then make a sub-folder in within apps called “ae-01-trivial” – the path to this folder would
be: C:\Desktop\apps\ae-01-trivial
STEP 8:
Using a text editor such as JEdit (www.jedit.org), create a file called app.yaml in
the ae—01-trivial folder with the following contents:
handlers:
- url: /.*
script: index.py
STEP 9:
Then create a file in the ae-•01-•trivial folder called index.py with three lines in
Then start the GoogleApp Engine Launcher program that can be found under Applications.
RESULT:
EX NO: 4 USE GAE LAUNCHER TO LAUNCH THE WEB APPLICATIONS
DATE:
AIM:
PROCEDURE:
STEP 1:
Start the Google App Engine Launcher program that can be found under Applications.
STEP 2:
Use the File Add Existing Application command and navigate into the apps directory and select the
ae-01-trivial folder. Once you have added the application, select it so that we can control the application
using the launcher.
STEP 3:
Once you have selected your application and press Run. After a few moments your application will start
and the launcher will show a little green icon next to your application.
STEP 4:
Then press Browse to open a browser pointing at your application which is running at http://localhost:8080/
Paste http://localhost:8080 into the browser and we should see your application as follows:
STEP 5:
With two files to edit, there are two general categories of errors that you may encounter. If we make a
mistake on the app.yaml file, the App Engine will not start and your launcher will show a yellow icon
near our application:
To get more detail on what is going wrong, take a look at the log for the application:
In this instance – the mistake is mis-‐indenting the last line in the app.yaml (line 8).
STEP 7:
When we make a mistake in the app.yaml file – we must the fix the mistake and attempt to start the
application again. But if we make a mistake in a file like index.py, we can simply fix the file and press
refresh in browser – there is no need to restart the server.
STEP 8:
To shut down the server: use the Launcher, Select the application and Press the Stop button.
RESULT :
EX NO: 5
SIMULATE A CLOUD SCENARIO USING CLOUDSIM AND RUN A
SCHEDULING ALGORITHM
DATE:
AIM:
PROCEDURE:
CloudSim does not have to be installed. Normally, we can unpack the downloaded package in any
directory, add it to the Java classpath and it is ready to be used. Please verify whether Java is
installed on your system.
STEP 1:
The first step is to initialise the CloudSim package by initialising the CloudSim library,
as follows:
CloudSim.init(num_user, calendar, trace_flag)
The fourth step is to create one virtual machine unique ID of the VM, userId ID of the VM’s
owner, mips, number Of Pes amount of CPUs, amount of RAM, amount of bandwidth, amount of
storage, virtual machine monitor, and cloudletScheduler policy for cloudlets:
broker.submitVmList(vmlist)
Then Create a cloudlet with length, file size, output size, and utilisation model:
STEP 6:
After completing all the process , Start the simulation as follows :
CloudSim.startSimulation()
OUTPUT:
Sample Output from the Existing Example:
Starting CloudSimExample1...
Initialising...
Starting CloudSim version 3.0
Datacenter_0 is starting...
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>null
CloudSimExample1 finished!
RESULT:
EX NO: 6
FILE TRANSFER FROM ONE VIRTUAL MACHINE TO OTHER
DATE:
AIM:
REQUIREMENTS:
1. ORACLE VIRTUAL BOX
2. OPEN NEBULA SANDBOX
3. UBUNTU Gt6.Ova
PROCEDURE:
STEP:1
Open Browser, type localhost:9869
STEP:2
Login using username: oneadmin, password: opennebula
STEP:3
STEP:4
Before migration
After Migration:
RESULT:
EX NO :7
HADOOP INSTALLATION
DATE:
AIM:
PROCEDURE:
STEP 1:INSTALLING JAVA
Step 1.1:Download jdk tar.gz file for Ubuntu 64 bit OS
$tar zxvf jdk-8u60-linux-x64.tar.gz
$cd jdk1.8.0-60/
$pwd
/home/Downloads/jdkcopy the path of jdk
Step 1.2:To set the environment variables for java
$ sudo nano /etc/profile
Pwd:
Add following three lines at the mid file
JAVA_HOME=/home/Downloads/jdk 1.8.0_60(paste path here)
PATH=$PATH:$JAVA_HOME/bin
Export PATH JAVA_HOME
Save the file by pressing ctrl+x,press y&enter
Step 1.3:source /etc/profile
Step 1.4:java -version
You will get “java hotspot(TM)64 bit server as last line
If you are not getting this,Update java alternatives
#update_alternatives install “us a/bin/java”java
#java_version
</property>
</configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
<property><name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value></property>
</configuration>
STEP 5:FORMAT HDFS FILE SYSTEM (via name node for first time)
#cd $HADOOP_PREFIX
#bin/hadoop namenode -format
Step 5.1:Start name node and data node(port
50070) #sbin/start-dfs.sh
Press y, yes
If error occurs ssh-add
To know running name node and data node Daemon type
#jps[In browser, type local host:50070,give enter , name node information is displayed]
Step 5.2:Start resource manager and node manager Daemon(port 8088)
#sbin/start-yarn.sh
#jps
Step 5.3:To stop the running process
#sbin/stop-dfs.sh
#sbin/stop-yarn.sh
OUTPUT:
RESULT:
EX NO :8
A WORD COUNT PROGRAM USING MAP-REDUCE TASKS
DATE:
AIM:
PROCEDURE:
STEP: 1
Download Hadoop-core-1.2.1.jar, which is used to compile and execute the MapReduce program.
Visit the following link http://mvnrepository.com/artifact/org.apache.hadoop/hadoop-core/1.2.1 to
download the jar. Let us assume the downloaded folder is /home/hadoop/.
STEP: 2
The following commands are used for compiling the WordCount.java program.
javac -classpath hadoop-core-1.2.1.jar -d . WordCount.java
STEP: 3
Create a jar for the program.
jar -cvf sample1.jar sample1/
STEP: 4
cd $HADOOP_PREFIX
bin/hadoop namenode -
format sbin/start-dfs.sh
sbin/start-yarn.sh
jps
STEP: 5
The following command is used to create an input directory in HDFS.
bin/hdfs dfs -mkdir /input
STEP: 6
The following command is used to copy the input file named sal.txt in the input directory of HDFS.
bin/hdfs dfs -put /home/it08/Downloads/sal.txt /input
STEP: 7
The following command is used to run the application by taking the input files from the input directory.
bin/hadoop jar /home/it08/Downloads/sample1.jar sample1.WordCount /input /output
PROGRAM:
WordCount.java
package sample;
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class WordCount {
public static class TokenizerMapper
extends Mapper<Object, Text, Text, IntWritable>{
DATE:
AIM:
PROCEDURE:
$ docker –h
docker version
Client:
Version: 19.03.6
...
You note that Docker installs both a Client and a Server: Docker Engine. For instance, if you run the same
command for podman, you will see only a CLI version, because podman runs daemonless and relies on an OCI
compliant container runtime (runc, crun, runv etc) to interface with the OS to create the running containers.
Use the docker container run command to run a container with the ubuntu image using the top command.
The -t flags allocate a pseudo-TTY which we need for the top to work correctly.
$ docker container run -it ubuntu top
from library/ubuntu
aafe6b5e13de: Pull complete
0a2b43a72660: Pull complete
18bdd1e546d2: Pull complete
8198342c3e05: Pull complete
f56970a44fd4: Pull complete
Digest: sha256:f3a61450ae43896c4332bda5e78b453f4a93179045f20c8181043b26b5e79028
The docker run command will result first in a docker pull to download the ubuntu image onto your host.
Once it is downloaded, it will start the container. The output for the running container should look like
this:
top - 20:32:46 up 3 days, 17:40, 0 users, load average: 0.00, 0.01, 0.00
Tasks: 1 total, 1 running, 0 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 0.1 sy, 0.0 ni, 99.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 2046768 total, 173308 free, 117248 used, 1756212 buff/cache
KiB Swap: 1048572 total, 1048572 free, 0 used. 1548356 avail Mem
Containers use linux namespaces to provide isolation of system resources from other containers or the
host. The PID namespace provides isolation for process IDs. If you run top while inside the container, you
will notice that it shows the processes within the PID namespace of the container, which is much different
than what you can see if you ran top on the host.
Even though we are using the ubuntu image, it is important to note that our container does not have its
own kernel. Its uses the kernel of the host and the ubuntu image is used only to provide the file system
and tools available on an ubuntu system.
Then use that id to run bash inside that container using the docker container exec command. Since we are
using bash and want to interact with this container from our terminal, use -it flags to run using interactive
mode while allocating a psuedo-terminal.
Notice the change in the prefix of your terminal. e.g. root@b3ad2a23fab3:/. This is an indication that
we are running bash "inside" of our container.
From the same terminal, run ps -ef to inspect the running processes.
root@b3ad2a23fab3:/# ps -ef
UID PID PPID C STIME TTY TIME
CMD root 1 0 0 20:34 ? 00:00:00
top
root 17 0 0 21:06 ? 00:00:00 bash
root 27 17 0 21:14 ? 00:00:00 ps -ef
You should see only the top process, bash process and our ps process.
For comparison, exit the container, and run ps -ef or top on the host. These commands will work on linux
or mac. For windows, you can inspect the running processes using tasklist.
root@b3ad2a23fab3:/# exit
exit
$ ps -ef
# Lots of processes!
Step 4: Clean up the container running the top processes by typing: <ctrl>-c, list all containers and remove
the containers by their id.
docker ps -a
RESULT:
EX NO :10
RUN A CONTAINER FROM DOCKER HUB
DATE:
AIM:
PROCEDURE:
docker version
docker login
STEP: 3 - Run the downloaded Docker Image & Access the Application
docker ps
docker ps -a
docker ps -a -q
docker rm <container-name>
docker images
RESULT:
EX NO :11
PROGRAM TO FIND MAXIMUM DATA USING
MAP AND REDUCE TASKS
DATE:
AIM:
PROCEDURE:
STEP: 1
The following commands are used for compiling the MaxData.java program.
javac -classpath hadoop-core-1.2.1.jar -d . MaxData.java
STEP: 2
Create a jar for the program.
jar -cvf sample1.jar sample1/
STEP: 3
cd $HADOOP_PREFIX
bin/hadoop namenode -
format sbin/start-dfs.sh
sbin/start-yarn.sh
jps
STEP: 4
The following command is used to create an input directory in HDFS.
bin/hdfs dfs -mkdir /input
STEP: 5
The following command is used to copy the input file named sal.txt in the input directory of HDFS.
bin/hdfs dfs -put /home/it56/Downloads/sal.txt /input
STEP: 6
The following command is used to run the application by taking the input files from the input directory.
bin/hadoop jar /home/it08/Downloads/sample1.jar sample1.MaxData /input5 /output5
PROGRAM:
MaxData.java
package sample1;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
while (itr.hasMoreTokens()) {
word.set(Integer.parseInt(itr.nextToken()));
context.write(new Text(str1),word);
}
}
}
OUTPUT:
RESULT:
EX NO :12
ATTACH A VIRTUAL BLOCK TO VIRTUAL
MACHINE
DATE:
AIM:
REQUIREMENTS:
PROCEDURE:
METHOD 1:
1. Open the virtual box
2. Power off the VM which you want to add virtual box
3. Then right click on that VM,select setting
4. Then click on storage,find controller IDE .
5. In the top right find add hard disk icon, the pop up window display
6. On that window select create new disk, and then click next and next then finish.
7. Then find attributes icon ,hard disk as IDE secondary slave.
METHOD 2:
AIM:
REQUIREMENTS:
PROCEDURE:
Nova compute instances support the attachment and detachment of Cinder storage volumes. This procedure
details the steps involved in creating a logical volume in the cinder-volumes volume group using the cinder
command line interface.
METHOD: 2
STEP: 1
Select the appropriate project from the drop down menu at the top left.
STEP: 3
On the Project tab, open the Compute tab and click Access & Security category.
STEP: 4
On the Access & Security tab, click API Access category and click Download Openstack RC File V2.0
STEP: 5
source ./admin-openrc.sh
STEP: 6
Create Cinder Volume. Use the cinder create command to create a new volume.
$ cinder create --display_name NAME SIZE.
OUTPUT:
RESULT:
EX NO :14
MOUNT THE ONE NODE HADOOP CLUSTER USING FUSE
DATE:
AIM:
PROCEDURE:
STEP: 1
wget http://archive.cloudera.com/cdh5/one-click-install/trusty/amd64/cdh5-repository_1.0_all.deb
STEP: 2
sudo dpkg -i cdh5-repository_1.0_all.deb
STEP: 3
sudo apt-get update
STEP: 4
sudo apt-get install hadoop-hdfs-fuse
STEP: 5
sudo mkdir -p xyz
STEP: 6
cd hadoop-2.7.0/
STEP: 7
bin/hadoop namenode -format
STEP: 8
sbin/start-all.sh
STEP: 9
hadoop-fuse-dfs dfs://localhost:9000 /home/it08/Downloads/xyz/
STEP: 10
sudo chmod 777 /home/it08/Downloads/xyz/
STEP: 11
hadoop-fuse-dfs dfs://localhost:9000 /home/it08/Downloads/xyz/
STEP: 12
cd /home/it08/Downloads/xyz/
STEP: 13
mkdir a
ls
OUTPUT:
RESULT: