0% found this document useful (0 votes)
67 views59 pages

CC Record

Uploaded by

Amanda Drew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views59 pages

CC Record

Uploaded by

Amanda Drew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 59

INDEX

Sl. No Title Date Page


No.
1. Implementation of Para-Virtualization 09/09/2022 1
using VM Ware’s Workstation / Oracle’s
Virtual Box and Guest OS
2. Installation and Configuration of Hadoop 09/09/2022 15
3. Hadoop Commands 09/09/2022 25
4. Word count application in Hadoop using 12/09/2022 27
Java
5. Finding the maximum and minimum value 26/09/2022 36
in Hadoop using Java
6. Python Streaming on Hadoop 26/09/2022 41

7. Word count application in Hadoop using 29/09/2022 43


Python
8. Finding the Maximum and Minimum 06/10/2022 49
value in Hadoop using Python
9. Cricket score analysis in Hadoop using 31/10/2022 53
Python Streaming
Cloud Computing Record

09/09/2022 Implementation of Para-Virtualization using VM Ware’s


Workstation / Oracle’s Virtual Box and Guest O.S
PROGRAM 3

Para-Virtualization
Para-Virtualization is a computer hardware virtualization technique that allows virtual
machines (VMs) to have an interface similar to that of the underlying or host hardware. This
technique aims to improve the VM’s performance by modifying the guest operating system
(OS).
Para-Virtualization (PV) is an enhancement of virtualization technology in which a guest
operating system (guest OS) is modified prior to installation inside a virtual machine (VM) in
order to allow all guest OS within the system to share resources and successfully collaborate,
rather than attempt to emulate an entire hardware environment.
With para-Virtualization, virtual machines can be accessed through interfaces that are similar
to the underlying hardware. This capacity minimizes overhead and optimizes system
performance by supporting the use of VMs that would otherwise be underutilized in
conventional or full hardware virtualization.

Installing the Virtual Box


STEP1: Download the Virtual Box 6.1 installation file by navigating to the website
https://www.virtualbox.org/ in your browser and clicking on the option ‘Download Virtual
Box’ displayed in the middle of the page.

STEP 2: The above step opens up a new page which gives the list of links to Virtual Box
binaries and its source code. Click on the option “Window hosts” which is listed below the
heading ‘VirtualBox 5.2.8 platform packages’. And the VirtualBox EXE file will be
downloaded.

1|III MSc Computer Science


Cloud Computing Record

STEP 3: Once the VirtualBox EXE file is downloaded double-click on the EXE file to open
up the VirtualBox installation window.

STEP 4: Do the following process to install the VirtualBox :


➢ Click on the ‘Next’ option

2|III MSc Computer Science


Cloud Computing Record

➢ Choose the appropriate options and click on the ‘Next’ option.

➢ Now, click on the option ’Yes’ to proceed with the installation process.

3|III MSc Computer Science


Cloud Computing Record

➢ Click on the option ‘Yes’

➢ Click on the option ‘Yes’ when prompted

➢ Once the installation is completed this prompt opens up. Check on ‘Start Oracle VM
VirtualBox 6.1.12after installation’ and then Click on the ‘Finish’ option. This will

4|III MSc Computer Science


Cloud Computing Record

now open the Virtual Box that is successfully installed, which enables you to create a
virtual machine to run any operating system on your PC.

Installing Lubuntu on a New Virtual Machine


STEP 1: Once the VirtualBox is successfully installed, you are now ready to create a new
virtual machine. Click on the “New” option on the top left corner of the page.

STEP 2: ‘Create Virtual Machine’ dialog opens up. Choose a descriptive name for the new
virtual machine, and also the type and version of the virtual machine to be installed.

5|III MSc Computer Science


Cloud Computing Record

STEP 3: Now, select the amount of memory to be allocated to the virtual machine. Here the
recommended memory size of 1024 MB is selected. And click on ‘Next’.

STEP 4: To create a new virtual hard disk to a new machine. Click on the ‘Create a virtual
hard disk now’ option and click on ‘Create’ button.

6|III MSc Computer Science


Cloud Computing Record

STEP 5: Choose the hard disk file type as VDI (VirtualBox Disk Image) and click on’Next’.

STEP 6: Choose the Storage type on the physical hard disk as the ‘Dynamically allocated’
and click on ‘Next’.

STEP 7: Now select the file location and select the size of the virtual hard disk. Then, click
on the ‘Create’.

7|III MSc Computer Science


Cloud Computing Record

STEP 8: Customize Virtual Machine – You have created a new virtual machine. Now you
have to customize the virtual machine created. Click on the ‘Start’ option on the menu bar, a
drop-down box appears, select ‘Normal Start’.

STEP 9: Now select a start-up disk from your PC and click on ‘Start’.

8|III MSc Computer Science


Cloud Computing Record

STEP 10: Now select the Preferred language as English

STEP 11: Click on the ‘Install Ubuntu’ option to start the installation process.

9|III MSc Computer Science


Cloud Computing Record

STEP 12: Now select the language to be used for the installation process as English and
Enter.

STEP 13: Perform the following steps to configure your keyboard.


a. Click on ‘No’ to auto detect the keyboard layout.

b. Select ‘English (US)’ as the country of origin of the keyboard and the layout of
the keyboard.

10 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

STEP 14: Your installation process of loading the additional components starts.

STEP 15: Now enter an appropriate hostname to configure the network.

STEP 16: Now set up the users and the passwords in the following steps:
a. Enter the full name of the user.

b. Enter the username for your account.

c. Choose a password for the user and retype the password

11 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

STEP 17: To Configure the clock click on ‘Yes’.

STEP 18: Choose the ‘Guided – use entire disk and set up LVM’ option to partition the
disks.

STEP 19: Select the provided option to partition the disks.

12 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

STEP 20: Click on ‘Yes’ to write the changes to disks and configure LVM.

STEP 21: Give the amount of volume group to use for guided partitioning as 10.7 GB.

STEP 22: Click on ‘Yes’ to write the changes to the disk.

STEP 23: The installation of the base system begins.

13 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

STEP 24: To configure the package manager leave the HTTP proxy information as blank to
indicate none and click on ‘Continue’.

STEP 25: The installation process continues.

STEP 26: Once the installation process has been completed successfully. The window pops
up for the user to login on to the system

14 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

09/09/2022 Installation and Configuration of Hadoop


PROGRAM 2

STEP 1: Install OpenJDK on Ubuntu


Use the following command to update your system before initiating a new installation: sudo
apt update

Type the following command in your terminal to install OpenJDK 8:


sudo apt install openjdk-8-jdk –y

15 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

Once the installation process is complete, verify the current version:


java -version; javac –version

STEP 2: Set Up a Non-Root User for Hadoop Environment


a. Install OpenSSH on Ubuntu
Install the OpenSSH server and client using the following command:
sudo apt install openssh-server openssh-client –y

b. Enable Passwordless SSH for Hadoop User


Generate an SSH key pair and define the location is is to be stored in:
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa

16 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

c. Use the cat command to store the public key as authorized_keys in the ssh
directory. Set the permissions for your user with the chmod command:
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 0600 ~/.ssh/authorized_keys

The new user is now able to SSH without needing to enter a password every time.
Verify everything is set up correctly by using the hdoop user to SSH to localhost:
ssh localhost

STEP 3: Download and Install Hadoop on Ubuntu


Visit the official Apache Hadoop project page, and select the version of Hadoop you want to
implement. Use the provided mirror link and download the Hadoop package with the wget
command:
wget http://apachemirror.wuchna.com/hadoop/common/hadoop-
3.2.1/hadoop3.2.1.tar.gz

Once the download is complete, extract the files to initiate the Hadoop installation:
tar xzf hadoop-3.2.1.tar.gz

The Hadoop binary files are now located within the hadoop-3.2.1 directory.

17 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

STEP 4: Single Node Hadoop Deployment (Pseudo-Distributed Mode)


A. Configure Hadoop Environment Variables (bashrc)
Edit the .bashrc shell configuration file using a text editor of your choice (we will be
using nano):
sudo nano .bashrc

Define the Hadoop environment variables by adding the following content to the end of the
file:

Apply the changes to the current running environment by using the following command:
source ~/.bashrc
B. Edit hadoop-env.sh File
sudo nano $HADOOP_HOME/etc/hadoop/hadoop-env.sh

18 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

Uncomment the $JAVA_HOME variable (i.e., remove the # sign) and add the full path to the
OpenJDK installation on your system and then, add the following line:
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
The path needs to match the location of the Java installation on your system.

C. Edit core-site.xml File


Open the core-site.xml file in a text editor:
sudo nano $HADOOP_HOME/etc/hadoop/core-site.xml

Add the following configuration to override the default values for the temporary
directory and add your HDFS URL to replace the default local file system setting.

19 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

D. Edit hdfs-site.xml File


Use the following command to open the hdfs-site.xml file for editing:
sudo nano $HADOOP_HOME/etc/hadoop/hdfs-site.xml

E. Edit mapred-site.xml File


Use the following command to access the mapred-site.xml file and define MapReduce
values sudo nano $HADOOP_HOME/etc/hadoop/mapred-site.xml

20 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

F. Edit yarn-site.xml File


sudo nano $HADOOP_HOME/etc/hadoop/yarn-site.xml

STEP 5: Format HDFS NameNode


It is important to format the NameNode before starting Hadoop services for the first time:
hdfs namenode –format
The shutdown notification signifies the end of the NameNode format process.

21 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

STEP 6: Start Hadoop Cluster


Navigate to the hadoop-3.2.1/sbin directory and execute the following commands to start the
NameNode and DataNode:
./start-dfs.sh

Once the namenode, datanodes, and secondary namenode are up and running, start the YARN
resource and nodemanagers by typing:
./start-yarn.sh

Type this simple command to check if all the daemons are active and running as Java
processes:
Jps

If everything is working as intended, the resulting list of running Java processes contains all
the HDFS and YARN daemons.

22 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

STEP 7: Access Hadoop UI from Browser


Use your preferred browser and navigate to your localhost URL or IP. The default port
number 9870 gives you access to the Hadoop NameNode UI: http://localhost:9870
The NameNode user interface provides a comprehensive overview of the entire cluster.

The default port 9864 is used to access individual DataNodes directly from your browser:
http://localhost:9864

23 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

The YARN Resource Manager is accessible on port 8088:


http://localhost:8088

24 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

09/09/2022 HDFS Commands


PROGRAM 3

1. Redirect to the Hadoop-3.3.4 directory and start DFS

2. Listing the directories in the Hadoop folder

3. Create a directory on your Desktop named MSc_Practicals

4. Move the MSc_Practicals folder from the desktop to HDFS

5. Delete the MSc_Practicals folder from the desktop and move it from HDFS to
desktop

25 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

6. Create another folder named CC_lab into MSc_Practicals

7. Create a text file titled file in desktop. Move the text file to the hadoop file system
and open the text file

8. Read the text file using the cat command

26 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

12/09/2022 Word count application in Hadoop using Java


PROGRAM 3

Create a job and submit to cluster


1. Check the Hadoop and Java versions

2. Start the deamons on dfs and yarn

3. Create a folder WordCount on the Desktop

4. Paste the WordCount.java file into WordCount folder

WordCount.java

import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;

27 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class WordCount {
public static class TokenizerMapper
extends Mapper<Object, Text, Text, IntWritable>{
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(Object key, Text value, Context context
) throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word, one);
}}}
public static class IntSumReducer
extends Reducer<Text,IntWritable,Text,IntWritable> {
private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable<IntWritable> values,
Context context
) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
result.set(sum);
context.write(key, result);
}}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "word count");
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}}

28 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

5. a. Create a folder input_data inside WordCount folder

b. Create a input.txt file and add some content to it

c. Create a folder tutorial_classes inside WordCount folder

6. Export Hadoop classpath

29 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

7. Make a WordCountApp directory and another folder input inside


WordCountApp folder in HDFS

8. Move the input.txt file into Hadoop Folder

9. Compile java code – Create a tutorial file based on the Class folder

10. Go to WordCount Directory using cd and put the output files in one jar file

30 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

Run the jar file on Hadoop

31 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

Execute the cat command to view the output

32 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

Localhost:8088

Localhost:9870

33 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

34 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

Localhost:9864

35 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

26/09/2022 Finding the maximum and minimum value in Hadoop


using Java
PROGRAM 5

1. Verify if all the nodes are working properly

2. Create a directory on the desktop named maxmin on the Desktop

3. Place the MyMaxMin.java file into maxmin folder on Desktop


MyMaxMin.java
import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.conf.Configuration;
public class MyMaxMin {
public static class MaxTemperatureMapper extends Mapper {
public void map(LongWritable arg0, Text Value, Context context) throws
IOException, InterruptedException {
if (!(line.length() == 0)) {
String date = line.substring(6, 14);
float temp_Max = Float.parseFloat(line.substring(39, 45).trim());
float temp_Min = Float.parseFloat(line.substring(47, 53).trim());
if (temp_Max > 35.0) {
context.write(new Text("Hot Day " + date), new Text(String.valueOf(temp_Max)));}

36 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

if (temp_Min < 10) {


context.write(new Text("Cold Day " + date),new Text(String.valueOf(temp_Min)));}
} } } public static class MaxTemperatureReducer extends Reducer <Text, Text, Text,
Text
public void reduce(Text Key, Iterator <Text> Values, Context context) throws
IOException, InterruptedException {
String temperature = Values.next().toString();
context.write(Key, new Text(temperature));
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = new Job(conf, "weather example");
job.setJarByClass(MyMaxMin.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
job.setMapperClass(MaxTemperatureMapper.class);
job.setReducerClass(MaxTemperatureReducer.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
Path OutputPath = new Path(args[1]);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
OutputPath.getFileSystem(conf).delete(OutputPath); //exiting the job only if the flag
value becomes false System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}

4. Create a folder input_data inside maxmin folder

37 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

5. Put the weather_data.txt file inside the input_data folder

6. Create a folder weather_dataset inside maxmin folder

7. Export Hadoop classpath

8. Make a maxmin directory and another folder input inside maxmin folder in
HDFS

38 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

9. Move the weather_data.txt file into Hadoop folder

10. Compile java code – Create a file based on the Class folder

11. Go to maxmin directory usind cg and put the output files in one jar file

12. Run the jar file on Hadoop

39 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

13. Use cat command to view the output

40 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

29/09/2022 Python Streaming on Hadoop


PROGRAM 6

STEP 1: Executing mapreduce programs using python

• To serve the input to the python program, we use the cat command
cat input_file | mapfunction.py

• To run the entire MapReduce in one line, we can make use of:
cat input_file | mapfunction.py | reducefunction.p

STEP 2: Changing the permission of the python script

• To make sure that the python script can successfully be executed, we must give or change
the permission rights to executable

• This can be achieved with the help of the chmod command


chmod +x python_filename.py

STEP 3: Move the input file to HDFS

• Move the input file to HDFS using the put command.

• Make sure the file exist in HDFS by making use of the ls command.

STEP 4: Components of the hadoop jar command

• HADOOP_STREAMING_PATH - $HADOOP_HOME/hadoop/tools/lib/Hadoop-
streaming 3.2.1.jar
This notifies the hadoop jar command of the presence of Hadoop streaming so that we can use
this instead of the javac compiler. Make sure that the folder destination exist in the correct
folder.

• INPUT_DIRECTORY - input /python/input/sample.txt


This gives the current position of the input file in HDFS

• OUTPUT_DIRECTORY - output /python/output


This notifies the hadoop jar command of the exact position where the output file should
appear.

41 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

• MAPPER_DIRECTORY - mapper ~/Desktop/map red python/mapper.py


This notifies the hadoop jar command of the exact file path of the mapper program.

• REDUCER_DIRECTORY - reducer ~/Desktop/map red python/reducer.py


This notifies the hadoop jar command of the exact directory of the reducer program.

STEP 5: Running the python script in Hadoop

• Hadoop jar HADOOP_STREAMING_PATH INPUT_DIRECTORY


OUTPUT_DIRECTORY MAPPER_DIRECTORY REDUCER_DIRECTORY
The python script can be run with the help of Hadoop streaming using the hadoop jar
command. The above line gives the syntax of the same.

STEP 6: Viewing the output from HDFS

• hadoop fs -cat OUTPUT_DIRECTORY/*


The output which is saved in the OUTPUT_DIRECTORY can be viewed using the cat
command as shown in the above code

42 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

29/09/2022 Word count application in Hadoop using Python


PROGRAM 7

Check the Hadoop and Java versions

Start all the deamons and verify using jps command

Paste the mapper.py and reducer.py files into map-red-python folder

43 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

Create a sample.txt file in the map-red-python folder

44 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

Executing the map reduce program in python

Changing the permissions of the python script using chmod command

45 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

Create input directory in python folder and move the sample.txt to Hadoop file system

Running the python script in Hadoop using jar command

46 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

47 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

View the output using cat command

48 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

06/10/2022 Finding the Maximum and Minimum value in Hadoop


using Python
PROGRAM 8

Start all the daemons and verify if its working properly

Create a folder minmaxtemp on the Desktop

Paste the mapper.py and reducer.py files into minmaxtemp folder

49 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

Create a sample.txt file in the minmaxtemp folder

Executing the map reduce program in python

Changing the permissions of the python script using chmod command

50 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

Move the temperature.txt file to Hadoop file system

Running the python script in Hadoop using jar command

51 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

View the output using cat command

52 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

31/10/2022 Cricket score analysis in Hadoop using Python


Streaming
PROGRAM 9

Start all the daemons and verify if its working properly

Create a folder cricket on desktop

Create a .txt file named player.txt in the cricket folder


The txt file consists of the name of the player(first 4 letters), runs scored in an innings and the
wickets taken in the same innings. The multiple records of a single person represent a
different innings of the same player.

53 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

Create a mapper file in the cricket folder for finding the batting average

Create another mapper file in the cricket folder for finding the wickets average

Create a reducer file in the cricket folder for finding the batting average

54 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

Create another reducer file in the cricket folder for finding the wickets average

Changing the permissions of the python script using chmod command

View the output of the mapper files in sorted order using the cat command

55 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

Run the reducer files on their respective mapper files and check if the python code is
working properly

Move the playerstats.txt file to Hadoop file system

56 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

Running the python script in Hadoop using jar command


Average Runs scored

57 | I I I M S c C o m p u t e r S c i e n c e
Cloud Computing Record

Average Wickets Taken

View the output using cat command


Output for Average runs scored

Output for Average Wickets Taken

58 | I I I M S c C o m p u t e r S c i e n c e

You might also like