0% found this document useful (0 votes)

100 views19 pages

Big Data Analytics - Lab-Manual

The document provides instructions for installing a single node Hadoop cluster on Ubuntu. It details downloading and configuring Hadoop, setting environment variables, formatting the namenode, and starting the Hadoop cluster processes.

Uploaded by

Monika R.T

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

100 views19 pages

Big Data Analytics - Lab-Manual

Uploaded by

Monika R.T

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

‭LAB MANUAL‬

‭PRACTICAL NO – 1‬

‭Exp No:‬

‭Date:‬

‭Aim:‬‭Installation of Single Node Hadoop Cluster on Ubuntu‬

‭ HEORY‬‭:‬
T
‭Apache Hadoop 3.1 have noticeable improvements any many bug fixes over the previous stable‬
‭3.0 releases. This version has many improvements in HDFS and MapReduce. This how-to guide‬
‭will help you to setup Hadoop 3.1.0 Single-Node Cluster on CentOS/RHEL 7/6/5, Ubuntu 18.04,‬
‭17.10, 16.04 & 14‬
‭.04, Debian 9/8/7 and LinuxMint Systems. This article has been tested with Ubuntu 18.04 LTS.‬

1‭ . Prerequisites‬
‭Java‬‭is‬‭the‬‭primary‬‭requirement‬‭for‬‭running‬‭Hadoop‬‭on‬‭any‬‭system,‬‭So‬‭make‬‭sure‬‭you‬‭have‬‭Java‬
‭installed‬‭on‬‭your‬‭system‬‭using‬‭the‬‭following‬‭command.‬‭If‬‭you‬‭don’t‬‭have‬‭Java‬‭installed‬‭on‬‭your‬
‭system,‬‭use‬‭one‬‭of‬‭the‬‭following‬‭links‬‭to‬‭install‬‭it‬‭first.‬‭Hadoop‬‭supports‬‭only‬‭JAVA‬‭8‬‭If‬‭already‬
‭any other version is present then uninstall the following using these commands.‬
‭sudo apt-get purge openjdk-\* icedtea-\* icedtea6-\*‬
‭OR‬
‭sudo apt remove openjdk-8-jdk‬

‭∙‬‭Step 1.1 – Install Oracle Java 8 on Ubuntu‬

‭ ou need to enable additional repository to your system to install Java 8 on Ubuntu VPS. After‬
Y
‭that install Oracle Java 8 on an Ubuntu system using apt-get.This repository contains a package‬
‭named oracle-java8-installer, Which is notan actual Java package. Instead of that, this package‬
‭contains a script toinstall Java on Ubuntu.Run below commands to install Java 8 on Ubuntu and‬
‭LinuxMint.‬
‭sudo add-apt-repository‬
‭ppa:webupd8team/java sudo apt-get‬
‭sudo apt-get install oracle-java8-installer‬
‭OR‬
‭sudo apt install openjdk-8-jre-headless‬
‭sudo apt install openjdk-8-jdk‬
‭∙‬‭Step 1.2 – Verify Java Inatallation‬
‭The apt repository also provides package oracle-java8-set-default to set Java 8 as your default‬
‭Java version. This package will be installed along with Java installation. To make sure run‬
‭below command.‬
‭sudo apt-get install oracle-java8-set-default‬
‭After successfully installing Oracle Java 8 using the above steps, Let’s verify the installed‬
v‭ ersion using the following command.‬
‭java -version‬
‭java version "1.8.0_201"‬

J‭ ava(TM) SE Runtime Environment (build 1.8.0_201-b09)‬

‭Java HotSpot(TM) 64-Bit Server VM (build 25.201-b09, mixed mode)‬
‭∙‬‭Step 1.3 – Setup JAVA_HOME and JRE_HOME Variable‬
‭Add the java path to JAVA_HOME variable in .bashrc file. Go to your home directory and in the‬
‭folder option click on show hidden files. After that a .bashrc file will be present, open the file‬
‭and add the following line at the end.‬
‭NOTE- Path of the java will be your pc path on which java is been‬
‭installed. export JAVA_HOME=/usr/lib/jvm/java-8-oracle‬
‭NOTE- After doing all changes and saving the file run the following command to make‬
‭changes through the .bashrc file.‬
‭source ~/.bashrc‬
‭All done, you have successfully installed Java 8 on a Linux system.‬

2‭ . Create Hadoop User‬

‭We recommend creating a normal (nor root) account for Hadoop working. To create an‬
‭account using the following command.‬
‭adduser hadoop‬
‭passwd hadoop‬
‭Set up a new user for Hadoop working separately other than the normal users. NOTE- Its‬
‭compulsory to create a sperate user with username hadoop otherwise it may give you path‬
‭file issues later.‬
‭Also run these commands from an admin privileged user present on the‬
‭machine. sudo adduser Hadoop sudo‬
‭Command – sudo adduser hadoop sudo If you have already created the user and want to‬
‭give sudo/root privileges to it then run the following command.‬
‭sudo usermod -a -G sudo hadoop‬
‭Otherwise you can directly edit the permission lines in sudoers file. Go to the root access‬
‭by running‬
‭sudo -I or su- <username>‬
‭Type the following command and add the below line to the file.‬
‭visudo‬
‭Add following lines to the file.‬
‭hadoop ALL=(ALL:ALL) ALL‬
‭After creating the account, it also required to set up key-based ssh to its own account. To do this‬
‭use execute following commands.‬
‭su - hadoop‬
‭ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa‬
‭cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys‬
‭chmod 0600 ~/.ssh/authorized_keys‬
‭ et’s verify key based login. Below command should not ask for the password but the first‬
L
‭time it will prompt for adding RSA to the list of known hosts.‬
‭ssh localhost‬
‭exit‬
‭Disable all firewall restriction‬
‭sudo ufw disable‬

I‭ f above command doesn’t work then go with.‬

‭service iptables stop‬
‭OR‬
‭sudo chkconfig iptables off‬
‭Sometimes it’s better to manage firewall using a third party software. Ex. yast‬

3‭ . Download Hadoop 3.1 Archive‬

‭In this step, download hadoop 3.1 source archive file using below command. You can also‬
‭select alternate download mirror for increasing download speed.‬
‭cd ~‬
‭wget http://www-eu.apache.org/dist/hadoop/common/hadoop-3.1.0/hadoop-3.1.0.tar.gz‬
‭tar xzf hadoop-3.1.0.tar.gz‬
‭mv hadoop-3.1.0 hadoop‬

4‭ . Setup Hadoop Pseudo-Distributed Mode‬

‭4.1. Setup Hadoop Environment Variables‬
‭First, we need to set environment variable uses by Hadoop. Edit ~/.bashrc file and‬
‭append following values at end of file.‬
‭export HADOOP_HOME=/home/hadoop/hadoop export‬
‭HADOOP_INSTALL=$HADOOP_HOME export‬
‭HADOOP_MAPRED_HOME=$HADOOP_HOME export‬
‭HADOOP_COMMON_HOME=$HADOOP_HOME‬
‭export HADOOP_HDFS_HOME=$HADOOP_HOME‬
‭export YARN_HOME=$HADOOP_HOME‬
‭export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native‬
‭export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin‬

‭ ow apply the changes in the current running‬

N
‭environment source ~/.bashrc‬

‭ ow edit $HADOOP_HOME/etc/hadoop/hadoop-env.sh file and set JAVA_HOME‬

N
‭environment variable. Change the JAVA path as per install on your system. This path may vary‬
‭as per your operating system version and installation source. So make sure you are using correct‬
‭path. export JAVA_HOME=/usr/lib/jvm/java-8-oracle‬

4‭ .2. Setup Hadoop Configuration Files‬

‭Hadoop has many of configuration files, which need to configure as per requirements of your‬
‭Hadoop infrastructure. Let’s start with the configuration with basic Hadoop single node‬
c‭ luster setup. first, navigate to below location‬
‭cd $HADOOP_HOME/etc/hadoop‬
‭Edit core-site.xml‬
‭<configuration>‬
‭<property>‬
‭<name>fs.default.name</name>‬

‭3‬
‭ value>hdfs://localhost:9000</value>‬
<
‭</property>‬
‭</configuration>‬

‭ dit hdfs-site.xml‬
E
‭<configuration>‬
‭<property>‬
‭<name>dfs.replication</name>‬
‭<value>1</value>‬
‭</property>‬
‭<property>‬
‭<name>dfs.name.dir</name>‬
‭<value>file:///home/hadoop/hadoopdata/hdfs/namenode</value>‬
‭</property>‬
‭<property>‬
‭<name>dfs.data.dir</name>‬
‭<value>file:///home/hadoop/hadoopdata/hdfs/datanode</value>‬
‭</property>‬
‭</configuration>‬

‭ dit mapred-site.xml‬
E
‭<configuration>‬
‭<property>‬
‭<name>mapreduce.framework.name </name>‬
‭<value>yarn </value>‬
‭</property>‬
‭</configuration>‬

‭ dit yarn-site.xml‬
E
‭<configuration>‬
‭<property>‬
‭<name>yarn.nodemanager.aux-services </name>‬
‭<value>mapreduce_shuffle </value>‬
‭</property>‬
‭</configuration>‬
4‭ .3. Format Namenode‬
‭Now format the namenode using the following command, make sure that Storage directory‬
‭is hdfs namenode -format‬

‭Sample output:‬
‭ ARNING: /home/hadoop/hadoop/logs does not exist. Creating.‬
W
‭2018-05-02 17:52:09,678 INFO namenode.NameNode: STARTUP_MSG:‬
‭STARTUP_MSG: Starting NameNode STARTUP_MSG: host =‬
‭localhost/127.0.1.1‬
‭STARTUP_MSG: args = [-format]‬
‭STARTUP_MSG: version = 3.1.0‬
‭...‬
‭...‬
‭...‬
‭2018-05-02 17:52:13,717 INFO common.Storage: Storage directory‬
‭/home/hadoop/hadoopdata/hdfs/namenode has been successfully formatted. 2018-05-02‬
‭17:52:13,806 INFO namenode.FSImageFormatProtobuf: Saving image file‬
‭/home/hadoop/hadoopdata/hdfs/namenode/current/fsimage.ckpt_0000000000000000000 using‬
‭no‬
‭compression‬
‭2018-05-02 17:52:14,161 INFO namenode.FSImageFormatProtobuf: Image file‬
‭/home/hadoop/hadoopdata/hdfs/namenode/current/fsimage.ckpt_0000000000000000000 of size‬
‭391 bytes saved in 0 seconds .‬
‭2018-05-02 17:52:14,224 INFO namenode.NNStorageRetentionManager: Going to retain‬
‭1 images with txid >= 0‬
‭2018-05-02 17:52:14,282 INFO namenode.NameNode: SHUTDOWN_MSG:‬
‭/************************************************************‬
‭SHUTDOWN_MSG: Shutting down NameNode at localhost/127.0.1.1‬
‭************************************************************/‬

5‭ . Start Hadoop Cluster‬

‭Let’s start your Hadoop cluster using the scripts provides by Hadoop. Just navigate to‬
‭your $HADOOP_HOME/sbin directory and execute scripts one by one.‬
‭Cd $HADOOP_HOME/sbin/‬

‭ ow run start-dfs.sh script.‬

N
‭./start-dfs.sh‬
‭Sample output:Starting namenodes on‬
‭[localhost] Starting datanodes‬
‭Starting secondary namenodes [localhost]‬
‭2018-05-02 18:00:32,565 WARN util.NativeCodeLoader: Unable to load native-hadoop‬
‭library for your platform... using builtin-java classes where applicable‬
‭Now run start-yarn.sh script.‬
.‭/start-yarn.sh‬
‭Sample output:Starting resourcemanager‬
‭Starting nodemanagers‬

6‭ . Access Hadoop Services in Browser‬

‭Hadoop NameNode started on port 9870 default. Access your server on port 9870 in‬
‭your favorite web browser.‬
‭http://localhost:9870/‬

‭ ow access port 8042 for getting the information about the cluster and all‬
N
‭applications http:// localhost:8042/‬

‭ ccess port 9864 to get details about your Hadoop node.‬

A
‭http://localhost:9864/‬

7‭ . Test Hadoop Single Node Setup‬

‭7.1Make the HDFS directories required using following commands.‬
‭bin/hdfs dfs -mkdir /user‬
‭bin/hdfs dfs -mkdir /user/hadoop‬
‭7.2Copy all files from local file system /var/log/httpd to hadoop distributed file system‬
‭using below command‬
b‭ in/hdfs dfs -put /var/log/apache2 logs‬
‭7.3 Browse Hadoop distributed file system by opening below URL in the browser. You will‬
‭see an apache2 folder in the list.‬
‭http://localhost:9870/explorer.html#/user/hadoop/logs/‬
‭PRACTICAL NO – 2‬

‭ xp No:‬
E
‭Date :‬

‭Aim:‬‭Hadoop Programming: Word Count MapReduce Program Using Eclipse‬

‭THEORY:‬

‭ teps to run WordCount Application in Eclipse‬

S
‭Step-1‬
‭Download eclipse if you don’t have.‬‭64 bit Linux os‬‭32 bit Linux os‬
‭Step-2‬
‭Open Eclipse and Make Java Project.‬
‭In eclipse Click on File menu-> new -> Java Project. Write there your project name. Here is‬
‭WordCount. Make sure Java version must be 1.6 and above. Click on Finish.‬

‭ tep-3‬
S
‭Make Java class File and write a code.‬
‭Click on WordCount project. There will be ‘src’ folder. Right click on ‘src’ folder -> New ->‬
‭Class. Write Class file name. Here is Wordcount. Click on Finish.‬

‭Copy and Paste below code in Wordcount.java. Save it.‬

‭ ou will get lots of error but don’t panic. It is because of requirement of external library‬
Y
‭of hadoop which is required to run mapreduce program.‬
‭import java.io.IOException;‬
i‭mport java.util.StringTokenizer;‬
‭import org.apache.hadoop.conf.Configuration;‬
‭import org.apache.hadoop.fs.Path;‬
‭import org.apache.hadoop.io.IntWritable;‬
‭import org.apache.hadoop.io.Text;‬
‭import org.apache.hadoop.mapreduce.Job;‬
‭import org.apache.hadoop.mapreduce.Mapper;‬
‭import org.apache.hadoop.mapreduce.Reducer;‬
‭import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;‬
‭import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;‬
‭public class Wordcount {‬
‭public static class TokenizerMapper‬
‭extends Mapper{‬
‭private final static IntWritable one = new‬
‭IntWritable(1); private Text word = new Text();‬
‭public void map(Object key, Text value, Context context) throws‬
‭IOException, InterruptedException {‬
‭StringTokenizer itr = new‬
‭StringTokenizer(value.toString()); while‬
‭(itr.hasMoreTokens()) {‬
‭word.set(itr.nextToken());‬
‭context.write(word, one);‬
‭}‬
‭}‬
‭}‬
‭public static class IntSumReducer‬
‭extends Reducer {‬
‭private IntWritable result = new IntWritable();‬
‭public void reduce(Text key, Iterable values,Context context) throws‬
‭IOException, InterruptedException {‬
‭int sum = 0;‬
‭for (IntWritable val : values) {‬
‭sum += val.get();‬
‭}‬
‭result.set(sum);‬
‭context.write(key, result);‬
‭}‬
‭}‬
‭public static void main(String[] args) throws Exception‬
‭{ Configuration conf = new Configuration();‬
‭Job job = Job.getInstance(conf, "word count");‬
‭job.setJarByClass(WordCount.class);‬
‭job.setMapperClass(TokenizerMapper.class);‬
‭job.setCombinerClass(IntSumReducer.class);‬
‭job.setReducerClass(IntSumReducer.class);‬
‭job.setOutputKeyClass(Text.class);‬
j‭ob.setOutputValueClass(IntWritable.class);‬
‭FileInputFormat.addInputPath(job, new Path(args[0]));‬
‭FileOutputFormat.setOutputPath(job, new Path(args[1]));‬
‭System.exit(job.waitForCompletion(true) ? 0 : 1);‬

}‭ ‬
‭}‬

‭ tep-4‬
S
‭Add external libraries from hadoop.‬
‭Right click on WordCount Project -> Build Path -> Configure Build Path -> Click on Libraries -‬
‭> click on ‘Add External Jars..’ button.‬
‭Select below files from hadoop folder.‬
‭In my case:- /usr/local/hadoop/share/hadoop‬
‭4.1‬‭Add jar files from /usr/local/hadoop/share/hadoop/common folder.‬
‭4.2‬‭Add jar files from /usr/local/hadoop/share/hadoop/common/lib folder.‬
‭4.3‬‭Add jar files from /usr/local/hadoop/share/hadoop/mapreduce folder (Don’t need to add‬
‭hadoop-mapreduce-examples-2.7.3.jar)‬
‭4.4‬‭Add jar files from /usr/local/hadoop/share/hadoop/yarn folder.‬
‭Click on ok. Now you can see, all error in code is gone.‬
‭Step 5‬
‭Running Mapreduce Code.‬
‭5.1‬‭Make input file for WordCount Project.‬
‭Right Click on WordCount project-> new -> File. Write File name and click on ok. You can‬
‭copy and paste below contains into your input file.‬
‭car bus bike‬
‭bike bus aeroplane‬
‭truck car bus‬
‭5.2‬‭Right click on WordCount Project -> click on Run‬‭As. -> click on Run Configuration…‬
‭Make new configuration by clicking on ‘new launch configuration’. Set Configuration‬
‭Name, Project Name and Class file name.‬
‭Output of WordCount Application and output logs in console.‬
‭Refresh WordCount Project. Right Click on project -> click on Refresh. You can find ‘out’‬
‭directory in project explorer. Open ‘out’ directory. There will be ‘part-r-00000’ file. Double click‬
‭to open it.‬
‭PRACTICAL NO – 3‬

‭Exp No:‬

‭Date:‬

‭Aim:‬‭Implementing Matrix Multiplication Using One Map-Reduce Step.‬

‭THEORY:‬
I‭ n mathematics, matrix multiplication or the matrix product is a binary operation that produces a‬
‭matrix from two matrices. In more detail, if A is an n × m matrix and B is an m × p matrix, their‬
‭matrix product AB is an n × p matrix, in which the m entries across a row of A are multiplied‬
‭with the m entries down a column of B and summed to produce an entry of AB. When two‬
‭linear transformations are represented by matrices, then the matrix product represents the‬
‭composition of the two transformations.‬

‭Algorithm for Map Function:‬

‭for each element mij of M do‬

‭produce (key,value) pairs as ((i,k), (M,j,mij), for k=1,2,3,.. upto the number of columns of N‬

‭for each element njk of N do‬

‭produce (key,value) pairs as ((i,k),(N,j,Njk), for i = 1,2,3,.. Upto the number of rows of M.‬

r‭ eturn Set of (key,value) pairs that each key (i,k), has list with values (M,j,mij) and (N, j,njk)‬
‭for all possible values of j.‬

‭Algorithm for Reduce Function:‬

‭for each key (i,k) do‬

‭sort values begin with M by j in listM‬

‭sort values begin with N by j in listN‬

‭multiply mij and njk for jth value of each list‬

‭sum up mij x njk return (i,k), Σj=1 mij x njk‬
‭Step 1. Download the hadoop jar files with these links.‬

‭Download Hadoop Common Jar files:‬‭https://goo.gl/G4MyHp‬‭$‬

‭wget‬‭https://goo.gl/G4MyHp‬‭-‭O
‬ hadoop-common-2.2.0.jar‬

‭Download Hadoop Mapreduce Jar File:‬‭https://goo.gl/KT8yfB‬

‭$ wget‬‭https://goo.gl/KT8yfB‬‭-‭O
‬ ‬

‭hadoop-mapreduce-client-core-2.7.1.jar‬‭Step 2. Creating Mapper file‬

‭for Matrix Multiplication.‬

‭import org.apache.hadoop.conf.*;‬

‭import org.apache.hadoop.io.LongWritable;‬

‭import org.apache.hadoop.io.Text;‬

‭import org.apache.hadoop.mapreduce.Mapper;‬

‭import java.io.IOException;‬

‭public class Map‬

‭extends org.apache.hadoop.mapreduce.Mapper<LongWritable, Text, Text, Text> {‬

‭@Override‬

‭public void map(LongWritable key, Text value, Context context)‬

‭throws IOException, InterruptedException {‬

‭Configuration conf = context.getConfiguration();‬

‭int m = Integer.parseInt(conf.get("m")); int p =‬

‭Integer.parseInt(conf.get("p"));‬

‭String line = value.toString();‬

‭// (M, i, j, Mij);‬

‭String[] indicesAndValue = line.split(",");‬

‭Text outputKey = new Text();‬
‭ ext outputValue = new Text();‬
T
‭if (indicesAndValue[0].equals("M")) {‬

‭for (int k = 0; k < p; k++) {‬

‭outputKey.set(indicesAndValue[1] + "," + k);‬

‭// outputKey.set(i,k);‬

‭outputValue.set(indicesAndValue[0] + "," + indicesAndValue[2]‬

‭+ "," + indicesAndValue[3]);‬

‭// outputValue.set(M,j,Mij);‬

‭context.write(outputKey, outputValue);‬

‭}‬

‭} else {‬

‭// (N, j, k, Njk);‬

‭for (int i = 0; i < m; i++) {‬

‭outputKey.set(i + "," + indicesAndValue[2]);‬

‭outputValue.set("N," + indicesAndValue[1] + ","‬

‭+ indicesAndValue[3]);‬

‭context.write(outputKey, outputValue);‬

‭}‬

‭}‬
‭ tep 3. Creating Reducer.java file for Matrix‬
S
‭Multiplication.‬
‭import org.apache.hadoop.io.Text;‬

‭import org.apache.hadoop.mapreduce.Reducer;‬

‭import java.io.IOException;‬

‭import java.util.HashMap;‬

‭public class Reduce‬

‭extends org.apache.hadoop.mapreduce.Reducer<Text, Text, Text, Text>‬

‭{ @Override‬

‭public void reduce(Text key, Iterable<Text> values, Context context)‬

‭throws IOException, InterruptedException {‬

‭String[] value;‬

‭//key=(i,k),‬

‭//Values = [(M/N,j,V/W),..]‬

‭HashMap<Integer, Float> hashA = new HashMap<Integer,‬

‭Float>(); HashMap<Integer, Float> hashB = new HashMap<Integer,‬

‭Float>(); for (Text val : values) {‬

‭value = val.toString().split(",");‬

‭if (value[0].equals("M")) {‬

‭hashA.put(Integer.parseInt(value[1]),‬

‭Float.parseFloat(value[2])); } else {‬

‭hashB.put(Integer.parseInt(value[1]), Float.parseFloat(value[2]));‬

‭}‬
}‭ ‬
‭int n = Integer.parseInt(context.getConfiguration().get("n"));‬
‭float result = 0.0f;‬

‭float m_ij;‬

‭float n_jk;‬

‭for (int j = 0; j < n; j++) {‬

‭m_ij = hashA.containsKey(j) ? hashA.get(j) :‬

‭0.0f; n_jk = hashB.containsKey(j) ? hashB.get(j) :‬

‭0.0f; result += m_ij * n_jk;‬

‭}‬

‭if (result != 0.0f) {‬

‭context.write(null,‬

‭new Text(key.toString() + "," + Float.toString(result)));‬

‭}‬

‭}‬
‭ tep 4. Creating MatrixMultiply.java file for‬
S

‭import org.apache.hadoop.conf.*;‬

‭import org.apache.hadoop.fs.Path;‬

‭import org.apache.hadoop.io.*;‬

‭import org.apache.hadoop.mapreduce.*;‬

‭import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;‬

‭import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;‬
‭importorg.apache.hadoop.mapreduce.lib.output.FileOutputForm‬

‭at;‬

‭import‬

‭org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;‬

‭public class MatrixMultiply {‬

‭public static void main(String[] args) throws Exception‬

‭{ if (args.length != 2) {‬

‭System.err.println("Usage: MatrixMultiply <in_dir>‬

‭<out_dir>"); System.exit(2);‬

‭}‬

‭Configuration conf = new Configuration();‬

‭conf.set("m", "1000");‬

‭conf.set("n", "100");‬

‭conf.set("p", "1000");‬

‭@SuppressWarnings("deprecation")‬

J‭ ob job = new Job(conf, "MatrixMultiply");‬

‭job.setJarByClass(MatrixMultiply.class);‬

‭job.setOutputKeyClass(Text.class);‬

‭job.setOutputValueClass(Text.class);‬

‭job.setMapperClass(Map.class);‬

‭job.setReducerClass(Reduce.class);‬

‭job.setInputFormatClass(TextInputFormat.class);‬

‭job.setOutputFormatClass(TextOutputFormat.class);‬
‭FileInputFormat.addInputPath(job, new Path(args[0]));‬

‭FileOutputFormat.setOutputPath(job, new‬

‭Path(args[1])); job.waitForCompletion(true);‬

‭}‬
‭}‬

‭Step 5. Compiling the program in particular folder named as operation/‬

$‭ javac -cp‬
‭hadoop-common-2.2.0.jar:hadoop-mapreduce-client-core-2.7.1.jar:operation/:. - d‬
‭operation/ Map.java‬

$‭ javac -cp‬
‭hadoop-common-2.2.0.jar:hadoop-mapreduce-client-core-2.7.1.jar:operation/:. - d‬
‭operation/ Reduce.java‬

‭$ javac -cp‬
h‭ adoop-common-2.2.0.jar:hadoop-mapreduce-client-core-2.7.1.jar:operation/:. - d‬
‭operation/ MatrixMultiply.java‬
‭Step 6. Let’s retrieve the directory after compilation.‬

‭$ ls -R operation/‬

‭operation/:‬

‭www‬

‭operation/www:‬

‭ehadoopinfo‬

‭operation/www/ehadoopinfo:‬

‭com‬

‭operation/www/ehadoopinfo/com:‬

‭Map.class MatrixMultiply.class Reduce.class‬

‭Step 7. Creating Jar file for the Matrix Multiplication.‬

‭$ jar -cvf MatrixMultiply.jar -C operation/ .‬

‭added manifest‬

‭adding: www/(in = 0) (out= 0)(stored 0%)‬

‭adding: www/ehadoopinfo/(in = 0) (out= 0)(stored 0%)‬

‭adding: www/ehadoopinfo/com/(in = 0) (out= 0)(stored 0%)‬

‭adding: www/ehadoopinfo/com/Reduce.class(in = 2919) (out= 1271)(deflated 56%) adding:‬

‭www/ehadoopinfo/com/MatrixMultiply.class(in = 1815) (out= 932)(deflated 48%) adding:‬
‭www/ehadoopinfo/com/Map.class(in = 2353) (out= 993)(deflated 57%)‬
‭Step 8. Uploading the M, N file which contains the matrix multiplication data to HDFS.‬
‭$ cat M‬
‭M,0,0,1‬

‭M,0,1,2‬

‭M,1,0,3‬

‭M,1,1,4‬

‭$ cat N‬

‭N,0,0,5‬

‭N,0,1,6‬

‭N,1,0,7‬

‭N,1,1,8‬

‭$ hadoop fs -mkdir Matrix/‬

‭$ hadoop fs -copyFromLocal M Matrix/‬

‭$ hadoop fs -copyFromLocal N Matrix/‬

‭ tep 9. Getting Output from part-r-00000 that was generated after the execution of‬
S
‭the hadoop command.‬

‭$ hadoop fs -cat result/part-r-00000‬

‭0,0,19.0‬

‭0,1,22.0‬

‭1,0,43.0‬

‭1,1,50.0‬

Yamaha R1 Service Manual 2007
100% (1)
Yamaha R1 Service Manual 2007
426 pages
Hadoop Command Line Interface
No ratings yet
Hadoop Command Line Interface
10 pages
Cloudera Administrator Training For Apache Hadoop
No ratings yet
Cloudera Administrator Training For Apache Hadoop
5 pages
DATA ANALYTICS Lab
No ratings yet
DATA ANALYTICS Lab
3 pages
Unit-3 (HDFS)
No ratings yet
Unit-3 (HDFS)
59 pages
Manual Hadoop HIve Installation
No ratings yet
Manual Hadoop HIve Installation
4 pages
BDA - Lab Manual
No ratings yet
BDA - Lab Manual
78 pages
Hadoop Hdfs Commands
No ratings yet
Hadoop Hdfs Commands
5 pages
HDFS Concepts
No ratings yet
HDFS Concepts
10 pages
CCS334 BIG DATA ANALYTICS Session 1 Intr
No ratings yet
CCS334 BIG DATA ANALYTICS Session 1 Intr
18 pages
Hadoop Unit-4
No ratings yet
Hadoop Unit-4
44 pages
HDFS Commands
No ratings yet
HDFS Commands
15 pages
BDA Unit - II
No ratings yet
BDA Unit - II
66 pages
Unit-4-Unit-4-Bda EDIT
No ratings yet
Unit-4-Unit-4-Bda EDIT
16 pages
BDA Lab ManuaL
No ratings yet
BDA Lab ManuaL
83 pages
BIGDATA LAB MANUAL
No ratings yet
BIGDATA LAB MANUAL
27 pages
Hadoop I/O: Jaeyong Choi
No ratings yet
Hadoop I/O: Jaeyong Choi
36 pages
Install and Run Hadoop On Windows
No ratings yet
Install and Run Hadoop On Windows
29 pages
Subject Name Parallel and Distributed Computing
100% (1)
Subject Name Parallel and Distributed Computing
3 pages
Unit 4 Hadoop
No ratings yet
Unit 4 Hadoop
86 pages
CCS334 BDA Lab Manual Final
No ratings yet
CCS334 BDA Lab Manual Final
40 pages
Deepshikha Agrawal Pushp B.Sc. (IT), MBA (IT) Certification-Hadoop, Spark, Scala, Python, Tableau, ML (Assistant Professor JLBS)
No ratings yet
Deepshikha Agrawal Pushp B.Sc. (IT), MBA (IT) Certification-Hadoop, Spark, Scala, Python, Tableau, ML (Assistant Professor JLBS)
74 pages
Cloudera Administration Study Guide
No ratings yet
Cloudera Administration Study Guide
3 pages
Unit 3 Big Data MCQ AKTU: Royal Brinkman Gartenbaubedarf
No ratings yet
Unit 3 Big Data MCQ AKTU: Royal Brinkman Gartenbaubedarf
17 pages
DAN Lab ManuaL
No ratings yet
DAN Lab ManuaL
53 pages
Big Data Computing - Week-1
No ratings yet
Big Data Computing - Week-1
3 pages
Big Data Analytics Lab Manual
No ratings yet
Big Data Analytics Lab Manual
80 pages
CSE325 OS Laboratory Manual
No ratings yet
CSE325 OS Laboratory Manual
36 pages
CS8091 Bigdata Analytics Lessonplan With Date
No ratings yet
CS8091 Bigdata Analytics Lessonplan With Date
11 pages
MCQ Type Questions
No ratings yet
MCQ Type Questions
24 pages
Hive Installation On Windows 10
No ratings yet
Hive Installation On Windows 10
13 pages
Unit Iii
No ratings yet
Unit Iii
43 pages
Aca Lab Manual Final
No ratings yet
Aca Lab Manual Final
28 pages
Big - Data PPT Unit 4
No ratings yet
Big - Data PPT Unit 4
233 pages
ETL-Kafka (Talend) Student MANUAL - For Merge
No ratings yet
ETL-Kafka (Talend) Student MANUAL - For Merge
36 pages
Hadoop and Mapreduce
No ratings yet
Hadoop and Mapreduce
21 pages
BDA Lab Manual R22
0% (1)
BDA Lab Manual R22
70 pages
BDC Previous Papers 2 Marks
100% (1)
BDC Previous Papers 2 Marks
7 pages
Course Contents of Hadoop and Big Data
No ratings yet
Course Contents of Hadoop and Big Data
11 pages
BgiData QB
100% (1)
BgiData QB
3 pages
BD - Spark - Baladasu A - SightSpectrum
No ratings yet
BD - Spark - Baladasu A - SightSpectrum
3 pages
Hadoop Interview Questions
No ratings yet
Hadoop Interview Questions
28 pages
DSBDSAssingment 11
No ratings yet
DSBDSAssingment 11
20 pages
Sample Paper Q0503
No ratings yet
Sample Paper Q0503
20 pages
HADOOP
100% (1)
HADOOP
35 pages
Installing Multi Node Cluster - Handbook 2.0
No ratings yet
Installing Multi Node Cluster - Handbook 2.0
2 pages
Unit 1 Bda Complete Notes
No ratings yet
Unit 1 Bda Complete Notes
15 pages
Bda Super Imp
No ratings yet
Bda Super Imp
35 pages
Hadoop Installation Step by Step
No ratings yet
Hadoop Installation Step by Step
6 pages
DSBDA Practical Final
No ratings yet
DSBDA Practical Final
49 pages
Big Data Lecture Notes
No ratings yet
Big Data Lecture Notes
140 pages
Introduction To MapReduce
No ratings yet
Introduction To MapReduce
43 pages
1 Hdfs Notes
No ratings yet
1 Hdfs Notes
38 pages
Big Data Analytics Assignment 1
No ratings yet
Big Data Analytics Assignment 1
1 page
DL Lab Manual
100% (1)
DL Lab Manual
35 pages
Big Data Analytics Lab Manual
No ratings yet
Big Data Analytics Lab Manual
90 pages
Data Storage Technologies and Networks
No ratings yet
Data Storage Technologies and Networks
7 pages
Lab Manual: Sri Ramakrishna Institute of Technology
No ratings yet
Lab Manual: Sri Ramakrishna Institute of Technology
49 pages
Apache Hive DDL DML, Queries
100% (2)
Apache Hive DDL DML, Queries
4 pages
Bda Unit 5
No ratings yet
Bda Unit 5
29 pages
Bdamanual
No ratings yet
Bdamanual
8 pages
Compiler Design
No ratings yet
Compiler Design
2 pages
Keyword Analyzer
No ratings yet
Keyword Analyzer
2 pages
DBMS-Database Management Lab Manual
No ratings yet
DBMS-Database Management Lab Manual
39 pages
Embedded System and Iot Record
No ratings yet
Embedded System and Iot Record
58 pages
Admit Card
No ratings yet
Admit Card
3 pages
Sag - Trainers Methodology I PDF
No ratings yet
Sag - Trainers Methodology I PDF
9 pages
Best Practices For Effectively Implementing An ATP Sanitation Verification Program
100% (1)
Best Practices For Effectively Implementing An ATP Sanitation Verification Program
16 pages
ALPFA Brings Top Latino Professionals Together For 2011 Annual Convention in Anaheim, CA.
No ratings yet
ALPFA Brings Top Latino Professionals Together For 2011 Annual Convention in Anaheim, CA.
1 page
OptiFlex 2 GM03 Manual Gun Operation Manual-En-0611
No ratings yet
OptiFlex 2 GM03 Manual Gun Operation Manual-En-0611
42 pages
Package Desire': R Topics Documented
No ratings yet
Package Desire': R Topics Documented
22 pages
Natwar Lal Joshi - Resume 2023
No ratings yet
Natwar Lal Joshi - Resume 2023
1 page
B.tech Eeee Syllabus
No ratings yet
B.tech Eeee Syllabus
12 pages
Nocom vs. Camerino
0% (1)
Nocom vs. Camerino
7 pages
MIP GET VIEW BOQDripSystem
No ratings yet
MIP GET VIEW BOQDripSystem
6 pages
MX Sdi0 en GC DWG 4007 A Cat B
No ratings yet
MX Sdi0 en GC DWG 4007 A Cat B
1 page
Diagnostic Test 15 Dependent Prepositions
No ratings yet
Diagnostic Test 15 Dependent Prepositions
1 page
University of The East Caloocan Campus
No ratings yet
University of The East Caloocan Campus
5 pages
Dairy Industry Analysis
No ratings yet
Dairy Industry Analysis
48 pages
Leave Application Form: To Be Filled-Out by Employee
No ratings yet
Leave Application Form: To Be Filled-Out by Employee
4 pages
Iphellstar Shirt Hellstar Studios Short Sleeve Tee Shirt6644203228classType VARIANT&From Search
No ratings yet
Iphellstar Shirt Hellstar Studios Short Sleeve Tee Shirt6644203228classType VARIANT&From Search
1 page
Ajsr 50 08
No ratings yet
Ajsr 50 08
14 pages
451866136ba Ii Year
No ratings yet
451866136ba Ii Year
16 pages
Fortec PT Brochure July 2020 Web
No ratings yet
Fortec PT Brochure July 2020 Web
30 pages
Quests in White Orchard The Witcher 3 Wiki
No ratings yet
Quests in White Orchard The Witcher 3 Wiki
1 page
99sqq89 Sonar
No ratings yet
99sqq89 Sonar
4 pages
Advert Receptionist Intern
No ratings yet
Advert Receptionist Intern
1 page
LFAR 1 - LFAR Format
No ratings yet
LFAR 1 - LFAR Format
18 pages
Mary Nirmala MSN Synopsis
No ratings yet
Mary Nirmala MSN Synopsis
25 pages
CBSE Class 3 Mathematics - 4 Digit Numbers-2
No ratings yet
CBSE Class 3 Mathematics - 4 Digit Numbers-2
4 pages
G9 DLL Q1 Week4
No ratings yet
G9 DLL Q1 Week4
3 pages
Evaluation of A Tick Bite For Possible Lyme Disease - UpToDate
No ratings yet
Evaluation of A Tick Bite For Possible Lyme Disease - UpToDate
24 pages
E527 High Brake Pump Command
No ratings yet
E527 High Brake Pump Command
6 pages
Spa - For Companies
No ratings yet
Spa - For Companies
2 pages