0% found this document useful (0 votes)
8 views16 pages

1 To 8

The document outlines various Hadoop-related tasks, including implementing matrix multiplication, executing Hive commands, managing files in HDFS, and counting words and characters using MapReduce. Each section provides aims, procedures, example commands, and results of the operations performed. Additionally, it covers the installation and manipulation of HBase, detailing commands for creating, updating, and deleting tables.

Uploaded by

Sneha.j Sneha.j
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views16 pages

1 To 8

The document outlines various Hadoop-related tasks, including implementing matrix multiplication, executing Hive commands, managing files in HDFS, and counting words and characters using MapReduce. Each section provides aims, procedures, example commands, and results of the operations performed. Additionally, it covers the installation and manipulation of HBase, detailing commands for creating, updating, and deleting tables.

Uploaded by

Sneha.j Sneha.j
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

1.

MapReduce Program for Matrix Multiplication

Aim:

To implement matrix multiplication using MapReduce in Hadoop.

Procedure:

1. Prepare two input matrices in HDFS with the format: <MatrixName, Row, Column, Value>.

2. Write a Mapper to emit intermediate key-value pairs for matrix entries.

3. Write a Reducer to calculate partial products and sum them up for final results.

4. Execute the MapReduce job using Hadoop.

Program:

import java.io.IOException;

import org.apache.hadoop.io.*;

import org.apache.hadoop.mapreduce.*;

public class MatrixMultiplication {


public static class MatrixMapper extends Mapper<LongWritable, Text, Text, Text> {

public void map(LongWritable key, Text value, Context context) throws IOException,
InterruptedException {

String[] elements = value.toString().split(",");

String matrix = elements[0];

String row = elements[1];

String column = elements[2];

String val = elements[3];

if (matrix.equals("A")) {

for (int k = 0; k < 10; k++) { // Assuming 10x10 matrices

context.write(new Text(row + "," + k), new Text("A," + column + "," + val));

} else {

for (int i = 0; i < 10; i++) {

context.write(new Text(i + "," + column), new Text("B," + row + "," + val));

}
}

public static class MatrixReducer extends Reducer<Text, Text, Text, Text> {

public void reduce(Text key, Iterable<Text> values, Context context) throws IOException,
InterruptedException {

double[] A = new double[10];

double[] B = new double[10];

for (Text value : values) {

String[] elements = value.toString().split(",");

if (elements[0].equals("A")) {

A[Integer.parseInt(elements[1])] = Double.parseDouble(elements[2]);

} else {

B[Integer.parseInt(elements[1])] = Double.parseDouble(elements[2]);

double sum = 0;
for (int i = 0; i < 10; i++) {

sum += A[i] * B[i];

context.write(key, new Text(String.valueOf(sum)));

Output:

The output will be the resultant matrix stored in the HDFS directory.

Result:

Matrix multiplication was successfully implemented using MapReduce.


2. Hive Commands

Aim:

To execute Hive commands for importing, distributing, sorting, clustering, and exporting data.

Procedure:

1. Start the Hive environment.

2. Execute commands step by step for importing, distributing, sorting, clustering, and exporting.

Commands and Output:

1. IMPORT:

IMPORT TABLE table_name FROM 'hdfs_path';

Output: Data is imported into the Hive table.

2. DISTRIBUTE BY:

SELECT * FROM table_name DISTRIBUTE BY column_name;


Output: Data is distributed into partitions based on the column.

3. EXPORT:

EXPORT TABLE table_name TO 'hdfs_path';

Output: Data is exported to the specified HDFS path.

4. SORT BY:

SELECT * FROM table_name SORT BY column_name;

Output: Data is sorted by the specified column.

5. CLUSTER BY:

SELECT * FROM table_name CLUSTER BY column_name;

Output: Data is clustered into partitions.

Result:

Hive commands were executed successfully.


3. Hadoop File Management Commands

Aim:

To manage files in HDFS using Hadoop commands.

Procedure:

1. Use the Hadoop file system commands to create directories, add files, and list their contents.

Commands:

# Create a directory in HDFS

hadoop fs -mkdir /user/mydir

# Upload a file to HDFS

hadoop fs -put localfile.txt /user/mydir


# List the contents of a directory

hadoop fs -ls /user/mydir

# Delete a file

hadoop fs -rm /user/mydir/localfile.txt

Output:

1. Directory created.

2. File uploaded.

3. Contents listed.

4. File deleted.

Result:

File management tasks were successfully performed in HDFS.


4. MapReduce Program for Word Count

Aim:

To count the number of words in a text file using MapReduce.

Procedure:

1. Write a Mapper to emit (word, 1) pairs for each word.

2. Write a Reducer to sum up counts for each word.

3. Execute the program on input text files stored in HDFS.

Program:

public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable> {

private final static IntWritable one = new IntWritable(1);

private Text word = new Text();


public void map(Object key, Text value, Context context) throws IOException,
InterruptedException {

String[] words = value.toString().split("\\s+");

for (String w : words) {

word.set(w);

context.write(word, one);

public static class IntSumReducer extends Reducer<Text, IntWritable, Text, IntWritable> {

public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException,
InterruptedException {

int sum = 0;

for (IntWritable val : values) {

sum += val.get();

context.write(key, new IntWritable(sum));

}
}

Output:

The output is a list of words with their respective counts stored in HDFS.

Result:

Word count was successfully implemented using MapReduce.

5. Download and Install HBase with Start-Up Scripts

Aim:

To install and configure HBase.

Procedure:

1. Download HBase from the official Apache HBase website.


2. Extract and configure hbase-site.xml for Zookeeper and HDFS.

3. Start HBase services.

Commands:

# Download and extract HBase

wget https://archive.apache.org/dist/hbase/X.X.X/hbase-X.X.X-bin.tar.gz

tar -xzvf hbase-X.X.X-bin.tar.gz

cd hbase-X.X.X

# Start HBase

bin/start-hbase.sh

Output:

HBase is installed and running.

Result:

HBase was successfully downloaded, installed, and started.


6. HBase Commands

Aim:

To create, manipulate, and delete tables in HBase.

Procedure:

1. Start the HBase shell.

2. Execute commands for table creation, data insertion, and retrieval.

Commands:

1. Create Table:

create 'customer', 'info'

2. Insert Data:
put 'customer', '1', 'info:name', 'John'

3. Get Data:

get 'customer', '1'

4. Delete Data:

delete 'customer', '1', 'info:name'

Output:

Commands successfully create, update, retrieve, and delete data in the HBase table.

Result:

HBase operations were successfully performed.

7. MapReduce Program for Counting Characters

Aim:

To count the number of characters in a text file using MapReduce.


Procedure:

1. Write a Mapper to emit (character, 1) pairs for each character.

2. Write a Reducer to sum up counts for each character.

Program:

Similar to word count, but split the input into characters instead of words.

8. Hive DML Commands

Aim:

To execute Hive DML commands such as INSERT, SELECT, and DELETE.

Procedure:

1. Use Hive to create and manipulate tables.


Commands:

1. INSERT:

INSERT INTO TABLE student VALUES (1, 'John', 'CS');

2. SELECT:

SELECT * FROM student;

3. DELETE:

DELETE FROM student WHERE id=1;

You might also like