0% found this document useful (0 votes)

41 views7 pages

Exp 5 Bdafinal

This document outlines an experiment aimed at implementing a matrix multiplication algorithm using Map-Reduce in Hadoop. It details the prerequisites, theoretical background of Map-Reduce, and step-by-step instructions for setting up the environment, writing the necessary Java code, compiling it, and executing the program. The conclusion emphasizes the successful implementation of the algorithm and includes a question for further practice.

Uploaded by

jpurva23ecs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views7 pages

Exp 5 Bdafinal

Uploaded by

jpurva23ecs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

EXPERIMENT NO : 5

To implement a simple algorithm in Map-Reduce: Matrix Multiplication

Name : Purva Jage

Div : A
Class : TYECS
Roll no : 626
Date of performance :
Date of submission :
Grade:
Sign:
EXPERIMENT NO : 5

Aim: To implement a simple algorithm in Map-Reduce: Matrix Multiplication/word count .

Prerequisite : Ensure that Hadoop is installed, configured and is running.

OutCome: After completing this experiment students will be able to

1. Implement logic and execute complex programs by external resources using map reduce.

Theory: Map reduce

MapReduce is a style of computing that has been implemented in several systems, including Google’s internal
implementation (simply called MapReduce) and the popular open-source implementation Hadoop which can
be obtained, along with the HDFS file system from the Apache Foundation. You can use an implementation
of MapReduce to manage many large
scale computations in a way that is tolerant of hardware faults. All you need to write are two functions, called
Map and Reduce, while the system manages the parallel execution, coordination of tasks that execute Map or
Reduce, and also deals with the possibility that one of these tasks will fail to execute. In brief, a MapReduce
computation executes as follows:
1. Some Map tasks each are given one or more chunks from a distributed file system. These Map tasks turn
the chunk into a sequence of key-value pairs. The way key value pairs are produced from the input data is
determined by the code written by the user for the Map function.
2. The key-value pairs from each Map task are collected by a master controller and sorted by key. The keys
are divided among all the Reduce tasks, so all key-value pairs with the same key wind up at the same Reduce
task.
3. The Reduce tasks work on one key at a time, and combine all the values associated with that key in some
way. The manner of combination of values is determined by the code written by the user for the Reduce
function.
Matrix Multiplication
Suppose we have an n x n matrix M, whose element in row i and column j will be denoted by Mij. Suppose
we also have a vector v of length n, whose j th element is Vj . Then the matrix vector product is the vector of
length n, whose ith element xi.
Let A and B be the two matrices to be multiplied and the result be matrix C. Matrix A has dimensions
L, M and matrix B has dimensions M, N.
In the Map phase:

Workflow of

Map Reduce Program to count word:

Use Mapper and Reducer:

● During a MapReduce job, Hadoop sends the Map and Reduce tasks to the appropriate servers in the
cluster. Generally the MapReduce paradigm is based on sending the computer to where the data
resides!
●
● MapReduce program executes in three stages, namely map stage, shuffle stage, and reduce stage.
● Map stage − The map or mapper’s job is to process the input data. Generally the input data is in the
form of a file or directory and is stored in the Hadoop file system (HDFS). The input file is passed to
the mapper function line by line. The mapper processes the data and creates several small chunks of
data.
● Reduce stage − This stage is the combination of the Shuffle stage and the Reduce stage. The
Reducer’s job is to process the data that comes from the mapper. After processing, it produces a new
set of output, which will be stored in the HDFS. ● During a MapReduce job, Hadoop sends the Map
and Reduce tasks to the appropriate servers in the cluster.
● The framework manages all the details of data-passing such as issuing tasks, verifying task
completion, and copying data around the cluster between the nodes. ● Most of the computing takes
place on nodes with data on local disks that reduces the network traffic.
● After completion of the given tasks, the cluster collects and reduces the data to form an appropriate
result, and sends it back to the Hadoop server.
● The framework manages all the details of data-passing such as issuing tasks, verifying task
completion, and copying data around the cluster between the nodes. ● Most of the computing takes
place on nodes with data on local disks that reduces the network traffic.
● After completion of the given tasks, the cluster collects and reduces the data to form an appropriate
result, and sends it back to the Hadoop server.
Steps to follow:

Step 1: Create a folder in C:\ as ‘hadoop_project’ => C:\hadoop_project

∙ Inside this folder, right click -> new -> text document
∙ Copy the Java code (Mapper, Reducer, Driver) and paste it into Notepad.(Java code is added in the
classroom)
∙ Click File → Save As.
∙ Choose All Files as the file type.
∙ Save the file as MatrixMultiplicationMapper.java inside a new folder (e.g., C:\
hadoop_project\).
∙ Repeat the same process for:

∙ MatrixMultiplicationReducer.java
∙ MatrixMultiplicationDriver.java
Step 2: Compile the Java Files

∙ Open Command Prompt (cmd) and navigate to the project folder: C:\hadoop_project ∙ Compile the
Java files with Hadoop dependencies
∙ javac -classpath
"C:\hadoop\share\hadoop\common\*;C:\hadoop\share\hadoop\mapreduce\*;C:\had oop\share\hadoop\
hdfs\*" -d . MatrixMultiplicationMapper.java MatrixMultiplicationReducer.java
MatrixMultiplicationDriver.java
This will generate .class files inside the current directory.

Step 3: Create a JAR File:

This creates matrix-multiplication.jar inside C:\hadoop_project\.

Step 4: Prepare Input Data

Open Notepad and copy the below data. Save the file as matrix_input.txt in C:\hadoop_project\.
Upload to HDFS: This you will have to do in another command prompt. First launch all deamons using
start-all.cmd and follow the below steps

C:\Users\Administrator>hdfs dfs -mkdir -p /matrix_input

C:\Users\Administrator>hdfs dfs -put C:\hadoop_project\matrix_input.txt /matrix_input

Step 5: Run the JAR File

C:\Users\Administrator>hadoop jar C:\hadoop_project\matrix-multiplication.jar
MatrixMultiplicationDriver /matrix_input /matrix_output

Viewing output :
Getting output file in specified folder
Observations and learning: MapReduce is a processing technique and a program model for distributed
computing based on java. The MapReduce algorithm contains two important tasks, namely Map and Reduce. Map
takes a set of data and converts it into another set of data, where individual elements are broken down into tuples
(key/value pairs).

Conclusion: Thus we successfully implemented a simple algorithm in Map-Reduce: Matrix Multiplication

Questions
Consider A and B matrix of 2 x 2 dimension, perform matrix multiplication using Mapreduce. Write all
the steps as discussed in the class.

Unit 3 Notes
No ratings yet
Unit 3 Notes
21 pages
BDA Module 3
No ratings yet
BDA Module 3
66 pages
Hadoop Beginner's Guide
From Everand
Hadoop Beginner's Guide
Garry Turkington
4/5 (7)
Mapreduce Final
No ratings yet
Mapreduce Final
55 pages
Primavera SDK
50% (2)
Primavera SDK
50 pages
Chapter 9 - Processing Big Data With Mapreduce
No ratings yet
Chapter 9 - Processing Big Data With Mapreduce
157 pages
Big Data Lab
No ratings yet
Big Data Lab
52 pages
Chapter 2 - Introduction To MapReduce - New
No ratings yet
Chapter 2 - Introduction To MapReduce - New
107 pages
Map Reduce
No ratings yet
Map Reduce
30 pages
BDA-4 MapReduce v.2
No ratings yet
BDA-4 MapReduce v.2
22 pages
Big Data
No ratings yet
Big Data
120 pages
Introduction To MapReduce
No ratings yet
Introduction To MapReduce
9 pages
BDA Exp Removed Removed
No ratings yet
BDA Exp Removed Removed
33 pages
1 To 8
No ratings yet
1 To 8
16 pages
Bda Megh
No ratings yet
Bda Megh
50 pages
Unit 3 Bda
No ratings yet
Unit 3 Bda
41 pages
CS702 Big Data Programs
No ratings yet
CS702 Big Data Programs
58 pages
Map Reduce Programming
No ratings yet
Map Reduce Programming
74 pages
PBDS Unit4
No ratings yet
PBDS Unit4
32 pages
Module2 C MapReduceParadigm
No ratings yet
Module2 C MapReduceParadigm
74 pages
Unit IV Programming Model
No ratings yet
Unit IV Programming Model
30 pages
Unit 2 Topic 4 Map Reduce
No ratings yet
Unit 2 Topic 4 Map Reduce
43 pages
Unit-2 MapReduce2024
No ratings yet
Unit-2 MapReduce2024
41 pages
Map Reduce Workflow Colloquim
No ratings yet
Map Reduce Workflow Colloquim
30 pages
Unit 2 - From Hadoop Streaming PDF
No ratings yet
Unit 2 - From Hadoop Streaming PDF
20 pages
Hadoop Map Reduce Concepts - Teaching - 1
No ratings yet
Hadoop Map Reduce Concepts - Teaching - 1
53 pages
Bda Unit 3
No ratings yet
Bda Unit 3
20 pages
M4 06 MapReduce
No ratings yet
M4 06 MapReduce
28 pages
CS-702 (D) BigData
No ratings yet
CS-702 (D) BigData
61 pages
Mapreduce 190419130907
No ratings yet
Mapreduce 190419130907
12 pages
BDA
No ratings yet
BDA
19 pages
Map Reduce
No ratings yet
Map Reduce
42 pages
MapReduce Tutorial
No ratings yet
MapReduce Tutorial
32 pages
Exp 9 - Merged
No ratings yet
Exp 9 - Merged
13 pages
Unit 5 Lecture 5
No ratings yet
Unit 5 Lecture 5
21 pages
21CS1601 Unit 5 Understanding Big Data Technolgies
No ratings yet
21CS1601 Unit 5 Understanding Big Data Technolgies
20 pages
Unit-2 (MapReduce-I)
No ratings yet
Unit-2 (MapReduce-I)
28 pages
Map Reduce 2
No ratings yet
Map Reduce 2
14 pages
Data Collection KoboToolbox
No ratings yet
Data Collection KoboToolbox
67 pages
Unit 4 1
No ratings yet
Unit 4 1
12 pages
BDA Module 3 - Part 1 (Mapreduce and HBase) 2023
No ratings yet
BDA Module 3 - Part 1 (Mapreduce and HBase) 2023
15 pages
Unit 2
No ratings yet
Unit 2
12 pages
BDA Exp5
No ratings yet
BDA Exp5
12 pages
SplitPDFFile 1 To 7
No ratings yet
SplitPDFFile 1 To 7
7 pages
Big Data Practical 2
No ratings yet
Big Data Practical 2
11 pages
Bda Experiment No2
No ratings yet
Bda Experiment No2
12 pages
Unit 4 BDA
No ratings yet
Unit 4 BDA
31 pages
OLP Online Admin Quick Guide
No ratings yet
OLP Online Admin Quick Guide
12 pages
Bye Laws and Sylabus
No ratings yet
Bye Laws and Sylabus
26 pages
3 Fuel Consumption Example - MR
No ratings yet
3 Fuel Consumption Example - MR
7 pages
Exp5 BDI 60004200124
No ratings yet
Exp5 BDI 60004200124
5 pages
BC0057 - Object Oriented Analysis and Design
No ratings yet
BC0057 - Object Oriented Analysis and Design
8 pages
18mcs35e U4
No ratings yet
18mcs35e U4
7 pages
Data Science
No ratings yet
Data Science
7 pages
Bda 03
No ratings yet
Bda 03
10 pages
MapReduce Tutorial
No ratings yet
MapReduce Tutorial
32 pages
100 Computer MCQs For EPFO SSA 2023
No ratings yet
100 Computer MCQs For EPFO SSA 2023
10 pages
Exp 5 Bda
No ratings yet
Exp 5 Bda
9 pages
DSBDA Manual Assignment 11
No ratings yet
DSBDA Manual Assignment 11
6 pages
VFPEncryption FLL Update - SweetPotato Software Blog
100% (1)
VFPEncryption FLL Update - SweetPotato Software Blog
2 pages
Python Control Statments PDF
No ratings yet
Python Control Statments PDF
19 pages
Big Data 4 Vivek
No ratings yet
Big Data 4 Vivek
3 pages
Performing A Clean Uninstall-Reinstall
No ratings yet
Performing A Clean Uninstall-Reinstall
4 pages
03 Firstmrjob Invertedindexconstruction 141206231216 Conversion Gate01 PDF
No ratings yet
03 Firstmrjob Invertedindexconstruction 141206231216 Conversion Gate01 PDF
54 pages
Map Reduce
No ratings yet
Map Reduce
18 pages
Python Programming: Control Structures
No ratings yet
Python Programming: Control Structures
40 pages
Mapreduce Programming Framework
No ratings yet
Mapreduce Programming Framework
23 pages
Programmer / Software Engineer: Vivek Sharma
No ratings yet
Programmer / Software Engineer: Vivek Sharma
3 pages
HP Workstation Z2 SSF G4 Datasheet
No ratings yet
HP Workstation Z2 SSF G4 Datasheet
4 pages
19022024024645class 9 Computer Applications Master Worksheet 2024
No ratings yet
19022024024645class 9 Computer Applications Master Worksheet 2024
3 pages
Mapreduce Programming Model and Design Patterns: Andrea Lottarini January 17, 2012
No ratings yet
Mapreduce Programming Model and Design Patterns: Andrea Lottarini January 17, 2012
23 pages
C5-CSA-20 Exam Question Paper 2024 - B-1
No ratings yet
C5-CSA-20 Exam Question Paper 2024 - B-1
3 pages
Data Sheet c78-502283
No ratings yet
Data Sheet c78-502283
5 pages
TMS TAdvStringGrid Quick Start
No ratings yet
TMS TAdvStringGrid Quick Start
16 pages
Oracle Dev
No ratings yet
Oracle Dev
5 pages
Gorilla - Large Language Model Connected With Massive APIs
No ratings yet
Gorilla - Large Language Model Connected With Massive APIs
18 pages
Wifi Multifunational Spykar Final
No ratings yet
Wifi Multifunational Spykar Final
15 pages
BTR Enrollment Form
No ratings yet
BTR Enrollment Form
2 pages
GCN 2020 03
No ratings yet
GCN 2020 03
4 pages
Beginning AWS Security: Build Secure, Effective, and Efficient AWS Architecture 1st Edition Tasha Penwell
100% (1)
Beginning AWS Security: Build Secure, Effective, and Efficient AWS Architecture 1st Edition Tasha Penwell
48 pages
Oracle 10g Exceptions
No ratings yet
Oracle 10g Exceptions
5 pages
Using Low-Code Solutions To Make The Most of Industrial IoT
No ratings yet
Using Low-Code Solutions To Make The Most of Industrial IoT
9 pages
Rse Api Host Configuration Guide
No ratings yet
Rse Api Host Configuration Guide
12 pages
Virtual Box Ubuntu Installation Guide On Windows 7
No ratings yet
Virtual Box Ubuntu Installation Guide On Windows 7
12 pages
Lingika Manovidyawa - PDF
No ratings yet
Lingika Manovidyawa - PDF
1 page
MCU Manual
No ratings yet
MCU Manual
9 pages
CSE427 Syllabus
No ratings yet
CSE427 Syllabus
3 pages
Documentation On Bank Management System
No ratings yet
Documentation On Bank Management System
43 pages
Quick Configuration of Openldap and Kerberos In Linux and Authenicating Linux to Active Directory
From Everand
Quick Configuration of Openldap and Kerberos In Linux and Authenicating Linux to Active Directory
Dr. Hidaia Mahmood Alassouli
No ratings yet
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet

Exp 5 Bdafinal

Uploaded by

Exp 5 Bdafinal

Uploaded by

EXPERIMENT NO : 5

To implement a simple algorithm in Map-Reduce: Matrix Multiplication

Name : Purva Jage

Aim: To implement a simple algorithm in Map-Reduce: Matrix Multiplication/word count .

OutCome: After completing this experiment students will be able to

Theory: Map reduce

Map Reduce Program to count word:

Step 1: Create a folder in C:\ as ‘hadoop_project’ => C:\hadoop_project

Step 3: Create a JAR File:

This creates matrix-multiplication.jar inside C:\hadoop_project\.

Step 4: Prepare Input Data

C:\Users\Administrator>hdfs dfs -mkdir -p /matrix_input

C:\Users\Administrator>hdfs dfs -put C:\hadoop_project\matrix_input.txt /matrix_input

Step 5: Run the JAR File

Conclusion: Thus we successfully implemented a simple algorithm in Map-Reduce: Matrix Multiplication

You might also like