MapReduce - What It Is, and Why It Is So Popular

The document discusses MapReduce, a programming model for processing large datasets across clusters of computers. It describes how MapReduce addresses challenges like sorting terabytes and petabytes of data by distributing tasks across thousands of computers to achieve high performance. The document also provides examples of MapReduce sorting large amounts of data very quickly using large computer clusters.

Uploaded by

jefferyleclerc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views7 pages

MapReduce - What It Is, and Why It Is So Popular

Uploaded by

jefferyleclerc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

MapReduce

What it is, and why it is so popular

Luigi Laura

Dipartimento di Informatica e Sistemistica

“Sapienza” Università di Roma

Rome, May 9th and 11th , 2012

Motivations: From the description of this course...

...This is a tentative list of questions that are likely be covered in

the class:
I The running times obtained in practice by scanning a
moderately large matrix by row or by column may be very
different: what is the reason? Is the assumption that memory
access times are constant realistic?
I How would you sort 1TB of data? How would you measure
the performances of algorithms in applications that need to
process massive data sets stored in secondary memories?
I Do memory allocation and free operations really require
constant time? How do real memory allocators work?
I ...
Motivations: From the description of this course...

...This is a tentative list of questions that are likely be covered in

I Nov. 2008: 1TB, 1000 computers, 68 seconds.

Previous record was 910 computers, 209 seconds.
Motivations: sorting...

I Nov. 2008: 1TB, 1000 computers, 68 seconds.

Previous record was 910 computers, 209 seconds.
I Nov. 2008: 1PB, 4000 computers, 6 hours; 48k harddisks...
Motivations: sorting...

I Nov. 2008: 1TB, 1000 computers, 68 seconds.

Previous record was 910 computers, 209 seconds.
I Nov. 2008: 1PB, 4000 computers, 6 hours; 48k harddisks...
I Sept. 2011: 1PB, 8000 computers, 33 minutes.

Advanced Storage Systems: Hossein Asadi Department of Computer Engineering Sharif University of Technology
No ratings yet
Advanced Storage Systems: Hossein Asadi Department of Computer Engineering Sharif University of Technology
33 pages
DATA RECOVERY (kHATEEB)
100% (1)
DATA RECOVERY (kHATEEB)
46 pages
Big Data: Ghislain Fourny
No ratings yet
Big Data: Ghislain Fourny
120 pages
2023 Data, Analytics, and Artificial Intelligence Adoption Strategy-C
No ratings yet
2023 Data, Analytics, and Artificial Intelligence Adoption Strategy-C
10 pages
UNYOUG Exadata Niemiec
No ratings yet
UNYOUG Exadata Niemiec
178 pages
Basic Database Concepts
No ratings yet
Basic Database Concepts
35 pages
Unit-4 Memory Management
No ratings yet
Unit-4 Memory Management
56 pages
Presentation of Aict (GRP 5)
No ratings yet
Presentation of Aict (GRP 5)
40 pages
My IT Notes
No ratings yet
My IT Notes
29 pages
Opensap Hana1 Warmup
No ratings yet
Opensap Hana1 Warmup
73 pages
Real-Time Databases
No ratings yet
Real-Time Databases
25 pages
Lo 1
No ratings yet
Lo 1
65 pages
Memory Mapping
No ratings yet
Memory Mapping
5 pages
Alvarez P.main Memory Management On Relational Database Sys 2022
No ratings yet
Alvarez P.main Memory Management On Relational Database Sys 2022
115 pages
08 - Performance Issues of In-Memory Databases in OLTP Systems
No ratings yet
08 - Performance Issues of In-Memory Databases in OLTP Systems
4 pages
Os Project 2
No ratings yet
Os Project 2
26 pages
Algorithms and Data Structures For External Memory
No ratings yet
Algorithms and Data Structures For External Memory
191 pages
Data Life Cycle
No ratings yet
Data Life Cycle
5 pages
Today's Topics: Advanced Storage Systems
No ratings yet
Today's Topics: Advanced Storage Systems
6 pages
Unit 1 DBMS
No ratings yet
Unit 1 DBMS
33 pages
VTU Exam Question Paper With Solution of 18CS822 Storage Area Networks June-2023-Dr.S.seetha
No ratings yet
VTU Exam Question Paper With Solution of 18CS822 Storage Area Networks June-2023-Dr.S.seetha
38 pages
STL For Large Data Management
No ratings yet
STL For Large Data Management
51 pages
Hai Jin
No ratings yet
Hai Jin
56 pages
BDA Unit-1
No ratings yet
BDA Unit-1
32 pages
OCR A Level Computer Science: 2.1 Systems Software
No ratings yet
OCR A Level Computer Science: 2.1 Systems Software
38 pages
Lecture 8 Memory Management
No ratings yet
Lecture 8 Memory Management
23 pages
2023 Data, Analytics, and Artificial Intelligence Adoption Strategy-A
No ratings yet
2023 Data, Analytics, and Artificial Intelligence Adoption Strategy-A
7 pages
1 Data - Structures - Course - 20242
No ratings yet
1 Data - Structures - Course - 20242
183 pages
Big Data Summary
No ratings yet
Big Data Summary
19 pages
1) - Types of Operating Systems?
No ratings yet
1) - Types of Operating Systems?
16 pages
Paper 2 PDF
No ratings yet
Paper 2 PDF
29 pages
Big Data
No ratings yet
Big Data
79 pages
Data Collection & Analysis Educational Presentation in Pink and Blue Lined Style
No ratings yet
Data Collection & Analysis Educational Presentation in Pink and Blue Lined Style
51 pages
BDA Unit-1
No ratings yet
BDA Unit-1
31 pages
De Unit 4
No ratings yet
De Unit 4
33 pages
01 Intro
No ratings yet
01 Intro
20 pages
Lecture 16
No ratings yet
Lecture 16
31 pages
Individual Assignment I
No ratings yet
Individual Assignment I
5 pages
DBMS Internals: How Does It All Work?
No ratings yet
DBMS Internals: How Does It All Work?
94 pages
Introduction To Big Data
No ratings yet
Introduction To Big Data
30 pages
Virtual+Cache Memory1
No ratings yet
Virtual+Cache Memory1
14 pages
Guha Roy 2017
No ratings yet
Guha Roy 2017
3 pages
Process and Storage
No ratings yet
Process and Storage
25 pages
Virtual Memory and Demanding Pages
No ratings yet
Virtual Memory and Demanding Pages
16 pages
Are You Ready For Inline Computing
No ratings yet
Are You Ready For Inline Computing
2 pages
Memory Organization Unit IV
No ratings yet
Memory Organization Unit IV
22 pages
23 Big Data and Data Wrangling
No ratings yet
23 Big Data and Data Wrangling
56 pages
DR Jahanzaib Assinment No 2
No ratings yet
DR Jahanzaib Assinment No 2
7 pages
AI Unit 2 - Data & Algorithms by Kulbhushan (Krazy Kaksha & KK World)
No ratings yet
AI Unit 2 - Data & Algorithms by Kulbhushan (Krazy Kaksha & KK World)
5 pages
Unit 2 of AI
No ratings yet
Unit 2 of AI
5 pages
Seminar On: Big Data
No ratings yet
Seminar On: Big Data
23 pages
2023 Data, Analytics, and Artificial Intelligence Adoption Strategy-H
No ratings yet
2023 Data, Analytics, and Artificial Intelligence Adoption Strategy-H
4 pages
Map vs. Unordered Map: An Analysis On Large Datasets: Akanksha Bindal Prateek Narang S. Indu
No ratings yet
Map vs. Unordered Map: An Analysis On Large Datasets: Akanksha Bindal Prateek Narang S. Indu
5 pages
Big Data
No ratings yet
Big Data
25 pages
Analysis of Mapreduce Algorithms: Harini Padmanaban
No ratings yet
Analysis of Mapreduce Algorithms: Harini Padmanaban
6 pages
Understanding Big Data
No ratings yet
Understanding Big Data
14 pages
Sap Hana and Its Performance Benefits
No ratings yet
Sap Hana and Its Performance Benefits
9 pages
Fast Scalable K-Means++ Algorithm With Mapreduce
No ratings yet
Fast Scalable K-Means++ Algorithm With Mapreduce
2 pages
Data Visualization Cheat Sheet For Basic Machine Learning Algorithms - by Boriharn K - Mar, 2024 - Towards Data Science
No ratings yet
Data Visualization Cheat Sheet For Basic Machine Learning Algorithms - by Boriharn K - Mar, 2024 - Towards Data Science
3 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-4
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-4
3 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community
3 pages
Balanced K-Means Revisited-1
No ratings yet
Balanced K-Means Revisited-1
3 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-P
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-P
3 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-17
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-17
3 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-9
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-9
4 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-5
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-5
4 pages
Fuzzy K-Mean Clustering in Mapreduce On Cloud Based Hadoop: Dweepna Garg
No ratings yet
Fuzzy K-Mean Clustering in Mapreduce On Cloud Based Hadoop: Dweepna Garg
4 pages
K-Means Clustering Optimization Algorithm Based On Mapreduce
No ratings yet
K-Means Clustering Optimization Algorithm Based On Mapreduce
6 pages
Improved K-Means Map Reduce Algorithm For Big Data Cluster Analysis
No ratings yet
Improved K-Means Map Reduce Algorithm For Big Data Cluster Analysis
7 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-A
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-A
6 pages
Hadoop
No ratings yet
Hadoop
7 pages
2 Mapreduce Model Principles
No ratings yet
2 Mapreduce Model Principles
7 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-O
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-O
3 pages
ITC Assignment 02 (Sub)
No ratings yet
ITC Assignment 02 (Sub)
10 pages
The Incremental Online K Means Clustering Algorithm and Its Application To Color Quantization
No ratings yet
The Incremental Online K Means Clustering Algorithm and Its Application To Color Quantization
42 pages
Tutorial For K Means Clustering in Python Sklearn - MLK - Machine Learning Knowledge-5
No ratings yet
Tutorial For K Means Clustering in Python Sklearn - MLK - Machine Learning Knowledge-5
3 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-16
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-16
3 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-14
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-14
3 pages
Paper Dvi
No ratings yet
Paper Dvi
7 pages
Balanced K-Means Revisited-5
No ratings yet
Balanced K-Means Revisited-5
3 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-1E
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-1E
2 pages
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-1Q
No ratings yet
SAP HANA PAL - K-Means Algorithm or How To Do Cust... - SAP Community-1Q
2 pages
A Distance-Based Kernel For Classification Via Support Vector Machines - PMC-17
No ratings yet
A Distance-Based Kernel For Classification Via Support Vector Machines - PMC-17
1 page
Embed and Conquer: Scalable Embeddings For Kernel K-Means On Mapreduce
No ratings yet
Embed and Conquer: Scalable Embeddings For Kernel K-Means On Mapreduce
9 pages

MapReduce - What It Is, and Why It Is So Popular

Uploaded by

MapReduce - What It Is, and Why It Is So Popular

Uploaded by

MapReduce

What it is, and why it is so popular

Dipartimento di Informatica e Sistemistica

Rome, May 9th and 11th , 2012

...This is a tentative list of questions that are likely be covered in

...This is a tentative list of questions that are likely be covered in

I Nov. 2008: 1TB, 1000 computers, 68 seconds.

I Nov. 2008: 1TB, 1000 computers, 68 seconds.

I Nov. 2008: 1TB, 1000 computers, 68 seconds.

You might also like