0% found this document useful (0 votes)

82 views7 pages

Large Scale Distributed Graph Processing: Data Mining (CS6720)

The document summarizes models for large scale distributed graph processing including the massively parallel computation (MPC) model. It describes how the MPC model works with distributed memory across machines, synchronous communication rounds, and limitations on memory and message sizes. It then provides examples of how fundamental graph algorithms like broadcasting and finding a maximal matching can be implemented in the MPC model.

Uploaded by

Rachit Tibrewal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

82 views7 pages

Large Scale Distributed Graph Processing: Data Mining (CS6720)

Uploaded by

Rachit Tibrewal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

03-03-2020

Large Scale Distributed Graph

John Augustine
Jan 16, 2020
Processing
Data Mining (CS6720)

1 2

Shared Memory PRAM Massively Parallel Computation (MPC) Model

• Input data size 𝑁 words; each word = 𝑂(log 𝑁) bits.
MapReduce
• The number of machines 𝑘. (Machines identified by {1, 2,…, 𝑘}.)
Programming
Parallel &
Distributed
Models • Memory size per machine 𝑆 words.
Computing Models Think like a vertex • 𝑆 ≥ 𝑁 is uninteresting. Assume: 𝑆 = 𝑂(𝑁 ) for some 𝜖 ∈ (0,1].
• Also, require 𝑆𝑘 ≥ 𝑁.
• Synchronous communication rounds
Massively Parallel
Computation
• Local computation within each machine
• Create messages for other machines. Sum of message sizes ≤ 𝑆.
Message Passing
• Send… Receive. Ensure no machine requires > 𝑆 memory.
𝑘-machine model • Goal: Solve problem in as few rounds as possible.

3 4
03-03-2020

Initial Data Distribution

On Graphs:
𝑁=𝑂 𝑛
• Typically, data is split into words (often as ⟨𝑘𝑒𝑦, 𝑣𝑎𝑙𝑢𝑒⟩ pairs). (Strongly)
Superlinear

• The words could be either randomly distributed or arbitrarily 𝑁 =𝑂 𝑛+𝑚

distributed.
• Load balanced so that no machine has much more than other
machines. = 𝑂(𝑚)
• Output: usually distributed & depends on problem. Memory
• Questions Size 𝑆
• How to achieve random load balanced distribution?
• How to remove duplicates? 𝑁 = 𝑂(𝑛) 𝑁 = 𝑛 for
𝛼 ∈ (0,1).
Near (Strongly)
Linear Sublinear

5 6

Broadcasting Maximal Matching

• Let 𝑆 = 𝑛 for some constant 𝜖 > 0. • A matching in a graph 𝐺 = (𝑉, 𝐸) is a set of edges that don’t share
common vertices.
• One machine src needs to broadcast 𝑛 words.
• Approach 1: the machine sends 𝑘 messages of size 𝑛. If 𝑘 > 𝑛 ???
• A maximum matching is a matching of maximum possible cardinality.
• Approach 2: Build 𝑛 -ary tree with src as root.
The image part with relationship ID rId3 was not found in the file.

• A maximal matching is a matching that ceases to be one when any

• Broadcast takes 𝑂(ℎ𝑒𝑖𝑔ℎ𝑡) rounds edge is added to it.
• ℎ𝑒𝑖𝑔ℎ𝑡 = 𝑂 log 𝑘 =𝑂
• A maximal matching has cardinality at least half of a maximum
since 𝑁 = 𝑝𝑜𝑙𝑦 𝑆 (𝑂(𝑛 ) for graphs) matching. Homework: Prove this.

7 8
03-03-2020

Sequential Algorithm for Filtering: Idea to find a maximal matching in

finding a maximal matching. the superlinear memory regime
1. Let 𝑋 = ∅. Preprocessing.
Let ℓ be a designated “leader” machine (say, machine 0). Assume it doesn’t hold any edge at the
2. For each 𝑒 = 𝑢, 𝑣 ∈ 𝐸, beginning. (Why is this OK?) During the course of the algorithm, ℓ maintains a matching (initially
1. If neither 𝑢 nor 𝑣 is an endpoint of any edge in 𝑋, then 𝑋 = 𝑋 ∪ {𝑒}. empty).
Other machines are called regular machines. 𝐺 = 𝑉 , 𝐸 denotes graph during phase 𝑟. We use
3. Output 𝑋. 𝑚 for number of edges in 𝐺 . 𝐺 ← 𝐺.
Steps in each phase 0,1, … (until 𝐺 becomes empty.)
Correctness: 1. Each regular machine marks each local edge independently with probability 𝑝 = and
sends the marked edges to the leader ℓ.
• Invariant: 𝑋 is a matching at all times.
2. The leader ℓ recomputes the maximal matching with edges it received but without losing any
• Suppose 𝑋 is not maximal at the end. Then some edge 𝑒 can be edge from the previous matching. (How?)
added to it and it will remain a matching. But why was 𝑒 rejected? 3. The leader ℓ broadcasts the matching so computed (≤ 𝑛/2 edges) to all machines.
4. Each regular machine removes edges that have at least one common vertex with the received
matching. Isolated vertices are also removed.

9 10

Outline of the Analysis Claim: At most whp at end of round 𝑟

• Correctness is obvious (similar to the sequential algorithm) if • Let 𝐺 = 𝑉 , 𝐸 be the leftover graph at the end of round 𝑟 − 1.
bandwidth limitation is not violated. • For some pair of vertices 𝑢, 𝑣 ∈ 𝑉 , can 𝑒 = 𝑢, 𝑣 have been sent to
the leader? No! (Why? If sent, at least one of 𝑢 or 𝑣 would have been
matched, and therefore discarded.)
• Claims:
• The leader ℓ receives at most 𝑛 edges (whp) in step 1. (Homework)
• Consider any set of vertices 𝐽 with > edges with both end
• If a phase 𝑟 starts with 𝑚 edges, then the number of edges at the end of points in 𝐽.
round 𝑟 is with high probability. • What is the chance that V = 𝐽?
• The total number of rounds is log m∈𝑂 . Why? Pr 𝑎𝑙𝑙 𝑖𝑛𝑑𝑢𝑐𝑒𝑑 𝑒𝑑𝑔𝑒𝑠 𝑛𝑜𝑡 𝑠𝑒𝑛𝑡 ≤ 1 − 𝑝 ≤𝑒 .
There are at most 2 subsets of 𝑉, so by union bound, the result holds.

11 12
03-03-2020

Data Distribution
The 𝑘-machine Model
The Random Vertex Partitioning (RVP)
• Input data size 𝑁 words; each word = 𝑂(log 𝑁) bits. • Typically, data is split into words (often as ⟨𝑘𝑒𝑦, 𝑣𝑎𝑙𝑢𝑒⟩ pairs).
• The number of machines 𝑘. (Machines identified by {1, 2,…, 𝑘}.) • The words could be either randomly distributed or arbitrarily
distributed.
• Each pair of machines connected by a link.
• Typically used in processing large graphs.
• Memory size is unbounded (but usually not abused).
• RVP: Most common approach is to randomly partition vertices into 𝑘
• Synchronous communication rounds parts and place each part into one of the machines. Then, a copy of
• Local computation within each machine each edge is placed in the (≤ 2) machines that contain either of its
• Each machine creates one message of 𝑂(log 𝑛) bits for every other machine. end points.
• Send… Receive. • Other partitioning of graph data is also conceivable (e.g., random
edge partitioning, arbitrary edge partitioning, etc.).
• Goal: Solve problem in as few rounds as possible.

13 14

How to design 𝑘-machine algorithms?

RVP is Load Balanced
Answer: Think like a vertex.
Mapping Lemma: Under RVP of a graph 𝐺 = (𝑉, 𝐸) with 𝑛 vertices and You have two “think like a vertex” (point to point message passing) models:
𝑚 edges, whp, (i) the Congested Clique (CC), and
1. every machine has at most 𝑂 vertices and (ii) the Node Capacitated Clique (NCC).
2. the number of edges associated with each link is at most 𝑂 +
edges, Simulation:
where Δ is the maximum degree in 𝐺. 1. Design algorithm in CC or NCC with good bounds.
Proof of part 1 is easy. Just use Chernoff bound. 2. Automatically simulate the CC/NCC algorithm in the 𝑘-machine using
Proof of part 2 is more complicated and uses Bernstein’s inequality that standard simulator.
we are not covering. So the proof is omitted. 3. Claim bounds in the 𝑘-machine model.

15 16
03-03-2020

The CC and NCC Models (point to point) Simulating CC/NCC in the 𝑘-machine model
• We have 𝑛 nodes 𝑉 = {1,2, … , 𝑛}. • Assume there is a hash function ℎ: V → {1, 2, … , 𝑘} that is a simple uniform
hash function. (Claims will hold under 𝑂(log 𝑛)-universal families.)
• The input graph 𝐺 = (𝑉, 𝐸) is known locally, i.e., each node 𝑣 ∈ 𝑉 • Assume that each node 𝑣 ∈ 𝑉 is placed in machine ℎ(𝑣).
knows its incident edges. • Each machine 𝑖 now contains (and therefore simulates) nodes 𝑉 =
• The nodes can communicate via synchronous message passing, but {𝑣|ℎ 𝑣 = 𝑖}. We know that 𝑉 ∈ 𝑂 whp.
each message must be at most 𝑂(log 𝑛) bits.
Simulation of one NCC round (at each machine 𝑖):
1. Machine 𝑖 performs local computation for all nodes in 𝑉 as per the CC/NCC
CC: Each node can send 𝑛 − 1 messages (one for every other node) algorithm.
2. The messages to be sent are then individually sent to the machine that holds their
NCC: Each node can send at most 𝑂 log 𝑛 messages to 𝑂 log 𝑛 respective recipient nodes.
carefully nodes. 3. Incoming messages are received and handed over to the recipient nodes.

17 18

Conversion Theorem (CC  𝑘-machine) Proof of the Conversion Theorem

Theorem: Consider a CC algorithm 𝐴 in which each node sends and receives at Consider round 𝑖 of 𝐴 under CC.
most Δ′ messages per round. Suppose further that 𝐴 takes 𝑇 communication
rounds to complete under CC with a total message complexity of 𝑀. Then, 𝐴 can Let 𝐺 = (𝑉, 𝐸 ) be the graph on the input node set 𝑉 with edges 𝑒 =
be simulated in the 𝑘-machine model in
𝑢, 𝑣 ∈ 𝐸 iff 𝑢 and 𝑣 communicated during round 𝑖.
𝑂
𝑀
+𝑇
Δ Note: max degree of 𝐺 (denoted Δ ) is at most Δ .
𝑘 𝑘
rounds.
| |
By the Mapping Lemma, at most 𝑂 + messages will be
Corollary 1: Consider an NCC algorithm 𝐴 that takes 𝑇 communication rounds communicated across each link. QED.
to complete under NCC. Then, 𝐴 can be simulated in the 𝑘-machine model in
𝑂 1+ 𝑇 rounds. (Proof is obvious, hence omitted.)

19 20
03-03-2020

Conversion Theorem (Broadcast CC  NCC) Proof of Theorem

Theorem: Suppose 𝐴 is an algorithm under CC that performs only • Arrange the nodes in NCC in the form of a tree
broadcast based communication (i.e., a node sending the same message to wherein each node 𝑖, 𝑖 > 1, has parent . 
every other node) and suppose further that 𝐴 requires 𝑇
communication rounds and 𝑅 broadcasts in total. Then, 𝐴 can be • Consider any round 𝑟 of the broadcast CC
simulated in NCC (point to point communication) in 𝑂 𝑅 + 𝑇 rounds algorithm with 𝑅 > 1 broadcasts.
comprising 𝑂(𝑅) broadcast calls. • Each node that has to broadcast a message
upcasts the message to the root in a pipelined
Corollary: 𝐴 can be simulated in the 𝑘-machine model in fashion. This takes 𝑂 log 𝑛 + 𝑅 rounds in NCC.
See next slide.
𝑅 • The messages are then broadcasted down to all
𝑂 +𝑇 nodes in the tree in 𝑂(log 𝑛 + 𝑅 ) rounds. (HW)
𝑘 • Thus, each round 𝑟 takes 𝑂 log 𝑛 + 𝑅 rounds.
rounds. QED

21 22

Upcasting messages in a tree (under NCC) Claim: Upcasting takes 𝑂 log 𝑛 + 𝑅 rounds.
Input: some 𝑅 nodes in the tree have a message each • A “clump” is a maximal connected collection of vertices such that their 𝑈 ’s are non-
Output: those messages must reach the root. empty.
• There is at most one clump with the root. Call it the head clump.
Strawman Attempt: Those 𝑅 nodes send it to the root. Violates NCC. • Each clump has a node closest to the root. Call it the root of the clump.
• Claim: two messages are “friends” if they are part of the same clump. Once two
messages become friends, they will remain friends.
Correct attempt: • Consequence: clumps can coalesce, but not break apart.
• At the start of a round 𝑟, let each node 𝑣 ∈ 𝑉 has a set of messages 𝑈 . Initially, 𝑈 • Claim: The root of any non-root clump moves closer to the root in each round. This can
contains the message that 𝑣 wishes to broadcast (empty otherwise.) happen in two ways:
• Then in round 𝑟, each 𝑣 picks one message 𝑥 from 𝑈 and sends it to the parent. In turn, 1. The root moves up and does not coalesce with another clump whose root is higher.
it receives up to two messages (say, 𝑦 and 𝑧) from its children. Thus, at the end of the 2. The root moves up and coalesces with another clump whose root is higher.
round, 𝑈 ← 𝑈 ∖ 𝑥 ∪ 𝑦, 𝑧 .
• Consequence: Every message will be part of the root clump in 𝑂 log 𝑛 rounds.
Homework: How can we adapt this algorithm to ensure each node knows when to
terminate the algorithm? • Claim: When a clump becomes a root clump, it will reduce to just the root in at most 𝑅
rounds. Homework: articulate why this is true.

23 24
03-03-2020

Breadth First Search (Take 1) Breadth First Search (Take 1)

• Input under CC: each node is aware of its neighbors. Exactly one node • Algorithm:
is designated as the root node. 1. In the very first round, the root sends a “hello” message to all its children.
2. Every other node waits until it receives a “hello” message and when it
receives for the first time, it sends out a “hello” message to its neighbors
• Output under CC: each node must know its level in the BFS tree, its from which it did not hear any “hello” message.
(How would you establish parent-child relationship? Each non-root node 𝑣 picks
parent and its children. an arbitrary node 𝑢 among those that first sent “hello” messages to 𝑣. Node 𝑣
then sends 𝑢 a message saying “hi, I am 𝑣 and I am a child of yours.” )
• BFS in CC (without broadcasts) takes O(𝐷) rounds and O(𝑚)
messages, where 𝐷 is the graph diameter. Δ = Δ.
• Thus, in 𝑘-machine model, round complexity is 𝑂 + 𝐷 .
Can we do better?

25 26

Breadth First Search (Take 2)

• Algorithm:
1. In the very first round, the root sends a “hello” message to all its children.
2. Every other node waits until it receives a “hello” message and when it receives for
the first time, it sends out a “hello” message to its neighbors from which it did not
hear any “hello” message.
(How would you establish parent-child relationship? Each non-root node 𝑣 picks an
arbitrary node 𝑢 among those that first sent “hello” messages to 𝑣. Node 𝑣 then sends
𝑢 a message saying “hi, I am 𝑣 and I am a child of yours.” )
• BFS in CC (without broadcasts) takes O(𝐷) rounds and O 𝑚 𝑂 𝑛
messages broadcasts, where 𝐷 is the graph diameter. Δ = Δ.
• Thus, in 𝑘-machine model, round complexity is 𝑂 + 𝐷 O +𝐷 .

Iot Lab Bec675c 6TH Sem
No ratings yet
Iot Lab Bec675c 6TH Sem
92 pages
MA252 - Combinatorial Optimisation
No ratings yet
MA252 - Combinatorial Optimisation
9 pages
hw3 S
No ratings yet
hw3 S
11 pages
Distributed Large-Scale Graph Processing: Data Mining (CS6720)
No ratings yet
Distributed Large-Scale Graph Processing: Data Mining (CS6720)
4 pages
Computing Functions Over Wireless Networks
No ratings yet
Computing Functions Over Wireless Networks
37 pages
Notebook 231102
No ratings yet
Notebook 231102
10 pages
Cs6515 Exam 2 Newest Version 2024 Complete 46
No ratings yet
Cs6515 Exam 2 Newest Version 2024 Complete 46
17 pages
IntroDistribuetComputing
No ratings yet
IntroDistribuetComputing
41 pages
Assignment Java Programs
No ratings yet
Assignment Java Programs
20 pages
CN Lab Manual PDF
No ratings yet
CN Lab Manual PDF
25 pages
Computer Network Assignment Help: Problems and Solutions
0% (1)
Computer Network Assignment Help: Problems and Solutions
28 pages
Lecture13 IO BLG336E
No ratings yet
Lecture13 IO BLG336E
58 pages
Distributed Systems
67% (3)
Distributed Systems
331 pages
ds_2016_17_lec3
No ratings yet
ds_2016_17_lec3
25 pages
Dna Book
No ratings yet
Dna Book
171 pages
Cse 221
No ratings yet
Cse 221
7 pages
Notes On Distributed Systems
No ratings yet
Notes On Distributed Systems
384 pages
Class 55
No ratings yet
Class 55
29 pages
Graph Algorithms
No ratings yet
Graph Algorithms
82 pages
Graph Algorithm
No ratings yet
Graph Algorithm
10 pages
Cs - 502 F-T Subjective by Vu - Toper
No ratings yet
Cs - 502 F-T Subjective by Vu - Toper
18 pages
Notes
No ratings yet
Notes
399 pages
Ada (Sem-5) Numericals
No ratings yet
Ada (Sem-5) Numericals
7 pages
Aoa Imp
No ratings yet
Aoa Imp
5 pages
Assignment of Algorithm
No ratings yet
Assignment of Algorithm
9 pages
Projects
No ratings yet
Projects
4 pages
Notes On Theory of Distributed System
No ratings yet
Notes On Theory of Distributed System
517 pages
CS3401 Algorithm Question Bank
No ratings yet
CS3401 Algorithm Question Bank
6 pages
Ds Unit 4
No ratings yet
Ds Unit 4
5 pages
Notes On Theory On Distributed Systems
No ratings yet
Notes On Theory On Distributed Systems
513 pages
Daa Module4 Slides
No ratings yet
Daa Module4 Slides
47 pages
Applications of Graph Theory in
No ratings yet
Applications of Graph Theory in
18 pages
DS Unit-4
No ratings yet
DS Unit-4
47 pages
Cs 702 Final Term Solved
No ratings yet
Cs 702 Final Term Solved
19 pages
Unit-5 - Oops
No ratings yet
Unit-5 - Oops
27 pages
Notes
No ratings yet
Notes
584 pages
Advanced Operation Research
No ratings yet
Advanced Operation Research
6 pages
Alg Ak Din Me To Reduce The Body Heat and Technology University
No ratings yet
Alg Ak Din Me To Reduce The Body Heat and Technology University
41 pages
Daa PR10 123
No ratings yet
Daa PR10 123
19 pages
Da Slides
No ratings yet
Da Slides
355 pages
Lect 1
No ratings yet
Lect 1
35 pages
Question Bank 3 and 4 Unit
No ratings yet
Question Bank 3 and 4 Unit
8 pages
GFGDGDG
No ratings yet
GFGDGDG
610 pages
Daa Case Study (Theory)
No ratings yet
Daa Case Study (Theory)
6 pages
Algorithms
No ratings yet
Algorithms
8 pages
Graph
No ratings yet
Graph
20 pages
Set_ B_Answer Key CT2
No ratings yet
Set_ B_Answer Key CT2
16 pages
National University of Computer and Emerging Sciences, Lahore Campus
No ratings yet
National University of Computer and Emerging Sciences, Lahore Campus
10 pages
Module 5 Algorithm Analysis and Design
No ratings yet
Module 5 Algorithm Analysis and Design
25 pages
Parallel Random Access Machine (PRAM) : Control
No ratings yet
Parallel Random Access Machine (PRAM) : Control
9 pages
Lecture 7 Disributed Algorithms
No ratings yet
Lecture 7 Disributed Algorithms
43 pages
Dijk Stra
No ratings yet
Dijk Stra
34 pages
CSE446 Lecture 4
No ratings yet
CSE446 Lecture 4
32 pages
Notes On Theory of Distributed Systems
No ratings yet
Notes On Theory of Distributed Systems
556 pages
Unit 7
No ratings yet
Unit 7
60 pages
Review 4: CSCI 2720: Data Structures
No ratings yet
Review 4: CSCI 2720: Data Structures
33 pages
Chapter 14: Parallel Algorithms
No ratings yet
Chapter 14: Parallel Algorithms
23 pages
Lecture3434 - 16870 - Graphs 1
No ratings yet
Lecture3434 - 16870 - Graphs 1
43 pages
Integrated Algebra on the Ti-73
From Everand
Integrated Algebra on the Ti-73
Kathleen Noftsier
No ratings yet
Scanline Rendering: Exploring Visual Realism Through Scanline Rendering Techniques
From Everand
Scanline Rendering: Exploring Visual Realism Through Scanline Rendering Techniques
Fouad Sabry
No ratings yet
Lusiana Armi - Tugas 3 Gambar Teknik Desain Interior
No ratings yet
Lusiana Armi - Tugas 3 Gambar Teknik Desain Interior
5 pages
Unrestricted Mounted Flow Meter
No ratings yet
Unrestricted Mounted Flow Meter
4 pages
Tableau File
No ratings yet
Tableau File
201 pages
Chapter 1
No ratings yet
Chapter 1
84 pages
HS2 Phase One Environmental Statement Volume 5 - Landscape and Visual Assessment - GOV - UK
No ratings yet
HS2 Phase One Environmental Statement Volume 5 - Landscape and Visual Assessment - GOV - UK
7 pages
CS563-NLP-2024 - Assignment 1
No ratings yet
CS563-NLP-2024 - Assignment 1
2 pages
Ec Centrifugal Module - Radipac: K3G560-Aq04-01
No ratings yet
Ec Centrifugal Module - Radipac: K3G560-Aq04-01
6 pages
Versionshistory UBST Englisch
No ratings yet
Versionshistory UBST Englisch
12 pages
Densit® Wear Protection Products
No ratings yet
Densit® Wear Protection Products
9 pages
All STalker MODS Needed For Best Experience
No ratings yet
All STalker MODS Needed For Best Experience
5 pages
Vivares vs. St. Theresa's College, G.R. No. 202666, September 29, 2014.
No ratings yet
Vivares vs. St. Theresa's College, G.R. No. 202666, September 29, 2014.
34 pages
Technology LEQ
No ratings yet
Technology LEQ
1 page
Mckinsey Tech Trends Outlook 20ww22 Quantum
No ratings yet
Mckinsey Tech Trends Outlook 20ww22 Quantum
13 pages
Orphanoserve Advancement
No ratings yet
Orphanoserve Advancement
14 pages
Laboratory Manual: Mobile Application Development 18CSMP68
No ratings yet
Laboratory Manual: Mobile Application Development 18CSMP68
47 pages
Mrs. A.Prema of Computer Science Decoders - Demultiplexers - PowerPointToPdf
No ratings yet
Mrs. A.Prema of Computer Science Decoders - Demultiplexers - PowerPointToPdf
36 pages
Lung Cancer
No ratings yet
Lung Cancer
70 pages
Ford Vietnam
No ratings yet
Ford Vietnam
4 pages
TOS Project
No ratings yet
TOS Project
19 pages
EE 215 Lab 1
No ratings yet
EE 215 Lab 1
3 pages
Literature Review On Jig
100% (2)
Literature Review On Jig
6 pages
Green Blue Geometric Thesis Defense Presentation
No ratings yet
Green Blue Geometric Thesis Defense Presentation
27 pages
Data Modeling For Data Engineers 1737555581
No ratings yet
Data Modeling For Data Engineers 1737555581
9 pages
Senior Analyst - Valuations - D. E. Shaw Renewable Investments
No ratings yet
Senior Analyst - Valuations - D. E. Shaw Renewable Investments
2 pages
PGCIL APPROVED VENDOR FOR 765 KV
100% (1)
PGCIL APPROVED VENDOR FOR 765 KV
14 pages
MPCA 2081 Set A Midterm
No ratings yet
MPCA 2081 Set A Midterm
3 pages
Vardhaman College of Engineering, Hyderabad: Autonomous Institute Affiliated To JNTUH
No ratings yet
Vardhaman College of Engineering, Hyderabad: Autonomous Institute Affiliated To JNTUH
2 pages
TUV-SUD - EOT Crane Information and Safety Storyboard
No ratings yet
TUV-SUD - EOT Crane Information and Safety Storyboard
19 pages
TV and Globalization
No ratings yet
TV and Globalization
2 pages

Large Scale Distributed Graph Processing: Data Mining (CS6720)

Uploaded by

Large Scale Distributed Graph Processing: Data Mining (CS6720)

Uploaded by

03-03-2020

Large Scale Distributed Graph

Shared Memory PRAM Massively Parallel Computation (MPC) Model

Initial Data Distribution

• The words could be either randomly distributed or arbitrarily 𝑁 =𝑂 𝑛+𝑚

Broadcasting Maximal Matching

• A maximal matching is a matching that ceases to be one when any

Sequential Algorithm for Filtering: Idea to find a maximal matching in

Outline of the Analysis Claim: At most whp at end of round 𝑟

How to design 𝑘-machine algorithms?

Conversion Theorem (CC  𝑘-machine) Proof of the Conversion Theorem

Conversion Theorem (Broadcast CC  NCC) Proof of Theorem

Breadth First Search (Take 1) Breadth First Search (Take 1)

Breadth First Search (Take 2)

You might also like