21 p2p
21 p2p
15-441
Scaling Problem
• Millions of clients ⇒ server and network
meltdown
2
P2P System
3
Why p2p?
• Scaling: Create system whose capacity grows with # of
clients - automatically!
• Self-managing
– This aspect attractive for corporate/datacenter needs
– e.g., Amazon’s 100,000-ish machines, google’s 300k+
• Harness lots of “spare” capacity at end-hosts
• Eliminate centralization
– Robust to failures, etc.
– Robust to censorship, politics & legislation??
4
Today’s Goal
• p2p is hot.
• There are tons and tons of instances
• But that’s not the point
Key=“title” Internet
Value=MP3 data… ? Client
Publisher
Lookup(“title”)
N4 N6
N5
8
Search Approaches
• Centralized
• Flooding
• A hybrid: Flooding between
“Supernodes”
• Structured
9
Different types of searches
• Needles vs. Haystacks
– Searching for top 40, or an obscure punk
track from 1981 that nobody’s heard of?
• Search expressiveness
– Whole word? Regular expressions? File
names? Attributes? Whole-text search?
• (e.g., p2p gnutella or p2p google?)
1
0
Framework
• Common Primitives:
– Join: how to I begin participating?
– Publish: how do I advertise my file?
– Search: how to I find a file?
– Fetch: how to I retrieve a file?
1
1
Centralized
• Centralized Database:
– Join: on startup, client contacts central
server
– Publish: reports list of files to central
server
– Search: query the server => return
node(s) that store the requested file
1
2
Napster Example: Publish
insert(X,
123.2.21.23)
...
Publish
I have X, Y, and Z!
123.2.21.23
1
3
Napster: Search
123.2.0.18
search(A)
-->
Fetch 123.2.0.18
Query Reply
Where is file A?
1
4
Napster: Discussion
• Pros:
– Simple
– Search scope is O(1) for even complex
searches (one index, etc.)
– Controllable (pro or con?)
• Cons:
– Server maintains O(N) State
– Server does all processing
– Single point of failure
• Technical failures + legal (napster shut down1
5
2001)
Query Flooding
• Join: Must join a flooding network
– Usually, establish peering with a few
existing nodes
• Publish: no need, just reply
• Search: ask neighbors, who ask their
neighbors, and so on... when/if found,
reply to sender.
– TTL limits propagation
1
6
Example: Gnutella
I have file A.
I have file A.
Reply
Query
Where is file A?
1
7
Flooding: Discussion
• Pros:
– Fully de-centralized
– Search cost distributed
– Processing @ each node permits powerful search
semantics
• Cons:
– Search scope is O(N)
– Search time is O(???)
– Nodes leave often, network unstable
• TTL-limited search works well for haystacks.
– For scalability, does NOT search every node. May have
to re-issue query later 1
8
Supernode Flooding
• Join: on startup, client contacts a “supernode” ...
may at some point become one itself
1
9
Supernode Network Design
“Super Nodes”
2
0
Supernode: File Insert
insert(X,
123.2.21.23)
...
Publish
I have X!
123.2.21.23
2
1
Supernode: File Search
search(A)
-->
123.2.22.50
search(A)
123.2.22.50 -->
Query Replies 123.2.0.18
Where is file A?
123.2.0.18
2
2
Supernode: Which nodes?
• Often, bias towards nodes with good:
– Bandwidth
– Computational Resources
– Availability!
2
3
Stability and Superpeers
• Why superpeers?
– Query consolidation
• Many connected nodes may have only a few files
• Propagating a query to a sub-node would take more b/w
than answering it yourself
– Caching effect
• Requires network stability
• Superpeer selection is time-based
– How long you’ve been on is a good predictor of
how long you’ll be around.
2
4
Superpeer results
• Basically, “just better” than flood to all
• Gets an order of magnitude or two
better scaling
• But still fundamentally: o(search) *
o(per-node storage) = O(N)
– central: O(1) search, O(N) storage
– flood: O(N) search, O(1) storage
– Superpeer: can trade between
2
5
Structured Search:
Distributed Hash Tables
• Academic answer to p2p
• Goals
– Guatanteed lookup success
– Provable bounds on search time
– Provable scalability
• Makes some things harder
– Fuzzy queries / full-text search / etc.
• Read-write, not read-only
• Hot Topic in networking since introduction in
~2000/2001
2
6
Searching Wrap-Up
2
7
DHT: Overview
• Abstraction: a distributed “hash-table” (DHT)
data structure:
– put(id, item);
– item = get(id);
• Implementation: nodes in system form a
distributed data structure
– Can be Ring, Tree, Hypercube, Skip List, Butterfly
Network, ...
2
8
DHT: Overview (2)
• Structured Overlay Routing:
– Join: On startup, contact a “bootstrap” node and integrate
yourself into the distributed data structure; get a node id
– Publish: Route publication for file id toward a close node id
along the data structure
– Search: Route a query for file id toward a close node id.
Data structure guarantees that query will meet the
publication.
N105 K20
N90
K80
A key is stored at its successor: node with next higher ID
3
1
DHT: Chord Basic Lookup
N120
N10
“Where is key 80?”
N105
K80 N90
N60
3
2
DHT: Chord “Finger Table”
1/4 1/2
1/8
1/16
1/32
1/64
1/128
N80
• Entry i in the finger table of node n is the first node that succeeds or
equals n + 2i
• In other words, the ith finger points 1/2n-i way around the ring
3
3
Node Join
• Compute ID
• Use an existing node to route to that ID in the
ring.
– Finds s = successor(id)
• ask s for its predecessor, p
• Splice self into ring just like a linked list
– p->successor = me
– me->successor = s
– me->predecessor = p
3
– s->predecessor = me 4
DHT: Chord Join
• Assume an identifier space [0..8]
6 2
5 3
4
3
5
DHT: Chord Join
6 2
Succ. Table
i id+2i succ
5 3
0 3 1
4
1 4 1
2 6 1
3
6
DHT: Chord Join
Succ. Table
i
i id+2 succ
0 1 1
1 2 2
2 4 0
• Nodes n0, n6 join Succ. Table
0 i id+2i succ
1 0 2 2
7
1 3 6
2 5 6
Succ. Table
i
i id+2 succ
0 7 0 6 2
1 0 0
2 2 2
Succ. Table
i id+2i succ
5 3
0 3 6
4
1 4 6
2 6 6
3
7
DHT: Chord Join
Succ. Table Items
i id+2 i
succ 7
• Nodes: 0 1 1
1 2 2
n1, n2, n0, n6 2 4 0
f7, f2 0 2
1 3
2
6
2 5 6
Succ. Table 6 2
i id+2i succ
0 7 0
1 0 0 Succ. Table
2 2 2 i id+2i succ
5 3
0 3 6
4
1 4 6
2 6 6
3
8
DHT: Chord Routing
Succ. Table Items
i id+2 i
succ 7
• Upon receiving a query for 0 1 1
item id, a node: 1 2 2
2 4 0
• Checks whether stores the
item locally
0 Succ. Table Items
• If not, forwards the query 1 i id+2 succ 1
i
to the largest node in its 7
0 2 2
successor table that does query(7)
1 3 6
not exceed id 2 5 6
Succ. Table 6 2
i id+2i succ
0 7 0
1 0 0 Succ. Table
2 2 2 i id+2i succ
5 3
0 3 6
4
1 4 6
2 6 6
3
9
DHT: Chord Summary
• Routing table size?
– Log N fingers
• Routing time?
– Each hop expects to 1/2 the distance to the
desired id => expect O(log N) hops.
4
0
DHT: Discussion
• Pros:
– Guaranteed Lookup
– O(log N) per node state and search scope
• Cons:
– This line used to say “not used.” But:
Now being used in a few apps, including
BitTorrent.
– Supporting non-exact match search is
(quite!) hard 4
1
The limits of search:
A Peer-to-peer Google?
• Complex intersection queries (“the” + “who”)
– Billions of hits for each term alone
• Sophisticated ranking
– Must compare many results before returning a
subset to user
• Very, very hard for a DHT / p2p system
– Need high inter-node bandwidth
– (This is exactly what Google does - massive
clusters)
• But maybe many file sharing queries are okay...42
Fetching Data
• Once we know which node(s) have the
data we want...
• Option 1: Fetch from a single peer
– Problem: Have to fetch from peer who has
whole file.
• Peers not useful sources until d/l whole file
• At which point they probably log off. :)
– How can we fix this?
4
3
Chunk Fetching
• More than one node may have the file.
• How to tell?
– Must be able to distinguish identical files
– Not necessarily same filename
– Same filename not necessarily same file...
• Use hash of file
– Common: MD5, SHA-1, etc.
• How to fetch?
– Get bytes [0..8000] from A, [8001...16000] from B
– Alternative: Erasure Codes
4
4
BitTorrent: Overview
• Swarming:
– Join: contact centralized “tracker” server, get a list
of peers.
– Publish: Run a tracker server.
– Search: Out-of-band. E.g., use Google to find a
tracker for the file you want.
– Fetch: Download chunks of the file from your
peers. Upload chunks you have to them.
• Big differences from Napster:
– Chunk based downloading (sound familiar? :)
– “few large files” focus
4
– Anti-freeloading mechanisms 5
BitTorrent
• Periodically get list of peers from tracker
• More often:
– Ask each peer for what chunks it has
• (Or have them update you)
• Request chunks from several peers at a
time
• Peers will start downloading from you
• BT has some machinery to try to bias
towards helping those who help you 4
6
BitTorrent: Publish/Join
Tracker
4
7
BitTorrent: Fetch
4
8
BitTorrent: Summary
• Pros:
– Works reasonably well in practice
– Gives peers incentive to share resources; avoids
freeloaders
• Cons:
– Central tracker server needed to bootstrap swarm
– (Tracker is a design choice, not a requirement, as
you know from your projects. Modern BitTorrent
can also use a DHT to locate peers. But
approach still needs a “search” mechanism)
4
9
Writable, persistent p2p
• Do you trust your data to 100,000 monkeys?
• Node availability hurts
– Ex: Store 5 copies of data on different nodes
– When someone goes away, you must replicate
the data they held
– Hard drives are *huge*, but cable modem upload
bandwidth is tiny - perhaps 10 Gbytes/day
– Takes many days to upload contents of 200GB
hard drive. Very expensive leave/replication
situation!
5
0
What’s out there?
Central Flood Super- Route
node
flood
Whole Napster Gnutella Freenet
File
5
7
Freenet: Overview
• Routed Queries:
– Join: on startup, client contacts a few other
nodes it knows about; gets a unique node id
– Publish: route file contents toward the file id. File
is stored at node with id closest to file id
– Search: route query for file id toward the closest
node id
– Fetch: when query reaches a node containing
file id, it returns the file to the sender
5
8
Freenet: Routing Tables
• id – file identifier (e.g., hash of file)
• next_hop – another node that stores the file id
• file – file identified by id being stored on the local node
id next_hop file
…
– If file id stored locally, then stop
• Forward data back to upstream requestor
– If not, search for the “closest” id in the table, and
…
forward the message to the corresponding
next_hop
– If data is not found, failure is reported back
• Requestor then tries next closest match in routing
table
5
9
Freenet: Routing
query(10)
n1 n2
4 n1 f4 1 9 n3 f9
12 n2 f12 4’
5 n3 4 n4 n5
2 14 n5 f14 5 4 n1 f4
13 n2 f13 10 n5 f10
n3 3 3 n6 8 n6
3 n1 f3
14 n4 f14
5 n3
6
0
Freenet: Routing Properties
• “Close” file ids tend to be stored on the same
node
– Why? Publications of similar file ids route toward
the same place
• Network tend to be a “small world”
– Small number of nodes have large number of
neighbors (i.e., ~ “six-degrees of separation”)
• Consequence:
– Most queries only traverse a small number of
hops to find the file
6
1
Freenet: Anonymity &
Security
• Anonymity
– Randomly modify source of packet as it traverses the
network
– Can use “mix-nets” or onion-routing
• Security & Censorship resistance
– No constraints on how to choose ids for files => easy to
have to files collide, creating “denial of service” (censorship)
– Solution: have a id type that requires a private key signature
that is verified when updating the file
– Cache file on the reverse path of queries/publications =>
attempt to “replace” file with bogus data will just cause the
file to be replicated more!
6
2
Freenet: Discussion
• Pros:
– Intelligent routing makes queries relatively short
– Search scope small (only nodes along search
path involved); no flooding
– Anonymity properties may give you “plausible
deniability”
• Cons:
– Still no provable guarantees!
– Anonymity features make it hard to measure,
debug
6
3
BitTorrent: Sharing Strategy
• Employ “Tit-for-tat” sharing strategy
– A is downloading from some other people
• A will let the fastest N of those download from him
– Be optimistic: occasionally let freeloaders download
• Otherwise no one would ever start!
• Also allows you to discover better peers to download from
when they reciprocate
– Let N peop
• Goal: Pareto Efficiency
– Game Theory: “No change can make anyone better
off without making others worse off”
– Does it work? (don’t know!) 6
4