Lecture 22:
Peer-to-Peer (P2P) Databases
Nov. 10, 2006
ChengXiang Zhai
Most slides are taken from the following presentations:
[Joe Hellerstein 04] http://db.cs.berkeley.edu/jmh/talks/vldb04-p2ptut-final.ppt
[Aline Viana et al. 03] http://www.euronetlab.com/seminar/viana_040703.ppt
[Ryan Huebsch 03] http://www.cs.berkeley.edu/~kubitron/courses/cs294-4-F03/slides/lec20-dbp2p.ppt
CS511 Advanced Database Management Systems 1
What is Peer-to-Peer (P2P)?
P2P is a class of applications that take advantage of
resources – storage, cycles, content, human presence –
available at the edges of the Internet.
Clay Shirky (www.shirky.com)
P2P refers to a class of systems and applications that
employ distributed resources to perform a critical
function in a decentralized manner.
Milojicic et al. (HP)
Napster? Gnutella? TVUPlayer?
http://www.tvunetworks.com/player/index.html
CS511 Advanced Database Management Systems 2
P2P Properties
• No central control, no central database
– Deployable in an ad-hoc fashion
• No hierarchy
– Every node is both a client and a server
– The communication between peers is symmetric
• No global view of the system (all local decisions)
• Peers are autonomous
• System globally unreliable
– Robustness and security issues
CS511 Advanced Database Management Systems 3
Examples of p2p usage
• File-sharing applications
• Distributed databases
• Distributed computing
• Collaboration
• Distributed games
• Ad hoc networks
• Application-level multicast
• Etc.
CS511 Advanced Database Management Systems 4
Basic P2P
CS511 Advanced Database Management Systems 5
Centralized model (Napster)
• File-sharing system
• Almost distributed system
– The location of a document is centralized
– The "transfer" is peer-to-peer
• Problems
– Robustness
– Scalability
CS511 Advanced Database Management Systems 6
Centralized model (Napster)
location
server
Register x
INTERNET
Document x!
OK:
Document
Peer Z x?
IP = a.b.c.d
x
CS511 Advanced Database Management Systems 7
Non-structured system (Gnutella-like)
• Two phases (like Napster)
– Localization + exchange
• No server
• Open source
– gnutella.wego.com
• Distributed search
– The query is flooded
– Loop avoidance
CS511 Advanced Database Management Systems 8
Gnutella
CS511 Advanced Database Management Systems 9
Lessons and Limitations
• Client-Server performs well
– But not always feasible
• Ideal performance is often not the key issue!
• Things that flood-based systems do well
– Organic scaling
– Decentralization of visibility and liability
– Finding popular stuff
– Fancy local queries
• Things that flood-based systems do poorly
– Finding unpopular stuff [Loo, et al VLDB 04]
– Fancy distributed queries
– Vulnerabilities: data poisoning, tracking, etc.
– Guarantees about anything (answer quality, privacy, etc.)
CS511 Advanced Database Management Systems 10
Gossip Protocols (Epidemic Algorithms)
• Originally targeted at database replication [Demers, et al. PODC ‘87]
– Especially nice for unstructured networks
– Rumor-mongering: propagate newly-received update to k random neighbors
• Extended to routing
– Point-to-point routing [Vahdat/Becker TR, ‘00]
– Rumor-mongering of queries instead of flooding [Haas, et al Infocom ‘02]
• Extended to aggregate computation [Kempe, et al, FOCS 03]
• Mostly theoretical analyses
– Usually of two forms:
• What is the “tipping point” where an epidemic infects the whole population? (Percolation
theory)
• What is the expected # of messages for infection?
• A Cornell specialty
– Demers, Kleinberg, Gehrke, Halpern, …
CS511 Advanced Database Management Systems 11
Why P2P Databases?
CS511 Advanced Database Management Systems 12
Infecting the Network, Peer-to-Peer
• The Internet is hard to change.
• But Overlay Nets are easy!
– P2P is a wonderful “host” for infecting network designs
– The “next” Internet is likely to be very different
• “Naming” is a key design issue today
• Querying and data independence key tomorrow?
• Don’t forget:
– The Internet was originally an overlay on the telephone network
– There is no money to be made in the bit-shipping business
• A modest goal for DB research:
– Don’t query the Internet.
CS511 Advanced Database Management Systems 13
Infecting the Network, Peer-to-Peer
Be the Internet.
• A modest goal for DB research:
– Don’t query the Internet.
CS511 Advanced Database Management Systems 14
Why Databases?
• The problem is placement and retrieval of data…
that would be a data management (or DB) problem
• P2P world is lacking
– Semantics
– Data transformation
– Data relationships
• All of which are core strengths of the DB community
• P2P brings a new environment for DB query
processing systems
– increased scalability, reliability, and performance
CS511 Advanced Database Management Systems 15
Some of the p2p DB groups
• PIER
– http://pier.cs.berkeley.edu
• Stanford Peers
– http://www-db.stanford.edu/peers/
• P-Grid
– http://www.p-grid.org/ (EPFL)
• Pepper
– http://www.cs.cornell.edu/database/pepper/pepper.htm
• BestPeer (PeerDB)
– http://xena1.ddns.comp.nus.edu.sg/p2p/
• Hyperion
– http://www.cs.toronto.edu/db/hyperion/
• Piazza
– http://data.cs.washington.edu/p2p/piazza/
CS511 Advanced Database Management Systems 16
PIER
• Peer-to-Peer Information Exchange & Retrieval
– Aggressively uses DHTs
• Deployed
– Running queries on ~400 nodes around the world (PlanetLab)
– Simulated on up to 10K nodes
• Current Applications
– Improved Filesharing
– Internet Monitoring ()
– Customizable Routing via Recursive Queries
CS511 Advanced Database Management Systems http://pier.cs.berkeley.edu 17
Vision: Network Oracle
• Suppose there existed a Network Oracle
– Answering questions about current Internet state
• Routing tables, link loads, latencies, firewall events, etc.
– How would this change things
• Social change (Public Health, safe computing)
• Medium term change in distributed application design
– Currently distributed apps do some of this on their own
• Long term change in network protocols
– App-specific custom routing
– Fault diagnosis
– Etc.
CS511 Advanced Database Management Systems 18
: Public Health for the Internet
• Security tools focused on “medicine”
– Vaccines for Viruses
– Improving the world one patient at a time
• Weakness/opportunity in the “Public Health” arena
– Public Health: population-focused, community-oriented
– Epidemiology: incidence, distribution, and control in a population
A New Approach
– Perform population-wide measurement
– Enable massive sharing of data and query results
• The “Internet Screensaver”
– Engage end users: education and prevention
– Understand risky behaviors, at-risk populations.
• Prototype running over PIER
CS511 Advanced Database Management Systems 19
Routing: Overlay networks
Overlay
IP
CS511 Advanced Database Management Systems 20
Routing: Overlay networks
Overlay
IP
CS511 Advanced Database Management Systems 21
Structured Overlays:
Distributed Hash Tables
(DHTs)
CS511 Advanced Database Management Systems 22
High-Level Idea: Indirection
• Indirection in space
– Logical (content-based) IDs, routing to those IDs
• “Content-addressable” network
y
– Tolerant of churn h =y
to h
• nodes joining and leaving the network z
CS511 Advanced Database Management Systems 23
High-Level Idea: Indirection
• Indirection in space
– Logical (content-based) IDs, routing to those IDs
• “Content-addressable” network
– Tolerant of churn h =z
to h
• nodes joining and leaving the network z
• Indirection in time
– Want some scheme to temporally decouple send and receive
– Persistence required. Typical Internet solution: soft state
• Combo of persistence via storage and via retry
• Metaphor: Distributed Hash Table
CS511 Advanced Database Management Systems 24
What is a DHT?
• Hash Table
– data structure that maps “keys” to “values”
– essential building block in software systems
• Distributed Hash Table (DHT)
– similar, but spread across the Internet
• Interface
– insert(key, value)
– lookup(key)
CS511 Advanced Database Management Systems 25
How?
Every DHT node supports a single operation:
– Given key as input; route messages toward node
holding key
CS511 Advanced Database Management Systems 26
DHT in action
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
CS511 Advanced Database Management Systems 27
DHT in action
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
CS511 Advanced Database Management Systems 28
DHT in action
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
Operation: take key as input; route messages to node holding key
CS511 Advanced Database Management Systems 29
DHT in action: put()
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
insert(K1,V1)
Operation: take key as input; route messages to node holding key
CS511 Advanced Database Management Systems 30
DHT in action: put()
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
insert(K1,V1)
Operation: take key as input; route messages to node holding key
CS511 Advanced Database Management Systems 31
DHT in action: put()
(K1,V1) K V
K V
K V
K V
K V
K V K V
K V
K V
K V
K V
Operation: take key as input; route messages to node holding key
CS511 Advanced Database Management Systems 32
DHT in action: get()
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
retrieve (K1)
Operation: take key as input; route messages to node holding key
CS511 Advanced Database Management Systems 33
Iterative vs. Recursive Routing
Previously showed recursive.
Another option: iterative
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
retrieve (K1)
Operation: take key as input; route messages to node holding key
CS511 Advanced Database Management Systems 34
DHT Design Goals
• An “overlay” network with:
– Flexible mapping of keys to physical nodes
– Small network diameter
– Small degree (fanout)
– Local routing decisions
– Robustness to churn
– Routing flexibility
– Decent locality (low “stretch”)
• A “storage” or “memory” mechanism with
– No guarantees on persistence
– Maintenance via soft state
CS511 Advanced Database Management Systems
An Example DHT: Chord
• Assume n = 2 m nodes for a moment
– A “complete” Chord ring
– We’ll generalize shortly
CS511 Advanced Database Management Systems
An Example DHT: Chord
CS511 Advanced Database Management Systems
An Example DHT: Chord
CS511 Advanced Database Management Systems
An Example DHT: Chord
• Overlayed 2 -Gons k
CS511 Advanced Database Management Systems
Routing in Chord
• At most one of each Gon
• E.g. 1-to-0
CS511 Advanced Database Management Systems
Routing in Chord
• At most one of each Gon
• E.g. 1-to-0
CS511 Advanced Database Management Systems
Routing in Chord
• At most one of each Gon
• E.g. 1-to-0
CS511 Advanced Database Management Systems
Routing in Chord
• At most one of each Gon
• E.g. 1-to-0
CS511 Advanced Database Management Systems
Routing in Chord
• At most one of each Gon
• E.g. 1-to-0
CS511 Advanced Database Management Systems
Routing in Chord
• At most one of each Gon
• E.g. 1-to-0
• What happened? 2
– We constructed the
binary number 15! 4
8
– Routing from x to y
is like computing
y - x mod n by 1
summing powers of 2
Diameter: log n (1 hop per gon type)
Degree: log n (one outlink per gon type)
CS511 Advanced Database Management Systems
Content-Addressable Networks (CAN)
• Provides a large scale distributed hash table
– Keys are mapped into values
• CAN defines a d-dimensional virtual space
– No relationship with the physical space
• The virtual space is completely distributed among the
peers
– Each peer is responsible for one share of the space
– The peer that is responsible for region R is also
responsible for the values inside R
• Documents must be uniquely identified
CS511 Advanced Database Management Systems 46
Example
CS511 Advanced Database Management Systems 47
Example
CS511 Advanced Database Management Systems 48
Example
1 2
CS511 Advanced Database Management Systems 49
Example
CS511 Advanced Database Management Systems 50
Example
CS511 Advanced Database Management Systems 51
Example
4
4
1
5
CS511 Advanced Database Management Systems 52
Example
1 4
6 5
CS511 Advanced Database Management Systems 53
Example
1 7 4
6 5
CS511 Advanced Database Management Systems 54
Association ID node
1 7 4
6 5
Ex: Node 3
holds this
3
document
CS511 Advanced Database Management Systems 55
Network Data Independence
CS511 Advanced Database Management Systems
SIGMOD Record, Sep. 2003 56
Recall Codd’s Data Independence
• Decouple app-level API from data organization
– Can make changes to data layout without modifying
applications
– Simple version: location-independent names
– Fancier: declarative queries
CS511 Advanced Database Management Systems 57
The Pillars of Data Independence
• Indexes DBMS
– Value-based lookups have to
compete with direct access B-Tree
– Must adapt to shifting data
distributions
– Must guarantee performance
• Query Optimization
Join Ordering,
– Support declarative queries
beyond lookup/search AM Selection,
– Must adapt to shifting data etc.
distributions
– Must adapt to changes in
environment
CS511 Advanced Database Management Systems 58
The Pillars of Data Independence
• Indexes DBMS P2P
– Value-based lookups have to
compete with direct access B-Tree Content-
– Must adapt to shifting data Addressable
distributions Overlay
– Must guarantee performance Networks
(DHTs)
• Query Optimization
Join Ordering, Multiquery
– Support declarative queries
beyond lookup/search AM Selection, dataflow
– Must adapt to shifting data etc. sharing?
distributions
– Must adapt to changes in
environment
CS511 Advanced Database Management Systems 59
Complex Query Processing
CS511 Advanced Database Management Systems 60
DHTs Gave Us Equality Lookups
• What else might we want?
– Range Search
– Aggregation
– Group By
– Join
– Intelligent Query Dissemination
• Theme
– All can be built elegantly on DHTs!
• This is the approach taken in PIER
– But in some instances other schemes are also reasonable
CS511 Advanced Database Management Systems 61
Range Search
• Numerous proposals in recent years
– Chord w/o hashing, + load-balancing [Karger/Ruhl
SPAA ‘04, Ganesan/Bawa VLDB ‘04]
– Mercury [Bharambe, et al. SIGCOMM ‘04]. Specialized
“small-world” DHT.
– P-tree [Crainiceanu et al. WebDB ‘04]. A “wrapped” B-
tree variant.
– P-Grid [Aberer, CoopIS ‘01]. A distributed trie with
random links.
–…
CS511 Advanced Database Management Systems 62
Aggregation
• Two key observations for DHTs
– DHTs are multi-hop, so hierarchical aggregation can
reduce BW
– DHTs provide tree construction in a very natural way
CS511 Advanced Database Management Systems 63
Consider Aggregation in Chord
• Everybody sends their
message to node 0
• Assume greedy jumps
(increasing Gon-order)
• Intercept messages and
aggregate along the way
CS511 Advanced Database Management Systems
Consider Aggregation in Chord
• Everybody sends their
message to node 0
• Assume greedy jumps
(increasing Gon-order)
• Intercept messages and
aggregate along the way
CS511 Advanced Database Management Systems
Consider Aggregation in Chord
• Everybody sends their
message to node 0
• Assume greedy jumps
(increasing Gon-order)
• Intercept messages and
aggregate along the way
CS511 Advanced Database Management Systems
So what if I don’t have a DHT?
• Need another tree-construction mechanism
– There are many in the NW literature (e.g. for multicast)
– Require maintenance messages akin to DHTs
• Do you maintain for the life of your query engine? Or setup/teardown as
needed?
• Can pick a tree shape of your own
– Not at the mercy of the DHT topologies
– E.g. could do high fan-in trees to minimize latency
• Or, can do aggregation via gossip [Kempe, et al FOCS ‘03]
CS511 Advanced Database Management Systems 67
Group By
• A piece of cake in a DHT
– Every node sends tuples toward the hash ID of the
grouping columns
– An agg tree is naturally constructed per group
• Note nice dual-purpose use of DHT
– Hash-based partitioning for parallel group by
• Just like parallel DBMS (Gamma, the Exchange op in Volcano)
– Agg tree construction in multi-hop overlay network
CS511 Advanced Database Management Systems 68
Hash Join
• We just did hash-based group by.
• Hash-based join is roughly the same deal, twice:
– Given R.a Join S.b
– Each node:
• sends each R tuple toward H(R.a)
• sends each S tuple toward H(S.b)
• Again, DHT gives
– Hash-based partitioning for parallel hash join
– Tree construction (no reduction along the way here, though)
• Note the resulting communication pattern
– A tree is constructed per hash destination!
• That’s a lot of trees!
• No big deal for the DHT -- it already had that topology there.
CS511 Advanced Database Management Systems 69
Query Dissemination
• How do nodes find out about a query?
• Case 1: Broadcast
– All nodes need to participate
– Need to have a broadcast tree out of the query node
– This is the opposite of an aggregation tree!
• But how to instantiate it?
• Naïve solution: Flood
– Each nodes sends query to all its neighbors
– Problem: nodes will receive query multiple times
• wasted bandwidth
CS511 Advanced Database Management Systems 70
Security and Trust
CS511 Advanced Database Management Systems 71
Trustworthy P2P
• Many challenges here. Examples:
– Authenticating peers
– Authenticating/validating data
• Stored (poisoning) and in flight
– Ensuring communication
– Validating distributed computations
– Avoiding Denial of Service
• Ensuring fair resource/work allocation
– Ensuring privacy of messages
• Content, quantity, source, destination
– Abusing the power of the network
– …
CS511 Advanced Database Management Systems 72
Free Riders
• Filesharing studies
– Lots of people download
– Few people serve files
• Is this bad?
– If there’s no incentive to serve, why do people do so?
– What if there are strong disincentives to being a major
server?
CS511 Advanced Database Management Systems 73
Simple Solution: Threshholds
• Many programs allow a threshhold to be set
– Don’t upload a file to a peer unless it shares > k files
• Problems:
– What’s k?
– How to ensure the shared files are interesting?
CS511 Advanced Database Management Systems 74
BitTorrent
• Server-based search
– suprnova.org, chat rooms, etc. serve “.torrent” files
• metadata including “tracker” machine for a file
• Bartered “Tit for Tat” download bandwidth
– Download one (random) chunk from a storage peer, slowly
– Subsequent chunks bartered with concurrent downloaders
• As tracked by the tracker for the file
– The more chunks you can upload, the more you can download
• Download speed starts slow, then goes fast
– Great for large files
• Mostly videos, warez
CS511 Advanced Database Management Systems 75
One Slide on Game Theory
• Typical game theory setup
– Assume self-interested (selfish) parties, acting autonomously
– Define some benefit & cost functions
– Parties make “moves” in the game
• With resulting costs and benefits for themselves and others
– A Nash equilibrium:
• A state where no party increases its benefit by moving
• Note:
– Equilibria need not be unique nor equal
– Time to equilibrium is an interesting computational twist
• Mechanism Design
– Design the states/moves/costs/benefits of a game
– To achieve particular globally-acceptable equilibria
• I.e. selfish play leads to global good
CS511 Advanced Database Management Systems 76
DAMD P2P!
• Distributed Algorithmic Mechanism Design (DAMD)
– A natural approach for P2P
• An Example: Fair-share storage [Ngan, et al., Fudico04]
– Every node n maintains a usage record:
• Advertised capacity
• Hosted list of objects n is hosting (nodeID, objID)
• Published list of objects people host for n (nodeID, objID)
– Can publish if capacity - p⋅∑(published list) > 0
• Recipient of publish request should check n’s usage record
– Need schemes to authenticate/validate usage records
• Selfish Audits: n periodically checks that the elements of its hosted list appear
in published lists of publishers
• Random Audits: n periodically picks a peer and checks all its hosted list items
CS511 Advanced Database Management Systems 77
Secure Routing in DHTs
• The “Sybil” attack [Douceur, IPTPS 02]
– Register many times
with multiple identities
– Control enough of
the space to capture
particular traffic
CS511 Advanced Database Management Systems 78
Squelching Sybil
• Certificate authority
– Centralize one thing: the signing of ID certificates
• Central server is otherwise out of the loop
– Or have an “inner ring” of trusted nodes do this
• Using practical Byzantine agreement protocols [Castro/Liskov OSDI ‘01]
• Weak secure IDs
– ID = SHA-1(IP address)
– Assume attacker controls a modest number of nodes
– Before routing through a node, challenge it to produce the right
IP address
• Requires iterative routing
CS511 Advanced Database Management Systems 79
Piazza
• Peers form small groups called spheres of
cooperation.
– May follow administrative boundaries
– Spheres of cooperation are nested
• Query Optimization problems:
– Exploit commonalities between queries
– Decide where to place data
– What queries to materialize (store answers)
• To make the problem tractable, optimization occurs
within a sphere of cooperation.
CS511 Advanced Database Management Systems 80
Piazza II
CS511 Advanced Database Management Systems 81
Piazza III
• Propagating Information
– Node advertises its materialized views to its neighbors
– Nodes consolidate info they receive and propagate
– Type of gossiping protocol
• Consolidating Queries
– Some queries can not be evaluated if data is not locally
available
– Broadcast all un-evaluatable queries to local sphere of
cooperation, and try to answer them collectively
CS511 Advanced Database Management Systems 82
Data Placement Problem
• Setup
– Set of cooperating nodes (no adversaries)
– Bottlenecks: network, CPU, or memory
– Nodes serve four roles
• Data Origin – producers
• Storage Provider
• Query Evaluator
• Query Initiator – consumers
– Cost of query = Origin or Storage Evaluator
+ Evaluator Initiator
CS511 Advanced Database Management Systems 83
Design Choices
• Score of decision making
– Global (hard, optimal) or local (easy, short-sided)
– Similar to multi-query optimization
• Extent of knowledge sharing
– Knowledge of materialized views on other nodes (a
catalog)
– Centralized or distributed? Hierarchical (like DNS)?
• Heterogeneity of information sources
– Few authoritative sources, lots of data producers
– Heterogeneous data different schemas
CS511 Advanced Database Management Systems 84
Design Choices II
• Dynamicity of participants
– Node churn
– Some nodes act like servers, some like
workstations
– Could place all data on servers reduced
flexibility and performance
• Data granularity
– Atomic granularity indivisible objects (complete
file)
– Hierarchical granularity groups (albums,
directories)
– Value based granularity Objects composed of
atomic value (tuples composed of values)
CS511 Advanced Database Management Systems 85
Design Choices III
• Degrees of replication
– One copy all the way to fully replicated
– More replicas make updates harder
– Also makes retrieval harder (more choices)
– Consistency is harder, typical solution is to have a
master replica
• Freshness and update consistency
– Invalidation messages, pushed by server on
update or pulled by client on request
– Timeout based, lower overhead, looser guarantees
about freshness and consistency
CS511 Advanced Database Management Systems 86
Metareferences
• Your favorite search engine should find the inline refs
• Project IRIS has a lot of participants’ papers online
– http://www.project-iris.org
• IEEE Distributed Systems Online
– http://dsonline.computer.org/os/related/p2p/
• O’Reilly OpenP2P
– http://www.openp2p.com
• Karl Aberer’s ICDE 2002 tutorial
– http://lsirpeople.epfl.ch/aberer/Talks/ICDE2002-Tutorial.pdf
• Ross/Rubenstein InfoCom 2003 tutorial
– http://cis.poly.edu/~ross/tutorials/P2PtutorialInfocom.pdf
• PlanetLab
– http://www.planet-lab.org
• OpenDHT
– http://www.opendht.org
CS511 Advanced Database Management Systems 87
PlanetLab
• Consortium of academia and industry
– Catalyzed by Intel Research in 2002
– Now hosted at Princeton U
– 25% of SOSP ‘03 papers used PlanetLab
• DB folks should get more involved!
CS511 Advanced Database Management Systems 88
OpenDHT
• A shared DHT service
– The Bamboo DHT
– Hosted on PlanetLab
– Simple RPC API
– You don’t need to deploy or
host to play with a real DHT!
• A playground for killer apps?
– Needn’t be as big as PIER!
– Example: FreeDB
replacement
• Research in sharing DHT svc!
CS511 Advanced Database Management Systems 89
The DB Community Has Much to Offer
• Complex (multi-operator) queries & optimization
– NW folks have tended to build single-operator “systems”
• E.g. aggregation only, or multi-d range-search only
– Adaptivity required
• But may not look like adaptive QP in databases…
• Declarative language semantics
– Deal with streaming, clock jitter and soft state!
• Data reduction techniques
– For visualization, approximate query processing
• Bulk-computation workloads
– Quite different from the ones the NW and systems folks envision
• Recursive query processing
– The network is a graph!
CS511 Advanced Database Management Systems 90