Pubsubhubbub For Developers: Brett Slatkin Software Engineer Google Inc
Pubsubhubbub For Developers: Brett Slatkin Software Engineer Google Inc
Developers
Brett Slatkin
Software Engineer
Google Inc.
September 28, 2009
Agenda
• Background
• Intro
• Motivation
• Scale
• Progress
Background
Why do real-time messaging?
• Syndication
o Creating a "flow"
o Simultaneous delivery of an event spurs immediate
conversation
o More participation enables more developed
conversations, better exchanging of ideas
o Cross-site allows promotion, linking, swarming around
sources, mash-ups, growth opportunity
Why do real-time messaging?
• Business, politics
o 1 minute of delay could cost a company millions, cause a
political scandal, be harmful to investors, etc
o Concrete example: SEC earnings requirements
Why do real-time messaging?
• Future applications (out of scope, but ...)
o Financial data
o Public scientific measurements (e.g., stream of weather
data, traffic status, polling, votes)
o Sensor networks
o Emergency information distribution
o Anything you can think of that's a stream of information!
Why do decentralized messaging?
• Web was built on decentralized protocols
• No single point of failure
• Interoperability is key to network effects and growth
• One API for application developers
Intro
What is PubSubHubbub?
hub.mode=publish&hub.url=<your feed>
hub.mode=subscribe&hub.verify=sync&
hub.topic=<feed URL>&hub.callback=<callback URL>
HTTP/1.1 200
...
<echo random>
How-to for Subscribers
Process new content from the Hub
POST /callback HTTP/1.1
Content-Type: application/atom+xml
...
• Distinct functions
o Accept and verify subscriptions to new topics
o Receive pings from publishers, retrieve content
o Extract new/updated items from feed
o Send all subscribers the new content
The role of the Hub
• Scalability
o # of subscribers & feeds, update frequency
o Delegation of content distribution (= bandwidth)
• Reliability
o Retry fetch, delivery, idempotence
How the hub works
How the hub works
TCP
(est. 1974)
Push it to the limit
What is magical about TCP? The Window.
Push it to the limit
Without the window, the tube can't be full.
Push it to the limit
TCP maximizes the throughput of a link
• Dump data in, it will be received
• The window means no waiting for acks!
• When acks are missed, the sender will retransmit
• Receivers reassemble the message in-order, de-dupe
• Good citizenship with congestion control
Push it to the limit
Where is such efficiency for application-level protocols?
• Exists, but often proprietary or an interoperability
nightmare
Push it to the limit
Where is such efficiency for application-level protocols?
• Exists, but often proprietary or an interoperability
nightmare
(cough SOAP cough)
Why another protocol?
Why another protocol?
• We want interoperable, web-scale messaging