OStatus/PuSH producer
From StatusNet
Contents |
[edit] Combined hub/producer
The general PubSubHubbub model separates producers and hubs in separate pieces, making things simple for the traditional use case -- standalone blog software. Producers don't have to worry about subscription maintenance or knowing which feed items are new, and only have to send one ping to the hub for each feed update. The hub then follows up by fetching the entire feed, determining what's new, and firing it off to subscribers.
This is a little awkward for us, however:
- we have hundreds of thousands of potential feeds to worry about, at least one per user and one per group, but most won't be subscribed to at any given time
- no sense pinging the server if nobody is subscribed
- spec is unclear on whether feed will be fetched if there are no subscribers -- could end up with hub reading thousands of feeds that are never used
- we usually only have one new thing at a time in the feed, but have no way to send update-only "fat pings" directly to the hub
- we'd have to build a feed with 10 or 20 items for the hub to fetch... on every post.
- slave replication lag could cause problems with an external hub fetching the latest posts
- recommended fix for this in PuSH world is an integrated hub
For our specialized use, combining the producer and hub lets us take some shortcuts:
- only need to publish feeds that are subscribed
- only need to push actual notice/event updates
The downside:
- have to implement subscription management ourselves
- already doing that for OMB; it's not really that hard
- have to implement redelivery/cancellation semantics ourselves
- we didn't even bother doing this for OMB!
[edit] Todo
Hub:
- accept subscription/unsubscription requests
-
web endpoint
- reject subs for non-canonical feeds
-
-
reject ping updates from outside as we're not an open-to-the-public hub
-
web endpoint
-
- send update data to subscribers
-
as queue handler
- need to be able to individually attempt resend to each subscriber that's offline
-
-
provide internal API to check whether a given feed is subscribed
-
provide internal API to push updates to a feed
Producer:
-
feeds for users
- feeds for groups
- build profile info into feed streams
- real name
-
location [name / lat / lon] (georss already in there)
- homepage
- bio
- avatar (pulling <logo> on feed for now -- we need per-user)
- make sure we drop the 96x96 explicit requirement :)
- tags
- build notice info in feeds:
-
text atom:summary
-
html atom:content
- attachments as link/rel=enclosure (can delay consumption on that for now)
- topics stored as atom:category, SHOULD be tags (rel-tag)
- response stored as thread:in-reply-to
- attention stored as omb:attn
- repeat/retweet stored as omb:forward
-
-
feed update+ping on new notices
- feed update+ping on profile change (maybe just let it wait until the next post)
- activitystreams?
- feed update+ping on other events -- activitystreams?
- delete notice(?)
- local user block (?)
- delete user
-
note we probably still need to be able to provide a regular-readable feed.-
might need to store an 'outbox' stream including non-notice events
-
- (backgroundable) profile updates
[edit] Maybes
Less vital.
Hub:
- send renewal notices to subscribers
- queue handler? poll?
- actually expire dead subscribers
- -> do we need to report this back inside so we can drop the subscriber from SN level or mark the subscriber as broken?
Producer:
- profile info
- language (currently we don't store it for profiles though, just for user preferences)
- add site meta-info to streams?
[edit] Flow
- Subscriber requests a feed subscription to the hub
- POST /push/hub -> actions/hub.php
- confirm this is a user feed we're handling
- save a HubSub item w/ subscription info
- enqueue item+'subscribe' to hubverify queue for confirmation callback
- POST /push/hub -> actions/hub.php
- Hub sends a verification request to subscriber's callback
- hubverify queue -> HubVerifyQueueHandler
- POST a 'subscribe' verification to subscriber's callback URI (stored in HubSub item)
- on success, save lease start & end time into HubSub item
- on failure.... retry? cancel and delete the sub?
- hubverify queue -> HubVerifyQueueHandler
- ... time passes ...
- Subscriber user's post is processed for inbox distribution
- enqueue to hubdistrib queue
- Prep output for remote PuSH subscribers
- hubdistrib queue -> HubDistribQueueHandler
- find all verified subscriptions for this user's feed
- for each subscriber, enqueue sub & a single-item version of the Atom feed to hubout queue
- hubdistrib queue -> HubDistribQueueHandler
- Send output
- hubout queue -> HubOutQueueHandler
- POST the feed to subscriber callback
- on success, we're done!
- on failure.... retry? mark another failure on the subscriber?
- hubout queue -> HubOutQueueHandler
- ... time passes ...
- Hub sends a subscription renewal check to subscriber's callback MAYBE?
- (no good way to schedule this right now? maybe while sending if we find we've passed the lease end?)
- POST a 'subscribe' verify ping to callback
- on success, update the lease expiration in HubSub record
- on fail...? Retry a few times? Kill sub immediately?
- POST a 'subscribe' verify ping to callback
- (no good way to schedule this right now? maybe while sending if we find we've passed the lease end?)
- ... time passes ...
- Subscriber requests an unsubscribe
- POST /push/hub -> actions/hub.php
- confirm this is a user feed we're handling
- enqueue item+'unsubscribe' to hubverify queue for confirmation callback
- POST /push/hub -> actions/hub.php
- Hub sends a verification request to subscriber's callback
- hubverify queue -> HubVerifyQueueHandler
- POST 'unsubscribe' verification to subscriber's callback URI (stored in HubSub item)
- on success, delete the subscription record
- on failure.... retry? cancel the unsub request?
- hubverify queue -> HubVerifyQueueHandler