On Describing Push Notifications in Web APIs

A few weeks back, I attended my first RESTFest “unconfernece””. This was a really great event put on by some fantastic folks. The “everyone talks” aspect of RESTFest is actually awesome idea for a conference this size. This format is great because if you’re like me, you had some things you wanted to get some feedback on and this provides the forum to do just that. Everyone gets a chance to do a quick 5-in-5 talk about anything. I was looking to get some feedback on some ideas I’ve been considering about push notifications in Web APIs.

I titled my talk “Real-Time Web APIs,” but I’m no so much interested in Web APIs that follow “real-time computing” constrains. Really, I’m looking to see if there’s a good way to describe a service that streams push notifications from a Web API that follows “RESTful” architectural constraints. There’s a lot of good ideas out there, but many of them either assume that the client either a web browser or running an HTTP server. Additionally, the mechanism to describe these services need some work.

The Goal

The core building blocks for push notifications are already available, but not so much the means that aid in discovery and the description of such resources. I’d like to be able to organize these pieces so that the following requirements can be met:

  1. Don’t assume that the subscriber is able to receive a callback using HTTP POST. If the subscriber is a web browser or a thick desktop client like a Swing or JavaFX application (yes, people still make these!), or even a nativ app on iOS, running a web server to receive and HTTP callback isn’t always practical or feasible.
  2. Advertise that a given link is exposing a stream of events via a link relation. This might be similar to the monitor link relation, but not necessarily bound to SIP.
  3. The link relation should be able to indicate the media type that event messages are described in, ideally via the type property.
  4. If there’s a sub protocol involved, it should also advertise what sub-protocol. If event stream is a media type that supports embedded content, it should also express that as well.
  5. Subscribing to a stream or feed should be simple

WebSockets and HTM5 Server-Sent Events (SSE) present some interesting opportunities for Web APIs that demand low-latency push notifications while also removing the need for a consumer to run an HTTP server. Keep in mind that I’m not talking about doing something like “REST over WebSockets (Shay does make some great points though!),” I’m simply looking for push notifications without the need for for the consumer to run a web server as well as describing the stream via link relations. I think it could be done, but there’s some missing bits.

PubSubHubub

PubSubHubub ticks a lot of check boxes for me, such as:

  • While the protocol is built around Atom and Atom concepts, it could support a variety of media types. It’s using the Content-Type header to express what is coming over the wire.
  • It sports a discovery model using the rel="hub" link relation either in a link header or a link within an Atom feed.
  • The subscriber subscribes to the Topic URL from the Topic URL’s declared Hub(s) using the PubSubHubbub subscription protocol.
  • Publishers ping the “Hub” to notify it of updates, aggregates the content, and sends it to the subscriber using an HTTP POST request to the call hub.callback URL.

What I like about it a lot is the use of hypermedia to do discovery and call backs. Additionally, it’s using standards means to describe what’s coming over the wire. The rel="hub" link relation combined with the script ion protocol is super easy an fit works. My challenge with PubSubHubub is the requirement of an HTTP callback on the part of the subscriber. As stated earlier, this isn’t always possible.

From my perspective, PubSubHubub has the right foundational model. It’s simply HTTP callbacks that are the sticking point. So can we do something similar with WebSockets or Server-Sent Events? I think so, but there’s some challenges with existing formats in order to make this work.

Server-Sent Events

I REALLY like HTML5 Server-Sent Events (SSE). A lot. I’m also really annoyed by the fact that Internet Explorer still isn’t supporting SSE in IE11. What I like about SSE is that it’s a media type (text/event-stream) so the subscriber will know that this URL represents an event stream. The event stream model is also dead simple:

event: change
data: 73857293

But it could be also like this:

event: change
data: {"@id" : "15628", "@type" : "ChangeEvent",  "value" : "73857293" }

Or even this:

event: change
data: {
data: "@id" : "15628", 
data: "@type" : "ChangeEvent",  
data: "value" : "73857293" 
data: }

Or like this (If you’re into this sort of thing):

event: change
data: <ChangeEvent><id>15628</id><value>73857293</value></ChangeEvent>

As you can see, the data field could contain nested media type like XML, JSON, or something else. The problem is that that there is no way to indicate that. How does one know that the data field contains structured content such as JSON? The browser gets around the issue by embedding JavaScript in an HTML document that references the stream:

var source = new EventSource(’/updates');
source.addEventListener('add',  changeHandler, false);

Obviously, the changeHandler function will parse and handle the embedded content. This works great where the subscriber is a web browser (or embedded browser), but for other environments it’s not so easy.

We could express this via a link, but it’d be missing some details. Let’s assume we have a link relation called stream that informs a client that this link represents a stream of events:

Link: <http://example.com/events>; rel="stream"; 
      type="text/event-stream";

It works and declares that the link is a stream of event and it exposes the events via SSE. But the subscriber has no hints that the data field contains JSON or XML content. In cases where the browser, or an embedded web browser, are not available, how does can a client get more information as to process a stream? For SSE, one option might be to include a media type parameter, call it data if you will, that would indicate that the nested type is something like JSON-LD:

Link: <http://example.com/events>; rel="stream"; 
      type="text/event-stream;data='application/ld+json'"

It’s just an idea, but it could be workable. I would love feedback on this and would REALLY like to see Microsoft add Server-Sent Events to IE at some point.

WebSockets

WebSockets are neat, but the majority of use cases for streaming notifications only really needs to go one way. The bi-directional nature of the WebSockets protocol is a nice to have but not entirely necessary for most applications. WebSockets by itself really isn’t that useful. A number of WebSocket examples you’ll see are effectively someone’s home-grown, JSON-based, socket protocol. It’s a bit too cowboy for my tastes, but it can get the job done.

Where I do find WebSockets more useful is being able to leverage a well-defined subprotocol over a WebSocket. At the moment, I’m quite of fond of STOMP, particularly STOMP over WebSockets. In a Java shop that is already heavily invested in JMS, STOMP over WebSockets is a reasonable leap given that tools such as ActiveMQ, RabbitMQ, and others are support STOMP over WebSockets now.

Building on the Server-Sent Event examples earlier, we have some similar problems such as:

  • We still don’t have a good way to indicating what might be coming over the wire
  • We have a new problem since we’re dealing with another protocol that supports subprotocols, we don’t have a means to identify the sub-protocol that the WebSocket will be using

At RESTFest, I crapped up a variation of link relation like so:

Link: <wss://example.com/events>;rel=”stream/v12.stomp";
       type=“application/ld+json";

Here, we’d overload the rel field to indicate that it’s a stream but that it’s also using STOMP as the sub-protocol, specifically STOMP v1.2 (note I’m using IANA WebSocket subprotocol IDs here). Because the URI begins with wss://, we know that we’re using WebSockets over SSL. The type property is indicating that the messages will be using application/ld+json. The problem with both approaches is that if I want to offer another alternate message formats (say JSON-LD or XML), then this solution does really work. But maybe that’s not a problem.

Constrained Application Protocol

One of the great things about attending a workshop like RESTFest is that you’re surrounded by people who are smarter or more experienced than you. After my 5-in-5, Mike Amudsen had a few good questions about what I was trying to do. He then asked if I had considered CoAP, or the Constrained Application Protocol. Having never heard of CoAP, I obviously hadn’t taken it into consideration. CoAP more than likely satisfies a number of my needs. Since it’s still in draft form, it’s not an easy sell yet. Without a doubt, CoAP is something to keep an eye one.

Wrapping Up

Right now, I’m going down the STOMP over WebSockets route. I’d REALLY prefer Server-Sent Events, but the fact that Microsoft isn’t supporting SSE in IE10 and IE11 is AND the corporate standard in most shops, it sadly makes SSE a non-starter. In the coming weeks, I’ll be slapping some code up on GitHub to test out some ideas. I’d love to get feedback on these ideas to see if I’m going off the rails or if these ideas have some merit.

Advertisement