How to protect your API clients against breaking changes

In a recent post, Mozilla developer Gervase Markham took a look at Google’s draft spec for the Roughtime protocol, which allows clients to determine the time to within the nearest 10 seconds or so without the need for an authoritative trusted timeserver. One part of Google’s ecosystem document caught his eye – he noticed that Roughtime is like a small “chaos monkey” for protocols, where the Roughtime server intentionally sends out a small subset of responses with various forms of protocol error:

A healthy software ecosystem doesn‘t arise by specifying how software should behave and then assuming that implementations will do the right thing. Rather we plan on having Roughtime servers return invalid, bogus answers to a small fraction of requests. These bogus answers would contain the wrong time, but would also be invalid in another way. For example, one of the signatures might be incorrect, or the tags in the message might be in the wrong order. Client implementations that don’t implement all the necessary checks would find that they get nonsense answers and, hopefully, that will be sufficient to expose bugs before they turn into a Blackhat talk.
 
Gerv dubbed this maxim “Langley’s law” – the principle that a service should “be occasionally evil in what it sends, and conservative in what it accepts.”
 
This flies in the face of conventional wisdom. It’s a complete reversal of the ancient Postel’s Law regarding internet protocols – that you should “be conservative in what you send, be liberal in what you accept.”
 
At Cayan, we’ve been using a similar approach to ensure that we can safely version our APIs using what’s known as an agile API versioning strategy. An agile API versioning strategy allows Cayan to make non-breaking changes to its APIs without needing to version them. As an example, we may want to ensure that we can return additional metadata in a response, or be able to add an optional parameter to a web method’s input. A developer’s POS is free to use the extra metadata and parameters or ignore them, and they can rely on the existing behavior having not changed.
 
We don’t want to get into an API versioning morass where every incremental change to our API must be versioned – which increases our maintenance burden - and requires our partners to opt-in to the new endpoints. Especially for trivial, incremental, additive changes. But we need to reconcile that with the need for our API clients (largely, Points of Sale) to not break as we begin to return new tags in our responses.
 
Given that Cayan is working with over 400 POS partners and over 80,000 merchants, we need to ensure that we can make changes without breaking anyone in the field. After all, we have no way of knowing if a partner’s POS is using a hand built XML or JSON parser, a SOAP client that strictly validates its WSDL, or etc. But it’s impossible (and undesirable) to test against every possible version of every partner’s POS before we release. Our strategy has been to have our APIs return random tags in their responses during our certification process.
 
This simple solution ensures that we can continue to quickly evolve our products and delight our customers, while not leaving anyone behind