These last years, REST has been my default choice as the protocol for automated interaction between computer systems. It’s easy to understand and easy to implement, but there are also a couple of things that are easy to get wrong when you’re new to RESTful webservices. Knowing how to define all the resources in your system and getting their URIs and the actions that can be invoked on them right is an art that you have to learn.
As you learn and read more about RESTful systems, sooner or later you’ll run into a discussion about “state”. As the Wikipedia page about REST explains in its section Stateless, “[t]he client–server communication is constrained by no client context being stored on the server between requests.” This constraint has over the course of the years caused lots of long and sometimes even heated debates. In many cases, clients do store part of the context locally for practical reasons, to the horror of REST die-hards (or RESTafarians, as they’re sometimes called), who will forbid you to call your system “RESTful”.
But are these RESTafarians entirely consistent with themselves? Is it really possible to create a working computer system fulfilling each and every architectural constraint of REST to the smallest detail? I don’t think so, because if you really would apply every constraint of REST in its most literal sense, the result would either be a computer system that doesn’t work at all, or one that violates many good and best practices of software engineering. I know of at least five such “inconvenient truths about REST”, as I call them, and the first one is that GET calls are never nullipotent.
When you apply REST to webservices, you should use the standard HTTP methods to act on your resources. One of those standard HTTP methods is GET, to retrieve a resource. By definition, the GET method is a “safe method”, or “nullipotent”, which means that it shouldn’t produce side-effects. However, the HTTP standard itself notes already that this should be taken with a grain of salt, and the Safe methods section on the Wikipedia page about HTTP lists examples like logging and caching as “relatively harmless” side-effects that are acceptable for safe methods.
It’s entirely correct that logging and caching are relatively harmless. After all, logging can cause a disk to go full, and consequently crash the whole webserver. Luckily, that doesn’t happen so often, but it illustrates that logging definitely isn’t absolutely harmless.
But there may be other reasons for a GET call to have side-effects. Consider e.g. access control, and intrusion detection. Maybe it’s OK to try and access a resource you don’t have read access to, once in a while. But if you’re trying to list out all the resources that exist on the server, then maybe as a precaution you should be (temporarily) banned from the system after trying to retrieve the first n resources? And even if you’re only accessing the resources you’re entitled to, what if you’re doing that at such a high frequency that you’re bringing down the server, or the cache in front of it, and therefore effectively executing a DoS attack?
What if the side-effect is part of your domain functionality? Say you’re working on your own version of Snapchat, and want to offer a one-time download service for a resource? Clearly, since you’re not uploading anything to the server, POST and PUT aren’t the right HTTP verbs to use for this service. DELETE doesn’t seem correct either, because the client is more concerned with getting the resource downloaded than deleting it. Additionally, the response message to a DELETE call isn’t supposed to contain the resource’s representation, if it contains anything at all. Therefore, GET seems to be the most appropriate HTTP verb to implement this service, but with the huge side-effect that it deletes the resource once it has been retrieved by a client. But does this mean that systems with such services can’t be RESTful by definition?
I think it’s a good idea to keep your GET calls as nullipotent as possible, but be aware of the fact that there will always be side-effects, and should be too. Don’t turn off access logging for GET calls just to make them more nullipotent; that doesn’t make your system more RESTful, only less secure. Side-effects are allowed when they’re a good practice (like logging and for security purposes), or part of the domain. In other words, when they’re unobservable (say, 99.999% of the time), or expected. The rest is an academic discussion, or a topic to be settled over a beer.
Next time: HTTP Verbs ≠ “REST Verbs”