Hacker News new | past | comments | ask | show | jobs | submit login
Optimising REST APIs (stereoplex.com)
52 points by 6ren on Dec 10, 2011 | hide | past | favorite | 16 comments



    The standard approach to expressing errors in a REST API
    is to use HTTP status codes. As is often the case with
    REST, this works fine for simple systems, but is simply
    too limiting for more sophisticated systems, particularly
    those which might submit a JSON or XML document to
    describe a POST request.
That's what the response body is for. If a client PUTs (or POSTs) a malformed representation, it's reasonable for the response to be 400 with error detail in the body.

The "depth" parameter is kind of interesting, since it gives the client a choice as to whether multiple resources are included in the response. I'd be interested to see how others implement this (or if the client is given no choice at all).

But essentially this is a straightforward choice between latency and response size.


The author doesn't seem to realize that response bodies can exist for anything other than a 200 response.

He says he uses Django almost exclusively. I've never used Django - is this perhaps a weakness of Django?

As far as passing options in the header through Accept, I would think it would make more sense to pass it as GET parameters with the request, e.g.:

GET /orders/432544?depth=1

instead of

GET /orders/432544 Accept: application/json,application/json;q=1;depth=1


It's not a limitation of django - subclasses of HttpResponse exist for certain error codes, and their content can be set in the normal manner.

https://docs.djangoproject.com/en/dev/ref/request-response/#...

Agreed - 'depth' makes more sense as a GET parameter, since it's a resource in its own right.


It definitely makes sense to have it in the query string, and I think people are already doing it that way, but the author makes a good point - it isn't a resource in its own right, it's merely a different representation. But in my experience, requiring Accept manipulation outside of .json versus .xml makes your service less usable by clients, which often don't expose anything besides the url.


I think both interpretations can be defended. Though if you're playing with the Accept: header it's important to correctly set Vary: as well.

> But in my experience, requiring Accept manipulation outside of .json versus .xml makes your service less usable by clients, which often don't expose anything besides the url.

Browsers have that limitation (and numerous others), but if a programmatic HTTP library does not expose this, you need to throw it out. The Accept header is not part of the "immutable" headers of XHR either.


it is not a different representation, it is a different resource. Imagune if you PUT it, it has a different meaning with different depth.


While I like the standardization afforded by REST, I frequently run into some artificial limitations that are imposed on the different request methods. For example, I find that I would frequently like to pass structure information in a GET that would be more useful to send in a body (as afforded by a POST).

One prime example relevant to this article would be a GET that was effectively a query by example. So a client could basically submit a skeleton of the data that it would like.

This would let a client "go deep" on some portions of the structure and shallow on other aspects. Imagine the following GET request for that calorie counting Android App in the post:

  GET /orders/432544
  { "toppings": 
    [{ "calories": ""}]
  }
With a response that would be:

  200 OK
  { "toppings":
    [ { "calories": 100 }
    , { "calories": 25 }
    ]
  }
This would give any client the ability to ask for exactly what it wanted with different depths to the structure if necessary. While you could serialize that into a URL, that seems like a kludge. I think in general, the HTTP methods made sense for their original design, but as we move on to building more flexible, integrated data system, a more powerful flexible API mechanism may be warranted.


Blergh, this isn't a GET if you're doing this.

GET is for retrieving a resource, which is like a fixed representation of data. In your example, you're trying to do a function call, an RPC, and badly mapping it to HTTP semantics.

That's what POST is for.

You could maybe do something like this:

    GET /orders/432544 HTTP/1.1
    Host: www.example.com
    Accept: application/json;bsaunder=1.0;calories
And, of course, technically, you CAN pass a body with your GET. The spec does not disallow bodies on GETs. Roy Fielding, however, would disapprove:

http://stackoverflow.com/questions/978061/http-get-with-requ...


> GET is for retrieving a resource, which is like a fixed representation of data. In your example, you're trying to do a function call, an RPC, and badly mapping it to HTTP semantics.

I don't really agree with that: he wants to retrieve a subset of a resource, that's a resource in itself. Using a more standard selection language, his query could be rewritten:

    GET /orders/432544?path=%2Ftoppings%2Fcalories HTTP/1.1
and would be perfectly sensible.


My read is that without id's, having broken-out calories like the GGP's is a broken data model and/or API. The API should be returning a sum of the calories for a particular order, which supports GP's "RPC" argument. Either that, or returning ingredient ids with their calories counts.


> My read is that without id's, having broken-out calories like the GGP's is a broken data model and/or API.

IDs are not involved at any point, and are not even relevant to bsaunder's comment.

> Either that, or returning ingredient ids with their calories counts.

An API which returns identifiers is not a REST API, so this statement makes no sense.

Now more to the point, bsaunder is simply looking for a way to fetch a subset of a resource by providing some sort of filter. I truly do not see why that would not be restful, it's simply a search query, what do you think happens when you search something in Google? It returns a resource which is the subset of an other resource (their complete index, which they don't expose but that does not really matter), and as far as I can tell nobody's accused search engines of failing to be restful yet.

(the sum stuff doesn't even make sense)


Have you looked at MQL? - http://wiki.freebase.com/wiki/MQL

There are some good examples here: http://wiki.freebase.com/wiki/MQL_Cookbook


I think I read about MQL a couple of years ago. I like it a lot. Not sure how/if it fits into the REST paradigm.


> This would give any client the ability to ask for exactly what it wanted with different depths to the structure if necessary. While you could serialize that into a URL, that seems like a kludge.

Not sure why: your structure looks like an ad-hoc (and hard to read) selection query language (e.g. XPath, CSS selectors, ...). I'm sure it sounds good when throwing the idea in the air "oh the response has the exact same shape as the body", but I'd expect it to not be very good in practice (much like Go's weird-ass date formats)


A better way is to PUT the template as a resource that you can GET with query paraneters later and which would fill that template. Like creating a prepared query.


Most REST APIs are simple, returning flattish records, shallowly nested, with a maximum depth of one list, and some choices within. Simple is attractive, but this leads to the REST problem of "chatty" APIs, going back-and-forth to the server for each lookup/join, to locate the resource.

As soon as you start "denormalizing", it makes sense to consider RDB/SQL solutions: there, you avoid joins/lookups between "normalized" flat records by having the data already pre-assembled in the form needed by frequent requests. Here, the problem isn't performance of database joins, but the latency of the whole network stack. Still, the same solution works.

My playful RESTful proposal to achieve this is to consider an SQL query ("SELECT... WHERE..." etc) as the name of the resource returned. Through caching these names for these resources, it is scalable in the REST way.

This is odd, because the SQL grammar allows all sorts of "names" to identify a "resource", so they don't look like names. But if different queries return the same table - these are just different names for the same resource, which REST permits. (It's more scalable to use the same name for the same resource, but that tends to happen anyway through convention.)

  SELECT/*/WHERE/a/AND/b        # is the same as:
  SELECT/*/WHERE/b/AND/a
---

Is this a real problem that businesses need to solve (will pay to fix), beyond annoying developers? Many website have become slow, as they go back-and-forth aggregating content. Bandwidth is fine; latency is the problem. It's one motivation for Amazon's Silk browser, which assembles content in the cloud, which is local to the services hosted there.

EDIT "naming resources" is a database; a hierarchical database, the sort that relational databases (largely) replaced. It's not surprising that relational algebra is applicable to REST problems.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: