Hacker News new | past | comments | ask | show | jobs | submit login

This doesn't seem unreasonable.

1) Whether it's user-friendly or not, it's still JSON, which makes JSON-P possible (they support JSON-P).

2) It probably mirrors the way they store their data (in tables, whether SQL or Access or Excel, doesn't really matter).

3) It's more compact than traditional JSON, which means less bandwidth, which means less cost. Keep in mind that this is essentially a not-for-profit API from a not-for-profit organization with the worst possible budgeting scenario. Yes, Gzip would basically eliminate this benefit, but they don't have gzip enabled and enabling it may be difficult or impossible under whatever constraints they operate under.

Also, it's entirely possible that this API has existed privately for a long, long time in a CSV format, and they simply made a minor enhancement to make it JSON and JSON-P compatible and open up the endpoints to the public.




#3 may matter for a number of reasons:

- I daresay Census is sitting on some _large_ data sets. Fancy data structures are one thing when you want the population of California, another when you're trying to get California, by ethnicity and age group, by zip code.

- Data structure compactness will matter less to HN readers than to someone sitting on the other end of a phone dial-up in Oklahoma.

- I am but an egg but don't fancier data structures require particular decisions for implementation on particular tables? Census has a lot of tables -> a lot of decisions.

- God only knows how many different systems are involved holding all their data. The simpler the data structure, the less they have to get into those -- and / or the simpler a layer they between some mainframe and the API output.

Also, it may not seem very friendly to HN readers. But it is stupid simple, you can _see_ how it is organized, making it accessible to a wider audience. And it's so simple that it can be adopted by any agency publishing tables. Note that there are a lot of agencies with a lot of tables -- something like this has a chance of becoming standard among all of them.

A public agency has special accessibility concerns, and it's just a reality that government agencies have particular legacy technology issues. A lowest common denominator format helps get the data over those obstacles.

Maybe they can do better in places, and they're free to, there is no reason they can't support other formats too. But this will be available for whatever subset isn't served by those extensions.


#3 seems a likely reason. and like I said, having a harder to work with data structure is still better than nothing. and it is free.


All good explanations, but I can't help but wish they had chosen a format like:

  {
    P0010001 : [
      '710231',
      '4779736',
      // ... etc.
    ],
    NAME : [
      'Alaska',
      'Alabama',
      // ... etc.
    ],
    state: [
      '02',
      '01,
      // ... etc.
    ],
    // ... etc.
  }


I'm having trouble thinking of any situation where that would be more convenient or useful than the format they are using now. Where would your proposed format be an advantage?


Why would that be better? You'd probably still have to convert it. And also that structure would be harder to convert to from a CSV format or a SQL query.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: