This doesn't seem unreasonable. 1) Whether it's user-friendly or not, it's still...

chernevik · on June 7, 2012

#3 may matter for a number of reasons:

- I daresay Census is sitting on some _large_ data sets. Fancy data structures are one thing when you want the population of California, another when you're trying to get California, by ethnicity and age group, by zip code.

- Data structure compactness will matter less to HN readers than to someone sitting on the other end of a phone dial-up in Oklahoma.

- I am but an egg but don't fancier data structures require particular decisions for implementation on particular tables? Census has a lot of tables -> a lot of decisions.

- God only knows how many different systems are involved holding all their data. The simpler the data structure, the less they have to get into those -- and / or the simpler a layer they between some mainframe and the API output.

Also, it may not seem very friendly to HN readers. But it is stupid simple, you can _see_ how it is organized, making it accessible to a wider audience. And it's so simple that it can be adopted by any agency publishing tables. Note that there are a lot of agencies with a lot of tables -- something like this has a chance of becoming standard among all of them.

A public agency has special accessibility concerns, and it's just a reality that government agencies have particular legacy technology issues. A lowest common denominator format helps get the data over those obstacles.

Maybe they can do better in places, and they're free to, there is no reason they can't support other formats too. But this will be available for whatever subset isn't served by those extensions.

jack-r-abbit · on June 6, 2012

#3 seems a likely reason. and like I said, having a harder to work with data structure is still better than nothing. and it is free.

peterldowns · on June 6, 2012

All good explanations, but I can't help but wish they had chosen a format like:

  {
    P0010001 : [
      '710231',
      '4779736',
      // ... etc.
    ],
    NAME : [
      'Alaska',
      'Alabama',
      // ... etc.
    ],
    state: [
      '02',
      '01,
      // ... etc.
    ],
    // ... etc.
  }

tzs · on June 7, 2012

I'm having trouble thinking of any situation where that would be more convenient or useful than the format they are using now. Where would your proposed format be an advantage?

asdfaoeu · on June 7, 2012

Why would that be better? You'd probably still have to convert it. And also that structure would be harder to convert to from a CSV format or a SQL query.