Show HN: Catj – A new way to display JSON files

jolmg · on June 21, 2019

I was curious if this could be doable with jq, and apparently it is:

  jq -j '
    [
      [
        paths(scalars)
        | map(
          if type == "number"
          then "[" + tostring + "]"
          else "." + .
          end
        ) | join("")
      ],
      [
        .. | select(scalars) | @json
      ]
    ]
    | transpose
    | map(join(" = ") + "\n")
    | join("") 
  '

EDIT: Got the string quoting and escaping.

EDIT 2: For those who want to save this script, you can put just the jq code in an executable file with the shebang:

  #!/usr/bin/jq -jf

Coryodaniel · on June 21, 2019

Holy moly, Your jq skills are savage.

jolmg · on June 22, 2019

Thanks, but I haven't really done much in jq before. Doing the above script involved looking a lot through the manual and experimenting. This exercise was a learning experience for me. I was only able to do this because the manpage is well written.

krick · on June 22, 2019

Huh, that's the spirit.

geezerjay · on June 22, 2019

And that's how your skills gradually become savage. Kudos.

pstuart · on June 21, 2019

I was expecting some mention of jq for this post, and you did not disappoint. Thank you for the script -- it works great and I'm adding it to my collection.

phiresky · on June 21, 2019

Would probably be shorter if you used `tostream`

jolmg · on June 21, 2019

You're right. Good tip:

  jq -r '
    tostream
    | select(length > 1)
    | (
      .[0] | map(
        if type == "number"
        then "[" + tostring + "]"
        else "." + .
        end
      ) | join("")
    ) + " = " + (.[1] | @json)
  '

EDIT: For those who want to save this script, you can put just the jq code in an executable file with the shebang:

  #!/usr/bin/jq -rf

hyperpallium · on June 22, 2019

Appending | to each line and a last line . is a jq program that reproduces the original json.

  jq -r '
   ( tostream
     | select(length > 1)
     | (
       .[0] | map(
         if type == "number"
         then "[" + tostring + "]"
         else "." + .
         end
       ) | join("")
     )
     + " = "
     + (.[1] | @json)
     + " |"
   ),
   "."
  '

jolmg · on June 22, 2019

That's insane. Here I was thinking about how much more challenging it would be to parse and reconstruct the object from jq, and you got the idea to take advantage of the syntax similarity to parse it as jq code itself. Nice. And so, that means the inverse of the jq code I posted would simply be:

  ( jq "$(sed 's/$/ |/;$a.')" <<< '{}' )

As in:

  catj example.json \
  | ( jq "$(sed 's/$/ |/;$a.')" <<< '{}' ) \
  > original.json

hyperpallium · on June 22, 2019

And a nice example of the path form being amenable to unix tools.

BTW the input json can be null, so -n works (also using process substitution):

  jq -nf <(sed 's/$/ |/;$a.')

hyperpallium · on June 23, 2019

Parsing is awkward in jq, but setpath(PATHS; VALUE) will create necessary structure. PATHS uses the array form, like ["movie", "cast", 5] not .movie.cast[5]. Since 1.5, jq has PCRE regex, so could remove ], and separate by . and [.

sdegutis · on June 22, 2019

Everything you guys have posted in this thread looks like pure ninja magic to me.

cmonnow · on June 22, 2019

looks like ancient Greek for us plebs

sethish · on June 23, 2019

This would be more suitable to large json files if used with the `--stream` flag. Here's my take on it:

  jq -c --stream '
    . as $in 
    | select(length == 2) 
    | (
      $in[0] | map(
        if type == "number" 
        then "[" + tostring + "]" 
        else "." + . 
        end
      ) | add
    ) + " = " + ($in[1] | tostring)'

Using `--stream` allows jq to start before parsing the entire json file. In my experience, a 700mb json file can take up 5gb of ram in either jq or python -m json.

jolmg · on June 24, 2019

You're right about `--stream`, but you didn't need the variable assignment. Also, `-c`, besides the fact that it's not available in jq-1.5 which some people are using, is very pointless in this situation, since we're looking to output text, and `-c` is for outputting objects/arrays in a compact format. The fact that `-r` wasn't used causes jq to output the text encoded as json strings. So, instead of outputting:

  .movie.name = "Interstellar"
  .movie.year = 2014
  .movie.is_released = true
  .movie.else = "Christopher Nolan"
  .movie.cast[0] = "Matthew McConaughey"
  .movie.cast[1] = "Anne Hathaway"
  .movie.cast[2] = "Jessica Chastain"
  .movie.cast[3] = "Bill Irwin"
  .movie.cast[4] = "Ellen \\\\ Burstyn"
  .movie.cast[5] = "Michael Caine"

You're outputting:

  ".movie.name = Interstellar"
  ".movie.year = 2014"
  ".movie.is_released = true"
  ".movie.else = Christopher Nolan"
  ".movie.cast[0] = Matthew McConaughey"
  ".movie.cast[1] = Anne Hathaway"
  ".movie.cast[2] = Jessica Chastain"
  ".movie.cast[3] = Bill Irwin"
  ".movie.cast[4] = Ellen \\\\ Burstyn"
  ".movie.cast[5] = Michael Caine"

Another point is how the strings at the right of the `=` are displayed. They should be quoted. The reason why they're not is because you piped the second element to `tostring` instead of `@json`.

A better version of your suggestion would've been:

  jq -r --stream '   
    select(length > 1)  
    | (
      .[0] | map(
        if type == "number"
        then "[" + tostring + "]"
        else "." + .
        end
      ) | add
    ) + " = " + (.[1] | @json)
  '

The use of `length > 1` instead of `length == 2` is a minor point, but if a future version jq decides to sometimes put 3 elements in these arrays, your filter would ignore those when we're likely to also want those. `length > 1` ensures what we need, that there are at least the elements that we're going to be using, while `length == 2` might filter some of those out, even if it's not right now.

Your use of `add` is neat, though. I wouldn't have thought of that.

facorreia · on June 29, 2019

Thanks! I've made a gist out of it:

https://gist.github.com/fernandoacorreia/4b67a41bbe227654868...

alinspired · on June 24, 2019

Following https://github.com/stedolan/jq/issues/243 i commonly use https://github.com/joelpurra/har-dulcify/blob/master/src/uti... to explore unfamiliar json, ie:

  $ docker inspect 620f55df9177| structure.sh |grep -i addr
   .[].NetworkSettings.GlobalIPv6Address
   .[].NetworkSettings.IPAddress
   .[].NetworkSettings.LinkLocalIPv6Address
   .[].NetworkSettings.MacAddress
   .[].NetworkSettings.Networks.bridge.GlobalIPv6Address
   .[].NetworkSettings.Networks.bridge.IPAddress
   .[].NetworkSettings.Networks.bridge.MacAddress
  
  $ docker inspect 620f55df9177| jq .[].NetworkSettings.IPAddress
   "192.168.0.2"

soheilpro · on June 22, 2019

That's awesome work. The only problem is that it does not properly handle keys which are not valid JS identifiers (like 1foo, @foo, foo-bar, etc.).

jolmg · on June 22, 2019

Well, there's this option without the blacklist:

  jq -r '
    tostream
    | select(length > 1)
    | (.[0] | map("[" + @json + "]") | join(""))                          
      + " = " + (.[1] | @json)
  '

And this other option with the blacklist patterns:

  jq -r '
    tostream
    | select(length > 1)
    | (
      .[0] | map(
        if type == "number" or (tostring | test("[@-]|^[0-9]|^else$"))
        then "[" + @json + "]"
        else "." + .
        end
      ) | join("")
    ) + " = " + (.[1] | @json)
  '

(The blacklist here is non-exhaustive, but an example.)

jolmg · on June 22, 2019

It occurs to me that a better way to blacklist would be something like:

  jq -r '
    tostream
    | select(length > 1)
    | (
      .[0] | map(
        if tostring | (
          test("^[A-Za-z$_][0-9A-Za-z$_]*$")
          and (
            . as $property
            | ["if", "else"] | all(. != $property) 
          )
        )
        then "." + .
        else "[" + @json + "]"
        end
      ) | join("")
    ) + " = " + (.[1] | @json)
  '

You whitelist against what the syntax allows for identifiers and then you blacklist reserved keywords. Writing it this way makes it easier to verify for correctness when comparing with the ECMAScript Specs. This is still a non-exhaustive blacklist and the whitelist regex lacks allowed unicode characters.

kfrzcode · on June 22, 2019

  jq: error: syntax error, unexpected INVALID_CHARACTER, expecting $end (Unix shell quoting issues?) at <top-level>, line 3:
  jq -j '      
  jq: 1 compile error

This, uh, doesn't work for me on jq-1.5.1.

jolmg · on June 22, 2019

Hmm... I just downgraded to 1.5 and it seems to work. jq just has 2 numbers in its versioning[1]. The other number must be from your distribution's package building. Maybe the issue is with something your distribution did while building the package (like adding a patch)? It might also be that the syntax error is not with the script, but with the JSON you input. Sorry, without being able to reproduce the error, I can't help more.

[1] https://github.com/stedolan/jq/releases

waveforms · on June 22, 2019

#!/usr/local/bin/jq -rf

alinspired · on June 24, 2019

i need to understand how #! works, ie `#!/usr/bin/jq --stream -rf" errors with `/usr/bin/jq: Unknown option --stream -rf`

`#!/usr/bin/jq -rf ` with tostream wrapper in code works fine

jolmg · on June 25, 2019

Like another thread mentioned, shebang (#!) parsing is non-standard. In macOS, I think what you tried would work like you'd expect, but it'd work differently on linux. The reason is that in linux, after parsing the path to the executable and a space, everything else is taken as a single argument. So if you were in bash, what you did would be the equivalent of doing:

  jq "--stream -rf" path/to/script

and jq doesn't know of any one option called "--stream -rf".

I haven't seen the discussions around these design decisions in the different OSes, but I imagine the crux of the matter is that you have to pick somewhere to stop, and where you chose to stop is largely arbitrary.

I mean, you can have the OS interpret shebangs with multiple arguments, but then you'll want to be able to put spaces in these arguments, so you'll want quoting, and then you'll want to put special characters like newlines inside, so you'll want escaping, etc.

The OS can implement all these things in execve()'s logic, but it might also be preferable to keep the logic simple in the interest of avoiding security-harming bugs. You know, less code, less bugs, less vulnerabilities.

If --stream had a single letter option equivalent, you could stick it together with the other ones. However, since it doesn't, your only option to make a portable script is to use a shell shebang like #!/bin/bash, and then do:

  exec jq --stream -rf ...

You might feel that this single argument restriction sucks and is definitely inferior to any implementation of multiple argument shebangs. I don't know if macOS shebangs support quoting, but if they don't and simply split on spaces, then I can tell you they can't do hacky stuff like writing code in a shebang like this:

> https://unix.stackexchange.com/questions/365436/choose-inter...

Granted, it's bad practice, but a little cool nevertheless.

snek · on June 22, 2019

IIRC in a hashbang isn't posix compatible

TheDong · on June 22, 2019

A hashbang is not defined by posix, so using it without args isn't "posix compatible".

From https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V...

> If the first line of a file of shell commands starts with the characters "#!", the results are unspecified.

There is no more specification for "#!/usr/bash" than "#!/usr/bin/jq -jf"

The exec page provides even fewer words about how to interpret shebangs if you thought perhaps I was linking to the wrong portion of the posix spec

emmelaich · on June 22, 2019

Args in a hashbang?

Maybe not, but I'm pretty sure every system supports a single arg. And very few (none?) support more.

lilyball · on June 22, 2019

macOS appears to support multiple args just fine. Which is why it annoys me that Shellcheck bitches about using more than one arg even though I'm writing a script for macOS specifically.

Splognosticus · on June 22, 2019

ShellCheck is right insofar as compatibility is concerned. You can only rely on the shebang supporting one argument. I'd personally just ignore that warning if I were writing for MacOS specifically, but you can configure ShellCheck to ignore certain errors that you don't care about[1].

[1] https://github.com/koalaman/shellcheck/wiki/Ignore

lilyball · on June 23, 2019

Yeah, but it's simpler just to move the additional args to `set` calls than to remember the syntax for disabling the directive. It's just irritating, and especially so because, for some reason, in VSCode it ends up highlighting literally the entire file as an error instead of just the shebang.

avidal · on June 21, 2019

Have you seen gron[0]? It's similar: flattens JSON documents to make them easily greppable. But it also can revert (ie, ungron) so you can pipe json to gron, grep -v to remove some data, then ungron to get it back to json.

[0] https://github.com/tomnomnom/gron

Fnoord · on June 21, 2019

Actively developed and written in Go.

The project linked to is from 2014 with last update in 2015.and it is on NPM...

What is left to say? Thank you!

edflsafoiewq · on June 22, 2019

What's wrong with it not being updated in 5 years? It's a simple script. It's probably stable and doesn't need to be "actively developed".

sethammons · on June 22, 2019

I see this sentiment a lot. I agree with you, simple things can be done and don't need active development.

Apaec · on June 22, 2019

I much prefer jq[1]. Actively developed and written in C.

[1]:https://github.com/stedolan/jq

heyoni · on June 22, 2019

Used to be written in php right?

wooby · on June 21, 2019

I haven't seen that one, but it looks very similar to the one I use: https://github.com/micha/json-table

krick · on June 22, 2019

That's actually a nice lifehack. Much simplier than jq. Unfortunately, would be harder to make all kinds of logical conditions for which jq allows (even if not that intuitively).

It still feels like there must be something in between, some way to make queries with json more naturally, than with jq, yet with enough power.

jolmg · on June 22, 2019

jq is certainly a unique language, which makes it unfamiliar to work with. Intuitiveness and natural feeling comes when one has gotten familiar with it after a bit of practice and reading the manual, though. It's a very well thought-out language. A very nice design.

It might help to recognize how it's influenced by shell languages and XPath, if you're familiar with those.

krick · on June 22, 2019

Well, no arguing there, it is indeed. And I use it from time to time. However, it's not like I need a tool like that every day, and if I'm no using it for a week I usually need to "learn" it all over again.

inferiorhuman · on June 21, 2019

What does grep + gron give you over jq?

pfranz · on June 21, 2019

jq seems very powerful. I don't deal with json all that often and my most common use case (by far) is `jq '.' -C` and it took a few tries for me to remember that syntax.

The idea of flattening, grepping, then reverting sounds very appealing and sounds like a better fit for me.

jolmg · on June 21, 2019

> `jq '.' -C` and it took a few tries for me to remember that syntax.

I don't think you really need neither `.` nor `-C`. Just `jq` seems to do the same colored output of the input by default.

pfranz · on June 22, 2019

It does look like neither are needed if you pipe a file in jq, but `jq . file.json` requires the `.` and if you're pipeing into a pager, like less, you need both `.` and `-C` to get colored output (that was the case with the alias I had pulled up). I am using 1.5 and haven't looked to see if 1.6 changes this.

jolmg · on June 22, 2019

I see. I doubt that behavior has changed, then.

`-C` would be required when piping because most of the time (with the exception of piping into less) when stdout is not a terminal, it doesn't make sense to include terminal color escape sequences. You'd end up with those codes in your files, and grep would be looking at them for matches, for example.

`.` would be required when passing the file as an argument instead of stdin, because jq interprets the first argument as jq-code. If you don't include `.` it would interpret the filename as jq-code.

pfranz · on June 23, 2019

`.` is still needed if I'm pipeing in json--but only when I'm piping out. Otherwise help goes to stderr and nothing goes to stdout.

I do honestly think jq is a cool and powerful tool. I also appreciate little things like auto-color when appropriate--git also does this. Git also uses your pager, which might trivialize my personal use case.

dirkt · on June 23, 2019

Wrong question: It's not a competition.

There are cases when you have some complicated json and just want to search for stuff. Then you use grep + gron.

There are cases when you want a complete json processing tool. Then you use jq.

You can probably simulate each approach with the other approach, but the code needed to this is just too tedious to write. So you use whatever tool fits your use case.

nerdponx · on June 23, 2019

No, but you still have to make a decision about which tool to use. So it's helpful to have a sense of the use cases for each.

ElegantGiraffe · on June 22, 2019

I find it useful when I don't know what the json schema is. Then you can just do a quick gron + grep and find where the interesting parts of a large json document are.

webo · on June 21, 2019

as far as i can tell, jq doesn't do flattening .

pstuart · on June 21, 2019

Not built in, but @jolmg posted a script here which does the needful.

lucb1e · on June 21, 2019

Maybe GP meant that jq can do selection as well, i.e. that grepping is redundant after jq. But jq is much more complicated to learn and grep works on all inputs (not just json), so it makes a lot more sense to learn and use grep properly.

ori_b · on June 22, 2019

Simplicity.

news_hacker · on June 22, 2019

Love the greppability and reconstructability. This should be the submission.

twp · on June 21, 2019

I wrote a similar tool:

https://github.com/twpayne/flatjson

The flat format is great for diffs:

  --- testdata/a.json
  +++ testdata/b.json
  @@ -1,5 +1,6 @@
   root = {};
   root.menu = {};
  +root.menu.disabled = true;
   root.menu.id = "file";
   root.menu.popup = {};
   root.menu.popup.menuitem = [];
  @@ -9,8 +10,5 @@
   root.menu.popup.menuitem[1] = {};
   root.menu.popup.menuitem[1].onclick = "OpenDoc()";
   root.menu.popup.menuitem[1].value = "Open";
  -root.menu.popup.menuitem[2] = {};
  -root.menu.popup.menuitem[2].onclick = "CloseDoc()";
  -root.menu.popup.menuitem[2].value = "Close";
  -root.menu.value = "File";
  +root.menu.value = "File menu";

chrismorgan · on June 22, 2019

There is actually a standard around writing paths into JSON objects: JSON Pointer, https://tools.ietf.org/html/rfc6901. It’s straightforward, and avoids ambiguity between separator and key name by simple replacement, e.g. `/foo/bar~0baz~1quux` looks up a key named "foo", then a key named "bar~baz/quux" inside it. It’s not particularly widely used, but I’ve come across it in a few places over the years (it’s not a common thing to need to do), and probably most recently JMAP uses it for backreferences.

(I haven’t run it, but a skim of the code suggests that this tool will turn `{"foo.bar": "baz", "foo": {"bar": "baz"}}` into `["foo.bar"] = "baz"` and `.foo.bar = "baz"`, resolving the separator ambiguity in a pretty JavaScripty way.)

yason · on June 22, 2019

When XML came out into popularity I think the first thing I wrote was a small Python program to flatten/unflatten XML into per-line entries quite similar to the example output in the article.

The text streams that are processed line-by-line by dozens or hundreds of line-based tools are immensely powerful and universal. It's all Unix heritage and often overlooked by fancy modern designs that more often follow a fashion rather than root themselves in substance.

Surely text streams have their share of limitations like everything else but in practise you can retrofit nearly anything into line-based text streams and get an immediate productivity multiplier by being able to apply a whole array of established tools to process that data. Proof of that power is that it has been worthwhile to write converters to and from text and other formats. Not only you can find translators to turn various hierarchical or object-oriented formats into text but you can even convert a PNG into text and back (with SNG).

Text streams are like roads with lanes. They're ages old, they're pretty good at separating and guiding traffic, and they're somehow suboptimal in several senses yet rarely can anyone point out a single, clear practical improvement on laned roads, not to mention a system for containing traffic flows that is superior to them.

emmelaich · on June 22, 2019

Augeas can do something similar too. But not only JSON but XML and 200+ other config file formats.[0]

  $ augtool -r . -L --transform 'JSON.lns incl /catj-eg.json'  <<< 'print /files/catj-eg.json'
  /files/catj-eg.json
  /files/catj-eg.json/dict
  /files/catj-eg.json/dict/entry = "movie"
  /files/catj-eg.json/dict/entry/dict
  /files/catj-eg.json/dict/entry/dict/entry[1] = "name"
  /files/catj-eg.json/dict/entry/dict/entry[1]/string = "Interstellar"
  /files/catj-eg.json/dict/entry/dict/entry[2] = "year"
  /files/catj-eg.json/dict/entry/dict/entry[2]/number = "2014"
  /files/catj-eg.json/dict/entry/dict/entry[3] = "is_released"
  /files/catj-eg.json/dict/entry/dict/entry[3]/const = "true"
  /files/catj-eg.json/dict/entry/dict/entry[4] = "director"
  /files/catj-eg.json/dict/entry/dict/entry[4]/string = "Christopher Nolan"
  /files/catj-eg.json/dict/entry/dict/entry[5] = "cast"
  /files/catj-eg.json/dict/entry/dict/entry[5]/array
  /files/catj-eg.json/dict/entry/dict/entry[5]/array/string[1] = "Matthew McConaughey"
  /files/catj-eg.json/dict/entry/dict/entry[5]/array/string[2] = "Anne Hathaway"
  /files/catj-eg.json/dict/entry/dict/entry[5]/array/string[3] = "Jessica Chastain"
  /files/catj-eg.json/dict/entry/dict/entry[5]/array/string[4] = "Bill Irwin"
  /files/catj-eg.json/dict/entry/dict/entry[5]/array/string[5] = "Ellen Burstyn"
  /files/catj-eg.json/dict/entry/dict/entry[5]/array/string[6] = "Michael Caine"

[0]

  $ ls  .../share/augeas/lenses/dist/|wc
        221     221    2867

tedivm · on June 21, 2019

For exploring large files I released a program called "JSONSmash" that runs a shell which lets you browse the data as if it was a filesystem.

https://blog.tedivm.com/open-source/2017/05/introducing-json...

rahimnathwani · on June 22, 2019

This looks cool! I'm curious: did you consider using FUSE for this, instead of taking a shell? Using FUSE might make installation harder for less technical users, but the upside is that they could use their choice of file manager (shell commands, curses-based or GUI).

As you've already implemented most relevant commands (cd, ls, cat), it would probably be easy to make a FUSE version using fs-fuse / fuse-bindings

ficklepickle · on June 21, 2019

What an interesting idea! On my last contract, I had to deal with 100mb+ JSON responses from a barely-documented API. This would have come in handy when I was figuring it out.

I've used JSONExplorer for this purpose, but it is web based and doesn't handle files this large.

Extending the filesystem metaphor to JSON data and re-using the same commands strikes me as a great idea.

Did another project inspire you, or did you come up with the concept yourself?

Have you done a Show HN yet?

tedivm · on June 22, 2019

I came up with this idea on my own when dealing with the AWS Bulk API files. I did make a "Show HN" but it got a total of four upvotes.

mkesper · on June 29, 2019

HN is very sensitive to timing, afaik the moderators allow resubmissions when that seems to be the case.

hokus · on June 22, 2019

this is something between a joke and a thought experiment.

    function cason(x){
    switch(x[0]){
      case "movie": switch(x[1]) {
        case "name"       : return "Interstellar";
        case "year"       : return 2014;
        case "is_released": return true;
        case "director"   : return "Christopher Nolan";
        case "cast": switch(x[2]){
          case 0: return "Matthew McConaughey";
          case 1: return "Anne Hathaway";
          case 2: return "Jessica Chastain";
          case 3: return "Bill Irwin";
          case 4: return "Ellen Burstyn";
          case 5: return "Michael Caine";
        }
      }
    }
    }

geofft · on June 22, 2019

Well, that's kind of the inverse of writing switch statements like

    def license(kernel):
        return {"Linux": "GPL",
                "FreeBSD": "BSD",
                "NT": "Proprietary"}[kernel]

kara_jade · on June 22, 2019

You can do this easily with the json_tree function from SQLite's JSON1 extension. It's given as an example in the documentation:

https://sqlite.org/json1.html#jtree

  SELECT big.rowid, fullkey, value
    FROM big, json_tree(big.json)
   WHERE json_tree.type NOT IN ('object','array');

ComputerGuru · on June 22, 2019

And with the fileio sqlite extension, you can even directly query files (and their contents, and directories recursively too, no less) from SQL.

nfoz · on June 21, 2019

That's nice, I like that.

Related, if you want more of a csv-style, see JSONLines. aka "newline-delimited JSON"

http://jsonlines.org/

emmelaich · on June 22, 2019

I don't see what jsonlines has over yaml; in fact the jsonlines examples presented there are almost trivially converted to yaml. e.g. the first example is valid yaml if you add a '- ' to each line.

And JSON is (almost) a perfect subset of yaml.

I've been using csv lately. It's reputation is overstated.

What I like is that it's far more compact than yaml or json and trivially pulled into sqlite for ad-hoc queries.

geofft · on June 22, 2019

One thing I like about JSON Lines is robustness to bad data - if an individual line is corrupted, you can discard it / print a warning and move on, and the parser can start again at the next newline. This makes it useful for log messages / metrics, because if something crashes while emitting a log line, you can recover. If something crashes while emitting an item in a YAML list, you might corrupt the entire rest of the document.

Another is that it makes streaming processing a little easier. Once you have a line, you know you can attempt to process it, and you can shard processing on newlines without a full YAML processor. Tools that work on newlines or tools that can just split on lines can handle the first level of JSON Lines output.

mirimir · on June 22, 2019

> ... and trivially pulled into sqlite for ad-hoc queries.

Or spreadsheets, to work out a plan, and then MySQL.

limsup · on June 22, 2019

Try it with deno:

deno install catj https://deno.land/std/examples/catjson.ts --allow-read

ravinizme · on June 22, 2019

Similar to python-jsonpipe (8 years ago).

https://github.com/zacharyvoase/jsonpipe

It includes 'jsonunpipe'.

So you could grep part of the JSON and still get a JSON back.

``` echo '{"a": 1, "b": 2}' |grep b| jsonpipe | jsonunpipe

#{""b": 2} ```

bborud · on June 22, 2019

Oh no. This looks like the horrible config format a colleague of mine invented at Yahoo when making an absolutely horrific config system. This brings back bad memories.

This may look cute but it is horrific when dealing with large configs and you have to reconstruct all the structure in your head.

Also, when you have a format that nests using brackets, braces and parenthesis, you can get help from the editor. This format does not give you that.

I'm not a huge fan of JSON (and the above mentioned format was invented because none of us were fans of XML at the time), but it turns out that both XML and JSON are actually easier to work with in practice than this format. Not least because there is ample tooling for JSON (and XML).

The lesson I learnt: I may hate XML (or in this case JSON), but finding an alternative that is better is not easy.

geofft · on June 22, 2019

I think the idea is not that this is for storage or editing, but just for querying - you should keep your data in JSON, but if you want to find where something is, do `catj foo.json | grep something` and it'll tell you all the paths in the document where you can find the string "something". The intent is not to open catj format in a text editor, or to use the output of catj for anything other than ad-hoc purposes.

flying_sheep · on June 21, 2019

https://github.com/stedolan/jq/issues/243

jq can also do the same thing, with more flexibility. And it is possible to combine with bash alias to make it indistinguishable from catj

jolmg · on June 21, 2019

> combine with bash alias

... or you know, you could put the jq script in an executable file and add a shebang like

  #!/usr/bin/jq -jf

or

  #!/usr/bin/jq -rf

In my opinion, aliases should mostly be used to add default options only. Not really to insert whole scripts into them.

Splognosticus · on June 22, 2019

I've always been of the mind that aliases should be "whatever the user finds convenient." I don't think I've ever seen code that depends on them.

sdegutis · on June 21, 2019

This is really similar to the format that AWS uses to represent recursive structures (arrays, maps) as a single array of key-value pairs for their APIs. Your catj could potentially be used to create that if ever working with the raw API directly instead of a SDK.

QuadrupleA · on June 22, 2019

Please don't write JSON like the example - it's like putting 4 files in a hierarchy of 20 folders. Way over-structured.

That said, if you're stuck dealing with bad JSON like this with low signal to noise this is a decent way to redisplay it.

danschumann · on June 22, 2019

Okay I made one with javascript ( go to https://underscorejs.org/ and open console )

EDIT: how do I markup code on HN?

ComputerGuru · on June 22, 2019

> how do I markup code

Indent each line with four spaces. Please please keep line length very short (under forty?) as HN's pre tags are absolutely not mobile friendly and can trash the entire page.

Edit: actually it seems to at least scroll within the comment div on overflow now, that's a huge improvement!

not_kurt_godel · on June 22, 2019

Hm, essentially JSON->Properties file. Cool. I wrote a little script the other day and decided Properties format was a pleasant way to define the config, maybe this could dovetail with similar future endeavors.

blablabla123 · on June 22, 2019

That's really smart, in fact this was - maybe until now - the only reason for me to resort to csv. (With pandas it's by the way really easy to flatten jsons into csv) JSON is such a nice format but the tools are really not there yet. I guess should should make it then possible to combine with line-based tools like head, tail, sort, uniq etc.

venthur · on June 22, 2019

Looks a lot like ye olde gron: https://github.com/tomnomnom/gron

Here's a Python implementation of gron https://github.com/venthur/python-gron

krapp · on June 21, 2019

The output format like a slightly better INI (in that it would support deeper hierarchies), although arrays would be a pain to write an index at a time like that.

And for purely aesthetic, nitpicky reasons I think the leading period in each line is redundant.

ucarion · on June 21, 2019

I don't think that leading period is redundant. It indicates that the top-level value is an object, as opposed to an array or some basic value.

deostroll · on June 26, 2019

Sweet! I find this tool extremely useful for parsing dmn files in javascript (via xml2js). But right now I am more interested in a similar feature for xml/html documents. Hoping if someone can point me to it. Thanks.

etaioinshrdlu · on June 21, 2019

It looks almost executable! If it were, that might be handy now and then.

rolltiide · on June 21, 2019

add in a few nil and undefined checks automatically and it really is executable

konsumer · on June 21, 2019

I made this to do similar, it goes both ways: https://github.com/konsumer/jsflat

BaconJuice · on June 21, 2019

Just curious, what would be the use case for something like this?

soheilpro · on June 21, 2019

Author here. I use it all the time when working with jq [1] to find the path of the nodes that I want to select or filter. It also makes it much easier to understand the structure of deeply nested JSON files.

[1] https://stedolan.github.io/jq

ofxartem · on June 21, 2019

I could see it being useful when using grep or writing awk scripts. For the latter you usually have to carry around some state if you want to deduce the path to a desired leaf node.

I could also see it useful for just finding the index of a particular array entity.

z3t4 · on June 21, 2019

Easier to parse maybe. You can parse it line by line. I also think this format is easier to edit.

fareesh · on June 22, 2019

I use the "JSON viewer awesome" extension on Chrome. It displays JSON in a collapsible verticle tree and has a copy path button.

This is a nice CLI based alternative

matmann2001 · on June 21, 2019

I would recommend an optional flag to disable the output coloring, in the case where someone wants to pipe this to a file or other program.

jay-anderson · on June 22, 2019

Reminds me of how spring configuration works between yaml and property files. It does a similar flattening to translate between them.

nailer · on June 23, 2019

pwsh is an open source shell that runs on Unix and supports JSON natively, so you don't need to flatten or use jq.

Here's an example from a couple of days ago:

https://twitter.com/mikemaccana/status/1141706132823695362?s...

karxxm · on June 21, 2019

All those redundancies make me sad

htk · on June 22, 2019

What a simple and clever idea. I would love something like that for XML in C#.

stdcall83 · on June 22, 2019

Looks like a good candidate for lesspipe plugin.

laurent123456 · on June 22, 2019

New from (2014)

jijji · on June 22, 2019

not much different than php print var_export(json_decode($json,TRUE), TRUE);

usamaejaz · on June 22, 2019

this looks nice

hasahmed · on June 22, 2019

cat foo.json | jq .

dymk · on June 21, 2019

Might want to put [2015] in the title? Certainly not a new way

enriquto · on June 21, 2019

this is not merely a display, it is really a sanitation of the brain-damaged json syntax.

Sohcahtoa82 · on June 21, 2019

What method would you use to serialize data structures into a format that is both easily read and written by both a computer and a human?

enriquto · on June 21, 2019

In the very rare cases that you really need to serialize complicated data structures, something like s-expressions or (gasp) json, or even a memory dump is perfectly appropriate. However, most of the times your data structures should be lists of numbers or lists of strings. For those, you do not really need a "format", you can simply print and scan them from a text file.

derrikcurran · on June 22, 2019

I have never once heard this opinion of JSON. JSON is a breath of fresh air. I'm really struggling to imagine what sort of software you work on.

enriquto · on June 22, 2019

If you come from xml hell, then yes, json may even seem a reasonable option. But if you are used to the terse beauty of printf and scanf, then a json file looks like bad sarcasm.

gridlockd · on June 21, 2019

Your opinion is bad.

enriquto · on June 21, 2019

json will be mocked in a few years by the same people that today mock xml. In the end, unix always survives.

krapp · on June 21, 2019

"unix" isn't a file format, nor is it in competition with either XML or JSON, so claiming it will "survive" either is a meaningless boast.

brennebeck · on June 22, 2019

While I don’t disagree with you, if you look at the comment above this you’ll see @enriquto mentions just dumping lists to a text file. Pretty sure that’s what was implied by ‘unix’: use a simple text file that is dumped to the fs that can be easily parsed by standard tools on any Unix-like.

gridlockd · on June 22, 2019

JSON is usually mocked by people who don't get to use Javascript objects natively. The benefits aren't obvious or tangible to them.

XML is native to nothing. It's annoying to use everywhere.

UNIX tools just don't compare. Sure, they stick around, because sometimes they're the easiest tool for some job, until they aren't and you regret starting out with them.

In a few years, Javascript and Javascript-compatible languages are likely to be bigger than ever before. A whole generation of developers has been trained mainly on them. Billions of dollars have been invested into the ecosystem. Whether we like it or not, that's the reality. For that reason alone, JSON will stick around.

imtringued · on June 22, 2019

People mock xml because it's an over complicated mess with too many features.