Hacker News new | past | comments | ask | show | jobs | submit login
Jd – JSON Diff and Patch (github.com/josephburnett)
184 points by smartmic 4 months ago | hide | past | favorite | 31 comments



I realize it's nice to use short names for applications, but couldn't this also be have been called jdiff? I feel like the two-letter tool name is exercising developer memory more than strictly required. I have 42 two-character binaries in /usr/bin. Of course, that's still only about 5% of the available two-alphabet names..

The developer can always choose to use a shorted local alias for commonly used tools.

That being said, I wonder if this is much better than difftastic that is more general purpose, but tree-aware? I suppose this one wouldn't care about JSON dictionary key ordering, at least.


The real headscratchers are the tools that have a proper name, a shortened name, and a command name:

- Stacked Git

- Shortened: stgit

- Command: stg

Lots of “stgit: command not found” ensues.


You have a point of course, but I find it funny that the path is not /users/binaries and, instead, it is a similar abbreviation.

In a way, it is a sort of seo race for tool devs.


usr stands for user system resources


This is a backronym. /usr is the original user home directory location on classic unix.

https://www.bell-labs.com/usr/dmr/www/notes.html


TIL after so many years that /usr isn't an abbreviation of "user". "UNIX/user system resources" makes a lot more sense in retrospect. Guess I should have RTFM a long time ago!


It looks like that's a newer interpretation than the original:

> As such, some people may now refer to this directory as meaning 'User System Resources' and not 'user' as was originally intended.

https://tldp.org/LDP/Linux-Filesystem-Hierarchy/html/usr.htm...


> couldn't this also be have been called jdiff? I feel like the two-letter tool name is exercising developer memory more than strictly required.

Yeah, in retrospect I should have given this a longer name. I was going for a natural fit with `jq`. ¯\_(ツ)_/¯

> I wonder if this is much better than difftastic that is more general purpose, but tree-aware?

There are quite a few good tree-aware JSON diff tools out there. But I wanted one that could also be used for patching. I've tried to maintain the invariant that all diffs can be applied as patches without losing anything. And I also wanted better set (and multi-set) semantics, since the ordering of JSON arrays so often isn't important.


Just use gron! greppable json

It turns JSON to JS syntax. it"s perfect for these tasks.

https://github.com/tomnomnom/gron


I have recently used jd with great success for some manual snapshot testing. At $work we did a major refactor of $productBackend, so I saved API responses into files for the old and new implementation and used jd (with some jq pre-processing) to analyze the differences. Some changes were expected, so a fully automatic approach wasn't feasible.

This uncovered a few edge cases we likely wouldn't have caught otherwise and I'm honestly really happy with that approach!

One thing I would note is that some restructurings with jq increased the quality of the diff by a lot. This is not a criticism of jd, it's just a consequence of me applying some extra domain knowledge that a generic tool could never have.


> I would note is that some restructurings with jq increased the quality of the diff by a lot.

I would really like to know more about these restructurings. Would you mind dropping me an example here or at https://github.com/josephburnett/jd/issues please? There are somethings I won't do with jd (e.g. generic data transformations) but I do plan to add some more semantic aware metadata with the v2 API.

Also, I'm glad this tool helped you! Made my day to here it :)


Hello! I sometimes have big json files to diff. Its content is a big array with complex object inside. The problem I have with all the diff tools I tried (this one included) is that it can't detect if element is missing. When that happens, it computes a very long diff where it could have just said "element is missing at index N". Are you aware of a tool without such caveat ? Thanks


> When that happens, it computes a very long diff where it could have just said "element is missing at index N".

That's exactly the problem addressed by this issue: https://github.com/josephburnett/jd/issues/50. And I've created a new v2 format to address this and other usecases. The v2 API will compute the longest common subsequence of two arrays and structure the diff around that (a standard way of producing a minimum diff).

I've just released jd 1.9.1 with the `-v2` flag. Would you mind trying one of your use cases to see if the diff looks any better? I should say something exactly like that "@ (some path) - (some element)".


I'm probably missing something obvious, but diff seems to be handling this just fine?

    # diff -u <(echo '[{"a": "b"}, {"c": "d"}, {"e": "f"}]' | jq) <(echo '[{"a": "b"}, {"e": "f"}]' | jq)
    --- /dev/fd/63 2024-09-09 16:31:23.376841575 +0200
    +++ /dev/fd/62 2024-09-09 16:31:23.376841575 +0200
    @@ -3,9 +3,6 @@
       "a": "b"
       },
       {
    -    "c": "d"
    -  },
    -  {
         "e": "f"
       }
     ]


Yeah that works. But I also wanted the ability to produce JSON Patch and JSON Merge Patch formats. And to support set semantics, identifying objects by specified keys. And it works on YAML too.


There is a diff functionality which I have provided in unify-jdocs that I think does exactly what you are looking for. You can get the details here -> https://github.com/americanexpress/unify-jdocs. At present it is only for Java. And if you do take a look, please feel free to give feedback - thanks.


IIRC DeepDiff does something like this:

https://github.com/seperman/deepdiff


Looks neat! Definitely more reliable than my hacky jq script[0] which I had to write for envs with only sh and jq

[0] https://gist.github.com/Checksum/17c84306f563eca40b353f6ed83...


What I've done in the past is pipe JSON into gron then diff. Works sufficiently well for eyeballing.


This tool looks great! I’ve been using difftastic lately, which does a fairly good job but struggles with big json files.

One feature I’ve yet to see is applying jq query syntax to the jsons before the diff


> One feature I’ve yet to see is applying jq query syntax to the jsons before the diff

Will you please add this as a feature request? https://github.com/josephburnett/jd/issues. I would like to hear more about how you would use it.


Have been using it in a Go project lately, wonderful library! Ended up with jd after trying a few others that couldn't handle edge cases, such as creating a diff between `[]` and `{}`. Love the diff format as well.


Is this useful for large json files, on the order of GiB?


Was just using it to compare two massive json files, super performant and useful compared to using jq


We're not that far from jsolog.


very very nice, just the tool I needed for the current task - and here it is! :)


Use it daily to make JSON payloads more readable. One of Open Source true gems.


I use jq for that!


nice, super useful for debugging API responses. Would be nice to be able to use it as a VSCode extension!


> Would be nice to be able to use it as a VSCode extension!

I've added support to use jd as a Git diff engine: https://github.com/josephburnett/jd?tab=readme-ov-file#use-g.... Can you configure VS Code use a custom command to show diffs?


This looks neat and useful!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: