Hacker News new | past | comments | ask | show | jobs | submit login
It is never too late to write your own C/C++ command-line utilities (lemire.me)
66 points by mfiguiere 3 months ago | hide | past | favorite | 40 comments



The very idea of checking a json every second is a problematic one. But if you need to do it once a second, does it really matter that you could do it in 1ms or 10ms? And if latency is that important, why don't put a bit more effort for better design instead of spawning a new process every time for something that could be in the main code? This looks to me like a bad design where the tools take the blame.

In regard to the choice of language, Python is not the best tool for most jobs, but it is a tool that always let's you do the job. If you've found a better tool for a certain job, good for you, enjoy it. Python always will be there when you need the next job just done.


In this case, the best tool for the job is the one that will let you:

1. Continue to easily iterate on the solution now and in the future (compiled languages have a built-in hinderance, in that you never know when a given binary corresponds to some state of the source tree)

2. Allow you to trivially talk to inotify/ReadDirectoryChangesW /FSEvents/kqueue/etc so that you only need to read the file when it actually changes

3. Minimise maintenance (installing third-party headers/libraries, setting up a project, etc etc)

I think the answer here is the shell, fswatch, and jq.

https://github.com/emcrisostomo/fswatch https://github.com/jqlang/jq http://man.openbsd.org/sh


At the very least, I imagine you could improve this a few ways.

1. Spawn a single process/daemonize it.

2. Wake up and check at least once every second (or whatever the preferred timing is).

3. Don't blindly read the JSON every time (check the last modified time before reading the contents). Obviously if it hasn't been modified, then go back to sleep (perhaps shorten the sleep window if time is of the essence).

4. I don't imagine you can guarantee a process finished writing the file before you finish reading the file... so maybe this belongs in a database.


With LLMs.

Writing c++ or c code of equivalent python code is wayyy too easy.

I can write code in any language even the ones I don't know.

Just fine the constraints I'll meet the goals 100%.


> <Every chatbot> can make mistakes. Please double-check responses. (~ at the bottom of every commercial LLM app)

You can write C/C++ using a chatbot if you could without.

If you couldn't without, how would you find the subtle errors that they all produce?


The answer is obviously customers call and complain or even cancel contacts!


Or a large class action suit for all the money it lost and info it leaked.


First, I assume the script was more complex than shown, and "jq" was not sufficient for some reason - because "jq" on such small document would likely be comparable speed to C++ or even faster, while requiring no development effort at all.

The benchmark measures startup time, and Python is notoriously bad with it. Why is the system requires a python app to be started many times per second? Can the system do localhost socket connection? Then we can have persistent Python daemon, and the speed would be many orders of magnitude faster than 22/second.

If for some reason it has to be process (3rd party system?), then a small C wrapper which does a socket connection to Python process would do the trick. This can potentially help with process management too, as it can launch python backend on demand, kinda like bazel does.

(This is all predicated on the fact that task is complex enough that it's worth setting up a dual-language system like this. I have no way to tell if that's true or not from the blog post)


C++ being only 10x faster than Python also undermines the supposed claim, as Python code did materialize all JSON tree on loading while C++ version used the ondemand feature of simdjson. Even worse, he mentioned the input is "small", which means that I/O can be a very large factor in this particular benchmark. (Remember, simdjson is supposed to reach more than GB/s, so the input should've been at least 5 MB if no other overhead was involved. Is that even small?) It's shame that he definitely knows and could have done a thorough analysis to prove the point but didn't.


NEON is the worst case for simdjson performance. It's likely that the code actually ran on a server with better vector extensions than his laptop.


There's also a few options in the middle: pypy, Nuitka, cython, and others can help with (almost) unmodified Python code and running it faster. Could be worth checking them before reaching for daemons and rewrites.


how many of those help with startup time?

Those things all make fast extensions/methods, they don't eliminate python startup, and sometimes add to it.


I used to think that.


jq is the first thing I thought of. I'd certainly add a bash solution to this comparison. Maybe as a daemon to exclude the startup cost.


> Migrating a Python script to C++ could well be worthwhile in some instances. However, it’s not all sunshine and rainbows with C++. Diving into C++ requires more mental effort.

That's a non-trivial statement! A statement which makes this article seem like it was written 10 years ago. Did anyone believe the C++ cited is all the C++ required to make this work? Go see the C++ as tested: https://github.com/lemire/Code-used-on-Daniel-Lemire-s-blog/...

By this I mean -- if Python isn't fast enough -- isn't that why we have Rust? Compare to my quick Rust: https://play.rust-lang.org/?version=stable&mode=debug&editio...

It kinda crushes the CLI tool use case. Especially w/r/t JSON.


Are all those lambdas to walk the tree down to the field you want so that it doesn’t have to parse all the bytes you don’t care about? Is it the efficiency of SAX without being horrible to read and write?


> Are all those lambdas to walk the tree down to the field you want so that it doesn’t have to parse all the bytes you don’t care about? Is it the efficiency of SAX without being horrible to read and write?

Just the ugly/simple-minded way I wrote it.

You can index into JSON with serde too, like so:

    let wealth_str: &str = data["user"]["wealth"].as_str().expect("Could not parse JSON value.");
See: https://docs.rs/serde_json/latest/serde_json/value/trait.Ind...


Ah, gotcha, thanks.


Your rust code is 32 lines vs 16 and it probably downloads dozens of micro-dependencies.

Unfortunately, the C++ also uses CMake fetch to get its dependencies. I guess this is fine for a small demo, but it would be better to inline them in the repo. At least that’s a realistic option for C++ and thise deps are easier to validate anyway.

As far as readability goes, I’m surprised that the C++ is consistently better than Rust in this regard. Most of those extra lines which double the size are for ceremony which nobody cares about when trying to understand the algorithm.

As an example, we have a Rust tool at work which is quite nice (and memory safe :D), but because it has a fat match in one file, most of the code is actually in the middle of the screen. I also spent half an hour looking at the 150 micro-deps it was pulling in and then another half hour verifying if I unknowingly pulled the serde binary blob on my machine (I didn’t). Validating Rust deps is a major PITA both for regular users and distros.

That’s why I write my command tools either in Golang (preferred), Python (if no unusual external deps needed) or C++ if I need speed. Most of the time I am fine with Golang and Python.


> Your rust code is 32 lines vs 16 and it probably downloads dozens of micro-dependencies.

You counted my comments and whitespace as LOC?

> As far as readability goes, I’m surprised that the C++ is consistently better than Rust in this regard. Most of those extra lines which double the size are for ceremony which nobody cares about when trying to understand the algorithm.

Wait, seriously? I mean -- I'm kinda shocked anyone feels this way, but different strokes for different folks.

For instance, I've never understood why something like this is normal for every C++ code example:

    simdjson::padded_string json = simdjson::padded_string::load(argv[1]);
You can raw dog imports in Rust code too, I suppose, but why not spend another line of code and add a `using` at the top?

Notice, also, one way the C++ version saved LOC is the C++ library takes a string path, opens the file path, and read its contents all at once, which is like half my program?

I'd also suggest that, for many, the LOC has very little to do with how understandable the program is.

> Validating Rust deps is a major PITA both for regular users and distros.

First, doesn't the C++ version also draw in its own dependency? AFAIK serde is pretty lightweight in terms of deps.

Second, is jerking off CMake and/or writing your own JSON parser better? I'm starting to understand -- C++ is this strange land where mind numbing pain is actually better. Up is down. Down is up! The segfaults are character building!

This reminds me of one of the more amusing conversations I've had re: C++. Someone was writing JIT compiler for their language and had chosen to implement in Rust. A person rather pointedly asked: Safety isn't a concern here, so why didn't you use C++? This is amusing to me only because C++ is much more difficult for me than Rust. It's often inscrutably complex. So my response would have been -- Let me get this straight -- someone isn't making me? I have a choice, and you want me to choose C++? Why should I be miserable forever?


Except nowadays vcpkg and conan exist, and I also don't have to wait for the world to be built for my tiny app, thanks to the ecosystem support for binary libraries.


Can you post benchmarks? serde_json isn't known for its throughput.


I think you may have missed my point? Which was -- I think the Rust is more understandable, is easier to write, and is easier to get correct, more quickly.

My version also isn't 1:1. I only wrote an approximation of what the author noted was the most relevant portion in the blog post.

If you'd like to benchmark my version you're certainly welcome to, once you've cleaned it up a bunch. The C++ version seems to be SIMD accelerated though? If you want a more relevant performance comparison, perhaps you should find SIMD accelerated Rust JSON library.


Frankly the C++ version is lacking whitespace and is shorter than it needs to be and as someone who uses C++ daily and has used simdjson before it is still much easier to read than the Rust version. Have you considered that this is somewhat subjective?


They seem very similar to me except that

1) I can see the Rust has error handling and I can't tell whether the C++ just doesn't bother or whether there is hidden control flow, since both these very different outcomes are usual in C++. My impression of simdjson is that it just doesn't care - handling errors wouldn't be "ridiculously fast".

2) "Filenames are text" nonsense in the C++. In a sense this is related to error handling, but it's not an error on the popular systems for your files to have names which aren't text, the OS is too lazy to require that.

Mainly, when I look at the C++ I see bug opportunities, some of which existed in the Python and some did not, and I imagine the expenditure of time to hunt down those bugs and fix them (or work around them) compared to the tiny saving of using this code.


Both C++ and Rust are equally unintuitive to understand, especially compared to Python which is effectively pseudo-code already. Whichever is more “understandable” is left solely up to the familiarity of the developer in question.


Currently the startup time is often the last of the problem when using python for CLI tools. The python ecosystem is terrible for CLIs.

It is very hard to ship reliably and consistently standalone tools to users in Python without bundling the damn interpreter in a giant blob/archive with the program itself.

The packaging ecosystem and import system of Python is a mess:

- Any PYTHONPATH entry on the target user machine might break your tool (hello bashrc). - Any globally installed python package on the system ( /lib/python3.X/site-packages ) might break your tool. - Any python package present in the user home directory might break your tool ( e.g ~/.local/lib/python3.X ). - Many python packages have binary dependencies that do not respect the ManyLinux (https://peps.python.org/pep-0513/) standard and have random ABIs issues with systems with different compilers / libc. - Some user mix in their environment Conda and system packages all together with different libc and that blow up with random errors on package import. - Add on top of that, you have the problems with the versioning of python itself.

This is honestly insane. It is a major usability pain compared to a simple "unpack and run" of a Golang, C++ or Rust binary.


Someone wanting to migrate Python scripts to compiled form for performance reasons might want to look at Nim. Cpp-like speeds with less cognitive load.


I really wish Nim's documentation was more accessible, but once you wrap your head around its peculiarities, it's a joy to work with.


It's all about tradeoffs. Do we lug the python interpreter around and be flexible and portable. Or do we go for that sweet performance by directly hooking to syscalls and/or the standard library? I love lean and fast software, and if the refactor surface is small and the solution is known, go for it.


If the data are really that mutable, why not go with a service doing the query in RAM and then back it up to disk every second?

This could be an SQLiter :memory: opportunity. Keep everything in tables and blow off the parsing.

Then the AI weenies can go ahead and predict the answers in advance.


Modern C++, alongside vcpkg and conan, make this relatively easy.

Also don't forget to enable hardened runtime flags, for having those bounds checks. Available in any sensible compiler.

As for raw C, if that is your jam, better use Go instead. Even the language creators moved on.


The speedup is mostly because of using a faster library, not because of the language.

I used py2many, a transpiler that can handle many simple language constructs and then modified the calls to json library to something that works.

The hard part of writing such utilities in python and getting a small, faster binary is library call translation. For this, py2many provides a framework. But someone needs to write library to library mappings.

Recently, py2many added a backend for mojo. It's very early in the game. But if mojo provides a python compatible stdlib, it becomes a lot more interesting as a backend.

Code here:

https://github.com/adsharma/json_parsing_langbench


Edit: The numbers have been revised by adding the missing -O3

The comment was based on what I saw by running an unoptimized C++ binary. You can repeat it yourself on your machine.

But yeah - in general, C++ can be 70x faster than python on many benchmarks, which is why tools that translate between languages exist.


Including the output of py2many --mojo=1 test.py here for completeness.

The generated code didn't run out of the box. Slightly modified below:

    from python import Python


    fn main() raises:
        var json = Python.import_module("json")
        for i in range(100000):
            var file = open("test.json", "r").read()
            var data = json.loads(file)

            var wealth_str = data["user"]["wealth"]
            var wealth = Float64(wealth_str)
        print("OK")

Not much faster:

    # time -p ./test
    OK
    real 1.69
    user 1.16
    sys 0.40
But you get a small shippable binary:

    # du -sh test
    4.1M    test


Solution building: I don't really see anything in that use case that would prevent it from using inotify/incron, so I would probably evaluate that side first.


Not to be that guy, but if you’re doing a new CLI and you want something as fast as C++ but closer to the out-of-the-box cross platform and general use of Python, Rust and the Clap library are a really good option. I can whip out a bug free program with that combination in 15 minutes that works on macOS, Linux, and Windows.


Imagine the person that wrote the C++ program has C++ experience and can whip that out in 10 minutes.


This is the way


Now, write it in Rust




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: