Justine's work on all this has been hugely inspiring and a big breath of fresh air in an area that always kicks the can of reducing final binary sizes down the road.
That's one of my very few gripes with Rust: compared to what Zig / C / C++ can produce, Rust's release binaries are quite huge. There's likely a lot of possible work to be done in the area of tree shaking and generally only leaving in what's actually used in the final binary. I hope this becomes a priority soon-ish.
Why wouldn't it help? Surely this (and others) method do static analysis on what's actually used in the final program, no?
Example: you use a crate with 13 functions. You only use 1 but it depends on 2 others in the same crate that don't depend on anything else. Why shouldn't the end result only compile 3 of the 13 functions?
Or maybe I am misunderstanding or I am badly informed (not a compiler / linker engineer so I'm likely with naive assumptions) -- apologies if so.
I understand it's not easy, I am just a bit surprised that Rust's team is not chasing this a bit more aggressively, let's say. It puts a slight stain on an otherwise excellent project.
Well it helps, but it'd be like spooning water out of a bucket when a faucet is pouring more into it. The NPM dependency model solves the diamond dependency problem by schlepping in multiple versions of the same package. C / C++ / Java (and I assume Zig too) require a single version of any given package. The way Rust and NPM do things adds a whole dimension of complexity to what needs to be shaken. It's sort of like a devil's bargain because it helps their ecosystems grow really fast, but it creates a lot of bloat over the long run. Rust is still relatively new so binary sizes are still pretty good compared to what we might witness in the future. So enjoy the golden age of Rust while it's golden!
No I don't think so, but I'm not certain. I think with Go the problem always had more to do with the way the language is designed, making it difficult for the linker to prune dependencies.
Wow, really? I've always thought that Cargo by default forces a package into one particular version. And that a particular package in multiple versions is something people opt into only in rare circumstances (similarly as in Java/Maven/shadowing case).
I still wish at one point binary sizes are taken seriously though. But I am aware that it's likely (a) not at all a priority currently, and (b) a gigantic effort.
I’m not sure what “taken seriously” would mean to you, but I work in embedded. We get the Rust compiler to spit out programs in the hundreds or thousands of bytes regularly. Binary sizes are more about the people doing the programming than the compiler missing some sort of crucial technology.
I get that this is possible with `no_std` but that's not an option for me.
Guess I'll dig out the guides for reducing binary sizes but last time I needed tokio + opentelemetry + a Prometheus adapter my release binary was always at least 5MB.
Also sorry, didn't mean to come off as dismissive. It's just that to me 5MB is quite a lot and should be shrinkable. I keep wondering if the compiler/linker tech can help Rust there.
It’s all good! I don’t think you were being dismissive at all. I don’t even contribute to the Rust project anymore, and in fact wish they’d prioritize various toolchain improvements. Just in this specific case I don’t really think there’s anything to be done. All the standard stuff is already in there. But maybe I’m wrong!
Is that 5MB with or without symbols? Note that on Linux, binaries aren't stripped by default.
Having said that, if your binary ended up including a hyper-based HTTP server and a TLS implementation, and was built with the default release profile (optimizing for speed rather than size), I can believe it would reach 5MB stripped.
Take out that last line if you need to recover from panics at runtime. So far, in my limited use of Rust, I haven't had to. Anyway, those are all the easy tweaks to reduce binary size.
Had the first two already. I always prefer to optimize for speed and not size but just for the heck of it I enabled it. Wasn't super sure about the panic thing so I never used it but your comment is reassuring so I will just leave it in.
From 5.0M to 3.0M, not bad!
Turned the optimization for speed back on -- 4.4M! Wow.
This doesn't sound too bad, and functions* that are not changing could, in theory, also be unified, although a single change in one of the transitive functions called might force you to keep multiple versions.
I'm not really familiar with packaging problems and obviously quadratic is worse than linear, but is having no duplicates a real alternative?
(Disclaimer: I'm mostly guessing here, but am I missing something important?)
AFAIU the choice here also considers how easily things will build and link, and whether your package manager/build tool will need a SAT-solver to figure out which version of each library to use, and even then you can still run into unsatisfiable restrictions (`Could not resolve dependencies`) if libraries are not adequately updated/maintained.
It seems that by allowing duplication "only" pay for the libraries that are not updated (including all their deps), which means that you are trading computer resources (disk, cpu?) for human time updating and debugging, which might be a really good deal, especially if you only end up with duplicates in the cases where you lack the human time to keep all deps updated.
One could argue that the time cost of maintaining the libraries can only be deferred so there's no benefit deferring it, but the time people using the libraries save because they don't need the libraries to get updated if they have enough disk is probably what made Rust and NPM just duplicate dependencies.
That's one of my very few gripes with Rust: compared to what Zig / C / C++ can produce, Rust's release binaries are quite huge. There's likely a lot of possible work to be done in the area of tree shaking and generally only leaving in what's actually used in the final binary. I hope this becomes a priority soon-ish.