> someone published their own code instead of code from the github repo
Sorry for my ignorance, I have not been a real dev for over 10 years, but wouldn't a better way for NPM to take new packages be for NPM to own the build process? As in the owner of a package would tell NPM: please build and publish from this previously registered repo. Then NPM would have it's own Jenkins servers actually run the build?
Based on my very limited understanding the main problem I see with NPM is that I have no idea if the code in the repo was the same code used to build the package. This is what happened in this case, correct? Wouldn't NPM owning the build process solve this problem?
I realize this would be a huge load for NPM, but NPM has the world's security riding on its shoulders.
But how would you know that the sourced on NPM same as the ones you have? Except if you require all NPM packages to be hosted on Github but that does not seem like a good idea to me.
I'm more familiar with the Python/PyPI world, where the "morons" also don't "just" do all the "simple" and "sane" things random internet commenters "helpfully" suggest.
But if you want to spin up a package repo that can, say, build the numpy/scipy packages from source and doesn't end up compromised or overloaded as a result, be my guest. Until then, though, maybe hold off with the "morons" commentary and the attacks on people who are doing their best to solve genuinely hard problems at zero charge to the community?
>people who are doing their best to solve genuinely hard problems at zero charge to the community
NPM is not doing their best. They actively reject ideas like package signing. They allow running arbitrary unsandboxed code when installing a package. Please, do tell me how they are doing their best. Even JS people think it's bad (hence why deno exists).
Why do no other package managers have this problem? Why are there no incidents where installing an apt-get package stole credentials, even though apt is way older?
When I was Django's release manager, I started our tradition of ensuring that every single package we put on PyPI was also accompanied by us publishing signed checksums of the file.
So. Django signs its packages. Now, what good does that do you? How do you know my key (or, these days, Tim's or Carlton's keys, since they roll the releases) is authorized to release Django?
"Just support package signing" is one of those things that sounds super easy. And in fact, PyPI technically supports it -- you can upload a detached signature along with your package!
But you don't "just" support signing. Signatures, absent a gigantic infrastructure of key management, indicating whose keys are trusted for what purposes and by whom, are basically useless. So when someone says "just support package signing", they don't really mean "just let us upload signatures!" What they really mean is "develop and maintain that web-of-trust infrastructure for me", but they don't like to acknowledge that's what the request really is.
Why are there no incidents where installing an apt-get package stole credentials
Because Debian grants package-releasing privileges only to a tiny group of people who are vetted before they get to release. Systems like npm and PyPI, by design, let anyone who wants to sign up and start publishing packages. That's a deliberate tradeoff, and one that comes with both risk (you'll get some bad actors) and reward (you'll get a larger and richer ecosystem of things being published).
I eagerly await your next set of soundbites that have come up, and been rebutted, in every single discussion of npm and PyPI that's come up on HN in the past five years.
You make good points about package signing. It's not a trivial problem. Fortunately it's a solved one. There are package managers (pacman, nuget, rpm, etc) that do this. Yes, maintaining a web of trust is required. Nobody said otherwise. You don't need to put words in the mouths of people who want NPM to be a bit more secure. Point is, it's probably worth the hassle for a fairly critical piece of infrastructure.
At the very least they could just do what Ruby gems do and allow packages to be signed but leave who to trust up to the user. Frankly, it wouldn't be that hard for ESlint to publish a key on their site and users to run a command like `npm trust /path/to/eslint.pem`. I don't generally think security should be opt-in, but it's still better than no option at all like current NPM.
Also, you didn't touch on the fact that NPM allows executing unsandboxed code on package install. I'm actually curious if you think there's a decent reason for this. It seems like a _really_ serious issue for questionable benefit. As far as I can tell, PyPI doesn't allow this.
> I eagerly await your next set of soundbites that have come up, and been rebutted, in every single discussion of npm and PyPI that's come up on HN in the past five years.
I'll ignore this blanket dismissal of my points. I think given NPM's history of issues (including particularly absurd highlights like left-pad and this eslint incident) maybe the NPM community should stop turning a blind eye to this and consider that they could be doing better.
You're absolutely right! People should hold off on unmerited attacks on those working hard to solve genuinely difficult problems.
Yet, is it perhaps possible that tying together build and distribution systems is not an unsolved or unsolvable problem? Bundling build processes with source packages is not a novel notion or even a novel approach. Builds being reproducible is similarly not a novel idea.
Or, to put it another way, is it possible that working hard is not a good explanation for other-than-optimal solutions when other options are known?
There's a reason I picked numpy/scipy as examples. Among popular Python packages, they're among the genuinely hardest to build from source. You need several non-Python dependencies, including multiple language build toolchains, to get a working build, and need to dive into notes on things like ABI compatibility between different FORTRAN compilers in order to make sure what you're doing will work.
So, setting up something like a PyPI -- again, because that's what I'm familiar with -- that "just" adds the feature of building the packages on machines owned by the package repo is not exactly a simple thing. And PyPI currently hosts over 1M different released packages, so take that number into account when figuring the complexity of all the different things it might have to support.
Or, to put it another way, is it possible that working hard is not a good explanation for other-than-optimal solutions when other options are known?
Or, to put it another way, is it possible that people who leave drive-by "helpful" "suggestions" in comment threads about package repositories vastly underestimate what they're asking for, and often don't even really understand the problem domain?
> There's a reason I picked numpy/scipy as examples. Among popular Python packages, they're among the genuinely hardest to build from source.
For context, I wrote my comment with awareness of the level of complexity that exists.
> Or, to put it another way, is it possible that people who leave drive-by "helpful" "suggestions" in comment threads about package repositories vastly underestimate what they're asking for, and often don't even really understand the problem domain?
You're right! It's absolutely possible that driveby snarky suggestions are not helpful in any way! It's also perhaps possible that there is a point to be made and perhaps a nasty-but-avoidable failure scenario.
There is, after all, a distinction to be made between problems that are complex and problems that are unsolvable. The general-purpose problem of building packages is, as you say, incredibly complex and difficult. It's worth considering that it might not be unsolvable. After all, every single package in PyPI got built somehow.
Which is to say one doesn't set out to bolt a build system onto a package repo system. One bolts a repo system onto a build system, because then it's an (easier) versioning and binary blob distribution problem when you have a reliable chain of trust.
This is certainly true, and the statement that NPM is _run by_ "morons" is, while immature in its phrasing, at least potentially aiming toward a legitimate point.
However, the post in question also labels NPM's _users_ in broad strokes as "morons", which can make no such claim of legitimacy. It's just a puerile, baseless insult that is devoid of content and detrimental to the poster's credibility.
Sorry for my ignorance, I have not been a real dev for over 10 years, but wouldn't a better way for NPM to take new packages be for NPM to own the build process? As in the owner of a package would tell NPM: please build and publish from this previously registered repo. Then NPM would have it's own Jenkins servers actually run the build?
Based on my very limited understanding the main problem I see with NPM is that I have no idea if the code in the repo was the same code used to build the package. This is what happened in this case, correct? Wouldn't NPM owning the build process solve this problem?
I realize this would be a huge load for NPM, but NPM has the world's security riding on its shoulders.
I'm ready to be told why this is dumb now...
edit: grammar