I'm afraid it can get worse. What happens when there will be a proliferation of "looking legit npm packages" thanks to AI, full with ransomware? Currently I can't really figure out a one size fits all solution to that. Any idea?
One idea that's gaining (marginal) traction in Rust (which really sits in the same boat here) is trusted reviews, where trust is established by a web of trust. You probably have some developers you trust, and they have a different set of people they trust, so you can establish transient trust (that decays as the chain gets longer).
indeed, blockchain makes trusting people much harder. The hijacked sense of "trust" used by the crypto-hype is a trivial technical sense in distributed databases.
Rather, an immutable ledger is a terrible system for trusting /people/, since if the data input into the system isnt reliable, there's no way to change it.
You then need to build an actual layer of trust on top of your untrustable blockchain, and then you end up spending 1MWHr and $100/review to recreate rotten tomatoes.
What advantages specifically would a blockchain have? Where does the existing solution, of using a fast database and trusting someone's private key, fall short?
Trust is great; but even trust can be broken either on purpose or accidentally over time. There's a great example of a well-known NPM package which was taken over accidentally by a hacker, and the thousands / millions of dependent packages and apps were totally vulnerable.
Check out https://socket.dev for a better NPM solution (not affiliated w/ them at all), though AI's definitely going to accentuate this problem 1000x.
This sounds good. Seems like the easiest way to start is to use the package.json-defined dependencies to create the web/tree. If a developer of package A use package B, they trust the developer of package B, and so on.
I would love to see this getting bigger, not just for package managers but in general. With AIs it will be easier than ever to produce spam or just poor content. We need some better way to rank and accept content, and apart from having large tech companies hiring armies of reviewiers, I would think web of trust can solve it.
Don't think that requires blockchain per se, or even human verification. It would work quite well just for me to assign my trust to various identities (Github accounts, LinkedIn accounts, etc) and for that trust to be used when ranking or filtering content.
I don't entirely get this. By adding a dependency to a project, doesn't that already establish a web of trust? I.e. if you trust the dev who made library X, you trust they have good reason to trust library Y that X depends on, etc.
Is this just about being more explicit about review?
It is very hard to turn a black box function into something that can be used reliably. Network and filesystem permissions are baby steps that only prevent genuine developer mistakes, not malicious attacks.
The PDF converter library you're using might not need filesystem or network access, but it can detect specific text in links and replace the URL with a phishing site. There are no technical shortcuts to trust.
You can sandbox all you want, use three layers of VMs and what not, but if you're allowing me to produce bytes for you and then expect to use them elsewhere in any nontrivial way, I've already won.
I work on a package manager and there are two main philosophies here.
1. Trust but verify - Assumes that some packages are inherently trustworthy and can be relied upon. This is where we are today.
2. Zero trust - Assumes you should not automatically trust anything, even if it appears to come from a trusted source. This is where it seems we're headed.
For OSS/central registries, #1 is followed. For internal registries, #2 is followed.
At least where the industry is headed towards are constant gates of "verification" following the #2 model. Think of the following:
1. Code signing
2. Reproducibility / Integrity
3. Verified sources
4. Least privilege
5. Monitoring tooling
6. 2FA
7. Vulnerability scanning
8. Allowlisting
etc
But are all those even practical for maintaining the ethos of open source? We'll find out.
Is it common to randomly browse npm for packages to use? Sure AI can create a copy of existing package with malware in it, but so can anyone else. It is harder to fake years of posts and community around a package that anyone might actually use.
Npmjs can do a lot to fight spam by collecting information about all http requests sent by logged in users (though GDPR may impose some limitations). In many cases this would allow knowing one spam package (e. g. reported by users) to uncover all or most submissions from the same threat actor by making an SQL query to analytical DB with the right parameters. But most abused services AFAIK don't pro-actively fight with spammers. AI will definitely would make it harder one can start with low hanging fruits - most spammers are not that sophisticated.