What's different between the NPM ecosystem, and, say, java/maven? What is the la...

tirpen · on April 11, 2022

> What is the latter doing that the former isn't?

Java has a standard library. A really big and well written standard library.

npm has a package called "is-number".

michaelt · on April 11, 2022

The short answer is nothing - it would be very easy to introduce a malicious Maven package.

The slightly longer answer is maven artefacts are immutable; the default behaviour is to pin precise versions; and norms among java programmers don't favour using libraries for one-liners like left-pad - meaning there are fewer people in a position to launch a supply chain attack.

The dam still has cracks in it, but there are fewer cracks and some have sticking plasters over them.

bogwog · on April 11, 2022

> and norms among java programmers don't favour using libraries for one-liners like left-pad - meaning there are fewer people in a position to launch a supply chain attack.

Javascript is the only ecosystem I've seen people doing stuff like that. I know this is going to sound elitist, but maybe the problem is that the bar for learning javascript is low, and the incentives for a javascript developer to improve themselves is also low. You can get away with lazy and bad practices for your entire career, even as a senior full-stack developer. Typescript kinda raises that bar a little bit, but not by much.

I don't think there's a solution to that particular problem, short of deprecating javascript once WASM reaches a point where it can fully replace it. But even in that scenario, we'll probably start to see JS interpreters ported to WASM anyways.

Maybe those "No Code" products are the solution? Replace all of those JS web/app developer positions with people trained on specific No Code platforms that require basically the same amount of programming knowledge, but outsource things like security and architectural decisions to the platform.

shadowgovt · on April 11, 2022

The bar to learning is low, the payout in the industry is high (everyone wants a web site, web service, or web app), and (key in this problem-space) the JavaScript standard library is basically a tiny raisin of functionality.

It's not so much "incentive to improve self is low" as "it doesn't make sense to rewrite something that exists," and since JS developers, to stereotype, tend to be extremely online, they will tend to solve problems by asking "Is this written yet?" instead of writing Yet Another YAML Parser.

ozim · on April 12, 2022

This is really not charitable interpretation.

Let's take more charitable one where we know that most of dev newcomers flock to JS and expect JS to have things like left-pad.

While being newcomer is not a bad thing in itself, unfortunately there is a bunch of expectations and things learned in JS ecosystem that are not aging well.

I don't think 'rising bar' is the answer as much as going "No Code/Low Code" is. People that are running companies that depend on newcomers will get what they pay for.

Companies that hire people with experience won't notice a thing.

Since left-pad a lot of people in the trenches of JS are learning that adding any dependency might cost a lot, and now we see more and more of that.

dmatech · on April 11, 2022

I never remember hearing about this sort of behavior in the CPAN scene. There's something different at a cultural level with Node. With Node, there have been several instances of otherwise competent coders destroy their own work to make a statement.

bilbo0s · on April 11, 2022

it would be very easy to introduce a malicious Maven package

From the perspective of a cybersecurity researcher, this is just not true. At a minimum, it's not true in the same way.

Node executes arbitrary code on install. The best an attacker can do in Java is execute arbitrary code at runtime, and even then, only insofar as the developer has directed the securitymanager to execute arbitrary code.

This is a massive difference that I believe people need to be at once more aware of, and more wary about. Don't believe the hype. Don't let people on the internet telling you there is no difference lull you into a false sense of security.

If you work on a machine, or on a project where security is important, check your dependencies people.

michaelt · on April 11, 2022

I'd wager 99% of Maven projects run unit tests on build, so I'm not sure the distinction between install-time and run-time is all that meaningful.

And the Security Manager might have been relevant back in the days of Java Applets and Web Start, but I've never seen it used outside of the OpenJDK test suite - and certainly not for protection against malicious code.

paulmd · on April 11, 2022

> I'd wager 99% of Maven projects run unit tests on build, so I'm not sure the distinction between install-time and run-time is all that meaningful.

Most Java projects don't build their dependencies from source though (unless it's a local project included via gradle/maven). So yes, unit tests run when dependencies are built, but nobody is building dependencies when their web app gets built.

michaelt · on April 11, 2022

But if a library is among your dependencies, I'd wager you're going to call some of its functions.

So you run a maven build, maven retrieves the library, maven runs your tests, your tests call functions from the library - and the library code you've just downloaded gets run.

feross · on April 11, 2022

(Reposting a comment I posted a few days ago.)

There's a few reasons that NPM sees more attacks than other ecosystems.

First, the scale of the JavaScript ecosystem. JavaScript is so much larger than every other ecosystem, so even a very small probability event (somebody introducing malware into a package) can happen surprisingly often given the scale of the ecosystem. Supply chain attacks are a problem in all open source ecosystems – not just JS – but they are a bit rarer and don't effect as many people so fewer people take note.

Second, npm was one of the first package managers to solve the classic "dependency hell" problem. In Python, if you have two dependencies, A and B, which both depend on different versions of C, say C@1.0.0 and C@2.0.0, respectively, then you're in trouble. You have an broken project. Python can only install one version of C. So now you're in dependency hell.

Npm on the other hand just installs both versions of C and it gives A the version that it wants, C@1.0.0. And it gives B the version that it wants, C@2.0.0. Both packages are happy - problem solved.

This caused Python maintainers to think twice before adding a new dependency lest they cause "dependency hell" for their users. Much better to just copy paste these 50 lines of code rather than adding a dependency. So there was an intrinsic sort of resistance – some pain is involved in adding new dependencies.

Npm maintainers had no such constraints. In a way, npm’s better developer experience led to the whole module ecosystem scaling "too well". Thus, you end up needing to trust more total maintainers, increasing the risk of supply chain attacks.

Disclosure: I started Socket (https://socket.dev) to help solve open source supply chain security. To learn more, see: https://news.ycombinator.com/item?id=30521913

matsemann · on April 11, 2022

- Bigger packages in java, developed by big organizations one can trust. Seldom depend on small ad-hoc packages.

- installing a dependency with maven is downloading a jar-file. In npm you often need to run arbitrary code as part of installation, making the attack surface far greater.

- usually use a specific version of dependencies, so no updates unless explicitly wanted.

- in theory, a securitymanager can also be used to give code from different libraries different permissions. Not seen it used much in practice, though. Only seen it be used for plugin systems.

bilbo0s · on April 11, 2022

In npm you often need to run arbitrary code as part of installation, making the attack surface far greater.

This really is the key. It makes it so you can't even really compare JS to java in an intellectually honest way. It's just not even close to the same. One downloads a Jar that will only ever execute at runtime, whereas node downloads arbitrary code that will execute on your machine immediately if the attacker so chooses. Not only that, but a user may legitimately not even know that node downloaded that module. The dependency tree is so ridiculous that the user would have to look through it with a fine toothed comb to spot the unimaginably big security hole.

On the one hand, yes, the user should have looked through his/her dependency tree, familiarized themselves with what was in those dependencies code-wise, and known what he/she was doing. On the other, come on man. That's kind of like these 100 page EULA's that take away all your rights. I'm not sure that it's reasonable to expect everyone to read those as carefully as you'd need to read them to avoid the problem?

ryukafalz · on April 11, 2022

The difference between code running at install-time and at runtime is not that big, all things considered. How often do you install a dependency without intending to run it almost immediately afterward?

matsemann · on April 11, 2022

It's an order of magnitude difference. Code running at install time is almost 100% guaranteed to run. So a transitive dependency 10 layers down is just as dangerous as any other.

Runtime, however, you need some code path to actually hit some part of that library/import it to be affected.

bilbo0s · on April 11, 2022

The difference between code running at install-time and at runtime is not that big

It is in java, and rust, and other languages that have securitymanagers or make security guarantees. Node is running code not only in a context that the dev never intended code to be run in, but also a context the dev has no control over. In rust or java, (or a lot of languages actually), code only runs in a context controlled by the dev.

I mean, in the worst case, with node, you may not even get the opportunity to run the app you were trying to install. The module may just own you at the outset. The dev would be powerless to stop any malicious behavior in the library.

From a security perspective, these are huge differences.

rootlocus · on April 11, 2022

Most packages you'd pull from maven are developed by large companies or foundations like apache. The dependencies you're pulling are simple jar files that get loaded at runtime and don't execute anything at build or install time.

ecmascript · on April 11, 2022

I have forgot about how maven works, but at least in npm you have the post install script that lets a package run anything after the dependency is downloaded. So that means package creators can run any code they want on your machine.

I don't really remember if maven has something similar since it was years I did anything in the jvm ecosystem but I think some package managers (like composer if I remember correctly) doesn't give this opportunity.

But since node doesn't have a large standard library it means people will reach out for third party packages for stuff that is small tasks in most languages / runtimes.

capableweb · on April 11, 2022

Having post-install step available to maliciously use doesn't really solve anything. As you're a programmer downloading a dependency, you're allowing the dependency full control of your system (in most language, Deno seems to try to address this at least) at runtime (at least), so they could do whatever they want as soon as you include the dependency in your application and run it once.

ecmascript · on April 11, 2022

True true, but at least you will have to check the api and start implement the library itself. Chances increase that you will see something weird about it the more you have to look at it.

moffkalast · on April 11, 2022

Throw apt/snap/pacman/whatever into the mix and the answer is still "nothing". People act like package managers are somehow the end all but they're no more secure than going to a random official site and downloading some shit, it's just streamlined the process somewhat.

In fact the latter is probably more secure, since the more the packages depend between each other the worse it gets. One random dependency can be hijacked and will be autoinstalled everywhere. Or someone can delete it and break half the internet as we've seen time and time again.

capableweb · on April 11, 2022

You're kind of right, but there is a different between the default repositories used by apt (Debian & Ubuntu) and pacman (Arch) and things like npm, in that they are indeed reviewed and can't disappear overnight. You have some guarantee that it won't disappear overnight, because of the organizations behind them. With npm, anyone can publish/unpublish without any sort of review, while packages in the default repositories are reviewed by others.

imtringued · on April 11, 2022

The difference is that non malicious NPM package authors are trying to destroy you with saturation attacks (throw a huge mass of packages at you so you cannot possibly check all of them) so that malware can slip through more easily.

lucideer · on April 11, 2022

Currently working on this general problem for a large corp: java & js are our two main languages (alongside a lot of python & go, small amounts of swift, groovy, kotljn & c, and some very very old php). Trust me when I say Maven/Gradle etc. are orders of magnitude more painful to solve for than others.

Nothing at all unique to NPM about supply chain risk.

Fwiw, I find Composer to be one of the better of the lot.

shadowgovt · on April 11, 2022

How tractable is it to proxy the npm package sources?

Were I to try and solve this as an enterprise project, that's the first thing I'd try: have a team declare the specific subset of packages we have hand-vetted and host them off a corporate package manager. Our software builds from only those packages; if devs need more, they petition to get them vetted. If they need new versions, they petition to get them updated. Our team keeps an eye out for hotfixes and periodically might mandate an upgrade if a vulnerability comes around.

lucideer · on April 11, 2022

We do, but auditing the mirrored sources is still not straighforward. It's a trade-off between completely blocking all production builds on CVEs and fully transparent mirroring, one we're still trying to balance. It's also expensive - proprietary SaaS offerings in this space are not particularly competitive, and managing it in-house is intensive.

In terms of mandated hand-vetted packages, unless your hand-vetting team is inhumanly well-resourced, you're looking at a stifling corporate environment for engineers there, and/or a lot of attrition. Again, needs to be a balance between central control and autonomy.

Our current approach is simply auditing packages we know are deployed in production and assigning tickets to update/remove within a fixed time period, rather than blocking deployments completely. Probably looking to block select deployments based on criticality in the near future, but again distinguishing between a theoretical exploit and one our code triggers in practice is still pretty difficult to automate without significant noise. And the other issue here is differentiating newly discovered vulns (already in prod - blocking deployment doesn't help) vs newly introduced vulns (not yet deployed).