Node.js packages don't deserve trust

lucacasonato · on April 11, 2022

Deno core team member and TC39 delegate here.

We have pondered about capability based security for Deno in the past. Our conclusion has always been that this is not possible to do securely in JS without freezing all prototypes and objects by default. The reasoning for this is that you need to make sure the capability token does not ever leak. For example as a malicious user I could override `globalThis.fetch` to exfiltrate the capability token destined for `fetch` and use it myself later.

One could also override `Map.prototype.set` / `Map.prototype.get` to exfiltrate a token every time it is added or removed from a `Map` (people will want to store tokens in a `Map`).

One could also override `Array.prototype[Symbol.iterator]` to exfiltrate tokens stored in arrays if those arrays are destructored, spread, etc.

There are many more cases like this, where one can exfiltrate tokens because of the very dynamic nature of JavaScript.

It is unlikely that freezing all intrinsic prototypes and objects is even enough. People will find ways to exfiltrate tokens.

3np · on April 11, 2022

Doesn't SES address that? The only fundamental barrier right now seems to be performance, which could be addressed by runtime support.

https://github.com/tc39/proposal-ses

https://github.com/endojs/endo

lucacasonato · on April 11, 2022

Yup, SES would address this. But SES also needs to bring with it a paradigm shift for JS:

a) Folks would have to load each bit of code they want separate permissions for in a separate compartment. This won't be easy. b) Runtimes will need to provide an immutable global realm, which is not something that is the case right now.

As I said in a different commeent, I think a lot of this can already be addressed by ShadowRealms. Deno will likely allow users to specify per ShadowRealm permissions, which is probably as granular as most people will want to get.

naugtur · on April 12, 2022

There's Endo, which does the loading/importing and dependency resolution. It doesn't have a default user experience at this point, but it creates compartments for packages and runs.

Getting permissions involved will require mapping them to packages in a policy file and there you go - an environment where you can use packages and they can't surprise you with data exfiltration etc.

danfinlay · on April 12, 2022

There's LavaMoat, which enables using SES confinement around normal npm packages by creating a policy file for what can be imported/required by that module (and can auto-generate a suggested policy file from what appears used, which fails to greater restriction/security, and can easily be expanded): https://github.com/LavaMoat/LavaMoat

naugtur · on April 12, 2022

Also, what's shadow realms? I tried looking it up and got a video game. And I'm not much of a gamer so it's not a matter if a bubble ;)

robotnikman · on April 13, 2022

I think they might be talking about this https://github.com/tc39/proposal-shadowrealm

ShadowRealm is possibly a reference to the Shadow Realm from the card game Yu-Gi-Oh

danfinlay · on April 12, 2022

Hi there, TC-39 delegate and MetaMask co-founder here.

SES does address this, and strives to achieve "object capability security", wherein access to a function is equivalent to permission to use it.

One difference between an object capability approach and the capability-token approach described in the OP article is that in an ocap approach, you would have no need for passing around a capability token just to pass it to the restricted methods: Instead, you simply disallow importing modules by default, and now pass in any restricted methods to modules that you want to have access to them. I find this approach greatly more ergonomic, and if you ever want to further restrict a function, you don't need a new token, you just write a closure with your own policy defined in it!

By the way, we've developed a tool called LavaMoat that allows applying SES security to existing npm modules, no token-passing needed, by restricting the environment of each module per a policy file. https://github.com/LavaMoat/LavaMoat

While we're at it, I'll plug an audit we did of the SES shim code: https://agoric.com/blog/technology/metamask-agoric-hardened-...

bakkoting · on April 11, 2022

o/ Luca! Fellow TC39 delegate here.

> It is unlikely that freezing all intrinsic prototypes and objects is even enough. People will find ways to exfiltrate tokens.

This is probably true, but frozen intrinsics would make it a _lot_ harder. Right now it's not reasonable to ask a library to be defensive against capability exfiltration, since it means not using any built-ins, but I think with frozen intrinsics it would be reasonable to treat a library leaking its capabilities as a security bug. There would still absolutely be leaks - most significantly in libraries which export classes and don't freeze the class prototype - but things would no longer be completely insecure by default. It would make malicious code have to work a _lot_ harder.

I think it's worth a shot. Deno removed the __proto__ getter/setter, and that did require a bunch of libraries to update, but it worked out OK.

Node already has --frozen-intrinsics, if anyone feels like experimenting with whether that would break your code.

lucacasonato · on April 11, 2022

Yeah, I agree it is definitely worth trying! I think all the talk around SES will push JS as a whole further towards something that could support capability based permissions securely in the future.

zozbot234 · on April 11, 2022

These seem like features that could work a bit better in the context of WASM modules and/or components. AIUI, WASM was designed with the expectation that support for capabilities would be required.

lucacasonato · on April 11, 2022

Yup, for sure. I think capability based ShadowRealms are also totally doable. I'm sure we'll add those for Deno once we add support for ShadowRealms.

akhmatova · on April 11, 2022

Your input is appreciated. But to anyone not using JS on a daily basis, the above reads as a blur of detail, rather than a clear, actionable message about to do about the overall situation.

So I'd be curious to know if you have a holistic response to primary claim made by the post, which is:

The fundamental problem with npm is that any package you install has full access to do whatever it wants on your computer.

lucacasonato · on April 11, 2022

> The fundamental problem with npm is that any package you install has full access to do whatever it wants on your computer.

Let me try to answer more concisely at a higher level:

Due to JS being a very dynamic language, it is at this time not possible to give different packages inside of the same JS runtime a different set of permissions or capabilities securely.

Because of this, the best we can do for sandboxing right now is permissions / capabilities that are set per JS runtime, rather than per dependency / module.

This is what Deno does. It allows you to set capabilites for the set of your code and all of it's dependencies at once. If you want to run a certain module with lesser permissions, you need to move it into a seperate JS runtime, either by running it on a seperate thread with a web worker, or in a seperate Deno process. Both web workers and subprocesses can have different (lesser) permissions to their parent.

paulmd · on April 11, 2022

it's been fascinating watching the JS people reinvent the concept of an OS from scratch. Today it's the security model, yesterday it's the task scheduling, etc.

Interesting little bit of outsider art, "what if front-end people designed an OS" and all, and it's going about as well as you'd expect with that.

I mostly wouldn't mind, but there's certain thresholds that I think are noxious or dangerous as a user. WebGL strikes me as much more dangerous than most people anticipate, GPU drivers are not really hardened against hostile shaders very well, it's quite likely that various escapes and data-leaks exist imo. Someone I know who worked on implementing that and said that out of all the graphics vendors, none of them liked it but the blue one's engineers had genuine fear in their eyes when the functionality was laid out to them.

I am also not really looking forward to when javascript realizes the need for persistent services... but thankfully I think web architecture mostly means that's somebody else's computer.

ratww · on April 12, 2022

This comment displays a lot of ignorance and prejudice.

"JS people" have not been reinventing much at all. Operating Systems aren't even close to offering what "JS people" needs at the granularity they would benefit from. By that measure, the same complaint can be made about the Java or .NET virtual machine, for example.

If anything, what the "JS people" of today is trying is to fix are all those "dangerous parts" that were added there with abandon by browser makers and OS people and will probably never be fixed in our lifetime.

> I am also not really looking forward to when javascript realizes the need for persistent services

It might surprise you, but Node.js works in the backend from day zero, and Service Workers have been working in the frontend for several years now. You're more than a decade behind the times.

asiachick · on April 11, 2022

That is special about JS here? AFAICT the same is true for every python package, every rust crate, every linux package, every ruby gem, etc....

Maybe all of those need the same scrutiny but AFAIK they fundamentally have the same issue.

forty · on April 11, 2022

Exactly, I'm not sure what people are trying to fix here. They want code they import in their projects to be sandboxed so they import untrusted code. That seems a bit f*cked up to me. Running untrusted apps is already pretty hard, running apps where some code is trusted and some not, seems... a bad idea.

IMO since the problem is generic to all languages we should have a generic solution, using standard sandboxing technics (containers, VMs, jail, etc)

gwright · on April 13, 2022

Makes me think of Plan 9 and its almost religious use of the 9P protocol to control what resources are available to a process. Or perhaps the Erlang VM with communication via messages. Composing systems in this way is very different than adding untrusted code to your application but seems like an interesting alternative approach.

ryukafalz · on April 11, 2022

That is correct, these all fundamentally have the same issue and all need the same scrutiny.

Sohcahtoa82 · on April 11, 2022

Python developers are slightly better behaved.

I can't possibly come up with a reason WHY (Maybe Python's standard library is that much better than Node?), but Python dependency trees just tend to be so much simpler. If I "pip install $PACKAGE", it's usually only a couple additional dependencies that get included unless it's one of the huge ones like numpy. If I "npm install $PACKAGE", I can expect 50+ additional dependencies to get added.

For some reason, Node engineers are much more likely to "npm install leftpad" than write the 2 lines of code it takes to implement string padding.

ratww · on April 12, 2022

The problem in Javascript is localized. Lots of extremely important foundational packages like React, Vue, Typescript, and even ones like Prettier don't really have lots of dependencies.

The biggest culprits of the craziness are things like Babel and Webpack, which require tens or sometimes hundreds of packages for a basic install. Some of those aren't really needed in Python, for example. But in JS there are alternatives to them.

On the other hand, some foundational packages in languages like Python and Ruby have a lot of dependencies. I don't know about Python much, but Rails also has a large dependency graph, despite Ruby having a good standard library.

verelo · on April 11, 2022

The JS standard library is very slim. I think this is both its greatest strength and weakness, and to your point, makes it likely that people just install some random package v's solve the problem themselves. I'm not sure which is better philosophically, but personally having done a lot of PHP dev in the last decade I find the JS code i'm writing to be easier to read but much harder to understand and run from a package management perspective. I think the developers working on solutions in JS would probably behave very different if they were given more built in tools to work with.

paulmd · on April 11, 2022

the same is true of java - less so in newer versions, but older versions the stdlib was pretty slim and low-level and there was a massive amount of library code written to wrap around it.

it still doesn't devolve to the javascript phenomenon of one-file (or one-function) libraries and so on. Dozens of libraries, sure, hundreds, maybe, but nobody actually has thousands of dependencies like you do on node.

Apparently "number of repos maintained" is a KPI so there's some gamesmanship there. Maybe it's also being used as a caching thing, smaller files might be more cacheable if you don't minify?

More generally though this may be the result of that "enterprise culture" that is sometimes looked down on in other situations, that at least stuff is getting bundled into appropriate packages for distribution vs just chucking every single file into a node package.

armchairhacker · on April 11, 2022

Dynamic languages such as JS allow you to override almost anything. It’s essentially impossible to sandbox “part” of JS code, you must either sandbox the entire runtime (e.g. in browsers) or allow everything (e.g. in node.js).

Individual packages want to run arbitrary JS code to do “safe” things during installation. But it’s very hard to allow them to do anything meaningful like file access, even if it’s “safe”, without potentially allowing exploits.

throwawayboise · on April 11, 2022

Run in a chroot/jail or container?

joshAg · on April 11, 2022

Not the OP, but the approach sounds roughly similar to SE linux, but for node.

A major problem with a half-secure security solution is that it's not actually secure. You might do everything right, and still get owned. For the threat model the solution needs to be complete (or able to be complete, but turned down to less than complete by the user*).

Part of the way this manifests with SElinux is that to have a fully locked down box with SElinux you have to consider access controls for _everything_ on the box on top of regular unix permissions. And to actually be a full solution, you have to install kernel headers because anything user-space isn't good enough to guarantee full security.

For node to make a similar guarantee locking down everything means turning off a lot of the features that allow javascript to be so dynamic or making major changes to the run-time implementation to support being able to use those features without making the security swiss-cheese.

And then, even though the system is securable, there's the issue of turning it on for all the modules you use. And not just the module's you use, but the modules your modules use and on and on. You could delegate the module-module to the module you import but that means either granting overly broad permissions to the module and hoping they don't screw something up (and that any module they delegate to does the same) or not delegating anything and personally granting explicit permissions to every module, no matter how deep in your hierarchy of requirements. If that sounds exhausting, it kind of is.

In SElinux's case what that leads to instead usually is rearchitecting things such that you can use virtualization to sandbox things and limiting access across the sandboxes (ohai, it's the deno solution). There's still places where a VM or other sandbox isn't appropriate though and if you want to be on linux you have to use SElinux (or a competitor, but i've only touched selinux), and it's not uncommon to have an entire team whose only job is to configure SElinux and support other teams that interact with it (eg, coaching them on how SElinux interacts with their codebase and what they need to tune, auditing teams' SElinux configs, and keeping SElinux working for the base system as that gets upgraded). And if you screw up or get lazy with auditing permissions, you've just limited the effectiveness of SElinux, possibly rendering it useless.

A large part of the reason SElinux is so hard to use and use right (and that directly translates to js and node) is that it's attempting to bolt-on security to an existing system that wasn't designed with (that kind of) security in mind. That's a monumentally hard thing to do in a way that doesn't require rewriting everything that uses it. And not having to rewrite everything is a hard requirement, because if you're going to rewrite everything that uses it, it's usually cheaper and easier to just make something new from scratch (in SElinux's case a new OS, in js's case a new language).

So holistic options: 1) remove all the dynamism that makes javascript javascript. this (potentially) breaks all existing code. Call it rustscript and get that to ship in all the browsers, and get all the websites to use that instead of javascript, and then make a serverside environment for rustscript. Now you can do fine-grained module permissioning. 1.5) remove only the dynamism that breaks this sort of security access control as part of a new ECMAScript spec and add support for the security access control at the same time. This breaks existing code, but the old code can still run in a runtime for the earlier spec. New code can take advantage of the new spec features. Old code can be modified to work with the new features. Broken code can be rewritten to the new spec. This makes rustscript ESNext. It will be up to various runtime to support this new runtime, so nodeNext will have support for it but it won't get backported. Browsers will require transpilation from ESNext to an earlier ES version as they do now, but eventually even they would drop support for the older js versions. 2) accept that module permissioning systems are easy enough to get around in JS that anything attempting to implement them is at best security theater. The deno solution isn't security theater, but that's because it makes much less stringent guarantees (ie only runtime granularity and not module granularity).

* why allow the user to make themselves insecure? In some cases, the user will choose to be less secure for some external reason or will be using the security solution as a part of a more holistic security solution, so some other part guarantees the security that is given up.

hackerfromthefu · on April 11, 2022

I think what he is saying is that Javascript is a broken language. Fundamentally un-securable when third party code is used.

bastawhiz · on April 11, 2022

That's a pretty extreme way to say it. Python is in essentially the same boat, as is bash, Ruby, PHP, dart... Calling languages "broken" is to say that one hard problem makes them unsuitable for use, which is hardly true.

akhmatova · on April 11, 2022

I think the main concern here is that it isn't that the language is too permissive -- but that the primary installer (and what many people cite as one of the greatest strengths of JS, largely responsible for ushering in its Golden Age) -- is structurally (and perhaps irreparably) insecure.

Whether this is really so (and in a significant way compared to its other competitors, out in dynamic programming land) -- is what people seem to be trying to suss out in this thread.

bastawhiz · on April 11, 2022

The top comment mentions Deno, which doesn't have NPM or even use a package manager. So I'm not sure how the "installer" is the relevant problem

johnny22 · on April 11, 2022

how is it any less problematic than python or ruby in this regard?

krapp · on April 11, 2022

Javascript was designed and intended to be used within the context of a browser to manipulate the DOM and script webpages. It is unsuitable for any other use.

hackerfromthefu · on April 13, 2022

The fact that including a package in your webpage/app, which includes a tree of hundreds of others, one of which can 'dynamically' overwrite/redefine other pieces of the namespace with impunity, is why it's broken in the first place.

ergo it is insecure by design for all uses, and unsuitable for anything important relying on code you wrote doing what you expect.

It's fair to say it's broken in a world where what we do in webpages MATTERS. It was fine in the past where nothing truly important happened in web pages, but that ship has sailed.

paulhodge · on April 11, 2022

agree, assuming you're talking about language version ES1 circa 1995.

koonsolo · on April 11, 2022

So when you include a 3rd party C++ library, how do you know it's not doing some malicious things?

akhmatova · on April 11, 2022

Yup, that's how I read it too. :) Was just wondering what a comprehensive counter-argument would look like.

If there is one, that is.

koonsolo · on April 11, 2022

So which language doesn't have this problem when including 3rd party libraries?

yonixw · on April 11, 2022

Based on your use-case, it may already been solved. Microservices, for example, are put in a container, which allow you to give read-only\full access to selected files. They also usually have some kind of Mesh side container that limits connectivity between whitelisted containers or public DNS. And since containers can limit memory and throttle CPU, you are more or less at the best you can get using untrusted 3rd party code.

inglor · on April 11, 2022

Hey luca, I assume you are familiar with Mark’s extensive work on capability based security for ES?

lucacasonato · on April 11, 2022

Yup, SES is very interesting, and I think it can solve this problem in the future. As this post is about the here and now though, and SES is not yet ready for widespread adoption, I think my point stands that at this current time it is not possible to securely do capability based security in JS.

throw_m239339 · on April 11, 2022

Hi, first thanks for your work both in TC39 and Deno.

But that's not the heart of the problem for node. The problem with node ecosystem and is a combination of a refusal from Core to provide a meaningful standard library, a set of packages people can trust so they don't have to download random things from the internet just to parse the body of a multi-part request for instance, and the fact that NPM was badly architectured as it's going to fetch as many versions of a same package as it needs to solve dependencies. Good package managers don't do that, period. Of course that, given the dynamic nature of Javascript, anybody can monkeypatch anything. But the language itself isn't at fault, it REALLY IS both the politics of a paper thin STD lib and bad package management with stupid dependency resolution.

And frankly Node.js should stop shipping with a package manager and its infrastructure entirely controlled by Microsoft. it's so bizarre how nobody seems to mind that in the Node community... NPM servers aren't open source.

jollybean · on April 12, 2022

The notion of 'securing part of a runtime' is essentially a fallacy.

'Sandboxing' anything really is quite hard.

If you want to make a firewall between blocks of untrusted code, they have to be run in completely isolated from one another, which creates a lot of overhead.

I don't see anyway to mix untrusted with trusted code.

If some of your code is untrusted, it's all untrusted.

We probably need an operational situation to this, which is, you pay a fee to some org that literally sits there and pours through the code to look for hacks, and there's some verification & oversight a to who is submitting what. etc..

It may very well be that the days of regular open source are over, there might have to be some changes to it.

I can imagine in 10 years from how, all you telling junior devs about the 'good old days' when people just randomly put some code up on a site, and you used it! They will hardly believe you. "That's crazy, a few line of bad code could wipe out your company!".

white_dragon88 · on April 11, 2022

Why not just freeze anyway? At the end of the day you don’t know what you don’t know. Deno is well positioned to make these sorts of restrictions, how often is someone doing something as mental as modifying prototypes in a nodejs environment anyway?

(Moment comes to mind actually, but does it really matter? That library is deprecated anyway)

joshAg · on April 11, 2022

Part of the draw of deno is that for better or worse it's javascript. If you start changing things about the language used in the deno runtime such that it's no longer compliant ecmascript, then you no longer get the benefits of it being ecmascript. The devs' mental modal of the language isn't a drop-in, libraries and modules including popular ones, are no longer guaranteed to work out of the box, etc.

You might say that all that is worth the benefit, but in that case why not just use another language that already gives you the feature you want or why stop there? Why not also fix other issues with javascript at the same time since we're no longer preserving compatibility? Something mental like the automatic type coercions? Or getting rid of var?

bakkoting · on April 11, 2022

Deno has already made similar changes, like https://github.com/denoland/deno/pull/4341. That particular change happens to be allowed by the JS standard. The change discussed here isn't currently allowed, but I suspect TC39 would be open to making it allowed (though obviously it would not be allowed for browsers, in the same way the linked change to __proto__ is not allowed for browsers).

If you change something that most code isn't relying on, most code will still work. This change is plausible because it's very rare for code to be mutating built-ins. That's not true for most other possible "fixes". And most other changes would not have a benefit to consumers of the application (who cares if the library you're using has `var`s?), so they're much less well motivated.

joshAg · on April 11, 2022

I wouldn't consider that a similar change, because it's fully compliant with a newer js spec, which deprecated that feature. It sounds like, what deno has done is removed native support for older js specs and instead makes you transpile to an earlier spec. And by removing support for the earlier specs, they are able to drop support for deprecated features.

"allowed by the js standard" is key. As long as it's allowed by the standard they're still fully compliant with it. The compliance is necessary because no one wants to deal with "mostly compliant". Users want certainty, so "mostly compliant" becomes "fork their spec and make your own, so i can know what guarantees you make". That's why each change to the spec results in a new version. It's a self-fork of the previous spec.

If the change made it into the ts or js spec, I'm sure they'd add in support for freezing protoype chains, even if just as an option that can be toggled, but i doubt they'd ever want to break the spec just because they don't like parts of it. That opens the door to more changes because they don't like the spec, and eventually you have a new language, or worse, the original language changes out from under you to support something similar in a newer spec (eg typescript and namespaces/modules).

bakkoting · on April 11, 2022

> That's why each change to the spec results in a new version. It's a self-fork of the previous spec.

That's not how it works, no. There is just the spec [1], which is updated frequently. I am editor of the specification. (There are annual editions as well, but no one should pay attention to these.)

> I'm sure they'd add in support for freezing protoype chains, even if just as an option that can be toggled, but i doubt they'd ever want to break the spec just because they don't like parts of it.

Well, like I said, if Deno's only concern is breaking with the spec here, I expect the spec could be updated to allow this behavior.

[1] https://tc39.es/ecma262/

joshAg · on April 11, 2022

>>> It is the fourteenth edition of the ECMAScript Language Specification

Fine, then sed s/version/edition/g

From the point of view of a maintainer that makes sense, but for the users each annual edition or feature moving into stage 4 its own spec version, and unless the feature is absolutely groundbreaking, thinking in terms of annual editions makes it possible for users to grab various tools with confidence that things will work together smoothly.

Look at how Mozilla interacts with the spec [1]. They're not thinking in terms of the nightly version of the spec. They're looking at the annualized editions and making sure they support them as fully as possible. And then they communicate that support to their own users in terms of that annualized edition.

V8 consumes from nightly and using the up-to-date test suites and they explain their reasoning here[2], but notably they're still conceptualizing things to their users through the lens of annualized editions even though they're still grabbing features when they're only at stage 3. They even tag blog posts about js with tags for the annualized editions that added the feature: [3].

As a user of the users of the spec, the annualized editions are super super helpful. I can only use features that have actually been implemented. And I have to make sure that each tool I use is only getting code that uses features it's implemented. Can you imagine if every tool had feature-by-feature implementation matrices? "OK, ESlint understands new features A, B and D, but babel only implements B, C, and D, but library X doesn't support feature D yet, so since we want to use that we have to use bluebird instead of the native feature D for now" and on an on. It'd be madness, and we'd end up picking a handful of tools we like enough and then transpiling everything to es5 because our own users don't actually care if we transpiled down to es5 or shaved enough yaks that we realized that our toolchain natively supports featurs B but everything else must be transpiled out or replacing the native implementations with our our in code implementation. Instead each tool picks an annualized edition and while slower tool release cycles be annoying, I can actually turn that guarantee of standardized features into a toolchain with transpilation steps as necessary, which means as a dev I know i can safely use any feature in that annualized edition without worrying if this feature i haven't really used before is going to blow up somewhere in my toolchain because the implementer hasn't gotten around to implementing that feature yet.

So while I like looking at the draft proposals and consider it important to know how the language is evolving, I have to wait for implementations to permeate enough of my tools before I can use the shiny new feature, which means annualized editions of the spec.

[1]: https://blog.mozilla.org/javascript/2017/02/22/ecmascript-20... [2]: https://v8.dev/blog/modern-javascript [3]: https://v8.dev/features/tags/ecmascript

bakkoting · on April 11, 2022

Mozilla is absolutely thinking in terms of the nightly version of the spec. I agree that public messaging sometimes talks about annual editions, but this is mostly because it's a convenient way to talk about when features were added to the language, not because it reflects any underlying reality.

Anyway, that's not really the relevant thing. What I'm addressing is:

> deno has done is removed native support for older js specs and instead makes you transpile to an earlier spec. And by removing support for the earlier specs, they are able to drop support for deprecated features.

And that's just not a thing. That has no relationship to the ES specification works. The __proto__ accessor was never mandatory; it was added in browsers a long time before it was specified, and then specified as optional (Annex B) when it was first added to the specification, and has been optional since then. This is true whether or not you think of there being a single specification or annual editions.

So, with that said, to address the specific topic of annual editions:

> "OK, ESlint understands new features A, B and D, but babel only implements B, C, and D, but library X doesn't support feature D yet, so since we want to use that we have to use bluebird instead of the native feature D for now" and on an on.

That's exactly how it works. eslint implements proposals at stage 4. Babel implements proposals as they come out and people contribute, but only adds them to preset-env at stage 3. The output for present-env is based on what's actually supported in the browsers you're using. They both take a variable amount of time to land features once they hit the appropriate stage. Neither of them gates anything on annual editions. Neither do browsers. And browsers will frequently not have implemented features from multiple editions ago; for example, regex lookbehind was added in ES2018 and is still not implemented in Safari.

lucacasonato · on April 11, 2022

If it was 2000, or even 2010, I'd agree with you. But that ship has unfortunately sailed. People modify the prototype all the time. If we were to freeze all prototypes right now in JS, a lot of existing code would break.

yonixw · on April 11, 2022

> something as mental as modifying prototypes

This is exactly what Jest (mocks) is built on top of. You will fail every enterprise security scan if you have Jest in the production package/docker.

prakis · on April 11, 2022

Why not lock the core libs like Map.prototpy Array.prototype ? Can't we have sandbox like environment (which is not default but which can be enabled on application basis). Java Applets which run inside the browser had this kind of sandbox restritions.

ex3ndr · on April 11, 2022

You can do this in almost any language, like Java, ObjC/Swift, etc

voxic11 · on April 12, 2022

and I don't think any of those languages support capability based security either.

keb_ · on April 11, 2022

The impression I get from the companies I've worked at is to not trust packages that are mostly maintained by one person. Ironically, those packages are usually the ones that don't have outrageously large dependency trees and can usually be audited by a developer on a weekend, and the more "trustworthy" larger packages with hundreds of maintainers typically have monstrous dependency trees.

Not saying 1-man tools are inherently better, but in my experience, these tools seems to have a tighter focus and less chances of scope/dependency creep.

hinkley · on April 11, 2022

Node has a peculiar place in the pantheon for me where there are a few well known individuals who are maintaining a mountain of code that is still mostly theirs even with external contributions coming in.

TJ in particular, before he fucked off, and Sindre Sorhus, who has a little kingdom of tools in the Unix philosophy but more useful than leftpad.

I probably use code from others but those are the only two I am keenly aware of.

jitl · on April 11, 2022

When I find an issue with one of the little packages Sorhus publishes, I often need to traverse 3-5 different GitHub repos to understand how the code works. Many times I find it better to copy paste the “base” code into my repo once I find it than continue to use the web of NPM packages.

keb_ · on April 11, 2022

lukeed is another one I think; I've depended on his packages for years, and have yet to be burnt by one.

hinkley · on April 11, 2022

I've not heard of him but it looks like I use a few libraries that use his stuff.

dylan604 · on April 11, 2022

>can usually be audited by a developer on a weekend

why are devs expected to do this on a weekend and not just as part of the work week? are we coming from the perspective of the dev working on a side project?

keb_ · on April 11, 2022

I said "weekend" to communicate a short amount of time; apologies, I don't think anyone should be doing unpaid labor.

seanw444 · on April 11, 2022

I knew what you meant ;)

btmcnellis · on April 11, 2022

The big problem with one-person packages isn't so much security as it is support. I have been burned more than once by old applications where key features rely on random packages with one maintainer who disappeared years ago. At least with a group, you have options to keep things moving without having to fork the library yourself.

(Of course the root cause here is arguably too much reliance on third-party dependencies, but searchable dropdowns are _such_ a pain to make on your own, and it's so tempting...)

The Sangria GraphQL library in Scala ran into a version of this. The libraries were primarily maintained by one person, who wrote the vast majority of the code and was the only person with write privileges in the main repos. Sadly, he passed away unexpectedly, and it took months (maybe a year or so) before his colleagues and other contributors were able to get access to the GitHub org.

ratww · on April 11, 2022

Well, for what is worth, we have a lot of dependencies maintained by Microsoft of all companies, with lots of production-breaking bugs and they're not too interested in fixing or letting us fix. Even getting fully-functional PRs (with good test coverage and community support) looked at takes a lot of work and time, let alone getting fixes after reporting issues.

One of those packages is a JS package that is hosted by them, so we can't even fork it and host ourselves.

On the other hand, with simple packages that get abandoned, we just fork, publish ourselves with another name or namespaced, and it's solved.

btmcnellis · on April 11, 2022

Solo maintainer vs. organization is definitely an imperfect heuristic for long-term support. But it's a decent approximation for dependencies that are low ROI but potentially high impact if they break, like a UI widget that gets used everywhere in your app.

It's the problem with any third-party dependency (ask anyone who's used certain Google products). But then if you build everything in-house, a) it's expensive, and b) you end up with homegrown frameworks written by somebody who left the company five years ago and now everyone is afraid to touch it.

The laws of software thermodynamics come for all of us. Eventually, old systems decay, and you need to roll up your sleeves and do the work to keep them going.

ratww · on April 11, 2022

> But it's a decent approximation for dependencies that are low ROI but potentially high impact if they break, like a UI widget that gets used everywhere in your app.

Not really, it's not decent at all. What is a great approximation, however, is the heuristic presented by the grandparent poster: projects that are easy to audit, easy to fork (if necessary) and don't have outrageously large dependency trees. Everything else is a liability.

danShumway · on April 11, 2022

It's really context dependent.

1-person tools make it easier to audit the person involved, and like you said in general I also find that those tools tend to be smaller and have fewer dependencies, which makes a big difference for security. Limited scope is good when looking at packages/dependencies, and side-projects are kind of required to have limited scope just by virtue of not having a ton of resources.

However, 1-person tools also have less accountability and less bandwidth to respond to emergencies, and (particularly if there aren't a lot of eyes on them), you need to evaluate whether the developer is qualified to build the package -- ie, are they likely to inadvertently introduce a security vulnerability or abandon the package if it needs security updates? People (myself included) have an instinctive bias to assume that 3rd-party code is written by people who know what they're doing. Part of the evaluation process needs to be asking, "does this person actually have the skill to do what they're trying to do?"

I also try to look at how documented the project is -- in an emergency, could I fork the project myself? If a project is being run by only 1 person, then it's more likely that the codebase isn't massive and that it wouldn't require a full team to update. But it's also less likely to be well-documented or extensively tested. Again, balancing act.

I'm not sure there's a single correct answer, I think it depends a lot on the project and on what kinds of packages you're looking at.

Macha · on April 11, 2022

Unfortunately, as recent incidents have shown, these many party, otherwise reliable, projects often have dependency chains that have these one person that can have a breakdown projects as dependencies.

Would I ever include left-pad as a direct dependency? No. But as we found out, the person who provided a library that was used by react-dom might.

the_biot · on April 11, 2022

> The impression I get from the companies I've worked at is to not trust packages that are mostly maintained by one person.

That is hardly the point. Regardless of what you think of the solution presented, the author is utterly right in saying any solution has to involve not trusting any packages at all. How many people wrote the package is irrelevant.

feross · on April 11, 2022

Founder of Socket (https://socket.dev) here, a new tool built by npm maintainers to help solve JavaScript supply chain security.

I totally agree with the idea that we should assume all open source packages may be malicious. Socket.dev uses "deep package inspection" to characterize the behavior of an open source package. By actually analyzing the package code, Socket can detect when packages use security-relevant platform capabilities, such as the network, filesystem, or shell.

For instance, to detect if a package uses the network, Socket looks at whether fetch(), or Node's net, dgram, dns, http or https modules are used within the package or any of its dependencies.

This entails running static analysis (and soon, dynamic analysis) on a package – and all of its dependencies – to look for specific risk markers.

In this way, Socket can detect the tell-tale signs of a supply chain attack, including the introduction of install scripts, obfuscated code, high entropy strings, or usage of privileged APIs such as shell, filesystem, eval(), and environment variables.

We are taking an entirely new approach to one of the hardest problems in security in a stagnant part of the industry that has historically been obsessed with just reporting on known vulnerabilities.

caseyross · on April 11, 2022

This, in my opinion, is the right answer for the problem identified in the parent blogpost. Rather than trying to get every single package author to adopt some unified capability token scheme in their code, just statically analyze all dependencies from the outside and report the capabilities they actually use.

It would be even better if something like this could be integrated directly into the package management tool itself, so that you could run `npm update` and get back "New dangerous API usage in package X version a.b.c: filesystem access. Type package name to acknowledge and upgrade."

feross · on April 11, 2022

Thanks. Glad you like our approach.

> It would be even better if something like this could be integrated directly into the package management tool itself

We're planning to build this. However right now, the primary way to consume Socket.dev data is through our GitHub app (https://socket.dev/integrations).

vbezhenar · on April 11, 2022

What prevents malicious person to craft their code until it evades your analysis? It's the same with antiviruses. They're not that useful because adversaries adapt their viruses to pass antivirus heuristics. And, as viruses show, you can make your heuristics whatever complex, someone smart will find a way around. Especially in that wild JavaScript environment.

feross · on April 11, 2022

This is a fair question. The answer is that most malware behaves in ways that are deterministically detectable. For example, 93% of malware uses install scripts, which must be declared in the package.json file and are not possible to hide from our analysis.

From recent research:

> We found 93.9% (3,412) of malicious packages had at least one install scripts, indicating that malicious attackers use install scripts frequently [1]

When malware authors adapt and start doing fancy dynamic stuff, we might not be able to figure out exactly what they're doing, but we can detect the usage of obfuscated code, dynamic requires, and other signals of compromise.

[1]: https://arxiv.org/pdf/2112.10165.pdf

donatj · on April 11, 2022

I think Go modules version resolution opting for lowest common release rather than the standard highest is a reasonable and sane option though not a total solution. It prevents users of your library from pulling unvetted versions of your dependencies just by pulling your library alone.

jaitsu · on April 11, 2022

https://go.dev/blog/supply-chain is a very good write-up on this topic too.

porsager · on April 11, 2022

This won't prevent packages you're already giving capabilities to - to later do something evil in a small patch update.

So it's not enough - you should also not rely on "automatic security updates" through some sort of semver trust. Lock your dependencies up completely, choose packages with a small dependency tree or zero dependencies, and at the same time use the newly launched https://socket.dev to know what packages do. Also reading through the source of your dependencies should be on the list.

josephg · on April 11, 2022

> This won't prevent packages you're already giving capabilities to - to later do something evil in a small patch update.

True - its not perfect! But the principle of least privilege should help limit the blast radius. How many packages in npm dependency tree need access to the filesystem? Or to the network? I bet its a vanishingly small percentage of the packages in the average nodejs project. Being vulnerable to malicious code in 3 hand selected packages is much, much better than being vulnerable to malicious code in any of the packages in your dependency tree.

And even then, we can be very specific about what those packages have direct access to. Right now the situation is "every package can read and write to any file on my computer". The fine grained permissions I'm proposing in this post would let us say "only package X can access the filesystem at all, and when it does it only has access to this subdirectory".

Much better!

feross · on April 11, 2022

Thanks for sharing Socket.dev! Totally agree that a key part of any supply chain security strategy must be understanding what packages actually do when they run.

For example, see the package `angular-calendar` which is a calendar/date picker web component. When you look it up on Socket.dev [1], you'll see that it actually uses:

- Install scripts

- Telemetry to track you

- Network access

- Shell access

- Environment variable access

- File system access

Which is waaay more capabilities than you'd expect. All of these capabilities turn out to be caused by a single dependency which implements telemetry to track the package usage a la Google Analytics.

[1]: https://socket.dev/npm/package/angular-calendar/issues/0.29....

porsager · on April 11, 2022

Wow.. Yeah that's a great example of exposing what's actually going on!

Btw, is there a specific reason you're not listing the "yellow issues" from a packages dependencies on its front page? For instance https://socket.dev/npm/package/mongoose doesn't really show anything on the front page, but if you go to "dependency issues" you get "uses network, eval etc.". I think it'd be necessary to treat "dependency issues" the same as the packages own issues.

feross · on April 11, 2022

We're not happy with the noisiness of filesystem and network issues, so we mark them as a bit lower priority for the moment. The specific issue is that we currently only detect when 'fs', 'net', etc. are required and not whether they're actually used and which specific functions are used.

We're working on improving our analysis and are close to shipping a big update at which point we'll increase the severity of these issues.

mark_and_sweep · on April 11, 2022

I'm surprised the author doesn't mention Node's Policies: https://nodejs.org/dist/latest/docs/api/policy.html

"Policies are a security feature intended to allow guarantees about what code Node.js is able to load."

josephg · on April 11, 2022

Author here! I didn't even know that existed. Thanks for the link!

That looks like it has the same problem as Deno's solution, in that its too coarse for my taste. I want to explicitly give permission to a library, not to the process as a whole. (Since I don't want some errant library deep in my dependency tree to nuke my production databases.)

I love the definitions of scope though - that looks like exactly the sort of thing that I want here.

zozbot234 · on April 11, 2022

> I want to explicitly give permission to a library, not to the process as a whole. (Since I don't want some errant library deep in my dependency tree to nuke my production databases.)

Doesn't Java have a SecurityManager feature that can do this? Perhaps we need a JS equivalent.

gorjusborg · on April 11, 2022

Also, the security manager is getting removed: https://openjdk.java.net/jeps/411

jameshart · on April 11, 2022

The trouble with security manager is it’s only as good as it’s widespread support.

If it worked, and was widely used, for example, nobody would have had to worry about the possibility of their logging library downloading code from an LDAP server and executing it.

vbezhenar · on April 11, 2022

SecurityManager was a bad idea. It might be useful to prevent accidental bound trespassing. But it turned out too flaky to serve as a security foundation. Basically nobody uses it for security.

mark_and_sweep · on April 11, 2022

> Author here! I didn't even know that existed. Thanks for the link!

You're welcome! To be fair, Policies are still experimental. That's probably why they didn't get much press despite being in Node since v11.

> I want to explicitly give permission to a library, not to the process as a whole.

You can enable/change/disable imports for a given library, though this might be quite cumbersome for a wide or deep dependency tree.

danShumway · on April 11, 2022

Good article. This is what I was excited about the first time I heard about Deno before I eventually learned what Deno's sandboxing model actually was. I'm not sure this is the exact proposal I would want, but I do want something vaguely like scopes or capabilities in Node, and even if it wasn't perfect I think it would go a long way towards mitigating at least some of the current risk in the ecosystem.

Also agreed that for all their use, it would have been better in the long run if install scripts had never existed. It's not just that they're a security vulnerability, they also get in the way of vendoring code, and can introduce additional non-JS dependencies and errors on other systems/platforms. Again, not to say that they don't have any use, I get why they're there. I just wonder if the benefits are worth the downsides.

didip · on April 11, 2022

In the beginning of node.js life, NPM the company had a guerilla marketing that says: The smaller the package, the better for reusability.

That stained the node.js ecosystem forever, until now.

This is why I am excited about Deno, Ryan wants to break away from this culture.

elondaits · on April 11, 2022

I think that’s throwing the baby out with the bathwater. Wouldn’t it be logical that when people use snippets from Stack Overflow or classical algorithms from Wikipedia the pasted code was actually treated as a dependency so you can get warnings and updates in case errors or security issues are found? It also helps for proper licensing and attribution.

The problem is that code in npm should allow for trust. Either based on signatures, or code reviews by trusted parties, or something like that. Code signed by a long time trusted developer that’s been published for a month should not be treated the same as a 3 minute old commit to a repo by a mysterious developer. These verifications should be automated and npm could give a final ranking.

throw_m239339 · on April 11, 2022

> I think that’s throwing the baby out with the bathwater. Wouldn’t it be logical that when people use snippets from Stack Overflow or classical algorithms from Wikipedia the pasted code was actually treated as a dependency so you can get warnings and updates in case errors or security issues are found? It also helps for proper licensing and attribution.

Why are you assuming devs don't just both use random snippets from Stack Overflow and also download packages with 900 transitive dependencies from NPM at the same time? It's not one or the other.

sgentle · on April 12, 2022

That's not an accurate history; npm started around 2010 and didn't become a company until 2014. Perhaps it was all part of a master plan to build the perfect trojan unicorn, but I invite you to look at some of the 2010-era code and blog posts to judge for yourself: https://github.com/npm/cli/tree/e790c85a https://blog.izs.me/2010/11/10-cool-things-you-probably-didn...

You can read and/or listen to issacs talk about the actual history here: https://changelog.com/founderstalk/61

In the early days of node, the ability to have tiny nested modules without opening the gates to dependency hell was such a profoundly new and exciting capability that the community didn't need any convincing to (over-)embrace it. As with microservices today or CORBA in the 90s, moving your design's complexity from its nodes to its edges is a powerful way to convince yourself that you've made it simpler.

silon42 · on April 11, 2022

IMO, npm is correct/better.

The solution is to 'vendor' everything and have package updates as part of normal code review.

throw_m239339 · on April 11, 2022

> IMO, npm is correct/better.

NPM was never correct. NPM as a business cared about growth. The more packages on NPM servers the more valuable their company, it's as simple as that.

filleokus · on April 11, 2022

Could a path forward also be to unify the most depended upon small packages into one large dependency, managed by some trustworthy entity?

I guess some plug-in to npm could handle the resolution-mapping between the (9 line) strip-ansi package and the node-standard-library package?

Of course this don’t solve all problem, but if a create-react style app could lower its number of dependencies by 80% or something, it would be easier to keep track of the remaining.

Because even if you have some kind of capability system, you are still pretty vulnerable to miss behaving packages. Even if the scope of badness is dramatically lower, chaos would ensue in many build pipelines if some of these “core” package just started throwing exceptions / returning empty objects

a9h74j · on April 11, 2022

ActiveState does something comparable for Python, Perl and TCL.

https://www.activestate.com/

hulitu · on April 11, 2022

The tendency is to make for every function a library. Just look at lib directory in the Slackware install tree. The functionality (of the distribution) has not changed too much but the number of libraries needed is astonishing.

chha · on April 11, 2022

No packages or repositiories deserves trust, at least not in the current state of affairs. There is no magic fix that will enable you to trust packages just by adding a new framework or anything similar, and what the linked article is addressing is only half the problem.

We also need a way to make packages auditable. Package signing by the publisher and the repository needs to be mandatory. Having an actual link between the package and the commit it was built on and a way to reproduce the build[1] also needs to be possible. This would allow for proper code reviews, not just of your own code but also audits of whatever extra components you are using.

Organizations need to define sensible thresholds for when you can use a package and when you implement the code yourself, to avoid adding a library only to use one function. They also need to define trust; what is needed to trust a package or a maintainer.

And we need a system and proper best practice to guide us on what to trust; do we really want to add a package with 100 direct or transitive dependencies, where some hasn't been updated for the past two years, some are maintained by solo developers and some are just implementing already existing functionality?

All of these are hard to implement and justify, when the current situation seem to work for a lot of people.

[1] - https://wiki.debian.org/ReproducibleBuilds

lucideer · on April 11, 2022

TL;DR: There's nothing fundamentally less worthy of trust about node's supply chain than any other popular mainstream language ecosystem. They've all got some badness.

---

Here's a trope I'm tired of:

Take X general problem that affects a wide range of systems, attribute it to one narrow system Y. Usually because that system Y's general accessibility and success leads to a higher number of high profile incidents related to problem X.

One of the practical outcomes of this trope is people trying to solve this problem in narrow, ecosystem-specific, non-portable ways.

dahfizz · on April 11, 2022

The issue is especially bad, and needs to be especially called out, for npm because of the insane dependency bloat.

You can write fully featured and useful python apps with just the standard packages that come with python. In JS you need a handful of third party packages just to tell if a number is odd.

A difference in magnitude becomes a difference in kind.

pfraze · on April 11, 2022

I think you have to appreciate the history with node to know why there are so many packages. Node’s growth coincided with GitHub’s and node’s community really adopted the “social programming” trope. You could really make a name for yourself with a popular node module. Javascript had more limitations to work around at the time, and then the desire to reuse code in node and the browser created even more need to abstract common tasks. The “modularize everything” philosophy resulted and it became a kind of game to make as many modules as you could think to make; after all, isn’t code sharing the joy of the FOSS revolution?

That era has since peaked and declined. Now I see people make way fewer modules because of the difficult of managing them all. There’s much less cred to earn from a node package. People who did gain social capital from modules are now stuck as maintainers, gaining very little additional value — thus more conversation about paying maintainers with monetary capital, along with abandoned or ownership-transferred code. And, of course, we’re now suffering from the security issues.

It’s still an incredibly valuable corpus of modules, but it’s post-bubble. It wasn’t just “js programmers are too novice to know better.” It was people having fun, playing the social game, trying silly ideas, and chasing a meme-wisdom of programming (modules = good).

fbrchps · on April 11, 2022

> In JS you need a handful of third party packages just to tell if a number is odd

But you absolutely don't.

I get that you're taking a worst-case example, but it also stands that if Node developers would actually take some time to write stuff themselves -- the leftpad situation was absolutely stupid, because `.padStart()` exists -- then the situation wouldn't be nearly as bad.

At my workplace, if you can write the functionality in (depending on the scale) a day to a week, then you're not allowed to use a 3rd party package for it. You can _look_ at what other people have done, but doing it in-house leads to less work overall in the future.

eropple · on April 11, 2022

> the leftpad situation was absolutely stupid, because `.padStart()` exists

String.prototype.padStart hit Chrome in January 2017 and Firefox in June 2016. `left-pad` was published in March 2014.

A lot of these small packages that seem ridiculous now addressed (sometimes poorly!) missing aspects of the library specification for pretty good reasons. The JS specification has improved a lot over time--I just chuck `ESNext` into the library tsconfig.json for every Node project--but there is a lot of historical baggage with which this kind of dismissal doesn't adequately come to grips.

Zababa · on April 11, 2022

.padStart() was added precisely because of the leftpad situation.

lucideer · on April 11, 2022

Node has horrific dependency bloat, but this is a strawman here because the right response is not "let's assume the dependency bloat is inherent/justified and come up with specific security mitigations" but rather "let's ask ourselves: WHY does Node have such terrible dependency bloat?"

I've no idea why, but I've some personal theories:

Similar to how PHP has historically been associated with bad code, not all of which can realistically be attributed to the spellings of its API identifiers, I think whenever you have a system that solves the ease-of-use/developer-accessibility problem well, you will end up with the standard of contributor to that system being less skilled, since it's easier for less experienced people to start using it. You see this throughout the NPM package ecosystem: packages developed by very inexperienced engineers being relied upon by big popular projects.

This is ultimately a "good problem". You strive to make your tools easy to use, and when you succeed, you end up with more people using them badly.

You can argue that tools should be both easy to use and also foolproof, but that's utopian. Let's work towards that but not expect it as a baseline.

capableweb · on April 11, 2022

> In JS you need a handful of third party packages just to tell if a number is odd

Maybe it's hyperbole, but you definitely don't "need" any third-party package to tell if a number is odd (`i % 2` does the trick). That there exists a package for it doesn't mean that the majority of users actually use it.

The standard library for JS is pretty small in general, but I don't think hyperbole is the right way of getting your point across, as I agree with you in general.

dahfizz · on April 11, 2022

is-odd gets between 400k and 500k downloads every week[1]. Maybe you don't need it, but lots of JS developers have decided that they do. is-odd also depends on is-number, and is-even depends on is-odd.

I agree that these packages are trivially implemented by a first party instead of imported, but ~half a million JS developers every week choose to import it instead. _That is the problem_.

[1] https://www.npmjs.com/package/is-odd

dvlsg · on April 11, 2022

I'm fairly certain those download numbers include CI runs redownloading the not-cached package over and over, likely via a transitive dependency. I don't think it's fair to say approximately half a million unique JS developers are choosing to import it every week.

imtringued · on April 11, 2022

That's the damn problem with npm.

This library is responsible for 50% of all downloads to is-even which calls is-odd. https://www.npmjs.com/package/handlebars-helpers

Who cares? https://github.com/helpers/handlebars-helpers/issues/315

lucideer · on April 11, 2022

These numbers aren't too hard to rack up for well-known packages (even the meme ones like this). e.g. is-odd is a transitive dependency of stuff like handlebars-helpers which gets a lot of downloads and will pull in is-odd automatically.

Hamuko · on April 11, 2022

>In JS you need a handful of third party packages just to tell if a number is odd.

You really don't, but for some reason JS developers have decided that this is the optimal way to check if a number is odd.

kybernetikos · on April 11, 2022

Can you cite anyone who actually thinks this?

People complain about these tiny packages they find on npm the whole time, but usually these packages come from people learning how to create their first npm packages, or creating or following tutorials. They aren't serious packages used by typical developers for production apps.

If you go to the isEven github repository you can even see "I created this in 2014, when I was learning how to program." If you hover over his 'organisation' you'll see the text "This is a joke. You'll only see this org if you are attempting to troll me about repositories I created when I was learning to program."

dahfizz · on April 11, 2022

is-odd gets between 400k and 500k downloads every week[1].

The package was written as an exercise in learning how to create a small, useless package. But a huge portion of the JS development community choosed to import and use the package anyway.

[1] https://www.npmjs.com/package/is-odd

kybernetikos · on April 11, 2022

Have a look at where those downloads are coming from and which packages are actually using it: https://www.npmjs.com/browse/depended/is-even

They're jokes or satire or learning repositories, like `install-is`: "Installing this package installs a bunch of useless packages" or Stalinsort or module-practice-january. The most serious looking dependents are by the same author.

imtringued · on April 11, 2022

You aren't looking hard enough. The joke libraries don't get any downloads. install-is gets 1 weekly download.

https://news.ycombinator.com/item?id=30991487

kybernetikos · on April 11, 2022

handlebars-helpers falls under my last sentence. The linked thread is literally with the guy who wrote is-odd.

zkldi · on April 11, 2022

This is absolutely not true, and I'm tired of seeing this.

is-odd, alongside a bunch of other microdependencies are almost all the work of one person, who made as many micropackages as possible and then PRd them into other more popular libraries. There are not 6 million people directly downloading `is-odd` a day. At all.

When this person could make one library to do something (like an ANSI-Colouring package), they would fractalise it into as many dependencies as possible, because that boosts their download count on NPM. I should note that this is just one person who has managed to nestle their way into some larger projects. I apologise for the spam, but this point really needs hammering home:

https://github.com/jonschlinkert/ansi-black

https://github.com/jonschlinkert/ansi-reset

https://github.com/jonschlinkert/ansi-bold

https://github.com/jonschlinkert/ansi-dim

https://github.com/jonschlinkert/ansi-italic

https://github.com/jonschlinkert/ansi-underline

https://github.com/jonschlinkert/ansi-inverse

https://github.com/jonschlinkert/ansi-hidden

https://github.com/jonschlinkert/ansi-strikethrough

https://github.com/jonschlinkert/ansi-black

https://github.com/jonschlinkert/ansi-red

https://github.com/jonschlinkert/ansi-green

https://github.com/jonschlinkert/ansi-yellow

https://github.com/jonschlinkert/ansi-blue

https://github.com/jonschlinkert/ansi-magenta

https://github.com/jonschlinkert/ansi-cyan

https://github.com/jonschlinkert/ansi-white

https://github.com/jonschlinkert/ansi-gray

https://github.com/jonschlinkert/ansi-grey

https://github.com/jonschlinkert/ansi-bgblack

https://github.com/jonschlinkert/ansi-bgred

https://github.com/jonschlinkert/ansi-bggreen

https://github.com/jonschlinkert/ansi-bgyellow

https://github.com/jonschlinkert/ansi-bgblue

https://github.com/jonschlinkert/ansi-bgmagenta

https://github.com/jonschlinkert/ansi-bgcyan

https://github.com/jonschlinkert/ansi-bgwhite

dahfizz · on April 11, 2022

I think you're missing the point. These packages are stupid and too small, but they get millions of downloads a month. That's the problem with the JS community. Instead of rejecting a dependency for being silly, JS devs will happily import them to save two lines of code.

lucideer · on April 11, 2022

No. The JS community won't happily import stupid packages to save two lines of code, any more than devs in any other community. A very small minority of devs will do this, and publish their larger packages. Those millions of downloads a month are transitive.

infamouscow · on April 11, 2022

This is the same kind of defense C programmers make regarding their beloved footguns.

That might be true in aggregate, but the exceptions are exceptionally bad, impact a lot of people, and then fools try to rationalize the footgun on HN.

paulhodge · on April 11, 2022

The question is, are we seeing that bad behavior because there is something inherent about the platform that encourages it, or are we seeing more with Node just because that community is enormous.

Because people love to bring up is-even as a reason why Node & NPM suck. What exactly did NPM do to create the is-even situation? (other than making it super easy to publish). What should they do differently?

infamouscow · on April 12, 2022

I'm old enough to remember the early days of Node and listening to a podcast (the name escapes me) with Isaac Schlueter explaining how node_modules isn't a hidden directory because you should vendor it.

The problem is Node and NPM grew at a greater rate than the rate it took to introduce someone to vendoring node_modules. Fast-forward a decade and it seems like people have forgotten all about vendoring and instead optimized for blindly shipping code warrantied for no purpose from the Internet.

The excuses why people don't vendor their packages are almost identical to the excuses people don't write tests for their code (i.e., time and velocity impact).

This isn't a technical problem, but a social one.

marcosdumay · on April 11, 2022

If I remember correctly, that library was a joke. That people immediately started exhorting as "the right way to do it" as a joke. That due to Poe's law other people immediately understood as the right way to do it...

bavell · on April 11, 2022

> You really don't, but for some reason JS interns have decided that this is the optimal way to check if a number is odd.

FTFY :)

Hamuko · on April 11, 2022

is-odd gets about two million downloads a month. is-number gets 278 million downloads a month.

I guess the interns have been very busy.

pfraze · on April 11, 2022

Those numbers are a little deceiving. It’s likely that those modules are upstream of a popular module or two. It’s not like they are installed by projects directly.

hombre_fatal · on April 11, 2022

It's only directly depended on by 40 or so npm modules, and most of them look like beginner/joke modules like "is-ten-thousand" where they brag about being featured in a "worst npm libraries" article.

Quickly in these sorts of conversations, we're really just making fun of beginners for using dumb packages. With the popularity of Node, it makes sense to me that you can have beginners searching npm/google for how to tell if a number is odd, and for whatever reason they find is-odd or whatever.

And perhaps the idea of even collecting packages to do simple things is something fun for them. I remember having a weird maximalist attitude when I was a beginner using Rails. I'd install a Ruby gem and I could use it anywhere in my files without even importing it, usually a single line of abstraction. For some reason that appealed to me even though I could have done it myself. I think I had this idea that people writing libraries were doing things right, and I was right for tapping into them.

I think we should save our denigration for the serious, popular projects that use packages rather than the fact that beginners use them. Like, this sort of package has no business being a transitive dep of Express (it isn't) and popular Express middleware, and mainly because x-deps are security issues.

pfraze · on April 11, 2022

Very good points, and I think you're right on the numbers here! I went dependent spelunking - almost half of it is because is-even is a dependent (naturally) but then I didn't notice any dependents of is-even to explain it. I guess people do just google it and grab the module.

Mea culpa for the unsubstantiated assertion before.

lucideer · on April 11, 2022

handlebars-helpers is one of the big dependents afaik.

bryanrasmussen · on April 11, 2022

is-odd package.json contains

"dependencies": { "is-number": "^6.0.0" },

bavell · on April 11, 2022

JS has a weak stdlib so it's definitely more common to need to pull in some deps vs python. Of course, nothing is forcing you to install silly packages that are wrappers for one-liners.

> In JS you need a handful of third party packages just to tell if a number is odd.

I have no doubt some clueless interns have done this but there is no need to self-inflict this kind of unnecessary pain.

rndgermandude · on April 11, 2022

>Of course, nothing is forcing you to install silly packages that are wrappers for one-liners.

That's the thing... I am effectively forced to install a lot of silly packages, because I need some not-so-silly packages, and these in turn pull in all the silly packages as dependencies of their own (a few levels down the chain).

I never installed leftpad (or any package like that) myself, and yet, at one point it was present in basically every node project I ever did, because of indirect dependencies.

While this kind of dependency bloat could happen in any language ecosystem, in node/npm it is from my experience by far the worst. I think it's because the javascript and node standard libraries were/are so very limited combined with npm making it too easy to publish and consume packages, and being early enough in the game so supply chain attacks weren't yet on most people's mind.

I think, aside from node package maintainers being too nonchalant about pulling in basically silly dependencies, it's also a matter of a lot of package maintainers being very laissez-faire when it comes to maintaining the cruft and doing the tedious work of removing dependencies that are no-longer needed.

An example of that - because it bugs me every time I see this show up in my logs, package lock or node_modules - is the isarray package. It's another one-liner, Array.isArray is part of JS since a long time (even IE 9 supports it and IE 9 was EOL in 2016) and the isarray package will just use it when present (i.e. virtually everywhere), and the author recommends to just use the built-in Array.isArray, and yet it's still omnipresent with almost 63 million weekly downloads, 858 direct dependents on npm (with countless other indirect dependents, and dependents not published and therefore not tracked on npm). And the number of weekly downloads still goes up week-by-week, month-by-month.

josephg · on April 11, 2022

Hi! Author here. I write a lot more Rust than Javascript these days, and I agree - this is absolutely a problem I'd love to eventually see addressed in the rust ecosystem as well. (Although dependency madness hasn't kicked in anywhere near as much in the rust ecosystem).

But rust and javascript each have unique challenges. Per-library sandboxing is hard in javascript because of the language's dynamism. Its hard in rust because any code you pull in can always drop down to unsafe, and in unsafe land you can really do whatever you want. We could probably selectively ban eval() in modern server-side javascript pretty easily because its just not used that much in modern code. But we can't ban unsafe in rust because its sprinkled everywhere.

> One of the practical outcomes of this trope is people trying to solve this problem in narrow, ecosystem-specific, non-portable ways.

I think trying to solve this in every language at the same time would be an exercise in boiling the ocean. There's no need to implement something like this in every language all at once. Much better to start somewhere, experiment, and hopefully if the solution works well we can see how it might apply in other languages.

jhugo · on April 11, 2022

Talking about `unsafe` in this context seems like changing the discussion completely.

All of the concerning things done in the JS supply-chain attacks can be done perfectly well in safe Rust. You don't need `unsafe` to exfiltrate secrets or to encrypt or delete files.

Auditing the use of unsafe in dependencies is worthwhile for mostly unrelated reasons; it's not the same thing as sandboxing them, auditing them for malicious code, or assessing how much you trust them.

josephg · on April 11, 2022

Right; I'm skipping ahead and imagining an alternate universe standard library for rust with capability support. Like, what if we did the thing I proposed in this blog post, but did it to rust instead of javascript? How would that work?

In that world, all the privileged operations (filesystem, OS, etc) in std would require an extra capability object to be passed at runtime. The capability would grant permission to the caller to perform the privileged action (like writing to a file).

That would stop rogue rust crates from making syscalls that they shouldn't be making. There might also be a way to enforce those permissions at compile time instead of at runtime.

But even if we were willing to do that, it wouldn't matter if any library could still use unsafe blocks. The reason is that there's dozens of ways to make syscalls from unsafe blocks without needing to go through std. For example, you could dynamically execute methods in glibc / musl, use an asm! block, or (probably) manually compile a function into a byte array, transmuting it into a function pointer and then execute it.

If you changed std, and also banned 3rd party libraries code from using unsafe you could probably make the system secure. But I'm worried that the rust community might not be willing to pay that cost for extra security.

jhugo · on April 11, 2022

> If you changed std, and also banned 3rd party libraries code from using unsafe you could probably make the system secure. But I'm worried that the rust community might not be willing to pay that cost for extra security.

Anyone will be willing to pay a cost when the benefit is well-defined and exceeds that cost. "Make the system secure" doesn't have a well-defined meaning, because security isn't binary like that.

Banning `unsafe` from all non-std deps is pretty hard for most real-world Rust software unless std grows a lot larger. Even apart from domains with a lot of necessary unsafe, like embedded, you have crates like `bytes` or `tokio` that use unsafe for performance reasons.

A realistic discussion about this stuff always has to involve trust. I don't have much issue with tokio containing unsafe, given its track record and the quality of the code. OTOH, I don't allow actix in my dependency trees because its history of using unsafe unnecessarily and unsoundly means that I don't trust it.

To me, a more interesting approach for improving supply chain security in software builds on the issue of trust rather than things like technical capabilities. Rust has some interesting work in this area already, like cargo-crev for distributed code review, cargo-deny for applying rules to your dependency tree, cargo-geiger for seeing which dependencies are using unsafe, etc.

zozbot234 · on April 11, 2022

> There might also be a way to enforce those permissions at compile time instead of at runtime.

AIUI, a compile time capability is just a custom unit type, perhaps using PhantomData to depend at compile time on some generic type or consteval parameter. Then ordinary type checking is enough to ensure that this gets "threaded" correctly throughout the code, as required. 'Narrowing' a capability is just a one-way type conversion, e.g. via .into(). Since you're doing this via a unit type that carries no information, everything should disappear at runtime, with no effects on the ABI. You'd effectively be using the type checker to prove things about what your code is allowed to do.

(It turns out that there's a (2021) article expanding on this: https://www.hardmo.de/article/2021-03-14-zst-proof-types.md )

ryukafalz · on April 11, 2022

> I'm skipping ahead and imagining an alternate universe standard library for rust with capability support.

FWIW there is https://github.com/bytecodealliance/cap-std

(…but yes it’s more limited than what you’re thinking of)

zozbot234 · on April 11, 2022

> Its hard in rust because any code you pull in can always drop down to unsafe, and in unsafe land you can really do whatever you want

There is a quasi-standard "cargo geiger" tool that can manage unsafety requirements in Rust imported crates. (Of course, this should really be an officially provided feature in the first place.)

lucideer · on April 11, 2022

> I think trying to solve this in every language at the same time would be an exercise in boiling the ocean.

I actually think the opposite is true.

Capabilities are great but limited in their applicability within actual application code. Ultimately, your main application that may require some "dangerous" APIs to do something benign, may be loading 3rd party code to wrap those APIs, and we need a way to trust that code. That's a general problem, and - while it is a very hard one to solve - coming up with novel and unique solutions in every ecosystem is going to reinvent a lot more wheels (& boil some seas at least) compared to looking at software composition holistically.

josephg · on April 11, 2022

Thats a really interesting perspective. I'd love to read more about that if you're willing to flesh it out more. Up for writing a blog post response?

jakelazaroff · on April 11, 2022

> Its hard in rust because any code you pull in can always drop down to unsafe, and in unsafe land you can really do whatever you want.

Not super familiar with Rust, so this may be a silly question: could Rust mitigate this by forbidding third-party packages to use unsafe code, except for ones you specifically allowlist? e.g. your Cargo.toml might look like this

    [dependencies]
    trusted = { version = "1.0.0", unsafe = true }
    untrusted = "1.0.0"

The trusted library would be allowed to use unsafe code, but the untrusted one would not.

Wintereise · on April 12, 2022

That's just going to lead to blanket allowances, sadly. Path of least resistance and all.

mschuster91 · on April 11, 2022

The proposed system itself should be fairly portable and in its idea reminds me of SELinux. The problem is SELinux knowledge isn't widespread, it's Linux-only (so no support for OS X and Windows which are the majority of JS developer machines), and especially it operates on a syscall level - which is perfectly fine to limit filesystem access (=for all nodejs processes, limit file I/O to the parent directory of node_modules?), but only barely unusable to sandbox network access.

Therefore it makes the most sense to define a standard for configuring capabilities/permission whitelists/grant requests/(however else you want to name it) and leave the implementation up to the platform/language.

marcosdumay · on April 11, 2022

Of course node is less trustworthy than other platforms. It it wasn't, we would see active exploits on other platforms as well, but they are either incredibly rare or non-existent.

But you are right in that it's not a difference of kind. Except for the inane idea of executing random code on library installation, node doesn't have any kind of vulnerability that isn't shared with every package manager out there. The important differences are social on the community, and of exposure area, because JS programs tend to rely on 1 or 2 orders of magnitude more developers than other languages.

lucideer · on April 11, 2022

100%

You've summarised it much more succinctly than I could. Fundamentally it's about misattributing the cause of untrustworthiness.

zozbot234 · on April 11, 2022

Other popular mainstream languages don't need a "leftpad" package in the first place.

onion2k · on April 11, 2022

JS didn't need it either. A dev made the library, shared it, and some developers decided to use it. That's on them. Nothing about JS made it necessary.

hombre_fatal · on April 11, 2022

Also, here's how to publish a package:

    mkdir my-package
    npm init
    npm publish

colejohnson66 · on April 11, 2022

Same with Rust. Any language with a package manager has this issue.

    cargo new my-crate
    cargo login ...
    cargo publish

ferdowsi · on April 11, 2022

Node has a more robust standard library than Rust, which forces devs to download third party libraries to compute a regular expression or generate a SHA-256.

For now Node is a richer target due to its popularity but the same issues will hit any language ecosystem that suffer the same flaws should they become popular.

jhugo · on April 11, 2022

"Third-party" isn't really correct for regex: https://github.com/rust-lang/regex (note the org)

Rust's stdlib is deliberately small in order to allow it to have very strong stability guarantees.

danShumway · on April 11, 2022

> One of the practical outcomes of this trope is people trying to solve this problem in narrow, ecosystem-specific, non-portable ways.

In general I agree, but I disagree that conversations about dependency scopes are ecosystem-specific. Frankly, figuring out how to limit the capabilities of imports is a discussion that literally every language with a package manager should be having right now.

Sure, the implementation is always going to be platform/language-specific, but this should be a serious consideration that people have any time that they're designing a new package system, we should expect new package managers to have an answer about how they scope dependencies and handle dependency permissions.

mikece · on April 11, 2022

Is the trope inaccurate in this case? If so, how? What would a path toward solution look like?

lucideer · on April 11, 2022

The post has two things to communicate:

1. The Node supply chain has serious problems when it comes to security: this is true

2. The strong implication (throughout the article but most prominently in the title) that these problems are either unique to Node or at least worse in Node than elsewhere. This is definitely not true.

---

The solutions cited in the article are good, and definitely benefit the Node ecosystem: --ignore-scripts is a well known option that would be great on-by-default but tends to be enforced in large corp CICD currently. The capabilities model is not too different to what Deno do (this article's idea is somewhat more fine grained but much more complex which could hamper adoption or lead to misuse).

Overall both would be of somewhat limited, but still very worthwhile benefit.

The hard problem imo is solving more generally for supply chain trust. We implemented --ignore-scripts enforcement last year and have only really see it have a relatively small impact, despite it being relatively new as an idea in the past few years (which motivates more exploits due to its novelty). In reality, most supply chain attacks are either targeting prod envs, or integrated build dependencies, rather than piggybacking CICD repo install steps: the later attack gives you the same env access and has no package-manager-specific mitigation, so it needs a more general solution. Capabilities also have limited use outside install scripts because limiting them also limits the "benign" functionality of your normal application code.

goodpoint · on April 11, 2022

Distributions like Debian are very trustworthy especially when compared to NPM.

lucideer · on April 11, 2022

Debian has ~20,000-30,000 packages total. NPM has well over a million. Contribution frequency and overall contributor numbers are also much much higher.

NPM is a victim of ease of use and popularity. It's a bigger target.

But both systems would benefit from a holistic approach to supply chain security.

tonyedgecombe · on April 11, 2022

Part of the problem there is JS doesn't have much of a standard library so everybody and their dog tries to fill that hole.

codedokode · on April 11, 2022

Sadly, in Debian every package which gets installed, can run a script with root privileges so this is even more dangerous. Although most packages don't need such privileges.

goodpoint · on April 11, 2022

Nonsense. Package managers need to be able to run scripts as root to do the installation. And yet, in the last 25 or 30 years there's never been a case of a malicious contributor successfully inserting a backdoor in the installation script of any package in any major distribution.

Because there is a vetting process, nothing else.

[And yes, of course, it would be possible to sandbox each package installation to access some very specific paths but so far it's really unnecessary]

hinkley · on April 11, 2022

> They've all got some badness.

They're all based on the same design, as far as I can see.

Is there another architecture we could be using?

nukemaster · on April 11, 2022

I'd disagree:

The way modern Linux distros work is that a number of volunteers who pay attention to what's going on upstream package software for users and developers. When something gets weird these volunteers change their behavior and prevent the users from being harmed (the recent Audacity mess is a great example of this.) I don't think people pushing eg cargo/node/snap appreciate how safe this has made their OS and languages that rely on distro package managers (such as C.)

You get the safety that you might expect from an app store without actually restricting anyone's freedom. It's very much like a church with elders preventing people from being captured by vices and whatnot. Yes it has issues but it works shockingly well, much better than many of the alternatives.

rlpb · on April 11, 2022

Also, most distros have the concept of stable releases. This provides a very valuable "focus point". It means that we don't just have to rely on the maintainers. Users can review packages for reasonable behavior too, and this isn't made futile by constantly changing packages. Users and maintainers can focus on just the stable release and at a reasonable cadence, and this focus point being the same for all users means that it has value to everyone else, too.

lucideer · on April 11, 2022

This is ultimately about scale though. It's easier for Linux because of the relative number of contributors to distro repos.

Ubuntu has 10s of thousands of packages. NPM has well over a million. The average update frequency is also much higher, as is the number of contributors per-package.

infamia · on April 11, 2022

> Ubuntu has 10s of thousands of packages. NPM has well over a million. The average update frequency is also much higher, as is the number of contributors per-package.

Node's culture of tiny libraries (partly caused by Javascript's tiny standard lib) is a big part of the problem and increases the number of potential supply chain issues.

danShumway · on April 11, 2022

It's not that Ubuntu has fewer packages because it's 30x more efficient with how it packages software -- it has fewer libraries because it has less software and less developer attention. It's not at all uncommon for me in Debian systems to have to search out non-distro repositories to pull from. And that's even before we get into the issue that Ubuntu/Debian repositories aren't rolling release. I often find myself jumping outside of the official Ubuntu repos even for software that they provide, just because they're out of date; it's one of the biggest reasons why I eventually moved to Arch.

Yes, JS dependency chains are out of control. No, that's not the only reason why there are over a million packages on NPM. No, the solution to the scalability problem of human-curated package managers can't be, "well, we just won't scale."

Adding a bigger standard library to JS would not be enough to get rid of 970,000 npm packages.

lucideer · on April 11, 2022

Node's culture of tiny libraries is overstated. See https://news.ycombinator.com/item?id=30989539

What standard libs are you comparing? Node's to what other language? I've seen so many commenters say this, but still not sure what the magical thing that can't be achieved with Node built-ins is...

Developers don't write libs because you can't do it with built-ins, the write libs because developers like to write code and NPM is easy to use.

infamia · on April 11, 2022

> Node's culture of tiny libraries is overstated. See https://news.ycombinator.com/item?id=30989539

You need thousands of packages for a pretty standard React application (which requires hundreds of base packages). That's a cultural problem in Node's packaging community that injects risk into the packaging ecosystem.

> What standard libs are you comparing? Node's to what other language? I've seen so many commenters say this, but still not sure what the magical thing that can't be achieved with Node built-ins is...

Brandon Eich himself has said that JS has a purposefully small standard library. https://www.infoworld.com/article/3048833/brendan-eich-javas...

JS' standard library is small compared with Python's for example.

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

https://docs.python.org/3/library/

horsawlarway · on April 11, 2022

This is literally EXACTLY how releases are supposed to work for companies using any package manager out there.

A number of [employees] who pay attention to what's going on upstream package software for users. When something gets weird these [employees] change their behavior and prevent the users from being harmed.

The problem is - much like a church with elders (and linux distros - frankly) - quality varies dramatically.

Some of them prevent people from being captured by vices, some of them diddle the kids.

Same here: Some companies take the appropriate steps to lock down dependencies and only update after a thorough vetting. Some pull the latest packages on every push to master.

danShumway · on April 11, 2022

The problem is that as you get deeper into Linux, you become progressively more and more likely to install your own packages from source, and then all of that curation goes out the window.

I'm not 100% convinced that the number of volunteers for package managers like Arch are actually sufficient to catch malware even in its current form; I think they get a lot of benefit out of desktop Linux being a relatively low-value target. But I'm really not convinced that their approach would be scalable if they actually had to scale at the level of npm. From what I can tell, Arch only has in the neighborhood of 13,000 packages[0], and it doesn't easily allow installing arbitrary versions[1]. I have nothing but praise for Linux, but none of the main distros have anything close to the amount of developer activity that the npm ecosystem has.

And that's when curation breaks down: Arch solves the problem of not having a ton of packages in the main repos by allowing multiple upstreams, by supplying AUR, and by compiling packages from source. But once you drop down into AUR, it's a lot more dangerous and would be a lot easier for people to push malicious code. And if you're pulling Makefiles off of Github, all of that goes out the window -- and there are good Linux software packages that encourage that behavior.

Not to mention the number of Linux software packages that straight up just give you a shell script to run that configures and installs the rest of the program (looking at you, Calibre). If you're lucky and you're using Arch, then you might be able to just install Calibre from the main repo. But that's also kind of Arch-specific, on Debian systems it's much more likely that you jump out of the main repos because they're out of date and you want the most recent version; you either start pulling from a dev-controlled upstream or you start running the shell scripts to install that software.

----

Don't get me wrong, I actually think that from a curation perspective, the way Linux package managers work is the best available solution we have for human moderation for software. A single large curated list that fits everyone's needs is impossible, it does not scale[2]. The only scalable solution for curation is to have a lot of separate curated lists that people can subscribe to; and then to recursively have curated lists of lists.

However, curation is not a magical catch-all solution against malware, particularly when you get AUR and source compilation in the mix. Curated lists are one layer of security, and need to be combined with other sandboxing techniques, with user education, and with (when possible) minimizing the number of packages people need to install. It's not as simple as saying, "the volunteers won't let anything bad happen" -- and I definitely wouldn't say that Linux package security is a solved issue, I think a recognition of some of the weaknesses of that model is part of the reason we're seeing so much effort going into Flatpak[3].

----

[0]: https://archlinux.org/packages/

[1]: Yes, you can roll back but it's not really something that's advised to do for specific packages. Generally, your system will run smoother if you keep everything up-to-date and don't pin specific versions.

[2]: We've seen this with both iOS and Android, you either make a limited list that doesn't meet everyone's needs, or you have bad curation. Sometimes both. Splitting up lists does a lot to help solve that problem.

[3]: Although in the spirit of having multiple curated lists, I wish we'd start to see more popular upstreams than just Flathub.

rlpb · on April 11, 2022

> The problem is that as you get deeper into Linux, you become progressively more and more likely to install your own packages from source, and then all of that curation goes out the window.

It needn't go out of the window. The key thing is to keep the set of packages on which you deviate small. Then you can curate the exceptions yourself, or a community can form that share the same needs and they can do it.

It's when you do throw the curation out of the window, or subscribe to an ecosystem that effectively requires it [that curation be thrown out the window], that the problem arises.

cryptica · on April 11, 2022

It doesn't make sense to build this 'capabilities' feature into the program itself. It feels a bit like taping your mouth shut in order to lose weight.

It doesn't make sense for a program to not trust its own code any more than it makes sense for a person to not trust their own thoughts.

There is no need to pollute your code like this. It should be implemented as an external tool which analyzes dependencies when executed on demand. A company could just run this tool as part of their CI pipeline before code is deployed or executed. It could be a default hook which runs automatically as part of npm install. It should not be part of the code itself. It's ugly and adds unnecessary overhead and complexity.

It's possible that an external tool executed at compile-time would not be able to verify modules which come with C/C++ bindings, but I think it would be difficult to stop these anyway (even at runtime). C/C++ bindings will always be less secure because it's harder to understand what's going on if you don't have access to the code. C/C++ is too powerful; you can do some crazy stuff with buffer overflows which would be difficult to detect anyway even at runtime. The solution is to try to stick to modules which rely only on native Node.js functionality and not on custom C/C++ bindings.

macspoofing · on April 11, 2022

What's different between the NPM ecosystem, and, say, java/maven? What is the latter doing that the former isn't?

tirpen · on April 11, 2022

> What is the latter doing that the former isn't?

Java has a standard library. A really big and well written standard library.

npm has a package called "is-number".

michaelt · on April 11, 2022

The short answer is nothing - it would be very easy to introduce a malicious Maven package.

The slightly longer answer is maven artefacts are immutable; the default behaviour is to pin precise versions; and norms among java programmers don't favour using libraries for one-liners like left-pad - meaning there are fewer people in a position to launch a supply chain attack.

The dam still has cracks in it, but there are fewer cracks and some have sticking plasters over them.

bogwog · on April 11, 2022

> and norms among java programmers don't favour using libraries for one-liners like left-pad - meaning there are fewer people in a position to launch a supply chain attack.

Javascript is the only ecosystem I've seen people doing stuff like that. I know this is going to sound elitist, but maybe the problem is that the bar for learning javascript is low, and the incentives for a javascript developer to improve themselves is also low. You can get away with lazy and bad practices for your entire career, even as a senior full-stack developer. Typescript kinda raises that bar a little bit, but not by much.

I don't think there's a solution to that particular problem, short of deprecating javascript once WASM reaches a point where it can fully replace it. But even in that scenario, we'll probably start to see JS interpreters ported to WASM anyways.

Maybe those "No Code" products are the solution? Replace all of those JS web/app developer positions with people trained on specific No Code platforms that require basically the same amount of programming knowledge, but outsource things like security and architectural decisions to the platform.

shadowgovt · on April 11, 2022

The bar to learning is low, the payout in the industry is high (everyone wants a web site, web service, or web app), and (key in this problem-space) the JavaScript standard library is basically a tiny raisin of functionality.

It's not so much "incentive to improve self is low" as "it doesn't make sense to rewrite something that exists," and since JS developers, to stereotype, tend to be extremely online, they will tend to solve problems by asking "Is this written yet?" instead of writing Yet Another YAML Parser.

ozim · on April 12, 2022

This is really not charitable interpretation.

Let's take more charitable one where we know that most of dev newcomers flock to JS and expect JS to have things like left-pad.

While being newcomer is not a bad thing in itself, unfortunately there is a bunch of expectations and things learned in JS ecosystem that are not aging well.

I don't think 'rising bar' is the answer as much as going "No Code/Low Code" is. People that are running companies that depend on newcomers will get what they pay for.

Companies that hire people with experience won't notice a thing.

Since left-pad a lot of people in the trenches of JS are learning that adding any dependency might cost a lot, and now we see more and more of that.

dmatech · on April 11, 2022

I never remember hearing about this sort of behavior in the CPAN scene. There's something different at a cultural level with Node. With Node, there have been several instances of otherwise competent coders destroy their own work to make a statement.

bilbo0s · on April 11, 2022

it would be very easy to introduce a malicious Maven package

From the perspective of a cybersecurity researcher, this is just not true. At a minimum, it's not true in the same way.

Node executes arbitrary code on install. The best an attacker can do in Java is execute arbitrary code at runtime, and even then, only insofar as the developer has directed the securitymanager to execute arbitrary code.

This is a massive difference that I believe people need to be at once more aware of, and more wary about. Don't believe the hype. Don't let people on the internet telling you there is no difference lull you into a false sense of security.

If you work on a machine, or on a project where security is important, check your dependencies people.

michaelt · on April 11, 2022

I'd wager 99% of Maven projects run unit tests on build, so I'm not sure the distinction between install-time and run-time is all that meaningful.

And the Security Manager might have been relevant back in the days of Java Applets and Web Start, but I've never seen it used outside of the OpenJDK test suite - and certainly not for protection against malicious code.

paulmd · on April 11, 2022

> I'd wager 99% of Maven projects run unit tests on build, so I'm not sure the distinction between install-time and run-time is all that meaningful.

Most Java projects don't build their dependencies from source though (unless it's a local project included via gradle/maven). So yes, unit tests run when dependencies are built, but nobody is building dependencies when their web app gets built.

michaelt · on April 11, 2022

But if a library is among your dependencies, I'd wager you're going to call some of its functions.

So you run a maven build, maven retrieves the library, maven runs your tests, your tests call functions from the library - and the library code you've just downloaded gets run.

feross · on April 11, 2022

(Reposting a comment I posted a few days ago.)

There's a few reasons that NPM sees more attacks than other ecosystems.

First, the scale of the JavaScript ecosystem. JavaScript is so much larger than every other ecosystem, so even a very small probability event (somebody introducing malware into a package) can happen surprisingly often given the scale of the ecosystem. Supply chain attacks are a problem in all open source ecosystems – not just JS – but they are a bit rarer and don't effect as many people so fewer people take note.

Second, npm was one of the first package managers to solve the classic "dependency hell" problem. In Python, if you have two dependencies, A and B, which both depend on different versions of C, say C@1.0.0 and C@2.0.0, respectively, then you're in trouble. You have an broken project. Python can only install one version of C. So now you're in dependency hell.

Npm on the other hand just installs both versions of C and it gives A the version that it wants, C@1.0.0. And it gives B the version that it wants, C@2.0.0. Both packages are happy - problem solved.

This caused Python maintainers to think twice before adding a new dependency lest they cause "dependency hell" for their users. Much better to just copy paste these 50 lines of code rather than adding a dependency. So there was an intrinsic sort of resistance – some pain is involved in adding new dependencies.

Npm maintainers had no such constraints. In a way, npm’s better developer experience led to the whole module ecosystem scaling "too well". Thus, you end up needing to trust more total maintainers, increasing the risk of supply chain attacks.

Disclosure: I started Socket (https://socket.dev) to help solve open source supply chain security. To learn more, see: https://news.ycombinator.com/item?id=30521913

matsemann · on April 11, 2022

- Bigger packages in java, developed by big organizations one can trust. Seldom depend on small ad-hoc packages.

- installing a dependency with maven is downloading a jar-file. In npm you often need to run arbitrary code as part of installation, making the attack surface far greater.

- usually use a specific version of dependencies, so no updates unless explicitly wanted.

- in theory, a securitymanager can also be used to give code from different libraries different permissions. Not seen it used much in practice, though. Only seen it be used for plugin systems.

bilbo0s · on April 11, 2022

In npm you often need to run arbitrary code as part of installation, making the attack surface far greater.

This really is the key. It makes it so you can't even really compare JS to java in an intellectually honest way. It's just not even close to the same. One downloads a Jar that will only ever execute at runtime, whereas node downloads arbitrary code that will execute on your machine immediately if the attacker so chooses. Not only that, but a user may legitimately not even know that node downloaded that module. The dependency tree is so ridiculous that the user would have to look through it with a fine toothed comb to spot the unimaginably big security hole.

On the one hand, yes, the user should have looked through his/her dependency tree, familiarized themselves with what was in those dependencies code-wise, and known what he/she was doing. On the other, come on man. That's kind of like these 100 page EULA's that take away all your rights. I'm not sure that it's reasonable to expect everyone to read those as carefully as you'd need to read them to avoid the problem?

ryukafalz · on April 11, 2022

The difference between code running at install-time and at runtime is not that big, all things considered. How often do you install a dependency without intending to run it almost immediately afterward?

matsemann · on April 11, 2022

It's an order of magnitude difference. Code running at install time is almost 100% guaranteed to run. So a transitive dependency 10 layers down is just as dangerous as any other.

Runtime, however, you need some code path to actually hit some part of that library/import it to be affected.

bilbo0s · on April 11, 2022

The difference between code running at install-time and at runtime is not that big

It is in java, and rust, and other languages that have securitymanagers or make security guarantees. Node is running code not only in a context that the dev never intended code to be run in, but also a context the dev has no control over. In rust or java, (or a lot of languages actually), code only runs in a context controlled by the dev.

I mean, in the worst case, with node, you may not even get the opportunity to run the app you were trying to install. The module may just own you at the outset. The dev would be powerless to stop any malicious behavior in the library.

From a security perspective, these are huge differences.

rootlocus · on April 11, 2022

Most packages you'd pull from maven are developed by large companies or foundations like apache. The dependencies you're pulling are simple jar files that get loaded at runtime and don't execute anything at build or install time.

ecmascript · on April 11, 2022

I have forgot about how maven works, but at least in npm you have the post install script that lets a package run anything after the dependency is downloaded. So that means package creators can run any code they want on your machine.

I don't really remember if maven has something similar since it was years I did anything in the jvm ecosystem but I think some package managers (like composer if I remember correctly) doesn't give this opportunity.

But since node doesn't have a large standard library it means people will reach out for third party packages for stuff that is small tasks in most languages / runtimes.

capableweb · on April 11, 2022

Having post-install step available to maliciously use doesn't really solve anything. As you're a programmer downloading a dependency, you're allowing the dependency full control of your system (in most language, Deno seems to try to address this at least) at runtime (at least), so they could do whatever they want as soon as you include the dependency in your application and run it once.

ecmascript · on April 11, 2022

True true, but at least you will have to check the api and start implement the library itself. Chances increase that you will see something weird about it the more you have to look at it.

moffkalast · on April 11, 2022

Throw apt/snap/pacman/whatever into the mix and the answer is still "nothing". People act like package managers are somehow the end all but they're no more secure than going to a random official site and downloading some shit, it's just streamlined the process somewhat.

In fact the latter is probably more secure, since the more the packages depend between each other the worse it gets. One random dependency can be hijacked and will be autoinstalled everywhere. Or someone can delete it and break half the internet as we've seen time and time again.

capableweb · on April 11, 2022

You're kind of right, but there is a different between the default repositories used by apt (Debian & Ubuntu) and pacman (Arch) and things like npm, in that they are indeed reviewed and can't disappear overnight. You have some guarantee that it won't disappear overnight, because of the organizations behind them. With npm, anyone can publish/unpublish without any sort of review, while packages in the default repositories are reviewed by others.

imtringued · on April 11, 2022

The difference is that non malicious NPM package authors are trying to destroy you with saturation attacks (throw a huge mass of packages at you so you cannot possibly check all of them) so that malware can slip through more easily.

lucideer · on April 11, 2022

Currently working on this general problem for a large corp: java & js are our two main languages (alongside a lot of python & go, small amounts of swift, groovy, kotljn & c, and some very very old php). Trust me when I say Maven/Gradle etc. are orders of magnitude more painful to solve for than others.

Nothing at all unique to NPM about supply chain risk.

Fwiw, I find Composer to be one of the better of the lot.

shadowgovt · on April 11, 2022

How tractable is it to proxy the npm package sources?

Were I to try and solve this as an enterprise project, that's the first thing I'd try: have a team declare the specific subset of packages we have hand-vetted and host them off a corporate package manager. Our software builds from only those packages; if devs need more, they petition to get them vetted. If they need new versions, they petition to get them updated. Our team keeps an eye out for hotfixes and periodically might mandate an upgrade if a vulnerability comes around.