I'm pretty sure GitHub warns you every single time you pull from or push to a repository where the organisation or repository has been renamed and you're relying on an alias that is not permanent.
Why is this news, and why does it need a name of "RepoJacking" assigned to it when the behaviour is working exactly as designed?
This isn't a novel vulnerability, and I wouldn't say any novel vulnerability research has been conducted here. Sure, there's some value in doing a code search and finding instances where people have automated scripts etc relying on aliases, but the vulnerability lies within these scripts.
All mutable global namespaces have the same problems.
I didn't know you could get street cred pointing this out one by one.
Let's try! If someone buys an expired domain and installs an SSL cert on it then old scripts will think it's legit. Let's call it phantomain, a combination of phantom and domain.
First thing I thought of too. If you've acquired a company, you'll need to plan on keeping the old domain names for a long time, if not indefinitely. Anyone who can later re-register that domain will now own any email sent there and any other network traffic as well.
No no! That's gotta be Ghomainjacking, two buzzwords in one AND ghosts are waaay spookier! Phantoms evoke too much imagery of incels in masks, singing on boats.
I was also trying to allude to Saint-Domingue, the French colony which France woke up one day to realize it no longer had and "sans domain", which are way too nerdy. Let's go with yours
Ghomainjacking but it's pronounced fimainjacking using the gho from ghoti because it's in the class of phishing attacks using a collision you didn't realize was possible
After NPR said they weren't going to post anymore because Musk decided to label them as "State Controlled Media" he made a point of saying, "OK, then we'll give your username to someone else after it expires in a month (apparently not posting for a month now expires accounts?)"
Hmm did you read the article? One specific attack vector they show has nothing to with Git or pushing/pulling to repositories. It shows an install.sh which curl downloads a master.zip from a public github repo and executes files within it.
>It shows an install.sh which curl downloads a master.zip from a public github repo
A repo that is an alias to another one. Someone can create this repo breaking the alias and thus being able to serve whatever they want. This is the so-called "repojacking" and what GP is also talking about.
> So would I see this message if it was pulled in a script, by npm, go install, cargo or another such tool?
The repository maintainers would, every single time they updated the code.
For your own scripts, you'd see a 301 Moved Permanently response when e.g. fetching a source code ZIP. It's up to you as to whether you want to follow this redirect silently or print a warning.
Nope. The project I work on went through a governance restructuring and we moved half the repos to a new org. I haven't updated most of my remotes, but they redirect seamlessly, no CLI warnings whatsoever.
The redirect behavior gives users the false impression that the old organization name is still permanently associated with them. Making that the actual behavior seems like the easy solution here.
If GitHub is concerned with people burning through names, they could either limit the number of changes before you have to reach out to support or limit the redirect behavior to just the last two org names.
Or just return a 410 and force folk to deal with the situation when their dependency breaks. I'm not convinced the convenience outweighs the risk. Especially given how many folk are just out here git-cloning random stuff from unvetted "vendors".
There are, currently, millions of exploits that are just ready to be taken advantage of, and we have no idea where they are, no way to detect them (because they could all be way upstream), and no way to stop them. That's kind of a big deal. Like an existential, massive, unsolvable, imminent disaster kind of deal.
It's such a big deal that I don't think anyone's going to take it seriously until the compromises start coming in waves. Just one of these compromises could affect thousands of projects. If there are millions of potential exploits? We're talking, like, the majority of the modern software landscape now has a giant hole in it. Not a potential hole, but, actively, right now, is exploitable.
Another reason to include commit ids in the url when fetching files from external repos. I think you should do this anyways in case the external repo maintainer makes a change that silently breaks your build script
We carried out a similar analysis using Packj tool [1] and found that a large number of repos are vulnerable to supply-chain attacks. We received a bunch of bounties for reporting vulnerabilities. Hopefully, these findings will bolster open-source security.
I've thought about this a lot when building GitHub integrations. It's so so so easy to use a GitHub username or an email as the link to the GitHub account. In fact, I don't even know how you connect an account to GitHub in a persistent manner. Email is slightly better because the odds that someone takes a specific email is pretty unlikely, but hey, domains expire too. Does GitHub give you a unique, guaranteed to never change id that corresponds to an account? They should.
After doing some more research, it really doesn't help that GitHub has a GraphQL query to get a user by login and no obvious query to get a user by ID.
What's tricky is that GitHub API docs[1] appears to explicitly recommend passing the username and not the ID. Both the GraphQL and the REST versions tell you to get a user by passing a username.
Yeah, they seem really focused on usernames. The sad thing is that you shouldn't just figure out a user-ID-to-username endpoint because that just creates a TOCTOU opportunity. You have to have GitHub accept a user ID directly on the operation you want to perform with that user, or else something could change in between getting the username and operating on that username.
> Also, because the instructions included an "npm install" command for the dependency, the attacker's code would achieve arbitrary code execution on the devices of unsuspecting users.
This is debatable. If the owner unpublishes their old releases, then the npm install would simply fail and the package reference string is burned forever. I don't like the implication this post makes that other package managers don't require code execution to install, or that npm is somehow more vulnerable.
> Registry data is immutable, meaning once published, a package cannot change. We do this for reasons of security and stability of the users who depend on those packages. So if you've ever published a package called "bob" at version 1.1.0, no other package can ever be published with that name at that version. This is true even if that package is unpublished.
To hijack an npm package is to exploit sloppy code review somewhere in the dependency tree. The registry has nothing to do with this.
Being aware of this policy, requiring maintainers to review all changes to "package-lock.json", and keeping this lock file in your repo ("npm-shrinkwrap.json" for releases) entirely mitigates this.
It's also important to regularly run npm audit on all releases and unpublish if vulnerabilities are found in the dependency tree of that lock file. It's better to break someone's build and leave a note that they need to upgrade than be part of the problem. Even further, npm audit reports are shown to the end user after an npm install so they can decide for themselves in the event the maintainers haven't unpublished yet (or ever).
> that other package managers don't require code execution to install
But that's more or less true. Arbitrary code execution isn't a feature needed when installing packages for other languages that don't use C bindings so heavily.
You're spot on that Node.js isn't alone, Python packages are very much the same in that packages can require code execution to install.
But not all packaging systems require the ability to execute package provided code in order to install some packages.
But then, in those languages, binding to C libs is far far less common.
Even simpler than that, something like bsd ports distfiles checksums, or go.sum, or whatever. If you depend on something, you should know what that something is, and you should have some measure of it in your code/project.
Why is this news, and why does it need a name of "RepoJacking" assigned to it when the behaviour is working exactly as designed?
This isn't a novel vulnerability, and I wouldn't say any novel vulnerability research has been conducted here. Sure, there's some value in doing a code search and finding instances where people have automated scripts etc relying on aliases, but the vulnerability lies within these scripts.