Millions of GitHub repos likely vulnerable to RepoJacking, researchers say

lol768 · on June 24, 2023

I'm pretty sure GitHub warns you every single time you pull from or push to a repository where the organisation or repository has been renamed and you're relying on an alias that is not permanent.

Why is this news, and why does it need a name of "RepoJacking" assigned to it when the behaviour is working exactly as designed?

This isn't a novel vulnerability, and I wouldn't say any novel vulnerability research has been conducted here. Sure, there's some value in doing a code search and finding instances where people have automated scripts etc relying on aliases, but the vulnerability lies within these scripts.

kristopolous · on June 24, 2023

All mutable global namespaces have the same problems.

I didn't know you could get street cred pointing this out one by one.

Let's try! If someone buys an expired domain and installs an SSL cert on it then old scripts will think it's legit. Let's call it phantomain, a combination of phantom and domain.

SoftTalker · on June 24, 2023

First thing I thought of too. If you've acquired a company, you'll need to plan on keeping the old domain names for a long time, if not indefinitely. Anyone who can later re-register that domain will now own any email sent there and any other network traffic as well.

adrianmsmith · on June 24, 2023

Agreed. For example NeXT was bought by Apple in 1997 yet next.com is still owned by Apple all this time later.

topato · on June 24, 2023

No no! That's gotta be Ghomainjacking, two buzzwords in one AND ghosts are waaay spookier! Phantoms evoke too much imagery of incels in masks, singing on boats.

kristopolous · on June 24, 2023

I was also trying to allude to Saint-Domingue, the French colony which France woke up one day to realize it no longer had and "sans domain", which are way too nerdy. Let's go with yours

Ghomainjacking but it's pronounced fimainjacking using the gho from ghoti because it's in the class of phishing attacks using a collision you didn't realize was possible

rapnie · on June 24, 2023

If you ditch your twitter account and someone else adopts it and poses as you, you can only throw a tantrum. Beware for these twantrum attacks.

junon · on June 24, 2023

I thought Twitter famously doesn't allow reallocation of old usernames?

reidrac · on June 24, 2023

In fact, that's how you delete your old tweets: rename old account to a new username, register the old username, delete the renamed account.

https://help.twitter.com/en/using-twitter/delete-tweets

See "How to delete multiple Tweets".

As bad as it sounds.

HeyLaughingBoy · on June 24, 2023

After NPR said they weren't going to post anymore because Musk decided to label them as "State Controlled Media" he made a point of saying, "OK, then we'll give your username to someone else after it expires in a month (apparently not posting for a month now expires accounts?)"

mattigames · on June 24, 2023

They allow it, I closed my account and days later some bot had already taken over my old username.

warent · on June 24, 2023

Hmm did you read the article? One specific attack vector they show has nothing to with Git or pushing/pulling to repositories. It shows an install.sh which curl downloads a master.zip from a public github repo and executes files within it.

forgotpwd16 · on June 24, 2023

>It shows an install.sh which curl downloads a master.zip from a public github repo

A repo that is an alias to another one. Someone can create this repo breaking the alias and thus being able to serve whatever they want. This is the so-called "repojacking" and what GP is also talking about.

tgv · on June 24, 2023

So would I see this message if it was pulled in a script, by npm, go install, cargo or another such tool?

But this article describes RepoJacking as reclaiming the original repo. The alias is removed, and the warning you mention too.

lol768 · on June 25, 2023

> So would I see this message if it was pulled in a script, by npm, go install, cargo or another such tool?

The repository maintainers would, every single time they updated the code.

For your own scripts, you'd see a 301 Moved Permanently response when e.g. fetching a source code ZIP. It's up to you as to whether you want to follow this redirect silently or print a warning.

kdmccormick · on June 24, 2023

Nope. The project I work on went through a governance restructuring and we moved half the repos to a new org. I haven't updated most of my remotes, but they redirect seamlessly, no CLI warnings whatsoever.

lakomen · on June 24, 2023

I believe it's just another fud article to support the terrible forced 2FA decision

naillo · on June 24, 2023

Security researchers love to fearmonger to stay employed

runlevel1 · on June 24, 2023

The redirect behavior gives users the false impression that the old organization name is still permanently associated with them. Making that the actual behavior seems like the easy solution here.

If GitHub is concerned with people burning through names, they could either limit the number of changes before you have to reach out to support or limit the redirect behavior to just the last two org names.

djbusby · on June 24, 2023

Or just return a 410 and force folk to deal with the situation when their dependency breaks. I'm not convinced the convenience outweighs the risk. Especially given how many folk are just out here git-cloning random stuff from unvetted "vendors".

throwaway892238 · on June 24, 2023

There are, currently, millions of exploits that are just ready to be taken advantage of, and we have no idea where they are, no way to detect them (because they could all be way upstream), and no way to stop them. That's kind of a big deal. Like an existential, massive, unsolvable, imminent disaster kind of deal.

It's such a big deal that I don't think anyone's going to take it seriously until the compromises start coming in waves. Just one of these compromises could affect thousands of projects. If there are millions of potential exploits? We're talking, like, the majority of the modern software landscape now has a giant hole in it. Not a potential hole, but, actively, right now, is exploitable.

flangola7 · on June 24, 2023

Fine-tuned LLMs will make this nice and scalable. I look forward to the coming hacknado.

1lint · on June 24, 2023

Another reason to include commit ids in the url when fetching files from external repos. I think you should do this anyways in case the external repo maintainer makes a change that silently breaks your build script

ewhauser421 · on June 24, 2023

Just verify the SHA of the tarball a la Bazel?

glandium · on June 24, 2023

Remember when git archive changed its format and that affected archives downloaded from github?

emidoots · on June 24, 2023

That won't help you very much. There's no guarantee the commit belongs to the named repository with e.g. raw links[0].

[0] https://twitter.com/slimsag/status/1672421999698903043

faangsticle · on June 24, 2023

Of course it will, since you'll either get the commit you wanted at the time you wrote the script, or an error.

bqmjjx0kac · on June 24, 2023

Unless someone is very good at finding SHA1 collisions.

NhanH · on June 24, 2023

The collisions need to deliver malicious payload as well, making it extra hard

manwe150 · on June 24, 2023

Those are still very hard to get for a random hash, and GitHub I think warns (or blocks?) you if you try to push a hash with a known vulnerability.

glandium · on June 24, 2023

If you clone the repo, it won't be there.

ashishbijlani · on June 24, 2023

We carried out a similar analysis using Packj tool [1] and found that a large number of repos are vulnerable to supply-chain attacks. We received a bunch of bounties for reporting vulnerabilities. Hopefully, these findings will bolster open-source security.

1. http://github.com/ossillate-inc/packj flags vulnerable/malicious NPM/PyPI/Rubygems/Cargo/Packagist packages. I'm the lead dev.

hardwaregeek · on June 24, 2023

I've thought about this a lot when building GitHub integrations. It's so so so easy to use a GitHub username or an email as the link to the GitHub account. In fact, I don't even know how you connect an account to GitHub in a persistent manner. Email is slightly better because the odds that someone takes a specific email is pretty unlikely, but hey, domains expire too. Does GitHub give you a unique, guaranteed to never change id that corresponds to an account? They should.

eslaught · on June 24, 2023

Yes:

https://api.github.com/users/<userhandle>

From:

https://github.com/NixOS/nixpkgs/blob/master/maintainers/mai...

hardwaregeek · on June 25, 2023

After doing some more research, it really doesn't help that GitHub has a GraphQL query to get a user by login and no obvious query to get a user by ID.

hardwaregeek · on June 24, 2023

Is it guaranteed to not change?

LoganDark · on June 24, 2023

GitHub user IDs look like "4723091" (there's mine).

If you look at the IDs for multiple accounts, you'll very quickly notice that they seem to have been assigned sequentially at registration time.

Fairly sure this is a permanent deal.

hardwaregeek · on June 25, 2023

What's tricky is that GitHub API docs[1] appears to explicitly recommend passing the username and not the ID. Both the GraphQL and the REST versions tell you to get a user by passing a username.

[1]: https://docs.github.com/en/rest/users/users?apiVersion=2022-...

LoganDark · on June 25, 2023

Yeah, they seem really focused on usernames. The sad thing is that you shouldn't just figure out a user-ID-to-username endpoint because that just creates a TOCTOU opportunity. You have to have GitHub accept a user ID directly on the operation you want to perform with that user, or else something could change in between getting the username and operating on that username.

sublinear · on June 24, 2023

> Also, because the instructions included an "npm install" command for the dependency, the attacker's code would achieve arbitrary code execution on the devices of unsuspecting users.

This is debatable. If the owner unpublishes their old releases, then the npm install would simply fail and the package reference string is burned forever. I don't like the implication this post makes that other package managers don't require code execution to install, or that npm is somehow more vulnerable.

> Registry data is immutable, meaning once published, a package cannot change. We do this for reasons of security and stability of the users who depend on those packages. So if you've ever published a package called "bob" at version 1.1.0, no other package can ever be published with that name at that version. This is true even if that package is unpublished.

https://docs.npmjs.com/policies/unpublish

https://docs.npmjs.com/cli/v9/commands/npm-unpublish

To hijack an npm package is to exploit sloppy code review somewhere in the dependency tree. The registry has nothing to do with this.

Being aware of this policy, requiring maintainers to review all changes to "package-lock.json", and keeping this lock file in your repo ("npm-shrinkwrap.json" for releases) entirely mitigates this.

https://docs.npmjs.com/cli/v9/configuring-npm/package-lock-j...

It's also important to regularly run npm audit on all releases and unpublish if vulnerabilities are found in the dependency tree of that lock file. It's better to break someone's build and leave a note that they need to upgrade than be part of the problem. Even further, npm audit reports are shown to the end user after an npm install so they can decide for themselves in the event the maintainers haven't unpublished yet (or ever).

https://docs.npmjs.com/cli/v9/commands/npm-audit

EdwardDiego · on June 24, 2023

> that other package managers don't require code execution to install

But that's more or less true. Arbitrary code execution isn't a feature needed when installing packages for other languages that don't use C bindings so heavily.

You're spot on that Node.js isn't alone, Python packages are very much the same in that packages can require code execution to install.

But not all packaging systems require the ability to execute package provided code in order to install some packages.

But then, in those languages, binding to C libs is far far less common.

tedunangst · on June 23, 2023

What if there were some way to specify a difficult to forge checksum for your dependencies?

rrdharan · on June 24, 2023

They could call it an SBOM...

codetrotter · on June 24, 2023

I think tedunangst is referring to https://man.openbsd.org/signify.1

tedunangst · on June 24, 2023

Even simpler than that, something like bsd ports distfiles checksums, or go.sum, or whatever. If you depend on something, you should know what that something is, and you should have some measure of it in your code/project.

throwaway892238 · on June 24, 2023

Like a SBOM

remram · on June 24, 2023

Like a commit hash?

jeffkeen · on June 24, 2023

I’d love to clickjack some repos to merge some PRs.

AHOHA · on June 24, 2023

Another reason not use github.

taftster · on June 24, 2023

What would you suggest as an alternative? Do you have a list of github problems that tips you over to said alternative(s)?