Hacker News new | past | comments | ask | show | jobs | submit login
Millions of GitHub repos likely vulnerable to RepoJacking, researchers say (bleepingcomputer.com)
134 points by pyeri on June 23, 2023 | hide | past | favorite | 50 comments



I'm pretty sure GitHub warns you every single time you pull from or push to a repository where the organisation or repository has been renamed and you're relying on an alias that is not permanent.

Why is this news, and why does it need a name of "RepoJacking" assigned to it when the behaviour is working exactly as designed?

This isn't a novel vulnerability, and I wouldn't say any novel vulnerability research has been conducted here. Sure, there's some value in doing a code search and finding instances where people have automated scripts etc relying on aliases, but the vulnerability lies within these scripts.


All mutable global namespaces have the same problems.

I didn't know you could get street cred pointing this out one by one.

Let's try! If someone buys an expired domain and installs an SSL cert on it then old scripts will think it's legit. Let's call it phantomain, a combination of phantom and domain.


First thing I thought of too. If you've acquired a company, you'll need to plan on keeping the old domain names for a long time, if not indefinitely. Anyone who can later re-register that domain will now own any email sent there and any other network traffic as well.


Agreed. For example NeXT was bought by Apple in 1997 yet next.com is still owned by Apple all this time later.


No no! That's gotta be Ghomainjacking, two buzzwords in one AND ghosts are waaay spookier! Phantoms evoke too much imagery of incels in masks, singing on boats.


I was also trying to allude to Saint-Domingue, the French colony which France woke up one day to realize it no longer had and "sans domain", which are way too nerdy. Let's go with yours

Ghomainjacking but it's pronounced fimainjacking using the gho from ghoti because it's in the class of phishing attacks using a collision you didn't realize was possible


If you ditch your twitter account and someone else adopts it and poses as you, you can only throw a tantrum. Beware for these twantrum attacks.


I thought Twitter famously doesn't allow reallocation of old usernames?


In fact, that's how you delete your old tweets: rename old account to a new username, register the old username, delete the renamed account.

https://help.twitter.com/en/using-twitter/delete-tweets

See "How to delete multiple Tweets".

As bad as it sounds.


After NPR said they weren't going to post anymore because Musk decided to label them as "State Controlled Media" he made a point of saying, "OK, then we'll give your username to someone else after it expires in a month (apparently not posting for a month now expires accounts?)"


They allow it, I closed my account and days later some bot had already taken over my old username.


Hmm did you read the article? One specific attack vector they show has nothing to with Git or pushing/pulling to repositories. It shows an install.sh which curl downloads a master.zip from a public github repo and executes files within it.


>It shows an install.sh which curl downloads a master.zip from a public github repo

A repo that is an alias to another one. Someone can create this repo breaking the alias and thus being able to serve whatever they want. This is the so-called "repojacking" and what GP is also talking about.


So would I see this message if it was pulled in a script, by npm, go install, cargo or another such tool?

But this article describes RepoJacking as reclaiming the original repo. The alias is removed, and the warning you mention too.


> So would I see this message if it was pulled in a script, by npm, go install, cargo or another such tool?

The repository maintainers would, every single time they updated the code.

For your own scripts, you'd see a 301 Moved Permanently response when e.g. fetching a source code ZIP. It's up to you as to whether you want to follow this redirect silently or print a warning.


Nope. The project I work on went through a governance restructuring and we moved half the repos to a new org. I haven't updated most of my remotes, but they redirect seamlessly, no CLI warnings whatsoever.


I believe it's just another fud article to support the terrible forced 2FA decision


Security researchers love to fearmonger to stay employed


The redirect behavior gives users the false impression that the old organization name is still permanently associated with them. Making that the actual behavior seems like the easy solution here.

If GitHub is concerned with people burning through names, they could either limit the number of changes before you have to reach out to support or limit the redirect behavior to just the last two org names.


Or just return a 410 and force folk to deal with the situation when their dependency breaks. I'm not convinced the convenience outweighs the risk. Especially given how many folk are just out here git-cloning random stuff from unvetted "vendors".


There are, currently, millions of exploits that are just ready to be taken advantage of, and we have no idea where they are, no way to detect them (because they could all be way upstream), and no way to stop them. That's kind of a big deal. Like an existential, massive, unsolvable, imminent disaster kind of deal.

It's such a big deal that I don't think anyone's going to take it seriously until the compromises start coming in waves. Just one of these compromises could affect thousands of projects. If there are millions of potential exploits? We're talking, like, the majority of the modern software landscape now has a giant hole in it. Not a potential hole, but, actively, right now, is exploitable.


Fine-tuned LLMs will make this nice and scalable. I look forward to the coming hacknado.


Another reason to include commit ids in the url when fetching files from external repos. I think you should do this anyways in case the external repo maintainer makes a change that silently breaks your build script


Just verify the SHA of the tarball a la Bazel?


Remember when git archive changed its format and that affected archives downloaded from github?


That won't help you very much. There's no guarantee the commit belongs to the named repository with e.g. raw links[0].

[0] https://twitter.com/slimsag/status/1672421999698903043


Of course it will, since you'll either get the commit you wanted at the time you wrote the script, or an error.


Unless someone is very good at finding SHA1 collisions.


The collisions need to deliver malicious payload as well, making it extra hard


Those are still very hard to get for a random hash, and GitHub I think warns (or blocks?) you if you try to push a hash with a known vulnerability.


If you clone the repo, it won't be there.


We carried out a similar analysis using Packj tool [1] and found that a large number of repos are vulnerable to supply-chain attacks. We received a bunch of bounties for reporting vulnerabilities. Hopefully, these findings will bolster open-source security.

1. http://github.com/ossillate-inc/packj flags vulnerable/malicious NPM/PyPI/Rubygems/Cargo/Packagist packages. I'm the lead dev.


I've thought about this a lot when building GitHub integrations. It's so so so easy to use a GitHub username or an email as the link to the GitHub account. In fact, I don't even know how you connect an account to GitHub in a persistent manner. Email is slightly better because the odds that someone takes a specific email is pretty unlikely, but hey, domains expire too. Does GitHub give you a unique, guaranteed to never change id that corresponds to an account? They should.



After doing some more research, it really doesn't help that GitHub has a GraphQL query to get a user by login and no obvious query to get a user by ID.


Is it guaranteed to not change?


GitHub user IDs look like "4723091" (there's mine).

If you look at the IDs for multiple accounts, you'll very quickly notice that they seem to have been assigned sequentially at registration time.

Fairly sure this is a permanent deal.


What's tricky is that GitHub API docs[1] appears to explicitly recommend passing the username and not the ID. Both the GraphQL and the REST versions tell you to get a user by passing a username.

[1]: https://docs.github.com/en/rest/users/users?apiVersion=2022-...


Yeah, they seem really focused on usernames. The sad thing is that you shouldn't just figure out a user-ID-to-username endpoint because that just creates a TOCTOU opportunity. You have to have GitHub accept a user ID directly on the operation you want to perform with that user, or else something could change in between getting the username and operating on that username.


> Also, because the instructions included an "npm install" command for the dependency, the attacker's code would achieve arbitrary code execution on the devices of unsuspecting users.

This is debatable. If the owner unpublishes their old releases, then the npm install would simply fail and the package reference string is burned forever. I don't like the implication this post makes that other package managers don't require code execution to install, or that npm is somehow more vulnerable.

> Registry data is immutable, meaning once published, a package cannot change. We do this for reasons of security and stability of the users who depend on those packages. So if you've ever published a package called "bob" at version 1.1.0, no other package can ever be published with that name at that version. This is true even if that package is unpublished.

https://docs.npmjs.com/policies/unpublish

https://docs.npmjs.com/cli/v9/commands/npm-unpublish

To hijack an npm package is to exploit sloppy code review somewhere in the dependency tree. The registry has nothing to do with this.

Being aware of this policy, requiring maintainers to review all changes to "package-lock.json", and keeping this lock file in your repo ("npm-shrinkwrap.json" for releases) entirely mitigates this.

https://docs.npmjs.com/cli/v9/configuring-npm/package-lock-j...

It's also important to regularly run npm audit on all releases and unpublish if vulnerabilities are found in the dependency tree of that lock file. It's better to break someone's build and leave a note that they need to upgrade than be part of the problem. Even further, npm audit reports are shown to the end user after an npm install so they can decide for themselves in the event the maintainers haven't unpublished yet (or ever).

https://docs.npmjs.com/cli/v9/commands/npm-audit


> that other package managers don't require code execution to install

But that's more or less true. Arbitrary code execution isn't a feature needed when installing packages for other languages that don't use C bindings so heavily.

You're spot on that Node.js isn't alone, Python packages are very much the same in that packages can require code execution to install.

But not all packaging systems require the ability to execute package provided code in order to install some packages.

But then, in those languages, binding to C libs is far far less common.


What if there were some way to specify a difficult to forge checksum for your dependencies?


They could call it an SBOM...


I think tedunangst is referring to https://man.openbsd.org/signify.1


Even simpler than that, something like bsd ports distfiles checksums, or go.sum, or whatever. If you depend on something, you should know what that something is, and you should have some measure of it in your code/project.


Like a SBOM


Like a commit hash?


I’d love to clickjack some repos to merge some PRs.


Another reason not use github.


What would you suggest as an alternative? Do you have a list of github problems that tips you over to said alternative(s)?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: