> "In almost all cases, whether or not there's a known local privesc bug, assume that code execution on your Linux systems equates to privesc; this is doubly true of machines in your prod deployment environment.
It depends. I've seen "oh well if someone has rce they probably have root anyway" used way too many times as an excuse to avoid defense-in-depth measures.
Those people might be right. Defense in depth is a legitimate tactic, but that's all it is, and it's often an excuse for people to waste time layering stupid stuff on top of real security controls.
ASLR, NX, and CFI would be an example of a defense in depth stack that is meaningful.
SSH, Fail2Ban, and SPA would be an example of a defense in depth stack that basically just wastes time.
I would be more comfortable with a system where I knew I had to burn the box if I lost RCE on it than I would be with a system that somehow depended on RCE not coughing up kernel, and persistence, to an attacker.
The other thing defense in depth can provide is increased attacker cost. That's why there are economically valuable DRM systems (BluRay's BD+ is an example here). All you have to do is push attacker cost across a threshold (for instance with BD+, that's keeping titles secure past the new release window) to make a defense in depth control valuable.
But if someone has a kernel exploit, probably nothing you've done for defense in depth is going to meaningfully increase costs.
> That's why there are economically valuable DRM systems (BluRay's BD+ is an example here). All you have to do is push attacker cost across a threshold (for instance with BD+, that's keeping titles secure past the new release window) to make a defense in depth control valuable.
A really good example of this is Spyro 3: The developers set up a system of overlapping checksums (which could in turn bet part of the data being checksummed by other, overlapping, checksums) so that it was virtually impossible to change even a single bit without failing the test. It was eventually cracked, as the check only ran at boot time (it required 10 seconds of disk access, and adding 10 seconds to every loading screen in the game would have been unacceptable), which meant it took over two months for pirates to get a crack working (unusual for the time). And since most game sales come in the first two months...
But that's really just me using this as an excuse to share a bit of technical trivia.
I'm confused, how is SSH an example of defense in depth? It is an access method. You should absolutely harden your SSH configuration. Fail2Ban is useless on a properly configured SSH server (no root, no passwords, no kerberos, only keys). Managing the keys at scale, well that is a different story.
I agree with you that ASLR, NX, and CFI are the most important system level defenses to employ.
I suspect that you're confusing fail2ban and port-knocking (or using fail2ban as a port-knocker).
The point of fail2ban is to prevent an attacker from brute-forcing your server. In a key-only config, the chances of getting brute forced is smaller (by a few orders of magnitude) than getting hit by an asteroid and having the server get hit by an asteroid, so fail2ban doesn't really help.
_In theory_, the same would be true for port-knocking.
However, in practice, sshd can have security holes which a malicious scanner could exploit. And while port-knocking doesn't help against a determined attacker (it's subject to MITM, replay-attacks), it does help with defense-in-depth.
That is true and a good use case for fail2ban. Useless was probably a strong word, what I really meant was of limited utility in increasing the security of the SSH service.
The main reason I use fail2ban is I got tired of the log file noise/bloat. I use key-only access on my servers already, with the key stored on a hardware token (Yubikey).
I guess the question then is why you're looking at failed Auth logs. Failed auths are boring, doubly so on a key only server. Successful auths are where the fun is at.
When I first set up fail2ban it was because I got annoyed that the machine on my desk was making regular "clunk...clunk...clunk" noises from the hard disk as it wrote another failed-auth attempt to the log every second or so...
Not entirely reasonable for all use cases. If there's a machine that you need access to from many different locations, a keyfile is more of a PITA than a long passphrase.
A HPC center (that is, lots of users coming in via ssh) I know about disabled key logins IIRC due to some incident where an attacker had got hold of a password-less key.
Too bad that sshd can't enforce use of password-proctected keys on the server side..
You got the thing backwards. It's not "too bad that sshd can't enforce keys"
of some property that happened to be missing in the key attackers got their
hands on. It's "too bad the HPC center staff didn't have tools good enough to
manage their servers". CFEngine and Puppet being two examples of such tools
the staff missed (or didn't know how to put into use in this case).
The problem, AFAIU, was that some user had a password-less key stored on some external system (their personal home computer, for all I know). That system was hacked, and allowed the attacker to access the HPC system. I don't see how the HPC center staff getting the Puppet-gospel could have prevented that person from using a password-less key. Well, except by disabling key-based logins (which, AFAIU, they could have used Puppet/cfengine/whatever for).
My point is that in general it would be better to disable password auth and only use key based auth, but only if you could somehow guarantee that the users wouldn't do crazy things like use password-less keys. But as you can't do that on the server-side, what other options do you have?
Control-Flow Integrity.
It's a bit of the new hotness in exploit mitigation, however it's quite complicated and there are various solutions that have different advantages and disadvantages.
clang docs:
http://clang.llvm.org/docs/ControlFlowIntegrity.html
Shorter CFI: when doing codegen for calls through function pointers (which will involve indirect calls through registers), emit extra code to make sure the register being jumped to is a legit function, thus breaking ROP payloads.
It depends. I've seen "oh well if someone has rce they probably have root anyway" used way too many times as an excuse to avoid defense-in-depth measures.