I think this article has come up before? Either way, it's a quirky thing for Gravitational to post, since their flagship project --- Teleport --- basically eliminates bastion servers altogether (you might think of it as an API-controlled self-contained bastion server). Teleport is free, and worth checking out: it solves a bunch of SSH management problems, not just controlling access, but also linking SSH access to SSO, running fleet-wide commands selectively, and generating transcripts of SSH sessions.
Teleport is kind of big and sprawling. But they've repeatedly contracted Doyensec to do assessment work for it, and Doyensec is a fantastic firm. I think parked behind Tailscale, so none of your SSH infra is exposed to the Internet to begin with, it's a pretty great solution, and I'd do that again before I ever hand-tooled an SSH bastion host again.
An interesting category of HOWTO have been companies teaching you how to do it yourself, for real. Before you get started, they pitch you at the end offering a paid-for option that has 0 learning curve. That's a good pitch.
Realistically what is the risk of ssh being exposed if white listing is done, 2 factor auth and key auth are used? I suppose someone using a zero day, spoofing or from a whitelisted IP may successfully exploit but really?
We've been experimenting a bit with tailscale and ssh access - and I'm not 100% convinced there's a great way to guarantee continued access - if you bind sshd to the tailscale vpn ip, an update that restarts ssh and tailscale could result in sshd not being able to bind the expected IP - leading to ssh being down. I think this is mostly due to sshd listen directive being somewhat limited.
so far I am mostly using tailscale + firewall. Using a firewall directly on the host as you mentioned seemed a bit dangerous - although we are trying it on a few servers. For now cloud provider firewall + tailscale.
> Either way, it's a quirky thing for Gravitational to post, since their flagship project --- Teleport --- basically eliminates bastion servers altogether
Not quirky at all. Probably the aim is to inform the reader of the myriad things to do to keep bastion server secure, and then suggest there is an easier alternative. :)
Do not trust the firewall on the bastion host, if an attack can get into the bastion host, they can disable the firewall, so it cannot be used to limit egress. It's better than nothing, but consider using a firewall that's managed on a via a separate management network. I do agree that you should only allow SSH from a few known IPs.
Limiting the number of users is weird, and not recommended. Create all the accounts you need to provide individual accounts for the staff that need to access the bastion host, you will need that as things like HIPAA require named accounts for auditing. None of the accounts need any privileges other than the most basic. Users do not need sudo/root privileges on a jump host.
Other than those two complains, it's good recommendations.
A final recommendation: If you use AWS though, consider using Session Manager instead of SSH and drop the bastion host. You can still connect using the SSH command, using proxy command in OpenSSH, but no public IP or bastion host is required.
> A final recommendation: If you use AWS though, consider using Session Manager instead of SSH and drop the bastion host. You can still connect using the SSH command, using proxy command in OpenSSH, but no public IP or bastion host is required.
I wrote something similar after I moved our fleet to SSM because I didn't want yet another CLI app to memorize flags on. It's ruby based and runs in an interactive mode by default. It doesn't cover the whole set of `aws ssm` featureset but focuses just on things that are needed for debugging sort of tasks. Leaving it here incase it's useful to anyone else: https://github.com/ajbdev/ruby-ssm-ops
Nitpick: the aws-connect quickstart suggests to install it through bpkg. But it turns out that bpkg does not have any "uninstall" or anything similar. I ended up doing just:
> if an attack can get into the bastion host, they can disable the firewall, so it cannot be used to limit egress.
This assumes that the attacker can get unconstrained root access to the system. It's fine to assume that attackers will but it's not as if you can't make that difficult.
At least in the DoD and IC environments I've worked in that had bastion hosts, the bastion host was severely locked down:
- Shell compiled without built-ins
- No coreutils
- No sudo
- Root account disabled
- Read-only root filesystem
- No user home directories
- Destroyed and rebuilt from template every X hours on some maintenance schedule
Effectively, all you can do is ssh in, ssh out, and forward ports. It might be theoretically possible, but as far as I know, no one has ever compromised one, especially since you can already only get to the bastion from a government VPN anyway, and authentication to that requires a smart card, so there are an awful lot of things you need to compromise to get to that point.
This also answers the suggestion down the page of "why don't you just apply the same controls to every host and not have a bastion." Because the bastion is unusable and you want to actually use your other hosts.
From a defense standpoint, one should consider "shell on a box" to usually mean attackers can get root on a box. If they can get persistence, they can wait for a kernel CVE to abuse.
Now, if you're just using a bastion as a jump host, you don't need to offer shells on it. Just allow people to proxy a port to behind the bastion and be done with it.
PermitTTY no
ForceCommand /usr/sbin/nologin
AllowTcpForwarding yes
AllowAgentForwarding no
I think it's probably reasonable when performing your incident response or even threat modeling to assume the attacker has or could escalate privileges. The linked article doesn't discuss anything that would make that harder, although perhaps practices like staying patched and minimizing attack surface are somewhat assumed (they do bring up choosing your OS based on minimizing attack surface for example).
There's also a lot you can do to harden that boundary. You can harden your kernel, you can execute user's shells in constrained environments like docker containers or restricted shells, leverage sandboxing technologies like apparmor or selinux, etc.
The user/root boundary can be a lot thinner than people expect, so I get why you'd want to point out that reliance on the attacker not escalating should be met with an evaluation of that boundary, but I think it may be understating the boundary to unconditionally not trust a host based firewall, or to say that getting onto the bastion itself is enough to disable the firewall when it does indeed require escalation.
I haven't actually tried it, but you can use SSM in your ssh config as a ProxyCommand. As I understand it, that will allow you to just use the ssh command as normal, with all the normal ssh abilities to do tunneling and port forwarding.
Twice I've seen Bastion Hosts compromised. Both times it practically gave the attackers the highest access. In one case it basically hid where the attack came from (compromised logs and all). In another it let them hijack an admin's password by reading his sudo.
IMHE, Bastion Hosts suck.
If you are forced to use one, send logs to a safer one-way storage encrypted and put tampering triggers everywhere you can in the Bastion Host. Also make sure you log outgoing connections. And make sure you can easily match incoming to outgoing.
If you absolutely have to use sudo on the Bastion Host force it to OTP only. Or if absolutely not possible, use 2FA, but this is a risk as something somewhere might not be properly protected and the password will leak. But the better way would be to have the bastion host run on some read-only image and not letting it upgrade or do any admin task at all. Maybe even remove admin users, SSH, the whole lot.
And related, do not have a single account with god-like access to everything. Isolate permissions. This is probably the hardest to get OK'd but it's the classic SPOF where they got you by the balls.
I agree, any security standards you're going to apply to a bastion host, just apply them to your entire network if possible, add security at every layer. So many times a bastion host just serves as a checkbox with added toil of jumping through a host. I despise them for the most part.
Having seen how bastion hosts or “jump boxes” work inside the enterprise I share your view. In practice they are generally not very well protected and are a very attractive target for attackers. It’s better to use a privileged session manager or regular ssh with mfa and ideally some type of identity proofing.
I can see that you can get a lot of things wrong with a bastion host, but if implemented sensibly, it should just be one more layer of a defense-in-depth strategy. What would you recommend instead of a bastion host?
> What would you recommend instead of a bastion host?
The question isn't to replace, but to remove. If you apply the same security to the actual hosts (which you probably should anyway) then why have an intermediary?
It does not seem to be mentioned here. But, my #1 hardening suggestion is install the Tripwire IDS (Intrusion Detection System). It is probably the best thing you could ever do for yourself as a system administrator. It integrity checks the entire file system. If anything happens to your system that you didn't authorize you're notified of it immediately. After initial install it is important to minimize and exclude false positives so that you end up with a system that rarely changes in ways you don't expect or can at least explain.
Another really useful tool is logwatch.
I actually caught an intruder this way hijacking my system several years ago. They removed rkhunter, chkrootkit and a variety of log files. And, modified lines in the last logged in users log. But, a combination of logwatch and tripwire caught it.
I personally use OSSEC for File Integrity Monitoring. And it has also actually caught an intruder that modified some PHP-code on a webserver. The attacker forgot to use the prefix @ in the PHP-code so a new error message was sent to the logfile and reported by OSSEC.
The premise sounds iffy ("SSH bastion hosts are an indispensable security enforcement stack for secure infrastructure access").
Every time you build some infrastructure, you expend scarce resources like engineering effort (=opportunity cost), time, money, and complexity by adding moving parts to your christmas tree of technology. You should always critically evaluate what's the most low hanging fruit you can invest in for a given end goal (eg improving security) considering the complexity costs. SSH bastions can be worth implementing in some situations, but not top of the list in many cases.
The next sentence starts talking about "security compliance standards" - you sometimes have to submit to doing stuff for reasons of ticking boxes, but it's important to remember when you're doing what's best for security and when you're going through motions mainly to tick boxes for someone else.
Good writup. One thing I would add for bastions if you wanted to harden them would be to disable session multiplexing if you are using MFA/2FA.
MaxSessions 1
The default is 10. The plus side of multiplexing is that subsequent connections using the same ssh connection channels are not validated against the authorization mechanisms such as login or 2FA. This reduces friction and speeds up the login process because login is not actually occurring. The trade-off of multiplexing is that all subsequent logins using that ssh connection are not logged nor are they validated with MFA. This means a person phishing your team members can easily hijack their connections without needing a password or 2FA and there are no lastlog entries. SSH Session multiplexing combined with passwordless sudo makes taking over a company trivial even if they have 2FA and strong passwords.
Another risk with a bastion model is port forwarding. As an organization you have to decide what is appropriate for that bastion. Unrestricted forwarding? Restricted? Denied?
AllowAgentForwarding no
AllowTcpForwarding yes
PermitOpen 192.168.1.2:22
If this bastion is for a PCI environment then one may want tighter restrictions. If it is for a development environment then maybe less restrictions and just better auditing on each host to enable forensic remediation.
If your bastion is also used for automation to drop files into a staging area, you can limit that automation to file transfers and even limit what it may do with files. This prevents the automation from having a shell or performing port forwarding.
The keys should be outside of the home directories to prevent malicious tools from appending additional authorized_keys into the account. Make use of automation to manage key trusts and add a comment to keys to map them to an internal tracking system like Jira. This assumes your MFA/2FA is excluding specific accounts or groups via PAM and permitting the use of ssh keys with specific groups or accounts.
AuthorizedKeysFile /etc/ssh/keys/%u
Match Group sftpusers
Banner /etc/ssh/banner_sftp.txt
PubkeyAuthentication yes
PasswordAuthentication no
PermitEmptyPasswords no
GatewayPorts no
ChrootDirectory /data/sftphome/%u
ForceCommand internal-sftp -l DEBUG1 -f AUTHPRIV -P symlink,hardlink,fsync,rmdir,remove,rename,posix-rename
AllowTcpForwarding no
AllowAgentForwarding no
-P sets limits on what may not be done in sftp. -p does the inverse and limits what may be done. [1] -l DEBUG1 or VERBOSE will give you syslog entries of what commands were executed on the files. This is useful for audits. Some redundant settings above are also useful to set explicitly for audits.
Another thing mentioned in the article is iptables. In a PCI environment one may want to also have explicit outbound rules using the owner module to limit what users or groups are permitted to ssh out. So if your organization have a group of people allowed to use this host as a bastions, then one could write a rule like
Or specify what CIDR blocks, ports, protocols may be used. You can use REJECT rules after this rule to make it obvious a connection was not allowed so that people do not spend hours debugging. This module is also handy for limiting which daemons may speak to your infrastructure. How strict or liberal the rule is entirely at the needs of your organization.
Lastly I would add that bastions should have as minimal an OS install possible and have SELinux enforcing. Actions denied by SELinux should go to a security operations center after you spend some time tuning out the noise and false positives.
Yes and I have met it once when at a huge Telco, while doing my bastion host in AWS a security architect installed this and used Keycloak as the policy engine to allow connections using SSH keys. It worked really well and also gave us a very strong granular control on who could connect, and a great audit trail.
This variable can also be set in tmux and gnu screen. People usually figure out fairly quick how to bypass the timer but it is handy when people console into servers via the drac/ilo and forget to log out. Some shells don't do anything with TMOUT so a bastion must only have vetted shells.
You are correct. This is widely copy/pasted bad advice and does the exact opposite of what the comment says.
It is not an idle timeout logout at all. Instead, it causes sshd to periodically send probes to the client. This has a couple of effects, most notably keeping tcp sessions "active" and frequently exchanging packets (this can be useful to keep connections through statefull firewalls alive if you are genuinely idle), and to rapidly detect and disconnect a client that has actually gone away.
I think the origin of this incorrect description is the CIS documents. They have the exact same gross mistake in them.
I think the ClientAlive probes are useful and should be on, but it's definitely not an "idle logout" as claimed.
A superset of these best practices in the article would be CIS benchmarks. Collectively agreed on by industry leaders and provide extensive resources that span the gamut of cloud, networking, and storage infrastructure.
I agree in general but there are a handful of edge cases which Google solved better with IAP: SSM can't forward ports to other hosts or any resource other than EC2. It's great for using SSH, SFTP, even tools like Ansible work fine, but if you need to get a port forward to something like RDS, a service in Fargate, etc. you'll need something else.
If you’re using - say - Debian all over your infra, introducing a whole new OS just for the bastions increases complexity without bringing any significant advantage.
The "how to grant and manage access to resources" issue is still unsolved in my opinion. There is a middle ground somewhere between raw bastions and managed access services or open VPNs that could be filled.
There are a few different players in this space, but the one to watch is Boundary by Hashicorp.
Basically managed authenticated proxy connections to any resource you could possibly need. Still young, so it's missing auditing and some of the convenience features, but give it a year and it will be a compelling open source competitor.
Teleport is great, but their centralized model is not suitable for all situations.. and the pricing (at least for kubernetes) leaves a lot to be desired.
There is also StrongDM, which is very similar with a better pricing model.
Newbie here to VPCs, bastion hosts, VPEs, etc. After spent a while on the topic, some questions arise that you might find redundant or can answer.
I am wondering why we need to configure this many steps as outlined in the article, and in general. What is the point of Teleport in the first place? Why is there no managed service that takes care of all of that, with me focusing on just deploying an app and running it in the VPC.
Can’t 99% of the use cases be put in a template and managed by a service provider, including the security?
A lot of it is. But unless you stick to the big 3 cloud providers, you need this for bare metal / colocation server deployments, which also happen to be much cheaper.
Can someone explain to me the benefits of limiting the IPs that can SSH into the bastion? It seems to me the main thing that's protecting against are misconfigurations of SSH (accidentally letting root log in with no password or something) or a zero day in SSH but I'm not convinced by either.
The company I work for does it so that bastions hosted on some public cloud hosting service are only accessible from the company network or by machines connected to its VPN. We handle _very_ sensitive data, and some engineer screwing up the configuration for a bastion would be _very_ bad. Defence in depth is important.
Also adds defense-in-depth against stolen credentials -- it means an attacker can't just exfiltrate stolen SSH credentials to use sometime later from somewhere else on the Internet (or sell them / pass them along to a different specialist) -- the attacker either has to use them in-place, or break into some other machine that's also on the allow-list.
Could someone do me a solid and explain best security practices around bastion hosts and vpn?
e.g.
- would you still require users connected to the vpn to go through a bastion host?
- would you ever run bastion/vpn through the same box?
- are there preferred access use cases for each?
Yes, you would still have people connect to the bastion if they're on the VPN; part of the point of a bastion is to have a central place to monitor and control SSH access, which a VPN doesn't really do for you. Additionally, you will inevitably end up with team members who need access to the VPN (to reach staging and test versions of your applications, or to access customer support consoles) but don't get SSH access; a bastion gives you a standard configuration to apply to your fleet to ensure that "on the VPN" doesn't ever equate to "can log into a server".
You should generally do both things.
Wait, I should word that better. You should generally have both sets of controls: network access control with a VPN, and fine-grained, auditable SSH-level access control. I don't love the "Linux shell server" approach to providing those SSH controls.
Thanks for the response, that clears things up quite a bit. Would you create jump-boxes per environment or do you generally just have 1 with all the different service/env access logic?
It depends. It's more important to have some controls in place than to make super-complicated controls. Again: shell servers you SSH into to SSH out of are kind of an anti-pattern. See elsewhere on the thread about Teleport, which, combined with Tailscale, is I think a pretty good answer to these concerns.
I run an "internal" set of bastion hosts that are gateways into a system that runs telnet. This internal system is able to run SSH, but connections stop around 100 because of OS limits. We need to support 400-500 logins, and that has to be telnet. Everybody connecting has to go through these bastions, including VPN users.
I recently built an nspawn container with tinysshd server, with a .profile that execs telnet to the relevant system on login.
We had previously used an old version of Microfocus Reflections (terminal emulation) with stunnels deployed on all the clients and bastions. That was not containerized, but the server stunnels were set to chroot() on startup.
I recently was forced to support the latest version of Reflections, and since it doesn't support chacha-poly, I also built dropbear SSH server just for them. Reflections is very expensive (~$500/seat), and the best that it supports is aes256-ctr, using Tatu Ylonen's commercial ssh.com (which appears to be abandonware). I really hope we can get rid of that.
very nice writeup - one of the better ones i have seen. you can go a step further and eliminate open inbound port 22 (make the sshd server 'dark' to the network) with open source solutions like this:
disclosure: we build SaaS on top of OpenZiti (the open source) so are opinionated in this domain. and, to be clear, the above is just one layer...other layers of security still apply.
i generally end up liking what teleport is doing and what they are all about... i keep meaning to try their opensource stuff out. does teleport's sshd 'listen' on port 22 and does it need an opening in a firewall?
still, having sshd listen on localhost and not a public ip is pretty cool imo. Ken and I did exactly that on a stream one day https://youtu.be/oSlwZcwZcsU if anyone is interested. The one extra step one could do is to convert sshd to only allow connections from localhost by editing /etc/ssh/sshd_config and set the ListenAddress to only 127.0.0.1
> does teleport's sshd 'listen' on port 22 and does it need an opening in a firewall?
Sorry, one of those crappy it depends answers. The teleport node agents, the agent running on the server you want a session on, can be configured to listen to inbound connections from the proxy (but doesn't use port 22 by default), or can be configured in a reverse tunneling mode where it does outbound dialing towards the Teleport proxy service. When using the reverse tunneling mode, you don't need inbound access to the end nodes, but still need the nodes to be able to make an outward connection to the Teleport infrastructure.
This is how the cloud hosted Teleport works as well, we can't be expected to have outbound network access to peoples machines, so all the agents will dial the cloud hosted proxies, and setup reverse tunnels that are then used for the inbound connection requests.
In most setups though, the Teleport Proxies would then still have inbound connectivity and are meant to be internet facing, so a client can request an SSH or other session, but that single way into the environment can be hardened, layered with additional security, as the environment may require.
Note: I'm affiliated with Teleport, my comments are my own.
Maybe I missed it but did they cover logging all keystrokes entered by users over the bastion? (In the case where you need to log into it first vs merely doing port forwarding)
That would make a lot of sense if SOCKS5 proxies weren't commonly used for auditing and provide much more transparency about what operations someone is doing on the internal systems.
Between the client and the SOCKS5 proxy? Of course using the SSH SOCKS proxy will encrypt data, I was rather thinking to a plain SOCKS5 proxy. Are there clients and servers supporting SOCKS-level encryption between the client and the proxy? I didn't see that possibility the last time I've read the SOCKS standard (but it was a few years ago).
Teleport is kind of big and sprawling. But they've repeatedly contracted Doyensec to do assessment work for it, and Doyensec is a fantastic firm. I think parked behind Tailscale, so none of your SSH infra is exposed to the Internet to begin with, it's a pretty great solution, and I'd do that again before I ever hand-tooled an SSH bastion host again.