If you’re not using SSH certificates you’re doing SSH wrong (2019)

tomrod · on March 24, 2022

Rant engaged. As a person who feels responsible for ensuring what I build is secure, the security space feels inscrutably defeating. Is there a dummies guide, MOOC, cert, or other instructional material to better get a handle on all these things?

SSH keys make sense. But certificates? Is this OIDC, SAML, what? Is it unreasonable to request better and deeper "how to do {new security thing}" when PKI is a new acronym to someone? Where can I point my data science managers so they can understand the need and how to implement measures to have security on PII-laden dashboards? As so on.

chefandy · on March 24, 2022

My needs may be entirely different than yours and I don't want to downplay the importance of security, but...

Security writing in "engagement" obsessed media yields lots of people screaming "FIRE!" whenever seeing something even theoretically flammable, and bandwagoneers— already imagining their hair is on fire— lambasting everyone not immediately evacuating for being careless about fire safety. It reminds me of politicians being 'tough on crime'— they reflexively jump at opportunities to tighten the screws regardless of its necessity or efficacy. It's an emotional response involving self-image, peer pressure, and fashion rather than rational cost benefit analysis.

Perfect is the enemy of good. Attacking every theoretical threat like an international bank's network admin yields no practical benefit for most. Not nobody but most. If this TLA is new to me, there will be another new one that people will lambast me for not knowing in a couple of years, max.

For me, this problem was a better fit for the Wizard of Oz than a security education resource— what I really needed was the right frame of mind rather than learning the implementation details of every incremental certificate authority update.

I evaluate my attack surfaces and reduce them if I can, evaluate the real importance of keeping what I'm protecting secret, implement standard precautions and architecture to mitigate those risks, pay attention to the systems, pay attention to new vulnerabilities, and re-evaluate upon changes. The process is technology-agnostic and only requires you deep dive into the stuff you need to know without feeling like you need a new certification ever 6 months to run your company's CalDAV server.

scottLobster · on March 24, 2022

Relevant article: How I learned to stop worrying (mostly) and love my threat model

https://arstechnica.com/information-technology/2017/07/how-i...

kbenson · on March 25, 2022

When you start dealing with hundreds of servers or more (perhaps it starts earlier at the high tens), you start looking at all things as trade-offs, and doing so yields interesting insights that aren't necessarily obvious when you're working at smaller levels.

What is the cost (in time and effort and manpower and complexity) to implement? What is the cost to maintain? What is the cost to manage, when you are adding and removing people often? What are the failure scenarios, when any one server that needs to manage things starts to become a liability for disaster recovery and redundancy purposes?

Sometimes the destination is clearly better than your current place, but the road to get there has a cost all its own that makes traveling it the non-optimal choice.

It's very easy for 1-3 admins to decide to implement something over 10-30 servers and keep themselves up to date and with the right access and knowledge to manage and maintain it. It's quite another thing when you're talking about hundreds of servers and you've implemented clear delineations about access and you have 10-20 admins ranging from junior to expert with associated levels of access to servers and tools, and the fleet of servers has evolved over years (or multiple decades, in some cases). Applying changes over that type of system can be complex and error prone and when it affects your ability to actually access and maintain the systems in question, it can be very hard to reason about the problems until you start encountering them. Change comes with risk, and risk assessment of technology becomes a large part of the planning requirements.

TedDoesntTalk · on March 25, 2022

> evaluate the real importance of keeping what I'm protecting secret

Excellent. This is often ignored by the security obsessed, those people yelling FIRE! as you say.

Securing access to my cloud-hosted cat photos does not demand the same energy as securing ICBM launch codes.

tener · on March 24, 2022

I feel for you. Security is a complex, evolving topic, with a dizzying array of concepts.

At work, we develop Teleport (https://goteleport.com/) to provide a secure access solution that is also easy to use and hard to get wrong. (Note: you cannot truly have "hard to use" and "secure" access, because people will always develop "backdoors" that are easier to use but not secure.)

If you are interested in some accessible writing about security check out: https://goteleport.com/blog/

On SAML: https://goteleport.com/blog/how-saml-authentication-works/

On OIDC: https://goteleport.com/blog/how-oidc-authentication-works/

I can recommend the YouTube channel too: https://www.youtube.com/channel/UCmtTJaeEKYxCjfNGiijOyJw

LambdaComplex · on March 24, 2022

Teleport seems like a genuinely cool product.

With that said, the company really needs to improve its interview process--my experience was downright terrible, and Glassdoor shows that other people had a similar experience

stusmall · on March 24, 2022

I'm with you. I really like the concept of their product and would be interested in using it. I applied a while ago but bowed out during the phone screen. There were a couple strange things that came up during the short call but there was one that wasn't forgivable. The post was clearly for a rust developer but they were upfront that they don't have any rust and are primarily a go shop. He said they put rust in the job title because it helps attract smart, passionate people.

It really put me off. I’m not dead set on developing in any given language. I like rust and have been working with it for a while but that isn’t a deal breaker for me. The thing is that if our introduction starts off with dishonesty I don't have any reason to expect it to get better from there. What will they mislead me about after I’m hired?

schemingguild · on March 24, 2022

Roles shouldnt be put up as rust, but there is clearly rust in the github repository. Its not a lot but my understanding is the usage is somewhat growing.

https://github.com/gravitational/teleport/search?l=rust

stusmall · on March 24, 2022

According to the gitlogs this conversation happened about a year before those were added. We talked about this pretty point blank. It was made clear that while they might use rust in the future and they had rust fans internally, it was a go position.

dijit · on March 24, 2022

FWIW I had the same experience with Embark Studios. (Game Dev in Stockholm that prides themselves on doing gamedev in rust.)

Applied for a rust job. Got a Go coding assessment. Was told that the job was Go based.

schemingguild · on March 24, 2022

Ah right, that sucks

alexk · on March 24, 2022

Hey, I'm Sasha, CTO @ Teleport. I have designed our interview process and have described it here:

https://goteleport.com/blog/coding-challenge/

We are also trying to be as transparent as possible with our challenges being open source:

https://github.com/gravitational/careers/tree/main/challenge...

and requirements being published here:

https://github.com/gravitational/careers/blob/main/levels.pd...

I am sorry to hear that you had bad experience. Our interview process is a trade-off and has one big downside - it may take more time and efforts compared to classic interviews. It could also feel disappointing if the team does not vote in favor of the candidate's application.

However, if there was something else wrong with your experience and you are willing to share, please send me an email to sasha@goteleport.com.

mistrial9 · on March 25, 2022

non-involved opinion here - it appears that a self-confident and clearly communicating C*O person is explaining exactly why the company is completely correct, while evidence of at least two actual non-company people show examples of this not being the case. Isn't it common for self-assured execs to explain away all the objections of outsiders, despite evidence directly presented? looks like it here $0.02

weq · on March 25, 2022

bingo. CTOs should realise that job ads are screened by devs just as they attempt to screen for mini-me's and protect from dead weight.

Devs (who arnt desperate for richers) look at your company and think, how cr*p would it be to work there? where are the indicators?

ibeckermayer · on March 25, 2022

No specific criticism of the process was offered, so a general justification is warranted.

Personally I became interested in working for Teleport in large measure because the interview process tested my practical skills, rather than having me pull leetcode trivia out of my ass. I haven’t regretted my decision whatsoever, all of my engineering teammates here that I’ve worked directly with are very responsible and competent and the company appears to be growing mostly in the right directions.

tptacek · on March 24, 2022

I like Teleport. If you're doing work samples, why is your team voting in favor of applications? Part of the point of work samples is factoring out that kind of subjectivity.

alexk · on March 24, 2022

That's a fair question. The team votes on specific aspects of implementation that can not be verified by running a program, for example:

* Error handling and code structure - whether the code processes errors well and has a clear and modular structure or crashes on invalid inputs, or the code works, but is all in one function.

* Communication - whether all PR comments have been acknowledged during the code review process and fixed.

Others, like whether the code uses good setup of HTTPS and has authn are more clear.

However, you have a good point. I will chat to the team and see if we can reduce the amount of things that are subject to personal interpretation and see if we can replace them with auto checks going forward.

tptacek · on March 24, 2022

We're a work-sample culture here too, and one of the big concerns we have is asking people to do work-sample tests and then face a subjective interview. Too many companies have cargo-culted work-sample tests as just another hurdle in the standard interview loop, and everyone just knows that the whole game is about winning the interview loop, not about the homework assignments.

A rubric written in advance that would allow a single person to vet a work sample response mostly cures the problem you have right now. The red flag is the vote.

alexk · on March 24, 2022

That's a fair concern. We don't have extra steps to the interview process, our team votes only on the submitted code. However, We did not spend enough time thinking about automating as many of those steps as possible as we should have.

For some challenges we wrote a public linter and tester, so folks can self-test and iterate before they submit the code:

https://github.com/gravitational/fakeiot

I'll go back and revise these with the team, thanks for the hint.

tptacek · on March 24, 2022

The good news is, if you've run this process a bunch of times with votes, you should have a lot of raw material from which to make a rubric, and then the only process change you need is "lose the vote, and instead randomly select someone to evaluate the rubric against the submission". Your process will get more efficient and more accurate at the same time, which isn't usually a win you get to have. :)

wdella · on March 24, 2022

Disclaimer: I'm a Teleport employee, and participate in hiring for our SRE and tools folks.

> A rubric written in advance that would allow a single person to vet a work sample response mostly cures the problem you have right now. The red flag is the vote.

I argue the opposite: Not having multiple human opinions and a hiring discussion/vote/consensus is a red flag.

The one engineer vetting the submission they may be reviewing before lunch or have had a bad week, turning a hire into a no-hire. [1] Not a deal breaker in an iterated PR review game, but rough for a single round hiring game. Beyond that, multiple samples from a population gives data closer to the truth than any single sample.

There is also a humanist element related to current employees: Giving peers a role and voice in hiring builds trust, camaraderie, and empathy for candidates. When a new hire lands, I want peers to be invested and excited to see them.

If you treat hiring as a mechanical process, you'll hire machines. Great software isn't built by machines... (yet)

[1] https://en.wikipedia.org/wiki/Hungry_judge_effect

tptacek · on March 24, 2022

Disclaimer: this comment ticked me off a bit.

If you really, honestly believe that multiple human opinions and a consensus process is a requirement for hiring, I think you shouldn't be asking people to do work samples, because you're not serious about them. You're asking people to do work --- probably uncompensated --- to demonstrate their ability to solve problems. But then you're asking your team to override what the work sample says, mooting some (or all) of the work you asked candidates to do. This is why people hate work sample processes. It's why we go way out of our way not to have processes that work this way.

We've done group discussions about candidates before, too. But we do them to build a rubric, so that we can lock in a consistent set of guidelines about what technically qualifies a candidate. The goal of spending the effort (and inviting the nondeterminism and bias) of having a group process is to get to a point where you can stop doing that, so your engineering team learns, and locks in a consistent decision process --- so that you can then communicate that decision process to candidates and not have them worry if you're going to jerk them around because a cranky backend engineer forgets their coffee before the group vote.

I don't so much care whether you use consensus processes to evaluate "culture fit", beyond that I think "culture fit" is a terrible idea that mostly serves to ensure you're hiring people with the same opinion on Elden Ring vs. HFW. But if you're using consensus to judge a work sample, as was said upthread, I think you're misusing work samples.

You can also not hire people with work samples. We've hired people that way! There are people our team has worked with for years that we've picked up, and there are people we picked up for other reasons (like doing neat stuff with our platform). In none of these cases did we ever take a vote.

(If I had my way, we'd work sample everyone, if only to collect the data on how people we're confident about do against our rubric, so we can tune the rubric. But I'm just one person here.)

Finally: a rubric doesn't mean "scored by machines". I just got finished saying, you build a rubric so that a person can go evaluate it. I've never managed to get to a point where I could just run a script to make a decision, and I've never been tempted to try.

I'll add: I'm not just making this stuff up. This is how I've run hiring processes for about 12 year, not at crazy scale but "a dozen a year" easily? It's also how we hire at our current company. I object, strongly, to the idea that we have a culture of "machines", and not just because if they were machines I'd get my way more often in engineering debates. We have one of the best and most human cultures I've ever worked at here, and we reject idea that lack of team votes is a red flag.

Terretta · on March 25, 2022

Strongly agree with this, two key concepts in particular:

1. Using group discussion to make the principled rubric is incredibly respectful of everyone’s (employee and candidate) time, not just now but future time. Using the rubric is also unreasonably effective at getting clearer pictures of people quickly.

2. Systematic doesn’t mean automated, and that hiring should aspire to be systematic to the point it makes no difference who interviewed the candidate, and all the difference which candidate interviewed.

I’ll add one …

3. If you have a rubric setting a consistent bar, share feedback with the candidate in real time (such as asking to ‘help me understand your choice I might have done differently?’) as well as synthesized feedback at the end: “This is my takeaway, is it fair?”

Contrary to urban legend this never got us sued. Every candidate, particularly those being told no, said it was refreshing to hear where they stood and appreciated the opportunity to revisit or clarify before leaving the room. Key is non judgmental clear synthesis with, “Is that fair?”

ibeckermayer · on March 25, 2022

You’re mistaken, we do have a rubric. All of the members of the interview team grade the interviewee according to the rubric, and the scores are then combined into “votes”.

tptacek · on March 25, 2022

That's good. I'm responding to "Not having multiple human opinions and a hiring discussion/vote/consensus is a red flag". I think having combined scores is an own-goal, but having people vote based on their opinions is something worse than that (if you're having people do work samples).

LambdaComplex · on March 25, 2022

Thanks for replying!

Here's what I think it boils down to: working on a codebase with your coworkers is (or at least certainly should be) an inherently collaborative process. On the other hand, a job interview is, in a sense, inherently antagonistic. No matter what shape the interview takes, these people aren't your friends, they aren't your coworkers, they are gatekeepers.

I already have a job as a programmer. At work, I can push back on my coworkers and debate the merits of various designs until we all reach a consensus. But with the Teleport interview, there's an inherent power imbalance that makes that impossible: "I'd really like to argue about this, because I don't think I agree, but I'm afraid that will decrease the chances of them hiring me."

And the only people who are in a position to change this process are the ones who have already gotten through it successfully.

ibeckermayer · on March 25, 2022

From my perspective you’re unfairly projecting bad faith onto Teleport and shooting your self in the foot in the process.

1) You’re assuming that a good faith argument would decrease the chances of us hiring you, but for the most part that isn’t the case. We’re an engineering company building a complex security product — the only way that can be done well is via a culture that’s perennially open to criticism, debate, and going with the better argument. In my tenure at Teleport, I’ve never experienced explicit or implicit punishment for voicing my opinion, even when it contradicted a more senior engineer’s opinion. The argument has always been evaluated on its merits and the correct option taken. An interviewee making a good argument and proving an interviewer wrong should, and based on my experience would, increase your chances of being hired.

2) I can imagine you retorting that even if that’s truly the case at Teleport, there’s no way you could know that beforehand, and due to the “antagonistic” nature of us being the “gatekeepers”, you’re forced to assume the worst. But if your goal is to work in a collaborative environment where criticism and debate is tolerated, then your implicit strategy makes no sense. If Teleport is that type of place you’d like to work, then pushback in the interview process will be well received; if it isn’t, then you won’t even get an offer. So you have nothing to lose by giving your true opinion, but if you assume the worst and self censor in an attempt to brown nose the hiring team, you risk ending up in a shitty work environment that you were hoping to avoid.

weq · on March 25, 2022

Yep imbalance, dynamics, so much to skew the process. If you think your interview process works, great, but likely it doesnt and you just get lucky. All the good people you screened out vs all the cruft you saved yourself from.. you will never know....!

Being a programmer isnt about what you know, its about how you learn. Born programmers vs learned programmers, you got a coding test for that? really? If you think you can screen anything more then selecting for familiarity; your been sniffing that corperate glue for too long.

If you come to me thinking i am suitable for a job, you reach out via linked in, you see my public repos, then ask me to code for you on demand like a monkey?! Pull the other one!

(not referncing OP, general comment on interview processes)

mwmaxey · on March 24, 2022

I work at Smallstep

We are hiring and we have a non-terrible interview process (and amazing culture)!

sparc24 · on March 24, 2022

Their pricing is bat shit crazy. Stay far, far away.

alexk · on March 24, 2022

Sasha, CTO @ Teleport here.

I agree, our enterprise product is quite expensive. Let me explain why:

* We are going through several security audits by third party agencies several times per year. We are trying to hire the best security agencies to audit our code and it is quite expensive.

* We are recruiting globally and try to place our comp at 90th+ percentile of the compensation as listed in opencomp.com and other sources we have access to.

* Our sales process also takes time, and the sales team employs sales engineers, sales and customer success specialists to assist with deployments of such a critical piece of the infrastructure.

* For all our employees we have wellness benefits for home office improvement, personal development, healthcare packages.

All of these factors above add up and we charge a lot for building a quality security product supported 24/7 across the globe.

However, this might not work for everyone, and we have a completely free and open source version that people can use without ever talking to our sales team:

https://github.com/gravitational/teleport

gk1 · on March 24, 2022

Hey Sasha :) Price should be justified by value to the customer, not overhead costs of the company. Even though your value/benefits are listed on the site, this is a good opportunity to reiterate them.

sokoloff · on March 24, 2022

It’s an intersection of those two things. Hawks can profitably prey on squirrels, while lions could not.

There’s room in the security market for $10/mo/user products and room for <whatever it is that Teleport charges>. If not, they’ll find out in an expensive and painful fashion…

Given that they have paying customers, their price is justified to at least those customers.

alexk · on March 24, 2022

gk1 thanks, this is a valid point!

Teleport solves many quite important problems four our enterprise customers' infrastructure. Our users use Teleport to replace secrets and static keys with short lived certificates, manage certificate authorities, add audit and compliance controls for access to critical data, consolidate access for SSH, Kubernetes, Databases and Desktops.

sparc24 · on March 24, 2022

You have no idea how much money you are leaving on the table because of your insane pricing strategy. Your expenses do not scale with a customer's use. Amateur mistake.

TheNewsIsHere · on March 25, 2022

I don’t follow this comment. The last time I engaged with Teleport’s sales team they somewhere between $40-$80/host (server, VPS, etc). That seems like it would definitely scale with use.

Edit: per year. And there was a minimum order quantity.

tener · on March 24, 2022

Free, extremely capable open source version: https://goteleport.com/teleport/download/

You don't get support and some other things (see: https://goteleport.com/docs/enterprise/introduction/), but this is not a "demo" version where you cannot do actual work.

Kind of crazy indeed.

sofixa · on March 24, 2022

It's a security product that could be a huge productivity gainer.

All competitors i can think of are also expensive.

michaelt · on March 24, 2022

If for some bureaucratic reason you can't use SSH, which the industry has been quite happily using for 20+ years....

sofixa · on March 24, 2022

It's not about not using SSH, it's about:

* having an easy way to connect to all machines in environments where not everything is built the same way and on the same cloud or whatever. A big company can have a ton of teams building stuff across a variety of clouds and DCs. Not to mention those machines could be dynamic, so you need to add discovery. Heck, there might be Windows boxes here and there.

* having audit logs of who run what command on which server when

* extra security features like team management, MFA, etc.

You can do all that (minus audit logging) with SSH, sure, but it takes time and effort by the people who care least ( practitioners) about those things ( security teams). Buying something like Teleport or Wallix or Boundary solves all those problems at once.

tptacek · on March 24, 2022

You don't need their paid product. The free (open source) version is excellent.

cmiles74 · on March 24, 2022

Is it really expensive? Their website lists a 14 day trial but I don't see any pricing, just links to "Contact Sales".

remram · on March 24, 2022

Wow, a pricing page with no numbers: https://goteleport.com/pricing/ Amazing

tailspin2019 · on March 24, 2022

I share the dislike for “call us for pricing” model.

But in fairness there is a de facto number on this pricing page, and that’s zero. Their free open source plan.

So I give them a bit of credit for that.

It’s the companies that have no free tier or even an advertised monthly cost plan at all and just a “call us for pricing” that I find a real turn off (even in roles where I have been a potential “enterprise” customer). So I’d definitely draw a distinction between the two.

mwmaxey · on March 24, 2022

I work at Smallstep

here's one with numbers: https://smallstep.com/sso-ssh/pricing/#pricing

manfre · on March 24, 2022

I previously contacted their sales to get a sense of pricing while evaluating options. Their enterprise pricing starts at $24,000. In the realm of business security products, that might not be overly expensive. I don't know what that translates to per user. I decided not to go beyond the initial email exchanges because their sales process with excessively opaque pricing gave me the same vibe of some one trying to sell me a time share.

GeorgeHahn · on March 24, 2022

As a current candidate, I'd be interested in hearing more about your interview experience.

robomc · on March 24, 2022

I'd love to use this product in my organisation, but I don't want to self host, and it's really unclear what it would cost me.

Seeing an "enterprise, call for a quote" type tier makes me assume it's going to be too expensive for agency securing 10-20 servers.

mxfurman · on March 28, 2022

Disclaimer: I work at smallstep. https://smallstep.com/pricing/.

For our hosted product, you're looking at $30-$60/month.

robomc · on April 4, 2022

This looks great btw - I'm not ready to move yet but this is in my plans.

blueflow · on March 24, 2022

I can surely recommend reading into SSH certificates using the ssh-keygen manpage. No any extra tools required.

I sign SSH certificates for all my keypairs on my client devices, principal is set to my unix username, expiry is some weeks or months.

The servers have my CA set via TrustedUserCAKeys in sshd_config (see manpages). SSH into root is forbidden per default, i SSH into an account with my principal name and then sudo or doas.

My gain in all of this: I have n clients and m servers. Instead of having to maintain all keys for all clients on all servers, i now only need to maintain the certificate on each client individually. If i loose or forget a client, its certificate runs out and becomes invalidated.

XorNot · on March 24, 2022

Expiries are not protection against compromise.

Compromises happen in seconds - milliseconds, and once they do they will establish persistence. Expiry systems do not and have never been protection against compromise. They're an auxiliary to revocation systems to let you keep revocation lists manageable.

If you don't have revocation lists, or your number of changes is small, you should go ahead and just set your credential expiries to whatever you want - infinity, 100 years, whatever - it won't make the slightest bit of difference.

Particularly in the case when they're protecting sudo user credentials, they're no defense at all.

_fjb4 · on March 24, 2022

Yeah the lack of mentioning a CRL at all really stood out when reading this. I actually didn't know about SSH certificates until I saw this article (I always assumed that SSH did not support this), but do run my own CA and authentication for internal web services, EAP-TLS, and VPN. The CRL is your first line of defense in the sense that it blocks the use of that credential instantly when it is revoked.

I will argue though that the use of a short expiry produces slightly better protection than no expiry at all. If an employee leaves the company (with no CRL in place) and their certs expire in 16 hours, then unless their credentials are stolen in that timeframe your systems are still safe.

Likewise, if a CRL is in place and credentials are stolen without you being aware of it, the expiry still provides a form of buffer if the stolen credentials end up being used after the cert expires. In this case the expiry would trigger before you realised that credentials were stolen and updated the CRL. Now yes compromises can happen in seconds, but that's not in every single case.

That being said I definitely agree that the expiry is not a subsitute to a CRL and any certificate system should have revocation systems in place. In the end you really should have both a CRL and expiry date if possible.

blueflow · on March 24, 2022

Rookie mistake: SSH has no CRL, it has an KRL.

And its actually a separate thing since it operates largely independently from the CA.

I have one in place. Used it once to terminate access for someone.

pritambaral · on March 25, 2022

Rookie mistake: SSH's KRL is also a CRL. See KEY REVOCATION LISTS in ssh-keygen(1). You can revoke plain keys with it, but also revoke certs (both by serial number and identity) with it.

The infrastructure I built for access control using SSH certs used it. I know it works because I tested for it specifically.

OrvalWintermute · on March 24, 2022

It sounds like you could be making the rookie mistake instead by not reading what he/she actually wrote.

> Yeah the lack of mentioning a CRL at all really stood out when reading this. I actually didn't know about SSH certificates until I saw this article (I always assumed that SSH did not support this), but do run my own CA and authentication for internal web services, EAP-TLS, and VPN. The CRL is your first line of defense in the sense that it blocks the use of that credential instantly when it is revoked.

This sounds like he/she is running an x509 CA. He/she is generating certs for various use-cases.

It is possible to use x509 certs with SSH of course, and so he/she could leverage his/her pre-existing CA for that function.

Given above context CRL is completely accurate. And, KRL is not.

egberts1 · on March 24, 2022

No, SSL CA certificate are in no way like OpenSSL CA issued ones.

The format of the certificate have diverged between two ecospheres about a decade ago.

imwillofficial · on March 24, 2022

If you didn’t know about SSH certs, you shouldn’t be giving advice. You should study the fundamentals

OrvalWintermute · on March 24, 2022

I think you may also have missed the context that he/she used, as they described running an x509 CA first.

In an organizational context, many organizations are not going to jump to creating a novel CA type (SSH CA) when in fact regular x509 CAs are well known and the basis for much security, and many in regulated industries are using them already.

Additionally, given that he/she is running an x509 CA, telling someone with that experience to study the fundamentals is not very polite. It assumes the author of the comment is not educated, but the very description of his/her use-cases are not simplistic ones.

Engineering is all about tradeoffs after all.

imwillofficial · on March 24, 2022

That’s a fantastic point. Mea culpa

blueflow · on March 24, 2022

Your pronoun thing makes your text painful to read.

jjulius · on March 24, 2022

... it genuinely pains you to read "he/she"?

egberts1 · on March 25, 2022

That’s why I use “one”.

OrvalWintermute · on March 24, 2022

I just didn't want to assume gender, and didn't want to go through comment history in order to find it.

rfrey · on March 25, 2022

You would seem to have a very low pain threshold.

_fjb4 · on March 24, 2022

I'm not familar with SSH certificates, but I do know the fundamentals of certificate-based authentication. If you don't have a way to revoke the cert, then the server will assume that your properly signed unexpired certificate is valid. You will need some way to let the server know that the previously issued cert is not valid anymore.

This is how this type of authentication works, and the article did not address the important case of wanting to revoke a user's credentials.

tomrod · on March 24, 2022

To connect back on my rant -- isn't it amazing the disparity of thoughts around security best practices? How does someone who knows next to nothing become a reliable security professional if even the security professionals disagree on fundamentals?

kafkaIncarnate · on March 24, 2022

The fundamentals are that you need to exceed O(2^N) > 80 bits roughly in complexity of your keys. Adding some padding to that is a good idea because some algorithms can be simplified in theory (for instance AES-128 is actually simplified down to like ~118 already through known math).

This is for symmetric encryption, and for asymmetric the equivalent is ~1024-bits, so padding it up to 2048-bits is generally the "minimum" for RSA, and some of that math is advancing too so bumping it to 4096-bits isn't a bad idea. If you want to be quantum proof, RSA will be broken so moving to something else like EC is nice. AES would be halved O(sqrt(N)), so AES-128 becomes the equivalent of AES-64, so if you want to be quantum proof there you need to jump up to AES-256 (unless you are using XTS/tweak mode, in which case AES-512). Keep in mind quantum also is not exactly short term practical to accomplish at the moment.

You can use whatever technology to accomplish that complexity, be it passwords, SSH keys, or SSH certs. Anything else is just technology architecture noise. Passwords absolutely can clear the O(2^N) > 80 bit threshold. It's just about bytes, and how you store them.

Nobody is going to be brute forcing a sufficiently complex password over the network anytime soon unless it isn't actually random but some default password that looks random.

Just look at the title of this post: "If you're not using SSH certificates you're doing SSH wrong". It's just completely devoid of environment issues, user issues, datacenter issues, and reeks of elitism. There is no "one true way" despite people's insistence that they are the arbiters of truth. I keep reading here about "you should just use serial over network instead of SSH!" but fail to read about how those serial over network connections are usually less secure than SSH itself.

Best practices guides have gone off the rails. They are generally good guidelines, but you have to make sure you are taking into account your own environment and user needs and take them with a grain of salt. Learn for yourself, and read raw facts from real cryptographers and people in the field. Don't take best practices guides as absolute truth, but learn from them.

How does one become a security professional? Maybe not with one of those "become a security professional in 30 minutes" packages then start a blog about how everyone isn't conforming to their tiny worldview. No matter what it'll take >10 years with actual experience, just like any profession. One has to start from the bottom and make their way up. Most environments are too complicated for any "one size fits all" solution:

https://xkcd.com/927/

EDIT: Further discussion on this here is interesting. The top comments go all in on SSH certificates, then down the line people start questioning why passwords are bad in the same ways. A lot of the "SSL certificate" push theorized here from their perspective seems to come from VPN providers that need it from lesser skilled clients/users (think, people who bought VPNs off YouTube video recommendations):

https://arstechnica.com/information-technology/2022/02/after...

OrvalWintermute · on March 24, 2022

> You can use whatever technology to accomplish that complexity, be it passwords, SSH keys, or SSH certs. Anything else is just technology architecture noise. Passwords absolutely can clear the O(2^N) > 80 bit threshold. It's just about bytes, and how you store them.

I always try to assume breach in my thought processes, but I recognize that this lead to overengineered solutions because sometimes the mitigation is not worth the cost.

> Just look at the title of this post: "If you're not using SSH certificates you're doing SSH wrong". It's just completely devoid of environment issues, user issues, datacenter issues, and reeks of elitism.

I think this is an excellent point you make. There are a few different ways to use SSH securely and I probably lean a little towards the x509 and other alternatives, given the established base of x509 within my industry.

I don't use SSH certificates at work because they really don't make sense for me when I am using a strong credential already (HSMs)

> There is no "one true way" despite people's insistence that they are the arbiters of truth. I keep reading here about "you should just use serial over network instead of SSH!" but fail to read about how those serial over network connections are usually less secure than SSH itself. Best practices guides have gone off the rails. They are generally good guidelines, but you have to make sure you are taking into account your own environment and user needs and take them with a grain of salt. Learn for yourself, and read raw facts from real cryptographers and people in the field. Don't take best practices guides as absolute truth, but learn from them.

These are some other seasoned points you make.

I like to think about "Security Objectives". In most cases I am concerned about is something secure from a confidentiality, or integrity perspective. But, since I also deal with an ICS/SCADA community, their context is completely driven by "Availability as Paramount", defined performance within an acceptable range being next, and only after that, does the other objectives come into play.

However, given the varying use-cases of machine, mobile, app, connectivity basis or lack thereof (internet, transient, air-gap, etc) and the limitations of each, sometimes a smorgasboard of solutions are needed to satisfy within constraints.

> How does one become a security professional? Maybe not with one of those "become a security professional in 30 minutes" packages then start a blog about how everyone isn't conforming to their tiny worldview. No matter what it'll take >10 years with actual experience, just like any profession. One has to start from the bottom and make their way up. Most environments are too complicated for any "one size fits all" solution:

Appreciate the words of wisdom.

I view security as having much in common with other rapidly evolving fields of expertise. The generalists becoming specialists, are now becoming sub-specialties, adding fellowships, etc. When I was a young force-sensitive had the good fortune to fall in with the right community in which to collaborate.

My opinion is that many of the security communities are among the most welcome, diverse, and inviting folks around.

kafkaIncarnate · on March 25, 2022

> I always try to assume breach in my thought processes, but I recognize that this lead to overengineered solutions because sometimes the mitigation is not worth the cost.

I agree with this mindset, I do the same. But at the same time, yes you do have to realize that sometimes it's not worth it. For instance, there are two types of attack you might encounter, a strong nation-state and a drive-by botnet using known exploits and weak passwords to grab the low hanging fruit. If you are patched and using strong passwords, you aren't going to be affected by the drive-by botnet. If you are patched and using MFA and whatever strong credentials, a zero-day sat on by a nation-state is going to plow through anyway. Then they have gotten into that outer ring as a user and you are trying to protect against privilege escalation. Most things to protect against that here that are actually going to work are going to be strong process control or integrity checking (Windows), or Mandatory Access control systems (SELinux), or just basic user silo-ing and not running things as privileged accounts (either one). Most of that is going to be on the OS design itself or architecture of the process.

So we go to privilege escalation exploits. Take this year, at time of writing this is March. I have been patching nothing but privilege escalation flaws on Linux machines (I don't admin Windows, so I don't know that landscape) all year in 2022. It's only been three months. There's no short supply of them being discovered, and many of them are mildly, moderately, or entirely mitigated by just using SELinux. Some of them go all the way past it, though, so sometimes it can be futile.

So the nation-state threat in almost any case will likely have the ability to jump right past the zero-day to root level. So what about in-between? Well, learning about attack and if you are stockpiling or developing zero-days, those tend to add up quick or you just get locked out entirely because they get patched. Your skills also ramp up pretty quickly, too, as an exploit hunter. So you either develop a strong foothold or you fall out of the criminal world entirely. I'm sure it's probably the most paranoia-driven and stressful "job" to have while you are striving not to completely fall apart and get locked out due to defense ramping up or locked up (not that trying not to get hacked isn't paranoia-driven enough).

I also want to emphasize, you REALLY don't want to get compromised AT ALL at this point. Patching is probably the best way to do that, and the most important step. The reason being, you can't necessarily prove that you have kicked out the user after you think you have unless you just completely wiped the machine, and even then you have no idea if they got as far as a firmware exploit (in the instance of a nation-state), which is the more terrifying exploits that are being discovered and sought after.

But regardless, if you find out that you've been compromised and you're using a random password, you're going to change that password anyway if you are doing things right.

> I don't use SSH certificates at work because they really don't make sense for me when I am using a strong credential already (HSMs)

And that's a great point, too. HSMs are a great way to secure SSH as it is, and use the same or similar cryptography as SSH certs as long as they are well developed.

What comes to mind for me for a complicated environment where SSH certs don't help is that there might be inter-organizational issues where you have to make a connection work over multiple crazy hops. So for instance, an end-user's laptop has to connect to Citrix from home, then RDP into a local machine in organization A, then over an existing IPSEC tunnel use OpenVPN software to VPN into organization B, then SSH into a server in organization B. Organization B just did things using OpenVPN, and then SSH, but the rest had to be tacked on due to the client's environment. Real world example. So, the best usage in this case was for organization B to use Yubikeys in OTP mode to type the AES signed secrets typed as a keyboard through the multiple connections. Organization B had no control over organization A's infrastructure or ability to tell them to stop doing anything the way they were doing it, but had to consider the security implications of the way they had set their systems up anyway because the "client" was working in this environment. Then there was the issue of training the users, and explaining SSH certs OR keys to them would have been impossible. Telling them to hit a button was hard enough.

I've heard much crazier stories from the military involving piping encrypted sessions over satellite and jumping it over cable connections, etc (including patching live Super Bowl feeds over serial connections for officers which are always fun stories, especially when dealing with legal copyright issues involving the government in the 80s and fudging reasoning), but there are just some things when you are involved with multiple organizations or multiple connections or inter-organization or international things that you just can't control every single detail of. This is going to get more and more complicated as remote-work gets adopted more as well, so these old stories of network insanity are extremely useful for application level connectivity for sysadmins now.

Long story short, sometimes that thing you think is engineered terribly has a reason for it. Usually it involves stupid logistical nightmares, weird requirements, or bureaucratic/legal hopping. It's only going to get worse, too.

NovemberWhiskey · on March 24, 2022

I'm not sure I understand the point here: are you saying that a CRL is an effective protection against compromise? If so, how exactly does that work?

blueflow · on March 24, 2022

If I'm not using a device for a long time, it ceases to be an authorized client. This is what i want.

westurner · on March 24, 2022

`ssh-keygen` #Certificates: https://man7.org/linux/man-pages/man1/ssh-keygen.1.html#CERT...

"DevSec SSH Baseline" ssh_spec.rb, sshd_spec.rb https://github.com/dev-sec/ssh-baseline/blob/master/controls...

"SLIP-0039: Shamir's Secret-Sharing for Mnemonic Codes" https://github.com/satoshilabs/slips/blob/master/slip-0039.m...

> Shamir's secret-sharing provides a better mechanism for backing up secrets by distributing custodianship among a number of trusted parties in a manner that can prevent loss even if one or a few of those parties become compromised.

> However, the lack of SSS standardization to date presents a risk of being unable to perform secret recovery in the future should the tooling change. Therefore, we propose standardizing SSS so that SLIP-0039 compatible implementations will be interoperable.

jas- · on March 24, 2022

Now you have a centralized single point of failure. While the ease of use is inherently obvious with the implementation, if/when it does fail you will have to fall back to public key/password auth anyways.

tptacek · on March 24, 2022

Centralized single points of control are a basic goal of corpsec. They trade availability for security. The alternative model of individual SSH keys is theoretically more highly available, but has many single points of security failure.

jas- · on April 3, 2022

Please enlighten me on the ‘many single points of security failure.’

blueflow · on March 24, 2022

Which failure mode do you mean? The CA is accessible via offline means. I can walk to it and sign me a new keypair.

jon-wood · on March 24, 2022

What happens when the building the CA is in burns down?

blueflow · on March 24, 2022

The CA is in a gpg-encryped secrets store (pass) and has a password on itself, so it can be backupped like normal data to an off-site location.

BenjiWiebe · on March 24, 2022

Scan printed QR codes of your private key that you had backed up off-site.

politician · on March 24, 2022

Ideally, k-of-n key shards, stored in safety deposit boxes.

tomrod · on March 24, 2022

That's actually pretty brilliant.

danuker · on March 24, 2022

Provided you keep said papers away from prying cameras in a verifiable way, that is.

For more inspiration, check out the Glacier Protocol.

https://glacierprotocol.org/

tomrod · on March 24, 2022

Thanks for the heads up!

I wish I'd thought about this when playing with bitcoin a few months after launch and amassing an integer value larger than zero. That wallet died with the hard drive.

aspenmayer · on March 24, 2022

Please tell me you still have the hard drive. There’s a chance for recovery, and I have some experience in this area if you want some tips. Step 0 is always keep your drives for future recovery attempts.

tomrod · on March 24, 2022

It was dumped many, many years ago while BTC was still a novelty paying for pizza in the thousands BTC per. I went to see if I still had a backup of the wallet with a USD:BTC spike a few years back and it was gone.

Life goes on, even when sad things happen :(

AitchEmArsey · on March 24, 2022

Think of it this way: by starving the supply of that one bitcoin, you have contributed in some small way to the eventual loss of all bitcoins through similar events - speeding up the rate at which the world can move on from this silly fad.

krnlpnc · on March 24, 2022

ddrescue may be of interest if you still have the disk.

That's `dd` for broken disks. It keeps a log of data it couldn't read, and can keep trying to read it indefinitely, it even supports a save state and can resume trying again later.

I've recovered filesystems from several failed disks using it. It's not fast though!

xxpor · on March 24, 2022

The extreme version of this is using an HSM, and putting one in a safe deposit box.

danuker · on March 24, 2022

It's not so extreme, you have to trust the HSM manufacturer.

Try generating randomness using casino-grade dice, and xor-ing it with the HSM. Maybe then.

xxpor · on March 24, 2022

Now I'm wondering who's managed to pull off supply chain attacks on dice, since I'm sure it's happened already.

hamburglar · on March 24, 2022

Also, this doesn’t apply to most real scenarios (especially not “how I run my personal stuff” type scenarios), but is a fun one to contemplate: what happens when your customer has requirements that specify all keys (including root signing keys) to be rotated at a certain point in the future? Having a process for this is an interesting challenge.

remram · on March 24, 2022

The CA is a key, not a network service.

benlivengood · on March 24, 2022

Sign with two or three CAs, and have sshd accept any of them.

enriquto · on March 24, 2022

> SSH keys make sense. But certificates?

I am equally mystified... Never understood how involving a possibly malicious third party can make communication more trustworthy.

But then again, I was also sure when I first heard about it that public key cryptography was obviously impossible. You just could not have secret communication when everything is on the open! Is there any simple explanation that we ignorant people can read about certificates to get an "aha! insight" moment? For the case of public key cryptography, the moment where everything snapped together was when I read the mathematical description of the Diffie-Helman key exchange [0].

I'm not interested in how to do certificates with ssh, but on what problem do certificates solve, exactly.

[0] https://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exc...

tptacek · on March 24, 2022

People talk too much about "certificates" as a good thing, but they're just a means to an end. The two major goals you're trying to solve with SSH:

(1) You want all of your authentication to route through a single point of control where you can enforce group-based access control, MFA, onboarding/offboarding, and audit logging.

(2) You want the actual secrets that allow you access to an SSH server not to live for a long time on anyone's laptop, because it is effectively impossible to ensure that, on a sufficiently large engineering team, nobody's laptop will ever get compromised; there's just too many of them, and developers do weird shit so the machines can't be ruthlessly locked down. You want people to have SSH login secrets for exactly as long as they need them for a specific server, and no longer.

Certificates solve the problem of having dynamic access control to SSH servers without having some weird system that is constantly replacing authorized_keys on all your servers; instead, there's a single root of trust on all the SSH servers (the CA public key) and a single place that mints valid certificates that enforces all the stuff I mentioned in (1) above.

It's worth knowing here that SSH certificates are nothing like X.509 certs; they're far simpler, and you could bang out an implementation of them yourself in a couple hours if you wanted.

adrian_b · on March 24, 2022

Using certificates does not provide anything useful for a small private network, with a single administrator, or where all the users are trusted.

On the other hand, they are useful for large organizations, with needs for differentiated access rights and management rights.

The centralized control over the certification authority allows the delegation of restricted rights to other levels of network administration.

dsr_ · on March 24, 2022

A certificate is just a formatted list of attributes that has been signed by a particular private key. Username, UID, GID, membership in this, privilege for that, good-after date and good-until date, for instance.

Everybody knows the public key associated with that private key, so you can verify that the private key did sign this list of attributes.

An ssh keypair is an actual public/private keypair, but a certificate is just signed and encoded (but not encrypted) formatted data.

If an ssh daemon has knowledge of a public key used to sign a cert, and has been instructed to trust that cert, and all the dates are good, then the ssh daemon can accept that cert as proof of identity and allow a login.

enriquto · on March 24, 2022

I understand what you mean, thanks for the explanation.

But why would you want to do that? What problem does it solve? Just that you can connect without having a private key yourself? This doesn't sound very safe.

blueflow · on March 24, 2022

You still need your own private key plus the certificate.

I have n clients, m servers.

On clients, i sign the lokal keypair with the CA key and log-in via certificate. The client-side certificate basically replaces the line in the server-side authorized_keys. The editing stays locally.

On servers, i register the CA key as "Certificates signed by this keypair are trustable", the authorized_keys file stays empty. No further editing required.

During normal daywork, the CA key sits unused and can be shut away.

Key Advantage: I don't need to edit anything on the countless servers anymore.

tptacek · on March 24, 2022

Because the keys aren't directly coupled to server configurations, but rather indirected through a CA which hosts the only durable key, those "private keys" users have to have can be extremely short-lived, and tailored for each individual access request.

I think people really get into trouble with SSH certificates trying to reason about the properties of certificates versus SSH keys versus passwords. The format isn't the point; making the endpoint keys dynamic is. If you built a secure messaging system that propagated one-time-use SSH keys, it would address the same problem. Nobody will, because certificates are easier and already work, but you could.

dfox · on March 24, 2022

The common way of managing ssh keys involves having some central entity that somehow updates the authorized_keys on all relevant hosts, which involves interaction with all the hosts which is somehow triggered by interaction with the user requesting access. With ssh certificates the central trusted node only interacts with the user (by signing the certificate) and does not have to update anything anywhere else.

tfigment · on March 24, 2022

It solves partly managing authorized_keys files. If you have a team separate keys can be difficult to manage. Shared keys are even worse. Certs can help with this if you properly manage the cert signing server (like hashicorp vault). All of that is currently free and open source. Also can now have short expiry times if desired.

yosamino · on March 24, 2022

One example also mentioned in the article:

If you connect to a ssh-server for the first time, ssh will give you a warning and let you know that you have to verify the fingerprint of the host key.

This becomes annoying when you connect to many different servers and I would not trust everyone (including me) to do this check correctly every single time.

SSH certificates solve this by having the ssh-host-key be signed in a way that your ssh-client can verify and you only have to add a key-signing-key to you known_hosts once.

Now you have to sign the ssh-host-key but you only have to do it once per server as opposed to having each user having to do it locally on every first connect.

slivanes · on March 24, 2022

I really like: https://www.abc.net.au/news/2017-09-15/how-encryption-works-...

enriquto · on March 24, 2022

This is a (quite convoluted) explanation of public key cryptography, that I already understand. My question was about certificates.

HideousKojima · on March 24, 2022

For public key cryptography my go to analogy for non-technical people has always been the mailing padlocks example (where the padlock is the public key, and the key to unlock itbis the private key that stays with the sender).

jpgvm · on March 24, 2022

I would say PKI (and especially the associated X509 standards) is by far the least understood (or most misunderstood) part of actually building secure stuff.

It would be nice if there was a dummies guide but I'm not really aware of one. Doesn't help that most of "how to PKI" on the web amounts to a bunch of unexplained cryptic openssl CLI incantations.

brandonmenc · on March 24, 2022

I recommend Security without Obscurity: A Guide to PKI Operations by W. Clay Epstein and Bulletproof TLS and PKI by Ivan Ristić.

I started working in this space a year ago (I'm on a project deploying zero trust networking at a large company) and these books have been invaluable.

https://www.amazon.com/gp/product/036765864X

https://www.feistyduck.com/books/bulletproof-tls-and-pki/

dfox · on March 24, 2022

It is combination of two things. X.509 is arguably overly complex ASN.1/X.500 thing, but that is not the main issue.

Main issue is that most people do not even grasp the concept of a certificate (ie. binding of public key to some additional information that is signed by some other entity).

raxxorrax · on March 24, 2022

Also it is a moving space. Browsers don't accept a single certificate for a site anymore, you also have to have that signed by a CA. You can create such a certificate yourself too, but as of today you will need at least two certificates for browsers to fully accept a TLS secured connection. It hasn't been that long since that rule is in place.

So it isn't only the technicalities of asynchronous encryption, there is also specific behavior of applications that use certificates to prove identities.

capn_duck · on March 24, 2022

Practical Cryptography by Bruce Schneier and Niels Ferguson is decent in that it gives a good lay of the land without diving too deep in to the mathematical rigor. The first half explains at a high level the concepts of encryption, key exchange, asymmetric encryption, digital signatures, and lays out the problem statement that PKI solves.

It's nice in that it will list out a bunch of available encryption algorithms or hash algorithms, but at the end of the chapter say "Just use this one, it's considered safe right now." i.e. AES256 and SHA256.

Unfortunately, it mostly avoids the practical steps of web security, like its not going to print out the command to type in to your shell to generate an SSL signing certificate. So I wouldn't recommend it if you're looking for an immediately practical book to help you secure your web server. But it orients you to the landscape so you have a general idea of what you're trying to achieve, and can google yourself the rest of the way there.

vngzs · on March 24, 2022

If they're willing to read a book on security design, I would recommend Security Engineering, 3rd Edition [0]. It includes a broad survey of what matters in the security space (rather than just cryptography), and generally in sufficient depth to understand how we may build secure platforms in the face of adversity.

Also, many of the chapters are available to read for free - read author's text under the cover photo.

[0]: https://www.cl.cam.ac.uk/~rja14/book.html

tomrod · on March 24, 2022

I feel this is the exact right thing for me right now -- people trusted in industry. I can follow tutorials and documentation. The part where a concept is explained is often missing and can be guessed at (albeit often wrongly).

I'll look into this and perhaps supplement with some good tutorials for my developers and data scientists. I appreciate your input!

tptacek · on March 24, 2022

I don't think Practical Cryptography is going to give you much of an intuition about why this article is advocating for certificates.

pphysch · on March 24, 2022

I spent an afternoon implementing a SSH cert service in Go using the standard "x/crypto/ssh" package, with little to no prior knowledge of SSH internals.

SaaS like Smallstep and Teleport are trying to middleman and monetize what is actually a simple process that more developers should be comfortable implementing themselves.

This isn't "rolling your own crypto", this is standard SSH key stuff plus a bit more nuance and LOC to make things more secure.

When you pay for those services, you are essentially paying for a wrapped SSH command + dead-simple web app + someone to be your CA (read: store the resulting files of `ssh-keygen`, hopefully securely). And all the potential headaches of relying on yet another SaaS.

Plenty of developers are comfortable writing a script, a simple web app, and securely storing a file on the webserver. This is all that is required to build SSH cert support into your internal apps/tools, plus an afternoon understanding how CAs work (in short, CA private key can sign any SSH public key, then that SSH public key can be validated by anyone holding the CA public key, no TOFU required).

michael_j_ward · on March 24, 2022

I have the same feeling, and it motivated me to recently purchase this "Bulletproof TLS and PKI" [0]

I haven't read it yet, so I'm posting i hope of someone else giving a quick review.

https://www.feistyduck.com/books/bulletproof-tls-and-pki/

ozim · on March 25, 2022

What is the scale you are operating on?

If you are having 10-50 servers and 5-10 people working on those - SSH keys are definitely good enough, it might be a bit of hassle to manage keys but quite OK.

If you go into large corporation area with more than 100 servers and more than 50 tech people that need to login to those servers you probably would already found out that there are other options and you probably have to run your internal CA (certificate authority).

If your org grows you would probably have CTO and other technical people who will have experience knowledge to implement things differently.

hitpointdrew · on March 24, 2022

Check out teleport. It abstracts away the certificate bit and manages it for you. You run 'tsh login' once and you get a cert good for 12 hours (then you can get access to all the teleport resources you are allowed to, weather that is ssh server access, db access, kubernets access, etc.) I am evaluating the product now and am quite impressed.

https://goteleport.com/

throwaway984393 · on March 24, 2022

It's not unreasonable, but we need some kind of universal knowledge base for tech stuff. Security is just one of many inscrutable topics in tech where you need weeks of research to understand the best practices.

diarized · on March 24, 2022

Scalable and secure access with SSH @FB https://news.ycombinator.com/item?id=12482212

Aeolun · on March 25, 2022

This blog post seemed eminently understandable to me, at least as someone aware of public key, but not certificate based authentication.

soraminazuki · on March 24, 2022

I feel that trying to make SSH keys short-lived is becoming more painful each year because there's an increase of tools that use SSH keys for purposes other than SSH logins. For example, age [1] encrypts files with SSH keys, agenix [2] does secrets management with it, Git can now sign commits with it [3], and even ssh-keygen can now sign arbitrary data [4]. All of these become useless the moment you start using short-lived keys.

[1]: https://github.com/FiloSottile/age

[2]: https://github.com/ryantm/agenix

[3]: https://calebhearth.com/sign-git-with-ssh

[4]: https://www.man7.org/linux/man-pages/man1/ssh-keygen.1.html

ayushnix · on March 24, 2022

Umm, please correct me if I'm wrong but I think you're confusing SSH keys with SSH certificates. A SSH client key can be reused to create short lived SSH client certificates. You can keep using that SSH client key to encrypt data, sign data, login to GitHub etc. There's no such thing as "short-lived keys", there's short lived SSH certificates.

pphysch · on March 24, 2022

Yes, a cert is just a public key that's been "stamped" by a certificate-authority (CA), allowing it to be validated by servers holding the CA public key (as well as enforcing other policies like lifespan, principles). It is a totally separate file and does not modify the original public or private key, which indeed have no notion of lifespan.

If you are constantly regenerating uncompromised SSH keys, you are probably doing something wrong.

The GP is misleading in this way.

ayushnix · on March 24, 2022

> If you are constantly regenerating uncompromised SSH keys, you are probably doing something wrong.

Yup, there's no real reason to generate a new SSH key pair each and every time you want to get a short lived certificate. The SSO or CA management system (like Vault) is responsible for verifying your identity.

dfox · on March 24, 2022

This is also true for X.509, there is exactly zero reason to generate new key (if it was not compromised) or even new CSR for certificate renewal. Yet people tend to do this which only increases the opportunities to fuck something up in the process. (Well, it does not make much sense, but over last year I have seen at least three instances of somebody overwriting the only copy of newly generated private key with the old one…)

ayushnix · on March 25, 2022

> This is also true for X.509, there is exactly zero reason to generate new key (if it was not compromised) or even new CSR for certificate renewal.

It might make sense to regenerate the private key on each certificate renewal if the private keys are kept unencrypted, as they often are in the X.509 scenario. If the keys are encrypted and if your web server manages to get the password to decrypt that key each and every time it serves a request, then yeah, I don't see the point of regenerating the private key on certificate renewal.

soraminazuki · on March 25, 2022

One of the first things the article mentions is rekeying. Since the step utility does in fact regenerate keys when obtaining certificates, it actually does have a lifespan. Besides, how do you even tell that your keys have been compromised?

ayushnix · on March 25, 2022

> One of the first things the article mentions is rekeying. Since the step utility does in fact regenerate keys when obtaining certificates, it actually does have a lifespan.

I went through the article again. Essentially, rekeying makes sense if the private keys in question, whether they are host private keys or client private keys, are kept unencrypted on disk. Host private keys typically are, so it might make sense to rekey host private keys. However, if your user private key is kept encrypted on disk, as it should be, there isn't really a good reason to rekey.

The step tool seems to abstract that process and it also generates a new key pair on each login but that keypair never even touches the disk, according to the article. This makes sense assuming the step tool generates a key pair and doesn't encrypt the private key. In that case, yes, rotating/regenerating the client keypair on each login make sense.

pphysch · on March 25, 2022

Those are choices made by the Smallstep SaaS and are not reflective of the underlying SSH cert technology.

soraminazuki · on March 26, 2022

The thing is, the topic of the article centers around the step utility and security practices that it considers is the best. My comments were in relation to the article, not all the possible ways in which SSH certificates can be used.

soraminazuki · on March 24, 2022

Yes, but the entire point of the described setup is to get rid of traditional long lasting keys in favor of ephemeral certificates (which I believe is another way of saying signed keys) obtained through SSO. Signing certificates with your existing keys kind of make the whole point moot.

ayushnix · on March 24, 2022

I doubt we're getting rid of traditional long lasting keys anytime soon. They do have their uses and it'd be a waste not to use them.

> Signing certificates with your existing keys kind of make the whole point moot.

You mean getting signed certificates from a SSH CA with an existing client public key? Why does it make the whole point moot? The SSO is responsible for verifying your identity. Rather, it seems pointless to generate a new SSH keypair each and everytime you want to get a certificate and login to a machine. You can certainly do it if you want but I don't see the point. You can keep using your existing SSH public keys to get short lived certificates and login to a machine, do you job, logout, and repeat the process with the same keys.

soraminazuki · on March 25, 2022

When you sign short term certificates with your existing keys, you're still using your existing keys, managed in the exact same way, to authenticate. The certificate would just be another layer of indirection. I fail to see how that would be a meaningful change.

One of the primary benefits of the described setup is that there would only be a single long term key. It would be managed more securely because it won't be lying around on each user's personal machines.

ayushnix · on March 25, 2022

> The certificate would just be another layer of indirection. I fail to see how that would be a meaningful change.

That certificate would only be issued after you've gone through SSO or some other certificate management process and it would have a defined expiry period, maybe even as short as a few minutes. It is a meaningful change.

When using key based authentication, as long as you had your public key in the authorized_keys file on the server, you would be able to login without issues, even if your keypair is compromised. With certificates, even if your keypair is compromised, you wouldn't be able to login to that server because you'd have to compromise the SSO/MFA authentication step as well, which adds another layer of meaningful security in the process.

soraminazuki · on March 25, 2022

I think we're in full agreement here. My prior comment was about using your existing keys as the CA.

ayushnix · on March 25, 2022

Oh, I didn't realize you were talking about not using your SSH keypair as the SSH CA keypair. Well, that's kinda expected. The SSH CA keypair is the single point of failure in this case, which sounds worse than it is but it's okay to have such a thing because of the benefits, so it should be protected and isolated.

imwillofficial · on March 24, 2022

Different strokes for different folks.

Different keys for uhh, different complex application use cases.

xrd · on March 24, 2022

This article was very illuminating. And, I wish it was written like this:

  To really secure your SSH server, do these three things:
  1. Setup a trusted authority
  2. Use ssh certificates
  3. With ssh certs, you can easily add MFA for logins expire until you reauth.
  Now that we've established this, here are the gory details.
  Lorem ipsum, lorem ipsum, lorem ipsum.

It's a great narrative, but I wish I knew the big payout in advance.

AtlasBarfed · on March 24, 2022

MFA for ssh means you can't automate cluster-level ops. Unless, well, you automate the MFA, which defeats the MFA.

This drives me a bit nuts about security people in the age of cloudscale. They assume you don't mind MFA'ing every hour and are manually doing logins and accesses for everything. Yeah, uh, I need to script orchestration on several hundred machines at once, and orchestrate/access on the scale of hours or even days for some things like "Big Data" backups or restores.

If certificates are anything like SSL certs and the horrorshow cli tools / options / management involved in those, no thanks. I'd rather have an automated sshkey switchover, or for stateless just routinely cycle the infrastructure with new keys.

It's been a long time tenet of security that you want an open algorithm that gets broadly and publicly challenged so you know its secure. Well, in the age of the state actors, this might not be the whole truth.

I think layering some klugy not-invented-here obfuscation atop the more battletested methods is a useful and important deterrent/delay. Sure someone will figure it out, but they have to TRY HARD. A lot of the institutional attacks seem to be based on human attacks on standardized systems, which HAVE to allow human access and vectors.

So in AWS land, secrets manager is secure unless you get the permissions. Then you have everything. But if each of those secrets has some whacko obfuscation for the various apps, then that is a big slowdown to the human attack vectors.

And the state actors? Well, even they have budgets. They'll probably move on to easier targets. If a state actor is motivated at targeting your company in particular, well, given that they'll have malware in the firmware of your hard drives and motherboards and the like, you're probably helpless.

Finally, what really bothers me about most security is that it leaves one of the most important "canaries in the coal mines" aspects of security: honeypots. Sure do the diligence on securing the access, but how about some turnkey approaches for setting up honeypots to detect when people are poking around? Honeypots are perfect for that, because the devs only care about the stuff they are working on. The intruders are doing the scanning.

fishpen0 · on March 24, 2022

Cluster-level ops really shouldn't be done by direct SSH. Tools like ansible, salt, chef, puppet, etc... all can be run in some form of daemon mode with a central management server for a reason. You should be authenticating against the management service and running your configuration or automation scripts from there, not as a massive pool of CSSH or whatever directly from your laptop.

Over the last 6 years I've worked for three different companies that all universally disabled ssh and we never had troubles running management scripts or tooling.

AtlasBarfed · on March 25, 2022

Stateful databases or disposable api servers?

Disposable API servers you can eliminate ssh access and the like. Containers generally don't run sshd, but they do still often have kubectl/dockerrun if you absolutely need to.

SSM sucks on a certain level because the output is capped at ?1MB? I think and you need to poll S3 to use it.

Salt daemon polls fine.

As you kind of alluded to, well, you can do SSM to "port knock" or simply do a change to a secgrp to flip on the ssh access, and then flip it off afterward.

I find Salt/Ansible/k8s too big, you can't "step through" to debug your orchestrations. I would categorize them as "heavyweight". SSH is a good substrate for everything else, including adhoc stuff.

Anything that can be run off a laptop can be run off an admin server. Remotely debugged. I love it. With the heavyweight stuff there is too much trial and error: try recipe, it craps, guess what's wrong, it craps, guess again, it craps.

The stuff I use actually doesn't require SSH. As long as you can deliver a command and get the output, I can use SSH, SSM (with the crappy limitations but its GREAT for stuff in China), kubectl, salt, dockerrun, teleport (until the token runs out), or combo ssh-to-bastion then do other stuff.

outworlder · on March 24, 2022

Until the daemon breaks and someone has to get in with SSH anyway? Chef is notorious for doing this. Or at least it was before we got rid of it. Some random script would break and then the run would be incomplete. And we couldn't just fix cookbooks as the run wouldn't complete.

Some deployments make the daemon approach (that phones home) difficult. Such as management in a corporate network. It's easy to configure AWS and the like to accept requests from well known corporate gateways. It's not as easy to make them from the outside the corporate network in. And even when that's doable, different cloud providers and regions make it difficult. You end up having a bunch of chef (or similar) servers scattered around.

fishpen0 · on March 24, 2022

In the rare case this happens we still don't use raw SSH. We rely on something identity-driven like SSM in AWS or IAP in GCP to initiate the tunnel.

AtlasBarfed · on March 27, 2022

GCP IAP sounds like Teleport, which we've already run into issues with since the Teleport daemon will die/not accept connections in some situations, while the good ol sshd does. Like: full disks, memory stress, or (I think) the teleport daemon getting killed.

SSM sounds like an advanced port knock. Or you could toggle the security group port access, or keep the bastion down and spin it up if you need it.

artificialLimbs · on March 25, 2022

You mean like Thinkst Canary?

Or OpenCanary if you can't afford $arm&leg?

AtlasBarfed · on March 25, 2022

I am not a security consultant, but I've never heard of either.

Which says something... but then again if the security group is doing its job with setting up canaries, why would peon dev like me even be awares?

zepearl · on March 24, 2022

I agree, something like that. From what I've experienced it's (for me) a common problem in that realm:

a lot of blah-blah that makes my eyes glaze/lose focus before getting to the core/overview.

~15 years ago, when the company I'm working for first implemented PKI, I needed something like 100 hours of "help" from local security engineers and the external software-vendor's programmers to understand how that worked and what the SW was supposed to do. After that, explaining to colleagues at least on a high level how that works became a matter of minutes.

To be fair towards myself, even the local scrty eng gurus (relaxed, as they themselves didn't have to "deliver" anything) and the external vendor's programming gurus (hardcore, as nothing would be paid if the SW could not be implmemented) were often absolutely not understanding each other, so I ended up becoming their unofficial mediator/translator:

if I felt like an obvious question was not dared to be asked by one of the parties then I sacrificed my self-esteem to dare asking it, if "even I" did manage to understand some concept then both parties were supposed to get it as well and if not then at least I could act as gateway to explain/expand offline :P

It was an interesting time - not nice nor very bad (a little bit bad, as we had of course an implementation deadline), but at least I learned a lot, as well in the area of social skills :)

michael_j_ward · on March 24, 2022

Is that what I get out of the box using tailscale for ssh?

https://tailscale.com/kb/1009/protect-ssh-servers/

tptacek · on March 24, 2022

Two examples of things Tailscale doesn't give you for this usage model that SSH CAs can:

* Transcript-level audit trails for what people are actually doing on SSH sessions.

* Differential access to different groups of users to the same machines.

Tailscale and SSH CAs work together nicely: require membership in the right Tailscale group to talk to SSH at all, thus tying access to SSH to your (e.g.) Google login and MFA requirement, and use something like Teleport for the actual SSH login, to get the audit log, group access, and an additional authentication factor.

tener · on March 24, 2022

Tailscale (and other similar solutions) works on the network level. This is not a bad idea in itself, but SSH certs operate on the application level.

The fact you can ping the server shouldn't mean you are allowed to actually access it.

1970-01-01 · on March 24, 2022

Meh, you're not wrong for skipping SSH certs. They're mature security, but not mature enough for everyone. And like everything else, they break, predictably and unpredictably. If you're not ready for someone to emergency access their way into prod to fix a broken SSH issue on Christmas morning, you're not ready for SSH certs. Maintenance will get you, one way or another..

teknopaul · on March 24, 2022

Is there something wrong with installing client public keys in the server and preventing password based logins?

Seems more secure than handing over the security of your servers to a CA?

Managing the authenticated_keys file is trivial and I like simple file based solutions.

I find simpler is often more secure because its easier to understand.

I have had plenty of ssh servers on the Internet even, (on non-22 ports to prevent logs filling) and not had a problem yet AFAIK.

I have also been on call when certs expired gawd knows how many times.

aidenn0 · on March 24, 2022

I'm not sure I buy it, but, TFA tries to explain:

Situation: A person's machine dies and takes the SSH private key with it

No Certs: They now need to get IT to distribute the public keys to a set of hosts, where the full list is possibly not known by any one person, so for the next week it will be "oh shit, I forgot to ask IT to put the pub key on server-12, so I'll open a ticket and not get any work done for the next few hours"

With Certs: They get IT to sign the new key.

michaelt · on March 24, 2022

> They now need to get IT to distribute the public keys to a set of hosts, where the full list is possibly not known by any one person

If the team responsible for server security isn't able to identify all the servers they're responsible for, you've got far bigger problems than SSH certificates are going to solve.

renewiltord · on March 24, 2022

You could just get your authorized keys via an `AuthorizedKeysCommand`. It will make it slower to load but even something as simple as `AuthorizedKeysCommand aws s3 cp s3://myprivatebucket/authorized_keys - | grep USER` will do the trick. Then revocation is just deletion from the file and adding is just addition to the file.

teknopaul · on March 30, 2022

"Situation: A person's machine dies and takes the SSH private key with it"

Not sure I buy that, people take backups. No reason why a CA can't loose its private key if we're presuming backups are not being taken, and then everyone is affected.

politician · on March 24, 2022

If the IT team is treating servers like pets, then this is a problem. If they aren't amateurs, then it's not a problem to update the script and push the configuration out to all of the servers.

aseipp · on March 24, 2022

The second part is only kind of accurate, because usually the "sign a new key" part is automated as part of some other authorization flow. For instance you can log into your corporate SSO provider, and that can actually provision a short-term (i.e. 1 hour) SSH key that is signed by the CA for you. That short-term key is then used to shell into hosts, and you periodically renew either the key or the SSO session token. There's a bunch of points in the design space you can do to make this transparent. Normally it's all wrapped up in some command that invokes ssh for you; so you just say 'corp-auth ssh username@host' instead of just 'ssh username@host' and it all "Just Works" if you're lucky.

You can also bake some basic rules/policies into the certificate, i.e. the above flow returns a signed certificate that is only valid for SSH'ing into hosts in the 'www' group (not the 'database' group.) So then operations people can add you to groups in some other place (LDAP, Exchange, whatever) and when they add you to the 'database' group, any future short-term SSH keys issued by the central authority allow you to now shell into the 'database' hosts. The hosts themselves can also have some automation to check these permissions are accurate when an SSH login request occurs. So you just file an IT ticket asking to shell into a host or group of hosts, they fiddle with some knobs somewhere, and a few minutes later you're done.

So the actual second part is more like "Get a new laptop and log back in through Okta" or whatever, and you have all the exact same permissions and whatnot you did before. It's easier for both the user and the administrators.

An immediate advantage of this are that keys are easy to immediately audit and revoke (because a central authority issues them so you have a trusted trail). But there are some more subtle advantages; one for example is that you make servers more homogenous and identical. They all just ship some specific sshd_config file, normally identical, rather than each server potentially having a different authorized_keys file; instead the central authority that issues keys is where the mapping of hosts/authorized users is, rather than each server having that knowledge "individually" in its own file. That's easier to understand, track, keep up to date (who has access where?) etc. You can just go modify a single global user and have the permissions flow downwards.

None of this really matters if you only have like 1 or 2 people doing everything, or it's your home network, but once you get above like, 5-10 people, or you have at least one sysadmin, it's actually really useful IMO. Also, to some extent, you can mix and match various parts of the above concepts (e.g. running code on every login request, to add extra authorization checks based on out-of-band information.) You have to decide what's appropriate for you. Read the ssh and sshd man pages and you'd be surprised at what you can do.

Source: I basically implemented my own SSH certificate authority infrastructure (server/client automation) "for fun."

aidenn0 · on March 24, 2022

I guess I don't see how this is better than using LDAP or AD authentication on each host?

Also, users tend to hate having to always relogin to SSO; maybe that's because the implementations have poor UX, and maybe there's no secure way around it.

aseipp · on March 24, 2022

Sure, you can use LDAP or AD or any other number of things to control server authentication as well as mapping some global database of user IDs to accounts. You could also do other things like combine this with a 2FA solution like Duo.

One thing SSH certificates certainly have going for them is that they're actually easy to script and integrate with, and "piecewise" migrate to, in my experience, while using a flow you already are pretty familiar with. I personally didn't use any sort of LDAP or AD setup to back my design; you can implement a custom backend for all this pretty easily yourself. There's nothing inherently confusing about the concept of cryptographic certificate authorities or anything, anymore than public key cryptography itself. It's a relatively natural extension of the SSH design you know already, is my point. Again, the man page is worth reading to understand it all a bit better.

> Also, users tend to hate having to always relogin to SSO; maybe that's because the implementations have poor UX, and maybe there's no secure way around it.

Well, I'll be honest, people who tend to use SSH and would be impacted by this stuff tend to hate lots of things and not always for good reasons. Put another way, listening to developers or whatever about what they hate and what's actually good isn't something I would factor into something like this. SSO is mandatory for very good reasons at any reasonable scale (and by "reasonable" my opinion is you should have it in place at, like, 10+ people.)

Anyway, besides that. There's nothing in theory that prevents you from doing something specific like having the backend refresh the SSO token issued for your SSH certificates every time you log into some server, upto some given interval e.g. logging in at least once a day seems reasonable, but if you login every 5 minutes to a new set of hosts you can refresh the token.

In my case the flow was something like 'my-ssh-ca-wrapper ssh user@bar', which would ask you for a token. I would then get this token by visiting a little webpage I wrote, but in theory it could also just launch the browser itself with xdg-open with a direct link. I just use a password manager to fill out those "SSO" credentials. It isn't ideal or fully integrated but in practice it would only take a few seconds and it's similar enough to corporate SSO setups. But yes, polish is everything for those final few steps. The actual backbone is pretty straightforward, though.

sergiosgc · on March 24, 2022

The SSO flow is way too complex for something that must work on an emergency.

defanor · on March 24, 2022

> This makes it operationally challenging to reuse host names. If prod01.example.com has a hardware failure, and it’s replaced with a new host using the same name, host key verification failures will ensue.

A new machine should just have a new name. If one really wants to pretend that it's the old one, they'd better really copy it, including the keys. But even skipping that, sorting this out doesn't seem like a big deal (at least at a small scale; I suspect the article makes more sense in some scenarios than in others).

> Curiously, OpenSSH chooses to soft-fail with an easily bypassed prompt when the key isn’t known (TOFU), but hard-fails with a much scarier and harder to bypass error when there’s a mismatch.

Seems to me like a sensible behaviour for TOFU, not sure what's curious about it. Sounds like it implies that an unknown key is at least as bad as a different-than-known key, but that sounds wrong in context of TOFU.

> Once the user completes SSO, a bearer token (e.g., an OIDC identity token) is returned to the login utility. The utility generates a new key pair and requests a signed certificate from the CA, using the bearer token to authenticate and authorize the certificate request.

So the weakest point will likely be the SSO and the related infrastructure, instead of SSH and actual keys, and you'll probably depend on third-party services and/or custom/uncommon self-hosted infrastructure. Likely with a SPOF too. Doesn't sound good in general.

It probably does make sense in some organizations, but this particular setup doesn't seem to apply to all SSH uses, and to justify the title.

tptacek · on March 24, 2022

Again, the alternative to SSO's single SPOF is many SPOFs for each individual engineer with access.

otabdeveloper4 · on March 24, 2022

Or you could automate key distribution and revocation instead of giving away the keys to your little kingdom to Google.

(Let's not pretend that "SSO" doesn't mean "let Google or Microsoft handle password storage for me".)

tptacek · on March 24, 2022

No:

(1) Authentication bypass to your email service is already game-over for almost every serious company.

(2) Google (if that's what you're using) has better MFA and access control than what you're going to roll yourself.

(3) Having multiple sources of truth for authentication is a corpsec nightmare, and companies that don't have that invariably wind up accidentally persisting access for departed team members or contractors, and, worse, no single place to consult for a reliable catalog of who has access to what, which is why if you poll CISOs at large-ish tech companies, they'll universally tell you than one of the first 5 things they did when they took over was get SSO stood up.

(4) The "automated key distribution and revocation" system you roll yourself will be jankier and less safe than the certificate-based systems that already exist.

(5) Because that automated key distribution and revocation system does not in fact exist, what you're really saying is that you're going to live with developers having long-lived keys on their laptops.

If you don't trust Google, set up Shibboleth or something; the Google stuff is a sideshow. But the idea that you should manage SSH authentication separately from the rest of your authentication is pretty unserious. I spent about 4 years, recently, parachuting into dozens of mid-sized startups, all of them clueful, and except for the teams that had SSO-linked SSH access, SSH management was invariably a total nightmare. The "just manage SSH directly" approach is, empirically, a failed model.

otabdeveloper4 · on March 27, 2022

I'm sure it's different for large megacorps, but if you have less than 100 devs then the single point of failure in your SSO scheme is a far bigger security and operational risk than having long-lived keys on some dev's laptop.

> the rest of your authentication

What is "the rest of your authentication" in this context? Corporate email? As far as I know, SSH is the only real authentication possible here.

defanor · on March 24, 2022

How so? Are you picturing that alternative as regular records in ~/.ssh/authorized_keys, or something else?

krnlpnc · on March 24, 2022

It depends on the scale. If a company has a handfull of hosts I'd argue that deploying the full AAA and PKI systems to back cert auth is doing it wrong.

Traditional ssh-key auth is simple and reliable, it's not until you have a large, complex and diverse user base that you need something more. That's why the huge fang sites use it. Every org doesn't need to mimic fang.

fvold · on March 24, 2022

Exactly!

I don't understand this obsession with enterprise-level security on home networks and hobby projects. If you think it's fun and educational to set up, then you're doing it for fun and education, not security. If you're doing it for security, you're basically setting up anti aircraft guns to do what a drone jammer could do with way less resources spent.

hedora · on March 24, 2022

Anti-aircraft guns can't actually take out most drones.

Good analogy.

ayushnix · on March 24, 2022

I'm using SSH certificates to manage a few nodes in my homelab and it's a pleasure to not have to deal with managing the known_hosts file on my clients and authorized_keys file on my servers. There's only 1 line in my known_hosts for my nodes and authorized_keys doesn't even exist on any of my servers. If I add a new node to my homelab, I don't have to make any changes in known_hosts or authorized_keys in the existing nodes and it's easy to bootstrap the same known_hosts and sshd_config that I use everywhere in the new node.

SSH keys would make managing these few nodes a lot more complex that it is.

aseipp · on March 25, 2022

You really don't need to be anywhere close to mega-scale to benefit from SSH certificates and integrated authentication flows, though. Even at the scale of "only" 10 people with SSH access, the whole system can be massively simplified and made more secure by integrating centralized logins, and SSH certificates are rather perfect for this.

I implemented my own SSH certificate authority myself more or less, and while it's overkill for my own homelab-level stuff, I absolutely would never use anything else once I have more than like, 5 people logging into some set of machines. The benefits of centralized SSH access control that you can freely integrate (and pretty easily too, thanks to OpenSSH!) with your existing identity provider is really nice.

nostoc · on March 25, 2022

True, and while the title is somewhate clickbaity, I think your point was pretty clear in the article.

jimmaswell · on March 24, 2022

I still don't care about any of this nonsense for my personal stuff when I can avoid it. Passwords all day for me. 0 security incidents in my lifetime.

Sucks that Github and some other things force SSH keys which are just passwords except always saved to your disk so that anyone who steals your laptop gets access.

It adds insult to injury when you try to capitulate to this malarkey, generate a key in PuTTy's key generator, then Github whines that the default setting isn't overkill enough and you have to make a whole NEW key with some other setting. I miss the good old days.

krnlpnc · on March 24, 2022

> Sucks that Github and some other things force SSH keys which are just passwords except always saved to your disk so that anyone who steals your laptop gets access.

This is the reason to encrypt ssh private keys with a passphrase. If the key is leaked it's still protected by the password.

It's a built-in feature of ssh. For an existing key downloaded from a cloud provider, use ssh-keygen -p to add/change the passphrase.

gizzlon · on March 24, 2022

How hard is it to crack these passwords? Since it's local (unlike the github password) I guess you can run brute force attacks etc at full speed ?

kenmacd · on March 24, 2022

Yes it's local, but also can be taken away to run on a cluster. Looks like ssh-keygen is using 16 rounds of bcrypt_pbkdf. My laptop just took 185ms to try a password. So I guess I could run less than 10 passwords per second (per core?).

I don't keep an ssh key on disk though. I use my gpg key on my hardware security token, which gives you 3 attempts before you have to unblock it with a separate management password, which again you get 3 attempts at before the key is entirely locked.

krnlpnc · on March 24, 2022

The longer the better. A memorable sentence is a good place to start.

ssh-agent will cache the passphrase in memory, which helps avoid needing to type in a long phrase repeatedly.

But it's worth saying that if any private key is leaked (passphrase or not), it's time to revoke it and generate a new one.

Having a passphrase in place raises the bar from "key leaked, 3rd party has access to everything" to "key leaked, 3rd party has to now attempt to crack the passphrase". It mitigates a very bad scenario and buys time.

wombatpm · on March 24, 2022

Pass phrase not password. You are going for length to protect from brute force.

Soylent Green is NOT people

Was one of my shorter pass phrases

UweSchmidt · on March 24, 2022

I'm sceptical about the entropy of easy to remember pass phrases, including negations and simple capitalizations. Even when going for something like "correct horse battery staple", which requires a memorization technique to remember, the space of words we are realistically drawing from when prompted by a shell is probably not that large.

ylk · on March 24, 2022

That’s what diceware is for: https://theworld.com/~reinhold/diceware.html

teknopaul · on March 24, 2022

How many times a day did you typ that wrong!

confident_inept · on March 24, 2022

That's going to depend on the length of your password. Longer is more entropy and orders of magnitude more difficult to 'brute force' with each character added.

9dev · on March 24, 2022

Yes. This is precisely why passphrases are a bad idea - people tend to use their easy-to-remember default password, which gets compromised along the way if an attacker can get their hands on the key file and throw their full processing power at it.

SSH certificates are a solution to that problem.

krnlpnc · on March 25, 2022

What mechanism is used to protect the key of the certificate authority?

9dev · on March 25, 2022

That’s a different situation - the CA key resides on some high security server, not a developer laptop that may get stolen or compromised by ordinary usage.

stjohnswarts · on March 24, 2022

sure, but that's why you're using a password manager that lets you generate 24 character mixed everything random passwords and use them easily, right? Right? Guys?

tener · on March 24, 2022

I miss the good old days before passwords were a thing and everyone just trusted others to behave ;-)

You can secure access to Github (and other places) with hardware keys, e.g. from https://www.yubico.com/.

kenmacd · on March 24, 2022

But you add a password to the key, so it's the same.

And not everyone saves it to disk. My ssh key is my gpg key. It's stored on a yubikey and can't ever leave it. If I do a `git pull` then my yubikey flashes and I have to tap it to allow that connection to happen. Steal my yubikey, well you can't unlock it. Hack my laptop and you can't tap the key.

outworlder · on March 24, 2022

> I still don't care about any of this nonsense for my personal stuff when I can avoid it. Passwords all day for me. 0 security incidents in my lifetime.

Even for personal stuff, why would you want to use passwords? Keys are more secure AND more convenient. Sure you don't need certificates but I don't understand how keys are more 'nonsensical' than passwords.

Keys have more flexibility, you can use SSH Agent, you can do SSH agent forwarding, etc.

> except always saved to your disk so that anyone who steals your laptop gets access.

This is wrong. First of all, your laptop should have disk encryption. Always. I don't care what your threat model is, encrypt the disk. Second, SSH keys can (and SHOULD) have a passphrase.

teknopaul · on March 24, 2022

Ever don't the math on the amount of time spent entering passwords? I have. Stopped using passwords for personal stuff.

I multiplied by cost per min for downtime in professional support and the cost of typing passwords was more than my yearly wage.

sneak · on March 25, 2022

This contains several misconceptions.

Keys are not just "passwords saved to disk". My private keys exist in hardware, on Yubikeys. They aren't on disk. The hardware requires authentication to access.

You're typing your passwords zillions of times. I log in to my system once, authenticate to my HSM once, and I can access many hosts via scripts and automated tools. This is impossible to do securely with password auth.

You're also training yourself to manually input the entirety of your authentication credential multiple times per day (or hour). This is bad practice, as anyone stealing it then has the keys (ha) to your kingdom (and they have way more opportunities to steal it!). Even if you just replace password auth with a password-protected key on disk, and don't use a password-caching agent that holds the decrypted key in ram (as would be typical), so that you're still typing your password each and every authentication, you've raised the bar substantially because someone would need to steal your encrypted key from disk in addition to obtaining your password.

Then there's the issue of cycling credentials, and the mental loads involved. I can cycle my keys without changing my workflow or having to type anything differently.

Passwords are not good authentication tools. Use actual cryptography.

toolz · on March 24, 2022

not if you encrypt your ssh keys, which is what everyone I know does - then your potentially weak password requires physical access to exploit while things accessible over the internet can't really be brute forced.

further, this is even more convenient when paired with an ssh-agent that will securely hold your private key in memory and not allow anyone to export that key...you could dump the memory but that would require root access, which again should be password protected

teknopaul · on March 24, 2022

My servers expect an environment variable to be sent too, (new sshd and ssh can do this). This gives basic 2fa without typing anything.

jalgos_eminator · on March 24, 2022

The github change also messed up my workflow, which involves pulling/cloning my company's git repo from lots of machines, many of them being short lived or disposable. Now I have to save the password forced on me in a file because I'm unable to memorize it easily and that made our setup less secure. Thanks github...

raxxorrax · on March 24, 2022

Passwords are safe if you can memorize them. It is not too hard in my opinion. I also think they should always be an option for any kind of auth. Maybe I want to authenticate against a system but I don't want others to know my ID. For that use case a password is the better solution.

politician · on March 24, 2022

The SSH servers that I'm familiar with are spun up with a host cert, so all of the FUD in this article about connecting to an unknown host is a non-issue. Check that the host cert matches the one you expect once, and the tooling makes sure to notify you if it changes.

As far as provisioning, maintaining a secure CA signing practice is a nightmare. It's K8S level of self-inflicted pain for a startup. If you're running at a larger scale and can dedicate a team to it, fine. If you're a dozen people trying to launch, getting the devops guy to run `ssh-copy-id` is not the challenge that this article makes it out to be. Nor is the slightly more automated Terraform script that installs and uninstalls authorized keys from servers.