Since the author is on here - reckon you could beat up on simple random tokens a...

remram · on Aug 25, 2021

I bring this up every time this is mentioned, but I really wish the API token format included a domain to notify in case of leaks. Having services register with your in-house secret scanning system works very well if you're GitHub, but otherwise it's a very closed mechanism.

If sendgrid tokens were `secret:sendgrid.com/91on9SIkbUfSs` instead of `SG.91on9SIkbUfSs`, or Amazon keys looked like `amazon.com/JGUIERHT` instead of `AKIAJGUIERHT`, we wouldn't need a database of regexes and endpoints to report secret leaks.

See also the last time I ranted about this: https://news.ycombinator.com/item?id=26568651

greysteil · on Aug 25, 2021

Appreciate your passion here Remi. I don't think a full standardisation of API token formats is ever likely, but I do think there's value in nudging things in that direction.

One big challenge is that it's hard to get service providers to change their token formats. Very few have this at the top of their priority list - they're busy with other things. Here's an example playing out in OSS that is pretty typical: I tried to persuade the (excellent) team at Sentry to update their format, and they essentially told me "we have other priorities" https://github.com/getsentry/sentry/pull/26313. And that's a relatively simple changes, not the adoption of a whole standard.

In addition, as Thomas points out in this article, there are a lot of different token types for someone thinking about minting API tokens to choose between. They might rationally have different preferences over them. A standard that is prescriptive of format and approach is likely to struggle given that diversity.

With that said, I do see an opportunity here for a more modest standard targeted at service providers that already use JWTs or Macaroons. Generic tokens of those types are relatively easy for scanning providers to identify, and it's easy (and hopefully uncontroversial) for service providers to encode more information in them, like an "if found" link. I think a standard that defines the attribute name there, and the API for reporting / responding, would be a good start that might see adoption.

remram · on Aug 25, 2021

> I don't think a full standardisation of API token formats is ever likely

I thought so too but then https://www.rfc-editor.org/rfc/rfc8959.html came out this year.

Importantly I am not proposing a big change at all: The tokens can stay exactly the same (in the database and crypto code), you can still use UUID or Macaroons or JWT, you only change the frontend to add this prefix. Apologies if this wasn't clear in the the two examples I posted without explanations. The benefits would also be a bit higher than the PR you reference, which seems to help with scanning on GitHub (you mention that it would already work without the change).

As you note in your PR, many tokens are already identifiable, so standardizing a way to put the reporting domain in there shouldn't reduce security (by obscurity).

sudhirj · on Aug 25, 2021

Taken a step further, the secret should just be a URL that revokes itself - Like https://revoke.sendgrid.com/91on9SIkbUfSs. Github should then just make a get request to every URL (can whittle abuse down a bit by requiring https://revoke.sendgrid.com/robots.txt to have a Revoke: YES` section). They and anyone could maintain an allowlist of revocation URLs to pattern match as well. This makes a global registry unnecessary, and standardises the act of revocation.

scrollaway · on Aug 25, 2021

"Woops. I hit ctrl+enter instead of ctrl+c while copying my secret. Guess production's down for a bit while we roll new ones!"

I mean your core idea is decent but that's just really funny.

There's some amount of practicality being lost if your secrets start growing massively. There's also potentially restrictions to what you can put in them, and a prefix with an underscore or colon might be easier than something that has slashes in it.

Your idea is probably living as a queriable dns record on the domain in question. Or a standard subdomain, or even a .well-known path.

innomatics · on Aug 25, 2021

> Woops

Could require a POST or maybe even HTTP DELETE to guard against woopsies. The latter has semantic niceness about it.

richthegeek · on Aug 25, 2021

The alternative (as in, current reality) is "Whoops. I hit ctrl+entry while copying my secret and no-one noticed for a month. Guess all our data is leaked now!"

It's also up to each provider what actually happens when the "revoke" action is triggered. Maybe they just warn you immediately, which is still better than nothing.

Macha · on Aug 25, 2021

But if it's not a URL, Ctrl-Enter won't do anything.

If it is a URL, it opens in a web browser, and now you have problems.

staticassertion · on Aug 25, 2021

Have shorter lived secrets. Also, not sure we should be solving for the "ctrl c" of secret use case tbh.

remram · on Aug 25, 2021

What I had in mind is posting them to <domain>/.well-known/report-leaked-secrets or a location looked up from the domain using DNS. Making them URLs is an interesting idea, but they are likely to look awkward (e.g. include "revoke" like your example) and get a lot of non-revocation traffic (even if we have a way to tell scanning apart from actual revocation requests, we'd probably rather only get the revocation traffic).

My professional email also mangles URLs, turning them into urldefense.proofpoint.com/... Such solutions are sure to interfere with tokens looking like URLs.

franga2000 · on Aug 25, 2021

Maybe don't let anyone delete other people's tokens, even if leaked, but automatically alerting the admin if anyone accesses the "URL" would probably be a good option.

teddyh · on Aug 25, 2021

That wouldn’t be idempotent.

Also, you couldn’t send such a secret to, say, Gmail.

sudhirj · on Aug 26, 2021

It would be in that repeatedly hitting the URL will not have any effects, other than disabling them the first time.

But yeah, auto-link followers will invalidate them immediately. There's a case to be made for that being a good thing, but don't want to get into that.

tptacek · on Aug 24, 2021

I agree with you. Someone brought this up on Twitter and I'm kicking myself for not remembering to include the notion of adding identifiable markings to sensitive tokens (I'd do it not, but I'd feel like I was plagiarizing).

And it's a noodly and somewhat incoherent notion of "safety" I'm using here, because of course, random tokens are unconstrained bearer tokens --- authenticated requests, CATs, Macaroons, and Biscuits all address that weakness. I'm biased by my concern over cryptographic implementation mistakes.

It's a neat property of Macaroons (and maybe Biscuits) that you can come up with sane configurations where checking a Macaroon into source code can be, if not totally safe, at least not a major incident. I wish I'd thought of that, too, since I think "checking the token into source control" is a more vivid example than "emailing tokens" or "passing them around".

tialaramex · on Aug 24, 2021

Surely you needn't feel like you're plagiarizing, just give credit. The credit won't be any less deserved next week or next year, even if you "Know" you'd definitely have fixed this without prompting, nobody else knows that, and so it looks like you avoided giving credit where it was due, so, don't do that.

Nobody's asking for royalties, so "Shout out to @SomeoneOnTwitter for reminding me" is enough.

tptacek · on Aug 24, 2021

I will probably get around to doing that. If it was an error in the post I'd feel differently, but this is just another good idea I forgot to include.

greysteil · on Aug 25, 2021

Thanks for doing the good work of encouraging people to mint better tokens!

gmmeyer · on Aug 25, 2021

tbh I actually do think this is an error. I worked someone where we didn't do this and I think of it as a bug rather than a missing feature. It makes it very difficult to identify the token if it's ever leaked without checking it against our database!

ithkuil · on Aug 25, 2021

Macaroons are great. I was using them successfully in a (now dead) product. Very easy to reason about when you get the basics. Unfortunately the ecosystem is small and the effort spent of getting colleagues on board (i.e. convincing them you're not using some fringe thing) was substantial and continuative.

tptacek · on Aug 25, 2021

I think part of the problem of Macaroons is the belief that there should be an ecosystem of them, and a standard, and standard libraries. They make work best when they're custom tailored to the applications that really want them.

MaxGabriel · on Aug 25, 2021

> In future we want to be able to do even more sophisticated things, like ask users for confirmation before the push code that contains secrets

If you all have thought about it, do you imagine you'd only warn in the presence of some generic token identifier, like `secret-token` a la https://datatracker.ietf.org/doc/html/rfc8959 ? Or, would you be able to warn on everything that matches the regular expressions your partners give you to identify their API tokens?

greysteil · on Aug 25, 2021

The latter. Our objective for secret scanning is to prevent as many serious secret leaks as possible. Where a service already has a token format that is highly identifiable we want to take advantage of that, rather than rely on the adoption of generic token identifiers.

kdeldycke · on Aug 27, 2021

That's for the exact same reason (obvious identification and easy scanning) that I proposed to prefix all Scaleway's access keys with `SCW`.

See: https://blog.scaleway.com/strengthening-scaleway-token-secur...