A few secure, random bytes without `pgcrypto`

mhio · 2024-09-25T01:40:47 1727228447

gen_random_uuid() produces a v4 UUID.

Taking the first 5 bytes of a v6 UUID (time) and last 5 (node) would be a bad random day.

manwe150 · 2024-09-25T02:26:53 1727231213

Wait, is this blog actually about how to introduce a backdoor into your Postgres install by rolling your own very bad rng?

thadt · 2024-09-25T02:35:01 1727231701

Nah, mhio is saying that the blog post has a typo:

> Postgres 13’s gen_random_uuid() which generates a V6 UUID that’s secure...

gen_random_uuid gives you a version V4 UUID, not a V6 UUID (it's even in the code comments in the snipped included in the blog). I don't believe Postgres even has a function to generate a V6 UUID - which, indeed, would be a bad idea to use as a source of randomness.

fanf2 · 2024-09-25T02:32:07 1727231527

No, a v4 uuid comes from a good RNG. The blog post just said v6 by mistake when it meant v4.

hinkley · 2024-09-28T09:48:40 1727516920

V6 is just a v4 rearranged to behave more like v7 for the purposes of b-tree insertion.

dwattttt · 2024-09-28T22:39:08 1727563148

I believe V6 is a reordering of V1, not V4. V4 is random aside from the bits specifying version & variant, ~6/7 bits.

hinkley · 2024-09-28T21:40:38 1727559638

I read this exact reply on this exact article two days ago.

What is happening right now? Why is your comment marked four hours ago?

hinkley · 2024-09-28T21:40:08 1727559608

I read this exact reply on this exact article two days ago.

What is happening right now?

tom_ · 2024-09-28T21:53:41 1727560421

Take the blue pill, press the back button, and pretend you never saw it.

Or take the red pill, pull back the curtains of reality, and see the machinery behind: https://news.ycombinator.com/item?id=41197775

hinkley · 2024-09-29T00:19:58 1727569198

And I got two replies recorded after making an edit to expand on my question.

Glitch in the matrix.

tough · 2024-09-29T00:26:11 1727569571

Man reading this whole thread glitched my brain a bit

hinkley · 2024-09-29T05:15:53 1727586953

There’s a few decades after you stop worrying you’re crazy and before you start worrying you’re senile. Leaving you a lot more energy for other things. Enjoy them.

frutiger · 2024-09-28T23:46:24 1727567184

My HN client uses the HN API which reveals the true post time of the comment. See https://seville.protostome.com/item/?id=41641314.

literalAardvark · 2024-09-29T13:50:03 1727617803

I'm really confused by this post. Wouldn't it be simpler to read a few bytes from /dev/random?

Sure, it wouldn't be portable to windows but that's more of a feature than a bug.

freeqaz · 2024-09-29T14:59:51 1727621991

How do you read from a file like that in SQL? I know that this is in "theory" possible but I've never had a legitimate use case where I've needed to do file I/O from my ORM, lol.

This is the ChatGPT answer that I was able to derive:

> You can read from `/dev/urandom` in a PostgreSQL query using `plperlu`, which allows executing unsafe Perl code. > Create a function to read random bytes:

  CREATE EXTENSION plperlu;

  CREATE OR REPLACE FUNCTION get_random_bytes(num_bytes int)
  RETURNS bytea
  LANGUAGE plperlu
  AS $$
  my $num_bytes = $_[0];
  open my $urandom, '<', '/dev/urandom' or die "Cannot open /dev/urandom: $!";
  read $urandom, my $bytes, $num_bytes;
  close $urandom;
  return $bytes;
  $$;

  SELECT get_random_bytes(16);

aaomidi · 2024-09-29T17:22:54 1727630574

/dev/urandom.

Some systems have basically made them equivalent though.

heavensteeth · 2024-09-29T09:30:21 1727602221

> I’m broadly against the use of Postgres extensions because they make upgrades harder and projects less portable [1],

I can't find that footnote anywhere

davidfiala · 2024-09-25T02:24:01 1727231041

Exercise extreme caution.

Having your security strategy rely on quirky behaviors of an implementation detail which might change is incredibly dangerous.

yunohn · 2024-09-25T16:18:46 1727281126

Not everything is a quirky implementation detail? It’s important for us developers to not write pure glue code between others functions, but to also understand them and write our useful code that may extend others work.

hinkley · 2024-09-25T02:28:06 1727231286

UUID v6 isn’t going to change. There’s a reason we have seven of them now. And v8, which would warrant your warning.

poincaredisk · 2024-09-25T03:39:00 1727235540

UUIDv6 won't change, but what about gen_random_uuid()

creatonez · 2024-09-25T10:09:56 1727258996

There is widespread acceptance nowadays that randomized UUIDs must be generated from the system CSPRNG or something equivalent, and that any non-cryptographically secure method is a bug. Most library implementations across languages have converged on this in some way.

That being said, the PostgreSQL documentation doesn't say anything in particular about the predictability of `gen_random_uuid`, so the behavior is unspecified. But it's worth noting the function has an explicit guard to raise an error if secure random is not available, so they were conscious of this possibility and did not attempt any misguided fallbacks.

And unfortunately this requirement is not baked into the UUID spec either, which uses the word "should" instead of "must" when discussing CSPRNG usage.

masklinn · 2024-09-25T04:12:59 1727237579

gen_random_uuid isn’t going to change either, the entire point is to generate a secure uuid4. At most it’ll get faster due to using platform-specific syscalls.

hinkley · 2024-09-25T02:55:20 1727232920

If you’re shopping for a CSPRNG, one of the items that should be very high on your list is being able to call the setSeed function multiple times and have the inputs compose instead of clobber each other.

You can send half-random input in and then send more half-random input in until you’re satisfied that the RNG has gotten a suitable amount of entropy. Do not chop, rearrange, hash, or bit shift the data trying to make it “stronger” the CSPRNG will do an infinitely better job of doing that for you. Just treat it like a Mr Fusion. Drop a can, a banana peel and the stale beer in and let it cook.

I gave a similar speech to a team trying to initialize SSL sessions on an embedded machine. “But what if we XOR…” No. Stahp.

ronsor · 2024-09-25T02:59:56 1727233196

Can you give some examples of CSPRNG implementations that allow this?

hinkley · 2024-09-25T17:17:36 1727284656

The ones in Java did in the context of the original discussion. And cryptographic hash based PRNG should have the capability of doing so, it’s an implementation detail whether you restart the data collection or append the data.

Just poke in the setSeed function and see what it does.

andreareina · 2024-09-25T10:19:27 1727259567

The Linux rng allows writing to a file to effect this.

yunohn · 2024-09-25T16:20:57 1727281257

AFAIU the blog author is taking the correctly randomly generated UUID and just cutting out the timestamp portion.

Why are you equating that to a hacky attempt to make less random data more random?

hinkley · 2024-09-25T17:14:08 1727284448

Because it’s essentially setSeed(getTimeMillis()). V6 and v7 are sortable, that’s why they exist. Which means like getTimeMillis() there are a finite number of starting points to try to guess the seed.

yunohn · 2024-09-25T20:24:17 1727295857

This is v4.

hinkley · 2024-09-28T09:56:07 1727517367

v6 is a transform of v1 UUIDs to behave like v7 keys with respect to database indexing - increasing over time. If it’s a function of time, then it’s still guessable by brute force.

Another responder suggested that the mention of v6 UUIDs is an error. Maybe. But that’s a truly bizarre typo to make. And they still haven’t fixed it.

yunohn · 2024-09-28T10:46:29 1727520389

But you could google Postgres UUID and confirm that they only provide v4? Instead of continuing a rant based on incorrect assumptions.

https://www.postgresql.org/docs/current/functions-uuid.html

amenhotep · 2024-09-25T20:34:17 1727296457

You seem to be saying that

setSeed(0); setSeed(1); rand()

and

setSeed(1); rand()

returning different values is not only a good idea but is already a thing. Am I wrong?

This would confuse the hell out of me, what specifically has this behaviour?

hinkley · 2024-09-28T10:01:13 1727517673

If you’re trying to create a repeatable source of randomness for unit testing for example, you would create a new PRNG for each run, not try to recycle an existing instance. You’re making assumptions about state that aren’t supportable.

Dylan16807 · 2024-09-28T21:22:42 1727558562

The behavior you are describing is not setting a seed. A seed should wipe out all existing state.

Adding entropy is a very different operation.

ynik · 2024-09-25T12:09:20 1727266160

> You can send half-random input in and then send more half-random input in until you’re satisfied that the RNG has gotten a suitable amount of entropy.

This does not actually work. If an attacker can observe output of the CSPRNG, and knows the initial state (when it did not yet have enough entropy), then piecemeal addition of entropy allows the attacker to bruteforce what the added entropy was. To be safe, you need to add a significant amount of entropy at once, without allowing the attacker to observe output from an intermediate state. But after you've done that, you won't ever need to add entropy again.

beng-nl · 2024-09-25T16:46:14 1727282774

You’re right, but I did not read GP to suggest otherwise.

GP does not suggest using the output before enough entropy had been gathered, eg see ‘until’ in:

> until you’re satisfied that the RNG has gotten a suitable amount of entropy.

hinkley · 2024-09-25T17:19:07 1727284747

Sibling already answered this. I don’t know how you came to this conclusion.