Hacker News new | past | comments | ask | show | jobs | submit login
A few secure, random bytes without `pgcrypto` (brandur.org)
57 points by surprisetalk 3 months ago | hide | past | favorite | 38 comments



gen_random_uuid() produces a v4 UUID.

Taking the first 5 bytes of a v6 UUID (time) and last 5 (node) would be a bad random day.


Wait, is this blog actually about how to introduce a backdoor into your Postgres install by rolling your own very bad rng?


Nah, mhio is saying that the blog post has a typo:

> Postgres 13’s gen_random_uuid() which generates a V6 UUID that’s secure...

gen_random_uuid gives you a version V4 UUID, not a V6 UUID (it's even in the code comments in the snipped included in the blog). I don't believe Postgres even has a function to generate a V6 UUID - which, indeed, would be a bad idea to use as a source of randomness.


No, a v4 uuid comes from a good RNG. The blog post just said v6 by mistake when it meant v4.


V6 is just a v4 rearranged to behave more like v7 for the purposes of b-tree insertion.


I believe V6 is a reordering of V1, not V4. V4 is random aside from the bits specifying version & variant, ~6/7 bits.


I read this exact reply on this exact article two days ago.

What is happening right now? Why is your comment marked four hours ago?


I read this exact reply on this exact article two days ago.

What is happening right now?


Take the blue pill, press the back button, and pretend you never saw it.

Or take the red pill, pull back the curtains of reality, and see the machinery behind: https://news.ycombinator.com/item?id=41197775


And I got two replies recorded after making an edit to expand on my question.

Glitch in the matrix.


Man reading this whole thread glitched my brain a bit


There’s a few decades after you stop worrying you’re crazy and before you start worrying you’re senile. Leaving you a lot more energy for other things. Enjoy them.


My HN client uses the HN API which reveals the true post time of the comment. See https://seville.protostome.com/item/?id=41641314.


I'm really confused by this post. Wouldn't it be simpler to read a few bytes from /dev/random?

Sure, it wouldn't be portable to windows but that's more of a feature than a bug.


How do you read from a file like that in SQL? I know that this is in "theory" possible but I've never had a legitimate use case where I've needed to do file I/O from my ORM, lol.

This is the ChatGPT answer that I was able to derive:

> You can read from `/dev/urandom` in a PostgreSQL query using `plperlu`, which allows executing unsafe Perl code. > Create a function to read random bytes:

  CREATE EXTENSION plperlu;

  CREATE OR REPLACE FUNCTION get_random_bytes(num_bytes int)
  RETURNS bytea
  LANGUAGE plperlu
  AS $$
  my $num_bytes = $_[0];
  open my $urandom, '<', '/dev/urandom' or die "Cannot open /dev/urandom: $!";
  read $urandom, my $bytes, $num_bytes;
  close $urandom;
  return $bytes;
  $$;

  SELECT get_random_bytes(16);


/dev/urandom.

Some systems have basically made them equivalent though.


> I’m broadly against the use of Postgres extensions because they make upgrades harder and projects less portable [1],

I can't find that footnote anywhere


Exercise extreme caution.

Having your security strategy rely on quirky behaviors of an implementation detail which might change is incredibly dangerous.


Not everything is a quirky implementation detail? It’s important for us developers to not write pure glue code between others functions, but to also understand them and write our useful code that may extend others work.


UUID v6 isn’t going to change. There’s a reason we have seven of them now. And v8, which would warrant your warning.


UUIDv6 won't change, but what about gen_random_uuid()


There is widespread acceptance nowadays that randomized UUIDs must be generated from the system CSPRNG or something equivalent, and that any non-cryptographically secure method is a bug. Most library implementations across languages have converged on this in some way.

That being said, the PostgreSQL documentation doesn't say anything in particular about the predictability of `gen_random_uuid`, so the behavior is unspecified. But it's worth noting the function has an explicit guard to raise an error if secure random is not available, so they were conscious of this possibility and did not attempt any misguided fallbacks.

And unfortunately this requirement is not baked into the UUID spec either, which uses the word "should" instead of "must" when discussing CSPRNG usage.


gen_random_uuid isn’t going to change either, the entire point is to generate a secure uuid4. At most it’ll get faster due to using platform-specific syscalls.


If you’re shopping for a CSPRNG, one of the items that should be very high on your list is being able to call the setSeed function multiple times and have the inputs compose instead of clobber each other.

You can send half-random input in and then send more half-random input in until you’re satisfied that the RNG has gotten a suitable amount of entropy. Do not chop, rearrange, hash, or bit shift the data trying to make it “stronger” the CSPRNG will do an infinitely better job of doing that for you. Just treat it like a Mr Fusion. Drop a can, a banana peel and the stale beer in and let it cook.

I gave a similar speech to a team trying to initialize SSL sessions on an embedded machine. “But what if we XOR…” No. Stahp.


Can you give some examples of CSPRNG implementations that allow this?


The ones in Java did in the context of the original discussion. And cryptographic hash based PRNG should have the capability of doing so, it’s an implementation detail whether you restart the data collection or append the data.

Just poke in the setSeed function and see what it does.


The Linux rng allows writing to a file to effect this.


AFAIU the blog author is taking the correctly randomly generated UUID and just cutting out the timestamp portion.

Why are you equating that to a hacky attempt to make less random data more random?


Because it’s essentially setSeed(getTimeMillis()). V6 and v7 are sortable, that’s why they exist. Which means like getTimeMillis() there are a finite number of starting points to try to guess the seed.


This is v4.


v6 is a transform of v1 UUIDs to behave like v7 keys with respect to database indexing - increasing over time. If it’s a function of time, then it’s still guessable by brute force.

Another responder suggested that the mention of v6 UUIDs is an error. Maybe. But that’s a truly bizarre typo to make. And they still haven’t fixed it.


But you could google Postgres UUID and confirm that they only provide v4? Instead of continuing a rant based on incorrect assumptions.

https://www.postgresql.org/docs/current/functions-uuid.html


You seem to be saying that

setSeed(0); setSeed(1); rand()

and

setSeed(1); rand()

returning different values is not only a good idea but is already a thing. Am I wrong?

This would confuse the hell out of me, what specifically has this behaviour?


If you’re trying to create a repeatable source of randomness for unit testing for example, you would create a new PRNG for each run, not try to recycle an existing instance. You’re making assumptions about state that aren’t supportable.


The behavior you are describing is not setting a seed. A seed should wipe out all existing state.

Adding entropy is a very different operation.


> You can send half-random input in and then send more half-random input in until you’re satisfied that the RNG has gotten a suitable amount of entropy.

This does not actually work. If an attacker can observe output of the CSPRNG, and knows the initial state (when it did not yet have enough entropy), then piecemeal addition of entropy allows the attacker to bruteforce what the added entropy was. To be safe, you need to add a significant amount of entropy at once, without allowing the attacker to observe output from an intermediate state. But after you've done that, you won't ever need to add entropy again.


You’re right, but I did not read GP to suggest otherwise.

GP does not suggest using the output before enough entropy had been gathered, eg see ‘until’ in:

> until you’re satisfied that the RNG has gotten a suitable amount of entropy.


Sibling already answered this. I don’t know how you came to this conclusion.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: