Wow, that thread is nuts. Scrolled up just a bit saw this.
My new public key search system is almost ready. I had to reinvent my binary database system because, although the database was lightweight https://bitcointalk.org/index.php?topic=5475626, I had efficiency issues with binary search. This is now a thing of the past. I have designed a system that stores 100 million public keys in an 80 KB file, yes, what you read 80KB!(in the future it will be smaller) that meets maximum efficiency. We would only be limited by the current speed of Secp256k1 when generating the 100 million or more public keys while creating the database. I am finishing designing the search script after months of being stuck due to personal issues, I am finally back on track.
I love these kind of mad inventor rabbit hole corners of the Internet. Kind of brings back the 90s for me when everything was exciting.
How could this work with less than 1 bit of data per key?
Assuming there are no duplicates, which is a sensible assumption, you’d need a minimum of 100,000,000 bits to store 100,000,000 unique entries larger than 1 bit with even a perfect hash function.
In general, when you're storing a list of numbers, there are many situations where you can go below 1 bit per number.
The easiest one to think about is storing the deltas between each number. Let's say 80% of your deltas are 5. If you use arithmetic encoding, then storing a 5 only takes about 1/3 of a bit. It's not hard to come up with probability distributions where the average amount of bits per entry is less than 1.
Also, back in the realm of perfect hashes, once you're more than half full it becomes more efficient to store the missing numbers. If your perfect hash has 100,003,000 possible outputs, then your worst case is around 50k unique entries. By the time you encounter 100k unique entries you only need to keep track of the 3000 you haven't seen yet.
Assuming no duplicates, the only case that would make sense would be if all but a single byte was different (sequentially across all records). Even then you’d end up with more than the number of bytes we’re talking about, even excluding the size of the index (which would be non-trivial).
Why don’t you just read what the guy said by following the links in the forum? Surely, you can find more explanation there that will answer some of your questions? Hahaha! :)
His thing has collisions, so it answers none of the questions.
Also they already did follow the link. That's why they said "they don’t actually store the keys, so the quote is misleading", which you responded to with a laugh and nothing else. And that happened many hours before you made this new comment.
I'm not sure that guy really understood what was going on. If he'd followed the links he would've found the code. Or at least a technical description. So why need to play dumb and ask here, while trying to control the discussion?
I don't like that kind of thing. If you're okay with it, alright. But that's not me.
My new public key search system is almost ready. I had to reinvent my binary database system because, although the database was lightweight https://bitcointalk.org/index.php?topic=5475626, I had efficiency issues with binary search. This is now a thing of the past. I have designed a system that stores 100 million public keys in an 80 KB file, yes, what you read 80KB!(in the future it will be smaller) that meets maximum efficiency. We would only be limited by the current speed of Secp256k1 when generating the 100 million or more public keys while creating the database. I am finishing designing the search script after months of being stuck due to personal issues, I am finally back on track.
I love these kind of mad inventor rabbit hole corners of the Internet. Kind of brings back the 90s for me when everything was exciting.