Currently totally unrealistic in practice, but it's proven that the algorithms are in P (polynomial time), and it's just ("just", I know...) a matter of getting the constant factor overheads down. It's definitely desirable, because it means some untrusted host could perform operations on your data without ever knowing what they were operating on e.g. "I added your numbers but I don't know what they are or their sum."
I'm reading this thread and I still can't figure out the point or how it protects things.
What client would want to add some numbers without knowing what the result is?
I can't imagine any interaction with a database where I don't need to do an operation with the resulting data (even if it's just passing it into a different API/DB unchanged).
My interpretation of your example in human terms is "add your day of birth to your bank balance, but don't tell me the answer" and then walking away. What am I missing?
-- edit --
Maybe I'm looking at it in the wrong direction? Is it more like "here are some numbers that, add them up on your calculator but don't look at the result before showing it to me"?
The latter is more like it. It is a case where you want something computed, like the sum of your paychecks over the last year, Or some credit card risk evaluation thing, or if you have markers for cancer in your genome data based on your personal information. Instead of sending these values to the server (perhaps encrypting in flight) where they are processed in plaintext (open to malicious intent on the server-side, or honest-but-curious folks who mine your sensitive personal information) the values, you upload remain encrypted so only you know what they mean. However the cloud side can sum them, perform threshold evaluations, search for things, determine fitness for a loan etc, without knowing anything, they go through the motions of the computation for you, but without being able to decipher anything about the computation result at any step along the way. Then when the server-side is done, it has computed whatever you asked it to do, but knows nothing. The server side reliably and deterministically manipulated symbols in a language it cant read. As it turns out, you can, so when the encrypted server-side results are sent back, you decrypt them and understand if you have been approved or have genetic markers for cancer or something.
“This thing I’m handing you is a $10 bill. Also, I have a proof that the serial number on it is real, and not duplicated which I can share (the proof) with any third party, to prove my payment or validate the ledger of payments, without revealing which serial number it is (thereby preserving your privacy).”
That is essentially what zcash is. So as I said, numerous applications in financial crypto (where we can reasonably spend full seconds on desktop, or minutes on secure hardware grinding away at proof to send funds), but not so practical to spend an even greater amount of time on each database query.
Polynomial time is pretty bad I feel and not due to any constant factor. A search in a regular database with an index is either O(1) or O(log(N)) depending on the design.
Indexing a N-element array (a O(log N) operation frequently approximated as O(1)) inside a homomorphic enclave requires O(N) operations. Constant factors exacerbate the problem, but any usefully large database is wholely unsuitable for homomorphic use.
Hmm... I don't understand this. O(N) is very fast for most realistic purposes I can come up with. It's faster than sorting - O(N log N) - which I was taught is "basically a free operation". (Coursera class with Tim Roughgarden.)
Sorting (or any other O(N log N) operation) is only "basically a free operation" for small N, and if you're not doing it very often. A billion times log base two of a billion is roughly thirty billion. If an operation takes a billionth of a second, your "basically free operation" is taking thirty seconds. Also, you definitely want to avoid doing an O(N log N) operation more often than you have to. For example, doing
for element of listB:
listA.add(element)
listA.sort()
is going to be much, much slower than
listA.sort()
for element of listB:
listA.insertInSortedPosition(element)
since "insert in sorted position" should be O(log N) — just use binary search to find the correct position. The first algorithm is O(N^2 log N) while the second algorithm is O(N log N). For any non-negligible N, you definitely don't want to be running in polynomial time if you can avoid it! And can you guarantee your algorithm will never be run on non-negligible N?
I mean, yeah, sometimes you can. And in that case, "premature optimization is the root of all evil," sure. If you're just pulling a couple hundred product records out of a database and then sorting them in code, then yeah, an O(N log N) sort is "basically a free operation."
But sometimes you want to do more complicated things than that, and then it's not true. So overall I think it's pretty irresponsible to teach that sorting is basically free (without caveats). That's definitely not always true!
I am also an amateur in these things. If indexing is O(N), does that mean it has the same performance as a linked list? Basically that would mean you have to go through each and every item until you reach the index you want.
Does homomorphic encryption allow running operations blindly on an encrypted dataset and return encrypted results without ever decrypting the dataset?