Ugh, that is very bad. I had much higher thoughts of the cryptographic purity of...

nullc · on May 9, 2013

Bitcoin is a distributed consensus algorithm. Software which implements a subtly different algorithm might be able to join the consensus until some case arises which triggers the difference, after that consensus between the implementations would be impossible.

Unfortunately, it's rather hard to demonstrate that two dissimilar complex programs actually implement exactly the same function.

This is doubly true because performance is very important, so some techniques that might help— e.g. define a kind of abstract turing machine, write the rules for the distributed algorithm in its language, then different implementations use the same rules and only have to prove their turing machine implementation is functionally identical— aren't readily available.

It can be hard to gain confidence through testing because some of the most important rules are for cases which are forbidden, so these cases will not show up on the production network— they only happen when an attacker produces them. There are an infinite number (subject to memory, and already the function input is gigabytes of data) of valid and invalid cases, so exhaustive testing of the whole function is not possible. They can also arise out of implicit behavior, "what happens when this value overflows?", "what is the maximum size of this structure", etc.

This is further complicated by the fact that Bitcoin implementations use third party code that wasn't written with this kind of must-have-exact-behavior in mind. The reference implementation uses things like OpenSSL's crypto and bignums, various boost data structures, BDB (previously, now leveldb)... and subtle potentially undocumented and unknown behavior from this third party code may be leaking into the definition of the consensus algorithm. The reference implementation has been slowly disentangling these dependencies, but things like the hidden BDB locking limitations (which depend on the layout of the database on disk) are easy to miss.

(Third party code isn't just a reference implementation issue, e.g. BitcoinJ's full node code has had algorithm inconsistencies arising out of undocumented behavior in a database library it uses)

Satoshi's answer for this was that there should only be one implementation of the full node software. However, "create your own Bitcoin implementation" seems to be the new millennium's "create an IRC client" so it seems that this is unrealistic... and there have been more than a half dozen attempts (including two nearly complete ones in java, two in C, three in C++, one in Haskell, one in erlang, one (two?) in javascript, as well as several non public ones (including one in ruby, I think)).

mechanical_fish · on May 9, 2013

Satoshi's answer for this was that there should only be one implementation of the full node software.

He should have talked to a release engineer. There are always at least two implementations of a piece of software: the stable version, and the prerelease version.

With Herculean effort and focus one could theoretically arrange a system where every customer on Earth runs version N until midnight on Dec 31 and runs version N+1 one microsecond later. You would have to distribute the new code in advance. But it wouldn't ever be perfect, because networks get partitioned: someone's PC would be unplugged when the patch was pushed. And what's the rollback procedure when you push the unforeseen bug?

rmc · on May 9, 2013

That sounds like the opposite of an actual peer-to-peer, decentralized system that BitCoin promises. Funny that.

betterunix · on May 9, 2013

The cryptographic purity of a system without a formal security definition and for which there is a known polynomial time attack? I think you were bound to be disappointed...

fianchetto · on May 9, 2013

Bitcoin is cryptographically pure but programmers aren't quite as perfect.

betterunix · on May 9, 2013

What is the security definition?

nullc · on May 9, 2013

https://socrates1024.s3.amazonaws.com/consensus.pdf

betterunix · on May 9, 2013

I think this paper presents a poor security definition for Bitcoin, at the very least because it assumes a non-adaptive adversary (i.e. the adversary cannot change which nodes it controls over time). That is not how things work in the real world; in the real world, the adversary will keep increasing his computational power (the number of "pawns") until his attack is successful. I am also skeptical of a computational model that assumes the adversary's computation is somehow synchronized with the honest parties'; that is not even remotely realistic, and it is very weak.

In other words, that paper does not tell us anything we did not already know: Bitcoin is secure only if nobody does as much computational work as the rest of the network combined (though there may be other attacks on the protocol that have not yet been discovered). If anything, this paper is a step towards formalizing a security definition for a protocol that is similar to Bitcoin, or a step towards proving that there will always be a polynomial time attack against such protocols (as was the case with Merkle's puzzles).

nullc · on May 9, 2013

It assumes an anonymous protocol. The identity of the nodes does not matter.

> a step towards proving that there will always be a polynomial time attack against such protocols

I had considered that to be the definition of a majority consensus. I find it sort of surprising that you'd think otherwise.

Lets just assume that some alternative protocol has a property where a an attack with the >50% computing power would be ignored. Then it follows that it would also allow an attack with <50% computing power— unless "attack" could be detected as a function of network state, in which case any sane system would just ignore those entirely— as Bitcoin does, e.g. a transaction outputting more coins than it inputs is ignored regardless of the hashpower— so they're not the kind of attacks we're talking about here.

Even if you dispense with all the crypto-computing-power-mumbo-jumbo: A _consensus_ ultimately depends on linear energy applied to an attack. Lets imagine a magical version of Bitcoin solves the sybil problem completely and counts the consensus of _users_ instead of computing power. China (for example) could reorganize the consensus by spending a lot of energy to manufacturer a lot of additional people. So long as the attacker put in more energy mining people than all the honest participants they'd always eventually win.

betterunix · on May 9, 2013

"It assumes an anonymous protocol. The identity of the nodes does not matter."

Nor would it matter in a protocol where each node is assigned a unique ID. All that matters is whether or not the nodes are malicious and whether malicious nodes can violate some security property of the protocol.

"I had considered that to be the definition of a majority consensus."

This is kind of like saying that a Merkle puzzles approach is the only key exchange system.

"Lets just assume that some alternative protocol has a property where a an attack with the >50% computing power would be ignored."

Why even bother with "computing power?" Let's make the adversary more powerful by allowing attacks that run in polynomial time, and furthermore allowing the attacker to coordinate parties of its choosing, and to adaptive corrupt more parties in the system. Security against such attackers is not at all unheard of:

http://eprint.iacr.org/2012/441.pdf

"Then it follows that it would also allow an attack with <50% computing power— unless "attack" could be detected as a function of network state,"

Yes, that is how secure multiparty computation is usually approached. That is why secure protocols in the malicious model (or some multiparty variant of it) often involve zero-knowledge proofs, commitment schemes, and so forth.

"A _consensus_ ultimately depends on linear energy applied to an attack"

Again, it sounds like you are arguing against the use of consensus systems. It sounds like you are saying Bitcoin cannot be secure (at least not as a digital cash system), at least not by any cryptographic security definition.

To put it another way, would you want the person or group with the loudest voice to prevent you from spending your money, or to take money they gave you back against your will? It is one thing for a currency issuer to destroy an economy; it is another for anyone who spins their CPU to be able to cheat or engage in targeted attacks.

moe · on May 10, 2013

It is one thing for a currency issuer to destroy an economy; it is another for anyone who spins their CPU to be able to cheat or engage in targeted attacks.

Ever heard of HFT?

wmf · on May 9, 2013

(Warning: wild-ass rambling follows.)

The only reason an agreement protocol is needed is because nodes are allowed to have different opinions (in Bitcoin's case, on the contents of the block chain) and so the protocol agrees on one of these opinions. If a system could be designed such that there is no room for opinion then it would be obvious whether a block is correct or not and thus having more resources would not benefit an attacker. This might require too much synchronization to be practical, though.