Hacker News new | past | comments | ask | show | jobs | submit login

Bitcoin is a distributed consensus algorithm. Software which implements a subtly different algorithm might be able to join the consensus until some case arises which triggers the difference, after that consensus between the implementations would be impossible.

Unfortunately, it's rather hard to demonstrate that two dissimilar complex programs actually implement exactly the same function.

This is doubly true because performance is very important, so some techniques that might help— e.g. define a kind of abstract turing machine, write the rules for the distributed algorithm in its language, then different implementations use the same rules and only have to prove their turing machine implementation is functionally identical— aren't readily available.

It can be hard to gain confidence through testing because some of the most important rules are for cases which are forbidden, so these cases will not show up on the production network— they only happen when an attacker produces them. There are an infinite number (subject to memory, and already the function input is gigabytes of data) of valid and invalid cases, so exhaustive testing of the whole function is not possible. They can also arise out of implicit behavior, "what happens when this value overflows?", "what is the maximum size of this structure", etc.

This is further complicated by the fact that Bitcoin implementations use third party code that wasn't written with this kind of must-have-exact-behavior in mind. The reference implementation uses things like OpenSSL's crypto and bignums, various boost data structures, BDB (previously, now leveldb)... and subtle potentially undocumented and unknown behavior from this third party code may be leaking into the definition of the consensus algorithm. The reference implementation has been slowly disentangling these dependencies, but things like the hidden BDB locking limitations (which depend on the layout of the database on disk) are easy to miss.

(Third party code isn't just a reference implementation issue, e.g. BitcoinJ's full node code has had algorithm inconsistencies arising out of undocumented behavior in a database library it uses)

Satoshi's answer for this was that there should only be one implementation of the full node software. However, "create your own Bitcoin implementation" seems to be the new millennium's "create an IRC client" so it seems that this is unrealistic... and there have been more than a half dozen attempts (including two nearly complete ones in java, two in C, three in C++, one in Haskell, one in erlang, one (two?) in javascript, as well as several non public ones (including one in ruby, I think)).




Satoshi's answer for this was that there should only be one implementation of the full node software.

He should have talked to a release engineer. There are always at least two implementations of a piece of software: the stable version, and the prerelease version.

With Herculean effort and focus one could theoretically arrange a system where every customer on Earth runs version N until midnight on Dec 31 and runs version N+1 one microsecond later. You would have to distribute the new code in advance. But it wouldn't ever be perfect, because networks get partitioned: someone's PC would be unplugged when the patch was pushed. And what's the rollback procedure when you push the unforeseen bug?


That sounds like the opposite of an actual peer-to-peer, decentralized system that BitCoin promises. Funny that.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: