Its truly amazing to see such a large bounty for an open source project. Even though this bug poses no security threat, the reward offered is akin to those offered by the big power houses (FB, Microsoft, Google) for providing fixes to critical security issues.
Well to be fair in this case the bounty was setup before the recent rise in price. It's only getting attention now because it's now worth something.
But the donors could have backed out and did not, and there are other similarly-sized bounties in the bitcoin community (I currently make my living off of community donations as a bitcoin-core developer).
The two donors I see are gavin and theymos, the former receives bitcoins so the rise in price doesn't affect him, the latter paid nothing himself since those coins were all given to him. There's is also someone else that contributed about 1 BTC, but that cannot be taken back except wtogami himself negotiates with this donator to send it back.
Anyway, I think the attention is well deserved and I hope this contributes to a real fix. The current situation makes me not want to use the official bitcoin client under osx, at all. And this will be the case for a long time now, since I have no idea whether the bug will be correctly fixed.
Note: to my knowledge there has not been any loss of wallet due to this bug. You simply have to endure the annoyingly long process of re-indexing the block chain after a corruption.
The bug is unrelated to losing a wallet, just because the wallet is not even using leveldb (it is still at BDB). Your last sentence was the reason for my last earlier paragraph.
Don't set too much store in the MtGox price - it's not real money, because all their bank accounts are frozen. The value is high because it represents the substantial risk that nobody will get any of that money.
Yes, sometime back in spring or early summer I believe (based on my memory of #bitcon-dev conversations.. I don't frequent the scum-infested bitcointalk much).
It's an infuriating bug, there's no real reason for it, and no pattern as to what triggers it. Some users see it daily others never see it. It's been around long enough to seriously irritate some users, hence the bounty.
In particular, it seems to not manifest itself for the technical folks who are likely to actually solve it if they can reproduce it and yet be quite frequent for others.
I wouldn't be shocked if it ultimately turned out to be due to some setting that gurus would never have enabled. :)
Yeah lets be honest, it's probably something that's a pain-in-the-ass to solve. Unpredictable/difficult to repeat. Open-source devs need some motivation for this one clearly.
"For applications that require tighter guarantees about the integrity of their data, Mac OS X provides the F_FULLFSYNC fcntl. The F_FULLFSYNC fcntl asks the drive to flush all buffered data to permanent storage. Applications, such as databases, that require a strict ordering of writes should use F_FULLFSYNC to ensure that their data is written in the order they expect. Please see fcntl(2) for more detail."
"F_FULLFSYNC - Does the same thing as fsync(2) then asks the drive to flush all buffered data to the permanent storage device (arg is ignored). This is currently implemented on HFS, MS-DOS (FAT), and Universal Disk Format (UDF) file systems. The operation may take quite a while to complete. Certain FireWire drives have also been known to ignore the request to flush their buffered data."
OS X has aggressive file buffering in memory, and it's getting more aggressive all the time. For example, cfprefsd, introduced in 10.8 (https://developer.apple.com/library/mac/releasenotes/DataMan...) made it so that when a system application read a preferences file, it stayed in memory and ignored the disk version, until cfprefsd eventually synced it back to disk. In 10.9, the behavior is much worse to the point that as soon as a pref is in cfprefsd, it's unlikely to leave it until the user logs out / the machine reboots.
In this instance, OS X has, for quite some time, had "defrag on the fly" for files under 20MB in size. On access of the file, it's read into memory and kept there in its entirety until memory pressure from other processes triggers a sync it back to disk. When it comes to writing a small file back to disk, OS X will "get around to it" when it's damned well ready unless you force its hand using the fcntl options above.
Unfortunately, the bit about "This is currently implemented on HFS, MS-DOS (FAT), and Universal Disk Format (UDF) file systems" covers pretty much the range of filesystem types that OS X can natively read+write on - but one that might get past this is ExFAT. I'd be surprised if that was the case, but it is natively supported read+write on OS X and would be something quick and easy to test (set up an ExFAT volume for the database) and possibly verify this is the root cause.
(Additionally, third-party read+write access to filesystems like NTFS via Paragon / Tuxera may be able to confirm this as well.)
This appears to be benchmark gaming - the POSIX Rationale for fsync(2) says:
The fsync() function is intended to force a physical write of data from the buffer cache, and to assure that after a system crash or other failure that all data up to the time of the fsync() call is recorded on the disk. Since the concepts of "buffer cache", "system crash", "physical write", and "non-volatile storage" are not defined here, the wording has to be more abstract.
Unfortunately, it's often not that simple on POSIX systems; it's quite common for disk controllers to disobey fsync, for instance, either in their drivers or in their hardware caches or both. I'm actually surprised Apple is even willing to make the guarantees above about F_FULLFSYNC; I'd read it as, at best, only applying to Apple hardware, as if a third-party controller is doing something silly they can't really do much about that.
It is useful for forcing all writes out to the storage device. If you device is battery/UPS backed, has enough capacity to flush its buffers (to disk or to flash memory) after a power loss, that is sufficient to (eventually) get your data on disk (yes,the drive may fail, but if that happens after the data has hit the platter, you have no guarantees, either)
There are two patches linked in the OP that switch to using F_FULLFSYNC on OSX. The OP says that people are still encountering db corruption even on branches with these fixes.
Additional reply here since my other is too old to edit:
Something a tester experiencing corruption at startup may want to try is using the 'purge' command from the Terminal.
While a restart will indeed trigger caching files back to disk before the pending restart of the system, the 'purge' command will simulate a "cold boot"-like empty disk buffer by dropping existing file caches.
If using the command solves the problem of the database corruption without a restart, then you're definitely suffering from disk cache. Easy to confirm.
Additionally, since this is an error on boot up of LevelDB, this is probably in regards to file reading - especially since on second bootup after a restart, no error is detected. F_FULLSYNC is for ensuring that a particular file's changes are 'fully written to disk' ...
... but the cache works both ways. A program could also end up reading the disk cache (which sounds like what's happening here), unless you used F_NOCACHE or F_GLOBAL_NOCACHE. Mind you, these don't prevent the accessing of files already in disk cache - they prevent a file from getting into the disk cache in the first place.
(Disclaimer: I can't think of a situation where the disk cache would get out of sync with the on-disk version of the file if F_FULLSYNC is used when accessed by a subsequent launch of a program except for in the case of faulty RAM on a machine which flipped bits. Your average file operation done by your average application in OS X isn't performing a checksum of data and generally wouldn't notice a single bit flip. It would be interesting to see which of these machines are using ECC RAM.)
Monetary incentive actually decreases quality, in situations like this. It devalues altruistic contributions (see: Drive: the surprising truth about what motivates us)
I've definitely found that for myself, especially for smallish amounts of money. If it's enough money that I can justify it as freelancing, then it's a different category entirely: I'll do a good job in return for being paid. But I'd rather do things I'm interested in without any money involved (writing Wikipedia articles, answering questions I know something about, etc.) than chase $2 bounties. That ends up making it feel not like an interesting hobby, but like a low-paid job, like spending all day on Mechanical Turk.
As a proprietor of such a system you also end up with a whole category of new bad behaviors, as people try to maximize their hourly pay by finding ways to game the system, getting the most payout for the least input.
One thing that does work is turning it into a game: non-monetary reputation points, badges, etc. No surprise, this is what stackoverflow has pioneered (and Wikipedia should take note).
I actually tend to find those pretty demotivating/bad-behavior-inducing as well. Reputation in the actual sense is one thing (getting a reputation for being a good contributor), but I really dislike karma/points-style "scorekeeping" and up/downvotes. Fortunately on HN nobody takes it seriously, or it could be a problem.
Wikipedia does have miscellaneous scores and badges, of which I find the badges awarded by community members as recognition of a contribution most useful: https://en.wikipedia.org/wiki/Wikipedia%3ABarnstars
There's also just raw counts of contributions, which everyone takes with a large grain of salt:
You can also click "thank" next to specific edits, which just sends the person a notice that someone appreciated their edit. That I think is useful, but I'm not sure how useful it would really be to keep a score of "number of thanks" or whatever. The point is just to say "hey someone noticed you did a good job here and appreciates it" to give some encouragement, not to keep score of who got thanked more.
And finally many people just collect vanity "hey look at what I've done!" lists, which can be a nice way of reflecting on your contributions and feeling good about them. Many people's User Pages are like that, or you could do it externally like e.g. http://www.gwern.net/Wikipedia%20resume
Even in my day job, I'd happily pay someone $20 who had the experience to clarify some annoying nuance of XYZ framework. Sure, I could keep digging through the API docs for another hour to find it, but they could answer it in 15 seconds. I'm certain I could be similarly helpful to others. Why isn't SO doing this yet?
The point is that many people who would have otherwise answered your question on SO out of sheer goodwill would not answer questions if paid $20 to answer questions, a motivational phenomenon that has some empirical support in psychology.
Psychological altruism bias notwithstanding, I think bounties sure have their place -- perhaps not side-to-side with altruistic ones as to not interfere with them: SO tends to get a lot of attention on interesting questions, and very few otherwise. This leaves an opportunity for low interest, high personal value questions, the dual of SO. There would still be a void for low interest low value questions, but reaching those is asking too much.
Maybe. But the bounty has also gotten the bug a lot of attention (case in point, this HN post).
All it takes is one developer who happens to be an expert in this area to see the bug and say, "I know exactly what is causing this!" and go fix it. What's the saying - "with enough eyes, all bugs are shallow" or something.
@cryptocoin: being hellbanned means no one can reply to your posts, and very few people see them. This is usually caused by bad behaviour, but if you believe it's a mistake you should request an unban via email.
Since the issue states that all reports seem to be from OS X 10.8.X on, I do wonder if it is related to changes that were introduced around 10.8[1][2]. Could FileVault or default Gatekeeper permissions also be involved?
I don't know much about LevelDB or the Bitcoin client, but I am currently taking a closer look at OS X first.
I don't know much about this issue, but just a shot in the dark: Apple is using a new technology called Core Storage that's "layered between the whole-disk partition scheme and the file system used for a specific partition" in order to manage FileVault and their fusion drives. It's also apparently a technology that's in flux, as there are certain incompatibilities in regards to fusion drives (possibly custom created ones?) with the release of Mavericks. John Siracusa mentioned this in a recent podcast, and even hypothesized about how Core Storage might be a stop-gap between HFS+ and a new Apple-designed file system. Could it have something to do with this?