OS X LevelDB Corruption Bounty: 10.00 BTC + 200.2 LTC

guyht · on Nov 25, 2013

Its truly amazing to see such a large bounty for an open source project. Even though this bug poses no security threat, the reward offered is akin to those offered by the big power houses (FB, Microsoft, Google) for providing fixes to critical security issues.

maaku · on Nov 25, 2013

Well to be fair in this case the bounty was setup before the recent rise in price. It's only getting attention now because it's now worth something.

But the donors could have backed out and did not, and there are other similarly-sized bounties in the bitcoin community (I currently make my living off of community donations as a bitcoin-core developer).

cryptocoin · on Nov 25, 2013

The two donors I see are gavin and theymos, the former receives bitcoins so the rise in price doesn't affect him, the latter paid nothing himself since those coins were all given to him. There's is also someone else that contributed about 1 BTC, but that cannot be taken back except wtogami himself negotiates with this donator to send it back.

Anyway, I think the attention is well deserved and I hope this contributes to a real fix. The current situation makes me not want to use the official bitcoin client under osx, at all. And this will be the case for a long time now, since I have no idea whether the bug will be correctly fixed.

maaku · on Nov 25, 2013

Note: to my knowledge there has not been any loss of wallet due to this bug. You simply have to endure the annoyingly long process of re-indexing the block chain after a corruption.

cryptocoin · on Nov 25, 2013

The bug is unrelated to losing a wallet, just because the wallet is not even using leveldb (it is still at BDB). Your last sentence was the reason for my last earlier paragraph.

tlrobinson · on Nov 25, 2013

This thread (and funding of the bounty addresses) was started on Nov 18, the day Bitcoin on MtGox touched $900. Did the bounty start earlier?

asuffield · on Nov 25, 2013

Don't set too much store in the MtGox price - it's not real money, because all their bank accounts are frozen. The value is high because it represents the substantial risk that nobody will get any of that money.

bct · on Nov 25, 2013

broostoryco, you are hellbanned.

maaku · on Nov 25, 2013

Yes, sometime back in spring or early summer I believe (based on my memory of #bitcon-dev conversations.. I don't frequent the scum-infested bitcointalk much).

nwh · on Nov 25, 2013

It's an infuriating bug, there's no real reason for it, and no pattern as to what triggers it. Some users see it daily others never see it. It's been around long enough to seriously irritate some users, hence the bounty.

nullc · on Nov 25, 2013

In particular, it seems to not manifest itself for the technical folks who are likely to actually solve it if they can reproduce it and yet be quite frequent for others.

I wouldn't be shocked if it ultimately turned out to be due to some setting that gurus would never have enabled. :)

dmix · on Nov 25, 2013

Yeah lets be honest, it's probably something that's a pain-in-the-ass to solve. Unpredictable/difficult to repeat. Open-source devs need some motivation for this one clearly.

Aqueous · on Nov 25, 2013

This has all the symptoms of a race condition.

pudquick · on Nov 25, 2013

... are Apple's manpages never read?

https://developer.apple.com/library/mac/documentation/Darwin...

"For applications that require tighter guarantees about the integrity of their data, Mac OS X provides the F_FULLFSYNC fcntl. The F_FULLFSYNC fcntl asks the drive to flush all buffered data to permanent storage. Applications, such as databases, that require a strict ordering of writes should use F_FULLFSYNC to ensure that their data is written in the order they expect. Please see fcntl(2) for more detail."

https://developer.apple.com/library/mac/documentation/Darwin...

"F_FULLFSYNC - Does the same thing as fsync(2) then asks the drive to flush all buffered data to the permanent storage device (arg is ignored). This is currently implemented on HFS, MS-DOS (FAT), and Universal Disk Format (UDF) file systems. The operation may take quite a while to complete. Certain FireWire drives have also been known to ignore the request to flush their buffered data."

OS X has aggressive file buffering in memory, and it's getting more aggressive all the time. For example, cfprefsd, introduced in 10.8 (https://developer.apple.com/library/mac/releasenotes/DataMan...) made it so that when a system application read a preferences file, it stayed in memory and ignored the disk version, until cfprefsd eventually synced it back to disk. In 10.9, the behavior is much worse to the point that as soon as a pref is in cfprefsd, it's unlikely to leave it until the user logs out / the machine reboots.

In this instance, OS X has, for quite some time, had "defrag on the fly" for files under 20MB in size. On access of the file, it's read into memory and kept there in its entirety until memory pressure from other processes triggers a sync it back to disk. When it comes to writing a small file back to disk, OS X will "get around to it" when it's damned well ready unless you force its hand using the fcntl options above.

Unfortunately, the bit about "This is currently implemented on HFS, MS-DOS (FAT), and Universal Disk Format (UDF) file systems" covers pretty much the range of filesystem types that OS X can natively read+write on - but one that might get past this is ExFAT. I'd be surprised if that was the case, but it is natively supported read+write on OS X and would be something quick and easy to test (set up an ExFAT volume for the database) and possibly verify this is the root cause.

(Additionally, third-party read+write access to filesystems like NTFS via Paragon / Tuxera may be able to confirm this as well.)

More reading material (MySQL has been dealing with this since 2005): http://lists.apple.com/archives/darwin-dev/2005/Feb/msg00072...

caf · on Nov 25, 2013

This appears to be benchmark gaming - the POSIX Rationale for fsync(2) says:

The fsync() function is intended to force a physical write of data from the buffer cache, and to assure that after a system crash or other failure that all data up to the time of the fsync() call is recorded on the disk. Since the concepts of "buffer cache", "system crash", "physical write", and "non-volatile storage" are not defined here, the wording has to be more abstract.

rsynnott · on Nov 25, 2013

Unfortunately, it's often not that simple on POSIX systems; it's quite common for disk controllers to disobey fsync, for instance, either in their drivers or in their hardware caches or both. I'm actually surprised Apple is even willing to make the guarantees above about F_FULLFSYNC; I'd read it as, at best, only applying to Apple hardware, as if a third-party controller is doing something silly they can't really do much about that.

antirez · on Nov 25, 2013

In other words, what the osx default fsync() semantics is useful for? I had the same discussion on Twitter a few days ago...

Someone · on Nov 25, 2013

It is useful for forcing all writes out to the storage device. If you device is battery/UPS backed, has enough capacity to flush its buffers (to disk or to flash memory) after a power loss, that is sufficient to (eventually) get your data on disk (yes,the drive may fail, but if that happens after the data has hit the platter, you have no guarantees, either)

From what I understand, that behaviour is in spec (for me, borderline, at best, but I don't make that spec) according to http://pubs.opengroup.org/onlinepubs/009695399/functions/fsy... ("physical write from the buffer cache", not "physical write to the disk") and, AFAIK, is what others do, too (http://ridiculousfish.com/blog/posts/mystery.html)

Edit: http://lists.apple.com/archives/darwin-dev/2005/Feb/msg00072..., referenced from that ridiculous fish post, gives more background info.

antirez · on Nov 25, 2013

Ok makes sense in special cases indeed, however a really unsafe default...

cypherpunks01 · on Nov 25, 2013

There are two patches linked in the OP that switch to using F_FULLFSYNC on OSX. The OP says that people are still encountering db corruption even on branches with these fixes.

https://github.com/sipa/bitcoin/commit/b28d8b423bddc860c5858... https://github.com/gmaxwell/bitcoin/commit/e7bad10c12ce9b5d4...

pudquick · on Nov 25, 2013

We'll, I'm glad someone apparently IS reading :)

But again - I'd point to the work of other longstanding database projects that are available on OS X as a source of "how we ensured data correctness".

pudquick · on Nov 25, 2013

Additional reply here since my other is too old to edit:

Something a tester experiencing corruption at startup may want to try is using the 'purge' command from the Terminal.

While a restart will indeed trigger caching files back to disk before the pending restart of the system, the 'purge' command will simulate a "cold boot"-like empty disk buffer by dropping existing file caches.

https://developer.apple.com/library/mac/documentation/Darwin...

If using the command solves the problem of the database corruption without a restart, then you're definitely suffering from disk cache. Easy to confirm.

Additionally, since this is an error on boot up of LevelDB, this is probably in regards to file reading - especially since on second bootup after a restart, no error is detected. F_FULLSYNC is for ensuring that a particular file's changes are 'fully written to disk' ...

... but the cache works both ways. A program could also end up reading the disk cache (which sounds like what's happening here), unless you used F_NOCACHE or F_GLOBAL_NOCACHE. Mind you, these don't prevent the accessing of files already in disk cache - they prevent a file from getting into the disk cache in the first place.

http://lists.apple.com/archives/darwin-dev/2009/Oct/msg00165...

(Disclaimer: I can't think of a situation where the disk cache would get out of sync with the on-disk version of the file if F_FULLSYNC is used when accessed by a subsequent launch of a program except for in the case of faulty RAM on a machine which flipped bits. Your average file operation done by your average application in OS X isn't performing a checksum of data and generally wouldn't notice a single bit flip. It would be interesting to see which of these machines are using ECC RAM.)

cryptocoin · on Nov 25, 2013

I'm not sure why you got that conclusion, as leveldb already received that fix some months ago, see https://code.google.com/p/leveldb/issues/detail?id=197

maaku · on Nov 25, 2013

Bitcoin is using an older version of leveldb (although, as mentioned, this fix is backported in a pull request).

jamesaguilar · on Nov 25, 2013

Looks like a free $10k for you if you're right! Let's see!

gigq · on Nov 25, 2013

I believe to claim the reward you have to reproduce the issue, there is already a patch out for the F_FULLFSYNC change.

https://code.google.com/p/leveldb/issues/attachmentText?id=1...

jliptzin · on Nov 25, 2013

stack overflow should allow you to attach BTC bounties to high priority questions like in this thread

maaku · on Nov 25, 2013

Monetary incentive actually decreases quality, in situations like this. It devalues altruistic contributions (see: Drive: the surprising truth about what motivates us)

_delirium · on Nov 25, 2013

I've definitely found that for myself, especially for smallish amounts of money. If it's enough money that I can justify it as freelancing, then it's a different category entirely: I'll do a good job in return for being paid. But I'd rather do things I'm interested in without any money involved (writing Wikipedia articles, answering questions I know something about, etc.) than chase $2 bounties. That ends up making it feel not like an interesting hobby, but like a low-paid job, like spending all day on Mechanical Turk.

As a proprietor of such a system you also end up with a whole category of new bad behaviors, as people try to maximize their hourly pay by finding ways to game the system, getting the most payout for the least input.

maaku · on Nov 25, 2013

One thing that does work is turning it into a game: non-monetary reputation points, badges, etc. No surprise, this is what stackoverflow has pioneered (and Wikipedia should take note).

_delirium · on Nov 25, 2013

I actually tend to find those pretty demotivating/bad-behavior-inducing as well. Reputation in the actual sense is one thing (getting a reputation for being a good contributor), but I really dislike karma/points-style "scorekeeping" and up/downvotes. Fortunately on HN nobody takes it seriously, or it could be a problem.

Wikipedia does have miscellaneous scores and badges, of which I find the badges awarded by community members as recognition of a contribution most useful: https://en.wikipedia.org/wiki/Wikipedia%3ABarnstars

There's also just raw counts of contributions, which everyone takes with a large grain of salt:

https://en.wikipedia.org/wiki/Wikipedia:List_of_Wikipedians_...

You can also click "thank" next to specific edits, which just sends the person a notice that someone appreciated their edit. That I think is useful, but I'm not sure how useful it would really be to keep a score of "number of thanks" or whatever. The point is just to say "hey someone noticed you did a good job here and appreciates it" to give some encouragement, not to keep score of who got thanked more.

And finally many people just collect vanity "hey look at what I've done!" lists, which can be a nice way of reflecting on your contributions and feeling good about them. Many people's User Pages are like that, or you could do it externally like e.g. http://www.gwern.net/Wikipedia%20resume

fudged71 · on Nov 25, 2013

Absolutely. I was just at a demo day where a group was making a Stack Overflow clone with monetary incentive and they didn't agree with this point.

twistedpair · on Nov 25, 2013

Even in my day job, I'd happily pay someone $20 who had the experience to clarify some annoying nuance of XYZ framework. Sure, I could keep digging through the API docs for another hour to find it, but they could answer it in 15 seconds. I'm certain I could be similarly helpful to others. Why isn't SO doing this yet?

argonaut · on Nov 25, 2013

The point is that many people who would have otherwise answered your question on SO out of sheer goodwill would not answer questions if paid $20 to answer questions, a motivational phenomenon that has some empirical support in psychology.

darkmighty · on Nov 25, 2013

Psychological altruism bias notwithstanding, I think bounties sure have their place -- perhaps not side-to-side with altruistic ones as to not interfere with them: SO tends to get a lot of attention on interesting questions, and very few otherwise. This leaves an opportunity for low interest, high personal value questions, the dual of SO. There would still be a void for low interest low value questions, but reaching those is asking too much.

chaz · on Nov 25, 2013

Check out http://www.airpair.com/, https://helpouts.google.com, and https://www.liveninja.com/. I haven't used any of them, but I think potential in this space, especially outside of the tech space.

clamprecht · on Nov 25, 2013

Maybe. But the bounty has also gotten the bug a lot of attention (case in point, this HN post).

All it takes is one developer who happens to be an expert in this area to see the bug and say, "I know exactly what is causing this!" and go fix it. What's the saying - "with enough eyes, all bugs are shallow" or something.

nathan_f77 · on Nov 25, 2013

If they don't do it, it could make an awesome chrome extension.

throwwit · on Nov 25, 2013

Just checking: is it normal for the bitcoin-qt client to upload a burst of 1GB to an IP on startup? Havn't seen it b4

asperous · on Nov 25, 2013

You might be sharing the block chain, i.e. for a new user.

Moral_ · on Nov 25, 2013

cryptocoin you're hellbanned, no one can see your posts.

locusm · on Nov 25, 2013

There are wallets around that use a shared blockchain too. Otherwise the download can be huge.

cryptocoin · on Nov 25, 2013

You probably meant something else here, because everyone shares the same blockchain*. I guess you meant a SPV client, like Multibit.

locusm · on Dec 1, 2013

yeah, thats the one.

brymaster · on Nov 25, 2013

cryptocoin, I have 'showdead' on and you're hellbanned. Welcome to HN!

cryptocoin · on Nov 25, 2013

I have no idea what that is but I will be glad to leave HN if that is the case, you're managing to do worse than bitcointalk.

nwh · on Nov 25, 2013

You're no longer hellbanned.

jnbiche · on Nov 25, 2013

We're not wanting you to leave, you're alerting you to the fact that the system has done this. Most of us commenting disagree with hellbanning.

Anderkent · on Nov 25, 2013

@cryptocoin: being hellbanned means no one can reply to your posts, and very few people see them. This is usually caused by bad behaviour, but if you believe it's a mistake you should request an unban via email.

sillysaurus2 · on Nov 25, 2013

It's also caused by using Tor. (Any comments from new accounts posted via Tor are autokilled.)

pirateking · on Nov 25, 2013

Since the issue states that all reports seem to be from OS X 10.8.X on, I do wonder if it is related to changes that were introduced around 10.8[1][2]. Could FileVault or default Gatekeeper permissions also be involved?

I don't know much about LevelDB or the Bitcoin client, but I am currently taking a closer look at OS X first.

[1] https://developer.apple.com/library/mac/releasenotes/General...

[2] https://developer.apple.com/library/mac/releasenotes/macosx/...

archagon · on Nov 25, 2013

I don't know much about this issue, but just a shot in the dark: Apple is using a new technology called Core Storage that's "layered between the whole-disk partition scheme and the file system used for a specific partition" in order to manage FileVault and their fusion drives. It's also apparently a technology that's in flux, as there are certain incompatibilities in regards to fusion drives (possibly custom created ones?) with the release of Mavericks. John Siracusa mentioned this in a recent podcast, and even hypothesized about how Core Storage might be a stop-gap between HFS+ and a new Apple-designed file system. Could it have something to do with this?

Moral_ · on Nov 25, 2013

Why don't they just go directly to the ioctl instead of through all these libraries?

When ever I need something written to disk _immediately_ I go straight to the drivers:

  	if (ioctl(fd, BLKFLSBUF, NULL))		
		perror("BLKFLSBUF failed");

that should work.