You're welcome to peruse the mailing list thread. http://lists.basho.com/piperma...

nirvana · on May 10, 2012

That thread shows that all of the particulars of your claims about Riak are actually false. Further it seems you didn't bother to understand how Riak can solve your problem and thus decided that it cannot.

cynicalkane · on May 10, 2012

Verbatim, from the mailing list:

"If large-scale mapreduce (more than a few hundred thousand keys) is important, or listing keys is critical, you might consider HBase."

"Riak can also collapse in horrible ways when asked to list huge numbers of keys. Some people say it just gets slow on their large installations. We've actually seen it hang the cluster altogether. Try it and find out!"

nirvana · on May 10, 2012

I chose the most polite way to point out his error, and now you are compounding it by attempting to rebut me with quotes that don't actually rebut me if you know what you're doing. Listing all keys is a function meant for debugging, not for running in production. If you're running M/R jobs based on that then you don't know what you're doing. The person you're quoting, in fact, said they were doing MR jobs over billions of keys. Further the person who made that recommendation doesn't work for Basho, and that they said he should consider HBase is not the same as saying that Riak can't do it.

You want to say I'm wrong, make a specific argument. Don't selectively quote things out of context that actually don't rebut my position, as that's profoundly dishonest. It is a way of pretending to rebut someone but without saying anything yourself so you can't be pinned on any statements. It is disingenuous.

I'm really tired of having to rebut these argument-from-ignorance "rebuttals" here on HN.

cynicalkane · on May 10, 2012

"The person you're quoting, in fact, said they were doing MR jobs over billions of keys"

Ctrl-F billions and found one match in the post I was quoting. No other reference to very large MR jobs in the post quoted.

"At Showyou, we're also building a custom backend called Mecha which integrates Riak and SOLR, specifically for this kind of analytics over billions of keys. We haven't packaged it for open-source release yet"

So the OP is supposed to use an unreleased experimental custom backend to do his big mapreduce jobs?

aphyr · on May 11, 2012

I am the person being quoted. You are correct that keylisting is not suitable for production use. We definitely don't do MR jobs over billions of keys: our huge data queries are powered by Mecha, which uses Solr.