You're not wrong. Whoever downvoted you is pushing an agenda and I'm not happy a...

stickfigure · on Feb 16, 2017

The parent poster (and many others in this thread) are assuming that performance under this deliberately pathological test is reflective of performance in the real world.

Optimistic locking systems inherently perform poorly under contention. But they also perform better than pessimistic concurrency systems overall because in the real world we design applications to avoid contention.

As an example, the Google App Engine datastore runs zillions of QPS across petabytes of data in a massive distributed cluster. But if you build an app that does nothing but mutate a single piece of state over and over, you'll top out at a couple transactions per second. This is painful if you're trying to build a simple counter, but with minimal care you can build a system that scales to any sized dataset and traffic volume.

hinkley · on Feb 16, 2017

So the benchmark is only informative from the standpoint of determining which records should be split to avoid concurrent writes.

For instance you wouldn't expect a single user to make 5 comments or upvotes per second so storing data about recent activity with the user isn't a bottleneck that you need to design for. Storing data about responses with an item might also be okay as long as you don't plan to be HN or Reddit (Github, for example, would be just fine). But if you want to track activity globally (eg, managing the watch list notifications in github), you will need to design around that number.

stickfigure · on Feb 16, 2017

Aphyr's test is not a benchmark. It is a test of database correctness under a carefully constructed set of pathological circumstances. It cannot be used to infer real-world performance behavior.

Yes, the general advice for users of the GAE datastore is to build Entity Groups around the data for a single user. That isn't absolute though, and it doesn't cause problems for watch lists; the Watch can be part of the User's EG rather than the Issue's EG. Or it can be its own EG. In practice this doesn't require as much consideration as you probably imagine.

qaq · on Feb 16, 2017

Regardless of locking strategy that throughput is abysmal. None would complain about high latency.