It seems like accord uses neither bounded synchronized clocks nor a shared times...

_benedict · on Aug 5, 2022

The loosely synchronized clocks are not required for correctness (i.e. to enforce strict serializability), only for performance, i.e. for achieving 1 WAN round-trip execution for conflicting transactions whose execution overlaps (all other transactions execute on 1 WAN round-trip anyway).

The system maintains a logical clock, and the logical clock enforces the strict serializability condition globally. The logical clock is just cheaper to maintain if the coordinator proposes a value that is already correctly ordered by the loosely synchronized clocks.

edit: sorry, I misread your post as interpreting the loosely synchronized clocks as being necessary for correctness. I answer (very loosely) how strict serializability is maintained in a sibling comment, but in brief a dependency set of all conflicting transactions that might potentially execute earlier is maintained, and a transaction executes in logical timestamp order wrt these dependencies (and, hence, also their dependencies). If a transaction and its dependencies never interacts with another transaction or its dependencies, then their execution order is commutative, i.e. any ordering is indistinguishable from any other.

evancordell · on Aug 5, 2022

I’m confused by this as well, it seems to only provide “strict serializability” for overlapping transactions, which other databases like Cockroach also provide.

_benedict · on Aug 5, 2022

Accord provides global strict serializability. Loosely speaking, strict serializability requires two things:

1) That the result of an operation is reflected in the database before a response is given to the client

2) That every operation that starts after another operation's response was given to a client has the effect of being executed after.

Accord enforces both of these properties. Accord only avoids enforcing an ordering (2) on transactions that are commutative, i.e. where it is impossible to distinguish one order of execution from another. This requires analysis of a transaction and its entire graph of dependencies, i.e. its conflicts, their conflicts, etc. So, if your most recent transaction operates over the entire database, it is ordered with every transaction that has ever gone before.

Cockroach does not do this. If it did, it would be globally strict serializable.

evancordell · on Aug 5, 2022

This might just be a language issue, because different sources use different words for consistency guarantees.

> Accord only avoids enforcing an ordering (2) on transactions that are commutative

If the committed state in the database records the transactions in a different order than an external observer could have observed them being accepted by the database, then you don't have what Spanner calls "external consistency" - I thought this is what you mean when you say "global strict serializability", but now I'm not so sure.

Cockroach does enforce this property, but only for transactions that touch the same (key-value layer) keys. They talk about this a bit in their article on "life without atomic clocks"[0].

In SpiceDB[1], we take advantage of this property (when Cockroach is selected as the backing datastore) to give us external consistency by forcing all transactions to overlap. This prevents the New Enemy Problem[2], which is what Google calls the problem of transaction ordering when seen from an authorization perspective.

Your description of Accord sounded very similar to how Cockroach works from this perspective, but there may be some subtlety that I am missing. Cockroach exposes snapshots via "as of system time" queries, which can expose "bad" ordering back to the client in future queries - if Accord doesn't allow snapshot queries then I suppose it wouldn't be an issue.

[0]: https://cockroachlabs.com/blog/living-without-atomic-clocks/...

[1]: https://github.com/authzed/spicedb

[2]: https://authzed.com/blog/prevent-newenemy-cockroachdb/

_benedict · on Aug 5, 2022

> If the committed state in the database records the transactions in a different order than an external observer could have observed them

So, property (1) provides this for non-overlapping commutative transactions. If w1 writes c1, then finishes; w2 writes c2 then finishes; then if r1,r2,r3, etc don't overlap either w1 or w2 either, then they must correctly witness their execution order and be "externally consistent".

If transaction executions overlap, we must work harder to ensure we see each others' strict serializable order, and so we simply avoid ordering ourselves with transactions that cannot affect our outcome. If these other transactions cannot affect our outcome, then we can't witness anything externally about them, by definition.

evancordell · on Aug 5, 2022

That makes more sense, thank you for the explanation.

Are there any plans for accord to permit snapshot queries?

_benedict · on Aug 5, 2022

Yes, amongst other things. It's certainly doable, and I expect they will arrive at some point, but I cannot say with any confidence when that might be.

richieartoul · on Aug 5, 2022

I’m not 100% sure, but my reading of the CockroachDB Jepsen test: https://jepsen.io/analyses/cockroachdb-beta-20160829 indicates that it does meet those two requirements, but that it still is not globally strictly serializable due to the presence of an anomaly they call “causal reversal” where: “transactions on disjoint records are visible out of order.”

If you read their Jepsen report and blog posts carefully, the anomaly they suffer from is not caused by a read transaction that started after previous write transactions had completed, but by a read transaction running concurrently with two other write transactions which both commit while the read is still running, and the read sees the writes in the wrong order. The definition you’ve provided above does not cover this case (I think) but its still a violation of strict serializability which makes me think your definition is too loose.

Correct me if I’m wrong though, at this level things get quite confusing.

_benedict · on Aug 5, 2022

This stuff is very subtle, but you have to consider all three transactions when applying the guarantees.

If w1 and w2 overlap their execution then any serializable execution is strict serializable, so w1 must have finished before w2 began. In this case (2) applies and w1 must have the effect of executing before w2, to all observers, including r.

If w1 and w2 are commutative then it doesn’t matter in what order they execute as they’re equivalent. If not, then at some point their dependency graphs intersect and will be/have been ordered wrt each other by the same restrictions.

edit: in the example given in the Jepsen report, with Accord r would have to execute either before w1; after w1 and before w2; or after them both. This is because one of the following must occur:

1) r takes a dependency on both w1 and w2 and executes after them both;

2) r takes a dependency on only w1; w2 takes a dependency on r, so w2 executes after r, and r executes after w1

3) w1 and w2 both take a dependency on r, so that r executes before either

richieartoul · on Aug 5, 2022

Isn’t that ignoring the point that the read is running concurrently with w1 and w2? Your comment about “have the effect” only applies to transactions beginning after a write has completed, but the anomaly cockroach suffers from is for a read running concurrently with two writes with no overlapping execution. In that scenario they fail to be strictly serializable, but so would any system meeting only the two criteria you outlined I think.

Edit: Ah ok I see you edited it. Ok that makes sense if it works.

_benedict · on Aug 5, 2022

yeah, to clarify also textually the "has the effect of" applies to all witnesses, including those that are concurrent. So if r executes after w1 or w2 then r must see that w1 applies before w2, since w1 "has the effect of" applying before w2, whether or not r is concurrent with them.

mad44 · on Aug 5, 2022

Why is it not possible for r to take a dependency on w2 and no dependency on w1?

w2 may hit different nodes than w1 in a sharded database?

Or is the leader-Paxos approach in Accord operate over all nodes in the database regardless of the shards the keys fall into? (I guess that is the only explanation.) But then scalability of these global transactions would be limited, no?

Thank you for your detailed explanations.

_benedict · on Aug 5, 2022

So this is where it can start to get a little complicated, but without going into the nitty gritty, taking the example precisely as given in the Jepsen report:

If r arrives at C1 before w1, w1 must witness it and take a dependency on its execution. If it arrives at C2 after w2, then r takes a dependency on w2’s execution. In this case, w1 will not execute until r does, and r will not execute until w2 does, and so it is simply not the case that w1 happened first - w1’s commit will not have been acknowledged to the client yet, and the actual order of execution will end up w2,r,w1, which is entirely correct to all observers since w1 and w2 will now overlap their execution.

> operate over all nodes in the database regardless of the shards the keys fall into

Nope, it’s all entirely independent for each shard. The (implied) dependency graph is what ties shards together, so only when they interact.

mad44 · on Aug 5, 2022

Thank you for the explanation.

I get it now. It is all thanks to separating the operation as 1. agreeing on WAL, 2. executing after waiting all WAL agreements stabilizing.

There is a separation between the two. This can introduce waits (which could be theoretically unbounded for EPaxos but Tempo and Accord bounds this).

Maybe in some sense this is in the same spirit of Calvin's serializing WAL first and executing it later. But Accord avoids a single entity doing the serialization and achieves it via a leaderless Paxos approach which uses dependency graphs for WAL serialization, and later using another protocol for WAL execution after settlement.

_benedict · on Aug 5, 2022

> Maybe in some sense this is in the same spirit of Calvin's serializing WAL first and executing it later

Yes, it is very similar in effect, with the absence of a global log conferring some important differences with pluses and minuses. I expect to offer interactive transactions eventually, for instance, which may or may not be difficult for Calvin, and faults on a “log shard” are isolated with Accord, but have global impact with Calvin. But snapshot queries are much easier with Calvin.

grogers · on Aug 5, 2022

How do you break dependency cycles across shards (i.e. deadlock detection)?

_benedict · on Aug 6, 2022

There aren’t any dependency cycles, as dependencies must take a strictly lower logical timestamp. Dependency resolution is essentially two step: first every transaction that MIGHT take a lower timestamp is adopted as a dependency, then once those transactions have fixed their timestamp those that take a higher timestamp are simply removed from the dependency set.

Once we offer interactive transactions we’ll have to detect cycles, but that should be relatively straightforward since execution dependencies are managed explicitly.