Jepsen: Jetcd 0.8.2

marksomnian · 2024-08-08T14:50:35 1723128635

Interesting footnote:

> In the 2022 engagement, the client’s engineers were enthusiastic about the prospect of a public analysis, and Jepsen was allowed to file public issues against systems including etcd. Following the conclusion of the contract, Jepsen independently completed a written report discussing the behaviors we’d found in etcd. However, Jepsen was unable to secure official permission from the client’s legal department to disclose that the client had funded part of the work. This created an unusual state of affairs: the issues, test suite, and reproduction instructions were all public, but per Jepsen’s ethics policy, the analysis itself could not be published. Jepsen shelved that analysis and it remains unpublished. The present analysis is based on entirely new work and verifies a different software system: jetcd, rather than etcd

mdaniel · 2024-08-08T16:35:46 1723134946

> No one followed up on the jetcd issue, and it was automatically closed as stale.

Another excellent outcome from those GH automations

protosam · 2024-08-08T16:31:18 1723134678

Makes me wonder if the Go v3 client has the same problem. If yes, that would be a major problem for all the Kubernetes systems in production.

mdaniel · 2024-08-08T16:42:18 1723135338

At the very real risk of "talk is cheap," my understanding is that is part of why Jepsen publishes the test suites (e.g. https://github.com/jepsen-io/etcd ) so it's not "take my word for it" but rather "lein run test-all" and watch the fireworks. So, a sufficiently motivated actor, say for example one of the deep-pocketed stewards of the Kubernetes project could run the tests themselves

Between my indescribable hatred for etcd and my long-held lust for a pluggable KV backend (https://github.com/kubernetes/kubernetes/issues/1957 et al) it'd be awesome if any provable KV safety violations were finally the evidence required for them to take that request seriously

protosam · 2024-08-08T16:51:32 1723135892

Having looked at the test suite already, I know enough to know that I don't understand it well enough to be that guy to do this. It's for this reason, I'm personally going to pull out the popcorn and see what happens over the next few weeks.

jhgg · 2024-08-08T17:57:11 1723139831

I'm currently working on a Rust v3 client, and have been reading the Go v3 source code, and the code definitely is hard to follow so I would be unsurprised if there were issues lurking.

tjungblut · 2024-08-08T18:40:30 1723142430

could you be more specific on what's so hard to follow? it's quite literally just the implementation of the GRPC interface [1].

[1] https://github.com/etcd-io/etcd/blob/main/client/v3/kv.go#L3...

silverlyra · 2024-08-08T22:16:27 1723155387

I was curious and dug into the Go client code. You linked to the definition of KV – the easiest way to create one is with NewKV [1], which internally creates a RetryKV [2] wrapper around the Client you give it.

RetryKV implements the KV methods by delegating to the underlying client. But before it delegates an immutable request (e.g., range), it sets the request retry policy to repeatable [3].

Retries are implemented with a gRPC interceptor, which checks the retry policy when deciding whether a request should be retried [4].

The Jepsen writeup says a client can retry a request when “the client can prove the first request could never execute, or that the request is idempotent”. In my (cold) read of the code, the Go client stays within those bounds.

For non-idempotent requests, the Go client only retries when it knows the request was never sent in the first place [5]. For idempotent requests, any response with gRPC status unavailable will be retried [6].

Unlike jetcd, the Go client’s retry behavior is safe.

[1] https://github.com/etcd-io/etcd/blob/main/client/v3/kv.go#L9... [2] https://github.com/etcd-io/etcd/blob/main/client/v3/retry.go... [3] https://github.com/etcd-io/etcd/blob/main/client/v3/retry_in... [4] https://github.com/etcd-io/etcd/blob/main/client/v3/retry_in... [5] https://github.com/etcd-io/etcd/blob/main/client/v3/retry.go... [6] https://github.com/etcd-io/etcd/blob/main/client/v3/retry.go...

protosam · 2024-08-08T23:33:39 1723160019

Just dropping a comment to express my gratitude for sharing a breakdown of your interpretation. :)

jhgg · 2024-08-08T23:56:24 1723161384

Look at the code for the watcher client[1] and lease management[2].

[1]: https://github.com/etcd-io/etcd/blob/main/client/v3/watch.go

[2]: https://github.com/etcd-io/etcd/blob/main/client/v3/lease.go

tjungblut · 2024-08-09T08:49:44 1723193384

The watch is a simple processing loop that receives and sends on a bi-directional GRPC channel. Leases have a similar loop for keep-alive messages, everything else is quite literally delegated.

I get that it's difficult to translate this 1:1 into Rust without channels and select primitives, but saying it's complex is wild. Try the server-side code for leases ;)

protosam · 2024-08-08T15:00:57 1723129257

Your posts are something I have in my bookmarks and reference regularly as I continue to build my own distributed data system. Thanks for continuing to test and report on these issues. These posts have clarified a lot of details about the consistency guarantees of these systems that I really couldn’t discern from their own documentation. The knowledge is invaluable with how developers lean towards just trusting the system they consume to be correct.

mjb · 2024-08-08T16:39:51 1723135191

The first bug is a great reminder that even strict serializability doesn't imply idempotency. If you're doing non-idempotent operations like unconditional writes, you've got to think very carefully before you add any retries to a system. Even with conditional writes, you need to think carefully about ABA bugs.

Both of these bugs are a great reminder that distributed system behavior includes clients. From the application's perspective bugs like this being introduced by the client isn't any practically different from them being introduced by the server - the same badness happens. A database needs to consider it's properties end-to-end from the application API.

It's also a great reminder that APIs that make it hard for clients to do the right thing will likely lead to bugs like this. Failures happen, and a good API needs to be designed in a way that allows the client to do something sensible following a failure. A great API makes it easy for a client to do something sensible, and hard for a client to do the wrong thing. Perhaps my favorite non-distributed example of this is AES-GCM, the ubiquitous AEAD crypto primitive: one tiny bug (reusing an IV) completely blows up the whole scheme.

And, as always, this is great stuff from Kyle. His Jepsen work has been moving the industry forward for years, and it's great to see him continue it (and continue to put the effort into writing up his findings so clearly).

aphyr · 2024-08-08T16:50:07 1723135807

> strict serializability doesn't imply idempotency

I think we're probably getting at the same thing, but I do want to clarify a bit. A Strict Serializable history, like a Serializable one, requires equivalence to a total order of transactions. That's clearly not true for etcd+jetcd: no possible order of transactions can allow (e.g.) a transaction to read from its own future. It's totally fine to submit non-idempotent transactions against a Serializable system: systems which actually provide Serializable will execute known-committed transactions exactly once.

Plenty of other databases pass this test; etcd+jetcd does not. This system is simply not Serializable.

mjb · 2024-08-08T16:55:37 1723136137

Maybe what I should have said is "you can't just retry transactions against a strict serializable database and expect to still get strict serializability (from the applications's perspective)". This is true of distributed system APIs more generally, too.

aphyr · 2024-08-08T17:05:14 1723136714

Yeah, that's a good way of phrasing it! :-)

mikemitchelldev · 2024-08-08T20:11:49 1723147909

What are the minimum resources you'd need to run similar types of tests at the scale that Jepsen does?

aphyr · 2024-08-08T20:15:35 1723148135

For this test, you can do it on pretty much any reasonable Linux machine. Longer histories can churn through more CPU and RAM--some of the more aggressive tests I ran for this work involved 20 GB heaps and 50 cores--but you can tune all that lower.