Neither method works. If the framework records the occurrence of the call before...

jitl · 2024-03-09T06:21:41 1709965301

It’s at-least-once. From https://docs.differential.dev/advanced/compute-recovery/

> If a machine fails to send any heartbeats within an interval (default 90 seconds):

> It is marked as unhealthy, and Differential will not send any new requests to it.

> The functions in progress are marked as failed, and Differential will retry them on a healthy worker.

I would guess that “idempotent” functions in the system also take a lease out on the idempotency key. Perhaps they release the lease on failure since they can observe errors thrown, and they commit the key as consumed after success. ¯\_(ツ)_/¯ the docs are not clear on these semantics!

Salgat · 2024-03-09T06:30:15 1709965815

That's how we handle idempotency. It's basically a mutex on the idempotency key with a timeout (in our case using redis with redlock since it's a distributed system). Once the command finishes, the key is marked as handled and the lock is released. At that point any future or queued requests with the same key immediately return.

CipherThrowaway · 2024-03-09T06:45:09 1709966709

It sounds to me like there are scenarios where calls to, and certainly effects of, "idempotent" functions can take place twice.

I haven't looked at Redlock for a while, but I'm assuming you're all across the historical objections to it from distributed systems researchers: https://martin.kleppmann.com/2016/02/08/how-to-do-distribute...

fweimer · 2024-03-09T12:37:01 1709987821

Is there really consensus that practical systems need to have lock lease timeouts?

jitl · 2024-03-09T20:22:10 1710015730

It’s a choice between at-least-once and at-most-once right? Without a timeout you get at-most-once since the lease holder may die off and never complete the task, and so the task of a completed 0 times.

I’m building a system right now and we’re going to use lease/heartbeats etc etc to elect leaders and whatnot, all this discussion seems very practical to me.

Salgat · 2024-03-09T20:23:25 1710015805

Exactly. It comes down to whether the consequences of these race conditions are acceptable or not. Sometimes (known and understood) race conditions are okay for performance reasons. Shoot, that's the whole philosophy behind eventual consistency. Sometimes they aren't, such as bank transactions. It's all situational.

CipherThrowaway · 2024-03-10T06:34:16 1710052456

Eventual consistency isn't a philosophy, and it's not the idea that race conditions are acceptable. It's a consistency model with specific properties and guarantees.

These races we're talking about don't produce eventual consistency, but inconsistency. Different systems require different levels of correctness, but if you're not familiar with the underlying theory then your trade-offs are going to be uninformed decisions rather than informed ones.

Salgat · 2024-03-11T16:47:51 1710175671

A race condition is the lack of determinism in a sequence of events that can lead to undesirable effects. Eventual consistency is an approach (and consistency model, design philosophy, whatever you want to call it) that accepts this lack of determinism as a compromise because the possible sequences are accounted for, since the system can still act on stale data until an update is fully pushed out (which is still undesirable, but an accepted compromise).

lunarcave · 2024-03-09T18:13:33 1710008013

Ok, I think we could do a better job of explaining the semantics.

The system does not implement idempotency by default. If something that's not marked as "idempotent" fails, it does try again in a different machine.

If the function is marked as idempotent [1] then the system ensure at-most-once semantics by only issuing the call once and only once to a worker who can process the call.

[1] https://docs.differential.dev/advanced/idempotency/