Hacker News new | past | comments | ask | show | jobs | submit login
A simple distributed algorithm for small idempotent information (antirez.com)
59 points by janerik on Feb 21, 2014 | hide | past | favorite | 14 comments



Idempotence by example: Looking up some customer's name and address in a database are typically idempotent (in fact nullipotent), since this will not cause the database to change. Similarly, changing a customer's address is typically idempotent, because the final address will be the same no matter how many times it is submitted. However, placing an order for a car for the customer is typically not idempotent, since running the method/call several times will lead to several orders being placed. Canceling an order is idempotent, because the order remains canceled no matter how many requests are made.

https://en.wikipedia.org/wiki/Idempotence


While the wiki article seems to do a decent job of explaining mathematics behind idempotency it sheds no light on why it's so important in distributed systems. antirez talks about idempotency in sort of mathematical sense, though I could be wrong as I've just skimmed through the article.

Since failures are a norm in distributed systems one would want clients to retry a request without worrying about double updations (placing the order twice in the example above, debiting the money twice etc.,).

Just elaborate this further. Let's say the "place order" request didn't even reach the server. In this case client can just retry the request. However, if the request did reach the server but the response didn't reach the client then client can't safely retry without creating duplicate order. The problem here for the client is that there's no way for it to distinguish between a lost request vs lost response.

In such cases idempotency is achieved by requiring clients to send a unique identifier with each request. Server can now ensure that for a given request identifier the operation is performed just once.


Knowing the Raft[0][1] protocol, I felt this post was disingenuous going on for its entire length, and only mentioning Raft basically as a footnote, w/ no other references.

I think Raft deserves a bit more prominence and credit here, Antirez.

  [0] (pdf) https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf
  [1] I kept reading thinking to myself repeatedly "This is just Raft".


In the very first part of the post: "The algorithm is mostly borrowed from Raft, however because of the premises it uses only a subset of Raft that is trivial to implement."

Maybe you only scanned the article.


:) my apologies -- I started reading your introduction and then moved on to the meat of the article...


np ;-) after all as an heavy HN reader I understand very well the impossibility to read whole posts...


It looks like the protocol is broken; can anyone confirm?

Suppose a handful of the processes trigger an election simultaneously, each with a `currentEpoch` of `n`, so that each process has voted but no proposal has a majority. Now all processes have a `lastVoteEpoch` of `n`.

According to the description, no process will vote more than once in a single epoch. However, a process will can only propose a new value using its `currentEpoch`. Since each proposal must have a `currentEpoch` of `n`, but no process is still allowed to vote in that epoch, no more state transitions are possible and the frequency cannot change.


Check the "retry" part, basically when there is a split-brain during an election, a new process will try again and eventually will succeed. This is copied from Raft.

It works because when some instance retries, it increments again the currentEpoch, so ti is a new voting process from the point of view of the voters.

Btw it is not needed that to retry is the instance that did it the previous time, because even if the voting failed, the currentEpoch of the process that tried to get elected, will be propagated across the system, so every process will get the new epoch, and can try with an updated epoch to get voted.


Right. For the same reason, the Paxos protocol allows multiple prepare votes from each acceptor. It makes this safe by requiring that each acceptor tell the proposer of the value for the highest ballot number it has accepted, and requires the proposer to switch proposal values when receiving such a message. This is, at first glance, one of the strange edges of the Paxos (synod) protocol, but it makes a lot of sense once you see an example where it is needed.

Raft has somewhat similar constructs, but explained slightly differently.


Hey, actually Raft uses randomized retries, since I copied this part (and most of the rest) from Raft. Voting can result into split brain as this condition is recovered, and a retry time that is big enough compared to RTT guarantees liveness.


It looked similar. Can you highlight the differences from Raft? Is it just that state is small so it can accompany the various voting/heartbeat messages?


Because the state is small and does not depend on the previous computation, the problem becomes trivial, and is accomplished just by broadcasting the full information with an attached unique version. No need for a log, rules to apply or not entries from the log to state machine, and so forth.


Hmm -- if a process bumps its `currentEpoch` for every proposal, and the delays are randomized, it looks like this would fail only probabilistically and may eventually succeed. I'll have to take a closer look later today...


Exactly that. It works only as long as the retry time is randomized and much larger than then network RTT.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: