When I first saw @TigerBeetleDB's Simulator and how every possible failure is enumerated and subsequently tested for, I was instantly hooked
Model checking the actual system 🏴☠️
github.com/tigerbeetle/tiger…
Jordan’s rules for building distributed systems:
#1 - don’t
#2 - if you have to anyways, at least don’t write the algorithms yourself
#3 - if you have to anyways, at least don’t do it without formal verification
#4 - if you have to anyways, go back to #1
It seems sharing it was worthwhile! I think SDN-based approaches to consensus are really promising, and with some time and effort we could see similar techniques brought to real world systems.
I wish I had the opportunity to work on consensus algorithms more often. I miss it 😌
I was cleaning out my GitHub repos the other day and came across this old gem I’d totally forgotten about. When I was at the Open Networking Foundation, I did some work researching low latency consensus using SDN-enabled clock synchronization protocols.
github.com/kuujo/just-in-tim…
I was cleaning out my GitHub repos the other day and came across this old gem I’d totally forgotten about. When I was at the Open Networking Foundation, I did some work researching low latency consensus using SDN-enabled clock synchronization protocols.
github.com/kuujo/just-in-tim…
The JIT Paxos protocol itself is largely s derived from on Viewstamped Replication (leader, views, etc). Requests are sent by the client to all replicas, and consensus is achieved in a single round trip as long as messages arrive in wall clock order.
I’d certainly expect its performance to degrade significantly under high load at least. (although hopefully no more than a traditional consensus algorithm). It’s clearly not ready for the real world. But I thought it would be interesting to share nonetheless… for posterity.
Yes. If a new leader is elected for any reason, that leader can overwrite the entry with an entry from its term. An entry is not guaranteed to be retained until a leader commits it (commit index >= entry index). 1/
Raft Consensus Challenge II ⛓️
We have a Raft cluster with 3 nodes, each maintaining a replica of the log. Node 2 is the leader of term 3, accepted client request 9, & added 9 to its log.
Can 9 be lost? Why?
x.com/DominikTornow/status/1…
Prior to committing an entry from its term, entries in a leader’s term can be overwritten even if they’re stored on a majority of replicas. See figure 8 in the original Raft paper for a deep dive into how and why that can happen:
raft.github.io/raft.pdf
3/
Of course, in this case the leader has committed an entry from its term, but entry 9 has not been replicated to a majority of nodes, so it can be lost regardless.
I think the answer I’m expected to give is node 2… but I’m going to “well, actually” this and point out it could be node 1 OR 2. IIRC there’s nothing in the Raft protocol that says a leader must append to its log before sending AppendEntries RPCs to followers.
1/
Raft Consensus Challenge ⛓️
We have a Raft cluster with 3 nodes, each maintaining a replica of the log.
Which node is the leader for term 3?! What gives it away?!
#ThinkingInDistributedSystems#ConsensusChallenge
But of course, this will only be the case if the leader has committed entires in its term. Until the leader commits an entry from its term (to make a majority of the logs consistent at the start of the term) the commit index will have been set by a prior leader. 4/4
It’s also worth noting that while an entry can be appended to followers before the leader, the leader can still only commit that entry (increment the commit index) after it’s stored in its log. A leader can’t just replicate an entry to quorum of followers and tell them to commit.