While exploring the internals of log replication in Raft, I found a simple yet very effective optimization when AppendEntries call gets rejected.
This usually happens when the previous term or previous log index don't match.
In this case, the leader usually decrements nextIndex and retries and keep going batch until the term and log index match.
This is simple and most importantly, works in practice. However, there are a couple of small details that make Raft so powerful:
1/ Fast backtracking:
Naively, the leader goes back one entry at a time. If a follower is missing 1000 entries, the leader has to make 1000 round-trips. This is not optimal.
Thus, the follower sends back ConflictTerm and ConflictIndex. Due to this, the leader can skip the entire conflicting term in one jump. O(N) becomes O(# of terms).
This is how Raft is able to heal the cluster in seconds.
2/ Commit Rule:
Take the following case:
> Term 1: node1 is leader, appends entry at index 3 to itself and node2. node1 crashes before sending to node3.
> Term 2: node3 becomes leader (node2 didn't vote for node3 because (node2) has a newer log). node3 never saw index 3 at all
> Term 3: node1 restarts and becomes leader again. Index 3 is now on node1 and node2 - that's a majority. Can node1 commit? NO.
If node1 commits the term 1 entry at index 3, then crashes again, node3 could become leader and overwrite it - because node3 still doesn't have it and its log would look equally valid.
To fix this, a leader doesn't directly commit an entry from a previous term. It only commits old entries as a side effect of committing a current‑term entry.
Once a new entry is committed, all prior entries are implicitly committed.
These details are tiny but they make Raft powerful and a popular choice for consensus.