ACCORD : The newest consensus protocol in town
By: Sid
At this point, y'all know how i love to yap about the latest tech in town. During one of our paper reading sessions in office I wanted to present something on the lines of distributed systems and consensus protocols and that's how I stumbled on ACCORD as a part of CEP-15.
Every time I look at traditional consensus algorithms, I see the same painful pattern: they're either fast but have a single point of failure, or they're resilient but slow.
Today, I'm talking about Accord — a protocol that actually solves this, and honestly? It's kind of a masterpiece in systems design. took away soooooo much from this paper.
lets begin !
The Problem We've Been Living With
Paxos (1998):
Consensus, but needs 2 round-trips. Why? Because concurrent proposals from different replicas conflict, and you need another round to resolve it T~T
[fun fact : the original paxos paper was about parliaments. it actually solved the consensus protocol but not once in the paper was computers and systems mentioned. the algorithm was derived from this and put into a readable format which was more comprehendable in 2001. the original paper is still considered one of the most consfusing reads xD ]
Multi-Paxos & Raft (2001, 2014):
they were fans of electing a leader to reach consensus.
the smart move reduces to the cycle to 1 round-trip.
But now:
- Leader failure = service degradation until new leader elected
- Geographically distant clients must talk to a distant leader (latency nightmare lmao)
- Single leader = throughput bottleneck
EPaxos (Egalitarian Paxos):
"What if we make it leaderless again?" Great idea! But under maximum tolerated failures, it forces you into the slow path anyway.(how? ill show you)
Also, it needs identical operation histories across replicas—complex topological sorting required.
Sound familiar? Systems trying to solve one problem by creating another xD
[i suggest y'all read about the other algorithms properly before going ahead, it'll make way more sense ]
Enter Accord: The Protocol That Closes the Gap
Here's the thing about Accord, it's not just incremental. It's the first leaderless consensus protocol that's production-stable. And it does this by co-designing replication AND transactions as a single unified protocol.
Traditional systems do this:
Client → Transaction Layer (2PC) → Replication Layer (Paxos/Raft) → 4 round-trips
[coordination once] [coordination again]
Accord does this:
Client → Accord (handles BOTH transactions + replication) → 1-2 round-trips
[simultaneous coordination]
No redundant coordination.
Innovation 1: Flexible Fast-Path Electorates
Here's where it gets clever.
EPaxos uses fixed quorum rules.
Say you have 5 replicas and can tolerate 2 failures:
- Fast-path quorum: 4 out of 5 nodes
((|F| = ⌈(|E| + f + 1) / 2⌉)
F = (5+2+1)/2 = 8/2 = 4) - Problem: If 2 nodes actually fail, you only have 3 left. Can't make 4 votes. Fast-path impossible. Forced to slow-path. (remember how i told you it forces you into slow path? this is it)
2 RTT instead of 1.
Accord's innovation? Dynamic electorates.
Instead of fixed quorum rules, Accord can exclude slow or distant nodes from the voting set without reducing fault tolerance. Same 5 replicas, 2 failures tolerated, but:
- Electorate: 3 fast nodes (exclude the 2)
- Fast-path quorum: 3 out of 3
- Result: Fast-path always succeeds, even under maximum failures
This is the first protocol to maintain optimal performance (fast-path) under ANY tolerated failures.
EPaxos? Only survives f/2 failures in fast-path while Accord survives f failures.
Innovation 2: Timestamp Reorder Buffer
Alright, imagine this scenario:
Transaction A starts in US-East at T0.
Transaction B starts in EU-West at T0 (simultaneously).
Without reordering:
- R1 (US-East) receives A at T5, processes it immediately, sees A first
- R3 (EU-West) receives B at T8, processes it immediately, sees B first
- R1 receives B at T10, but already processed A (sees order: A, B)
- R3 receives A at T12, but already processed B (sees order: B, A)
Different replicas see different orders! Fast-path fails.
Back to slow-path.
Painful T~T
Accord's fix? Intentionally buffer and wait.
Instead of processing immediately, replicas BUFFER incoming transactions and wait until T20 (enough time for all messages to likely arrive). Then they process all buffered messages sorted by timestamp:
T5 R1 receives A → BUFFER
T8 R3 receives B → BUFFER
T10 R1 receives B → BUFFER both
T12 R3 receives A → BUFFER both
T20 ALL replicas sort by timestamp → UNANIMOUS ORDER
Both see the same order. Fast-path succeeds.
there's no way you didn't go "wtf, isn't waiting gonna make it slower??"
so, here's the kicker math:
Cost of waiting: 10-20 milliseconds (typical network latency + clock skew)
Benefit of avoiding slow-path round-trip: 50-100 milliseconds (cross-region RTT)
Net result: 3-5x faster overall latency by being patient for milliseconds!
And no, you don't need atomic clocks or GPS (looking at you, Spanner 👀). Just standard NTP clock sync. Accord's philosophy? Use cheap, standard time sync and just wait a bit.
Still WAY cheaper than an extra network round-trip.
How Accord Actually Works: Three Phases
Let me walk through a real scenario:
transferring $100 from Alice's account (Shard A) to Bob's account (Shard B).
Both shards are replicated across 5 replicas.
Phase 1: PreAccept (Propose & Vote)
Step 1: Coordinator proposes
- Picks timestamp t = 120000.000, seq=1, id=Coordinator
- Sends to fast-path electorate (4 nodes): R1, R2, R3, R4
Step 2: Each replica checks for conflicts
- R1: "No conflicts on Alice/Bob accounts" → Accept t, no dependencies
- R2: "No conflicts" → Accept t, no dependencies
- R3: "I saw another transaction on Alice's account at t=120000.005 (higher!)" → Reject, propose t=120000.005, include Transaction 2 as dependency
- R4: "No conflicts" → Accept t, no dependencies
Step 3: Coordinator counts votes
- 3 out of 4 accept t [screw you R3 (•̀⤙•́ )]
- 1 rejects (proposes higher timestamp)
Decision point: Not unanimous? Can't use fast-path. Go to Accept phase (slow-path).
Phase 2: Accept (Resolve Conflicts)
Step 1: Coordinator picks final timestamp
- Looks at all responses: max timestamp is 120000.005
- Merge all dependencies: must include Transaction 2
- tfinal = 120000.005
Step 2: Sends Accept to simple majority (3 out of 5)
- R1, R2, R3: "Accept tfinal = 120000.005, dependencies include TX2"
Step 3: All 3 accept → simple majority reached
- Timestamp is now durably decided
- TX2 must execute first
note : This phase only runs if PreAccept didn't get unanimous agreement
Phase 3: Commit (Broadcast Decision)
Always runs, whether fast-path or slow-path.
Coordinator broadcasts to ALL replicas: "Transaction committed at 120000.005, dependencies resolved."
All replicas record the decision. Execute only after dependencies complete. Update Alice -100, Bob +100.

Recovery Protocol: What if the Coordinator Crashes?
This is where Accord's design shines.
When a coordinator goes down mid-transaction:
A new replica volunteers as Recovery Coordinator and contacts a recovery quorum (f+1 replicas). By asking "what do you know about this transaction?", it determines:
- Case A: PreAccept phase succeeded → Commit immediately
- Case B: PreAccept had conflicts → Run Accept phase, then Commit
- Case C: Accept phase done → Send Commit
- Case D: Already committed → Broadcast to remaining replicas
- Case E: Never started → Abort or retry
The recovery coordinator resumes from wherever the protocol left off. No guessing, no rollbacks, no lost transactions.
Real-World Impact: Cassandra Gets ACID Transactions
Here's why this matters for Cassandra users:
Before Accord:
- No cross-partition transactions
- Eventually consistent by default
- Lightweight transactions limited to single partition
- Applications must handle coordination (nightmare)
After Accord (CEP-15, Cassandra 5.0):
- Full ACID transactions across partitions
- Strict serializability (strongest guarantee)
- Geographic distribution without sacrificing consistency
- Database handles coordination automatically
Cassandra's famous for leaderless, linear scalability. Accord lets it keep that and offer ACID transactions—without Spanner's expensive hardware, without Raft/Paxos bottlenecks.
The Technical Achievements
1.Leaderless can be production-stable — Previous leaderless attempts (EPaxos, TAPIR, Janus) had critical instability issues. Accord is first to ship.
2.Matches leader-based performance without the bottleneck — Optimal latency, no SPOF, better geographic distribution.
3.Predictable performance under ANY conditions — Failures? Fast-path survives. Contention? Guaranteed 1 RTT. Transactions always complete.
4.Novel mechanisms:
- Flexible electorates: Dynamic quorum adjustment
- Timestamp reorder buffer: Conflict avoidance through intentional delay
- Co-designed protocol: Unified replication + transactions
It's elegant. It's practical. It uses standard infrastructure (NTP, not atomic clocks). It's shipping in real databases. And it proves that "leaderless" and "production-ready" aren't contradictions.
That's worth talking about ٩(^ᗜ^ )و
TL;DR: Accord is the first production-stable leaderless consensus protocol. It achieves this by:
- Co-designing transactions + replication as one protocol (no redundant coordination)
- Using dynamic electorates to maintain fast-path even under failures
- Buffering transactions by timestamp to ensure consistent ordering across replicas
- Recovering elegantly when coordinators fail
Result: ACID transactions across geographic regions at leaderless scale.
References :
Read the whitepaper here