Why Strong Consistency?

by SchwKatzeon 11/26/2025, 10:00 PMwith 93 comments

by Kinranyon 11/27/2025, 8:03 PM

I continue to be surprised that in these discussions correctness is treated as some optional highest possible level of quality, not the only reasonable state.

Suppose we're talking about multiplayer game networking, where the central store receives torrents of UDP packets and it is assumed that like half of them will never arrive. It doesn't make sense to view this as "we don't care about the player's actual position". We do. The system just has tolerances for how often the updates must be communicated successfully. Lost packets do not make the system incorrect.

by kukkeliskuuon 11/27/2025, 6:54 PM

I think we should stop calling these systems eventually consistent. They are actually never consistent. If the system is complex enough and there are always incoming changes, there is never a point in time in these "eventually consistent systems" that they are in consistent state. The problem of inconsistency is pushed to the users of the data.

by mrkeenon 11/27/2025, 9:01 PM

It's wishful thinking. It's like choosing Newtonian physics over relativity because it's simpler or the equations are neater.

If you have strong consistency, then you have at best availability xor partition tolerance.

"Eventual" consistency is the best tradeoff we have for an AP system.

Computation happens at a time and a place. Your frontend is not the same computer as your backend service, or your database, or your cloud providers, or your partners.

So you can insist on full-ACID on your DB (which it probably isn't running btw - search "READ COMMITTED".) but your DB will only be consistent with itself.

We always talk about multiple bank accounts in these consistency modelling exercises. Do yourself a favour and start thinking about multiple banks.

by wowamiton 11/28/2025, 1:46 AM

Eventual consistency arises from necessity -- a need to prioritise AP more. Not every application needs strong consistency as a primary constraint. Why would you optimise for that, at the cost of availability, when eventual consistency is an acceptable default?

by lmmon 11/28/2025, 3:39 AM

This isn't a reason to have strong consistency and pay the costs, it's a reason to not do read-modify-write. Indeed I'd argue it's actually a reason to prefer eventually consistent systems, as they will nudge you away from adopting this misguided architecture before your system becomes too big to migrate.

Adopt an event streaming/sourcing architecture and all these problems go away, and you are forced to have a sensible dataflow rather than the deadlocky nonsense that strongly-consistent systems nudge you towards.

by econon 11/28/2025, 3:24 AM

I've never had the luxury of multiple db but in general race conditions..

Could you put an almost empty db in front that only records recent changes? Deletes become rows, updates require posting all values of the row. If no record is found forward the query to the read db. If modifications are posted forward the query to the write db.

If correctness is merely nice to have I always use a "Pain" value that influences the sleep duration. It rarely gets very busy instantaneously, activity usually changes gradually.

by asahon 11/28/2025, 11:49 AM

True serializability doesn't model the real world. IRL humans observe something then make decisions and take action, without holding "locks" on the thing they observed. Everything from the stock market to the sitcom industry depend on this behavior.

Other models exist and are more popular than serializability, e.g. for practicality, PostgreSQL uses MVCC and read consistency, not serializability.

by Tractor8626on 11/27/2025, 9:52 PM

> read-modify-write is the canonical transactional workload. That applies to explicit transactions (anything that does an UPDATE or SELECT followed by a write in a transaction), but also things that do implicit transactions (like the example above)

Your "implicit transaction" would not be consistent even if there was no replication involved at all. Explicit db transactions exist for a reason - use them.

by nulloremptyon 11/27/2025, 8:27 PM

I keep wondering how the recent 15h outage have affected these eventually consistent systems.

I really hope to see a paper on the effects of it.

by jiggawattson 11/27/2025, 10:08 PM

Blogs like this make me go on the same rant for the n-th time:

Consistency for distributed systems is impossible without APIs returning cookies containing vector clocks.

The idea is simple: every database has a logical sequence number (LSN), which the replicas try to catch up to -- but may be a little bit behind. Every time an API talks to a set of databases (or their replicas) to produce a JSON response (or whatever), it ought to return the LSNs of each database that produced the query in a cookie. Something like "db1:591284;db2:10697438".

Client software must then union this with their existing cookie, and return the result of that to the next API call.

That way if they've just inserted some value into db1 and the read-after-write query ends up going to a read replica that's slightly behind the write master (LSN 591280 instead of 591284) then the replica can either wait until it sees LSN >= 591284, or it can proxy the query back to the write master. A simple "expected latency of waiting vs proxying" heuristic can be used for this decision.

That's (almost entirely) all you need for read-after-write transactional consistency at every layer, even through Redis caches and stateless APIs layers!

by jeffbeeon 11/27/2025, 10:21 PM

The argument seems to rely on the point that the replicas are only valuable if you can send reads to them, which I don't think is true. Eventually-consistent replicated databases are valuable on their own terms even if you can only send traffic to the leader.

by arkt8on 11/28/2025, 3:37 AM

it sometimes can be just an architectural issue...

You can use the critical query against the RW instance, the first point.

The other point is that most of the time, specially concerning to web where the amount of concurrent access may be critical, the data doesn't need to be time-critical.

With the advent of reactive in apps and web things became overcomplex.

Yes, strong consistency will always be an issue. And mitigation should start in the architecture. More often than not, the problem arise from architectural overcomplication. Each case is a case.

by generalzodon 11/27/2025, 7:50 PM

in the read after write scenario, why not use something like consistency tokens ? and redirect to primary if the secondary detects it has not caught up ?

by sgarlandon 11/27/2025, 7:30 PM

For the love of all that’s holy, please stop doing read-after-write. In nearly all cases, it isn’t needed. The only cases I can think of are if you need a DB-generated value (so, DATETIME or UUIDv1) from MySQL, or you did a multi-row INSERT in a concurrent environment.

For MySQL, you can get the first auto-incrementing integer created from your INSERT from the cursor. If you only inserted one row, congratulations, there’s your PK. If you inserted multiple rows, you could also get the number of rows inserted and add that to get the range, but there’s no guarantee that it wasn’t interleaved with other statements. Anything else you wrote, you should already have, because you wrote it.

For MariaDB, SQLite, and Postgres, you can just use the RETURNING clause and get back the entire row with your INSERT, or specific columns.

by Animatson 11/27/2025, 7:03 PM

So why isn't the section that needs consistency enclosed in a transaction, with all operations between BEGIN TRANSACTION and COMMIT TRANSACTION? That's the standard way to get strong consistency in SQL. It's fully supported in MySQL, at least for InnoDB. You have to talk to the master, not a read slave, when updating, but that's normal.

by rakooon 11/27/2025, 9:47 PM

I don't understand this article and It's like the author doesn't really know what they're talking about. They don't want eventual consistency, they want read-your-writes, a consistency level that's stronger than EC yet still not strong.

https://jepsen.io/consistency/models/read-your-writes

Read-your-writes is indeed useful because it makes code easier to write: every process can behave as if it was the only one in the world, devs can write synchronous code, that's great ! But you don't need strong consistency.

I hope developers learn a little bit more about the domain before going to strong consistency.