Ditch your mutex, you deserve better

Comments

By sebstefan 2025-11-189:412 reply

>STM is an optimistic concurrency system. This means that threads never block waiting for locks. Instead, each concurrent operation proceeds, possibly in parallel, on their own independent transaction log. Each transaction tracks which pieces of data it has accessed or mutated and if at commit time it is detected that some other transaction has been committed and altered data which this transaction also accessed, then the latter transaction is rolled back and is simply retried.

I already foresaw (and it gets mentioned later), the problem that if you have many small, frequent operations, they will prevent a big, long operation from happening because they will always change the state and cause conflicts before the big one can finish. You can easily code yourself an app that will softlock forever.

The post doesn't offer a good solution (it talks about one where there's a tradeoff, but you don't need tradeoffs for this)

The way this gets fixed it is to make the lock acquisition (or in this case, the priority of the merge?) preemptive (Wound-Wait)

All transactions have a global, ever incrementing number attached to them, their ID, which we call "seniority", if you try to get a a lock, and the lock is held by a transaction with a lower seniority (=> a higher ID), you kill the transaction, take the lock, and once you're done the transaction you killed is allowed to retry

In the meantime, if a transaction with the lower seniority tries to get the lock, it gets blocked

This insures that your program will always finish.

In the case of "lots of frequent small transactions" + "one big, long report", the report will get killed a few times, until your report inevitably becomes the most senior transaction to ask for this resource, and is allowed to complete.

By nlitened 2025-11-1814:124 reply

I think STM looks pretty cool in toy examples, but in practice it's pretty bad. Very difficult for me to make strong logical argument about this, just based on how it feels.

In Clojure, there are first-class STM primitives with retriable transactions in the standard lib, but every time I tried to use them, the code was slow, difficult to reason about, full of "colored" functions, and non-composable.

Also, unintuitively, the code starts immediately smelling slightly "wrong", as if I tried to put into code my first childish thoughts instead of actually thinking about how the system _should_ work. Like the notorious bank accounts example: yeah, cool, two accounts' balances are TVars — but what if I have 10M accounts? Is the list of accounts also a TVar since sometimes new accounts are created? Are indexes on the list also TVars? How do you persist the lists, how do you synchronize slow persistence and in-memory transactions? How do you synchronize all the deposits with a backup/failover machine? As you continue trying to untangle all this with STM, you start drowning in complexity, and encountering huge contention among transactions touching many parts of your systems, and spewing countless retry logs.

It's not only my own experience — I believe it's widely accepted in Clojure community that nobody actually uses STM, and instead uses simple atomic updates or queues.

By juliangmp 2025-11-188:346 reply

I found that a lot of the problems I had been having with mutexes, stem from the fact that traditionally the mutex and the data it protects are separate. Bolting them together, like Rust's Mutex<T> does, solves a lot these problems. It let's you write normal, synchronous code and leave the locking up to the caller, but without making it a nightmare. You can't even access the data without locking the mutex.

This isn't an attack on the (very well written) article though. Just wanted to add my two cents.

Hacker News

Ditch your mutex, you deserve better

Show article

commandersaki

Comments

By sebstefan 2025-11-189:412 reply

By nlitened 2025-11-1814:124 reply

By juliangmp 2025-11-188:346 reply

HackerNews