I am having hard time to understand Haskell's memory model. I got stuck implementing a simple lock-free hash map and while trying to reproduce the error on a smaller example I realized I have no idea what is going on. I am only targeting x86_64
data TicketLock = TicketLock {cur :: !(IORef Int), next :: !(IORef Int)}
testTicket = do
l <- TicketLock <$> newIORef 0 <*> newIORef 0
mapConcurrently_ (\x -> replicateM_ 500000 (lockunlock l)) [0 .. 3]
print "finished"
lockunlock :: TicketLock -> IO ()
lockunlock (TicketLock {..}) = do
myticket <- atomicModifyIORef' next (\a -> (a + 1, a))
let wait = do
x <- readIORef cur -- atomicModifyIORef' cur (\a -> (a, a)) works correctly
unless (x == myticket) wait
wait
print "lock acquired" -- to observe it gets stuck
atomicModifyIORef' cur (\a -> (a + 1, ())) -- unlock
The test gets stuck after a couple of thousand iterations.
What is going wrong with this code? I kind of expect it to fail because I guess iorefs can be reordered by compiler but i can't figure out what exactly is going on. Is it observable in generated STG code or cmm code?
My real question is why adding barriers does not fix the problem? I tried adding barriers (from atomics-primops
package) after every line of the lockunlock
function and problem still continues.
I also tried using AtomicCounter
from atomic-primops
package, the problem is the same, even though incrCounter is implemented by fetch-add instruction the test somehow gets stuck after couple of thousand iterations, even with barriers all over the code.
data BetterTicketLock = BetterTicketLock {curb :: !(AtomicCounter), nextb :: !(AtomicCounter)}
testTicketb = do
l <- BetterTicketLock <$> newCounter 1 <*> newCounter 0
mapConcurrently_ (\x -> replicateM_ 500000 (lockunlockb l)) [0 .. 3]
print "finished"
lockunlockb :: BetterTicketLock -> IO ()
lockunlockb (BetterTicketLock {..}) = do
myticket <- incrCounter 1 nextb -- it returns new value not old value
let wait = do
x <- incrCounter 0 curb -- I am not even using readCounter which is similar to readIOref
unless (x == myticket) wait
wait
print "lock acquired"
void $ incrCounter 1 curb
Ideally i want a way to reason about haskell programs like in rust using acquire-release semantics how is the following can be converted to Haskell?
struct TicketLock { current: AtomicUsize, next: AtomicUsize}
impl TicketLock {
fn new() -> Self{
TicketLock {current: AtomicUsize::new(0),next: AtomicUsize::new(0)}
}
fn lockunlock(&self) {
let ticket = self.next.fetch_add(1, Ordering::Relaxed);
while self.current.load(Ordering::Acquire) != ticket { }
// critical section
let _ = self.current.fetch_add(1, Ordering::Release);
}
}
Edit: For some reason this works as expected without any performance penalty. Is it because of barriers being ignored in tight loops?
lockunlock (TicketLock {..}) s = do
myticket <- atomicModifyIORef' next (\a -> (a + 1, a))
let wait = do
x <- readIORef cur
unless (x == myticket) $ threadDelay 0 >> wait
wait
modifyIORef' s (+ 1)
atomicModifyIORef' cur (\a -> (a + 1, ())) -- unlock
You're running into a pathological case with the implementation of pre-emptive multitasking in GHC. See the documentation of the Control.Concurrent module:
Your
wait
loop seems to not allocate anything, so it cannot be pre-empted. You can solve this issue by inserting a manualyield
:Alternatively, you can compile with
-fno-omit-yields
.I haven't been able to figure anything more out by trying to debug using the eventlog, because even that seems to be blocked. As K.A. Buhr remarks below he has written a better answer: