Hook into GHC runtime system

146 views Asked by At

I have been looking at how transactional memory is implemented in Haskell, and I am not sure I understand how the STM operations exposed to the programmer hook into the runtime system functions written in C. In ghc/libraries/base/GHC/Conc/Sync.hs of the git repo, I see the following definitions:

-- |A monad supporting atomic memory transactions.
newtype STM a = STM (State# RealWorld -> (# State# RealWorld, a #))
            deriving Typeable

-- |Shared memory locations that support atomic memory transactions.
data TVar a = TVar (TVar# RealWorld a)
          deriving Typeable

-- |Create a new TVar holding a value supplied
newTVar :: a -> STM (TVar a)
newTVar val = STM $ \s1# ->
    case newTVar# val s1# of
         (# s2#, tvar# #) -> (# s2#, TVar tvar# #)

Then in ghc/rts/PrimOps.cmm, I see the following C-- definition:

stg_newTVarzh (P_ init){
  W_ tv;

  ALLOC_PRIM_P (SIZEOF_StgTVar, stg_newTVarzh, init);

  tv = Hp - SIZEOF_StgTVar + WDS(1);
  SET_HDR (tv, stg_TVAR_DIRTY_info, CCCS);

  StgTVar_current_value(tv) = init;
  StgTVar_first_watch_queue_entry(tv) = stg_END_STM_WATCH_QUEUE_closure;
  StgTVar_num_updates(tv) = 0;

  return (tv);
}

My questions:

  1. What does the first and last # mean in (# s2#, TVar tvar# #). I've read before that putting a # after a variable is just a naming convention indicating something is unboxed, but what does it mean when it is by itself?
  2. How do we get from newTVar# to stg_newTVarzh? It seems like I am missing another definition between these two. Does the compiler rewrite newTVar# into a call to the C-- function listed?
  3. What are P_ and W_ in the C-- code?

I have only been able to find one other occurrence of newTVar# in ghc/compiler/prelude/primops.txt.pp

primop  NewTVarOp "newTVar#" GenPrimOp
   a
    -> State# s -> (# State# s, TVar# s a #)
 {Create a new {\tt TVar\#} holding a specified initial value.}
with
 out_of_line  = True
 has_side_effects = True

According to https://ghc.haskell.org/trac/ghc/wiki/Commentary/PrimOps, this is how primitives are defined so that the compiler knows about them.

1

There are 1 answers

0
chi On BEST ANSWER

(# s2#, TVar tvar# #) is an unboxed tuple.

The name stg_newTVarzh is built from:

  • The stg_ prefix, which is common to the whole GHC runtime, and stands for the spineless-tagless G-machine, an abstract machine to evaluate functional languages;

  • newTVar which is the first part of newTVar#;

  • the final zh, which is the so-called z-encoding of #: this encoding generates a plain name usable by the linker/the ABI in all platform, removing funny characters like hash (#).