What differences are there in the handling of Threads vs Fibers in Boehm GC?
Win32 CreateFiber only takes a desired stack size and allocates it without giving the user access to the stack pointer (as far as I can tell). Does Boehm GC recognize the created stacks as roots automatically? If not, how can we make the GC aware of the stacks? Do we use GetCurrentThreadStackLimits?
First, Boehm GC needs to scan stack of each thread (or fiber) that deals with pointers allocated by GC or which calls GC functions. For regular threads, there are 2 ways to get thread registered - either call GC_CreateThread to create the thread or GC_register_my_thread to register the current thread. After thread registration, the garbage collector deals with the thread automatically.
In case of a coroutine (fiber), there is no way to register it currently (but it is possible by intercepting CreateFiber, EndFiber and SwitchToFiber). The only way currently to make the GC aware of the fibers, is to update stack bottom of the current (regular) thread manually (i.e. acquire the GC lock and call GC_set_my_stackbottom when switching to another fiber). See https://github.com/ivmai/bdwgc/issues/274 for the low-level details.
About GetCurrentThreadStackLimits usage: Currently BDWGC uses VirtualQuery and GetThreadContext to determine boundaries of stack (committed region of the stack), I don't see right now how this could be improved by GetCurrentThreadStackLimits, at least for regular threads.