At which point a goroutine can yield?

2.1k views Asked by At

I am trying to gain better understanding of how goroutines are scheduled in Go programs, especially at which points they can yield to other goroutines. We know that a goroutine yields on syscals that would block it, but apparently this is not the whole picture.

This question raises somewhat of similar concern, and the most rated answers says that a goroutine may also switch on function calls, as doing that would call the scheduler to check if the stacks needs to be grown, but it explicitly says that

If you don't have any function calls, just some math, then yes, goroutine will lock the thread until it exits or hits something that could yield execution to others.

I wrote a simple program to check and prove that:

package main

import "fmt"

var output [30]string      // 3 times, 10 iterations each.
var oi = 0

func main() {
    runtime.GOMAXPROCS(1)   // Or set it through env var GOMAXPROCS.
    chanFinished1 := make(chan bool)
    chanFinished2 := make(chan bool)

    go loop("Goroutine 1", chanFinished1)
    go loop("Goroutine 2", chanFinished2)
    loop("Main", nil)

    <- chanFinished1
    <- chanFinished2

    for _, l := range output {
        fmt.Println(l)
    }
}

func loop(name string, finished chan bool) {
    for i := 0; i < 1000000000; i++ {
        if i % 100000000 == 0 {
            output[oi] = name
            oi++
        }
    }

    if finished != nil {
        finished <- true
    }
}

NOTE: I am aware that putting a value in the array and incrementing oi without synchronization is not quite correct, but I want to keep the code easy and free of things that could cause switching. After all, the worst thing that can happen is putting a value without advancing the index (overwriting), which is not a big deal.

Unlike this answer, I avoided using of any function calls (including built-in append()) from the loop() function that is launched as a goroutine, also I am explicitly setting GOMAXPROCS=1 which according to documentation:

limits the number of operating system threads that can execute user-level Go code simultaneously.

Nevertheless, in the output I still see the messages Main/Goroutine 1/Goroutine 2 interleaved, meaning one of following:

  • execution of the goroutine interrupts and the goroutine gives up the control at some moments;
  • GOMAXPROCS does not work as stated in the documentation, spinning up more OS threads to schedule goroutines.

Either the answer is not complete, or some things have changed since 2016 (I tested on Go 1.13.5 and 1.15.2).

I am sorry if the question was answered, but I failed to find neither explanation of why this particular example yields control, nor about points where goroutines yield control in general (excepting blocking syscalls).

NOTE: This question is purely theoretical, I am not trying to solve any practical task now, but in general, I assume that knowing points where a goroutine can yield and where it cannot allows us to avoid redundant usage of synchronization primitives.

2

There are 2 answers

3
torek On BEST ANSWER

Go version 1.14 introduced asynchronous preemption:

Goroutines are now asynchronously preemptible. As a result, loops without function calls no longer potentially deadlock the scheduler or significantly delay garbage collection. This is supported on all platforms except windows/arm, darwin/arm, js/wasm, and plan9/*.

As answered in Are channel sends preemption points for goroutine scheduling?, Go's preemption points may change from one release to the next. Asynchronous preemption just adds possible preemption points almost everywhere.

Your writes to the output array are not synchronized and your oi index is not atomic, which means we can't really be sure what happens in terms of the output array. Of course, adding atomicity to it with a mutex introduces cooperative scheduling points. While these aren't the source of cooperative scheduling switches (which must be occurring based on your output), they do mess with our understanding of the program.

The output array holds strings, and using strings can invoke the garbage collection system, which can use locks and cause scheduling switching. So this is the most likely cause of scheduling switching in pre-Go-1.14 implementations.

1
AJR On

As @torek has pointed out the most popular runtime environments for GO have used pre-emptive scheduling for a few months now (since 1.14). Otherwise the points at which a goroutine may yield varies depending on the runtime environment and the release but William Kennedy gives a good summary.

I also recall that there was an option added to the compiler to add yield points to long running loops a few years ago, but this was an experimental option not normally triggered. (Of course you can do it manually by calling runtime.GoSched every now and then on your loop.)

As for your test I am surprised by the result you got when running under Go 1.13.5. The behaviour is not exactly defined due to the data races (I know you avoided any synchronisation mechanisms to avoid triggering a yield) but I would not have expected that result. One thing is that setting GOMAXPROCS to 1 will mean that only one goroutine is executing concurrently but that may not necessarily mean when a different goroutine executes it will run on the same core. A different core will have a different cache and (without synchronisation) different opinions of the values of output and oi.

But may I suggest you simply forget about modifying global variables and just log a message before and after the busy-loop. This should clearly show (in GO < 1.14) that only one lopp will run at a time. (I was trying do the same experiment as you many years ago and that seemed to work.)