Goroutine scheduling problem when executing infinite loop

90 views Asked by At

I am studying the MIT 6.824 course, and the lab2 of the course is to implement the basic Raft consensus algorithm. The environment for my course labs is WSL2 of Windows operating system, and the Linux distribution is Ubuntu22.04.2 LTS, the version of Go is go1.20.1 linux/amd64, CPU is intel i7-12650h.

In my code implementation, after a cluster member is elected as Leader, a logSyncer Goroutine will be started for each members in the cluster, to regularly check whether there are new log entries that need to be replicated to the followers or send heartbeats.

The code logic of logSyncer is as follows:

func (rf *Raft) logSyncer(server int, expectTerm int) {

    noOpsCnt := -1
    for {
       if rf.killed() {
           return
       }
       rf.mu.Lock()

       if rf.currentTerm != expectTerm {
           rf.mu.Unlock()
           return
       }
        
       curLastLogIndex := rf.getGlobalIndex(len(rf.log) - 1)

       if curLastLogIndex >= rf.nextIdx[server] || (noOpsCnt+1)%5 == 0 {
           // the condition to step into this block:
           // 1. time has passed a Heartbeat period (5*20ms ~ 5*30ms)
           // 2. new log entries already exist and can be sent to other followers
           rf.syncEntriesToFollower(server, expectTerm)
           noOpsCnt = 0
       } else {
           // no-ops, just increment counter
           noOpsCnt++
       }
       rf.mu.Unlock()
       
       // before starting a new for-loop, sleep 20~30 ms
       randSleepTime := GetRandom(20, 30)
       time.Sleep(time.Duration(randSleepTime))
    }
} 

However, when I start 40 sub-processes and continuously execute the lab test program about the log replication function multiple times, I may encounter the following puzzling situation every 2500 times, which ultimately leads to the failure of the test: command line execute test program

In the normal process, the leader will send a heartbeat to the follower every once in a while (I set it to 100~150ms) to ensure that the follower will not time out and start a new round of elections. However, in the failed case, there will be a time interval of 500 ms ~ 1s where the leader does not send a heartbeat, causing a follower to start a new round of election.

Then I begin the difficult process of debugging what I think there was some bugs in my implementation. I initially suspect that the rf.mu.Lock() operation can't return in a short time because the mutex is held while performing blocking operations (channel send, rpc or sleep, etc.)

I add a check() Goroutine to the above code. When the lock is not acquired for a long time(I set it to 200ms), it will print the stack frames of all Goroutines. But to my disappointment, running the failed test case doesn't trigger a panic and print the Goroutine function stack.

for{
     //...
    done := make(chan struct{}, 1)
    check := func() {
        ticker := time.NewTicker(200 * time.Millisecond)
        defer ticker.Stop()
    
        select {
        case <-done:
           return
        case <-ticker.C:
           // print Goroutines stack
           buf := make([]byte, 24576) // 24 KB
           buf = buf[:runtime.Stack(buf, true)]
           fmt.Println(string(buf))
           panic("spend more than 200ms to acquire rf.mu")
        }
    }
    go check()
    
    rf.mu.Lock()
    // let check() exit
    done<-struct{}{}
    // ...
}

Later, I add more log information:

func (rf *Raft) logSyncer(server int, expectTerm int) {

    noOpsCnt := -1
    tPrevEndLoop := time.Now().UnixMilli()
    for {
       tBeginLoop := time.Now().UnixMilli()
       
       // 1. print tBeginLoop - prevEndLoop
       
       if rf.killed() {
          return
       }
       tKill := time.Now().UnixMilli()
       
       // 2. print tKill - tBeginLoop
       
       // start check()
       // ...
       rf.mu.Lock()
       // exit check()
       // ...
       
       tLock := time.Now().UnixMilli()
       
       // 3. print tLock - tKill 
        
       curLastLogIndex := rf.getGlobalIndex(len(rf.log) - 1)

       if curLastLogIndex >= rf.nextIdx[server] || (noOpsCnt+1)%5 == 0 {
           // the condition to step into this block:
           // 1. time has passed a Heartbeat period (5*20ms ~ 5*30ms)
           // 2. new log entries already exist and can be sent to other followers
          rf.syncEntriesToFollower(server, expectTerm)
          noOpsCnt = 0
       } else {
           // no-ops, just increment counter
          noOpsCnt++
       }
       rf.mu.Unlock()
       
       tBeforeSleep := time.Now().UnixMilli()
       
       // before starting a new for-loop, sleep 20~30 ms
       randSleepTime := GetRandom(20, 30)
       time.Sleep(time.Duration(randSleepTime))
       
       tAfterSleep := time.Now().UnixMilli()
       
       // 4. print tAfterSleep - tBeforeSleep
       tPrevEndLoop = tAfterSleep
    }
} 

I add multiple time points and print the difference between them:

  • tPrevEndLoop: the end moment of the previous for-loop.
  • tBeginLoop: the start moment of the current for-loop.
  • tKill: the moment to check the survival status of the raft instance.
  • tLock: the moment when logSyncer Goroutine acquires the mutex lock rf.mu.
  • tBeforeSleep, tAfterSleep: the moment when logSyncer starts and ends sleeping after executing a loop.

What I don't expect is that there would be an interval of up to 1s between the end of a for-loop and the execution of a new for-loop. This is incredible for code that does not include network and disk I/O!

017048 TEST Server 0 Leader Term 1, logSyncer-> Server 2, noOpsCnt 3,
 (tBeginLoop - tPrevEndLoop) takes 1038 ms, (tKill- tBeginLoop) takes 0 ms, (tLock- tKill) takes 0 ms, (tAfterSleep - tBeforeSleep) takes 26ms

I would like to ask my friends, what may have happened to cause the above situation to happen?

0

There are 0 answers