How to execute custom command when ctest timeouts on jenkins

334 views Asked by At

I have a Jenkins that executes ctest which in turn executes several unit tests. A global timeout of 120 minutes for a test run is configured.

One of my test programs gets sporadically stuck and killed by the configured timeout.

What I like to have is a core dump of the test program in the problem situation. So I'd like to execute a custom command (e.g. gcore XXX), whenever the timeout is reached.

How can I configure that in Jenkins and/or ctest?

1

There are 1 answers

0
fex On BEST ANSWER

I wrote my own, non-portable script to accomplish the job. Hopefully it serves as a help and/or inspiration for others...

#!/usr/bin/env ruby

#watchers the children of ctest. 
#takes a gcore of the child and kills it, if its runtime exceeds the         configured timeout

#make test will show a line like this, if this watch dog killed the test:
#      Start 49: test_logging
#49/86 Test #49: test_logging ......................***Exception: Other     62.33 sec

require "time"

TIMEOUT_SEC = (ENV["TIMEOUT_SEC"] || 23*60).to_i
DIR_CORES = ENV["DIR_CORES"] || "/tmp/corefiles/"
KILL_SIGNAL = ENV["KILL_SIGNAL"] || 9
SLEEP_TIME_SEC = (ENV["SLEEP_TIME_SEC"] || 5).to_i

puts "Started ctest watch dog."
puts Process.daemon

while true do
    pid_ctest = %x(pgrep ctest).strip
    if !pid_ctest.nil? && !pid_ctest.empty?
#       puts "ctest: #{pid_ctest}"
        pid_child = %x(ps -o ppid= -o pid= -A | awk '$1 == #{pid_ctest}{print $2}').strip
        if !pid_child.nil? && !pid_child.empty?
#           puts "child: #{pid_child}"
            runtime_child = %x(ps -o etime= -p #{pid_child}).strip
            timeary = runtime_child.strip.split(":")
            hour, min, sec = 0
            if timeary.length > 2
                hour = timeary[0]
                min = timeary[1]
                sec = timeary[2]
            else
                min = timeary[0]
                sec = timeary[1]
            end

            res = %x(pstree #{pid_ctest})
            ary = res.split("-")
            ary.delete_if {|x| x.empty?}
            child_name = ary[1].strip

            t = hour.to_i*60*60 + min.to_i*60 + sec.to_i
            if t > TIMEOUT_SEC
                puts "kill child: #{pid_child} #{runtime_child} #{t.to_i}"

                puts "dumping core to #{DIR_CORES}/#{child_name}"
                %x(gcore -o #{DIR_CORES}/#{child_name} #{pid_child} )
                puts "killing with signal #{KILL_SIGNAL}"
                %x(kill --signal #{KILL_SIGNAL} #{pid_child})
            else
                puts "Letting child alive. ctest: #{pid_ctest}, child:     #{pid_child}, name: #{child_name}, runtime: #{runtime_child}, in sec: #{t}. Killing in #{TIMEOUT_SEC-t} sec"
            end
        end
    end

    sleep SLEEP_TIME_SEC
end