I call an external program inside a function. Now i would like to timeout this function and not just the external program. But after the function times out, the external program is still running on my computer (i'm using debian) until it finishes its computation, after that its thread still remains in the process table as a subthread of my main program until the main program terminates.
Here are two minimal examples which illustrates what i would like to do. The first uses unsafePerformIO, the second is completely in the IO monad. I don't really depend on the unsafePerformIO but would like to keep it if possible. The described problem occures with and without it.
With unsafePerformIO
module Main where
import System.Timeout
import Criterion.Measurement
import System.IO.Unsafe
import System.Process
main = do
x <- time $ timeoutP (1 * 1000000) $ mytest 2
y <- getLine
putStrLn $ show x ++ y
timeoutP :: Int -> a -> IO (Maybe a)
timeoutP t fun = timeout t $ return $! fun
mytest :: Int -> String
mytest n =
let
x = runOnExternalProgram $ n * 1000
in
x ++ ". Indeed."
runOnExternalProgram :: Int -> String
runOnExternalProgram n = unsafePerformIO $ do
-- convert the input to a parameter of the external program
let x = show $ n + 12
-- run the external program
-- (here i use "sleep" to indicate a slow computation)
answer <- readProcess "sleep" [x] ""
-- convert the output as needed
let verboseAnswer = "External program answered: " ++ answer
return verboseAnswer
Without unsafePerformIO
module Main where
import System.Timeout
import Criterion.Measurement
import System.IO.Unsafe
import System.Process
main = do
x <- time $ timeout (1 * 1000000) $ mytest 2
y <- getLine
putStrLn $ show x ++ y
mytest :: Int -> IO String
mytest n = do
x <- runOnExternalProgram $ n * 1000
return $ x ++ ". Indeed."
runOnExternalProgram :: Int -> IO String
runOnExternalProgram n = do
-- convert the input to a parameter for the external program:
let x = show $ n + 12
-- run the external program
-- (here i use "sleep" to indicate a slow computation):
answer <- readProcess "sleep" [x] ""
-- convert the output as needed:
let verboseAnswer = "External program answered: " ++ answer
return verboseAnswer
Maybe bracket can be of help here, but i don't really know how.
Edit: I adopted John L's answer. Now i am using the following:
import Control.Concurrent
import Control.Exception
import System.Exit
import System.IO
import System.IO.Error
import System.Posix.Signals
import System.Process
import System.Process.Internals
safeCreateProcess :: String -> [String] -> StdStream -> StdStream -> StdStream
-> ( ( Maybe Handle
, Maybe Handle
, Maybe Handle
, ProcessHandle
) -> IO a )
-> IO a
safeCreateProcess prog args streamIn streamOut streamErr fun = bracket
( do
h <- createProcess (proc prog args)
{ std_in = streamIn
, std_out = streamOut
, std_err = streamErr
, create_group = True }
return h
)
-- "interruptProcessGroupOf" is in the new System.Process. Since some
-- programs return funny exit codes i implemented a "terminateProcessGroupOf".
-- (\(_, _, _, ph) -> interruptProcessGroupOf ph >> waitForProcess ph)
(\(_, _, _, ph) -> terminateProcessGroup ph >> waitForProcess ph)
fun
{-# NOINLINE safeCreateProcess #-}
safeReadProcess :: String -> [String] -> String -> IO String
safeReadProcess prog args str =
safeCreateProcess prog args CreatePipe CreatePipe Inherit
(\(Just inh, Just outh, _, ph) -> do
hPutStr inh str
hClose inh
-- fork a thread to consume output
output <- hGetContents outh
outMVar <- newEmptyMVar
forkIO $ evaluate (length output) >> putMVar outMVar ()
-- wait on output
takeMVar outMVar
hClose outh
return output
-- The following would be great, if some programs did not return funny
-- exit codes!
-- ex <- waitForProcess ph
-- case ex of
-- ExitSuccess -> return output
-- ExitFailure r ->
-- fail ("spawned process " ++ prog ++ " exit: " ++ show r)
)
terminateProcessGroup :: ProcessHandle -> IO ()
terminateProcessGroup ph = do
let (ProcessHandle pmvar) = ph
ph_ <- readMVar pmvar
case ph_ of
OpenHandle pid -> do -- pid is a POSIX pid
signalProcessGroup 15 pid
otherwise -> return ()
This solves my problem. It kills all child processes of the spawned process and that at the right time.
Kind regards.
Edit: it is possible to get the pid of the spawned process. You can do so with code like the following:
If you kill the process, instead of putting the open
ph_
into the mvar you should create an appropriateClosedHandle
and put that back instead. It's important that this code executes masked (bracket will do this for you).Now that you have a POSIX id you can use system calls or shell out to kill as necessary. Just be careful that your Haskell executable isn't in the same process group if you go that route.
/end edit
This behavior seems sort of sensible. The documentation for
timeout
claims that it doesn't work at all for non-Haskell code, and indeed I don't see any way that it could generically. What's happening is thatreadProcess
spawns a new process, but then is timed out while waiting for output from that process. It seems thatreadProcess
doesn't terminate the spawned process when it's aborted abnormally. This could be a bug inreadProcess
, or it could be by design.As a workaround, I think you'll need to implement some of this yourself.
timeout
works by raising an async exception in a spawned thread. If you wrap yourrunOnExternalProgram
in an exception handler, you'll get the behavior you want.The key function here is the new
runOnExternalProgram
, which is a combination of your original function andreadProcess
. It would be better (more modular, more reusable, more maintainable) to make a newreadProcess
that kills the spawned process when an exception is raised, but I'll leave that as an exercise.