How to read all remaining output of readInBackgroundAndNotify after NSTask has ended?

304 views Asked by At

I'm invoking various command line tools via NSTask. The tools may run for several seconds, and output text constantly to stdout. Eventually, the tool will terminate on its own. My app reads its output asynchronously with readInBackgroundAndNotify.

If I stop processing the async output as soon as the tool has exited, I will often lose some of its output that hasn't been delivered by then.

Which means I have to wait a little longer, allowing the RunLoop to process pending read notifications. How do I tell when I've read everything the tool has written to the pipe?

This problem can be verified in the code below by removing the line with the runMode: call - then the program will print that zero lines were processed. So it appears that at the time the process has exited, there's already a notification in the queue that is waiting to be delivered, and that delivery happens thru the runMode: call.

Now, it might appear that simply calling runMode: once after the tool's exit may be enough, but my testing shows that it isn't - sometimes (with larger amounts of output data), this will still only process parts of the remaining data.

Note: A work-around such as making the invoked tool outout some end-of-text marker is not a solution I seek. I believe there must be some proper way to do this, whereby the end of the pipe stream is signalled somehow, and that's what I'm looking for in an answer.

Sample Code

The code below can be pasted into a new Xcode project's AppDelegate.m file.

When run, it invokes a tool that generates some longer output and then waits for the termination of the tool with waitUntilExit. If it would then immediately remove the outputFileHandleReadCompletionObserver, most of the tool's output would be missed. By adding the runMode: invocation for the duration of a second, all output from the tool is received - Of course, this timed loop is less than optimal.

And I would like to keep the runModal function synchronous, i.e. it shall not return before it has received all output from the tool. It does run in its own tread in my actual program, if that matters (I saw a comment from Peter Hosey warning that waitUntilExit would block the UI, but that would not be an issue in my case).

- (void)applicationDidFinishLaunching:(NSNotification *)aNotification
{
    [self runTool];
}

- (void)runTool
{
    // Retrieve 200 lines of text by invoking `head -n 200 /usr/share/dict/words`
    NSTask *theTask = [[NSTask alloc] init];
    theTask.qualityOfService = NSQualityOfServiceUserInitiated;
    theTask.launchPath = @"/usr/bin/head";
    theTask.arguments = @[@"-n", @"200", @"/usr/share/dict/words"];

    __block int lineCount = 0;

    NSPipe *outputPipe = [NSPipe pipe];
    theTask.standardOutput = outputPipe;
    NSFileHandle *outputFileHandle = outputPipe.fileHandleForReading;
    NSString __block *prevPartialLine = @"";
    id <NSObject> outputFileHandleReadCompletionObserver = [[NSNotificationCenter defaultCenter] addObserverForName:NSFileHandleReadCompletionNotification object:outputFileHandle queue:nil usingBlock:^(NSNotification * _Nonnull note)
    {
        // Read the output from the cmdline tool
        NSData *data = [note.userInfo objectForKey:NSFileHandleNotificationDataItem];
        if (data.length > 0) {
            // go over each line
            NSString *output = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
            NSArray *lines = [[prevPartialLine stringByAppendingString:output] componentsSeparatedByString:@"\n"];
            prevPartialLine = [lines lastObject];
            NSInteger lastIdx = lines.count - 1;
            [lines enumerateObjectsUsingBlock:^(NSString *line, NSUInteger idx, BOOL * _Nonnull stop) {
                if (idx == lastIdx) return; // skip the last (= incomplete) line as it's not terminated by a LF
                // now we can process `line`
                lineCount += 1;
            }];
        }
        [note.object readInBackgroundAndNotify];
    }];

    NSParameterAssert(outputFileHandle);
    [outputFileHandle readInBackgroundAndNotify];

    // Start the task
    [theTask launch];

    // Wait until it is finished
    [theTask waitUntilExit];

    // Wait one more second so that we can process any remaining output from the tool
    NSDate *endDate = [NSDate dateWithTimeIntervalSinceNow:1];
    while ([NSDate.date compare:endDate] == NSOrderedAscending) {
        [[NSRunLoop currentRunLoop] runMode:NSDefaultRunLoopMode beforeDate:[NSDate dateWithTimeIntervalSinceNow:0.1]];
    }

    [[NSNotificationCenter defaultCenter] removeObserver:outputFileHandleReadCompletionObserver];

    NSLog(@"Lines processed: %d", lineCount);
}
2

There are 2 answers

14
vadian On BEST ANSWER

It's quite simple. In the observer block when data.length is 0 remove the observer and call terminate.

The code will continue after the waitUntilExit line.

- (void)runTool
{
    // Retrieve 20000 lines of text by invoking `head -n 20000 /usr/share/dict/words`
    const int expected = 20000;
    NSTask *theTask = [[NSTask alloc] init];
    theTask.qualityOfService = NSQualityOfServiceUserInitiated;
    theTask.launchPath = @"/usr/bin/head";
    theTask.arguments = @[@"-n", [@(expected) stringValue], @"/usr/share/dict/words"];

    __block int lineCount = 0;
    __block bool finished = false;

    NSPipe *outputPipe = [NSPipe pipe];
    theTask.standardOutput = outputPipe;
    NSFileHandle *outputFileHandle = outputPipe.fileHandleForReading;
    NSString __block *prevPartialLine = @"";
    [[NSNotificationCenter defaultCenter] addObserverForName:NSFileHandleReadCompletionNotification object:outputFileHandle queue:nil usingBlock:^(NSNotification * _Nonnull note)
    {
        // Read the output from the cmdline tool
        NSData *data = [note.userInfo objectForKey:NSFileHandleNotificationDataItem];
        if (data.length > 0) {
            // go over each line
            NSString *output = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
            NSArray *lines = [[prevPartialLine stringByAppendingString:output] componentsSeparatedByString:@"\n"];
            prevPartialLine = [lines lastObject];
            NSInteger lastIdx = lines.count - 1;
            [lines enumerateObjectsUsingBlock:^(NSString *line, NSUInteger idx, BOOL * _Nonnull stop) {
                if (idx == lastIdx) return; // skip the last (= incomplete) line as it's not terminated by a LF
                // now we can process `line`
                lineCount += 1;
            }];
        } else {
            [[NSNotificationCenter defaultCenter] removeObserver:self name:NSFileHandleReadCompletionNotification object:nil];
            [theTask terminate];
            finished = true;
        }
        [note.object readInBackgroundAndNotify];
    }];

    NSParameterAssert(outputFileHandle);
    [outputFileHandle readInBackgroundAndNotify];

    // Start the task
    [theTask launch];

    // Wait until it is finished
    [theTask waitUntilExit];

    // Wait until all data from the pipe has been received
    while (!finished) {
        [[NSRunLoop currentRunLoop] runMode:NSDefaultRunLoopMode beforeDate:[NSDate dateWithTimeIntervalSinceNow:0.0001]];
    }

    NSLog(@"Lines processed: %d (should be: %d)", lineCount, expected);
}
3
l'L'l On

The problem with waitUntilExit is that it doesn't always behave the way one might think. The following is mentioned in the documenation:

waitUntilExit does not guarantee that the terminationHandler block has been fully executed before waitUntilExit returns.

It appears this is precisely the problem you are having; it's a race condition. The waitUntilExit is not waiting long enough and the lineCount variable is reached before the NSTask completes. The solution would likely be to use a semaphore or dispatch_group, although it's unclear if you want to go that route — this is not an easy problem to resolve it seems.

*I experienced a similar issue from months back that still isn't resolved unfortunately.