Problem with reading an NSPipe->NSFileHandle to end

I'm trying to execute a task in the background and parsing the output from the task along the way. However I get the NSTaskDidTerminateNotification before all the output from the task has been delivered by NSFileHandleReadCompletionNotification - and I am not able to squeeze much more (but in some cases a little, but never all the way to the end) out of the filehandle after the task exits.

Code is what we all want to look at, so here are the interesting bits.

I never got the full output in threadPipeReader, but then I tried to fetch the data in threadTaskStopped - but that only gives some output. Not all the way to the end either.

I then tried to specify how much data I wanted from myFileHandle in threadTaskStopped, by doing:

NSData *data = [myFilehandle readDataOfLength:262144];

And then got the output:

got some more: <about 262144 characters of data>
output size: 262144

But then the while-loop exited because myFileHandle was zero length the next time it got polled. When I don't specify a size, but just asks for availableData, I get sizes of exactly 16K (16384) multiple times. But never an odd size - and I can definetely see that the output gets chopped off (sometimes mid-line) - so I am never able to get the last output from the task.

> I'm trying to execute a task in the background and parsing the output from the task along the way. However I get the NSTaskDidTerminateNotification before all the output from the task has been delivered by NSFileHandleReadCompletionNotification

This is completely ordinary. There are two independent inter-process communication mechanisms at work, and there's no guarantee that all of the output data will arrive at your process and be delivered in a notification before the task termination notification is delivered.

> - and I am not able to squeeze much more (but in some cases a little, but never all the way to the end) out of the filehandle after the task exits.

From what you say below, I'm not sure that's accurate.

> - (id) init {
> [super init];

You should be assigning the result from [super init] to self. You should also be checking if it's nil.

The above lines register to for those notifications on _all_ tasks and file handles in the whole process. This is probably not what you want. You should register for those notifications after you've created the pipe (and its file handle) and the task, and you should register on those objects specifically.

What you need to do is just mark some internal state so you know the task has exited in threadTaskStopped:. Then, return to the run loop so that you can continue to receive the notifications from the background reads. Eventually, you'll receive the end-of-file marker (a zero-length data). After you've received both the end-of-file and the task termination notification, then you can proceed to make final use of the data and clean up the task (and your registrations with the notification center).

The issue you're encountering is probably because there's both a background read in progress and your attempt to synchronously read in the foreground. The background read has probably obtained the "missing" data that you're never seeing from the foreground read. You will never see it if you don't allow the run loop to fire -- for example, if you terminate the thread after getting the task termination notification. Even if you did see it, you'd get it out of order with respect to the synchronous foreground read you're doing. There's no telling which read operation would get any particular chunk of data.

Abandon the foreground reading and the assumption that all data will have arrived by the time you get the task termination notification. Use only background reading and keep running the run loop until you get both end-of-file and task-terminated indicators.

If you still aren't getting all of the output you expect, then your task is probably exiting early, perhaps crashing.

>> [[NSNotificationCenter defaultCenter] addObserver:self
>> selector:@selector(threadPipeReader:)
>> name:NSFileHandleReadCompletionNotification
>> object:nil];
>>
>> [[NSNotificationCenter defaultCenter] addObserver:self
>> selector:@selector(threadTaskStopped:)
>> name:NSTaskDidTerminateNotification
>> object:nil];>
> The above lines register to for those notifications on _all_ tasks and file handles in the whole process. This is probably not what you want. You should register for those notifications after you've created the pipe (and its file handle) and the task, and you should register on those objects specifically.

I launch multiple processes, and I do a check to see which one is the one I'm getting notified for by doing this in threadPipeReader:

if ( [notification object] == myFileHandle )

(and of course I have more NSTasks and NSFileHandles than myTask and myFileHandle than my example shows).

So unless you strongly discourage this method, it works out pretty well and I am able to distinguish which process is notifying me. The method you suggests forces me to create seperate sub-routines for each invokation.

My programs sole purpose is to launch processes and look at their output.

Only issue here is that threadPipeReader: does not get called after threadTaskStopped: has been called. Even though output is clearly missing.

>> -(void)threadTaskStopped:(NSNotification *)notification {
>>
>> NSData *data = [myFileHandle availableData];
>>
>> while ([data length] > 0) {
>> NSLog(@"got some more: %@", [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding]);
>> NSLog(@"output size: %d", [data length]);
>> data = [myFileHandle availableData];
>> }
>>
>> }
>> <code end>
>>
>> I never got the full output in threadPipeReader, but then I tried to fetch the data in threadTaskStopped - but that only gives some output. Not all the way to the end either.>
> What you need to do is just mark some internal state so you know the task has exited in threadTaskStopped:. Then, return to the run loop so that you can continue to receive the notifications from the background reads. Eventually, you'll receive the end-of-file marker (a zero-length data). After you've received both the end-of-file and the task termination notification, then you can proceed to make final use of the data and clean up the task (and your registrations with the notification center).

This sounds like the thing I need - however I need a more detailed explanation. I don't know what "return to the run loop" means. Can you give a code example?

But threadTaskStopped: does not need to examine data for the process that exited. In fact I will prefer to have threadPipeReader: continue to get the data fed, even though the process exited. It's not important that the process exited.

> The issue you're encountering is probably because there's both a background read in progress and your attempt to synchronously read in the foreground.

If you think of the availableData calls from threadTaskStopped:, I only put them there because threadPipeReader: didn't get called after threadTaskStopped: did.

> The background read has probably obtained the "missing" data that you're never seeing from the foreground read.

Hmmm, I suspect some more code is needed. The part of the program that executes these NSTasks is a seperate thread:

> On 11/04/2010, at 21.17, Ken Thomases wrote:
> >> On Apr 8, 2010, at 9:57 AM, Rasmus Skaarup wrote:> >>> [[NSNotificationCenter defaultCenter] addObserver:self
>>> selector:@selector(threadPipeReader:)
>>> name:NSFileHandleReadCompletionNotification
>>> object:nil];
>>>
>>> [[NSNotificationCenter defaultCenter] addObserver:self
>>> selector:@selector(threadTaskStopped:)
>>> name:NSTaskDidTerminateNotification
>>> object:nil];>>
>> The above lines register to for those notifications on _all_ tasks and file handles in the whole process. This is probably not what you want. You should register for those notifications after you've created the pipe (and its file handle) and the task, and you should register on those objects specifically.>
> I launch multiple processes, and I do a check to see which one is the one I'm getting notified for by doing this in threadPipeReader:
>
> if ( [notification object] == myFileHandle )
>
> (and of course I have more NSTasks and NSFileHandles than myTask and myFileHandle than my example shows).

OK, that's fairly reasonable. Given that you're checking the notification object is one of the ones you're interested in, it's safe.

> So unless you strongly discourage this method, it works out pretty well and I am able to distinguish which process is notifying me. The method you suggests forces me to create seperate sub-routines for each invokation.

That's actually not true. You can observe specific objects while using the same selector for all of the observations.

The distinction between this and what you are apparently doing is that the framework could theoretically be using NSFileHandles or (less likely) NSTask for its own purposes. If you observe indiscriminately, then your method may get called for file handles or tasks that you didn't explicitly create. You then check the notification objects, which filters out the ones you didn't create. If you observe only the specific objects you create, then your method is only invoked for them and not for any framework-created objects. Either way is workable, although I personally prefer to be specific.

>> What you need to do is just mark some internal state so you know the task has exited in threadTaskStopped:. Then, return to the run loop so that you can continue to receive the notifications from the background reads. Eventually, you'll receive the end-of-file marker (a zero-length data). After you've received both the end-of-file and the task termination notification, then you can proceed to make final use of the data and clean up the task (and your registrations with the notification center).>
> This sounds like the thing I need - however I need a more detailed explanation. I don't know what "return to the run loop" means. Can you give a code example?

Um, not really. Returning to the run loop means returning from your methods back to the whatever run loop is running. For example, the main event loop (if the task launch and file handle readInBackgroundAndNotify are issued on the main thread).

Your code is launching the task and initiating the background read from the file handle, and then returning. Then, when something happens, a notification is posted and your registered observer methods are invoked. In between those events, what is your program doing? Generally speaking, it's running the run loop, which actually means it's waiting for external events or data to arrive (or timers to fire).

The above makes no sense. You are launching a background thread, but the only thing the background thread is doing is passing some work to the foreground thread.

You have added nothing but complexity and cost by involving a thread.

There is no need to do anything with threads to get asynchronous launching and monitoring of tasks. The APIs involved (at least, the ones you are using) are inherently asynchronous.

I suspect we're still not seeing the real code. I further expect that your problem is a result of confused use of threads.

When you invoke -readInBackgroundAndNotify, there is a run loop source installed on the run loop associated with the current thread. That is, if you invoke -readInBackgroundAndNotify on a background thread, the run loop source is installed on the background thread's run loop. If you invoke it on the main thread, the source is installed on the main thread's run loop.

This is important because, unless you take specific steps to run a background thread's run loop, it isn't run for you. (The main thread's run loop is run as part of a GUI application's main event loop. If you're writing a command-line tool or daemon instead of an application, then you must run it manually if you want it to be run.)

If the run loop containing the file handle's background read run loop source is not run in a relevant mode, then you won't get the notification about data that was read.

Try simplifying your app by eliminating the use of background threads. Or, if you feel you must use background threads for long-running computation, use them only for handling the data after receiving it. Do everything involving launching the task and initiating background reads of its output from the main thread (without that bizarre bit about launching a background thread just to have it shunt some work back to the main thread).

> Try simplifying your app by eliminating the use of background threads. Or, if you feel you must use background threads for long-running computation, use them only for handling the data after receiving it. Do everything involving launching the task and initiating background reads of its output from the main thread (without that bizarre bit about launching a background thread just to have it shunt some work back to the main thread).

I thought that when I added the observers from the main init, and started the tasks by doing performSelectorOnMainThread: from the background thread, it would also start the task on the same thread as the observers - the main thread. But to my suprice, if I add specific observers from startMyTask: (instead of from the main init) everything works perfectly, even though I still initate the task launch from a background thread.

>> Try simplifying your app by eliminating the use of background threads. Or, if you feel you must use background threads for long-running computation, use them only for handling the data after receiving it. Do everything involving launching the task and initiating background reads of its output from the main thread (without that bizarre bit about launching a background thread just to have it shunt some work back to the main thread).>
> I thought that when I added the observers from the main init, and started the tasks by doing performSelectorOnMainThread: from the background thread, it would also start the task on the same thread as the observers - the main thread. But to my suprice, if I add specific observers from startMyTask: (instead of from the main init) everything works perfectly, even though I still initate the task launch from a background thread.
>
> Thanks for your help Ken!

You're welcome, but the above still demonstrates quite a bit of confusion about threads. The reason I told you to remove the (apparently unnecessary) threading from your app was because, no offense, I don't have confidence that you're getting it right. And getting the threading wrong is very likely to produce symptoms like you described.

For example, the phrase "on the same thread as the observers" makes no sense. Observers don't have or live on a specific thread. Whatever thread a notification is posted on, that's the thread where the observers' methods are invoked. Notification delivery is just a one-step-indirect method invocation. Posting the notification is exactly the same as looping through a list of the observers and just directly invoking their registered selector.

Second, the "main" init (whatever that means) may be, but is not necessarily, invoked on the main thread. It depends on how you wrote things. But it shouldn't matter, in terms of where you register observers of notifications. Registering observers with the default notification center means they are registered with the notification center, period. That's true across all threads. The notification center and registrations with it are not thread-specific.

The fact that neither of us understands why the change you made has "fixed" the problem means I have no confidence in it, and neither should you. You need to actually understand what's going on and why.