Change History (55)

Here's the current plan:
Deferred.__init__ is changed to take an optional no-argument callable as an
argument. If given, the Deferred can be cancelled.
Deferred gains a new method, cancel. If called on a Deferred for which no
canceller has been supplied to __init__, UncancellableDeferred is raised.
Otherwise, if the Deferred has no result, the canceller supplied to __init__ is
called and the Deferred errbacks itself with CancelledDeferred. If the Deferred
does have a result, if the result is not a Deferred, AlreadyCalled is raised.
If the result is a Deferred, cancel is called on it.
Things I don't like about this:
d = defer.Deferred(cancel)
d.addCallback(lambda ign: defer.Deferred())
# d.cancel() - works
d.callback(None)
# d.cancel() - doesn't
Basically the same problem, but an even worse behavior:
d = defer.Deferred(canceller)
d.addCallback(
lambda ign: random.choice([
defer.Deferred(canceller), defer.Deferred()]))
d.callback(None)
# d.cancel() - who knows if it works or not

I think the first issue is not likely to be a problem, any more than
cancellation of timed calls is. The second is a bit more worrisome, to me.
If I'm writing user code that wants to cancel an operation, in many cases I
don't particularly care whether or not the actual operation was cancelled or
not, as long as it pretends to have been. That is: it is inefficient to have a
thread continue blocking on a dns query, but it's still okay, as long as from my
point of view the query was aborted.
This perhaps points to an API which will either actually cancel via the
canceller, or just errback and silently suppress further callbacks if there is
no canceller.
That does seem a bit unclean, but I can't think of anything immediately wrong
with it.

I think this sounds right. Glyph and I had discussed this briefly last night
and came to pretty much the exact same conclusion. The one extra detail we
discussed is that constructing a Deferred without a canceller should log a
deprecation warning. To explicitly construct a Deferred which has no
cancellation behavior (and just has .callback()/.errback() suppressed after it
is cancelled), None should be passed in. This should encourage people to think
more about cancellation, which should improve this situation overall in the future.

Additionally, I think we decided that cancelling an already-fired Deferred is a
no-op. This might bear some further discussion though. As I write this
comment, I am leaning towards thinking it should raise an error, since that
parallels how DelayedCalls work, as you pointed out.

Patch to add .cancel and cancellers to lock/semaphore/queue, and tests for
above. Also makes deferred timeout be the same as cancellation.
This does point out an issue with having a default cancellation behavior. For
lock/semaphore, not providing a canceller function would cause the instance
state to essentially be corrupted if the deferred actually was cancelled, as the
lock would never be released.
That is not necessarily a fatal problem, as you can always just say "don't do
that, then".

So I'm dropping the priority and leaving it around for future discussion.

I'm not sure if that's accurate since there are still calls to setTimeout in twisted.mail and twisted.names, but somehow this ticket seems to have fallen off of everyone's radar. I suspect some more GUI applications where there is a "cancel" button visible in the UI would perhaps drive some discussion here.

We have a protocol that gets a request, then sends it to the "first layer" for approval, then sends it to the "second layer" for approval, and if the second layer also approves, implementing the request. The code looks something like:

Where sendToSecondLayer returns a deferred that is called back when the request was approved,
or erred back if it is not approved. If after sending to the second layer, someone on the
second layer disconnected, the request will never get approved or disapproved. I want to
cancel the deferred, so the rollback will be done (and perhaps the request to be retried
later).

I would prefer that L{Deferred.cancel} on a Deferred that has no canceller raise something like NoCancellerError immediately instead of errbacking with a CancelledError. And get rid of the "swallowing" behavior.

The docstring says it's going to *callback* the deferred, not errback it (I think?) and then the test goes directly on to errback it instead.

I'm also confused about the following because the test cancels *A*, not *B*.

+ def test_cancelNestedDeferred(self):
+ """
+ Verify that a Deferred, A, which is waiting on another Deferred, B,
+ returned from one of its callbacks, will propagate L{CancelledError}
+ when B is cancelled.
+ """

The support for chained deferreds also seems dubious; perhaps it would be best to leave that up to implementations of cancellers.

I am really disgusted by the modification to testLock. Do not add anything more to that test. It should be broken up into smaller unit tests. If you want to leave it alone entirely, then fine, but put the new assertion in a different unit test. Same for the change to testSemaphore.

The swallowing and behavior when there is no canceller is for two reasons:

1) the user of the deferred should not have to care whether the provider provided special behavior for cancellation. The user simply wants to inform that he no longer gives a crap what happens to it.

That would be nice, except then the code that returned the Deferred is going to be dealing with synchronous exceptions suddenly coming out of perfectly valid calls to callback() or errback(). Who knows what impact that's going to have on its state. I actually agree with Chris's review; I don't think that behavior is really acceptable.

2) To sensibly support the existing timeout mechanism.

The existing timeout mechanism is basically a bug. Don't use it. The functionality even being there makes it impossible to use the deferred from the reactor; there would be a circular dependency, since deferreds (as it stands) depend on the reactor for setTimeout.

It should be removed and replaced with a free utility function that can cancel the deferred at a later time rather than an inherent feature of Deferred.

For test_cancelNestedDeferred, that's a typo in the docstring: it meant "when A is cancelled".

The swallowing and behavior when there is no canceller is for two reasons:

1) the user of the deferred should not have to care whether the provider provided special behavior for cancellation. The user simply wants to inform that he no longer gives a crap what happens to it.

That would be nice, except then the code that returned the Deferred is going to be dealing with synchronous exceptions suddenly coming out of perfectly valid calls to callback() or errback(). Who knows what impact that's going to have on its state. I actually agree with Chris's review; I don't think that behavior is really acceptable.

That's not accurate; there is code in there specifically to swallow the *first* call to .callback after Deferred.cancel on a Deferred that has no canceller, without raising an exception.

That's not accurate; there is code in there specifically to swallow the *first* call to .callback after Deferred.cancel on a Deferred that has no canceller, without raising an exception.

Huh. Maybe I should have read this code before putting it into review.

While it's an interesting solution to the superficial problem I mentioned, I think it still has the same problem at a deeper level where all the code expecting the deferred up until that point is still sitting around waiting for it, forever, potentially in some wacky state because of it. The client code might no longer care about its callbacks, but internal callbacks (say, to execute the next command on a command queue) could still need to be invoked.

Ultimately the problem is that client code should not use a public API to randomly change the internal implementation state of some framework it's calling.

While it's an interesting solution to the superficial problem I mentioned, I think it still has the same problem at a deeper level where all the code expecting the deferred up until that point is still sitting around waiting for it, forever, potentially in some wacky state because of it. The client code might no longer care about its callbacks, but internal callbacks (say, to execute the next command on a command queue) could still need to be invoked.

I think you need to read it again, still. :) The callbacks chain won't be waiting forever, because they'll have gotten an errback from before.

It is a bit odd for someone else to be errbacking your Deferred, which is why the Deferred needs to then not raise an exception if you do .callback/.errback yourself. This might have the potential for a problem, but, my feeling is that it's not actually going to be a problem

FWIW, I think I had both glyph and exarkun that it (the cancellation part) was a good idea back when this was first proposed 2 years ago. I'm sure we discussed it. Of course even if that's the case, it's still no _guarantee_ it's the right thing, of course. :)

I had a long chat with radix about this, and I think we agreed on a solution here.

My main premise here is that one of the most important features of Deferreds is that, at a particular point in the callback/errback chain, you can predict (based on API documentation, mostly) what results and error conditions you can expect when you add a callback.

Preserving this feature is important for basically the same reason that raising random asynchronous exceptions in Python is a bad idea. You can't write correct software if you can't predict both success and failure modes of each thing you invoke. In other words, Deferreds should not violate their contracts. Yes, this is not quite as serious as every program potentially raising random exceptions everywhere, but it is still important.

If you can cause any deferred to raise some kind of cancellation exception, then you can never have a trivial deferred which just has a success path; you've always got to take cancellation into account. Even if we thought forcing that in every case was a good idea (I'm not so sure it is), we have five years of Deferred-using software that wasn't written that way to contend with.

So, I think cancellation always has to be advisory. By that, I mean calling cancel() does not mean "stop what you are doing right now", it means, "stop what you are doing as soon as you can reasonably do so while maintaining correctness". The canonical example of this is cancelling some computation happening in a thread. You can't actually cancel it, you can just ignore its result. As far as calling / client code is concerned, by default, I would like the cancellation to be a no-op. It can't stop the code, so it shouldn't. This makes it safe for anyone who wants to stop an operation to call cancel on a deferred at any time. If it can be cancelled, it will stop, and if it can't be, then it will keep going. In any event, at no point can you cause a Deferred to violate its contract.

There is a tricky case when you get to the issue of chaining. Consider the following code:

A and C can be cancelled; B cannot. If cancellation is, by default, a no-op, then what happens when, after A has fired, but B has not yet, the caller of operation calls cancel() on A?

I propose that the correct behavior is to put A into a "cancelled" state, which will cause it to cancel C immediately as it is returned to A's callback chain as a result. That provides consistent behavior, and relay's the user's intention to cancel the aggregate operation of A, without hanging the callback chain halfway through or raising a spurious and unexpected failure for B, which does not provide a canceller.

In more sophisticated cases, where you know what you really want in a particularly interesting cancellation case, you can always write wrapping deferreds which do exactly what your application wants (discard the results of the threaded computation and immediately appear to have cancelled for the user's benefit, for example). We should write documentation for those as those cases arrive.

My first impression is that the comment above basically makes sense, but I'm going to think about it a little more.

One immediate comment, though:

If you can cause any deferred to raise some kind of cancellation exception, then you can never have a trivial deferred which just has a success path; you've always got to take cancellation into account.

You sure can: defer.succeed() is uncancellable (unless it's blocked waiting for the result of a nested deferred of course), as it's been callbacked before anyone else sees it.

What you can't have is a deferred which hasn't been callbacked yet which has only a success path. I'd argue that if you think you can even without cancel, you're probably wrong, but maybe that's not a very good argument.

My first impression is that the comment above basically makes sense, but I'm going to think about it a little more.

Cool. Hopefully we won't have too much more discussion here :).

One immediate comment, though:

If you can cause any deferred to raise some kind of cancellation exception, then you can never have a trivial deferred which just has a success path; you've always got to take cancellation into account.

You sure can: defer.succeed() is uncancellable (unless it's blocked waiting for the result of a nested deferred of course), as it's been callbacked before anyone else sees it.

True, it's possible to implement something that can't raise a cancellation exception, but (provided a default cancel() implementation that errbacks) you can't specify a Deferred interface that can't raise a cancellation exception; as soon as you're doing something actually deferred, you encounter that possibility.

What you can't have is a deferred which hasn't been callbacked yet which has only a success path. I'd argue that if you think you can even without cancel, you're probably wrong, but maybe that's not a very good argument.

You're right, in the sense that in practice, it's pretty uncommon to have a Deferred which actually does something asynchronous but doesn't raise any errors. The "real world" case I'm concerned about is (ostensibly the vast majority of) legacy code which is ostensibly written to deal with specific types of errors, with no catch-all, or a catch-all which really freaks out.

I think it might be possible to have a default-raise canceller exception and still somehow guarantee particular exception types as your own result by trapping them or something. I'm not going to bother figuring it out though, since it's more work for the application programmer, it doesn't deal with the "legacy" case, and I think my solution satisfies this case just fine :).

I've come to think that the "wait around for the next cancellable deferred" behavior is unnecessary and perhaps undesirable.

My proposal is that in the situation glyph describes, the cancel method should simply do nothing (going along with the "advisory" nature of cancellation). Given that the author of 'operation' clearly wants the Deferred that he returns to be cancellable, he can simply wrap B (the threaded Deferred) with something that glyph briefly mentioned in the last paragraph of his second most-recent post.

My problems stemmed from the fact that your proposal will sometimes cause unnecessary work (in non-optimal cases). *my* proposal, on the other hand, causes cancel() to be non-deterministic in buggy situations. I think in the end yours is better, despite ugliness in implementation.

Deferred vs Operation is nicer than Deferred vs CancellableDeferred (represented by internal state and checks in the cancel() method).

It does not add overhead to ordinary deferreds, only to cancellable ones, and then I suspect that it would add less.

Disadvantages:

Changing a Deferred to a cancelable Deferred is backwards compatible, but changing a Deferred to an Operation will break using code. (But, if the code does not need to use cancel(), it can be given operation.deferred anyway)

I don't like the idea of hamstringing Deferred in order to support a C implementation. We made a number of really brain-damaged design decisions very early in the implementation of the reactor (for example, not using Deferreds for the client-connection logic) that were at least partially in support of C implementations which didn't really exist. I think that we can still have a C Deferred implementation that has a 'cancel' method.

Deferred vs Operation is nicer than Deferred vs CancellableDeferred (represented by internal state and checks in the cancel() method).

You're right about this, in the sense that the behavior of cancel() should be consistent. The last round of discussion here, however, implies that it will be consistent, and any internal state checks will be invisible to the caller.

It does not add overhead to ordinary deferreds, only to cancellable ones, and then I suspect that it would add less.

I don't think this is a meaningful metric, because I don't know what "overhead" you're talking about, or how you would compare them.

Disadvantages:

Changing a Deferred to a cancelable Deferred is backwards compatible, but changing a Deferred to an Operation will break using code. (But, if the code does not need to use cancel(), it can be given operation.deferred anyway)

This is basically a dealbreaker. I'm really trying to encourage ​a much stricter interpretation of "compatibility" going forward, and changing the return value of existing functions to something completely incompatible is not going to work. Adding entirely new APIs in every case where some operation can be cancelled is unnecessarily ugly, since I believe that a modified version of cancellable deferreds will address the other issues that you raised.

I am still somewhat suspicious of the proposed behavior of silently "succeeding" even if no cancellation method has been registered. I suggest that the cancel method return a result indicating whether or not it successfully canceled or just pretended to do so, so callers who *do* care have this information available.

To address the earlier concerns about every callback needing to account for cancellation; I think that the type of exception raised by the canceler should be left up to the system which allocated the Deferred in the first place. (In other words, you should have to call errback in your canceler, and be explicit about it.)

I am still somewhat suspicious of the proposed behavior of silently "succeeding" even if no cancellation method has been registered.

I am pretty sure it has to be this way. It's impossible to tell whether a different level of the stack has already canceled a particular Deferred.

I suggest that the cancel method return a result indicating whether or not it successfully canceled or just pretended to do so, so callers who *do* care have this information available.

What indicates a "successful" cancellation, though, I wonder? For example, consider a system (let's call it 'X') which wants to do some asynchronous cleanup. It does a ClientCreator.connectTCP, and in the callback for that, does a long-running callRemote. An application (let's call it 'Y') has an internal timeout and calls cancel on the Deferred returned by X. As a result, X issues another callRemote on the same connection sending the equivalent of SIGINT, but still wants to wait for the response to that call. A plugin for 'Y' (let's call it Z) which is using this API never gets a callback and won't know that internally that Deferred has already been canceled.

So after this internal timeout has occurred, but before the cleanup has completed, Z decides to call cancel() because the user gets bored and hits the cancel button. But it's already been canceled. What should happen? Can you no longer add callbacks to a canceled Deferred? Should Z (which knows nothing about the internal state of X or Y) be allowed to interrupt the clean-up 'callRemote' operation? How do you even interrupt such an operation, other than just dropping the connection? Does it count as a "successful" even though nobody up the stack got their callback or errback called yet?

Still thinking about the implications of this particular example. I'm thinking that maybe addCallbacks should take a canceler, which can override calling the returned Deferred's cancel method, so it won't be immediately canceled. On the other hand you can just make a new Deferred with no canceler and call chainDeferred to sidestep the other system's Deferred being canceled.

In any case, you need the silent "success" behavior, so that multiple systems can call cancel() on the same Deferred without blowing up. Under the covers, the cancellation callback should actually only be called once, though.

This is sounding more and more complicated... Also, shouldn't chainDeferred automatically chain cancellation too?

I was going to say "no" to this, but then I thought about it a bit, and you're probably right. Let's be more specific about what that means, though.

If you do something cancellable and get a deferred A, and then do "B = Deferred(); A.chainDeferred(B)", "B.cancel()" should imply "A.cancel()". This is the case because you aren't supposed to remove callbacks from a Deferred, which means you cannot prevent B.callback from getting called when A fires.

An existing assumption if you're using chainDeferred is that once you have called chainDeferred, it's A's job to call B.callback and therefore nothing else ever should. But this assumption adds something more in the case where cancellation is possible: Bmust not have its own canceller registered.

That's bad, because the canceller would normally be specified in Deferred.__init__, and encouraged, to the point of emitting a deprecation warning if it's not specified. There's no way to tell "oh, I'm going to call chainDeferred on this Deferred in a minute, so it shouldn't have a canceller right now".

All this makes me think that chainDeferred is kind of a crummy API, and maybe what we want is Deferred.fork(); since you always need to construct a Deferred and then immediately call chainDeferred it before doing anything else with it; Y = X.fork() is less code than from twisted.internet.defer import Deferred; X = Deferred(); Y.chainDeferred(X). Also, to support cancellation better, fork() could accept an optional cancellation argument: if unspecified, B.cancel() would call A.cancel(), but if it is specified, it could invoke a callback which could facilitate the removal of B's callbacks from A's callbacks chain. (I am being deliberately vague here because I don't know whether it should be automatic or manual, or what the manual procedure should be.)

In the general case, a callback may arbitrarily transform the type of the Deferred's result, so it's not safe to remove or reorder arbitrary callbacks. This is unfortunately still true in chainDeferred's case, because Deferred.callback (which is what is placed into the callback chain) returns None. However, in the case of passthru callbacks, it would be safe to remove them; fork() could explicitly pass through its input to enable this.

After much discussion, both on IRC and at PyCon 2010 Atlanta, we have mostly agreed that the semantics in the branch were already pretty much correct. The semantics in the code should match the diagram checked into the doc/ directory, and the reviewer should verify that they correspond.

There are two styles of raising AlreadyCalledError: line 394 (with
parens) and line 415 without parens.

Line 394 raises but should do nothing.

Comments on twisted/test/test_defer.py

test_canceller, test_cancellerWithCallback and
test_cancelAlreadyCalled need whitespace around an = sign in the
assignment self.cancellerCalled=True

The cancellation tests should clearly distinguish between calling
defer.cancel() and the calling of the actual cancellation
method. The former can be called multiple times, but the later
should only ever be called once. The tests would be simpler if the
two were clearly separate.

I would set self.cancellerCalled = 0 and then have the cancellers
increment it. That would allow tests that the canceller had been
called the expected number of times. A tearDown method could
(additionally) be added to test that the value is always 0 or 1.

test_cancelNestedDeferred fires the deferred A and then calls
cancel. That is supposed to be a no-op, but the test tests that the
canceller was called.

The diagram (source:doc/core/img/deferred-states.svg) that's
supposed to cover the various states a deferred can be in has many
edges. Some of these are tested (I presume) in the regular deferred
tests, but others involving cancellation are not. For example,
test_noCanceller tests what happens when callback is called twice
when a deferred has been canceled, but doesn't test what happens
when errback is called. Or, for example, there is no test to prove
that a deferred that has been errbacked does nothing when later
canceled (in fact the behavior is likely wrong in that case,
as AlreadyCalledError will be raised). So if we're serious about
testing the changes properly, there should be a systematic effort
(along with evidence of such, which makes a review easier) to show
that the relevant edges in the state diagram have been tested.

if getHostByName() quickly returns a result, cancel() will fail because the deferred object already has a concrete result

if I use the result of the previous function, for example with d.addCallback(self.doConnect).addCallback(self.login), the above cancel() may kill the connection/login attempt.

As presented, cancel() seems to be a kind of a premature call to errback(), which overrides a following call to callback() or errback()without raising AlreadyCalledError. For symmetry (and for the previous example to work), calling cancel() when callback() has been called should be silently discarded, and not raise AlreadyCalledError.

IMO the current implementation of cancel() mixes two things: it triggers the initial run of a deferred, OR forces a paused callback chain to resume with a Failure. This breaks the ability of deferreds to return synchronous results.

A naive implementation of timeouts using cancellation doesn't work, yes. However, it's possible to write a timeout implementation that does work, and we can provide it as a utility function, or even a method:

Thanks for your example. I think it shows that almost all uses of cancel() need to specify where they occur in the callback chain (like: "the deferred must run its callback chain until here in less than xxx seconds").
Maybe this would be better expressed in term of a "cancellation point":

I just attached a diff against cancel-990-3 that addresses most of the comments from the last 24hrs and adds 25 tests. I hope I'll have time to write a more detailed comment later addressing individual concerns above.

A naive implementation of timeouts using cancellation doesn't work, yes. However, it's possible to write a timeout implementation that does work, and we can provide it as a utility function, or even a method:

Sure, but this should be a separate ticket. The current implementation of timeout should not change its behavior, as it's already deprecated. Do you want to go ahead and file it?

It is very important that this functionality be located outside of defer.py and therefore allow us to eventually move defer.py back to twisted/python/, where it belongs, removing all of its twisted.internet dependencies.

(I'm also not completely sure that I agree with the implementation you've proposed, but let's talk about it later.)

Your concerns, as I understand them, were valid. But the code that was originally up for review still
contained something that was supposed to be changed: that calling cancel on a deferred that has fired (and has a synchronous result - either a successful one or a failure), is a no-op. That's been corrected
now. I *think* that means that the things above that you were rightly concerned about are in fact not a problem.

I'm completely happy with the changes committed on trunk.
One minor nit: IMO the three sentences starting with "Note: We do not need to wrap this in a try/except..." don't below to a docstring; they are implementation comments.

One minor nit: IMO the three sentences starting with "Note: We do not need to wrap this in a try/except..." don't below to a docstring; they are implementation comments.

Note that this is a docstring on a private method (its name starts with an underscore), and therefore inherently provided as information for maintainers rather than users. We find it nicer to put things into docstrings where possible, since it makes the information available to a wider variety of tools. Of course, if there's some hairy code in the middle of a public method, we will use a comment to note that particular area of the implementation.