The following program is misbehaving with python3.2
import signal, time
def sighandler( arg1, arg2): print("got sigint"); assert 0
signal.signal( signal.SIGINT, sighandler)
for i in range(1000000):
print(i)
I'd expect Ctrl-C to terminate the program with AssertionError and that's indeed what happens under python2.7.
But with python3.2a, I get "Assertion Error" 1 out ~10 times. The other 9 times, the program locks up (goes to sleep? ps shows process status as "S"). After the program locks up, it does not respond to subsequent "Ctrl-C" presses.
This is on 64-bit Ubuntu 8.04.

Wow. The lock is precisely there so that the buffered object doesn't have to be MT-safe or reentrant. It doesn't seem reasonable to attempt to restore the file to a "stable" state in the middle of an inner routine.
Also, the outer TextIOWrapper (we're talking about sys.stdout here) is not designed to MT-safe at all and is probably in an inconsistent state itself.
I would rather detect that the lock is already taken by the current thread and raise a RuntimeError. I don't think it's a good idea to do buffered I/O in a signal handler. Unbuffered I/O probably works.
(in a more sophisticated version, we could store pending writes so that they get committed at the end of the currently executing write)

Would avoiding PyErr_CheckSignals() while the file object is in inconsistent state be a reasonable alternative?
I am guessing that it's not that uncommon for a signal handler to need IO (e.g to log a signal).
If making IO safer is not an option, then I think, this limitation needs to be documented (especially, given that this seems to be a behavior change from Python 2.x).

> Would avoiding PyErr_CheckSignals() while the file object is in
> inconsistent state be a reasonable alternative?
No, because we'd like IO operations to be interruptible by the user
(e.g. pressing Ctrl-C) when they would otherwise block indefinitely.
> I am guessing that it's not that uncommon for a signal handler to need
> IO (e.g to log a signal).
In C, it is recommended that signal handlers be minimal. In Python,
there is no explicit recommendation but, given they execute
semi-asynchronously, I personally wouldn't put too much code in them :)
That said, there's no problem doing IO as long as you're not doing
reentrant calls to the *same* file object. I agree that for logging this
is not always practical...
> If making IO safer is not an option, then I think, this limitation
> needs to be documented (especially, given that this seems to be a
> behavior change from Python 2.x).
Perhaps the IO documentation needs an "advanced topics" section. I'll
see if I get some time.

Dummy question: why don't you use KeyboardInterrupt instead of a custom SIGINT handler?
try:
for i in range(1000000):
print(i)
except KeyboardInterrupt:
print("got sigint")
Python SIGINT handler raises a KeyboardInterrupt (the handler is written in C, not in Python) which is safe, whereas writing to sys.stdout doesn't look to be a good idea :-)

This issue remembers me #3618 (opened 2 years ago): I proposed to use RLock instead of Lock, but RLock was implemented in Python and were too slow. Today, we have RLock implemented in C and it may be possible to use them. Would it solve this issue?
--
There are at least two deadlocks, both in _bufferedwriter_flush_unlocked():
- call to _bufferedwriter_raw_write()
- call to PyErr_CheckSignals()
> The lock is precisely there so that the buffered object doesn't
> have to be MT-safe or reentrant. It doesn't seem reasonable
> to attempt to restore the file to a "stable" state in the middle
> of an inner routine.
Oh, so release the lock around the calls to _bufferedwriter_raw_write() (aorund PyObject_CallMethodObjArgs() in _bufferedwriter_raw_write()) and PyErr_CheckSignals() is not a good idea? Or is it just complex because the buffer object have to be in a consistent state?
> (in a more sophisticated version, we could store pending writes
> so that they get committed at the end of the currently
> executing write)
If the pending write fails, who gets the error?

> This issue remembers me #3618 (opened 2 years ago): I proposed to use
> RLock instead of Lock, but RLock was implemented in Python and were
> too slow. Today, we have RLock implemented in C and it may be possible
> to use them. Would it solve this issue?
I think it's more complicated. If you use an RLock, you can reenter the
routine while the object is in an unknown state, so the behaviour can be
all kinds of wrong.
> > The lock is precisely there so that the buffered object doesn't
> > have to be MT-safe or reentrant. It doesn't seem reasonable
> > to attempt to restore the file to a "stable" state in the middle
> > of an inner routine.
>
> Oh, so release the lock around the calls to
> _bufferedwriter_raw_write() (aorund PyObject_CallMethodObjArgs() in
> _bufferedwriter_raw_write()) and PyErr_CheckSignals() is not a good
> idea? Or is it just complex because the buffer object have to be in a
> consistent state?
Both :)
> > (in a more sophisticated version, we could store pending writes
> > so that they get committed at the end of the currently
> > executing write)
>
> If the pending write fails, who gets the error?
Yes, I finally think it's not a good idea. flush() couldn't work
properly anyway, because it *has* to flush the buffer before returning.

Ok, so +1 to apply immediatly your patch which "fixes" the deadlock. If someone is motived to make Buffered* classes reentrant, (s)he can remove this exception.
io and signal documentation should also be improved to indicate that using buffered I/O in a signal handler may raise a RuntimeError on reentrant call (and give an example to explain the problem?).
About the patch: can't you move "&& (self->owner = PyThread_get_thread_ident(), 1) )" in _enter_buffered_busy()?