When tracing line execution with sys.settrace, a particular code
structure fails to report an executed line. The line is a continue
statement after an if condition in which the if condition is true every
time it is executed.
Attached is a file with two copies of the same code, except in the first
the if condition is always true, and in the second it is sometimes true.
In the first, trace.py reports that the continue is never executed,
even though it is (as evidenced from the values of a, b, and c after
execution).
In the second code, the continue is properly reported.
This bug has been present since version 2.3. 2.2 does not exhibit it
(trace.py didn't exist in 2.2, but coverage.py shows the problem also).
To see the problem, execute "trace.py -c -m continue.py". Then
continue.py.cover will show:
1: a = b = c = 0
101: for n in range(100):
100: if n % 2:
50: if n % 4:
50: a += 1
>>>>>> continue
else:
50: b += 1
50: c += 1
1: assert a == 50 and b == 50 and c == 50
1: a = b = c = 0
101: for n in range(100):
100: if n % 2:
50: if n % 3:
33: a += 1
17: continue
else:
50: b += 1
50: c += 1
1: assert a == 33 and b == 50 and c == 50

This is because of a "peephole" optimization of the generated bytecode:
a jump instruction which target is another jump instruction can be
modified modified to target the final location.
You gain a few opcodes, but tracing is confusing...
Not sure how to fix this, though.

I see that the cause of the problem is the peephole optimizer. That
doesn't mean this isn't a problem.
I am measuring the code coverage of a set of tests, and one of my lines
is being marked as not executed. This is not the fault of the tests,
because in fact, without the optimization, the line would be executed.
Conceptually, the line has been executed (the loop is restarted, rather
than execution continuing).
I don't know what the solution to this is. Some options include fixing
the line tracing code to somehow indicate that the continue was
executed; or providing a way to disable peephole optimization for times
when accurate execution tracing is more important than speed.

On Sat, Mar 29, 2008 at 2:51 PM, Ned Batchelder <report@bugs.python.org> wrote:
>
> Ned Batchelder <nedbat@users.sourceforge.net> added the comment:
>
> I am measuring the code coverage of a set of tests, and one of my lines
> is being marked as not executed. This is not the fault of the tests,
> because in fact, without the optimization, the line would be executed.
> Conceptually, the line has been executed (the loop is restarted, rather
> than execution continuing).
>
.. but the continue statement on line 5 is NOT executed in x == True
case. Note that without optimization, the if statement + the
continue line translate to
3 19 LOAD_FAST 0 (x)
22 JUMP_IF_FALSE 4 (to 29)
25 POP_TOP
4 26 JUMP_FORWARD 1 (to 30)
>> 29 POP_TOP
5 >> 30 JUMP_ABSOLUTE 13
where the second jump is to the continue statement. Peephole
optimizer recognizes that the jump target is an unconditional jump and
changes the code to jump directly to the final target bypassing the
continue line. The optimized code is
3 19 LOAD_FAST 0 (x)
22 JUMP_IF_FALSE 4 (to 29)
25 POP_TOP
4 26 JUMP_ABSOLUTE 13
>> 29 POP_TOP
5 30 JUMP_ABSOLUTE 13
If x is true, line five is NOT executed.
> I don't know what the solution to this is. Some options include fixing
> the line tracing code to somehow indicate that the continue was
> executed; or providing a way to disable peephole optimization for times
> when accurate execution tracing is more important than speed.
>
I think it is a good idea to provide a way to disable peephole
optimizer. In fact, I recently proposed exactly that in msg64638. My
only problem is that I would like to follow gcc tradition and make -O
option take an optional numeric argument with 0 meaning no
optimization and increasingly aggressive optimization as the argument
increases. Unfortunately -O0 will be confusingly similar to -OO.
Since -OO is not really optimization, but rather "strip" option, it
should probably migrate to -s or something. In any case, such drastic
changes to command line options are not acceptable for 2.x, but maybe
possible for 3.0.
I can easily implement -N (No optimization) or -g (debug) option that
will disable the peephole optimizer if there is support for such
feature.

This has basically almost never been a problem in the real world. No
need to complicate the world further by adding yet another option and
the accompanying implementation-specific knowledge of why you would
ever want to use it.
Also, when the peepholer is moved (after the AST is created, but before
the opcodes), then little oddities like this will go away.
Recommend closing as "won't fix".

I recognize that this is an unusual case, but it did come up in the real
world. I found this while measuring test coverage, and the continue
line was marked as not executed, when it was.
I don't understand "when the peepholer is moved", so maybe you are right
that this will no longer be an issue. But it seems to me to be endemic
to code optimization to lose the one-to-one correspondence between
source lines and ranges of bytecodes. And as the compiler becomes more
complex and performs more optmizations, problems like this will likely
increase, no?
In any case, I'd like to know more about the changes planned for the AST
and compiler...

On Sat, Mar 29, 2008 at 4:58 PM, Raymond Hettinger
<report@bugs.python.org> wrote:
> This has basically almost never been a problem in the real world.
I believe Ned gave an important use case. In coverage testing,
optimized runs can show false gaps in coverage. In addition, a no
optimize option would provide a valuable learning tool. Python has an
excellent simple VM very suitable for a case study in introductory CS
courses. Unfortunately, inability to disable peephole optimizer makes
understanding the resulting bytecode more difficult, particularly
given some arbitrary choices made by the optimizer (such as 2*3+1 =>
7, but 1+2*3 => 1+6). Furthermore, as Raymond suggested in another
thread, peephole optimizer was deliberately kept to bare minimum out
of concerns about compilation time. Given that most python code is
pre-compiled, I think it is a rare case when code size/speed
improvements would not be worth increased compilation time. In a rare
case when compilation time is an issue, users can consider disabling
optimization. Finally, an easy way to disable the optimizer would
help in developing the optimizer itself by providing an easy way to
measure improvements and debugging.
> No need to complicate the world further by adding yet another option and
> the accompanying implementation-specific knowledge of why you would
> ever want to use it.
>
This would not really be a new option. Most users expect varying
levels of optimization with -O option and python already has 3 levels:
plain, -O, and -OO or Py_OptimizeFlag = 0,1, and 2. Moreover, in fact,
Py_OptimizeFlag can be set to an arbitrary positive integer using
undocumented -OOO.. option. I don't see how anyone would consider
adding say -G with Py_OptimizeFlag = -1 that would disable all
optimization as "complicating the world."
> Also, when the peepholer is moved (after the AST is created, but before
> the opcodes), then little oddities like this will go away.
>
I don't see how moving optimization up the chain will help with this
particular issue. Note that the problem is not with peepholer emiting
erroneous line number information, but the fact that the continue
statement is optimized away by replacing the if statement's jump to
continue with a direct jump to the start of the loop. As I stated in
my first comment, trace output is correct and as long as the compiler
avoids redundant double jumps, the continue statement will not show up
in trace regardless where in compilation chain it is optimized. The
only way to get correct coverage information is to disable double jump
optimization.

Weigh the cost/benefit carefully before pushing further. I don't doubt
the legitimacy of the use case, but do think it affects far fewer than
one percent of Python programmers. In contrast, introducing new
command line options is a big deal and will cause its own issues
(possibly needing its own buildbot runs to exercise the non-optimized
version, having optimized code possibly have subtle differences from
the code being traced/debugged/profiled, and more importantly the
mental overhead of having to learn what it is, why it's there, and when
to use it).
My feeling is that adding a new compiler option using a cannon to kill
a mosquito. If you decide to press the case for this one, it should go
to python-dev since command line options affect everyone.
This little buglet has been around since Py2.3. That we're only
hearing about it now is a pretty good indicator that this is a very
minor in the Python world and doesn't warrant a heavy-weight solution.
It would be *much* more useful to direct effort improving the mis-
reporting of the number of arguments given versus those required for
instance methods:
>>> a.f(1, 2)
TypeError: f() takes exactly 1 argument (3 given)

On Sun, Mar 30, 2008 at 5:01 PM, Raymond Hettinger
<report@bugs.python.org> wrote:
..
> Weigh the cost/benefit carefully before pushing further. I don't doubt
> the legitimacy of the use case, but do think it affects far fewer than
> one percent of Python programmers.
I agree with you, but only because fewer than 1% of Python programmers
have complete test coverage for their code. :-) On the other hand, I
wanted a no-optimize option regardless of the trace issue. Once it is
there, I am sure everyone interested in how python compiler works will
use it. (I am not sure what % of Python programmers would fall into
that category.)
I don't know how big of a deal an extra buildbot is, but I don't think
it will be necessary. It is hard to imagine optimization that would
fix (mask) errors in non-optimized code. Therefore, a non-optimized
buildbot is unlikely to flag errors that ar not present in optimized
runs. On the other hand errors introduced by optimizer will be easier
to diagnose if they disappear when the code runs without optimization.
Mental overhead is important, but I think it will be easier to explain
the effect of no optimize option than to explain what -O does in the
current version. As far as I can tell, -O has nothing to do with
peephole optimization and only removes assert statements and replaces
__debug__ with 0. I am sure most python users are not aware of the
fact that peephole optimization is performed without -O option.
> My feeling is that adding a new compiler option using a cannon to kill
> a mosquito. If you decide to press the case for this one, it should go
> to python-dev since command line options affect everyone.
>
As an alternative to the command line option, what would you say to
making sys.flags.optimize writeable and disable peepholer if
Py_OprimizeFlag < 0? This will allow python tracing tools to disable
optimization from within python code. The fact that setting
sys.flags.optimize flag will not affect modules that are already
loaded is probably a good thing because tracing code itself will run
optimized. Such tracing tools may also need to use a custom importer
that would ignore precompiled code and effectively set
dont_write_bytecode flag.
> This little buglet has been around since Py2.3. That we're only
> hearing about it now is a pretty good indicator that this is a very
> minor in the Python world and doesn't warrant a heavy-weight solution.
>
I still maintain that this is not a bug. Not hearing about it before
is probably an indication that users sophisticated enough to try to
achieve full test coverage for their code were able to recognize false
coverage gaps as such.

Marking this one as closed.
Also, rejecting the various ways to disable peephole optimization.
This was discussed with Guido long ago and the decision essentially
recognized that for most practical purposes the output of the peepholer
is the generated code and no good would come from exposing upstream
intermediate steps.
Since then, I believe Neal got Guido's approval for either the -O or -
OO option to generate new optimizations that potentially change
semantics. In that situation, there is a worthwhile reason for the
enable/disable option.

It's hard for me to agree with your assessment that for no practical
good would come from disabling the optimizer. Broadly speaking, there
are two types of code execution: the vast majority of the time, you
execute the code so that it can do its work. In this case, speed is
most important, and the peephole optimizer is a good thing. But another
important case is when you need to reason about the code. This second
case includes coverage testing, debugging, and other types of analysis.
Compiled languages have long recognized the need for both types of
compilation, which is why they support disabling optimization entirely.
As Python becomes more complex, and more broadly deployed, the needs of
the two types of execution will diverge more and more. More complex
optimizations will be attempted in order to squeeze out every last drop
of performance. And more complex tools to reason about the code will be
developed to provide rich support to those using Python for complex
development.
I see discussion here of moving the optimizer to the AST level instead
of the bytecode level. This won't change the situation. The optimizer
will still interfere with analysis tools.
As a developer of analysis tools, what should I tell my users when their
code behaves mysteriously?

While I agree with Raymond that the interpreter should be left alone,
this could be reclassified (and reopened) as a doc issue. The current
trace doc (Lib Ref 25.10) says rather tersely "The trace module allows
you to trace program execution, generate annotated statement coverage
listings, print caller/callee relationships and list functions executed
during a program run." This could be augmented with a general statement
that the effect of certain statements may get computed during
compilation and not appear in the runtime trace -- or a more specific
statement about continue, break, and whatever else.
AS for continue.py, it seems that the apparent non-execution of a
continue line indicates one of two possible problems.
1. The if statement is equivalent to 'if True:', at least for the
intended domain of input, hence redundant, and hence could/should be
removed.
2. Otherwise, the inputs are incomplete as far as testing the effect of
not taking the if-branch, and hence could/should be augmented.
Either way, it seems to me that the lack of runtime execution of
continue, coupled with better documentation, could usefully point to
possible action.

Since the main argument for not fixing this bug seems to be that it doesn't affect many users, it seems like I should comment here that the issue is affecting me. A recently proposed addition to Twisted gets bitten by this case, resulting in a report of less than full test coverage when in fact the tests do exercise every line and branch of the change.
Perhaps it is too hard to add and maintain a no-optimizations feature for Python (although I agree with Ned that this would be a useful feature for many reasons, not just to fix this bug). There are other possible solutions to the issue of inaccurate coverage reports though.
For example, Python could provide an API for determining which lines have code that might be executed. coverage.py (and the stdlib trace.py) currently use the code object's lnotab to decide which lines might be executable. Maybe that should omit "continue" lines that get jumped over. If the line will never execute, it seems there is no need to have it in the lnotab.
Using the lnotab is something of a hack though, so it might also make sense to leave it alone but introduce an API to get the same information, but corrected for whatever peephole optimizations the interpreter happens to have.
As far as the "not a bug" arguments go, I don't think it matters much whether you ultimately decide to call it a bug or a feature request. It *is* clearly a useful feature to some people though, and rejecting the requested behavior as "not a bug" doesn't help anyone. So call it a feature request if that makes it more palletable. :)

I think supporters of this feature request should take discussion to python-ideas to try to gather more support. The initial post should summarize reasons for the request, possible implementations, and the counter-arguments of Raymond.

Choose pydev if you want. Discussion there is *usually* (but definitely not always) more focused on implementation of uncontroversial changes. I am pretty much +-0 on the issue, though Jean-Paul's post seems to add to the + side arguments that might be persuasive to others also.

I found this issue just the other day while researching why we were getting false gaps in our test coverage reports (using Ned's coverage module, natch!). I agree that this seems like a fairly minor nuisance, but it's a nuisance that anybody who has tests and measures test coverage will run into sooner or later -- and that's *everybody*, right?
I think some kind of fix ought to be discussed. After all, "it should be possible to have accurate coverage results" is a proposition that seems fairly reasonable to me.

Ned, why is your proposal to turn-off ALL peephole transformations with COMMAND-LINE switch?
* Why not just turn-off the jump-to-jump? Do you really need to disable constant folding and other transformations?
* Have you explored whether the peephole.c code can be changed to indicate the continue-statement was visited?
* Why does this have to be a command-line setting rather than a flag or environment variable settable by coverage.py?
* Is there some less radical way the coverage.py can be taught to make the continue-statement as visited?
* Are you requesting that optimization constraints be placed on all of the implementations of Python (Jython, PyPy, and IronPython) to make coverage.py perfect?
* Do you want to place limits on what can be done by Victor's proposed AST tranformations which will occur upstream from the peepholer and will make higher level semantically-neutral transformations *prior* to code generation.
* Have you considered whether the genererated PYC files need a different magic number or some other way to indicate that they aren't production code?
* If coverage.py produces a report on different code than the production run, doesn't that undermine some of the confidence the meaningfulness of the report?
In other words, are you sure that you're making the right request and that it is really worth it? Do we really have to open this can of worms to make coverage.py happy?

> Have you considered whether the genererated PYC files need a different magic number or some other way to indicate that they aren't production code?
Would it make sense to use a different sys.implementation.cache_tag? For example, the tag si currently "cpython-35". We can use "cpython-35P" when peephole optimizations are disabled. So you can have separated .pyc and .pyo files and the disabling peephole optimizations is compatible with -O and -OO command line options.

[Victor]
> Oh, another option to solve the .pyc file issue is to *not*
> write .pyc files if the peephole optimizer is disabled.
> If you disable an optimizer, you probably don't care of performances.
That is an inspired idea and would help address one of the possible problems that could be caused by a new on/off switch.

I consider peephole optimization when no optimization was requested a bug.
Documentation for -O says it "Turns on basic optimizations". Peephole optimization is a basic optimization, yet it is performed even when no basic optimizations were requested.
No need to add a switch. Just don't optimize if not requested.

I believe the python-ideas thread on this topic came to the conclusion that a -X flag -- e.g., `-X DisableOptimizations` -- would be a good way to turn off all optimizations. The flag could then either blindly set sys.dont_write_bytecode to True or set sys.flags.optimize to -1 in which case a bytecode file named e.g. foo.cpython-36.opt--1.pyc would be written which won't lead to any conflicts (I wish we could use False for sys.flags.optimize but that has the same values as 0 which is the default optimization level).
Does one of those proposal seems acceptable to everyone? Do people like Ned who asked for this feature have a preference as to whether the bytecode is or is not written out to a .pyc file?

Few different optimizations work together here. Folding constants at the AST level allows to eliminate the constant expression statement in the code generation stage. This makes 'continue' a first statement in the 'if' body. Boolean expressions optimizations (performed in the code generation stage now) creates a conditional jump to the start of the 'if' body (which is 'continue' now). If 'continue' is not nested in 'try' or 'with' blocks, it is compiled to an unconditional jump. And finally the jump optimization in the peepholer retargets the conditional jump from the unconditional jump to the start of the loop.

Serhiy: thanks for the fuller story about the optimizations in place now. I'll repeat my earlier plea: providing a way to disable optimization is beneficial for tooling like coverage.py, and also for helping us to ensure that the optimizer is working correctly.
I doubt anyone commenting here would be happy with a C compiler that optimized with no off-switch.