This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This new release of Nuitka has a focus on re-organizing the Nuitka generated
source code and a modest improvement on the performance side.

For a long time now, Nuitka has generated a single C++ file and asked the C++
compiler to translate it to an executable or shared library for CPython to
load. This was done even when embedding many modules into one (the "deep"
compilation mode, option --deep).

This was simple to do and in theory ought to allow the compiler to do the most
optimization. But for large programs, the resulting source code could have
exponential compile time behavior in the C++ compiler. At least for the GNU g++
this was the case, others probably as well. This is of course at the end a
scalability issue of Nuitka, which now has been addressed.

So the major advancement of this release is to make the --deep option
useful. But also there have been a performance improvements, which end up giving
us another boost for the "PyStone" benchmark.

Bug fixes

Imports of modules local to packages now work correctly, closing the small
compatibility gap that was there.

Modules with a "-" in their name are allowed in CPython through dynamic
imports. This lead to wrong C++ code created. (Thanks to Li Xuan Ji for
reporting and submitting a patch to fix it.)

There were warnings about wrong format used for Ssize_t type of
CPython. (Again, thanks to Li Xuan Ji for reporting and submitting the patch
to fix it.)

When a wrong exception type is raised, the traceback should still be the one
of the original one.

Set and dict contractions (Python 2.7 features) declared local variables for
global variables used. This went unnoticed, because list contractions don't
generate code for local variables at all, as they cannot have such.

Using the type() built-in to create a new class could attribute it to the
wrong module, this is now corrected.

New Features

The direct use of __import__() with a constant module name as parameter is
also followed in "deep" mode. With time, non-constants may still become
predictable, right now it must be a real CPython constant string.

New Optimization

Added optimization for the built-ins ord() and chr(), these require a
module and built-in module lookup, then parameter parsing. Now these are
really quick with Nuitka.

Added optimization for the type() built-in with one parameter. As above,
using from builtin module can be very slow. Now it is instantaneous.

Added optimization for the type() built-in with three parameters. It's
rarely used, but providing our own variant, allowed to fix the bug mentioned
above.

Cleanups

Using scons is a big cleanup for the way how C++ compiler related options are
applied. It also makes it easier to re-build without Nuitka, e.g. if you were
using Nuitka in your packages, you can easily build in the same way than
Nuitka does.

Static helpers source code has been moved to ".hpp" and ".cpp" files, instead
of being in ".py" files. This makes C++ compiler messages more readable and
allows us to use C++ mode in Emacs etc., making it easier to write things.

Generated code for each module ends up in a separate file per module or
package.

Constants etc. go to their own file (although not named sensible yet, likely
going to change too)

Module variables are now created by the CPythonModule node only and are
unique, this is to make optimization of these feasible. This is a pre-step to
module variable optimization.

New Tests

Added "ExtremeClosure" from my Python quiz, it was not covered by existing
tests.

Added test case for program that imports a module with a dash in its name.

Added test case for main program that starts with a dash.

Extended the built-in tests to cover type() as well.

Organizational

There is now a new environment variable NUITKA_SCONS which should point to
the directory with the SingleExe.scons file for Nuitka. The scons file
could be named better, because it is actually one and the same who builds
extension modules and executables.

There is now a new environment variable NUITKA_CPP which should point to
the directory with the C++ helper code of Nuitka.

The script "create-environment.sh" can now be sourced (if you are in the top
level directory of Nuitka) or be used with eval. In either case it also
reports what it does.

Update

The script has become obsolete now, as the environment variables are no
longer necessary.

To cleanup the many "Program.build" directories, there is now a "clean-up.sh"
script for your use. Can be handy, but if you use git, you may prefer its
clean command.

Update

The script has become obsolete now, as Nuitka test executions now by
default delete the build results.

Numbers

python 2.6:

Pystone(1.1) time for 50000 passes = 0.65
This machine benchmarks at 76923.1 pystones/second

Nuitka 0.3.4:

Pystone(1.1) time for 50000 passes = 0.34
This machine benchmarks at 147059 pystones/second

This pre-release of Nuitka has a focus on re-organizing the Nuitka generated source
code. Please see the page "What is Nuitka?" for clarification of
what it is now and what it wants to be.

For a long time, Nuitka has generated a single C++ file, even when embedding many modules
into one. And it has always showed that the GNU g++ compiler clearly has exponential
compile time behavior when translating these into the executable.

This is no more the case. So this pre-release is mainly about making the "--deep" feature
useful. Before the release, I may look into optimizations for speed again. Right now time
is very short due to day job reasons, so this pre-release is also about allowing people to
use the improvements that I have made and get some feedback about it.

Bug fixes

None at all. Although I am sure that there may be regressions on the options side. The
tests of CPython 2.7 all pass still, but you may find some breakage.

Cleanups

Static helpers source code has been moved to ".hpp" and ".cpp" files, instead of being
in ".py" files.

Generated generated code for each module is now a separate file.

Constants etc. go to their own file (although not named sensible yet)

New Features

Uses Scons to make the build.

New Tests

I have added ExtremClosure from the Python quiz. I feel it was not covered by existing
tests yet.

Organizational

There is now a new environment variable "NUITKA_SCONS" which should point to the directory
with the Scons file for Nuitka.

The create-environment.sh can now be sourced (if you are in the top level directory of
Nuitka) or be used with eval. In either case it also reports what it does.

Numbers

None at this time. It likely didn't change much at all. And I am not yet using the link
time optimization feature of the g++ compiler, so potentially it should be worse than
before at max.

This release will be inside the "git" repository only. Check out latest version
here to get it.

Python3 support was added, and has reached 3.3 in the mean time. The doctests
are extracted by a script indeed. But exception stack correctness is an
ongoing struggle.

my Python compiler Nuitka has come a long way, and currently I have little to no
time to spend on it, due to day job reasons, so it's going to mostly stagnate
for about 2 weeks from my side. But that's coming to an end, and still I would
like to expand what we currently have, with your help.

Note: You can check the page What is Nuitka? for
clarification of what it is now and what it wants to be.

As you will see, covering all the CPython 2.6 and 2.7 language features is
already something. Other projects are far, far away from that. But going ahead,
I want to secure that base. And this is where there are several domains where
you can help:

Python 3.1 or higher

I did some early testing. The C/API changed in many ways, and my current
working state has a couple of fixes for it. I would like somebody else to
devote some time to fixing this up. Please contact me if you can help here,
esp. if you are competent in the C/API changes of Python 3.1. Even if the
CPython 3.1 doesn't matter as much to me, I believe the extended coverage from
the new tests in its test suite would be useful. The improved state is not yet
released. I would make an release to the person(s) that want to work on it.

Doctests support

I have started to extract the doctests from the CPython 2.6 test suite. There
is a script that does it, and you basically only need to expand it with more
of the same. No big issue there, but it could find issues with Nuitka that we
would like to know. Of course, it should also be expanded to CPython 2.7 test
suite and ultimately also CPython 3.1

Exception correctness

I noted some issues with the stacks when developing with the CPython 2.7
tests, or now failing 2.6 tests, after some merge work. But what would be
needed would be tests to cover all the situations, where exceptions could be
raised, and stack traces should ideally be identical for all. This is mostly
only accuracy work and the CPython test suite is bad at covering it.

All these areas would be significant help, and do not necessarily or at all
require any Nuitka inside knowledge. You should also subscribe the mailing list if you consider helping, so we can discuss things in
the open.

If you choose to help me, before going even further into optimization, in all
likelihood it's only going to make things more solid. The more tests we have,
the less wrong paths we can take. This is why I am asking for things, which all
point into that direction.

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release of Nuitka continues the focus on performance. It also cleans up a
few open topics. One is "doctests", these are now extracted from the CPython 2.6
test suite more completely. The other is that the CPython 2.7 test suite is now
passed completely. There is some more work ahead though, to extract all of the
"doctests" and to do that for both versions of the tests.

This means an even higher level of compatibility has been achieved, then there
is performance improvements, and ever cleaner structure.

Bug fixes

Generators

Generator functions tracked references to the common and the instance context
independently, now the common context is not released before the instance
contexts are.

Generator functions didn't check the arguments to throw() the way they are
in CPython, now they are.

Generator functions didn't trace exceptions to "stderr" if they occurred while
closing unfinished ones in "del".

Generator functions used the slightly different wordings for some error
messages.

Function Calls

Extended call syntax with ** allows that to use a mapping, and it is now
checked if it really is a mapping and if the contents has string keys.

Similarly, extended call syntax with * allows a sequence, it is now
checked if it really is a sequence.

Error message for duplicate keyword arguments or too little arguments now
describe the duplicate parameter and the callable the same way CPython does.

Now checks to the keyword argument list first before considering the parameter
counts. This is slower in the error case, but more compatible with CPython.

Classes

The "locals()" built-in when used in the class scope (not in a method) now is
correctly writable and writes to it change the resulting class.

Name mangling for private identifiers was not always done entirely correct.

Others

Exceptions didn't always have the correct stack reported.

The pickling of some tuples showed that "cPickle" can have non-reproducible
results, using "pickle" to stream constants now

New Optimization

Access to instance attributes has become faster by writing specific code for
the case. This is done in JIT way, attempting at run time to optimize
attribute access for instances.

Assignments now often consider what's cheaper for the other side, instead of
taking a reference to a global variable, just to have to release it.

The function call code built argument tuples and dictionaries as constants,
now that is true for every tuple usage.

Cleanups

The static helper classes, and the prelude code needed have been moved to
separate C++ files and are now accessed "#include". This makes the code inside
C++ files as opposed to a Python string and therefore easier to read and or
change.

New Features

The generator functions and generator expressions have the attribute
"gi_running" now. These indicate if they are currently running.

New Tests

The script to extract the "doctests" from the CPython test suite has been
rewritten entirely and works with more doctests now. Running these tests
created increased the test coverage a lot.

The Python 2.7 test suite has been added.

Organizational

One can now run multiple "compare_with_cpython" instances in parallel, which
enables background test runs.

There is now a new environment variable "NUITKA_INCLUDE" which needs to point
to the directory Nuitka's C++ includes live in. Of course the
"create-environment.sh" script generates that for you easily.

Numbers

python 2.6:

Pystone(1.1) time for 50000 passes = 0.65
This machine benchmarks at 76923.1 pystones/second

Nuitka 0.3.3:

Pystone(1.1) time for 50000 passes = 0.36
This machine benchmarks at 138889 pystones/second

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release of Nuitka continues the focus on performance. But this release also
revisits the topic of feature parity. Before, feature parity had been reached
"only" with Python 2.6. This is of course a big thing, but you know there is
always more, e.g. Python 2.7.

With the addition of set contractions and dict contractions in this very
release, Nuitka is approaching Python support for 2.7, and then there are some
bug fixes.

Bug fixes

Calling a function with ** and using a non-dict for it was leading to
wrong behavior. Now a mapping is good enough as input for the ** parameter
and it's checked.

Deeply nested packages "package.subpackage.module" were not found and gave a
warning from Nuitka, with the consequence that they were not embedded in the
executable. They now are.

Files that ended in line with a "#" but without a new line gave an error from
"ast.parse". As a workaround, a new line is added to the end of the file if
it's "missing".

More correct exception locations for complex code lines. I noted that the
current line indication should not only be restored when the call at hand
failed, but in any case. Otherwise sometimes the exception stack would not be
correct. It now is - more often. Right now, this has no systematic test.

Re-raised exceptions didn't appear on the stack if caught inside the same
function, these are now correct.

For exec the globals argument needs to have "__builtins__" added, but the
check was performed with the mapping interface. That is not how CPython does
it, and so e.g. the mapping could use a default value for "__builtins__" which
could lead to incorrect behavior. Clearly a corner case, but one that works
fully compatible now.

New Optimization

The local and shared local variable C++ classes have a flag "free_value" to
indicate if an "PY_DECREF" needs to be done when releasing the object. But
still the code used "Py_XDECREF" (which allows for "NULL" values to be
ignored.) when the releasing of the object was done. Now the inconsistency of
using "NULL" as "object" value with "free_value" set to true was removed.

Tuple constants were copied before using them without a point. They are
immutable anyway.

Cleanups

Improved more of the indentation of the generated C++ which was not very good
for contractions so far. Now it is. Also assignments should be better now.

The generation of code for contractions was made more general and templates
split into multiple parts. This enabled reuse of the code for list
contractions in dictionary and set contractions.

The with statement has its own template now and got cleaned up regarding
indentation.

New Tests

There is now a script to extract the "doctests" from the CPython test suite
and it generates Python source code from them. This can be compiled with
Nuitka and output compared to CPython. Without this, the doctest parts of the
CPython test suite is mostly useless. Solving this improved test coverage,
leading to many small fixes. I will dedicate a later posting to the tool,
maybe it is useful in other contexts as well.

Reference count tests have been expanded to cover assignment to multiple
assignment targets, and to attributes.

The deep program test case, now also have a module in a sub-package to cover
this case as well.

Organizational

The gitweb interface might be considered an
alternative to downloading the source if you want to provide a pointer, or
want to take a quick glance at the source code. You can already download with
git, follow the link below to the page explaining it.

The "README.txt" has documented more of the differences and I consequently
updated the Differences page. There is now a distinction between generally
missing functionality and things that don't work in --deep mode, where
Nuitka is supposed to create one executable.

I will make it a priority to remove the (minor) issues of --deep mode in
the next release, as this is only relatively little work, and not a good
difference to have. We want these to be empty, right? But for the time being,
I document the known differences there.

Numbers

python 2.6:

Pystone(1.1) time for 50000 passes = 0.65
This machine benchmarks at 76923.1 pystones/second

Nuitka 0.3.2:

Pystone(1.1) time for 50000 passes = 0.39
This machine benchmarks at 128205 pystones/second

This is 66% for 0.3.2, slightly up from the 58% of 0.3.1 before. The
optimization done were somewhat fruitful, but as you can see, they were also
more cleanups, not the big things.

My 6 years old son was hurt when he fell of some stone steps during day care
after school. He broke his arm twice and it was really bad. Things didn't go
optimal with that, and in the clinic, during the surgery, one of the risks
materialized, and so he got content from the stomach into his lunge, which is
really serious. So he ended up in intensive care in the hospital.

This meant he could not breathe without assistence anymore, but fortunately that
went over quite well, so he was only one night in intensive care and then was
sent on the normal kids section of the hospital.

My wife had a 3 days exams all the while, which would have been very problematic
to interrupt. So I drove her to the exams very early in the morning, then to
hospital playing with my son, and later to pick her up, so she can see him
too. My other son, 3 years, was meanwhile in Kindergarten from where I picked
him up to allow the kids to play together.

They play together each day, and they missed another very badly. The little one
was very concerned with what the big one was doing.

On top of that, we had the final acceptance of the new house yesterday, just
barely after the son was released from hospital. The new house will also consume
a lot of time for a while. I am preparing the move in about 2 weeks.

For Nuitka this meant of course that no time at all was available for it and
that it will not be much in the near future. I missed a week in the day job and
will need to catch up. Family and day job are priorities of course.

Luckily small things already make an improvement and then I had some things done
already, so there still will be some progress. And of course, I do need some
recreation in any case, and Nuitka is a lot of fun to me. I expect to make a
Nuitka release 0.3.2 this weekend, and it will again be a good step ahead.

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release of Nuitka continues the focus on performance and contains only
cleanups and optimization. Most go into the direction of more readable code,
some aim at making the basic things faster, with good results as to performance
as you can see below.

New Optimization

Constants in conditions of conditional expressions (a if cond else d),
if/elif or while are now evaluated to true or false
directly. Before there would be temporary python object created from it which
was then checked if it had a truth value.

All of that is obviously overhead only. And it hurts the typically while
1: infinite loop case badly.

Do not generate code to catch BreakException or ContinueException
unless a break or continue statement being in a try: finally:
block inside that loop actually require this.

Even while uncaught exceptions are cheap, it is still an improvement
worthwhile and it clearly improves the readability for the normal case.

The compiler more aggressively prepares tuples, lists and dicts from the
source code as constants if their contents is "immutable" instead of building
at run time. An example of a "mutable" tuple would be ({},) which is not
safe to share, and therefore will still be built at run time.

For dictionaries and lists, copies will be made, under the assumption that
copying a dictionary will always be faster, than making it from scratch.

The parameter parsing code was dynamically building the tuple of argument
names to check if an argument name was allowed by checking the equivalent of
name in argument_names. This was of course wasteful and now a pre-built
constant is used for this, so it should be much faster to call functions with
keyword arguments.

There are new templates files and also actual templates now for the while
and for loop code generation. And I started work on having a template for
assignments.

Cleanups

Do not generate code for the else of while and for loops if there is
no such branch. This uncluttered the generated code somewhat.

The indentation of the generated C++ was not very good and whitespace was
often trailing, or e.g. a real tab was used instead of "t". Some things
didn't play well together here.

Now much of the generated C++ code is much more readable and white space
cleaner. For optimization to be done, the humans need to be able to read the
generated code too. Mind you, the aim is not to produce usable C++, but on the
other hand, it must be possible to understand it.

To the same end of readability, the empty else {} branches are avoided for
if, while and for loops. While the C++ compiler can be expected to
remove these, they seriously cluttered up things.

The constant management code in Context was largely simplified. Now the
code is using the Constant class to find its way around the problem that
dicts, sets, etc. are not hashable, or that complex is not being ordered;
this was necessary to allow deeply nested constants, but it is also a simpler
code now.

The C++ code generated for functions now has two entry points, one for Python
calls (arguments as a list and dictionary for parsing) and one where this has
happened successfully. In the future this should allow for faster function
calls avoiding the building of argument tuples and dictionaries all-together.

For every function there was a "traceback adder" which was only used in the
C++ exception handling before exit to CPython to add to the traceback
object. This was now in-lined, as it won't be shared ever.

Numbers

python 2.6:

Pystone(1.1) time for 50000 passes = 0.65
This machine benchmarks at 76923.1 pystones/second

Nuitka 0.3.1:

Pystone(1.1) time for 50000 passes = 0.41
This machine benchmarks at 121951 pystones/second

This is 58% for 0.3.1, up from the 25% before. So it's getting somewhere. As
always you will find its latest version here.

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release 0.3.0 is the first release to focus on performance. In the 0.2.x
series Nuitka achieved feature parity with CPython 2.6 and that was very
important, but now it is time to make it really useful.

Optimization has been one of the main points, although I was also a bit forward
looking to Python 2.7 language constructs. This release is the first where I
really started to measure things and removed the most important bottlenecks.

New Features

Added option to control --debug. With this option the C++ debug
information is present in the file, otherwise it is not. This will give much
smaller ".so" and ".exe" files than before.

Added option --no-optimization to disable all optimization. It enables C++
asserts and compiles with less aggressive C++ compiler optimization, so it can
be used for debugging purposes.

Support for Python 2.7 set literals has been added.

Performance Enhancements

Fast global variables: Reads of global variables were fast already. This was
due to a trick that is now also used to check them and to do a much quicker
update if they are already set.

Fast break/continue statements: To make sure these statements execute
the finally handlers if inside a try, these used C++ exceptions that were
caught by try/finally in while or for loops.

This was very slow and had very bad performance. Now it is checked if this is
at all necessary and then it's only done for the rare case where a
break/continue really is inside the tried block. Otherwise it is now
translated to a C++ break/continue which the C++ compiler handles more
efficiently.

Added unlikely() compiler hints to all errors handling cases to allow the
C++ compiler to generate more efficient branch code.

The for loop code was using an exception handler to make sure the iterated
value was released, using PyObjectTemporary for that instead now, which
should lead to better generated code.

Using constant dictionaries and copy from them instead of building them at run
time even when contents was constant.

New Tests

Merged some bits from the CPython 2.7 test suite that do not harm 2.6, but
generally it's a lot due to some unittest module interface changes.

Added CPython 2.7 tests test_dictcomps.py and test_dictviews.py which
both pass when using Python 2.7.

Added another benchmark extract from "PyStone" which uses a while loop with
break.

Numbers

python 2.6:

Pystone(1.1) time for 50000 passes = 0.65
This machine benchmarks at 76923.1 pystones/second

Nuitka 0.3.0:

Pystone(1.1) time for 50000 passes = 0.52
This machine benchmarks at 96153.8 pystones/second

That's a 25% speedup now and a good start clearly. It's not yet in the range of
where i want it to be, but there is always room for more. And the
break/continue exception was an important performance regression fix.

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler. Please see the page "What is Nuitka?" for an overview.

This release 0.2.4 is likely the last 0.2.x release, as it's the one that
achieved feature parity with CPython 2.6, which was the whole point of the
release series, so time to celebrate. I have stayed away (mostly) from any
optimization, so as to not be premature.

From now on speed optimization is going to be the focus though. Because right
now, frankly, there is not much of a point to use Nuitka yet, with only a minor
run time speed gain in trade for a long compile time. But hopefully we can
change that quickly now.

New Features

The use of exec in a local function now adds local variables to scope it is
in.

The same applies to from module_name import * which is now compiled
correctly and adds variables to the local variables.

Bug Fixes

Raises UnboundLocalError when deleting a local variable with del
twice.

Raises NameError when deleting a global variable with del twice.

Read of to uninitialized closure variables gave NameError, but
UnboundLocalError is correct and raised now.

Cleanups

There is now a dedicated pass over the node tree right before code generation
starts, so that some analysis can be done as late as that. Currently this is
used for determining which functions should have a dictionary of locals.

Checking the exported symbols list, fixed all the cases where a static was
missing. This reduces the "module.so" sizes.

With gcc the "visibility=hidden" is used to avoid exporting the helper
classes. Also reduces the "module.so" sizes, because classes cannot be made
static otherwise.

New Tests

Added "DoubleDeletions" to cover behaviour of del. It seems that this is
not part of the CPython test suite.

The "OverflowFunctions" (those with dynamic local variables) now has an
interesting test, exec on a local scope, effectively adding a local variable
while a closure variable is still accessible, and a module variable too. This
is also not in the CPython test suite.

Restored the parts of the CPython test suite that did local star imports or
exec to provide new variables. Previously these have been removed.

Also "test_with.py" which covers PEP 343 has been reactivated, the with
statement works as expected.

Quiz Question

Say you have the following module code:

a_global=7defdeepExec():for_closure=3defexecFunction():code="f=2"# Can fool it to nestexeccodeprint"Locals now",locals()print"Closure was taken",for_closureprint"Globals still work",a_globalprint"Added local from code",fexecFunction()deepExec()

Can you overcome the SyntaxError this gives in CPython? Normally exec like this is not
allowed for nested functions. Well, think about it, the solution is in the next paragraph.

Solution

The correct answer is that you need to add "in None, None" to the exec and you are
fine. The exec is now allowed and behaves as expected. You can see it in the locals, "f"
was indeed added to it, the closure value is correct, and the global still works.

It seems the "SyntaxError" tries to avoid such code, but on the other hand, exec is not
forbidden when it has parameters, and those imply defaults when they are None.

Now, I had this strange realization when implementing the "exec" behaviour for my Python
compiler Nuitka which in its next version (due later this
week) will be able to handle this type code as well. :-)