I'll get into the details of how PyRun_File() works in a little bit,
but if you look carefully at Listing 3, you should notice something
interesting. When I call PyRun_File() to execute the files, the
dictionary gets passed in twice. The reason for this is that Python
code actually has two environmental contexts when it is executed.
The first is the global context, which I've already talked about.
The second context is the local context, which contains any locally
defined variables or functions. In this case, those are the same, because
the code being executed is top-level code. On the other hand, if you
were to execute a function dynamically using multiple C-level calls,
you might want to create a local context and use that instead of the
global dictionary. For the most part though, it's generally safe to
pass the global environment for both the global and local parameters.

Manipulating Python Data Structures in C/C++

At this point, I'm sure you've noticed the Py_DECREF() calls that
popped up in the Listing 3 example. Those fun little guys are there
for memory management purposes. Inside the interpreter, Python handles
memory management automatically by keeping track of all references to
memory transparent to the programmer. As soon as it determines that all
references to a given chunk of memory have been released, it deallocates
the no-longer needed chunk. This can be a problem when you start working
on the C side though. Because C is not a memory-managed language, as soon
as a Python data structure ends up referenced from C, all ability to track
the references automatically is lost to Python. The C application can make as many
copies of the reference that it wants, and hold on to it indefinitely
without Python knowing anything about it.

The solution is to have C code that gets a reference to a Python object
handle all of the reference counting manually. Generally, when a Python
call hands an object out to a C program, it increments the reference count
by one. The C code can then do what it likes with the object without worrying
that it will be deleted out from under it. Then when the C program is
done with the object, it is responsible for releasing its reference by
making a call to Py_DECREF().

It's important, though, to remember when you copy a pointer within
your C program that may outlast the pointer from which you're copying, you
need to increment the reference count manually, by calling Py_INCREF().
For example, if you make a copy of a PyObject pointer to store inside
an array, you'll probably want to call Py_INCREF() to ensure that
the pointed-to object won't get garbage-collected after the original
PyObject reference is decremented.

Executing Code from a File

Now let's take a look at a slightly more useful example to see how Python
can be embedded into a real program. If you take a look at Listing
4, you'll see a small program that allows the user to specify short
expressions on the command line. The program then calculates the
results of those expressions and displays them in the output. To add a
little spice to the mix, the program also lets users specify a file
of Python code that will be loaded before the expressions are executed.
This way, the user can define functions that will be available to the
command-line expressions.

Two basic Python API functions are used in this program,
PyRun_SimpleString() and PyRun_AnyFile(). You've seen PyRun_SimpleString()
before. All it does is execute the given Python expression
in the global environment. PyRun_SimpleFile() is similar to the
PyRun_File() function that I discussed earlier, but it runs things in the
global environment by default. Because everything is run in the
global environment, the results of each executed expression or group of
expressions will be available to those that are executed later.

The Python/C API is very low level, verbose, painful to work with, and highly error prone. With C++ code in particular it just sucks - you'll spend half your time writing const_cast("blah") to work around "interesting" APIs or writing piles of "extern C" wrapper functions, and the rest of your time writing verbose argument encoding/decoding or reference counting code. It's immensely frustraing to use plain C to write a Python object to wrap a C++ object.

Do yourself a favour, and once you've got embedding working, expose the Python interface to your program using a higher level tool. I hear SWIG is pretty good, especially for plain C code, but haven't used it myself (I work with heavily templated C++ with Qt). SIP (used to make PyQt) from Riverbank computing has its advantages also, and is good if you want your Qt app's API to integrate cleanly into PyQt. Otherwise, I'd suggest the amazing Boost::Python for C++ users, as it's ability to almost transparently wrap your C++ interfaces, mapping them to quite sensible Python semantics, is pretty impressive.

Boost::Python has the added advantage that you can write some very nice C++ code that integrates cleanly with Python. For example, you can iterate over a Python list much like a C++ list, throw exceptions between C++ and Python, build Python lists as easily as (oversimplified example):

The equivalent Python/C API code is longer, filled with dangerous reference count juggling, contains a lot of manual error checking that's often ignored, and is a lot uglier.

With regards to the article above, it's good to see things like this written. I had real trouble getting started with embedding Python, and I think this is a pretty well written intro. I do take issue with one point, though, and that's duplicating the environment. Cloning the main dict does not provide separate program environments - very far from it. It only gives them different global namespaces. Interpreter-wide state changes still affect both programs. For example, if one program imports a module, the other one can see it in sys.modules ; if one program changes a setting in a module, the other one is affected. Locale settings come to mind. Most well designed modules will be fine, but you'll run into the odd one that thinks that module-wide globals are a good idea and consequently chokes.

Unfortunately, the alternative is to use sub-interpreters. Sub-interpreters are a less than well documented part of Python's API, and as they rely on thread local storage they're hopeless in single threaded programs. They can be made to work (see Scribus, for example) but it's not overly safe, and will abort if you use a Python debug build.

When you combine this with a GUI toolkit like Qt3 that only permits GUI operations from the main thread (thankfully, the limitiation is relieved by Qt4), this becomes very frustrating. If you're not stuck with this limitation, you can just spawn off a thread for your users' scripts, and should consider designing your interface that way right from the start.

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.