IronPython: Reusing Import Symbols to Avoid Performance Hits

I was struggling with IronPython today as I stumbled over a pretty annoying setback when it came to dynamic compilation of scripts that involved namespace imports.

Lets do an artificial sample: The snippet below just increments a variable by one. As expected, it executes blazingly fast – after compilation, it executes a few thousand times without getting over a single millisecond:

Basically, this snippet performs the same logic (incrementing the ‘value’ variable), but it contains an import for the System.Xml namespace. It’s not necessary, but it still needs to be compiled. Executing this (compiled!) script 4000 times takes over 5 seconds!

However: If you are lucky (as me), you have the opportunity to separate namespace imports and business logic, so you basically end up with two scripts:

The import script, which is executed only once to get the imported symbols

The statement above executes the import script with an empty SymbolDictionary. If you check this dictionary after execution, you can see that it now contains symbols for all types of the imported namespace:

However, this code is not thread-safe, because it causes different scopes to actually share not only the import symbols, but all variables through the symbol dictionary. Look at the code below, that creates to script scope instances which are initialized with individual values:

//create a new scope with the symbols of the import scope
ScriptScope run1 = engine.CreateScope(importSymbols);
//start with a value of 10
run1.SetVariable("value", 10);
workerScript.Execute(run1);
//create a second scope with the symbols of the import scope
ScriptScope run2 = engine.CreateScope(importSymbols);
//start with a value of 20
run2.SetVariable("value", 20);
workerScript.Execute(run2);
//both scopes actually share the same variable
Console.Out.WriteLine(run1.GetVariable<int>("value"));
Console.Out.WriteLine(run2.GetVariable<int>("value"));

The console outputs the same value both times – because both scopes operate on the same variable:

21
21

So basically, we have two requirements:

Store import symbols (or any other shared variables) in a reusable cache.

Store worker variables that belong to a given script scope in another dictionary.

We can do this by subclassing the CustomSymbolDictionary class of the Microsoft.Scripting.Runtime.BaseSymbolDictionary namespace. Took me ages to find a solution, but implementation was a breeze:

/// <summary>/// A symbol dictionary that provides an fixed set of/// symbols through a <see cref="SharedScope"/>. As new variables/// are not added to the <see cref="SharedScope"/>, these cached/// symbols can be reused across different scopes./// </summary>publicclass SharedSymbolDictionary : CustomSymbolDictionary
{
/// <summary>/// A script scope that provides a reusable set of symbols./// Any variables that are being created by the <see cref="ScriptScope"/>/// that owns this cache are not stored within <see cref="SharedScope"/>,/// but the parent scope's own symbol dictionary./// </summary>public ScriptScope SharedScope { get; private set; }
/// <summary>/// Creates a new cache instance/// </summary>/// <param name="sharedScope">A reusable <see cref="ScriptScope"/> that provides/// a set of symbols that are supposed to be used across several scopes.</param>/// <exception cref="ArgumentNullException">If <paramref name="sharedScope"/>/// is a null reference.</exception>public SharedSymbolDictionary(ScriptScope sharedScope)
{
if (sharedScope == null) thrownew ArgumentNullException("sharedScope");
SharedScope = sharedScope;
}
/// <summary>/// Invoked if a given variable or symbol is being requested. This method/// tries to find the requested item in the underlying <see cref="SharedScope"/>./// </summary>/// <param name="key"></param>/// <param name="value"></param>/// <returns>True if the <see cref="SharedScope"/> provides the requested/// symbol.</returns>protectedoverridebool TryGetExtraValue(SymbolId key, outobjectvalue)
{
//return the key from the base scope, if possible.lock (SharedScope)
{
return SharedScope.TryGetVariable(SymbolTable.IdToString(key), outvalue);
}
}
/// <summary>/// Gets a list of the extra keys that are cached by the the optimized/// implementation of the module./// </summary>publicoverride SymbolId[] GetExtraKeys()
{
lock (SharedScope)
{
return SharedScope.GetItems().Select(pair =>
SymbolTable.StringToId(pair.Key)).ToArray();
}
}
/// <summary>/// Tries to set the extra value and return true if the specified key/// was found in the list of extra values.<br/>/// Any attempts to store extra values are being denied, which causes/// them to be stored in the scope itself rather than the local/// <see cref="SharedScope"/>. This ensures that runtime variables are/// not shared between different instances of the cache./// </summary>/// <param name="key">The key that is used to store the submitted/// value.</param>/// <param name="value">Value to be cached.</param>/// <returns>Always false because runtime symbols are not supposed/// to be stored within the cache. This causes the value to be stored/// within the internal dictionary of the base class.</returns>protectedoverridebool TrySetExtraValue(SymbolId key, objectvalue)
{
returnfalse;
}
}

With this implementation, we can change our snippet accordingly:

//create a new scope with the symbols of the import scopeSharedSymbolDictionary cache = new SharedSymbolDictionary(importScope);
ScriptScope run1 = engine.CreateScope(cache);//start with a value of 10
run1.SetVariable("value", 10);
workerScript.Execute(run1);
//create a second scope with the symbols of the import scopecache = new SharedSymbolDictionary(importScope);
ScriptScope run2 = engine.CreateScope(cache);//start with a value of 20
run2.SetVariable("value", 20);
workerScript.Execute(run2);
//both scopes actually share the same variable
Console.Out.WriteLine(run1.GetVariable<int>("value"));
Console.Out.WriteLine(run2.GetVariable<int>("value"));

As the SharedSymbolDictionary class stores the value variable not in the shared importScope, the variables of the worker scope can be set independently, which produces the correct output:

11
21

And the performance gain is remarkable: Execution time is once again down to 17 milliseconds for 10’000 iterations 🙂

Have you tried removing the “import *”. In the python world its considered bad practice for the performance reasons you found above. Being mostly a python user who dables in C# I do realize that in essence “import *” is all C# seems to allow you to do, but python is more flexible.

@Joseph Lisee
(Comment was lost – just picked it up from a backup machine – sorry).

The idea behind the whole thing was the ability to dynmically parse python scripts, so in order to propertly compile them, the import was necessary. Haven’t measured any performance improvements if one could restrict the script to certain namespaces, though.

Hi,
this piece of code has been really useful. However, I’ve recently started using extension methods with IPy 2.6.1 and ExtensionType attribute, and while they can be used from IronPython, they’re only present on the first run, when the SharedSymbolDictionary is actually constructed. On later calls, where I load it from cache, all the extension methods disappear.