The PEAK Rules Core Framework

NOTE: This document is for people who are extending the core framework in some
way, e.g. adding custom action types to specialize method combination, or
creating new kinds of engines or conditions. It isn't intended to be user
documentation for the built-in rule facility.

The PEAK-Rules core framework provides a generic API for creating and
manipulating generic functions, with a high degree of extensibility. Almost
any concept implemented by the core can be replaced by a third-party
implementation on a function-by-function basis. In this way, an individual
library or application can provide for its specific needs, without needing to
reinvent the entire spectrum of tools.

The main concepts implemented by the core are:

Generic functions

A function with a "dispatching" add-on, that manages a collection of
methods, where each method has a rule to determine its applicability.
When a generic function is invoked, a combination of the methods that
apply to the invocation (as determined by their rules) is invoked.

Method combination

The ability to compose a set of methods into a single function, with their
precedence determined by the type of method and the logical implication
relationships of their applicability rules.

The first versions will focus on developing a core framework for extensible
functions that is itself implemented using extensible functions. This
self-bootstrapping core will implement a type-tuple-caching engine using
relatively primitive operations, and will then have a method combination
system built on that. The core will thus be capable of implementing generic
functions with multiple dispatch based on positional argument types, and the
decorator APIs will be built around that.

The next phase of development will add alternative engines that are oriented
towards predicate dispatch and more sophisticated ways of specifying regular
class dispatch (e.g. being able to say things like isinstance(x,Foo)orisinstance(y,Foo)). To some extent this will be porting the expression
machinery from RuleDispatch to work on the new core, but in a lot of ways it'll
just be redone from scratch. Having type-based multiple dispatch available to
implement the framework should enable a significant reduction in the complexity
of the resulting library.

An additional phase will focus on adding new features not possible with the
RuleDispatch engine, such as "predicate functions" (a kind of dynamic macro
or rule expansion feature), "classifiers" (a way of priority-sequencing a
set of alternative criteria) and others.

Finally, specialty features such as index customization, thread-safety,
event-oriented rulesets, and such will be introduced.

(Note: Criteria, signatures, and predicates are described and tested in detail
by the Criteria.txt document.)

Criterion

A criterion is a symbolic representation of a test that returns a boolean
for a given value, for example by testing its type. The simplest criterion
is just a class or type object, meaning that the value should be of that
type.

Signature

A condition expressed purely in terms of simple tests "and"ed together,
using no "or" operations of any kind. A signature specifies what argument
expressions are tested, and which criteria should be applied to them.
The simplest possible signature is a tuple of criteria, with each criterion
applied to the corresponding argument in an argument tuple. (An empty tuple
matches any possible input.) Signatures are also described

Predicate

One or more signatures "or"ed together. (Note that this means that
signatures are predicates, but predicates are not necessarily signatures.)

Rule

A combination of a predicate, an action type, and a body (usually a
function.) The existence of a rule implies the existence of one or more
actions of the given action type and body, one for each possible signature
that could match the predicate.

Action Type

A factory that can produce an Action when supplied with a signature, body,
and sequence. (Examples in peak.rules will include the MethodList,
MethodChain, Around, Before, and After types.)

Action

An object representing the behavior of a single invocation of a generic
function. Action objects may be combined (using a generic function of
the form combine_actions(a1,a2)) to create combined methods ala
RuleDispatch. Each action comprises at least a signature and a body, but
actions of more complex types may include other information.

Rule Set

A collection of rules, combined with some policy information (such
as the default action type) and optional optimization hints. A rule
set does not directly implement dispatching. Instead, rule engines
subscribe to rule sets, and the rule set informs them when actions are
added and removed due to changes in the rule set's rules.

This would almost be better named an "action set" than a "rule set",
in that it's (virtually speaking) a collection of actions rather than
rules. However, you do add and remove entries from it by specifying
rules; the actions are merely implied by the rules.

Generic functions will have a __rules__ attribute that points to their
rule set, so that the various decorators can add rules to them. You
will probably be able to subclass the base RuleSet class or create
alternate implementations, as might be useful for supporting persistent or
database-stored rules. (Although you'd probably also need a custom rule
engine for that.)

Rule Engine

An object that manages the dispatching of a given rule set to implement
a specific generic function. Generic functions will have an __engine__
attribute that points to their current engine. Engines will be responsible
for doing any indexing, caching, or code generation that may be required to
implement the resulting generic function.

The default engine will implement simple type-based multiple dispatch with
type-tuple caching. For simple generic functions this is likely to be
faster than almost anything else, even C-assisted RuleDispatch. It also
should have far less definition-time overhead than a RuleDispatch-style
engine would.

Engines will be pluggable, and in fact there will be a mechanism to allow
engines to be switched at runtime when certain conditions are met. For
example, the default engine could switch automatically to a
RuleDispatch-like engine if a rule is added whose conditions can't be
translated to simple type dispatching. There will also be some type of
hint system to allow users to suggest what kind of engine implementation
or special indexing might be appropriate for a particular function.

Method combination is performed using the combine_actions() API function:

>>> from peak.rules.core import combine_actions

combine_actions() takes two arguments: a pair of actions. They are
compared using the overrides() generic function to see if one is more
specific than the other. If so, the more specific action's override()
method is called, passing in the less-specific action. If neither action
can override the other, the first action's merge() method is called,
passing in the other action.

In either case, the result of calling the merge() or override() method
is returned.

So, to define a custom action type for method combination, and it needs to
implement merge() and override() methods, and it must be comparable to
other method types via the overrides() generic function.

The implies() function is used to determine the logical implication
relationship between two signatures. A signature s1 implies a
signature s2 if s2 will always match an invocation matched by s1.
(Action implication is based on signature implication; see the Action Types
section below for more details.)

For the simplest signatures (tuples of types), this corresponds to a subclass
relationship between the elements of the tuples:

And as a special case of type implication, any classic class implies both
object and InstanceType, but cannot imply any other new-style classes.
This special-casing is used to work around the fact that isinstance() will
say that a classic class instance is an instance of both object and
InstanceType, but issubclass() doesn't agree. PEAK-Rules wants to
conform with isinstance() here:

Type or class objects are used to represent "this class or a subclass", but
istype() objects are used to represent either "this exact type" (using
istype(aType,True)), or "anything but this exact type" (istype(aType,False)). So their implication rules are different.

Internally, PEAK-Rules uses istype objects to represent a call signature
being matched, because the argument being tested is of some exact type. Then,
any rule signatures that are implied by the calling signature are considered
"applicable".

So, istype(aType,True) (the default) must always imply the same type or
class, or any parent class thereof:

An exact type will also imply any exclusion of a different exact type:

>>> implies(istype(int), istype(str, False))
True

In other words, if type(x)isint, that implies type(x)isnotstr.
But of course, that doesn't work they other way around:

>>> implies(istype(str, False), istype(int))
False

These implication rules are sufficient to bootstrap the basic types-only
rules engine; additional rules for istype behavior are explained in
Criteria.txt to show intersection of criteria such as istype, and other
more-advanced criteria manipulation used in the full predicate rules engine.

The default action type (for rules with no specified action type) is
Method. A Method combines a body, a signature, a definition-order
serial number, and an optional "chained" action that it can fall back to. All
of these values are optional, except for the body:

Around methods are identical to normal Method objects, except that
whenever an Around method and a regular Method are combined, the
Around method overrides the regular one. This forces all the regular
methods to be further down the chain than all of the "around" methods.

You will normally only want to use Around methods with functions that have
a next_method parameter, since their purpose is to wrap "around" the
calling of lower-precedence methods. If you don't do this, then the method
chain will always end at that Around instance:

The simplest possible action type is NoApplicableMethods, meaning that
there is no applicable action. When it's overridden by another method, it
will of course get chained to the other method's tail (if appropriate).

Notice that to create a MethodList with only one method, you must use the
make() classmethod. Method also has this classmethod, but it has the
same signature as the main constructor. The main constructor for
MethodList has a different signature for its internal use.

The combination of before, after, primary, and around methods is as shown:

This is because our method type redefined __call__() but did not include
its own compiled() method.

The compiled() method of a Method subclass takes an Engine as its
argument, and should return a callable to be used in place of directly calling
the method itself. It should pass any objects it plans to call (e.g. its tail
or individual submethods) through compile_method(ob,engine), in order to
ensure that those objects are also compiled:

As you can see, compile_method() invokes our new compiled() method,
which ends up returning the original function. And, if we don't define a
__call__() method of our own, we end up inheriting one from Method
that compiles the method and invokes it for us:

>>> m(1)
compiling
42

However, if we use this method type in a generic function, then the generic
function will cache the compiled version of its methods so they don't have to
be compiled every time they're called:

>>> f = func_with(MyMethod2)
>>> f(1)
compiling
42
>>> f(1)
42

(Note: what caching is done, and when the cache is reset is heavily
dependent on the specific dispatching engine in use; it can also be the case
that a similar-looking method object will be compiled more than once, because
in each case it has a different tail or match signature.)

Now, Method subclasses do NOT inherit their compiled() method from
their base classes, unless they are also inheriting __call__. This
prevents you from ending up with strangely-broken code in the event
you redefine __call__(), but forget to redefine compiled():

As you can see, the new subclass works, but doesn't get compiled. So, you
can do your initial debugging and development without compilation by defining
__call__(), and then switch over to compiled() once you're happy with
your prototype.

Now, let's define a method type that works like MyMethod3, but is
compiled using a template:

So far, it looks a little like our earlier compilation. We compile the
body like before, but then, what's that apply_template stuff?

The apply_template() method of engine objects takes a "template" function
and one or more arguments representing values that need to be accessible in
our compiled function. Let's go ahead and define noisy_template now:

Template functions are defined using the conventions of DecoratorTools's
@template_function decorator, only without the decorator. The first
positional argument is the generic function the compiled method is being
used with, and any others are up to you.

Any use of $args is replaced with the correct calling signature for
invoking a method of the corresponding generic function, and you must
name all of your arguments and local variables such that they won't conflict
with any actual argument names. (In practice, this means you want to use
__-prefixed names, which is why we're defining the template outside
the class, to prevent Python from mangling our parameter names and messing up
the template.)

Note, too, that all the other caveats regarding @template_function
functions apply, including the fact that the function cannot actually use any
of its arguments (or any variables from its containing scope) to determine the
return string -- it must simply return a constant string. (It can, however,
refer to globals in its defining module, as long as they're not shadowed by
the generic function's argument names.)

This will only work, however, if all the arguments passed to apply_template
are usable as dictionary keys. So, it's best to use tuples instead of lists,
frozensets instead of sets, etc. (Also, this means you can't pass in keyword
arguments.)

Decorators can accept an entry point string in place of an actual function,
provided that the PEAK "Importing" package (peak.util.imports) is
available. In that case, the registration is deferred until the named module
is imported:

If the named module is already imported, the registration takes place
immediately, otherwise it is deferred until the named module is actually
imported.

This allows you to provide optional integration with modules that might or
might not be used by a given application, without creating a dependency between
your code and that package.

Note, however, that if the named function doesn't exist when the module is
imported, then an attribute error will occur at import time. The syntax of
the target name is lightly checked at call time, however:

(This is just a sanity check, though, just to make sure you didn't accidentally
put some other string first (like the criteria). It won't detect a string
that points to a non-existent module, or various other possible errors, so you
should still verify that your code gets run when the target module is imported
and the relevant conditions apply.)

Rules are currently implemented as 3-item tuples comprising a predicate, a
body, and an action type that will be used as a factory to create the actions
for the rule. At minimum, all a rule needs is a body, so there's a convenience
constructor (Rule) that allows you to create a rule with defaults. The
predicate and action type default to () and None if not specified:

An action type of None (or any false value) means that the ruleset should
decide what action type to use. Actually, it can decide anyway, since the
rule set is always responsible for creating action objects; the rule's action
type is really just advisory to begin with.

Observers can be added with the subscribe() and unsubscribe() methods.
Observers have their actions_changed method called with an "added" set
and a "removed" set of action definitions. (An action definition is a
tuple of the form (actiontype,body,signature,serial), and can thus
be used to create action objects.)

When an observer is first added, it's notified of the current contents of the
RuleSet, if any. As a result, observers don't need to do any special case
handling for their initial setup. Everything can be handled via the normal
operation of the actions_changed() method: