The memoize decorator (http://code.activestate.com/recipes/52201/)
caches in memory all the inputs and outputs of a function call. It can
thus avoid running twice the same function, with a very small
overhead. However, it compares input objects with those in cache on each
call. As a result, for big objects there is a huge overhead. Moreover
this approach does not work with numpy arrays, or other objects subject
to non-significant fluctuations. Finally, using memoize with large
objects will consume all the memory, where with Memory, objects are
persisted to disk, using a persister optimized for speed and memory
usage (joblib.dump()).

In short, memoize is best suited for functions with “small” input and
output objects, whereas Memory is best suited for functions with complex
input and output objects, and aggressive persistence to disk.

The original motivation behind the Memory context was to be able to a
memoize-like pattern on numpy arrays. Memory uses fast cryptographic
hashing of the input arguments to check if they have been computed;

We need to close the memmap file to avoid file locking on Windows; closing
numpy.memmap objects is done with del, which flushes changes to the disk

>>> delres

Note

If the memory mapping mode used was ‘r’, as in the above example, the
array will be read only, and will be impossible to modified in place.

On the other hand, using ‘r+’ or ‘w+’ will enable modification of the
array, but will propagate these modification to the disk, which will
corrupt the cache. If you want modification of the array in memory, we
suggest you use the ‘c’ mode: copy on write.

Warning

Because in the first run the array is a plain ndarray, and in the
second run the array is a memmap, you can have side effects of using
the Memory, especially when using mmap_mode=’r’ as the array is
writable in the first run, and not the second.

In some cases, it can be useful to get a reference to the cached
result, instead of having the result itself. A typical example of this
is when a lot of large numpy arrays must be dispatched accross several
workers: instead of sending the data themselves over the network, send
a reference to the joblib cache, and let the workers read the data
from a network filesystem, potentially taking advantage of some
system-level caching too.

Getting a reference to the cache can be done using the
call_and_shelve method on the wrapped function:

Once computed, the output of g is stored on disk, and deleted from
memory. Reading the associated value can then be performed with the
get method:

>>> result.get()array([ 0.08, 0.77, 0.77, 0.08])

The cache for this particular value can be cleared using the clear
method. Its invocation causes the stored value to be erased from disk.
Any subsequent call to get will cause a KeyError exception to be
raised:

A MemorizedResult instance contains all that is necessary to read
the cached value. It can be pickled for transmission or storage, and
the printed representation can even be copy-pasted to a different
python interpreter.

Shelving when cache is disabled

In the case where caching is disabled (e.g.
Memory(cachedir=None)), the call_and_shelve method returns a
NotMemorizedResult instance, that stores the full function
output, instead of just a reference (since there is nothing to
point to). All the above remains valid though, except for the
copy-pasting feature.

Across sessions, function cache is identified by the function’s name.
Thus if you have the same name to different functions, their cache will
override each-others (you have ‘name collisions’), and you will get
unwanted re-run:

memory cannot be used on some complex objects, e.g. a callable
object with a __call__ method.

However, it works on numpy ufuncs:

>>> sin=memory.cache(np.sin)>>> print(sin(0))0.0

caching methods: you cannot decorate a method at class definition,
because when the class is instantiated, the first argument (self) is
bound, and no longer accessible to the Memory object. The following
code won’t work:

Decorates the given function func to only compute its return
value for input arguments not cached on disk.

Parameters:

func: callable, optional

The function to be decorated

ignore: list of strings

A list of arguments name to ignore in the hashing

verbose: integer, optional

The verbosity mode of the function. By default that
of the memory object is used.

mmap_mode: {None, ‘r+’, ‘r’, ‘w+’, ‘c’}, optional

The memmapping mode used when loading from cache
numpy arrays. See numpy.load for the meaning of the
arguments. By default that of the memory object is used.

Returns:

decorated_func: MemorizedFunc object

The returned object is a MemorizedFunc object, that is
callable (behaves like a function), but offers extra
methods for cache lookup and management. See the
documentation for joblib.memory.MemorizedFunc.