Navigation

This recipe presents a way to track the cache keys related to a particular region,
for the purposes of invalidating a series of keys that relate to a particular id.

Three cached functions, user_fn_one(), user_fn_two(), user_fn_three()
each perform a different function based on a user_id integer value. The
region applied to cache them uses a custom key generator which tracks each cache
key generated, pulling out the integer “id” and replacing with a template.

When all three functions have been called, the key generator is now aware of
these three keys: user_fn_one_%d, user_fn_two_%d, and
user_fn_three_%d. The invalidate_user_id() function then knows that
for a particular user_id, it needs to hit all three of those keys
in order to invalidate everything having to do with that id.

This recipe presents one technique of optimistically pushing new data
into the cache when an update is sent to a database.

Using SQLAlchemy for database querying, suppose a simple cache-decorated
function returns the results of a database query:

@region.cache_on_arguments()defget_some_data(argument):# query database to get datadata=Session().query(DBClass).filter(DBClass.argument==argument).all()returndata

We would like this particular function to be re-queried when the data
has changed. We could call get_some_data.invalidate(argument,hard=False)
at the point at which the data changes, however this only
leads to the invalidation of the old value; a new value is not generated until
the next call, and also means at least one client has to block while the
new value is generated. We could also call
get_some_data.refresh(argument), which would perform the data refresh
at that moment, but then the writer is delayed by the re-query.

A third variant is to instead offload the work of refreshing for this query
into a background thread or process. This can be acheived using
a system such as the CacheRegion.async_creation_runner.
However, an expedient approach for smaller use cases is to link cache refresh
operations to the ORM session’s commit, as below:

fromsqlalchemyimporteventfromsqlalchemy.ormimportSessiondefcache_refresh(session,refresher,*args,**kwargs):""" Refresh the functions cache data in a new thread. Starts refreshing only after the session was committed so all database data is available. """assertisinstance(session,Session), \
"Need a session, not a sessionmaker or scoped_session"@event.listens_for(session,"after_commit")defdo_refresh(session):t=Thread(target=refresher,args=args,kwargs=kwargs)t.daemon=Truet.start()

Within a sequence of data persistence, cache_refresh can be called
given a particular SQLAlchemy Session and a callable to do the work:

defadd_new_data(session,argument):# add some datasession.add(something_new(argument))# add a hook to refresh after the Session is committed.cache_refresh(session,get_some_data.refresh,argument)

Note that the event to refresh the data is associated with the Session
being used for persistence; however, the actual refresh operation is called
with a differentSession, typically one that is local to the refresh
operation, either through a thread-local registry or via direct instantiation.

If you use a redis instance as backend that contains other keys besides the ones
set by dogpile.cache, it is a good idea to uniquely prefix all dogpile.cache
keys, to avoid potential collisions with keys set by your own code. This can
easily be done using a key mangler function:

Under the hood, dogpile.cache wraps cached data in an instance of
dogpile.cache.api.CachedValue and then pickles that data for storage
along with some bookkeeping metadata. If you implement a ProxyBackend to
encode/decode data, that transformation will happen on the pre-pickled data-
dogpile does not store the data ‘raw’ and will still pass a pickled payload
to the backend. This behavior can negate the hopeful improvements of some
encoding schemes.

Since dogpile is managing cached data, you may be concerned with the size of
your payloads. A possible method of helping minimize payloads is to use a
ProxyBackend to recode the data on-the-fly or otherwise transform data as it
enters or leaves persistent storage.

In the example below, we define 2 classes to implement msgpack encoding. Msgpack
(http://msgpack.org/) is a serialization format that works exceptionally well
with json-like data and can serialize nested dicts into a much smaller payload
than Python’s own pickle. _EncodedProxy is our base class
for building data encoders, and inherits from dogpile’s own ProxyBackend. You
could just use one class. This class passes 4 of the main key/value functions
into a configurable decoder and encoder. The MsgpackProxy class simply
inherits from _EncodedProxy and implements the necessary value_decode
and value_encode functions.

Encoded ProxyBackend Example:

fromdogpile.cache.proxyimportProxyBackendimportmsgpackclass_EncodedProxy(ProxyBackend):"""base class for building value-mangling proxies"""defvalue_decode(self,value):raiseNotImplementedError("override me")defvalue_encode(self,value):raiseNotImplementedError("override me")defset(self,k,v):v=self.value_encode(v)self.proxied.set(k,v)defget(self,key):v=self.proxied.get(key)returnself.value_decode(v)defset_multi(self,mapping):"""encode to a new dict to preserve unencoded values in-place when called by `get_or_create_multi` """mapping_set={}for(k,v)inmapping.iteritems():mapping_set[k]=self.value_encode(v)returnself.proxied.set_multi(mapping_set)defget_multi(self,keys):results=self.proxied.get_multi(keys)translated=[]forrecordinresults:try:translated.append(self.value_decode(record))exceptExceptionase:raisereturntranslatedclassMsgpackProxy(_EncodedProxy):"""custom decode/encode for value mangling"""defvalue_decode(self,v):ifnotvorvisNO_VALUE:returnNO_VALUE# you probably want to specify a custom decoder via `object_hook`v=msgpack.unpackb(payload,encoding="utf-8")returnCachedValue(*v)defvalue_encode(self,v):# you probably want to specify a custom encoder via `default`v=msgpack.packb(payload,use_bin_type=True)returnv# extend our region configuration from above with a 'wrap'region=make_region().configure('dogpile.cache.pylibmc',expiration_time=3600,arguments={'url':["127.0.0.1"],},wrap=[MsgpackProxy,])