NDB manages caches for you. There are two caching levels: an in-context cache and a gateway to App Engine's standard caching service, memcache. Both caches are enabled by default for all entity types, but can be configured to suit advanced needs. In addition, NDB implements a feature called auto-batching, which tries to group operations together to minimize server round trips.
Introduction
Caching helps most types of applications. NDB automatically caches data that it writes or reads (unless an application configures it not to). Reading from cache is faster than reading from the Datastore.
Probably the automatic caching does what you want. The rest of this page provides more detailed information in case you want to know more or to control some parts of the caching behavior.
You can alter caching behavior of many NDB functions by passing
Context Options arguments.
For example, you might call
key.get(use_cache=False, use_memcache=False)
to bypass caching. You can also change default
caching policy on an NDB context as described below.
Caution: When you use the Administration Console's Datastore Viewer to modify the Datastore contents, the cached values will not be updated. Thus, your cache may be inconsistent. For the in-context cache this is generally not a problem. For Memcache, we recommend using the Administration Console to flush the cache.
Context Objects
Cache management uses a class named
Context
:
each incoming HTTP request and each transaction is executed in a new
context. To access the current context,
use the ndb.get_context()
function.
Caution: It makes no sense to share
Context
objects between multiple threads or requests.
Don't save the context as a global variable!
Storing it in a local variable is fine.
Context objects have methods for setting cache policies and otherwise manipulating the cache.
The In-Context Cache
The in-context cache persists only for the duration of a single incoming HTTP request and is "visible" only to the code that handles that request. It's fast; this cache lives in memory. When an NDB function writes to the Datastore, it also writes to the in-context cache. When an NDB function reads an entity, it checks the in-context cache first. If the entity is found there, no Datastore interaction takes place.
Queries do not look up values in any cache. However, query results are written back to the in-context cache if the cache policy says so (but never to Memcache).
Memcache
Memcache is App Engine's standard caching service, much faster than the Datastore but slower than the in-context cache (milliseconds vs. microseconds).
By default, a nontransactional context caches all entities in memcache. All an application's contexts use the same memcache server and see a consistent set of cached values.
Memcache does not support transactions. Thus, an update meant to be applied to both the Datastore and memcache might be made to only one of the two. To maintain consistency in such cases (possibly at the expense of performance), the updated entity is deleted from memcache and then written to the Datastore. A subsequent read operation will find the entity missing from memcache, retrieve it from the Datastore, and then update it in memcache as a side effect of the read. Also, NDB reads inside transactions ignore the Memcache.
When entities are written within a transaction, memcache is not used; when the transaction is committed, its context will attempt to delete all such entities from memcache. Note, however, that some failures may prevent these deletions from happening.
Policy Functions
Automatic caching is convenient for most applications but maybe your application is unusual and you want to turn off automatic caching for some or all entities. You can control the behavior of the caches by setting policy functions. There is a policy function for the in-process cache, set with
ctx.set_cache_policy(func)
and another for memcache, set with
ctx.set_memcache_policy(func)
(where ctx
is a Context
).
Each policy function accepts a key and returns a Boolean result.
If it returns False
, the entity identified by that
key will not be saved in the corresponding cache.
For example, to bypass the in-process cache for all
Account
entities, you could write
ctx.set_cache_policy(lambda key: key.kind() != 'Account')
(However, keep reading for an easier way to accomplish the same thing.)
As a convenience, you can pass True
or False
instead of a function that always returns the same value.
The default policies cache all entities.
There is also a Datastore policy function governing which entities are written to the Datastore itself:
ctx.set_datastore_policy(func)
This works like the in-context cache and memcache policy functions:
if the Datastore policy function returns False
for a given key, the corresponding entity will not be written to the
Datastore.
(It may be written to the in-process cache or memcache
if their policy functions allow it.)
This can be useful in cases where you have entity-like data
that you would like to cache, but that you don't need to store
in the Datastore.
Just as for the cache policies, you can pass True
or
False
instead of a function that always returns the same value.
Memcache automatically expires items when under memory pressure. You can set a memcache timeout policy function to determine an entity's maximum lifetime in the cache:
ctx.set_memcache_timeout_policy(func)
This function is called with a key argument and should return an
integer specifying the maximum lifetime in seconds;
0 or None
means indefinite
(as long as the memcache server has enough memory).
For convenience, you can simply pass an integer constant
instead of a function that always returns the same value.
See the memcache documentation for more information about timeouts.
Note: There is no separate lifetime policy for the in-context cache: the cache's lifetime is the same as that of its context, a single incoming HTTP request. However, you can clear the in-process cache by calling
ctx.clear_cache()
A brand-new context starts out with an empty in-process cache.
While policy functions are very flexible, in practice most policies are simple. For example,
- Don't cache entities belonging to a specific model class.
- Set the memcache timeout for entities in this model class to 30 seconds.
- Entities in this model class need not be written to the Datastore.
To save you the work of writing and continually updating trivial policy functions (or worse, overriding the policies for each operation using context options), the default policy functions obtain the model class from the key passed to them and then look in the model class for specific class variables:
Class Variable | Type | Description |
---|---|---|
_use_cache | bool | Specifies whether to store entities in in-process cache; overrides default in-process cache policy. |
_use_memcache | bool | Specifies whether to store entities in memcache; overrides default memcache policy. |
_use_datastore | bool | Specifies whether to store entities in datastore; overrides default Datastore policy. |
_memcache_timeout | int | Maximum lifetime for entities in memcache; overrides default memcache timeout policy. |
Note:
This is a feature of the default policy function for each policy.
If you specify your own policy function but also want to fall back to the
default policy, call the default policy functions explicitly as
static methods of class Context
:
default_cache_policy(key)
default_memcache_policy(key)
default_datastore_policy(key)
default_memcache_timeout_policy(key)