Objects in the Datastore are known as entities.
Each entity is an instance of a
ndb.Model
class; or, more likely,
an application-defined subclass of ndb.Model
.
Each entity is identified by a
key, unique within the application's
Datastore.
(If you're new to App Engine but not to web application development in general, this is similar to defining a schema in SQL, except that there is no need to issue CREATE TABLE
commands.)
- Creating Entities
- Retrieving Entities from Keys
- Updating Entities
- Deleting Entities
- Operations on Multiple Keys or Entities
- Expando Models
- Model Hooks
- Using Numeric Key IDs
Overview
An entity model class defines one or more properties possessed by entities of that class. For example:
from google.appengine.ext import ndb class Account(ndb.Model): username = ndb.StringProperty() userid = ndb.IntegerProperty() email = ndb.StringProperty()
Each entity is identified by a key, unique within the application's
Datastore.
In its simplest form, a key consists of a kind and an
identifier.
The kind is normally the name of the model class to which the entity belongs
("Account" in the example above), but can be changed to some other string by
overriding the class method _get_kind()
.
The identifier may be either a key name string assigned by the
application or an integer numeric ID generated automatically by the
Datastore.
An entity's key can designate another key as a parent. As a shorthand for saying "an entity's key's parent", people usually say "an entity's parent"; depending on context, they might mean the entity's key's parent; or the entity that has that key. An entity without a parent is a root entity. An entity, its parent, parent's parent, and so on recursively, are its ancestors. The entities in the Datastore thus form a hierarchical key space similar to the hierarchical directory structure of a file system. The sequence of entities beginning with a root entity and proceeding from parent to child, leading to a given entity, constitute that entity's ancestor path.
The complete key identifying an entity thus consists of a sequence of
kind-identifier pairs specifying its ancestor path and terminating with
those of the entity itself. The constructor method for class
Key
accepts such a sequence of kinds and identifiers and
returns an object representing the key for the corresponding entity.
For example, a revision of a message that "belongs to" an owner,
might have a key that looks like
rev_key = ndb.Key('Account', 'Sandy', 'Message', 'greeting', 'Revision', '2')
Notice that the entity's kind is designated by the last
kind-name in the list.
(You might think that the string
'2'
is a strange key value: why not
use a number? You can use numeric IDs, but it's a little tricky.
If you want to know the details, please see
Using Numeric Key IDs.)
For a root entity, the ancestor path is empty and the key consists solely of the entity's own kind and identifier:
sandy_key = ndb.Key('Account', 'Sandy')
(Alternatively, you can use the model class object itself, rather than its name, to specify the entity's kind—
sandy_key = ndb.Key(Account, 'Sandy')
—but it will be converted to the name string in the actual key.)
You can also use the named parameter parent
to designate any entity in
the ancestor path directly. Thus, the following key specifications are
all equivalent:
k1 = ndb.Key('Account', 'Sandy', 'Message', 'greetings', 'Revision', '2') k2 = ndb.Key(Revision, '2', parent=ndb.Key('Account', 'Sandy', 'Message', 'greetings')) k3 = ndb.Key(Revision, '2', parent=ndb.Key(Account, 'Sandy', Message, 'greetings'))
Creating Entities
You create an entity by calling the constructor method for its model class. You can specify the entity's properties to the constructor with keyword arguments:
sandy = Account(username='Sandy', userid=123, email='sandy@gmail.com')
This creates an object in your program's main memory;
it will be gone as soon as the process terminates.
To store the object as a persistent entity in the Datastore,
use the put()
method. This returns a key
for retrieving the entity from the Datastore later:
sandy_key = sandy.put()
Alternatively, instead of supplying the property values directly to the constructor, you can set them manually after creation:
sandy = Account() sandy.username = 'Sandy' sandy.userid = 123 sandy.email = 'sandy@gmail.com'
The convenience method populate()
allows you to set several properties in one operation:
sandy = Account() sandy.populate(username='Sandy', userid=123, email='sandy@gmail.com')
However you choose to set the entity's properties,
the property types (in this case, StringProperty
and
IntegerProperty
) enforce type checking. For example:
bad = Account(username='Sand', userid='not integer') # Raises an exception sandy.username = 42 # Raises an exception
Retrieving Entities from Keys
Given an entity's key, you can retrieve the entity from the Datastore:
sandy = sandy_key.get()
The Key methods kind()
and id()
recover the entity's kind and identifier from the key:
kindString = rev_key.kind() # returns "Revision" ident = rev_key.id() # returns "2"
The parent()
method returns a key representing the
parent entity:
greeting_key = rev_key.parent()
You can also use an entity's key to obtain an encoded string suitable for embedding in a URL:
urlString = rev_key.urlsafe()
This produces a result like agVoZWxsb3IPCxIHQWNjb3VudBiZiwIM
which can later be used to reconstruct the key and retrieve
the original entity:
rev_key = ndb.Key(urlsafe=urlString) revision = rev_key.get()
Note:
The URL-safe string looks cryptic, but it is not encrypted!
It can easily be decoded to recover the original entity's kind and
identifier:
key = Key(urlsafe=urlString)
kindString = key.kind()
ident = key.id()
If you use such URL-safe keys, don't use sensitive data such as
email addresses as entity identifiers. (A possible solution would
be to use the MD5 hash of the sensitive data as the identifier.
This stops third parties, who can see the encrypted keys, from using
them to harvest email addresses, though it doesn't stop them from
independently generating their own hash of a known email address and
using it to check whether that address is present in the Datastore.)
Updating Entities
To update an existing entity, just retrieve it from the Datastore, modify its properties, and store it back again:
sandy = key.get() sandy.email = 'sandy@gmail.co.uk' sandy.put()
(You can ignore the value returned by put()
in this case, since an entity's key doesn't change when you update it.)
Deleting Entities
When an entity is no longer needed, you can remove it from the
Datastore with the key's delete()
method:
sandy.key.delete()
Note that this is an operation on the key, not on the entity itself.
It always returns None
.
Operations on Multiple Keys or Entities
Because each get()
or put()
operation invokes a separate remote procedure call (RPC),
issuing many such calls inside a loop is an inefficient
way to process a collection of entities or keys at once.
The following methods are faster:
list_of_keys = ndb.put_multi(list_of_entities) list_of_entities = ndb.get_multi(list_of_keys) ndb.delete_multi(list_of_keys)
Advanced note: These methods interact correctly with the context and caching; they don't correspond directly to specific RPC calls.
Expando Models
Sometimes you don't want to declare your properties ahead of time.
A special model subclass, Expando
, changes the behavior of
its entities so that any attribute assigned (as long as it doesn't
start with an underscore) is saved to the Datastore.
For example:
class Mine(ndb.Expando): pass e = Mine() e.foo = 1 e.bar = 'blah' e.tags = ['exp', 'and', 'oh'] e.put()
This writes an entity to the Datastore with a
foo
property with integer value 1,
a bar
property with string value 'blah'
,
and a repeated tags
property with string
values 'exp'
, 'and'
, and 'oh'
.
The properties are indexed, and you can inspect them using the
entity's _properties
attribute:
print e._properties {'foo': GenericProperty('foo'), 'bar': GenericProperty('bar'), 'tags': GenericProperty('tags', repeated=True)}
An Expando
created by get
ting a value from
the Datastore has properties for all property values that were saved
in the Datastore.
An application can add predefined properties to an
Expando
subclass:
class FlexEmployee(ndb.Expando): name = ndb.StringProperty() age = ndb.IntegerProperty() e = FlexEmployee(name='Sandy', location='SF')
This gives e
a name attribute with value 'Sandy
,
an age
attribute with value None
,
and a dynamic attribute location
with value 'SF'
.
To create an Expando
subclass whose properties are unindexed,
set _default_indexed = False
in the subclass definition:
class Specialized(ndb.Expando): _default_indexed = False e = Specialized(foo='a', bar=['b']) print e._properties {'foo': GenericProperty('foo', indexed=False), 'bar': GenericProperty('bar', indexed=False, repeated=True)}
You can also set _default_indexed
on an Expando
entity. In this case it will affect all properties assigned after it was set.
Another useful technique is querying an Expando
kind for a dynamic property. A query like
FlexEmployee.query(FlexEmployee.location == 'SF')
won't work, as the class doesn't have a property object for the
location property. Instead, use GenericProperty
, the
class Expando
uses for dynamic properties:
FlexEmployee.query(ndb.GenericProperty('location') == 'SF')
Model Hooks
NDB offers a lightweight hooking mechanism. By defining a hook, an
application can run some code before or after some type of operations;
for example, a Model
might run some function before every
get()
.
A hook function runs when using the synchronous,
asynchronous and multi versions of the appropriate method.
For example, a "pre-get" hook would apply to all of
get()
, get_async()
,
and get_multi()
. There are pre-RPC and post-RPC versions
of each hook.
Hooks can be useful for
- query caching
- auditing Datastore activity per-user
- mimicking database triggers
The following example shows how to define hook functions:
from google.appengine.ext import ndb class Friend(ndb.Model): name = ndb.StringProperty() def _pre_put_hook(self): # inform someone they have new friend @classmethod def _post_delete_hook(cls, key, future): # inform someone they have lost a friend f = Friend() f.name = 'Carole King' f.put() # _pre_put_hook is called fut = f.key.delete_async() # _post_delete_hook not yet called fut.get_result() # _post_delete_hook is called
If you use post- hooks with asynchronous APIs, the hooks
are triggered by calling check_result()
,
get_result()
or yielding (inside a tasklet) an async
method's future.
Post hooks do not check whether the RPC was successful;
the hook runs regardless of failure.
All post- hooks have a Future
argument at the end of the call signature. This Future
object
holds the result of the action. You can call get_result()
on this Future
to retrieve the result; you can be sure
that get_result()
won't block, since the Future
is complete by the time the hook is called.
Raising an exception during a pre-hook prevents the request from
taking place.
Although hooks are triggered inside *_async
methods,
you cannot pre-empt an RPC by raising
tasklets.Return
in a pre-RPC hook.
Using Numeric Key IDs
A key is a series of kind-ID pairs. You want to make sure each entity has a key that is unique within its application and namespace. An application can create an entity without specifying an ID; the Datastore automatically generates a numeric ID. If an application picks some IDs "by hand" and they're numeric and the application lets the Datastore generate some IDs automatically, the Datastore might choose some IDs that the application already used. To avoid, this, the application should "reserve" the range of numbers it will use to choose IDs (or use string IDs to avoid this issue entirely).
To "reserve" a range of IDs, an application can use a model class'
allocate_ids()
class method. The method can be used in two ways: to allocate a specified
number of IDs, or to reserve a given range of IDs. To allocate 100
IDs for a given model class (say MyModel
), use the form:
first, last = MyModel.allocate_ids(100)
To allocate 100 IDs for entities with parent key p:
first, last = MyModel.allocate_ids(100, parent=p)
The returned values, first and last, are the first and last ID (inclusive) allocated. An application can use these to construct keys as follows:
keys = [Key(MyModel, id) for id in range(first, last+1)]
These keys are guaranteed not to have been returned previously by the
Datastore's internal ID generator, nor will they be returned by future
calls to the internal ID generator. However, allocate_ids()
does not check whether the IDs returned are present in the Datastore;
it only interacts with the ID generator.
An alternate form lets you allocate all IDs up to a given maximum value:
first, last = MyModel.allocate_ids(max=N)
This form ensures that all IDs less than or equal to N are considered allocated. The return values, first and last, indicate the range of IDs that were reserved by this operation. It is not an error to try to reserve IDs that were already allocated; in that case, first indicates the first ID not yet allocated and last is the last ID allocated. (In that case, first > last and first > N.)
An application cannot call allocate_ids()
in a transaction.