Cloud Datastore best practices

You can use the best practices listed here as a quick reference of what to keep
in mind when building an application that uses Cloud Datastore. If you are
just starting out with Cloud Datastore, this page might not be the best
place to start, because it does not teach you the basics of how to use
Cloud Datastore. If you are a new user, we suggest that you start with
Getting Started with Cloud Datastore.

Note: This page describes system behavior for Cloud Datastore databases that have not yet upgraded to Cloud Firestore in Datastore mode.

General

Always use UTF-8 characters for namespace names, kind names, property names,
and custom key names. Non-UTF-8 characters used in these names can interfere
with Cloud Datastore functionality. For example, a non-UTF-8 character
in a property name can prevent creation of an index that uses the property.

Do not use a forward slash (/) in kind names or custom key names. Forward
slashes in these names could interfere with future functionality.

Avoid storing sensitive information in a Cloud Project ID. A Cloud Project ID
might be retained beyond the life of your project.

API calls

Use batch operations for your reads, writes, and deletes instead of
single operations. Batch operations are more efficient because they perform
multiple operations with the same overhead as a single operation.

If a transaction fails, ensure you try to rollback the transaction. The
rollback minimizes retry latency for a different request contending for the same
resource(s) in a transaction. Note that a rollback itself might fail, so the
rollback should be a best-effort attempt only.

Use asynchronous calls where available instead of synchronous calls.
Asynchronous calls minimize latency impact. For example, consider an application
that needs the result of a synchronous lookup() and the results of a query
before it can render a response. If the lookup() and the query do not have a
data dependency, there is no need to synchronously wait until the lookup()
completes before initiating the query.

Entities

Group highly related data in entity groups. Entity groups enable
ancestor queries, which return strongly consistent results. Ancestor
queries also rapidly scan an entity group with minimal I/O because the entities
in an entity group are stored at physically close places on
Cloud Datastore servers.

Avoid writing to an entity group more than once per second. Writing at a
sustained rate above that limit makes eventually consistent reads more eventual,
leads to time outs for strongly consistent reads, and results in slower overall
performance of your application. A batch or transactional write to an entity
group counts as only a single write against this limit.

Do not include the same entity (by key) multiple times in the same commit.
Including the same entity multiple times in the same commit could impact
Cloud Datastore latency.

Keys

Key names are autogenerated if not provided at entity creation. They are
allocated so as to be evenly distributed in the keyspace.

For a key that uses a custom name, always use UTF-8 characters except a
forward slash (/). Non-UTF-8 characters interfere with various processes
such as importing a Cloud Datastore backup into Google BigQuery. A
forward slash could interfere with future functionality.

For a key that uses a numeric ID:

Do not use a negative number for the ID. A negative ID could interfere
with sorting.

Do not use the value 0(zero) for the ID. If you do, you will get an
automatically allocated ID.

If you wish to manually assign your own numeric IDs to the entities you
create, have your application obtain a block of IDs with the allocateIds()
method. This will prevent Cloud Datastore from assigning one of
your manual numeric IDs to another entity.

If you assign your own manual numeric ID or custom name to the entities
you create, do not use monotonically increasing values such as:

If an application generates large traffic, such sequential numbering could
lead to hotspots that impact Cloud Datastore latency. To avoid the
issue of sequential numeric IDs, obtain numeric IDs from the allocateIds()
method. The allocateIds() method generates well-distributed sequences of
numeric IDs.

By specifying a key or storing the generated name, you can later perform a
consistent lookup() on that entity without needing issue a query to find the entity.

Do not index properties with monotonically increasing values (such as a
NOW() timestamp). Maintaining such an index could lead to hotspots that impact
Cloud Datastore latency for applications with high read and write
rates. For further guidance on dealing with monotonic properties, see
High read/write rates for a narrow key range below.

Properties

Always use UTF-8 characters for properties of type string. A non-UTF-8
character in a property of type string could interfere with queries. If you need
to save data with non-UTF-8 characters, use a byte string.

Queries

If you need to access only the key from query results, use a
keys-only query. A keys-only query returns results at lower latency and
cost than retrieving entire entities.

If you need to access only specific properties from an entity, use a
projection query. A projection query returns results at lower latency and
cost than retrieving entire entities.

Likewise, if you need to access only the properties that are included in the
query filter (for example, those listed in an order by clause), use a
projection query.

Do not use offsets. Instead use cursors. Using an offset only avoids
returning the skipped entities to your application, but these entities are still
retrieved internally. The skipped entities affect the latency of the query, and
your application is billed for the read operations required to retrieve them.

If you need strong consistency for your queries, use an ancestor query.
(To use ancestor queries, you first need to
structure your data for strong consistency.) An ancestor query returns
strongly consistent results. Note that a non-ancestor keys-only query
followed by a lookup() does not return strong results, because the
non-ancestor keys-only query could get results from an index that is not
consistent at the time of the query.

Designing for scale

Updates to a single entity group

A single entity group in Cloud Datastore should not be updated too rapidly.

If you are using Cloud Datastore, Google recommends that you design your
application so that it will
not need to update an entity group more than once per second.
Remember that an entity with no parent and no children is its own entity
group. If you update an entity group too rapidly then your Cloud Datastore writes will
have higher latency, timeouts, and other types of error. This is known as
contention.

Cloud Datastore write rates to a single entity group can sometimes exceed the one per second
limit so load tests might not show this problem. Some suggestions for designing
your application to reduce write rates on entity groups are in
the Cloud Datastore contention article.

High read/write rates to a narrow key range

Avoid high read or write rates to Cloud Datastore keys that are lexicographically
close.

Cloud Datastore is built on top of Google's NoSQL database,
Bigtable, and is subject to Bigtable's performance
characteristics. Bigtable scales by sharding rows onto separate tablets, and
these rows are lexicographically ordered by key.

If you are using Cloud Datastore, you can get slow writes due to a
hot tablet if you have a sudden increase in the write rate to a small
range of keys that exceeds the capacity of a single tablet
server. Bigtable will eventually split the key space to support high
load.

The limit for reads is typically much higher than for writes,
unless you are reading from a single key at a high rate. Bigtable
cannot split a single key onto more than one tablet.

Hot tablets can apply to key ranges used by both entity keys and
indexes.

In some cases, a Cloud Datastore hotspot can have wider impact to an
application than preventing reads or writes to a small range of
keys. For example, the hot keys might be read or written during instance
startup, causing loading requests to fail.

By default, Cloud Datastore allocates keys using a scattered
algorithm. Thus you will not normally encounter hotspotting on
Cloud Datastore writes if you create new entities at a high write rate using
the default ID allocation policy. There are some corner cases where
you can hit this problem:

If you create new entities at a very high rate using the legacy sequential ID
allocation policy.

If you create new entities at a very high rate and you are allocating your own
IDs which are monotonically increasing.

If you create new entities at a very high rate for a kind which previously had
very few existing entities. Bigtable will start off with all entities on the
same tablet server and will take some time to split the range of keys onto
separate tablet servers.

You will also see this problem if you create new entities at a high rate with
a monotonically increasing indexed property like a timestamp, because these
properties are the keys for rows in the index tables in Bigtable.

Cloud Datastore prepends the namespace and the kind of the root entity group to the
Bigtable row key. You can hit a hotspot if you start to write to a new
namespace or kind without gradually ramping up traffic.

If you do have a key or indexed property that will be monotonically
increasing then you can prepend a random hash to ensure that the keys
are sharded onto multiple tablets.

Likewise, if you need to query on a monotonically increasing (or decreasing)
property using a sort or filter, you could instead index on a new property, for which you
prefix the monotonic value with a value that has high cardinality across the dataset, but is
common to all the entities in the scope of the query you want to perform.
For instance, if you want to query for entries by timestamp but only need to
return results for a single user at a time, you could prefix the timestamp with the
user id and index that new property instead. This would still permit queries and
ordered results for that user, but the presence of the user id would ensure the
index itself is well sharded.

Ramping up traffic

Gradually ramp up traffic to new Cloud Datastore kinds or portions of the keyspace.

You should ramp up traffic to new Cloud Datastore kinds gradually in
order to give Bigtable sufficient time to split tablets as the traffic
grows. We recommend a maximum of 500 operations per second to a new
Cloud Datastore kind, then increasing traffic by 50% every 5 minutes. In
theory, you can grow to 740K operations per second after 90 minutes
using this ramp up schedule. Be sure that writes are distributed
relatively evenly throughout the key range. Our SREs call this the
"500/50/5" rule.

This pattern of gradual ramp up is particularly important if you
change your code to stop using kind A and instead use kind B. A naive
way to handle this migration is to change your code to read kind B,
and if it does not exist then read kind A. However, this could cause a
sudden increase in traffic to a new kind with a very small portion of
the keyspace. Bigtable might be unable to efficiently split tablets
if the keyspace is sparse.

The same problem can also occur if you migrate your entities to use
a different range of keys within the same kind.

The strategy that you use to migrate entities to a new kind or key
will depend on your data model. Below is an example strategy, known as
"Parallel Reads". You will need to determine whether or not this
strategy is effective for your data. An important consideration will
be the cost impact of parallel operations during the migration.

Read from the old entity or key first. If it is missing, then you
could read from the new entity or key. A high rate of reads of
non-existent entities can lead to hotspotting, so you need to be sure
to gradually increase load. A better strategy is to copy the old
entity to the new then delete the old. Ramp up parallel reads
gradually to ensure that the new key space is well split.

A possible strategy for gradually ramping up reads or writes to a
new kind is use a deterministic hash of the user ID to get a random
percentage of users that write new entities. Be sure that the result
of the user ID hash is not skewed either by your random function or by
your user behavior.

Meanwhile, run a Dataflow job to copy all your data from the old entities or
keys to the new ones. Your batch job should avoid writes to sequential keys in
order to prevent Bigtable hotspots. When the batch job is complete, you can read
only from the new location.

A refinement of this strategy is to migrate small batches of users
at one time. Add a field to the user entity which tracks the migration
status of that user. Select a batch of users to migrate based on a
hash of the user ID. A Mapreduce or Dataflow job will migrate the keys
for that batch of users. The users that have an in-progress migration
will use parallel reads.

Note that you cannot easily roll back unless you do dual writes of
both the old and new entities during the migration phase. This would
increase Cloud Datastore costs incurred.

Deletions

Avoid deleting large numbers of Cloud Datastore entities across a small range of keys.

Bigtable periodically rewrites its tables to remove deleted
entries, and to reorganize your data so that reads and writes are more
efficient. This process is known as a compaction.

If you delete a large number of Cloud Datastore entities across a small
range of keys then queries across this part of the index will be
slower until compaction has completed. In extreme cases, your queries
might time out before returning results.

It is an anti-pattern to use a timestamp value for an indexed
field to represent an entity's expiration time. In order to retrieve
expired entities, you would need to query against this indexed field,
which likely lies in an overlapping part of the keyspace with index
entries for the most recently deleted entities.

You can improve performance with "sharded queries", that prepend a
fixed length string to the expiration timestamp. The index is sorted
on the full string, so that entities at the same timestamp will be
located throughout the key range of the index. You run multiple
queries in parallel to fetch results from each shard.

A fuller solution for the expiration timestamp issue is to use a
"generation number" which is a global counter that is periodically
updated. The generation number is prepended to the expiration
timestamp so that queries are sorted by generation number, then shard,
then timestamp. Deletion of old entities occurs at a prior
generation. Any entity not deleted should have its generation number
incremented. Once the deletion is complete, you move forward to the
next generation. Queries against an older generation will perform
poorly until compaction is complete. You might need to wait for several
generations to complete before querying the index to get the list of
entities to delete, in order to reduce the risk of missing results due
to eventual consistency.

Sharding and replication

Use sharding or replication for hot Cloud Datastore keys.

You can use replication if you need to read a portion of the key
range at a higher rate than Bigtable permits. Using this strategy, you
would store N copies of the same entity allowing N times higher rate
of reads than supported by a single entity.

You can use sharding if you need to write to a portion of the key
range at a higher rate than Bigtable permits. Sharding breaks up an
entity into smaller pieces. The principles are explained in
the sharding
counters article.

Some common mistakes when sharding include:

Sharding using a time prefix. When the time rolls over to the next prefix then
the new unsplit portion becomes a hotspot. Instead, you should gradually roll
over a portion of your writes to the new prefix.

Sharding just the hottest entities. If you shard a small proportion of the
total number of entities then there might not be sufficient rows between the
hot entities to ensure that they stay on different splits.