Index Configuration

Google Cloud Datastore uses indexes
for every query your application makes. These indexes are updated whenever an entity changes, so the results can be returned quickly when the app makes a query. To do this, the datastore needs to know in advance which queries the application will make. You specify which indexes your app needs in a configuration file. If you're using the protocol buffer API, the development server can generate the datastore index configuration automatically as you test your app. If you're using the JSON API you'll need to do a bit more work. We hope to reach feature parity between these two APIs in an upcoming release.

System requirements

About datastore-indexes.xml

You specify configuration for datastore indexes in WEB-INF/datastore-indexes.xml, in your dataset directory. This is an XML file whose root element is <datastore-indexes>. It contains zero or more <datastore-index> elements, one for each index that the Datastore should maintain.

As described on the Datastore Indexes page, an index is a table of values for a set of given properties for entities of a given kind. Each column of property values is sorted either in ascending or descending order. Configuration for an index specifies the kind of the entities, and the names of the properties and their sort orders.

The <datastore-indexes> element has an autoGenerate attribute that controls whether this file should be considered along with automatically generated index configuration. See Using Automatic Index Configuration below.

Each <datastore-index> element represents an index. The kind attribute specifies the kind of the entities to index. The ancestor attribute is true if the index supports queries that filter by ancestor-key to constrain results to a single entity group, false otherwise.

The <property> elements in a <datastore-index> represent the entity properties to index. The name attribute is the property name, and the direction attribute is the sort order, either asc for ascending or desc for descending. The order of the property elements specifies the order in the index: rows are sorted by the first property, then the second property, and so on.

Using automatic index configuration

Determining the indexes required by your application's queries manually can be tedious and error-prone. Thankfully, the development server can determine the index configuration for you. To use automatic index configuration, add the attribute autoGenerate="true" to your WEB-INF/datastore-indexes.xml file's <datastore-indexes> element. Automatic index configuration is also used if your dataset does not have a datastore-indexes.xml file.

With automatic index configuration enabled, the development server maintains a file named WEB-INF/appengine-generated/datastore-indexes-auto.xml in your dataset directory. When your app, running against the development server, attempts a datastore query for which there is no corresponding index in either datastore-indexes.xml or datastore-indexes-auto.xml, the server adds the appropriate configuration to datastore-indexes-auto.xml.

If automatic index configuration is enabled when you update your production indexes (see Updating Indexes), the tool uses both datastore-indexes.xml and datastore-indexes-auto.xml to determine which indexes need to be built for your dataset in production.

If autoGenerate="false" is in your datastore-indexes.xml, the development server and the command line tool that updates your indexes in production (see Updating Indexes) ignore the contents of datastore-indexes-auto.xml. If the app running locally performs a query whose index is not specified in datastore-indexes.xml, the development server throws an exception, just as the production Datastore would.

It's a good idea to occasionally move index configuration from datastore-indexes-auto.xml to datastore-indexes.xml, then disable automatic index configuration and test your app against the development server. This makes it easy to maintain indexes without having to manage two files, and ensures that your testing will reproduce errors caused by missing index configuration.

Manual index configuration

The legacy App Engine Datastore viewer allows you to interactively query the
Datastore using a query language called GQL. These interactive queries will
succeed if your dataset has the necessary index to fulfill the query and fail
with an error containing the xml definition of the missing index if it does not.
By translating your Google Cloud Datastore queries to GQL you can use
interactive queries and the detailed error messages they return to determine
which indexes need to be added to your dataset's WEB-INF/datastore-indexes.xmlfile. GQL
provides a superset of the query functionality available in the Google Cloud
Datastore query API so this translation should always be possible.

To run a GQL query using the legacy App Engine Datastore viewer:

Go to the [Google Developers Console][1].

In the list of projects, locate your dataset ID.

Click your dataset to make it active for administration.

Click the App Engine icon to display the legacy Admin Console.

Click the Datastore Viewer link on the left-hand side of the page.

Click the +Options link to open up the interactive query form.

Type your query in the form and click Run Query.

If the query succeeds, no further action is necessary, you have the indexes
you need for that query.

If the query fails, copy the xml index definition contained in the error
message and add it to your WEB-INF/datastore-indexes.xml file.

Updating indexes

Google Cloud Datastore provides the gcd command line tool for updating the
indexes that are available to your production dataset. This tool looks at your
dataset index configuration (the datastore-indexes.xml and
appengine-generated/datastore-indexes-auto.xml files), and if the index configuration
defines an index that doesn't exist yet in your production dataset, the
Datastore creates the new index.

Depending on how much data is already in the Datastore that belongs in the new
index, the process of creating the index may take a while. If the app performs a
query that requires an index that hasn't finished building yet, the query will
raise an exception. To prevent this, you must be careful about deploying a new
version of your app that requires a new index before the new index finishes
building.

You can check the status of the dataset's indexes from the Indexes section
of the Datastore console for your project in the [Developers Console][1].

Deleting unused indexes

When you change or remove an index from the index configuration, the original
index is not deleted from the Datastore automatically. This gives you the
opportunity to leave an older version of the app running while new indexes are
being built, or to revert to the older version immediately if a problem is
discovered with a newer version.

When you are sure that old indexes are no longer needed, you can delete them
from the Datastore using the vacuumindexes action. The command to vacuum
indexes is as follows:

This command deletes all indexes for the dataset that are not mentioned in the
local versions of datastore-indexes.xml and
appengine-generated/datastore-indexes-auto.xml.

Passwordless login with OAuth2

If you don't want to enter your login credentials, you can use an OAuth 2.0
token instead. This token gives access to the Datastore, but not to other parts
of your Google account; if your Google account uses
two-factor authentication, you'll find this especially convenient. You can
store this token to permanently log in on this machine.

A page will appear in your web browser prompting you for authorization. If no
browser could be started, then gcd will instead show you a URL to copy/paste
into your browser. Log in if necessary. The page will ask whether you wish to
give the Datastore access. Click OK, then you will be given a token that you
will need to supply to the prompt from gcd.

From now on, when you use the --auth_mode=oauth2 option it uses the saved
credentials.

Command-line arguments

The gcd tool accepts the following options for index management:

--dataset_id_override=...

Dataset ID to use instead of the one in the project directory. If your local
dataset ID and the Cloud Datastore project ID don't match, you need to use
this option in order to update indexes in the Cloud Datastore project.

--auth_mode=oauth2|password

Authentication mode for connecting to Cloud Datastore. oauth2 will take
you through an OAuth2 flow using a web browser. password will prompt you for
a password on the command line.

--no_confirm

Force deletion of indexes without being prompted (For vacuumindexes only.)