New in Luminous: RGW Metadata Search

RGW metadata search is a new feature that was added in Ceph Luminous. It enables integration with Elasticsearch to provide a search API to query an object store based on object metadata.

A new zone type

A zone in the RGW multisite system is a set of radosgw daemons serving the same data, backed by the same set of RADOS pools in Ceph. Multiple zones that are placed in the same zonegroup mirror each others data. In most cases there is a zone per cluster and multiple Ceph clusters in different data centers or geographies are federated.

As part of this new multisite architecture we introduced a way to create new tiers or zone types. New sync modules can now also send a copy of the data–or metadata–to a different data tier. Enter Elasticsearch, which can now be used to index the metadata of all objects in a zonegroup. A zonegroup can then contain some zones storing copies of all of the objects (and serving them up via radosgw) and some indexing the object metadata in Elasticsearch.

For example, we can create a zonegroup that has three zones: zone A, zone B, and zone M. Zone A and zone B are data zones. Users will create buckets there, upload objects to them, and so on. Zone M will be a metadata search zone. Whenever data is written to zone A or B the elasticsearch sync module will push the a copy of the metadata to Elasticsearch.

One of the main questions we had when designing this features was whether we should involve RGW with the search queries, or whether that should be left for the users to deal with (presumably by accessing Elasticsearch directly). We concluded it would be much better in terms of user experience if we served as a proxy between users and Elasticsearch and managed the queries ourselves. This allows us to provide a more consistent experience, and solves the authentication and authorization problems. End users do not have access to Elasticsearch, and we make sure that the queries that are sent to Elasticsearch request only data that the users are permitted to read.

Below is a summary of the new APIs and new configurables, followed by a real-world configuration example.

New RESTful APIs

New REST APIs were added to RGW in order to use and control metadata search:

Query metadata

The request needs to be sent to the RGW that is located on the Elasticsearch tier zone.

Input:

GET /[<bucket>]?query=<expression>

Request params:

max-keys: max number of entries to return

marker: result pagination marker

The query expression takes the form:

[(]<arg> <op> <value> [)][<and|or> …]

where the op can be any of

<, <=, ==, >=, >

For example:

GET /?query=name==foo

Will return all the indexed keys that user has read permission to, and are named ‘foo’.

The output will be a list of keys in XML that is similar to the S3 list buckets response.

Configure custom metadata fields

Define which custom metadata entries should be indexed under the specified bucket, along with the types of the keys. If explicit custom metadata indexing is configured, this is needed so that RGW will index the specified custom metadata values. Otherwise it is needed in cases where the indexed metadata keys are of a type other than string.

Note that this request needs to be sent to the metadata master zone.

Input:

PUT /<bucket>?mdsearch

HTTP headers:

X-Amz-Meta-Search: <key [; type]> [, ...]

Where key is x-amz-meta-<name>, and type is one of the following: string, integer, date.

Delete custom metadata configuration

Delete custom metadata bucket configuration. This request should be sent to the metadata master zone.

Input:

DELETE /<bucket>?mdsearch

Get custom metadata configuration

Retrieve custom metadata bucket configuration.

Input:

GET /<bucket>?mdsearch

Elasticsearch tier zone configurables

The following configurables are now defined:

endpoint: Specifies the Elasticsearch server endpoint to access

num_shards (integer): The number of shards that Elasticsearch will be configured with on data sync initialization. Note that this cannot be changed after init. Any change here requires rebuild of the Elasticsearch index and reinit of the data sync process.

num_replicas (integer): The number of the replicas that Elasticsearch will be configured with on data sync initialization.

explicit_custom_meta (true | false): Specifies whether all user custom metadata will be indexed, or whether user will need to configure (at the bucket level) what custom metadata entries should be indexed. This is false by default.

index_buckets_list (comma separated list of strings): If empty, all buckets will be indexed. Otherwise, only buckets specified here will be indexed. It is possible to provide bucket prefixes (e.g., foo*), or bucket suffixes (e.g., *bar).

approved_owners_list (comma separated list of strings): If empty, buckets of all owners will be indexed (subject to other restrictions), otherwise, only buckets owned by specified owners will be indexed. Suffixes and prefixes can also be provided.

override_index_path (string): if not empty, this string will be used as the elasticsearch index path. Otherwise the index path will be determined and generated on sync initialization.

Configuration example

Here is a simple configuration in which we create a new realm, with a single zonegroup, and have two zones in that zonegroup: a data zone and a metadata search zone. Both zones will run on the same Ceph cluster.

Naming

For the purposes of this example, we use the following zone information:

Start second RGW

Create a user, upload stuff

$ radosgw-admin user create --uid=yehsad --display-name=yehuda
...

Here we use obo tool (can be found here: https://github.com/ceph/obo) to create buckets and upload some data. This is just an example and users can upload data using any S3 or Swift compatible client tool. You will need to fill in the access key and secret based on the output of the user create command.

Conclusion

With minimal configuration, RGW can leverage Elasticsearch to query your object store based on basic metadata (like object names). With a bit more effort it can be used to index based on custom metadata fields that may be in use in your environment.

This is implemented on top of the RGW multisite infrastructure, which is proving to be quite flexible. Stay tuned in future releases for sync plugins that replicate data to (or even from) cloud storage services like S3!