Table of Contents

Introduction

ViPR provides a mechanism for ingesting file-based data into object storage. The data is not moved, but is accessed from the ingested file system. Ingested object data can be used by object and HDFS storage clients and can also be accessed as files using the ViPR file-access mode. When accessing the ingested object data as HDFS data or using file-access mode, the original directory structure is preserved. This article describes how to perform the ingestion.

Ingestion overview

When ingesting data from a file system, the file system is added as a data store to an object virtual pool.

The data in the file system is associated with a specified destination bucket which must already exist and be empty. The type of bucket determines the type of access that is allowed, object, HDFS, or both, so it is important that the correct type of bucket is created for the ingest operation

At Version 2.0, object ingestion is supported from file system based data stores, not from commodity nodes.

The ingest process is represented in the figure below.

Ingesting objects into object storage

The destination bucket will have been created on a file system-based data store within an object virtual pool, and the ingested file system is added as a data store to the same object virtual pool. The data store created from the ingested file system cannot be used for new data. New data written to the ingest bucket will be written to a different data store
within the virtual pool. Similarly, the data store will not be used for new buckets.

The object/HDFS storage capacity reported by ViPR, which is the same as the storage allocated against a license, will include the full capacity of the ingested file system. However, where the capacity of the ingested file system exceeds the size of the ingested data, the excess capacity cannot be used.

In data services, a bucket is associated with a project, which is used for metering purposes. Hence, once ingestion is done, all the ingested objects belong to the bucket, and are accounted against the project associated with this bucket for metering purposes.

The following procedures are provided to support Data Services ingestion:

Ingest file system data

Use this procedure to ingest data from a file system into ViPR object storage. Where a step requires access to the ViPR UI, you can use the Online Help associated with the specified page to obtain guidance.

Before you begin

If the file system that holds the data to be ingested is not currently managed by ViPR it must be brought under ViPR management using the following steps. These steps must be performed by a user with the ViPR System Administrator role:

Ensure that a data services virtual pool exists and has at least one data store associated with it.

Ensure that the file system is not exported. If it is, remove any exports.

If the file system was created in ViPR, and you want to ingest the data on the file system into object storage, you need to remove the file system exports. Currently there is no provision for performing this step from the UI and it must be performed using the API or CLI.

A bucket into which the data can be ingested must already exist and must be empty. The bucket should belong to the user who will own the ingested objects.

To create a bucket from the UI, use
User > Service Catalog
> Data Services > Create Bucket for Data Services. The online help associated with that topic will provide guidance.

You must have the Tenant Administrator role in ViPR in order to use the Ingest File System service.

Procedure

Select
User > Service Catalog
> Data Services > Ingest File Systems.

Enter a name for the data store that will be created from the ingested file system.

This is the data store that will be created from the ingested file system. The data store will be assigned to the data services virtual pool to which the specified bucket (sometimes referred to as a keypool) belongs.

Enter a description that will be assigned to the data store that is created.

Select the project that owns the file system from which data will be ingested.

Only file systems that belong to the selected project will be offered as the source of the data.

The data that is ingested is associated, for metering and showback/chargeback purposes, with the project that owns the bucket into which it is ingested.

Select the file system from which the data is to be ingested.

The file system is used as the storage that backs the data store that is created.

Enter the name of an existing bucket into which the data will be ingested.

Select
Order.

Results

When you ingest data into a bucket, all the files from the file share are now behaving as objects of the bucket. If the bucket was created using the UI, or created using a different mechanism but without specifying any access control list (ACL), the default ACL for the bucket will be applied and the owner of the bucket has full control. No other user has access unless the ACL is changed to allow access.

The data store that is created from the ingested file system cannot be used for new data. New data for the bucket into which the ingested data is placed, or new buckets, will be written to other data stores.

Ingest object data using the Controller REST API

The Controller REST API provides the ability to ingest files from a file system and make them available as objects.

The API provides a set of methods that enable ingestion tasks to be initiated and to be monitored. The methods are used by the ViPR UI to support the Ingestion service, and can be used when writing custom clients that want to implement the data ingest function.

Object data ingest methods

Method

Description

POST /object/ingestion

Initiates object data ingestion. The method is asynchronous.

GET /object/ingestion/{id}

Returns the state of a specified ingestion task. As ingestion is asynchronous, this method can be used to determine when it is complete.

GET /object/ingestion

Lists all ingestion tasks that are in progress and have completed.

POST /object/ingestion/{id}/deactivate

Enables an ingestion task to be deactivated.

GET /object/ingestion/{id}/tasks/{op_id}

Returns the status of an operation associated with an ingestion task.

Details of the payloads and return parameters for these methods can be found in
EMC ViPR REST API Reference. The full procedure for performing object data ingestion is described in
Ingestion overview.

The
POST /object/ingestion method is used to initiate an object ingestion operation and the object data ingest parameters are provided in the table below.

Object data ingest parameters

Parameter

Description

datastore name

The name of a data store that will be created from the specified file share. The data store will be assigned to the data services virtual pool that the specified bucket (keypool) belongs to.

datastore description

A description assigned to the data store that will be created.

fileshare id

The identity of the file system on which the data to be ingested is located. The file share is used as the storage that backs the data store that is created.

keypool name

The name of the bucket to which the ingested data will be assigned. This bucket must already have been created.

The operation requires the Tenant Administrator role and the new data that is ingested belongs to the tenant that owns the data services virtual pool.

The file system identity can be obtained using the
GET /file/filesystems/bulk methods to return the IDs of all ViPR-managed file systems and then using the list of IDs to obtain details of each file system using
POST /file/filesystems/bulk.

Removing file system exports

You can remove file system exports using the ViPR Controller REST API or using the CLI.

Before you begin

This procedure identifies the steps required to remove file system exports. If you want to perform these steps programmatically, you can use the ViPR Java Client. If you want to perform them manually, you can use a client such as "curl" or a browser-based web client, or you can use the CLI.

Procedure

To unexport a file system using the API, follow the steps below:

Get the identity of the ViPR-managed file system from which you want to remove the exports.

You can find the identity by locating the file system on the
User > Resources > File Systems page, expanding the file system, locating the File System field, and obtaining the associated
urn:.

Alternatively, you can:

Get a list of the file system resources managed by ViPR.

GET /file/filesystems/bulk

Returns the identities of all file system resources. In the example below, only one file system exists.