Migrating from Amazon S3 to Cloud Storage

This page describes how to migrate from Amazon Simple Storage Service (Amazon
S3) to Cloud Storage for users sending requests using an API. If you are not
currently using Amazon S3 and you want to send requests using the
Cloud Storage API, then start here: XML API Overview.

If you are new to Cloud Storage and will not be using the API directly, consider
using the Google Cloud Platform Console to set and manage transfers. The
Google Cloud Platform Console provides a graphical interface to Cloud Storage that
enables you to accomplish many of your storage tasks using just a browser,
including migration of your data from Amazon S3 to Cloud Storage.

Overview of migration

If you are an Amazon S3 user, you can easily migrate your applications that use
Amazon S3 to use Cloud Storage. You have two migration options:

Simple Migration

This is this easiest way to get started with Cloud Storage if you are coming
from Amazon S3 because it requires just a few simple changes to the tools and
libraries you currently use with Amazon S3. For more information, see Simple
Migration.

While a simple migration allows you to get going quickly with
Cloud Storage, it does not allow you to use all the features of Cloud Storage.
To take full advantage of Cloud Storage, follow the steps for a full migration.

Full Migration

A full migration from Amazon S3 to Cloud Storage requires a few extra steps than a
simple migration, but the benefit is that you can use all the features of
Cloud Storage, including support for service accounts, multiple projects, and OAuth
2.0 for authentication. For more information, see Full
Migration.

Simple migration

In a simple migration from Amazon S3 to Cloud Storage, you can use your existing
tools and libraries for generating authenticated REST requests to Amazon S3, to
also send authenticated requests to Cloud Storage. The changes you need to make to
your existing tools and libraries are described in this section.

That's it! At this point you can start using your existing tools and libraries
to send keyed-hash message authentication code (HMAC) requests to Cloud Storage.

When you use the Cloud Storage XML API in a simple migration scenario,
specifying the AWS signature identifier in the Authorization header lets
Cloud Storage know to expect x-amz-* headers and Amazon S3 ACL XML syntax in
your request.

Note: A simple migration from Amazon S3 is an easy and quick way to get started
using Cloud Storage, with a minimal investment in time and changes to your existing
code. After you have some experience with a simple migration, you can fully
transition to Cloud Storage to take advantage of all its features. For more
information, see Full Migration.

Setting a default project

To use the Cloud Storage in a simple migration scenario, you must choose a
default project. When you choose a default project, you are telling
Cloud Storage the project to use for operations like GET service or PUT
bucket.

If the project is already the default project, you will see
PROJECT-ID is your default project.

This project is now your default project. You can change your default project at
any time by choosing a different project and enabling interoperable access for
it.

Note: Choosing a default project for a simple migration (interoperable access)
does not impact using the XML API for a full migration, where you specify the
x-goog-project-id with each request.

Managing developer keys for a simple migration

To use the Cloud Storage XML API in a simple migration scenario, you must use
keyed-hash message authentication code (HMAC) authentication with
Cloud Storage developer keys. Developer keys consist of an access key and a
secret. An access key is a 20 character alphanumeric string, which is linked to
your Google account. All authenticated Cloud Storage requests, except those
that use cookie-based authentication, must use an access key in the request so
that the Cloud Storage system knows who is making the request. The following
is an example of an access key:

GOOGTS7C7FUP3AIRVJTE

A secret is a 40 character Base-64 encoded string that is linked to a specific
access key. A secret is a preshared key that only you and the Cloud Storage
system know. You must use your secret to sign all requests as part of the
authentication process. The following is an example of a secret:

If you have not set up interoperability before, click Enable
interoperability access.

Click Create a new key.

Security tips for working with developer keys

You can create up to five developer keys. This is useful if you are working on
several different projects and you want to use different developer keys for each
project.

Important: Never let another person use your developer keys. Your developer keys
are linked to your Google account and you should treat them as you would any set
of access credentials.

You can also use the key management tool to delete your developer keys and
create new developer keys. You may want to do this if you think someone else is
using your developer keys or you need to change your keys as part of a key
rotation, which is a security best practice. If you have developer keys and you
want to create new developer keys, we recommend that you first update your code
with the new developer keys before you delete the old keys. When you delete
developer keys, they become immediately invalid and they are not recoverable.

Authenticating in a simple migration scenario

Authorization header

For operations in a simple migration scenario that require authentication, you
will include an Authorization request header just like you do for requests to
Amazon S3. The Authorization header syntax for an Amazon S3 request is:

Authorization: AWS AWS-ACCESS-KEY:signature

In a simple migration scenario, you only change the header to use your Google
Developer access key:

Authorization: AWS GOOG-ACCESS-KEY:signature

The parts of the Authorization header are:

Signature identifier

The signature identifier identifies the signature algorithm and version that
you are using. Using AWS indicates that you intend to send x-amz-* headers.

Access key

The access key identifies the entity that is making and signing the request.
In a simple migration, replace the Amazon Web Service (AWS) access key ID you
use to access Amazon S3 with your Google developer access key. Your Google
developer access key starts with "GOOG".

Signature

The signature is a keyed cryptographic hash of various request headers. The
signature is created by using HMAC-SHA1 as the hash function and your secret as
the cryptographic key. The resulting digest is then Base64 encoded. When
Cloud Storage receives your signed request, it uses the access key to look up
your secret and verify that you created the signature. For more information on
how to obtain an access and secret key, see Managing developer keys for access
in a simple migration scenario.

In a simple migration, replace the AWS secret access key with your Google
developer key secret as the cryptographic key.

Authentication calculation

This section describes the process of authenticating an XML API request in a
simple migration scenario. While this section can be used to develop your own
code to sign requests, it is mainly intended to be a review if you already have
tools or libraries that sign your requests to Amazon S3. In this case, you will
continue to use these tools to access Cloud Storage using the XML API, with the
changes shown here.

This authentication method provides both identity and strong authentication
without revealing your secret. Providing both identity and authentication in
every request helps ensure that every Cloud Storage request is processed under
a specific user account and with the authority of that user account. This is
possible because only you and the Cloud Storage system know your secret. When
you make a request, the Cloud Storage system uses your secret to calculate the
same signature for the request that you calculated when you made the request. If
the signatures match, then the Cloud Storage system knows that only you could
have made the request.

The following pseudocode shows how to create the signature for the Authorization
header:

To create the signature, you use a cryptographic hash function known as
HMAC-SHA1. HMAC-SHA1 is a hash-based message authentication code (MAC) and is
described in RFC 2104. It requires two input parameters, both UTF-8
encoded: a key and a message. The key is your Cloud Storage secret and the
message must be constructed by concatenating specific HTTP headers in a specific
order. The following pseudocode shows how to construct the message:

Each of the canonical entities that make up the message represents a
strictly-ordered and strictly-formatted concatenation of various headers and
resources. The following sections describe how to construct each of these
entities.

CanonicalHeaders

You construct the CanonicalHeaders portion of MessageToBeSigned by
concatenating several header values and adding a newline (U+000A) after each
header value. The following pseudocode notation shows you how to do this
(newlines are represented by \n):

Do not include the header names in the concatenated string; include only the
header values. If a required header does not exist in a request, substitute an
empty string for the header value and be sure to include the newline after the
empty string. Also, the Date header is required for all authenticated
requests, so this field must be populated with a valid date and time stamp. The
date and time stamp must be within 15 minutes of when Cloud Storage receives
your request.

You can also use the x-amz-date or x-goog-date extension headers to specify
the date and time stamp. If you use the date extension header, Cloud Storage
ignores the Date header in your request. In this case, substitute an empty
string for the Date header.

CanonicalExtensionHeaders

You construct the CanonicalExtensionHeaders portion of the message by
concatenating all extension (custom) headers that begin with x-amz- or
x-goog-. However, you cannot perform a simple concatenation. You must
concatenate the headers using the following process:

Make all custom header names lowercase.

Sort all custom headers lexicographically by header name.

Eliminate duplicate header names by creating one header name with a
comma-separated list of values. Be sure there is no whitespace between the
values and be sure that the order of the comma-separated list matches the order
that the headers appear in your request. For more information, see RFC 7230
section 3.2.

Replace any folding whitespace or newlines (CRLF or LF) with a single space.
For more information about folding whitespace, see RFC 2822 section
2.2.3.

Remove any whitespace around the colon that appears after the header name.

Append a newline (U+000A) to each custom header.

Concatenate all custom headers.

It's important to note that you use both the header name and the header value
when you construct the CanonicalExtensionHeaders portion of the message. This
is different than the CanonicalHeaders portion of the message, which used only
header values.

CanonicalResource

You construct the CanonicalResource portion of the message by concatenating
the resource path (bucket, object, and subresource) that the request is acting
on. To do this, you can use the following process:

Begin with an empty string.

If the bucket name appears in the Host header, add a slash (/) and the bucket
name to the string (for example, /travel-maps). If the bucket name appears in
the path portion of the HTTP request, do nothing.

Add the path portion of the HTTP request to the string, excluding any query
string parameters. For example, if the path is /europe/france/paris.jpg?acl and
you already added the bucket travel-maps to the string, then you need to add
/europe/france/paris.jpg to the string.

If the request is scoped to a subresource, such as ?acl, add this subresource
to the string, including the question mark.

Copy the HTTP request path literally: that is, you should include all URL
encoding (percent signs) in the string that you create. Include only query
string parameters that designate subresources (such as acl). You should not
include query string parameters such as ?prefix, ?max-keys, ?marker, and
?delimiter.

The following examples show what the signed message looks like for several
different kinds of requests.

Sample authentication request

The following examples upload an object named /europe/france/paris.jpg to a
bucket named my-travel-maps, apply the predefined ACL public-read, and
define a custom metadata header for reviewers. Here is the request to a bucket
in Amazon S3:

This request did not provide a Content-MD5 header, so an empty string is shown
in the message (second line).

Access control in a simple migration scenario

To support simple migrations, Cloud Storage accepts ACLs produced by Amazon
S3. In a simple migration scenario, you will use AWS as your
signature identifier which tells Cloud Storage to expect ACL syntax using
Amazon S3 ACL XML syntax. You should ensure that the Amazon S3 ACLs you use map
to the Cloud Storage ACL model. For example, if your tools and libraries use
Amazon S3's ACL syntax to grant bucket WRITE permission, then they must also
grant bucket READ permission because Cloud Storage permissions are
concentric. You do not need to specify both WRITE and READ
permission when you grant WRITE permission using the Cloud Storage syntax.

Finally, in a simple migration scenario, you can also use the GOOG1 signature
identifier in the Authorization header. In this case, you must use the
Cloud Storage ACL syntax and ensure that all of your headers are Google
headers, x-goog-*. While this is possible, we recommend that you move to a
full migration as described below in order to realize all the benefits of
Cloud Storage.

Full migration

A full migration from Amazon S3 to Cloud Storage enables you to take advantage of
all the features of Cloud Storage including:

Support for service accounts

Service accounts are useful for server-to-server interactions that do not
require end-user involvement. For more information, see Service
Accounts.

Support for multiple projects

Multiple projects allow you to have, in effect, many instances of the
Cloud Storage service. This allows you to separate different functionality or
services of your application or business as needed. For more information, see
Using Projects.

OAuth 2.0 authentication

OAuth 2.0 relies on SSL for security instead of requiring your application to
do cryptographic signing directly, and is easier to implement. With OAuth, your
application can request access to data associated with a user's Google Account,
and access can be scoped to several levels, including read-only, read-write, and
full-control. For more information, see OAuth 2.0 Authentication.

To migrate fully from Amazon S3 to Cloud Storage, you will need to make the
following changes:

Set the x-goog-project-id header in your requests. Note
that in a simple migration scenario, you chose a default project for all
requests. This is not needed in a full migration.

Get set up to use OAuth 2.0 authentication as described in OAuth 2.0
Authentication. The first step is to register your application (where
you will be issuing requests from) with Google. Using OAuth 2.0 means that your
Authorization header will look like this:

Authorization: Bearer <oauth2_token>

Access control in a full migration

This section shows a few examples of access control to help you migrate from
Amazon S3 to Cloud Storage. For an overview of access control in Cloud Storage, see
Access Control.

In Cloud Storage, there are several ways to apply ACLs to buckets and objects (see
Specifying Bucket and Object ACLs). Two of the ways you specify ACLs are
analogous to what you do in Amazon S3:

The acl query string parameter to apply ACLs for specific scopes.

The x-goog-acl request header lets you apply predefined ACLs, which are
sometimes known as canned ACLs.

Using the acl query string parameter

You can use the acl query string parameter for a Cloud Storage request
exactly the same way you would use it for an Amazon S3 request. The acl
parameter is used in conjunction with the PUT method to apply ACLs to the
following: an existing object, an existing bucket, or a bucket you are creating.
When you use the acl query string parameter in a PUT request, you must attach
an XML document (using Cloud Storage ACL syntax) to the body of your request. The
XML document contains the individual ACL entries that you want to apply to the
bucket or object.

The following example shows a PUT request to Amazon S3 that uses the acl query
string parameter. ACLs are defined in an XML document sent in the request body.
The PUT request changes the ACLs on an object named europe/france/paris.jpg
that is in a bucket named my-travel-maps. The ACL grants jane@gmail.com
FULL_CONTROL permission.

Note that Cloud Storage does not require an <Owner/> element in the
ACL XML document. For more information, see Default Object ACLs.

You can also retrieve bucket and object ACLs by using the acl query string
parameter with the GET method. The ACLs are described in an XML document, which
is attached to the body of the response. You must have FULL_CONTROL permission
to apply or retrieve ACLs on an object or bucket.

Applying ACLs with an extension request header

You can use the x-goog-acl header in a Cloud Storage request to apply
predefined ACLs to buckets and objects exactly the same way you would use the
x-amz-acl header in an Amazon S3 request. You typically use the x-goog-acl
(x-amz-acl) header to apply a predefined ACL to a bucket or object when you
are creating or uploading the bucket or object. The Cloud Storage predefined
ACLs are similar to Amazon S3 Canned ACLs, including private,
public-read, public-read-write, as well as others. For a list of Cloud Storage
predefined ACLs, see Predefined ACLs.

The following example shows a PUT Object request that applies the public-read
ACL to an object named europe/france/paris.jpg that is being uploaded into a
bucket named my-travel-maps in Amazon S3.

You can also use the x-goog-acl header to apply a predefined ACL to an
existing bucket or object. To do this, include the acl query string parameter
in your request but do not include an XML document in your request. Applying a
predefined ACL to an existing object or bucket is useful if you want to change
from one predefined ACL to another, or you want to update custom ACLs to a
predefined ACL. For example, the following PUT Object request applies the
predefined ACL private to an object named europe/france/paris.jpg that is
in a bucket named my-travel-maps.

Migrating from Amazon S3 to Cloud Storage Request Methods

Cloud Storage supports the same standard HTTP request methods for reading and
writing data to your buckets as are supported in Amazon S3. Therefore, the
majority of your tools and libraries that you currently use with Amazon S3,
will work as is with Cloud Storage. Cloud Storage supports the following request
methods:

Service request for GET.

Bucket requests, including PUT, GET, DELETE.

Object requests, including GET, POST, PUT, HEAD, and DELETE.

For more information, see XML API Reference Methods. Keep
in mind that when you send requests to Cloud Storage, you will need to change the
request body, when applicable, to use the appropriate Cloud Storage syntax. For
example, when you create a lifecycle configuration for a bucket, use the
Cloud Storage lifecycle XML, which is different than the Amazon S3 lifecycle
XML.

There are a few differences between Cloud Storage XML API and Amazon S3 which are
summarized below, with suggested Cloud Storage alternatives:

Amazon S3 Functionality

Cloud Storage XML API Functionality

Multipart upload.
POST /<object-name>,
PUT /<object-name>

In the Cloud Storage XML API, you can upload a series of component objects, performing a
separate upload for each component. Then you can
compose the objects into a single composite
object.

Note: While the JSON API offers a
multipart upload feature, this feature is
used for sending metadata along with object data. It is not equivalent to S3's multipart
upload feature.

GET/POST bucket query string parameters:

"policy" - Working with Amazon S3 bucket policies.

"website" - Configuring bucket websites.

"tagging" - Tagging buckets for cost allocation purposes.

"notification" - Notifying for bucket events.

"requestPayment" - Configuring who pays for the request and the data download from a bucket.

Alternatives:

"policy" - Cloud Storage ACLs, project team membership, and the ability to use
multiple projects address many of the scenarios where bucket policies are used.

"requestPayment" - Use multiple projects with different billing profiles to manage
who pays for requests and downloaded data from a bucket. For more about
configuring billing, see Billing in the
Google APIs Console Help documentation.

Migrating from Amazon S3 to Cloud Storage Headers

Cloud Storage uses several standard HTTP headers as well as several custom
(extension) HTTP headers. If you are transitioning from Amazon S3 to Cloud Storage,
you can convert your custom Amazon S3 headers to the equivalent Cloud Storage
custom header or similar functionality as shown in the table below.

Discussion groups and support for XML API compatibility with Amazon S3

The Cloud Storage gs-discussion group which formerly supported XML
API interoperability and migration issues, is in archive mode. The discussion
forum can now be accessed on Stack Overflow using the tag
google-cloud-storage. See the Resources and
Support page for more information about discussion forums
and subscribing to announcements.