These Amazon Web Services (AWS) Storage Gateway Revisited posts are a follow-up to the AWS Storage Gateway test drive and review I did a few years ago (thus why it’s called revisited). As part of a two-part series, the first post looks at what AWS Storage Gateway is, how it has improved since my last review of AWS Storage Gateway along with deployment options. The second post in the series will look at a sample test drive deployment and use.

As a quick refresher, S3 is the AWS bulk, high-capacity unstructured and object storage service along with its companion deep cold (e.g. inactive) Glacier. There are various S3 storage service classes including standard reduced redundancy storage (RRS) along with infrequent access (IA) that have different availability durability, performance, service level and cost attributes.

Note that S3 IA is not Glacier, as your data always remains online accessible while Glacier data can be off-line. AWS S3 can be accessed via its API, as well as via HTTP rest calls, AWS tools along with those from third-party’s. Third party tools include NAS file access such as S3FS for Linux that I use for my Ubuntu systems to mount S3 buckets and use similar to other mount points. Other tools include Cloudberry, S3 Motion, S3 Browser as well as plug-ins available in most data protection (backup, snapshot, archive) software tools and storage systems today.

AWS S3 Storage Gateway and What’s New

The Storage Gateway is the AWS tool that you can use for accessing S3 buckets and objects via your block volume, NAS file, or tape-based applications. The Storage Gateway is intended to give S3 bucket and object access to on-premise applications and data infrastructures functions including data protection (backup/restore, business continuance (BC), business resiliency (BR), disaster recovery (DR) and archiving), along with storage tiering to cloud.

Some of the things that have evolved with the S3 Storage Gateway include:

Easier, streamlined download, installation, deployment

Enhanced Virtual Tape Library (VTL) and Virtual Tape support

File serving and sharing (not to be confused with Elastic File Services (EFS))

Ability to define your own bucket and associated parameters

Bucket options including Infrequent Access (IA) or standard

Options for AWS EC2 hosted, or on-premise VMware as well as Hyper-V gateways (file only supports VMware and EC2)

AWS Storage Gateway Three Functions

File Gateway (NFS NAS): Files, folders, objects, and other items are stored in AWS S3 with a local cache for low latency access to most recently used data. With this option, you can create folders and subdirectory similar to a regular file system or NAS device as well as configure various security, permissions, access control policies. Data is stored in S3 buckets that you specify policies such as standard or Infrequent Access (IA) among other options. AWS hosted via EC2 as well as VMware Virtual Machine (VM) for on-premise file gateway.

Also, note that AWS cautions on multiple concurrent writers to S3 buckets with Storage Gateway, so check the AWS FAQs which may have changed by the time you read this. Current file share limits (subject to change) include 1 file gateway share per S3 bucket (e.g. a one-to-one mapping between file share and a bucket). There can be 10 file shares per gateway (e.g. multiple shares each with its own bucket per gateway) and a maximum file size of 5TB (same as maximum S3 object size). Note that you might hear about object storage systems supporting unlimited size objects which some may do, however generally there are some constraints either on their API front-end, or what is currently tested. View current AWS Storage Gateway resource and specification limits here.

Volume Gateway (Block iSCSI): Leverages S3 with a point in time backup as an AWS EBS snapshot. Two options exist including Cached volumes with low-latency access to most recently used data (e.g. data is stored in AWS, with a local cache copy on disk or SSD). The other option is Stored Volumes (e.g. non-cached) where the primary copy is local and periodic snapshot backups are sent to AWS. AWS provides EC2 hosted, as well as VMs for VMware and various Hyper-V Windows Server based VMs.

Current Storage Gateway volume limits (subject to change) include maximum size of a cached volume 32TB, the maximum size of a stored volume 16TB. Note that snapshots of cached volumes larger than 16TB can only be restored to a storage gateway volume, they can not be restored as an EBS volume (via EC2). There is a maximum of 32 volumes for a gateway with a total size of all volumes for a gateway (cached) of 1,024TB (e.g. 1PB). The total size of all volumes for a gateway (stored volume) is 512TB. View current AWS Storage Gateway resource and specification limits here.

Storage Gateway limits for tape include minimum size of a virtual tape 100GB, maximum size of a virtual tape 2.5TB, maximum number of virtual tapes for a VTL is 1,500 and total size of all tapes in a VTL is 1PB. Note that the maximum number of virtual tapes in an archive is unlimited and the total size of all tapes in an archive is also unlimited. View current AWS Storage Gateway resource and specification limits here.

What This All Means

As to which gateway function and mode (cached or non-cached for Volumes) depends on what it is that you are trying to do. Likewise choosing between EC2 (cloud hosted) or on-premise Hyper-V and VMware VMs depends on what your data infrastructure support requirements are. Overall I like the progress that AWS has put into evolving the Storage Gateway, granted it might not be applicable for all usage cases.

Ok, nuff said (for now…).

Getting Started with Kubernetes: A Webinar Series brought to you by DigitalOcean