Object storage: The blob creeping from niche to mainstream

Can you be scalable AND floatable?

Storage dull? Dry? Uninteresting? Not a bit. Everybody and everything uses data storage. Without we'd be lost. And thanks in part to the growth of cloud computing and big data, storage has risen up the agenda.

In the big data universe, things are changing. Our methods of naming, storing and retrieving filesystems need to be reinvented to keep pace with the swelling data volumes that will extend from petabytes into zettabytes and ultimately yottabytes. Could object storage be the answer for these new massive data environments?

Object storage: a definition

Object storage is the discipline or practice of labelling units of data as objects rather than files. An object is comprised of data in the same way that a file is, but it does not reside in a hierarchy of any sort for means of classification i.e. by name or file size or other. Instead, data storage objects “float” in a storage memory pool with an individual flat address space.

Object storage sits well with the über-flexible world of cloud. This is because each unit benefits from an extended metadata identifier to allow its retrieval without the user needing to know its real physical location. Suddenly data storage automation sounds a lot easier.

According to OpenStack’s official documentation, object storage provides an API-accessible storage platform that can be integrated directly into applications or used for back up, archiving and data retention.

As there isn’t a notion of RAID, volumes, or aggregates, object storage can be treated as a “pooled capacity” so applications and users can consume the desired amount of storage at any one time. This means (if the system works), the guesswork of capacity planning is eliminated. Volumes no longer need to be tied to a particular server or application. If application ‘A’ unexpectedly grows at 80 per cent, there is no need to reconfigure and reallocate storage volumes as Application ‘A’ has access to the pooled storage capacity.

Sean Derrington, of cloud storage provider Exablox, adds: “Perhaps more importantly, storage capacity can be increased in any ‘unit’ desired. Since there are no volumes, capacity can be non-disruptively added to the existing pool in near real-time - eliminating the need to purchase and plan nine or 12 months ahead. When storage is added, the file system seen by applications and users doesn’t change. The only thing noticeable is the storage they have access to has increased.”

In terms of industry standards we have OpenStack Swift. This open source object storage system is described by its development team as a highly available, distributed, “eventually consistent” object/blob store. That distributed part is important; Swift helps replicate the objects across a server and multiple locations to make retrieval as easy as possible.

According to ZFS-storage software supplier Nexenta, the jury is still out on Swift. The firm says that there are well-recognized limitations and some flaws in the Swift design, but still it is gaining increasing popularity. Nexenta asserts that today Swift fills the niche that is not covered by Dropbox et al, so BYOD users will eventually use it. This means Swift could be on track to become the preferred backend, although of course Amazon, Google and Microsoft will compete for that space leveraging their respective proprietary close sourced technologies.

Simon Robinson is research vice president for storage technologies at 451 Research. In his report Object storage looks like a technology whose time has come Robinson explains that object storage, on paper at least, seems like an appealing option. “It's radically simpler than traditional SAN and even NAS, it scales much better from a capacity standpoint, and it's especially well-suited for cost-effectively storing the reams of unstructured data – think files, videos, music and images – that are being created in this 'big data' era.”

Software-defined storage prowess

As positive as this sounds, Robinson’s team say that according to their research, the adoption of object storage remains a “minority sport”. But the analyst points out that growth may yet be spurred by the many cloud service providers who are keenly interested in developing cloud storage services that will help them compete with Amazon Web Services - and object storage represents achievable “software-defined storage prowess” in this regard.

Nexenta also points out that Intel’s continuing work on x86 architecture and instruction sets to accelerate SHA, RAID, CRC and erasure coding is “very timely and promising” just now. “Those are the functions that a storage appliance executes, generating sometimes multiple processor cycles per each stored byte. Local deduplication, for instance, uses cryptographic strong hashing - this may be SHA-256, SHA-512 or SHA-3. In that sense, deduplication definitely requires specific capabilities from the CPUs (or GPUs if available),” according to the company.