BiB 060: Focus On Your Data & Not Where It’s Stored With HammerSpace

The following text is a transcript of the audio you can listen to in the player above.

Welcome to Briefings In Brief, an audio digest of IT news and information from the Packet Pushers, including vendor briefings, industry research, and commentary.

I’m Ethan Banks, it’s November 27, 2018, and here’s what’s happening. I had a briefing with HammerSpace last month. HammerSpace is a newly launched company to help you manage your data in shops where you are functioning both in the public cloud and on-premises.

If you said, “Oh, so HammerSpace is a storage company,” you’d be wrong about that. HammerSpace is about managing your data, no matter where it’s stored. In fact, a big part of the HammerSpace value proposition is that you don’t have to care about where the data is stored anymore. You can just get on with using it.

In this briefing, HammerSpace introduced their company, discussed some use cases, and dove deeply into their technology.

Understanding HammerSpace

For storage admins, the simplest way to describe HammerSpace is that they are an abstraction layer between data and metadata. David Flynn, CEO described it like this. “We have introduced an abstraction which separates the data consumer side and the storage infrastructure side. And that’s really what’s necessary before we can manage data in the cloud.”

HammerSpace becomes your point of entry to the data in your organization. HammerSpace indexes all the metadata, and presents a file system view via NFS or SMB. HammerSpace also manages access to the data, based on the policy that’s been defined.

They described this approach as sort of like DNS. HammerSpace clients talk to the metadata service, obtaining routing information for the file needed. The client then accesses the file directly. This means that HammerSpace’s metadata control plane is not in the data path. Storage platforms and data consumers no longer have to be tightly coupled in this model, as HammerSpace is the abstraction layer in the middle.

The Parallel NFS Link

If you’re wondering how HammerSpace gets out of the data path, the answer is parallel NFS. Quoting from pnfs.com, “Parallel NFS removes the performance bottleneck in traditional NAS systems by allowing the compute clients to read and write data directly and in parallel, to and from the physical storage devices. The NFS server is used only to control metadata and coordinate access, allowing incredibly fast access to very large data sets from many clients.”

HammerSpace is using this technology, and even has Trond Myklebust, Chief Linux Kernel Maintainer for NFS on the HammerSpace team as CTO.

For clients without parallel NFS support, HammerSpace offers a Data Service Node to act as a proxy.

The Automated, Policy-Governed Data Mover

All of this technology is lovely, but you might be wondering how this helps with hybrid cloud operations. The other major piece of the HammerSpace puzzle is that HammerSpace can move data around for you, wherever you need it, without operator intervention beyond policy definition.

David Flynn explained it this way. “Once you’re able to present data through the lens of the metadata and manage it through the lens of the metadata, then you can think of it as truly, ‘Data like air. Simply everywhere.’ As a matter of fact, the metadata can be aggressively replicated across the globe at different data centers so that you can have the view of it and when you go to access it, it can get the data to where you need it in a on-demand basis.”

HammerSpace Use Cases

HammerSpace cited “Cloud Analytics With On-Premises Data” as their main use case currently. The idea is that you have data in-house, and you need to mine that data. However, you want to use a cutting-edge cloud-based tool to do the analysis. Uh-oh. Normally, the data and the tool would not be co-located, so you’d be out of luck. Not with HammerSpace. They can get the data where it needs to be without data consumers having to worry about it.

Additional HammerSpace use cases include unstructured data management in the cloud, cloud bursting, lift-and-shift to cloud, and disaster recovery & service recovery in the cloud.

Minimal Pain Of Acquisition & Installation

The HammerSpace product is a pay-as-you-go service that is priced by volume of data, and not number of objects. The service can be deployed on prem as a virtual or containerized service. Alternatively, it’s available as a service highly optimized to run efficiently in the public cloud.

Another major selling point is that the HammerSpace service is meant to be non-disruptive to IT operations. Once the product is installed, HammerSpace claims you’ll be getting value from it in less than an hour with no major changes to your storage infrastructure required.

That was Briefings in Brief from the Packet Pushers. For more IT podcasts, blogs and news created for engineers, visit packetpushers.net where you can subscribe for free. And for even more great information, become a member at ignition.packetpushers.net.