Cloud Hosting for You. Sign up today and take control of your own server! Contact us via ticket or email for all support inquiries: https://www.linode.com/contact

Jul 11

Ceph: Block Storage for the 21st Century

By: Steven J. Vaughan-Nichols

Storage used to be so simple. You had a Single Large Expensive Drive (SLED) and you stored all your data on it.

Then, we moved on to redundant arrays of inexpensive disks (RAID), and things got more complex. But, it was still pretty easy. Unless you were using Small Computer System Interface (SCSI). I still get the heebie-jeebies thinking about chaining SCSI drives together.

But, even as hard drives were replaced by solid-state drives (SSD), physical drives couldn’t keep up with modern server data needs, never mind those of clouds and containers. This is where Ceph, and software-defined storage (SDS) have stepped in.

Ceph is an open-source SDS system. It’s designed to run on commercial off-the-shelf (COTS) hardware. From the user’s viewpoint, you’re not concerned about the hardware — whether it’s cache-enabled hard-drives or RAID SSDs.

There are two ways to create this kind of SDM. One way is with a Distributed File Systems (DFS), which is basically the files and directories you’ve been using for years. The only real difference between DFS and conventional storage is that instead of storing files on a single drive or array of drives, they’re stored across drives on multiple servers.

The other method, which Ceph uses, is in an Object Store, where each piece of data is stored in a flat, non-hierarchical namespace and identified by an arbitrary, unique identifier. File details, its metadata, are stored along with the data itself.

Within RADOS, an object is THE unit of storage. In turn, objects are stored in object pools. Each pool has a name (e.g., “foo”) and forms a distinct object namespace. Each pool also defines how the object is stored, designates a replication level (2x, 3x, etc.), and delineates a mapping rule, describing how replicas should be distributed across the storage cluster. (For example, each replica should live in a separate rack.)

Finally, the Ceph storage cluster is comprised of object storage daemons/devices (OSDs). This cluster can store multiple pools and makes Ceph so scalable. You can start with little more storage than you have on your desktop and surge up to petabytes.

While these things are all neat, what makes most people — and anyone who cares about the bottom-line — excited about Ceph is that it makes storage much more affordable.

You can access Ceph storage multiple ways.

If you’re just using block storage, you can use the RADOS Gateway, which is an object storage interface built on top of Librados to provide applications with a RESTful gateway to Ceph Storage Clusters. (In general, Ceph Object Storage supports two interfaces. These are Amazon Simple Storage Service (S3) and OpenStack Swift Representational State Transfer (REST)-based application programming interfaces (APIs). Ceph also has its own native API.)

You can also mount Ceph as a block device. In this mode, Ceph automatically stripes and replicates the data across the cluster. Ceph’s RADOS Block Device (RBD) also integrates with Linux’s built-in Kernel Virtual Machines (KVMs). This enables you to deploy Ceph’s storage to KVMs running on your Ceph clients.

Ceph is also very flexible. Whether you want to access storage as blocks, files, or objects, its there for you. In today’s world where we need quick, reliable access to huge data stores, not to mention Big Data, SDS programs like Ceph are as necessary now as RAID was back in the day.

Please feel free to share below any comments or insights about your experience with or questions about using Ceph, block storage or Linode’s block storage beta. And if you found this blog useful, please consider sharing it through social media.

About the blogger: Steven J. Vaughan-Nichols is a veteran IT journalist whose estimable work can be found on a host of channels, including ZDNet.com, PC Magazine, InfoWorld, ComputerWorld, Linux Today and eWEEK. Steven’s IT expertise comes without parallel — he has even been a Jeopardy! clue. And while his views and cloud situations are solely his and don’t necessarily reflect those of Linode, we are grateful for his contributions. He can be followed on Twitter (@sjvn).