This is the second-to-last chunk of new stuff before Hammer. Big items include additional checksums on OSD objects, proxied reads in the cache tier, image locking in RBD, optimized OSD Transaction and replication messages, and a big pile of RGW and MDS bug fixes. UPGRADING The experimental ‘keyvaluestore-dev’ OSD backend has been renamed ‘keyvaluestore’ (for …Read more

It is not always easy to know how to organize your data in the Crushmap, especially when trying to distribute the data geographically while separating different types of discs, eg SATA, SAS and SSD.
Let’s see what we can imagine as Crushmap hiera…

Many years ago I came across a script made by Shawn Moore and Rodney Rymer from Catawba university.
The purpose of this tool is to reconstruct a RBD image.
Imagine your cluster dead, all the monitors got wiped off and you don’t have backup (I know what can possibly happen?).
However all your objects remain intact.

I’ve always wanted to blog about this tool, simply to advocate it and make sure that people can use it.
Hopefully it will be a good publicity for this tool :-).

Backuping RBD images

Before we dive into the recovery process.
I’d like to take a few lines to describe what is important to backup and how to backup it.

Ceph makes it easy to create multiple cluster on the same hardware with the naming of clusters. If you want a better insolation you can use LXC, for example to allow a different version of Ceph between your clusters.

Space reclamation mechanism for the Kernel RBD module.
Having this kind of support is really crucial for operators and ease your capacity planing.
RBD images are sparse, thus size after creation is equal to 0 MB.
The main issue with sparse images is that images grow to eventually reach their entire size.
The thing is Ceph doesn’t know anything that this happening on top of that block especially if you have a filesystem.
You can easily write the entire filesystem and then delete everything, Ceph will still believe that the block is fully used and will keep that metric.
However thanks to the discard support on the block device, the filesystem can send discard flush commands to the block.
In the end, the storage will free up blocks.

Now let’s check the default behavior, when discard is not supported, we delete our 128 MB file so we free up some space on the filesystem.
Unfortunately Ceph didn’t notice anything and still believes that this 128 MB of data are still there.

We are quickly approaching the Hammer feature freeze but have a few more dev releases to go before we get there. The headline items are subtree-based quota support in CephFS (ceph-fuse/libcephfs client support only for now), a rewrite of the watch/notify librados API used by RBD and RGW, OSDMap checksums to ensure that maps are …Read more

This is a long-awaited bugfix release for firefly. It several imporant (but relatively rare) OSD peering fixes, performance issues when snapshots are trimmed, several RGW fixes, a paxos corner case fix, and some packaging updates. We recommend that all users for v0.80.x firefly upgrade when it is convenient to do so. NOTABLE CHANGES