2014-08-15

How I added my LVM volumes as OSDs in Ceph

This article expands on how I added an LVM logical volume based OSD to my ceph cluster. It might be useful to somebody else who is having trouble getting

ceph-deploy osd create ...

or

ceph-deploy osd prepare ...

to work nicely.

Here's how Ceph likes to have its OSDs setup. Ceph OSDs are mounted by OSD.id in

/var/lib/ceph/ceph-*

. Within that folder should be a file called

journal

. The journal file can either live on that drive or be a symlink. That symlink should be to another raw partition (e.g. partition one on an SSD) though it does work with a symlink to a regular file too.

Here's a run-down of the steps that worked for me:

First

mkfs

the file system on the intended OSD data volume. I use XFS because BTRFS would add to the strain on my netbook but YMMV. After the mkfs is complete you'll have a drive with an empty filesystem.

Then issue

ceph osd create

which will return a single number: this is your OSDNUM. Mount your OSD data drive to

/var/lib/ceph/osd/ceph-{OSDNUM}

remembering to substitute in your actual OSDNUM. Update your

/etc/fstab

to automount the drive to the same folder on reboot. (Not quite true for LVM on USB keys: I have noauto in fstab and a script that mounts the LVM logical volumes later in the boot sequence).

Now prepare the drive for Ceph with

ceph-osd -i {OSDNUM} --mkfs --mkkey

. Once this is done you'll have a newly minted but inactive OSD complete with a shiny new authenication key. There will be a bunch of files in the filesystem. You can now go ahead and symlink the journal if you want. Everything up to this point is somewhat similar to what

ceph-deploy osd prepare ..

does.

Doing the next steps manually can be a bit tedious so I use ceph-deploy.

ceph-deploy osd activate hostname:/var/lib/ceph/osd/ceph-{OSDNUM}

There's a few things that might go wrong.

If you've removed OSDs from your cluster then

ceph osd create

might give you a OSDNUM that is free in the CRUSH map but still has an old

ceph auth

entry. That's why you should

ceph auth del osd.{OSDNUM}

when you delete an OSD. Another useful command is

ceph auth list

so you can see if there's any entries that need cleaning up. The key in the

ceph auth list

should match the key in

/var/lib/ceph/osd/ceph-{OSDNUM}

. If it doesn't then delete the auth entry with

ceph auth del osd.{OSDNUM}

. The

ceph-deploy osd activate ...

command will take care of adding correct keys for you but will not overwrite an existing [old] key.

Check that the new OSD is up and in the CRUSH map using

ceph osd tree

. If the OSD is down then try restarting it with

/etc/init.d/ceph restart osd.{OSDNUM}

. Also check that the weight and reweight columns are not zero. If they are then get the CRUSHID from