On Wed, May 08, 2013 at 06:05:26PM -0400, Mike Snitzer wrote:
> On Wed, May 08 2013 at 5:48pm -0400,
> Darrick J. Wong <darrick wong oracle com> wrote:
>
> > Hi,
> >
> > So I've been watching the hit/miss counters in dmcache and I've noticed a
> > couple of things that look like errors to me:
> >
> > First, I noticed that if I reboot the system, neither cache_postsuspend nor
> > cache_dtr get called. This might simply be expected behavior, but it means
> > that the in-memory superblock structure doesn't get written out to disk upon
> > reboot. Just to be sure, I put a printk into __commit_transaction. It prints
> > out for 'dmsetup info' and 'dmsetup remove' but nothing at reboot.
>
> We don't have reboot notifiers that auto-magically tear down an
> artbitrary DM stack. Typically the device shutdown includes unmounting
> filesystems, stopping LVM (which tears down DM devices, etc).
>
> So given that we don't have any userspace LVM2 support for dm-cache yet
> I'm not surprised by this. In fact it is expected.
Hmm, I wasn't aware that the lvm2 package had any teardown scripts. It doesn't
seem to have any in RHEL5.8 or Ubuntu...
> > Second, cache_status calls dm_cache_commit, which writes out a superblock to
> > the metadata device. However, there's no call to save_stats to copy the
> > current values of the counters out to the disk's copy prior to calling
> > dm_cache_commit. Therefore, we seem to be writing out stale copies of
> > superblock fields.
> >
> > The second one seems fixable with the attached patch
>
> I'll defer to Joe on this but I think sync_metadata() is pretty heavy to
> be doing every 'dmsetup info'. BTW, with just dm_cache_commit() the
> superblock fields aren't stale; only the on-disk hints are.
How often does dmsetup info run? I admit that it becomes slower with the
patch, but I didn't think it was really in anyone's hot path. But given that
there's a comment just prior that says:
/* Commit to ensure statistics aren't out-of-date */
it feels like we ought at least to be calling save_stats() so that we update
the on-disk statistics. Though, given that the metadata size should be about
10MB for a 100GB cache device, I don't mind flushing out 10MB of metadata to
get the device info.
Really the problem is that with both of these complaints active, the superblock
counters and tables /never/ seem to get updated, even across multiple reboots.
(I'm still digging for why I see such weird unreproduceable benchmark numbers.)
--D