Re: HAMMER recovery and other questions

From:

Matthew Dillon <dillon@xxxxxxxxxxxxxxxxxxxx>

Date:

Mon, 23 Jun 2008 15:28:22 -0700 (PDT)

:1) When I have a nohistory mount and have a, say, power breakdown while
:writing data to it, will the transactions be still re-run after
:rebooting?
The filesystem will be in a consistent state upon re-mounting after
the crash, but if you didn't explicitly sync your operations to
disk HAMMER could unwind up to 30 seconds or so worth of operations
in order to get the fs back into a consistent state.
If you fsync() then all data written to the file object in question
prior to the fsync, plus any related directory infrastructure, is
guaranteed to be recovered after a crash if the fsync() returns before
the crash occurs.
NOTE! There are two side issues, one of which I can fix and one of
which I cannot. The first is that we haven't implemented the hardware
disk flush command through the device driver so if the hard disk lies
about the I/O being complete (and most do), HAMMER's flush sequence
may wind up being imperfect. That is, the drive could write the volume
header before finishing writing the UNDO blocks and cause the crash
recovery code to fail. It's fairly easy to add that feature but I
wonder if someone else could do it :-). FreeBSD did add that feature
and it didn't look too complicated.
The second thing to note is that if you physically pull the plug on a
hard drive which is in the middle of writing something, you can lose
the hard drive... and I'm not talking about just losing one or two
sectors. I mean you can lose several tracks, even data you weren't
writing out but which was simply nearby. You can easily lose the
entire drive. A system crash is one thing, an uncontrolled power-down
is quite another. Drive manufacturers aren't willing to spend the
$0.10 required to put in a big enough capacitor to put the drive into
a safe state on power failure.
Some time last year I was running three raid arrays but didn't have the
UPS's smart status feature hooked into the computing equipment. A power
failure occured and I lost three drives. Poof, three dead drives. Now
I have the UPS hooked in to the computers with apcupsd so the computers
shut down before the UPS does.
:2) If I understand well, I do a synctid and then I do the softlink on
:the transaction ID and (soft)prune after e. g. 2 days -- does this
:mean all my history gets deleted except for the last 2 days. Is this
:correct?
Yes. HAMMER will delete all history prior to that softlink's
transaction id but will retain all history after it. Thus the
history from that transaction id on to 'now' will remain fine-grained.
(If it doesn't do that tell me, because that's how it is supposed
to work).
:3) If I make 3 softlinks, like soft1 (made 4 days ago), soft2
:(made 3 days ago) and soft3 (made 1 day ago), delete soft2 and
:softprune, will _all_ the changes done between soft1 and soft2 also be
:deleted?
Yes. Once you remove soft2 and then run the prune command any
history between soft1 and soft3 will be destroyed, including history
that was previously retained in order to support the soft2 snapshot.
Now that soft2 is gone, that history will be destroyed.
All history prior to soft1 will be destroyed, and any history after
soft3 (from soft3 to current) will be retained and remain
fine-grained.
:4) Feature suggestion: I think for a little bit more comfortable
:operation, there should me a command that automatically creates a
:softlink. Like: hammer snap /path/to/softlink which does a synctid and
:creates the softlink in the desired path. That way one would not be
:forced to retrieve the transaction ID and create softlinks manually. Or
:have I missed something and you already have implemented this? :-)
It's a good idea. Go ahead and add it to the hammer utility.
Maybe call it 'hammer snapshot <softlink-directory> [<filesystem>]'
(where the filesystem need only be specified if the softlink
directory is not in the desired filesystem).
:5) Bug report: please add the nohistory flag to the chflags man
:page. :-)
(Sascha can you do that for us?)
:6) While we are at nohistory: is it possible to have a fully
:nohistory'd volume with only specific directories for which the user
:would like to retain the history?
Sure. The chflags flag propogates automatically so just start
out by chflagging -R the entire mess nohistory, then chflagging
-R the bits you want history to be retained on 'history'. Once
you've done that any new objects created under directories with
nohistory set will also be nohistory, and any new objects
created under directories which allow history will also allow
history.
:TIA for the answers and sorry if some of my questions are related to
:straightforward things, I am writing from a user's POV.
:
:--
:Gergo Szakal MD <bastyaelvtars@gmail.com>
I'm going to add an addendum here with regards to the upcoming mirroring
support. I am not confident that I can make mirroring work well for
files marked 'nohistory'. I will try, but there's a lot of complication
involved due to the lack of history (and thus the lack of B-Tree
elements showing what got deleted). Ultimately such support will be
required, but I don't know if I can fit that level of sophistication
into the 2.0 release.
-Matt
Matthew Dillon
<dillon@backplane.com>