Re: How much are filesystem images trusted?

From:

"Dionysus Blazakis" <dion.blazakis@xxxxxxxxx>

Date:

Sat, 19 Jul 2008 22:04:27 -0400

On Sat, Jul 19, 2008 at 2:27 PM, Matthew Dillon
<dillon@apollo.backplane.com> wrote:
>
> :I've been looking at the HAMMER code a bit. It seems the mount will
> :hang the kernel at recovery time if the tail of a undo record contains
> :a zero size. I've been told the filesystem is implicitly trusted, but
> :I think a failed assert would be better than the stuck while loop.
> :
> :I have a small disk image to illustrate the hang at:
> :http://leaf.dragonflybsd.org/~dion/hammer.small.bz2
> :
> :This obviously isn't a high priority, but I'm interested in hearing
> :opinions on it (does this kind of bug interest us?).
> :
> :-- Dion
>
> I'm assuming you just poked the bits in the on-media UNDO FIFO to
> create the failure condition and it isn't a bug per-say, right?
Definitely. I was just making sure my understanding of the recovery
code was correct.
The disk was not "organic"-ly constructed.
>
> I think an assertion is fine, or even just have the mount return
> a failure. Would you like to code up your patch suggestion? We
> can commit it after the release.
I have a patch up at:
http://leaf.dragonflybsd.org/~dion/hammer-mount-badundo.patch
It consists of two small changes:
- Check that the tail_size is reported at least the size of a tail
fifo structure (instead of at least 0) -- this will cause an EIO
instead of a loop or panic.
- If an error occured in hammer_recover, an io lock leak caused a
panic. I now skip the (last) flush if an error occured during mount.
This seems safe -- doesn't matter too much, you're screwed at this
point.
-- Dion
>
> -Matt
> Matthew Dillon
> <dillon@backplane.com>
>