Re: Background fsck

:The journal should at least be spread over the disk. E.g. the seek
:argument is somewhat weak since you need (a) to write the meta-data
:to disk anyway and seek for that and (b) to seek for the normal read
:activity.
There is no reason to spread a meta-data journal all over the disk,
you'll just make it go slower. All you need is a fixed area of whatever
size you choose to use as a circular FIFO. e.g. 10MB of space would
be sufficient.
The seek argument is paramount. 99.9% of the time a disk spends doing
an I/O is seeking. The actual write takes no time at all. It's the
seek that matters.
Remember, we have a buffer cache... when you write out a file no disk
I/O has to occur immediately. All that has to happen is that when the
buffer cache *does* want to flush meta-data out, that it needs to be
sure that it has been written to the journal first. But
a great deal of meta-data can be collected by the buffer cache before
it starts writing and it all still comes down to a single seek to write
the journal.
The result is that the computer would make one seek+write to write
potentially hundreds of meta-data fragments to the journal, and then
would make separate seek+writes to put each meta-data fragment in its
proper place on the disk.
So lets calculate that... lets say you have a hundred meta-data
fragments which would normally require 25 disk seeks to flush out.
Now add journaling... so now we are doing 26 disk seeks to flush it all
out instead of 25. Each seek takes 5ms so without the journal flushing
the buffer cache would take 25x5 = 125ms, and with the journal it
would take 130ms.
In otherwords, A meta-data journal does not slow things down in the
least.
:That's the problematic part. For the background fsck there is one
:assumption which must hold true. At one time or another the fsck
:must find any ressource. That's what complicates the process a bit.
:Doing it in kernel would simplify things a bit, but the basic problem
:is the same. But anyway could you live with allowing the R/W mount
:of dirty softdep fs? You can run the fsck later, it is only needed to
:recollect some unreferenced inodes and blocks.
I really dislike a R/W mount on anything dirty.
A journal removes the need entirely... recovery would not take very
long at all so you wouldn't have to support background fscking.
-Matt
Matthew Dillon
<dillon@xxxxxxxxxxxxx>
:> So, how can I (a) convince you that a filesystem journal is the way to
:> go and (b) maybe get you to write it ? :-) :-) (!)
:
:I will dig into David's journal code anyway, since this is at least
:interesting. I haven't changed my PoV concerning Journal vs. Softdep yet,
:but we should get some real performance numbers once we have both
:Journaling and Softdeps in tree.
:
:Joerg