On Sun, 2011-10-09 at 01:03 -0700, Gavin Hurlbut wrote:
> Unfortunately, mythtv uses fsync() in the recording path too, but I
> think we have it spaced out to be less abusive of the I/O subsystem,
> but not sure how recently it has been looked at in detail.
I just need to nip this in the bud early. MythTV performance issues
with ext4 have nothing to do with fsync() and everything do to with
database access. Every time you save a row to a DB the disk cache
must be flushed if barriers are enabled. This means if you have a
drive with a 32MB cache, that data must be written to disk. This
cache may include seeks each of which take 8ms to execute. The seeks
and the disk rotation latency can be the real killers once the cache
is flushed.
MythTV can make several DB writes per second at some times. But really
MythTV isn't as heavy a DB user as many modern applications. Turning
off barriers will have much more impact on your web browsing experience
with Firefox (which uses sqllite internally) than with your MythTV
experience.
The reason performance went down appreciably with ext4 shortly after
it was added to the Linux kernel is because the implementation of
barriers was changed. Barriers used to just mean that the data was
flushed across the SATA bus to the disk when crossing a barrier.
Modern disks have a small bit of RAM so that when you write to the
disk it can continue to accept other small writes while it actually
completes the first write. But if the power is cut abruptly there is
no battery or capacitor to allow those last few writes to complete
before the disk powers down. So the kernel was changed so that when
we encounter a barrier we wait for the data to actually make it to
the spinning platters and not just to the disk drive. Some disks and
many RAID controllers have battery backup, but those tend to be
professional drives and it's expected that those machines have a
professional system administrator who can turn off barriers or not
depending on the use case.
To see how barriers might affect performance think of a web page load:
Type in address & hit return: http://www.mythtv.org
Saved to DB: url, then request is made to remote server.
remote server returns html with 80 javascript, image, and
css resources.
Saved to DB: tab names now that html has reported a new title
Saved to DB: location of html text in cache
browser requests each of the 80 resources
80 Saves to DB: location of each of the 80 resources
Here we have 83 saves to the DB each. Before the barrier changes
these would have all be in the RAM cache of the drive and taken
about 20ms to write to disk. But now as soon as the second write
to DB starts it gets stalled for 10 ms, then the download is stalled.
Lets say you are loading a web page on 10 TCP connections and the
roundtrip delay is 40 ms. Before the changes the web page load would
take 40 + 80/10 * 40 = 360 ms, now it takes 40+10 + 80 * (40+10) =
3200 ms. The key here is that things that would have run in parallel
became serialized because they had to write something to disk before
they could proceed.
It wasn't just ext4 that suffered a performance penalty from the barrier
changes, XFS and JFS did as well. ext4 was just the poster child because
it was corruption of ext4 stored that forced the barrier changes and the
change also happened just after a large number of people began
experimenting with ext4 since it had just made it into the kernel. I
remember SGI hardware guys complaining to me at cocktail parties over a
decade ago about how PC's didn't have any protection against loss of
power which was a major problem for XFS when initially ported to Linux.
As for solutions... If you are not a guru and don't make backups on a
regular basis, use ext3 for the disk that contains your databases (/var
and /home directories) and XFS for the disks that contain your video.
Mount the XFS disk with the nobarrier,noatime options. Then don't worry
about this issue.
If you already have ext4 on the partition with /var and /home and don't
want to change it, you can mount with nobarrier or one of the other
suggestions being made here, but you should know that ext4 with
nobarrier is not as safe as ext3 with nobarrier. Like XFS, ext4 uses
delayed allocation, which increases the window of time you are
vulnerable to a loss of power. Ironically this doesn't cause a
problem with mysql, sqllite or mythtv which use fflush() appropriately.
-- Daniel