If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

sqlite results explained

With the SQLite test profile to measure how long it takes to perform 12,500 insertions using this lightweight SQL database, EXT3 and NILFS2 were the clear winner. It took 20 seconds for this database test to complete under EXT3, 34 seconds under NILFS2, but 870 seconds for EXT4! XFS was at 1312 seconds and Btrfs was at 1472 seconds! These results are a bit shocking, but the Phoronix Test Suite does run these tests multiple times to ensure accuracy and statistical significance.

It's really not shocking, and it's already been explained on your mailing list:

The difference is whether or not barriers are on during the test, which cause drive cache flushes on fsyncs, which sqlite will do a -lot-

When you test something and get shocking results, it would make sense to contact the developers of the subsystem(s) at that point, to see if it can be explained - if it's expected, a problem in test methodology, a regression, or what. It would let you do a much more informative writeup, and educate your readers. As it is, you run the risk of starting a meme like "ext4 is bad for databases" when in fact it's a change in the default configuration which has caused the change, and it's more of a tuning issue.

I think phoronix could be a very interesting and useful tool for users and developers alike, but investing a bit in communication with the experts in the subsystems you're testing would help a lot.

The difference is whether or not barriers are on during the test, which cause drive cache flushes on fsyncs, which sqlite will do a -lot-

When you test something and get shocking results, it would make sense to contact the developers of the subsystem(s) at that point, to see if it can be explained - if it's expected, a problem in test methodology, a regression, or what. It would let you do a much more informative writeup, and educate your readers. As it is, you run the risk of starting a meme like "ext4 is bad for databases" when in fact it's a change in the default configuration which has caused the change, and it's more of a tuning issue.

I think phoronix could be a very interesting and useful tool for users and developers alike, but investing a bit in communication with the experts in the subsystems you're testing would help a lot.

Thanks,
-Eric

this is along the lines that I expected. ext3 does silently disable barriers, and even fails to do a fsync properly under some conditions (when the journal wraps), but some of the kernel developers have not been willing to change the default to the safer mode due to the performance impact it can have under some workloads.

for database workloads it would be very useful to have two sets of tests.

safe-but-slow (what you would want on a database where the contents are worth money)

fast-but-risky (what you would want on a database where the contents are not worth much, but performance is critical)

MySQL showed that there is a huge market for this second category (they now have the option of the safe-but-slow mode, but when they started they didn't)

some of this is in the application (disabling fsync), but other parts can be in the filesystem as well (disabling barriers, atime, etc)

In addition it would be good to see ext2 (or ext4 with journaling disabled) in these tests. especially for databases where the application takes care of data integrity, there is little data-loss value in using a journal, and the performance difference can be substantial.

I was recently doing some benchmarks on SSDs for a utterly reliable mode of rsyslog, where it sacrafices speed as needed to make sure that when a message is acknoledged it's safe on disk. In doing this test I saw a range of 1700 messages/sec to 8500 messsages/sec on identical hardware with the only change being the filesystem in use (with ext2 being the 8500/sec at ~60% cpu for the busiest thread while the nearest competitor was ~4000/sec at 100% cpu for the busiest thread.

on the postgres performance mailing list I routinely see ext2 recommended as the filesystem of choice for partitions dedicated to the database.

the ext4 developers are saying that ext4 with journaling disabled should be able to replace ext2, but this is a relativly new mode of operation, and they are still finding all the bugs in it (Google is very interested in this mode, they want the large disk and extents capability of ext4 without the overhead of the journal for some uses)

The difference is whether or not barriers are on during the test, which cause drive cache flushes on fsyncs, which sqlite will do a -lot-

When you test something and get shocking results, it would make sense to contact the developers of the subsystem(s) at that point, to see if it can be explained - if it's expected, a problem in test methodology, a regression, or what. It would let you do a much more informative writeup, and educate your readers. As it is, you run the risk of starting a meme like "ext4 is bad for databases" when in fact it's a change in the default configuration which has caused the change, and it's more of a tuning issue.

I think phoronix could be a very interesting and useful tool for users and developers alike, but investing a bit in communication with the experts in the subsystems you're testing would help a lot.

Thanks,
-Eric

To some extent I disagree with this. The upstream developers, lead developers who push to the kernel and the distribution vendors all have a part to play in the cycle of getting things to production.

Phoronix is just reporting on the state of whatever made it into the kernel.

My view is that the question isn't so much what Phoronix has highlighted with 2.6.30 and why didn't he discuss with the upstream developers before publishing, but rather why didn't the maintainers have an awareness of the benefits and deficiencies prior to pushing up into the kernel.

For most people, they will review the general performance, select the filesystem that best performs for their intended use, and then tune only that platform.

I appreciate the position that the developers are in, but PTS is trivial enough for the developers to be self-aware before pushing downstream. It shouldn't take any real effort to test before pushing.