Monday, November 14, 2011

Linux lseek scalability

I don't normally follow Linux kernel development, but I was pleased to hear (via Andres Freund) that the Linux kernel developers have committed a series of patches by Andi Kleen to reduce locking around the lseek() system call. As I blogged about back in August, PostgreSQL calls lseek quite frequently (to determine the file length, not to actually move the file pointer), and due to the performance enhancements in 9.2devel, it's now much easier to hit the contention problems that can be caused by frequently acquiring and releasing the inode mutex. But it looks like this should be fixed in Linux 3.2, which is now at rc1, and therefore on track to be released well before PostgreSQL 9.2.

Meanwhile, we're gearing up for CommitFest #3. Interesting stuff in this CommitFest includes Álvaro Herrera's work on reducing foreign key lock strength and a PostgreSQL foreign data wrapper (pgsql_fdw) by Hanada Shigeru. Reviewers are needed, for those and many other patches!

I did test fstat(). At very high client counts, fstat() was a huge win because it doesn't lock the inode mutex, even in existing Linux releases. However, under ordinary circumstances, it's noticeably slower than lseek, probably because it copies more data from kernel space to user space. So if we went with fstat() it would really be just a hack to work around a kernel issue that only exists on Linux and only people with very large machines will notice, at the expense of everyone else.