I have a program that is very heavily hitting the file system, reading and writing randomly to a set of working files. The files total several gigabytes in size, but I can spare the RAM to keep them all mostly in memory. The machines this program runs on are typically Ubuntu Linux boxes.

Is there a way to configure the file system to have a very very large cache, and even to cache writes so they hit the disk later? I understand the issues with power loss or such, and am prepared to accept that. Crashing aside, in normal operation the writes should eventually reach the disk!

Or is there a way to create a RAM disk that writes-through to real disk?

7 Answers
7

Linux by default uses any spare RAM as a file cache, so no configuration is necessary for that.

You may want to consider using ext4 as the filesystem. It uses quite a number of techniques to speed up disk access, including delayed allocation which:

This has the effect of batching together allocations into larger runs. Such delayed processing reduces CPU usage, and tends to reduce disk fragmentation, especially for files which grow slowly. It can also help in keeping allocations contiguous when there are several files growing at the same time.

Are you seeing high numbers of IO waits, indicating that the read and write requests are not being satisfied via existing buffers? As others have noted, Linux is very good about giving spare RAM to buffers, so you should check this first.

If you're not seeing IO waits, then it's possible that your performance problems (do you even have problems? your question doesn't say) are due to kernel context switches from lots of small program-initiated IO operations. In this case you can gain a significant performance boost by rewriting your application to use memory-mapped files. But that's more of a question for StackOverflow.

hmm ... last answer before mine was 3 months ago, OP hasn't been seen since posting the question, no other obvious activity, yet it appeared on the front page ... I guess the system is trying to get answers to questions
–
AnonMay 20 '10 at 21:01

You can see my answer here: Reserve RAM for cache and buffer. If you want to reserve memory for cache and buffer: echo 10 > /proc/sys/vm/vfs_cache_pressure when 100 is the default value. Then you can limit the max ram used by each app: echo 8192 > /proc/sys/vm/max_map_count.