Josh Berkus wrote:
> Now you can see why other DBMSs don't use the OS disk cache. There's
> other
> issues as well; for example, as long as we use the OS disk cache, we
can't
> eliminate checkpoint spikes, at least on Linux. No matter what we do
with
> the bgwriter, fsyncing the OS disk cache causes heavy system activity.
MS SQL server uses the O/S disk cache...the database is very tightly
integrated with the O/S. Write performance is one of the few things SQL
server can do better than most other databases despite running on a
mid-grade kernel and a low-grade filesystem...what does that say?
ReadFileScatter() and ReadFileGather() were added to the win32 API
specifically for SQL server...this is somewhat analogous to transaction
based writing such as in Reisfer4. I'm not arguing ms sql server is
better in any way, IIRC they are still using table locks (!).
> > It seems inevitable that Postgres will eventually eliminate that
> redundant
> > layer of buffering. Since mmap is not workable, that means using
> O_DIRECT
> > to read table and index data.
IMO, The O_DIRECT argument makes assumptions about storage and o/s
technology that are moving targets. Not sure about mmap().
Merlin