As an aside, you should have been considering Linux+ReiserFS rather than
Solaris+UFS.
ReiserFS has been the best filesystem for a number of purposes since
about 1999 or 2000, especially including handling very many files,
especially very many small files. Not only can you put 100,000 files in
a directory with no problems, but the overhead of small files is about
under 64 bytes on average. (I believe the overhead was 17 bytes plus
the length of the filename.) The filesystem, in its default mode,
combines 'tails' just like a database would. In fact, it's use of
btrees and hashes along with journaling pretty make it a
database/filesystem.
In 2000 I benchmarked a 400Mhz system with a single 10,000 RPM drive
which was able to create/write, read, or delete, small (64, 128, 256,
1024, 2048, etc.) files at about 1100 per second. For this test, I was
operating on 1 million files in 10 directories of 100,000 each.
Hans Reiser, Stephen Tweedie (Ext2/ext3 author), and I debated the need
for better mulithreaded models for ReiserFS at one of the first
LinuxWorlds. It will be interesting to see how Soliaris's new
filesystem compares.
Still (back on the subject), in general it's bad to create that many
files unless you have a good reason. It can't be required in the
processing of a generalized data format.
sdw
Cutler, Roger (RogerCutler) wrote:
>...
>About your specific proposal for handling the seismic data (which is our
>contribution -- including an example dataset), compression aside, I
>still don't know. Is it really reasonable to fling millions of small
>files around? I recall that some operating systems don't like that at
>all. As a specific example, I have experience on Solaris Unix systems
>making directories containing hundreds of thousands of small
>auto-generated files. The OS choked -- really fundamentally choked --
>if you tried to put them all in one directory. I was forced to make
>directory trees with leaf directories that had some max number of files
>in them (I used 1000, if I recall correctly). This necessitated, of
>course, a bunch of pain-in-the-neck logic and code.
>
>This was a while ago, so maybe things have improved -- I throw the
>experience out for what it is worth. But I am dubious and would
>certainly want to see demonstrations before committing to this approach.
>
>
>
...
--
swilliams@hpti.com http://www.hpti.com Per: sdw@lig.net http://sdw.st
Stephen D. Williams 703-724-0118W 703-995-0407Fax 20147-4622 AIM: sdw