On 3/14/07, Pete Kazmier <pete-expires-20070513 at kazmier.com> wrote:
> When using readFile to process a large number of files, I am exceeding
> the resource limits for the maximum number of open file descriptors on
> my system. How can I enhance my program to deal with this situation
> without making significant changes?
I made it work with 20k files with only minor modifications.
> > type Subject = String
> > data Email = Email {from :: From, subject :: Subject} deriving Show
It has been pointed out that parseEmail would work better if it were
strict; the easiest way to accomplish this seems to be to replace the
above line by
data Email = Email {from :: !From, subject :: !Subject} deriving Show
[snip]
> > fileContentsOfDirectory :: FilePath -> IO [String]
> > fileContentsOfDirectory dir =
> > setCurrentDirectory dir >>
> > getDirectoryContents dir >>=
> > filterM doesFileExist >>= -- ignore directories
> > mapM readFile
And here's another culprit - readFile actually opens the file before
any of its output is used. So I imported System.IO.Unsafe and replaced
the last line above by
mapM (unsafeInterLeaveIO . readFile)
With these two changes the program seems to work fine.
HTH,
Bertram