My line of thinking there is that making a note of which subdirectory I had got to, in a checkpoint every so often, and combining it with File::Find::prune to "skip forwards". I suppose I'm not really sure why I'm resisting databases, though.

Comment on
Re^6: Splitting up a filesystem into 'bite sized' chunks
Replies are listed 'Best First'.

File::Find let's you specify "prune" which enables or disables traversal. If you know your last checkpoint is /mnt/myhome/stuff/junk you can pattern match your file path, and turn off traversal until you get a match. (you may have to roll up your checkpoint a little if the target has been deleted in the interim).

That'll - hopefully - give me a restartable find. Being able to "skip head" in future (and thus distribute processing) may require a first pass, and tracking multiple checkpoints.

Thinking about it, some sort of start/finish and some way of compensating for "drift". But one checkpoint every 100k files takes a huge list down to merely large. (but still doesn't help your first pass, unless you can take some wild guesses for initial checkpoints and do that same drift compensation.)