I have a few directories, some with a depth of 3, which contain mixed file types. What I need to do is to rm -rf all the subdirectories that do not contain filetype foo.

Is this achievable with find somehow?
I do know that I can use find like this:

find . ! -name '*.foo' -delete

to delete all files within the directories that do not contain any file of type *.foo.
Now, how can I use this, to not only delete all unwanted files, but all directories and subdirectories which do not contain *.foo?

So if you have /a/b, and b has no *.foo files, but /a/b/c has *.foo files, obviously we don't want to run rm -rf /a/b, right? In this case, I don't think rm -rf is the right approach, maybe we need something like rm dir/*; rmdir dir.
–
MikelMar 1 '11 at 3:20

If a directory contains some.foo and some.bar, should it be deleted? Your question is not clear in this respect.
–
GillesMar 1 '11 at 23:31

3 Answers
3

(Your question is not clear: if a directory contains some.foo and some.bar, should it be deleted? I interpreted it as requiring such a directory to be kept.)

The following script should work, provided that no file name contains a newline and no directory matches *.foo. The principle is to traverse the directory from the leaves up (-depth), and as *.foo files are encountered, the containing directory and all parents are marked as protected. Any reached file that is not *.foo and not protected is a directory to be deleted. Because of the -depth traversal order, a directory is always reached after the *.foo files that might protect it. Warning: minimally tested, remove the echo at your own risk.

Maybe File::Find would be the right tool for this job.
–
GillesMar 1 '11 at 23:27

I get an error on awk script line 5 running that, but it seems OK if I change [\\\011-/] to [\011-].
–
MikelMar 2 '11 at 1:28

@Mikel: That's weird, I get no error under Gawk, Mawk or the original awk, and I don't see what could be wrong. Your regexp "[\001-]" (I assume \011 is a typo) matches only \001 and -, which is no good since the point is to protect \\'" and whitespace.
–
GillesMar 2 '11 at 7:58

Thanks for the suggestion. You're right I should short-circuit as soon as a file is found. Seeing as I'm using read already, how about find ... | read file && return 0. I think that is very slightly faster again.
–
MikelMar 2 '11 at 1:21