I would like to use rsync to transfer all files from a server (or server-via-ssh) which have a specific ending string, such as a file type extension like tar.gz. I want them all regardless of how deep they might go, now, or in the future. But this can be more than just the extension. In one case I want to get all files ending in -server-cloudimg-amd64-root.tar.gz. And equally important I do not want to get any other files at all, even if they are at shallower paths.

The simple case of --include="**/*-server-cloudimg-amd64-root.tar.gz does not get them. I do know rsync include/exclude is not simple and did not expect that to work. I know there is some need to specify directories to also be transferred. But rsync's logic has been a perpetual mystery to me (not like any path ACL rules I've ever seen) because of the fact that it also requires matching parent directories separately. I think what is needed is simply an option with the semantics that says "if this file matches, include all necessary directories to make it transfer without implying anything else matches" in much the same way the command mkdir -p ${DIRNAME} would create the parents of the named directory as needed. I see no such option in rsync. Is there some straightforward way to do this in one pass?

That seems to work (with my pattern in place of the example pattern). But I don't understand why. I'd still want to test it more thoroughly to be sure no "weird" site would end up having things that confuse it.
–
SkaperenSep 8 '12 at 20:28

OK, I commented too quick. It is pulling down other directories that do not have the matching files. Though they are empty in the target, they are getting created. On some very large trees that can be a problem requiring cleanup every time rsync is run.
–
SkaperenSep 8 '12 at 20:32

I see from the linked answers, specifically the answer beginning with the word "Judging", that the culprit is --include='*/'. It matches all directories. This is not what I want. I only want the directories needed for the files that match.
–
SkaperenSep 8 '12 at 20:38

-m option will do the job, I have edited my answer.
–
gorkyplSep 8 '12 at 20:44

OK, that's definitely looking better. I guess I needed the right combination of -m and --include='*/' to do this. I was looking too hard for one magic option.
–
SkaperenSep 8 '12 at 21:10

top/level/path is the top level directory, and the search would be performed in all its subdirectories. You can use -maxdepth or -mindepth options if you want to narrow your search, and use wildcards like ? or * with -name.

You can of course add additional options to rsync, like rsync -av. The final part -- {} + -- feeds rsync with the list of files found by find command.

If you want to see the list of files that would be passed to rsync, you can test it by sybstituting rsync with echo:

Unfortunately I do not have access at the server side to run something like "find". I am already using "rsync -a" and the like. I can download the whole tree or specific subdirectories OK. I could download a list with "rsync -r rsync://sitename/module/" and have done so but that is what I want to avoid. Also executing rsync multiple times is "just wrong" (there could be hundreds or thousands of matched files among a million that do not match).
–
SkaperenSep 8 '12 at 20:10

But this is a good example showing that find does matches in a simple way that works. The problem is even though rsync does consider the file matched in a way like that, it will refuse to transfer that file if any parent directory didn't also get a match somehow.
–
SkaperenSep 8 '12 at 20:13

Ah, OK, I have misunderstood you. I have created a second answer.
–
gorkyplSep 8 '12 at 20:21