I think there was some mention of this in Knuth (of course), I dimly
remember this being part of an algorithms homework assignment of some
significance many years ago. If the sets of registers are the same
(as they would be for parameters) then you can view it as a
permutation, and any permutation can be represented as a set of
cycles, and you can do a cycle of length in N in N+1 moves using a
single scratch register.

e.g.

abcdef => badfce

1 <- 2 <- 1

3 <- 4 <- 6 <- 5 <- 3

Finding the cycles takes some time, I believe computing the expected
case for that time is the interesting part of the homework problem.
Of course, in the general case, this is not a proper permutation,
since the register sets may not be equal, and some of the sources
may be trashed. I think this means that you can save one move per
"cycle" (and they are not really cycles then).