Ayende,
Just to clarify, assuming that the requests are to be made synchronously, and the first request was to srv-4 and posts/1234 was found there, now you want to exclude posts/1234 from the request to srv-backup-4 because you already have it?

A further optimisation to Yan's solution would be to consider the order in which urls are hit. For e.g. any Load containing an Id ending in "8" means http://srv-backup-4 MUST be hit. If we do this first we MIGHT also get those ending in "4" for free and save a hit to http://srv-backup-4.

Dan,
That's a nice idea, even without formally specifying that one server should have priority another, for this example at least you can save the hit to srv-4 by ordering the groups by the number of ids it contains:

idsDict
.Keys
// map each Id into Id-Url pairs, one for each url
.SelectMany(id => GetAppropriateUrls(id).Select(url => new { Id = id, Url = url }))
// group by the url to get a Url-Id[] mapping
.GroupBy(m => m.Url)
.OrderByDescending(m => m.Count())
// filter groups so only proceed if a group has ids it needs to retrieve
// note this works thanks to delay execution
.Where(gr => gr.Any(m => !idsDict[m.Id]))

With given knowledge it's impossible to get optimal request sequence, because we don't know where document is located (on main or backup server).
So, solution with Dynamic Programming might fit well.
The idea to load document with iteration, and pick server, who could return the largest number of documents on each turn:

There is not enough information to determine optimal sequence of server calls.
So, my best bet is to use a dynamic programming technique to call server, which could return largest number of documents each time.
While this is not guaranteeing to find a best possible solution, it will minimize a number of round trips, compared to naive solution.

For better results, it might be good idea to add some heuristics, based on run-time usage data, to pick "better" servers.

Simply sorting by largest possible result set wouldn't be optimal in some cases. You should pull out any documents with only one possible server and perform them first (along with requesting any other documents that may be on that server), then perform the remaining requests largest -> smallest.

@Betty
IMO, this is a good suggestion, however, it would't help in general case. I bet there is a counterexample too for your's algorithm.
To come up with least number of round-trips, its required to apply some sort of graph search algorithms.
For example do a http://en.wikipedia.org/wiki/Minimumspanningtree ,
if we encode documents as nodes, and connect nodes, who are on the same servers.

The problem is, that MST seems like n^n (rough guess) complexity, so it might be overkill, to search optimal solution.

Perhaps I'm being naive, but the usual meta-goal for reducing network round-trips is to optimize for the lowest overall elapsed time due to network latency effects.

That can often be achieved by issuing overllapping asynchronous parallel queries to multiple servers (not that hard to do with .Net parallel Task support these days).

Usually that gives you a total elapsed time slightly greater than the longest running query.

Most of the above-proposed methods introduce serial execution dependencies and are 'stateful'; hence not readily executed in parallel and thus will likely result in longer elapsed times than a naive 'stateless' parallel algorithm, except for some 'edge cases' where the 'stateful' algorithm will perform better.

A 'naive' general algorithmic approach would be:

Step 1 - coalesce the queries into 'batches' with 1 batch/ target server (querying all servers in a group where a target document can be present in any of a group of several servers).....I liked the LINQ approach to that problem.

Step 3 - otherwise, take the output batches from step 1 and execute the required queries as asynchronous parallel tasks (1 per server), with a 'join point' when all queries are complete. Note that execution threads need to enforce the rule that existence of a document 'trumps' any 'not-found' results for a document.

Step 4 - Post-process the aggregated query results looking for situations where the net outcome of queries for specific documents did not find the document and generate any related exceptions (since your example searches by primary Key, not finding a document on any target server is probably invalid (?)).

Step 5 - return aggregated results.

To avoid 'premature optimization', Implement the 'naive' algorithm first, then benchmark using a reasonably-sized set of 'random' queries and perform low-level optimizations to fix any 'bottlenecks' identified by the benchmark....rinse and repeat using the same benchmark data.

@Mike
Not exactly, we know at least something about distribution of a documents in server.
At fists, for example we know that document A is located at Server 1, 2 or 3 with probability of 33%.
After attempt to fetch document from Server 1, it would be 50% for Server 2 and 3.
This information can be applied as entry point for MST algorithm.