> It seems to me that if you split into two (or more) tiers, and random-load-balance in the front tier (hit first by the customer), and then at the second tier only send requests to unloaded clients [...]

I'm unclear how you'd think introducing a second tier changes things. That tier would need to track dyno availability and then you're right back to the same distributed state problem.

Perhaps you mean if the second tier was smaller, or even a single node? In that case, yes, we did try a variation of that. It had some benefits but also some downsides, one being that the extra network hop added latency overhead. We're continuing to explore this and variations of it, but so far we have no evidence that it would provide a major short-term benefit for RG or anyone else.

> Do you have reason to believe that this doesn't one-shot RapGenius's problem?

As a rule of thumb, I find it's best to avoid one-shots (or "specials"). It's appealing in the short term, but in the medium and long term it creates huge technical debt and almost always results in an upset customer. Products made for, and used by, many people have a level of polish and reliability that will never be matched by one-offs.

So if we're going to invest a bunch of energy into trying to solve one (or a handful) of customer's problems, a better investment is to get those customers onto the most recent product, and using all the best practices (e.g. concurrent backend, CDN, asset compilation at build time). That's a more sustainable and long-term solution.

Sorry, yes, I'm supposing that the second tier serves fewer dynos; sufficiently few that your solutions from 2009 (that motivated you to advertise intelligent routing in the first place) are still usable.

> As a rule of thumb, I find it's best to avoid one-shots (or "specials").

Absolutely, and I would never suggest that. However, it's not just RG that has this problem, right? If I understand correctly, isn't it every single customer who believed your advertising and followed your suggested strategy to use single-threaded Rails, and doesn't want to switch?

So it's not about short or medium term; it's about letting customers take the latency hit (as you note), in order to get the scaling properties that they already paid for.