Thinking along the lines of things that rdns servers can do unilaterally
to improve forgery resilience...
Do we know what typical percentage of queries a rdns box is likely to
receive, of sub-domains of in-cache domains (which are not themselves
cached), versus all other queries?
What I'm thinking is, when such a query is seen, even without checking
TXID/QID mismatches, just always require two identical answers at each
(external, non-cached) step of the recursive resolution process, using UDP.
With randomized ports per query, this effectively doubles the number of
entropy bits, albeit at a performance hit of 2x, but only for those
non-cached domains underneath cached domains.
The question is, what is the *actual* performance penalty? If the
client-side percentage of such queries is low, like 20%, then the 2x
penalty would only add 20% to the load, whilst going a long way towards
making such attacks infeasible.
How infeasible? If 1x birthday success to poison a cache takes 10 hours,
using GigE LAN-connected servers, then 2x (using randomized QID and
source port) should take 2^32 times as long, or roughly 11M years --
unless the attacker was *really* *really* *REALLY* lucky, or had some
way of reverse-engineering the PNRG state.
Now that I put this down on e-paper, this seems pretty compelling.
Nowhere near the performance impact of TCP, and even not that much logic
or state required locally...
Thoughts?
Preaching to the choir?
Brian