Revisiting IPv6 Web adoption in LATAM

I’ve recently revisited the issue of trying to measure IPv6 adoption from the Web perspective. Just as I did last time, I’m using Search Engine Results to find domains, that I’ll later massage with Unix tools to build hostnames and then try to resolve AAAA for them.

This time I’m using Bing. Bing has rapidly become a search engine of choice for developers in terms of their API and pretty fair usage terms. Yahoo! did some ugly moves with their BOSS platform and Google has always been quite crappy for developers. So this leaves Bing; no wonder the awesome DuckDuckGo search engine is built on top of it (and 50+ other sources, but no Google)

Bing::Search is also now available on CPAN which makes it easier for me as a Perl-er to write something real quick. So with that and with Domain::PublicSuffix I manage to build a pretty tool.

The issue relies on the algorithm. Search engines are, by nature, restricted to a number of results. Bing does the first 1000.

But then, a single domain can fill up 1000 results easily, so I’m using a neat trick: I’m scanning the domains in the first page, and using the – operator (yes, Bing continues to support – and + despite Google did that awful change to accommodate their Google+ platform) to remove results. Then I iterate until the query is too long for Bing to handle, and explore the 1000 results. My next step is to weigh first and remove only the heavier domains, but I predict this will not give me large differences (most likely about 30% of the Search Engine API limit)

This gives me a pretty large number of domains to hande, so I can build AAAA queries.