On Wed, 31 Oct 2012 00:05:03 +0100, Simon Sapin <simon.sapin@kozea.fr>
wrote:
> Le 30/10/2012 21:58, Florian Rivoal a écrit :
>> I'm in favor of this, with the usual caveats about having to be careful
>> not to break the web.
>
> Is there any way to know if this is gonna break the web? I remember
> reading about searching for usage in a big database of actual web pages,
> but I don’t know who has such data.
Alternatively to having a big database, a crafty combination of wget,
grep, sed and friends, working off a list of top web sites such as the one
provided by alexa.com
(http://s3.amazonaws.com/alexa-static/top-1m.csv.zip) can give a sense of
the state of the web.
I have not had time to do it on this topic, though.
- Florian