On Friday 17 October 2008, Michael Ernst wrote:
> Sometimes, a user expects that checklink will produce certain warnings.
> Some reasons include robot exclusion rules, password-protected content, and
> errors in automatically-generated content.
>
> A user would prefer checklink to show only the unexpected warnings, rather
> than hiding them in an avalance of uninteresting output.
>
> This patch adds flags that suppress certain warnings. These flags
> complement the existing --exclude and --exclude-docs flags. (The patch
> also permits --exclude-docs to be supplied multiple times instead of just
> once.)
Thanks for the patch! Some comments follow. (I don't mind discussing these
things here on the www-validator mailing list, but I think a better suited
place would be either the public-qa-dev mailing list or W3C Bugzilla).
Because the patch contains two different things (modification of existing
exclude-docs functionality, and addition of new options), could you split it
into two patches? I hope that's the way it'd also be eventually committed to
CVS - it's easier to track changes that way. We can eg. first get the
exclude-docs change in, then the rest.
The patch appears to drop precompilation and error repoting of the
exclude-docs regexp. I don't think that's a good idea for two reasons.
First, doing the compilation right at the beginning we get the regexp's
syntax checked right there and can abort immediately with a descriptive
message instead of running into it later during the check (when the use might
no longer be actively watching the link check progress) and barfing with a
more obscure error message. Second, precompiling it only once at the
beginning is good for performance.
Same considerations as above seem to apply to the exclude-redirect-prefix
regexps.
I think options that can be specified multiple times should be initialized to
an empty array ([]) instead of undef, for cleanliness reasons and because
that way there's no need to check their definedness later on.
I don't like the wildly varying separator characters in option values (->, :,
#). Better would be consistent, and we already have the space char used
in --masquerade so I suggest using space for the new options as well.
In addition to the --help output, bin/checklink.pod in CVS needs to be updated
too.