3 Answers
3

I say, if we have to register in order to view the content that they're pointing us to, then the link is broken, because that's very unlikely to happen. It's bad enough they're already pointing us off-site to view it, but now we have to register too? Don't think so. Those links should be subject to removal just as any broken link should be.

As far as I'm aware, the crawler only puts a comment there saying "this link is broken" right? I'm fairly sure you can exclude Stack Exchange links, especially on meta sites. Any other sites that fall under that edge-case category should still be reviewed anyways, so the comment is helpful. If the link is a valid edge-case, just remove the comment and let it continue on. I'd be much more worried about the bad links we'd be letting continue on because we're trying to avoid the good links.

This seems to follow the same issue that gets complained about with the "low quality posts" tab. Sure, sometimes it turns up false positive (ok, lots of the time), but it's better than letting the ones that need reviewed slip through the cracks.

interesting, special (we allow 404s on Meta and Child Meta for explicit domains) ... I kind of like that ... though I still think the 404s should be subject to review cause someone may have mis-typed a link.
–
wafflesMay 18 '12 at 1:17

Semantically, if a user is not supposed to see a particular resource, 401 or 403 would make more sense (I'm not sure which one applies: 401 implies that basic authentication could work, 403 implies that nobody can see the resource). But specific HTTP error codes seem to have gone the way of the dinosaur, so, shrug, you might as well return 404.

How would you know whether a link is conditionally broken? If it returns an error status to the crawler, include it in the review list. I expect that we'll eventually determine a list of domains where 404 should be whitelisted. This list will probably be site-specific; for example, http://SITE.stackexchange.com/* should be whitelisted on meta.SITE.stackexchange.com, as it's common to link to deleted questions. Links to scientific papers behind a paywall may be considered legitimate in certain communities; it should be up to each site's community to decide. Links to sites that require registration should be considered on a case-by-case basis; your example with http://stackexchange.com/filters/new can be generalized to many sites (answer to a [facebook] question on Webapps.SE: “go to your user options at …”; answer to a [some-hosting-provider] question on Webmasters.SE: “go to your control paenl at …”; etc). Again, I expect some whitelists to emerge.

I hope you aren't considering banning dead links upon posting. A warning would probably a good thing though: the bad link could be a typo, or it could be a bad idea (e.g. a link behind a paywall); but let the user say “yes, it's legitimate”.