Odd apache 404 behaviour

Oh the legacy systems; just lately we’ve been experiencing a little bit of “what can only be described as” odd behavior. Our Apache server is set up to offer virtual hosts; one of these is a remnant of our yee oldee static site :) For some reason though, the following behavior is being observed.

If you click a link to one of the pages on any site you are thrown to a 404 error page on our primary site.

However…

If you type the url in (or do a “copy link”, “paste (into location bar) and go”) the page loads without issue; subsequent requests – as the page is now cached – resolve fine.

SuspicionsMy main suspicion is that some thing’s going a bit askew on the redirect front (see – blame mod_rewrite – just because it is powerful and confusing with it’s regular expressions (but not the sort you hear in bars!))

Detailed setupWe have multiple virtual hosts all being accessed from a primary apache server (say 20 or so); many of these link into our primary http://www.domain (usually with a rewrite/proxy to some location – say http://www.domain/ex/servicename ).

Let’s remove these lines and give it a whirl!Yep – everything seems to work. But these lines were put in place for a reason, weren’t they? (You don’t just go adding things willy-nilly to the old apache configs).

A quick ask around reveals no knowledge.

My suspicions are;1) the site has been effectively decommissioned – this happens – slowly; but direct access has been allowed to give people access to specific areas that are currently unable to be migrated (online training packages and their ilk) 2) someone fugged up, possibly me.

I do need to do some confirmation – and hunting for reasons (should it be doing this? Has the site been decommissioned ? Why haven’t the content providers been told ? yadda yadda)

Solution – long termConfirm whether or not the site is decommissioned, or whether there was another reason for the rules being put in place. (Contact point: head of Web Team/head of Customer Services)

Solution – short termLet’s add in some exclusions to try and say “yeah, keep doing that, but not if it’s _these_ pages”

One of the pages is our antivirus software which we allow to be distributed to staff machines (that avoid our standard network for one reason or another) ; it is accessed by a number of pages – but the important thing is, if a request is made for it on the http://www.snd site, that it passes the request straight to the page.

Currently it’s throwing all content back to the http://www.domain site; so we need to put a rewrite rule in before the existing rule.

Something along the lines of

RewriteRule ^/antivirus(.*)?$ – [L]

ta-da :)Now we just need to find out why it was put there in the first place.Later!