Full Member

What does the network traffic in the browser show? Do you see a 301 status and "Location" header in the response?

The message you are seeing in the browser looks like the actual response body in the redirect, but the browser is not redirecting for some reason. This could perhaps happen if "something" is intercepting the response and changing the HTTP status code (or removing the "Location" header) before the browser gets it?!

What other directives do you have in your .htaccess file?

RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$

Bit of an aside, as I don't think it is related to this issue, but you shouldn't be making this pattern optional (trailing "?") and neither do you need to capture it. So, this should be:

Senior Member from US

arthur, does this happen with all browsers you've tried? It seems more like a browser behavior than a problem with the site.

you shouldn't be making this pattern optional

This is an ancient usage having to do with user-agents that don't sent the "Host:" header. I have personally never met a request missing this header field; even the minimalist three-line block-on-sight robots (the ones that don't send User-Agent or Accept headers) include it. So it does no harm, but has probably outlived its usefulness. And, yeah, those picoseconds add up, especially in htaccess.

It is possible to set browsers to "warn" if sites attempt to redirect.

Generally this only applies to scripted redirects, like the one you get here after posting. Redirects that are sent instead of, not by, the requested page are not affected.

Administrator

joined:Aug 10, 2004
posts:11293
votes: 135

I have personally never met a request missing this header field; even the minimalist three-line block-on-sight robots (the ones that don't send User-Agent or Accept headers) include it. So it does no harm, but has probably outlived its usefulness.

all it takes is one such request to start an infinite redirect chain unless your RewriteCond handles the no header case.

there are several text-based browser such as w3m and I believe lynx as well that implement HTTP/1.0 protocol. these user agents are often the front end for web accessibility tools, so consider carefully what ignoring these user agents implies...

Administrator

Senior Member from US

joined:Apr 9, 2011
posts:14715
votes: 614

:: detour to raw logs ::

Over the years--as recently as last week, in fact--there have been a fair number of requests in the form "GET http://www.example.com/etcetera" where you'd expect the Host: header to be missing because it's part of the request instead. (1.0 vs. 1.1 seems to be independent.) But nope, it's always there.

It did occur to me to wonder if my host adds a Host: header--how else would things on a shared server end up in the right place?--but I asked once and they said they don't.

Full Member

joined:Apr 11, 2015
posts: 308
votes: 21

...user-agents that don't sent the "Host:" header.

Thanks lucy24. Funny, I did think about that but somehow tripped on the regex?!

there are several text-based browser such as w3m and I believe lynx as well that implement HTTP/1.0 protocol. these user agents are often the front end for web accessibility tools, so consider carefully what ignoring these user agents implies...

...how else would things on a shared server end up in the right place?

Well, exactly.

A request that contains an absolute-URI should override the Host header (in HTTP/1.1 spec). However, it seems that a valid HTTP client (both 1.0 and 1.1) should only use an absolute-URI when making requests to a proxy server. (?) So, it would seem that for a valid HTTP client to make a successful request to a shared server (name based virtual hosts) then it would need to include the Host header?

Senior Member from US

So, it would seem that for a valid HTTP client to make a successful request to a shared server (name based virtual hosts) then it would need to include the Host header?

Maybe the Host-header-less requests simply never reach my site logs in the first place, being deflected at the gate. You don't see me weeping.

:: detour to investigate ::

After some searching, it turns out that (a) the Lynx available to me* (via Terminal in Mac OS) doesn't support HTTPS, which excludes my first-choice site, and (b) it's mildly deficient in one header, leading to a very picturesque image-less version of my 403 page if I don't poke a hole.

* User-Agent: Lynx/2.8.6rel.5 libwww-FM/2.14

arthur? Still with us? Try to ignore the digressions; it's all educational for somebody, somewhere.

New User

joined:Jan 19, 2012
posts: 25
votes: 0

(Forgive me everyone, some of this is whooshing way over my head.)

If I type in http://www.example.com the page displays as https://www.example.com so it does redirect from http to https. But it then still displays the "Moved permanently" message rather than the web page content. This happens in chrome, firefox and edge. If I type in https://www.example.com directly into the browser the page loads correctly with no redirects or error messages.

I ran the page through a redirect checker and it shows the http to https 301 redirect perfectly fine, but that's the last line of code, there's no 200 response code shown once it's displayed the page and it obviously isn't displaying properly.

The only other thing I have in my htaccess is this to define urls for two error documents: ErrorDocument 404 /404.shtml ErrorDocument 410 /410.shtml

Full Member

If I type in http://www.example.com the page displays as https://www.example.com

As you suggest, this does appear to show that the redirect is working correctly.

But it then still displays the "Moved permanently" message rather than the web page content.

However, this suggests the redirect is not working correctly?!

Is it possible there are two redirects: one working and one not?!

Aside:

Maybe the Host-header-less requests simply never reach my site logs in the first place

On a shared host, this is most probable. Unless... you are the (unlucky) owner of the "default" site (ie. first defined site/virtualhost in the server config). Although most hosts avoid this and implement some default/generic landing page.

Senior Member from US

joined:Apr 9, 2011
posts:14715
votes: 614

On a shared host, this is most probable.

I got brave and tried requesting a raw IP address belonging to my host. They have an error screen saying "can't find the site" with a cute little graphic, probably intended for exactly this situation. With Live Headers enabled, I could see that the browser sends all the usual headers--but, of course, no Host: header, since I didn't specify one. No numerical response code for the page itself, just for the picture.

New User

joined:Jan 19, 2012
posts: 25
votes: 0

Aha! A solution!

I had set up my SSL via Cloudflare's "flexible" option without really understanding the implications...the main one being that an SSL certificate is not actually present on my own web server. So this means:

"There is an encrypted connection between your site visitors and CloudFlare, but not from CloudFlare to your server. The HTTPS condition from the htaccess or PHP will always return as off, as server is still using the http protocol."

So the site shows as https if loaded directly in a browser by a human, but behind the scenes if you asked for the http version and a 301 redirect came into play, the code I had set up in my htaccess meant that the browser was continually asking my server for an https version and my web server was continually saying "Nope, no can do" and the browser was saying "But I NEED it" and my server was saying "tough luck."

So. The solution was simply to remove the http to https redirect I had slaved over in my htaccess file and turn on an option within Cloudlfare that says "Always use https". Some googling suggests there would be a way to do this in htaccess as well via something like this:

but I think it's best for all concerned if I leave my poor htaccess alone now. I appreciate that flexible SSL is probably not a good idea for sites with credit card numbers etc being inputted but my site literally has no secure info at all so I hope the flexible mode is sufficient.

Incidentally, my site now seems to redirect in two jumps, from http://example.com to https://example.com to https://www.example.com. I'm not really bothered by this as I checked a few other sites (including the BBC) and they seem to do the same - am I right not to be concerned?

I also checked webmasterworld.com and they seem to jump from http to https to http://www to https://www which seems a little long-winded. Would these jumps affect anything - seo, load speed etc?

Senior Member from US

joined:Apr 9, 2011
posts:14715
votes: 614

if you asked for the http version and a 301 redirect came into play, the code I had set up in my htaccess meant that the browser was continually asking my server for an https version and my web server was continually saying "Nope, no can do" and the browser was saying "But I NEED it" and my server was saying "tough luck."

Wow. A masterful summary. I am in envy of your prose.

:)

am I right not to be concerned?

A properly coded canonicalization redirect will achieve everything in one step. Sometimes, as you've found, it's impossible. The same applies to any redirect: where possible, do it in one step. If it honestly isn't possible, don't sweat it.

I don't think the present site's webmaster (yes, he exists, and it's a specific named individual) is around at the moment; someone else may be able to shed light on this site's behavior. In the meantime, file under “Do as we say, not as we do”.

Full Member

joined:Apr 11, 2015
posts: 308
votes: 21

I had set up my SSL via Cloudflare's "flexible" option ...

Ah yes, you will indeed get a redirect loop, as you described.

However, what you described earlier didn't sound like a redirect loop? You wouldn't expect to see the "Moved permanently - the document has moved here" message and neither would "[typing] https://www.example.com directly into the browser [load the page] correctly with no redirects or error messages." - If you "typed https://www.example.com directly into the browser" you would expect to see the same "redirect loop"?!

New User

joined:Jan 19, 2012
posts: 25
votes: 0

I don't understand - why would you expect the same redirect loop when typing https://www.example.com directly into the browser?

Once the flexible SSL is turned on my site is capable of displaying pages as https and if you requested the https AND the www version directly my htaccess file would be completely ignored as there are no redirects that need applying.

Full Member

joined:Apr 11, 2015
posts: 308
votes: 21

why would you expect the same redirect loop when typing https://www.example.com directly into the browser?

Because, typing https://www.example.com directly in the browser is the same as being redirected to https://www.example.com - in both cases the browser is making a request for https://www.example.com. Your site cannot tell the difference.

Hhhmm, I'm wondering, had you already configured a "page rule" at Cloudflare to always redirect to HTTPS at the time you implemented the redirect in .htaccess?

New User

joined:Jan 19, 2012
posts: 25
votes: 0

I didn't have a "page rule" set up in Cloudflare.

I understand what you mean about typing in the request for https://www.example.com being the same as being redirected to it. But does that mean that when I typed in https://www.example.com, the code I'd put into htaccess would STILL have asked for a redirect to an https version, is that what you're saying? It still wouldn't have recognised the page as being https and the page should not have displayed correctly?

I think my confusion is because, in this specific circumstance, I have a web server giving out an http page that Cloudflare is able to display in the browser as an https page. So there must be a point where the browser is "fooled" into thinking the page is in https because otherwise it wouldn't show the little padlock. In my head therefore a straight request for https://www. would work correctly because it would ignore the htaccess code completely.

Full Member

joined:Apr 11, 2015
posts: 308
votes: 21

No, ...

Yes.

....an SSL certificate is not actually present on my own web server. So this means:

"There is an encrypted connection between your site visitors and CloudFlare, but not from CloudFlare to your server. The HTTPS condition from the htaccess or PHP will always return as off, as server is still using the http protocol."