On Mon, Jun 27, 2016 at 2:17 PM, Luca Toscano <toscano.luca@gmail.com>
wrote:
>
>
> 2016-06-27 13:17 GMT+02:00 Vacelet, Manuel <manuel.vacelet@enalean.com>:
>
>>
>>
>> On Sat, Jun 25, 2016 at 11:00 AM, Luca Toscano <toscano.luca@gmail.com>
>> wrote:
>>
>>>
>>>
>>> 2016-06-24 17:26 GMT+02:00 Vacelet, Manuel <manuel.vacelet@enalean.com>:
>>>
>>>>
>>>>
>>>> On Sun, Jun 19, 2016 at 3:17 PM, Luca Toscano <toscano.luca@gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> 2016-06-08 16:14 GMT+02:00 Vacelet, Manuel <manuel.vacelet@enalean.com
>>>>> >:
>>>>>
>>>>>> On Tue, Jun 7, 2016 at 11:02 PM, Luca Toscano <toscano.luca@gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2016-06-07 10:55 GMT+02:00 Vacelet, Manuel <
>>>>>>> manuel.vacelet@enalean.com>:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jun 6, 2016 at 5:32 PM, Vacelet, Manuel <
>>>>>>>> manuel.vacelet@enalean.com> wrote:
>>>>>>>>
>>>>>>>>> dOn Mon, Jun 6, 2016 at 5:00 PM, Vacelet, Manuel <
>>>>>>>>> manuel.vacelet@enalean.com> wrote:
>>>>>>>>>
>>>>>>>>>> On Mon, Jun 6, 2016 at 4:09 PM, Luca Toscano <
>>>>>>>>>> toscano.luca@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I was able to repro building httpd from 2.4.x
branch and
>>>>>>>>>>> following your configuration files on github.
I am almost sure that
>>>>>>>>>>> somewhere httpd sets the Last-Modified header
translating "foo" to the
>>>>>>>>>>> first Jan 1970 date. I realized though that I
didn't recall the real issue,
>>>>>>>>>>> since passing value not following the RFC can
lead to inconsistencies, so I
>>>>>>>>>>> went back and checked the correspondence. Quoting:
>>>>>>>>>>>
>>>>>>>>>>> "Actually I wrote this snippet to highlight the
behaviour (the
>>>>>>>>>>> original code sent the date in iso8601 instead
of rfc1123) because it was
>>>>>>>>>>> more obvious.
>>>>>>>>>>> During my tests (this is extracted from an automated
test
>>>>>>>>>>> suite), even after having converted dates to
rfc1123, I continued to get
>>>>>>>>>>> some sparse errors. What I got is that the value
I sent was sometimes
>>>>>>>>>>> slightly modified (a second or 2) depending on
the machine load."
>>>>>>>>>>>
>>>>>>>>>>> So my understanding is that you would like to
know why a
>>>>>>>>>>> Last-Modified header with a legitimate date/time
set by a PHP app gets
>>>>>>>>>>> "delayed" by a couple of seconds from httpd,
right?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Yes for sure, this is the primary issue.
>>>>>>>>>> However, the (undocumented) difference of behavior
from one
>>>>>>>>>> version to another (2.2 -> 2.4 and more surprisingly
from between two 2.4
>>>>>>>>>> versions) is also in question here.
>>>>>>>>>> Even more strange, 2.4 built for other distrib doesn't
highlight
>>>>>>>>>> the behaviour !
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I made another series of test and it seems to be linked
to fastcgi.
>>>>>>>>>
>>>>>>>>> I took the stock apache (2.4.6 plus tons of patches)
& php-fpm
>>>>>>>>> (5.4.16 + tons of patches) from RHEL7 and I get the exact
same behaviour
>>>>>>>>> (headers rewritten to EPOCH)
>>>>>>>>> However, if I server the very same php script from mod_php
>>>>>>>>> (instead of fcgi) it "works" (the headers are not modified).
>>>>>>>>>
>>>>>>>>>
>>>>>>>> For the record, I also have the same behaviour (headers rewritten
>>>>>>>> when using php-fpm + fastcgi) on alpine linux 3.4 that ships
apache2-2.4.20.
>>>>>>>> So AFAICT, it doesn't seem distro specific.
>>>>>>>>
>>>>>>>> On the root of the problem, from my point of view:
>>>>>>>> - the difference between mod_php vs. php-fpm + fcgi is
>>>>>>>> understandable (even if not desired and not documented).
>>>>>>>> - the fact that fcgi handler parse & rewrite headers
seems to lead
>>>>>>>> to inconsistencies (I'll try to build a test case for that).
>>>>>>>> - however, even if the headers are wrong, I think apache
default
>>>>>>>> (use EPOCH) is wrong as it leads to very inconsistent behaviour
(the
>>>>>>>> resource will never expire). I would prefer either:
>>>>>>>> -- do not touch the header
>>>>>>>> -- raise a warning and discard the header
>>>>>>>>
>>>>>>>> What do you think ?
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> From my tests the following snippet of code should be responsible
>>>>>>> for the switch from 'foo' to unix epoch:
>>>>>>>
>>>>>>> *https://github.com/apache/httpd/blob/2.4.x/server/util_script.c#L663
>>>>>>> <https://github.com/apache/httpd/blob/2.4.x/server/util_script.c#L663>*
>>>>>>>
>>>>>>> The function that contains the code,
>>>>>>> ap_scan_script_header_err_core_ex, is wrapped by a lot of other
functions
>>>>>>> eventually called by modules like mod-proxy-fcgi. A more verbose
>>>>>>> description of the function in:
>>>>>>>
>>>>>>> https://github.com/apache/httpd/blob/2.4.x/include/util_script.h#L200
>>>>>>>
>>>>>>> Not sure what would be the best thing to do, but probably we
could
>>>>>>> follow up in a official apache bugzilla task?
>>>>>>> https://bz.apache.org/bugzilla/enter_bug.cgi?product=Apache%20httpd-2
>>>>>>>
>>>>>>>
>>>>>> Wow, thanks for the investigation !
>>>>>>
>>>>>
>>>>> Sorry for the delay! I submitted a patch for trunk with a possible
>>>>> fix, namely dropping (and logging at trace1 level) any non compliant
>>>>> date/time set in a Last-Modified header returned by a FCGI/CGI script:
>>>>> http://svn.apache.org/r1748379
>>>>>
>>>>>
>>>> Cool :)
>>>>
>>>>
>>>>> The fix is also in the list of proposal for backport to the 2.4.x
>>>>> branch, we'll see what other people think about this solution.
>>>>>
>>>>> We should also do a follow up for the other main issue, namely the
>>>>> fact that you see a different/delayed Last-Modified header sometimes
among
>>>>> your FCGI/httpd responses. Can you give me an example of Last-Modified
>>>>> header value before/after the "delay" and a way to repro it?
>>>>>
>>>>
>>>> I wrote a test case in the "time" branch:
>>>> https://github.com/vaceletm/bug-httpd24/tree/time
>>>>
>>>> In my own tests, I get:
>>>> --------------------->8---------------------
>>>> < Date: Fri, 24 Jun 2016 15:21:46 GMT
>>>> < Server: Apache/2.4.18 (Red Hat)
>>>> < X-Powered-By: PHP/5.6.5
>>>> < Last-Modified: Fri, 24 Jun 2016 15:21:48 GMT
>>>> < Transfer-Encoding: chunked
>>>> < Content-Type: text/html; charset=UTF-8
>>>> <
>>>> { [data not shown]
>>>> 0 44 0 44 0 0 21 0 --:--:-- 0:00:02
>>>> --:--:-- 21* Connection #0 to host localhost left intact
>>>>
>>>> * Closing connection #0
>>>> sent value: Fri, 24 Jun 2016 17:21:46 +0200
>>>> --------------------->8---------------------
>>>>
>>>> The value sent doesn't respect RFC1123 (+0200 instead of GMT as time
>>>> zone) but the result is weird as you can see:
>>>> - I sent "Fri, 24 Jun 2016 17:21:46 +0200"
>>>> - but apache decided to send "Fri, 24 Jun 2016 15:21:48 GMT"
>>>>
>>>> Notice the 2 seconds ?
>>>> I put a "sleep(2)" in my php script...
>>>>
>>>> I don't know if your fix also take this into account
>>>>
>>>
>>> Thanks a lot for the precise test! The same code snippet that I modified
>>> is responsible for the behavior that you mentioned. Httpd modifies the
>>> Last-Modified header with the request's modification time if the value sent
>>> from FCGI appears to be in the future (since the HTTP RFC states "An origin
>>> server with a clock MUST NOT send a Last-Modified date that is later than
>>> the server's time of message origination (Date).").
>>>
>>> I modified your PHP code snippet (http://apaste.info/EEz) trying to
>>> compare a GMT date vs a "Europe/Paris" one, already formatted for RFC1123,
>>> and PHP seems to agree with httpd in recognizing the "Europe/Paris" date as
>>> more recent. Moreover, if you generate a GMT date and format it for RFC1123
>>> the header is not modified with the extra two seconds.
>>>
>>> So from what I can see httpd does the correct thing, I don't see a bug
>>> like in the previous case. What do you think? I am far from a PHP expert so
>>> I might have missed something important :)
>>>
>>>
>> Mmm I don't think it' the right way to compare the dates here as you are
>> really comparing the format strings here.
>> I propose a new version of the snippet: http://apaste.info/Aox
>>
>
>> Clearly, just changing the timezone doesn't impact the time comparison
>> (and it's the expected behaviour).
>>
>
> Correct, in general the best way should be the one that you proposed, but
> in this case we are talking about RFC1123 specific date/times, so the
> format string comparison should be relevant imho. A efficient RFC 822/1123
> parser would probably assume the GMT timezone and care only about what
> comes before, this is why Europe/Paris is seen as more recent than GMT. A
> super strict and correct parse would also check the GMT bit and return
> error if missing, but it may be a bit overkill.
>
>
>> To me there is a wrong attempt to comply with RFC in apache here.
>> Either the parser is able to:
>> 1. correctly read the header input
>> 2. normalize to GMT
>> 3. ensure the resulting date is not > to server time (+ probably log
>> somthing to help developers to understand things)
>> or there should be a warning and the header is dropped (like if it's not
>> a date).
>>
>> Here I thing either step 1 ou 2 are no done properly in apache.
>>
>>
> I am seeing things in a different way, namely only point 3 should/could be
> implemented. AFAIU RFC1123 (and related) assume a GMT date/time and since
> the HTTP RFC requires this format for the Last-Modified header, I don't
> believe that httpd should be required to be able to convert multiple
> formats/timezones to RFC1123. This seems to be backed up by the comments of
> the function used by httpd to convert the Last-Modified header value:
>
Ok but current behaviour is not correct either.
If I understood well assume that apache receives a RFC1123 value so it
compares with current server time (and eventually sends the later).
In my example, the date is not a valid RFC1123 value (because it sends
+0200 or Europe/Paris). Here, the most sensible default would be to trash
with value.
It's as invalid as "foo" in my initial example so from my point of view the
behaviour of apache should be the same: discard the header (thanks to your
patch) and raise a warning.