Re: base::mean not consistent about NA/NaN

Thank you for interesting examples.
I would find useful to document this behavior also in `?mean`, while `+`
operator is also affected, the `sum` function is not.
For mean, NA / NaN could be handled in loop in summary.c. I assume that
performance penalty of fix is the reason why this inconsistency still
exists.
Jan

Re: base::mean not consistent about NA/NaN

On Tue, Jul 3, 2018 at 10:12 AM, Jan Gorecki <[hidden email]> wrote:
> Thank you for interesting examples.
> I would find useful to document this behavior also in `?mean`, while `+`
> operator is also affected, the `sum` function is not.

`sum` is "affected" on my system, if you mean:

> sum(c(NA,NaN))
[1] NA
> sum(c(NaN,NA))
[1] NaN

oh, maybe you mean:

> sum(NaN, NA)
[1] NA
> sum(NA, NaN)
[1] NA

But whatever, no money back guarantee:

Computations involving ‘NaN’ will return ‘NaN’ or perhaps ‘NA’:
which of those two is not guaranteed and may depend on the R
platform (since compilers may re-order computations).

Re: base::mean not consistent about NA/NaN

Yes, the performance overhead of fixing this at R level would be too
large and it would complicate the code significantly. The result of
binary operations involving NA and NaN is hardware dependent (the
propagation of NaN payload) - on some hardware, it actually works the
way we would like - NA is returned - but on some hardware you get NaN or
sometimes NA and sometimes NaN. Also there are C compiler optimizations
re-ordering code, as mentioned in ?NaN. Then there are also external
numerical libraries that do not distinguish NA from NaN (NA is an R
concept). So I am afraid this is unfixable. The disclaimer mentioned by
Duncan is in ?NaN/?NA, which I think is ok - there are so many numerical
functions through which one might run into these problems that it would
be infeasible to document them all. Some functions in fact will preserve
NA, and we would not let NA turn into NaN unnecessarily, but the
disclaimer says it is something not to depend on.

Yes, the performance overhead of fixing this at R level would be too
large and it would complicate the code significantly. The result of
binary operations involving NA and NaN is hardware dependent (the
propagation of NaN payload) - on some hardware, it actually works the
way we would like - NA is returned - but on some hardware you get NaN or
sometimes NA and sometimes NaN. Also there are C compiler optimizations
re-ordering code, as mentioned in ?NaN. Then there are also external
numerical libraries that do not distinguish NA from NaN (NA is an R
concept). So I am afraid this is unfixable. The disclaimer mentioned by
Duncan is in ?NaN/?NA, which I think is ok - there are so many numerical
functions through which one might run into these problems that it would
be infeasible to document them all. Some functions in fact will preserve
NA, and we would not let NA turn into NaN unnecessarily, but the
disclaimer says it is something not to depend on.