I'm writing a numerical optimization, and I'm having a problem with an expression of the form
$$ e^{-t} (1+\mathrm{erf}(t)) $$

The overall shape of the function looks correct, but when $t$ is small, $e^{-t}$ is huge while $(1+\mathrm{erf}(t))$ is very small, and their product is also small. This leads to horrible floating point inaccuracies.

I know, of course, that there are multiple things I can do to remedy this, including scaling my problem so that the values are more reasonably sized. Another is to reformulate the expression to avoid ever computing the huge intermediate values. In the particular example I've given, this is simple:
$$ e^{-t} (1+\mathrm{erf}(t)) \\
\exp{\left[ \log(e^{-t}) + \log{(1+\mathrm{erf}(t))} \right]} \\
\exp{\left[ -t \log{(1+\mathrm{erf}(t))} \right]}
$$
...which is well behaved for all reasonable values of $t$.

However, in my actual expression, there are various parameters with respect to which I take the derivative. The resulting expressions are hideous and reformulating them by hand is daunting (although tractable).

Is there a way to make Mathematica reformulate an expression while attempting to avoid expressions that will be numerically unstable? I don't expect Mathematica to be automatically aware of which expressions will be problematic, but if I could, for example, simply instruct it to avoid using Exp[] unless it absolutely must, this would be a very useful tool for me (and I suspect for other people working on numerical optimization!).

Note: I am not doing the optimization work in Mathematica. I'm only using Mathematica to help derive analytical gradients for my merit function. Therefore, any features of Mathematica which would eliminate the numerical inaccuracy only in Mathematica doesn't really help me.

BTW: your function is equivalent to $\exp(-t)\mathrm{erfc}(-t)$; maybe that reformulation might be more useful to you.
–
Guess who it is.♦May 25 '12 at 14:47

@J.M.: $t$ can go to negative values, yes. I've actually written my code using $\mathrm{erfc}$, but the values resulting from the $\mathrm{erf}$ term are the same, it doesn't really help. The issue is the huge values from $\mathrm{exp}$. They don't survive to the output anyway, so my goal is to reformulate things such that huge numbers are never generated in the first place.
–
Colin KMay 25 '12 at 14:56

2

@J.M.: Matlab however, does have a nice implementation of the scaled complimentary eror function (I'm assuming you mean $e^{t^2} \mathrm{erfc}(t)$) so a reformulation involving this function would be great. I actually hadn't thought of that yet. Thank you! For this particular problem I think I've got enough suggestions to make a solution, but for future use I'd still be interested in how to use MMA for this more effectively.
–
Colin KMay 25 '12 at 15:25

1

@Jens: I'm sorry, I should have been more clear about this: The function I've described is a fit function, not the merit function itself.
–
Colin KMay 25 '12 at 15:43

2 Answers
2

If you at least know in advance the range in which you will later evaluate, you might consider Taylor series or related approximate forms, using Mathematica to derive such forms and to approximate an error bound. In your example:

So the error is around 3/100*t^5 in magnitude near t = zero. This of course assumes the series converges in that region, but in this case at least we know it does (if it did not, your problems would go beyond numerical (in)stability).

At t=1, this error is

In[13]:= func - approxpoly /. t -> 1.
Out[13]= -0.0732347

If you anticipate values of t in that range you might want to use more terms in the polynomial, or else switch before t=1 to a different approximation (this is assuming an error of 7% is more than you want to allow). The table below will give a better idea of the error sizes for t in the range of (-1,1).

As you would be evaluating outside of Mathematica, you would be substituting the approximations for the function of interest. What I indicate above is an idea of how Mathematica might be utilized to derive and assess them for quality.

I would even go further and suggest that OP use PadeApproximant[] to derive approximations, as these can usually give more accurate approximations even slightly far away from the expansion point.
–
Guess who it is.♦May 25 '12 at 15:14

One is reminded of similar problems with $\Gamma$: this function is frequently used in combinatorial expressions in which enormous values almost cancel. Taking logarithms is unstable, but it's possible to compute $\log(\Gamma(z))$ directly with an asymptotic expansion--whence Mathematica offers a separate LogGamma function for this purpose.

Why not use asymptotic expansions for large arguments, then? What is needed is an asymptotic expansion of $\log(\text{erfc}(z))$ as $z \to \infty$ (see the comment by @J.M. and apply $\log$). Mathematica's Limit capability quickly lets us obtain as many terms as we might like. I used it to work out the first few below:

To get more coefficients (which you would scarcely need, but this shows how the first eight coefficients were obtained), compute suitable limits of the difference. For instance, to obtain the ninth coefficient, evaluate

Here is a less complicated way of obtaining a nice asymptotic expansion: Series[Log[Erfc[z]], {z, Infinity, 20}]. One might consider constructing a Padé approximant from the asymptotic series as well, but thanks to a bug within the internal implementation of PadeApproximant[], some trickery is required: Log[1/z] - Log[Pi]/2 - PadeApproximant[Log[1/z] - Log[Pi]/2 - Series[Log[Erfc[z]], {z, Infinity, 10}], {z, Infinity, 6}].
–
Guess who it is.♦May 26 '12 at 2:25

Mathematica is a registered trademark of Wolfram Research, Inc. While the mark is used herein with the limited permission of Wolfram Research, Stack Exchange and this site disclaim all affiliation therewith.