2 Answers
2

In the example output from your code, $\sigma$ is huge, i.e. the Gaussian is extremely broad. The variable s you define as the pre-factor for the argument of the corresponding exponential is then only $\approx -1\cdot{}10^{-15}$, which is dangerously close to typical double precision limits (adding $10^{-16}$ to $1$ with typical double precision, e.g., still is $1$. scipy'squad having correspondingly to deal with huge and tiny numbers will make it hard to detect e.g. what numerically is zero.

For numerical integration to be stable, it is important to scale the integration variable appropriately: here, $\sigma$ is the typical length scale of your problem, and one would want the typical numerical scale used for integration to be of the order of $1$ (or $0.1$, or $10$, or anything of reasonable numerical magnitude). A possible substitution of the integration variable could be $u=(x-\mu)/\sigma$ (with ${\rm d}x=\sigma{\rm d}u$). The integral would then become:
$$\int\limits_{-\infty}^{\infty} k\exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right) {\rm d}x = \int\limits_{-\infty}^{\infty} \sigma k\exp\left(-\frac{u^2}{2}\right) {\rm d}u .$$

def f2(u):
return k*numpy.exp(-0.5*u**2)*sigma

Using the original integral over $f(x)$, I also find quad to fail and yield an incorrect integral of $\approx -4\cdot{}10^{-8}$ (using the same $\sigma$ as in the example output in the question, not another random value), employing $f2(u)$, quad yields $1$.

Let's say the transformation of the integrand shouldn't be as simple as above (which basically leads to a Gaussian with 'normalized' coefficients). One could also define $u=x/\sigma$, i.e. not absorb the shift due to $\mu$ in the transformation (which works here because $\mu \in [-10;10]$, if $\mu$ itself was huge, it would indeed be best to use the previous transformation to avoid numerical instabilities) and use the following integrand:

def f3(u):
return k*numpy.exp(-0.5*(u*sigma-mu)**2/sigma**2)*sigma

which, despite still having the shift due to $\mu$ in the integrand, accurately yields $1$ with scipy'squad. Choose e.g. $\mu=10^{12}$ and this latter approach will fail, quad again yielding incorrect results close to $0$.

A change of variables $u=x\cdot\alpha$ and ${\rm d}u={\rm d}x\cdot\alpha$ can generally be used for quadrature (and other numerical problems) to transform the integral kernel to numerically more tractable magnitudes.

$\begingroup$Could you give more details about the random.seed(0), and how it relates to the result ?$\endgroup$
– ZophikelJun 17 '19 at 14:56

$\begingroup$The computation uses randomness (in generate()), so whatever the values are there, they will affect the final result. Fixing the random.seed makes the results reproducible.$\endgroup$
– Nico SchlömerJun 17 '19 at 17:21

1

$\begingroup$One thing is that the script will yield different results every time since random.seed is not fixed. The other is that one can test whether scipy.integrate.quad can handle Gaussian input with any randomly chosen width $\sigma$: the function definition with the choice for the pre-factor $k$ indeed uses the fact that there is an analytical result of the integral for normalization. The automatic integrator quad fails, if $\sigma$ is too large. The numerical difficulties aren't really the randomness of the input itself, but rather that a random width can be too large for quad to handle.$\endgroup$
– v-joeJun 17 '19 at 18:46

2

$\begingroup$@v-joe Not really. The variance depends on the seed, but it is always too large.$\endgroup$
– Nico SchlömerJun 17 '19 at 20:35

1

$\begingroup$@NicoSchlömer Good point. Looking at it that way (i.e. generating random data and computing the variance of that) there is really no need to employ any notion of a Gaussian = (continuous) normal distribution and any quadrature whatsoever but just take the variance computed using a discrete sum.$\endgroup$
– v-joeJun 17 '19 at 20:56