Let x1,x2,… denote the data.
The naïve formula for calculating the sample
variance in one pass,

v=1n-1⁢∑i=1n(xi-x¯)2=(∑i=1nxi2)-n⁢x¯2n-1,x¯=1n⁢∑i=1nxi,

suffers from computational round-off error.
If the mean x¯ is large in absolute value,
and ∑i=1nxi2 is close to n⁢x¯2,
then the subtraction at the end will tend to lose
significant digits on the result.
Also, in rare cases, the sum of squares ∑i=1nxi2
can overflow on a computer.

A better alternative, though requiring more work per iteration,
is to calculate the runningsample mean
and variance instead,
and update these as each datum is processed.
Here we give the derivation of the one-pass algorithm — which
involves nothing more than simple algebraic manipulations.

We want to express an+1 and sn+1
in terms of the old values an and sn.

For convenience, let δ=xn+1-an
and γ=an+1-an.
Then we have

an+1=n⁢an+xn+1n+1=(n+1)⁢an+xn+1-ann+1=an+δn+1.

For the variance calculation,
we have

sn+1

=∑i=1n((xi-an)-γ)2+(xn+1-an+1)2

=∑i=1n(xi-an)2-2⁢γ⁢∑i=1n(xi-an)+∑i=1nγ2+(xn+1-an+1)2

=sn+0+n⁢γ2+(xn+1-an+1)2.

Now observe:

γ=δn+1,xn+1-an+1=δ-γ=(n+1)⁢γ-γ=n⁢γ;

hence we obtain:

sn+1=sn+n⁢γ2+n2⁢γ2=sn+n⁢(n+1)⁢γ2

=sn+n⁢γ⁢δ

=sn+(xn+1-an+1)⁢δ.

Note that the number to be added to sn is never negative,
so no cancellation error will occur from this procedure.
(However, there can still be computational round-off error if sn+1-sn
happens to be very small compared to sn.)

The recurrence relation for the
sample covariance of two lists of numbers x1,x2,…
and y1,y2,…
can be derived similarly.
If an and bn denote the arithmetic means
of first n numbers of each of the two lists respectively,
then the sum of adjusted products