With ordinary least squares fitting all measurements $m_i$ are treated equally. But for example in our GPS fitting scenario older points should contribute less than newer ones. So the measurements need to be weighted differently by introducing weights $w_i$ in the sum of residuals:

$ R = \sum_i w_i(f(t_i)-m_i)^2 $

Again, we find the minimum by setting the gradient to zero: $\nabla R=0$.

Calculating the partial derivatives of the gradient leads to the following three linear equations: