We will now consider automatic differentiation. It is a technique that
can compute exact derivatives, fast, while requiring about the same
effort from the user as is needed to use numerical differentiation.

So what does this small change buy us? The following table compares
the time it takes to evaluate the residual and the Jacobian for
Rat43 using various methods.

CostFunction

Time (ns)

Rat43Analytic

255

Rat43AnalyticOptimized

92

Rat43NumericDiffForward

262

Rat43NumericDiffCentral

517

Rat43NumericDiffRidders

3760

Rat43AutomaticDiff

129

We can get exact derivatives using automatic differentiation
(Rat43AutomaticDiff) with about the same effort that is required
to write the code for numeric differentiation but only \(40\%\)
slower than hand optimized analytical derivatives.

So how does it work? For this we will have to learn about Dual
Numbers and Jets .

Reading this and the next section on implementing Jets is not
necessary to use automatic differentiation in Ceres Solver. But
knowing the basics of how Jets work is useful when debugging and
reasoning about the performance of automatic differentiation.

Dual numbers are an extension of the real numbers analogous to complex
numbers: whereas complex numbers augment the reals by introducing an
imaginary unit \(\iota\) such that \(\iota^2 = -1\), dual
numbers introduce an infinitesimal unit \(\epsilon\) such that
\(\epsilon^2 = 0\) . A dual number \(a + v\epsilon\) has two
components, the real component \(a\) and the infinitesimal
component \(v\).

Surprisingly, this simple change leads to a convenient method for
computing exact derivatives without needing to manipulate complicated
symbolic expressions.

Observe that the coefficient of \(\epsilon\) is \(Df(10) =
20\). Indeed this generalizes to functions which are not
polynomial. Consider an arbitrary differentiable function
\(f(x)\). Then we can evaluate \(f(x + \epsilon)\) by
considering the Taylor expansion of \(f\) near \(x\), which
gives us the infinite series

A Jet is a
\(n\)-dimensional dual number, where we augment the real numbers
with \(n\) infinitesimal units \(\epsilon_i,\ i=1,...,n\) with
the property that \(\forall i, j\ :\epsilon_i\epsilon_j = 0\). Then
a Jet consists of a real part \(a\) and a \(n\)-dimensional
infinitesimal part \(\mathbf{v}\), i.e.,

\[x = a + \sum_j v_{j} \epsilon_j\]

The summation notation gets tedious, so we will also just write

\[x = a + \mathbf{v}.\]

where the \(\epsilon_i\)‘s are implict. Then, using the same
Taylor series expansion used above, we can see that:

In order for the above to work in practice, we will need the ability
to evaluate an arbitrary function \(f\) not just on real numbers
but also on dual numbers, but one does not usually evaluate functions
by evaluating their Taylor expansions,

This is where C++ templates and operator overloading comes into
play. The following code fragment has a simple implementation of a
Jet and some operators/functions that operate on them.

template<intN>structJet{doublea;Eigen::Matrix<double,1,N>v;};template<intN>Jet<N>operator+(constJet<N>&f,constJet<N>&g){returnJet<N>(f.a+g.a,f.v+g.v);}template<intN>Jet<N>operator-(constJet<N>&f,constJet<N>&g){returnJet<N>(f.a-g.a,f.v-g.v);}template<intN>Jet<N>operator*(constJet<N>&f,constJet<N>&g){returnJet<N>(f.a*g.a,f.a*g.v+f.v*g.a);}template<intN>Jet<N>operator/(constJet<N>&f,constJet<N>&g){returnJet<N>(f.a/g.a,f.v/g.a-f.a*g.v/(g.a*g.a));}template<intN>Jet<N>exp(constJet<N>&f){returnJet<T,N>(exp(f.a),exp(f.a)*f.v);}// This is a simple implementation for illustration purposes, the// actual implementation of pow requires careful handling of a number// of corner cases.template<intN>Jet<N>pow(constJet<N>&f,constJet<N>&g){returnJet<N>(pow(f.a,g.a),g.a*pow(f.a,g.a-1.0)*f.v+pow(f.a,g.a)*log(f.a);*g.v);}

With these overloaded functions in hand, we can now call
Rat43CostFunctor with an array of Jets instead of doubles. Putting
that together with appropriately initialized Jets allows us to compute
the Jacobian as follows:

classRat43Automatic:publicceres::SizedCostFunction<1,4>{public:Rat43Automatic(constRat43CostFunctor*functor):functor_(functor){}virtual~Rat43Automatic(){}virtualboolEvaluate(doubleconst*const*parameters,double*residuals,double**jacobians)const{// Just evaluate the residuals if Jacobians are not required.if(!jacobians)return(*functor_)(parameters[0],residuals);// Initialize the Jetsceres::Jet<4>jets[4];for(inti=0;i<4;++i){jets[i].a=parameters[0][i];jets[i].v.setZero();jets[i].v[i]=1.0;}ceres::Jet<4>result;(*functor_)(jets,&result);// Copy the values out of the Jet.residuals[0]=result.a;for(inti=0;i<4;++i){jacobians[0][i]=result.v[i];}returntrue;}private:std::unique_ptr<constRat43CostFunctor>functor_;};

Automatic differentiation frees the user from the burden of computing
and reasoning about the symbolic expressions for the Jacobians, but
this freedom comes at a cost. For example consider the following
simple functor:

There is no single solution to this problem. In some cases one needs
to reason explicitly about the points where indeterminacy may occur
and use alternate expressions using L’Hopital’s rule (see for
example some of the conversion routines in rotation.h. In
other cases, one may need to regularize the expressions to eliminate
these points.