the fine Art of coding – Julien Pilet

Menu

Category Archives: optimization

When developing computer vision systems that needs to find plane projections from point-to-point correspondences, a RANdom SAmple Consensus (RANSAC) implementation is necessary. The algorithm rejects wrong correspondences (outliers) and find the geometric transformation, usually a homography, explaining inliers. The algorithm randomly picks 4 correspondences, find the corresponding homography, and count how many correspondences it explains. After a fixed number of iteration, the homography with the best support is chosen.

OpenCV offers an implementation in findHomography. It is unfortunately rather slow. A great optimization approach is to leverage SIMD instructions such as SSE or NEON. In theory, SIMD instructions could allow the processor to test 4 homographies in a single pass. However, conditional jumps are forbidden since the execution flow has to be common for the 4 tested homographies.

Computing homographies without conditional jumps

We need a jumpless way of computing a homography from 4 correspondences. Since I’m lazy, I asked maple to generate the function for me. Maple had problem inverting analytically the 8 by 8 matrix to solve the problem. It managed to compute analytically a homography sending the unit square to arbitrary points, though.

This little code produces a rather large function that does not use SIMD instructions at all: it uses double. To compute an arbitrary homography, I just call this function twice, invert one result, and multiply both matrix together. These function do not need conditional jumps.

Thank to the power of C++ (operator overloading in particular), replacing “double” with a SSE type simply amounts to defining a class.

Replacing double with fvec4

Compilers give access to SSE instructions through intrinsics that do not look very friendly. The following example is rather hard to read:

Results

Using SSE instructions to find homographies significantly speeds up computation. My implementation is much faster than OpenCV’s findHomography function which is a bit more more accurate, because it uses 64 bits doubles instead of 32 bits floats.