Monday, October 19, 2009

CGAL: All of Arrangement_2's Children Need an Exact Kernel

This comes up on the CGAL mailing list fairly often: why does the package XXX need an exact kernel? (Or the more common form: why does the package XXX exhibit some strange behavior if I use an inexact kernel?)

Well, if this package is based on the package Arrangement_2, then it inherited a requirement for exact constructions. Two examples include: some forms of Minkowski sum (the sum is built by building a preliminary shape and dumping it into an arrangement where it can be analyzed for self-intersections and winding counts found) and polygon boolean operations (where the arrangement contains each region intersecting any polygon with any other).

So why does arrangement_2 require exact constructions? Well, here's a layman's answer...I write this because I spent a good amount of effort a few years ago trying to write one that would work with floating point, and I hit just about every horrible problem that this "naive" case creates.

First, there is repeatability of insertion. If you want to insert a line segment AB into your arrangement multiple times, you want the subsequent inserts of AB to be no-ops (since AB is already present). Even better, if you have some kind of "notifier" API you want to get back all of the edges in the arrangement that AB was cut into.

But...what if there was a crossing segment CD that intersects AB at a point P such that...P cannot be collinear to AB because we ran out of bits of precision?

If you take the naive approach and insert AP and PB you now have segments that are close to AB but not quite AB.

Go back and insert AB again and you are really in for a world of hurt: AB originates with AP and runs a line almost the same, except for a single bit of floating point precision. I promise you that your attempt to figure out how to relate AB and AP to each other is going to go very, very badly.

The above case gets even worse when you consider that you may be inserting overlapping lines (and not just the same line over and over). If your definition isn't overlapping, you'll simply be inserting lines so close together that a single line crossing them all produces a single point.

Arrangements and planar maps save topology information (e.g. who is on what side of something else). The geometry information defines the map, but the topology information guides operations. In other words, we don't even try to check for an intersection unless the topology information says there couldbe an intersection. (This is pretty important for the speed of the map.)

This is where a lack of stability in floating point calculations becomes a problem. If a floating point geomtry check is close but inexact, equivalent tests may produce inconsistent results.

If the first result of a test is used to store topology information, we may not even try the second, inconsistent test.

My strategy at the time to deal with this was an ugly one: the map was only guaranteed to work for a very, very limited number of cases...the list was something like this:

Operations on horizontal and vertical segments were guaranteed to work in just about every case, because we could do an exact test if at least one segment were horizontal or vertical.

Insertion of segments was guaranteed to work if the original input list didn't come close to the limits of floating point , wasn't degenerate in any way, and we did the inserts by breaking up intersections in a single pass, then inserting.

This makes the API not very useful at all - looking at what I use Arrangement_2 for now, we violate these conditions all the time. Such limits certainly aren't appropriate for a general-purpsoe package like CGAL.

Consider for example, the trivial case of finding the union of two overlapping polygons. If any two edges are degenerate (e.g. nearly parallel, but the difference in slope is close to the limit in floating point precision) my limited API will fail, even for a trivial case where one polygon fully contains the other.

As a final thought: it probably is possible to build an API for an arrangement based on inexact geometry. The API would "leak" all of its geometry problems, e.g. the insertion of a curve could fail due to precision limits, and tests (is the curve in or out) would give inexact errors (E.g. "it's close to here but we're not really sure.")

Such an API would be fast (it'd be only floating point) and possibly even robust, but it would leak a ton of abstraction problems right up to the next level of code. For some limited classes of problems this might be acceptable, but often I think the speed boost of floating point would be lost, since most high performance higher level algorithms (like sweeps) are written assuming robust geometry.