Monday, September 21, 2009

Translation: Ordnance Survey and Google Maps side-by-side for planning the Land's End-John O'Groats walk in the United Kingdom.

My friend Steve is walking the LEJOG in 2010 (I guess that makes him a LEJOGger - as opposed to the JOGLErs who go North to South). Of course he is blogging about it - doesn't everybody?.

Even better, he's planning his route using Where's The Path, which is a fantastic side-by-side map site for the U.K. It shows Ordnance Survey topographic maps and Google Maps imagery in a cleverly linked view (similar to the VExGM web site I blogged about earlier this year, but even more useful). Here's a link to the most famous location in Britain (but I suspect it doesn't feature in many LEJOG itineraries, unless you're a real anorak for walking).

WTP also lets you digitize paths and export them to GPX and KML. Steve is finding the KML export particularly useful, since it allows him to locate all the pubs along the route of his walk.

Now, when is someone going to do this for Canada? All those prospective SJV'ers out there would be overjoyed!

Over the years of working with JTS I've read a ton of interesting and sometimes even useful academic papers about aspects of Computational Geometry. I hope to start documenting these on this page, perhaps with some (no doubt opinionated) comments about applicability. A treasure trove for Geometry geeks!

Wednesday, August 19, 2009

The web mapping site 192.com has a very nice map zoom control which dynamically displays a representative image of the map at the prospective zoom level.

This nicely addresses a problem which has occurred in a few applications I've been involved with recently - how to communicate to the user what a given zoom level actually looks like.

At first I thought the control might be truly dynamic - i.e. pulling a live tile from the centre point of the current map view. But it's actually just a set of tile snippets at a fixed point. Still, it provides a very nice effect. (The dynamic version would be simple to implement, but perhaps would be too slow in actual use, with unpredicatable latency).

Friday, May 15, 2009

I find the sequence diagram to be one of the most useful UML diagrams - but it's also one of the most fiddly & annoying to generate in a diagramming tool. There's a few reasons for that:

there's a large amount of implicit structure which is tedious to maintain

sequence diagrams quickly get complex for processes with lots of actors and/or actions

I can never remember all the conventions for representing concepts like alternation, loops and nesting

Luckily, the web to the rescue! The websequencediagrams website has a on-line diagram generator which uses a simple language to describe the diagram. This works well precisely because of the structure in sequence diagrams. Because they are inherently linear in structure, they are more constrained than other UML diagrams.

The language supports just about everything you might want - signal types, groups, nesting, notes, lifelines, etc. It outputs in images or as PDF - and it even has cool styles! Here's the idea:

I've blogged about the appeal of specifying UML diagrams using code before. I'm not sure how much traction TextUML is getting - but it's hard to resist the appeal of a web interface, a simple language, and sexy ouput.

Now, if only someone would develop a force-directed web-based state diagram engine...

Note: there's some other alternatives to the websequencediagram tool - the author blogs about them here. The most interesting from my point of view is sdedit, which is open-source and Java-based. It's more powerful, but the language looks a mite complex. (Too bad there's no standard language for describing UML - maybe the Three Amigos should skip a siesta and whip one up.)

The JTS implementation scales very well - a Delaunay/Voronoi diagram of 1 million points computes in under 5 minutes. The one downside is memory usage. As usual, Java is quite memory-hungry - a 1 M point Delaunay takes around 700 MB. This seems excessive, even for Java. Hopefully some memory profiling may reveal that this can be reduced by some tuning of the class structures involved.

Even better, the Delaunay/Voronoi algorithm seems to be quite robust, even though no special attempt has been made to provide high-precision or otherwise robust implementations of some of the key predicates (inCircle and orientation). This may yet prove to be an issue for some inputs, but at this point it seems possible to claim that for non-pathological inputs the algorithms execute correctly.

This Delaunay implementation arose out of the work I did on automated watershed extraction using a system called WaterBuG a while back. It's taken a while to make its way into JTS because there's quite a bit of work involved in turning a codebase targeted at a very specific project into a clean, user-friendly API. Not to mention writing documentation...

But it's there now, and I'll be interested to see if it's of use to JTSer's. There seems to be a sudden surfeit of Delaunay and CDT Java libraries (see here, here, here, here and here), so I make no claims that the JTS one is best-of-breed. But hopefully having a decent implementation easily accessible and embedded in a general-purpose library will meet some needs. Also, my hope is that it will serve as the basis for further algorithms in JTS (such as the much-sought-after Concave Hull!).

Here's some screenshots showing the triangulation API at work:

The input points and constraint segments:The Delaunay triangulation (notice that triangle edges cross some of the longer constraint edges):A conforming Delaunay triangulation (the triangulation edges are now faithful to the constraint edges):

Wednesday, March 4, 2009

...As they explain in this blog post about using JTS to simplify a polygonal coverage.

The really sweet thing is that not only do the Web Monkeys have laser beams, but also honkin' big computers as well. The post states that they use 7 GB of memory to process 41,000 polygons in a single batch (as is required for the simplification algorithm that they're using). I inquired and found out that they're using:

Niice... I've always wanted to experiment with large memory space for JTS processing, for things like large overlays, large TIN building, etc. So far I haven't had a machine large enough to let me try. But it sounds like 64-bit Java "just works", as expected. Another great reason for coding to a JVM - 64-bit memory space support with no recompilation!

Tuesday, March 3, 2009

For my area of interest (Victoria B.C.) my 20-second appraisal is that they both have aspects of their rendering which are better than the other; and VE's imagery coverage is more limited than GM's (probably due to this deal).

Tuesday, January 13, 2009

I just saw this blog post in rebuttal to Martin Fowler's DatabaseThaw article. It's a typically well-thought out response from Typical Programmer. The short summary is that he rejects Fowler's criticism of using RDB's as the mechanism for application integration. (And don't dismiss him as simply a RDB bigot - he's not shy about nailing relational gurus, as in Relational Database Experts Jump The MapReduce Shark).

TP's arguments ring true to me, although in my experience it takes a very concientious DB shop to fully maintain the "encapsulation of data and operations" across many applications, many custodians, and many developers.

It would be great if Fowler's vision of the integration point being HTTP rather than SQL solved all our problems. Loose coupling is a wonderful thing to have. But so far I'm not seeing how this provides equivalent performance for all cases, and how it avoids the problem of data model mismatch.

Friday, January 9, 2009

Martin Fowler has a great blikiTM post on DatabaseThaw. The soundbite is something like "What happens after relational DBs?". He has some interesting links to things like Drizzle.

His opinon of relational DBs is that

their dominance is due less to their role in data management than their role in integration

This makes sense, but I think they also have offered the benefit of providing a fairly powerful, standard way of modelling and querying data. Over the lifetime of this technology this has fought off some powerful challengers, notably object-oriented data modelling. (Although perhaps this battle was only won by becoming more like the "loser"). The latest challenger is semantic data modelling (RDF, SPARQL, etc). Perhaps the world is ready to make this transition - although I'm not rushing to sell my Oracle stock.

He also makes a good point that the focus of integration is shifting from the DB to the Web. Hard to argue with this... the clouds are moving in!

Thursday, January 8, 2009

The GEOS team is working on porting the JTS CascadedPolygonUnion algorithm. Reasonably enough they asked whether there were unit tests that they could use to check the correctness of their code. I was a bit dismayed to realize that in fact there weren't any unit tests for this class (although having used the code extensively I'm pretty confident that it's correct.)

So fine, some unit tests are needed. This should be easy, right? Just create or obtain some polygonal datasets, union them using both CascadedPolygonUnion (fast) and Geometry.union (slow), and compare the results.

But there's a catch (in geometric processing there usually is). Due to rounding effects, there's no guarantee that the results of the two algorithms will be vertex-by-vertex identical. Of course, they should be very, very similar... but how to check this?

In geometry processing it's usually much harder to test "similar" than it is to test "exact", and this case is no exception. In fact, there's many different tests that could be used, depending on how you want to define "similar". In the context of unit tests, similarity testing is essentially checking for discrepancies which might be introduced into a theoretically correct result. So the tests to use may depend on the operation(s) that are being performed, since different kinds of algorithms can produce different kinds of discrepancies. A further issue which needs to be considered is to detect gross differences which might result from catastrophic algorithm failure.

So now the task is to write a SimilarityValidator for geometries. For polygons there's various characteristics which can be compared:

A simple thing to do is to compare basic quantities like the area and perimeter length of the polygons. Paradoxically, these are good for detecting small discrepancies, but not for detecting gross ones (e.g. two identical rectangles which are offset by a large distance)

One obvious thing to do is to compute the symmetricDifference of the polygons and inspect the area of the result. This has a couple of issues. One is what area tolerance to use? A more serious issue is that the symmetric difference operation is itself an approximation to an exact answer. This is especially true in situations where the inputs are very close - which is exactly the situation we are trying to check! So there may be a certain lack of confidence that this is detecting all discrepancies

A more robust test is to compute the Hausdorff distance between the rings of the polygons. This allows expressing the tolerance in terms of distance, which is more intuitive. It's also fairly robust and easy to confirm the answer. There's a challenge in developing an efficient Hausdorff algorithm - but that's a topic for another blog post.

Other tests might include comparing structural things like comparing the number of rings in the polygons.

Hausdorff Distance between two polygons

With this validator written and some test datasets run I can finally confirm that CascadedPolygonUnion actually works!