Monthly Archives: April 2009

Geocode services are useful for spatializing many types of business data. There are numerous services to choose from, but often business data is riddled with mistypes and other issues that cause geocode failures. I was recently faced with a data set that contained nearly a 30% failure rate from a common geocode batch service and decided to see what I could do with Virtual Earth’s geocode service.

Using web.config for setup parameters

In the process I was faced with passing parameters from my Web.config into the project and down to my Silverlight xaml to be used in the Page.cs code behind. Silverlight has only indirect access to information in the appSettings of web.config. I eventually stumbled across this approach:

Web.config: add key value pair to the appSettings

Default.aspx.cs: Use Page_Load to get the parameters and pass to Xaml1.InitParameters

App.cs: use initParams to get the parameters and send them to the Page.cs Page() method

Page.cs: in the Page.cs Page method set class variables with the parameters

1. Web.config
The web.config file can be used to store parameters in the appSettings section. These parameters are then available as application wide settings. An additional advantage is the ability to change settings on the deploy server without a rebuild and redeploy each time.

2. Default.aspx.cs:
Use Default.aspx.cs Page_Load(object sender, EventArgs e) method to setup the commonService and get the client token: commonService.GetClientToken(tokenSpec); for accessing the VEGeocodeWebService. Also pull the “geocodeLimit parameter from the appSettings, ConfigurationManager.AppSettings["geocodeLimit"]. These two parameters, “clientToken” and “limit” are then setup as InitParameters for the asp:Silverlight Xaml page, ID=”Xaml1″. InitParameters is a string listing comma delimited key=value pairs. The end result is Xaml1.InitParameters = “clientToken=Your Token,limit=10″.

At this point my VE token and geocodeLimit are available for use locally in my codebehind for Page.xaml

AddressLookup
Once that is settled I can move on to an address lookup using VEGeocodingService. The basic approach as seen in the initial Page view is a TextBox for input, a TextBox for output, a small menu for selecting a confidence factor, and a button to start the geocode when everything is ready. Underneath these controls is a VirtualEarth.MapControl. Addresses can be copy/pasted from text, csv, or xls files into the input TextBox. After geocoding the output TextBox can be copy/pasted back to a text file.

Once the geocode Click event is activated the confidence selection is added as a new ConfidenceFilter(). Next a geocodeRequest is setup, new GeocodeRequest(), using the input TextBox addresses, which are either tab or comma delimited, one per line. In addition to the address, a table ID code is included to keep track of the address. This id is passed into the geocodeService.GeocodeAsync(geocodeRequest, id); so that I can use it in the callback to identify the result.

The callback, geocodeService_GeocodeCompleted(object sender, GeocodeCompletedEventArgs e), should have a GeocodeResponse with information about the matched addresses, any address modifications made by the parser, and the all important latitude, longitude. GeocodeCompletedEventArgs also holds the address id as a UserState object which I can cast to string.

The result may have several “finds” in its Results array, but I choose to just look at the first “find” returned for each address. By adding an ellipse to my VE MapControl and changing the VEMap.View I can examine the location of the “find” to see if it makes sense. Since I’m somewhat familiar with Denver, the idea is to look at a result and decide if it is junk or not, at least in the more egregious cases. The results are also added to the Output TextBox.

The ToolTipservice is a simple way to add rollover labelling to the address points, but perhaps a more complex rollover would be useful. Adding a couple of events, MouseEnter and MouseLeave, to the point shape allows any variety of rollover affect desired. Here is a simple set of events to change the fill brush:

The points are available for viewing as a shape layer on top of the VE MapControl, but what if I can see how I should change the lat,lon location to make it more accurate. What I need is a drag capability to pick an address point and drag it to a new location.

This was a little more complicated than the simple rollover event. To start with I add a MouseLeftButtonDown event to the address ellipse. Normally you would also have a MouseMove and MouseLeftButtonUp event to help with the drag. However, VE MapControl complicates this approach masking move events with its pan event.

The solution that I found is to temporarily overide the VEMap.MousePan and VEMap.MouseLeftButtonUp with methods specific to the address point. The new point_Pan method sets its MapMouseEventArgs “handled” property to “true”, preventing the event from bubbling up to the VEMap pan. The VEMap.ViewportPointToLocation is used to change the screen x,y point to a lat,lon location as the mouse moves across the map. Once the MouseLeftButtonUp fires we can drop our shape at the new location and handle some book keeping in the Output TextBox.

Leaving the fill color “Green” indicates that this point has been relocated.It is also important to remove the temporary Mouse event handlers so that Map pan will work again.

A method of passing Web.Config parameters in to the xaml code behind for a Silverlight MapControl project.
This is useful for deployment on multiple servers with differing needs as well as locating important credentials in a single place.

Setting up a VEGeocodingService and making use of resulting locations in a Silverlight MapControl. Geocoding is a common need and the VE geocoding appears to be adequate. I noticed some strange behaviors when compared to Google. I was especially concerned that the parser seems to modify the address until it finds something that works, in fact anything that works, even locations in other states! The only response indicator that something was being modified is the MatchCodes[]

Looking at some event driven interaction with new shape layer objects. Event driven interaction is a big plus, with granular events down to individual geometry shapes a map can become a live form for selection and edits. This is why I find Silverlight so attractive. It duplicates the event driven capability pioneered a decade ago in the SVG spec.

Microsoft SQL Server 2008 introduces some spatial capabilities. Of course anytime Microsoft burps the world listens. It isn’t perhaps as mature as PostGIS, but it isn’t as expensive as Oracle either. Since it’s offered in a free Express version I wanted to give it a try, connecting the Silverlight MapController CTP with some data in MS SQL Server 2008. Here is the reference for SQL Server 2008 with new geography and geometry capabilities: Transact-SQL

To start with I needed some test data in .shp format.

The new Koordinates Beta site has a clean interface to some common vector data sources. I signed up for a free account and pulled down a version of the San Francisco Bay Area bus stops, bikeways, and transit stations in Geographic WGS 84 (EPSG:4326) coordinates as .shp format. Koordinates has built a very nice data interface and makes use of Amazon AWS to provide their web services. They include choices for format as well as a complete set of EPSG coordinate systems that make sense for the chosen data source. I selected Geographic WGS 84 (EPSG:4326) because one area SQL Server is still lacking is EPSG coordinate system support. EPSG:4326 is the one supported coordinate system in the Transact SQL geography and there is no transform function.

With data downloaded I can move on to import. Morten Nielson’s SharpGIS has some easy to use tools for importing .shp files into the new MS SQL Server 2008 spatial. Windows Shape2Sql.exe, like many windows apps, is easier to use for one at a time file loading with its nice GUI, at least comparing to the PostgreSQL/PostGIS shp2pgsql.exe loader. However, batch loading might be a bit painful without cmd line and piping capability. SQL Server Spatial Tools

Fig 3 – Shape2Sql from sharpgis.net

Curiously SQL Server offers two data types, the lat,long spatial data type, geography, and the planar spatial data type, geometry. SRIDs can be associated with a record, but currently there is no transform capability. Hopefully this will be addressed in future versions. Looking ahead to some comparisons, I used Shape2SQL to load Bay Area Bus Stops as geometry and then again as geography. In both cases I ended up with a field named “geom”, but as different data types. The new geo data types hold the spatial features. The data load also creates a spatial index on the geom field. Once my data is loaded into MS Sql Server I need to take a look at it.

Using geom.STAsText() to show WKT, I can look inside my geography field, “geom”, and verify the loading:

SQL Server 2008 Management Studio includes a visual display tab as well as the normal row grid. Geography data type offers a choice of 4 global projections and affords some minimalistic zoom along with a lat,long grid for verifying the data. But geometry data type ignores projection. The geometry spatial result offers a grid based on the x,y extents of the data selection. I didn’t find a way to show more than one table at at a time in the spatial view.

Now that we have some spatial data loaded it’s time to see about hooking up to the new Silverlight MapController CTP. The basic approach is to access the spatial data table through a service on the Web side. A service reference can then be added to the Silverlight side, where it can be picked up in the C# code behind for the MapControl xaml page. This is familiar to anyone using Java to feed a Javascript client. The Database queries happen server side using a servlet with jdbc, and the client uses the asynchronous query result callback to build the display view.

Continuing with the new Silverlight MapControl CTP, I took a look at this tutorial: Johannes Kebeck Blog

However, the new geo data types are not supported through Linq designer yet. This means I wasn’t able to make use of Linq OR/M because of the geom fields. I then switched to an ADO.NET Data Service model, which appears to work, as it produces the ado web service and allows me to plug in my GeoModel.edmx GeoEntities into the new auto generated GeoDataService.svc like this:GeoDataService : DataService<GeoEntities>

Using the service endpoint call“http://localhost:51326/GeoDataService.svc/bay_area_transit_stations(1)”
returns the correct entry, but minus the all important geom field. Once again I’m thwarted by OR mapping, this time in EF. Examining the auto generated GeoModel.designer.cs reveals that geom field class is not generated. Perhaps this will be changed in future releases.

So neither Linq OR/M or ADO.net OR/M has caught up with the SQL Server 2008 data types yet. Apparently Linq is competing internally wtih ADO.NET EF, see Is LINQ to SQL Dead? I gather going forward .NET 4.0 will be emphasizing Entity Framework, but as far as spatial data types auto generated code isn’t there yet.

Back to manual code. I can add a Silverlight-enabled WCF service to my GeoTest.Web that connects to the database and then pass the results back to Silverlight using a Data Service Reference. From that point I can use the result inside Page.cs to populate a MapLayer on the map interface with Ellipse shapes representing the transit stops or MapPolyline for Bikeways. I know there must be a more sophisticated approach to this process, but for simplicity I’ll just pass results back to the Silverlight page via strings.

Notice that the STIntersects query needs to use a matching SRID in the view bounds Polygon to get a result. I also noticed that using a geometry data type causes an inordinately long query time. In my test with Bus Stops up to 5 sec was required for a query that was only in the sub second range for a geography data type with the same STIntersects query. After looking over the statistics for the two versions, I could see that the geometry version loops through the entire 79801 records to retrieve 32 rows, while the geography version uses only 4487 loops to get the same result. Perhaps I'm missing something but the geometry index doesn't appear to be working!

This bit of code turned out to be very useful:
private SFBayDataServiceClient GetServiceClient()
{
Uri uri = new Uri(HtmlPage.Document.DocumentUri,
"SFBayDataService.svc");
EndpointAddress address = new EndpointAddress(uri);
return new SFBayDataServiceClient("*", address);
}

WCF services are easy to create and use in Visual Studio, however, deployment is a bit of a trick. The above code snippet ensures that the endpoint address follows you over to the deployment server. I have still not mastered WCF deployment and spent a good deal of trial and error time getting it working. Deployment turned out to be the most frustrating part of the process.

Now, I have a Silverlight map display with fully event driven geometry similar to SVG. ToolTips give automatic rollover info, but it is also possible to add mouse events to completely customize interaction with individual points, polylines, and polygons. I expect that this will result in more interesting event driven map overlays. Unfortunately, large numbers of features slow down map interactions so a better approach is to use a Geoserver WMS tile cache for zoom levels down to a reasonable level and then switch to shape elements.

Summary

It is still early in the game for SQL Server 2008 spatial. There are a few holes and version 1.0 is not really competitve with PostGIS in capability, but performance is decent using the geography data type. The combination of Silverlight MapControl and SQL Server spatial is good for tightly coupled webapps. However, these days OGC standards implemented by OGC servers make decoupled viewer/data stores very easy to develop. I would choose a decoupled architecture in general, unless there are constraints requiring use of Microsoft products. Future releases will likely add OR/M auto generated code for geography and geometry data types. Also useful would be a complete multigeometry capability and general EPSG support.

The nice thing is the beauty of a Silverlight MapControl and the ability to completely customize overlays with event driven interaction at the client using C# instead of javascript.

This is aimed at creating a national building inventory rather than a terrain model, but still very interesting. The interior/exterior ‘as built’ aspect has some novelty to it. Again virtual and real worlds appear to be on a collision trajectory and may soon overlap in many interesting ways. How to make use of a vast BIM modeling infrastructure is an interesting question. The evolution of GE and VE moves inexorably toward a browser virtual mirror of the real world. Obviously it is useful to first responders such as fire and swat teams, but there may be some additional ramifications. Once this data is available will it be public? If so how could it be adapted to internet use?

Until recently 3D modeling in browsers was fairly limited. Google and Microsoft have both recently provided useful api/sdk libraries embeddable inside a browser. The terrain model in GE and VE is vast but still relatively coarse and buildings were merely box facades in most cases. However, Johannes Kebek’s blog points to a recent VE update which re-enabled building model interiors, allowing cameras to float through interior space. Following a link from Kebek’s blog to Virtual Earth API Release Information, April 2009 reveals these little nuggets:

Digital Elevation Model data – The ability to supply custom DEM data, and have it automatically stitched into existing data.

Terrain Scaling – This works now.

Building Culling Value – Allows control of how many buildings are displayed, based on distance and size of the buildings.

Turn on/off street level navigation – Can turn off some of the special effects that occur when right next to the ground.

Both Google and Microsoft are furnishing modeling tools tied to their versions of virtual online worlds. Virtual Earth 3dvia technology preview appears to be Microsoft’s answer to Google Sketchup.
The race is on and shortly both will be views into a virtual world evolving along the lines of the gaming markets. But, outside of the big two, GE and VE, is there much hope of an Open Virtual World, OVM? Supposing this BIM data is available to the public is there likely to be an Open Street Map equivalent?

Fortunately GE and VE equivalent tools are available and evolving in the same direction. X3D is an interesting open standard scene graph schema awaiting better implementation. WPF is a Microsoft standard with some scene building capability which appears to be on track into Silverlight … eventually. NASA World Wind Java API lets developers build applet views and more complex Java Web Start applications which allow 3D visualizing. Lots of demos here: World Wind DemosBlender.org may be overkill for BIM, but is an amazing modeling tool and all open source.

LiDAR Terrain

Certainly BIM models will find their way into browsers, but there needs to be a corresponding evolution of terrain modeling. BIM modeling is in the sub foot resolution while terrain is still in the multimeter resolution. I am hopeful that there will also be a national resource of LiDAR terrain data in the sub meter resolution, but this announcement has no indication of that possibility.

I’m grappling with how to make use of the higher resolution in a web mapping app. GE doesn’t work well since you are only draping over their coarse terrain. Silverlight has no 3D yet. WPF could work for small areas like architectural site renderings, but it requires an xbap download similar to Java Web Start. X3D is interesting, but has no real browser presence. My ideal would be something like an X3D GeoElevationGrid LOD pyramid, which will not be available to browsers for some years yet. The new VE sdk release with its ability to add custom DEM looks very helpful. Of course 2D contours from LiDAR are a practical use in a web map environment until 3D makes more progress in the browser.

A national scale resource coming out of the “Stimulus” package for the US will obviously be absorbed into GE and VE terrain pyramids. But neither offers download access to the backing data, which is what the engineering world currently needs. What would be nice is a national coverage at some submeter resolution with the online tools to build selective areas with choice of projection, contour interval, tin, or raw point data. Besides, Architectural Engineering documents carry a legal burden that probably excludes them from online connectivity.

Not small figures for an Open Source community resource. Perhaps OVM wouldn’t have to actually host the data at community expense, if the government is kind enough to provide WCS exposure, OVM would only need to be another node in the process chain.

Further into the future, online virtual models of the real world appear helpful for robotic navigation. Having a mirrored model available short circuits the need to build a real time model in each autonomous robot. At some point a simple wireless connection provides all the information required for robot navigation in the real world with consequent simplification. This of course adds an unnerving twist to some of the old philosophical questions regarding perception. Does robotic perception based on a virtual model but disconnected from “the real” have something to say to the human experience?

One of the areas of interest using SimpleDB is the ability to add multiple attribute values. Here is the overview from Amazon’s Service Highlights.

“Flexible – With Amazon SimpleDB, it is not necessary to pre-define all of the data formats you will need to store; simply add new attributes to your Amazon SimpleDB data set when needed, and the system will automatically index your data accordingly. The ability to store structured data without first defining a schema provides developers with greater flexibility when building applications, and eliminates the need to re-factor an entire database as those applications evolve.”

As an extension to my previous blog post I decided to try adding alternative language names:

After adding several languages for ‘Albuquerque, New Mexico,’ I am able to display them as GeoRSS and then as tooltip text in the Virtual Earth API viewer.

I had to add an explicit character encoding to my http response like this:response.setCharacterEncoding(“UTF-8″);
Once that was done I could reliably get UTF-8 character strings in the GeoRSS xml returned to the viewer.

I am not multilingual, not even bilingual really, so where to go for alternate language translations? I had read about an interesting project over at Google: Language Tools

Here I could simply run a translate on my geoname for whatever languages are offered by Google. I cannot vouch for their accuracy, but I understand that Google has developed a statistically based language translation algorithm that can beat many if not all rule based algorithms. It was developed by applying statistical pattern processing to very large sets of “Rosetta stone” type documents, that had been previously translated. Because it is not rule based it avoids some of the early auto translation pit falls such as translating “hydraulic ram” as a “male water sheep.”

SimpleDB, with its free unstructured approach to adding attributes, let’s me add any number of additional alternateNames attributes in whatever language UTF-8 character set I wish.

Although this works nicely for point features, more complex spatial features are unsuited to SimpleDB. The limit of 256 attribute per item and 1024 byte per attribute precludes arbitrary length polyline or polygon geometry. Perhaps Amazon SimpleDB 2.0 will let attributes be arbitrary length, which means polyline and polygon geometries could be added along with a bbox for intersect queries.

Still it is an interesting approach for storing and viewing point data.

Amazons SimpleDB service is intriguing because it hints at the future of Cloud databases. Cloud databases need to be at least “tolerant of network partitions,” which leads inevitably to Werner Vogel’s “eventually consistent” cloud data. See previous blog post on Cloud Data. Cloud data is moving toward the scalability horizon discovered by Google. Last week’s announcement on AWS, Elastic Map Reduce, is another indicator of moving down the road toward infinite scalability.

SimpleDB is an early adopter of data in the Cloud and is somewhat unlike the traditional RDBMS. My interest is how the SimpleDB data approach might be used in a GIS setting. Here is my experiment in a nutshell:

GeoNames.org is a creative commons attribution license collection of GNS, GNIS, and other named point resources with over 8 million names. Since SimpleDB beta allows a single domain to grow up to 10 GB, the experiment should fit comfortably even if I later want to extend it to all countries. Calculating a rough estimate on a name item uses this forumla:Raw byte size (GB) of all item IDs + 45 bytes per item + Raw byte size (GB) of all attribute names + 45 bytes per attribute name + Raw byte size (GB) of all attribute-value pairs + 45 bytes per attribute-value pair.

I chose a subset of 7 attributes from the GeoNames source <name, alternatenames, latitude, longitude, feature class, feature code, country code> leading to this rough estimate of storage space:

itemid 7+45 = 52

attribute names 73+7*45 = 388

attribute values average 85 + 7*45 =400

total = 840bytes per item x 8000000 = 6.72 Gb

For experimental purposes I used just the Colombia tab delimited names file. There are 57,714 records in the Colombia, CO.txt, names file, which should be less than 50Mb. I chose a spanish language country to check that the utf-8 encoding worked properly.2593108||Loma El Águila||Loma El Aguila||||5.8011111||7.2833333||T||HLL||CO||||36||||||||0||||151||America/Bogota||2002-02-25

I ran across this very “simple” SimpleDB code: ‘Simple’ SimpleDB code in single Java file/class (240 lines) This Java code was enhanced to add Map collections for Put and Get Attribute commands by Alan Williamson. I had to make some minor changes to allow for multiple duplicate key entries in the HashMap collections. I wanted to have the capability of using multiple “name” attributes for accomodating alternate names and then eventually alternate translations of names, so Map<String, ArrayList> replaces Map<String, String>

However, once I got into my experiment a bit I realized the limitations of urlencoded Get calls prevented loading the utf-8 char set found in Colombia’s spanish language names. I ended up reverting to the Java version of Amazon’s SimpleDB sample library. I ran into some problems since the Amazon’s SimpleDB sample library referenced jaxb-api.jar 2.1 and my local version of Tomcat used an older 2.0 version. I tried some of the suggestions for adding jaxb-api.jar to /lib/endorsed subdirectory, but in the end just upgrading to the latest version of Tomcat, 6.0.18, fixed my version problems.

One of the more severe limitations of SimpleDB is the single type “String.” To be of any use in a GIS application I need to do Bounding Box queries on latitude,longitude. The “String” type limitation carries across to queries by limiting them to lexicographical ordering. See: SimpleDB numeric encoding for lexicographic ordering In order to do a Bounding Box query with a lexicographic ordering we have to do some work on the latitude and longitude. AmazonSimpleDBUtil includes some useful utilities for dealing with float numbers. String encodeRealNumberRange(float number, int maxDigitsLeft, int maxDigitsRight, int offsetValue)
float decodeRealNumberRangeFloat(String value, int maxDigitsRight, int offsetValue)

Using maxDigitsLeft 3, maxDigitsRight 7, along with offset 90 for latitude and offset 180 for longitude, encodes this lat,lon pair (1.53952, -72.313633) as (“0915395200″, “1076863670″) Basically these are moving a float to positive integer space and zero filling left and right to make the results fit lexicographic ordering.

Now we can use a query that will select by bounding box even with the limitation of a lexicographic ordering. For example Bbox(-76.310031, 3.889343, -76.285419, 3.914497) translates to this query:Select * From GeoNames Where longitude > “1036899690″ and longitude < “1037145810″ and latitude > “0938893430″ and latitude < “0939144970″

Once we can select by an area of interest what is the best way to make our selection available? GeoRSS is a pretty simple XML feed that is consumed by a number of map viewers including VE and OpenLayer. Simple format point entries look like this:<georss:point>45.256 -71.92</georss:point> So we just need an endpoint that will query our GeoNames domain for a bbox and then use the result to create a GeoRSS feed.

There seems to be some confusion about GeoRSS mime type – application/xml, or text/xml, or application/rss+xml, or even application/georss+xml show up in a brief google search? In the end I used a Virtual Earth api viewer to consume the GeoRSS results, which isn’t exactly known for caring about header content anyway. I worked for awhile trying to get the GeoRSS acceptable to OpenLayers.Layer.GeoRSS but never succeeded. It easily accepted static .xml end points, but I never was able to get a dynamic servlet endpoint to work. I probably didn’t find the correct mime type.

This example servlet makes use of the nextToken to extend the query results past the 5s limit. There is also a limit to the number of markers that can be added in the VE sdk. From the Amazon website:“Since Amazon SimpleDB is designed for real-time applications and is optimized for those use cases, query execution time is limited to 5 seconds. However, when using the Select API, SimpleDB will return the partial result set accumulated at the 5 second mark together with a NextToken to restart precisely from the point previously reached, until the full result set has been returned. “

I wonder if the “5 seconds” indicated in the Amazon quote is correct, as none of my queries seemed to take that long even with multiple nextTokens.

SimpleDB can be used for bounding box queries. The response times are reasonable even with the restriction of String only type and multiple nextToken SelectRequest calls. Of course this is only a 57000 item domain. I’d be curious to see a plot of domain size vs query response. Obviously at this stage SimpleDB will not be a replacement for a geospatial database like PostGIS, but this experiment does illustrate the ability to use SimpleDB for some elementary spatial queries. This approach could be extended to arbitrary geometry by storing a bounding box for lines or polygons stored as SimpleDB Items. By adding additional attributes for llx,lly,urx,ury in lexicographically encoded format, arbitrary bbox selections could return all types of geometry intersecting the selection bbox.