Tuesday, August 4, 2009

I have previously discussedNetezza, who produce data warehousing appliances that provide outstanding performance and simplicity for complex analytics on very large data volumes. I did some consulting work with them last year as they added spatial capabilities to their system. Today they announced a major new architecture, which they say gives a 3-5x performance improvement for typical workloads (more for some operations, less for others), and reduces price per terabyte by a factor of 3. So overall price performance improves by a factor of 10-15. Database guru Curt Monash has a good discussion of the new architecture and pricing implications on his blog.

The new hardware architecture is more flexible than the old one, which makes it easier to vary the proportions of processor, memory and disk, which will allow them to provide additional product families in future:

High storage (more disk, lower cost per terabyte, lower throughput)

High throughput (higher cost per terabyte but faster)

Entry level models

I think that the entry level and high throughput models will be especially interesting for geospatial applications, many of which could do interesting analytics with a Netezza appliance, but may not have the super large data volumes (multiple terabytes) that Netezza's business intelligence customers have. Another interesting change for the future is that Netezza's parallel processing units (now Snippet blades, or S-blades, formerly snippet processing units or SPUs) are now running Linux, whereas previously they were running a rather more obscure operating system called Nucleus. In future, this should make it easier to port existing analytic applications to take advantage of Netezza's highly parallel architecture (though this is not something that is available yet). The parallel processing units also do floating point operations in hardware rather than software, which should also have a significant performance benefit for their spatial capabilities.

I continue to think that Netezza offers some very interesting capabilities for users wanting to do high end geospatial analytic applications on very large data volumes, and that there will be a lot of scope for its use in analyzing historical location data generated by GPS and other location sensors. And I am just impressed by anyone who produces an overnight 10-15x price performance improvement in any product :) !

No comments:

Peter Batty

About me

Peter Batty is a co-founder and CTO of the geospatial division at Ubisense. He has worked in the geospatial industry for 25 years and has served as CTO for two leading companies in the industry (and two of the world's top 200 software companies), Intergraph and Smallworld (now part of GE Energy). He served on the board of OSGeo from 2011 to 2013 and chaired the FOSS4G 2011 conference in Denver. He serves on the advisory board of Aero Glass. See here for a more detailed bio. You can email Peter at peter@ebatty.com, and can see videos of some of his conference presentations here.