Thursday, June 5. 2014

Imagine you were a Software Developer

Did you ever come across a situation where the data you were working with was wrong? It just does not comply to the specs. It is a complete mess. And the client does not really understand the implications because they only looked at it through some odd software (from the ugly competitor, obviously). The meta of the data sits in the mind of so many people but has never been organized and fixed (to fit into our software). Oh, maybe there are some ugly old UML diagrams that have been glued into a PDF as JPG images after resizing them beyond legibility. And the latest version you have is beta-something from eons ago.

Imagine you were a Data Monger

Did you ever come across a situation where the software just did not want to eat your data? First the encoding is wrong, then it needs a carriage return AND line feed instead of just a \n. Then it imports the core column which contains all the float values you really need as text strings. After some time you find out how to "fix" this. Until several days later you notice that the import routine cut off all the decimals in one data set. Because it came from a German colleague and interpreted the commas as thousands separators instead of decimal marks (darn German encoding). Sigh.

Who is Right?

Obviously both are right. But their perspectives are different - and for some reason the software developer often "wins" whilst the data monger should.
For us software engineers data is a nuisance. It never works, stubbornly refuses to "flow" into the software. And because the software is (obviously) right the data must be wrong. The good thing is that data can be "corrected", tweaked, bent and made to work.
For us data mongers the software appears to be wrong. Initially. But then the software engineers tell us otherwise plus the stack was dead expensive, looks awesomely professional and others seem to get by with it. So we start to wonder and then we start to tweak. Make the data fit into the software.

Wrongdoings!

And this is where it all goes wrong! Because software comes and goes. Data stays. Especially geospatial data. It may get old and become a historic document. Which you need badly later on when you want to show how things change over time. When you need to document that change. Software instead gets outdated and falls out of use. Data grows and becomes more complex. This is not nice but it is a reality. Deal with it. Never ever even think that you have to make the data work with your software. The software has to deal with your data. Or die. It is like trying to wear shoes three sizes too small. Well, you can cut off your toes...

Right-doings!

The data is where the money sits. This is a simple reality. Software should never be more than thin veneer. If the data is a brick then the software is the mortar. Did you ever see a mason cut off every single brick the mortar is thick?
If a tire and the rim are the data then the software is the tire mounting lube. No more. Would you cut up and reweld the rim so that the lube will help the tire slip on?
Software is nothing but a soap bubble on the rim. It must be dead easy to wash off and exchange for another brand. Because Data is important. Software is not. Software comes and goes.

Appeal to us Software Engineers

Please let us collectively get down off our high horses and accept that we are humble servants to the Data. The Data is always right. It cannot be wrong. It is Data. All that is wrong is our attitude.

Have fun,
Eight

Part II of this rant (link forthcoming) will describe some nasty effects resulting from sticking with software that prescribe you what to do.

Sunday, March 24. 2013

Er,
I am proud to have been awarded the Best Presentation Award for my keynote at the AGI South West Conference which took place in Bristol last Thursday in a hall with rock stars left and right. Me right in the middle – speaking about geospatial openness? How cool is that.

Wrong!

Now, what's wrong with this? Everything. First off the presentations were voted for by the audience. Which is good. But there were a dozen presentations in two tracks and only two keynotes in the plenary. Which means that most presentations were only seen by more or less half the audience. That roughly doubles the chance of a keynote to win.

What else is wrong? Pride is always wrong. Yes, many will disagree but please bear with me and let me explain. Pride in and out of itself is a problematic emotion (or shall we better say "state of mind"). Why that? Because pride locks us into what we did and prevents us from learning. It closes our mind to other things. From Wikipedia: "...pride refers to an inflated sense of one's personal status or accomplishments...". Wikipedia hurries on to relativize that pride can also have a positive connotation like a "...satisfied sense of attachment..." which is again wrong because attachment is bad. Attachment means clinging to something. But in a world where everything changes clinging to something will eventually make us suffer because we will lose what we have clung to. Everything changes all the time and that's a fact. So even from an egoistic perspective pride and attachment are no good. They will eventually only hurt us.

What else? Well, to be honest, in the past decade or so I have given twentysomething keynotes, conducted 50 workshops and gave 200 regular presentations before audiences between 10 and 2500 people. This admittedly gives me some expertize. So one reason I was invited to give a keynote probably was that I do that fairly well. (Ugh, can you see the pride rise again? It is a disgustingly intrusive mental state).

Anything else wrong? Oh yes, lots more. For example, please tell me what is "the best"? We live in a culture that has completely focused itself on a monotheistic notion of "there can only be one". The Highlander Principle. Hence all the monopolies. But this is crap. A big mental turd. There is several billion people around. How can any one of those be the best? Monopolies are bad, no matter from where and about what.

Be Real

So let's be real for a change. This was nicely put in words by the authors of a recent mail to the members of GITA. Between other things they list future tasks they will offer and say: "...these will not be 'this is best' types of reviews, but rather analysis for completeness, accuracy and similar...". Aha. Well done. They split "best" up into an analysis of several aspects of "best". Which makes a lot more sense.

Lets check my talk for completeness. It presented three aspects of openness relevant to standards, software and data. How on earth can anybody talking 35 minutes be complete and do justice to all three aspects? It is impossible.

Next is accuracy. To be honest - my talks are typically highly inaccurate (from a scientific one-truth-only point of view). Because there are always many truths. One blatant example: I am known to postulate that only x% of revenue generated in IT comes out of selling proprietary licenses (to make a point why Open Source is perfectly viable from a business perspective). And to prove it I cite myself in a paper I published years ago in a FAO and FIG Commission paper where I cited Bruce Perens (another Open Source rock star) who never ever said that this is a fixed x% of anything. Because this figure is highly opaque. Nobody can tell how much cost can really be attributed to setting up and running an IT system. How high is the cost of women power required to transform data to fit into the newest version of software? How much revenue goes down the drain because the transition into the new system was delayed by x months? Forget it. Does not compute.

David Overton of dbyhundred giving a keynote about SplashMaps: Washable, wearable, wasterproof maps for the Real Outdoors.

Still - I invariably replace x by a number. Because otherwise it would look wishy-washy. And I even change that number arbitrarily. In my presentation at the Asia Geospatial Conference in 2011 (which btw also received a "Best" award) I said that the number is 5% because obviously in Asia license revenue is way lower than around here. The street market price for the newest version of Adobe Illustrator is US$ 1.50. After the talk people said that the cost for licenses is much lower in any given system, probably in the range of 1%.
At the last Cambridge conference two years ago I arbitrarily raised the percentage to 20 because the audience consisted of the directors of European National Mapping Agencies (who spend millions each year for Oracle licenses and I did not want to make them feel too stupid). After my talk a friend employed by esri told me that this figure is completely wrong and that his projects typically have a margin of 40% or even 60% coming in from licensing costs. And he is probably right. He is a sales guy and makes a living off selling licenses. Please pray with me that this share will always stay that high because this is how he pays his rent.
And he is a really nice guy.

In summary - I am a lier. Couldn't be worse. Accurate? Yeah, see you On The Highway To Hell. Blamblamm.

Everything is Relative (ask Einstein)

OK. Coming down again, rant mode ends. Why do I say all this? Because I strongly believe that there is a reason for diversity. Nobody is best. Nothing is best. But we all do the best we can (well, yeah, I am a hopeless case of seeing the best in people all the time) and everything is the best FOR A CERTAIN ASPECT. So don't go for the award winning software xyz but look at the problem at hand and then apply the best system suited to solve it. Or - better even - go to that guy you met at the last conference and ask her what she would do.

The AGI South West Conference

Now. The AGI conference organizers are cool. They know all of this. Which is why they had a second best award. Not like in "2nd place winner" but "also best". Which makes so much more sense. This award went to Clare Hubbard from the UK MetOffice presenting the "Lessons learned and challenges ahead" with their Datapoint program. This was a complete (as complete as you can be in 30 minutes) and accurate account of a really great project and if you have anything to do with open data please check it out. I will try to find a link to her presentation and post it here in the comments.

Clare Hubbard from the MetOffice UK gives an introduction to Open Data criteria.

And there was this other presentation by Tony Bush from DEFRA about the Air Quality Datasets which really drove home how important it is to release data openly and how difficult it is at the same time. And I loved Anthony Perkins' update from the Environment Agency Hackathon – an event organized together with Ordnance Survey's Geovation program to enable developer communities to make use of open data. Playful. Easy. We are only starting to understand what we can do. Just check this completely oblique app to get a taste: http://penguin.hodgetastic.com/ This is not meant in serious and is not about accurate data but it is about getting people intrigued and interested in their environment. It is best. No questions asked.

Phew. Towards the end I managed to include some geospatial aspects. This is just to justify that the OSGeo planet continues to reference my little blog.

Generic Geospatial Open Source

Oh, by the way, my presentation is probably the most concise and accurate summary of openness with respect to standards software and data I ever managed to get across in thirtysomething minutes. Please feel free to reuse and propagate wherever you want to enlighten people about openness downloads are available in my Publications section as editable ODP and and PDF and Online at Slideshare. Maybe I should also export some of the diagrams as images for use in other media, they tend to fall apart in different version of LibreOffice, wasOpenOffice and WeakPoint. The video version unfortunately has bad audio and cuts off after 29:59 due to a limitation of my camera so I did not upload. Should I anyway? I like to see myself talking about this stuff, it is usually entertaining to watch. Popcorn entertainment for nerds. He.

Yikes. Did you see it? There it was again, Ugly Dr. Pride sitting on my left shoulder and prompting me to promote my ego.

Sunday, January 13. 2013

Recently I talked with the CTO of a major systems provider. At one point I used the term "generic" together with "software development" and wanted to carry on saying that Open Source also does this and so on. But I never got there. He stopped me and said that I should please never use the term "generic" again. As he carried on he could barely contain his anger, not at me but at the rest of the world of software development, sort of. He had heard the promise of generic software so often and it never worked out, that he just got tired of it. He explained that especially in the Web context he explicitly goes down another route these days. He wants a one-off development that will do just exactly what he needs for two or three years. Then he will throw it away and get something new. It is always a one-off, never more. He does not want any documentation, no upgrade plan, no support, he wants a running system. Full stop. That was news to me but it got me thinking.

The Lifetime of Geodata

This is OpenData from the Ordnance Survey. It has been collected in the early 70s to create contour lines for topographic maps. The maps were later scanned, digitized and stored as DXF contour lines and as ASCII grid files. This image has been created from the ASCII files using GDAL.

Now, especially in the geospatial domain things are a bit different. Geodata has a very very long life time. Not all but some geodata almost seems to live forever. Be it satellite images that need to be archived - and then read a decade later, or cadastral data that evolves as people buy and sell land, cut up land lots and join them up. It is almost a living system. In many cases the lifetime of the data we work with survives our employment. The data was there when we started and it will be there when we leave. Some data will outlast our own lifetime. Ugh. Face it.

The Lifetime of Software

So how do these two things go together? The lifetime of (certain) software becoming shorter and shorter and that of data extends practically endlessly? Right now it seems to not go together at all. One of the primary reasons is that software developers tend to perceive data as a nuisance. Something that gets in the way of great software design. It has to be taken into consideration but it is not fun. It slows innovation and takes up resources. Go away.

Software developers are bad Data designer

What is the source of this problem? It should be blatantly obvious by now. Typically the people implementing software are software developers. Did you ever meet a data developer? No? Me neither. And if you did, are you sure it really was a data developer? I bet it probably only was a failed software developer. Or someone infected with an architecture design, be it Client/Server, SOA, Web 2.0, RESTful, API or whatever. What do they have in common? They think in terms of the software. The only recent exception may be resource orientation or the ROA (Resource Oriented Architecture). But do you read anything much about this anywhere?

Metadata Anybody?

A piece of metadata. Useless but compliant.

Boohoo! Boring! Dusty! Go away! This is because we lack a data culture. There is a disconnect between the needs of a data and the people who could actually provide it. Software developers think in terms of their software. Architects think in terms of scalability, security, maintenance - of the software. Application designer think in terms of user experience. And data - is just there. A never ending nuisance full of (real world) errors and never quite finished. Ever evolving. Painful for someone who wants to design a generic software.

This is also the reason why we have no metadata. Anybody really in love with their data would maintain it's lineage. It would be beautiful metadata. Not pointy brackets we have to fill with predefined content like INSPIRE wants us to do. We would document where the data came from, what happened to it, who worked on it, what is the accuracy and so on. But nobody cares about this. Because you need a pretty technical background to understand how to do it. And the people with technical background usually come from a software perspective. It is a deadlock situation. Praise Open Data if only to bring out this problem into bright sunlight for everybody to see.

How to Proceed?

We developed a few ideas during my recent stint at the Ordnance Survey. Thanks agin to everybody who I worked with there for a very educative and insightful time! What we did is take a piece of geospatial data and track it down throughout the organization. Ignore the software. Just follow the data. And then look at how this path can be improved. From the perspective of the data. Amazingly it can be improved by simply removing half the software. And by maintaining a record of what has been done to the data as it travels through the organization. And by talking to the people who did the little dirty tricks to the data to make it work with "their" software. Did you hear that? The things that were done to the data to make it work with the software. This is simply wrong. It should be the other way round. You should never, ever change your data according to the specs of a software. Data must come first, because:

Friday, January 11. 2013

Folks,
this is yet another irrelevant blog from my little life in the UK. Brits, stop reading, you'r not gonna like it. US of Americans will neither like it.

I cannot believe how backward this country is wrt identity management. Yes, this UK show-off monarchy thing, Great Britain, whatever you call it. It is just a dreary little island off to the west of Europe but still believes it is the center of the world. Ha, told you that you would not like it.

Banking

My was-to-be future bank let me know that I need a utility bill as proof of address. Please what? Yes. In the UK you prove that you actually exist through a utility bill. Utilities are the folks who give you water and take away your shit, right? Well, makes sense, everybody has to shit and it is not allowed in the woods anymore. So having a bill with them sort of proves that you are somewhere. Sort of. Fine. But I have no utility bill because gas, water, shit removal is graciously part of my rent. But my rental agreement is no go. I have no bank account in the UK because I don't need one. I have a bank. In Germany. And money is mostly bit in a wire anyway. So what the fuck do you want? But even if I would want to get a bank account in the UK I would not get it because:

I have no bank account to "prove" that I live in the UK.

I have no utility bill to "prove" that I live in the UK.

Identity Management

This appear to be an unresolvable deadlock and yes - it is. So they suggested I prove my address in Germany. Easy. I have an ID card. In Germany we have a registration office where you have to register yourself when you move to another address. You get stuck in prison if you don't. Not nice but effective. It's been like this since around 2000 years - not in Germany but in other developed places for example in Bethlehem. It is part of the legacy of the largest and oldest corporation of the world and documented in it's bylaws (the Bible). But no, the UK bank will not accept my German ID card. Please what? You don't trust a government issued ID card but do trust a shit remover?

Excuse my strong language but it is just not acceptable that the UK does not have an official registration. Not at all. Never. We live in 2012 and have digital identity management since 30(!) years. We live in Europe and there are international registers maintained by insurance companies, the police, the military, the secret service and what not else and here - in the UK - you would rather trust - a utility bill? Can you hear the incredulity in my voice? I will raise a petition to remove the UK from Europe (they'll go bankrupt soon anyway and we have enough to pay for Greece, Portugal, Spain and the likes - but they are at least nice holiday destinations).

But my utilities provider in Germany was very helpful and they sent me a paper bill (I have been doing this online since six years but they do still have a printer somewhere). I get an annual bill. Since 15 years that I live there. The last one is from April 2012. Which is more than 6 months ago. Therefore the bank does not accept it, it may not be older than 6 months. So I get no bank account in the UK. And by now I don't want one anymore because I am moving on anyway.

Freedom and Independence

Why can't the Brits get around to setting up a registry? Because they declare that they want to be independent and uncontrolled and have their "freedom". They abhor control by the government. Come off it, the Isles are the most paranoid place you can image. Control. Everywhere. Every car number plate is identified on every road every day. There are CCTVs everywhere. But still, if it is the government that wants to provide an identity service - it is rejected. I can understand when this happens in the USA where they still believe that shooting each other is a great way of solving problems. And where you identify with a credit card and a driving license. Please! Come on. I thought that Greatish Britain was a bit more advanced than their hunchbacked cousins in the US of A. Wrong. They are their ancestors.

I can see it coming, people will eventually authenticate through Google and Facebook to get a bank account and buy a house or shoot someone. Altogether nice. I'm lovin' it.

Ah, another one now that I am already rambling: My MasterCard works everywhere in the world but in the UK where it will not be accepted in half the shops and fuel stations I have tested so far. And I really mean it when I say that I use it everywhere, from Argentina to Zimbabwe, and yes, I travel all the time.

Bordering Edges

Living on an island is a big hardship, really. Be it the UK or the USA. You all have my compassion. You know what your problem is? You don't have borders. Edges is where things happen. Joining different thinking makes you smart. You know what happens to frogs that sit in a pot and you slowly raise the temperature? They will smugly sit in their bath until they are too hot and weak to hop out. They die. Well then, we will all die eventually, so what's the point? Don't be smug and hop out every now and then. Out there is a great world to explore. And through Facebook you only see the Polaroid version of it. Ugh, I am getting old. Who knows what a Polaroid is...

Wednesday, November 28. 2012

Disclaimer: This is barefaced advertisement. Go away now if you don't like ads. You could go here or here or there instead. You have been warned!

Close up of a Splashmap

The Idea

Splashmaps? So what is this all about? It is a project that a friend of mine has started over a year ago and which I followed along with moderate interest until at one point I started to understand that this is actually a cool way of making good use of Open Data. And even making it physical as a printed map! Can you believe it? Me, with a print-out! How last century is this? So anachronistic that it is probably cool again.

Besides being based on Open Data from two very different sources another novelty is that the outcome is printed onto fabric. It is all geared towards the outdoors. We started off conceiving this to be a nice addition for mountain bikers and walkers in the national forests of Great Britain - but this was only a starting point. As we proceeded people came up with more and more ideas. This new way of producing high quality on demand prints turned out to be useful in many contexts.

More Ideas

One of our friends pointed out that this is ideal for horseback riding. Paper maps tend to make noises that will irritate horses and once irritated a horse will not stop to buckle until the paper map is gone, typically together with the rider.

But then it becomes more radical. A colleague from the World Bank complained that the maps they distribute in the tropics tend to have a short life time because they simply fall apart in paper-adverse weather conditions. And then there is a need for heavy duty "analog" mapping in emergency situations with pouring rain and storms. Can you imagine relief mapping with the most up-to-date OpenStreetMap data in disaster areas? That wasn't my idea either.

How to get there?

These are all things which we did not consider at the outset and cannot promise will actually work out but they all make a lot of sense to me. So if you think this is interesting and you would like to contribute, please consider to help us get the seed funding at Kickstarter.

We are still short of quite a bit. Ugh - how I don't like doing this... I feel like Wikipedia asking for funds but with way less Moral Authority.

The Geodata Stack

Now, for the geeks of you - what does it all make up? My CEO will probably crucify me for sharing all the details but hey, this is not really bleeding edge. The Open Data is up for grabs by anybody and the tech stack is also a rather boring standard set-up. But I will share some background anyway.

We take OpenSpace data from the Ordnance Survey and apply our own styling. One big feature of the product is the option to style the maps in our very own way. Up to now we have been pretty conservative, but I can already imagine versions with reflecting colors for night shifts and ultraviolet colors for hidden spots in secret maps, and... Yeah, well, we are not quite there yet. But if there is interest we will explore it - once we got this off the ground.

Next we export foot and bridle paths (and some other secret ingredients) from OpenStreetMap and add them as an overlay. Once you buy a map we will be glad to give you the source data used to create it. Why? Because we believe in sharing and - obviously - as per ODbL you are entitled to get the data anyway. But not the Ordnance Survey Open Data because that comes in a different database and is thus not affected by the ODbL. ((This was just a hint for all those who still believe that you cannot use OSM for commercial undertakings. You can!)). And you can also download the Ordnance Survey Open Data on your own anyway - which is actually pretty cool in itself.

The Geo Tech Stack

We used the C taste of the OSGeo stack to set up the stack using GDAL/OGR, PostgreSQl, PostGIS, MapServer, Mapbender, OpenLayers and some glue to create management interfaces. The whole stack is interoperable because we based it on OGC standards starting from GML and ending with a WMS output. Yes, with high quality prints in 600dpi. This might be noteworthy for those who still believe that web mapping is just little pixely pics. Far from it. With MapServer 6.2 we get some really fine cartography. In between there is some manual tweaking and bending and correcting of geometries and text placement, for this we typically use Quantum GIS. Obviously Proj4 is all over and some imposm and shp2pgsl and you know, the whole shebang...

So yes, this is a full Open Source based architecture, using Open Data and nothing but. No But, no FUD, it just works like a treat.

More of this & Kudos

Splashmaps(tm) Ad-Hoc projection(c) Back off! Patent Pending!

Sounds interesting? Find out more about David Overton who has been investing a lot of time into the whole project:

He had the initial idea and also walked us right through up to the end product. He ran the print files through a dozen print shops all over Europe and beyond to find the right mix of fabric, colors, text sizes and symbols to use. Many thanks David!

And many thanks to all our backers (check them out, you might even know some). And now lets just keep fingers crossed that we make our pledge goal within the next 9 days (Gulp!).