Stop Food Waste - OS Open Data & SQL Azure

Any commercial, profitable idea I can think of for a new IT development project to knock about with while studying is already being done. If you consider non-profitable but worthwhile causes however, the possibilities are virtually endless.

SQL Server has an ever increasing array of methods that can used with geographic and geometric data and the Ordnance Survey Society has great material to use.

The system also providing the winning entry an Ordnance Survey (geographic) Open Data competition

To design and build a community web service for supermarkets and restaurant’s to log the availability of food about to expire on. Homeless shelters and similar organizations could then search for these items in the same geographical area. It's a project inspired by Hugh Fernley Whittingstall's 'Fish Fight' and 'Land Share' campaigns:

The Ordnance Survey data is coordinated by the UK National Grid system as opposed to latitude/longitude. In the map of the UK below, each of the two letter cells below represents 100 KM square. Latitude and Longitude could be could be reverse engineered from OS Grid Reference using this conversion program (writing a SQLCLR code version of this would be a challenging job) - http://www.movable-type.co.uk/scripts/latlong-gridref.html

The OS Data is available in two formats, vector and raster. Each arrives as a collection of approximately 8 CD's containing in excess of 20 GB of data.

Below the Grid data is filtered using STOverlaying of Vector data. The 2 blank grid cells in the middle of the Isle of Wight are due to OverLay only, not Intersect methods being used.

Each 1 KM square cell was calculated based on Raster data coordinates imported from TAB files.

Raster Data
The OS Raster data is effectively a collection of arial/satelite images with accompanying metadata files. Microsoft Reporting Services Maps don't support them unfortunately, they would look great underlying semi-transparent Vector data

It's straight forward importing the images into a SQL File Stream table for future use or for custom coding purposes. Each image file comes with an accompanying 'TAB' file containing coordinate information which can be used to construct an initial set of grid coordinates.

Reports

The report below shows an area on the UK shoreline imported from the Vector data CD's. Underlying this is a section of the grid, constructed from the imported TAB files. It's labelling just the South West Corners of 10KM square cells to avoid cluttering the screen but show the cross referencing.

The grid is useful as the position and area are easy to quantify, irregular polygons and lines are not so easily interpretted, the grid also corresponds to the RASTER satelite images. Food demand and availability information is stored at item level by a grid reference (EG su80sw_R1C1) which represents a defined area of 1 KM within the parent cell. The SQL Azure Sharding/Partitioning key also lies at another, higher level in this grid/cell parent hierarchy.

Stop Food Waste - SQL Azure Partitioned Database Schema
To support and scale the system, the SQL Azure Sharding feature looks good for partitioning supermarkets and homeless shelters in proximity according to an OS National Grid derived key. If the system became popular it would be possible to add new federations (by sub-dividing existing grids) as discussed in this whitepaper - http://social.technet.microsoft.com/wiki/contents/articles/1926.how-to-shard-with-sql-azure.aspx .

The use of 'No Check' constraints on certain 'jct' Junction table foreign key's is in case the rows they reference exist only in another partition/SQL Azure federation table. The sharding policy is by GeoZone, if not participating in the sharding policy the junction tables appear in all Sharded instances. I'm not sure if this is how it works or not to be honest.

3 schema's are used, MDS, Azure and Geo, each with update permissions on owned tables, select permissions on tables owned by other schema's.

Azure - The 'tGeoZone' column Sharding, GUID's used for primary key on tables updated in Azure to ensure uniqueness across federations.

GEO - Imported Ordnance Survey Open Data is physically too large for Azure so Azure clients would need a connection to the local database for maps and other geographic information, via XML / GML

Developer Edition) and involves a lot more XML/GML using the SQL geospatial methods. It's enjoyable but time consuming, progress is slow.

I'm not an expert database designer and am completely unfamiliar with SQL Azure making assumptions based on the whitepaper and how that would be physically implemented.

The database schema above is an old one, the OS Data is best stored in the format supplied on the CD's, no ETL transformation necessary. This schema was intended for SQL Azure partitioning and the effect this might have with foreign key constraints on non partitioned JCT intersect type tables.

The main lesson's learnt ready for Version 2 are:

Don't try and tackle all the OS Data at once, pick a workable area of a few miles.

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands
of articles and SQL scripts, a library of free eBooks, a weekly database news roundup,
a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals
that makes it such a success.