Marble/NaturalEarth

Natural Earth

Marble currently uses the very old and outdated MWDBII dataset for vector map outlines and we really need to replace it with more up-to-date data. However, MWDBII has two key advantages, it is very compact in size enabling Marble to ship it by default, and the individual nodes have a zoom level value which speeds up drawing.

The Natural Earth data set is a "public domain map dataset available at 1:10m, 1:50m, and 1:110 million scales. Featuring tightly integrated vector and raster data, with Natural Earth you can make a variety of visually pleasing, well-crafted maps with cartography or GIS software." This data set seems ideal as a replacement for the MWDBII.

Advantages:

Free / Public Domain data

Regularly updated

Wide variety of political and geographic features

Available at 3 different scales: 1:10m 1:50m and 1:110m

All feature nodes at same scale are matched

Data attributes such as country code, population, relative magnitude, etc

Disadvantages:

Is in Shapefile format which is space inefficient

No per node zoom level attribute

Different scale datasets do not match so cannot efficiently be used together for zooming

The 1:10m dataset seems ideal as the base map in Marble as it provides a higher level of detail than the current MWDBII. The 1:110m dataset seems ideal for use in a country selector widget in kdelibs. The 1:50m dataset being less detailed than the MWDBII is probably less useful to Marble.

Using the data in the default shapefile format is not considered desirable however:

No shapefile format support in Marble (yet), would have to rely on an external library or write our own

Space inefficient (14Mb vs 2.6Mb for MWDBII)

No zoom level attribute or any node level attributes

Vector level attributes are stored in .dbf format which adds complexity to implementing shapefile support

The ideal solution would therefore be to convert the Natural Earth data into a more efficient file format that includes a zoom level attribute

Ship minimal dataset with Marble (approx 4-5Mb?) and make the rest either an automated download as soon as connected online, or make available through GHNS.

As you say, there's the two approaches that could work for the lightweight
default layer:
a) Convert just the required NE datasets to pnt format, either merging the 3
scale levels into a single file with just 3 detail levels, or use Douglas-
Peucker on the 1:10m files to create the required detail levels
b) Implement an internal lightweight shapefile parser without dbf support and
ship only the required NE datasets.

Some pros/cons to consider:

The 1:10m country file is 6.55MB and contains 533,202 points = 12.28
bytes/point compared to the PNT which is 745KB and contains 127,246 points =
5.85 bytes/point, which would suggest the NE data in PNT format would be half
the size, so 6 MB in total. This could probably be further reduced by a light
application of Douglas-Peucker.

The NE shapefiles have been carefully processed so shared borders and
overlapping features like rivers match exactly and other such niceties,
applying the Douglas-Peucker algorithm might affect that.

A lightweight shapefile parser would allow users/apps to load other
shapefiles.

We would have to reconvert and check the data every time there's a new NE
release which could be a lot of effort, but an automated shp2pnt script could
prove useful to allow apps/users to display their own shapefiles in a simple
way.

Overall, it seems the best approach for the updating the lightweight layer is
indeed to convert the shapefiles to PNT format, provided the D-H algorithm can
be deployed in a way to mark each point with a detail level rather than just
throwing the points away.

Roughly 533,202 x 8 bytes = 4 Mb for the country borders alone, not including internal border and coastline files

If that's too much to ship, then ship the 1:50m dataset as the default and download the 1:10m dataset once online

Metadata file:

convert / filter dbf into our own format

Rather than the Geonames ID, we could just use the Natural Earth object ID,
then a look-up file/table that matches the NE ID to the ISO / FIPS / whatever
code (NE provides this in the metadata) and Geonames ID (which we would have
to provide). This would allow look-ups via whatever code or ID is available,
and we wouldn't be reliant on Geonames IDs staying constant.

So required work is:
1) Fix GeoPainter LinearRings which contain a pole not rendered correctly

- Torsten knows, problem will fill in flat map needs to create polygon if closed and crosses dateline once only

From Kashmir to the Elemi Triangle, Northern Cyprus to Western Sahara.

Core data

Internal boundaries

Core data??

Coastline

ocean coastline, including major islands. Coastline is matched to land and water polygons.

Core data?

First order admin (provinces, departments, states, etc.)

internal boundaries and polygons for all but a few tiny island nations. Includes names attributes and some statistical groupings of the same for smaller countries.

Optional download

Populated places

point symbols with name attributes. Includes capitals, major cities and towns, plus significant smaller towns in sparsely inhabited regions. We favor regional significance over population census in determining rankings.