In order to make the maps perspicuous at a given view, we may require to limit
the number of the features rendered at neighbouring locations which would normally
overlap each other. Currently there’s no such mechanism in MapServer which would
prevent from the symbols to overlap based on their relative locations. In a feasible
solution we should provide rendering the isolated symbols as is, but create new
(clustered) features for those symbols that would overlap in a particular scale.
According to the example at http://trac.osgeo.org/mapserver/attachment/ticket/3700/cluster.png
the number of the features forming the clusters are displayed in the labels for each clustered features.

This functionality will be implemented as a separate layer data provider (implemented in mapcluster.c)
This provider will be used internally (being invoked in msLayerOpen).
The clustering parameters can be specified for each layer type as follows:

LAYERTYPEPOINTCONNECTIONTYPEOGRNAMEclusterCLUSTERMAXDISTANCE20# in pixelsREGIONellipse# can be rectangle or ellipseGROUP(expression)# we can define an expression to create separate groups for each valueFILTER(expression)# we can define a logical expression to specify the grouping conditionEND...END

We can also use multiple classes to display the clustered shapes. The referred example above use the
following layer definition:

In the proposed solution msLayerOpen will call the vtable method of the cluster layer provider
instead of the original vtable method depending on the existence of the
clustering options, something like:

if(layer->cluster.region)returnmsClusterLayerOpen(layer);

In msClusterLayerOpen the data provider will override the vtable functions so that
the subsequent LayerWhichShapes/LayerNextShape/LayerClose (and some further) functions
will be handled by this provider and not by the original data source.
The clustering process itself will be handled in the LayerWhichShapes call. This is the only
place where the features are retrieved from the original data source and then cached
in the local clustering database (stored in layerinfo).

The clustering process itself will be implemented in the following way:

For each feature we create a tentative cluster with some aggregate attributes
(like the feature count, the average position and the variance) the features are added into
a customized quadtree data structure which provides quick access when searching for the
neighboring shapes

For each feature we will retrieve all the neighbouring shapes (that has already been retrieved earlier)
within the specified distance (CLUSTERMAXDISTANCE) and search shape (CLUSTERREGION) by using a quadtree search. We will also inspect the filter and group conditions in each relations,
In the related clusters we update the feature count (n) average positions (avg) and the variance (var)
for each intersecting clusters by using the following recursive formula:

In a second turn we evaluate the tentative clusters based on their feature count and the offset of the
average position related to the initial position and the variance. The best ranking clusters will
be identified by minimizing the position offset and the variance. The individual features (having rank=0)
will be retrieved first in this approach.

The best ranking clusters will be added to the finalization list (in layerinfo) and the finalized
clusters (and the related features) will be removed from the quadtree as well.

Based on the finalized features we update the average position and the variance of the affected
clusters which are still exist in the quadtree.

Repeat from #4 until we have features in the quadtree.

In LayerNextShape the features are served from the finalization list which is preserved until the layer is open.
In LayerClose the vtable of the layer will be restored to the original methods
(by calling msInitializeVirtualTable)

The clustered layer itself will provide the following aggregate attributes:

1) Cluster:FeatureCount - count of the features in the clustered shape
1) Cluster:Group - The group value of the cluster (to which the group expression is evaluated)

These attributes can be used to configure the labels of the features and can also be used in expressions.
The clustered layer will also support to get further attributes from the original data source as
referenced in the LAYER configuration.
The ITEMS processing option can be used to specify additional attributes from the source layer
according to the user preference.

If we retrieve the original attributes then the layer provider will provide only those values which
are equal for each shapes in the cluster. The other values are set to “Cluster:Empty”. In the future
we may probably extend this by implementing aggregate functions to define the attributes, like
min([attributename]) or max([attributename])

We can define the symbology and labelling of the clustered layers in the same way as any other layer by specifying the classes and styles. STYLEITEM AUTO is not considered to be supported at this phase.

In the query operations the clustered features are retrieved as single shapes with the attribute set
as specified in the ITEMS processing option. The clustered features are preserved until the layer
is open, so the single pass query approach is provided by this driver.