Although
there are many clustering techniques, it is not possible to use them
for all purposes. The initiative problem was to create as many
clusters as possible (eg. thousands) for the local image features
description in huge amount of video for TRECVid 2008 evaluation.
These large dimensional vectors cover the space almost continuously
and commonly used clustering methods are unable to create enough
classes or to finish in serious time.

Therefore,
we have invented a new method based on Voronoi tessellation that
needs no more than two passes through the data. It is based on
discovery of clusters in higher density locations. Because of large
dataset, it is possible to create higher amount of candidate clusters
and select appropriate number of classes (large but not huge) and the
rest data assign to these classes. The method has been implemented as
a set of SQL functions and queries and tested on a huge problem and
large amount of classes. Performed experiments have proven that it is
significantly faster than common techniques.