Over
a decade ago, Stanford statistician David Donoho predicted that the 21st
century will be the century of data. "We can say with complete confidence
that in the coming century, high-dimensional data analysis will be a very significant
activity, and completely new methods of high-dimensional data analysis will be
developed; we just don't know what they are yet." -- D. Donoho, 2000.

Unprecedented technological advances
lead to increasingly high dimensional data sets in all areas of science,
engineering and businesses. These include genomics and proteomics, biomedical
imaging, signal processing, astrophysics, finance, web and market basket
analysis, among many others. The number of features in such data is often of
the order of thousands or millions - that is much larger than the available
sample size.

For a number of reasons, classical data analysis methods inadequate,
questionable, or inefficient at best when faced with high dimensional data
spaces:

2.
Phenomena that occur in high dimensional probability spaces, such as the
concentration of measure, are counter-intuitive for the data mining
practitioner. For instance, distance concentration is the phenomenon that the
contrast between pair-wise distances may vanish as the dimensionality
increases.

3. Bogus correlations and misleading
estimates may result when trying to fit complex models for which the effective
dimensionality is too large compared to the number of data points available.

4. The
accumulation of noise may confound our ability to find low dimensional
intrinsic structure hidden in the high dimensional data.

5. The
computation cost of processing high dimensional data or carrying out
optimisation over a high dimensional parameter spaces is often prohibiting.

Topics

This
workshop aims to promote new advances and research directions to address the
curses and uncover and exploit the blessings of high dimensionality in data
mining. Topics of interest include all aspects of high dimensional data mining,
including the following:

- Systematic studies of how the curse of
dimensionality affects data mining methods

- Data presentation and visualisation methods
for very high dimensional data sets

- Data mining applications to real problems
in science, engineering or businesses where the data is high dimensional

Paper
submission

High
quality original submissions are solicited for oral and poster presentation at the
workshop. Papers should not exceed a maximum of 8 pages, and must follow the
IEEE ICDM format requirements of the main conference. All submissions will be
peer-reviewed, and all accepted workshop papers will be published in the
proceedings by the IEEE Computer Society Press. Submit your paper here.