Sign up to receive free email alerts when patent applications with chosen keywords are publishedSIGN UP

Abstract:

A data analyzer/classifier comprises using a preprocessing step, and
energy minimization step, and a postprocessing step to analyze/classify
data.

Claims:

1. A method for determining dimensionality of a network, the
dimensionality corresponding to a number of degrees of freedom in the
network, the method comprisings:using a computer system, sampling data
from one or more nodes of the network;using a computer system, mapping
the data into one or more matrices;using a computer system, applying
individual differences multidimensional scaling to the one or more
matrices to produce a common space output; andusing a computer system,
processing the common space output to determine the dimensionality of the
network.

2. The method for determining dimensionality of a network of claim 1
wherein mapping the data into one or more matrices comprises a processing
step selected from the group consisting of forming one or more symmetric
matrices, forming one or more hollow symmetric matrices, entry-wise
substituting the received data to populate one or more symmetric
matrices, forming one or more proximity matrices, forming one or more
distance matrices, and forming one or more Euclidean distance matrices.

4. A method for reconstructing a network, the method comprising:using a
computer system, sampling data from one or more nodes of the
network;using a computer system, mapping the data into one or more
matrices;using a computer system, applying individual differences
multidimensional scaling to the one or more matrices to produce a source
space output;from the source space output, using a computer system,
determining the dimensionality of the network;using a computer system,
using free nodes to recreate and reconstruct individual nodes through the
use of matrices containing missing values; andusing a computer system,
establishing node connectivity through the use of lowest-energy
connections constrained by dimensionality.

5. The method for reconstructing a network of claim 4 wherein mapping the
data into one or more matrices comprises a processing step selected from
the group consisting of forming one or more symmetric matrices, forming
one or more hollow symmetric matrices, entry-wise substituting the
received data to populate one or more symmetric matrices, forming one or
more proximity matrices, forming one or more distance matrices, and
forming one or more Euclidean distance matrices.

7. A method for determining dimensionality of a dynamical system from
partial data, the dimensionality corresponding to a number of degrees of
freedom in the dynamical system, the method comprising:using a computer
system, sampling data from the dynamical system;using a computer system,
mapping the data into one or more matrices;using a computer system,
applying individual differences multidimensional scaling to the one or
more matrices to produce a common space output;using a computer system,
processing the common space output to determine dimensionality of the
dynamical system.

8. The method determining dimensionality of a dynamical system from
partial data of claim 7 wherein mapping the data into one or more
matrices comprises a processing step selected from the group consisting
of forming one or more symmetric matrices, forming one or more hollow
symmetric matrices, entry-wise substituting the received data to populate
one or more symmetric matrices, forming one or more proximity matrices,
forming one or more distance matrices, and forming one or more Euclidean
distance matrices.

Description:

[0001]The present patent document is a a division of application Ser. No.
11/097,783, filed Apr. 1, 2005, issued as U.S. Pat. No. 7,702,155 on Apr.
20, 2010, which is a division of application Ser. No. 09/581,949, filed
Jun. 19, 2000, issued as U.S. Pat. No. 6,993,186 on Jan. 31, 2006, which
claims the benefit of the filing date under 35 U.S.C. §119(e) of
Provisional U.S. Patent Application Ser. No. 60/071,592, filed Dec. 29,
1997. All of the foregoing applications and patents are hereby
incorporated by reference.

COPYRIGHT NOTICE

[0002]A portion of the disclosure of this patent document contains
material which is subject to copyright protection. The copyright owner
has no objection to the facsimile reproduction by anyone of the patent
document or the patent disclosure, as it appears in the Patent and
Trademark Office patent file or records, but otherwise reserves all
copyright rights whatsoever.

REFERENCE TO APPENDIX [CD ROM/SEQUENCE LISTING]

[0003]A computer program listing appendix is included containing computer
program code listings on a CD-Rom pursuant to 37 C.F.R. 1.52(e) and is
hereby incorporated herein by reference in its entirety. The total number
of compact discs is 1 (two duplicate copies are filed herewith). Each
compact disc includes one (1) file entitled SOURCE CODE APPENDIX. Each
compact disc includes 23,167 bytes. The creation date of the compact disc
is Mar. 24, 2005.

BACKGROUND

[0004]The present invention relates to recognition, analysis, and
classification of patterns in data from real world sources, events and
processes. Patterns exist throughout the real world. Patterns also exist
in the data used to represent or convey or store information about real
world objects or events or processes. As information systems process more
real world data, there are mounting requirements to build more
sophisticated, capable and reliable pattern recognition systems.

[0005]Existing pattern recognition systems include statistical, syntactic
and neural systems. Each of these systems has certain strengths which
lends it to specific applications. Each of these systems has problems
which limit its effectiveness.

[0006]Some real world patterns are purely statistical in nature.
Statistical and probabilistic pattern recognition works by expecting data
to exhibit statistical patterns. Pattern recognition by this method alone
is limited. Statistical pattern recognizers cannot see beyond the
expected statistical pattern. Only the expected statistical pattern can
be detected.

[0007]Syntactic pattern recognizers function by expecting data to exhibit
structure. While syntactic pattern recognizers are an improvement over
statistical pattern recognizers, perception is still narrow and the
system cannot perceive beyond the expected structures. While some real
world patterns are structural in nature, the extraction of structure is
unreliable.

[0008]Pattern recognition systems that rely upon neural pattern
recognizers are an improvement over statistical and syntactic
recognizers. Neural recognizers operate by storing training patterns as
synaptic weights. Later stimulation retrieves these patterns and
classifies the data. However, the fixed structure of neural pattern
recognizers limits their scope of recognition. While a neural system can
learn on its own, it can only find the patterns that its fixed structure
allows it to see. The difficulties with this fixed structure are
illustrated by the well-known problem that the number of hidden layers in
a neural network strongly affects its ability to learn and generalize.
Additionally, neural pattern recognition results are often not
reproducible. Neural nets are also sensitive to training order, often
require redundant data for training, can be slow learners and sometimes
never learn. Most importantly, as with statistical and syntactic pattern
recognition systems, neural pattern recognition systems are incapable of
discovering truly new knowledge.

[0009]Accordingly, there is a need for an improved method and apparatus
for pattern recognition, analysis, and classification which is not
encumbered by preconceptions about data or models.

BRIEF SUMMARY

[0010]By way of illustration only, an analyzer/classifier process for data
comprises using energy minimization with one or more input matrices. The
data to be analyzed/classified is processed by an energy minimization
technique such as individual differences multidimensional scaling (IDMDS)
to produce at least a rate of change of stress/energy. Using the rate of
change of stress/energy and possibly other IDMDS output, the data are
analyzed or classified through patterns recognized within the data. The
foregoing discussion of one embodiment has been presented only by way of
introduction. Nothing in this section should be taken as a limitation on
the following claims, which define the scope of the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

[0011]FIG. 1 is a diagram illustrating components of an analyzer according
to the first embodiment of the invention; and

[0012]FIG. 2 through FIG. 10 relate to examples illustrating use of an
embodiment of the invention for data classification, pattern recognition,
and signal processing.

DETAILED DESCRIPTION OF THE PRESENTLY PREFERRED EMBODIMENTS

[0013]The method and apparatus in accordance with the present invention
provide an analysis tool with many applications. This tool can be used
for data classification, pattern recognition, signal processing, sensor
fusion, data compression, network reconstruction, and many other
purposes. The invention relates to a general method for data analysis
based on energy minimization and least energy deformations. The invention
uses energy minimization principles to analyze one to many data sets. As
used herein, energy is a convenient descriptor for concepts which are
handled similarly mathematically. Generally, the physical concept of
energy is not intended by use of this term but the more general
mathematical concept. Within multiple data sets, individual data sets are
characterized by their deformation under least energy merging. This is a
contextual characterization which allows the invention to exhibit
integrated unsupervised learning and generalization. A number of methods
for producing energy minimization and least energy merging and extraction
of deformation information have been identified; these include, the
finite element method (FEM), simulated annealing, and individual
differences multidimensional scaling (IDMDS). The presently preferred
embodiment of the invention utilizes individual differences
multidimensional scaling (IDMDS).

[0014]Multidimensional scaling (MDS) is a class of automated, numerical
techniques for converting proximity data into geometric data. IDMDS is a
generalization of MDS, which converts multiple sources of proximity data
into a common geometric configuration space, called the common space, and
an associated vector space called the source space. Elements of the
source space encode deformations of the common space specific to each
source of proximity data. MDS and IDMDS were developed for psychometric
research, but are now standard tools in many statistical software
packages. MDS and IDMDS are often described as data visualization
techniques. This description emphasizes only one aspect of these
algorithms.

[0015]Broadly, the goal of MDS and IDMDS is to represent proximity data in
a low dimensional metric space. This has been expressed mathematically by
others (see, for example, de Leeuw, J. and Heiser, W., "Theory of
multidimensional scaling," in P. R. Krishnaiah and L. N. Kanal, eds.,
Handbook of Statistics, Vol. 2. North-Holland, N.Y., 1982) as follows.
Let S be a nonempty finite set, p a real valued function on S×S,

p:S×S→R.

[0016]p is a measure of proximity between objects in S. Then the goal of
MDS is to construct a mapping {tilde over (f)} from S into a metric space
(X, d),

{tilde over (f)}:S→X,

such that p(i, j)=pij≈d({tilde over (f)}(i), {tilde over
(f)}(i)), that is, such that the proximity of object i to object j in S
is approximated by the distance in X between {tilde over (f)}(i) and
{tilde over (f)}(j). X is usually assumed to be n dimensional Euclidean
space Rn, with n sufficiently small.

such that pk(i, j)=pijk≈d({tilde over (f)}k(i),
{tilde over (f)}k(j)), for k=1, . . . , m.

[0018]Intuitively, IDMDS is a method for representing many points of view.
The different proximities pk can be viewed as giving the proximity
perceptions of different judges. IDMDS accommodates these different
points of view by finding different maps {tilde over (f)}k for each
judge. These individual maps, or their image configurations, are
deformations of a common configuration space whose interpoint distances
represent the common or merged point of view.

[0019]MDS and IDMDS can equivalently be described in terms of
transformation functions. Let P=(pij) be the matrix defined by the
proximity p on S×S. Then MDS defines a transformation function

f:pij→dij(X),

where dij(X)=d({tilde over (f)}(i), {tilde over (f)}(j)), with {tilde
over (f)} the mapping from S→X induced by the transformation
function f. Here, by abuse of notation, X={tilde over (f)}(S), also
denotes the image of S under {tilde over (f)}. The transformation
function f should be optimal in the sense that the distances f(pij)
give the best approximation to the proximities pij. This
optimization criterion is described in more detail below. IDMDS is
similarly re-expressed; the single transformation f is replaced by m
transformations fk. Note, these fk need not be distinct. In the
following, the image of Sk under fk will be written Xk.

[0020]MDS and IDMDS can be further broken down into so-called metric and
nonmetric versions. In metric MDS or IDMDS, the transformations
f(fk) are parametric functions of the proximities
pij(pijk). Nonmetric MDS or IDMDS generalizes the metric
approach by allowing arbitrary admissible transformations f (fk),
where admissible means the association between proximities and
transformed proximities (also called disparities in this context) is
weakly monotone:

pij<pkl implies f(pij)≦f(pkl).

[0021]Beyond the metric-nonmetric distinction, algorithms for MDS and
IDMDS are distinguished by their optimization criteria and numerical
optimization routines. One particularly elegant and publicly available
IDMDS algorithm is PROXSCAL See Commandeur, J. and Heiser, W.,
"Mathematical derivations in the proximity scaling (PROXSCAL) of
symmetric data matrices," Tech. report no. RR-93-03, Department of Data
Theory, Leiden University, Leiden, The Netherlands. PROXSCAL is a least
squares, constrained majorization algorithm for IDMDS. We now summarize
this algorithm, following closely the above reference.

[0022]PROXSCAL is a least squares approach to IDMDS which minimizes the
objective function

σ is called the stress and measures the goodness-of-fit of the
configuration distances dij(Xk) to the transformed proximities
fk(pijk). This is the most general form for the objective
function. MDS can be interpreted as an energy minimization process and
stress can be interpreted as an energy functional. The wijk are
proximity weights. For simplicity, it is assumed in what follows that
wijk=1 for all i, j, k.

[0023]The PROXSCAL majorization algorithm for MDS with transformations is
summarized as follows.

[0030]7.Go to step 4 if the difference between the current and previous
stress is not less than ε, some previously defined number. Stop
otherwise.

[0031]For multiple sources of proximity data, restrictions are imposed on
the configurations Xk associated to each source of proximity data in
the form of the constraint equation Xk=ZWk.

[0032]This equation defines a common configuration space Z and diagonal
weight matrices Wk. Z represents a merged or common version of the
input sources, while the Wk define the deformation of the common
space required to produce the individual configurations Xk. The
vectors defined by diag(Wk), the diagonal entries of the weight
matrices Wk, form the source space W associated to the common space
Z.

[0033]The PROXSCAL constrained majorization algorithm for IDMDS with
transformations is summarized as follows. To simplify the discussion,
so-called unconditional IDMDS is described. This means the m
transformation functions are the same: f1=f2= . . . =fm.

[0041]8. Go to step 4 if the difference between the current and previous
stress is not less than ε, some previously defined number. Stop
otherwise.

[0042]Here, tr(A) and A' denote, respectively, the trace and transpose of
matrix A.

[0043]It should be pointed out that other IDMDS routines do not contain an
explicit constraint condition. For example, ALSCAL (see Takane, Y.,
Young, F, and de Leeuw, J., "Nonmetric individual differences
multidimensional scaling: an alternating least squares method with
optimal scaling features," Psychometrika, Vol. 42, 1977) minimizes a
different energy expression (stress) over transformations,
configurations, and weighted Euclidean metrics. ALSCAL also produces
common and source spaces, but these spaces are computed through
alternating least squares without explicit use of constraints. Either
form of IDMDS can be used in the present invention.

[0044]MDS and IDMDS have proven useful for many kinds of analyses.
However, it is believed that prior utilizations of these techniques have
not extended the use of these techniques to further possible uses for
which MDS and IDMDS have particular utility and provide exceptional
results. Accordingly, one benefit of the present invention is to
incorporate MDS or IDMDS as part of a platform in which aspects of these
techniques are extended. A further benefit is to provide an analysis
technique, part of which uses IDMDS, that has utility as an analytic
engine applicable to problems in classification, pattern recognition,
signal processing, sensor fusion, and data compression, as well as many
other kinds of data analytic applications.

[0045]Referring now to FIG. 1, it illustrates an operational block diagram
of a data analysis/classifier tool 100. The least energy deformation
analyzer/classifier is a three-step process. Step 110 is a front end for
data transformation. Step 120 is a process step implementing energy
minimization and deformation computations--in the presently preferred
embodiment, this process step is implemented through the IDMDS algorithm.
Step 130 is a back end which interprets or decodes the output of the
process step 120. These three steps are illustrated in FIG. 1.

[0046]It is to be understood that the steps forming the tool 100 may be
implemented in a computer usable medium or in a computer system as
computer executable software code. In such an embodiment, step 110 may be
configured as a code, step 120 may be configured as second code and step
120 may be configured as third code, with each code comprising a
plurality of machine readable steps or operations for performing the
specified operations. While step 110, step 120 and step 130 have been
shown as three separate elements, their functionality can be combined
and/or distributed. It is to be further understood that "medium" is
intended to broadly include any suitable medium, including analog or
digital, hardware or software, now in use or developed in the future.

[0047]Step 110 of the tool 100 is the transformation of the data into
matrix form. The only constraint on this transformation for the
illustrated embodiment is that the resulting matrices be square. The type
of transformation used depends on the data to be processed and the goal
of the analysis. In particular, it is not required that the matrices be
proximity matrices in the traditional sense associated with IDMDS. For
example, time series and other sequential data may be transformed into
source matrices through straight substitution into entries of symmetric
matrices of sufficient dimensionality (this transformation will be
discussed in more detail in an example below). Time series or other
signal processing data may also be Fourier or otherwise analyzed and then
transformed to matrix form.

[0048]Step 120 of the tool 100 implements energy minimization and
extraction of deformation information through IDMDS. In the IDMDS
embodiment of the tool 100, the stress function σ defines an energy
functional over configurations and transformations. The configurations
are further restricted to those which satisfy the constraint equations
Xk=ZWk. For each configuration Xk, the weight vectors
diag(Wk) are the contextual signature, with respect to the common
space, of the k-th input source. Interpretation of σ as an energy
functional is fundamental; it greatly expands the applicability of MDS as
an energy minimization engine for data classification and analysis.

[0049]Step 130 consists of both visual and analytic methods for decoding
and interpreting the source space W from step 120. Unlike traditional
applications of IDMDS, tool 100 often produces high dimensional output.
Among other things, this makes visual interpretation and decoding of the
source space problematic. Possible analytic methods for understanding the
high dimensional spaces include, but are not limited to, linear
programming techniques for hyperplane and decision surface estimation,
cluster analysis techniques, and generalized gravitational model
computations. A source space dye-dropping or tracer technique has been
developed for both source space visualization and analytic
postprocessing. Step 130 may also consist in recording stress/energy, or
the rate of change of stress/energy, over multiple dimensions. The graph
of energy (rate or change or stress/energy) against dimension can be used
to determine network and dynamical system dimensionality. The graph of
stress/energy against dimensionality is traditionally called a screen
plot. The use and purpose of the scree plot is greatly extended in the
present embodiment of the tool 100.

[0050]Let S={Sk} be a collection of data sets or sources Sk for
k=1, . . . , m. Step 110 of the tool 100 converts each Sk .di-elect
cons. S to matrix form M(Sk) where M(Sk) is a p dimensional
real hollow symmetric matrix. Hollow means the diagonal entries of
M(Sk) are zero. As indicated above, M(Sk) need not be symmetric
or hollow, but for simplicity of exposition these additional restrictions
are adopted. Note also that the matrix dimensionality p is a function of
the data S and the goal of the analysis. Since M(Sk) is hollow
symmetric, it can be interpreted and processed in IDMDS as a proximity
(dissimilarity) matrix. Step 110 can be represented by the map

M:S→Hp(R),

Sk→M(Sk)

where )Hp (R is the set of p dimensional hollow real symmetric
matrices. The precise rule for computing M depends on the type of data in
S, and the purpose of the analysis. For example, if S contains time
series data, then M might entail the straightforward entry-wise encoding
mentioned above. If S consists of optical character recognition data, or
some other kind of geometric data, then M(Sk) may be a standard
distance matrix whose ij-th entry is the Euclidean distance between "on"
pixels i and j. M can also be combined with other transformations to form
the composite, (M∘F)(Sk), where F, for example, is a
fast Fourier transform (FFT) on signal data Sk. To make this more
concrete, in the examples below M will be explicitly calculated in a
number of different ways. It should also be pointed out that for certain
data collections S it is possible to analyze the conjugate or transpose
S' of S. For instance, in data mining applications, it is useful to
transpose records (clients) and fields (client attributes) thus allowing
analysis of attributes as well as clients. The mapping M is simply
applied to the transposed data.

[0051]Step 120 of the presently preferred embodiment of the tool 100 is
the application of IDMDS to the set of input matrices M(S)={M(Sk)}.
Each M(Sk)).di-elect cons.M(S) is an input source for IDMDS. As
described above, the IDMDS output is a common space Z.OR right.Rn
and a source space W. The dimensionality n of these spaces depends on the
input data S and the goal of the analysis. For signal data, it is often
useful to set n=p-1 or even n=|Sk| where |Sk| denotes the
cardinality of Sk. For data compression, low dimensional output
spaces are essential. In the case of network reconstruction, system
dimensionality is discovered by the invention itself.

[0052]IDMDS can be thought of as a constrained energy minimization
process. As discussed above,the stress σ is an energy functional
defined over transformations and configurations in Rn; the
constraints are defined by the constraint equation Xk=ZWk.
IDMDS attempts to find the lowest stress or energy configurations Xk
which also satisfy the constraint equation. (MDS is the special case when
each Wk=I, the identity matrix.) Configurations Xk most similar
to the source matrices M(Sk) have the lowest energy. At the same
time, each Xk is required to match the common space Z up to
deformation defined by the weight matrices Wk. The common space
serves as a characteristic, or reference object. Differences between
individual configurations are expressed in terms of this characteristic
object with these differences encoded in the weight matrices Wk. The
deformation information contained in the weight matrices, or,
equivalently, in the weight vectors defined by their diagonal entries,
becomes the signature of the configurations Xk and hence the sources
Sk (through M(Sk))). The source space may be thought of as a
signature classification space.

[0053]The weight space signatures are contextual; they are defined with
respect to the reference object Z. The contextual nature of the source
deformation signature is fundamental. As the polygon classification
example below will show, Z-contextuality of the signature allows the tool
100 to display integrated unsupervised machine learning and
generalization. The analyzer/classifier learns seamlessly and invisibly.
Z-contextuality also allows the tool 100 to operate without a priori data
models. The analyzer/classifier constructs its own model of the data, the
common space Z.

[0054]Step 130, the back end of the tool 100, decodes and interprets the
source or classification space output W from IDMDS. Since this output can
be high dimensional, visualization techniques must be supplemented by
analytic methods of interpretation. A dye-dropping or tracer technique
has been developed for both visual and analytic postprocessing. This
entails differential marking or coloring of source space output. The
specification of the dye-dropping is contingent upon the data and overall
analysis goals. For example, dye-dropping may be two-color or binary
allowing separating hyperplanes to be visually or analytically
determined. For an analytic approach to separating hyperplanes using
binary dye-dropping see Bosch, R. and Smith, J, "Separating hyperplanes
and the authorship of the disputed federalist papers," American
Mathematical Monthly, Vol. 105, 1998. Discrete dye-dropping allows the
definition of generalized gravitational clustering measures of the form

[0055]Here, A denotes a subset of W (indicated by dye-dropping),
χA(x), is the characteristic function on A, d(.,.) is a distance
function, and p.di-elect cons.R. Such measures may be useful for
estimating missing values in data bases. Dye-dropping can be defined
continuously, as well, producing a kind of height function on W. This
allows the definition of decision surfaces or volumetric discriminators.
The source space W is also analyzable using standard cluster analytic
techniques. The precise clustering metric depends on the specifications
and conditions of the IDMDS analysis in question.

[0056]Finally, as mentioned earlier, the stress/energy and rate of change
of stress/energy can be used as postprocessing tools. Minima or kinks in
a plot of energy, or the rate of change of energy, over dimension can be
used to determine the dimensionality of complex networks and general
dynamical systems for which only partial output information is available.
In fact, this technique allows dimensionality to be inferred often from
only a single data stream of time series of observed data.

[0057]A number of examples are presented below to illustrate the method
and apparatus in accordance with the present invention. These examples
are illustrative only and in no way limit the scope of the method or
apparatus.

Example A

Classification of Regular Polygons

[0058]The goal of this experiment was to classify a set of regular
polygons. The collection S={S1, . . . , S16} with data sets
S1-S4, equilateral triangles; S5-S8, squares;
S9-S12, pentagons; and S13-S16; hexagons. Within each
subset of distinct polygons, the size of the figures is increasing with
the subscript. The perimeter of each polygon Sk, was divided into 60
equal segments with the segment endpoints ordered clockwise from a fixed
initial endpoint. A turtle application was then applied to each polygon
to compute the Euclidean distance from each segment endpoint to every
other segment endpoint (initial endpoint included). Let
xski denote the i-th endpoint of polygon Sk, then the
mapping M is defined by

M:S→H60(R),

Sk→[dSk1|dsk2| . . .
|dsk60]

where the columns

dski=(d(xski,xsk1),d(xs.-
sub.ki,xsk2), . . .
,d(xski,xsk60))'.

[0059]The individual column vectors dski have intrinsic
interest. When plotted as functions of arc length they represent a
geometric signal which contains both frequency and spatial information.

[0060]The 16, 60×60 distance matrices were input into a publicly
distributed version of PROXSCAL. PROXSCAL was run with the following
technical specifications: sources--16, objects--60, dimension--4,
model--weighted, initial configuration--Torgerson,
conditionality--unconditional, transformations--numerical, rate of
convergence--0.0, number of iterations--500, and minimum stress--0.0.

[0061]FIG. 2 and FIG. 3 show the four dimensional common and source space
output. The common space configuration appears to be a multifaceted
representation of the original polygons. It forms a simple closed path in
four dimensions which, when viewed from different angles, or, what is
essentially the same thing, when deformed by the weight matrices,
produces a best, in the sense of minimal energy, representation of each
of the two dimensional polygonal figures. The most successful such
representation appears to be that of the triangle projected onto the
plane determined by dimensions 2 and 4.

[0062]In the source space, the different types of polygons are arranged,
and hence, classified, along different radii. Magnitudes within each such
radial classification indicate polygon size or scale with the smaller
polygons located nearer the origin.

[0063]The contextual nature of the polygon classification is embodied in
the common space configuration. Intuitively, this configuration looks
like a single, carefully bent wire loop. When viewed from different
angles, as encoded by the source space vectors, this loop of wire looks
variously like a triangle, a square, a pentagon, or a hexagon.

Example B

Classification of Non-Regular Polygons

[0064]The polygons in Example A were regular. In this example, irregular
polygons S={S1, . . . , S6} are considered, where
S1-S3 are triangles and S4-S6 rectangles. The
perimeter of each figure Sk was divided into 30 equal segments with
the preprocessing transformation M computed as in Example A. This
produced 6, 30×30 source matrices which were input into PROXSCAL
with technical specifications the same as those above except for the
number of sources, 6, and objects, 30.

[0065]FIG. 4 and FIG. 5 show the three dimensional common and source space
outputs. The common space configuration, again, has a "holographic" or
faceted quality; when illuminated from different angles, it represents
each of the polygonal figures. As before, this change of viewpoint is
encoded in the source space weight vectors. While the weight vectors
encoding triangles and rectangles are no longer radially arranged, they
can clearly be separated by a hyperplane and are thus accurately
classified by the analysis tool as presently embodied.

[0066]It is notable that two dimensional IDMDS outputs were not sufficient
to classify these polygons in the sense that source space separating
hyperplanes did not exist in two dimensions.

Example C

Time Series Data

[0067]This example relates to signal processing and demonstrates the
analysis tool's invariance with respect to phase and frequency
modification of time series data. It also demonstrates an entry-wise
approach to computing the preprocessing transformation M.

[0068]The set S={S1, . . . , S12} consisted of sine, square, and
sawtooth waveforms. Four versions of each waveform were included, each
modified for frequency and phase content. Indices 1-4 indicate sine, 5-8
square, and 9-12 sawtooth frequency and phase modified waveforms. All
signals had unit amplitude and were sampled at 32 equal intervals x, for
0≦x≦2π.

[0069]Each time series Sk, was mapped into a symmetric matrix as
follows. First, an "empty" nine dimensional, lower triangular matrix
Tk=(tijk)=T(Sk) was created. "Empty" meant that
Tk had no entries below the diagonal and zeros everywhere else. Nine
dimensions were chosen since nine is the smallest positive integer m
satisfying the inequality m(m-1)/2≧32 and m(m-1)/2 is the number
of entries below the diagonal in an m dimensional matrix. The empty
entries in Tk were then filled in, from upper left to lower right,
column by column, by reading in the time series data from Sk.
Explicitly: s1k=t21k, the first sample in Sk was
written in the second row, first column of Tk;
s2k=t31k, the second sample in Sk was written in
the third row, first column of Tk, and so on. Since there were only
32 signal samples for 36 empty slots in Tk the four remaining
entries were designated missing by writing--2 in these positions (These
entries are then ignored when calculating the stress). Finally, a hollow
symmetric matrix was defined by setting

M(Sk)=Tk+Tkt.

[0070]This preprocessing produced 12, 9×9 source matrices which were
input to PROXSCAL with the following technical specifications:
sources--12, objects--9, dimension--8, model--weighted, initial
configuration--Torgerson, conditionality--unconditional,
transformations--ordinal, approach to ties--secondary, rate of
convergence--0.0, number of iterations--500, and minimum stress--0.0.
Note that the data, while metric or numeric, was transformed as if it
were ordinal or nonmetric. The use of nonmetric IDMDS has been greatly
extended in the present embodiment of the tool 100.

[0071]FIG. 6 shows the eight dimensional source space output for the time
series data. The projection in dimensions seven and eight, as detailed in
FIG. 7, shows the input signals are separated by hyperplanes into sine,
square, and sawtooth waveform classes independent of the frequency or
phase content of the signals.

Example D

Sequences, Fibonacci, etc.

[0072]The data set S={S1, . . . , S9} in this example consisted
of nine sequences with ten elements each; they are shown in Table 1, FIG.
8. Sequences 1-3 are constant, arithmetic, and Fibonacci sequences
respectively. Sequences 4-6 are these same sequences with some error or
noise introduced. Sequences 7-9 are the same as 1-3, but the negative l's
indicate that these elements are missing or unknown.

[0073]The nine source matrices M(Sk)=(mijk) were defined by

mijk=|sik-sjk|,

[0074]the absolute value of the difference of the i-th and j-th elements
in sequence Sk. The resulting 10×10 source matrices where
input to PROXSCAL configured as follows: sources--9, objects--10,
dimension--8, model--weighted, initial configuration--simplex,
conditionality--unconditional, transformations--numerical, rate of
convergence--0.0, number of iterations--500, and minimum stress--0.0.

[0075]FIG. 9 shows dimensions 5 and 6 of the eight dimensional source
space output. The sequences are clustered, hence classified, according to
whether they are constant, arithmetic, or Fibonacci based. Note that in
this projection, the constant sequence and the constant sequence with
missing element coincide, therefore only two versions of the constant
sequence are visible. This result demonstrates that the tool 100 of the
presently preferred embodiment can function on noisy or error containing,
partially known, sequential data sets.

Example E

Missing Value Estimation for Bridges

[0076]This example extends the previous result to demonstrate the
applicability of the analysis tool to missing value estimation on noisy,
real-world data. The data set consisted of nine categories of bridge data
from the National Bridge Inventory (NBI) of the Federal Highway
Administration. One of these categories, bridge material (steel or
concrete), was removed from the database. The goal was to repopulate this
missing category using the technique of the presently preferred
embodiment to estimate the missing values.

[0077]One hundred bridges were arbitrarily chosen from the NBI. Each
bridge defined an eight dimensional vector of data with components the
NBI categories. These vectors were preprocessed as in Example D, creating
one hundred 8×8 source matrices. The matrices were submitted to
PROXSCAL with specifications: sources--100, objects--8, dimension--7,
model--weighted, initial configuration--simplex,
conditionality--unconditional, transformations--numerical, rate of
convergence--0.0, number of iterations--500, and minimum stress--0.00001.

[0078]The seven dimensional source space output was partially labeled by
bridge material--an application of dye-dropping--and analyzed using the
following function

[0079]where p is an empirically determined negative number, d(x,y) is
Euclidean distance on the source space, and χ4 is the
characteristic function on material set Ai, i=1,2, where A1 is
steel, A2 concrete. (For the bridge data, no two bridges had the
same source space coordinates, hence gp was well-defined.) A bridge
was determined to be steel (concrete) if
gp(A1,x)>gp(A2,x)
(gp(A1,x)<gp(A2,x)). The result was indeterminate
in case of equality.

[0081]This example demonstrates the use of stress/energy minima to
determine network dimensionality from partial network output data.
Dimensionality, in this example, means the number of nodes in a network.

[0082]A four-node network was constructed as follows: generator nodes 1 to
3 were defined by the sine functions, sin(2x), sin(2x+π/2), and
sin(2x+4π/3); node 4 was the sum of nodes 1 through 3. The output of
node 4 was sampled at 32 equal intervals between 0 and 2π.

[0083]The data from node 4 was preprocessed in the manner of Example D:
the U-th entry of the source matrix for node 4 was defined to be the
absolute value of the difference between the i-th and j-th samples of the
node 4 time series. A second, reference, source matrix was defined using
the same preprocessing technique, now applied to thirty two equal
interval samples of the function sin(x) for 0≦x≦2π. The
resulting 2, 32×32 source matrices were input to PROXSCAL with
technical specification: sources--2, objects--32, dimension--1 to 6,
model--weighted, initial configuration--simplex,
conditionality--conditional, transformations--numerical, rate of
convergence--0.0, number of iterations--500, and minimum stress--0.0. The
dimension specification had a range of values, 1 to 6. The dimension
resulting in the lowest stress/energy is the dimensionality of the
underlying network.

[0084]Table 2, FIG. 10, shows dimension and corresponding stress/energy
values from the analysis by the tool 100 of the 4-node network. The
stress/energy minimum is achieved in dimension 4, hence the tool 100 has
correctly determined network dimensionality. Similar experiments were run
with more sophisticated dynamical systems and networks. Each of these
experiments resulted in the successful determination of system degrees of
freedom or dimensionality. These experiments included the determination
of the dimensionality of a linear feedback shift register. These devices
generate pseudo-random bit streams and are designed to conceal their
dimensionality.

[0085]From the foregoing, it can be seen that the illustrated embodiment
of the present invention provides a method and apparatus for classifying
input data. Input data are received and formed into one or more matrices.
The matrices are processed using IDMDS to produce a stress/energy value,
a rate or change of stress/energy value, a source space and a common
space. An output or back end process uses analytical or visual methods to
interpret the source space and the common space. The technique in
accordance with the present invention therefore avoids limitations
associated with statistical pattern recognition techniques, which are
limited to detecting only the expected statistical pattern, and
syntactical pattern recognition techniques, which cannot perceive beyond
the expected structures. Further, the tool in accordance with the present
invention is not limited to the fixed structure of neural pattern
recognizers. The technique in accordance with the present invention
locates patterns in data without interference from preconceptions of
models or users about the data. The pattern recognition method in
accordance with the present invention uses energy minimization to allow
data to self-organize, causing structure to emerge. Furthermore, the
technique in accordance with the present invention determines the
dimension of dynamical systems from partial data streams measured on
those systems through calculation of stress/energy or rate of change of
stress/energy across dimensions.

[0086]While a particular embodiment of the present invention has been
shown and described, modifications may be made. For example, PROXSCAL may
be replaced by other IDMDS routines which are commercially available or
are proprietary. It is therefore intended in the appended claims to cover
all such changes and modifications which fall within the true spirit and
scope of the invention.

[0087]It is therefore intended that the foregoing detailed description be
regarded as illustrative rather than limiting, and that it be understood
that it is the following claims, including all equivalents, that are
intended to define the spirit and scope of this invention.