Abstract:

The purpose is to provide a pattern identifying method, a pattern
identifying device and a pattern identifying program, which able to
correctly identify a pattern even in a case where an outlier is existed.
The identifying method includes: reading, as data, an input pattern to be
identified and a learning pattern previously prepared; computing a
probability of a virtually generated virtual pattern existing between
said input pattern and said learning pattern, as a first probability;
computing a non-similarity of said input pattern with respect to said
learning pattern, based on said first probability; and identifying
whether or not said input pattern is consistent with said learning
pattern, based on said non-similarity.

Claims:

1. A pattern identifying method, comprising: reading, as data, an input
pattern to be identified and a learning pattern previously prepared;
computing a probability of a virtually generated virtual pattern existing
between said input pattern and said learning pattern, as a first
probability; computing a non-similarity of said input pattern with
respect to said learning pattern; based on said first probability; and
identifying whether or not said input pattern is consistent with said
learning pattern, based on said non-similarity.

2. The pattern identifying method according to claim 1, wherein said
computing the non-similarity comprises: computing a logarithm of said
first probability as said non-similarity.

3. The pattern identifying method according to claim 1, wherein said
computing the non-similarity comprises: computing said first probability
itself as said non-similarity.

4. The pattern identifying method according to claim 1, wherein each of
said input pattern, said learning pattern and said virtual pattern is a
multidimensional pattern that includes a plurality of component, said
computing the first probability comprises: computing a probability of
said virtual pattern existing between said input pattern and said
learning pattern for each of said plurality of component, as a
probability element; and computing a product of said probability element
in said plurality of component, as said first probability, and said
computing said probability element comprises: deciding the probability
element corresponding to ith component as 1, when said input pattern
or said learning pattern is lost in said ith component.

5. The pattern identifying method according to claim 4, wherein said
computing said probability element comprises: computing said probability
element, based on a probability density function that is previously
prepared for each of said plurality of component.

6. The pattern identifying method according to claim 5, wherein said
probability density function is a function that indicates a probability
of existence of randomly generated data.

7. The pattern identifying method according to claim 5, wherein said
probability density function is a function that indicates a probability
of existence of data that is generated to be distributed with uniformity.

8. A pattern identifying program for making a computer execute a method
which comprises: reading, as data, an input pattern to be identified and
a learning pattern previously prepared; computing a probability of a
virtually generated virtual pattern existing between said input pattern
and said learning pattern, as a first probability; computing a
non-similarity of said input pattern with respect to said learning
pattern, based on said first probability; and identifying whether or not
said input pattern is consistent with said learning pattern, based on
said non-similarity.

9. The pattern identifying program according to claim 8, wherein said
computing the non-similarity comprises: computing a logarithm of said
first probability as said non-similarity.

10. The pattern identifying program according to claim 8, wherein said
computing the non-similarity comprises: computing said first probability
itself as said non-similarity.

11. The pattern identifying program according to claim 8, wherein each of
said input pattern, said learning pattern and said virtual pattern is a
multidimensional pattern that includes a plurality of component, said
computing the first probability comprises: computing a probability of
said virtual pattern existing between said input pattern and said
learning pattern for each of said plurality of component, as a
probability element; and computing a product of said probability element
in said plurality of component, as said first probability, and said
computing said probability element comprises: deciding the probability
element corresponding to ith component as 1, when said input pattern
or said learning pattern is lost in said ith component.

12. The pattern identifying program according to claim 11, wherein said
computing said probability element comprises: computing said probability
element, based on a probability density function that is previously
prepared for each of said plurality of component.

13. The pattern identifying program according to claim 12, wherein said
probability density function is a function that indicates a probability
of existence of randomly generated data.

14. The pattern identifying program according to claim 12, wherein said
probability density function is a function that indicates a probability
of existence of data that is generated to be distributed with uniformity.

15. A pattern identifying device, comprising: a data inputting means for
reading, as data, an input pattern to be identified and a learning
pattern previously prepared; a first probability computing means for
computing a probability of a virtually generated virtual pattern existing
between said input pattern and said learning pattern, as a first
probability; a non-similarity computing means for computing a
non-similarity of said input pattern with respect to said learning
pattern, based on said first probability; and an identifying means for
identifying whether or not said input pattern is consistent with said
learning pattern, based on said non-similarity.

16. The pattern identifying device according to claim 15, wherein said
non-similarity computing means is configured to compute a logarithm of
said first probability as said non-similarity.

17. The pattern identifying device according to claim 15, wherein said
non-similarity computing means is configured to compute said first
probability itself as said non-similarity.

18. The pattern identifying device according to claim 15, wherein said
data inputting means is configured to read a multidimensional pattern
that includes a plurality of component, as each of said input pattern,
said learning pattern and said virtual pattern, said first probability
computing means comprises: a probability element computing means for
computing a probability of said virtual pattern existing between said
input pattern and said learning pattern for each of said plurality of
component, as a probability element; and a multiplying means for
computing a product of said probability element in said plurality of
component, as said first probability, and said probability element
computing means is configured to decide the probability element
corresponding to ith component as 1, when said input pattern or said
learning pattern is lost in said ith component.

19. The pattern identifying device according to claim 18, wherein said
probability element computing means is configured to compute said
probability element, based on a probability density function that is
previously prepared for each of said plurality of component.

20. The pattern identifying device according to claim 19, wherein said
probability density function is a function that indicates a probability
of existence of randomly generated data.

21. The pattern identifying device according to claim 19, wherein said
probability density function is a function that indicates a probability
of existence of data that is generated to be distributed with uniformity.

Description:

TECHNICAL FIELD

[0001] The present invention relates to a pattern identifying method, a
pattern identifying device and a pattern identifying program.

BACKGROUND ART

[0002] Technologies related to identification of a pattern are applied to
wide fields such as image recognition, voice recognition and data mining
fields. When identifying the pattern, the pattern to be identified
(referred to as "input pattern", hereinafter) is compared with a
previously prepared pattern (referred to as "learning pattern",
hereinafter) so as to determine whether or not the input pattern is
consistent with the learning pattern.

[0003] It is desired to improve identifying accuracy in the technologies
for identifying the pattern. However, the input pattern is not always
provided in a complete state. In the input pattern, a part of components
may be a value (outlier) that is not related to an inherent value. For
example, in a case of the image identification, the input pattern may
include an occlusion. The occlusion is an image of a portion which is
inherently not an object to be compared and may cause the outlier. Also,
in a case of the voice identification, a sudden short-time noise may be
superposed to a voice to be identified. Such a short-time noise may
easily cause the outlier.

[0004] With regard to the input pattern, noise removal is usually
performed as preprocessing. However, it is very difficult to address the
outlier only by the noise removal. Therefore, it is desired to provide a
technique for identifying the pattern more accurately. That is, it is
desired to improve a robustness of identification.

[0005] As one technique for improving the robustness, a technique is
proposed, which uses a similarity or non-similarity between the input
pattern and the learning pattern in order to improve an identifying
performance. Patent Literature 1 (Japanese Patent Publication
JP2006-39658A) discloses that identification is performed by using a
sequence relationship corresponding to a non-similarity between partial
images. Moreover, Patent Literature 2 (Japanese Patent Publication
JP2004-341930A) discloses a technique of addressing the outlier by a vote
method using a reciprocal of a distance as a similarity between the same
categories. Moreover, Non-Patent Literature 3 (C. C. Aggarwal, A.
Hinneburg, D. A. Keim; On the Surprising Behavior of Distance Metrics in
High Dimensional Space, Lecture Notes in Computer Science, Vol. 1973,
Springer, 2001) discloses that an L1/k norm (k is an integer of 2 or
more) is used as a distance scale in a D-dimensional space. It is
described that the robustness against the noise is improved.

[0006] Meanwhile, regarding the pattern identification, there is also a
problem in a dimension of the pattern. In a case where the technique
relating to pattern identification is applied to the image recognition or
voice recognition and the like, the number of components may increase in
many cases. That is, a dimension of the input pattern may increase in
many cases. If the dimension of the input pattern increases, it is known
that the identifying accuracy of the pattern is lowered with the
spherical concentration phenomenon (see for example, Non-Patent
Literatures 1 (K. Beyer, J. Goldstein, R. Ramakrishnan, U. Shaft; When Is
"Nearest Neighbor" Meaningful?, in Proceeding of the 7th
International Conference on Database Theory, Lecture Notes In Computer
Science, vol. 1540, pp. 217-235, Springer-Verlag, London, 1999.) and 2
(Kamishima: A Survey of Recent Clustering Methods for Data Mining (part
2)--Challenges to Conquer Giga Data Sets and The Curse of
Dimensionality--, The Japanese Society of Artificial Intelligence
Official Jobrnal 18, No. 2, pp. 170-176, 2003)).

[0007] In order to accurately identify a pattern even in a case of a
high-dimensional input pattern, a technique is adopted, which reduces a
dimension of the input pattern. As the technique for reducing the
dimension, for example, a principal component analysis and
multidimensional scaling and the like are known. Also, in Non-Patent
Literature 2, a representative method for efficiently reducing a
dimension is described.

[0008] As the other related techniques as long as the inventor can know,
Patent Literature 3 (Japanese Patent Publication JP2000-67294A) and
Patent Literature 4 (Japanese Unexamined Patent Application Publication
JP-A-Heisei 11-513152) are listed.

[0018] In order to calculate a non-similarity (similarity) between a
high-dimensional (D-dimensional) input pattern
X.sup.(1)=(x.sup.(1)1, . . . , x.sup.(1)D) and a learning
pattern X.sup.(2)=(x(.sup.(2)1, . . . , x.sup.(2)D), it is
considered to use a distance between the input pattern X.sup.(1) and the
learning pattern X.sup.(2). In other words, it is considered that, as the
larger the distance is, the lower the similarity is (i.e., the higher the
non-similarity is).

[0019] As the distance d2.sup.(D) (X.sup.(1), . . . , X.sup.(2))
between the input pattern X.sup.(1) and the learning pattern X.sup.(2),
it is considered to use an L; norm represented by a following expression
1.

[0020] However, when using the L2 norm, in the components of
D-dimensional patterns, an influence exerting on the non-similarity by a
component having a small distance is much smaller compared to an
influence of a component having a large distance. It is assumed that an
outlier is included in either of the input pattern and the learning
pattern. At this time, the distance between the input pattern and the
learning pattern easily becomes large in a component having the outlier.
Therefore, the influence exerting on the non-similarity becomes large in
the component having the outlier, and it becomes difficult to accurately
identify. Moreover, if the dimension D becomes large, a probability of
existence of the outlier becomes high. Therefore, it becomes further
difficult to identify a pattern in the high-dimensional pattern.

[0021] As a method for reducing the influence of the outlier, it is
considered to use an L1/k norm (k is an integer of 2 or more) that
is represented by the following expression 2, as a distance
d1/k.sup.(D) (X.sup.(1), X.sup.(2)) between the D-dimensional input
pattern X.sup.(1)=(x.sup.(1)1, . . . , x.sup.(1)D) and the
learning pattern X.sup.(2)=(x.sup.(2)1, . . . , x.sup.(2)D).

[0022] In the case where an L.sub.α norm (α is a positive real
number) is used as the distance, the smaller α is, the higher the
robustness becomes, in the identification. This is because, the smaller
α is, the smaller the influence by the component having a large
distance becomes so that the influence by the outlier becomes relatively
small. By using the L1/k norm as the distance, the influence exerted
on the non-similarity by the outlier is reduced and it is considered that
it is facilitated to accurately identify a pattern even in the case of
the high-dimensional pattern.

[0023] However, even in the case of using the L1/k norm, it was still
difficult to completely eliminate the influence of the outlier.

[0024] Therefore, an object of the present invention is to provide a
pattern identifying method, a pattern identifying device and a pattern
identifying program, which able to accurately identify a pattern even in
the case where the outlier exists.

[0025] A pattern identifying method according to the present invention
includes: reading, as data, an input pattern to be identified and a
learning pattern previously prepared; computing, as a first probability,
a probability of a virtually generated virtual pattern existing between
the input pattern and the learning pattern; computing a non-similarity of
the input pattern with respect to the learning pattern based on the first
probability; and identifying whether or not the input pattern is
consistent with the learning pattern based on the non-similarity.

[0026] A pattern identifying program according to the present invention is
a program for a computer executing the steps of: reading, as data, an
input pattern to be identified and a learning pattern previously
prepared; computing, as a first probability, a probability of a virtually
generated virtual pattern existing between the input pattern and the
learning pattern; computing a non-similarity based on the first
probability; and identifying whether or not the input pattern is
consistent with the learning pattern based on the non-similarity.

[0027] A pattern identifying device according to the present invention
includes: data input means adapted to read, as data, an input pattern to
be identified and a learning pattern previously prepared; first
probability computing means adapted to compute, as a first probability, a
probability of a virtually generated virtual pattern existing between the
input pattern and the learning pattern; non-similarity computing means
adapted to compute a non-similarity based on the first probability; and
identifying means adapted to identify whether or not the input pattern is
consistent with the learning pattern based on the non-similarity.

[0028] According to the present invention, a pattern identifying method, a
pattern identifying device and a pattern identifying program are
provided, which can accurately identify a pattern even in the case where
the outlier exists.

BRIEF DESCRIPTION OF DRAWINGS

[0029] FIG. 1 is a schematic block diagram showing a pattern identifying
device according to a first embodiment;

[0030] FIG. 2 is a flow chart showing a pattern identifying method
according to the first embodiment;

[0031] FIG. 3 is a flow chart showing the pattern identifying method
according to the first embodiment; and

[0032] FIG. 4 is a schematic block diagram showing a pattern identifying
device according to a second embodiment.

DESCRIPTION OF EMBODIMENTS

First Exemplary Embodiment

[0033] FIG. 1 is a schematic block diagram showing a pattern identifying
system according to the present exemplary embodiment. This pattern
identifying system includes a pattern identifying device 10, an external
storage device 20 and an output device 30.

[0034] Input data and a learning data group are stored as data in the
external storage device 20. The input data indicates a target pattern to
be identified. The learning data group indicates learning patterns. The
learning patterns are patterns to be compared to the input pattern as
references of identification. The learning data group includes a
plurality of pieces of learning data in a list. The external storage
device 20 includes, for example, a hard disc and the like.

[0035] The pattern identifying device 10 is provided for identifying a
learning pattern that is consistent with the input pattern. The pattern
identifying device 10 includes an input device 13, a search device 14, a
non-similarity computing device 11, a memory 15 for storing various kinds
of data and an identifying device 12. The input device 13, the search
device 14, the non-similarity computing device 11 and the identifying
device 12 are realized by a pattern identifying program that is stored in
a ROM (Read Only Memory) and the like.

[0036] The input device 13 is provided for reading the input pattern. The
input device 13 extracts a plurality of feature (component) based on the
input data. Then, a feature value x of each component is obtained to
generate an input pattern X.sup.(1)=(x.sup.(1)1, . . . ,
x.sup.(1)D). The generated input pattern X.sup.(1) is read into the
pattern identifying device 10. In the input pattern
X.sup.(1)=(x.sup.(1)1, . . . , x.sup.(1)D, x.sup.(1)n (n
is a positive integer) indicates a feature value x of a nth
component. D indicates the number of the components, namely, indicates
that the dimension of the input pattern X.sup.(1) is D.

[0037] The search device 14 is provided for reading the learning pattern
from the learning pattern group. The search device 14 searches learning
data from the learning data group. Then, the search device 14 extracts a
plurality of features (components) based on the searched learning data,
similarly to the input device 13. Then, the search device 14 obtains a
feature value of each component and generates a D-dimensional learning
pattern X.sup.(2)=(x.sup.(2)1, . . . , x.sup.(2)D). The
generated learning pattern X.sup.(2) is read into the pattern identifying
device 10.

[0038] The non-similarity computing device 11 is provided for computing a
non-similarity between the input pattern X.sup.(1) and the learning
pattern X.sup.(2). The non-similarity computing device 11 includes a
first probability computing part 16 and a non-similarity computing part
17. The first probability computing part 16 includes a probability
element computing part 18 and a multiplying part 19.

[0039] The identifying device 12 is provided for identifying whether or
not the input pattern X.sup.(1) is consistent with the learning pattern
X.sup.(2), based on the non-similarity.

[0040] In the memory 15, probability density function data 15-1 and a
threshold 15-2 for identification are previously stored.

[0041] The probability density function data 15-1 is data that indicates
probability density function q(x). The probability density function q(x)
is a function of the feature value x, and indicates a probability of
existence of the data when the data is randomly generated within a
domain. The probability density function data 15-1 indicates a
probability density function for each of D pieces of components. That is,
the probability density function data 15-1 indicates probability density
functions q1 (x1), . . . , and qd(xD), regarding to
the D pieces of components.

[0042] The threshold 15-2 is data indicating a value that is used as a
reference when identifying whether or not the input pattern is consistent
with the learning pattern.

[0043] The output device 30 is exemplified as a display device having a
display screen or the like. An identified result by the pattern
identifying device 10 is outputted to the output device 30.

[0044] Subsequently, a pattern identifying method according to the present
exemplary embodiment will be explained below.

[0045] FIG. 2 is a flow chart showing the pattern identifying method
according to the present exemplary embodiment.

[0046] Step S10: Reading of Input pattern

[0047] Initially, the input data stored in the external storage device 20
is read into the pattern identifying device 10 via the input device 13.
The input device 13 extracts a plurality (D pieces) of features
(components) based on the input data. Then, the feature value x of the
each component is obtained to generate the input pattern
X.sup.(1)=(x.sup.(1)1, . . . , x.sup.(1)D). The generated input
pattern X.sup.(1) is read into the pattern identifying device 10.

[0048] Step S20: Reading of Learning Pattern

[0049] Next, the search device 14 reads a learning pattern from the
learning data group stored in the external storage device 20 into the
pattern identifying device 10. The search device 14 extracts a plurality
(D pieces) of component based on the learning data, similarly, to the
input device 14. Then, the feature value of the each component is
obtained to generate the learning pattern X.sup.(2)=(x.sup.(2)1, . .
. , x.sup.(2)D). The generated learning pattern X.sup.(2) is read
into the pattern identifying device 10.

[0050] Step S30: Computation of Non-similarity

[0051] Subsequently, the non-similarity computing device 11 computes a
non-similarity between the input pattern X.sup.(1) and the learning
pattern X.sup.(2). The process in the present step will be described
later.

[0052] Step S40: Is Data Pair Consistent?

[0053] Subsequently, the identifying device 12 compares the non-similarity
with the threshold 15-2 stored in the memory 15. The identifying device
12 determines whether or not the input pattern is consistent with the
learning pattern, based on the comparison result.

[0054] Step S50: Output of Identified Result

[0055] In Step S40, when the input pattern is consistent with the learning
pattern, the identifying device 12 outputs, via the output device 30, the
fact that the input patter is consistent with the learning pattern.

[0056] Step S60: Are All Learning Patterns Processed?

[0057] Meanwhile, in Step S40, when the input pattern is not inconsistent
with the learning pattern, a next learning pattern is read from the
learning data group of the external storage device 20 by the search
device 14, and the processes in Step S20 and subsequent steps are
repeated. In a case where the all learning data of the learning data
group has been processed, the identifying device 12 outputs, via the
output device 30, the fact that there is no consistent learning pattern.

[0058] By a series of the processes described above, the learning pattern
is identified that is consistent with the input pattern.

[0059] In the present exemplary embodiment, the process in the step (Step
S30) of computing the non-similarity is devised.

[0060] FIG. 3 is a flow chart specifically showing an operation of Step
S30. In Step S30, the first probability computing part 16 computes a
probability of a virtually generated pattern X.sup.(3)=(x.sup.(3)1,
. . . , x.sup.(3)D) (referred to as "virtual pattern", hereinafter)
existing between the input pattern X.sup.(1) and the learning pattern
X.sup.(2), as the first probability (Steps S31 and S32). Then, the
non-similarity computing part 17 computes a logarithm of the first
probability, as the non-similarity (Step S33). The following further
specifically describes the process, of each step.

[0061] Step S31: Computation of Probability Component

[0062] Initially, regarding the each of the D-dimensional components, the
probability component computing part 18 computes a probability of the
virtual pattern X.sup.(3) existing between the input pattern X.sup.(1)
and the learning pattern X.sup.(2), as a probability element p
(x.sup.(1)i, x.sup.(2)i). This probability element p
(x.sup.(1)i, x.sup.(2)i) is computed by using the probability
density function qi (xi). That is, regarding an ith
component xi, the probability element p (x.sup.(i)i,
x.sup.(2)i) is obtained by the following expression 3.

[0064] Subsequently, the product calculating part 19 computes a
probability of the all of the D pieces of components in the virtual
pattern X.sup.(3) existing between the input pattern X.sup.(1) and the
learning pattern X.sup.(2), as the first probability P (X.sup.(1),
X.sup.(2)). This first probability P (X.sup.(1), X.sup.(2)) can be
computed by obtaining a product of the probability elements p
(x.sup.(1)1, x.sup.(2)i) obtained in Step S31. That is, the
first probability P X.sup.(2) can be computed by the following expression
4.

[0065] The obtained first probability P (X.sup.(1), X.sup.(2)) indicates a
probability of the virtual pattern X.sup.(3) randomly given in a domain
of the input pattern X.sup.(1) incidentally existing between the input
pattern X.sup.(1) and the learning pattern X.sup.(2). Hence, it can be
said that, the smaller this first probability P, the smaller the
difference between the input pattern X.sup.(1) and the learning pattern
X.sup.(2). In this case, it is concluded that the input pattern X.sup.(1)
and the learning pattern X.sup.(2) are similar patterns.

[0066] Step S33: Computation of Non-similarity

[0067] Next, the non-similarity computing part 17 computes a logarithm of
the first probability P (X.sup.(1), X.sup.(2)) as a non-similarity
E.sup.(D) (X.sup.(1), X.sup.(2)). That is, the non-similarity computing
part 17 computes the non-similarity E.sup.(D)(X.sup.(1), X.sup.(2)) by
the following expression 5.

[Expression 5]

E.sup.(D)(X.sup.(1),X.sup.(2))=ln P(X.sup.(1), X.sup.(2)) (5)

[0068] By the processes of Steps S31 to S33 as described above, the
non-similarity E.sup.(D) (X.sup.(1), X.sup.(2)) between the input pattern
X.sup.(1) and the learning pattern X.sup.(2) is computed. Since the
computed non-similarity is a logarithm of a probability, it becomes a
non-positive value. Also, the larger the first probability P (X.sup.(1),
X.sup.(2)) is, the larger the non-similarity E.sup.(D) (X.sup.(1),
X.sup.(2)) becomes, and it is represented that the non-similarity is
large (i.e., the similarity is small).

[0069] Subsequently, an effect of the present exemplary embodiment will be
explained.

[0070] When a distance between the input pattern X.sup.(1) and the
learning pattern X.sup.(2) is small, the non-similarity E.sup.(D)
(X.sup.(1), X.sup.(2)) obtained in the present exemplary embodiment
becomes a small value. Regarding this point, it is similar to the case
where the non-similarity is calculated based on a distance L1/k norm
(see Expression 2) between the input pattern and the learning pattern.

[0071] However, whereas the L1/k(norm is a non-negative value, the
non-similarity of the present exemplary embodiment is a non-positive
value. In the case where the L1/k norm is used as the
non-similarity, a penalty is imposed to the similarity in a component
having a large distance such as the outlier. That is, if k is set to be a
large value, an influence exerted on the similarity (non-similarity) by
an outlier component becomes smaller than that in the case of setting k
to be small. However, among the D pieces of components, the outlier
component is still large in the influence on the non-similarity.

[0072] Contrary to this, in the present exemplary embodiment, in a
component having a small distance, the similarity is added in point.
Therefore, among the D pieces of components, the outlier component easily
becomes smallest in the influence on the non-similarity. This point is
explained below.

[0073] Contribution by the probability element p (x.sup.(1)i,
x.sup.(2)i) of the ith component on the non-similarity is
defined as Ei (X.sup.(1), X.sup.(2)). Moreover, it is assumed that the
non-similarity E.sup.(D) (X.sup.(1), X.sup.(2)) can be given as a sum of
the contribution Ei (X.sup.(1), X.sup.(2)) of the all components. That
is, it is assumed that the following expression 6 can be established
between the non-similarity E.sup.(D) (X.sup.(1), X.sup.(2)) and the
contribution Ei (X.sup.(1), X.sup.(2)).

[0075] According to the expression 7, the contribution Ei (X.sup.(1),
X.sup.(2)) of the ith component can be represented by the following
expression 8.

[Expression 8]

Ei(X.sup.(1),X.sup.(2))=ln p(xi.sup.(1),xi.sup.(2)) (8)

[0076] Referring to the expression 8, since the contribution Ei
(X.sup.(1), X.sup.(2)) of the ith component is a logarithm of a
probability, it is understood that the contribution is 0 or a negative
value all the time. That is, it is understood that the following
expression 9 can be established.

[Expression 9]

Ei(X.sup.(1),X.sup.(2))=ln
p(xi.sup.(1),xi.sup.(2))≦0 (9)

[0077] In the component having the outlier, there is a large difference
between the input pattern X.sup.(1) and the learning pattern X.sup.(2) in
the feature value. Therefore, the probability element p (x.sup.(1)i,
x.sup.(2)i) becomes large. Hence, the contribution Ei (X.sup.(1),
X.sup.(2)) of the component having the outlier becomes large. However,
the contribution Ei (X.sup.(1), X.sup.(2)) is 0 or a negative value
(non-positive value) and an absolute value of Ei (X.sup.(1), X.sup.(2))
becomes small. The fact that the absolute value of the Contribution Ei
(X.sup.(1), X.sup.(2)) is small means that an influence on the
non-similarity, that is a computed result, is small. That is, in the all
components, the component having the outlier easily becomes smallest in
the influence on the non-similarity. Whereas, in the case of a similar
component, the probability element p (x(.sup.(1)i, x.sup.(2)i)
becomes small and the absolute value of the contribution Ei (X.sup.(1),
X.sup.(2)) easily becomes large. That is, the influence on the computed
result of the non-similarity easily becomes large.

[0078] As described above, according to the present exemplary embodiment,
among the D pieces of components, the component having the outlier is a
small in the influence on the non-similarity. Thus, a pattern can be
identified even when the pattern is a high-dimensional pattern. By this
feature, even in an image identification having, e.g., an occlusion, it
becomes possible to reduce the contribution of the occlusion portion that
is essentially not to be compared.

Second Exemplary Embodiment

[0079] Subsequently, a second exemplary embodiment of the present
invention will be explained. FIG. 4 is a schematic block diagram showing
a configuration of a pattern identifying device according to the present
exemplary embodiment. In the present exemplary embodiment, the
non-similarity computing part is deleted in comparison with the first
exemplary embodiment. The other points can be same as those of the first
exemplary embodiment, and the detailed explanation thereof is omitted
here.

[0080] In the present exemplary embodiment, the step (Step S30) of
computing a non-similarity in the first exemplary embodiment is modified.
That is, in the present exemplary embodiment, the first probability
itself is treated as the non-similarity.

[0081] Even if the first probability itself is used, the non-similarity
can reflect a degree of the similarity (non-similarity) between the input
pattern X.sup.(1) and the learning pattern X.sup.(3).

[0082] When the first probability itself is used as the non-similarity, it
can be said that the threshold for identification indicates a probability
of the input pattern being determined to be consistent with the learning
pattern. although the input pattern is inherently not inconsistent with
the learning pattern. Therefore, when determining the threshold for
identification, an expected error rate itself can be used. For example,
in a case where the expected error rate is 0.01%, the threshold for
identification may be set to 0.01%. Thus, according to the present
exemplary embodiment, it is facilitated to set a parameter in the pattern
identifying device.

Third Exemplary Embodiment

[0083] Subsequently, a third exemplary embodiment of the present invention
will be explained. In the present exemplary embodiment, the process of
the non-similarity computing device 11 (the process in Step S30 for
computing a non-similarity) is further devised in comparison with the
exemplary embodiments mentioned above. The other points can be same as
those of the exemplary embodiments mentioned above, and the detailed
explanation thereof will be omitted.

[0084] In a finger printing identification and the like, data of a part of
features (components) is lost in the input pattern in many cases. If the
data is lost, it may be difficult to calculate the non-similarity.

[0085] For example, the method of using the L1/k norm (see Expression
2) is unsuitable for a pattern identification when a missing value
exists. It is assumed that a distance d1/k.sup.(D) (X.sup.(1),
X.sup.(2)) between a D-dimensional input pattern
X.sup.(1)=(x.sup.(1)1, . . . , x.sup.(1)D) and a learning
pattern X.sup.(2)=(x.sup.(2)1, . . . , x.sup.(2)D) is obtained
by using the L1/k norm. Also, with respect to a (D-d)-dimensional
input pattern wherein d pieces of components are excluded as missing
values from the D-dimensional input pattern, it is assumed that a
distance d1/k.sup.(D-d) (X.sup.(1)', X.sup.(2)') from the learning
pattern X.sup.(2) is obtained. Then, it is assumed that the distance
d1/k.sup.(D) (X.sup.(1), X.sup.(2)) and the distance
d1/k.sup.(D-d) (X.sup.(1)', X.sup.(2)') are compared. The comparison
result is d1/k.sup.(D-d) (X.sup.(1)',
X.sup.(2)')≦d1/k.sup.(D) (X.sup.(1), X.sup.(2)). That is, in
the case where the missing value exists, the distance between the input
pattern and the learning pattern becomes small, and the input pattern is
determined to be similar to the learning pattern.

[0086] Therefore, in the present exemplary embodiment, there is made a
device for handling the missing value.

[0087] In the present exemplary embodiment, when a value of a certain
component is a missing value in the input pattern X.sup.(1) or the
learning pattern X.sup.(2), the probability element computing part 18
computes the probability element p (x.sup.(1)i, x.sup.(2)i) of
the component as 1 (see expression 10 as below).

[Expression 10]

p(xi.sup.(1), xi.sup.(2))=1 (10)

[0088] Thus, a contribution of a probability element of the missing value
component exerting on the non-similarity becomes zero (see expression 11
as below).

[Expression 11]

Ei(X.sup.(1), X.sup.(2))=0 (11)

[0089] Accordingly, a non-similarity E.sup.(D) (X.sup.(1), X.sup.(2))
between two D-dimensional patterns X.sup.(1) and X.sup.(2) including no
missing value always becomes smaller than a non-similarity E.sup.(D-d)
(X.sup.(1)', X.sup.(2)') between (D-d)-dimensional patterns X.sup.(1)'
and X.sup.(2)' excluding d pieces of components as the missing values.
Therefore, the similarity becomes smaller in the case where the missing
value exists. Thus, different from a case of using the L1/k norm,
the property of E.sup.(D-d) (X.sup.(1)', X.sup.(2)')≧E.sup.(D)
(X.sup.(1), X.sup.(2)) can be imparted to the non-similarity. For
example, even in the case where it may be considered that a feature value
of a part of the input pattern is lost, such as case of e.g. a finger
printing identification and the like, it becomes possible to determine
that the case having no data loss is rather similar.

Fourth Exemplary embodiment

[0090] Subsequently, a fourth exemplary embodiment of the present
invention will be explained. In the present exemplary embodiment, the
probability density function data 15-1 is modified in comparison with the
exemplary embodiments mentioned above. In the above-mentioned exemplary
embodiments, as the probability density function, a function is provided
that indicates a probability of existence of data that is randomly
generated in a domain. On the other hand, in the present exemplary
embodiment, the probability density function is a function that indicates
a probability of existence of data that is generated so as to be
uniformly distributed within the domain.

[0091] As in the present exemplary embodiment, also by using a uniform
distribution function as the probability density function, functions same
as the exemplary embodiments mentioned above can be obtained.

[0092] This application is based on the Japanese Patent Application No.
2008-152952 filed on Jun. 11, 2008, claiming the right of priority by
this application and the disclosure thereof is entirely incorporated
herein by reference.