Tools

"... Diversity among the members of a team of classifiers is deemed to be a key issue in classifier combination. However, measuring diversity is not straightforward because there is no generally accepted formal definition. We have found and studied ten statistics which can measure diversity among binary ..."

Diversity among the members of a team of classifiers is deemed to be a key issue in classifier combination. However, measuring diversity is not straightforward because there is no generally accepted formal definition. We have found and studied ten statistics which can measure diversity among binary classifier outputs (correct or incorrect vote for the class label): four averaged pairwise measures (the Q statistic, the correlation, the disagreement and the double fault) and six non-pairwise measures (the entropy of the votes, the difficulty index, the Kohavi-Wolpert variance, the interrater agreement, the generalized diversity, and the coincident failure diversity). Four experiments have been designed to examine the relationship between the accuracy of the team and the measures of diversity, and among the measures themselves. Although there are proven connections between diversity and accuracy in some special cases, our results raise some doubts about the usefulness of diversity measures in building classifier ensembles in real-life pattern recognition problems.

"... The combination of different text representations and search strategies has become a standard technique for improving the effectiveness of information retrieval. Combination, for example, has been studied extensively in the TREC evaluations and is the basis of the “meta-search” engines used on the W ..."

The combination of different text representations and search strategies has become a standard technique for improving the effectiveness of information retrieval. Combination, for example, has been studied extensively in the TREC evaluations and is the basis of the “meta-search” engines used on the Web. This paper examines the development of this technique, including both experimental results and the retrieval models that have been proposed as formal frameworks for combination. We show that combining approaches for information retrieval can be modeled as combining the outputs of multiple classifiers based on one or more representations, and that this simple model can provide explanations for many of the experimental results. We also show that this view of combination is very similar to the inference net model, and that a new approach to retrieval based on language models supports combination and can be integrated with the inference net model.

"... In this paper the score distributions of a number of text search engines are modeled. It is shown empirically that the score distributions on a per query basis may be fitted using an exponential distribution for the set of non-relevant documents and a normal distribution for the set of relevant docu ..."

In this paper the score distributions of a number of text search engines are modeled. It is shown empirically that the score distributions on a per query basis may be fitted using an exponential distribution for the set of non-relevant documents and a normal distribution for the set of relevant documents. Experiments show that this model fits TREC-3 and TREC-4 data for not only probabilistic search engines like INQUERY but also vector space search engines like SMART for English. We have also used this model to fit the output of other search engines like LSI search engines and search engines indexing other languages like Chinese. It is then shown that given a query for which relevance information is not available, a mixture model consisting of an exponential and a normal distribution can be fitted to the score distribution. These distributions can be used to map the scores of a search engine to probabilities. We also discuss how the shape of the score distributions arise given certain assumptions about word distributions in documents. We hypothesize that all &apos;good&apos; text search engines operating on any language have similar characteristics. This model has many possible applications. For example, the outputs of different search engines can be combined by averaging the probabilities (optimal if the search engines are independent) or by using the probabilities to select the best engine for each query. Results show that the technique performs as well as the best current combination techniques. This material is based on work supported in part by the National Science Foundation, Library of Congress and Department of Commerce under cooperative agreement number EEC-9209623, in part by the National Science Foundation under grant numbers IRI-9619117 and IIS-9909073, in part by N...

"... Diversity, negative dependence, (or independence), orthogonality, complementarity, are intuitively desirable characteristics of a classifier team. It has been proved theoretically that a group of independent classifiers improve upon the single best classifier when majority vote combination is used. ..."

Diversity, negative dependence, (or independence), orthogonality, complementarity, are intuitively desirable characteristics of a classifier team. It has been proved theoretically that a group of independent classifiers improve upon the single best classifier when majority vote combination is used. A dependent set of classifiers may be either better or worse. It is assumed that this holds for other combination methods. It is therefore hoped that using measures of diversity will allow the identification of classifiers which will produce good results on combination. This study looks at the relationships between di erent methods of classifier combination and measures of diversity. We considered ten combination methods and ten measures of diversity.

...roduces accuracy which is related to the correlation between the classifier outputs. They extended this result to show similar relationship for combination by order statistics: minimum, maximum, mean =-=[13]-=-. In a previous study, we proved that there is a functional relationship between the Q statistic and the upper and the lower limits of the majority vote accuracy [14]. However, there is no theoretical...

"... In this paper, a theoretical and experimental analysis of linear combiners for multiple classifier systems is presented. Although linear combiners are the most frequently used combining rules, many important issues related to their operation for pattern classification tasks lack a theoretical basis. ..."

In this paper, a theoretical and experimental analysis of linear combiners for multiple classifier systems is presented. Although linear combiners are the most frequently used combining rules, many important issues related to their operation for pattern classification tasks lack a theoretical basis. After a critical review of the framework developed in works by Tumer and Ghosh, on which our analysis is based, we focus on the simplest and most widely used implementation of linear combiners, which consists in assigning a non-negative weight to each individual classifier. Moreover, we consider the ideal performance of this combining rule, i.e., that achievable when the optimal values of the weights are used. We do not consider the problem of weights estimation, which has been extensively addressed in the literature. Our theoretical analysis shows how the performance of linear combiners, in terms of misclassification probability, depends on the performance of individual classifiers, and on the correlation between their outputs. In particular, we evaluate the ideal performance improvement that can be achieved using the weighted average over the simple average combining rule, and investigate in what way it depends on the individual classifiers. Experimental results on real data sets show that the behaviour of linear combiners agrees with the predictions of our analytical model. Finally, we discuss the contribution to the state of the art and the practical relevance of our theoretical and experimental analysis of linear combiners for multiple classifier systems.

...caused by the patterns falling in the interval (x ∗ , xb), which are assigned to class ωi instead of ωj. The 1 In this paper, we will consider the case of a one-dimensional feature space, as in [30], =-=[31]-=-. This is not a limitation, since in [29] it was shown that the same results hold for the case of multi-dimensional feature spaces. 2 Note that this assumption does not hold if the estimates of the po...

"... Abstract. We derive upper and lower limits on the majority vote accuracy with respect to individual accuracy p, the number of classifiers in the pool (L), and the pairwise dependence between classifiers, measured by Yule’s Q statistic. Independence between individual classifiers is typically viewed ..."

Abstract. We derive upper and lower limits on the majority vote accuracy with respect to individual accuracy p, the number of classifiers in the pool (L), and the pairwise dependence between classifiers, measured by Yule’s Q statistic. Independence between individual classifiers is typically viewed as an asset in classifier fusion. We show that the majority vote with dependent classifiers can potentially offer a dramatic improvement both over independent classifiers and over the individual accuracy p. A functional relationship between the limits and the pairwise dependence Q is derived. Two patterns of the joint distribution for classifier outputs (correct/incorrect) are identified to derive the limits: the pattern of success and the pattern of failure. The results support the intuition that negative pairwise dependence is beneficial although not straightforwardly related to the accuracy. The pattern of success showed that for the highest improvement over p, all pairs of classifiers in the pool should have the same negative dependence.

...raging, or by an order statistic such as minimum, maximum or median, the classification error rate above the Bayes error (called the added error) depends on the correlation between the estimates (see =-=[25,26]-=-). Positively correlated classifiers only slightly reduce the added error, uncorrelated classifiers reduce the added error by a factor of 1/L, and negatively correlated classifiers reduce the error ev...

"... Multiple classifier systems based on the combination of outputs of a set of different classifiers have been proposed in the field of pattern recognition as a method for the development of high performance classification systems. Previous work clearly showed that multiple classifier systems are effec ..."

Multiple classifier systems based on the combination of outputs of a set of different classifiers have been proposed in the field of pattern recognition as a method for the development of high performance classification systems. Previous work clearly showed that multiple classifier systems are effective only if the classifiers forming them are accurate and make different errors. Therefore, the fundamental need for methods aimed to design &quot;accurate and diverse&quot; classifiers is currently acknowledged. In this paper, an approach to the automatic design of multiple classifier systems is proposed. Given an initial large set of classifiers, our approach is aimed at selecting the subset made up of the most accurate and diverse classifiers. A proof of the optimality of the proposed design approach is given. Reported results on the classification of multisensor remote-sensing images show that this approach allows the design of effective multiple classifier systems.

"... Independence and dependence of classifier outputs have been debated in the recent literature giving rise to notions such as diversity, complementarity, orthogonality, etc. There seems to be no consensus on the meaning of these notions beyond the intuitive perception. Here we summarize 10 measures of ..."

Independence and dependence of classifier outputs have been debated in the recent literature giving rise to notions such as diversity, complementarity, orthogonality, etc. There seems to be no consensus on the meaning of these notions beyond the intuitive perception. Here we summarize 10 measures of classifier diversity: 4 pairwise and 6 non-pairwise measures. We derive the limits of the measures for 2 classifiers of equal accuracy.

...he most popular measure has been the correlation between d i;j and d k;j for class ! j . A sum of correlations for ! j weighted by the prior probabilities P (! j ), j = 1; : : : ; c, has been used in =-=[34, 35-=-] to derive a relationship between correlation and improvement on the accuracy when average combination method is used. The supposedly benecial negative correlation has been the backbone of the negati...

"... Abstract—The combination of multisource remote sensing and geographic data is believed to offer improved accuracies in land cover classification. For such classification, the conventional parametric statistical classifiers, which have been applied successfully in remote sensing for the last two deca ..."

Abstract—The combination of multisource remote sensing and geographic data is believed to offer improved accuracies in land cover classification. For such classification, the conventional parametric statistical classifiers, which have been applied successfully in remote sensing for the last two decades, are not appropriate, since a convenient multivariate statistical model does not exist for the data. In this paper, several single and multiple classifiers, that are appropriate for the classification of multisource remote sensing and geographic data are considered. The focus is on multiple classifiers: bagging algorithms, boosting algorithms, and consensus-theoretic classifiers. These multiple classifiers have different characteristics. The performance of the algorithms in terms of accuracies is compared for two multisource remote sensing and geographic datasets. In the experiments, the multiple classifiers outperform the single classifiers in terms of overall accuracies. Index Terms—Bagging, boosting, consensus theory, multiple classifiers, multisource remote sensing data. I.

...hibit virtually no overfitting when the data are noiseless. Other advantages of boosting include that the algorithm has a tendency to reduce both the variance and the bias of the classification [11], =-=[13]-=-. On the other hand, AdaBoost is computationally more demanding than other simpler methods. Therefore, it is dependent on the classification problem whether it is more valuable to get increased classi...