Subscribe to the latest research through IGI Global's new InfoSci-OnDemand Plus

InfoSci®-OnDemand Plus, a subscription-based service, provides researchers the ability to access full-text content from over 93,000+ peer-reviewed book chapters and 24,000+ scholarly journal articles covering 11 core subjects. Users can select articles or chapters that meet their interests and gain access to the full content permanently in their personal online InfoSci-OnDemand Plus library.

When ordering directly through IGI Global's Online Bookstore, receive the complimentary e-books for the first, second, and third editions with the purchase of the Encyclopedia of Information Science and Technology, Fourth Edition e-book.

InfoSci®-Journals Annual Subscription Price for New Customers: As Low As US$ 4,080*

This collection of over 185 e-journals offers unlimited access to highly-cited, forward-thinking content in full-text PDF and HTML with no DRM. There are no platform or maintenance fees and a guarantee of no more than 5% increase annually.

Abstract

Feature Subset Selection (FSS) is a well-known task of Machine Learning, Data Mining, Pattern Recognition or Text Learning paradigms. Genetic Algorithms (GAs) are possibly the most commonly used algorithms for Feature Subset Selection tasks. Although the FSS literature contains many papers, few of them tackle the task of FSS in domains with more than 50 features. In this chapter we present a novel search heuristic paradigm, called Estimation of Distribution Algorithms (EDAs), as an alternative to GAs, to perform a population-based and randomized search in datasets of a large dimensionality. The EDA paradigm avoids the use of genetic crossover and mutation operators to evolve the populations. In absence of these operators, the evolution is guaranteed by the factorization of the probability distribution of the best solutions found in a generation of the search and the subsequent simulation of this distribution to obtain a new pool of solutions. In this chapter we present four different probabilistic models to perform this factorization. In a comparison with two types of GAs in natural and artificial datasets of a large dimensionality, EDAbased approaches obtain encouraging results with regard to accuracy, and a fewer number of evaluations were needed than used in genetic approaches.