Classification algorithms are difficult to apply to sequential examples, such as text or DNA sequences, because a vast number of features are potentially useful for describing each example. Past work on feature selection has focused on searching the space of all subsets of the available features, which is intractable for large feature sets. The authors adapt data mining techniques to act as a preprocessor to select features for standard classification algorithms such as Naive Bayes and Winnow. They apply their algorithm to a number of data sets and experimentally show that the features produced by the algorithm improve classification accuracy up to 20%.