Description

Classification problems in machine learning involve assigning labels to various kinds of output types, from single assignment binary and multi-class classification to more complex assignments such as category ranking, sequence identification, and structured-output classification. Traditionally, most machine learning algorithms and theory is developed for the binary setting. In this dissertation, we provide a framework to unify these problems. Through this framework, many algorithms and significant theoretic understanding developed in the binary domain is extended to more complex settings.
First, we introduce Constraint Classification, a learning framework that provides a unified view of complex-output problems. Within this framework, each complex-output label is viewed as a set of constraints, sufficient enough to capture the information needed to classify the example. Thus, prediction in the complex-output setting is reduced to determining which constraints, out of a potentially large set, hold for a given example---a task that can be accomplished by the repeated application of a single binary classifier to indicate whether or not each constraint holds. Using this insight, we provide a principled extension of binary learning algorithms, such as the support vector machine and the Perceptron algorithm to the complex-output domain. We also show that desirable theoretical and experimental properties of the algorithms are maintained in the new setting.
Second, we address the structured output problem directly. Structured output labels are collections of variables corresponding to a known structure, such as a tree, graph, or sequence that can bias or constrain the global output assignment. The traditional approach for learning structured output classifiers, that decomposes a structured output into multiple localized labels to learn independently, is theoretically sub-optimal. In contrast, recent methods, such as constraint classification, that learn functions to directly classify the global output can optimal performance. Surprisingly, in practice it is unclear which methods achieve state-of-the-art performance. In this work, we study under what circumstances each method performs best. With enough time, training data, and representative power, the global approaches are better. However, we also show both theoretically and experimentally that learning a suite of local classifiers, even sub-optimal ones, can produce the best results under many real-world settings.
Third, we address an important algorithm in machine learning, the maximum margin classifier. Even with a conceptual understanding of how to extend maximum margin algorithms to more complex settings and performance guarantees of large margin classifiers, complex outputs render traditional approaches intractable in more complex settings. We introduce a new algorithm for learning maximum margin classifiers using coresets to find provably approximate solution to maximum margin linear separating hyperplane. Then, using the constraint classification framework, this algorithm applies directly to all of the previously mentioned complex-output domains. In addition, coresets motivate approximate algorithms for active learning and learning in the presence of outlier noise, where we give simple, elegant, and previously unknown proofs of their effectiveness.

You are granted permission for the non-commercial reproduction, distribution, display, and performance of this technical report in any format, BUT this permission is only for a period of 45 (forty-five) days from the most recent time that you verified that this technical report is still available from the University of Illinois at Urbana-Champaign Computer Science Department under terms that include this permission. All other rights are reserved by the author(s).