Academic Commons Search Resultshttp://academiccommons.columbia.edu/catalog.rss?f%5Bauthor_facet%5D%5B%5D=Sable%2C+Carl&f%5Bdepartment_facet%5D%5B%5D=Computer+Science&f%5Bgenre_facet%5D%5B%5D=Technical+reports&q=&rows=500&sort=record_creation_date+desc
Academic Commons Search Resultsen-usUsing Density Estimation to Improve Text Categorizationhttp://academiccommons.columbia.edu/catalog/ac:109986
Sable, Carl; McKeown, Kathleen; Hatzivassiloglou, Vasileioshttp://hdl.handle.net/10022/AC:P:29284Thu, 21 Apr 2011 00:00:00 +0000This paper explores the use of a statistical technique known as density estimation to potentially improve the results of text categorization systems which label documents by computing similarities between documents and categories. In addition to potentially improving a system's overall accuracy, density estimation converts similarity scores to probabilities. These probabilities provide confidence measures for a system's predictions which are easily interpretable and could potentially help to combine results of various systems. We discuss the results of three complete experiments on three separate data sets applying density estimation to the results of a TF*IDF/Rocchio system, and we compare these results to those of many competing approaches.Computer sciencekrm8Computer ScienceTechnical reports