Abstract

The proliferation of information sources available on the Wide World Web has resulted in a need for database selection tools to locate the potential useful information sources with respect to the user's information need. Current database selection tools always treat each database independently, ignoring the implicit, useful associations between distributed databases. To overcome this shortcoming, in this paper, we introduce a data-mining approach to assist the process of database selection by extracting potential interesting association rules between web databases from a collection of previous selection results. With a topic hierarchy, we exploit intraclass and interclass associations between distributed databases, and use the discovered knowledge on distributed data-bases to refine the original selection results. We present experimental results to demonstrate that this technique is useful in improving the effectiveness of data-base selection.