Abstract��
A new query clustering method on user-query log was presented. Traditional clustering techniques focused on queries and click-through logs, which are often sparse. The average cluster size is often small. In contrast, the user-query log is much denser as well as noisier. To reduce the influence of the noises and discover similar queries, queries visited by the same user at the same session were assumed to be mostly similar. Based on the assumption, a new similarity measure using query co-occurrence relations was calculated to create query neighbor vector space. The queries were represented by vectors consisting of their neighbors. The similarity function for clustering was calculated based on the query neighbor vectors. An adjusted clustering method of density-based spatial clustering of applications with noise(DBSCAN) was applied to generate the clusters. Experiments on a real dataset of 95�X262 queries show that 79.77% precision and ��48.21%�� recall is achieved and the average cluster size achieves 51.