BibTeX

Share

OpenURL

Abstract

In the language modeling approach to information retrieval, Dirichlet prior smoothing frequently outperforms Jelinek-Mercer smoothing. Both Dirichlet prior and Jelinek-Mercer are forms of linear interpolated smoothing. The only difference between them is that Dirichlet prior determines the amount of smoothing based on a document’s length. Theory suggests that Dirichlet prior’s advantage should be the result of better document model estimation, for Dirichlet prior sensibly smooths longer documents less. In contrast, our hypothesis was that Dirichlet prior’s performance advantage comes primarily from a penalization of shorter documents ’ scores. We conducted two experiments to test our hypothesis. In our first experiment, when we transformed the test collections to have a uniform probability of relevance given document length, P (Rel|Len), Dirichlet prior’s performance advantage disappeared. If Dirichlet prior’s advantage came from better estimation, it should have retained that advantage even with a uniform P (Rel|Len). In our second experiment, we gave the known P (Rel|Len) as a document prior to the retrieval method. With the document prior, Jelinek-Mercer’s performance increased to match Dirichlet prior and Dirichlet prior showed some degradation in performance. These results confirm our hypothesis. While better estimation was formerly a plausible explanation of Dirichlet prior’s performance advantage, we now 1 know that Dirichlet prior smoothing’s advantage appears to come from its penalization of shorter documents.