A Practical Sampling Strategy for Efficient Retrieval Evaluation

Javed A. Aslam and Virgil Pavlu

Abstract:

We consider the problem of large-scale retrieval evaluation, with a focus on the considerable effort required to judge tens of thousands of documents using traditional test collection construction methodologies. Recently, two methods based on emph{random sampling} were proposed to help alleviate this burden: While the first method proposed by Aslam et al. is very accurate and efficient, it is also very complex, and while the second method proposed by Yilmaz et al. is relatively simple, its accuracy and efficiency are significantly lower than the former.

In this work, we propose a new method for large-scale retrieval evaluation based on random sampling which combines the strengths of each of the above methods: it maintains the simplicity of the Yilmaz et al. method while achieving the performance of the Aslam et al. method. Furthermore, we demonstrate that this new sampling method can be adapted to incorporate both randomly sampled and fixed relevance judgments, as were available in the most recent TREC Terabyte track, for example.

Full text:

A working draft as of May 22, 2007, may be found below.

statAP.pdf