My question is how I can have exactly the same ranking results
between these two queries?

Thanks,
Ying-Hsang Liu

On Jun 29, 2006, at 11:47 PM, Katherine Don wrote:

> Hi
>
> We do a very simplistic version of ranking for cross collection
> searching - we take the ranks from each subcollection at face value. I
> think that the same document in a different collection would receive a
> different rank - it depends on term frequencies within a collection
> as a
> whole.
> So how you split up the documents into collections could have an
> effect
> on the ranking.
>
> Regards,
> Katherine
>
> Ying-Hsang Liu wrote:
>> Hi
>>
>> Thanks for your helpful information regarding the details of ranking.
>>
>> Since the Greenstone has the function of cross-collection search and
>> I also use this function in my collection, I am wondering if there
>> will be
>> differences in the ranking results? More specifically, my question is
>> will there be a difference in the ranking results if there is only
>> one big
>> collection, or there are several sub-collections (given the same
>> data set)?
>>
>> Thanks!
>>
>>
>> On Jun 18, 2006, at 8:00 PM, Katherine Don wrote:
>>
>>> Hi
>>>
>>>
>>> MG does either boolean or ranked queries (but not both at once)
>>> while
>>>
>>> MGPP ranks boolean queries. So the "display results in ranked/
>>> natural
>>>
>>> order" just switches the ranking on/off.
>>>
>>> Only documents which match the boolean query will be included in the
>>>
>>> results.
>>>
>>>
>>> The ranking is done using a cosine measure (based on term frequency,
>>>
>>> document frequency, document weights...) - see the book mentioned
>>> below
>>>
>>> for more information about this.
>>>
>>>
>>> There is a website about MG, at http://www.cs.mu.oz.au/mg/ and
>>> there is
>>>
>>> a link there to more information about the software. I thought it
>>> may
>>>
>>> have info about the ranking, but it is down at the moment. I'm
>>> not sure
>>>
>>> if this is a permanent error, so you may like to check there.
>>>
>>>
>>> MGPP is a reimplementation of MG which is written in C++ instead
>>> of C,
>>>
>>> and uses word level indexing instead of document level. I think
>>> that the
>>>
>>> compression, indexing and ranking algorithms are pretty much the
>>> same as
>>>
>>> for MG.
>>>
>>>
>>> Regards,
>>>
>>> Katherine
>>>
>>