lucene-java-user mailing list archives

Hello Mike,
Am 27.07.2010 14:38, schrieb Michael McCandless:
> On Tue, Jul 27, 2010 at 7:58 AM, Alexander vom Berg<mail@avomberg.de> wrote:
>
>> Hello Mike,
>>
>> thanks for your answer!
>> I am currently working with Lucene 3.0.1 and except the .tii - file all
>> other descriptions are comprehensible.
>> The idea behind the tii/tis file structure is for faster retrieving the
>> correct terms.
>> At first I lookup in memory (tii-file) and take the most nearby hit. With
>> this information I can skip to the correct position in the tis-file and scan
>> up to my final hit. I don't exactly understand how this skipping is
>> realized.
>> Do I have a direct pointer to the postion on the hard drive? Or how do I
>> find the term without having to much file access? :D
>>
> Yes, you have to seek the tis file handle, then you do .next() until
> the term matches. Maybe you stop there, eg if you're just looking for
> say the docFreq of that term. Or, if you then need to iterate the
> docs/positions, from that term entry you have the long file pointers
> of frq and prx files, which you must seek to and decode.
>
> Btw, what is it that you are doing? You seem to be re-inventing
> Lucene :) You could simply use Lucene's low level APIs to do this...
>
>
this was meant more as a question and if my assumptions how Lucene works
are correct. :) Sorry for beeing unclear.
I don't want to implement it myself!
>> My intention behind this is that I want to run some performance tests on an
>> created index with different block sizes of the hard drive.
>> Can I just copy this created index on another drive (with different
>> blocksize) or do I have to generate the hole index again?
>>
> Ahhh.
>
> You mean the block size of the underlying filesystem? If so, then
> copying will be fine in that the resulting index will function
> correctly.
>
> However, this may not be a fair performance test since with 'cp'
> presumably the IO system may have optimized how the files are
> allocated to blocks on disk. Ie, you'll get a different allocation
> than had Lucene directly opened these files and written them itself on
> the 2nd file system. You could test both approaches and see if
> there's a difference!
>
>
Do you mean problems with fragmentation here? Or what exactly is the
difference after I copy the index (faster because it's defragmented?)?
What happens if I use the copy-Method from
org.apache.lucene.store.Directory?
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
Best regards
Alex
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org