#: Marcel Reutegger changed the world a bit at a time by saying on 10/26/2005 9:14 AM :#
> Hi John,
>
> I haven't tried the bdb persistence manager yet.
>
> but it seems that brian is working with it, maybe he can share his
> experience?
>
> regards
> marcel
>
How is db-persistence (so Derby) storing binary content? (I mean f.e. the uploaded files are stored
in the DB as blobs? or as BerkleyDB is doing on FS?)
thanks,
./alex
--
.w( the_mindstorm )p.
> js@neasys.com wrote:
>> Hi, Marcel,
>>
>> Thanks a lot for your reply. One more question:
>> how does bdb persistent compare with db persistent?
>> Which one will be able to hold more items?
>>
>> John
>>
>> On Tue, Oct 25, 2005 at 09:08:00AM +0200, Marcel Reutegger wrote:
>>
>>>Hi John,
>>>
>>>js@neasys.com wrote:
>>>
>>>>I have tried jcr/jackrabbit and like it.
>>>>Next I would like to push jackrabbit to its limit:
>>>>load in as many items as possible. I would appreciate help on
>>>>a few configuration/tuning issues:
>>>>(1) which persistent manager to use?
>>>
>>>in a recent test I imported over a million wikipedia articles which
>>>resulted in about 6 million items. no versioning, btw.
>>>
>>>my configuration is:
>>>dell latitude d505
>>>db-persitence using derby
>>>256m heap
>>>
>>>at the beginning the time to add an article was about 5ms.
>>>towards the end of the load the time to add an article was stable at
>>>about 50ms.
>>>
>>>some other figures:
>>>db size: 2 GB
>>>index size: 300 MB
>>>
>>>
>>>>(2) what parameters to tune?
>>>
>>>I can give you some advice on configuring the index: the default config
>>>will cause lucene to create segments of 100 nodes, which will be merged
>>>when as soon as 10 segments exist. when doing a bulk load you should set
>>>the paramter minMergeDocs to a higher value. e.g. 1000. this will create
>>>segments of 1000 nodes, and will be more efficient.
>>>
>>>
>>>>(3) will multiple wordspaces help?
>>>
>>>IMO this might help, if you run into scalability issues with the
>>>persistence manager you are using.
>>>
>>>
>>>>(4) any other things to watch for?
>>>
>>>use separate disks for the index and workspace data.
>>>
>>>
>>>>My host has 4GB ram and a few TB diskspace.
>>>>
>>>>Also, any doc describing all possbile elements in repository.xml?
>>>
>>>the sample repository.xml file in src/conf contains an inline dtd that
>>>contains some documentation.
>>>
>>>
>>>>And if SearchIndex can be turned off?
>>>
>>>yes, this is possible. you simply omit the SearchIndex element in the
>>>configuration. though, I would be very interested to see how well the
>>>index works with your data.
>>>
>>>regards
>>> marcel
>>>
>>>
>>
>> __________________________________________
>> http://www.neasys.com - A Good Place to Be
>> Come to visit us today!
>>
>>
>