directory-dev mailing list archives

On Fri, Jan 20, 2012 at 4:48 PM, Emmanuel Lecharny <elecharny@gmail.com> wrote:
> Just FYI, I will continue to play around the BTree I've pushed in my branch.
> It's funny, and it helps me to understand the concepts without having to go
> deep into books and existing algorithm. Call it an exercise, this is exactly
> what it is.
>
> As far as I can tell, it also seems that JDBM has a few sub-optimal methods
> : to insert N nodes into a BTree, JDBM do 10% more comparison than I do in
> my implementation. We can probably shave those extra 10% from JDBM code
> later (I tried, but it had some bad side effect when done brutally).
>
> Otherwise, I do think that if your modified JDBM already implements the MVCC
> Btree, then we should use it in trunk.
>
> I'm also wondering if there is any added GC in the modified JDBM code ? Just
> because if we don't have one, the JDBM files will grow very quickly.
there is GC in the added code.
>
>
>
> On 1/20/12 10:07 PM, Selcuk AYA wrote:
>>
>> On Fri, Jan 20, 2012 at 9:47 AM, Emmanuel Lecharny<elecharny@gmail.com>
>> wrote:
>>>
>>> Hi guys,
>>>
>>> I was a bit silent last week and this week, let me update you about what
>>> I
>>> was working on.
>>>
>>> First of all, I have had to deal with some familly issues, which ate half
>>> of
>>> my time.
>>>
>>> Regarding the Txn branch I was working on until last week, I stopped
>>> because
>>> I was not able to fix the code without a serious help from Selcuk. As he
>>> is
>>> busy, I preferred to wait for him to be available again, instead of
>>> bullying
>>> into the code and break it seriously. I believe that there are some
>>> improvement since the moment I started to work on the branch, but it's
>>> not
>>> working fully yet.
>>>
>>> So I switched to something we wanted to do a long time ago : designing a
>>> new
>>> version of JDBM. JDBM is a BTree implementation, with locks to protect
>>> concurrent access. The idea was to implement a MVCC solution on top of a
>>> BTree :
>>> - each search can be done concurrently with any other operation, because
>>> it
>>> asks for a specific existing revision from the btree
>>> - each modification is done on a new revision
>>> - two modifications can't be done at the same time (so modifications are
>>> queued and executed one after the other)
>>
>>
>> Exactly what you are saying here is already implemented in JDBM thanks
>> to the latest changes. How is what you are implementing is different?
>>
>>> The consequence is that searches will be very fast. It comes to a price
>>> though : we keep a track on every revision, until it's not used anymore.
>>> This is done by copying every modified pages when applying some
>>> modification.
>>>
>>> As of today, the addition operation and the find operation is working
>>> just
>>> fine. I conducted some benchmark on additions, and it seems that the
>>> system
>>> is pretty decent.
>>>
>>> A *lot* remains to be done :
>>> - deletion must be implemented
>>> - browsing the tree is not yet implemented
>>> - it's all in memory atm
>>> - we must add some semaphore for concurrent modifications
>>> - a GC must be added to discard unused pages
>>>
>>> But most of all, as it's a in-memory btree atm, I must add the disk
>>> layer.
>>> It will be based on Memory Mapped files.
>>>
>>> Once those preliminary works will be done, the idea is to use this
>>> implementation to replace JDBM. That would make the server consistent,
>>> and
>>> we may then use it without the in-memory txn layer.
>>>
>>> Not to say that this txn layer is useless; using a MVCC btree based
>>> backend
>>> is *not* enough : we have no way to guarantee the atomicity of move
>>> operation across partitions.
>>
>> Txn layer enables us to group multiple atomic atomic changes and
>> consistent search into a single atomic unit. For example, when you
>> change an entry, change to the entry and indices happen as an atomic
>> unit. Do you have code to provide this at your new implementation?
>>
>>> This work has been done in my sandbox, where you can follow the work in
>>> progress :
>>> http://svn.apache.org/viewvc/directory/sandbox/elecharny/shared-mvbt
>>>
>>> At the same time, thanks to Pierre-Arnaud, a first milstone of Studio 2.0
>>> has been released, and it exposed some nasty bugs in the LDAP API. Which
>>> is
>>> actually a good thing : we can fix them !
>>>
>>> So keep tuned, a lot of new things are coming soon !
>>>
>>> --
>>> Regards,
>>> Cordialement,
>>> Emmanuel Lécharny
>>> www.iktek.com
>>>
>
>
> --
> Regards,
> Cordialement,
> Emmanuel Lécharny
> www.iktek.com
>