If I start using a HiLo generator to assign ID's for a table, and then decide to increase or decrease the capacity (i.e. the maximum 'lo' value), will this cause collisions with the already-assigned ID's?

I'm just wondering if I need to put a big red flag around the number saying 'Don't ever change this!'

Note - not NHibernate specific, I'm just curious about the HiLo algorithm in general.

4 Answers
4

HiLo algorithms in general basically map two integers to one integer ID. It guarantees that the pair of numbers will be unique per database. Typically, the next step is to guarantee that a unique pair of numbers maps to a unique integer ID.

Your High sequence is now 3. Now what would happen if you all of a sudden changed your l_size to 10? Your next block, your High is incremented, and you'd get 4*10+1 = 41

Oops. This new value definitely falls within the "reserved block" of 1-100. Someone with a high sequence of 0 would think, "Well, I have the range 1-100 reserved just for me, so I'll just put down one at 41, because I know it's safe."

There is definitely a very, very high chance of collision when lowering your l_max.

What about the opposite case, raising it?

Back to our example, let's raise our l_size to 500, turning the next key into 4*500+1 = 2001, reserving the range 2001-2501.

It looks like collision will be avoided, in this particular implementation of HiLo, when raising your l_max.

Of course, you should do some own tests on your own to make sure that this is the actual implementation, or close to it. One way would be to set l_max to 100 and find the first few keys, then set it to 500 and find the next. If there is a huge jump like mentioned here, you might be safe.

However, I am not by any means suggesting that it is best practice to raise your l_max on an existing database.

Use your own discretion; the HiLo algorithm isn't exactly one made with varying l_max in mind, and your results may in the end be unpredictable depending on your exact implementation. Maybe someone who has had experience with raising their l_max and finding troubles can prove this count correct.

So in conclusion, even though, in theory, Hibernate's HiLo implementation will most likely avoid collisions when l_max is raised, it probably still isn't good practice. You should code as if l_max were not going to change over time.

Just by experience I'd say: yes, decreasing will cause collisions. When you have a lower max low, you get lower numbers, independent of the high value in the database (which is handled the same way, eg. increment with each session factory instance in case of NH).

There is a chance that increasing will not cause collisions. But you either need to try or ask someone who knows better then I do to be sure.

By allocating ranges from the number space & representing the NEXT directly, rather than complicating the logic with high words or multiplied numbers, you can directly see what keys are going to be generated.

Essentially, "Linear Chunk allocator" uses addition rather than multiplication. If the NEXT is 1000 & we've configured range-size of 20, NEXT will advance to 1020 and we'll hold keys 1000-1019 for allocation.

Range-sized can be tuned or reconfigured at any time, without loss of integrity. There is a direct relationship between the NEXT field of the allocator, the generated keys & MAX(ID) existing in the table.

(By comparison, "Hi-Lo" uses multiplication. If the next is 50 & the multiplier is 20, then you're allocating keys around 1000-1019. There are no direct correlation between NEXT, generated keys & MAX(ID) in the table, it is difficult to adjust NEXT safely and the multiplier can't be changed without disturbing current allocation point.)

With "Linear Chunk", you can configure how large each range/ chunk is -- size of 1 is equivalent to traditional table-based "single allocator" & hits the database to generate each key, size of 10 is 10x faster as it allocates a range of 10 at once, size of 50 or 100 is faster still..

A size of 65536 generates ugly-looking keys, wastes vast numbers of keys on server restart, and is equivalent to Scott Ambler's original HI-LO algorithm.

In short, Hi-Lo is an erroneously complex & flawed approach to what should have been conceptually trivially simple -- allocating ranges along a number line.