I was doing some experiments with comparing the MinTL container performance against the STL performance and MinTL was getting stomped because of the GC. It seemed to be allocating twice as much memory as the STL runs. The allocations I was making was for arrays with lengths a power of two. I looked at gc.d and noticed all the allocations were being made with +1 added to the lengths, which means my nice power of two was being incremented to the next larger bucket and hence using twice the space of the C++ code. Can we get rid of those +1? I recompiled phobos without the +1 and it all seemed to work ok. I think it will be common to use power-of-two sizes for objects assuming those will fit nicely together when in fact those fit the worst together.
-Ben

> I was doing some experiments with comparing the MinTL container performance
> against the STL performance and MinTL was getting stomped because of the GC.
> It seemed to be allocating twice as much memory as the STL runs. The
> allocations I was making was for arrays with lengths a power of two. I
> looked at gc.d and noticed all the allocations were being made with +1 added
> to the lengths, which means my nice power of two was being incremented to
> the next larger bucket and hence using twice the space of the C++ code.
Whew. Thanks for that information! I have to adjust that in my containers as well (By the way, would you mind testing them as well? *sweet-question*). I think the +1 comes from the string handling... The GC puts a 0 after them to help you cope with the native C api...
Ciao
uwe

>> I was doing some experiments with comparing the MinTL container performance
>> against the STL performance and MinTL was getting stomped because of the GC.
>> It seemed to be allocating twice as much memory as the STL runs. The
>> allocations I was making was for arrays with lengths a power of two. I
>> looked at gc.d and noticed all the allocations were being made with +1 added
>> to the lengths, which means my nice power of two was being incremented to
>> the next larger bucket and hence using twice the space of the C++ code.
>> Whew. Thanks for that information! I have to adjust that in my containers as well.
By the way, that improves vector's allocating performance by 15%. Quite a change.
Thanks!
uwe

"Uwe Salomon" <post@uwesalomon.de> wrote in message news:op.sq0zk4il6yjbe6@sandmann.maerchenwald.net...>>> I was doing some experiments with comparing the MinTL container
>>> performance
>>> against the STL performance and MinTL was getting stomped because of
>>> the GC.
>>> It seemed to be allocating twice as much memory as the STL runs. The
>>> allocations I was making was for arrays with lengths a power of two. I
>>> looked at gc.d and noticed all the allocations were being made with +1
>>> added
>>> to the lengths, which means my nice power of two was being incremented
>>> to
>>> the next larger bucket and hence using twice the space of the C++ code.
>>>> Whew. Thanks for that information! I have to adjust that in my containers as well.
How would you adjust your code? I don't see what a container can do.
> By the way, that improves vector's allocating performance by 15%. Quite a change.
For my power-of-two tests it was an order of magnitude. What kinds of tests are you using?

>>> Whew. Thanks for that information! I have to adjust that in my
>>> containers as well.
>> How would you adjust your code? I don't see what a container can do.
The container always allocates more memory than needed. If you simply append() elements, as soon as he does not have enough space, he allocates the next power of two but one (right grammar?? I mean the one following the next) bytes, later the next power of two that is larger than 1.5 * requested size. Thus he often allocated way too much, as you said.
Of course the container can't prevent you from adding exactly as much elements that their size in bytes is a power of two. This is especially a problem in reserve(). But if you simply don't care and just append() what you have, the problem will not occur.
>> By the way, that improves vector's allocating performance by 15%. Quite a change.
>> For my power-of-two tests it was an order of magnitude.
What do you mean with "order of magnitude"? The dictionary spits out several translations... ;)
> What kinds of tests are you using?
// This tests how fast writing items and
// dynamically allocating space as needed is.
for (int i = 0; i < anz; ++i)
array.append(i);
// This tests only the writing speed.
array.reserve(anz);
for (int i = 0; i < anz; ++i)
array.append(i);
I do these tests both for my arrays and the D arrays, and compare. Currently they are equally fast. Note that the first test has this equivalent in D:
array.length = 16;
for (int i = 0; i < anz; ++i)
{
if (i >= array.length)
array.length = array.length * 2;
array[i] = i;
}
Ciao
uwe

>>>> I was doing some experiments with comparing the MinTL container
>>>> performance
>>>> against the STL performance and MinTL was getting stomped because of
>>>> the GC.
>>>> It seemed to be allocating twice as much memory as the STL runs. The
>>>> allocations I was making was for arrays with lengths a power of two. I
>>>> looked at gc.d and noticed all the allocations were being made with +1
>>>> added
>>>> to the lengths, which means my nice power of two was being incremented
>>>> to
>>>> the next larger bucket and hence using twice the space of the C++ code.
What i forgot to ask: Does the GC always add 1 to the length of the dynamic array? Even for large elements?
struct Verylarge
{
dchar[50] x;
}
Verylarge[] dynArray;
dynArray.length = 10;
This would waste 200 bytes then... With no benefit, i think. The GC should only add 1 length for (d)(w)char, i cannot see of what use this is in other cases. Well, we have to ask Walter then.
ciao
uwe

"Uwe Salomon" <post@uwesalomon.de> wrote in message news:op.sq03yu2e6yjbe6@sandmann.maerchenwald.net...>>>> Whew. Thanks for that information! I have to adjust that in my containers as well.
>>>> How would you adjust your code? I don't see what a container can do.
>> The container always allocates more memory than needed. If you simply append() elements, as soon as he does not have enough space, he allocates the next power of two but one (right grammar?? I mean the one following the next) bytes, later the next power of two that is larger than 1.5 * requested size. Thus he often allocated way too much, as you said.
>> Of course the container can't prevent you from adding exactly as much elements that their size in bytes is a power of two. This is especially a problem in reserve(). But if you simply don't care and just append() what you have, the problem will not occur.
>>>> By the way, that improves vector's allocating performance by 15%. Quite a change.
>>>> For my power-of-two tests it was an order of magnitude.
>> What do you mean with "order of magnitude"? The dictionary spits out several translations... ;)
Order of magnitude means a power of 10 or 2 depending on context. My use above means power of 2 (or more actually). The first time I ran the test it started swapping and I had to kill it - only when I removed the +1 did it fit without swapping and it ran in approximately the same space as the STL version.
> I do these tests both for my arrays and the D arrays, and compare. Currently they are equally fast. Note that the first test has this equivalent in D:
>> array.length = 16;
> for (int i = 0; i < anz; ++i)
> {
> if (i >= array.length)
> array.length = array.length * 2;
>> array[i] = i;
> }
This is the kind of resizing (powers of two) that you want to avoid since it results in 1/2 wasted space even when you fill up the array entirely. For example, say i==16. Then the line array.length = array.length*2; will actually result in 2*16+1 bytes being requested which is 33 which gets rounded up to 64 even though you will only ever use 32. Once i == 32 the resize will request 2*32+1 bytes which is 65 which doesn't fit in 64 so it will against reallocate and ask for 128. So you are always wasting 1/2 the space when you resize by powers of 2. It becomes, ironically, the most inefficient way to geometrically grow the array.

> It becomes, ironically, the most
> inefficient way to geometrically grow the array.
This really needs to be changed. You are perfectly right, rounding to powers of two is such a common approach that it should work as fast as expected.
Ciao
uwe

"Uwe Salomon" <post@uwesalomon.de> wrote in message news:op.sq06dwv76yjbe6@sandmann.maerchenwald.net...>>>>> I was doing some experiments with comparing the MinTL container
>>>>> performance
>>>>> against the STL performance and MinTL was getting stomped because of
>>>>> the GC.
>>>>> It seemed to be allocating twice as much memory as the STL runs. The
>>>>> allocations I was making was for arrays with lengths a power of two. I
>>>>> looked at gc.d and noticed all the allocations were being made with +1
>>>>> added
>>>>> to the lengths, which means my nice power of two was being incremented
>>>>> to
>>>>> the next larger bucket and hence using twice the space of the C++
>>>>> code.
>> What i forgot to ask: Does the GC always add 1 to the length of the dynamic array? Even for large elements?
>> struct Verylarge
> {
> dchar[50] x;
> }
>> Verylarge[] dynArray;
> dynArray.length = 10;
>> This would waste 200 bytes then... With no benefit, i think. The GC should only add 1 length for (d)(w)char, i cannot see of what use this is in other cases. Well, we have to ask Walter then.
>> ciao
> uwe
It adds 1 byte to every request.

> Can
> we get rid of those +1? I recompiled phobos without the +1 and it all
> seemed to work ok. I think it will be common to use power-of-two sizes for
> objects assuming those will fit nicely together when in fact those fit the
> worst together.
note the +1 seems to have appeared in dmd.122 so it is fairly recent.