Here's something interesting I noticed while writing serialization code.
If you're in a situation where you have to repeatedly append small to medium
sized chunks of string
to a buffer, and the computation of those chunks is relatively cheap, it might
be faster (and use
less memory) to do it in two passes: first, determining the size of the result
string, then
allocating it completely and filling it in.
I noticed this while trying to figure out why my serialization code for YAWR
was so atrociously
slow. I'm passing the ser method a void delegate(char[]) that it's supposed to
call with the
strings it serializes, in order. So normally, serialization would look like
this:
char[] buffer; ser(data, (char[] f) { buffer~=f; });
When I figured out that it was the primarily bottleneck that caused the delays
while saving, I
replaced it with the following code
size_t len=0;
ser(data, (char[] f) { len+=f.length; });
auto buffer=new char[len]; len=0;
ser(data, (char[] f) { buffer[len..len+f.length]=f; len+=f.length; }
To my surprise, this more than doubled the speed of that code. So if you have
some block of code
that does lots of repeated string concats, you might want to give this a try.
--downs

Here's something interesting I noticed while writing serialization
code.
If you're in a situation where you have to repeatedly append small to
medium sized chunks of string
to a buffer, and the computation of those chunks is relatively cheap,
it might be faster (and use
less memory) to do it in two passes: first, determining the size of
the result string, then
allocating it completely and filling it in.

For cases like this, it would be nice to have a VirtualArray type
class VArray(T)
{
opSliceAssign(T[], size_t, size_t);
opSliceAssign(T[]);
T[] opSlice(size_t, size_t);
T[] opSlice();
offset(int);
offsetByLast();
}
This would act as an array that is as big as it needs to be and where the
beginning can be hidden.
Then this would work
auto a = new VArray!(char);
foreach(char[] s; arrayOfStrings)
{
a[] = s; // copy in starting at "0";
a.offsetByLast
}
return a[];
Inside this would do copies and reallocates while trying to not keep the
number of the second down;

Here's something interesting I noticed while writing serialization code.
If you're in a situation where you have to repeatedly append small to
medium sized chunks of string to a buffer, and the computation of those
chunks is relatively cheap, it might be faster (and use less memory) to
do it in two passes: first, determining the size of the result string,
then allocating it completely and filling it in.
I noticed this while trying to figure out why my serialization code for
YAWR was so atrociously slow. I'm passing the ser method a void
delegate(char[]) that it's supposed to call with the strings it
serializes, in order. So normally, serialization would look like this:
char[] buffer; ser(data, (char[] f) { buffer~=f; });
When I figured out that it was the primarily bottleneck that caused the
delays while saving, I replaced it with the following code
size_t len=0;
ser(data, (char[] f) { len+=f.length; });
auto buffer=new char[len]; len=0;
ser(data, (char[] f) { buffer[len..len+f.length]=f; len+=f.length; }
To my surprise, this more than doubled the speed of that code. So if you
have some block of code that does lots of repeated string concats, you
might want to give this a try.
--downs

You can do this as well:
void reserve(T)(T[] array, int reserve)
{
array.length += reserve;
array.length = 0;
}
That way you can use char[] like normal.

Here's something interesting I noticed while writing serialization code.
If you're in a situation where you have to repeatedly append small to
medium sized chunks of string to a buffer, and the computation of those
chunks is relatively cheap, it might be faster (and use less memory) to do
it in two passes: first, determining the size of the result string, then
allocating it completely and filling it in. I noticed this while trying to
figure out why my serialization code for YAWR was so atrociously slow. I'm
passing the ser method a void delegate(char[]) that it's supposed to call
with the strings it serializes, in order. So normally, serialization would
look like this:
char[] buffer; ser(data, (char[] f) { buffer~=f; });
When I figured out that it was the primarily bottleneck that caused the
delays while saving, I replaced it with the following code
size_t len=0;
ser(data, (char[] f) { len+=f.length; });
auto buffer=new char[len]; len=0;
ser(data, (char[] f) { buffer[len..len+f.length]=f; len+=f.length; }
To my surprise, this more than doubled the speed of that code. So if you
have some block of code that does lots of repeated string concats, you
might want to give this a try.

Yes, i using this pattern commonly, if this is possible.
I'm going to write data structures similar to cords [1] or ropes [2] in D.
Concatenation in them are very fast, because only pointer is copied, not
content.
Other important features:
* easy undo (and fast, and cheap) - you can have multiple cords and will
share most of data, even if you change something in one of them
* easy (and fast O(1)) append and prepend
* easy (and fast O(1)) substring slicing
* string are computed only once.
* reduced number of memory allocations
* easy sharing of similar strings
It will be simple to use:
auto r = new cord!(char)(); // or new cord!(char)("aaa");
r ~= "xxx";
r ~= "yyy";
r ~= "zzz";
auto r2 = r.toString(); // will first precompute len, and then copy content
// of buffers
foreach (c; r) {
write(c);
}
writefln("r[3]=%s", r[3]);
foreach (block; r.fastiterator) {
write(block);
}
auto r3 = r[1..3]; // O(1)
r3.rebalance(); // for faster access
r3[2] = 'c'; // will not change r ! O(log n) in 2-3 trees,
// O(c*n) in lists, but with very small c (~ 0.01)
auto r4 = r ~ r3; // O(1)
auto r5 = r.remove(4, 9); // O(1)
Common usages:
* text editors
* servers: appending on receiving, and appending when creating respone
* creating big arrays step by step without knowing its size
I will first implement simple list based buffers (with small overheads),
and then implement second version based on 2-3 trees probably (for very
large arrays, > 10MB).
I really need this data structures, so it will probably take not so long to
do all this things. :)
[1] http://www.sgi.com/tech/stl/Rope.html
[2] http://www.hpl.hp.com/personal/Hans_Boehm/gc/gc_source/cordh.txt
--
Witold Baryluk

Here's something interesting I noticed while writing serialization code.
If you're in a situation where you have to repeatedly append small to
medium sized chunks of string to a buffer, and the computation of those
chunks is relatively cheap, it might be faster (and use less memory) to do
it in two passes: first, determining the size of the result string, then
allocating it completely and filling it in. I noticed this while trying to
figure out why my serialization code for YAWR was so atrociously slow. I'm
passing the ser method a void delegate(char[]) that it's supposed to call
with the strings it serializes, in order. So normally, serialization would
look like this:
char[] buffer; ser(data, (char[] f) { buffer~=f; });
When I figured out that it was the primarily bottleneck that caused the
delays while saving, I replaced it with the following code
size_t len=0;
ser(data, (char[] f) { len+=f.length; });
auto buffer=new char[len]; len=0;
ser(data, (char[] f) { buffer[len..len+f.length]=f; len+=f.length; }
To my surprise, this more than doubled the speed of that code. So if you
have some block of code that does lots of repeated string concats, you
might want to give this a try.

Yes, i using this pattern commonly, if this is possible.
I'm going to write data structures similar to cords [1] or ropes [2] in D.
Concatenation in them are very fast, because only pointer is copied, not
content.
Other important features:
* easy undo (and fast, and cheap) - you can have multiple cords and will
share most of data, even if you change something in one of them
* easy (and fast O(1)) append and prepend
* easy (and fast O(1)) substring slicing
* string are computed only once.
* reduced number of memory allocations
* easy sharing of similar strings
It will be simple to use:
auto r = new cord!(char)(); // or new cord!(char)("aaa");
r ~= "xxx";
r ~= "yyy";
r ~= "zzz";
auto r2 = r.toString(); // will first precompute len, and then copy content
// of buffers
foreach (c; r) {
write(c);
}
writefln("r[3]=%s", r[3]);
foreach (block; r.fastiterator) {
write(block);
}
auto r3 = r[1..3]; // O(1)
r3.rebalance(); // for faster access
r3[2] = 'c'; // will not change r ! O(log n) in 2-3 trees,
// O(c*n) in lists, but with very small c (~ 0.01)
auto r4 = r ~ r3; // O(1)
auto r5 = r.remove(4, 9); // O(1)
Common usages:
* text editors
* servers: appending on receiving, and appending when creating respone
* creating big arrays step by step without knowing its size
I will first implement simple list based buffers (with small overheads),
and then implement second version based on 2-3 trees probably (for very
large arrays, > 10MB).
I really need this data structures, so it will probably take not so long to
do all this things. :)
[1] http://www.sgi.com/tech/stl/Rope.html
[2] http://www.hpl.hp.com/personal/Hans_Boehm/gc/gc_source/cordh.txt

wow, that looks amazing. I personally could benefit a lot from a library such
as this! I do tons and tons of string concatenation at the moment, and hand
optimizing every single time is annoying.
What can I do to help?
kenny

How do these perform in a multi-threaded scenario? At a glance, I would guess
these suffer from much the same problems as COW.
Witold Baryluk Wrote:

downs wrote:

Here's something interesting I noticed while writing serialization code.
If you're in a situation where you have to repeatedly append small to
medium sized chunks of string to a buffer, and the computation of those
chunks is relatively cheap, it might be faster (and use less memory) to do
it in two passes: first, determining the size of the result string, then
allocating it completely and filling it in. I noticed this while trying to
figure out why my serialization code for YAWR was so atrociously slow. I'm
passing the ser method a void delegate(char[]) that it's supposed to call
with the strings it serializes, in order. So normally, serialization would
look like this:
char[] buffer; ser(data, (char[] f) { buffer~=f; });
When I figured out that it was the primarily bottleneck that caused the
delays while saving, I replaced it with the following code
size_t len=0;
ser(data, (char[] f) { len+=f.length; });
auto buffer=new char[len]; len=0;
ser(data, (char[] f) { buffer[len..len+f.length]=f; len+=f.length; }
To my surprise, this more than doubled the speed of that code. So if you
have some block of code that does lots of repeated string concats, you
might want to give this a try.

Yes, i using this pattern commonly, if this is possible.
I'm going to write data structures similar to cords [1] or ropes [2] in D.
Concatenation in them are very fast, because only pointer is copied, not
content.
Other important features:
* easy undo (and fast, and cheap) - you can have multiple cords and will
share most of data, even if you change something in one of them
* easy (and fast O(1)) append and prepend
* easy (and fast O(1)) substring slicing
* string are computed only once.
* reduced number of memory allocations
* easy sharing of similar strings
It will be simple to use:
auto r = new cord!(char)(); // or new cord!(char)("aaa");
r ~= "xxx";
r ~= "yyy";
r ~= "zzz";
auto r2 = r.toString(); // will first precompute len, and then copy content
// of buffers
foreach (c; r) {
write(c);
}
writefln("r[3]=%s", r[3]);
foreach (block; r.fastiterator) {
write(block);
}
auto r3 = r[1..3]; // O(1)
r3.rebalance(); // for faster access
r3[2] = 'c'; // will not change r ! O(log n) in 2-3 trees,
// O(c*n) in lists, but with very small c (~ 0.01)
auto r4 = r ~ r3; // O(1)
auto r5 = r.remove(4, 9); // O(1)
Common usages:
* text editors
* servers: appending on receiving, and appending when creating respone
* creating big arrays step by step without knowing its size
I will first implement simple list based buffers (with small overheads),
and then implement second version based on 2-3 trees probably (for very
large arrays, > 10MB).
I really need this data structures, so it will probably take not so long to
do all this things. :)
[1] http://www.sgi.com/tech/stl/Rope.html
[2] http://www.hpl.hp.com/personal/Hans_Boehm/gc/gc_source/cordh.txt
--
Witold Baryluk

How do these perform in a multi-threaded scenario? At a glance, I
would guess these suffer from much the same problems as COW.
Witold Baryluk Wrote:

If I had to guess... It looks like it never modifies data, just makes
references
to it: Copy reference, replace reference on write. As long as you can get
the memory back that should be fast (and lock free to boot).

Here's something interesting I noticed while writing serialization
code.
If you're in a situation where you have to repeatedly append small to
medium sized chunks of string
to a buffer, and the computation of those chunks is relatively cheap,
it might be faster (and use
less memory) to do it in two passes: first, determining the size of
the result string, then
allocating it completely and filling it in.
I noticed this while trying to figure out why my serialization code
for YAWR was so atrociously
slow. I'm passing the ser method a void delegate(char[]) that it's
supposed to call with the
strings it serializes, in order. So normally, serialization would look
like this:
char[] buffer; ser(data, (char[] f) { buffer~=f; });
When I figured out that it was the primarily bottleneck that caused
the delays while saving, I replaced it with the following code
size_t len=0;
ser(data, (char[] f) { len+=f.length; });
auto buffer=new char[len]; len=0;
ser(data, (char[] f) { buffer[len..len+f.length]=f; len+=f.length; }
To my surprise, this more than doubled the speed of that code. So if
you have some block of code
that does lots of repeated string concats, you might want to give this
a try.
--downs