Should we call the constructor for blah 100 times or 200 times in the second instance? What do we do with the content of the first 100 instances of *a - allocate new ones, copy the existing ones or something else?

If you have a need to "resize" your datablocks, why not allocate them in "chunks" so that you get some spare size left over, and then fill that out until you get to the "size" and then do your own "renew" by using allocating a new array (or whatever it is you're dealing with) and then copying the old content into the new one.

As for how to pick a "slightly larger size", perhaps picking the next (1 << n) that is bigger than the reuqested size.

Using a fixed addition works well if you often only add a little bit, but if you don't know what size you're going to grow the block of memory to, then you may find that a "double it each time" is a better way [2].

[1] It's certainly possible to implement "malloc" by calling "new", but not the other way around.

[2] A collegue of mine worked on a project where someone used the method of "adding a fixed size more" (I think 4K) each time the buffer was too small. It started out at 4K and grew to 16MB in his case. Growing that at 4K at a time is PAINFULL (lots of memory copying). By using the "double the size when it's not large enough", he could reduce the runtime of that particular app by a factor of 10 or so.

If that is exactly what you're doing in your code (or at least something that is just a plain array of some data with no "class" or "complex content"), then you could actually just use malloc/realloc. Unless of course where you're using this code FORBIDS use of classic C-functionality. Just remember to NOT use "delete" when you de-allocate the memory, use free instead.

As I explained in the previous post, "renew" isn't quite so easy to implement for the more complex scenarios. For trivial "stuff", you could implement something like this:

Should we call the constructor for blah 100 times or 200 times in the second instance? What do we do with the content of the first 100 instances of *a - allocate new ones, copy the existing ones or something else?

This is exactly why renew would be useful. Renew would only reconstruct the objects that it needs to. This would be like vector, except that resizing the vector beyond capacity does not necessarily require the vector to copy elements if the operating system can spare the extra space after the internal dynamic array. It leaves the task of whether to copy the element in the bigging over or not up to the operating system, which knows more about the available memory than the program.

If you have a need to "resize" your datablocks, why not allocate them in "chunks" so that you get some spare size left over, and then fill that out until you get to the "size" and then do your own "renew" by using allocating a new array (or whatever it is you're dealing with) and then copying the old content into the new one.

Becomes sometimes you need sequential memory. Vectors are sometimes preferred over deques. Vector and vector like objects would benefit from renew.

As for how to pick a "slightly larger size", perhaps picking the next (1 << n) that is bigger than the reuqested size.

Using a fixed addition works well if you often only add a little bit, but if you don't know what size you're going to grow the block of memory to, then you may find that a "double it each time" is a better way [2].

That's the problem. Increasing the capacity of the vector by a little bit should not require copying every element over most of the time.

This is exactly why renew would be useful. Renew would only reconstruct the objects that it needs to. This would be like vector, except that resizing the vector beyond capacity does not necessarily require the vector to copy elements if the operating system can spare the extra space after the internal dynamic array. It leaves the task of whether to copy the element in the bigging over or not up to the operating system, which knows more about the available memory than the program.

Most operating systems don't do realloc natively - it's done in the C-library, and it's done by looking at the internal data in the heap, to see if there's enough space to expand it [e.g. we asked for a block of 4K, but the heap contained one that is bigger, so the "rest" of that block is free - realloc can just change the size of the block. If it's unable to expand "in situ", it does a fresh malloc and copies the data anyways].

Becomes sometimes you need sequential memory. Vectors are sometimes preferred over deques. Vector and vector like objects would benefit from renew.

That's the problem. Increasing the capacity of the vector by a little bit should not require copying every element over most of the time.

It's just a case of "where do you solve the problem" as well as "how do you know what needs to be done". I'm not clued up enough about how vector is implemented, but I'm pretty sure it doesn't specifically say that you couldn't allocate "spare area" whenever it needs to grow, and thus only actually grow it when it runs out of spare area - selecting increasingly larger spare area when it's growing larger. Not saying this is how vector is implemented at the moment, but that it COULD BE done. And it certainly can be done without using "renew".

This is exactly why renew would be useful. Renew would only reconstruct the objects that it needs to. This would be like vector, except that resizing the vector beyond capacity does not necessarily require the vector to copy elements if the operating system can spare the extra space after the internal dynamic array. It leaves the task of whether to copy the element in the bigging over or not up to the operating system, which knows more about the available memory than the program.

What I am curious about is whether it is possible to write an allocator for vector that tries to grow the vector without copying existing elements to a newly allocated block unless absolutely necessary, keeping in mind that the vector's elements are expected to be contiguous. It does not sound possible to me, but then I am a beginner at allocators and sophisticated memory management in general.

I'm pretty sure it doesn't specifically say that you couldn't allocate "spare area" whenever it needs to grow, and thus only actually grow it when it runs out of spare area - selecting increasingly larger spare area when it's growing larger.

That is precisely what vector is expected to do, hence the member functions capacity(), reserve(), resize() and size(). I believe some existing implementations do double the capacity when it is maxed out.

Most operating systems don't do realloc natively - it's done in the C-library, and it's done by looking at the internal data in the heap, to see if there's enough space to expand it [e.g. we asked for a block of 4K, but the heap contained one that is bigger, so the "rest" of that block is free - realloc can just change the size of the block. If it's unable to expand "in situ", it does a fresh malloc and copies the data anyways].

Right, that's what realloc does. A renew function/operation would do the same thing but with constructors.

The OS may allocate 4K to the program. But the c++ code cannot directly access that 4K; it relies on calls to malloc, calloc, new, ext. Thus if you have a size 100 dynamic array, which uses objects with nontrivial constructors, and you want to increase the size by 1, there is currently no way to do it without copying the entire array over and deleting the old array. This is inefficient.

It's just a case of "where do you solve the problem" as well as "how do you know what needs to be done". I'm not clued up enough about how vector is implemented, but I'm pretty sure it doesn't specifically say that you couldn't allocate "spare area" whenever it needs to grow, and thus only actually grow it when it runs out of spare area - selecting increasingly larger spare area when it's growing larger. Not saying this is how vector is implemented at the moment, but that it COULD BE done. And it certainly can be done without using "renew".

Vector itself can be implemented to work like realloc because there is no requirement that vector is written in c++ (as long as it behaves like it's written in c++). More to the point is how to write an more specialized version vector that uses realloc like functionality in c++ for our own purposes.

What I am curious about is whether it is possible to write an allocator for vector that tries to grow the vector without copying existing elements to a newly allocated block unless absolutely necessary, keeping in mind that the vector's elements are expected to be contiguous. It does not sound possible to me, but then I am a beginner at allocators and sophisticated memory management in general.

It's not possible. There were proposals to make it possible, but I don't know about their status.

That is precisely what vector is expected to do, hence the member functions capacity(), reserve(), resize() and size(). I believe some existing implementations do double the capacity when it is maxed out.

In fact, doubling the capacity is required behaviour for push_back if the vector is full, and no implementation I know does anything different for other insertion functions.

Using a fixed addition works well if you often only add a little bit, but if you don't know what size you're going to grow the block of memory to, then you may find that a "double it each time" is a better way [2].

It is not good. Example:

You need to get some numbers from input. Initially ou make space for 100 items. But there are 122 items! If you double the allocation, you will have room for 200 items. 200 - 122 = 78
78 unit of memory is wasted.

...then do your own "renew" by using allocating a new array (or whatever it is you're dealing with) and then copying the old content into the new one.

Of course I can do my own renew if I could write the realloc function by myself!

renew should first look at the memory allocated by the last new. If it has more room at its tail then that space will be allocated. In this case the pointer does not change. Else it should find a continous free space at the size we want and copy the contents of our array to this new location, delete the old array, return the address of this new location.

There's no guarantee that the "behind the scenes" allocation isn't larger than what you need anyways. It's obviously not necessary to DOUBLE the allocation either - adding 50% or some other proportion would work too. If you grow by 25% at "time to grow", you'd end up with 125 items on the growth, and only waste 3 items of memory if you have 122 items acutally in use - but then you get 4 times as many copies if you grow many times.

It is the usual tradeoff between "memory and processing power" - you either waste CPU-cycles, but save memory or use less memory-space, but use more CPU-cycles. If you want to know exactly how many items you need, you could read your file twice, once to figure out how many items you have, and the second time to actually fill it in - that's probably not a good idea, but it could be done.

realloc also copies the data sometimes, and it's only because of "overallocation" in the first place that realloc is successfull without copying (or that you ONLY have one set of malloc/realloc, so your entire heap is able to grow whenever you realloc - this is unlikely to be the case in any decent-size C++ project, since C++ is very much oriented towards using dynamic memory).

Also, if you want to, you can implement your own new operator - either for ALL objects or for certain object types. With that, you could perhaps also add a functionality to support "renew" for simple cases. The complex cases still needs more knowledge of the data than you have available at that point.