Do pointers returned by malloc/new have to be aligned?

This is a discussion on Do pointers returned by malloc/new have to be aligned? within the C++ Programming forums, part of the General Programming Boards category; Hi,
I implemented my own memory management if form of own allocator functions which operate on my own data structures ...

What alignment is needed depends mainly on the processor. And "needed" ranges from "if it's wrong you loose a bit of performance" to "if it's wrong, it doesn't work".

Generally, it's considered a good idea to align to at least 8 bytes, preferably 16.

In x86 (older processors) a cache-line is 32 bytes, so if you do not want to processors to have "false-sharing" [which is where two CPU's are not sharing the actual data, but data from the same cache-line is used by two different processors], then aligning to "a cache line" will reduce the problems with this.

It is also good to avoid splitting blocks into TINY fragments, as that is most likely going to end up with a heap that contains a lot of 1-byte blocks that no one can use, which is another reason for having a minimum heap-block-size (e.g. 32 bytes).

The language requires that the returned pointer is "of suitable alignment for the largest supported type". Reality says that the alignment is either 8 (for doubles, and pointers on 64-bit) or 16 (so you can use it for SSE algorithms). Common sense says that it should most definitely not be less aligned than what the system default allocator returns.

I've tried different implementations of a custom function with less and less features. None of that worked (the library functions crashed due to memory access error after a time).

Now I tried the very simplest version of a custom function I could come up with. It just allocs and returns the memory asked for. It doesn't care about freeing of memory, re-allocation, corruption checks, fragmentation or anything else.

Don't just leak like that. From experience, it'll stay that way for about the next five years, upon which time it finally comes back to bite you and takes hours to track down. There's no reason for it to be a vector here at all. Why not just new up the right number of bytes straight off?!
Also, you're probably supposed to take ptr and osize into acount instead of just ignoring them. Usually you'd multiple the last two values together.

Yes I know and in the real program it's implemented in a much better way (I hope), but I though the chances of getting the right tip is much higher if I develop and post the simplest possible solution. (the productive function with debugging code to mark and track correct handling of blocks is way bigger). Also it helped my to exclude other possible errors regarding free and re-use of blocks.

Ok, in my case the simplest solution was to simple, because it ignored one of realloc's tasks - moving the memory to the new home