2018-03-15

memcpy

Having been caught out by this (and yes, I should know better) this is a friendly reminder for those coding in C.

The man page on memcpy is clear.

DESCRIPTION

The memcpy() function copies n bytes from memory area src to memory area dest. The memory areas must not overlap. Use memmove(3) if the memory areas do overlap.

In days gone by the memcpy would be done by a simple loop copying bytes from src to dst until length runs out. e.g. while(len--)*dst++=*src++; or some such, but probably in assembler.

So a classic case of copying a block of data back a few bytes, e.g. memcpy(data,data+1,len) would be fine.

Unfortunately the warning of The memory areas must not overlap. is not to be ignored.

You will get away with ignoring it a lot, and that is the problem! Whether you get away with it depends on a lot of things. Version of C libraries and even version of the compiler, the specific alignment of the points you are moving data to and from, the length you are moving, and probably more factors I cannot think of.

So things may work 100% until next recompiled, or simply until run on a new machine. Worse, they may work most of the time, but not quite all.

The reason is that a memcpy can be carefully optimised. For example, on an ARM you can load a whole load of registers in one go and then store a whole load of registers in one go. It may be more optimal for it to start copying from the end and work backwards, for example. The specification of memcpy not permitting overlapping areas allows for all number of optimisations to be performed in the implementation.

On the other hand memmove has to allow for overlapping areas.

DESCRIPTION

The memmove() function copies n bytes from memory area src to memory area dest. The memory areas may overlap: copying takes place as though the bytes in src are first copied into a temporary array that does not overlap src or dest, and the bytes are then copied from the temporary array to dest.

In practice it does not have to copy to somewhere temporarily, just make sure it moves data in the right order if there is an overlap. This means more checks and code that may not have quite the same optimisations available.

So, always be careful to use memmove if you cannot be sure the memory areas do not overlap.

11 comments:

Some cpus have a hardware block copy instruction (eg. XAP2) so the compiler compiles memcpy() inline to that instruction. memmove() on the other hand checks whether the block copy instruction would violate the overlap and if it doesn't uses it but otherwise does the copy carefully but more slowly in assembler.

Some platforms have DMA hardware that can do memory to memory moves. Runtime libraries on those platforms can replace memcpy() with a version that for big copies uses DMA but for smaller copies where the DMA setup overhead would dominate calls the original memcpy().

There was (and may still be -- I should check, and fix it if it's still outstanding) a GCC bug whereby structure assignment for large structures could be offloaded to memcpy... even when the assignment was e.g. 'a = a' or the more likely case of '*a = *b' where a and b were pointers that may alias the same structure.

Long gone are the days when you could use overlapping areas to flood fill a string with a pattern. Now compilers and processors think they know better than you. E.g. With a 20 character string you could move "1234" to 1-4 then move 1-16 to 5-20 and you would get "12341234123412341234".

On the original C compiler for the ARM (Norcroft C for the Acorn Archimedes) memcpy and memmove were actually the same function. memmove guarantees not to break the overlap case, but memcpy doesn't guarantee that it will.

The implementation did indeed use lots of registers, and was (by the standards of the day) blindingly fast. It was part of what let the Archimedes desktop have solid window drags when practically everything else still only let you drag an outline of the window.

* It produced lots of diagnostics when fed invalid C source which the contemporary Microsoft C compiler would process without comment. Somehow accurate diagnostics were seen by some reviewers as a bad thing.

Norcroft C for the ARM is still by far the best C compiler I have ever used, both in how good the code it generated was and the superb warnings and errors from the compiler about the source code. gcc is rubbish in comparison.

Everything I write here is just my honest opinion and not a statement by my employer, etc, you get the idea. If you find any words or pictures menacing or offensive, or likely to impair your computer, or alarming or distressing, stop reading now and don't come back (and don't forget to block me on social media too). Nothing here is legal advice. Everything on this blog is without prejudice, just in case. Comments are moderated to weed out obvious spam, so do not appear instantly. You take responsibility for any comments you post. Always bookmark www.me.uk as I may change the URL blogger sees.

And please, if you don't like what I post, say so - comment - discuss...