i've been looking on google.com but theres not much there so can anybody help?

Cheers

Allan

mikael_aronsson

12-14-2001, 01:56 AM

All ints and floats should be aligned on 4 bytes, all doubles on 8 bytes, that's pretty much it, when ever you fail to do this most CPU's are getting slower, on Intel CPU's it's pretty expensive to handle unaligned data.

If you want more fancy stuff you should also try to put related data together in tight groups of 16 or 32 bytes, aligned at 16 or 32 bytes, this improves the cache management a bit.

If you go to intels website you can find a pdf file about optimization on Intel cpu's it's a lot of assembler stuff but they also have lot's of info about algnment there.

Mikael

Iceman

12-14-2001, 07:18 AM

Good question, now that we have someone that obviously knows what he is talking about ... lemme ask a question ...

Suppose I have a vertex structure defined in the following code and that I manage groups of vertices in a simple array of such structs ...

struct Vertex
{
double x;
double y;
double z;
};

The sizeof() operator on that struct will return 24 (sizeof(Vertex) = 3*sizeof(double) = 3*8 = 24). Would it be beneficial to pad it to 32 bytes such as in the following code? I could care less about the additional memory requirements, I have 512Mb of RAM.

struct Vertex
{
double x;
double y;
double z;
double padding;
};

Also, do you know if the Visual C++ optimizer does this for me already?

[This message has been edited by Iceman (edited 12-14-2001).]

dorbie

12-14-2001, 07:37 AM

This depends on the access pattern, I assume you're worried about arrays of this type. It's not going to help to create the pad, and use extra memory.

If you had small groups of floats you access fairly randomly then it pays to align. If you have an array, and you're just reading through the array sequentially then it wouldn't pay.

If this data was used for indexing, and the triangle vertices were jumping around it, then it could pay to align. If it was indexed and most primitives got reasonably sequential access it wouldn't pay to align, and this is normally the case with indexed primitives, at least the first hit on vertices is sequential.

[This message has been edited by dorbie (edited 12-14-2001).]

[This message has been edited by dorbie (edited 12-14-2001).]

Korval

12-14-2001, 12:01 PM

Also, to follow up, don't use doubles. As nice as they are, most implementations internally convert them to single-precision floats when doing computations anyway. Not to mention, they're pretty slow as far as floating-point math is concerned.

If you use an extension like NV_VAR or ATI_VAO, where vertex data is stored in AGP or video memory and is directly accessed by the hardware, then the question of padding becomes pretty hardware-specific. I would imagine, however, that since it is uncached random accessing, that padding wouldn't be necessary or benifitial.

Iceman

12-14-2001, 05:52 PM

Most of the time I use floats for that very reason. In my current project, that is impossible since the numbers are so large (earth reference coordinates ...).

Assuming that I am using floats, the question becomes similar ...

struct Vertex
{
float x;
float y;
float z;
float padding; // ???
};

In another project, I use compiled vertex array such as the code snip below: