Custom Primatives... faster?

I have looked at code here and there, and I have notices a few differences with how people program geometric primitives. The particular thing that I noticed was custom primitives. Sometimes with structs, arrays, or even classes to represent paired data like 2D and 3D points.

Although I'm unsure of the purpose of this. Is code using custom geometric primitives used for speed, or is it just organization of code?

If you're talking about structs or classes, those aren't custom primitives, but custom types. Primitives would be things such as int and double. I'd say that custom primitives would be typedefed primitives.

As for using custom data types, I'd say it's more about the organization. I doubt there would be much speed difference between accessing a member of a struct and accessing a member of an array. (it's more or less the same thing)

I do admit that there are many things you can do with those custom types that can speed production along. For example addition, subtraction, dot-products, cross-products, etc. I guess the only reason I have not done them is I'm just worried about speed, since variables like 3D points will be thrown around everywhere in my programs.

So I decided to build a simple speed tester to see the speeds for myself, and here are the results I was given.

Each operation was performed 100,000,000 times
Each operation takes three int[3] and adds two of them to the other
EX:
three[0] = one[0] + two[0];
three[1] = one[1] + two[1];
three[2] = one[2] + two[2];

And turn it into pretty much the equivalent of the C code. IIRC there was some debate a while back at my job about using the constructor like that at the end vs using the implicit operator= ... too arcane for me to remember this early in the morning. But as you can see it's a very nuanced issue.

gcc does fairly well with inlined C++ and C. I usually get very acceptable speeds by making non-virtual, mostly inlined classes for geometric primitives such as matrices and vectors. The only way to get even faster is to use streaming together with altivec.

In this example 3D vector class, every method is inlined, except for Angle, MutateBy, MutatedBy & both Prints, which have their code written as usual on the .cpp file. Note that the "inline" keyword is not even necessary.

Actually, inline with C++ has the same restrictions as inline in C. The function definition must be included for it to work. Additionally, virtual functions cannot be inlined, as the implementation to call is decided at runtime.

When I said inlined C++ is just as fast, I was comparing it to inlined C.

Note the inline is just a 'suggestion' to the compiler, it can pick and choose what to do case by case (including ignoring all your suggestions). On top of this, depending on your compiler optimization settings it may inline functions for speed that you didn't mark as inline. Basically its all _extremely_ compiler dependant.

Secondly, inlining can make your code slower... because it makes it larger. Why? If your code begins to grow in size such that it no longer fits nicely into the CPU caches or causes some boundry case memory paging.

I think this belongs under...
"Premature optimization is the root of all evil." - Donald Knuth

I'd also add that optimization without first measuring and then making a goal for what metrics you want to hit is pointless. Or to put it another way:

1) Build it as simple as possible so that it is bug free and it works correctly.

2) Measure its performance and determine what needs to be faster and where you can realistically improve it.

3) Make very targeted fixes to improve performance. Remember that design/algorithm changes almost always have _far_ more performance effect than one line 'speed ups'.

Oh and you want to know the magic wand for solving performance problems, schedule problems, man power problems, budget problems... just about every software development problem?:
Cut features.

In my experience this is the most consistently succesful solution to a project that is out of control in one way or another.

Unless you tell gcc to do otherwise, and honouring the other requirements for a function to be inlined, inline behaves as expected. Obviously, inline is an optimisation, so optimisations have to be turned on.

Speaking of math primitives, it doesn't take much brains to inline a function that adds a few numbers if things get slow. A function call adding a few dozen instructions takes several times longer to execute than the function body itself. Usual optimisation steps apply, eg profile first, but this is really one of the prime cases inlining was meant be used for.