Comment viewing options

The performance delta comes from parameter passing/data copying.
Yeah that's obvious, was just lacking a better word to describe it. It's perfectly good that there are no instance methods, was just refering to the comparison charts. It's a similar problem with hardware comparisons you find on the web, they always compare the latest generation hardware with each other, but never with the previous generation as a base value.

I lean towards implementing the more accurate/general method and leave the other to the user as a possible optimization
It would actually be nice to have both available. The nice thing about the current method is that it allows you to set a desired accuracy. This is very useful when attempting to optimize a triangle-soup (glDrawArrays) into an indexed triangle list (glDrawElements). When examining the soup, 2 vertices might have the same position/uv, but slightly different normals (this can be introduced by either the loader or the modelling app's smoothing). If the normals difference is neglectible, you can merge the 2 vertices into a single one, this won't break lighting and improve the vertex-cache optimization. (Ofcourse this is only interesting for low-poly meshes)

Don't really know how you want to handle the default value, that's why I asked. It might be best to accept that it's alpha stage and will need tweaking at some later point, and leave it as is right now.
Maybe explicitly state somewhere that the programmer should always try to formulate comparisons of floating point numbers with lesser/greater whenever possible, and consider IsNearlyEqual() as a rather unreliable result that should be avoided, and never build important logic upon it.

I've done some profiling on the 0.3.13 mathlib and came to the conclusion that there's quite a few tweaks i'd like to do for my own use. Would you mind changing the structs to be partial for the next release, so users can extend it without touching OpenTK.dll?

Hmmm what other options remain to extend the structs? Since you cannot derive a new struct from them? I have to admit that I only declared the struct partial and didn't try recompile the .dll since I was only working with the mathlib. You're right, it cannot extend built modules :|

I will share the findings with you, but currently I don't really know if you are still modifying it and decided to add my stuff as partial in a new file. The things I posted in this thread didn't really make it into the trunk, so that appears to be the best path for now.

Basically the changes are more helper-methods and more overloads with ref/out parameters, like you and objarni both noticed in your benchmarks too.
E.g.:staticvoid Lerp(refVector3 a, refVector3 b, reffloat blend, outVector3 result) executes in half the time, compared to the currently included method which is only passing by value. Since the CPU is a good candidate for becoming the bottleneck, trading slightly worse readability for noticable better performance is a good deal imho.

When adding Vector3 B to Vector3 A, the only method faster than Vector3.Add(ref A, ref B, out A ) is void Add(refVector3 b ) which is ~15%, which is equal to staticvoid Add(refVector3 a, refVector3 b ) performance-wise.

I really want those non-static members too, not necessarily for speed only. Just take a look:

Don't take this wrong, the API is far from final. The only reason more stuff didn't make it that they would push the release further back which would be bad (too many changes in one release without feedback - the original plan was one release every 2-4 weeks). This includes the IEquatable interface, the IsNearlyEqual function and reference parameters.

The reason why it would push the release back, is that it is difficult to find a solution that is both fast and reasonably convenient. For example, instance methods don't work intuitively with properties:

// A doesn't change if it is a property, while it does if it is a field. Ugh...
A.Add(ref B );

References on the other hand don't work with properties at all:

// Fails to compile if either A or B is a property.Vector3.Add(ref A, ref B, out A );

Even worse, overloading on ref alone isn't CLS-compliant - we could do that, but it could potentially break other .Net languages...

The XNA team followed the simplest, but least performant road: everything is passed by value, no instance methods. I'm not saying they were wrong (I'm sure they run their tests before going that way), however, the math toolkit is the single biggest potential CPU bottleneck - it's a bad idea to release something less than fully optimized (and unlike XNA we don't have the luxury to fall back on C++/CLI).

I'm still trying to find a solution to these problems. If you have any ideas please share - the next release will have a shorter development cycle, so we can improve and test new ideas as they come.

Edit:
It seems that setting the CLSCompliant attribute to false allows to overload on ref parameters, meaning that we can have both:

Vector3.Add(a, b, out c);
Vector3.Add(ref a, ref b, out c);

This solves the problem with properties as parameters, however it seems that the method marked with CLSCompliant(false) doesn't appear in intellisense. I'm not sure why that is (such functions do appear in the GL class), but otherwise it works. Will have to test with braindead languages, though (hope it doesn't break VB.Net).

I really can't think of any solution to the A.Add(ref b); problem, where A is a property.

Maybe this sounds weird, but I'd split the structs of each type in the mathlib into 2 partial files, something like Vector3Slow.cs and Vector3Fast.cs. The first could contain all operator overloads and pass-by-value methods, the second file would primarily contain ref/out overloads. This would draw a clear line between the options and might be less aggressive than splitting them into separate structs like Vector3 (nice) and Vec3 (fast). The users programming with the fast methods could be made aware of avoiding data encapsulation with accessor/mutator, but will most likely have done that anyway already.

Providing a dozen overloads for each method with any possible value/ref parameter combination will probably still lead to only 1-3 of them frequently used, because the user is concerned with either easy-to-use or speed. I just think every static function should have an ref/out overload, this is what i've been working on.

One version favors readability and easy-to-use, the other one speed. There's no more overloads needed than that. To be more precise, don't waste time writing an overload like these: Vector3.Cross( A, B, out C ); or C = Vector3.Cross( ref A, ref B );

Regarding XNA, the developers probably had their target clientel in mind too. For a beginner adding Vectors C = A + B; is much more convenient - than Vector3.Add(ref A, ref B, out C); - and those are 90% of their users. Although the later looks a bit alien at first, it's not that bad once you get used to it.

Are double-precision quaternions really needed? I don't have any plans for that, but I wouldn't turn down some code.
Hmm... Mind if I have a look in the math module?What are dual quaternions?
1. complex number a+bi (i^2 = -1; {a, b} real)
2. dual number: a+be (e^2 = 0; {a, b} real).
3. dual quaternion: p + qe (e^2 = 0; {p, q} quaternions)
Can be used to represent general rigid transformations (translation and rotation). Correcting matrix drift is a pain in the arse. Normalizing the rotation part of a dual quaternion is WAY simpler and faster...
They're heavily used in robotics while the CG world has been ignoring them.
best page I found http://www.euclideanspace.com/maths/algebra/realNormedAlgebra/other/dual...

Hmm... Mind if I have a look in the math module?
Nope, not at all. That's why the source is out there in the first place.

This is the first time I heard of dual numbers, thanks for the link. I'm still trying to grasp their mathematical properties, but they seem very interesting (I've always wondered how it is possible to use quaternions for translations). I'll be doing some reading.