Especially the reduce functions look interesting, as they are more general than the dot-product instruction available in SSE. Nothing revolutionary though, but all in all it looks like a very nice and useful instruction set, although I was hoping for 8-bit instructions as well (with 8 bit components, and RGBA, you could process 4x4 pixels at once -- that would be a real killer for image processing).

[Update] The instructions are 3 operand only, storing the result in the first operand!

Over the last months, I've written a virtual texture mapping implementation as part of my student research work. Some people have already got a copy to read (you know who you are ;) ), rest assured that I'll continue to work on this stuff. I'm going to post about it on this blog, as soon as the work becomes a bit more mature, currently the framework is in early alpha stage, and we are working on a better content creation pipeline. Our artist -- although very talented -- had a hard time to produce demo content, and hence we (that is, a co-developer and me) have to write some tools to help him.

My solution is basically a reimplementation of Sean Barret's "Sparse Virtual Textures" (about which I already blogged about), this time with DX10, though I didn't use anything DX10 specific. However, I measured lots of stuff, and tweaked based on that, and I still have lots and lots of things I have to try and measure. The implementation supports 4:1 anisotropic hardware filtering, and requires roughly 5x more texture space than framebuffer size (for a framebuffer with 400k pixels, you would need a 2M cache). No special shader tricks are needed, the lookup costs < 10 cycles (of which most are fixed overhead costs, so it becomes cheaper with more lookups).

Not all is lost though, you can use the comments to ask specific questions, and I'll try to answer them.

As you probably have noticed (unless you only use a feed-reader for my page ;) ) -- I have a new theme. If you spot any problems (usability or display), please leave a comment, and tell me which browser/OS combination you use. So far, it works fine with Opera/Windows, Firefox/Windows & Linux, Chrome/Windows (except for small issues with the code), Konqueror/Linux (same problems as with Chrome, and some with the title bar text).

[Update] Fixed Gravatar-display inside the comments, my gravatar should be displayed properly now. Made the check for comments created by the admin more reliable.

We'll take a look at the PIMPL (private implementation) pattern today, which is especially useful for larger projects, where compile times become a problem. Pimpl allows to decouple the interface from the implementation, to a point where nearly each class can be fully forward declared only. This reduces the compile times dramatically. Another usage of Pimpl is to hide large or ugly include files (windows.h, anyone?) from the clients.

So how does it work? The idea is to forward define an inner class, and always store a pointer to it. Let us take a look at an example:

This is our public class interface, and see, we don't expose our container type. In this case, we'll use a standard vector. Our implementation looks like this:

// Implementation#include<vector>classContainer::Impl{public:Impl(constsize_tsize){vec.resize(size);}std::vector<int>vec;};Container::Container(constsize_tsize):impl_(newImpl(size)){}/*We need those copy constructors, otherwise, we wouldshare our state. For most classes, it is best to make themnoncopyable anyway.*/Container::Container(constContainer&other):impl_(newImpl(other.impl_->vec.size())){impl_->vec=other.impl_->vec;}Container&Container::operator=(constContainer&other){impl_->vec=other.impl_->vec;return*this;}int&Container::operator[](constintindex){returnimpl_->vec[index];}constint&Container::operator[](constintindex)const{returnimpl_->vec[index];}

That's it! We have to pay by one additional memory allocation (which can be circumvented by semi-portable trickery), but we gain a lot while compiling. A large library which makes excessive use of Pimpl is Qt, but it pays off, as it includes the bare minimum required to get a compile, and nothing more. For the sake of completeness, a small usage example: