VLIW code for NVidia is only required for cheap sm_21 models. Basically Ti series cards < 9xx series, like the 560Ti. It is required to get out maximum performance for those cards. With newer cards you don't need this.

Basically this C++ classes just create code which generates code using the same instructions multiple times in a row but for different data sets. As you said, it's like SIMD.