The vectorized implementation in Assembler makes OptiVec functions, on the average, 2-3 times faster than compiled source code of the same functionality. In many instances, the numerical accuracy is improved as well.
This version is for Embarcadero RAD Studio XE3 (Delphi XE3), both for 32-bit and for 64-bit.