Saturday, August 21, 2010

Parallel classifier with a GPU

The objective is to speedup the classifier by using a GPU to do the strings matching and the computation of probabilities. But there are two big issues that I listed in my previous post:

- The number of threads limited at 32.
- The memory bandwidth 30 times lower than the computation capability.

The straightforward solution is to use the GPU for what it is essentially designed; "Compute a large array of independant floating point values with exactly the same code, the same instruction excuted on the same time." on many simplied processors. In this way, we will able to run the 512 cores.

The simplest way is to transform the terms (string of alphanumeric characters) in values. First step: A hash routine using a parallel algorithm [1] will perform this transformation. Next step: The hash values of terms in the dictionary are compared with the hash value of the searched term. Final step: All the product wi*xi will be computed. Both comparisons and multiplications are performed in parallel.