Histogram computation on CUDA

Histogram computation is a frequently utilized task in image and video processing. As soon as histogram algorithm could be run in parallel, one can do that on GPU to get very high performance. We've implemented CUDA-based histogram kernels and now they are the part of Fastvideo Image & Video Processing SDK.