Note that this appears to only work on Windows 7+ according to the source-code. Either way, the undocumented function would be "D3DKMTQueryStatistics" from GDI32.dll and you'd need to get your hands on d3dkmt.h (google it) for structure definitions.
–
Jasper BekkersDec 8 '11 at 15:08

What you want to do is not done in CUDA. For a percentage output of the utilization handles by a specific cuda device you have to ask the GPU driver like nvidia-smi does.

EDIT: After a little google search i found open-hardware-monitor It is written in C# and shows a way to get the utilization of AMD and nVidia cards on windows systems. The implementation for AMD cards works on linux and windows.

I've looked a bit inside the code. Look at the NVAPI.cs file to see how open-hardware-monitor uses the nvidia dlls on windows to get all sensor and load data.