Version 1.20 of the Collatz Sieve has been released for testing. (You have to enable beta/test apps in your preferences in order to get collatz_sieve workunits.) The code has been cleaned up since 1.10 and a couple bugs fixed. Both 32 and 64 bit versions for Linux and Windows are now available. OS X is limited to 64 bit only.

The Linux applications were compiled on Oracle's version of RHEL 6.5 which is from the CUDA 3.0 era so I'm hoping the binaries will work on other newer Linux distributions. As usual, if it doesn't, let me know.

The OS X applications were compiled on Mavericks so I have no idea whether older Macs will be able to run it or now. Supposedly, it should work for 10.5 or later but I'm not holding my breath on that one. I'm hoping the kernel will build properly on the AMD machines as I've only tested on Mavericks with an nVidia GPU.

</stderr_txt>
]]>
The .config for this wu was the same as for 1.10, which were running well.

Lower the sieve size to 28. The 1.10 subdivided the sieve into 4 parts whereas the 1.20 does not so it now runs 4 times as many items per kernel. Try using 28 for the sieve to get the same effect. On the flip side, because it now runs one kernel per sieve instead of 4, you should be able to increase the kernels per reduction.

Then again, you'll have to wait for v1.21 which fixes the result/validator issue that is happening now.

Notice the blank line? That's where the actual line number and error in the kernel are supposed to print. It looks like Apples/AMD's compiler still has the bug where it just crashes rather than return the error that occurred. I guess that's what you get when a company is more focused on wrist watches rather than operating systems.

There are ZERO differences between the OpenCL kernels used on the Windows, Linux and OS X applications. Given that it compiles fine on the other platforms and when using the nVidia driver on OS X, it seems to me that the issue is clearly with the AMD/OS X driver which means there is nothing I can do other than complain to AMD and Apple that the problem still isn't fixed.

But they are finishing and reporting results. That's much better than not being able to compile the app. The issue is either in the output of the app or on the server. Since the output is supposed to be the same and since the server code hasn't changed, my guess is that the code cleanup I did so all platforms could share the exact same code as much as possible caused a problem in the result output.

As far as estimates go, BOINC sucks. It tries to be smart and ends up being really dumb. The original sieve WUs were very large. BOINC still hasn't figured out that when the WU size changes, it's estimates should also change. No, it uses the past performance to judge future work and slowly adjusts over time. Secondly, the integer operations per second of the CPU are a really poor way to measure GPU performance so BOINC uses estimated GPU flops using a fancy algorithm that worked for a specific type of GPU for a given model line. As hardware changes, the estimate gets further and further off. Again, K.I.S.S. works really well. The fancier the algorithms, the worse it has gotten. But, it works OK for CPU apps that have variable WU sizes and use floating point computations, so that's what everyone else has to use.

Tip: Always ignore the BOINC time estimate. If you want to believe it, I have a bridge to sell you....

There is a bug in the app output that I need to find. According to the validator, the max steps and total steps are showing up as 0 which is why validation is failing. The results shown in the stderr output ARE valid though. It's just that the result file returned to the server is screwed up.