Help optimizing CPU/GPU

// DBOINCP-300: added node comment count condition in order to get Preview working ?>

Apathyball

Joined: 23 Feb 16

Posts: 4

Credit: 29461261

RAC: 0

2 Mar 2016 21:12:13 UTC

Topic 198480

(moderation:

)

Hey all.

I just got started with BOINC last week and want to make sure I'm setting everything up right.

I've got a i7-6700K with a GTX 980ti. I'm currently running 2 tasks at a time on the GPU and the load stays around 90%. My issue is that as I keep lowering the CPU contribution, my speed with the GPU keeps increasing.

When I increased my GPU tasks from 1 to 2, it was more than double the time to finish 1 task. So from reading on the board I learned to decrease my CPU contribution to help "feed" the GPU. Performance increased quite a bit. However, as I keep decreasing the CPU contribution, the GPU performance keeps climbing.

For example, decreasing the CPU contribution from 40% to 10% seems to increase my GPU performance by 23%.

Does this make sense? I want to keep contributing to the CPU tasks but they seem to be neutering my GPU tasks.

... For example, decreasing the CPU contribution from 40% to 10% seems to increase my GPU performance by 23%.

I think you are telling us that you are playing with a setting that reads, "Use at most x% of the CPU time". If that's correct, put it back to 100% and change the other setting, "Use at most x% of the CPUs" instead. If you set that to 87% then only 7 cores will be crunching CPU tasks. If you use 75% it will be 6 cores. This is the best way to provide GPU support if it is needed. I expect the first value will work the best. You could try the second one later to see if that gave further improvement. I would suspect not, but I don't own such powerful GPUs so don't have direct experience :-).

As you have more than one host, please realise that if you make these changes on the website (and you're not using different locations (venues) for each host) the settings will apply to ALL hosts. One way to avoid that is to make the change only through BOINC Manager - advanced view where it will be local to the host where it is needed. Have you read the sticky thread at the top of this forum? There are notes there about global and local preferences. Please ask if things are not clear.

When you get 2x running well on your GPU and have stable times, you could try running 3x. I believe you will get a further (but smaller) overall throughput improvement by doing that. Take things steadily, monitor temperatures closely and don't be in too much of a rush :-). Please ask about anything you don't understand. There are plenty of skilled people here willing to give assistance.

Once again, thanks for your contribution to the project and welcome aboard!

... I want to keep contributing to the CPU tasks but they seem to be neutering my GPU tasks.

I tried to get you a quick response in my previous message so I didn't take the time necessary to look in depth. I've now had a chance to do that and to check through your complete list of all tasks assigned to that particular host. Here are some further comments that might help you to improve your machine's performance.

* You will be able to find settings that allow you to crunch plenty of CPU tasks without them affecting GPU performance to any significant degree.

* While you are sorting things out, keep your work cache settings quite small so that you don't feel the need to abort any surplus.

* Another reason to stay small for a bit longer is that you should consider changing the app being used to crunch BRP6 tasks. You're currently using the standard cuda32 version and there is a cuda55 test version that will run ~25% faster. You need to allow test apps in your preferences on the website to achieve this. Once you allow that, your next work request for BRP6 should get the test app and tasks 'branded' as cuda55 for that app. Existing cuda32 tasks will be done by the current app.

* There seems to be a current issue (validate errors) with some BRP4G tasks. If you look at that particular type in your list and click on the 'Invalid' link at the top of the page you will see you have 3 of these errors. The data for these tasks comes from Arecibo and sometimes has RFI (radio frequency interference) in it which causes this. It's very hard to make sure the data is completely clean at all times so these issues can come up randomly. The Devs will try to find and abort the affected tasks. The top item in your list of three has already been canceled by the Devs. BRP6 doesn't have these problems.

* When you are running multiple GPU tasks, it is safer not to try running a mixture of both BRP4G and BRP6. I have no direct experience but I've seen reports from others who have.

* I expect that you will be able to end up with an optimal arrangement where you are crunching at least 2x-3x with the GPU and simultaneously running at least 6 CPU tasks if you wish to. If you stopped all CPU crunching you could keep temperature and power draw at a lower level but you would only get a small reduction in GPU crunch time. You will achieve a higher scientific output by crunching both but at the expense of power and temperature as noted.

I see the backlog of GW tasks has been cleared nicely -- well done on that! I was a bit concerned about some very long run times earlier (well over a day back on Feb 25) but that is all fixed now. The most recent ones were done quite quickly - as low as 25K secs. The FGRP4 and FGRPB1 tasks can now be done and they should go quite well - perhaps only half the time that the GW tasks were taking.

I notice that your latest cuda55 BRP6 tasks are now taking longer than they first were. I'm guessing that you may have switched to running 3x instead of 2x. If that is so, you would seem to be finishing 3 tasks in about 6,200 seconds as opposed to 2 tasks in about 4,700 seconds. This would be a nice further improvement.

If you are still crunching 2x and not 3x, something seems to be wrong.

I notice you still have the final GW task not yet returned. Even though it's past the deadline, it's still useful and could prevent a 3rd task being issued. Will it be done shortly?

Yes, I did change it to 3x. It appears to be a nice change! My CPU is at 75% now and everything seems good.

I just returned the last GW task on my other computer (laptop); I don't see any others listed on either comp, so I think we're good there?

My laptop has been more finicky than my PC. I've had to reduce the usage a number of times to keep the CPU temp in a healthy range. Also, the Cuda55 applications kept getting to around 30% and then they'd error out. Reducing the CPU processes down to 20% and letting the GPU work at 2x seems to be going well. The BRP4G applications seems to finish without issue.

Yes, I did change it to 3x. It appears to be a nice change! My CPU is at 75% now and everything seems good.

And perhaps you've changed it again to 4x? The most recent elapsed times seem to indicate this. Once you get a few more done, it would seem to be showing a further (but perhaps smaller) improvement. I presume you mean that the number of cores BOINC is allowed to run CPU tasks on is 75%?

Quote:

I just returned the last GW task on my other computer (laptop); I don't see any others listed on either comp, so I think we're good there?

Yes, all is very good. I haven't been looking at your other machine at all. The task I did see (it showed in red - past the deadline) was your one in this workunit quorum. It was returned around 1.5h after I posted. Even though it was a deadline miss and the scheduler may send out an extra copy to a 3rd host, the reality of locality scheduling means the scheduler will often be forced to wait until a suitable host that already has the appropriate large data files comes along and requests new work. During this window of opportunity, your task being returned was able to forestall the unnecessary 3rd copy. You can see this in the form of the "Didn't need" status column entry in the above link. So, some otherwise wasted effort was saved :-).

Quote:

My laptop has been more finicky than my PC. I've had to reduce the usage a number of times to keep the CPU temp in a healthy range. Also, the Cuda55 applications kept getting to around 30% and then they'd error out. Reducing the CPU processes down to 20% and letting the GPU work at 2x seems to be going well. The BRP4G applications seems to finish without issue.

Laptops are much more difficult to keep cool and you should be very careful not to run too many CPU tasks. The GPU task failures may simply be 'collateral damage' from the excess CPU heat. Don't try to run more than 2x on that GPU. How many actual CPU tasks are running one or two? BOINC using 25% of cores would run 2 and 12% would run 1 so I'm not quite sure what 20% gives - probably two.

And perhaps you've changed it again to 4x? The most recent elapsed times seem to indicate this. Once you get a few more done, it would seem to be showing a further (but perhaps smaller) improvement. I presume you mean that the number of cores BOINC is allowed to run CPU tasks on is 75%?

Yes, I changed it to 4x. It does appear to have improved slightly again and my GPU load is now a few points higher (92-94%). Temps, times, and load all appear to be good so I probably won't try increasing it again.

Quote:

Laptops are much more difficult to keep cool and you should be very careful not to run too many CPU tasks. The GPU task failures may simply be 'collateral damage' from the excess CPU heat. Don't try to run more than 2x on that GPU. How many actual CPU tasks are running one or two? BOINC using 25% of cores would run 2 and 12% would run 1 so I'm not quite sure what 20% gives - probably two.

20% is running 1 actual CPU task. Anything more than 1 task at a time and the CPU temps would climb into the 90's. 1 CPU task and 2 GPU tasks keep the temp solidly in the low 70's and doesn't seem to generate errors.

Thanks again for the help. Things seem to be trucking along quite a bit better than before =)

i use a i7-4790K and a GTX 980.
I run now 4 task on the GTX 980 and 6 task on my i7-4760K.

I change the setting "Use at most x% of the CPUs" yesterday to 75%.
And now a "Gamma-ray pulsar binary search #1" task run between 5.000 and 12.000 sec faster.
A "Gravitational Wave search O1 all-sky tuning" task run between 10.000 and 14.000 sec faster
A "Binary Radio Pulsar Search (Arecibo, GPU)" task run between 500 and 1.000 sec faster.

I think it's better to use on a 8 core CPU only 6 core.
The usage of the CPU is now at 90% and the usage of GPU is now at 89-94%.