we would like to test our new CPU multicore application for quantum chemistry tasks ("QC"). Since it’s the first time we have a CPU app out, I’ll test the behavior of GPUGRID with a relatively large batch that you will see soon. Workunits are named "*QC309big*".

Here’s some features of the app, in short (subject to change):

* Platform: Linux only for now, generic x64.
* Threads: as many as Boinc decides. I guess it depends on your machine, your preferences, and other running tasks in ways which are obscure to me…
* Run time: about 1 CPU hour per WU (so, shorter if multithreading)
* Credit: computed with the default algorithm (tasks are short, don’t expect much). Bonus mechanism for fast turnaround is still on.
* Known bugs: restarts and checkpoints. This should be mitigated with the “keep in memory when suspended” option. Sorry about that, it’s outside of our control.
* Network behavior: the first time you get a WU of this kind it downloads a Python interpreter (miniconda) and then some open-source packages, and installs them in the project directory. The installation is reused whenever possible.
* Disk usage: could go around 1 GB, perhaps more when tasks are running. Resetting the project should remove everything.
* Memory usage: should be around 1 GB when running.

Depending on the results of this test, we’ll start thinking about other platforms.

the client does not receive WUs, although there are almost a thousand of them and the client is suitable for the requirements (Linux x64). earlier this host was able to receive test tasks for QC and python

Dears, all three errors mention a missing "readlink" executable. It is surprising, because it's a fairly basic command, but please check if you can run "readlink" in a terminal. If not installed, should be in the "coreutils" package.

I'll add /bin to the path in the next app update. That may work, unless there is some weird sandboxing thing going on. You shouldn't need to tweak your system: just let them fail (they should fail fast, so no CPU loss).

Concerning why some hosts are not receiving WUs, it's baffling me. It's not a matter of hosts already having GPUs because my own machine does and it did not get tasks. It may be related to the "reliable hosts" classification.

we would like to test our new CPU multicore application for quantum chemistry tasks ("QC"). Since it’s the first time we have a CPU app out, I’ll test the behavior of GPUGRID with a relatively large batch that you will see soon. Workunits are named "*QC309big*".

Here’s some features of the app, in short (subject to change):

* Platform: Linux only for now, generic x64.
* Threads: as many as Boinc decides. I guess it depends on your machine, your preferences, and other running tasks in ways which are obscure to me…
* Run time: about 1 CPU hour per WU (so, shorter if multithreading)
* Credit: computed with the default algorithm (tasks are short, don’t expect much). Bonus mechanism for fast turnaround is still on.
* Known bugs: restarts and checkpoints. This should be mitigated with the “keep in memory when suspended” option. Sorry about that, it’s outside of our control.
* Network behavior: the first time you get a WU of this kind it downloads a Python interpreter (miniconda) and then some open-source packages, and installs them in the project directory. The installation is reused whenever possible.
* Disk usage: could go around 1 GB, perhaps more when tasks are running. Resetting the project should remove everything.
* Memory usage: should be around 1 GB when running.

Depending on the results of this test, we’ll start thinking about other platforms.

Two of my computers have received tasks and processed them with no trouble.
Both run Fedora (16 and 21), host ids are 192138 and 189186.
My 8 core (16 thread) computer (running Fedora 25) has yet to receive a task.

Host 192138 is a 6 core computer and Host 189186 is a four core computer.

The 6 core has shorter Run times per task and more CPU times than the 4 core.

This is as expected due to core count, however the 4 core computer gets higher credit per task than the 6 core, this does not make sense.

6 core getting around 1,500 sec Run time, 8,600 CPU time and about 66 credits.

4 core getting around 3,200 sec Run time, 6,900 CPU time and about 85+ credits.

Credit assignment logic has historically been problematic (see here) to the point that I am inclined to think that it has no best solution. For the time being the credit algorithm is the old default one from boinc. I think it relies heavily on the self-computed FLOPS and yes that seems paradoxical.

Can anybody comment on the suspend/resume behavior under a variety of conditions (ie. with and without "keep in memory")? I expect the calculation to restart from scratch, but not crash.

When I suspended a task with LAIM on, BOINC manager showed that it was suspended, but the system monitor showed that the task was still busy using all the threads that were allocated to it.

When I suspended a task with LAIM off, BOINC manager showed that the task was suspended and the task disappeared from the system monitor. When the task was resumed, it restarted from 0 and appears to be running normally.

I just wanted to report back:
My host ID: 420971 gets work and finishes latest version with success!
My host ID: 452211 does not get any work. Message is: There is now work available. This host does not have any GPU and works from an USB stick.

Working/not working pairs are useful for debugging indeed (if they have the same preferences, that is). It was suggested that it was the presence of a GPU, but there are GPU-less counter-examples, like this. The scheduler is a software nightmare...

I'll resume tests later this week. In the meantime, there are 1000 more CPU WUs (QC310big).

Today is my lucky day. I just enabled the multicore app, and immediately picked up two of them on my i7-3770 machine running Ubuntu 16.04.3 (Linux 4.10.0.38), and BOINC 7.8.3. They run on 7 cores, with one core reserved for GPU support as set by BOINC preferences, not in the app_config (though I use one for other purposes).

However, suspending them does not shut them down with LAIM enabled, as noted before. I have not tried the non-LAIM case.

If it matters, this machine was attached to GPUGrid earlier, and I had run a few GPU work units on the GTX 980, though I am requesting only the CPU work now. But maybe that has something to do with why I am getting them.

EDIT: Also, I have "Run test applications?" enabled, though I don't know if that is necessary in this case.

On a 1950x it's reserving all 32 threads but not running them near the maximum.
It seems to be switching which cores are active - my System Monitor CPU usage chart looks like a long line of infinity symbols.

If you divide the CPU time by the runtime, you'll see an average usage of about seventeen cores a second. Everything else is going to waste.

getting a ton of quantum chemistry tasks on my aws ec2 p2.xlarge instance.
a47-toni_qc310k-0-1-* are the names of the tasks. Are these the new multicore tasks you talked about? The machine takes a task to 66% in 2 seconds and then sits at that percentage for ~10 minutes.

I think the task stops reporting progress @ 66%? bug? I compiled the boinc client on the ec2 instance, so it could definitely be user error as well.

I'm using Ubuntu's bundled system monitor to display CPU usage graphs. That 66% thing is just a bug with the work unit time estimation, but my cores really were gradually rising and falling from 0 to 100%. Like a helix on its side, but with 32 lines.

(It's not thermal throttling.)

IF at all possible, consider limiting each multicore app to four cores - almost every modern CPU's threads can be divided equally by four, so we can ensure the highest throughput as no thread would go to waste.

Since I had 100% errors (Message 48156 - Posted: 12 Nov 2017 | 2:36:31 UTC) on my first batch of these CPU tasks, I created a symlink as instructed, then deleted the symlink as subsequently instructed, but I have never received a single task since my 12 Nov 2017 post.

On a 1950x it's reserving all 32 threads but not running them near the maximum.
It seems to be switching which cores are active - my System Monitor CPU usage chart looks like a long line of infinity symbols.

If you divide the CPU time by the runtime, you'll see an average usage of about seventeen cores a second. Everything else is going to waste.

Since I had 100% errors (Message 48156 - Posted: 12 Nov 2017 | 2:36:31 UTC) on my first batch of these CPU tasks, I created a symlink as instructed, then deleted the symlink as subsequently instructed, but I have never received a single task since my 12 Nov 2017 post.

Will the app name stay with "*QC309big*" or will it change for the real stuff? So we might make a app_config file or better still, might you propose an app_config file to limit cpu cores per work-unit to X cores.

@PappaLitto: I quite happy with a Linux only app!

It is time to make some Linux USB sticks (16GB USB3.0, 10 USD): I work with Lubuntu 17.10 on varios computers I do not use, and it works great! Or try BOINCOS v2.0 Beta Release, there all is pre-configured.

Making a windows app will probably need one of the following two solutions. Neither is perfect (by far).

* The "Windows Subsystem for Linux" from Microsoft. It's unfortunately W10 only (as far as I can tell), and probably we'd be the first BOINC project to use it (=headaches).
* A VirtualBox app. Its downsides are known I think.

By the way, question for the gurus: when you run a vbox app, is virtualbox automatically installed on your system?

By the way, question for the gurus: when you run a vbox app, is virtualbox automatically installed on your system?

No. The user has to install it themselves, and usually some VBox extensions are recommended as well.

There are two ways of installing VBox for Windows:

1) Via a combined single-click installer for both VBox and BOINC, available from BOINC. The simplicity is attractive, but there are downsides - there is no control over e.g. installation location, and the version of VBox included is usually several steps behind the current release.

2) Direct from the Oracle VBox site. BOINC will still recognise this - there's no special BOINC code in the combined VBox installer.

Any VBox extensions desired will always have to be downloaded from Oracle. There may be other adjustments required to the host computer, such as enabling virtualisation in the BIOS, which might be unfamiliar to the casual user.

Even when you get the windows app going, it looks like you're still going short on the number of crunchers by more than half, based on the server status page which shows currently 821 users crunching long units in the last 24 hours, while 34 are crunching quantum chemistry. (821/34 = 24.15)

So, in order to meeting 50x, you will eventually have to create an app for multi CPU-GPU.

Quantum chemistry has a long way to go.

In the mean time, you can't make the windows app too difficult for the crunchers to set up, because most of us are not computer gurus and you will end up with only a few more crunchers.

Even when you get the windows app going, it looks like you're still going short on the number of crunchers by more than half, based on the server status page which shows currently 821 users crunching long units in the last 24 hours, while 34 are crunching quantum chemistry. (821/34 = 24.15)

So, in order to meeting 50x, you will eventually have to create an app for multi CPU-GPU.

I am all in favor of GPU, but as noted on many project forums, it doesn't work for most problems. But don't write off Linux on the CPU yet. It is just in the startup phase. I have even taken my machines off until the production version is released. Once the word gets around (be sure to post a note on the BOINC forum), you will get lots of help. And CPUs are getting more cores all the time.

Even when you get the windows app going, it looks like you're still going short on the number of crunchers by more than half, based on the server status page which shows currently 821 users crunching long units in the last 24 hours, while 34 are crunching quantum chemistry. (821/34 = 24.15)

So, in order to meeting 50x, you will eventually have to create an app for multi CPU-GPU.

Quantum chemistry has a long way to go.

In the mean time, you can't make the windows app too difficult for the crunchers to set up, because most of us are not computer gurus and you will end up with only a few more crunchers.

This is a big undertaking. Good luck guys!!

Oh there will be many more if there is consistent work. The work inconsistency of GPU work pushes many away. Compare the support for pogs vs duchamp. Similar projects but duchamp requires vbox

I would say don't mess with a virtualbox application if it would replace the Linux application. Too many headaches. If someone is running Windows, they could easily set up their own virtualbox VM and run it under Linux with the standard app. Win-win for everyone that way. It also give the user more control over the VM. Just my thoughts on it. Also, more and more people are migrating to Windows 10 and it is the direction all new machines are following. So, might as well prepare for the future.
____________

I would say don't mess with a virtualbox application if it would replace the Linux application. Too many headaches. If someone is running Windows, they could easily set up their own virtualbox VM and run it under Linux with the standard app. Win-win for everyone that way. It also give the user more control over the VM. Just my thoughts on it. Also, more and more people are migrating to Windows 10 and it is the direction all new machines are following. So, might as well prepare for the future.

My guess is that they could leave the Linux application as it is, and just add a VirtualBox application for Windows. I have had no particular problems with VBox on either Windows or Linux machines recently. I run LHC, Cosmology and sometimes others on it. I would prefer that they set it up so that I don' have to configure my own machine. All you really have to do is to first ensure that running a Virtual Machine is enabled in your BIOS. A good primer is on the Cosmology site:http://www.cosmologyathome.org/faq.php#vtx

There is actually more to it than that. I have ran every VM project as well and have done so with a whole slew of different hardware and software setups. There is a huge reason why those projects do not get much support. LHC only has a large user base now because it merged the projects with the original Six Track project. Event still, the virtualbox applications remain less popular. Keep also in mind that these projects have problems with new releases of Virtualbox as they have right now. If I make my own VM using the latest release, it does not suffer the same. The only advantage to a vbox application is to allow the scientist to have an easier time compiling a single application. This may sound great to them, but the amount of lost time on the end user far exceeds their lost time.

Also, for reference if it helps, GPUGrid attempted vbox applications back in 2014. Discussion started in 2013 http://www.gpugrid.net/forum_thread.php?id=3542#33874
____________

I'm actually arguing for keeping the Linux version rather than replace it. Telling me to not bother because you like them isn't acceptable to me. You play it off like they run great because you had little issue with them. You can scour their forums over and over to find the average user does not agree. You are right about LHC not being in its present form as it would still be Six Track running traditional work and the others doing it in house or eventually adapting things different. Cosmology would just be down one application as well. I don't see how that is relevant. Either way. My vote is to not embrace virtualbox if it means pulling non-virtualbox work.
____________

I'm actually arguing for keeping the Linux version rather than replace it. Telling me to not bother because you like them isn't acceptable to me. You play it off like they run great because you had little issue with them. You can scour their forums over and over to find the average user does not agree. You are right about LHC not being in its present form as it would still be Six Track running traditional work and the others doing it in house or eventually adapting things different. Cosmology would just be down one application as well. I don't see how that is relevant. Either way. My vote is to not embrace virtualbox if it means pulling non-virtualbox work.

I'm actually arguing for keeping the Linux version rather than replace it. Telling me to not bother because you like them isn't acceptable to me. You play it off like they run great because you had little issue with them. You can scour their forums over and over to find the average user does not agree. You are right about LHC not being in its present form as it would still be Six Track running traditional work and the others doing it in house or eventually adapting things different. Cosmology would just be down one application as well. I don't see how that is relevant. Either way. My vote is to not embrace virtualbox if it means pulling non-virtualbox work.

I agree that VBox projects/apps get much less support. That's supported by data. It's even more evident when there are competitions and people do not have VBox already setup so it will run with BOINC. They just end up running the non-VBox apps.

LHC may not exist w/o VBox. Maybe they wanted to keep their stuff a secret or whatever. They could definitely get some more support if the rest of the apps were not Vbox.

Multiple tasks at once are the way we intend to go for QC (consistent with your preferences of course). The idea is to limit the number of cores to 4, and the BOINC client should manage the available capacity.

Multiple tasks at once are the way we intend to go for QC (consistent with your preferences of course). The idea is to limit the number of cores to 4, and the BOINC client should manage the available capacity.

I wasn't at home to see them run. I just happened to nice I had two tasks on my account page. They def used more than 4 cores. Looks like a little more than 8 threads.
Run time CPU Time Credit
3,745.85 30,947.64 138.80
3,501.03 30,948.79 129.71

Completed an hour apart. Another client was running some other CPU work so run time could have been better.