Context Navigation

New credit system design

Introduction

We can estimate the peak FLOPS of a given processor.
For CPUs, this is the Whetstone benchmark score.
For GPUs, it's given by a manufacturer-supplied formula.

Applications access memory,
and the speed of a host's memory system is not reflected
in its Whetstone score.
So a given job might take the same amount of CPU time
and a 1 GFLOPS host as on a 10 GFLOPS host.
The "efficiency" of an application running on a given host
is the ratio of actual FLOPS to peak FLOPS.

GPUs typically have a much higher (50-100X) peak speed than CPUs.
However, application efficiency is typically lower
(very roughly, 10% for GPUs, 50% for CPUs).

The first credit system

In the first iteration of credit system, "claimed credit" was defined as

C1 = H.whetstone * J.cpu_time

There were then various schemes for taking the
average or min of the claimed credit of the
replicas of a job, and using that as the "granted credit".

We call this system "Peak-FLOPS-based" because
it's based on the CPU's peak performance.

The problem with this system is that, for a given app version,
efficiency can vary widely.
In the above example,
host B would claim 10X as much credit,
and its owner would be upset when it was granted
only a tenth of that.

Furthermore, the credits granted to a given host for a
series of identical jobs could vary widely,
depending on the host it was paired with by replication.

So host neutrality was achieved,
but in a way that seemed arbitrary and unfair to users.

The second credit system

To address the problems with host neutrality,
we switched to the philosophy that
credit should be proportional to number of FLOPs actually performed
by the application.
We added API calls to let applications report this.
We call this approach "Actual-FLOPs-based".

SETI@home had an application that allowed counting of FLOPs,
and they adopted this system.
They added a scaling factor so that the average credit
was about the same as in the first credit system.

Not all projects could count FLOPs, however.
So SETI@home published their average credit per CPU second,
and other projects continued to use benchmark-based credit,
but multiplied it by a scaling factor to match SETI@home's average.

This system had several problems:

It didn't address GPUs.

project that couldn't count FLOPs still had host neutrality problem

didn't address single replication

Goals of the new (third) credit system

Device neutrality: similar jobs should get similar credit
regardless of what processor or GPU they run on.

Limited project neutrality: different projects should grant
about the same amount of credit per CPU hour,
averaged over hosts.
Projects with GPU apps should grant credit in proportion
to the efficiency of the apps.
(This means that projects with efficient GPU apps will
grant more credit on average. That's OK).

Peak FLOP Count (PFC)

This system uses to the Peak-FLOPS-based approach,
but addresses its problems in a new way.

When a job is issued to a host, the scheduler specifies usage(J,D),
J's usage of processing resource D:
how many CPUs, and how many GPUs (possibly fractional).

If the job is finished in elapsed time T,
we define peak_flop_count(J), or PFC(J) as

PFC(J) = T * (sum over devices D (usage(J, D) * peak_flop_rate(D))

Notes:

We use elapsed time instead of actual device time (e.g., CPU time).
If a job uses a resource inefficiently
(e.g., a CPU job that does lots of disk I/O)
PFC() won't reflect this. That's OK.

usage(J,D) may not be accurate; e.g., a GPU job may take
more or less CPU than the scheduler thinks it will.
Eventually we may switch to a scheme where the client
dynamically determines the CPU usage.
For now, though, we'll just use the scheduler's estimate.

The idea of the system is that granted credit for a job J
is proportional to PFC(J),
but is normalized in the following ways:

Version normalization

If a given application has multiple versions (e.g., CPU and GPU versions)
the average granted credit is the same for each version.
The adjustment is always downwards:
we maintain the average PFC*(V) of PFC() for each app version,
find the minimum X,
then scale each app version's jobs by (X/PFC*(V)).
The results is called NPFC(J).

Notes:

This mechanism provides device neutrality.

This addresses the common situation
where an app's GPU version is much less efficient than the CPU version
(i.e. the ratio of actual FLOPs to peak FLOPs is much less).
To a certain extent, this mechanism shifts the system
towards the "Actual FLOPs" philosophy,
since credit is granted based on the most efficient app version.
It's not exactly "Actual FLOPs", since the most efficient
version may not be 100% efficient.

Averages are computed as a moving average,
so that the system will respond quickly as job sizes change
or new app versions are deployed.

Project normalization

If an application has both CPU and GPU versions,
then the version normalization mechanism uses the CPU
version as a "sanity check" to limit the credit granted for GPU jobs.

Suppose a project has an app with only a GPU version,
so there's no CPU version to act as a sanity check.
If we grant credit based only on GPU peak speed,
the project will grant much more credit per GPU hour than
other projects, violating limited project neutrality.

The solution to this is: if an app has only GPU versions,
then we scale its granted credit by a factor,
obtained from a central BOINC server,
which is based on the average scaling factor
for that GPU type among projects that
do have both CPU and GPU versions.

Notes:

Projects will run a periodic script to update the scaling factors.

Rather than GPU type, we'll actually use plan class,
since e.g. the average efficiency of CUDA 2.3 apps may be different
from that of CUDA 2.1 apps.

Initially we'll obtain scaling factors from large projects
that have both GPU and CPU apps (e.g., SETI@home).
Eventually we'll use an average (weighted by work done) over multiple projects.

Host normalization

For a given application, all hosts should get the same average granted credit per job.
To ensure this, for each application A we maintain the average NPFC*(A),
and for each host H we maintain NPFC*(H, A).
The "claimed credit" for a given job J is then

NPFC(J) * (NPFC*(A)/NPFC*(H, A))

Notes:

NPFC* is averaged over jobs, not hosts.

Both averages are recent averages, so that they respond to
changes in job sizes and app versions characteristics.

This assumes that all hosts are sent the same distribution of jobs.
There are two situations where this is not the case:
a) job-size matching, and b) GPUGrid.net's scheme for sending
some (presumably larger) jobs to GPUs with more processors.
To deal with this, we'll weight the average by workunit.rsc_flops_est.

Replication and cheating

Host normalization mostly eliminates the incentive to cheat
by claiming excessive credit
(i.e., by falsifying benchmark scores or elapsed time).
An exaggerated claim will increase NPFC*(H,A),
causing subsequent claimed credit to be scaled down proportionately.
This means that no special cheat-prevention scheme
is needed for single replications;
granted credit = claimed credit.

For jobs that are replicated, granted credit is be
set to the min of the valid results
(min is used instead of average to remove the incentive
for cherry-picking, see below).

However, there are still some possible forms of cheating.

One-time cheats (like claiming 1e304) can be prevented by
capping NPFC(J) at some multiple (say, 10) of NPFC*(A).

Cherry-picking: suppose an application has two types of jobs,

which run for 1 second and 1 hour respectively.

Clients can figure out which is which, e.g. by running a job for 2 seconds
and seeing if it's exited.
Suppose a client systematically refuses the 1 hour jobs
(e.g., by reporting a crash or never reporting them).
Its NPFC*(H, A) will quickly decrease,
and soon it will be getting several thousand times more credit
per actual work than other hosts!
Countermeasure:
whenever a job errors out, times out, or fails to validate,
set the host's error rate back to the initial default,
and set its NPFC*(H, A) to NPFC*(A) for all apps A.
This puts the host to a state where several dozen of its
subsequent jobs will be replicated.

Implementation

Download in other formats:

Site migrated to: https://github.com/BOINC/boinc-dev-doc/wikiCopyright (c) 2014 University of California. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.