When maui calculates the queued job priority component based on
fairshare it compares the actual usage with the total usage to get a
percentage use. When a system has been lightly loaded over the
fairshare interval(s) this results in the consistent users being overly
disadvantaged against any (new) users with low or zero usage. Eg. If a
cluster with a 2 day fairshare average is used by one user only for 2
days but only to 10% of its capacity, then the fairshare calculation
attributed 100% of the cluster usage to the user (rather than 10%). When
another user arrives they initially get a 100vs1 fairshare priority
boost rather than 10vs1. I think it is clearly unfair/undesirable to
penalize the existing user just because the system was underutilized.
I have developed a patch to make maui calculate fairshare based priority
on actual usage rather than usage relative to total utilization.
The usage is compared with a simple estimate of the total number of
cycles available in one fairshare period (total_procs*interval), which
will not give (intermediate) accurate results in terms of percentage
use, but does give useful relative results.
This is probably best use with a very small FSTARGET:
USERCFG[DEFAULT] FSTARGET=0.001
so that the priority contribution is usually negative (it is the
relative value which matters anyway - or rather the priority
difference). Note that a zero FSTARGET turns the component off.
It would be possible (but moderately difficult) to improve the accuracy
by taking into account the fairshare decay and proportion of the current
timeshare interval. This might be important if varying per credential
fairshare targets were to be used. Otherwise, aconsistent 10% usage
would give a varying fairshare priority depending on the current time in
the fairshare interval (though relative priorities would be affected
less).
If somebody is motivated to improve on the patch I would be interested
in having the improvements (such as a config parameter to turn it
on/off).
For reference, there was a related item on the moab list a while back.
I am unsure if the development actually produced what I wanted or not.
Well not ... at least because the development was in moab and not maui.
http://www.clusterresources.com/pipermail/moabusers/2006-May/000181.html
Cheers,
Gareth
--- maui-3.2.6p16.bak/src/moab/MPriority.c 2006-05-24
08:20:46.000000000 +1000
+++ maui-3.2.6p16/src/moab/MPriority.c 2006-08-04 16:43:44.000000000
+1000
@@ -871,7 +871,7 @@
{
SFactor[mpsFU] = FSTargetUsage -
(J->Cred.U->F.FSUsage[0] + J->Cred.U->F.FSFactor) /
- (GP->F.FSUsage[0] + GP->F.FSFactor) * 100.0;
+ (GP->CRes.Procs * GP->FSC.FSInterval);
switch(FSMode)
{
@@ -907,7 +907,7 @@
{
SFactor[mpsFG] = FSTargetUsage -
(J->Cred.G->F.FSUsage[0] + J->Cred.G->F.FSFactor) /
- (GP->F.FSUsage[0] + GP->F.FSFactor) * 100.0;
+ (GP->CRes.Procs * GP->FSC.FSInterval);
switch(FSMode)
{
@@ -943,7 +943,7 @@
{
SFactor[mpsFA] = FSTargetUsage -
(J->Cred.A->F.FSUsage[0] + J->Cred.A->F.FSFactor) /
- (GP->F.FSUsage[0] + GP->F.FSFactor) * 100.0;
+ (GP->CRes.Procs * GP->FSC.FSInterval);
}
else
{
@@ -983,7 +983,7 @@
{
SFactor[mpsFC] = FSTargetUsage -
(J->Cred.C->F.FSUsage[0] + J->Cred.C->F.FSFactor) /
- (GP->F.FSUsage[0] + GP->F.FSFactor) * 100.0;
+ (GP->CRes.Procs * GP->FSC.FSInterval);
}
else
{
@@ -1023,7 +1023,7 @@
{
SFactor[mpsFQ] = FSTargetUsage -
(J->Cred.Q->F.FSUsage[0] + J->Cred.Q->F.FSFactor) /
- (GP->F.FSUsage[0] + GP->F.FSFactor) * 100.0;
+ (GP->CRes.Procs * GP->FSC.FSInterval);
}
else
{