I have a job which shows the following on a checkjob command
(I've only reproduced what seem to be the relevant lines of output):
SubmitTime: Thu May 7 10:59:49
(Time Queued Total: 3:24:50 Eligible: 00:00:16)
......
Holds: Defer
Messages: cannot start job - RM failure, rc: 15031, msg: 'Premature
end of message'
PE: 1.00 StartPriority: 10
cannot select job 4669711 for partition DEFAULT (job hold active)
And in my maui.cfg file I have the following:
DEFERTIME 00:05:00
Why is this job still queued as of (currently) 3+ hours later?
Shouldn't the deferred hold have been released
within 5 minutes of the first failure? Or is there something I'm not
understanding about deferred holds?
(The RM failure has been cleared by a restart of Maui - this is due
to a longstanding bug in the communications
between Torque and Maui that we're currently trying to track down here
and fix).
Michael Durket