On Fri, Oct 13, 2006 at 04:52:23PM -0400, Neelesh Arora alleged:
> Garrick Staples wrote:
> >On Thu, Oct 12, 2006 at 06:58:09PM -0400, Neelesh Arora alleged:
> >>- There are several jobs in the queue that are in the Q state. When I do
> >>checkjob <jobid>, I get (among other things):
> >>"job can run in partition DEFAULT (63 procs available. 1 procs required)"
> >>but the job remains in Q forever. It is not the case of a resource
> >>requirement not being met (as the above message indicates)
> >
> >That means a reservation is set preventing the jobs from running.
> >
> >>- restarting torque and maui did not help either
> >
> >Look at the reservations preventing the job from running.
> >
>> If I do showres, I get the expected reservations for the running jobs.
> By expected, I mean the number/name of nodes assigned to each job are as
> reported by qstat/checkjob. There is only one reservation for an idle job:
> ReservationID Type S Start End Duration N/P
> StartTime
> 88655 Job I INFINITY INFINITY INFINITY 5/10
> Mon Nov 12 15:52:32
> and,
> # showres -n|grep 88655
> node015 Job 88655 Idle 2 INFINITY
> INFINITE Mon Nov 12 15:52:32
> node014 Job 88655 Idle 2 INFINITY
> INFINITE Mon Nov 12 15:52:32
> node010 Job 88655 Idle 2 INFINITY
> INFINITE Mon Nov 12 15:52:32
> node003 Job 88655 Idle 2 INFINITY
> INFINITE Mon Nov 12 15:52:32
> node002 Job 88655 Idle 2 INFINITY
> INFINITE Mon Nov 12 15:52:32
>> So, this probably means that no other job can start on these nodes. That
> still leaves 60+ nodes that have no reservations on them. Is there
> something else I am missing here?
You might need to increase RESERVATIONDEPTH, I have mine at 500.
> >> An update:
> >> I notice that when these jobs are stuck, one way to get them started is
> >> to set a walltime (using qalter) less than the default walltime. We set
> >> a default_walltime of 9999:00:00 at the server level and require the
> >> users to specify the needed cpu-time.
> >>
> >> This was set a long time ago and has not been causing any issues.
> But it
> >> seems now that if you have set this default and then a user submits a
> >> job with an explicit -l walltime=<time> specification, then that job
> >> runs while older jobs with default walltime wait.
> >>
> >> Can some one please shed some light on this - I am out of clues here?
> >
> > Walltime is really important to maui. Smaller walltimes allow jobs to
> > run within backfill windows. If everyone has infinite walltimes, you
> > basicly reduce yourself to a simple FIFO scheduler and might as well
> > just use pbs_sched.
>> Well, we set default_walltime so high because maui does not care about
> the specified cpu-time (and we want to do job allocation based on
> cpu-time). Maui would take wall-time = cpu-time and kill the job if
> wall-time was exceeded, even if cpu-time was not. Refer to our previous
> discussion on this maui bug at:
>http://www.clusterresources.com/pipermail/torqueusers/2006-June/003729.html
It is a valid work-around, which has obviously served you well for a few
months, but the *optimum* case is to have correct walltimes so that you
can take advantage of backfill.
Maybe give pbs_sched a try. Not that I really recommend it, but it won't try
to be smart about things.