Is there any solution to control the "dispathing rate" (I mean the number of jobs dispatched per scheduling pass), mainly to avoid bootleneck from jobs using storage services ?(I mean avoid too many jobs starting at the same time and trying to access files)?

I don't speak about controling the number of running jobs, why is, of course, possible by complexI also know that users can "serialize" their jobs (via hold|released states or jobs dependancies)

Post by chambonIs there any solution to control the "dispathing rate" (I mean the number of jobs dispatched per scheduling pass), mainly to avoid bootleneck from jobs using storage services ?

you could possibly play around with the scheduler configuration, esp.with "job_load_adjustments", "load_adjustment_decay_time", and possiblythe "load_formula" (depending on your scenario):

e.g., we made the following:job_load_adjustments np_load_avg=2.00load_adjustment_decay_time 0:3:00

resulting in slowly dispatching the jobs (each job increases the loadinstantly by 2, which is then decayed within 3 mins.) This works for us,as we only have jobs requiring a single CPU/slot.

You could check and adjust the scheduling behaviour with an empty queuewhen submitting a bunch of jobs (more than slots are available), eachmaking 100% CPU load: e.g., if using Ganglia, you could monitor aslow-increase of cluster load (jobs are dispatched slowly) whenincreasing load_adjustment_decay_time.

Post by andre_ismllyou could possibly play around with the scheduler configuration, esp.with "job_load_adjustments", "load_adjustment_decay_time", and possibly

Ok, I understand what you mean.The inconvenient is that this rule will concern ALL jobs (and not only jobs | users | groups| projects| complex| etc. using storage services)moreover, I my case I have already defined a load_formula to take into account the cpu load, disk space et memory space per machine.(with the help of load sensors)

What you can do: submit all jobs with an operator or system hold. Then use a cron job which will check, that one and only one job is pending without hold.

The problem is of course, that you will influence the scheduling as you are doing it partially on your own. You could try to have a limited amount from each project/group/resource request eligible for scheduling.

-- Reuti

Post by chambonThank you very much for your answer.Bernard CHAMBON------------------------------------------------------http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=302476

- Queue is always in disabled state- Cron job to check for any pending job at each minute- If there's pending jobs, enable ***@compute-node one at a time,sleep for some seconds before enable another ***@compute-node- Check for list of pending jobs while the loop above is running- Exit from loop when all pending jobs are cleared- Disable the queue again.

Post by andre_ismllyou could possibly play around with the scheduler configuration, esp.with "job_load_adjustments", "load_adjustment_decay_time", and possibly

Ok, I understand what you mean.The inconvenient is that this rule will concern ALL jobs (and not only jobs | users | groups| projects| complex| etc. using storage services)moreover, I my case I have already defined a load_formula to take into account the cpu load, disk space et memory space per machine.(with the help of load sensors)

What you can do: submit all jobs with an operator or system hold. Then use a cron job which will check, that one and only one job is pending without hold.The problem is of course, that you will influence the scheduling as you are doing it partially on your own. You could try to have a limited amount from each project/group/resource request eligible for scheduling.-- Reuti

Post by chambonThank you very much for your answer.Bernard CHAMBON------------------------------------------------------http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=302476