Monday, June 22, 2009

Following on from my href="http://paddy3118.blogspot.com/2009/03/batch-process-runner-in-bash-shell.html" target="_blank">Batch process runner script, I gotthinking that a nice capability would be to have a maximum runtimelimit for the jobs.

My first thought was to create an href="http://unixhelp.ed.ac.uk/CGI/man-cgi?at" target="_blank">atcommand set to execute the reuired time after the start of each job,one for each job, that would kill the job.

Nah.

Discussions with a friend lead to consideration of using href="http://docs.sun.com/app/docs/doc/816-5165/ulimit-1?l=en&a=view&q=ulimit" target="_blank">ulimit in the script to limit therun time, but unfortunately on the Unix box I was testing on, ulimitcould only limit the maximum CPU time, not run time. If the job wasstuck in a non-busy wait then it would not be auto-killed.

I decided to try creating a script to run a time eating processing taskthat created a background kill task so the script would be selfkilling.

The script

Line 1: The KILLAFTER environmentvariable will kill the script after 3 seconds

Line 5: Store the process ID for the'bash -c' script for killing later

Line 8: This is the slleping backgroundsub-process that wakes after KILLAFTER seconds and kills the script.

Line 9: But if the normal script commandsfinish, store the sub-process PID so it can be cleanly killed itself.

Line 12: Some dummy command that counts up every second.

Line 13: And save its exit status so it can berestored after removing the background kill process.

Line 16: Terminate the background kill process sowe are not left waiting.

Line 18: Exit with the saved exit status.

Line 19: Prints the exit status of the whole 'bash-c' script

Lines 20-24 show the output of the first run. Notice how although thefor loop in line 12 is set to count to 5, the count gets killed afterprinting 3, i.e. the three seconds of $KILLAFTER.

Lines 27 onwards show a second run where KILLAFTER is set to more timethan is used by the for loop. Notice how the script prints the fullcount up to 5, and has a return status of zero.