----------------------------------------------------------------------
H H PPP CCC UK NATIONAL HPC SERVICE
H H P P C C --------------------------
HHHHH PPP C x x provided by
H H P C C xx EPCC and
H H P CCC x x CCLRC Daresbury Laboratory
----------------------------------------------------------------------
HPCx User Mailing 028 28 November 2003
----------------------------------------------------------------------
Contents
** December meetings - Final Call!!
** HPCx course: Optimisation techniques for the Power4 processor
** HPCx seminar: Towards capability computing
** HPCx Users' group meeting
** 14th Daresbury Machine Evaluation Workshop
** jtmp: Job-temporary scratch space
** Memory allocation under LoadLeveler
** Busy machine
----------------------------------------------------------------------
Greetings--
----------------------------------------------------------------------
DECEMBER MEETINGS - FINAL CALL FOR REGISTRATIONS
A reminder of the meeting which are taking place at Daresbury
Laboratory over 9 - 12 December. They're described in the sections
below.
This is a final call for registrations to these events. We must close
them at lunchtime on Monday, 1 December 2003.
REGISTER NOW!!
Together these events will make up a really important opportunity for
contact between users, vendors and providers of high-performance
computing in the UK. You are most cordially invited to attend, and
HPCx and DL staff look forward to meeting you at Daresbury.
How to get to the Daresbury Laboratory:
http://www.cclrc.ac.uk/Activity/ACTIVITY=DLMaps;
Hotels in the area:
http://www.cse.clrc.ac.uk/disco/mew14/hotels.html
----------------------------------------------------------------------
HPCx COURSE: OPTIMISATION TECHNIQUES FOR THE POWER4 PROCESSOR
Tuesday, 9 December.
This course will focus on tools and techniques for single processor
optimisation on HPCx. The course will cover the architecture of the
processors and memory system, profiling and hardware counter tools,
getting the best from the compilers and Power4-specific tips and
tricks. There will be hands-on sessions in addition to lecture
material.
Registration form:
http://www.hpcx.ac.uk/support/training/form.html
----------------------------------------------------------------------
HPCx SEMINAR: TOWARDS CAPABILITY COMPUTING
Wednesday, 10 December
One of our key challenges is to ensure that the full capability of the
HPCx service is used to the limit. We need to enable applications in
all disciplines to scale effectively right up to the full size of the
system. Speakers will include experienced users and HPCx and IBM
staff.
This will be the first in an annual series of seminars.
http://www.hpcx.ac.uk/about/events/annual2003/
----------------------------------------------------------------------
HPCx USERS' GROUP MEETING
Wednesday, 10 December
This is a chance for users to bring their concerns and problems
directly to the senior management of HPCx, who will be present. We
hope that this will also include issues that have arisen during the
Seminar. Our intention is that in many cases it will be possible to
take decisions on the spot to respond to these.
----------------------------------------------------------------------
14TH DARESBURY MACHINE EVALUATION WORKSHOP
Thursday-Friday, 11-12 December
This well-established and widely-respected annual event aims to
encourage close contact between the research communities and vendors
of distributed high-performance scientific computing. About a dozen
vendors are expected to make presentations. Systems will be available
for benchmarking (it is hoped starting on Monday, 8 December), and
there will be an exhibition. Proceedings will be published and made
available to those registered to attend.
http://www.cse.clrc.ac.uk/disco/mew14.shtml
----------------------------------------------------------------------
jtmp: JOB-TEMPORARY SCRATCH SPACE
This is a way for users to have access to a large (4.5Tb) shared
filespace. Space there is created automatically when your LoadLeveler
job starts, and released automatically as soon as it ends.
You can get the pathname of a temporary directory in this space by
doing this in your job:
JTMPDIR=`lljtmp`
You can then get there by doing this:
cd $JTMPDIR
Access to the jtmp space is unlimited - there are no quotas there.
In practice, however, you have to share the space with whoever else
is using it at the time. There are no guarantees.
You can do anything you like there. However, the directory and
everything in it are completely wiped at the end of teh job and cannot
be retrieved. Anything you want to keep must be copied away before
the job ends.
----------------------------------------------------------------------
MEMORY ALLOCATION UNDER LOADLEVELER
Recently we have had some problems with allocation of memory for jobs
under LoadLeveler, which have caused difficulties for some user
groups.
To cope with these, we have implemented some changes to the software
which scans jobs as they are submitted to LoadLeveler, known as the
'submission filter'. These changes allow users more control over the
use of memory by the system.
In future, the total amount of real memory which can be occupied by a
process will be:
7.2 Gb / tasks_per_node
This is called the RSS - the resident set size. (A node is an LPAR.)
The RSS is divided into two areas: stack and data.
In Fortran terms, stack is used for:
- subprogram calling information
- local variables, including arrays, unless they are
marked SAVE
Data is used for:
- program code
- static variables, including COMMON variables and
variables marked SAVE
- memory allocated by ALLOCATE - known as 'heap' variables.
- buffers for use by MPI and Fortran IO
A program that runs out of either stack or data space will fail. For
stack space, you will usually get a message like this:
ERROR: 0031-250 task 1: Segmentation fault
(Unfortunately, there are other things which can also cause this.) If
you run out of data space, the message will usually be like this:
1525-108 Error encountered while attempting to allocate a data
object. The program will stop.
With the new filter, you can specify the amount of memory to be
allocated to these two areas, using the keywords stack_limit and
data_limit. For example:
#@ stack_limit = 400mb
will set the stack to 400 Mb. The following rules apply:
* If stack_limit is not specified, it defaults to 200 Mb
(This has been the fixed size until now)
* If data_limit is not specified, it defaults to RSS - stack_limit
* If stack_limit + data_limit > RSS, the job will fail
This means that if you specify neither of these, LoadLeveler will
behave as it has up to now.
Notice that the system will not now allow you to have a virtual memory
which is larger than the RSS. This means that the system should not
normally swap. If you need more memory per process than is allowed by
these rules, you will need to reduce the number of tasks you have per
node.
You might think that we are forcing users to waste AUs, by obliging
them to decrease the number of tasks per node instead of swapping in
this case. But swapping a multiprocessor application has serious
performance implications; your application is likely to waste at
least as much time simply waiting for the swaps.
----------------------------------------------------------------------
BUSY MACHINE
HPCx is very busy at the moment. We appreciate that this can be
frustrating. Here are a couple of points.
* Submitting large numbers of jobs to LoadLeveler does no harm.
However, it doesn't help you to get to the top of the queue.
It's not the case that if you have lots of jobs waiting, your
jobs are more likely to run. In fact, when it's planning its
job mix, LoadLeveler only looks at the four oldest jobs you
have submitted.
* Please don't use the interactive-parallel region to run jobs
which don't need to be interactive. This can seriously affect
people who really need to run interactively.
----------------------------------------------------------------------
Regards
--John
----------------------------------------------------------------------
Earlier mailings: http://www.hpcx.ac.uk/support/notices/index.html
To be removed from the mailing list: log into your website account,
go to the "Update" page, and click the "Opt out of user emails"
field; then click "Commit update".
--
John Fisher j.fisher@epcc.ed.ac.uk
HPCx User Administration and Helpdesk
HPCx: http://www.hpcx.ac.uk Helpdesk: support@hpcx.ac.uk
Phone: +44 131 650 5029 Fax: +44 131 650 6555