Launching Rose commands

It is possible to launch many of the Rose tools from the various GUIs. For example you can run or edit suites from rosie go, run suites from rose edit, and view log files from rose suite-gcontrol whilst the suite is running.

When running rose from the command line make sure to run from the appropriate roses/ directory or append the suite name using --name=puma-aa015, e.g.

rose suite-shutdown
--name=puma-aa015

Be careful though because rose suite-run --name=puma-aa0015 works differently. It would run the suite in the current directory but re-name it puma-aa015.

Stop archiving of log files

By default, when a suite is run, the log files from the previous run will be tarred up. To avoid this run rose suite-run with the flag --no-log-archive.

Copying suites between repositories

Run rosie copy specifying the repository for the new suite with the --prefix flag. For example to copy a suite (puma-aa125) from the puma repository to the MOSRS u repository you would run:

rosie copy --prefix=u puma-aa125

Setting up rose host-select archer

Some suites use the command rose host-select to choose a machine to submit the suite to. This can be used to select the least-loaded server, but for ARCHER we use this to mitigate against times when some of the ARCHER login nodes are down.

To get rose host-select archer to work there is some setup required. (Note: You can just replace the rose host-select line in your suite.rc or site/archer.rc file with the name of the host, but you won't get the benefits.)

i) If your PUMA and ARCHER usernames are the same skip to step ii). Otherwise you will need to configure your SSH settings so that it knows your ARCHER username. Open the file ~/.ssh/config and add the following lines, replacing <archer-username> with your username:

Host login*.archer.ac.uk
User <archer-username>

To check this is working correctly, try to login to ARCHER without your username:

ssh login.archer.ac.uk

ii) Next run the following script which logs into each of the hosts to add them to your ~/.ssh/known_hosts file. (Otherwise rose host-select will not be able to connect).

~um/um-training/setup-archer-hosts

Note that if the script can't connect to one of the hosts, for example because it is down, rose host-select won't be able to access it. This shouldn't matter too much if it is just one host, but you can add the host at a later time by re-running the script.

To check this has worked correctly, run the command: rose host-select archer and it should return an active host.

Mail notifications

You will need to set your email address in your cylc configuration file. Open or create a new file ~/.cylc/global.rc and add the following lines, using your own email address in place of dummy-email:

[task events]
mail to = dummy-email

Then add even notifications to your suite's suite.rc file, for example:

[[[events]]]
mail events = succeeded, failed

These can go under [runtime] -> [[root]] or a specific task definition. For a full set of notifications see the documentation pointed to above.

Important: Rose notifications will not work on PUMA and are no longer recommended for use. Rose notifications have the form rose suite-hook, and any instances should be removed from the suite.

Merging in changes from another suite

You may have taken a copy of a suite, but there have been subsequent changes that you wish to include. FCM won't allow you to merge in changes from another suite, but you can do it with a direct svn command. You will need to know the full svn URL for the suite containing the changes and the revision number (use -c) or range (use -r), for example:

Check which suites you have running

You can then click on each of the suites to open the usual cylc suite control GUI.

Troubleshooting common errors

Rosie go asks for "username for u"

By default rosie is set up to load suites from the local puma repository and the Met Office Science Repository Service (MOSRS).
If your MOSRS password isn't cached, Rosie will prompt for it at startup. Clicking 'cancel' then produces an error:

The authenticity of host 'exvmscylc.monsoon-metoffice.co.uk (10.168.64.4)' can't be established.
RSA key fingerprint is 98:c8:5e:b9:b3:d2:2f:c4:9c:89:78:08:d6:78:70:3a.
Are you sure you want to continue connecting (yes/no)?

Type yes.

Now from exvmscylc, log in to exvmsrose using the full path:

ssh exvmsrose.monsoon-metoffice.co.uk

And again type yes at the prompt.

Type exit to get back to the Rose VM, then ssh into exvmsrose again, and this should succeed without any interative prompts.

Now type exit twice to get back to the original Rose terminal. And try re-submitting the rose suite.

Unable to submit jobs; can't find cylc (MONSooN)

rose suite-run on exvmsrose fails unable to find cylc

exvmsrose$ rose suite-run
...
[FAIL] WARNING:
[FAIL] This computer is provided for the processing of official information.
[FAIL] Unauthorised access described in Met Office SyOps may constitute a criminal offence.
[FAIL] All activity on the system is liable to monitoring.
[FAIL] bash: line 11: cylc: command not found

No gcylc window

Sometimes the gui is slow to load. If it does not appear at all however, check that you have X11 forwarding set up from your initial location and the lander.

To do so ssh with the -Y option or alternatively, append the following line to your ~/.ssh/config file:

Host *
ForwardX11 yes

Problems shutting down suites

Types of shutdown

By default when you try to shutdown a suite, cylc will wait for any currently running tasks to finish before stopping, which may not be what you want to do. You can also tell cylc to kill any active processes or ignore running processes and force the suite to shutdown anyway. The latter is what you will need to do if the suite has got stuck:

rose suite-shutdown -- --now

To access these options in the cylc GUI, go to "Control" → "Stop Suite".
See also rose help suite-shutdown for further details.

Forcing shutdown

Sometimes after trying to shutdown a suite, it will still appear to be running.

First make sure you have used the correct shutdown command and aren't waiting for any unfinished tasks (see above). It can take cylc a little while to shut down everything properly, so be patient and give it a few minutes.

If it still appears to be running (for example you get an error when you try to re-start the suite), you may have to do the following:

Manually kill the active processes:
Get a list of processes associated with the suite. For example, for suite u-ak194 I would run:

Unable to access STASHmaster from branch on ARCHER

Some suites may reference files held in the repository for use at runtime. The most common example of this is the STASHmaster file. To make a change to the STASHmaster file requires editing the file in a branch and setting the path to this in the suite. However the method described in the instructions below will not work on ARCHER:

This is because the job tries to access the repository from the ARCHER queues, which will not work. Note this will work on the XCS machines, so if you are porting a suite, it may have something like this in.

The solution is to make the suite extract the file on PUMA and then copy over to ARCHER with the other suite files.