GC3Apps provide a script drive execution of multiple gnfs-cmd jobs
each of them with a different parameter set. Allotogehter they form a
single crypto simulation of a large parameter space.
It uses the generic gc3libs.cmdline.SessionBasedScript framework.

The purpose of gcrypto is to execute several concurrent
runs of gnfs-cmd on a parameter set. These runs are performed in
parallel using every available GC3Pie resource; you can of
course control how many runs should be executed and select what output
files you want from each one.

Like in a for-loop, the gcrypto driver script takes as input
three mandatory arguments:

RANGE_START: initial value of the range (e.g., 800000000)

RANGE_END: final value of the range (e.g., 1200000000)

SLICE: extent of the range that will be examined by a single job (e.g., 1000)

For example:

# gcrypto 800000000 1200000000 1000

will produce 400000 jobs; the first job will perform calculations
on the range 800000000 to 800000000+1000, the 2nd one will do the
range 800001000 to 800002000, and so on.

Inputfile archive location (e.g. lfc://lfc.smscg.ch/crypto/lacal/input.tgz)
can be specified with the ‘-i’ option. Otherwise a default filename
‘input.tgz’ will be searched in current directory.

Job progress is monitored and, when a job is done,
output is retrieved back to submitting host in folders named:
RANGE_START+(SLICE*ACTUAL_STEP)
Where ACTUAL_STEP correspond to the position of the job in the
overall execution.

The gcrypto command keeps a record of jobs (submitted, executed and
pending) in a session file (set name with the ‘-s’ option); at each
invocation of the command, the status of all recorded jobs is updated,
output from finished jobs is collected, and a summary table of all
known jobs is printed. New jobs are added to the session if new input
files are added to the command line.

Options can specify a maximum number of jobs that should be in
‘SUBMITTED’ or ‘RUNNING’ state; gcrypto will delay submission
of newly-created jobs so that this limit is never exceeded.

The gcrypto execute several runs of gnfs-cmd on a
parameter set, and collect the generated output. These runs are
performed in parallel, up to a limit that can be configured with the
-Jcommand-line option. You can of course control how
many runs should be executed and select what output files you want
from each one.

In more detail, gcrypto does the following:

Reads the session (specified on the command line with the
--session option) and loads all stored jobs into memory.
If the session directory does not exist, one will be created with
empty contents.

Divide the initial parameter range, given in the command-line,
into chunks taking the -J value as a reference. So
from a coomand line argument like the following:

$ gcrypto 800000000 1200000000 1000 -J 200

gcrypto will generate an initial chunks of 200 jobs
starting from the initial range 800000000 incrementing of 1000.
All jobs will run gnfs-cmd on a specific parameter set
(e.g. 800000000, 800001000, 800002000, ...). gcrypto will keep
constant the number of simulatenous jobs running retrieving
those terminated and submitting new ones untill the whole
parameter range has been computed.

Updates the state of all existing jobs, collects output from
finished jobs, and submits new jobs generated in step 2.

Finally, a summary table of all known jobs is printed. (To control
the amount of printed information, see the -l command-line
option in the Introduction to session-based scripts section.)

If the -C command-line option was given (see below), waits
the specified amount of seconds, and then goes back to step 3.

The program gcrypto exits when all jobs have run to
completion, i.e., when the whole paramenter range has been
computed.

Execution can be interrupted at any time by pressing Ctrl+C.
If the execution has been interrupted, it can be resumed at a later
stage by calling gcrypto with exactly the same
command-line options.

gcrypto requires a number of default input files common to every
submited job. This list of input files is automatically fetched by
gcrypto from a default storage repository.
Those files are:

When gcrypto has to be executed with a different set of input
files, an additional command line argument --input-files could be
used to specify the locatin of a tar.gz archive containing the
input files that gnfs-cmd will expect. Similarly, when a different
version of gnfs-cmd command needs to be used, the command line
argument --gnfs-cmd could be used to specify the location of the
gnfs-cmd to be used.

In this example, job information is stored into session
SAMPLE_SESSION (see the documentation of the --session option
in Introduction to session-based scripts). The command above creates the jobs,
submits them, and finally prints the following status report:

Note that the status report counts the number of jobs in the
session, not the total number of jobs that would correspond to the
whole parameter range. (Feel free to report this as a bug.)

Calling gcrypto over and over again will result in the same jobs
being monitored;

The -C option tells gcrypto to continue running until
all jobs have finished running and the output files have been
correctly retrieved. On successful completion, the command given in
example 2. above, would print:

Each job will be named after the parameter range it has computed (e.g.
800001000, 800002000, ... ) (you could see this by passing
the -l option to gcrypto); each of these jobs will
create an output directory named after the job.

For each job, the set of output files is automatically retrieved and
placed in the locations described below.

In typical operation, one calls gcrypto with the -C
option and lets it manage a set of jobs until completion.

So, to compute a whole parameter range from 800000000 to 1200000000
with an increment of 1000, submitting 200 jobs simultaneously each of
them requesting 4 computing cores, 8GB of memory and 4 hours of
wall-clock time, one can use the following
command-line invocation: