ADABAS Multi/Prefetch for Batch Job Performance

In an ADA742 and NAT413 Environment

Originally by Jim Poole Updated and Enhanced by Dieter W. Storr

Changes the database requests from one record per one request to multiple records per request.

Transparent to the application code, defined in the JCL.

(More intro information is at the end of this memo)

Multifetch vs. Prefetch:

The older 'Prefetch' is a subset of Multifetch.

Very similar except for minor changes in the sizing formulas.

Multifetch is the future direction of ADABAS and the default here.

For the rest of this memo, only Multifetch is covered.

Bottom Lines:

Multifetch parms have to be set for each program/situation.

Multifetch can take run times of hours down to minutes.

Multifetch can also take a 30 minute job to 12 hours.

Multifetch jobs can lock up the database, sending CICS to maxtask.

Multifetch, done well, can make dramatic improvements.

Multifetch, done wrong, can cause dramatic problems.

READ (50000) VIEW1 BY ISN with different PREFSBL.
ADABAS Calls w/o Multifetch = 50,000
ADABAS Calls with Multifetch: see figure
Number of I/Os with and w/o Multifetch are approx. the same
but can be different depends on the ADABAS buffer pool

Overall guidelines:

GOAL: MAKE BATCH RUN FASTER BUT DO NOT KILL THE REST OF THE DATABASE!

Multifetch should limit the I/O's per command to 7 to 9.
Ideally limit the number of io's to egual one data storage track.
Otherwise: Overall Database Performance is affected.

The number of concurrent multifetch jobs should be limited.
Otherwise: Overall Database Performance is affected.

Multifetch jobs should not do any updates
(ie. the records being multifetched should not be put on hold)
Otherwise: Major hold contention problems and performance issues can occur.

Multifetch parms should be carefully defined to exclude file/cmds
except the sequence desired to be multifetched.
Otherwise: The job can run significantly longer and database
performance is also impacted.

The recommended sequence for multifetch is read physical (L2).
Otherwise: Other sequences increase the IO per command and risk
longer run times if it is a restarted loop.

Setting up Multifetch:

Basic JCL:

Requires additional ADARUN cards.
These are passed to the ddname DDCARD, for example:

Determine the File/Command Exclusion List
PREFXFIL,PREFXCMD.
You cannot INCLUDE and EXCLUDE files at the same time.

Files/commands not properly excluded could cause slower results.

If NATURAL used then the commands used to access the Natural
system files are Multifetch candidates.
These should be excluded, for example

PREFXFIL=14,14,15,19,22
PREFXCMD=L3/L9/L3/L3/L3

Leave only the desired file/command for multifetching off the
exclusion list. All other files should be on the exclusion list.
To determine all the file/commands:
- Check the program for the views used and commands used.
- Or use program PREFTCH to analyze (see B.1.a above)

The exclusion list is coded in file/command format:
PREFXFIL=fnr1/fnr2/fnr3/fnr4
PREFXFIL=cmd1/cmd2/cmd3/cmd4
This indicates that the exclusions are fnr1/cmd1, fnr2/cmd2
and so on.

Determine the ideal number of records to fetch per call.

The maximum that can be specified depends on the maximum that can be held by the record
and ISN buffers, each of which is assumed to be less than 32 kilobytes.
If no value or zero (the default) is specified, the sizes of the ISN and record buffers determine the number of records that are prefetched. Specifying zero (0) does not disable multifetch operation.

PREFNREC = n records.
SETTING THIS NUMBER TOO HIGH WILL CAUSE PERFORMANCE PROBLEMS OR
POTENTIAL ISSUES FOR OTHER USERS OF THE DATABASE!

PREFNREC SHOULD BE USED TO AVOID LATER PROGRAM CHANGES IMPACTING
PERFORMANCE OF PREFETCH!

Use one of the following:

If in doubt, set to 10.

If the job is a Read Physical (L2 cmd), then use the ADAREP:
- Determine number of Records on file.
- Determine number of used DS RABNS (allocated minus unused).
- Determine number of records per RABN/block.
- Multiply this number times 7 to 9.

Use program $SAMPLE to analyze (see B.1.a above).
For the desired multifetch sequence (file/Lx cmd),
count number of Lx commands it takes for 7 to 9 IO's. Use this.

Note that for Read Physical sequences, PREFNREC can be set higher.

PREFNREC should NOT be set higher than the number of reads per
loop/RC (see notes at bottom). To determine the number of Lx's
per RC, do an extract of all commands and for the Lx CID used,
count the number of Lx's per RC command.

Let:
J = total job time (derived from Epilog or examination)
D = sum of adabas duration for the Lx commands that are
going to be multifetched.
(use SAS History or clog report. due prior to any Multifetch
settings).
F = ( J - D ) / D

If F is less than 1, than any multi/prefetching is going to have
a performance impact on the database. Use the guidelines elsewhere
is this doc on limiting and controlling the jobs.
If F is greater than 1, then PREFNREC can be set towards F, (still
using the 7 to 9 IO rule) without significantly impacting the
database performance. As PREFNREC approaches and/or passes F,
there will be a performance impact on the database.

Determine the size of the fetch buffer, PREFSBL.

PREFSBL must specify at least twice the size of the largest
record that might be prefetched.

PREFSBL = 2* ( ARB * PREFNREC * 4/3 )

ARB = The Actual Record Buffer Length.
The sum of the lengths the fields returned per 1 record.
Use one of the following:
a. Add up the lengths of the fields returned in the program.
b. Or see program PREFTCH to analyze (see B.1.a above),
and use value of REC/LEN/DECMP.

IF AND ONLY IF PREFNREC IS BEING USED (AND IT SHOULD BE), THEN
PREFSBL CAN BE SIZED LARGER THAN THIS FORMULA.

Determine Size of the Total Fetch Buffer, PREFTBL.

If only multifetching one sequential process then PREFTBL=PREFSBL.

Otherwise, PREFTBL= x * PREFSBL
where x = the number of sequential process to multifetch.

General tips.

Go simple, multifetch one seqence, exclude the rest.

Read physical (L2 command is the best candidate).
PREFNREC is easily determine by 3 * number-of-recs-per-block.
PREFSBL is the sum of the field lengths in the program times
PREFNREC * 8/3.

Read logical (L3 command) needs to be monitored. Be careful on
the amount of IO (depends on how distributed the records are
by the descriptor value).

Read by ISN (L1 command) can be good but again depends on
the amount of IO and how distributed the ISN's are amongst
data storage.

By default in ADABAS a separate call (or request or command) is made
for each ADABAS record. If the program needs to read in a 1000 records,
then a 1000 ADABAS calls are issued against the database. Each commit (ET)
or rollback (BT) is a separate call. Each update of a record is a separate
call. Each release cursor (RC or Release Command-ID or end of Natural Loop)
is a separate call. All ADABAS SQL and Natural generate these ADABAS calls.

In a batch job, each call is passed from the application program to a common routine called ADALNK. ADALNK then passes the call to the database address space via cross-memory services. That call is posted on the database command-queue (with all other user requests from other batch jobs as well as CICS and Com-plete users). Much like a mini-operating system, ADABAS selects commands/calls off the command queue and place them into a thread for execution. Selection from the command queue to the thread is basically made from dispatching priority and the age of the request.

The ADABAS threads are a multi-threaded environment where ADABAS processes the command till complete or an IO is required or some other system resource. The time a command spends in a thread is a function of it's CPU and I/O needs. When finished, the command is posted back to the user and returned via cross-memory services.

Thus, the total path of a command is:
Program => ADALNK => Command Queue => Thread

Multifetch/Prefetch.

Multifetch is a possibility for all sequential processing by ADABAS. A
command-id (cursor-id) is required. These sequential processes/candidates
are:

Read Physical (L2 calls), THE BEST CANDIDATE!

Read Logical, Read by descriptor/key (L3).

Histogram (L9).

Read by ISN, 'I' or 'N' option (L1).

In the batch job, multifetch is toggled by a 'ADARUN' parm passed via the DDCARD ddname in the JCL. When the ADABAS call is passed, a check is first made to see if 'fetch' is turned on. If it is, control is passed to a ADAMLF module (with Prefetch the module is ADAPRF).
This is the actual multifetch processing. The path length is now:

Program => ADAMLF => ADALNK => Command Queue => Thread

ADAMLF checks to see if the command is a multifetch candidate. If not, it is directly passed on to ADALNK and processed as described above. If it is a multifetch candidate, ADAMLF checks to see if the file/command is on the exclusion list from Multifetch. If not a buffer is set aside. This buffer size is determined by the ADARUN parm PREFSBL.

If there are multiple parallel sequential processes, then each unique process (determined by the command-id) gets a separate PREFSBL. The limit to the number of parallel processes is determined by ADARUN parm PREFTBL. All PREFSBL's are contained in the PREFTBL. For example, if PREFTBL=20K and PREFSBL=10k, then 2 processes can be 'multifetched'.

IT IS CRITICAL THAT THE MULTIFETCH PROCESS IS LIMITED TO THE ONE OR TWO
PROCESSES THAT NEED MULTIFETCHING!

ADAMLF cuts the PREFSBL size in half, issues a call to the database (with the multifetch option turned on). That command is cross-memory passed to the database command-queue, then selected into the ADABAS thread for processing. ADABAS then fills the buffer with x records and sends the request back. While the user program is processing the records in the buffer, an
asynchronous call is made to ADABAS to fill the other half of the buffer. And so it goes
on, in a 'flip/flop' fashion.

No changes are required to the program, they are simply calling ADAMLF and then ADALNK.

How many records are passed back on each call is a function of the ADARUN parm PREFNREC and the PREFSBL size.

If the PREFNREC or PREFSBL are sized to big, then other users are made to wait and queue up. This will result in bad performance for the rest of the database, eventual max-tasking of CICS and involve SLA outages!

Multifetching for the particular sequential process (eg. the L2 or L3 call identified by the unique command-id (cursor-id)) proceeds until the program reaches:

If the PREFNREC or PREFSBL is too large, then an excessive number of records are read into the buffer, then released when the inner-loop issues the RC command. This will cause batch run times much longer than the original non-multifetch job. Only one of the sequential processes should be multifetched. The other file and command should be on the exclusion list.

A final note: calls to the Natural system files are NOT multifetch candidates. They should always be on the exclusion list or not on the inclusion list.