PM75406: GLOBAL ONLINE CHANGE COMMIT FAILS LEAVING SYSTEMS HUNG IF ONE OR MORE SYSTEMS ARE DOWN AND PREPARE SPECIFIED WITH OPTION(FRCNRML)

A fix is available

Subscribe

You can track all active APARs for this component.

APAR status

Closed as program error.

Error description

In a 10 way IMS Datsharing Environment 4 systems were down for
hardware upgrades when a Global Online Change was initiated with
OPTION(FRCNRML). GOLC failed to complete on 1 of the 6 active
systems. A QUERY MEMBER command shows a status of OLCCMT1C for
this system indicating online change phase 1 completed. The
other systems have completed phase 2 (OLCCMT2C), but are waiting
for GOLC to complete on all systems. The OLCSTAT dataset has
been corrupted with one duplicate entry and several missing
entries.

Local fix

Problem summary

****************************************************************
* USERS AFFECTED: IMS V11 users of global online *
* change that use OPTION(FRCNRML) or *
* OPTION(FRCABND) *
****************************************************************
* PROBLEM DESCRIPTION: An INIT OLC PHASE(COMMIT) command *
* corrupts the OLCSTAT dataset, if *
* issued when IMSs are down, after *
* INIT OLC PHASE(PREPARE) was specified *
* with OPTION(FRCNRML). *
****************************************************************
* RECOMMENDATION: INSTALL CORRECTIVE SERVICE FOR APAR/PTF *
****************************************************************
An INIT OLC PHASE(PREPARE) TYPE(ACBLIB) OPTION(FRCNRML)
command is issued and executes successfully, in the case where
4 IMSs were down because they were shut down with a
a /CHE FREEZE command, and 6 IMSs were active.
An INIT OLC PHASE(COMMIT) command is issued and commit phase 1
is performed successfully by all the active IMSs. The command
master IMS then updates the OLCSTAT dataset incorrectly,
corrupting it. The command master IMSID overlays the first
IMSID in the OLCSTAT dataset, the last IMSID is incorrectly
deleted, and the result is that the command master IMSID
appears in the OLCSTAT dataset twice.
This problem occurs if the number of IMSs in the OLCSTAT
dataset is 4 or more and at least one of the IMSs is down.
Each OLCSTAT dataset IMS entry is 80 bytes, so 80 x 4 =
320 bytes, which exceeds the maximum length allowed on
a MVC instruction, so not all of the data is moved.
This is why the commit command master appears in the
new OLCSTAT dataset twice.
Bypass:
The IMSs that successfully completed commit phase 2
(QUERY MEMBER TYPE(IMS) SHOW(STATUS) displays status
of OLCCMT2C), but whose IMSIDs disappeared from the
OLCSTAT dataset (QUERY OLC LIBRARY(OLCSTAT) SHOW(ALL)
doesn't display them in the MbrList field)
can be added back to the OLCSTAT dataset with the
Global Online Change Utility (DFSUOLC0) FUNC=ADD.
The INIT OLC PHASE(COMMIT) command should be then
issued again, to complete the Global Online Change
for the IMSs that committed it, to get those IMSs out
of an online change state, so that they can start doing
work again, can participate in the next Global Online
Change, and don't have to coldstart.
The IMS commit command master whose IMSID is duplicated
in the OLCSTAT dataset can have one of the IMSIDs removed
with the DFSUOLC0 utility FUNC=DEL. The
INIT OLC PHASE(COMMIT) command can be attempted again to
complete the Global Online Change on this IMS and any others
that failed to complete the Global Online Change.

Problem conclusion

GEN:
KEYWORDS:
*** END IMS KEYWORDS ***
DFSOLC00 is corrected to use the MVCL instruction when
moving the new OLCSTAT dataset contents to the old OLCSTAT
dataset buffer, in case the length of the new OLCSTAT
dataset contents exceeds 255 bytes, because of the count
of IMSs in the old OLCSTAT dataset is 4 or more.