Search my Blog

Follow me on Twitter

The return of the relink Grid Infrastructure and Rdbms relink

Introduction:

This week I have been part of the debate again , do we or don’t we relink when major activities like Upgrade of Linux Kernel is performed . I have been asked to do the relink after the Rac cluster was upgraded on Linux. So as always thought it would be wise to make notes during the day as a plan to be performed during the night . In this blog you will find the steps i have performed on a two node Rac cluster with 11.2.0.4 Grid Infrastructure and two Oracle software trees holding 11.2.0.4 Rdbms and 11.1. Rdbms.

With regard to relinking discussion in team had been like .. 1) we might break things in relinking and 2) we don’t have the resources to do that for every server. My recommendation is to follow Oracle in this and do deal with relink of the Grid Infra right after OS has been relinked . Cause if something is broken during the Upgrade and your relinking there after well at least you know where it came from and can deal with things as from there . Where as if you do not relink your Software right after such a major change on OS you might still be hit in the dark in the upcoming weeks and you would need to figure out then what might have caused things.

You can even debate on the fact if it is needed to stop the resources like listeners and databases gracefully before shutting down the cluster or to perform a checkpoint in your database and just shutdown the crs . I have been doing both approaches and never had issues so far. But i can imagine that heavy used , busy systems might prefer the grace shutdown before shutting down GI.

Below you will find my steps . As always happy reading and till we meet again ,

Mathijs.

Detailed Plan:

mysrvrar / mysrvrbr

Steps 1 – 8 will be performed on all two nodes in my cluster, in a sequential order with some delay to make sure no cluster panic will occur.

1

crsctl status resource -t>/tmp/BeforeWork.lst

Check your cluster in order to be able to compare it to what it looks like after the relinking. Maybe it is even a good idea to put it into a file. Often i end up on clusters which i am not that familiar with on a daily basis. So i tend to make this overview before i start working on the cluster.

2

cSpfile.ksh

This is a home made script in which several activities are performed. It will perform a create a spfile , do a checkpoint and do switch logfile right before shutting down the cluster node.

3

emctl stop agent

4

srvctl stop home -o $ORACLE_HOME -s /tmp/statusRDBMS -n mysrvrar

This will stop all resources that started from 1120.4 home and keep a record of them in the file in /tmp/status RDBMS. This will be convenient when starting again .

5

6

srvctl stop instance -d MYDBCM -i MYDBCM1

This is a shared cluster so we have customers requiring the 1120.4 software and some the 11.1 software . The 11.1 databases have to be stopped individually.

srvctl stop instance -d MYDBCMAC -i MYDBCMAC1

7

srvctl stop listener -n mysrvrar -l listener_MYDBCM1

It is common to have a listener per database so i will stop the 11.1 listener in proper way as well.

srvctl stop listener -n mysrvrar -l listener_MYDBCMAC1

8

As root:

Dealing with the cluster means you have to logon or perform sudo su – as the ORACLE user to become ROOT to perform the needed task to stop the cluster-ware on the cluster node.

9

cd /opt/crs/product/11204/crs/bin

10

./crsctl disable crs

During this maintenance Linux will be patching and rebooting various times so i was asked to make sure that the Grid Infra structure is not starting at each reboot till we are ready.

11

./crsctl stop crs

Last step as preparation for the Linux guys to patch the Machines . Shutting down the Grid Infra structure. Time to take a 2hr sleep.

Time to Relink the software on the two nodes

Starting relink on the first node. Performing steps 9 and following . I will complete all steps needed on the first node and see to it that the Grid Infrastructure is started before moving on to the second node.

12

CHECK IF CRS IS DOWN otherwise REPEAT step 4

After Returning to the cluster still check if crs is down. Because it is better to be safe then sorry.

13

As root:

In order to relink the Grid Infra you have to become the root user again.

14

cd /opt/crs/product/11204/crs/bin

as root

15

cd /opt/crs/product/11204/crs/crs/install

16

perl rootcrs.pl -unlock

Earlier this night the GI was shutdown for Linux patching. When you perform this perl rootcrs .pl -unlock it will try to shutdown the GI. So in my case i got a message that the system was not able to stop the crs ..

17

As the grid infrastructure for a cluster owner:

This was a bit tricky. Cause the owner of the Grid Infra in my case is Oracle so dont try this as root . Better to open a second window as Oracle for the steps below.

18

export ORACLE_HOME=/opt/crs/product/11204/crs

As the Oracle user.

cd /opt/crs/product/11204/crs/bin

As the Oracle user.

19

relink

Relink will also write a relink log which you can tail.

20

[Step 1] Log into the UNIX system as the Oracle software owner:

Once the GI software has been relinked it is time for relinking the Oracle Homes( in my case an 11.1 and 11.2. software tree). In my case i logged on as the oracle user.

21

[STEP 2] Verify that your $ORACLE_HOME is set correctly:

22

For all Oracle Versions and Platforms, perform this basic environment check first:

If relinking was successful, the make command will eventually return to the OS prompt without an error. There will NOT be a ‘Relinking Successful’ type message. I performed a tail on the logfiles as relink was running in a second window and did not see any issues. And as the note says wait for the prompt to return ( with no comments – messages ) and you are good to go

29

As root again:

Since i am relinking both the GI and the RDBMS i have moved this step ( starting the GI again till after the RDBMS relinking has finished because of course during the relink of RDBMS the environment ( Databases , listeners ) have to be down !

30

cd /opt/crs/product/11204/crs/crs/install/

31

perl rootcrs.pl -patch

This perl rotcrs.pl -patch wil also start the cluster on this node again.NOTE we had issues that this was hanging on the first Node . It appeared that the second node was up and running after all ( my Linux Colleague had issued a crsctl disable crs from an old not active cluster-ware software which was still present on the box) . So in this specific scenario on second node i stopped crs again then the script continued on first node.

32

crsctl enable crs

If you have used the disable crs . Enable it again so after a node reboot the GI will start.

33

As Oracle

emctl start agent

Agent was already running so no manual action needed.

34

srvctl start home -o $ORACLE_HOME -s /tmp/statusRDBMS -n mysrvrar

This will start all resources started from 1120.4 home. The resources had been saved previously in the /tmp/statusRDBMS file

35

srvctl start instance -d MYDBCM -i MYDBCM1

Starting the 11.1 Resources.

srvctl start instance -d MYDBCMAC -i MYDBCMAC1

36

srvctl start listener -n mysrvrar -l listener_MYDBCM1

Starting the 11.1 Resources.

srvctl start listener -n mysrvrar -l listener_MYDBCMAC1

37

As Oracle User on the second node once it is relinked:

38

srvctl start instance -d MYDBCM -i MYDBCM2

Starting the 11.1 Resources.

srvctl start instance -d MYDBCMAC -i MYDBCMAC2

39

srvctl start listener -n mysrvrbr -l listener_REQMOD2

Starting the 11.1 Resources.

srvctl start listener -n mysrvrbr -l listener_MYDBCM2

srvctl start home -o $ORACLE_HOME -s /tmp/statusRDBMS -n mysrvrbr

crsctl status resource -t

Check your cluster again and compare the result with the status before. Hopefully all resources will appear online online or at least show the situation as it was before . There might be an extra activity if you are using services that have been relocated during the action. In such case you will have to relocate them again to the original location.