Disclaimer!

FatDBA or Oracle ‘Ant’ is an independent web-blog/site.The experiences, Test cases, views, and opinions expressed in this website are my own and does not reflect the views or opinions of my employer.

This site is independent of and does not represent Oracle Corporation in any way. Oracle does not officially sponsor, approve, or endorse this site or its content.
Product and company names mentioned in this website may be the trademarks of their respective owners.

Posts Tagged ‘RAC’

Okay, you have an ASM instance crashed and at the same time the db instance failed on the instance … Expected behavior and many of us have faced this scenario in production RAC setups.
Answer to the question is ‘Flex ASM’ which provides us with something that was previously unattainable: the ability to run multiple, independent in cardinality, ASM instances. You can think of it what SCAN is to Database 11gR2.

Its been a while we have the Flex ASM available for 12c users, now the question – How to convert a Non-Flex ASM setup to Flex enabled ASM.
Below is the method to it, i performed a POC for one of the customer some time back and here are the steps.

Okay, now lets do the conversion, we will be doing the silent conversion. You can use the ASMCA GUIas well to do the same.
Here used 192.168.10.0 as the IP and a free port for ASM LISTENER, we will use 1526 port here for listening all requests.

Okay, so the last step generated an auto script which needs to be executed from root to do the real work. This will bounce all RAC components one by one on each node. By the end of the step we will have a new LISTENER exclusively created for the ASM instance and both of the two instances (ASM1, ASM2) will be registered with it.

Well, this is an expected error because we are running on a 2 Node RAC and Flex ASM (Same as SCAN Listeners) needs at-least 2 Instance up and running which is not possible here in my case. But i will now kill the asm instance manually (Killing the PMON)

[oracle@dixitdb12v dixit]$ ./orachk
This version of orachk was released on 09-Oct-2014 and its older than 120 days. No new version of orachk is available in RAT_UPGRADE_LOC. It is highly recommended that you download the latest version of orachk from my oracle support to ensure the highest level of accuracy of the data contained within the report.

Do you want to continue running this version? [y/n][y]y

CRS stack is running and CRS_HOME is not set. Do you want to set CRS_HOME to /opt/app/grid/11.2.0/grid?[y/n][y]y

18 of the included audit checks require root privileged data collection . If sudo is not configured or the root password is not available, audit checks which require root privileged data collection can be skipped.

1. Enter 1 if you will enter root password for each host when prompted

2. Enter 2 if you have sudo configured for oracle user to execute root_orachk.sh script

3. Enter 3 to skip the root privileged collections

4. Enter 4 to exit and work with the SA to configure sudo or to arrange for root access and run the tool later.

Please indicate your selection from one of the above options for root access[1-4][1]:- 1

Running orachk in serial mode because expect(/usr/bin/expect) is not available to supply root passwords on remote nodes

NOTICE: Installing the expect utility (/usr/bin/expect) will allow orachk to gather root passwords at the beginning of the process and execute orachk on all nodes in parallel speeding up the entire process. For more info – http://www.nist.gov/el/msid/expect.cfm. Expect is available for all major platforms. See User Guide for more details.

Checking for prompts in /home/oracle/.bash_profile on dixitdb12v for oracle user…

Checking for prompts in /home/oracle/.bash_profile on dixitdb13v for oracle user…

FAIL => Bash is vulnerable to code injection (CVE-2014-6271)
INFO => $CRS_HOME/log/hostname/client directory has too many older log files.
INFO => ORA-00600 errors found in alert log for TESTRAC
INFO => At some times checkpoints are not being completed for TESTRAC
INFO => oracleasm (asmlib) module is NOT loaded
WARNING => Shell limit soft nproc for DB is NOT configured according to recommendation
WARNING => kernel.shmmax parameter is NOT configured according to recommendation
WARNING => Database Parameter memory_target is not set to the recommended value on TESTRAC2 instance
FAIL => Operating system hugepages count does not satisfy total SGA requirements
WARNING => NIC bonding is not configured for interconnect
WARNING => NIC bonding is NOT configured for public network (VIP)
WARNING => OSWatcher is not running as is recommended.
INFO => Jumbo frames (MTU >= 8192) are not configured for interconnect
WARNING => NTP is not running with correct setting
WARNING => Database parameter DB_BLOCK_CHECKING on PRIMARY is NOT set to the recommended value. for TESTRAC
WARNING => fast_start_mttr_target should be greater than or equal to 300. on TESTRAC2 instance

Like this:

Even after a good Database Upgrade plan one can face issues related to performance, Functionailities, Backups etc. Today I’m going to discuss one of the case that was happened during a production upgrade from 10g r2 -> 11g r2.

We had our database full backup scheduled every night with configuration like – Autobackup ON, Dataa transfer directly to SBT Mediums (Tapes), OPTIMIZATION ON and other basic settings/configurations.
While performing status check next morning we discovered backup got failed with error message:

ORA-00245: control file backup failed; target is likely on a local file system

As we have AUTOBACKUP functionailty ON which takes backup of critical files like controlfile file whenever the database structure metadata in the control file changes and whenever a backup record is added.

Reason:
Fro 11gR2 onwards in RAC database, due to the changes made to the controlfile backup mechanism, any instance in the cluster may write to the snapshot controlfile. Due to this snapshot controlfile need to be visible to all instances.
A snapshot controlfile in 11gr2 must be reachable by all nodes in RAC Environment. If the snapshot controlfile is not available or not shared RMAN will throw such errors during backup operations.
Documentation Link: http://docs.oracle.com/cd/E11882_01/rac.112/e16795/rman.htm#i455026

Solution:
To avoid such situations always keep your snapshot controlfile on a shared location so that it could be accessible by all nodes when needed.

using target database control file instead of recovery catalog
new RMAN configuration parameters:
CONFIGURE SNAPSHOT CONTROLFILE NAME TO ‘/disk02/rmaninfo/snapcontrol/snapcf_cesc1.f’;
new RMAN configuration parameters are successfully stored.

Okay now we have the solution of the problem let’s discuss more about Snapshot Controlfile and about this new change in 11g r2.
– When RMAN performs any operation that requires a consistent view of the control file (such as a backup), it will first create a copy of the control file. This copy is called the snapshot control file. The snapshot control file will be used for the duration of that operation and will be overwritten by any subsequent operation. Even related operations (say, during a backup database plus archive-log operation that does an archive log backup, a database backup, and then another archive log backup) will use newly created snapshot control files, one for each operation.

Why and what’s the need of this new change ?
From 11gR2 onwards, the controlfile backup happens without holding the controlfile enqueue.

Now, what’s this ‘ControlFile Enqueue‘ ?
When we need to back up or resynchronize from the Control file by RMAN, that first creates a snapshot or consistent image of the control file.
If one RMAN job is already backing up the control file while another needs to a new snapshot control file, then may see error:

RMAN-08512: waiting for snapshot controlfile enqueue for 1900 seconds

A job that must wait for the control file enqueue waits for a brief interval and then successfully retrieves the enqueue. RMAN makes up to five attempts to get the enqueue and then fails the job. So, finally with oracle 11g r2 we have the solution of this situation (In RAC Env) by keeping snapshot controlfile on shared location.