Main menu

Category Archives: Data Guard

Post navigation

Recently I built a Data Guard environment on two Exadatas with three RAC databases and did a lot of tests. The Show Configuration is probably the most frequent command I used in DG Broker.

When running show configuration from dgmgrl, we usually see the same result no matter where the command is executed, primary or any standby databases. During one switchover test, I run into a weird situation. The show configuration command can return me three different results from one primary database and two standby databases, just like above the image above (cat changes into a lion from the mirror). Here are the result:

The first thing I checked whether Data Guard replication was still working or not. Did a few switch logfile from primary and can see the logs were replicated to two standby databases. Verified data guard related parameters, tnsnames and listener entries in all databases. Found no issue there. At this moment, I narrowed down the issue to DG Broker and suspect it could relate to DG Broker configuration. After a few tries, I found a solution to fix this issue.1. On primary db (wzxdb), remove the database wzsdb from DG Broker configuration, then add it back.
2. On standby db (wzsdb), bounce the database.

After fixing the issue in primary database, let’s goto the standby database with issues. It still has the same error from show configuration command. So I went ahead bouncing the database.srvctl stop database -d wzsdb
srvctl start database -d wzsdb

We can see it seems the standby database received commands like REMOVE DATABASE wzsdb and ENABLE CONFIGURATION from primary DG Broker, but just can not send the message back to primary database. After bouncing the standby database, it returned normal and can communicate back to primary database.

Finally, all databases have this SUCCESS status no matter where I run the show configuration command.

I did some work on an interesting project to keep a standby database in sync with a production primary database manually. This is not a true standby database as the primary database does not communicate with this standby database. Due to certain reason, we could not configure Data Guard to allow data replication between these two database. So no way to do the redo log shipping like we do in Data Guard environment. What I mean manually is we take the archivelog backup from the previous day, restore and recover to this standby database. As this database is a VLDB, the volume of daily archive log files is in size of multi-terabyte. We use an Exadata X-4 full rack to host this standby database. Even with restoring using all db nodes and 200+ channels, it still take several hours for restore only. And similar timing in recovering these archive logs. Not mention the time copying file between two data centers. It takes a lot of efforts to keep up with production primary database and reduce the lag between these two databases.

The benefit doing this manually is the minimum impact in current production environment. The only overhead on production db is when copying files to Exadata. The impact is quite low. We scp rman backup pieces using all 8 db nodes to maximize the utilization of band width.

One major task during this restore and recover is to identify the correct restore and recover point from the daily rman backupset for the archive logs. Identify the right recover point, different people might have different opinions. Just like the image below, how many bars can you see, three or four?

There are many blogs and articles discussing the way to identify the correct restore and recover points. The majority of people like to use v$archived_log view to get the recover point. In my scenario, it did not work well as I can get the correct recovery point only after I restore all the archive logfiles. What I want is after cataloging the rman backup piece, what are last applied archive log sequence for each thread, and what are my next recover point for the current rman backup pieces that are just cataloged.

Using both v$archived_log and v$backup_archivelog_details views, I created a script that can help me answer all the questions I have.
1. The restore commands can be used for each thread
2. The last applied archive log sequence for each thread, also include the timestamp and next change SCN#
3. The last possible recover point for each thread for the cataloged rman backup pieces
4. The recover command

Recently I did some Data Guard tests on 11.2.0.3 RAC. Both primary and standby databases were on different Exadata QuarterRack. During one test, I might mess up some data guard parameters. When I performed switchover operation wzsdb->wzpdb, it failed in the middle of the process. This is an interesing scenario I have never run into in the past. Here is the result from the execution:

Majority of the time when there is an issue during the switchover using DG Broker, bounce both new primary database and new standby can usually resolve the issue. It didn’t work this time. Tried multiple bounce of both databases, restarted MRP manually. None of them works. Both databases claimed to be Primary database in the DG Broker, just like the two bears above. Here is what the result from DG Broker looks like.

Obvious I should not have two primary databases in Data Guard. Next thing I would like to check whether this is the issue inside Data Guard Broker. I run the the following queries on both databases.Database wzpdb (Supposed new primary database)

The data guard processes on standby also looked ok. Retried show configuration command on both databases and got the same errors. At this moment, it seems like the solution is to recreate the DG Broker. So I went ahead and do the followings to recreate the brokers:Standby Database (wzsdb)Step 1. Make sure to stop MRP first
SYS@wzdb1> alter database recover managed standby database cancel;
Database altered.Step 2. Stop dg broker and remove the files
SYS@wzdb1> alter system set dg_broker_start=false scope=both sid=’*’;
System altered.

We are back in business. Another possible solution is not to recreate the DG Broker files completely, but just remove the DG Broker configuration from dgmgrl, and then recreate the the configuration in dgmgrl. Next time if I run into the similar issue, I will try it out.