This blog is by a long-time Oracle storage professional who has history with both NetApp and EMC.

July 23, 2007

Oracle Backup: Which Snapshot is best? (Part 1)

In this post, I will begin to get to the heart of the
matter: The use of storage-level instantaneous copy technology with Oracle
backup.

When I talk about storage-level instantaneous copy technology
(which I will refer to from now on with the acronym SLIC), I mean things like
snapshots and BCVs. Anything that facilitates a point in time instantaneous
copy of a database or file system, through some storage-layer mechanism. SLIC
technologies come in two broad types: Physical copy technologies (like BCVs)
and virtual copy technologies (like snapshots).

When I first went to work for NetApp back in 1997, this was
my first challenge. Those were heady times. The internet boom was underway. EMC
was pushing their version of SLIC technology, which was largely BCVs. NetApp had
SLIC in the form of snapshots, but Oracle was highly resistant to the concept
of NFS. My first assignment was to validate the use of NetApp’s NFS
implementation with Oracle databases. In the process, I was strongly encouraged
to figure out how to do an Oracle hot backup with snapshots. Which I successfully
did in early 1998.

At this point, Oracle 7 was the current version, and hot
backup (or “user managed backup” as Oracle preferred to call it) was the only
way to back up an Oracle database. It would be difficult to over-emphasize the
importance of the hot backup feature in Oracle’s success. I think it was
absolutely the killer feature which allowed Oracle to become the dominant force
on the planet in this space. Let me explain why that was so.

Again, the internet boom was coming into full swing. This
pushed a fundamental change in the way that databases worked in many IT
organizations, especially dotcoms or companies that wanted to sell products in
the online marketplace. The internet was a global phenomenon. That meant 24 x 7
access. Before the internet, DBAs had some downtime each day to back the
database up. No longer. Any downtime was now unacceptable. The database had to
be backed up while online and open. The database software market became a horse race to see which vendor could do a better job backing up an online production database.

The dominant forces in the market were Oracle and Sybase.
Microsoft SQL Server was a Sybase knock off and basically a toy. IBM DB2 was stuck
in the proprietary mainframe world, and had no real open source strategy. It
was really between Oracle and Sybase.

Oracle had the feature called hot backup. This allowed you
to use SLIC to make an instantaneous copy of a running Oracle database while it
was in a special mode called “hot backup mode”. I/O to the database was allowed
to continue the entire time. Basically, hot backup mode was, and still is, a
form of controlled or “gated” corruption. Oracle knew that any block written to
the datafiles during the period when they were in this mode were potentially
corrupt. So Oracle simply copied
them to the logs as well. During recovery, Oracle ignored those blocks in the datafiles, and pulled those blocks from the logs. This meant that for purposes of optimizing hot
backup, you needed to take the copy as rapidly as possible, both to minimize
the number of ignored blocks, and reduce the impact of the huge increase in
logging which occured while in hot backup mode. Enter SLIC, which enabled the
DBA to make a copy of the database instantly.

With SLIC, Oracle databases could be backed up very rapidly.
The backup operation had minimal impact on the production database. Most DBAs
could not measure any performance impact at all. This meant that databases
could be backed up much more often. The impact of hot backup on logging was a tiny blip, that's all.

SLIC allowed the copy to become a fully writable copy of the
database instantly as well. This meant that the restore time was also
dramatically reduced. Since the database could be backed up more often, this
also reduced the time for recovery, since fewer log files needed to be applied.

In contrast, Sybase did not support any SLIC technology whatsoever.
You were required to use a tool called Backup Server to make a backup of the
database. This tool did lots and lots of I/O. The process seriously affected the production database’s
performance, and the backup operation took hours. Restore and recovery time were
similarly long.

Thus, when combined with SLIC, Oracle had the best online backup
technology going. This allowed Oracle to crush Sybase and become the logical
choice for all internet-facing database applications. This in turn led to
Oracle’s dominance in the database marketplace, which persists to this day.

EMC led the charge of the storage space to provide SLIC
technology to the Oracle database market, initially in the form of BCVs. As I said, I
became involved in 1997 in validating NetApp snapshots, as well as NFS, for
storing and backing up Oracle databases. By that time EMC had also introduced a
set of snapshot SLIC technologies in the form of Timefinder Snap on the
Symmetrix and SnapView snapshot on the CLARiiON.

In my next post I will discuss the differences between the SLIC
approaches taken by EMC and NetApp and how those decisions have affected the
Oracle database user.

Comments

Hi Jeff

To my best knowledge the Oracle database will only write the full data block to the redo log on the first change after it is brought into the SGA and thereafter it will just write the normal redo changes.
I think you are making a simplification on how Oracle treats blocks in hot backup mode, the simplification might not have impact on your quest but I still think the prerequisites are important

Powerlink access is required for this, though. If you are not an EMC customer, you will not be able to see this page unfortunately. You can adapt these scripts to Windows easily by installing cygwin. I always run cygwin on any Windows box I use, including my laptop. cygwin can be found here:

Is it possible for NetApp to just take a snapshot of a part of your database (read only tablespaces on a designated filesystem) while RMAN does the backup of all the read write tablespaces?

----------------------------

Rony:

Not sure what you are thinking here. I rarely if ever deal with read only tablespaces, though. You would tend to back them up once, then forget about them. They are after all read only. That is, they are not going to change. Thus, you would not need any snaps of them.

Not sure where you are coming from here. I spent years working with Sybase up in Alaska building things like the Alaska Group Economic Model. I am intimately familiar with the quiesce feature in Sybase in other words.

disclaimer: The opinions expressed here are my personal opinions. I am a blogger who works at EMC, not an EMC blogger. This is my blog, and not EMC's. Content published here is not read or approved in advance by EMC and does not necessarily reflect the views and opinions of EMC.