Updates in a Distributed Environment

Distributed updates add several levels of complexity to a distributed system. The fact that multiple users are sharing and accessing data that exists at many sites rather that at a single site, adds the following considerations when these users attempt to update that data:

replication (update propagation)

concurrency control

transaction recovery management

Replication

In a distributed system, a data object can be represented at many sites. Making sure that updates to any representative at any of the sites are propagated to all other sites is the responsibility of Oracle7's
replication mechanisms.

A snapshot is a full copy of a table, or a subset of a table, that reflects a recent state of a master table (the table on the node that you designate as the master node). A snapshot is defined by a distributed query that references one or more master tables, views, or, with certain limitations, other snapshots.

Read-only snapshots can be used for queries only and are the simplest form of replication. They are typically used for systems in which many sites need to query data that are updated only by one site..

Updatable snapshots can reflect local updates and therefore improve response time by avoiding network traffic. However, there must be a mechanism that ensures that the local updates are not lost when the snapshot is refreshed from the master table.

Oracle's symmetric replication facility provides that mechanism. It allows multiple copies of data to be maintained at different sites in a distributed system.

Concurrency Control

In a distributed system, there is the strong potential for more than one user to be concurrently executing transactions that update the same data and produce incorrect results. Oracle provides locking mechanisms to handle multiuser access to the same data.

In a distributed system, there is another level of complexity. A situation called global deadlock can occur. For example, transaction T1 has a lock on object A in San Francisco and requests a lock on object B in Dallas, and transaction T2 has a lock on object B in Dallas and requests a lock on object A in San Francisco. This series of events would cause both transactions to go into wait states.

In Oracle distributed transactions, local deadlocks are detected by analyzing a "waits for" graph, and global deadlocks are detected by a time-out. Once detected, non-distributed and distributed deadlocks are handled by the database and application in the same way.

Distributed Transaction Management

All participants (nodes) in a distributed transaction should be unanimous as to the action to take on that transaction. That is, they should either all commit or rollback.

Oracle uses a prepare/commit mechanism to ensure that all participants in the distributed transaction are "on the same track".

Briefly, a prepare/commit mechanism works this way (more detailed instructions for implementing prepare/commit are in
following sections):

Phase 1

In a prepare/commit system, the initiating node is called the global coordinator. When this site receives a commit request from an update transaction, it asks all participants in the distributed system to prepare (to promise to commit or rollback the transaction, even if there is a failure). The participating sites will reply that they are either prepared to commit or not.

Phase 2

If the global coordinator receives a "prepared to commit" message from all the participating sites, it broadcasts a commit command to all sites. If even one site has replied that it is not prepared to commit, the global coordinator will abort the transaction and send out a rollback (or undo) command.

The Distributed Transaction Management Mechanism

Oracle7 automatically controls and monitors the commit or rollback of a distributed update transaction and maintains the integrity of the global database (the collection of databases participating in the transaction) using a transaction recovery management mechanism known as prepare/commit). This mechanism is completely transparent. Its use requires no programming on the part of the user or
application developer.

The information in this section explains how this mechanism works, why it is used, and how you can take advantage of its features to better design and configure a distributed system.

Coordinating Distributed Updates

By definition, changes made by all SQL statements in a transaction are either committed or rolled back as a unit. The commit of a non-distributed transaction (one that contains SQL statements that modify data only at a local database) is simple -- all changes are either committed or rolled back as a unit in the non-distributed system.

However, the commit or rollback of a distributed transaction must be coordinated over a network so that the participating nodes either all commit or all roll back the transaction, even if a network failure or a system failure of any number of nodes occur during the process. The prepare/commit mechanism guarantees that the nodes participating in a distributed transaction either all commit or all roll back the transaction, thus maintaining the integrity of the global database.

All implicit database changes performed by integrity constraints, remote procedure calls, and triggers are also protected by Oracle7's distributed transaction management mechanism.

The Prepare and Commit Phases

The processing of a distributed update has two distinct phases:

prepare phase

The global coordinator (initiating node) asks participants to prepare (to promise to commit or rollback the transaction, even if there is a failure).

commit phase

If all participants respond to the coordinator that they are prepared, the coordinator asks all nodes to commit the transaction. If any participants cannot prepare, the coordinator asks all nodes to roll back the transaction.

When a user commits a distributed transaction with a COMMIT statement, both phases are performed automatically. The following sections describe each phase in further detail.

Prepare Phase

The first phase in committing a distributed transaction is the preparephase in which the commit of the transaction is not actually carried out. Instead, all nodes referenced in a distributed transaction (except one, known as the commit point site, described ) are told to prepare (to commit). By preparing, a node records enough information so that it can subsequently either commit or abort the transaction (in which case, a rollback will be performed), regardless of
intervening failures.

When a node responds to its requestor that it has prepared, the prepared node has made a promise to be able to either commit or roll back the transaction later and not to make a unilateral decision on whether to commit or roll back the transaction.

Note: Queries that start after a node has prepared cannot access the associated locked data until all phases are complete (an insignificant amount of time unless a failure occurs).

When a node is told to prepare, it can respond with one of
three responses:

prepared

Data on the node has been modified by a statement in the distributed transaction, and the node has successfully prepared.

read-only

No data on the node has been, or can be, modified (only queried), so no prepare is necessary.

abort

The node cannot successfully prepare.

Prepare Phase Actions by Nodes

To complete the prepare phase, each node performs the
following actions:

The node requests its descendants (nodes subsequently referenced) to prepare.

The node checks to see if the transaction changes data on that node or any of its descendants. If there is no update, the node skips the next steps and replies with a read-only message
(see below).

The node allocates all resources it needs to commit the transaction if data is updated.

The node flushes any entries corresponding to changes made by that transaction to its local redo log.

The node guarantees that locks held for that transaction are able to survive a failure.

The node responds to the node that referenced it in the distributed transaction with a prepared message or, if its prepare or the prepare of one of its descendents was unsuccessful, with an abort message (see below).

These actions guarantee that the transaction can subsequently commit or roll back on that node. The prepared nodes then wait until a COMMIT or ROLLBACK is sent. Once the node(s) are prepared, the transaction is said to be in-doubt.

Read-only Response

When a node is asked to prepare and the SQL statements affecting the database do not change that node's data, the node responds to the node that referenced it with a read-only message. These nodes do not participate in the second phase (the commit phase). For more information about read-only distributed transactions, see "Read-Only Distributed Transactions"

Unsuccessful Prepare

When a node cannot successfully prepare, it performs the
following actions:

That node releases any resources currently held by the transaction and rolls back the local portion of the transaction.

The node responds to the node that referenced it in the distributed transaction with an abort message.

These actions then propagate to the other nodes involved in the distributed transaction to roll back the transaction and guarantee the integrity of the data in the global database.

Again, this enforces the primary rule of a distributed transaction. All nodes involved in the transaction either all commit or all roll back the transaction at the same logical time.

Commit Phase

The second phase in committing a distributed transaction is the commit phase. Before this phase occurs, all nodes referenced in the distributed transaction have guaranteed that they have the necessary resources to commit the transaction. That is, they are all prepared.

Therefore, the commit phase consists of the following steps:

1. The global coordinator send a message to all nodes telling them to commit the transaction.

2. At each node, Oracle7 commits the local portion of the distributed transaction (releasing locks) and records an additional redo entry in the local redo log, indicating that the transaction has committed.

When the commit phase is complete, the data on all nodes of the distributed system are consistent with one another.

A variety of failure cases, caused by network or system failures, are possible during both the prepare phase and the commit phase. For a description of failure situations and how Oracle7 resolves intervening failures during prepare/commit, see "Troubleshooting Distributed Update Problems" .

The Session Tree

As the statements in a distributed transaction are issued, Oracle7 defines a session tree of all nodes participating in the transaction. A session tree is a hierarchical model that describes the relationships between sessions and their roles. All nodes participating in the session tree of a distributed transaction assume one or more roles:

Clients

A node acts as a client when it references information from another node's database. The referenced node is a database server. In the above example, the node SALES.ACME.COM is a client of the nodes (database servers) that serve the WAREHOUSE and
FINANCE databases.

Servers and Database Servers

A server is a node that is directly referenced in a distributed transaction or is requested to participate in a transaction because another node requires data from its database. A node supporting a database is also called a database server.

In Figure 5 - 1, an application at the node holding the SALES database initiates a distributed transaction which accesses data from the nodes holing the WAREHOUSE and FINANCE databases. Therefore, SALES.ACME.COM has the role of client node, and WAREHOUSE and FINANCE are both database servers.

In this example, SALES is a database server and a client because the application is also requesting an update of the SALES database.

Local Coordinators

A node that must reference data on other nodes to complete its part in the distributed transaction is called a local coordinator. In Figure 5 - 1, SALES.ACME.COM, although it happens to be the global coordinator, is also considered a local coordinator because it coordinates the nodes it directly references: WAREHOUSE.ACME.COM and FINANCE.ACME.COM.

A local coordinator is responsible for coordinating the transaction among the nodes it communicates directly with by:

receiving and relaying transaction status information to and from those nodes

passing queries to those node

receiving queries from those nodes and passing them on to other nodes

returning the results of queries to the nodes that initiated them

The Global Coordinator

The node where the distributed transaction originates (to which the database application issuing the distributed transaction is directly connected) is called the global coordinator. This node becomes the parent or root of the session tree. The global coordinator performs the following operations during a distributed transaction:

All of the distributed transaction's SQL statements, remote procedure calls, etc. are sent by the global coordinator to the directly referenced nodes, thus forming the session tree.

For example, in Figure 5 - 1, the transaction issued at the node SALES.ACME.COM references information from the database servers WAREHOUSE.ACME.COM and FINANCE.ACME.COM. Therefore, SALES.ACME.COM is the global coordinator of this distributed transaction.

The global coordinator instructs all directly referenced nodes other than the commit point site (see below) to prepare
the transaction.

If all nodes prepare successfully, the global coordinator instructs the commit point site to initiate the global commit of
the transaction.

If there is one or more abort messages, the global coordinator instructs all nodes to initiate a global rollback of the transaction.

For more information about the global coordinator's role, see "A Simple Example" .

The Commit Point Site

The job of the commit point site is to initiate a commit or roll back as instructed by the global coordinator. The system administrator always designates one node to be the commit point site in the session tree by assigning all nodes a commit point strength (see below). The node selected as commit point site should be that node that stores the most critical data (the data most widely used)

The commit point site is distinct from all other nodes involved in a distributed transaction with respect to the following two issues:

The commit point site never enters the prepared state. This is potentially advantageous because if the commit point site stores the most critical data, this data never remains in-doubt, even if a failure situation occurs. (In failure situations, failed nodes remain in a prepared state, holding necessary locks on data until in-doubt transactions are resolved.)

In effect, the outcome of a distributed transaction at the commit point site determines whether the transaction at all nodes is committed or rolled back. The global coordinator ensures that all nodes complete the transaction the same way that the commit point site does.

A distributed transaction is considered to be committed once all nodes are prepared and the transaction has been committed at the commit point site (even though some participating nodes may still be only in the prepared state and the transaction not yet actually committed).

The commit point site's redo log is updated as soon as the distributed transaction is committed at that node. Likewise, a distributed transaction is considered not committed if it has not been committed at the commit point site.

Commit Point Strength

Every node acting as a database server must be assigned a commit point strength. If a database server is referenced in a distributed transaction, the value of its commit point strength determines what role it plays in the prepare/commit phases. Specifically, the commit point strength determines whether a given node is the commit point site in the distributed transaction.

The commit point site is determined at the beginning of the prepare phase. The commit point site is selected only from the nodes participating in the transaction. Once it has been determined, the global coordinator sends prepare messages to all participating nodes.

Of the nodes directly referenced by the global coordinator, the node with the highest commit point strength is selected. Then, the initially-selected node determines if any of its servers (other nodes that it has to obtain information from for this transaction) has a higher commit point strength.

Either the node with the highest commit point strength directly referenced in the transaction, or one of its servers with a higher commit point strength becomes the commit point site. Figure 5 - 2 shows in a sample session tree the commit point strengths of each node (in parentheses) and shows the node chosen as the commit point site.

Figure 5 - 2. Commit Point Strengths and Determination of the Commit Point Site

The following conditions apply when determining the commit
point site:

A read-only node (a node which will not change its local data for the transaction) cannot be designated as the commit
point site.

If multiple nodes directly referenced by the global coordinator have the same commit point strength, Oracle7 will designate one of these nodes as the commit point site.

If a distributed transaction ends with a rollback, the prepare and commit phases are not needed, consequently a commit point site is never determined. Instead, the global coordinator sends a ROLLBACK statement to all nodes and ends the processing of the distributed transaction.

The commit point strength only determines the commit point site in a distributed transaction. Because the commit point site stores information about the status of the transaction, the commit point site should not be a node that is frequently unreliable or unavailable in case other nodes need information about the transaction's status.

As Figure 5 - 2 illustrates, the commit point site and the global coordinator can be different nodes of the session tree.

The commit point strengths of each nodes is communicated to the coordinator(s) when the initial connections are made. The coordinator(s) retain the commit point strengths of each node they are in direct communication with so that commit point sites can be efficiently selected during prepare/commits. Therefore, it is not necessary for the commit point strength to be exchanged between a coordinator and a node each time a commit occurs.

Specifying the Commit Point Strength of an Instance

Specify a commit point strength for each node that insures that the most critical server will be "non-blocking" if a failure occurs during a prepare/commit phase.

A node's commit point strength should relate to the estimated number of collisions that can result from data locked by in-doubt transactions. For example, mainframe-based database servers will probably have higher commit point strengths than minicomputer-based servers. In turn, minicomputer-based database servers will probably have higher commit point strengths than PC-based database servers. To determine each node's commit point strength, it will be necessary for all administrators of the distributed system to communicate and establish the appropriate values.

A node's commit point strength is set by the initialization parameter COMMIT_POINT_STRENGTH. The range of values is any integer from 0 to 255. For example, to set the commit point strength of a database to 200, include the following line in that database's parameter file:

The Scenario

A company that has separate Oracle7 servers, SALES.ACME.COM and WAREHOUSE.ACME.COM. As sales records are inserted into the SALES database, associated records are being updated at the WAREHOUSE database.

The Process

The following steps are carried out during a distributed update transaction that enters a sales order:

1. An application issues SQL statements.

At the Sales department, a salesperson uses a database application to enter, then commit a sales order. The application issues a number of SQL statements to enter the order into the SALES database and update the inventory in the WAREHOUSE database.

These SQL statements are all part of a single distributed transaction, guaranteeing that all issued SQL statements succeed or fail as a unit. This prevents the possibility of an order being placed but, inventory is not updated to reflect the order. In effect, the transaction guarantees the consistency of data in the
global database.

As each of the SQL statements in the transaction executes, the session tree is defined, as shown in Figure 5 - 3.

An order entry application running with the SALES database initiates the transaction. Therefore, SALES.ACME.COM is the global coordinator for the distributed transaction.

The order entry application inserts a new sales record into the SALES database and updates the inventory at the warehouse. Therefore, the nodes SALES.ACME.COM and WAREHOUSE.ACME.COM are both database servers. Furthermore, because SALES.ACME.COM updates the inventory, it is a client of WAREHOUSE.ACME.COM.

This completes the definition of the session tree for this
distributed transaction.

Remember that each node in the tree has acquired the necessary data locks to execute the SQL statements that reference local data. These locks remain even after the SQL statements have been executed until the prepare/commit phases are completed.

2. The application issues a COMMIT statement.

The final statement in the transaction that enters the sales order is now issued -- a COMMIT statement which begins the prepare/commit phases starting with the prepare phase.

3. The global coordinator determines the commit point site.

The commit point site is determined immediately following the COMMIT statement. SALES.ACME.COM, the global coordinator, is determined to be the commit point site, as shown in Figure 5 - 4.

See "Specifying the Commit Point Strength" for more information about how the commit point site is determined.

After the commit point site is determined, the global coordinator sends the prepare message to all directly referenced nodes of the session tree, excluding the commit point site. In this example, WAREHOUSE.ACME.COM is the only node asked to prepare.

WAREHOUSE.ACME.COM tries to prepare. If a node can guarantee that it can commit the locally dependent part of the transaction and can record the commit information in its local redo log, the node can successfully prepare.

In this example, only WAREHOUSE.ACME.COM receives a prepare message because SALES.ACME.COM is the commit point site (which does not prepare). WAREHOUSE.ACME.COM responds to SALES.ACME.COM with a prepared message.

As each node prepares, it sends a message back to the node that asked it to prepare. Depending on the responses, two things
can happen:

If any of the nodes asked to prepare respond with an abort message to the global coordinator, the global coordinator then tells all nodes to roll back the transaction, and the
process is completed.

If all nodes asked to prepare respond with a prepared or a read-only message to the global coordinator. That is, they have successfully prepared, the global coordinator asks the commit point site to commit the transaction.

SALES.ACME.COM, receiving acknowledgement that WAREHOUSE.ACME.COM is prepared, instructs the commit point site (itself, in this example) to commit the transaction. The commit point site now commits the transaction locally and records this fact in its local redo log.

Even if WAREHOUSE.ACME.COM has not committed yet, the outcome of this transaction is determined, that is, the transaction will be committed at all nodes even if the node's ability to commit is delayed.

6. The commit point site informs the global coordinator
of the commit.

The commit point site now tells the global coordinator that the transaction has committed. In this case, where the commit point site and global coordinator are the same node, no operation is required. The commit point site remembers it has committed the transaction until the global coordinator confirms that the transaction has been committed on all other nodes involved in the distributed transaction.

After the global coordinator has been informed of the commit at the commit point site, it tells all other directly referenced nodes to commit. In turn, any local coordinators instruct their servers to commit, and so on. Each node, including the global coordinator, commits the transaction and records appropriate redo log entries locally. As each node commits, the resource locks that were being held locally for that transaction are released.

Figure 5 - 6 illustrates Step 6 in this example. SALES.ACME.COM, both the commit point site and the global coordinator, has already committed the transaction locally. SALES now instructs WAREHOUSE.ACME.COM to commit the transaction.

Figure 5 - 6. The Global Coordinator and Other Servers Commit the Transaction

7. The global coordinator and commit point site complete the commit.

After all referenced nodes and the global coordinator have committed the transaction, the global coordinator informs the commit point site.

The commit point site, which has been waiting for this message, erases the status information about this distributed transaction and informs the global coordinator that it is finished. In other words, the commit point site forgets about committing the distributed transaction. This is acceptable because all nodes involved in the prepare/commit phases have committed the transaction successfully, and they will never have to determine its status in
the future.

After the commit point site informs the global coordinator that it has forgotten about the transaction, the global coordinator finalizes the transaction by forgetting about the transaction itself.

This completes the COMMIT phase and thus completes the distributed update transaction.

All of the steps described above are accomplished automatically and in a fraction of a second.

Coordination of System Change Numbers

Each committed transaction has an associated system change number (SCN) to uniquely identify the changes made by the SQL statements within that transaction. In a distributed system, the SCNs of communicating nodes are coordinated when:

A connection occurs using the path described by one or more database links.

A distributed SQL statement executes (the execute phase completes).

A distributed transaction commits.

Among other benefits, the coordination of SCNs among the nodes of a distributed system allows global distributed read-consistency at both the statement and transaction level. If necessary, global distributed time-based recovery can also be completed.

During the prepare phase, Oracle7 determines the highest SCN at all nodes involved in the transaction. The transaction then commits with the high SCN at the commit point site. The commit SCN is then sent to all prepared nodes with the commit decision.

Read-Only Distributed Transactions

There are three cases in which all or part of a distributed transaction is read-only:

A distributed transaction can be partially read-only if:

only queries are issued at one or more nodes

updates do not modify any records

updates rolled back due to violations of integrity constraints or triggers being fired

In each of these cases, the read-only nodes recognize this fact when they are asked to prepare. They respond to their respective local coordinators with a read-only message. By doing this, the commit phase completes faster because Oracle eliminates the read-only nodes from subsequent processing.

The distributed transaction can be completely read-only (no data changed at any node) and the transaction is notstarted with the SET TRANSACTION READ ONLY statement.

In this case, all nodes recognize that they are read-only during the prepare phase, and no commit phase is required. However, the global coordinator, not knowing whether all nodes are read-only, must still perform the operations involved in the prepare phase.

The distributed transaction can be completely read-only (all queries at all nodes) and the transaction is started with a
SET TRANSACTION READ ONLY statement. In this case, only queries are allowed in the transaction, and the global coordinator does not have to undertake a prepare/commit. Updates by other transactions do not degrade global transaction-level read consistency, because it is automatically guaranteed by coordination of SCNs at each node of the distributed system.

Limiting the Number of Distributed Transactions Per Node

The initialization parameter DISTRIBUTED_TRANSACTIONS controls the number of possible distributed transactions in which a given instance can concurrently participate, both as a client and a server. If this limit is reached and a subsequent user tries to issue a SQL statement referencing a remote database, the statement is rolled back and the following error message is returned:

ORA-2042: too many global transactions

For example, assume that DISTRIBUTED_TRANSACTIONS is set to 10 for a given instance. In this case, a maximum of ten sessions can concurrently be processing a distributed transaction. If an eleventh session attempts to issue a DML statement requiring distributed access, an error message is returned to the session, and the statement is
rolled back.

The database administrator should consider increasing the value of the initialization parameter DISTRIBUTED_TRANSACTIONS when an instance regularly participates in numerous distributed transactions and the above error message is frequently returned as a result of the current limit. Increasing the limit allows more users to concurrently issue distributed transactions.

If the DISTRIBUTED_TRANSACTIONS initialization parameter is set to zero, no distributed SQL statements can be issued in any session. Also, the RECO background process is not started at startup of the local instance. In-doubt distributed transactions that may be present (from a previous network or system failure) cannot be automatically resolved by Oracle7.

Therefore, only set this initialization parameter to zero to prevent distributed transactions when a new instance is started, and when it is certain that no in-doubt distributed transactions remained after the last instance shut down.

Troubleshooting Distributed Update Problems

A network or system failure can cause the following types of problems:

A prepare/commit being processed when a failure occurs may not be completed at all nodes of the session tree.

If a failure persists (for example, if the network is down for a long time), the data exclusively locked by in-doubt transactions is unavailable to statements of other transactions.

The following sections describe these situations.

Failures that Interrupt Prepare/Commit

The user program that commits a distributed transaction is informed of a problem by one of the following error messages:

ORA-02050: transaction ID rolled back,

some remote dbs may be in-doubt

ORA-02051: transaction ID committed,

some remote dbs may be in-doubt

ORA-02054: transaction ID in-doubt

A robust application should save information about a transaction if it receives any of the above errors. This information can be used later if manual distributed transaction recovery is desired.

Note: The failure cases that prompt these error messages are beyond the scope of this book and are unnecessary to administer the system.

No action is required by the administrator of any node that has one or more in-doubt distributed transactions due to a network or system failure. The automatic recovery features of Oracle7 transparently complete any in-doubt transaction so that the same outcome occurs on all nodes of a session tree (that is, all commit or all roll back) once the network or system failure is resolved.

However, in extended outages, the administrator may wish to force the commit or rollback of a transaction to release any locked data. Applications must account for such possibilities.

Failures that Prevent Access of Data

When a user issues a SQL statement, Oracle7 attempts to lock the required resources to successfully execute the statement. However, if the requested data is currently being held by statements of other uncommitted transactions and continues to remained locked for an excessive amount of time, a time-out occurs. Consider the following two scenarios.

Transaction Time-Out

A DML SQL statement that requires locks on a remote database may be blocked from doing so if another transaction (distributed or non-distributed) currently own locks on the requested data. If these locks continue to block the requesting SQL statement, a time-out occurs, the statement is rolled back, and the following error message is returned to the user:

ORA-02049: time-out: distributed transaction waiting for lock

Because no data has been modified, no actions are necessary as a result of the time-out. Applications should proceed as if a deadlock has been encountered. The user who executed the statement can try to re-execute the statement later. If the lock persists, the user should contact an administrator to report the problem.

The timeout interval in the above situation can be controlled with the initialization parameter DISTRIBUTED_LOCK_TIMEOUT. This interval is in seconds. For example, to set the time-out interval for an instance to 30 seconds, include the following line in the associated parameter file:

DISTRIBUTED_LOCK_TIMEOUT=30

With the above time-out interval, the time-out errors discussed in the previous section occur if a transaction cannot proceed after 30 seconds of waiting for unavailable resources.

Additional Information: For more information about initialization parameters and editing parameter files, see the Oracle7 Server Reference.

Lock From In-Doubt Transaction

A query or DML statement that requires locks on a local database may be blocked from doing so indefinitely due to the locked resources of an in-doubt distributed transaction. In this case, the following error message is immediately returned to the user:

ORA-01591: lock held by in-doubt distributed transaction ID

In this case, the SQL statement is rolled back immediately. The user who executed the statement can try to re-execute the statement later. If the lock persists, the user should contact an administrator to report the problem, including the ID of the in-doubt distributed transaction.

The chances of the above situations occurring are very rare, considering the low probability of failures during the critical portions of the prepare/commit phases. Even if such a failure occurs and assuming quick recovery from a network or system failure, problems are automatically resolved without manual intervention. Thus problems usually resolve before they can be detected by users or
database administrators.

Manually Overriding In-Doubt Transactions

A database administrator can manually force the COMMIT or ROLLBACK of a local in-doubt distributed transaction. However, a specific in-doubt transaction should be manually overridden onlywhen the following situations exist:

The in-doubt transaction locks data that is required by other transactions. This happens if users complain that the ORA-01591 error message interferes with their transactions.

An in-doubt transaction prevents the extents of a rollback segment to be used by other transactions. The first portion of an in-doubt distributed transaction's local transaction ID corresponds to the ID of the rollback segment, as listed by the data dictionary views DBA_2PC_PENDING and DBA_ROLLBACK_SEGS.

The failure that did not allow the prepare/commit phases to complete will not be corrected in an acceptable time period. Examples of such cases might include a telecommunication network that has been damaged or a damaged database that needs a substantial amount of time to complete recovery.

Normally, a decision to locally force an in-doubt distributed transaction should be made in consultation with administrators at other locations. A wrong decision can lead to database inconsistencies which can be difficult to trace and that you must manually correct.

If the conditions above do not apply, always allow the automatic recovery features of Oracle7 to complete the transaction. However, if any of the above criteria are met, the administrator should consider a local override of the in-doubt transaction. If a decision is made to locally force the transaction to complete, the database administrator should analyze available information with the following goals in mind:

Try to find a node that has either committed or rolled back the transaction. If you can find a node that has already resolved the transaction, you can follow the action taken at that node.

See if any information is given in the TRAN_COMMENT column of DBA_2PC_PENDING for the distributed transaction. Comments are included in the COMMENT parameter of the COMMIT command. For example, an in-doubt distributed transaction's comment might indicate the origin of the transaction and what type of transaction it is:

COMMIT COMMENT 'Finance/Accts_pay/Trans_type 10B';

See if any information is given in the ADVICE column of DBA_2PC_PENDING for the distributed transaction. An application can prescribe advice about whether to force the commit or force the rollback of separate parts of a distributed transaction with the ADVISE parameter of the SQL command ALTER SESSION.

The advice sent during the prepare phase to each node is the advice in effect at the time the most recent DML statement executed at that database in the current transaction.

For example, consider a distributed transaction that moves an employee record from the EMP table at one node to the EMP table at another node. The transaction could protect the record (even when administrators independently force the in-doubt transaction at each node) by including the following sequence of SQL statements:

ALTER SESSION ADVISE COMMIT;

INSERT INTO emp@hq ... ; /*advice to commit at HQ */

ALTER SESSION ADVISE ROLLBACK;

DELETE FROM emp@sales ... ; /*advice to roll back at SALES*/

ALTER SESSION ADVISE NOTHING;

If you manually force the in-doubt transaction, the worst that can happen is that each node has a copy of the employee record being moved; the record cannot disappear.

Manual Override Example

The following example shows a failure during the commit of a distributed transaction and how to go about gaining information before manually forcing the commit or rollback of the local portion of an in-doubt distributed transaction. Figure 5 - 7 illustrates the example.

In this failure case, the prepare phase completed. However, during the commit phase, the commit point site's commit message (the message telling the global coordinator that the transaction was committed at the commit point site) never made it back to the global coordinator, even though the commit point site committed the transaction.

You are the WAREHOUSE database administrator. The inventory data locked because of the in-doubt transaction is critical to other transactions. However, the data cannot be accessed because the locks must be held until the in-doubt transaction either commits or rolls back. Furthermore, you understand that the communication link between sales and headquarters cannot be resolved immediately. Therefore, you decide to manually force the local portion of the in-doubt transaction using the following steps:

1. Record user feedback.

2. Query the local DBA_2PC_PENDING view to obtain the global transaction ID and get other information about the
in-doubt transaction.

3. Query the local DBA_2PC_NEIGHBORS view to begin tracing the session tree so that you can find a node that resolved the
in-doubt transaction.

Step 1: Record User Feedback

The users of the local database system that conflict with the locks of the in-doubt transaction get the following error message:

ORA-01591: lock held by in-doubt distributed transaction 1.21.17

Here, 1.21.17 is the local transaction ID of the in-doubt distributed transaction in this example. The local database administrator should request and record this ID number from the users that report problems to identify in-doubt transactions that should be forced.

Step 2: Query DBA_2PC_PENDING

Query the local DBA_2PC_PENDING (see also page 5 - 32) to gain information about the in-doubt transaction:

SELECT * FROM sys.dba_2pc_pending

WHERE local_tran_id = '1.21.17';

For example, when the previous query is issued at WAREHOUSE, the following information is returned.

Column Name Value

---------------------- --------------------------------------

LOCAL_TRAN_ID 1.21.17

GLOBAL_TRAN_ID SALES.ACME.COM.55d1c563.1.93.29

STATE prepared

MIXED no

ADVICE

TRAN_COMMENT Sales/New Order/Trans_type 10B

FAIL_TIME 31-MAY-91

FORCE_TIME

RETRY_TIME 31-MAY-91

OS_USER SWILLIAMS

OS_TERMINAL TWA139:

HOST system1

DB_USER SWILLIAMS

COMMIT#

Figure 5 - 8. Results of Querying DBA_2PC_PENDING

The global transaction ID is the common transaction ID that is the same on every node for a distributed transaction. It is of the form:

global_database_name.hhhhhhhh.local_transaction_id

Here, global_database_name is the database name of the global coordinator (where the transaction originates), hhhhhhhh is an internal database ID at the global coordinator (8 hexadecimal digits), and local_tran_id is the corresponding local transaction ID assigned on the global coordinator. Therefore, the last portion of the global transaction ID and the local transaction ID match at the global coordinator. In the example, you can tell that WAREHOUSE is not the global coordinator because these numbers do not match.

The transaction on this node is in a prepared state. Therefore, WAREHOUSE awaits its coordinator to send either a commit or a rollback message.

The transaction's comment or advice may include information about this transaction. If so, use this comment to your advantage. In this example, the origin (the sales order entry application) and transaction type is in the transaction's comment. This information may reveal something that would help you decide whether to commit or rollback the local portion of the transaction.

If useful comments do not accompany an in-doubt transaction, you must complete some extra administrative work to trace the session tree and find a node that has resolved the transaction.

Step 3: Query DBA_2PC_NEIGHBORS

The purpose of this step is to climb the session tree so that you find coordinators, eventually reaching the global coordinator. Along the way, you might find a coordinator that has resolved the transaction. If not, you can eventually work your way to the commit point site, which will always have resolved the in-doubt transaction.

The DBA_2PC_NEIGHBORS view provides information about connections associated with an in-doubt transaction. Information for each connection is different, based on whether the connection is inbound or outbound:

If the connection is inbound, your node is subordinate (a server of) another node. In this case, the DATABASE column lists the name of the client database that connected to your node, and the DBUSER_OWNER column lists the local account for the database link connection that corresponds to the
in-doubt transaction.

If the connection is outbound, your node is a client of other servers. In this case, the DATABASE column lists the name of the database link that connects to the remote node. The DBUSER_OWNER column lists the owner of the database link for the in-doubt transaction.

Additionally, the INTERFACE column tells whether the local node or a subordinate node is the commit point site.

To trace the session tree, you can query the local DBA_2PC_NEIGHBORS view. In this case, you query this view on the WAREHOUSE database.

SELECT * FROM sys.dba_2pc_neighbors

WHERE local_tran_id = '1.21.17'

ORDER BY sess#, in_out;

Column Name Value

---------------------- --------------------------------------

LOCAL_TRAN_ID 1.21.17

IN_OUT in

DATABASE SALES.ACME.COM

DBUSER_OWNER SWILLIAMS

INTERFACE N

DBID 000003F4

SESS# 1

BRANCH 0100

The columns of particular interest in this view are the IN_OUT, DATABASE, DBUSER_OWNER, and INTERFACE columns. In this example, the IN_OUT column reveals that the WAREHOUSE database is a server for the SALES database, as specified in the DATABASE column. The connection to WAREHOUSE was established through a database link from the SWILLIAMS account, as shown by the DB_OWNER column, and WAREHOUSE, nor any of its descendants, was the commit point site, as shown by the INTERFACE column.

At this point, you can contact the administrator at the located nodes and ask them to repeat Steps 2 and 3, using the global transaction ID.

Note: If you can directly connect to these nodes with another network, you can repeat Steps 2 and 3 yourself.

For example, the following results are returned when Steps 2 and 3 are performed at SALES and HQ, respectively.

Manually Checking the Status of Pending Transactions at SALES.ACME.COM

SELECT * FROM sys.dba_2pc_pending

WHERE global_tran_id = 'SALES.ACME.COM.55d1c563.1.93.29';

Column Name Value

---------------------- --------------------------------------

LOCAL_TRAN_ID 1.93.29

GLOBAL_TRAN_ID SALES.ACME.COM.55d1c563.1.93.29

STATE prepared

MIXED no

ADVICE

TRAN_COMMENT Sales/New Order/Trans_type 10B

FAIL_TIME 31-MAY-91

FORCE_TIME

RETRY_TIME 31-MAY-91

OS_USER SWILLIAMS

OS_TERMINAL TWA139:

HOST system1

DB_USER SWILLIAMS

COMMIT#

SELECT * FROM dba_2pc_neighbors

WHERE global_tran_id = 'SALES.ACME.COM.55d1c563.1.93.29'

ORDER BY sess#, in_out;

At SALES, there are three rows for this transaction (one for the connection to WAREHOUSE, one for the connection to HQ, and one for the connection established by the user). Information corresponding to the rows for the SALES and HQ connections is listed below:

Column Name Value

---------------------- --------------------------------------

LOCAL_TRAN_ID 1.93.29

IN_OUT OUT

DATABASE WAREHOUSE.ACME.COM

DBUSER_OWNER SWILLIAMS

INTERFACE N

DBID 55d1c563

SESS# 1

BRANCH 1

LOCAL_TRAN_ID 1.93.29

IN_OUT OUT

DATABASE HQ.ACME.COM

DBUSER_OWNER ALLEN

INTERFACE C

DBID 00000390

SESS# 1

BRANCH 1

The information from the previous query reveals several facts:

SALES is the global coordinator because the local transaction ID and global transaction ID match. Also, notice that two outbound connections are established from this node, but no inbound links (this node is not a server of another node).

HQ or one of its servers (none in this example) is the commit point site.

Manually Checking the Status of Pending Transactions at HQ.ACME.COM:

SELECT * FROM dba_2pc_pending

WHERE global_tran_id = 'SALES.ACME.COM.55d1c563.1.93.29';

Column Name Value

---------------------- --------------------------------------

LOCAL_TRAN_ID 1.45.13

GLOBAL_TRAN_ID SALES.ACME.COM.55d1c563.1.93.29

STATE COMMIT

MIXED NO

ACTION

TRAN_COMMENT Sales/New Order/Trans_type 10B

FAIL_TIME 31-MAY-91

FORCE_TIME

RETRY_TIME 31-MAY-91

OS_USER SWILLIAMS

OS_TERMINAL TWA139:

HOST SYSTEM1

DB_USER SWILLIAMS

COMMIT# 129314

At this point, you have found a node that resolved the transaction. It has been committed. Therefore, you can force the in-doubt transaction to commit at your local database (see the following section for information on manually committing or rolling back in-doubt transactions). It is a good idea to contact any other administrators you know that could also benefit from your investigation.

Step 4: Check for Mixed Outcome

After you manually force a transaction to commit or roll back, the corresponding row in the pending transaction table remains. The STATE of the transaction is changed to forced commit or forced abort, depending on how you forced the transaction.

Furthermore, once connections between the instances resume, RECO checks the global outcome of the transaction. The MIXED column is changed to yes and the row for the transaction is not deleted if you forced the transaction the wrong way.

If you ever see a transaction forced the wrong way, you should be aware that some global data inconsistency may exist. Eventually, you can purge unnecessary rows from the pending transaction table.

The Pending Transaction Table (DBA_2PC_PENDING)

Every Oracle7 database has a pending transaction table which is a special table that stores information about distributed transactions as they proceed through the prepare/commit phases. You can query a database's pending transaction table by referencing the DBA_2PC_PENDING data dictionary view.

Each transaction with an entry in the pending transaction table is classified in one of the following categories (as indicated in DBA_2PC_PENDING.STATE):

collecting

This category normally applies only to the global coordinator or local coordinators. The node is currently collecting information from other database servers before it can decide whether
it can prepare.

prepared

The node has prepared and may or may not have acknowledged this to its local coordinator with a prepared message. However, no commit message has been received. The node remains prepared, holding any local resource locks necessary for the transaction to commit.

committed

The node (any type) has committed the transaction, but other nodes involved in the transaction may not have done the same. That is, the transaction is still pending at one or more nodes.

forced commit

A pending transaction can be forced to commit at the discretion of a database administrator. This entry occurs if a transaction is manually committed at a local node by a database administrator.

forced abort (rollback)

A pending transaction can be forced to roll back at the discretion of a database administrator. This entry occurs if this transaction is manually rolled back at a local node by a database administrator.

Also of particular interest in the pending transaction table is the mixed outcome flag (as indicated in DBA_2PC_PENDING.MIXED). The database administrator can make the wrong choice if a pending transaction is forced to commit or roll back (for example, the local administrator rolls back the transaction, but the other nodes commit it). Incorrect decisions are detected automatically, and the damage flag for the corresponding pending transaction's record is set (MIXED=yes).

The RECO (Recoverer) background process uses the information in the pending transaction table to finalize the status of in-doubt transactions. The information in the pending transaction table can also be used by a database administrator, who decides to manually override the automatic recovery procedures for pending distributed transactions.

All transactions automatically resolved by RECO are automatically removed from the pending transaction table. Additionally, all information about in-doubt transactions correctly resolved by an administrator (as checked when RECO reestablishes communication) are automatically removed from the pending transaction table. However, all rows resolved by an administrator that result in a mixed outcome across nodes remain in the pending transaction table of all involved nodes until they are manually deleted.

Manually Committing In-Doubt Transactions

The local database administrator has two ways to manually force an in-doubt transaction to commit. The DBA can use the Server Manager Transaction Object List option Force Commit or the SQL command COMMIT with the FORCE option and a text string, indicating either the local or global transaction ID of the in-doubt transaction to commit. Figure 5 - 9 shows the Server Manager Transaction Object List.

Forcing a Commit or Rollback in Server Manager

To commit an in-doubt transaction, select the transaction from the Transaction Object List and choose Force Commit from the Transaction menu.

To roll back an in-doubt transaction, select the transaction from the Transaction Object List and choose Force Rollback from the Transaction menu.

Attention: You cannot roll back an in-doubt transaction to a savepoint.

Manually Committing or Rolling Back In-Doubt Transactions

The following SQL statement is the command equivalent of the action taken in Figure 5 - 9 to commit an in-doubt transaction.

COMMIT FORCE 'transaction_name';

To manually rollback an in-doubt transaction, use the SQL command ROLLBACK with the FORCE option and a text string, indicating either the local or global transaction ID of the in-doubt transaction to rollback. For example, to rollback the in-doubt transaction with the local transaction ID of 2.9.4, use the following statement:

ROLLBACK FORCE '2.9.4';

Attention: You cannot roll back an in-doubt transaction to a savepoint.

To manually force the commit or rollback of an in-doubt transaction issued by yourself, you must have been granted the FORCE TRANSACTION system privilege. To force the commit or rollback of another user's distributed transaction, you must have the FORCE ANY TRANSACTION system privilege. Both privileges can be obtained either explicitly or via a role.

Note: Forcing the commit or rollback of an in-doubt distributed transaction does not affect the status of the operator's current transaction.

Forcing Rollback/Commit on the Local Pending Transaction Table

In all examples, the transaction is committed or rolled back on the local node, and the local pending transaction table records a value of forced commit or forced abort for the STATE column of this transaction's row.

Specifying the SCN

Optionally, you can specify the SCN for the transaction when forcing a transaction to commit. This feature allows you to commit an in-doubt transaction with the SCN assigned when it was committed at other nodes. Thus you maintain the synchronized commit time of the distributed transaction even if there is a failure. Specify an SCN only when you can determine the SCN of the same transaction already committed at another node.

For example, assume you want to manually commit a transaction with the global transaction ID global_id. First, query the DBA_2PC_PENDING view of a remote database also involved with the transaction in question. Note the SCN used for the commit of the transaction at that node. Specify the SCN (a decimal number) when committing the transaction at the local node. For example, if the SCN were 829381993, you would use the command:

COMMIT FORCE 'global_id', 829381993;

Changing Connection Hold Time

If a distributed transaction fails, the connection from the local site to the remote site may not close immediately. Instead, it remains open in case communication can be restored quickly, without having to re-establish the connection. You can set the length of time that the connection remains open with the database parameter DISTRIBUTED_RECOVERY_CONNECTION_HOLD_TIME.

A high value minimizes the cost of reconnecting after failures, but causes the local database to consume more resources. In contrast, a lower value minimizes the cost of resources kept locked during a failure, but increases the cost of reconnecting after failures. The default value of the parameter is 200 seconds. See the Oracle7 Server Reference for more information.

Setting a Limit on Distributed Transactions

The database parameter DISTRIBUTED_TRANSACTIONS sets a maximum on the number of distributed transactions in which a database can participate. You should increase the value of this parameter if your database is part of many distributed transactions. The default value is operating system-specific.

In contrast, if your site is experiencing an abnormally high number of network failures, you can temporarily decrease the value of this parameter. Doing so limits the number of in-doubt transactions in which your site takes part, and thereby limits the amount of locked data at your site, and the number of in-doubt transactions you might have to resolve.

Testing Distributed Transaction Recovery Features

If you like, you can force the failure of a distributed transaction to observe RECO, automatically resolving the local portion of the transaction. Alternatively, you might be interested in forcing a distributed transaction to fail so that you can practice manually resolving in-doubt distributed transactions and observing the results.

The following sections describes the features available and the steps necessary to perform such operations.

Forcing a Distributed Transaction to Fail

Comments can be included in the COMMENT parameter of the COMMIT statement. To intentionally induce a failure during the prepare/commit phases of a distributed transaction, include the following comment in the COMMENT parameter:

COMMIT COMMENT 'ORA-2PC-CRASH-TEST-n';

where n is one of the following integers:

n

Effect

1

Crash commit point site after collect

2

Crash non-commit point site after collect

3

Crash before prepare (non-commit point site)

4

Crash after prepare (non-commit point site)

5

Crash commit point site before commit

6

Crash commit point site after commit

7

Crash non-commit point site before commit

8

Crash non-commit point site after commit

9

Crash commit point site before forget

10

Crash non-commit point site before forget

Table 5 - 1. Failure Values for the Parameter COMMENT

For example, the following statement returns the following messages if the local commit point strength is greater than the remote commit point strength and both nodes are updated:

COMMIT COMMENT 'ORA-2PC-CRASH-TEST-7';

ORA-02054: transaction #.##.## in-doubt

ORA-02059: ORA-CRASH-TEST-n in commit comment

At this point, the in-doubt distributed transaction appears in the DBA_2PC_PENDING view. If enabled, RECO automatically resolves the transaction rather quickly.

Privileges Required to Induce Prepare/Commit Phase Failures

You can induce prepare/commit phase failures via the previous comments only if the local and remote sessions have the FORCE ANY TRANSACTION system privilege. Otherwise, an error is returned if you attempt to issue a COMMIT statement with a crash comment.

The Recoverer (RECO) Background Process

The RECO background process of an Oracle7 instance automatically resolves failures involving distributed transactions. At exponentially growing time intervals, the RECO background process of a node attempts to recover the local portion of an in-doubt
distributed transaction.

RECO can use an existing connection or establish a new connection to other nodes involved in the failed transaction. When a connection is established, RECO automatically resolves all in-doubt transactions. Rows corresponding to any resolved in-doubt transactions are automatically removed from each database's pending transaction table.

Disabling and Enabling RECO

The recoverer background process, RECO, can be enabled and disabled using the ALTER SYSTEM command with the ENABLE/DISABLE DISTRIBUTED RECOVERY options, respectively. For example, you might want to temporarily disable RECO to force the failure of a prepare/commit and manually resolve the in-doubt transaction. The following statement disables RECO:

ALTER SYSTEM DISABLE DISTRIBUTED RECOVERY;

Alternatively, the following statement enables RECO so that in-doubt transactions are automatically resolved:

ALTER SYSTEM ENABLE DISTRIBUTED RECOVERY;

Note: Single-process instances (for example, a PC running MS-DOS) have no separate background processes, and therefore no RECO process. Therefore, when a single-process instance that participates in a distributed system is started, distributed recovery must be manually enabled using the statement above.