The Redo Log and the Capture Process

Every Oracle database has a set of two or more redo log files. The redo log files for a database are collectively known as the database's redo log. The primary function of the redo log is to record all changes made to the database.

Redo logs are used to guarantee recoverability in the event of human error or media failure. A capture process is an optional Oracle background process that reads the database redo log to capture DML and DDL changes made to database objects. When a capture process is configured to capture changes from a redo log, the database where the changes were generated is called the source database.

Logical Change Records (LCRs)

A capture process reformats changes captured from the redo log into LCRs. An LCR is an object with a specific format that describes a database change. A capture process captures two types of LCRs: row LCRs and DDL LCRs.

After capturing an LCR, a capture process enqueues an event containing the LCR into a queue. A capture process is always associated with a single SYS.AnyData queue, and it enqueues events into this queue only. You can create multiple queues and associate a different capture process with each queue. Figure 2-1 shows a capture process capturing LCRs.

Note:

A capture process can be associated only with a SYS.AnyData queue, not with a typed queue.

Figure 2-1 The Capture Process

Row LCRs

A row LCR describes a change to the data in a single row or a change to a single LOB column in a row. The change results from a data manipulation language (DML) statement or a piecewise update to a LOB. For example, a DML statement may insert or merge multiple rows into a table, may update multiple rows in a table, or may delete multiple rows from a table. So, a single DML statement can produce multiple row LCRs. That is, a capture process creates an LCR for each row that is changed by the DML statement. Further, the DML statement itself may be part of a transaction that includes many DML statements.

A captured row LCR may also contain transaction control statements. These row LCRs contain directives such as COMMIT and ROLLBACK. These row LCRs are internal and are used by an apply process to maintain transaction consistency between a source database and a destination database.

Each row LCR contains the following information:

The name of the source database where the row change occurred

The type of DML statement that produced the change, either INSERT, UPDATE, DELETE, LOBERASE, LOBWRITE, or LOBTRIM

The schema name that contains the table with the changed row

The name of the table that contains the changed row

A raw tag that can be used to track the LCR

The identifier of the transaction in which the DML statement was run

The system change number (SCN) when the change was written to the redo log

The old values related to the change. If the type of the DML statement is UPDATE or DELETE, then these old values include some or all of the columns in the changed row before the DML statement. If the type of the DML statement INSERT, then there are no old values.

The new values related to the change. If the type of the DML statement is UPDATE or INSERT statement, then these new values include some or all of the columns in the changed row after the DML statement. If the type of the DML statement DELETE, then there are no new values.

DDL LCRs

A DDL LCR describes a data definition language (DDL) change. A DDL statement changes the structure of the database. For example, a DDL statement may create, alter, or drop a database object.

Each DDL LCR contains the following information:

The name of the source database where the DDL change occurred

The type of DDL statement that produced the change, for example ALTERTABLE or CREATEINDEX

The schema name of the user who owns the database object on which the DDL statement was run

The name of the database object on which the DDL statement was run

The type of database object on which the DDL statement was run, for example TABLE or PACKAGE

The text of the DDL statement

The logon user, which is the user whose session executed the DDL statement

The schema that is used if no schema is specified for an object in the DDL text

The base table owner. If the DDL statement is dependent on a table, then the base table owner is the owner of the table on which it is dependent.

The base table name. If the DDL statement is dependent on a table, then the base table name is the name of the table on which it is dependent.

A raw tag that can be used to track the LCR

The identifier of the transaction in which the DDL statement was run

The SCN when the change was written to the redo log

Note:

Both row LCRs and DDL LCRs contain the source database name of the database where a change originated. If captured LCRs will be propagated by a propagation or applied by an apply process, then, to avoid propagation and apply problems, Oracle Corporation recommends that you do not rename the source database after a capture process has started capturing changes.

Capture Rules

A capture process captures changes based on rules that you define. Each rule specifies the database objects for which the capture process captures changes and the types of changes to capture. You can specify capture rules at the following levels:

A table rule captures either DML or DDL changes to a particular table.

A schema rule captures either DML or DDL changes to the database objects in a particular schema.

A global rule captures either all DML or all DDL changes in the database.

The capture process does not capture certain types of changes and changes to certain datatypes in table columns. Also, a capture process never captures changes in the SYS and SYSTEM schemas.

Datatypes Captured

When capturing changes made to tables, a capture process can capture changes made to columns of the following datatypes:

CHAR

VARCHAR2

NCHAR

NVARCHAR2

NUMBER

DATE

CLOB

BLOB

RAW

TIMESTAMP

TIMESTAMPWITHTIMEZONE

TIMESTAMPWITHLOCALTIMEZONE

INTERVALYEARTOMONTH

INTERVALDAYTOSECOND

The capture process raises an error when it finds a change that satisfies one of its rules to a table containing a column with a datatype that is not listed. The capture process does not capture DML changes in columns of the following datatypes: NCLOB, LONG, LONGRAW, BFILE, ROWID, and UROWID, and user-defined types (including object types, REFs, varrays, and nested tables). When the capture process raises an error, it writes the LCR that caused the error into its trace file, raises an ORA-00902 error, and becomes disabled.

Types of Changes Captured

A capture process can capture only certain types of changes made to a database and its objects. The following sections describe the types of DML and DDL changes that can be captured. A capture process ignores changes that it cannot capture.

Note:

A capture process never captures changes in the SYS and SYSTEM schemas.

If you share a sequence at multiple databases, then sequence values used for individual rows at these databases may vary. Also, changes to actual sequence values are not captured. For example, if a user references a NEXTVAL or sets the sequence, then a capture process does not capture changes resulting from these operations.

Types of DDL Changes Ignored by a Capture Process

A capture process captures the DDL changes that satisfy the rules in the capture process rule set, except for the following types of DDL changes:

ALTERDATABASE

CREATECONTROLFILE

CREATEDATABASE

CREATEPFILE

CREATESPFILE

A capture process captures DDL statements that satisfy the rules in the capture process rule set, but not the results of these DDL statements, unless the DDL statement is a CREATETABLEASSELECT statement. For example, when a capture process captures an ANALYZE statement, it does not capture the statistics generated by the ANALYZE statement. However, when a capture process captures a CREATETABLEASSELECT statement, it captures the statement itself and all of the rows selected (as INSERT row LCRs).

Some types of DDL changes that are captured by a capture process cannot be applied by an apply process. If an apply process receives a DDL LCR that specifies an operation that cannot be applied, then the apply process ignores the DDL LCR and records information about it in the trace file for the apply process.

Other Types of Changes Ignored by a Capture Process

In addition, online table redefinition using the DBMS_REDEFINITION package is not supported on a table or schema for which a capture process captures changes.

NOLOGGING and UNRECOVERABLE Keywords for SQL Operations

If you use the NOLOGGING or UNRECOVERABLE keyword for a SQL operation, then the changes resulting from the SQL operation cannot be captured by a capture process. Therefore, if the changes resulting from a SQL operation should be captured by a capture process, then do not use these keywords.

If the object for which you are specifying the logging attributes resides in a database or tablespace in FORCELOGGING mode, then Oracle ignores any NOLOGGING or UNRECOVERABLE setting until the database or tablespace is taken out of FORCELOGGING mode. You can determine the current logging mode for a database by querying the FORCE_LOGGING column in the V$DATABASE dynamic performance view.

Note:

The UNRECOVERABLE keyword is deprecated and has been replaced with the NOLOGGING keyword in the logging_clause. Although UNRECOVERABLE is supported for backward compatibility, Oracle Corporation strongly recommends that you use the NOLOGGING keyword, when appropriate.

See Also:

Oracle9i SQL Reference for more information about the NOLOGGING and UNRECOVERABLE keywords, FORCELOGGING mode, and the logging_clause

UNRECOVERABLE Clause for Direct Path Loads

If you use the UNRECOVERABLE clause in the SQL*Loader control file for a direct path load, then the changes resulting from the direct path load cannot be captured by a capture process. Therefore, if the changes resulting from a direct path load should be captured by a capture process, then do not use the UNRECOVERABLE clause.

If you perform a direct path load without logging changes at a source database, but you do not perform a similar direct path load at the destination databases of the source database, then apply errors may result at these destination databases when changes are made to the loaded objects at the source database. In this case, a capture process at the source database can capture changes to these objects and one or more propagations can propagate the changes to the destination databases, but these objects may not exist at the destination databases, or, the objects may exist at the destination database, but the rows related to these changes may not exist.

Therefore, if you use the UNRECOVERABLE clause for a direct path load and a capture process is configured to capture changes to the loaded objects, then make sure any destination databases contain the loaded objects and the loaded data to avoid apply errors. One way to make sure that these objects exist at the destination databases is to perform a direct path load at each of these destination databases that is similar to the direct path load performed at the source database.

If you load objects into a database or tablespace that is in FORCELOGGING mode, then Oracle ignores any UNRECOVERABLE clause during a direct path load, and the loaded changes are logged. You can determine the current logging mode for a database by querying the FORCE_LOGGING column in the V$DATABASE dynamic performance view.

Supplemental Logging in a Streams Environment

Supplemental logging places additional column data into a redo log whenever an UPDATE operation is performed. Such updates include piecewise updates to LOBs. The capture process captures this additional information and places it in LCRs.

There are two types of supplemental logging: database supplemental logging and table supplemental logging. Database supplemental logging specifies supplemental logging for an entire database, while table supplemental logging enables you to specify log groups for supplemental logging for a particular table. If you use table supplemental logging, then you can choose between unconditional and conditional log groups.

Unconditional log groups log the before images of specified columns any time the table is updated, regardless of whether the update affected any of the specified columns. This is sometimes referred to as an ALWAYS log group. Conditional log groups log the before images of all specified columns only if at least one of the columns in the log group is updated.

Supplementing logging at the database level, unconditional log groups at the table level, and conditional log groups at the table level together determine which old values are logged in an update statement or piecewise LOB update.

If you plan to use one or more apply processes to apply LCRs captured by a capture process, then you must enable supplemental logging at the source database for the following types of columns in tables at the destination database:

Any columns at the source database that are used in a primary key in tables for which changes are applied at a destination database must be unconditionally logged in a log group or by database supplemental logging of primary key columns.

If the parallelism of any apply process that will apply the changes is greater than 1, then any unique constraint at a destination database that comes from multiple columns at the source database must be conditionally logged. Supplemental logging need not be specified if the unique constraint comes from a single column at the source database.

If the parallelism of any apply process that will apply the changes is greater than 1, then any foreign key constraint at a destination database that comes from multiple columns at the source database must be conditionally logged. Supplemental logging need not be specified if the foreign key comes from a single column at the source database.

Any columns at the source database that are used in substitute key columns for an apply process at a destination database must be unconditionally logged. You specify substitute key columns for a table using the SET_KEY_COLUMNS procedure in the DBMS_APPLY_ADM package.

The columns specified in a column list for conflict resolution during apply must be conditionally logged if more than one column at the source database is used in the column list at the destination database.

Any columns at the source database that are used by a DML handler or error handler specified for update operations or piecewise updates to LOBs at a destination database must be unconditionally logged.

Any columns at the source database that are used by a rule or a rule-based transformation must be unconditionally logged.

If you specify row subsetting for a table at a destination database, then any columns at the source database that are in the destination table or columns at the source database that are in the subset condition must be unconditionally logged. You specify a row subsetting condition for an apply process using the dml_condition parameter in the ADD_SUBSET_RULES procedure in the DBMS_STREAMS_ADM package.

If you do not use supplemental logging for these types of columns at a source database, then changes involving these columns might not apply properly at a destination database.

Note:

LOBs, LONGs, and user-defined types cannot be part of a supplemental log group

"Column Lists" for more information about supplemental logging and column lists

Instantiation

If you plan to use a capture process to replicate changes made to a database object from a source database to a destination database, then the destination database must have a copy of the database object. If the copy does not exist at the destination database, then you must instantiate the object at the destination database before you can replicate changes. To instantiate an object means to create an object physically at a destination database based on an object at a source database. If the object is a table, then the objects at the source and destination database need not be an exact match, but, if some or all of the table data is replicated between the two databases, then the data that is replicated should be consistent when the table is instantiated. Typically, instantiation is done using export/import.

In a Streams environment that shares a database object within a single database or between multiple databases, a source database is the database where changes to the object are generated in the redo log. If a capture process captures or will capture such changes and the changes will be applied locally or propagated to other databases and applied at destination databases, then you may need to instantiate source database objects at destination databases. In any event, you must always prepare the object for instantiation. By preparing an object for instantiation, you are setting the earliest SCN for which changes to the object may need to be applied at the destination database.

The following procedures in the DBMS_CAPTURE_ADM package prepare database objects for instantiation:

PREPARE_TABLE_INSTANTIATION prepares a single table for instantiation.

PREPARE_SCHEMA_INSTANTIATION prepares for instantiation all of the database objects in a schema and all database objects added to the schema in the future.

PREPARE_GLOBAL_INSTANTIATION prepares for instantiation all of the objects in a database and all objects added to the database in the future.

These procedures record the lowest SCN of each object for instantiation. SCNs subsequent to the lowest SCN for an object can be used for instantiating the object. These procedures also populate the Streams data dictionary for the relevant capture processes, propagations, and apply processes that capture, propagate, or apply changes to the table, schema, or database being prepared for instantiation.

Whenever you add or modify the condition of a capture or propagation rule for an object, you must run the appropriate procedure to prepare the object for instantiation at the source database if any of the following conditions are met:

One or more rules are added to the rule set for a capture process that instruct the capture process to capture changes made to the object. When you use the DBMS_STREAMS_ADM package to add rules to a rule set for a capture process, the appropriate procedure to prepare for instantiation is run automatically at the source database. When you use the DBMS_RULE_ADM package to add these rules, you must prepare for instantiation manually.

One or more conditions of rules in the rule set for a capture process are modified to instruct the capture process to capture changes made to the object.

One or more rules are added to the rule set for a propagation that instruct the propagation to propagate changes made to the object.

One or more conditions of rules in the rule set for a propagation are modified to instruct the propagation to propagate changes made to the object.

When any of these conditions are met, you must prepare these database objects for instantiation at the source database to populate any relevant Streams data dictionary that requires information about the source object, even if the object already exists at a remote database where the rules were added or changed.

The relevant Streams data dictionaries are populated asynchronously for both the local dictionary and all remote dictionaries. The procedure that prepares for instantiation adds information to the redo log at the source database. The local Streams data dictionary is populated with the information about the object when a capture process captures these redo entries, and any remote Streams data dictionaries are populated when the information is propagated to them.

When you instantiate tables using export/import, any table supplemental log group specifications are retained for the instantiated tables. That is, after instantiation, log group specifications for imported tables at the import database are the same as the log group specifications for these tables at the export database. If you do not want to retain supplemental log group specifications for tables at the import database, then you can drop specific supplemental log groups after import. Database supplemental logging specifications are not retained during export/import, even if you perform a full database export/import.

The Start SCN, Captured SCN, and Applied SCN for a Capture Process

This section describes system change number (SCN) values that are important for a capture process. You can query the DBA_CAPTURE data dictionary view to display these values for one or more capture processes.

Start SCN

The start SCN is the SCN from which a capture process begins to capture changes. When you start a capture process for the first time, by default the start SCN corresponds to the SCN when the capture process was created. For example, if a capture process is started two days after it was created, then the capture process begins capturing changes from the redo log at the time of creation two days in the past.

You can specify a different start SCN during capture process creation, or you can alter a capture process to set its start SCN. The start SCN value specified must be from a time after the first capture process was created for the database.

Captured SCN

The captured SCN is the SCN that corresponds to the most recent change captured by a capture process.

Applied SCN

The applied SCN for a capture process is the SCN of the most recent event dequeued by the relevant apply processes. All events below this SCN have been dequeued by all apply processes that apply changes captured by this capture process.

Streams Capture Processes and RESTRICTED SESSION

When you enable restricted session during system startup by issuing a STARTUPRESTRICT statement, capture processes do not start, even if they were running when the database shut down. When the restricted session is disabled, each capture process that was running when the database shut down is started.

When the restricted session is enabled in a running database by the SQL statement ALTERSYSTEM with the ENABLERESTRICTEDSESSION clause, it does not affect any running capture processes. These capture processes continue to run and capture changes. If a stopped capture process is started in a restricted session, then the capture process does not start until the restricted session is disabled.

Streams Capture Processes and Oracle Real Application Clusters

You can configure a Streams capture process to capture changes in a Real Application Clusters environment. If you use one or more capture processes and Real Application Clusters in the same environment, then the environment must meet the following requirements:

All archived logs that contain changes to be captured by a capture process must be available to all instances in the Real Application Clusters environment. In a Real Application Clusters environment, a capture process always reads archived redo logs, and it reads changes made by all instances.

Any call to the DBMS_CAPTURE_ADM.START_CAPTURE procedure must be run on the instance that owns the queue that is used by the capture process. Calls to other procedures and functions that operate on a capture process can be performed from any instance.

Any supplemental logging specifications must be made on each running instance. After it is specified for each running instance, it does not need to be specified again if an instance is shut down and restarted, and it does not need to be specified for any new instances.

The ARCHIVE_LAG_TARGET initialization parameter should be set to a value greater than zero. This initialization parameter specifies the duration after which the log files are switched automatically. LogMiner orders all LCRs by SCN. To do so, it needs the archived log files from all instances. Setting this parameter to switch the log files automatically ensures that LogMiner does not wait for an inordinately long time if one instance has far fewer transactions than another.

If the owner instance for a queue table containing a queue used by a capture process becomes unavailable, then queue ownership is transferred automatically to another instance in the cluster. If this happens, then, to restart the capture process, connect to the owner instance for the queue and run the START_CAPTURE procedure. The DBA_QUEUE_TABLES data dictionary view contains information about the owner instance for a queue table. The capture process maintains a persistent start/stop state in a Real Application Clusters environment only if the owner instance for its queue does not change before the database instance owning the queue is restarted.

Also, any parallel execution processes used by a single capture process run on a single instance in a Real Application Clusters environment.

Capture Process Architecture

A capture process is an Oracle background process whose process name is cpnn, where nn is a capture process number. Valid capture process names include cp01 through cp99. A capture process captures changes from the redo log by using the infrastructure of LogMiner. Streams configures LogMiner automatically. You can create, alter, start, stop, and drop a capture process, and you can define capture rules that control which changes a capture process captures.

The user who creates a capture process is the user who performs capture rule evaluations and capture rule-based transformations. This user also enqueues captured events into the queue used by the capture process. This user must have the necessary privileges to perform these actions, including execute privilege on the rule set used by the capture process, execute privilege on all transformation functions used in the rule set, and privileges to enqueue events into the capture process queue.

Capture Process Components

The components of a capture process depend on the setting specified for the parallelism capture process parameter. If parallelism is set to a value of 3 or greater, then a capture process uses the following parallel execution servers to capture changes concurrently:

One reader server that reads the redo log to find changes

A number of preparer servers that format changes found by the reader into LCRs. The number of preparer servers equals the number specified for the parallelism capture process parameter minus two.

One builder server that merges the LCRs created by the preparer servers to preserve the SCN order. After merging the LCRs, the builder server enqueues them into the queue associated with the capture process.

For example, if parallelism is set to 5, then a capture process uses a total of five parallel execution servers, assuming five parallel execution servers are available: one reader server, three preparer servers, and one builder server.

If parallelism is set to 2 or lower, then a capture process itself (cpnn) performs all the work without using any parallel execution servers.

LogMiner Configuration

The capture process uses LogMiner to capture changes that are recorded in the redo log. This section describes configuring LogMiner for use by one or more capture processes.

Alternate Tablespace for LogMiner Tables

LogMiner tables include data dictionary tables and temporary tables used by LogMiner. By default, all LogMiner tables are created to use the SYSTEM tablespace, but the SYSTEM tablespace may not have enough space to accommodate the LogMiner tables. Therefore, Oracle Corporation strongly recommends creating an alternate tablespace for the LogMiner tables before you create a capture process at a database. Use the DBMS_LOGMNR_D.SET_TABLESPACE routine to re-create all LogMiner tables in an alternate tablespace.

Multiple Capture Processes in a Single Database

Each capture process uses one LogMiner session, and the LOGMNR_MAX_PERSISTENT_SESSIONS initialization parameter controls the maximum number of active LogMiner sessions allowed in the instance. The default setting for this initialization parameter is 1. Therefore, to use multiple capture processes in a database, set the LOGMNR_MAX_PERSISTENT_SESSIONS initialization parameter to a value higher than the number of capture processes.

In addition, if you run multiple capture processes on a single database, you might need to increase the System Global Area (SGA) size for each instance. Use the SGA_MAX_SIZE initialization parameter to increase the SGA size. Also, you should increase the size of the shared pool by 10 MB for each capture process on a database.

Note:

Oracle Corporation recommends that each capture process use a separate queue to keep LCRs from different capture processes separate.

Capture Process Creation

You can create a capture process using the DBMS_STREAMS_ADM package or the DBMS_CAPTURE_ADM package. Using the DBMS_STREAMS_ADM package to create a capture process is simpler because defaults are used automatically for some configuration options. In addition, when you use the DBMS_STREAMS_ADM package, a rule set is created for the capture process and rules are added to the rule set automatically. The DBMS_STREAMS_ADM package was designed for use in replication environments. Alternatively, using the DBMS_CAPTURE_ADM package to create a capture process is more flexible, and you create a rule set and rules for the capture process either before or after it is created. You can use the procedures in the DBMS_STREAMS_ADM package or the DBMS_RULE_ADM package to add rules to the rule set for the capture process.

When a capture process is created by a procedure in the DBMS_STREAMS_ADM package, a procedure in the DBMS_CAPTURE_ADM package is run automatically on the tables whose changes will be captured by the capture process. The following table lists which procedure is run in the DBMS_CAPTURE_ADM package when you run a procedure in the DBMS_STREAMS_ADM package.

When you run this procedure in the DBMS_STREAMS_ADM package

This procedure in the DBMS_CAPTURE_ADM package is run automatically

ADD_TABLE_RULES

PREPARE_TABLE_INSTANTIATION

ADD_SCHEMA_RULES

PREPARE_SCHEMA_INSTANTIATION

ADD_GLOBAL_RULES

PREPARE_GLOBAL_INSTANTIATION

More than one call to prepare instantiation is allowed. When a capture process is created by the CREATE_CAPTURE procedure in the DBMS_CAPTURE_ADM package, you must run the appropriate procedure manually to prepare each table, schema, or database whose changes will be captured for instantiation, if you plan to instantiate the table, schema, or database at a remote site.

Note:

After creating a capture process at a database, do not change the DBID or DBNAME of the database.

Data Dictionary Duplication During Capture Process Creation

When the first capture process is created for a database, Streams populates a duplicate data dictionary called a Streams data dictionary for use by capture processes and propagations. Initially, the Streams data dictionary is consistent with the primary data dictionary at the time when the capture process was created.

A capture process requires a Streams data dictionary because the information in the primary data dictionary may not apply to the changes being captured from the redo log. These changes may have occurred minutes or hours before they are captured by a capture process. For example, consider the following scenario:

A capture process is configured to capture changes to tables.

A database administrator stops the capture process. When the capture process is stopped, it records the SCN of the change it was currently capturing.

User applications continue to make changes to the tables while the capture process is stopped.

The capture process is restarted three hours after it was stopped.

In this case, to ensure data consistency, the capture process must begin capturing changes in the redo log at the time when it was stopped. The capture process starts at the SCN that it recorded when it was stopped.

The redo log contains raw data. It does not contain database object names and column names in tables. Instead, it uses object numbers and internal column numbers for database objects and columns, respectively. Therefore, when a change is captured, a capture process must reference the data dictionary to determine the details of the change.

The Streams data dictionary is updated when a DDL statement is processed by a capture process, if necessary. If there were any DDL changes to the relevant tables in the time between when a capture process is capturing changes and the current time, then the primary data dictionary may not contain the correct information for the captured changes. However, the Streams data dictionary always reflects the correct time for the captured changes because it versions a subset of the information in the primary data dictionary.

When a capture process determines whether or not to capture DDL changes involving a table, the capture process automatically adds information about the change to the Streams data dictionary. In addition, the capture process determines whether or not to capture the Streams data dictionary information for the new version of the table. To make these determinations, the capture rule set is evaluated with partial information that includes the name and owner of the table created or altered by the DDL statement. Capturing and propagating Streams data dictionary information makes it available in each destination queue, where it can be used by propagations and apply processes.

If at least one rule in the capture rule set either evaluates to TRUE (true_rules) or could evaluate to TRUE given more information (maybe_rules), then the Streams data dictionary information is captured for the table. This rule can be either a DML rule or a DDL rule. A capture process at a source database performs a similar rule evaluation when a table is prepared for instantiation.

Because the data dictionary is duplicated when the first capture process is created, it might take some time to create the first capture process for a database. The amount of time required depends on the number of database objects in the database.

The data dictionary is duplicated only once for a database. Additional capture processes use the same Streams data dictionary that the first capture process created in the database. Because the Streams data dictionary is multiversioned, each capture process is in sync with the Streams data dictionary.

Scenario Illustrating the Need for a Streams Data Dictionary

Consider a scenario in which a capture process has been configured to capture changes to table t1, which has columns a and b, and the following changes are made to this table at three different points in time:

Time 1: Insert values a=7 and b=15.

Time 2: Add column c.

Time 3: Drop column b.

If for some reason the capture process is capturing changes from an earlier time, then the primary data dictionary and the relevant version in the Streams data dictionary contain different information. Table 2-1 illustrates how the information in the Streams data dictionary is used when the current time is different than the change capturing time.

Table 2-1 Information About Table t1 in the Primary and Capture Data Dictionaries

Current Time

Change Capturing Time

Primary Data Dictionary

Streams Data Dictionary

1

1

Table t1 has columns a and b.

Table t1 has columns a and b at time 1.

2

1

Table t1 has columns a, b, and c.

Table t1 has columns a and b at time 1.

3

1

Table t1 has columns a and c.

Table t1 has columns a and b at time 1.

The capture process captures the change resulting from the insert at time 1 when the actual time is time 3. If the capture process used the primary data dictionary, then it might assume that a value of 7 was inserted into column a and a value of 15 was inserted into column c, because those are the two columns for table t1 at time 3 in the primary data dictionary. However, a value of 15 was actually inserted into column b.

Because the capture process uses the Streams data dictionary, the error is avoided. The Streams data dictionary is synchronized with the capture process and continues to record that table t1 has columns a and b at time 1. So, the captured change specifies that a value of 15 was inserted into column b.

ARCHIVELOG Mode and a Capture Process

A capture process reads online redo logs whenever possible and archived redo logs otherwise. For this reason, the database must be running in ARCHIVELOG mode when a capture process is configured to capture changes. You must keep an archived redo log file available until you are certain that no capture process will ever need that file. Make sure redo logs are available until all transactions within the redo log have been applied at all downstream databases. You can use APPLIED_SCN column in the DBA_CAPTURE data dictionary view to determine the SCN of the most recent message dequeued by the relevant apply processes. All changes below this SCN have been dequeued by all apply processes that apply changes captured by the capture process.

When a capture process falls behind, there is a seamless transition from reading an online redo log to reading an archived redo log, and, when a capture process catches up, there is a seamless transition from reading an archived redo log to reading an online redo log.

Capture Process Parameters

After creation, a capture process is disabled so that you can set the capture process parameters for your environment before starting it for the first time. Capture process parameters control the way a capture process operates. For example, the time_limit capture process parameter can be used to specify the amount of time a capture process runs before it is shut down automatically. After you set the capture process parameters, you can start the capture process.

Setting the parallelism parameter to a number higher than the number of available parallel execution servers might disable the capture process. Make sure the PROCESSES and PARALLEL_MAX_SERVERS initialization parameters are set appropriately when you set the parallelism capture process parameter.

Automatic Restart of a Capture Process

You can configure a capture process to stop automatically when it reaches certain limits. The time_limit capture process parameter specifies the amount of time a capture process runs, and the message_limit capture process parameter specifies the number of events a capture process can capture. The capture process stops automatically when it reaches one of these limits.

The disable_on_limit parameter controls whether a capture process becomes disabled or restarts when it reaches a limit. If you set the disable_on_limit parameter to y, then the capture process is disabled when it reaches a limit and does not restart until you restart it explicitly. If, however, you set the disable_on_limit parameter to n, then the capture process stops and restarts automatically when it reaches a limit.

When a capture process is restarted, it starts to capture changes at the point where it last stopped. A restarted capture process gets a new session identifier, and the parallel execution servers associated with the capture process also get new session identifiers. However, the capture process number (cpnn) remains the same.

Capture Process Rule Evaluation

A running capture process completes the following series of actions to capture changes:

Finds changes in the redo log.

Performs prefiltering of the changes in the redo log. During this step, a capture process evaluates rules in its rule set at the object level and schema level to place changes found in the redo log into two categories: changes that should be converted into LCRs and changes that should not be converted into LCRs.

Prefiltering is a safe optimization done with incomplete information. This step identifies relevant changes to be processed subsequently, such that:

A change is converted into an LCR if one or more rules may evaluate to TRUE after conversion.

A change is not converted into an LCR if the capture process can ensure that no rules would evaluate to TRUE after conversion.

Converts changes that may cause one or more rules to evaluate to TRUE into LCRs based on prefiltering.

Performs LCR filtering. During this step, a capture process evaluates rules regarding information in each LCR to separate the LCRs into two categories: LCRs that should be enqueued and LCRs that should be discarded.

Discards the LCRs that should not be enqueued based on the rules.

Enqueues the remaining captured LCRs into the queue associated with the capture process.

For example, suppose the following rule is defined for a capture process: Capture changes to the hr.employees table where the department_id is 50. No other rules are defined for the capture process, and the parallelism parameter for the capture process is set to 1.

Given this rule, suppose an UPDATE statement on the hr.employees table changes 50 rows in the table. The capture process performs the following series of actions for each row change:

Finds the next change resulting from the UPDATE statement in the redo log.

Determines that the change resulted from an UPDATE statement to the hr.employees table and must be captured. If the change was made to a different table, then the capture process ignores the change.

Captures the change and converts it into an LCR.

Filters the LCR to determine whether it involves a row where the department_id is 50.

Either enqueues the LCR into the queue associated with the capture process if it involves a row where the department_id is 50, or discards the LCR if it involves a row where the department_id is not 50 or is missing.

Figure 2-2 Flowchart Showing Capture Process Rule Evaluation

The Persistent State of a Capture Process

A capture process maintains a persistent state. That is, the capture process maintains its current state when the database is shut down and restarted. For example, if the capture process is running when the database is shut down, then the capture process automatically starts when the database is restarted, but, if the capture process is stopped when a database is shut down, then the capture process remains stopped when the database is restarted.