All the Views expressed here are my own and do not reflect opinions or views of the anyone else.All the views are tested on my testing environment and kindly test the post before applying anything on production.You can reach to me at neeraj.vishen@gmail.com .

Saturday, January 21, 2012

Oracle transportable tablespaces are the fastest way for moving large volumes of data between two Oracle databases. Starting with Oracle Database 10g, we can transport tablespaces across platforms . Furthermore, like import and export, transportable tablespaces provide a mechanism for transporting metadata in addition to transporting data. Below are the steps to perform the cross platform transport tablespace .

Prerequisites :

There are few points which has to be considered before performing the Transportable tablespaces . The followings are :

1.) The source and target database must use the same character set and national character set .

2.) We cannot transport a tablespace to a target database in which a tablespace with the same name already exists. However, we can rename either the tablespace to be transported or the destination tablespace before the transport operation.

3.) Objects with underlying objects (such as materialized views) or contained objects (such as partitioned tables) are not transportable unless all of the underlying or contained objects are in the tablespace set.

4.) If the owners of tablespace objects does not exist on target database, the usernames need to be created manually before starting the transportable tablespace import.If we use spatial indexes, then:

be aware that TTS across different endian platforms are not supported for spatial indexes in 10gR1 and 10gR2; such a limitation has been released in 11g .

Specific Spatial packages must be run before exporting and after transportation, please see Oracle Spatial documentation.

5.) We cannot transport the SYSTEM tablespace or objects owned by the user SYS .

6.) Opaque Types (such as RAW, BFILE, and the AnyTypes) can be transported, but they are not converted as part of the cross-platform transport operation. Their actual structure is known only to the application, so the application must address any endianness issues after these types are moved to the new platform .

7.) Floating-Point Numbers BINARY_FLOAT and BINARY_DOUBLE types are transportable using Data Pump but not the original export utility, EXP.

If we are migrating a database, then make sure there are no invalid objects in the source database before making the export. Take a full no rows export to recreate objects that won't be transported with TTS or remove all the invalid objects by running utlrp.sql scripts . Keep the source database viable until we have determined all objects are in the target database and there are no issues (i.e. the target database has been thoroughly checked out ).

Here we are performing the cross platform transport tablespace demo .The details about the sorce and target are as below :

Source :

OS = Redhat Linux (32 bit)

Oracle Version = Oracle 10.2.0

Database Name = comcast

Tablespace = TES_TBS (which is to be transported)

Target :

OS = Microsoft Window (32 bit)

Oracle Version = Oracle 11.2.0

Database Name = noida

Step 1 : Determine Platforms Support and Endianness (on source) :

This step is only necessary if we are transporting the tablespace set to a platform different from the source platform. Determine whether cross-platform tablespace transport is supported for both the source and target platforms, and also determine the endianness of each platform . We can query the v$transportable_platform view to see the platforms that are supported, and to determine each platform's endian format (byte ordering).

If the endian formats are different then a conversion is necessary for transporting the tablespace. For example, run the below command to convert the tablespace of Source(Linux 64 bit) to Target(Solaris 64 bit) platform .

i.> Using CONVERT Tablespace FROM PLATFORM on Source host

RMAN> convert tablespace test_user_tbs

2> to platform ‘Solaris[tm] OE (64-bit)'

3> format='/oradata/rman_backups/%N_%f' ;

The data file “test_user_tbs01.dbf” is not touched and a new file will be created for Solaris platform under “/oradata/rman_backups“and copy the file to Target platform. Use RMAN's CONVERT command to convert the datafiles to be transported to the destination host's format. The converted datafiles are stored in “/solaris/oradata”.

ii.> Using CONVERT DATAFILE... FROM PLATFORM on Destination host

RMAN> convert datafile test_user_tbs01.dbf

2> from platform ‘Linux IA (64-bit)'

3> db_file_name_convert ‘/linux/oradata/’ ‘/solaris/oradata’Let's have a demo of the transportable tablespace .

Step 3 : Create a tablespace and user :

Here we will create the Tablespace and User on the source database and this tablespace will be transported further .

There may be logical or physical dependencies between objects in the transportable set and those outside of the set. We can only transport a set of tablespaces that is self-contained . Some examples of self contained tablespace violations are:

An index inside the set of tablespaces is for a table outside of the set of tablespaces.

A partitioned table is partially contained in the set of tablespaces.

A referential integrity constraint points to a table across a set boundary.

The statement below can be used to determine whether Tablespace are self-contained or not :

SQL> conn sys as sysdba

Enter password:

Connected.

SQL> EXECUTE dbms_tts.transport_set_check('TES_TBS', TRUE, TRUE) ;

PL/SQL procedure successfully completed.

The DBMS_TTS package checks if the transportable set is self-contained. All violations are inserted into a temporary table that can be selected from the transport_set_violations view.

Transport both the datafiles and the export file of the tablespaces to a place that is accessible to the target database.

Step 7 : Import the dumpfile in target database

Before importing the dumpfile make sure that tablespace of same doesnot exist in the target database. Also check the user may exist which is having default tablespace i.e, "TES_TBS" . We can also use the remap_schema parameter to restore it into some other schemas.

Determines the optimal access path for the query to take. With the Rule Based Optimizer (RBO) it uses a set of heuristics to determine access path. With the Cost Based Optimizer (CBO) we use statistics to analyze the relative costs of accessing objects.

[6] QEP Generation

QEP = Query Evaluation Plan

[7] QEP Execution

QEP = Query Evaluation Plan

Steps [1]-[6] are handled by the parser. Step [7] is the execution of the statement.

The explain plan is produced by the parser. Once the access path has been decided upon it is stored in the library cache together with the statement itself. We store queries in the library cache based upon a hashed representation of that query. When looking for a statement in the library cache, we first apply a hashing algorithm to the statement and then we look for this hash value in the library cache. This access path will be used until the query is reparsed.

Terminology

Row Source

A set of rows used in a query may be a select from a base object or the result set returned by joining 2 earlier row sources

Predicate

where clause of a query

Tuples

rows

Driving Table

This is the row source that we use to seed the query. If this returns a lot of rows then this can have a negative affect on all subsequent operations

Probed Table

This is the object we lookup data in after we have retrieved relevant key data from the driving table.

How does Oracle access data?

At the physical level Oracle reads blocks of data. The smallest amount of data read is a single Oracle block, the largest is constrained by operating system limits (and multiblock i/o). Logically Oracle finds the data to read by using the following methods:

The rightmost uppermost operation of an explain plan is the first thing that the explain plan will execute. In this case TABLE ACCESS FULL LARGE is the first operation. This statement means we are doing a full table scan of table LARGE. When this operation completes then the resultant row source is passed up to the

next level of the query for processing. In this case it is the SELECT STATEMENT which is the top of the query.

[CHOOSE] is an indication of the optimizer_goal for the query. This DOES NOT necessarily indicate that plan has actually used this goal. The only way to confirm this is to check the

cost= part of the explain plan as well. For example the following query indicates that the CBO has been used because there is a cost in the cost field:

SELECT STATEMENT [CHOOSE] Cost=1234

However the explain plan below indicates the use of the RBO because the cost field is blank:

SELECT STATEMENT [CHOOSE] Cost=

The cost field is a comparative cost that is used internally to determine the best cost for particular plans. The costs of different statements are not really directly comparable.

[:Q65001] indicates that this particular part of the query is being executed in parallel. This number indicates that the operation will be processed by a parallel query slave as opposed to being executed serially.

[ANALYZED] indicates that the object in question has been analyzed and there are currently statistics available for the CBO to use. There is no indication of the 'level' of analysis done.

Access Methods in detail

Full Table Scan (FTS)

In a FTS operation, the whole table is read up to the high water mark (HWM). The HWM marks the last block in the table that has ever had data written to it. If you have deleted all the rows then you will still read up to the HWM. Truncate resets the HWM back to the start of the table. FTS uses multiblock i/o to read the blocks from disk. Multiblock i/o is controlled by the parameter <PARAM:db_block_multi_block_read_count>.

This defaults to:

db_block_buffers / ( (PROCESSES+3) / 4 )

Maximum values are OS dependant

Buffers from FTS operations are placed on the Least Recently Used (LRU) end of the buffer cache so will be quickly aged out. FTS is not recommended for large tables unless you are reading >5-10% of it (or so) or you intend to run in parallel.

Notice the 'TABLE ACCESS BY ROWID' section. This indicates that the table data is not being accessed via a FTS operation but rather by a rowid lookup. In this case the rowid has been produced by looking up values in the index first. The index is being accessed by an 'INDEX UNIQUE SCAN' operation. This is explained below. The index name in this case is EMP_I1. If all the required data resides in the index then a table lookup may be unnecessary and all you will see is an index access with no table access.

In the following example all the columns (empno) are in the index. Notice that no table access takes place:

Because we have forced a FTS the data is unsorted and so we must sort the data

after it has been retrieved.

There are 4 methods of index lookup:

index unique scan

index range scan

index full scan

index fast full scan

Index unique scan

Method for looking up a single key value via a unique index. Always returns a single value You must supply AT LEAST the leading column of the index to access data via the index, However this may return > 1 row as the uniqueness will not be guaranteed.

In certain circumstances it is possible for the whole index to be scanned as opposed to a range scan (i.e. where no constraining predicates are provided for a table). Full index scans are only available in the CBO as otherwise we are unable to determine whether a full scan would be a good idea or not. We choose an index Full Scan when we have statistics that indicate that it is going to be more efficient than a Full table scan and a sort.

For example we may do a Full index scan when we do an unbounded scan of an index and want the data to be ordered in the index order. The optimizer may decide that selecting all the information from the index and not sorting is more efficient than doing a FTS or a Fast Full Index Scan and then sorting.

An Index full scan will perform single block i/o's and so it may prove to be inefficient. Index BE_IX is a concatenated index on big_emp (empno,ename)

SQL> explain plan for select empno,ename
from big_emp order by empno,ename;

Scans all the block in the index Rows are not returned in sorted order Introduced in 7.3 and requires V733_PLANS_ENABLED=TRUE and CBO may be hinted using INDEX_FFS hint uses multiblock i/o can be executed in parallel can be used to access second column of concatenated indexes. This is because we are selecting all of the index.

Note that INDEX FAST FULL SCAN is the mechinism behind fast index create and recreate. Index BE_IX is a concatenated index on big_emp (empno,ename)

A Join is a predicate that attempts to combine 2 row sources We only ever join 2 row sources together Join steps are always performed serially even though underlying row sources may have been accessed in parallel. Join order - order in which joins are performed

The join order makes a significant difference to the way in which the query is executed. By accessing particular row sources first, certain predicates may be satisfied that are not satisfied by with other join orders. This may prevent certain access paths from being taken.

Suppose there is a concatenated index on A(a.col1,a.col2). Note that a.col1 is the leading column. Consider the following query:

We could represent the joins present in the query using the following schematic:

B <---> A <---> C
col3=10 col3=5

There are really only 2 ways we can drive the query: via B.col3 or C.col3. We would have to do a Full scan of A to be able to drive off it. This is unlikely to be efficient with large tables;

If we drive off table B, using predicate B.col3=10 (as a filter or lookup key) then we will retrieve the value for B.col1 and join to A.col1. Because we have now filled the leading column of the concatenated index on table A we can use this index to give us values for A.col2 and join to A.

However if we drive of table c, then we only get a value for a.col2 and since this is a trailing column of a concatenated index and the leading column has not been supplied at this point, we cannot use the index on a to lookup the data.

So it is likely that the best join order will be B A C. The CBO will obviously use costs to establish whether the individual access paths are a good idea or not.

If the CBO does not choose this join order then we can hint it by changing the from

Rows are produced by Row Source 1 and are then sorted Rows from Row Source 2 are then produced and sorted by the same sort key as Row Source 1. Row Source 1 and 2 are NOT accessed concurrently Sorted rows from both sides are then merged together (joined)

MERGE
/ \
SORT SORT
| |
Row Source 1 Row Source 2

If the row sources are already (known to be) sorted then the sort operation is unecessary as long as both 'sides' are sorted using the same key. Presorted row sources include indexed columns and row sources that have already been sorted in earlier steps. Although the merge of the 2 row sources is handled serially, the row sources could be accessed in parallel.

Row source 1 is known as the outer table
Row source 2 is known as the inner table

Accessing row source 2 is known a probing the inner table For nested loops to be efficient it is important that the first row source returns as few rows as possible as this directly controls the number of probes of the second row source. Also it helps if the access method for row source 2 is efficient as this operation is being repeated once for every row returned by row source 1.

New join type introduced in 7.3 More efficient in theory than NL & SMJ Only accessible via the CBO Smallest row source is chosen and used to build a hash table and a bitmap The second row source is hashed and checked against the hash table looking for joins. The bitmap is used as a quick lookup to check if rows are in the hash table and are especially useful when the hash table is too large to fit in memory.

Hash joins are enabled by the parameter HASH_JOIN_ENABLED=TRUE in the init.ora or session. TRUE is the default in 7.3

Cartesian Product

A Cartesian Product is done where they are no join conditions between 2 row sources and there is no alternative method of accessing the data Not really a join as such as there is no join! Typically this is caused by a coding mistake where a join has been left out. It can be useful in some circumstances - Star joins uses cartesian products.

Note that if the row source is already appropriately sorted then no sorting is required. This is now indicated in 7.3:

SORT GROUP BY NOSORT
INDEX FULL SCAN .....

In this case the group by operation simply groups the rows it does not do the sort operation as this has already been completed.

Sorts are expensive operations especially on large tables where the rows do not fit in memory and spill to disk. By default sort blocks are placed into the buffer cache. This may result in aging out of other blocks that may be reread by other processes. To avoid this you can use the parameter <Parameter:SORT_DIRECT_WRITES> which does not place sort blocks into the buffer cache.

Filter

Has a number of different meanings used to indicate partition elimination may also indicate an actual filter step where one row source is filtering another functions such as min may introduce filter steps into query plans

In this example there are 2 filter steps. The first is effectively like a NL except that it stops when it gets something that it doesn't like (i.e. a bounded NL). This is there because of the not in. The second is filtering out the min value:

SQL> explain plan for select * from emp
where empno not in (select min(empno)
from big_emp group by empno);

Query Plan
------------------
SELECT STATEMENT [CHOOSE] Cost=1
FILTER **** This is like a bounded nested loops
TABLE ACCESS FULL EMP [ANALYZED]
FILTER **** This filter is introduced by the min
SORT GROUP BY NOSORT
INDEX FULL SCAN BE_IX

This example is also interesting in that it has a NOSORT function. The group by does not need to sort because the index row source is already pre sorted.

Views

When a view cannot be merged into the main query you will often see a projection view operation. This indicates that the 'view' will be selected from directly as opposed to being broken down into joins on the base tables. A number of constructs make a view non mergeable. Inline views are also non mergeable.

In the following example the select contains an inline view which cannot be merged:

In this case the inline view tmp which contains an aggregate function cannot be merged into the main query. The explain plan shows this as a view step.

Partition Views

Allows a large table to be broken up into a number of smaller partitions which can be queried much more quickly than the table as a whole a union all view is built over the top to provide the original functionality Check constraints or where clauses provide partition elimination capabilities

KBWYV1 is a view on 4 tables KBWYT1-4. KBWYT1-4 contain rows for week 31-34 respectively and are maintained by check constraints. This query should only return rows from partions 2 & 3. The filter operation indicates this. Partitions 1 & 4 are eliminated at execution time. The view line indicates that the view is not merged. The union-all partion information indicates that we have recognised this as a partition view. Note that the tables can be accessed in parallel.

Remote Queries

Only shows remote in the OPERATION column OTHER column shows query executed on remote node OTHER_NODE shows where it is executed Different operational characteristics for RBO & CBO

Checkout the other column for details of what the slaves are executing

v$pq_slave will show any parallel activity

Columns to look in for information

other - contains the query passed to the slaves

other_tag - describes the contents of other

object_node - indicates order of pqo slaves

Parallel Query operates on a producer/consumer basis. When you specify parallel degree 4 oracle tries to allocate 4 producer slaves and 4 consumer slaves. The producers can feed any of the consumers. If there are only 2 slaves available then we use these. If there is only 1 slave available then we go serial If there are none available then we use serial. If parallel_min_percent is set then we error ora 12827 instead of using a lower number of slaves or going serial.

Consumer processes typically perform a sorting function. If there is no requirement for the data to be sorted then the consumer slaves are not produced and we end up with the number of slaves used matching the degree of parallelism as opposed to being 2x the degree.

Parallel Terms

PARALLEL_FROM_SERIAL

This means that source of the data is serial but it is passed to a parallel consumer

PARALLEL_TO_PARALLEL

Both the consumer and the producer are parallel

PARALLEL_COMBINED_WITH_PARENT

This operation has been combined with the parent operator. For example in a sort merge join the sort operations would be shown as PARALLEL_COMBINED_WITH_PARENT because the sort and the merge are handled as 1 operation.

PARALELL_TO_SERIAL

The source of the data is parallel but it is passed to a serial consumer. This typically will happen at the top of the explain plan but could occur anywhere

Examples of parallel queries

Assumptions

OPTIMIZER_MODE = CHOOSE
DEPT is small compared to EMP
DEPT has an index (DEPT_INDX) on deptno column

Below is the graphical representation of the execution plan. If you read the tree; In order to perform Operation 1, you need to perform Operation 2 and 4. Operation 2 comes first; In order to perform 2, you need to perform its Child Operation 3. In order to perform Operation 4, you need to perform Operation 2

Search This Blog

Translate My Page

About Me

I have started this blog to share my knowledge and experience with other Oracle DBA enthusiasts.I have experience on database architecture ,design and administrating ranging from 9i,10g,11g on various platforms.My main interests are high availability and disaster recovery solutions for mission critical 24×7 systems.