You are here

Yasin Baskan

I was trying to drop a tablespace which I know there were no segments in it. A simple query from dba_segments returns no rows which means there are no segments allocated in this tablespace. But strangely I got this:

SQL> drop tablespace psapsr3old including contents and datafiles;drop tablespace psapsr3old including contents and datafiles*ERROR at line 1:ORA-14404: partitioned table contains partitions in a different tablespace

How come I cannot drop a tablespace with no segments in it?

Enter deferred segment creation. Things were simpler on 9i or 10g. The database I get this error on is an 11.2.0.2 database. Starting with 11.2.0.1 things may fool you when it comes to segments in the database. 11.2 brought us the feature called "deferred segment creation" which means no segments are created if you do not insert any data into the tables you created. Have a look at the note "11.2 Database New Feature Deferred Segment Creation [Video] (Doc ID 887962.1)" about this feature. It is there to save disk space in case you have lots of tables without data in them. In 11.2.0.1 it was only for non-partitioned heap tables, starting with 11.2.0.2 it is also used for partitioned tables.

Coming back to my problem, even if there were no segments reported in dba_segments there are tables and partitions created in this tablespace without their segments created yet. If we look at the tables and partitions in that tablespace:

SQL> select segment_created,count(*) from dba_tables 2 where tablespace_name='PSAPSR3OLD' 3 group by segment_created;

SEG COUNT(*)--- ----------NO 13482

SQL> select segment_created,count(*) from dba_tab_partitions 2 where tablespace_name='PSAPSR3OLD' 3 group by segment_created;

SEGM COUNT(*)---- ----------NO 1237

There are thousands of objects in there.

What is the solution then?

Obviously it is to get rid of these objects by moving them to a different tablespace. The standard "alter table move" and "alter table move partition" commands do the job. Then the question becomes; will a move table operation create the segment in the new tablespace? If you are on 11.2.0.1, yes it will, defeating the whole purpose of this feature. If you are on 11.2.0.2 it will not create the segments. This is explained in the note "Bug 8911160 - ALTER TABLE MOVE creates segment with segment creation deferred table (Doc ID 8911160.8)".

After everything is moved you can safely drop the tablespace without this error.

UPDATE: Gokhan Atil made me aware of Randolf Geist's post about the same issue. See that post here.

Direct NFS Clonedb is a feature in 11.2 that you can use to clone your databases. Kevin Closson explains what it is in this post. In his demo videos he is using a perl script to automate the process of generating the necessary scripts. That script is not publicly available as of today but the MOS note 1210656.1 explains how to do the clone manually without the perl script.

Tim Hall also has a step by step guide on how to do the cloning in this post. He also uses the perl script mentioned above.

We have been testing backups and clones on Exadata connected to a 7410 ZFS Storage Appliance, I wanted to share our test on Direct NFS Clonedb. This test is on a quarter rack x2-2 connected to a 7410 storage via Infiniband. A RAC database will be cloned as a single instance database and the clone database will opened in one db node.

Enable Direct NFS on Exadata

For security, as of today default Exadata installation disables some services needed by NFS. To use NFS on Exadata db nodes we enabled those services first.

service portmap start

service nfslock start

chkconfig --level 345 portmap on

chkconfig --level 345 nfslock on

Recent Exadata installations come with Direct NFS (dNFS) enabled, you can check if you have it enabled by looking at the database alert log. When the database is started you can see this line in the alert log if you have dNFS enabled.

You can use OS copies or RMAN image copies to back up the database for use in the cloning process. Here are the commands we used, do not forget to create the target directory before.

sql 'alter database begin backup';

backup as copy database format '/backup/clone_backup/%U';

sql 'alter database end backup';

Prepare the clone db

To start the clone database we need an init.ora file and a create controlfile script. You can back up the source database's control file to a text file and use it. In the source database run this to get the script, this will produce a script under the udump directory (/u01/app/oracle/diag/rdbms/dbm/dbm1/trace in Exadata).

SQL> alter database backup controlfile to trace;

Database altered.

After editing the script this is the one we can use for the clone database.

/u01/app/oradata/clone is a directory on the local disks, you can also use NFS for redo logs if you want to. The DATAFILE section lists the image copies we have just produced using RMAN. You can get this list using this sql, be careful about the completion time because you may have previous image copies in the same directory.

select name,completion_time from V$BACKUP_COPY_DETAILS;

Now we need an init.ora file, we can just copy the source database's file and edit it.

SQL> create pfile='/backup/clone.ora' from spfile;

File created.

Since the source database is a RAC database you need to remove parameters related to RAC (like cluster_database, etc...). You also need to change the paths to reflect the new clone database, like in the parameter control_files. Here is the control_files parameter in this test.

*.control_files='/u01/app/oradata/clone/control.ctl'

I also use a local directory, not NFS, for the control file.

There is one parameter you need to add when cloning a RAC database to a single instance database.

_no_recovery_through_resetlogs=TRUE

If you do not set this parameter you will get an error when you try to open the clone database with resetlogs. MOS note 334899.1 explains why we need to set this. If you do not set this this is the error you will get when opening the database.

The first parameter to dbms_dnfs is the backup image copy name we set in the controlfile script, the second parameter is the target filename which should reside in NFS. You can create this script using the following sql on the source database.

When you enable the database smart flash cache and start using it you will see new wait events related to that. These events help to find out if the problem is about the flash cache or not.

The ones I faced till now are "db flash cache single block physical read", "db flash cache multiblock physical read" and "write complete waits: flash cache". These are from a 11.2.0.1 database using the F5100 flash array as the database smart flash cache.

db flash cache single block physical read

"db flash cache single block physical read" is the flash cache equivalent of "db file sequential read". Read waits from the flash cache are not accounted for in the "db file sequential read" event and have their own wait event. The following is from an AWR report of 30 mins from a database using the database smart flash cache.

There are over 11 million "db flash cache single block physical read" waits which took about 0.44ms on average (AWR reports it as 0ms). "db file sequential read" waits are over 600,000. This means we had a high flash cache hit ratio, most of the reads were coming from the flash cache, not the disks.

This is the wait event histogram from the same AWR report.

% of Waits

Event

Total Waits

<1ms

<2ms

<4ms

<8ms

<16ms

<32ms

<=1s

>1s

db flash cache single block physical read

11.2M

99.0

.9

.1

.0

.0

.0

.0

99% of all flash cache single block reads were under 1ms, none of them are over 4ms.

db flash cache multiblock physical read

"db flash cache multiblock physical read" is the flash cache equivalent of "db file scattered read". It is the event we see when we are reading multiple blocks from the flash cache. The AWR report I am using in this post does not contain many multiblock operations but here is the event from the foreground wait events section.

Event

Waits

%Time -outs

Total Wait Time (s)

Avg wait (ms)

Waits /txn

% DB time

db flash cache multiblock physical read

1,048

0

1

1

0.00

0.00

We had 1048 waits of 1ms on average.

% of Waits

Event

Total Waits

<1ms

<2ms

<4ms

<8ms

<16ms

<32ms

<=1s

>1s

db flash cache multiblock physical read

1171

83.2

12.4

3.2

.9

.3

The wait event histogram shows that again most of the waits are below 1ms with some being up to 16ms.

write complete waits: flash cache

This is the wait event we see when DBWR is trying to write a block from the buffer cache to the flash cache and a session wants to modify that block. The session waits on this event and goes on with the update after DBWR finishes his job. You can find a good explanation and a good example for this in Guy Harrison's post. Here is the event and its histogram in the same AWR report I have.

Event

Waits

%Time -outs

Total Wait Time (s)

Avg wait (ms)

Waits /txn

% DB time

write complete waits: flash cache

345

0

1

4

0.00

0.00

% of Waits

Event

Total Waits

<1ms

<2ms

<4ms

<8ms

<16ms

<32ms

<=1s

>1s

write complete waits: flash cache

345

22.9

23.2

27.2

19.7

5.5

.3

1.2

I had 345 waits in 30 mins with an average time of 4ms. But the important thing is the contribution of this event to DB time was only 1 second in a 30 minute workload.

You can see that this event starts climbing up in some situations, especially if you are having a problem with your flash cache device and writes to it start to take longer.

Here is one case where poor DBWR processes are struggling to write to the flash cache. Database control was showing high waits in the "Configuration" class and those waits were "write complete waits", "free buffer waits" and "write complete waits: flash cache". This was all because my flash cache device was a single conventional HDD, not even an SSD. After changing the db_flash_cache_file parameter to use the F5100 array this picture was not seen anymore.

The documentation about using the database flash cache feature recommends increasing db_cache_size or sga_target or memory_target, whichever you are using, to account for the metadata for the blocks kept in the flash cache. This is because for each block in the flash cache, some metadata is kept in the buffer cache. The recommendation is to add 100 bytes for a single instance database and 200 bytes for RAC multiplied by the number of blocks that can fit in the flash cache. So you need to calculate how many blocks the flash cache can keep (by dividing db_flash_cache_size in bytes with the block size) and increase the buffer cache.

If you do not do this and the buffer cache size is too low, it is automatically increased to cater for this metadata. It also means you will have less space for actual data blocks if you do not increase your buffer cache size.

I happened to learn this adjustment by chance and it gave me a chance to calculate exactly how much space the flash cache metadata needs.

This is on a single instance 11.2.0.1 database on an M4000 server running Solaris 10.

I start with db_cache_size=320m (I am not using sga_target or memory_target because I want to control the size of the buffer cache explicitly) and db_flash_cache_size=20g, the instance starts up without any errors or warnings but the alert log shows:

The value of parameter db_cache_size is below the required minimumThe new value is 4MB multiplied by the number of cpus plus the memory required for the L2 cache.WARNING: More than half the granules are being allocated to the L2 cache when db_cache_size is set to 335544320. Either decrease the size of L2 cache, or increase the pool size to671088640

If you look at the db_cache_size at this point it shows 448m. The database automatically increased 320m to 448m. It also warns me that most of this space will be used for the flash cache metadata. This is a server with 32 CPUs (cores actually) so I multiply this by 4m, it makes 128m which is the space that will be used for actual data blocks. The remaining 320m will be used for the flash cache metadata. I have 20g of flash cache and my block size is 8K, this means 2,621,440 blocks can fit in there. Let's see how much space is needed for the metadata on one block, since I have 320m for the metadata I convert it to bytes and divide by the number of blocks, 320*1024*1024/2621440, which gives me 128 bytes.

The documentation states 100 bytes for a single instance database but it is actually a little bit higher.

Another case to verify. This time I start with db_cache_size=448m and db_flash_cache_size=60g. Similar messages are written to the alert log again.

The value of parameter db_cache_size is below the required minimumThe new value is 4MB multiplied by the number of cpus plus the memory required for the L2 cache.WARNING: More than half the granules are being allocated to the L2 cache when db_cache_size is set to 469762048. Either decrease the size of L2 cache, or increase the pool size to2013265920

When I look at db_cache_size now I see that it is increased to 1088m.

Of the 1088m buffer cache, again 128m will be used for data blocks, the remaining 960m is for the flash cache metadata. 60g of flash can hold 7,864,320 blocks, doing the math again tells me that the metadata for a single block is again 128 bytes.

If you are starting with a small buffer cache, remember to check the alert log and the current size of the buffer cache. If it is already high and you do not see any adjustments be aware that you will use 128 bytes for the metadata for each block. This means you will have less memory for data blocks. It is a good practice to calculate this need beforehand and size the buffer cache accordingly.

We have been testing the F5100 flash array in our humble lab (borrowed that term from a colleague, he knows who he is). There are two ways to use it, one is to place your datafiles on it, the other is to use it as the database flash cache.

The database flash cache feature came with 11.2 and is a way to extend the SGA. It is not the same thing as the flash cache in Exadata, read Kevin Closson's this post to find out what the difference is. F5100 is one of the products you can use as the flash cache, the other is the F20 card.

It is possible and (may be the best option) to use ASM to configure the flash cache. The documentation states that you can use a filename or an ASM diskgroup name for db_flash_cache_file parameter which is the parameter to enable the flash cache feature.

So, how do we do this?

The first step is creating an ASM diskgroup on the flash devices. In my test, for simplicity I use one flash device which is /dev/rdsk/c1t9d0s0. This is just one flash module (of size 24GB) from F5100. The process for creating the diskgroup is no different than creating a diskgroup on conventional disks, just make sure your asm_diskstring includes the flash device path. By using asmca I created a diskgroup named FLASH using external redundancy on this single flash device.

Then following the documentation I set the parameters to enable the flash cache.

SQL> alter system set db_flash_cache_size=20G scope=spfile;

System altered.

SQL> alter system set db_flash_cache_file='+FLASH' scope=spfile;

System altered.

Now time to restart to make the new parameters effective.

SQL> startup force;

ORACLE instance started.

Total System Global Area 2606465024 bytes

Fixed Size 2150840 bytes

Variable Size 2113932872 bytes

Database Buffers 469762048 bytes

Redo Buffers 20619264 bytes

Database mounted.

ORA-03113: end-of-file on communication channel

Process ID: 17560

Session ID: 2471 Serial number: 3

The instance fails to start. Looking at the alert log file we see this.

The instance is up and I can see that the flash cache is enabled and using the correct ASM diskgroup.

The documentation states that we can use the diskgroup name but as we saw it needs some correction.

A problem and a lesson

The path I followed to this point is a little embarrassing and teaches a lesson about setting parameters.

At first I used db_flash_cache_file='FLASH' assuming it would use the ASM diskgroup. After restarting the instance I immediately started a workload on the database to see the effects of the flash cache. Here is what I saw in the Enterprise Manager performance page.

The waits of class "Configuration" were holding back the database. When I clicked on "Configuration" link I saw this.

The system was mostly waiting on "free buffer waits", "write complete waits" and "write complete waits: flash cache". This is because of DBWR. Someone has to write the blocks from the buffer cache to the flash when the buffer cache is full. This process is DBWR. Since I love using the DTrace toolkit I used the iosnoop script in that toolkit to find out what DBWR was doing. iosnoop can show you what file a process reading from or writing to most. So I ran iosnoop for one of the DBWR processes.

The file it is trying to write to is $ORACLE_HOME/dbs/FLASH and since that directory is on a slow local disk it is taking a long time and causing the waits.

If I had looked at the db_flash_cache_file parameter after restarting the database to see if it was alright I would have seen this before starting the workload. So, once more it teaches to check if the parameter was set the way you want it before taking further action.

There is a common misunderstanding among DBAs about the column ISDEFAULT in the view v$parameter. Some think that when this column is TRUE it means the current value of the parameter is the default value. This leads to wrong conclusions and sometimes wrong settings for even production environments.

"Indicates whether the parameter is set to the default value (TRUE) or the parameter value was specified in the parameter file (FALSE)"

This explanation is not a clear one and different people may understand different things from it.

This column is dependent on the setting in the parameter file when the instance is started. It does not show if the current value is the default or not. It only shows if the parameter is set in the parameter file or not. When it is TRUE it means that you did not set this in the parameter file when starting the instance. When it is FALSE it means this parameter was set in the parameter file when starting the instance.

Here is a real life case about this. When changing a few parameters an Exadata DBA accidentally sets the parameter cell_offload_processing to FALSE using an alter system command. When he tries to take back the settings he looks at v$parameter.ISDEFAULT to find out if cell_offload_processing=FALSE is the default setting. He sees that ISDEFAULT=TRUE and arrives at the wrong conclusion that cell_offload_processing=FALSE is the default value and leaves the parameter that way. This causes all Exadata storage offloading to be disabled and may cause query times to go over the roof.

Let's look at this with an example on 11.2.0.2. This is from a database created with default parameters.

After we set it FALSE we see that ISMODIFIED reflected the change, but ISDEFAULT is still TRUE. From this if a DBA concludes that this is the default value and takes action based on that, the result will be wrong. As we did not set this parameter in the parameter file when starting the instance, ISDEFAULT still shows TRUE.

Now we are back to the defaults as the reset command removed the parameter from the parameter file.

So, do not count on the ISDEFAULT column when you are trying to find if the current value of a parameter is the default value or not. The documentation is the most reliable source to find out the default values of the parameters.

In general usage, everyone configures tns listeners on the public interfaces of RAC nodes so that clients can connect through the public network. Conventionally the private network interfaces are used for the interconnect traffic so most people do not open them to the application because the application traffic may interfere with the interconnect messaging.

But what if we have a high-speed and high-bandwidth interconnect network that some applications can also use for fast communication to the nodes? Can we create tns listeners on the private interfaces, if we can how? This high-speed interconnect network is especially true for Exadata where we have an Infiniband private network that is used for RAC interconnect and for the communication between the storage nodes and the database nodes. Since it is a high-speed low-latency network the application servers or ETL servers can connect to that network and use it for fast data transfer.

To do this we need to create tns listeners on the private interfaces so that clients can connect through sqlnet. The following steps will show how I did this. I did this configuration on a half rack Database Machine but since I do not have access to that now the example here is based on a test Virtualbox system (which runs 11.2.0.2 RAC) so Infiniband is not here but the steps are the same for Exadata. Also DNS is not used, all addresses are defined in the /etc/hosts files of the nodes.

I have two nodes, rac1 and rac2. The private network is 192.168.0 and the public network is 192.168.56. After the default installation I have the default listeners created on the public interfaces, there is only one SCAN address as this is a test setup.

Assume I have an ETL server that is connected to the private network which needs to connect to the database through the private interface. What I need is a listener per node that is listening on the private IP addresses.

If you start with netca ( from the grid home because that is where the listeners run on 11.2) and try to create the listeners you will see that you will not be able to select the private network.

It will show only the public network because this selection is based on your virtual IP definitions. Since I do not have any VIPs on the private network I do not see it.

So the first step is to create VIPs on the private interfaces. I start by adding the new VIPs to the /etc/hosts files. These lines should be added to both nodes' /etc/hosts file.

I created a VIP for each node on the eth1 interface which is the private interface and I have specified "-k 2" to indicate that the network number is 2. You can use "srvctl add vip -h" to see what options you have.

At this step if you look at the "ifconfig -a" output you will not see the new VIPs up because we have not started them up yet. Let's start them now.

[root@rac1 bin]# ./srvctl start vip -i rac1-ib

Now you will see the new IP up in the "ifconfig -a" output. This is the related line.

I can change the local_listener parameter and add this listener so that my database registers with it. After that as the default installation sets the SCAN listener as the remote_listener SCAN will be able to direct connections to this listener as well. But, there is a problem with this. What happens if SCAN directs the connection from the ETL server to the public interface instead of the private one? Or vice-versa, what happens if it directs connections from the public network clients to the private interface? Users will get errors because they cannot reach the private network and the ETL server will get errors because it cannot reach the public network.

The correct way to register my database to the listeners is to use the listener_networks parameter. listener_networks is a new 11.2 parameter and serves the purpose of cross-registration when you have listeners on multiple networks. Basically with it you can say, "register my local listeners on the public interfaces to the SCAN listener, register my local listeners on the private interface to each other".

This way clients using SCAN will connect to the public interfaces and clients using the private network will connect to the private interfaces. Let's do it now.

Not to clutter the listener parameters with long tns descriptions let's add the definitions to the tnsnames.ora file and use the names instead. On both nodes's tnsnames.ora file residing in the database home I add these lines. Remember to change the hostnames to rac2 for the ones other ORCL_IB when editing the file on rac2.

ORCL_PUBLIC_LOCAL will be used as the local listener for the public network, ORCL_PRIVATE_LOCAL will be used as the local listener for the private network, ORCL_IB will be used as the remote listener for the private network so that the local private listeners can register to each other and the SCAN address will be used as the remote listener for the public network.

Now time to change the parameters.

SQL> alter system set listener_networks='((name=public_network)(local_listener=orcl_public_local)(remote_listener=rac-scan:1521))','((name=priv_network)(local_listener=orcl_private_local)(remote_listener=orcl_ib))';

Now we can use this tnsnames entry to connect through the private network.

ORCL_ETL =

(DESCRIPTION =

(LOAD_BALANCE=on)

(ADDRESS = (PROTOCOL = TCP)(HOST = rac1-ib)(PORT = 1522))

(ADDRESS = (PROTOCOL = TCP)(HOST = rac2-ib)(PORT = 1522))

(CONNECT_DATA =

(SERVER = DEDICATED)

(SERVICE_NAME = orcl)

)

)

It took quite a while for me to get this in writing, if you use this procedure and see problems or if you think there are points not explained clearly, let me know so I can edit the post, this way it will be more useful to others who may need it.

If you are trying to patch an existing 11.2.0.1 installation on a non-RAC GI (also called single instance HA) system be aware that some steps will be different from the patch readme or the documentation.

If you choose to apply GI PSU1 and follow the README there is a section that tells you to prepare the GI home.

"2.2.5 Prepare GI home for patch

As an owner of Oracle GI home software owner, run the command: %/bin/srvctl stop home -o -s -n "

When I ran this on a single instance system I got:

[oracle@oel5 patches]$ srvctl stop home -o /u01/app/product/11.2.0/grid -s /tmp/stat.txtPRCH-1002 : Failed to stop resources running from crs home /u01/app/product/11.2.0/gridPRCH-1030 : one or more resources failed to stop PRCH-1026 : Failed to stop ASMPRCD-1027 : Failed to retrieve database orclPRCD-1035 : orcl is the database unique name of a single instance database, not a cluster database

This is a problem with srvctl and this step must be skipped for non-RAC GI installations as the next step will stop GI. After this README tells you to unlock the grid home:

"You must invoke this script as root user to open protection on GI software files for patch application #/crs/install/rootcrs.pl -unlock"

rootcrs.pl is the script used for RAC environments. If you have non-RAC GI the script you need to use is roothas.pl. After changing rootcrs.pl with roothas.pl and unlocking the grid home you can continue with applying the patch.

MOS note 1089476.1 explains the steps to patch a non-RAC GI home. Keep this note in mind when you are working on such a system.

The ones except gpgkey are self-explanatory. The parameter gpgkey is used to point to a file that contains the public key for the packages you install so that yum can verify the package's authenticity if needed. The file I use is the key file that contains the public key to verify the oracle-validated rpm.

oracle-validated rpm is used to install the necessary packages for Oracle installations, it also updates the kernel parameters and creates a default oracle user. Using it is an easy way to prepare your server for Oracle installations, the other option is to check the installation prerequisites from the documentation and install the packages, update the kernel parameters and create the user yourself.

MOS Note 579101.1 explains how to install the oracle-validated rpm.

I tried to install this rpm without checking the note and I did not use the gpgkey parameter in /etc/yum.conf initially. This is what you get if you do not set it.

If you are using Oracle Enterprise Linux (OEL) 5, the installation media comes with a yum repository on it. The repository is in the directory /media/Enterprise Linux dvd 20090127/Server/repodata for OEL 5.3 (the location may change).

It is possible to use that repository when installing new packages or components locally without accessing a remote repository or without having the need to copy the rpms to a local directory. If you did a base installation the yum package is already installed, if not you need to install it first. After yum is in place edit /etc/yum.conf to insert lines related to the repository on the installation media.

# PUT YOUR REPOS HERE OR IN separate files named file.repo# in /etc/yum.repos.d[local]name="media"baseurl=file:///media/Enterprise%20Linux%20dvd%2020090127/Serverenabled=1gpgkey=file:///media/Enterprise%20Linux%20dvd%2020090127/RPM-GPG-KEY-oracle

You need to add only the last 5 lines, the other ones are already there.After this you can use yum to add packages or components when needed. An example to install the oracle-validated rpm from this repository will be in my next post.

This seems like the problem is related to the virtual disk I created as the root disk. I am using the SATA interface and in the virtual machine storage settings there is an option named "Use host I/O cache" which is unchecked in my case. Checking it and starting up the vm again resolves the issue.

There is a note in Metalink that explains that on Windows having space characters in your ORACLE_HOME variable, the patch location or JDK location causes an error when running opatch. Yesterday I saw a strange problem that is similar to the above case.

If your opatch directory contains space characters you get a strange error. Even if the above conditions were not present we got an error like this:

C:\Documents and Settings\test\Desktop\OPatch>opatch lsinventoryException in thread "main" java.lang.NoClassDefFoundError: and

OPatch failed with error code = 1

Metalink returns no results for this error. This error is caused by the space characters in "Documents and Settings". When you move the opatch directory to another directory which does not contain space in its name opatch runs without this problem.

Yesterday I attended Kevin Closson's Exadata technical deep dive webcast series part 4. It is now available to download here. In there he talks about DBFS which is a filesystem on top of the Oracle database which can store normal files like text files. DBFS is provided with Exadata and is used to store staging files for the ETL/ELT process. This looks very promising, he sites several tests he conducted and gives performance numbers too. Watch the webcast if you haven't yet.

11G brought interval partitioning which is a new partitioning method to ease the maintenance burden of adding new partitions manually. The interval partition clause in the create table statement has an option to list tablespace names to be used for interval partitioning. The documentation states that the tablespaces in the list you provide are used in a round-robin manner for new partitions:

Interval partitions are created in the provided list of tablespaces in a round-robin manner.

This does not mean that any newly created partition will reside in the tablespace which is next on the list. The tablespaces may be skipped if partitions map to more than one interval. Here is a test case that shows how the list is used.

The "store in" clause lists tablespaces tbs1, tbs2 and tbs3 to be used for interval partitioning. After the above create table command I now have one partition which resides in tbs1. Let's insert a row which needs to be inserted into a new partition and see which tablespace the partition will be created in.

The row I inserted maps to one interval, which is one month, it does not have a date value which is more than one month higher than the current maximum value. So the next tablespace, tbs2, is used for the new partition.

I skipped March and inserted a value for April. The current maximum key becomes May 1st, we do not see a partition with a maximum value of Apr 1st. The next tablespace on the list was tbs1 but we see that the new partition is on tbs2, not tbs1. Tbs1 would be used if I did not skip an interval when inserting rows.

So, the tablespaces on the list are used in a round-robin manner but each is used for only one interval. If you skip intervals the tablespaces related to that interval are skipped too.

This is something to keep in mind if you want to strictly decide which tablespace will hold which partition.

I was trying to find out why a query with the RULE hint produces different plans. I got stuck so I posted the problem to the OTN database forum and Randolf Geist provided a good answer and starting point for it. A second eye on the problem can remove the mind blockage and clear your way.

When changing plsql code in a production system the dependencies between objects can cause some programs to be invalidated. Making sure all programs are valid before opening the system to users is very important for application availability. Oracle automatically compiles invalid programs on their first execution, but a successfull compilation may not be possible because the error may need code correction or the compilation may cause library cache locks when applications are running.

Dependencies between plsql programs residing in the same database are easy to handle. When you compile a program the programs dependent on that one are invalidated right away and we can see the invalid programs and compile or correct them before opening the system to the users.

If you have db links and if your plsql programs are dependent on programs residing in other databases you have a problem handling the invalidations. There is an initialization parameter named remote_dependencies_mode that handles this dependency management. If it is set to TIMESTAMP the timestamp of the local program is compared to that of the remote program. If the remote program's timestamp is more recent than the local program, the local program is invalidated in the first run. If the parameter is set to SIGNATURE the remote program's signature (number and types of parameters, subprogram names, etc...) is checked, if there has been a change the local program is invalidated in the first run.

The problem here is; if you change the remote program's signature or timestamp you cannot see that the local program is invalidated because it will wait for the first execution to be invalidated. If you open the system to the users before correcting this you may face a serious application problem leading to downtime.

The second execution compiles the package and returns the status to valid.

How can we know beforehand that the local program will be invalidated on the first execution? The only way I can think of is to check the dependencies across all databases involved. By collecting all rows from dba_dependencies from all databases we can see that when the remote program is changed the programs on other databases that use this remote program will be invalidated when they are executed. Then we can compile these programs and see if they compile without errors.

Database links may be very annoying sometimes, this case is just one of them.

In his famous index internals presentationRichard Foote mentions a bug in 9i about index block splits when rows are inserted in the order of the index columns. Depending on when you commit your inserts the index size changes dramatically.

While I was trying to find out why a 3-column primary key index takes more space than its table I recalled that bug and it turned out that was the reason of the space issue. The related bug is 3196414 and it is fixed in 10G.

I am trying to insert the rows in the order of the primary key column, so what I expect to see is that when an index block fills there will be a 90-10 split and the index will grow in size. But as the number of leaf block splits show there are 35 block splits and none of them are 90-10 splits meaning all are 50-50 block splits. I have 36 leaf blocks but half of each one is empty.

If we try the same inserts but commit after the loop the result changes.

In this case we see that there have been 18 block splits and all were 90-10 splits as expected. We have 19 leaf blocks and all are nearly full. Depending on where the commit is we can get an index twice the size it has to be. When I ran the same test in 10G it did not matter where the commit was. I got 19 leaf blocks in both cases.

I did not test if this problem happens when several sessions insert a single row and commit just like in an OLTP system but I think it is likely because we have indexes showing this behavior in OLTP systems.

I am regular user of OTN and its forums. Last week I was trying to login to OTN from a public computer and I got the "invalid login" error everytime I tried. I was sure I was typing my password correct but I could not get in anyway. So, I tried to get my password reset and sent to my e-mail address. Then I remembered that the e-mail address I used to register for OTN was from my previous employer meaning I did not have access to it anymore. As OTN does not allow changing the registration e-mail address I was stuck. I send a request from OTN to get my password delivered to my current e-mail address. Here is the reply I got:

Resolution: Oracle's membership management system does not currently supportthe editing of the email address or username in your membership profile.(It will support this capability in a future release.)Please create a new account with the new email address you wish to use. However,it is possible to change the email at which you receiveDiscussion Forum "watch" emails (see "Your Control Panel" when logged in).

They tell me to create a new user and forget about my history, watch list, everything. What a user centric approach this is.

If you are an OTN member do not lose your password and your e-mail account at the same time, you will not find anybody from OTN who is willing to solve your problem and help you to recover your password.

I am used to bad behavior and unwillingness to solve problems in Metalink, now I get the same behavior in OTN. Whatever, just wanted to let you know about it.

Complete refresh of a single materialized view used to do a truncate and insert on the mview table until 10G. Starting with 10G the refresh does a delete and insert on the mview table. This guarantees that the table is never empty in case of an error, the refresh process became an atomic operation.

There is another difference between 9.2 and 10G in the refresh process, which I have realized when trying to find out why a DDL trigger prevented a refresh operation. In 9.2 the refresh process runs an ALTER SUMMARY statement against the mview while in 10G it does not.

We have a DDL trigger in some of the test environments (all 9.2) for preventing some operations. The trigger checks the type of the DDL and allows or disallows it based on some conditions. After this trigger was put in place some developers started to complain about some periodic mview refresh operations failing a few weeks ago. I knew it was not because of the TRUNCATE because the trigger allowed truncate operations.

So I enabled sql tracing and found out what the refresh process was doing. Here is a simplified test case for it in 10G.

YAS@10G>create table master as select * from all_objects where rownum<=5;

*ERROR at line 1:ORA-04045: errors during recompilation/revalidation of YAS.MVIEWORA-20001: DDL NOT ALLOWEDORA-06512: at "SYS.DBMS_SNAPSHOT", line 820ORA-06512: at "SYS.DBMS_SNAPSHOT", line 877ORA-06512: at "SYS.DBMS_SNAPSHOT", line 858ORA-06512: at line 1

After enabling sql trace and running the refresh again, the trace file shows the problem statement.

*ERROR at line 1:ORA-04045: errors during recompilation/revalidation of YAS.MVIEWORA-20001: DDL NOT ALLOWEDORA-06512: at "SYS.DBMS_SNAPSHOT", line 820ORA-06512: at "SYS.DBMS_SNAPSHOT", line 877ORA-06512: at "SYS.DBMS_SNAPSHOT", line 858ORA-06512: at line 1

The last statement in the trace file is this:

ALTER SUMMARY "YAS"."MVIEW" COMPILE

So, in 9.2 refresh process runs an ALTER SUMMARY command against the mview. After changing the trigger to allow DDL operations on SUMMARY objects I could refresh the mview without errors.

Tonguc Yilmaz had a recent post about using 10046 for purposes other than performance tuning, like finding out what statements a procedure runs. I find uses for sql tracing nearly everyday and the above case is one of them. When you are not able to understand why an error happens, it is sometimes very useful to turn on sql tracing and examine the trace file. Errors related to specific bind values, errors raised from a "when others" block (you know we do not like "when others", right?) are a couple of things you can dig and analyze with sql tracing.

I have been looking for a solution to keep track of blog comments, both mine and other people's. Not all blogs have the option to subscribe to the comments, when you comment on a post you need to check later if someone commented further. I want a central repository where I can see all my comments on a post, all comments made by others on the same post and all comments made on my own blog.

I tried Cocomment before but I was not satisfied with it. So I decided to give Disqus a try and enabled it on this blog. Laurent Schneider decided to try it too.

Then Andy made another post about the concerns of some people about Disqus (one being Tim Hall). Their concerns make sense. Who owns the comments one makes, the commenter or the blog owner? Is it sensible to store the comments to your blog not in your blog database, but elsewhere? What if Disqus is not there next year, what if Disqus is inaccessible for some time? Is it possible to export the comments from Disqus and import them back to the blog?

Some of these may be irrelevant if you are using a public blogging service, like Blogger, because it means you are already storing your posts and comments somewhere else. The question "What if Blogger is not there next year?" comes to mind for example.

A solution suggested by Tim Hall is to dual post the comments. The comments will be in the blog and on Disqus also, this need the blogging service to provide a way to do it.

Another solution can be a strong export-import utility in Disqus. That way you can export the comments and put them back to the blog whenever you want. Disqus currently has an export utility but as far as I have read it is not reliable for now.

While I agree with these concerns I liked what Disqus provides. The one-stop page for all comments, the threading of comments, being able to follow other users are primary features I like. So, I will stick with it, at least for now.