The fact that we now officially know that each PDB will have their own local UNDO ? (no, this was easy to guess)

The fact that we will be able to flashback a PDB ? (nope. This, too, was easy to guess)

I’ll give you a clue. The best news was something that sneakily appeared between April and October 2016 in your patch maintenance logs :

[root@myclusterdb02 OPatch]# ./opatchauto apply /patches/OCT2016_bundle_patch/24436624/Database/12.1.0.2.0/12.1.0.2.161018DBBP/24448103 -oh /u01/app/oracle/product/12.1.0.2/dbhome_1 -ocmrf /u01/app/oracle/product/12.1.0.2/dbhome_1/OPatch/ocm/bin/ocm.rsp
System initialization log file is /u01/app/oracle/product/12.1.0.2/dbhome_1/cfgtoollogs/opatchautodb/systemconfig2016-12-11_05-23-36PM.log.
Session log file is /u01/app/oracle/product/12.1.0.2/dbhome_1/cfgtoollogs/opatchauto/opatchauto2016-12-11_05-24-09PM.log
WARNING: the option -ocmrf is deprecated and no longer needed. OPatch no longer checks for OCM configuration. It will be removed in a future release.

The interesting part is in that 4th line :

WARNING: the option -ocmrf is deprecated and no longer needed. OPatch no longer checks for OCM configuration. It will be removed in a future release.

This is awesome! We no longer need this annoying ocm.rsp file !

When I discovered this back in November 2016, I searched for an official feature announcement but found nothing (it looks like an opatch 13c new feature backported in 11g and 12c). This ocm.rsp file was still needed when I applied the April 2016 PSU, but from which opatch version was it implemented?

First, I looked at the 2016 dev comments in opatch 11.2.0.3.15 and 12.2.0.1.7 (which were the last version when I was applying this Exadata October 2016 PSU) and I sadly discovered that no specific comments were talking about this new feature (even if it would obviously deserve some):

I published a blog post called “Investigating IO Performance on Amazon RDS for Oracle” recently, and soon after posting it I received several questions asking if IO worked the same way on EC2 instances. My immediate though was it did, mostly because RDS for Oracle is basically an EC2 instance with Oracle Database on top of it, where the configuration is fully managed by Amazon. But as always, it’s better to test than assume, so here we go!

Although the testing was done by running a workload in an Oracle database, the results will apply to any other type of workload because the performance characteristics purely depend on the type of instance, type of EBS volume and the size of the IO requests, and it doesn’t matter how the data is processed after it’s retrieved from the storage.

The Testing

The testing was done exactly the same was as described in the previous blog post, the only difference was that I had to create an oracle database manually by myself. I used the 12.1.0.2.0 database Enterprise Edition and ASM, and the EBS volume was used as an ASM disk.

Measuring the IO performance

On RDS we had the nice Enhanced Monitoring which I set up with a refresh interval of a few seconds and I used it to collect performance statistics quickly. For EC2 (specifically for EBS volumes), there is no such thing as enhanced monitoring, so I needed to use the standard CloudWatch monitoring with the minimum refresh rate of 5 minutes (very inconvenient, because a single test case would have to be run for 5 to 10 minutes to collect reliable data). This was not acceptable, so I looked for alternatives and found that iostat displayed the same values the monitoring graphs did:

the “tps” showed IO per second, and “kB_read/s”+”kB_wrtn/s” allowed me to calculate the throughput (I actually ended up using just the kB_read/s as my workload was 100% read only and the values in kB_wrtn/s were tiny).iostat is even more convenient to use than the enhanced monitoring, it didn’t take long to see the first benefit of EC2 over RDS!

The Results

It was no surprise, the outcome of testing on EC2 was quite similar to the results from testing on RDS.

Provisioned IOPS Storage on an EC2 Instance

As on RDS, the testing was done on db.m4.4xlarge with 100G of io1 with 1000 provisioned IO/s. Also the results are very very similar, the only notable difference that I could observe (although I can’t explain it, and I’m not sure there is a patter in it or not, as I did’t do too many tests), was the fact that the throughput for 64K-96K IOs didn’t reach the same level as 128K+ IOs.

Provisioned IOPS Throughput (on EC2)

Provisioned IOPS Measured IO/s (on EC2)

These results confirm that (the same as with RDS), there are several sizes of physical IOs: (8), 16, 32, 64 and 128, and starting with 64K, the performance is throughput-bound, but with IOs of smaller size, it’s IOPS-bound.

General Purpose Storage on an EC2 Instance

The testing with General Purpose SSDs (100G with 300 baseline IOPS) didn’t provide any surprises and the results were exactly the same as for RDS.
The only difference in the graphs is the “bust performance” measures for IOs of different sizes that I’ve added to outline how the “bursting” improves both IO/s and Throughput.

General Purpose SSD Throughput (on EC2)

General Purpose SSD Measured IO/s (on EC2)

These results also confirm that (the same as with RDS), there are several sizes of physical IOs: 16, 32, 64 and 128, and starting with 32K, the performance is throughput-bound, but with IOs of smaller size, it’s IOPS-bound.

Additional Flexibility with EC2Using Multiple gp2 Volumes

Opposite to RDS, I can configure my storage and instance more freely, so instead of having just a single gp2 volume attached to it I added five 1G-sized (yes tiny) volumes to the +DATA disk group. the minimum IOPS for a gp2 volume is 100, so my 5 volumes gave cumulative 500 baseline IOPS. As ASM was used, the IOs were +/- evenly distributed between the volumes.

I didn’t do too thorough testing, but still I noticed a few things.
Take a look at these iostat outputs from testing done with 8K reads (this is burst performance):

Bursting performance applies to each volume separately. It should allow getting up to 3000 IOPS per volume, but I reached only ~1200 per volume with cumulative throughput of 48214 KB/s (not even close to the limit). So there’s some other limit or threshold that applies to this configuration (and it’s not the CPU). But look! I’ve got 6024 IO/s burst performance, which is quite remarkable for just 5G.

As I was not hitting the maximum 3000 bursting IOPS per volume, the burst credit was running out much slower. if it lasts normally ~40 minutes at 3000 IOPS, it lasts ~3 times longer at ~1200 IOPS, which would allow running at better performance longer (i.e if one used 5x2G volumes instead of 1x10G volume)

This iostat output is from testing done with 1M reads (this is burst performance):

The cumulative throughput is 245111 KB/s, which is very close to the throughput limit of the instance. I wasn’t able to reach such throughput on a single volume of gp2, where the maximum I observed was just 133824 KB/s, and 163840 KB/s is a throughput limit for a single gp2 volume which was bypassed too. It appears that configuring multiple volumes allows reaching the instance throughput limit that was not possible with a single volume.

I didn’t run any non-burst tests as it required too much time (2 hours of waiting to exhaust the burst credits).

Database with a 32K Block Size

We have observed that starting with 32K block reads the EBS volume become’s throughput-bound, not IOPS-bound. Obviously I wanted to see how it performed if the database was created with a 32K block size.
I ran a few very simple tests using 1 data block sized IOs (32K) on these two configurations:

db.m4.4xlarge with 100G / 1000 PIOPS (io1)

db.m4.4xlarge with 20G / 100 IOPS (gp2)

There were no surprises on the Provisioned IOPS storage and I got the 1000 IOPS that were provisioned (actually it was slightly better – 1020 IO/s), and the throughput was 32576.00 KB/s
On General Purpose SSD, the story was different – we know that starting from 32K-sized IOs, the performance becomes throughput-bound, and it was confirmed here too:

During burst period I measured up to 4180 IO/s at 133779 KB/s, which was 4 times faster than Provisioned SSD.

During non-burst period I measured up to 764 IOs at 24748 KBs/s throughput. Which is somewhat slower than Provisioned SSD. Also 24748 KBs/s, was slower than the throughput I measured on a 100G gp2 volume (we already ow that the non-burst throughput limit for gp2 depends on the size of the disk). If I used a 100G gp2 volume, I’d get 2359 IO/s at 75433 KB/s (this is from the graph above), which is also better that what one can get from a Provisioned SSD volume, and costs less.

Conclusions

Most of the conclusions were already outlined in the previous blog post, and they also apply to the EC2 instances when a single EBS volume is used for storage.

On the other side, the EC2 instance allows System administrators and DBAs (or should I say “Cloud Administrator”) to work around some of the limitations by changing the “variables” that can’t be altered on RDS – like, the block size of the database (which is 8K on RDS), and the number of EBS volumes behind the RDS configuration. Using a 32K block size for a database residing on General Purpose volume allows bypassing the IOPS limitation completely, and only throughput limits stay in effect. However, if 32K block size is not an option (as for Oracle e-Business Suite), then the IOPS and throughput can still be maximized by using a configuration of multiple GP2 volumes.

After having all these tests done, I think the only reason for using RDS instead of EC2 is the database management that is provided by Amazon. If that is something very critical for your requirements, it’s the way to go. If it’s not something you require – the EC2 can be configured to perform better for the same price, but you need to think about it’s maintenance by yourself.

I’ve recently been involved in quite a few database migrations to Oracle RDS. One thing that I had noticed when dealing with post-migration performance issues was related to queries that used TABLE SCAN FULL in their execution. It seemed, that in many cases, it just took a single query to max out the allocated IOPS (IOs per second) or bandwidth, which in turn would caused overall slowness of the RDS instance.

The search in documentation showed that it could have been caused by how IO operations are counted on Amazon RDS, as it’s quite different from what a routine Oracle DBA like me would expect. For multi-block reads the database (depending on storage) would typically issue IOs of size up to 1MB, so if an 8K block size was used the table scans would read up to 128 blocks in a single IO of db file scattered read or direct path read.

Now, pay attention to what the AWS documentation says:While Provisioned IOPS (io1 storage) can work with I/O sizes up to 256 KB, most databases do not typically use such large I/O. An I/O request smaller than 32 KB is handled as one I/O; for example, 1000 16 KB I/O requests are treated the same as 1000 32 KB requests. I/O requests larger than 32 KB consume more than one I/O request; Provisioned IOPS consumption is a linear function of I/O request size above 32 KB. For example, a 48 KB I/O request consumes 1.5 I/O requests of storage capacity; a 64 KB I/O request consumes 2 I/O requests, etc. … Note that I/O size does not affect the IOPS values reported by the metrics, which are based solely on the number of I/Os over time. This means that it is possible to consume all of the IOPS provisioned with fewer I/Os than specified if the I/O sizes are larger than 32 KB. For example, a system provisioned for 5,000 IOPS can attain a maximum of 2,500 IOPS with 64 KB I/O or 1,250 IOPS with 128 KB IO.
… and …I/O requests larger than 32 KB are treated as more than one I/O for the purposes of PIOPS capacity consumption. A 40 KB I/O request will consume 1.25 I/Os, a 48 KB request will consume 1.5 I/Os, a 64 KB request will consume 2 I/Os, and so on. The I/O request is not split into separate I/Os; all I/O requests are presented to the storage device unchanged. For example, if the database submits a 128 KB I/O request, it goes to the storage device as a single 128 KB I/O request, but it will consume the same amount of PIOPS capacity as four 32 KB I/O requests.

Based on the statements above it looked like the large 1M IOs issued by the DB would be accounted as 32 separate IO operations, which would obviously exhaust the allocated IOPS much sooner than expected. The documentation talks only about Provisioned IOPS, but I think this would apply to General Purpose SSDs (gp2 storage) too, for which the IOPS baseline is 3 IOPS/GB (i.e. 300 IOPS if allocated size is 100GB of gp2).

I decided to do some testing to find out how RDS for Oracle handles large IOs.

The Testing

For testing purposes I used the following code to create a 1G table (Thanks Connor McDonald and AskTom):

Basically, I’ll flush the buffer cache, which will force the direct path reads by setting _serial_direct_read to “ALWAYS”, and then, will choose the db_file_multiblock_read_count based on how big IOs I want to issue (note, by default the db_file_multiblock_read_count is not set on RDS, and it resolves to 128, so the maximum size of an IO from the DB is 1 MB), I’ll test with different sizes of IOs, and will Capture the throughput and effective IOPS by using the “Enhanced Monitoring” of the RDS instance.

Side-note: the testing I had to do turned out to be more complex than I had expected before I started. In few cases, I was limited by the instance throughput before I could reach the maximum allocated IOPS, and due to this, the main testing needed to be one on large enough instance (db.m4.4xlarge), that had more of the dedicated EBS-throughput.

The ResultsProvisioned IOPS storage

Testing was done on a db.m4.4xlarge instance that was allocated 100GB of io1 storage of 1000 Provisioned IOPS. The EBS-optimized throughput for such instance is 256000 KB/s.
The tests were completed by using db_file_multiblock_read_count of 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 16, 32, 64 and 128.
For each test the Throughput and IO/s were captured (from RDS CloudWatch graphs), and also the efficient IO size was derived.
The DB instance was idle, but still, there could be few small IO happening during the test.

Provisioned IOPS Measured Throughput

Provisioned IOPS Measured IO/s

From the graphs above the following features that are not documented can be observed:

The RDS instance is dynamically choosing the physical IO size (I’ll call them “physical“, just to differentiate that these are the IOs to storage, while in fact, that’s only what I see in the CLoudWatch graphs, the real physical IO could be some something different) based on the size of the IO request from the database. The possible physical IO sizes appear to be 16K, 32K, 64K, 128K and probably also 8K (this could also be in fact 16K physical IO reading just 8K of data)

The IOPS limit applies only to smaller physical IOs sizes (up to 32K), for larger physical IOs (64K, 128K) the throughput is the limiting factor of the IO capability. The throughput limit appears to be quite close to the maximum throughput that the instance is capable of delivering, but at this point, it’s not clear how the throughput limit for particular situation is calculated.

Throughput Limits for Provisioned IOPS

I ran additional tests on differently sized instances with io1 storage to understand better how the maximum throughput was determined. The graph below represents the throughput achieved on different instances, but all had the same 100G of 1000 PIOPS io1 storage. The throughput was done by using db_file_multiblock_read_count=128:

PIOPS Throughput by Instance Type

it appears that the maximum throughput is indeed limited by the instance type, except for the very largest instance db.m4.10xlarge (For this instance the situation is somewhat weird even in the documentation because the maximum throughput is mentioned as 500 MB/s, but the maximum throughput for a single io1 EBS volume, which should be there underneath the RDS, is just 320 MB/s, and I was unable to reach any of these limits)

General Purpose SSD storage

Testing was done on a db.m4.4xlarge instance that was allocated 100GB of gp2 storage with 300 IOPS baseline. The EBS-optimized throughput for such instance is 256000 KB/s.
The tests were conducted similarly to how they were done for Provisioned IOPS above (note, this is the baseline performance, not burst performance)

General Purpose SSD Measured Throughput

General Purpose SSD Measured IO/s

Similarly to Provisioned IOPS, the General Purpose SSD storage behaves differently from what’s explained in the documentation:

The physical IO size again is calculated dynamically based on the size of the IO request from the database. The possible sizes appear to be the same as for io1: (8K), 16K, 32K, 64K and 128K.

The IOPS limit (to baseline level) appears to apply to IO sizes only up to 16K (compared to 32K in case of Provisioned IOPS), for larger physical IOs starting from 32K, the limit appears to be throughput-driven.

It’s not clear how the throughput limit is determined for the particular instance/storage combination, but in this case, it appeared to be around 30% of the maximum throughput for the instance, however, I didn’t confirm the same ratio for db.m4.large where the maximum achievable throughput depended on the allocated size of the gp2 storage.

Burst Performance

I haven’t collected enough data to derive anything concrete, but during my testing I observed that Burst performance applied to both maximum IOPS and also the maximum throughput. For example, while testing on db.m4.large (max instance throughput of 57600 KB/s) with 30G of 90 IOPS baseline performance, I saw that for small physical IOs it allowed bursting up to 3059 IOPS for short periods of time, while normally it indeed allowed only 300 IOPS. For larger IOs (32K+), the baseline maximum throughput was around 24500 KB/s, but the burst throughput was 55000 KB/s

Throughput Limits for General Purpose SSD storage

I don’t really know how the maximum allowed throughput is calculated for different instance type and storage configuration for gp2 instances, but one thing is clear: that instance size, and size of the allocated gp2 storage are considered in determining the maximum throughput. I was able to achieve the following throughput measures when gp2 storage was used:

75144 KB/s (133776 KB/s burst) on db.m4.4xlarge (100G gp2)

54500 KB/s (same as burst, this is close to the instance limit) on db.m4.large (100G gp2)

24537 KB/s (54872 KB/s burst) on db.m4.large (30G gp2)

29116 KB/s (burst was not measured) on db.m4.large (40G gp2)

37291 KB/s (burst was not measured) on db.m4.large (50G gp2)

Conclusions

The testing provided some insight into how the maximum performance of IO is determined on Amazon RDS for Oracle, based on the instance type, storage type, and volume size. Despite finding some clues I also understood that managing IO performance on RDS is far more difficult than expected for mixed size IO requests that are typically issued by Oracle databases. There are many questions that still need to be answered (i.e. how the maximum throughput is calculated for gp2 storage instances) and it’d take many many hours to find all the answers.

On the other-hand, the testing already revealed a few valuable findings:

Opposite to the documentation that states that all IOs are measured and accounted in 32KB units, we found that an IU reported by amazon can be of size 8K (probably), 16K, 32K, 64K and 128K

For small physical IOs (up to 32K in case of Provisioned IOPS and up to 16K in case of General Purpose SSD) the allocated IOPS is used as the limit for the max performance.

For larger physical IOs (from 64K in case of Provisioned IOPS and from 32K in case of General Purpose SSD) the throughput is used as the limit for the max performance, and the IOPS limit no longer applies.

The Burst performance applies to both IOPS and throughput

P.S. As to my original issue of a single TABLE SCAN FULL severely impacting the overall performance, I found that in many cases we were using small RDS instances db.m3.large or db.m4.large, for which the maximum throughput was ridiculously small, and we were hitting the throughput limitation, not the IOPS limit that actually didn’t apply to the larger physical IOs on gp2 storage.

Sometimes it is necessary to put on your uber-geek hat and start using cryptic bits of code to retrieve information from an Oracle database. Troubleshooting enqueue locking events in Oracle databases is one of the times some advanced SQL may be necessary.

Likely you have used SQL similar to the following when troubleshooting Oracle enqueue’s, probably in connection with row lock contention.

The value for P1 is not very useful as is; Oracle has encoded the type of enqueue and the requested mode into the column. When working with current events such as when selecting from v$session, it is simple to determine the type of lock and the mode requested by querying v$lock, such as in the following example:

Session 68 is waiting on a TX enqueue with requested lock mode of 6. Seasoned Oracle DBA’s will recognize this as classic row lock contention.

Why bother to find out just which type of enqueue this is? There are many types of locks in Oracle, and they occur for differing reasons. The TX lock is interesting as it can occur not only in Mode 6 but Mode 4 as well; Mode 4 refers to locks that involve unique keys, such as when 2 or more sessions try to insert the same value for a primary key. The following example shows just that:

Knowing just which lock mode is requested is vital, as the troubleshooting for TX Mode 4 locks will be different from what is used to troubleshoot Mode 6.

Though we can find the lock name and mode information in v$lock, there is still value in being able to decipher that cryptic P1 column.

The ASH and AWR facilities do not include any historical information for the lock name and mode; the P1 column found in v$active_session_history and dba_hist_active_sess_history does not have a corresponding dba_hist_lock view. Any research done after an event has occurred does require decoding this information.

Deciphering v$session.p1

Oracle Support document 34566.1 is the enqueue reference note that provides information needed to get the lock name and mode from the p1 column. As you will see this information is a bit puzzling.

The rest of this article will focus on TX Mode 6 locks. The value shown for this lock and mode in the P1 column is always 1415053318. Following is the SQL recommended by Oracle:

The bitand function is used to mask all unwanted bits to 0. The number is then divided by value needed to remove all of the now-zeroed-out lower order bytes.

The values being used as bitmasks are -16777216 and 16711680. The use of -16777216 does not seem to make sense. As the intent is to mask all but one byte, I would expect to find an FF surrounded by a number of zeroes. The bit mask of 16711680, however, looks fine.

Now let’s run the Oracle support query again, but modified to show just the integer values rather than converting them to ASCII.

Well, that is interesting. An implicit conversion is taking place with to_char() that is removing the decimal portion of these numbers. Is that being done with trunc(), round(), or something else? I don’t know the answer to that. What seems more important is just doing the math correctly.

There are a couple of things here that can be changed to make this work as expected.

A New BitMask

Let’s modify the first bitmask to something that seems more reasonable than -16777216. Let’s use this instead, as it masks only the single byte we need:

The original divisors are off by 1, which does not completely remove the lower order values.

16777215 = 0xFFFFFF
65535 = 0xFFFF

Increasing each by one has the desired effect.

New Divisor Values

16777216 = 0x1000000
65536 = 0x10000

Conclusion

Those odd bitmasks have been in the back of my mind for some time, and today it seemed like a good idea to dig in and find out more about them. It isn’t too hard to imagine that in some cases the wrong values might be returned, leading to some long and unproductive troubleshooting sessions.

There is a demo script enqueue-bitand.sql containing much of the SQL found in this article. There is also a script awr-top-sqlid-events.sql that incorporates the enqueue lock decoding. This script could be made better than it is, so please issue a pull request if you have some useful modifications.

As an Oracle DBA, one of our daily tasks is to optimize bad SQL statements that are affecting the system and causing performance degradation. First we identify the culprit SQL, then we extract the execution plan, after that, we start the cycle of SQL tuning and optimization as appropriate.

There are many methods to extract the execution plan for a specific SQL statement, however, not all these methods will provide the actual or real plan that optimizer follows, rather than a suggestion or expectation of the execution plan, which is not always accurate and can be misleading.

In the following blog, I will demonstrate various execution plan extraction methods, and create a scenario to test these methods then see which ones provide the actual execution plan that optimizer really follows to execute a specific SQL, and which methods provide a suggestion of the execution plan, that is not necessarily used by Oracle optimizer.

So first things first, what is an “execution plan”?

An execution plan is a set of steps that the optimizer performs when executing a SQL statement and performing an operation.

There are many ways to extract the execution plan for a specific SQL, like:

Enterprise Manager

SQL*Plus AUTOTRACE

EXPLAIN PLAN command

SQL Trace (event 10046) with tkprof

DBMS_XPLAN package to view plans from:

Automatic Workload Repository

V$SQL_PLAN

SQL Tuning Sets

Plan table

Some of these methods will provide the “actual” execution plan, while other methods will provide a “suggestion” of the execution plan, that is, the steps that Oracle expects optimizer to follow in order to execute a SQL statement, which may not always be true.

In this test case, I will create a table with two columns, and insert skewed data into one column, then I will build an index and histogram on that column. I will then query that column using bind variables and see if Oracle will use the index or not.

It should be mentioned that any method that provides suggestion of the execution plan, is just a synonym of EXPLAIN PLAN command, and because this command does not use the feature of Bind Variable Peeking, it will not generate optimal plan for each different value in this test case, and that will be the crucial bit that this test case depends on.

We see clearly that optimizer accessed table using index range scan, which is good, so now are 100% sure that the real execution plan used an INDEX RANGE SCAN, let’s compare this result with other results.

> SQL*Plus AUTOTRACE:

Autotrace is very useful way to get SQL statistics, but will it provide the real execution plan?

Ops!, AUTOTRACE suggests that optimizer will use a full table scan to access the table, which is different from the execution plan provided in previous step, so AUTOTRACE will provide a suggestion of the execution plan, which may not always be true, now let’s continue with the next method.

> EXPLAIN PLAN:

Explain Plan command will provide the execution plan without even running the SQL, so we can derive easily that we will get a suggestion of the execution plan:

So far, we have tested four execution plan methods, AUTOTRACE and EXPLAIN PLAN methods which will provide suggestions of the execution plan, DBMS_XPLAN.DISPLAY_CURSOR which will always provide the real execution plan, and SQL Trace which can provide both suggested and real execution plan.

As we can see, EXPLAIN PLAN is aware of the baseline and showing that optimizer will perform index scan.

Summary:

There are many methods to extract the execution plan, some methods will provide the actual execution plan, while other methods will provide just a suggestion that optimizer may not follow in real time. In this test case we have tested four methods for execution plan extraction (DBMS_XPLAN, AUTOTRACE, EXPLAIN PLAN, and SQL Trace), we have then created an SQL Baseline and performed the test again on some methods to confirm that all methods are always aware of the SQL Baseline.

Some time ago I published a post regarding the Database Migration Service (DMS) on AWS. The service had just appeared at that time and looked very promising. It is still appealing and has a good use and potential to mature and be better. I was able to test and try it in several cases moving data to AWS and between RDS databases inside AWS. During those exercises, I ran into some issues and wanted to make other people aware of some things to keep in mind when starting or planning to use the DMS. Most of my experience with DMS is related to migrations to and from Oracle databases. So the following tips are about Oracle migrations and replications.

Before planning any kind of a logical replication based on transaction logs please check what kind of data types you have and whether you have primary keys on all your replicated tables. The primary key existence in some cases is not only desirable but is required to properly replicate the objects. As an example, if you want to replicate a table with some lob objects using DMS you must have a primary key for the table because if you don’t your lob columns will be excluded from replication and you end up with null values instead of the lobs on the target database.

Check for datatypes used for the tables. All logical replications have some limitations in terms of replicated data types. You may find some of your tables can be replicated with some limitations or not replicated at all.

Let’s speak about DMS. When you plan your DMS replication tasks keep in mind that you can combine replication of several schemas to one task. It may significantly reduce load on the source system. Each migration or replication task may apply a considerable load to the source system. In my experience, we hit 100% CPU and max IO load running 8 replication tasks on the source m3.large system.

Remember you cannot change the created task. You are able to stop, start or resume the task but you cannot alter any rules or parameters for the task. It will maybe change soon but currently, it is not possible yet.

If you choose the truncate option for your migration and replication task it may change your metadata. You may find your indexes and constraints to be renamed and you even can lose some of the indexes. In my case, I got renamed primary key and unique key constraints and couple of indexes were lost. Please be careful. After that case, I tried to choose “do nothing” mode and do everything by myself preparing the target for data migration and replication.

You can use RDS snapshot to instantiate your data if you plan to setup a replication between two RDS instances of the same type. In our case, it was done to minimize downtime when the database was migrated to an encrypted storage. When you use RDS snapshot you can use snapshot creation time as “Custom CDC start time” for your DMS replication task.

If you use one universal user for your endpoints to replicate multiple schemas you will need to use transformation rules because the replication will try to use schema from endpoint as destination target by default. By other words, you set up a target endpoint using user “DMS” and try to replicate schema SCOTT it will use schema “DMS” as a destination by default if you don’t have the transformation rules.

You have to enable minimal supplemental logging on database level and supplemental logging for all columns explicitly for each replicated table. Even you enable supplemental logging for all columns on database level using “alter database” you still need to add it on table level. The DMS task will be aborted without it complaining about the lack of supplemental logging for all column for the table.

If you create a table in one of your replicated schema you need to add a clause for supplemental logging like “CREATE TABLE …ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS; ” . In that case, the table and data will be replicated.

It is better to use AWS cli to get most of the service since all new features and additions will be in cli first. As an example, if you want a debug logging for your replication you have to use AWS cli.

If you have only unique index on a table and it is function based index your data migration and replication task can fail. Also, it may fail if you have more than one unique index on a table.

I hit some issues with monitoring. The idea looks good but it requires some fixing. It looks like it doesn’t work correctly in Firefox and Safari. At least for me, it was not working right.

The status of a task may not tell you everything. Sometimes it shows state “ERROR” but nevertheless, it works and replicates data behind the scenes. So, it can be bit misleading. I look to statistics and monitoring pages for the task to get the full picture.

As a summary, I can say the service deserves attention and can be considered as a valuable option when you plan your migration strategy and AWS DMS team works hard to make it better. Happy migrations!

The other day I was running an upgrade for a client that is using ACLs ( Access Control Lists) from 11.2.0.3 to 12.1.0.2. If you have been doing upgrades to 12c, you know that when running the catctl.pl -n 4 catupgrd.sql it entails 73 steps. So this upgrade failed in step 65 with the following error (I have trimmed the output for reading purposes) :

This was a small post to make you aware that if you are using ACLs , you need to run the patch 20369415 to the 12c binaries so that you don’t have to face a possible database corruption and have a harder time upgrading your database.

Databases or schemas tend to get moved around between different servers or even datacenters for hardware upgrades, consolidations or other migrations. And while the work that needs to be done is pretty straight forward for DBAs, I find the most annoying aspect of that is updating all client connect strings and tns entries used with new IP addresses and – if not using services – also the SID as the instance name might have changed.

That process can be simplified a lot when following a simple good practice of creating an extra service for each application or schema and along with that service also a DNS name for that IP. With that in place, a database can be migrated without the need to touch client connection parameters or tns aliases. All that is needed will be to migrate the database or schema, create the service name on the new instance and update the DNS record to the new machine.

Demo

Here is an example. I am migrating a schema from an 11g single instance on my laptop to a RAC database in the oracle public cloud. I am connecting to that database with blog_demo.pythian.com both as the hostname (faked through /etc/hosts instead of proper DNS for this demo) and the service name. As an application I am connecting to the database with sqlcl and a static connection string. Just remember that the whole, and only point of this demo is to migrate the schema without having to change that connect string.

Updated the DNS or as in this simplified demo my /etc/hosts and now I can connect with the same connection string. Note that the IP, the instance_name and the version have changed without the need to modify the connection string.

When creating services for your applications to connect to a schema, also create a DNS entry for that and use this DNS name and the service for all client and application connections instead of using the hostname and SID. This might initially look like overhead but allows for flexibility when migrating schemas or databases to other systems. Updating DNS and creating a new service on the target machine can be changed in central places and saves updating potentially hundreds of client connect strings or tnsnames across the enterprise.

In the last post of this series about Oracle net services, I talked about how services can help you identify performance issues faster and easier by tagging connections with service names. Today I am introducing you to the idea of temporarily disabling connections during maintenance with the help of services.

During deployments, testing or reorganizations it might be necessary to prevent clients from connecting to the database while still allowing access for DBAs to do their work. Some methods to do this include temporarily locking application user accounts or putting the database in quiesce mode. But with services, you now also have a more tactical approach to this issue.

My example assumes a single instance with two services DEMO_BATCH and DEMO_OLTP. And let’s assume that we need to temporarily disable batch services, maybe just to reduce system load due to those activities or maybe because we are reorganizing the objects used by the batch processes.

To disable a service in a single instance we can either remove it from the SERVICE_NAMES instance parameter or use the DBMS_SERVICE package.

Note that stopping will only affect new connections. Existing sessions that used the DEMO_BATCH service are allowed to continue until they disconnect or you kill them. This gives you the flexibility of a grace period where you just wait for existing sessions to finish their work and disconnect by themselves.

If you are using grid infrastructure and manage services through srvctl this behaviour is basically the same but you get an extra “force” switch to also disconnect existing sessions while stopping a service.

Creating extra services on a database allows you to stop and start them for maintenance which can be used as a convenient way to lock out only certain parts of an application while leaving user accounts unlocked to connect via different services.

I spend most of my time as a DBA in linux terminal and sqlplus. Everybody who works with oracle sqlplus knows about its power, but also about its limitations. For many years I have used the rlwrap utility developed by Hans Lub. It gives me command history, and the ability to edit my SQL Plus commands, and use auto completion if I set it up. In this post I will share some tips about installation and basic usage.

First we need to install the utility, and there are several options for that. We will check a few of them below. Please keep in mind that all of my examples are tested on Oracle Linux 6.

For the first one we need git, yum and automake packages. We will use the latest and greatest source code from the project site on GitHub: https://github.com/hanslub42/rlwrap and your standard Yum repository. Assuming you have connection to GitHub and your Yum repo. Let’s run it step-by-step:

The second way is to compile it from source you have downloaded from http://utopia.knoware.nl/~hlub/uck/rlwrap/ . It may be useful if you don’t have connection to Yum and GitHub. Keep in mind you will need GNU readline and ncurses libraries and headers installed in your system. So, we download the binaries, unpack it, compile and install.

The third way is to copy previously compiled rlwrap execution file and use it on a new system adding it to */bin directory in standard path. It works if you have several similar, binary compatible systems and don’t want to spend time compiling the same binaries on each one.

Of course you may consider to make your own rpm or use EPEL (Extra Packages for Enterprise Linux) yum repository and install it from there. Just keep in mind the version you get from EPEL may be slightly outdated.

Having the rlwrap installed you may find use for it. Here some basic examples how you can use the utility: Create an alias in your bashrc for sqlplus :

vi ~/.bashrc

and add

alias sqlp='rlwrap sqlplus'

The same you can do for rman :

alias rman='rlwrap rman'

For Oracle GoldenGate command line utility

alias ggsci='rlwrap ./ggsci'

In rlwrap you can use ! and TAB to call list of commands or use prefix and CTRL+R to search for certain command in command history. Also you can create your own dictionary and use it for auto completion. Let’s try to build some dictionary for auto-completion I created a file “lsql.lst” with the following contents:

You can see we have all “V$” views and it can be extremely handy when you don’t really have any time to search for a view name and only vaguely remember that you have a view to look up for certain information.

The rlwrap may not be most sophisticated program but it can make your life much easier. There may be a more advanced tool for sqlplus like SQLcl that provides a lot more options. But—the beauty of rlwrap is in its “lightness” and ability to work not only with sqlplus, but with practically any command line tool.

I then found this note on Metalink : EXADATA : What Is Oracle ASM appliance.mode Attribute (Doc ID 1664261.1) which explains that starting from Exadata 11.2.0.3.3, “The Oracle ASM appliance.mode attribute improves disk rebalance completion time when dropping one or more Oracle ASM disks. This means that redundancy is restored faster after a failure.”

Wow, that looks like a pretty cool feature! But it seems that (sadly) we cannot set a 12.1 compatible if this feature is enabled.

Let’s give it a try and deactivate it to set my compatible.asm to 12.1 :

Sadly, no. I hit one of the restrictions; indeed, “The Oracle ASM disk group attribute compatible.asm is set to release 11.2.0.4, 12.1.0.2 and later. (Appliance mode is not valid with 12.1.0.1)” — then I guess that even of the documentation does not say so, Appliance mode is not valid with a compatible.asm set to 12.1.0.0.0 either. Even if it is very weird that Oracle dev “forgot” (?) this cool feature when they released 12c on Exadata, they hopefully “released” it again with 12.1.0.2. I’ll follow up on that when moving this compatible.asm to 12.1.0.2 !

We use the terms DDL and DML in our DBA life with Oracle all the time. The first stands for Data Definition Language and it is about Data Definition changes. The latter is about manipulation with your data itself, and stands for Data Manipulation Language. When we speak about replication of our data by replication tools between Oracle databases, we generally either enable DDL, work only replicating DML, or do it for both together. In general, I would recommend replicating both DML and DDL just in case, to prevent the replication to be broken in case of unexpected structural changes in the replicated schemas. But in some cases you do not want to replicate all DDL or any DDL at all for certain reasons. I will discuss a couple of operations which are handled slightly different from pure DDL/DML changes in GoldenGate.

The first of them is truncate operation. In Oracle it is definitely DDL and you can see that.

It is clear that the object gets the new data_object_id and last_ddl_time shows new time.

There is a general assumption that you need to enable a DDL support to replicate truncates. But is this true? In reality you can replicate truncates (with some limitations) without full DDL support, and I want to show you how it can be done. What you need to do is setup a parameter GETTRUNCATES. Let’s setup it on extract and see how it works.

You can see that we have captured the truncate by our extract. Even our DDL support is disabled. What we need is to set up the same parameter GETTRUNCATES on replicat side. Why do we need to set it up explicitly? Because the default behaviour and parameter for GoldenGate is “IGNORETRUNCATES” for all processes. As result, the truncates will be applied to the target system.

We don’t need full DDL support if we want to replicate truncates only. Sometimes it may help us when we have workflow including truncates, but we don’t want to replicate all DDL commands for some reasons. Just keep in mind that it works with some limitations. You cannot replicate by using “truncate partition” for Oracle. It will require full DDL support.

The second thing I want to discuss in this topic is support for sequences values replication. Sometimes people assume that it requires DDL support, but this is not true. As matter of fact replicating of sequences values doesn’t require you to enable DDL support for your replication. Of course, you need full DDL replication support to replicate CREATE, ALTER, DROP, RENAME for sequences, but the values are replicated as DML.

To enable the replication of sequences you need to create a special user on source and target databases, add the user to the GGSCHEMA parameter to your .GLOBALS file, and run one script to create all necessary procedures in the newly created schema. Let’s have a closer look. I have a user OGG I am using for connection and I plan to use the same user for sequence support.

We can see that the two operations are a bit different from all other standard DDL and DML in Oracle GoldenGate. I hope this small piece of information may help you in your implementation, or help to support your GoldenGate environment.

Query Store is a new feature in SQL Server 2016 which, once enabled, automatically captures a history of queries, execution plans, and runtime statistics, retaining them for your troubleshooting performance problems caused by query plan changes.

When it comes to building database applications and solutions, developers, DBAs, engineers and architects have a lot of new and exciting tools and technologies to play with, especially with the Hadoop and NoSQL environments growing so rapidly.

While it’s easy to geek out about these cool and revolutionary new technologies, at some point in the development cycle you’ll need to stop to consider the real-world business implications of the application you’re proposing. After all, you’re bound to face some tough questions, like:

Why did you choose that particular database for our mission-critical application? Can your team provide 24/7 support for the app? Do you have a plan to train people on this new technology? Do we have the right hardware infrastructure to support the app’s deployment? How are you going to ensure there won’t be any bugs or security vulnerabilities?

If you don’t have a plan for navigating and anticipating these kinds of questions in advance, you’re likely to face difficulty getting approval for and implementing your application.

Any database applications or solutions you build or adopt for your organization must be “enterprise-ready”: secure, stable and scalable with a proven, tested capacity for deployment. They must be easy to administer and easy to support. But how do you make sure that happens?

Open communications: A close working relationship between the development and operations teams goes a long way toward seamless integration of your database applications. By working together (from start to finish, as early on as possible), you can better anticipate the issues to be solved so your app or solution gets up and running faster.

Platform reliability: Open source databases are great for obvious reasons: they’re free and easily customizable. But if your app is revenue-generating or mission-critical, it’s better to use a tested and proven distribution platform like Datastax Enterprise for Cassandra, Cloudera or HortonWorks for Hadoop, and Oracle or Percona for MySQL.

Continuous quality: No matter which technology you use to build your app, make sure it passes rigorous quality assurance, scale and performance testing — both initially and with every new release of the software. Your vendor also needs to be proactive when it comes to releasing patches and fixing bugs.

Suitable infrastructure: Consider whether the infrastructure you’re running is appropriate for the app you’re developing. If the database is set to run on multiple nodes of commodity hardware — to cover your bases in case one fails — but your operations team likes to store everything on an expensive SAN device, you might want to look into other alternatives.

Experienced hosting: You’ll want to find a hosting company that is reliable, cost-effective and meets your company’s security policies. It should also have experience hosting the database technology you plan on using; that way, it knows how to respond when issues or challenges arise.

Expert talent: Bring in a team of experts that can support your entire environment. While your operations team may want to take care of some elements themselves (everything up to the OS level, for instance), you’ll still want to show them that you have 24/7 support coverage available if needed. This team should be committed to continuous training and have enough people skilled with your technology to provide uninterrupted coverage.

Comprehensive skills: Your team should be able to check your configurations against best practices for security, performance and availability — but don’t forget to ensure that they’re also set up for the more mundane things like systems monitoring, responding to alerts and fault finding.

Ongoing costs: When tallying the running cost of your application, keep in mind that you need to incorporate the cost of the distributed version, its hosting, and 24/7 support and optimization.

With all the elements that go into getting an application enterprise-ready, it might be easier to work with a reputable partner who has the experience and expertise to help you deploy the right solution for your organization and ensure its long-term success.

I am using Time Slider on several Solaris 11.x servers and I found the same problem on all of them – it doesn’t create any automatic snapshots for some datasets. For example, it doesn’t create any snapshots for the Solaris root dataset rpool/ROOT/solaris. However it creates snapshots for the leaf dataset rpool/ROOT/solaris/var. The rpool/ROOT dataset also doesn’t have any automatic snapshots, but rpool itself has snapshots, so it’s not easy to understand what is happening.

I searched for this problem and found that other people have noticed it as well.

The original poster (OP) has encountered a problem with a complex pool configuration with many nested datasets having different values for the com.sun:auto-snapshot* properties. He has dug into the Time Slider Python code and has proposed a change, which has been blindly accepted without proper testing and has ended up in Solaris 11. Unfortunately, this change has introduced a serious bug which has destroyed the logic for creating recursive snapshots where they are possible.

Let me quickly explain how this is supposed to work. If a pool has com.sun:auto-snapshot=true for the main dataset and all child datasets inherit this property, Time Slider can create a recursive snapshot for the main dataset and skip all child datasets, because they should already have the same snapshot. However, if any child dataset has com.sun:auto-snapshot=false, Time Slider can no longer do this. In this case the intended logic is to create recursive snapshots for all sub-trees which don’t have any excluded children and then create non-recursive snapshots for the remaining datasets which also have com.sun:auto-snapshot=true. The algorithm is building separate lists of datasets for recursive snapshots and for single snapshots.

Here is an excerpt from /usr/share/time-slider/lib/time_slider/zfs.py:

# Now figure out what can be recursively snapshotted and what
# must be singly snapshotted. Single snapshot restrictions apply
# to those datasets who have a child in the excluded list.
# 'included' is sorted in reverse alphabetical order.
for datasetname in included:
excludedchild = False
idx = bisect_right(everything, datasetname)
children = [name for name in everything[idx:] if \
name.find(datasetname) == 0]
for child in children:
idx = bisect_left(excluded, child)
if idx < len(excluded) and excluded[idx] == child:
excludedchild = True
single.append(datasetname)
break
if excludedchild == False:
# We want recursive list sorted in alphabetical order
# so insert instead of append to the list.
recursive.append (datasetname)

This part is the same in all versions of Solaris 11 (from 11-11 to 11.3, which is currently the latest). If we look at the comment above the last line, it says that it should do “insert instead of append to the list”. This is because the included list is sorted in reverse alphabetical order when it is built. And this is the exact line that has been modified by the OP. When append is used instead of insert the recursive list becomes sorted in reverse alphabetical order as well. The next part of the code is traversing the recursive list and is trying to skip all child datasets which already have their parent marked for recursive snapshot:

This code heavily relies on the sort order and fails to do its job when the list is sorted in reverse order. What happens is that all datasets remain in the list with child datasets being placed before their parents. Then the code tries to create recursive snapshot for each of these datasets. The operation is successful for the leaf datasets, but fails for the parent datasets because their children already have a snapshot with the same name. The snapshots are also successful for the datasets in the single list (ones that have excluded children). The rpool/dump and rpool/swap volumes have com.sun:auto-snapshot=false. That’s why rpool has snapshots.

Luckily, the original code was posted in the same thread so I just reverted the change:

if excludedchild == False:
# We want recursive list sorted in alphabetical order
# so insert instead of append to the list.
recursive.insert(0, datasetname)

After doing this, Time Slider immediately started creating snapshots for all datasets that have com.sun:auto-snapshot=true, including rpool/ROOT and rpool/ROOT/solaris. So far I haven’t found any issue and snapshots work as expected. There may be some issues with very complex structure like the OP had, but his change has completely destroyed the clever algorithm for doing recursive snapshots where they are possible.

Final Thoughts.

It is very strange that Oracle hasn’t paid attention to this bug and has left it hanging for more than 4 years. Maybe they consider Time Slider a non-important Desktop feature. However I think that it’s fairly useful for servers as well.

The solution is simple – a single line change, but it will be much better if this is resolved in a future Solaris 11.3 SRU. Until then I hope that my blog post will be useful for anyone who is trying to figure out why the automatic snapshots are not working as intended.

A while ago, I posted a better “du” for asmcmd . Since then, Oracle 12cR2 beta has been released but it seems that our poor old “du” will not be improved.

I then wrote a better “better du for asmcmd” with some cool new features compared to the previous one which was quite primitive.

In this second version you will find :

No need to set up your environment, asmdu will do it for you

If no parameter is passed to the script, asmdu will show you a summary of all the diskgroups :

If a parameter (a diskgroup) is passed to the script, asmdu will show you a summary of the diskgroup size with its filling rate and the list of the directories it contains with their sizes :Note : you can quickly see in this screenshot that “DB9” consumes the most space in the FRA diskgroup; it is perhaps worth to have a closer look

A list of all running instances on the server is now shown on top of the asmdu output; I found that handy to have that list here

I also put some colored thresholds (red, yellow and green) to be able to quickly see which diskgroup has a space issue; you can modify it easily in the script :

Puppet manifests can hold a lot of sensitive information. Sensitive information like passwords or certificates are used in the configuration of many applications. Exposing them in a puppet manifest is not ideal and may conflict with an organization’s compliance policies. That is why data separation is very important aspect of secure puppet code.

Hiera is a pluggable Hierarchical Database. Hiera can help by keeping data out of puppet manifests. Puppet classes can look for data in hiera and hiera would search hierarchically and provide the first instance of value.

Although Hiera is able to provide data separation, it cannot ensure security of sensitive information. Anyone with access to the Hiera data store will be able to see the data.

Enter Hiera-eyaml. Hiera-eyaml is a backend for Hiera that provides per-value encryption of sensitive data within yaml files to be used by Puppet.

The following puppet module can be used to manage hiera with eyaml support.

Puppetmaster need to be restarted after this as changes to hiera.conf would need a restart to apply.

Using eyaml command line

Eyaml commands need to be used in a directory with keys directory(In this example /etc/puppet). Following command can be used to encrypt a password. The command would give us two options, string and block.

The above examples covers encrypting strings and files, which constitutes most of the sensitive data used in puppet code. Incorporating hiera-eyaml into puppet work-flow will ensure compliance and security of sensitive data.

Awhile ago I found myself analyzing a MySQL fabric installation to understand why a group member was occasionally being marked as FAULTY even when the server was up and running and no failures were observed.

Since I was not clear under which circumstances a server is marked as FAULTY, I decided to review MySQL Fabric code (Python) to better understand the process.

The module responsible for printing this message is failure_detection.py and more specifically, the _run method belonging to FailureDetector class. This method will loop through every server in a group, and attempt a connection to the MySQL instance running on that node. MySQLServer.Is_alive (mysql/fabric/server.py) method is called for this purpose.

Before reviewing the failure detection process, we first need to know that there are four MySQL fabric parameters that will affect when a server is considered unstable or faulty:

DETECTION_TIMEOUT DETECTION_INTERVAL DETECTIONS FAILOVER_INTERVAL

Based on the above variables, the logic followed by FailureDetector._run() to mark a server as FAULTY is the following:

1) Every {DETECTION_INTERVAL/DETECTIONS} seconds, a connection against each server in the group is attempted with a timeout equal to DETECTION_TIMEOUT2) If DETECTION_TIMEOUT is exceeded, the observed message is logged and a counter incremented3) When this counter reaches DETECTIONS, the server is marked as “unstable” and if the last time the master changed was greater than FAILOVER_INTERVAL ago, the server is marked as FAULTY

With a better understanding of the logic followed by MySQL fabric to detect faulty nodes, I went to the configuration file to check the existing values for each of the parameters:

From the values above we can notice that each group will be polled every 2 seconds (DETECTION_INTERVAL/DETECTIONS) and that the monitored server should respond within a second for the test to be considered successful.

On high concurrency nodes, or nodes under heavy load, a high polling frequency combined with tight timeouts could cause the servers to be marked as FAULTY just because the connection attempt would not be completed or processed (in the case of high connection rates or saturated network interfaces) before the timeout occurs.

Also, having FAILOVER_INTERVAL reduced to 0, will cause the server to be marked as FAULTY even if a failover had just occurred.

A less aggressive configuration would be more appropriated for heavy loaded environment:

As with any other database clustering solution that relies on a database connection to test node status, situations where the database server would take longer to respond should also be considered. The polling frequency should be adjusted so the detection window is within an acceptable range, but the rate of monitoring connections generated is also kept to the minimum. Check timeouts should also be adjusted to avoid false positives caused by the server not being able to respond in a timely manner.

All of this takes time. Granted, not a great deal of time, but tuning is an iterative process and so these steps will be performed multiple times. Not only are these steps a productivity killer, but they are repetitive and annoying. No one wants to keep running the same manual command over and over.

This task is ripe for some simple automation.

If both the client and database servers are some form of Unix, automating these tasks is straightforward.

Please note that these scripts require an 11g or later version of the Oracle database. These scripts are dependent on the v$diag_info view to retrieve the tracefile name. While these scripts could be made to work on 10g databases, that is left as an exercise for the reader.

Step by Step

To simplify the process it can be broken down into steps.

1. Reconnect

The first step is to create a new connection each time the SQL is executed. Doing so ensures the database session gets a new tracefile, as we want each execution to be isolated.

-- reconnect.sql
connect jkstill/XXXX@oravm

2. Get the Tracefile hostname, owner and filename

Oracle provides all the information needed.

In addition the script will set the 10046 event, run the SQL of interest and then disable the 10046 event.

This demo shows you how to automate the retrieval of the trace file. But why stop there? The processing of the file can be modified as well.

Really, it isn’t even necessary to copy the script over, as the content can be retrieved and piped to your favorite command. The script mrskew.sql for instance uses ssh to cat the tracefile, and then pipe the contents to the Method R utility, mrskew. Note: mrskew is a commercial utility, not open source software.

Now we can see where all the db time was consumed for this SQL statement, and there was no need to copy the trace file to the current working directory. The same can be done for tkprof and other operations. Please see the plan.sql and tkprof.sql scripts in the Github repository.

Wrapping It Up

A little bit of automation goes a long way. Commands that are repetitive, boring and error prone can easily be automated. Both your productivity and your mood will soar when little steps like these are used to improve your workflow.