Wednesday, October 24, 2012

A new version 2.0 of the XPLAN_ASH utility introduced here is available for download.
You can download the latest version here.
The change log tracks the following changes:
- Access check
- Conditional compilation for different database versions
- Additional activity summary
- Concurrent activity information (what is/was going on at the same time)
- Experimental stuff: Additional I/O summary
- More pretty printing
- Experimental stuff: I/O added to Average Active Session Graph (renamed to Activity Timeline)
- Top Execution Plan Lines and Top Activities added to Activity Timeline
- Activity Timeline is now also shown for serial execution when TIMELINE option is specified
- From 11.2.0.2 on: We get the ACTUAL DOP from the undocumented PX_FLAGS column added to ASH
- All relevant XPLAN_ASH queries are now decorated with comments so it should be easy to identify them in the Library Cache
- More samples are now covered and a kind of "read consistency" across queries on ASH is introduced
- From 11.2.0.2 on: Executions plans are now pulled from the remote RAC instance Library Cache if necessary
- Separate Parallel Slave activity overview (similar to "Parallel" tab of Real-Time SQL Monitoring)
- Support for the limited ASH capabilities of Oracle 10.2
So that's quite a big list of changes. In particular noteworthy is that the script now supports the limited capabilities of 10gR2 ASH. Although the coolest features of the script (Activity and Parallel Distribution information on Execution Plan line level) are not available with 10g, the remaining stuff still can be very useful - in particular the Parallel Slave and Activity Timeline information for Parallel Execution analysis - so that I believe this is a nice addition for those not yet on 11g and one of most powerful ways of analysing Parallel Execution on 10g.
When enabling Rowsource Statistics you can even get actual Rowsource cardinalities, however not all Parallel Execution plans are supported via DBMS_XPLAN.DISPLAY_CURSOR - Parallel Execution plans using multiple Data Flow Operations (DFOs, see the previous post about XPLAN_ASH for a little bit more info about DFOs) will generate multiple child cursors (one per DFO) and DISPLAY_CURSOR cannot "aggregate" the statistics information across multiple child cursors (besides that the "Rowsource Statistics" attribute is not preserved when enabling it on session level and those multiple child cursors get generated).
Functionality-wise I believe the most important stuff is now covered by this version of the script, and therefore the next releases will focus on fixing the numerous bugs that certainly have been introduced by this new version, and in particular on an improved representation of the information.
Very likely a HTML report similar to the HTML version of Real-Time SQL Monitoring would allow getting a much better overview of the information provided rather than the messy output of the current SQL*Plus query results.
Due to the way the queries on the ASH data are now written, it should be possible to come up with a version that also supports S-ASH, the free ASH implementation.

Summary Of Changes

1. It runs a reasonable pre-check and informs if any of the objects potentially required by the script aren't accessible. You'll see an output similar to the following in such cases:

2. Furthermore the script no longer appears to "hang" if privileges are missing or anything else unexpected happens. Actually the script wasn't hanging but usually waited for input with termout set to OFF, since some of the defines populated under regular conditions weren't properly populated when things went wrong. This has been corrected and you should now see the above privilege check output along with regular follow-up error messages, but no hanging script any longer.
3. The output has been re-aligned and further sections have been added. In particular the various "Activity" / "Concurrent Activity" summaries should be helpful to get a better understanding of the overall activity profile of the query and concurrently ongoing database activity. See below for more details about these new sections.
Note that in case of a cross-instance Parallel Execution the summaries will also be provided on instance level in addition.

10.2 Support

Except for a few sections that simply aren't available when running on pre-11g the script shows quite similar information when used with 10.2
The biggest difference when using the script with 10.2 is caused by the fact that 10.2 ASH neither has a clear indicator of a SQL execution instance (the SQL_EXEC_START / SQL_EXEC_ID columns added in 11g) nor is there a relation to the execution plan lines (SQL_PLAN_LINE_ID etc.)
Therefore the SQL_EXEC_START parameter with 10.2 is mandatory, and simply specifies the lower sample time of ASH data to query. Instead of the non-existing SQL_EXEC_ID the script expects a SQL_EXEC_END date input similar to SQL_EXEC_START that obviously determines the upper sample time for ASH data and is also mandatory.
Likewise in 10.2 the script doesn't support searching for the most recent execution like it does on 11g, so - as already outlined - it is up to you to determine what ASH data should be searched for a particular SQL_ID by specifying both SQL_EXEC_START and SQL_EXEC_END.
Note that if the ASH samples at SQL_EXEC_START don't include samples for the SQL_ID in question, then the script uses the "closest" samples to SQL_EXEC_START found within the range SQL_EXEC_START / SQL_EXEC_END to determine the session information that will be used to limit the ASH data.

Short Introduction

Since the initial post what a bit lengthy, allow me to use this opportunity to summarize what the main purpose of the script is:
1. It provides an output that to a significant degree is similar to information provided by the Real-Time SQL Monitoring even in cases where the Real-Time SQL Monitoring report is no longer available.
In particular it allows pulling this information even from the DBA_HIST* views, something that is not possible with Real-Time SQL Monitoring, and therefore allows pulling detailed reports about SQL executions as long as ago as the configured AWR retention.
Note that this is applicable to both serial and Parallel Execution, and the tool can also provide information about currently executing SQL statements.
2. For Parallel Execution, it offers some important clues. Some of them aren't provided elsewhere, even not by Real-Time SQL Monitoring (At least not up to version 11.2):
- It shows information about the Average Active Sessions (AVERAGE_AS column below) with different granularity. This information is crucial in understanding how well the work was distributed among the Parallel Slaves. When looking at the sample report below it becomes immediately obvious that the AVERAGE_AS figures are way below the Parallel degree used.

In particular the "Parallel Slave activity", which corresponds to the "Parallel" tab of the Real-Time SQL Monitoring and the "Activity Timeline", which corresponds to a mixture of the "Activity" and "Metrics" tab of the Real-Time SQL Monitoring, allows spotting any anomalies regarding skew (both temporal and data distribution skew).
Below example shows that one Parallel Slave is sampled much more than the others. The Activity Timeline (scroll to the right to see the relevant columns) shows that only for a short period of time the expected number of slaves where active at the same time, and a long period of only one average active session.

Note that it is important to understand that this information doesn't tell you anything about the efficiency of the execution plan performed, it just tells you how many Parallel Slaves were active concurrently on average. Whether the operation performed is efficient or not is a different question, however even an otherwise efficient execution plan might run much longer than necessary if the work isn't distributed well among the Parallel Slaves.
- Furthermore it allows systematic troubleshooting of data distribution skew by showing the number of samples per process on exection plan line level. This makes it easy to spot those execution plan operations that were affected by data distribution skew. Armed with this knowledge it can be systematically evaluated if changing the data distribution method of the corresponding join operations allows influencing this reasonably.
The "Parallel Distribution ASH" and "Parallel Distribution Graph ASH" columns added to the DBMS_XPLAN.DISPLAY* output provide that information.
The "Parallel Distribution ASH" shows the process names along with the number of samples found (in descending order). For Parallel Slaves, this is Pxxx, for all other process names it is the actual name found in the PROCESS column. If the execution is cross-instance, the name is prefixed with the instance number:

Ideally the number of samples per process should be quite similar, if there is a significant difference, then very likely the underlying cause is an uneven data distribution among the Parallel Slaves.
This is how it should look like ideally:

The "Parallel Distribution Graph ASH" attempts a graphical representation of the information that you see in the "Parallel Distribution ASH" column.
The idea of the "Graph" is that you don't need to work through all the rows of the execution plan in the "Parallel Distribution ASH" column and look at the actual sample numbers, but simply can sweep through the "Graph" in order to easily identify the potential skew candidates.
So, if you have a uniform distribution of work among the Parallel Slaves, you'll see something like this:
0123456789ABCDEF... (it's just 16 repeating hex-characters if you have more than 16 slaves sampled for an execution plan line)
But, if you have a non-uniform distribution of work among the Parallel Slaves, you'll see something like this:
000000000001111223
or in extreme cases
0000000000000
which means that only a few slaves (or only one in the last example) had to do the all of work.
If the graph doesn't correspond to what you think to see in the "Parallel Distribution ASH" column - the distribution of work looks skewed, but the graph doesn't correspond to what I've just explained, then try the "ASH,DISTRIB,TIMELINE" option instead.
In this case even a single sample of a single Parallel Slave will show up as a very wide (so this option might produce false positives):
0000000000000000000000000
which you then need to cross-check with the "Parallel Distribution ASH" column for relevance: If you have only have a few samples per line, then it is probably not relevant since you simply haven't spend significant time there.
But the advantage of this representation is that skew becomes more visible that might not be obvious when using the default DISTRIB_REL option.
The underlying reason is that the default option shows the distribution graph relative to the operation with the greatest number of samples assigned to. So if there are execution plan lines with a much higher sample number than the remaining lines (but these samples are not skewed), the graph might be so compressed that the skew doesn't become obvious.

Similar to the previous version this section shows a general summary of the ASH samples found for this SQL execution. If a cross-instance execution is detected, the summary will also be shown on instance level in addition which allows understanding to what extend the instances where involved in the execution.

This is a new section that takes advantage of the fact that ASH data covers all activity and therefore shows a summary of the ASH samples found during the execution that do not belong to this SQL execution. This can provide important clues why you might see variations in runtime although execution plans and data are similar. Note that for cross-instance executions the concurrent activity covered is limited to the instances participating in the execution. Although in principle the activity on instances not participating can also influence this SQL execution for various reasons I have decided to exclude that information - which can of course easily be changed in the corresponding queries.

This section is more or less unchanged to the previous version and only shows up if Parallel Execution gets used, you're running 11g+ and an execution plan could be found for the SQL statement.
It will show details about the different Data Flow Operations (DFOs) a Parallel Execution could consist of, in particular the activity related to each DFO, the Parallel Degree per DFO (could be different per DFO) and if the DFOs were active at the same time or not. In case of a cross-instance Parallel Execution this information will be broken down per participating instance.
Note that from 11.2.0.2 on the actual Parallel Degree is stored in the newly added PX_FLAGS column of the ASH data, so you get an additional column ACTUAL_DEGREE, whereas the ASSUMED_DEGREE uses the number of sampled slaves and the execution plan operations from the corresponding execution plan section for an educated guess about the degree.

This new section corresponds to the "Parallel" tab of the Real-Time SQL Monitoring and only shows up if Parallel Execution was detected. In particular for 10g it is very helpful for Parallel Execution analysis since the even more helpful analysis on Execution Plan line level is not available.
It will show the activities found in the ASH data per every process sampled as part of the Parallel Execution.
The most useful information in this section is very likely the "Activity Graph" column as it allows easily spotting distribution problems, although it doesn't allow the systematic troubleshooting of such issues that the execution plan line level information does, as explained above.
Depending on the version also the "Top Activities" and "Top Active Plan Lines" will also be shown per process. As usual the N of the Top N can be configured in the configuration section of the script.

This is again a new section that summarizes the activity found for this SQL execution and in principle corresponds to some part of the header information shown in Real-Time SQL Monitoring.
The activity is broken down between CPU and non-CPU, and for non-CPU the different Activity Classes will be shown. In case of a cross-instance Parallel Execution this section will be repeated, this time broken down per instance.

This new section corresponds to the previous one but shows a more granular breakdown of the activity on wait event level rather than wait class. Apart from that the same applies as above, also it will be repeated in case of a cross-instance Parallel Execution on instance level.

This section was previously called "Average Active Session Graph", but since a lot of information has been added it is now called "Activity Timeline" and will also be shown for serial execution if the "TIMELINE" option was specified.
In addition to the already previously available information about the average active sessions and their activity (CPU or non-CPU) plus PGA/TEMP now the top active plan lines (11g+), activities and processes per bucket will be shown. Note that the N of the "Top N sections" can be configured in the "configuration" section of the script.

This section is pretty much unchanged to the previous version and shows the execution plan (if available) along with the activity per execution plan line, as explained in the original post for version 1.0 of the script. Note that the script detects whether an execution plan is available or not and shows the activity per execution plan line in a different format if no execution plan could be found.
A note to RAC users: In versions prior to 11.2.0.2 the execution plan can only be pulled from the local instance - in worst case this execution plan is an incorrect one, since for the same SQL_ID and CHIlD_CURSOR different plans could be in the different Library Caches of the RAC instances.
From 11.2.0.2 on the correct execution plan will be pulled from the corresponding RAC instance by using an (undocumented) function that allows a "remote" instance execution of a query.

Experimental Stuff

There is a global switch _EXPERIMENTAL at the beginning of the configuration section.
By default this is disabled because the stuff shown could be called "unreliable" and potentially "misleading" information.
If you enable it by setting the configuration switch to an empty string, the I/O figures from the ASH data (only from 11.2+ on)
will be shown at various places of the report. Note that this data is unreliable and usually falls short of
the actual activity (I've never seen it reporting more than the actual activities). Since sometimes unreliable
figures can be much better than nothing at all you can enable it in such cases where you want for example get an
idea if the I/O was in the range of MBs or GBs - this is something you should be able to tell from the ASH data.
Likewise the average and median wait times from ASH will be shown at different places of the report if experimental is turned on.
It is important to understand what these wait times are: These are waits that were "in-flight", not completed when the sampling took place.
Doing statistical analysis based on such sampled, in-flight wait times is sometimes called "Bad ASH math", but again, if you know what you are doing
and keep telling yourself what you're looking at, there might be cases where this information could be useful, for example, if you see that
hundreds or thousands of those "in-flight" waits were sampled with a typical wait time of 0.5 secs (for example a multiblock read) where you expect a typical wait time of 0.005 secs.
This might be an indication that something was broken or went wrong and could be worth further investigation.
This is what the affected sections look like with experimental turned on:

Note again that the wait times shown (average and median) are not true wait times since the wait was ongoing when the sample of the session was taken, which means that it wasn't completed yet, so the actual wait time for that wait will have been longer, and most of the waits won't be covered by the sampling anyway, so as already explained, you need to be very catious with that information.

This section, only available from 11.2 on, and only shown when the so called "experimental" mode is enabled (disabled by default, see the configuration section of the script).
From 11.2 on Oracle enriches the ASH data with accumulated statistics related to DB Time, CPU Time and I/O activity. This is a new approach since the data is not truly sampled, but accumulated from sample to sample, so in theory represents exact delta figures.
Note that this also means that the historic ASH data written to AWR (all samples every tenth second) requires these values to be aggregated for all samples in between that are not written to AWR in order to have meaningful information in these columns of the historic ASH data.
However, as it turns out, at least in 11.2, obviously not all I/O is covered by these statistics, for example I/O to the TEMP tablespace is not represented correctly and usually falls short of the actual I/O performed.
Furthermore there seem to be inconsistencies in the data that could indicate that it is not necessarily a reliable data source (the script to some extent attempts to detect such outliers and filter them out). Hence that information can be quite incorrect and therefore it is marked as "experimental". Nevertheless it can be much better than no I/O related information at all, therefore I decided to add it as a separate section.
Note that the I/O layer figures can be more than simply read plus write I/O, because these values include any software mirroring performed by Oracle via the ASM layer. So for example any write to ASM with ASM normal redundancy will be doubled in the I/O layer figures.
Please note furthermore that the Cell Offload Efficiency figure is only relevant to Exadata environments where the smart scan capabilities can reduce the actual amount of I/O. Of course you can easily end up with negative efficiency percentages if a lot of mirroring during writing takes place, or HCC compressed data gets uncompressed in the cells and sent to the compute nodes.

From 11.2 on, and if the "experimental" mode is enabled (as explained above already), the I/O figures per bucket will also be shown.
Although not part of the experimental stuff, please note that in particular the TEMP figures have turned out not to be very reliable, furthermore short peaks in PGA / TEMP usage are not well represented in the ASH data.