13 Using the PL/SQL Hierarchical Profiler

You can use the PL/SQL hierarchical profiler to identify bottlenecks and performance-tuning opportunities in PL/SQL applications.

The profiler reports the dynamic execution profile of a PL/SQL program organized by function calls, and accounts for SQL and PL/SQL execution times separately. No special source or compile-time preparation is required; any PL/SQL program can be profiled.

This chapter describes the PL/SQL hierarchical profiler and explains how to use it to collect and analyze profile data for a PL/SQL program.

13.1 Overview of PL/SQL Hierarchical Profiler

Nonhierarchical (flat) profilers record the time that a program spends within each subprogram—the function time or self time of each subprogram. Function time is helpful, but often inadequate. For example, it is helpful to know that a program spends 40% of its time in the subprogram INSERT_ORDER, but it is more helpful to know which subprograms call INSERT_ORDER often and the total time the program spends under INSERT_ORDER (including its descendant subprograms). Hierarchical profilers provide such information.

Time spent in the subprogram itself and in its descendent subprograms (subtree time)

Detailed parent-children information, for example:

All callers of a given subprogram (parents)

All subprograms that a given subprogram called (children)

How much time was spent in subprogram x when called from y

How many calls to subprogram x were from y

The PL/SQL hierarchical profiler is implemented by the DBMS_HPROF package and has two components:

Data collection

The data collection component is an intrinsic part of the PL/SQL Virtual Machine. The DBMS_HPROF package provides APIs to turn hierarchical profiling on and off and write the raw profiler output to a file.

Analyzer

The analyzer component processes the raw profiler output and stores the results in hierarchical profiler tables.

Note:

To generate simple HTML reports directly from raw profiler output, without using the Analyzer, you can use the plshprof command-line utility.

13.2 Collecting Profile Data

To collect profile data from your PL/SQL program for the PL/SQL hierarchical profiler, follow these steps:

Ensure that you have these privileges:

EXECUTE privilege on the DBMS_HPROF package

WRITE privilege on the directory that you specify when you call DBMS_HPROF.START_PROFILING

Use the DBMS_HPROF.START_PROFILING PL/SQL API to start hierarchical profiler data collection in a session.

Run your PL/SQL program long enough to get adequate code coverage.

To get the most accurate measurements of elapsed time, avoid unrelated activity on the system on which your PL/SQL program is running.

A directory object is an alias for a file system path name. For example, if you are connected to the database ASSYSDBA, this CREATEDIRECTORY statement creates the directory object PLSHPROF_DIR and maps it to the file system directory /private/plshprof/results:

CREATE DIRECTORY PLSHPROF_DIR as '/private/plshprof/results';

To run the SQL script in Example 13-1, you must have READ and WRITE privileges on both PLSHPROF_DIR and the directory to which it is mapped. if you are connected to the database ASSYSDBA, this GRANT statement grants READ and WRITE privileges on PLSHPROF_DIR to HR:

13.3 Understanding Raw Profiler Output

Raw profiler output is intended to be processed by the analyzer component of the PL/SQL hierarchical profiler. However, even without such processing, it provides a simple function-level trace of the program. This topic explains how to understand raw profiler output.

Note:

The raw profiler format shown in this chapter is intended only to illustrate conceptual features of raw profiler output. Format specifics are subject to change at each Oracle Database release.

The SQL script in Example 13-1 wrote this raw profiler output to the file test.trc:

Call events (P#C) and return events (P#R) are properly nested (like matched parentheses). If an unhandled exception causes a called subprogram to exit, the profiler still reports a matching return event.

Each call event (P#C) entry in the raw profiler output includes this information:

This name might be one of the special function names in Section 13.3.2.

Hexadecimal value that represents an MD5 hash of the signature of the called subprogram

The DBMS_HPROF.analyze PL/SQL API (described in Section 13.4) stores the hash value in a hierarchical profiler table, which allows both you and DBMS_HPROF.analyze to distinguish between overloaded subprograms (subprograms with the same name).

Line number at which the called subprogram is defined in the PL/SQL module

PL/SQL hierarchical profiler does not measure time spent at individual lines within modules, but you can use line numbers to identify the source locations of subprograms in the module (as IDE/Tools do) and to distinguish between overloaded subprograms.

For example, consider this entry in the preceding example of raw profiler output:

P#C PLSQL."HR"."TEST"::7."TEST.FOO"#980980e97e42f8ec #4

The components of the preceding entry have these meanings:

Component

Meaning

PLSQL

PLSQL is the namespace to which the called subprogram belongs.

"HR"."TEST"

HR.TEST is the name of the PL/SQL module in which the called subprogram is defined.

7

7 is the internal enumerator for the module type of HR.TEST. Examples of module types are procedure, package, and package body.

"TEST.FOO"

TEST.FOO is the name of the called subprogram.

#980980e97e42f8ec

#980980e97e42f8ec is a hexadecimal value that represents an MD5 hash of the signature of TEST.FOO.

#4

4 is the line number in the PL/SQL module HR.TEST at which TEST.FOO is defined.

Analyzes the profile data in the raw profiler output file test.trc (from Section 13.3), which is in the directory that is mapped to the directory object PLSHPROF_DIR, and stores the data in the hierarchical profiler tables in Table 13-3.

Returns a unique identifier that you can use to query the hierarchical profiler tables in Table 13-3. (By querying these hierarchical profiler tables, you can produce customized reports.)

Elapsed time, in microseconds, for subprogram, including time spent in descendant subprograms.

FUNCTION_ELAPSED_TIME

NUMBER(38)

Elapsed time, in microseconds, for subprogram, excluding time spent in descendant subprograms.

CALLS

NUMBER(38)

Number of calls to subprogram.

Table 13-6 describes the columns of the hierarchical profiler table DBMSHP_PARENT_CHILD_INFO, which contains one row of parent-child information for each unique parent-child subprogram combination profiled in this run of DBMS_HPROF.analyze.

Table 13-6 DBMSHP_PARENT_CHILD_INFO Table Columns

Column Name

Column Data Type

Column Contents

RUNID

NUMBER

References RUNID column of DBMSHP_FUNCTION_INFO table. For a description of that column, see Table 13-5.

PARENTSYMID

NUMBER

Parent symbol ID.

RUNID, PARENTSYMID references DBMSHP_FUNCTION_INFO(RUNID, SYMBOLID).

CHILDSYMID

NUMBER

Child symbol ID.

RUNID, CHILDSYMID references DBMSHP_FUNCTION_INFO(RUNID, SYMBOLID).

SUBTREE_ELAPSED_TIME

NUMBER(38)

Elapsed time, in microseconds, for RUNID, CHILDSYMID when called from RUNID, PARENTSYMID, including time spent in descendant subprograms.

FUNCTION_ELAPSED_TIME

NUMBER(38)

Elapsed time, in microseconds, for RUNID, CHILDSYMID when called from RUNID, PARENTSYMID, excluding time spent in descendant subprograms.

CALLS

NUMBER(38)

Number of calls to RUNID, CHILDSYMID from RUNID, PARENTSYMID.

13.4.2.2 Distinguishing Between Overloaded Subprograms

Suppose that a program declares three subprograms named compute—the first at line 50, the second at line 76, and the third at line 100. In the DBMSHP_FUNCTION_INFO table, each of these subprograms has compute in the FUNCTION column. To distinguish between the three subprograms, use either the LINE# column (which has 50 for the first subprogram, 76 for the second, and 100 for the third) or the HASH column (which has a unique value for each subprogram).

In the profile data for two different runs, the LINE# and HASH values for a function might differ. The LINE# value of a function changes if you insert or delete lines before the function definition. The HASH value changes only if the signature of the function changes; for example, if you change the parameter list.

Consider the third row of the table DBMSHP_PARENT_CHILD_INFO (Example 13-5). The RUNID column shows that this row corresponds to the third run. The columns PARENTSYMID and CHILDSYMID show that the symbol IDs of the parent (caller) and child (called subprogram) are 2 and 1, respectively. The table DBMSHP_FUNCTION_INFO (Example 13-4) shows that for the third run, the symbol IDs 2 and 1 correspond to procedures __plsql_vm and __anonymous_block, respectively. Therefore, the information in this row is about calls from the procedure __plsql_vm to the __anonymous_block (defined within __plsql_vm) in the module HR.TEST. This row shows that, when called from the procedure __plsql_vm, the function time for the procedure __anonymous_block is 44 microseconds, and the time spent in the __anonymous_block subtree (including descendants) is 121 microseconds.

13.4.2.4 Examples of Calls to DBMS_HPROF.analyze with Options

For an example of a call to DBMS_HPROF.analyze without options, see Example 13-2.

Example 13-6 creates a package, creates a procedure that invokes subprograms in the package, profiles the procedure, and uses DBMS_HRPROF.analyze to analyze the raw profiler output file. The raw profiler output file is in the directory corresponding to the PLSHPROF_DIR directory object.

13.5 plshprof Utility

The plshprof command-line utility (located in the directory $ORACLE_HOME/bin/) generates simple HTML reports from either one or two raw profiler output files. (For an example of a raw profiler output file, see test.trc in Section 13.2.)

You can browse the generated HTML reports in any browser. The browser's navigational capabilities, combined with well chosen links, provide a powerful way to analyze performance of large applications, improve application performance, and lower development costs.

Suppose that your raw profiler output file, test.trc, is in the current directory. You want to analyze and generate HTML reports, and you want the root file of the HTML report to be named report.html. Use this command (% is the prompt):

% plshprof -output report test.trc

13.5.2 HTML Report from a Single Raw Profiler Output File

To generate a PL/SQL hierarchical profiler HTML report from a single raw profiler output file, use these commands:

13.5.2.1 First Page of Report

The first page of an HTML report from a single raw profiler output file includes summary information and hyperlinks to other pages of the report.

Sample First Page

PL/SQL Elapsed Time (microsecs) Analysis

2831 microsecs (elapsed time) & 12 function calls

The PL/SQL Hierarchical Profiler produces a collection of reports that present information derived from the profiler output log in a variety of formats. These reports have been found to be the most generally useful as starting points for browsing:

Function Elapsed Time (microsecs) Data sorted by Total Subtree Elapsed Time (microsecs)

Function Elapsed Time (microsecs) Data sorted by Total Function Elapsed Time (microsecs)

Also, the following reports are also available:

Function Elapsed Time (microsecs) Data sorted by Function Name

Function Elapsed Time (microsecs) Data sorted by Total Descendants Elapsed Time (microsecs)

13.5.2.2 Function-Level Reports

The function-level reports provide a flat view of the profile information. Each function-level report includes this information for each function:

Function time (time spent in the function itself, also called "self time")

Descendants time (time spent in the descendants of the function)

Subtree time (time spent in the subtree of the function—function time plus descendants time)

Number of calls to the function

Function name

The function name is hyperlinked to the Parents and Children Report for the function.

Each function-level report is sorted on a particular attribute; for example, function time or subtree time.

This sample report is sorted in descending order of the total subtree elapsed time for the function, which is why information in the Subtree and Ind% columns is in bold type:

Sample Report

Function Elapsed Time (microsecs) Data sorted by Total Subtree Elapsed Time (microsecs)

2831 microsecs (elapsed time) & 12 function calls

Subtree

Ind%

Function

Descendant

Ind%

Calls

Ind%

Function Name

2831

100%

93

2738

96.7%

2

16.7%

__plsq_vm

2738

96.7%

310

2428

85.8%

2

16.7%

__anonymous_block

2428

85.8%

15

2413

85.2%

1

8.3%

HR.TEST.TEST (Line 1)

2413

85.2%

435

1978

69.9%

3

25.0%

HR.TEST.TEST.FOO (Line 3)

1978

69.9%

1978

0

0.0%

3

25.0%

HR.TEST.__static_sql_exec_line5 (Line 5)

0

0.0%

0

0

0.0%

1

8.3%

SYS.DBMS_HPROF.STOP_PROFILING (Line 53)

13.5.2.3 Module-Level Reports

Each module-level report includes this information for each module (for example, package or type):

Module time (time spent in the module—sum of the function times of all functions in the module)

Number of calls to functions in the module

Each module-level report is sorted on a particular attribute; for example, module time or module name.

This sample report is sorted by module time, which is why information in the Module, Ind%, and Cum% columns is in bold type:

Sample Report

Module Elapsed Time (microsecs) Data sorted by Total Function Elapsed Time (microsecs)

166878 microsecs (elapsed time) & 1099 function calls

Module

Ind%

Cum%

Calls

Ind%

Module Name

84932

50.9%

50.9%

6

0.5%

HR.P

67749

40.6%

91.5%

216

19.7%

SYS.DBMS_LOB

13340

8.0%

99.5%

660

60.1%

SYS.UTL_FILE

839

0.5%

100%

214

19.5%

SYS.UTL_RAW

18

0.0%

100%

2

0.2%

HR.UTILS

0

0.0%

100%

1

0.1%

SYS.DBMS_HPROF

13.5.2.4 Namespace-Level Reports

Each namespace-level report includes this information for each namespace:

Namespace time (time spent in the namespace—sum of the function times of all functions in the namespace)

Number of calls to functions in the namespace

Each namespace-level report is sorted on a particular attribute; for example, namespace time or number of calls to functions in the namespace.

This sample report is sorted by function time, which is why information in the Function, Ind%, and Cum% columns is in bold type:

Sample Report

Namespace Elapsed Time (microsecs) Data sorted by Total Function Elapsed Time (microsecs)

166878 microsecs (elapsed time) & 1099 function calls

Function

Ind%

Cum%

Calls

Ind%

Namespace

93659

56.1%

56.1%

1095

99.6%

PLSQL

73219

43.9%

100%

4

0.4%

SQL

13.5.2.5 Parents and Children Report for a Function

For each function tracked by the profiler, the Parents and Children Report provides information about parents (functions that call it) and children (functions that it calls). For each parent, the report gives the function's execution profile (subtree time, function time, descendants time, and number of calls). For each child, the report gives the execution profile for the child when called from this function (but not when called from other functions).

13.5.3.1 Difference Report Conventions

A positive value indicates that the number increased (regressed) from the first run to the second run.

A negative value for a difference indicates that the number decreased (improved) from the first run to the second run.

The symbol # after a function name means that the function was called in only one run.

13.5.3.2 First Page of Difference Report

The first page of an HTML difference report from two raw profiler output files includes summary information and hyperlinks to other pages of the report.

Sample First Page

PL/SQL Elapsed Time (microsecs) Analysis – Summary Page

This analysis finds a net regression of 2709589 microsecs (elapsed time) or 80% (3393719 versus 6103308). Here is a summary of the 7 most important individual function regressions and improvements:

Regressions: 3399382 microsecs (elapsed time)

Function

Rel%

Ind%

Calls

Rel%

Function Name

2075627

+941%

61.1%

0

HR.P.G (Line 35)

1101384

+54.6%

32.4%

5

+55.6%

HR.P.H (Line 18)

222371

6.5%

1

HR.P.J (Line 10)

Improvements: 689793 microsecs (elapsed time)

Function

Rel%

Ind%

Calls

Rel%

Function Name

-467051

-50.0%

67.7%

-2

-50.0%

HR.P.F (Line 25)

-222737

32.3%

-1

HR.P.I (Line 2)#

-5

-21.7%

0.0%

0

HR.P.TEST (Line 46)

The PL/SQL Timing Analyzer produces a collection of reports that present information derived from the profiler's output logs in a variety of formats. The following reports have been found to be the most generally useful as starting points for browsing:

Function Elapsed Time (microsecs) Data for Performance Regressions

Function Elapsed Time (microsecs) Data for Performance Improvements

Also, the following reports are also available:

Function Elapsed Time (microsecs) Data sorted by Function Name

Function Elapsed Time (microsecs) Data sorted by Total Subtree Elapsed Time (microsecs) Delta

Function Elapsed Time (microsecs) Data sorted by Total Function Elapsed Time (microsecs) Delta

Function Elapsed Time (microsecs) Data sorted by Total Descendants Elapsed Time (microsecs) Delta

13.5.3.4 Module-Level Difference Reports

Each module-level report includes, for each module, the change in these values from the first run to the second run:

Module time (time spent in the module—sum of the function times of all functions in the module)

Number of calls to functions in the module

Sample Report

Module Elapsed Time (microsecs) Data sorted by Total Function Elapsed Time (microsecs) Delta

Module

Calls

Module Name

2709589

3

HR.P

0

0

SYS.DBMS_HPROF

13.5.3.5 Namespace-Level Difference Reports

Each namespace-level report includes, for each namespace, the change in these values from the first run to the second run:

Namespace time (time spent in the namespace—sum of the function times of all functions in the namespace)

Number of calls to functions in the namespace

Sample Report

Namespace Elapsed Time (microsecs) Data sorted by Namespace

Function

Calls

Namespace

2709589

3

PLSQL

13.5.3.6 Parents and Children Difference Report for a Function

The Parents and Children Difference Report for a function shows changes in the execution profiles of these from the first run to the second run:

Parents (functions that call the function)

Children (functions that the function calls)

Execution profiles for children include only information from when this function calls them, not for when other functions call them.

The execution profile for a function includes this information:

Function time (time spent in the function itself, also called "self time")

Descendants time (time spent in the descendants of the function)

Subtree time (time spent in the subtree of the function—function time plus descendants time)

Number of calls to the function

Function name

The sample report is a fragment of a Parents and Children Difference Report that corresponds to a function named HR.P.X.

The first row, a summary of the difference between the first and second runs, shows regression: function time increased by 1,094,099 microseconds (probably because the function was called five more times).

The "Parents" rows show that HR.P.G called HR.P.X nine more times in the second run than it did in the first run, while HR.P.F called it four fewer times.

The "Children" rows show that HR.P.X called each child five more times in the second run than it did in the first run.

Sample Report

HR.P.X (Line 11)

Subtree

Function

Descendant

Calls

Function Name

3322196

1094099

2228097

5

HR.P.X (Line 11)

Parents:

6037490

1993169

4044321

9

HR.P.G (Line 38)

-2715294

-899070

-1816224

-4

HR.P.F (Line 28)

Children:

1125489

1125489

0

5

HR.P.J (Line 10)

1102608

1102608

0

5

HR.P.I (Line 2)

The Parents and Children Difference Report for a function is accompanied by a Function Comparison Report, which shows the execution profile of the function for the first and second runs and the difference between them. This example is the Function Comparison Report for the function HR.P.X:

Sample Report

Elapsed Time (microsecs) for HR.P.X (Line 11) (20.1% of total regression)

HR.P.X (Line 11)

First Trace

Ind%

Second Trace

Ind%

Diff

Diff%

Function Elapsed Time (microsecs)

1999509

26.9%

3093608

24.9%

1094099

+54.7%

Descendants Elapsed Time (microsecs)

4095943

55.1%

6324040

50.9%

2228097

+54.4%

Subtree Elapsed Time (microsecs)

6095452

81.9%

9417648

75.7%

3322196

+54.5%

Function Calls

9

25.0%

14

28.6%

5

+55.6%

Mean Function Elapsed Time (microsecs)

222167.7

220972.0

-1195.7

-0.5%

Mean Descendants Elapsed Time (microsecs)

455104.8

451717.1

-3387.6

-0.7%

Mean Subtree Elapsed Time (microsecs)

677272.4

672689.1

-4583.3

-0.7%

Scripting on this page enhances content navigation, but does not change the content in any way.