This is the blog of Robert Catterall, an IBM Db2 for z/OS specialist. The opinions expressed herein are the author's, and should not be construed as reflecting official positions of the IBM Corporation.

Sunday, December 29, 2013

Of the features introduced with DB2 10 for z/OS, high-performance DBATs is one of my favorites. It enabled (finally) DDF-using applications to get the CPU efficiency benefit that comes from combining thread reuse with the RELEASE(DEALLOCATE) package bind option -- a performance tuning action that has long been leveraged for CICS-DB2 workloads. Implementing high-performance DBATs is pretty easy: in a DB2 10 (or 11) environment, when a package bound with RELEASE(DEALLOCATE) is executed by way of a DBAT (i.e., a database access thread -- the kind used for DRDA requesters that connect to DB2 via the distributed data facility), that thread becomes a high-performance DBAT (if it isn't one already). Before jumping into this, however, you should consider some things that are impacted by the use of high-performance DBATs. One of those things is the DBAT pool. That's where the MAXDBAT parameter of ZPARM comes in, and that's what this blog entry is about.

The value of MAXDBAT determines the maximum number of DBATs that can be concurrently active for a DB2 subsystem. The default value is 200, and at many sites that value, or one that's a little larger, has effectively supported a much greater number of DB2 client-server application connections (the default value for CONDBAT in ZPARM -- the maximum number of connections through DDF to a DB2 subsystem -- is 10,000). How so? Well, if your system is set up to allow for inactive connections (CMTSTAT = INACTIVE has been the default in ZPARM since DB2 V8), when a DDF transaction completes the associated connection will go into an inactive state (a very low-overhead transition, as is the transition back to the active state) and the DBAT used for the transaction will go into the DBAT pool, ready to service another transaction. That can happen because a "regular" DBAT is only associated with a particular DB2 connection while it is being used to execute a request from said connection. Because it is common for only a small percentage of DDF connections to a DB2 subsystem to be active (i.e., associated with in-flight transactions) at any given moment, a large ratio of connections to DBATs has historically been no problem at all.

Bring high-performance DBATs into the picture, and things change. In particular, a high-performance DBAT, once instantiated, will remain dedicated to the connection through which it was instantiated until it's been reused by 200 units of work (at which point it will be terminated, so as to free up resources allocated to the thread). That high-performance DBAT, therefore, will NOT go into the DBAT pool when a transaction using the thread completes. When a request associated with another connection comes in (i.e., from a connection other than the one through which the high-performance DBAT was instantiated), the high-performance DBAT won't be available to service that request. Some other DBAT will have to be used, and guess what? If that DBAT isn't a high-performance DBAT, it will become one if the package associated with the incoming request (and that could be a DB2 Connect or IBM Data Server Driver package) was bound with RELEASE(DEALLOCATE). The DBAT pool thus becomes progressively smaller as high-performance DBATs are instantiated. Know what else happens? The number of active DBATs goes up -- maybe sharply. Why? Because a "regular" DBAT is active only while it is being used to execute a DDF transaction. A high-performance DBAT, on the other hand, is considered to be active as long as it exists -- that will be 200 units of work, as mentioned previously, and when a high-performance DBAT is waiting to be reused, it's an active DBAT.

This last point -- about the number of active DBATs potentially rising sharply when high-performance DBATs are utilized -- is illustrated by some information I recently received from a DB2 professional. At this person's shop, high-performance DBATs were "turned on" for a DB2 subsystem (the PKGREL option of the -MODIFY DDF command can be used as a "switch," telling DB2 to either honor RELEASE(DEALLOCATE) for packages executed via DBATs -- thereby enabling instantiation of high-performance DBATs -- or not), and the number of active DBATs for the subsystem went from the usual 60 or so to about 700. Because the MAXDBAT value for the DB2 subsystem was already at 750, these folks didn't run out of DBATs, but the pool of "regular" DBATs got pretty small. In response to the big increase in active DBATs seen when high-performance DBAT functionality was enabled, the MAXDBAT value for the DB2 system in question was increased to 2000. Was this OK? Yes: When packages are bound or rebound in a DB2 10 for z/OS environment, almost all thread-related virtual storage goes above the 2 GB "bar" in the DBM1 address space, and that allows for a 5- to 10-times increase in the number of threads that can be concurrently active for the DB2 subsystem.

So, if you're thinking about using high-performance DBATs (and you should), check your subsystem's MAXDBAT value, and consider making that value substantially larger than it is now. Additionally, take steps to enable selective use of high-performance DBATs by your network-attached, DB2-accessing applications. For programs that contain embedded SQL statements and, therefore, have their own packages (e.g., DB2 stored procedures -- both external and native), use RELEASE(DEALLOCATE) for the most frequently executed of these packages. For the packages associated with DB2 Connect and/or the IBM Data Server Driver, use two collections: The default NULLID collection, into which you'd bind the DB2 Connect and/or IBM Data Server Driver packages with RELEASE(COMMIT), and another collection (named as you want) into which you'd bind these packages with RELEASE(DEALLOCATE). Then, by way of a data source or connection string specification on the client side, direct DDF-using applications to NULLID or the other collection name, depending on whether or not you want high-performance DBATs to be used for a given application.

To keep an eye on DBAT usage for a DB2 subsystem, periodically issue the command -DISPLAY DDF DETAIL. In the output of that command you'll see a field, labeled QUEDBAT, that shows the number of times (since the DB2 subsystem was last started) that requests were delayed because the MAXDBAT limit had been reached. If the value of this field is non-zero, consider increasing MAXDBAT for the subsystem. You might also want to look at the value of the field DSCDBAT in the output of the -DISPLAY DDF DETAIL command. This value shows you the current number of DBATs in the pool for the subsystem. As I've pointed out, maintaining the "depth" of the DBAT pool as high-performance DBAT functionality is put to use might require increasing MAXDBAT for your DB2 subsystem.

DDF activity can also be tracked by way of your DB2 monitor. I particularly like to use a DB2 monitor-generated Statistics Long Report to see if the connection limit for a DB2 subsystem (specified via the CONDBAT parameter in ZPARM) is sufficiently high. In the section of the report under the heading "Global DDF Activity," I'll check the value of the field labeled CONN REJECTED-MAX CONNECTED (or something similar -- fields in reports generated by different DB2 monitors might be labeled somewhat differently). A non-zero value in this field is an indication that the CONDBAT limit has been hit, and in that case you'd probably want to set CONDBAT to a larger number to allow more connections to the DB2 subsystem.

So there you go. Using high-performance DBATs can improve the CPU efficiency of your DB2 for z/OS client-server workload, but if you do leverage high-performance DBAT functionality then you might need to boost the DBAT limit for your DB2 subsystem in order to maintain the depth of your DBAT pool, because as high-performance DBATs increase in number, pooled DBATs decrease in number (unless you've upped your MAXDBAT value to compensate for this effect). Boosting MAXDBAT in a DB2 10 (or 11) environment is OK, as thread-related virtual storage in such an environment is almost entirely above the 2 GB "bar" in the DBM1 address space (assuming that packages have been bound or rebound with DB2 at the Version 10 or 11 level). Of course, you need real storage to back virtual storage, so if you increase the MAXDBAT value keep an eye on the z/OS LPAR's demand paging rate and make sure that this doesn't get out of hand (if the demand paging rate is in the low single digits or less per second, it's not out of hand).

You state that "High Performance DBATs are limited to half of MAXDBATs." On what are you basing this claim? That is NOT how high-performance DBATs were implemented. I know for a fact that it's possible for 100% of the DBATs in a DB2 for z/OS system to be of the high-performance variety.

You also state that restricting "high performance DBATs to high volume, light transaction connections that DISCONNECT when they don't have SQL work" (emphasis added by me) is a "documented recommendation." To what documentation are you referring here? I ask because it is NOT important for a network-attached, DB2-accessing application to DISCONNECT following transaction completion if it's to benefit from the use of high-performance DBATs. This is an important point because it's VERY COMMON for network-attached, DB2-accessing applications to connect to a DB2 for z/OS location and then STAY CONNECTED to that DB2 location for a long time. High-performance DBATs can be used to very-much positive effect with such applications. What's important in such situations is that the application's transactions complete in a "clean" fashion. By that I mean that they complete and commit with no "hanging resources" (such as open WITH HOLD cursors, non-dropped declared temporary tables, etc.). When an application's transactions complete cleanly, it's totally OK to use high-performance DBATs for that application, even if it is characterized by long-duration connections to the DB2 server.

Your point about frequently executed "light" transactions being good candidates for high-performance DBATs is a good one (assuming that by "light" you mean relatively low in-DB2 CPU time, on average, per transaction). For such transactions, the cost of constantly releasing and reacquiring the same resources (such as table space locks and package sections) affected by the RELEASE bind option is proportionately higher than it would be for a "heavy" transaction, so you can get some real bang for your buck in using high-performance DBATs for transactions of that type. For a transaction with a higher average in-DB2 CPU time, high-performance DBATs won't move the server-side CPU consumption needle much, so they should just use "regular" DBATs. Similarly, transactions that are infrequently executed won't use high-performance DBATs to much benefit. They should also use "regular" DBATs.

Robert, thanks for your answers. The basis for my claim are some IBM presentations including page 41 of the March 13, 2012, SHARE 2012Session 10996, "DB2 for z/OS Distributed Access – Best Practices and Updates" by Adrian Burke, which states for DB2 10:

"If # of Hi-Perf DBATs exceed 50% of MAXDBAT threshold

• DBATs will be pooled at commit and package resources copied/allocated as RELEASE(COMMIT)"

Which I interpreted, based on the assumption that transactions are committed, that MAXDBAT/2 was the effective cap. Hence my request for elaboration of your scenario. Have I misinterpreted the bullet? Is the actual implementation different?

The information in that presentation bullet is not correct. High-performance DBAT functionality was implemented without a 50%-of-MAXDBAT limitation.

Sometimes, early presentations about a new release of DB2 contain some items of information related to code that is not yet finalized in terms of implementation. Such informational items may describe design ideas that end up not being implemented. It is possible for an informational item related to a not-implemented design idea to persist for a while in various iterations of presentation material. That may be what happened in this case.

Robert, I noticed that page 41 of the Adrian Burke presentation says, "After the Hi-Perf DBAT has been reused 200 times* DBAT will be purged and client connection will then go inactive." Page 18 of 1Q2013 Technology Summit Session 1 Track 2 presentation says "Connection turns inactive after 200 times to free up DBAT."

While page 30 of your recent presentation only references the fate of the DBAT, "After being used for 200 units of work, high-performance DBAT will be terminated to free up resources"

What happens to the client connection, what impact does it have and is there a way to monitor its occurrence? Which of the wordings is most complete and accurate? Or does some other wording work better?

A client connection going inactive is not something about which you should be concerned. It happens all the time in a DB2 DDF application environment. Business as usual. No impact on the client side of things -- the client is still effectively connected to the DB2 for z/OS subsystem. On the DB2 server side, there is very little impact, as 1) an inactive connection takes up only a small amount of virtual storage, and 2) the CPU overhead involved in moving a client connection to the inactive state (and back to the active state when a request from the client needs to be processed) is very low.

As stated, an inactive client connection can be very quickly and cheaply restored to an active state, and this will happen when the client next sends a request to the DB2 subsystem. In returning to the active state, the connection will be provided with a DBAT to get the DB2 request processed (and if the request is associated with a package bound with RELEASE(DEALLOCATE), the DBAT will become a high-performance DBAT; otherwise, when the request has been processed the DBAT will go back into the DBAT pool and the client connection will again go into an inactive state, remaining in that state until anther request from the client is to be processed.

I think that the wording in Adrian's presentation that you've cited is solid. It's a good, net description of what happens.

Robert, thank you for explaining about the presentation. Too bad that it wasn't implemented, I thought it was an excellent idea and safeguard. Do any other things like this come to mind (presented after GA but not implemented or backed out)?

I was also wrong earlier to write "documented recommendations" and wrong not to limit my comments to Connection Pooling, with an alternative being commit oriented Connection Concentrator.

Looking again at pages 6-10 of the the Adrian Burke presentation that mentioned previously, I see that I read in my own interpretation and recommendations into the text.

"Have higher-volume client-server transactions use that second collection to gain high-performance DBAT performance benefits (collection name can be specified as a data source property on the client side)"

Which would work best for well-design and implemented Connection Pooling and Connection Concentrator configurations.

When you wrote:

"Because high-performance DBATs do not go back into the DBAT pool, you may want to increase value of MAXDBAT in ZPARM to compensate"

You use of "may want to" suggests that you weren't expecting active DBATs to increase from 60 to 700 in most places.

An increase from 60 to 700 is greater than I would expect, but results will be different at different sites due to DDF workload differences. As I get more in the way of data points I may be able to get a better idea as to what a "typical" increase in active DBATs is, once high-performance DBAT support is activated.

Robert - in a CMTSTAT=INACTIVE setting, is there a periodic clean-up of the INACTIVE connections (i.e, if the app servers do not close out connections/Websphere settings are not robust enough to clean the connections in a timely manner)

With respect to connections in an inactive state, the answer to your question is, "not really." If a connection is active (and, therefore, associated with a DBAT), and it sits idle for X number of seconds (whatever was specified for IDTHTOIN in ZPARM) then DB2 can terminate the DBAT and connection; however, if a connection is in the inactive state then it will remain in that state indefinitely if the connection is not closed or terminated on the client side (or if the application server is not shut down). If a connection is inactive and communication with the client has been disrupted, that situation will be detected via TCP/IP KeepAlive functionality and DB2 will get rid of the connection (if the ZPARM TCPKPALV is set to n seconds then the TCP/IP stack on the z/OS LPAR will attempt to reach out to the client every n seconds, and if it gets no response back it will inform DB2 of this and DB2 will get rid of the connection).

So, if a client connects to DB2 and does not disconnect (and does not shut down, so it continues to respond to TCP/IP KeepAlive probes) then DB2 connections in an inactive state will persist indefinitely. That's usually not going to be a big deal, as one DB2 subsystem can support up to 150,000 connections, and an inactive connection has a very small virtual storage footprint in the DB2 DDF address space.

I assume that you are talking about a batch program that runs under Linux and accesses a DB2 for z/OS-managed database. If that's the case, consider limiting the application in question to a single connection to DB2 for z/OS and a single DB2 thread (i.e., a single DB2 DBAT) by way of the DB2 profile tables. This capability is described in the DB2 for z/OS Knowledge Center on the Web (see http://www.ibm.com/support/knowledgecenter/SSEPEK_11.0.0/com.ibm.db2z11.doc.perf/src/tpc/db2z_monitorthreadsconnectionsprofiles.dita).

You may have read that in an IBM "redbook" titled, "DB2 9 for z/OS: Distributed Functions" (downloadable from http://www.redbooks.ibm.com/abstracts/sg246952.html?Open). On page 23 of that document, in the first paragraph, you'll see these words: "Each DDF connection only consumes approximately 7.5 K of memory inside the DDF address space, whereas each active DBAT consumes approximately 200 K of memory at a minimum, depending on SQL activity."