October 2014

September 2013

September 24, 2013

I had the pleasure of attending IDUG's DB2 Tech Conference
in Australia earlier this month. There were numerous presentations on DB2 11,
and since IBM is now making this information public, I'll share with you some
details about the upcoming release.

What follows is based on Terry Purcell's presentation,
"What's Coming from the Optimizer in DB2 11 for z/OS?" (session A10):

DB2 11 optimizer enhancements retrofitted to DB2
10 — Obviously, this means that customers using DB2 10 will be able
to benefit from the latest optimizer features and improvements.

Plan management — I'll focus on this
topic for this post. DB2 10 introduced APREUSE(ERROR), a plan management
parameter that prompts DB2 to reuse the previous access paths for each SQL
statement. One drawback to APREUSE(ERROR) is that if any one SQL statement
fails, the entire plan/package would fail. However, with DB2 11 comes another
new parameter, APREUSE(WARN), that operates at the statement level.

Let's compare the two parameters by summarizing what will
happen if you REBIND on a package and some of the statements fail:

With APREUSE(WARN)

Access
paths are retained on all statements that took the previous hint.

A new
access path is created for statements on which the hint failed.

The
package is bound successfully.

With APREUSE(ERROR)

If all
statements took the previous hint, access paths are retained for the
entire package.

If a
single statement fails, package REBIND fails.

Keep in mind that some things could still change between now
and DB2 11's general availability. As IBM notes in its standard disclaimer:

"IBM’s
statements regarding its plans, directions, and intent are subject to change or
withdrawal without notice at IBM’s sole discretion. Information regarding
potential future products is intended to outline our general product direction
and it should not be relied on in making a purchasing decision. The information
mentioned regarding potential future products is not a commitment, promise, or
legal obligation to deliver any material, code or functionality. Information
about potential future products may not be incorporated into any contract. The
development, release, and timing of any future features or functionality
described for our products remains at our sole discretion."

Incidentally, if you'd like to find out more about the
recent IDUG conference, in the near future all of the presentations will be
posted here. If you're starting at IDUG.org, just select "Tech
Library," then "Past Conference Presentations."

September 17, 2013

Recently
I wrote about managing IDAA with Data Studio. The marketing message
around IDAA is that you don’t need to be a DBA to tune the queries because
there are no indexes and queries are fast. While it's true that you don’t have
indexes or need to worry about using SQL coding tricks to force different
access paths, you can control the physical organization and placement of the
data being stored on the Netezza box.

This is
important because while in most cases using IDAA can massively reduce query
times, in select circumstances, a query that took five hours to process might
now take 12 hours. In those cases, you need to modify the physical
organization.

When
enabling a table to be accelerated, you can specify the distribution keys as
well as the organization keys. By default the data will be loaded with random
distribution. This is OK at the outset, or when you're only accessing a single
table. Believe me, when you're doing an initial proof of concept (POC), you'll
be very happy to see what was an 8-hour query complete in just minutes.

The next
step is to examine the SQL that is joining tables and organize the data for
these tables by the columns being used to join on. With IDAA, the time to
conduct these queries can potentially be slashed from minutes to seconds.

The Data
Studio help documentation describing IDAA implementation options is excellent.
Here's a list of some major headings. In some cases I've added detail text:

Distribution keys -- There are two
methods for distributing table rows to the accelerator worker nodes: random
distribution and distribution using a distribution key.

Random distribution

Use of a distribution key (hash partitioning)
-- By default, random distribution is used to distribute table rows to worker
nodes for query processing. That is, all tables are evenly distributed among
the existing worker nodes.

Impact of the distribution key on
the query performance

Impact of distribution key on join
performance

Best practices for selecting
distribution keys

Choosing an organizing key --
Organizing keys can further speed up accelerated queries by reducing the time
necessary to scan the disks belonging to a single worker node.

If you
have experience tuning queries for IDAA, please share some of your favorite
tips in Comments.

September 10, 2013

I feel very fortunate to live in an area where I can attend
a DB2 Regional User Group (RUG). Years ago I'd go to Midwest Database User
Group meetings in Chicago. I loved learning more about DB2 and interacting
with fellow group members, who by and large had been around and knew more about
the database than I did then.

Eventually I started presenting before the user group. I
enjoyed sharing my experiences. After that, I joined the board to help run the
meetings. I learned so much and developed so many friendships by being a part
of that group. It greatly enhanced my career.

And yes, new user groups are forming. In fact a colleague of
mine, Philippe Dubost, recently started a new DB2 user group to serve the
Czech Republic and Slovakia. The first meeting is Oct. 8, and some great
speakers from IBM and CA Technologies will be presenting. csDUG is registered
with IDUG and included in the European section of the aforementioned user group
list.

September 03, 2013

Recently, I downloaded and installed Data Studio 4.1 with the
IBM DB2 Analytics Accelerator (IDAA) plug-in. I was going to write about using
this solution to create a database connection, but it turns out that this topic
is already well covered by David Simpson. Check out this terrific presentation
he gave to the Baltimore/Washington DB2 User Group.

While David doesn't get into the features of IDAA, you'll
learn a lot about the basics of using Data Studio with DB2 for z/OS. Page 14
lists the parameters needed to create the database connection, though there are
actually several ways to find this information. One option is to look at the
DB2 master address started task and do a find on "DDF." You can also
use the DB2 command: –DIS DDF.

Output from –DIS DDF command with values changed to match
page 14.

DSNL080I!DA0G
DSNLTDDF DISPLAY DDF REPORT FOLLOWS:

DSNL081I STATUS=STARTD

DSNL082I LOCATIONLUNAMEGENERICLU

DSNL083I DA1BUSILDA01.DA1BDB2 USILDA01.DA0GPTIB

DSNL084I TCPPORT=9500SECPORT=9501RESPORT=9502IPNAME=-NONE

DSNL085I IPADDR=::140.212.64.35

DSNL086I SQLDOMAIN=p390.themisinc.com

DSNL086I RESYNC DOMAIN= p390.themisinc.com

DSNL089I MEMBER IPADDR=::140.212.64.35

DSNL099I DSNLTDDF DISPLAY DDF REPORT COMPLETE

Once Data Studio is installed and a database connection to a
subsystem with an accelerator is established, you can start managing the
accelerators as well as DB2 tables that use the accelerator with Data Studio.
Here are some relevant resources:

IBM Systems Magazine is a trademark of International Business Machines Corporation. The editorial content of IBM Systems Magazine is placed on this website by MSP TechMedia under license from International Business Machines Corporation.