You are here

Feed aggregator

Oracle has acquired mValent, a leading provider of application configuration management solutions. With the acquisition of mValent, Oracle is expected to enable customers with the ability to collect, compare and reconcile deep configuration information of complex systems. The mValent solution ensures consistent configuration across IT environments and also provides agile support for ongoing environment and […]

This week I am taking a bit of a departure. Normally I write
about things I have already done, but this week I want to
speculate a bit on a security model I am thinking of coding
up. Basically I have been asking myself how to create a
security model for database apps that never requires elevated
privileges for code, but still allows for hosts sharing multiple
applications, full table security including row level and
column level security, and structural immunity to SQL injection.

The Functional Requirements

Let's consider a developer who will be hosting multiple
database applications on a server, sometimes instances of the
same application for different customers. The applications
themselves will have different needs, but they all boil down
to this:

Some applications will allow surfers to join the site
and create accounts for themselves, while others will be
private sites where an administrator must make user accounts.

Some applications will not contain sensitive data, and
so the site owner wants to send forgotten passwords in email
-- which means the passwords must be stored in plaintext. Other
site owners will need heightened security that disallows
storing of passwords in plaintext.

In both cases, administrators must of course be able to
manage accounts themselves.

The system should be structurally immune
to SQL injection.

It must be possible to have users with the same user id
("Sheilia", "John", etc.) on multiple applications who are
actually totally different people.

The application code must never need to run at an
elevated privelege level for any reason -- not
even to create accounts on public sites where
users can join up and conduct transactions.

It must be possible for the site owners or their
agents to directly
connect to the database at very least for querying and
possibly to do database writes without going through our
application.

Users with accounts on one app must never be able to
sign on to another app on the same server.

These requirements represent the most flexible possible
combination of demands that I have so far seen in real life.
The question is, can they be met while still providing
security? The model I'd like to speculate on today says
yes.
Informed Paranoia Versus Frightened Ignorance

Even the most naive programmer knows that the internet
is not a safe place, but all too often a lot of security
advice you find is based on frightened ignorance
and takes the form, "never do x, you don't know what might
happen." If we are to create a strong security model,
we have to do better than this.

Much better is to strive to be like a strong system architect,
whose approach is based on informed paranoia.
This hypothetical architect knows everybody is out
to compromise his system, but he seeks a thorough knowledge
of the inner workings of his tools so that he can
engineer the vulnerabilities out as much as possible.
He is not looking to write rules for the programmer
that say "never do this", he is rather looking to make it
impossible for the user or programmer to compromise
the system.

Two Examples

Let us consider a server hosting two applications, which
are called "social" and "finance".

The "social" application is a social networking site with
minimal security needs. Most important is that the site
owners want members of the general public to sign up, and
they want to be able to email forgotten passwords
(and we can't talk them out of it) -- so we
have to store passwords in plaintext.

The "finance" application is a private site used by employees
of a corporation around the world. The general public is
absolutely not welcome. To make matters worse however, the
corporation's IT department demands to be able to directly
connect to the database and write to the database without
going through the web app. This means the server will have
an open port to the database. Sure it will be protected with
SSL and passwords, but we must make sure that only users
of "finance" can connect, and only to their own application.

Dispensing With Single Sign-On

There are two ways to handle connections to a database. One
model is to give users real database accounts, the other is
to use a single account to sign on to the database. Prior to
the web coming along, there were proponents of both models in
the client/server world, but amongst web developers the single
sign-on method is so prevalent that I often wonder if they
know there is any other way to do it.

Nevertheless, we must dispense with the single sign-on method
at the start, regardless of how many people think that Moses
carved it on the third tablet, because it just has too many
problems:

Single Sign-on is the primary architectural flaw that makes
SQL injection possible. As we will see later, using real
database accounts makes your site (almost) completely immune
to SQL injection.

Single Sign-on requires a connection at the maximum privilege
level that any system user might have, where the code then decides
what it will let a particular user do. This is a complete
violation of the requirement that code always run at the lowest
possible privilege level.

Single Sign-on totally prevents the requirement that
authorized agents be allowed to connect to the database and
directly read and write values.

So single sign-on just won't work with the requirements listed.
This leads us to creating real accounts on the database server.

Real Accounts and Basic Security

When you use a real database account, your code connects
to the database using the username and password provided
by the user. Anything he is allowed to do your code will
be allowed to do, and anything he is not allowed to do will
throw and error if your code tries to do it.

This approach meets quite a few of our requirements nicely.
A site owner's IT department can connect with the same
accounts they use on the web interface -- they have
the same privileges in both cases. Also, there is no
need to ever have application code elevate its privilege
level during normal operations, since no regular users should ever be
doing that. This still leaves the issue of how to create
accounts, but we will see that below.

A programmer who thinks of security in terms of what code
can run will have a very hard time wrapping his head around
using real database accounts for public users. The trick to
understanding this approach
is to forget about code for a minute and to
think about tables. The basic fact of database application
security is that all security
resolves to table permissions. In other words, our security
model is all about who can read or write to what tables, it is
not about who can run which program.

If we grant public users real database accounts, and they
connect with those accounts, the security must be handled
within the database itself, and it comes down to:

Defining "groups" as collections of users who share
permissions at the table level.

Deciding which groups are allowed select, insert, update,
and delete privileges on which tables.

Granting and revoking those privileges on the server itself
when the database is built.

At very least row-level security will be required, wherein
a user can only see and manipulate certain rows in a table.
This is how you keep users from using SQL Injection to mess
with each other's order history or member profiles.

Column security is also very nice to finish off the
picture, but we will not be talking about that today as it
does not play into the requirements.

Now we can spend a moment and see why this approach eliminates
most SQL Injection vulnerabilities. We will imagine a table of
important information called SUPERSECRETS. If somebody could
slip in a SQL injection exploit and wipe out this table we'd all
go to jail, so we absolutely cannot allow this.
Naturally, most users would have no privileges on
this table -- even though they are directly connected to the
database they cannot even see the table exists, let alone
delete from it. So if our hypothetical black hat
somehow slips in ";delete from supersecrets"
and our code fails to trap for it, nothing happens. They have
no privlege on that table. On the other side of things, consider
the user who is privileged to delete from that table. If this
user slips in a ";delete from supersecrets" he is only going to
the trouble with SQL Injection to do something he is perfectly
welcome to do anyway through the user interface. So much
for SQL injection.

To repeat a point made above: row-level security is a must.
If you grant members of a social site global UPDATE privileges
on the PROFILES table, and you fail to prevent a SQL Injection,
all hell could break loose. Much better is the ability to
limit the user to seeing only his own row in the PROFILE table,
so that once again you have created a structural immunity
to SQL injection.

Anonymous Access

Many public sites allow users to see all kinds of information
when they are not logged on. The most obvious example would
be an eCommerce site that needs read access to the ITEMS table,
among others. Some type of anonymous access must be allowed
by our hypothetical framework.

For our two examples, the "social" site might allow limited
viewing of member profiles, while the "finance" application
must show absolutely nothing to the general public.

If we want a general solution that fits both cases, we opt
for a deny-by-default model and allow each application
to optionally have an anonymous account.

First we consider deny-by-default. This means simply that
our databases are always built so that no group has any
permissions on any tables. The programmer of the "social"
site now has to grant certain permissions to the anonymous
account, while the programmer of the "finance" application
does nothing - he already has a secure system.

But still the "finance" site is not quite so simple. An anonymous
user account with no privileges can still log in, and
that should make any informed paranoid architect nervous.
We should extend
the deny-by-default philosophy so the framework will
not create an anonymous
account unless requested. This way the programmer of the
"finance" application still basically does nothing, while
the programmer of the "social" must flip a flag to create
the anonymous account.

Virtualizing Users

If we are having real database accounts, there is one small
detail that has to be addressed. If the "social" site has
a user "johnsmith" and the finance application has a user
of the same name, but they are totally different people,
we have to let both accounts exist but be totally separate.

The answer here is to alias the accounts. The database
server would actually have accounts "finance_johnsmith" and
"social_johnsmith". Our login process would simply take
the username provided and append the code in front of it
when authenticating on the server. 'nuf said on that.

Allowing Public Users To Join

The "social" site allows anybody to join up and create
an account. This means that somehow the web application
must be able to create accounts on the database server.
Yet it must do this without allowing the web code to
elevate its privileges, and while preventing the disaster
that would ensue if a user on the "social" site somehow
got himself an account on the "finance" site.

Believe it or not, this is the easy part! Here is how it
works for the "social" site:

Create a table of users. The primary key is the user_id
which prevents duplication.

For the social site, there is a column called
PASSWORD that stores the password in plaintext.

Allow the anonymous account to INSERT into this table!
(Remember though that deny-by-default means that so far
this account has no other privileges).

Put an INSERT trigger on the table that automatically creates
an aliased user account, so that "johnsmith" becomes
"social_johnsmith". The trigger also sets the password.

A DELETE trigger on the table would delete users if
the row is deleted.

An UPDATE trigger on the table would update the password
if the user UPDATES the table.

Row level security is an absolute must.
Users must be able to
SELECT and UPDATE table, but only their own row. If your
database server or framework cannot support row-level
security, it's all out the window.

This gives us a system that almost gets us where we need
to be: the general public can create acounts,
the web application does not need to elevate its privileges,
users can set and change their passwords, and no user can
see or set anything for any other user. However, this leaves
the issue of password recovery.

In order to recover passwords and email them to members of
the "social" site, it is tempting to think that
the anonymous account must be able to
somehow read the users table, but that is no good because
then we have a structural flaw where a successful
SQL injection would expose user accounts. However, this
also turns out to be easy. There are two options:

Write a stored procedure that the anonymous user is
free to execute, which does not return a password but
actually emails it directly from within the database
server. This requires your database server be able to
send emails. (Postgres can, and I assume SQL Server
can, and I don't really know about mySql).

Create a table for password requests, allow inserts
to it but nothing else. A trigger sends the email.
In this approach you can track email recovery requests.

For the "finance" application we cannot allow any of this
to happen, so again we go to the deny-by-default idea. All
of the behaviors above will not happen unless the programmer
sets a flag to turn them on when the database is built.

This does leave the detail of how users of the "finance"
application will reset their passwords.
For details on how a secure app can still allow password
resets, see my posting of Sept 7 2008 Secure Password Resets.

One More Detail on Public Users

We still have one more detail to handle for public users.
Presumably a user, having joined up, has more privileges than
the anonymous account. So the web application must be able
to join them into a group without elevating its privileges.
The solution here is the same as for creating the account:
there will be a table that the anonymous user can make
inserts into (but nothing else), and a trigger will join
the user to whatever group is named.

Except for one more detail. We cannot let the user join
whatever group they want, only the special group for members.
This requirement can be met by defining the idea of a "freejoin"
group and also a "solo" group. If the anonymous user inserts
into a user-group table, and the requested group is flagged
as allowing anybody to join, the trigger will allow it, but
for any other group the trigger will reject the insert.
The "solo" idea is similar, it means that if a user is in
the "members" group, and that group is a "solo" group, they
may not join any other groups. This further jails in
members of the general public.

Almost Done: User Administration

In the last two sections we saw the idea of a table of users
and a cross-reference of users to groups. This turns out to
solve another issue we will have: letting administrators
manage groups. If we define a group called "user_administrators"
and give them total
power on these tables, and also give them CRUD screens
for them, then we have a user administrator system.
This works for both the "social" and the "finance" application.

The triggers on the table have to be slightly different
for the two cases, but that is a small exercise to code
them up accordingly.

Cross-Database Access

Believe it or not, the system outlined above has met all of
our requirements except one. So far we have a system that never
requires the web server to have any elevated priveleges within
the database, allows members of the public to join some sites
while barring them from others, is structurally immune from
SQL injection, allows different people on different sites to
have the same user id, and allows administrators
of both sites to directly manage accounts. Moreover, we
can handle both plaintext passwords and more serious
reset-only situations.

This leaves only one very thorny issue: cross-database
access. The specific database server I use most is PostgreSQL,
and this server has a problem (for this scenario) anyway,
which is that out-of-the-box, a database account can connect
to any database. This does not mean the account has any
priveleges on the database, but we very seriously do not want
this to happen at all. If a member of the "social" site can
connect to the "finance" app, we have a potential vulnerability
even if he has zero privileges in that database. We would be
much happier if he could not connect at all.

In Postgres there is a solution to this, but I've grown to
not like it. In Postgres you can specify that a user can only
connect to a database if they are in a group that has the
same name as the database. This is easy to set up, but it
requires changing the default configuration of Postgres.
However, for the sheer challenge of it I'd like to work out
how to do it without requiring that change. So far I'm
still puzzling this out. I'd also like to know that the
approach would work at very least on MS SQL Server and
mySql.

Conclusion

Most of what is in this week's essay is not that radical to
any informed database veteran. But to web programmers
who were unfortunate enough to grow up in the world
of relational-databases-must-die nonsense, it is probably
hard or impossible to imagine a system where users are
connecting with real database accounts. The ironic thing
is that the approached described here is far more secure
than any single sign-on system, but it requires the programmer
to shift thinking away from action-based code-centric models
to what is really going on: table-based privileges. Once
that hurdle is past, the rest of it comes easy.

Yeah, well: I never said this blog would be a regular thing, so there!I'm not sure what is going to happen to the world economy. I keep hearing about these fantastic rescue packages that seemly no one has to pay for?Recently, I was reminded of why I am so cynical of modern theories of education and economy. I do have a lot of respect for the engineers who made the moon landings possible Noonsnoreply@blogger.com4

Finnaly fresh news... the apex.oracle.com demo server was upgraded to Application Express 3.2.0.00.21. You can't download it yet, but you can test it online. Very soon we'll have a production release available.You can read all about Oracle Application Express Release 3.2 new features details here.Additional info on Joel Kallman's post about the new release.

We have gotten a lot of questions recently regarding how to gather and maintain optimizer statistics on large partitioned tables. The majority of these questions can be summarized into two topics:

When queries access a single partition with stale or non-existent partition level statistics I get a sub optimal plan due to “Out of Range” values

Global statistics collection is extremely expensive in terms of time and system resources

This article will describe both of these issues and explain how you can address them both in Oracle Database 10gR2 and 11gR1.

Out of RangeLarge tables are often decomposed into smaller pieces called partitions in order to improve query performance and ease of data management. The Oracle query optimizer relies on both the statistics of the entire table (global statistics) and the statistics of the individual partitions (partition statistics) to select a good execution plan for a SQL statement. If the query needs to access only a single partition, the optimizer uses only the statistics of the accessed partition. If the query access more than one partition, it uses a combination of global and partition statistics.

“Out of Range” means that the value supplied in a where clause predicate is outside the domain of values represented by the [minimum, maximum] column statistics. The optimizer prorates the selectivity based on the distance between the predicate value and the maximum value (assuming the value is higher than the max), that is, the farther the value is from the maximum value, the lower the selectivity will be. This situation occurs most frequently in tables that are range partitioned by a date column, a new partition is added, and then queried while rows are still being loaded in the new partition. The partition statistics will be stale very quickly due to the continuous trickle feed load even if the statistics get refreshed periodically. The maximum value known to the optimizer is not correct leading to the “Out of Range” condition. The under-estimation of selectivity often leads the query optimizer to pick a sub optimal plan. For example, the query optimizer would pick an index access path while a full scan is a better choice.

The "Out of Range" condition can be prevented by using the new copy table statistics procedure available in Oracle Database10.2.0.4 and 11g. This procedure copies the statistics of the source [sub] partition to the destination [sub] partition. It also copies the statistics of the dependent objects: columns, local (partitioned) indexes etc. It adjusts the minimum and maximum values of the partitioning column as follows; it uses the high bound partitioning value as the maximum value of the first partitioning column (it is possible to have concatenated partition columns) and high bound partitioning value of the previous partition as the minimum value of the first partitioning column for range partitioned table. It can optionally scale some of the other statistics like the number of blocks, number of rows etc. of the destination partition.

Assume we have a table called SALES that is ranged partitioned by quarter on the SALES_DATE column. At the end of every day data is loaded into latest partition. However, statistics are only gathered at the end of every quarter when the partition is fully loaded. Assuming global and partition level statistics (for all fully loaded partitions) are up to date, use the following steps in order to prevent getting a sub-optimal plan due to “out of range”.

Lock the table statistics using LOCK_TABLE_STATS procedure in DBMS_STATS. This is to avoid interference from auto statistics job.

EXEC DBMS_STATS.LOCK_TABLE_STATS('SH','SALES');

Before beginning the initial load into each new partition (say SALES_Q4_2000) copy the statistics from the previous partition (say SALES_Q3_2000) using COPY_TABLE_STATS. You need to specify FORCE=>TRUE to override the statistics lock.

In data warehouse environment it is very common to do a bulk load directly into one or more empty partitions. This will make the partition statistics stale and may also make the global statistics stale. Re-gathering statistics for the effected partitions and for the entire table can be very time consuming. Traditionally, statistics collection is done in a two-pass approach:

In the first pass we will scan the table to gather the global statistics

In the second pass we will scan the partitions that have been changed to gather their partition level statistics.

The full scan of the table for global statistics collection can be very expensive depending on the size of the table. Note that the scan of the entire table is done even if we change a small subset of partitions.

In Oracle Database 11g, we avoid scanning the whole table when computing global statistics by deriving the global statistics from the partition statistics. Some of the statistics can be derived easily and accurately from partition statistics. For example, number of rows at global level is the sum of number of rows of partitions. Even global histogram can be derived from partition histograms. But the number of distinct values (NDV) of a column cannot be derived from partition level NDVs. So, Oracle maintains another structure called a synopsis for each column at the partition level. A synopsis can be considered as sample of distinct values. The NDV can be accurately derived from synopses. We can also merge multiple synopses into one. The global NDV is derived from the synopsis generated by merging all of the partition level synopses. To summarize

The global statistics are automatically derived from the partition level statistics and global synopses

Incremental maintenance feature is disabled by default. It can be enabled by changing the INCREMENTAL table preference to true. It can also be enabled for a particular schema or at the database level. If you are interested in more details of the incremental maintenance feature, please refer to the following paper presented in SIGMOD 2008 and to our previous blog entry on new ndv gathering in 11g.

Assume we have table called SALES that is range partitioned by day on the SALES_DATE column. At the end of every day data is loaded into latest partition and partition statistics are gathered. Global statistics are only gathered at the end of every month because gathering them is very time and resource intensive. Use the following steps in order to maintain global statistics after every load.

Turn on incremental feature for the table.

EXEC DBMS_STATS.SET_TABLE_PREFS('SH','SALES','INCREMENTAL','TRUE');

At the end of every load gather table statistics using GATHER_TABLE_STATS command. You don’t need to specify the partition name. Also, do not specify the granularity parameter. The command will collect statistics for partitions with stale or missing statistics and update the global statistics based on the partition level statistics and synopsis.

EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','SALES');

Note: that the incremental maintenance feature was introduced in Oracle Database 11g Release 1. However, we also provide a solution in Oracle Database10g Release 2 (10.2.0.4) that simulates the same behavior. The 10g solution is a new value, 'APPROX_GLOBAL AND PARTITION' for the GRANULARITY parameter of the GATHER_TABLE_STATS procedures. It behaves the same as the incremental maintenance feature except that we don’t update the NDV for non-partitioning columns and number of distinct keys of the index at the global level. For partitioned column we update the NDV as the sum of NDV at the partition levels. Also we set the NDV of columns of unique indexes as the number of rows of the table. In general, non-partitioning column NDV at the global level becomes stale less often. It may be possible to collect global statistics less frequently then the default (when table changes 10%) since approx_global option maintains most of the global statistics accurately.

Let's take a look at an example to see how you would effectively use the Oracle Database 10g approach.

After the data load is complete, gather statistics using DBMS_STATS.GATHER_TABLE_STATS for the last partition (say SALES_11FEB2009), specify granularity => 'APPROX_GLOBAL AND PARTITION'. It will collect statistics for the specified partition and derive global statistics from partition statistics (except for NDV as described before).

We have gotten a lot of questions recently regarding how to gather and maintain optimizer statistics on large partitioned tables. The majority of these questions can be summarized into two topics:

When queries access a single partition with stale or non-existent partition level statistics I get a sub optimal plan due to “Out of Range” values

Global statistics collection is extremely expensive in terms of time and system resources

This article will describe both of these issues and explain how you can address them both in Oracle Database 10gR2 and 11gR1.

Out of RangeLarge tables are often decomposed into smaller pieces called partitions in order to improve query performance and ease of data management. The Oracle query optimizer relies on both the statistics of the entire table (global statistics) and the statistics of the individual partitions (partition statistics) to select a good execution plan for a SQL statement. If the query needs to access only a single partition, the optimizer uses only the statistics of the accessed partition. If the query access more than one partition, it uses a combination of global and partition statistics.

“Out of Range” means that the value supplied in a where clause predicate is outside the domain of values represented by the [minimum, maximum] column statistics. The optimizer prorates the selectivity based on the distance between the predicate value and the maximum value (assuming the value is higher than the max), that is, the farther the value is from the maximum value, the lower the selectivity will be. This situation occurs most frequently in tables that are range partitioned by a date column, a new partition is added, and then queried while rows are still being loaded in the new partition. The partition statistics will be stale very quickly due to the continuous trickle feed load even if the statistics get refreshed periodically. The maximum value known to the optimizer is not correct leading to the “Out of Range” condition. The under-estimation of selectivity often leads the query optimizer to pick a sub optimal plan. For example, the query optimizer would pick an index access path while a full scan is a better choice.

The "Out of Range" condition can be prevented by using the new copy table statistics procedure available in Oracle Database10.2.0.4 and 11g. This procedure copies the statistics of the source [sub] partition to the destination [sub] partition. It also copies the statistics of the dependent objects: columns, local (partitioned) indexes etc. It adjusts the minimum and maximum values of the partitioning column as follows; it uses the high bound partitioning value as the maximum value of the first partitioning column (it is possible to have concatenated partition columns) and high bound partitioning value of the previous partition as the minimum value of the first partitioning column for range partitioned table. It can optionally scale some of the other statistics like the number of blocks, number of rows etc. of the destination partition.

Assume we have a table called SALES that is ranged partitioned by quarter on the SALES_DATE column. At the end of every day data is loaded into latest partition. However, statistics are only gathered at the end of every quarter when the partition is fully loaded. Assuming global and partition level statistics (for all fully loaded partitions) are up to date, use the following steps in order to prevent getting a sub-optimal plan due to “out of range”.

Lock the table statistics using LOCK_TABLE_STATS procedure in DBMS_STATS. This is to avoid interference from auto statistics job.

EXEC DBMS_STATS.LOCK_TABLE_STATS('SH','SALES');

Before beginning the initial load into each new partition (say SALES_Q4_2000) copy the statistics from the previous partition (say SALES_Q3_2000) using COPY_TABLE_STATS. You need to specify FORCE=>TRUE to override the statistics lock.

In data warehouse environment it is very common to do a bulk load directly into one or more empty partitions. This will make the partition statistics stale and may also make the global statistics stale. Re-gathering statistics for the effected partitions and for the entire table can be very time consuming. Traditionally, statistics collection is done in a two-pass approach:

In the first pass we will scan the table to gather the global statistics

In the second pass we will scan the partitions that have been changed to gather their partition level statistics.

The full scan of the table for global statistics collection can be very expensive depending on the size of the table. Note that the scan of the entire table is done even if we change a small subset of partitions.

In Oracle Database 11g, we avoid scanning the whole table when computing global statistics by deriving the global statistics from the partition statistics. Some of the statistics can be derived easily and accurately from partition statistics. For example, number of rows at global level is the sum of number of rows of partitions. Even global histogram can be derived from partition histograms. But the number of distinct values (NDV) of a column cannot be derived from partition level NDVs. So, Oracle maintains another structure called a synopsis for each column at the partition level. A synopsis can be considered as sample of distinct values. The NDV can be accurately derived from synopses. We can also merge multiple synopses into one. The global NDV is derived from the synopsis generated by merging all of the partition level synopses. To summarize

The global statistics are automatically derived from the partition level statistics and global synopses

Incremental maintenance feature is disabled by default. It can be enabled by changing the INCREMENTAL table preference to true. It can also be enabled for a particular schema or at the database level. If you are interested in more details of the incremental maintenance feature, please refer to the following paper presented in SIGMOD 2008 and to our previous blog entry on new ndv gathering in 11g.

Assume we have table called SALES that is range partitioned by day on the SALES_DATE column. At the end of every day data is loaded into latest partition and partition statistics are gathered. Global statistics are only gathered at the end of every month because gathering them is very time and resource intensive. Use the following steps in order to maintain global statistics after every load.

Turn on incremental feature for the table.

EXEC DBMS_STATS.SET_TABLE_PREFS('SH','SALES','INCREMENTAL','TRUE');

At the end of every load gather table statistics using GATHER_TABLE_STATS command. You don’t need to specify the partition name. Also, do not specify the granularity parameter. The command will collect statistics for partitions with stale or missing statistics and update the global statistics based on the partition level statistics and synopsis.

EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','SALES');

Note: that the incremental maintenance feature was introduced in Oracle Database 11g Release 1. However, we also provide a solution in Oracle Database10g Release 2 (10.2.0.4) that simulates the same behavior. The 10g solution is a new value, 'APPROX_GLOBAL AND PARTITION' for the GRANULARITY parameter of the GATHER_TABLE_STATS procedures. It behaves the same as the incremental maintenance feature except that we don’t update the NDV for non-partitioning columns and number of distinct keys of the index at the global level. For partitioned column we update the NDV as the sum of NDV at the partition levels. Also we set the NDV of columns of unique indexes as the number of rows of the table. In general, non-partitioning column NDV at the global level becomes stale less often. It may be possible to collect global statistics less frequently then the default (when table changes 10%) since approx_global option maintains most of the global statistics accurately.

Let's take a look at an example to see how you would effectively use the Oracle Database 10g approach.

After the data load is complete, gather statistics using DBMS_STATS.GATHER_TABLE_STATS for the last partition (say SALES_11FEB2009), specify granularity => 'APPROX_GLOBAL AND PARTITION'. It will collect statistics for the specified partition and derive global statistics from partition statistics (except for NDV as described before).

ADF Business Components have a method isAttributeChanged available in Entity Objects that returns true if the the attribute has been changed since it was obtained from the Database. This method can cause a problem when used in business rules that are triggered at Commit time, or in business rules that only apply when the attribute value has actually changed.

Problem when calling isAttributeChanged at Commit time

From the Javadoc of EntityImpl: Note that if a Posting operation is performed, and the attribute value is sync'd with the database, that subsequent requests will return false, until the next time the attribute is changed.

Another quote from the Javadoc of EntityImpl: Note that this method will return true if the value has been set to the same as the database value since it doesn't perform a value comparison.

This can be a problem for validations that apply only when the value changes: for example, the rule "Department Name cannot be changed if the Department has Employees". You might have implemented the rule in a method validator using isAttributeChanged:

return (!isAttributeChanged(DEPARTMENTNAME) || !hasEmployees());

Suppose a Department has Employees, and you change both the Department Name and the Department Manager. When you apply these changes, you obviously get an error because you are not allowed to change the Department Name. Suppose you now change the Department Name back to its original value, and try again, because you still want to change the Department Manager. You will still get the error, because isAttributeChanged(DEPARTMENTNAME) will remain false until you have posted the data, even though the attribute value is now the same as the value in the database!

To avoid this, you can define a variation on isAttributeChanged in your EntityImpl base class: isAttributeValueChanged (but note that when called from Commit level, it will still return false like Jan described).

/**
* Using getPostedAttribute and getAttribute this method retrieves the old and new
* values and compares whether it has changed. Alternative to isAttributeChanged,
* which returns true if the value was modified to the same value as the database.
* @param index index of the attribute.
* @return true if the value of the attribute at the given index
* is different from the one in the database.
*/
public boolean isAttributeValueChanged(int index) {
Object oldVal = getPostedAttribute(index);
Object newVal = getAttribute(index);
boolean valueChanged = true;
if ((oldVal == null) && (newVal == null)) {
valueChanged = false;
} else if (newVal != null) {
if (newVal instanceof String) {
if (((String)newVal).length() == 0) {
if (oldVal == null) {
valueChanged = false;
}
}
}
if (valueChanged && newVal.equals(oldVal)) {
// in new rows, the oldVal is always the same as the newVal,
// but if the value is not null in a new row, we want to return true
if (getEntityState() != Entity.STATUS_NEW) {
valueChanged = false;
}
}
}
return valueChanged;
}

Try searching for "cloud computing" in google and you get 16,200,000 results.

There is lot of interest about Cloud Computing and what it means as it applies to different aspects of computing paradigm. It is more than just Software as a Service (SaaS), which relates to hosted on-demand delivery of applications.

Based on my understanding, Cloud Computing is about applications, platform, infrastructure, and storage resources being available as services.

There are many interesting discussions about what it is and what should be the definition of "Cloud".

Instead of trying to coin yet another definition of "Cloud", I tried to find a somewhat formal but plausible definition. This is the closest one I found. It is from an article published in ACM SIGCOMM Computer Communication Review

"Clouds are a large pool of easily usable and accessible virtualized resources (such as hardware, development platforms and/or services). These resources can be dynamically re-configured to adjust to a variable load (scale), allowing also for an optimum resource utilization. This pool of resources is typically exploited by a pay-per-use model in which guarantees are offered by the Infrastructure Provider by means of customized SLAs."

As I write this, I am nursing a couple of bruises I incurred skijouring with my dog Thunder. Skijouring is Nordic skiing with your dog hooked up to a lead, so he pulls you. I believe the word “skijour” comes from the Norwegian word for “cheating,” judging by the dirty looks I get from other Nordic skiers when Thunder zooms past them towing me. Mind you, Thunder has two speeds: flat out and conked out. Periodically, he stops in the middle of a run to jump into a snow bank, do a “doggy snow angel” and cool off, or check out some other dog’s calling card. You get used to using the fast-braking snowplow since I know from experience it is no fun to reach the end of the tether at high speed: it leads to a Nordic skiing move known as a “face plant.”

I Nordic ski a lot, hooked up to Thunder most of the time though I typically do a few kilometers by myself because otherwise he is the only one getting aerobic exercise. The biggest hazard for me in Nordic skiing (which does not normally carry the risks alpine skiing does) is from what I call LLDS – loose, loutish dog syndrome. That is, many people around here ski with their dogs, hardly any of them are on leads, and of those, at least some percentage are big, loutish, untrammeled dogs who run up to Thunder at full tilt, causing him to either “defend” me or otherwise establish his alpha dog-dom. I have to explain to their owners that no, Thunder does not want to play: he is a working dog, pulling a sled or a skier is his “job” and he is single-minded about it. Sometimes but not often, his lunge after another dog bounding up at him pulls me over (hence, the previously alluded-to bruises). Last year, I had a $50 vet bill courtesy of a border collie who bit Thunder to herd him when he did most definitely did not want to be herded. The owner, needless to say, did not call his dog despite my repeated requests that he do so. Jerk.

I should add that most people on the Nordic trails around Sun Valley (especially when asked) call their dogs instead of letting them launch themselves at other dogs, most of the dogs are reasonably well behaved but about 20% are not and they are inevitably golden retrievers or labs. With apologies to my friends who have those breeds,* I cringe when I see labs or retrievers on the Nordic trails, because I know those dogs are going to be the ones bounding up and creating a problem. Part of this I attribute to rampant cute puppyitis (many people fall in love with an adorable puppy – and both labs and retrievers are very cute puppies – without bothering to find out anything about the breed temperament) and part to the fact that in any group, there are going to be irresponsible pet owners.

My dog, I admit, is hardly perfect, but the rules of the Blaine County Recreation District are that if dogs are off lead, they must be within 15 feet and under voice command, both of which are regularly flouted by the aforementioned irresponsible dog owners. I’ve seen dogs wandering not anywhere near visual or audible distance of their owners (labs and retrievers, almost always). Realistically, of course, it is the owner who is at fault and not the dog. I guess a side affect of cute puppyitis is thinking your dog is so adorable, every one else (and every other dog) will find him just as appealing and you don’t have to bother training him.

I’ve been reminded recently of the difference between (if I may be forgiven) bad dogs and bad owners with the release of the SANS Top 25 Most Dangerous Programming Errors list. The list describes the Top 25 programming errors that contribute to security vulnerabilities. Many people contributed to it so it represents a broad consensus that is, overall, a strength. I should also note that one of my team reviewed and contributed to the list. I further add (hastily) that on the face of it, I don’t have a problem with the list: a list is a list (like a dog, it’s not necessarily a bad dog or a good dog, it’s just a dog). To the extent that a Top 25 Most Dangerous Programming Errors list helps create a starting point for Avoiding Bad Coding, it can be very useful. I also thank various folks who spearheaded the creation of the list (Bob Martin at Mitre was the hard-working project manager) and who scrupulously attempted to get an industry consensus via multiple review cycles and multiple contributors. As I have opined previously, there are a lot of universities that don’t teach secure coding or emphasize it and a lot of developers who are looking for guidance as to where to start paying attention to What Not To Do. To that extent, the SANS list is very useful.

My issue is with the potential mischief this list can and I fear will be put to, which, like my Nordic skiing example is not the fault of the dog (the list), it’s a fault of the “owners,” in this case people who “buy” the cute little list and are enamored of it and think it can do no wrong. I meet dog owners who decide their dog who runs up at Thunder full throttle is just the cutest thing and they do not understand why Thunder does not want to play. Then, it is my fault if there is a problem (uh, my dog is on a lead, yours is not, nor does yours obey voice command, therefore, your dog is by definition out of control and it is your responsibility to control the animal). I love Nordic skiing but the LLDS problem makes me absolutely crazy and ruins my skiing experience. Similarly, I am seeing what started as a good idea – a “don’t do this” list – morph into a big, loutish dog.

For example, I have heard people exclaim that it would be a great thing to demand of commercial software vendors that the solution to all security ills is that vendors be made to attest (as part of a contract) that their code contains no programming errors in the SANS Top 25 Most Dangerous Programming Errors list. Now, even if this could be foisted on the vendor community, just like a large untrammeled Labrador, I believe, the results would be unpredictable and not universally welcome. One wants to know, is the goal to have better software, or to have someone to sue? Because making forced contractual attestations as to the degree to which one’s software is free of a class of defects has some unintended consequences.

I have opined previously on the fact that automated tools help find Dumb Things In Code that could include some or all of the SANS Top 25. But everyone who has used them knows that the available tools on the market do not find all vulnerabilities or even all types of vulnerabilities. And if your code is quadruple digits in length or more – and most commercial software is multiple million lines of code – manual code review is a) time consuming, b) unlikely to succeed, c) subject to the vagaries of the reviewer and therefore not repeatable or consistent, and d) out of date as soon as the code changes, which is daily in many development organizations. More to the point, what happens when a vendor that has done the best it can to find and eradicate the SANS Top 25 (or whatever the List Du Jour is, more on which later) finds a vulnerability on the verboten list after using the best available tools? That is, where the product is already shipping? Suppose that the vulnerability a vendor finds is a really bad one (let’s say, CVSS 10), and so easy to exploit your grandmother can do it. Does the vendor a) sit on the vulnerability and hope nobody else finds it, or b) fix the problem and find themselves in breach of contract? That, dear reader, is what is known as a Catch-22 (or, in the best classic tradition, being caught between Scylla and Charybdis). Back to my original question, are the people rubbing their hands in glee at the prospect of forcing contract provisions that are list-based down vendors’ throats actually interested in better software, or having someone to blame?

I believe there is another problem with enshrining anything like a SANS Top 25 list into a contractual or regulatory straightjacket. First of all, the SANS Top 25 list is a general list with both the utility and limitations of that. If one’s code is (based on one’s development practice, development framework and deployment scenario of the software) more at risk from another kind of vulnerability that did not make the Top 25 (or, if new threats are discovered, and they are all the time), what should a vendor focus on? The theoretical Top 25 or the issues based on one’s own experience? The very nature of a Top 25 list is that it is likely to be overtaken by events (actually, in the general sense, it would be a very nice problem to have, because it might mean everyone had eradicated the Top 25 issues or at least made a really good dent in them, which we all want). However, that does not mean contractual language would change fast enough to maintain relevancy. It never does.

By way of example, when I was in the Navy, it was (at that time) part of the uniform regulations that females wearing dress uniforms (service dress blue or service dress white) were required to wear or carry white gloves and a purse. (If you carried white gloves instead of wearing them, they had to be folded lengthwise with “fingers out” – in front of your hand. If you care.) In short, white gloves and a purse were “part of the uniform” and if you did not wear or carry gloves and carry a purse that meant you were technically, “out of uniform,” one of the seven deadly sins in the Navy. Did I mention that part of your annual fitness report (think “employee performance review”) is “military bearing” and you get graded on things like how well you wear the uniform and observe military etiquette? Now, as a member of the Civil Engineer Corps, my job as a contracting officer entailed going on construction sites. The safety “rules” of most construction sites are that you need to wear a hardhat and steel-toes shoes. So, in order to comply with construction site safety regulations and Navy uniform regulations, I faced carrying white gloves onto a construction site (where, needless to say, they would not stay white very long). Not to mention, I was supposed to be carrying a purse and wearing a hardhat (instead of my uniform hat). I gave up on the steel-toed shoes since there are no steel-toed women’s pumps that I know of.

The bottom line is that those uniform regulations were set in the day when every woman wore a hat and gloves (my mom still does, to church every Sunday) and did not contemplate that women might expand to jobs that involved going onto construction sites. The regs did not keep up with the times. As a result, there is no way I could be both “in uniform” and “in compliance” on a construction site. Catch-22 – pick which regulation you are going to break, because I could not comply with all of them.

You might note that in the above case of “unattainable” or mutually conflicting regulations I should have just used common sense, and I did. But I was still technically “out of uniform.” And hence, my question, are the people advocating that the otherwise useful list become a contractual commitment (thus overusing if not abusing the utility of it) interested in better software or in building in an inability of vendors to meet all contractual provisions (the software assurance equivalent of demanding steel-toed high heeled pumps)? And if their intentions are to make software better, can they reconcile the likely outcome that vendors will consider their almost certain liability when deciding whether to release a fix and consult their lawyers on how to characterize an issue differently, rather than just getting on with the fix? Making things better should not entail requiring a level of perfection nobody could possibly meet, let alone enshrining it into a contract. Unless of course, one’s real goal is to make money out of others’ imperfections. (I should pause here and reiterate that the list is a good idea, I know the people who spearheaded it had good motives and I thank them for their efforts.)

One of the great privileges of my job has been the opportunity to advise policymakers on the challenges of cybersecurity or to comment on bills and other regulatory language. I note that while “lobbyist” is sometimes a term of opprobrium:

·

Policymakers, and the staff who capably assist them, typically want to legislate well·

Nobody can be an expert on every issue and thus policymakers and their staffers rely in part on outside stakeholders advising them of the specifics of an issue·

Such advice, if one is asked to or can offer it, is part of one’s civic responsibility (to help policymakers govern wisely)·

Lobbyists use these opportunities to talk about the scope and context of an issue, what their advice is and why, and what other positions are and why·

Bad lobbyists merely try to frame their side of the issue in a narrow context, and create law or regulation that will disproportionately benefit them personally - or their taskmasters - and to hell with everyone else and the public good

What I see shaping up in cybersecurity are a lot of people with agendas that involve personal enrichment as much as – no, I believe, in many cases more than - actually making things better. Security vendors, for example, whose core markets are drying up and who want to get into the testing business and therefore are lobbying to whomever will listen that outside testing is the answer to all security ills. Dare I say it, “Quis custodiet ipsos custodes?” (Who will watch the watchers?). People who want to come in and “audit” development processes – by regulation (again, who vets the auditors?). Vendors with encryption solutions who try to push data encryption as the legislative security for all ills. (Encryption does not solve access control problems and anyway, an alternative solution might be not allowing data to migrate to places it should not exist. Or automatically deleting it from places it should not be.) More to the point, I fear requiring specific technical remedies for a perceived risk runs the risk that the technical standards will rapidly be overtaken by newer technology, and we will be enshrining buggy whip standards in an automobile age. Or wearing white gloves to construction sites.

Now that the problem is clear, instead of recommending a particular “make it better” program like enshrining the SANS Top 25 list into contracts, here is what I would do:

I would like actual governance structures around Intended Do Good Security programs to ensure they are administered well, fairly, do not advantage a single vendor, promote actual and cost effective security improvement and do not demand the impossible overnight. In short, they give vendors both encouragement – pushy encouragement is fine – to improve and credit for doing it. And do actual security good, of course. I would include a recommended list of “good ideas that could make things better” as well as “bad ideas that legislators should stay away from.” (The way the SANS Top 25 list was developed showed good governance in that it was inclusive and transparently arrived at.)

Similarly, I would like vendors to implement some governance structures around their development processes. Nobody can do everything in security but we can all improve, and being able to a) target areas for improvement, b) create reasonable standards of practice in secure development within our own enterprises, and c) develop structures that help measure our internal consistency in depth and breadth of secure development practice is part of growing up. It’s treating security like a business value (diversity, cost containment, ethics, anything else you can and do run as a consistent corporate program). We all need to do that. (Oracle does have governance structures around our secure development programs, which will be the subject of future blogs.) None of us will develop products exactly the same way, with exactly the same APIs or exactly the same development frameworks, exactly the same tools and not every “good security idea” will be equally cost effective or useful for all of us. But we can, and should, figure out a) what we should be doing as individual vendors that is generally considered good practice, and b) make sure we do that.

For my part, and back to my Nordic skiing example, I cannot control what others do with their dogs, but for my part, I have tried to improve what I do with my dog. I started using my “happy voice” with Thunder when another dog comes up, to remind him that he likes labs, border collies, retrievers and so on. I am less tense and defensive so he does not feel he needs to protect me. I can be a better dog owner, and I try to be. Specifically, I figure if I am more open and accepting of loose, loutish dogs, it helps avoid problems and vet bills. I also thank the many responsible dog owners I meet for how smart, well behaved, and handsome their dogs are, and for calling their dogs instead of letting them bound up. Just like working on cybersecurity issues, my Nordic skiing goal is “happy – and safe – trails for all.”

* One of my dear friends just lost her beloved lab Zach. Zach was a lovely, sweet dog and will forever be enshrined in my personal pantheon of utterly lovable labs.

Book of the Week

War Made New, Technology, Warfare and the Course of History: 1500 to Today by Max Boot

A really fascinating look at how technology has contributed to military success. Max Boot analyzes warfare in terms of “ages” – gunpowder, first and second industrial age, and the information age (e.g., as applied in Iraq and Afghanistan). He also looks at potential future combat. The interesting questions in military history are not “who won?” but “why?” and “how?” It’s also fascinating to see how warfare is being transformed through information technology.

Pau hana happiness. If you’ve had a hard week, you can cheer yourself up by checking out Mānoa DNA below doing Ka’a Ahi Kahului. (You can also learn how to say ‘railroad’ in Hawaiian.) I love these guys!http://www.youtube.com/watch?v=xwx231xqAKI

As I write this, I am nursing a couple of bruises I incurred skijouring with my dog Thunder. Skijouring is Nordic skiing with your dog hooked up to a lead, so he pulls you. I believe the word “skijour” comes from the Norwegian word for “cheating,” judging by the dirty looks I get from other Nordic skiers when Thunder zooms past them towing me. Mind you, Thunder has two speeds: flat out and conked out. Periodically, he stops in the middle of a run to jump into a snow bank, do a “doggy snow angel” and cool off, or check out some other dog’s calling card. You get used to using the fast-braking snowplow since I know from experience it is no fun to reach the end of the tether at high speed: it leads to a Nordic skiing move known as a “face plant.”

I Nordic ski a lot, hooked up to Thunder most of the time though I typically do a few kilometers by myself because otherwise he is the only one getting aerobic exercise. The biggest hazard for me in Nordic skiing (which does not normally carry the risks alpine skiing does) is from what I call LLDS – loose, loutish dog syndrome. That is, many people around here ski with their dogs, hardly any of them are on leads, and of those, at least some percentage are big, loutish, untrammeled dogs who run up to Thunder at full tilt, causing him to either “defend” me or otherwise establish his alpha dog-dom. I have to explain to their owners that no, Thunder does not want to play: he is a working dog, pulling a sled or a skier is his “job” and he is single-minded about it. Sometimes but not often, his lunge after another dog bounding up at him pulls me over (hence, the previously alluded-to bruises). Last year, I had a $50 vet bill courtesy of a border collie who bit Thunder to herd him when he did most definitely did not want to be herded. The owner, needless to say, did not call his dog despite my repeated requests that he do so. Jerk.

I should add that most people on the Nordic trails around Sun Valley (especially when asked) call their dogs instead of letting them launch themselves at other dogs, most of the dogs are reasonably well behaved but about 20% are not and they are inevitably golden retrievers or labs. With apologies to my friends who have those breeds,* I cringe when I see labs or retrievers on the Nordic trails, because I know those dogs are going to be the ones bounding up and creating a problem. Part of this I attribute to rampant cute puppyitis (many people fall in love with an adorable puppy – and both labs and retrievers are very cute puppies – without bothering to find out anything about the breed temperament) and part to the fact that in any group, there are going to be irresponsible pet owners.

My dog, I admit, is hardly perfect, but the rules of the Blaine County Recreation District are that if dogs are off lead, they must be within 15 feet and under voice command, both of which are regularly flouted by the aforementioned irresponsible dog owners. I’ve seen dogs wandering not anywhere near visual or audible distance of their owners (labs and retrievers, almost always). Realistically, of course, it is the owner who is at fault and not the dog. I guess a side affect of cute puppyitis is thinking your dog is so adorable, every one else (and every other dog) will find him just as appealing and you don’t have to bother training him.

I’ve been reminded recently of the difference between (if I may be forgiven) bad dogs and bad owners with the release of the SANS Top 25 Most Dangerous Programming Errors list. The list describes the Top 25 programming errors that contribute to security vulnerabilities. Many people contributed to it so it represents a broad consensus that is, overall, a strength. I should also note that one of my team reviewed and contributed to the list. I further add (hastily) that on the face of it, I don’t have a problem with the list: a list is a list (like a dog, it’s not necessarily a bad dog or a good dog, it’s just a dog). To the extent that a Top 25 Most Dangerous Programming Errors list helps create a starting point for Avoiding Bad Coding, it can be very useful. I also thank various folks who spearheaded the creation of the list (Bob Martin at Mitre was the hard-working project manager) and who scrupulously attempted to get an industry consensus via multiple review cycles and multiple contributors. As I have opined previously, there are a lot of universities that don’t teach secure coding or emphasize it and a lot of developers who are looking for guidance as to where to start paying attention to What Not To Do. To that extent, the SANS list is very useful.

My issue is with the potential mischief this list can and I fear will be put to, which, like my Nordic skiing example is not the fault of the dog (the list), it’s a fault of the “owners,” in this case people who “buy” the cute little list and are enamored of it and think it can do no wrong. I meet dog owners who decide their dog who runs up at Thunder full throttle is just the cutest thing and they do not understand why Thunder does not want to play. Then, it is my fault if there is a problem (uh, my dog is on a lead, yours is not, nor does yours obey voice command, therefore, your dog is by definition out of control and it is your responsibility to control the animal). I love Nordic skiing but the LLDS problem makes me absolutely crazy and ruins my skiing experience. Similarly, I am seeing what started as a good idea – a “don’t do this” list – morph into a big, loutish dog.

For example, I have heard people exclaim that it would be a great thing to demand of commercial software vendors that the solution to all security ills is that vendors be made to attest (as part of a contract) that their code contains no programming errors in the SANS Top 25 Most Dangerous Programming Errors list. Now, even if this could be foisted on the vendor community, just like a large untrammeled Labrador, I believe, the results would be unpredictable and not universally welcome. One wants to know, is the goal to have better software, or to have someone to sue? Because making forced contractual attestations as to the degree to which one’s software is free of a class of defects has some unintended consequences.

I have opined previously on the fact that automated tools help find Dumb Things In Code that could include some or all of the SANS Top 25. But everyone who has used them knows that the available tools on the market do not find all vulnerabilities or even all types of vulnerabilities. And if your code is quadruple digits in length or more – and most commercial software is multiple million lines of code – manual code review is a) time consuming, b) unlikely to succeed, c) subject to the vagaries of the reviewer and therefore not repeatable or consistent, and d) out of date as soon as the code changes, which is daily in many development organizations. More to the point, what happens when a vendor that has done the best it can to find and eradicate the SANS Top 25 (or whatever the List Du Jour is, more on which later) finds a vulnerability on the verboten list after using the best available tools? That is, where the product is already shipping? Suppose that the vulnerability a vendor finds is a really bad one (let’s say, CVSS 10), and so easy to exploit your grandmother can do it. Does the vendor a) sit on the vulnerability and hope nobody else finds it, or b) fix the problem and find themselves in breach of contract? That, dear reader, is what is known as a Catch-22 (or, in the best classic tradition, being caught between Scylla and Charybdis). Back to my original question, are the people rubbing their hands in glee at the prospect of forcing contract provisions that are list-based down vendors’ throats actually interested in better software, or having someone to blame?

I believe there is another problem with enshrining anything like a SANS Top 25 list into a contractual or regulatory straightjacket. First of all, the SANS Top 25 list is a general list with both the utility and limitations of that. If one’s code is (based on one’s development practice, development framework and deployment scenario of the software) more at risk from another kind of vulnerability that did not make the Top 25 (or, if new threats are discovered, and they are all the time), what should a vendor focus on? The theoretical Top 25 or the issues based on one’s own experience? The very nature of a Top 25 list is that it is likely to be overtaken by events (actually, in the general sense, it would be a very nice problem to have, because it might mean everyone had eradicated the Top 25 issues or at least made a really good dent in them, which we all want). However, that does not mean contractual language would change fast enough to maintain relevancy. It never does.

By way of example, when I was in the Navy, it was (at that time) part of the uniform regulations that females wearing dress uniforms (service dress blue or service dress white) were required to wear or carry white gloves and a purse. (If you carried white gloves instead of wearing them, they had to be folded lengthwise with “fingers out” – in front of your hand. If you care.) In short, white gloves and a purse were “part of the uniform” and if you did not wear or carry gloves and carry a purse that meant you were technically, “out of uniform,” one of the seven deadly sins in the Navy. Did I mention that part of your annual fitness report (think “employee performance review”) is “military bearing” and you get graded on things like how well you wear the uniform and observe military etiquette? Now, as a member of the Civil Engineer Corps, my job as a contracting officer entailed going on construction sites. The safety “rules” of most construction sites are that you need to wear a hardhat and steel-toes shoes. So, in order to comply with construction site safety regulations and Navy uniform regulations, I faced carrying white gloves onto a construction site (where, needless to say, they would not stay white very long). Not to mention, I was supposed to be carrying a purse and wearing a hardhat (instead of my uniform hat). I gave up on the steel-toed shoes since there are no steel-toed women’s pumps that I know of.

The bottom line is that those uniform regulations were set in the day when every woman wore a hat and gloves (my mom still does, to church every Sunday) and did not contemplate that women might expand to jobs that involved going onto construction sites. The regs did not keep up with the times. As a result, there is no way I could be both “in uniform” and “in compliance” on a construction site. Catch-22 – pick which regulation you are going to break, because I could not comply with all of them.

You might note that in the above case of “unattainable” or mutually conflicting regulations I should have just used common sense, and I did. But I was still technically “out of uniform.” And hence, my question, are the people advocating that the otherwise useful list become a contractual commitment (thus overusing if not abusing the utility of it) interested in better software or in building in an inability of vendors to meet all contractual provisions (the software assurance equivalent of demanding steel-toed high heeled pumps)? And if their intentions are to make software better, can they reconcile the likely outcome that vendors will consider their almost certain liability when deciding whether to release a fix and consult their lawyers on how to characterize an issue differently, rather than just getting on with the fix? Making things better should not entail requiring a level of perfection nobody could possibly meet, let alone enshrining it into a contract. Unless of course, one’s real goal is to make money out of others’ imperfections. (I should pause here and reiterate that the list is a good idea, I know the people who spearheaded it had good motives and I thank them for their efforts.)

One of the great privileges of my job has been the opportunity to advise policymakers on the challenges of cybersecurity or to comment on bills and other regulatory language. I note that while “lobbyist” is sometimes a term of opprobrium:

· Policymakers, and the staff who capably assist them, typically want to legislate well
· Nobody can be an expert on every issue and thus policymakers and their staffers rely in part on outside stakeholders advising them of the specifics of an issue
· Such advice, if one is asked to or can offer it, is part of one’s civic responsibility (to help policymakers govern wisely)
· Lobbyists use these opportunities to talk about the scope and context of an issue, what their advice is and why, and what other positions are and why
· Bad lobbyists merely try to frame their side of the issue in a narrow context, and create law or regulation that will disproportionately benefit them personally - or their taskmasters - and to hell with everyone else and the public good

What I see shaping up in cybersecurity are a lot of people with agendas that involve personal enrichment as much as – no, I believe, in many cases more than - actually making things better. Security vendors, for example, whose core markets are drying up and who want to get into the testing business and therefore are lobbying to whomever will listen that outside testing is the answer to all security ills. Dare I say it, “Quis custodiet ipsos custodes?” (Who will watch the watchers?). People who want to come in and “audit” development processes – by regulation (again, who vets the auditors?). Vendors with encryption solutions who try to push data encryption as the legislative security for all ills. (Encryption does not solve access control problems and anyway, an alternative solution might be not allowing data to migrate to places it should not exist. Or automatically deleting it from places it should not be.) More to the point, I fear requiring specific technical remedies for a perceived risk runs the risk that the technical standards will rapidly be overtaken by newer technology, and we will be enshrining buggy whip standards in an automobile age. Or wearing white gloves to construction sites.

Now that the problem is clear, instead of recommending a particular “make it better” program like enshrining the SANS Top 25 list into contracts, here is what I would do:

I would like actual governance structures around Intended Do Good Security programs to ensure they are administered well, fairly, do not advantage a single vendor, promote actual and cost effective security improvement and do not demand the impossible overnight. In short, they give vendors both encouragement – pushy encouragement is fine – to improve and credit for doing it. And do actual security good, of course. I would include a recommended list of “good ideas that could make things better” as well as “bad ideas that legislators should stay away from.” (The way the SANS Top 25 list was developed showed good governance in that it was inclusive and transparently arrived at.)

Similarly, I would like vendors to implement some governance structures around their development processes. Nobody can do everything in security but we can all improve, and being able to a) target areas for improvement, b) create reasonable standards of practice in secure development within our own enterprises, and c) develop structures that help measure our internal consistency in depth and breadth of secure development practice is part of growing up. It’s treating security like a business value (diversity, cost containment, ethics, anything else you can and do run as a consistent corporate program). We all need to do that. (Oracle does have governance structures around our secure development programs, which will be the subject of future blogs.) None of us will develop products exactly the same way, with exactly the same APIs or exactly the same development frameworks, exactly the same tools and not every “good security idea” will be equally cost effective or useful for all of us. But we can, and should, figure out a) what we should be doing as individual vendors that is generally considered good practice, and b) make sure we do that.

For my part, and back to my Nordic skiing example, I cannot control what others do with their dogs, but for my part, I have tried to improve what I do with my dog. I started using my “happy voice” with Thunder when another dog comes up, to remind him that he likes labs, border collies, retrievers and so on. I am less tense and defensive so he does not feel he needs to protect me. I can be a better dog owner, and I try to be. Specifically, I figure if I am more open and accepting of loose, loutish dogs, it helps avoid problems and vet bills. I also thank the many responsible dog owners I meet for how smart, well behaved, and handsome their dogs are, and for calling their dogs instead of letting them bound up. Just like working on cybersecurity issues, my Nordic skiing goal is “happy – and safe – trails for all.”

* One of my dear friends just lost her beloved lab Zach. Zach was a lovely, sweet dog and will forever be enshrined in my personal pantheon of utterly lovable labs.

Book of the Week

War Made New, Technology, Warfare and the Course of History: 1500 to Today by Max Boot

A really fascinating look at how technology has contributed to military success. Max Boot analyzes warfare in terms of “ages” – gunpowder, first and second industrial age, and the information age (e.g., as applied in Iraq and Afghanistan). He also looks at potential future combat. The interesting questions in military history are not “who won?” but “why?” and “how?” It’s also fascinating to see how warfare is being transformed through information technology.

Pau hana happiness. If you’ve had a hard week, you can cheer yourself up by checking out Mānoa DNA below doing Ka’a Ahi Kahului. (You can also learn how to say ‘railroad’ in Hawaiian.) I love these guys!
http://www.youtube.com/watch?v=xwx231xqAKI

Oracle RAC and the Parallel Query Option can be a powerful combination, but if you don’t entirely understand how it works; it might not be doing exactly what you think it is.The parallel query option allows some long running operations such as table scans to be divided into smaller chunks and run by multiple server processes.Since operations such as table scans are very I/O intensive, this essentially allows other processes to be doing work on your behalf while some of them are simply waiting on the I/O to return.Let’s look at an example.

Let’s say that you have a table scan that is going to read 1,000,000 blocks.Furthermore, ignoring all disk caching, driver optimizations, etc. let’s assume that each physical I/O takes 2ms.So, in order to read this entire table into the SGA it will take 1,000,000 I/Os x 0.002 sec/I/O = 2,000 sec which is 33.3 minutes.So, it will take a half an hour (plus overhead) just to read the data into the SGA.Most of the time which is spent idle waiting on the I/O to return.

Now let’s turn on parallel query with a parallelism of 10. Oracle will internally divide this 1,000,000 block table into multiple pieces and pass those pieces off to the 10 parallel query processes.If you are using anything other than a single SATA drive for your database the I/O subsystem can easily handle the parallelism and will now return that same 1,000,000 blocks in 1/10th of the time, or 3.33 minutes.

Note:I prefer to have control over my parallelism, so I will use the parallel hint to invoke parallel query as such;SELECT /*+ PARALLEL(table_alias, degree) */ FROM table;

Taking it up a notch, now let’s consider that the operations are consuming all of the CPU in the system.I can now additionally add nodes to use in a RAC cluster for parallel query, thus taking advantage of parallelism and RAC.This can be done by extending the hint to include the number of nodes to use in a parallel query operation as such;/*+ PARALLEL(table_alias, degree, instances) */ FROM table;This will allow me to specify not only the number of query slaves to use, but the number of instances to use as well.

This is only part of the story.Now it is necessary to specify whether we are talking about 10g or 11g.Let’s start with Oracle 10g.

Parallel Query and Oracle 10g RAC

With Oracle 10g, the number of RAC nodes that will participate in a parallel query operation depends on a couple of parameters; instance_groups and parallel_instance_group.The instance_groups parameter is used to set up instance_groups and the parallel_instance_group parameter is used to specify which group you belong to.

First, set up the instance groups.In this example I have a four node RAC cluster:

INSTANCE_GROUPS

Assign the node a set of instance groups that it participates in

node1.instance_groups = node1,first_two,all_nodes

node2.instance_groups = node2,first_two,all_nodes

node3.instance_groups = node3,last_two,all_nodes

node4.instance_groups = node4,last_two,all_nodes

The following graphic illustrates the configuration.

Setting the parallel_instance_group parameter will define which group you belong to, thus defining where your parallel queries can run.

For example, if I set my parallel_instance_group parameter to Node1, my parallel queries will only run on node 1 as shown here:

Setting the parameter parallel_instance_group in the init.ora file will specify the default value for each user who logs into that instance.An alter session command can be used to modify that parameter as shown here:SQL>alter session set parallel_instance_group=all_nodes;

This will allow the parallel query to span all nodes in the RAC cluster.

Note:You don’t have to be on the node that parallel query is configured to run on.If you happen to log onto node 3 and your parallel_instance_group is Node1 non-parallel queries will run on node 3, but any parallel queries will run on node 1.

Parallel Query and Oracle 11g RAC

So, what about Oracle 11g?If you are running on Oracle 11g forget everything I just told you.One of the best features about Oracle 11g is that it now uses services to specify which nodes your parallel query runs on.So, depending on which service you connect to, the preferred nodes for that service are the nodes that parallel queries will use.

Conclusions

Depending on the operation that you are doing, you might be able to achieve huge performance gains by parallelizing across multiple nodes in a RAC cluster.Of course your mileage might vary, and if you aren’t careful you might increase the load on nodes to a point where you are affecting other users.I have a saying that “If you aren’t careful with parallelism, you might paralyze your system.”

BPEL Console provides several out of the box features for searching the process instances using multiple search criteria. Following are the main search criteria that can be used within BPEL console

All the above are very powerful features for process administration. Enterprise Integrations and Support

Enterprise integrations deal with business data and transactions. These integrations often have interactions between multiple layers. In case of a failed transaction at any layer, the support function personals can spend a great deal of time and energy on finding the issue and troubleshooting it. I wrote about this in one of my previous post "Service Oriented approach for ERP integration".

Now imagine a use case where thousands of messages are being processed every minute. Searching by some of the default criteria’s can result in multiple search results.

Lets take an example. Custom Search Requirement Example

Lets assume that there is a BPEL process that processes orders in Oracle E-Business Suite. The order details are coming from an external application that is used by end users which in turn publishes to a JMS queue. The BPEL process picks up from the JMS queue and creates an Order in the EBS with the Order Number that comes from the message.

In the above example, if the end application does not receive an order confirmation, you can imagine there can be many reasons for that. One of them could be that the bpel process instance failed for some reason.

When it comes to tracking that order, the end application support will most likely call the support and mention that for this particular order (along with some sort of business key like the order number) there was no confirmation received.

In a production environment, there may be thousands of process instances. For our bpel support person to browse through the instances just by time can also take some time. But for him to be able to search the process instance by using a business key can prove to be extremely useful.

So the question from business user and administrator becomes Can I search by Order Number?

The answer is YES.Is there a simple way to do it?

Any one who has worked with application integration can relate to it and would agree with it. Now the question is “Is there a simple way to do it?”

The answer is YES. Let’s Do It

Let's build one simple bpel process from scratch and find out how can we achieve this highly desirable functionality Step 1

Right Click on your application and select “New Project” option

Select BPEL Process Project

Step 2Name the process “SearchableUsingCustomKeyProcess

Step 3

For simplicity we will proceed with the default option

Step 4

So now we have a skeleton asynchronous BPEL process

Step 5

Click on SearchableUsingCustomKeyProcess.xsd

Change the name of the input to orderNumber

Step 6

Drag and drop a Java Embedding activity from the Process Activities drop down on right hand side

Name it “JavaEmbeddingToSetProcessTitle”

Step 7

Double click Java Embedding you just added

Add the following code

Step 7

Deploy the process on server

Step 8

Initiate the process from BPEL console with “1001” as input to the orderNumber field

Notice the instance title is different than the process name. This is because of the setTitle statement in our Java Embedding.

Click on Audit Flow and then on the Java activity to make sure that the instance ran without any errors

Step 10

Let’s see if we can search by our custom field (order number) that we want to search for out of many bpel process intances in the console

Click on Instances

Now put the following into the Title field under Locate Instances and click “Go”

“OrderEntryBPELProcess - OrderNumber:1001”

Notice how we are able to search for our particular order number (1001).

You can search for other order numbers easily by just replacing the end of the string with the order number. For example searching for the following string within the title field will search for order number “2001”

“OrderEntryBPELProcess - OrderNumber:2001”

Summary

Requirements related to searching for a process instance in BPEL console by using a custom field could be addressed by the approach demonstrated in this post.

This example demonstrates that it can be done and it can be done easily.

It’s quite feasible that the same approach could be used for “basic” complex event processing operations, especially those that don’t require history (or much persistence)

Paul is certainly right that CEP and event/data stream processing engines could benefit from the appliance approach. However I think that it is precisely those applications that require history that could most benefit. A high-speed streaming engine with a co-located database (running with much of the data in cache) gives the perfect environment for answering a wider range of "right-time" BI requirements; the engine can control exactly what needs to be retained and for how long to provide the appropriate time-series data to support whatever aggregated outputs are desired. The database acts as a memory extender (allowing infrequently accessed data to be paged out), as a source of replay / recovery data, and as a history for newly defined streaming queries. The latter is most important - if you want a 1 hour moving average, most streaming queries won't give you the right answer until the end of the first hour.

In the first three parts of this article, we have seen how SQL plan baselines are created, used and evolved. In this final installment, we will show some user interfaces, describe the interaction of SPM with other features and answer some of your questions.

DBMS_SPM package

A new package, DBMS_SPM, allows you to manage plan histories. We have already seen in previous examples how you can use it to create and evolve SQL plan baselines. Other management functions include changing attributes (like enabled status and plan name) of plans or dropping plans. You need the ADMINISTER SQL MANAGEMENT OBJECT privilege to execute this package.

Viewing the plan history

Regardless of how a plan history is created, you can view details about the various plans in the view DBA_SQL_PLAN_BASELINES. At the end of Part 3 of this blog, we saw that the SQL statement had two accepted plans:

The SQL handle is a unique identifier for each SQL statement that you can use when managing your plan history using the DBMS_SPM package.

Creating an accepted plan by modifying the SQL text

Some of you may be manually tuning SQL statements by adding hints or otherwise modifying the SQL text. If you enable automatic capture of SQL plans and then execute this statement, you will be creating a SQL plan baseline for this modified statement. What you most likely want, however, is to add this plan to the plan history of the original SQL statement. Here's how you can do this using the above SQL statement as an example.

When the optimizer uses an accepted plan for a SQL statement, you can see it in the plan table (for explain) or V$SQL_PLAN (for shared cursors). Let's explain the SQL statement above and display its plan:

Note------ SQL plan baseline "SYS_SQL_PLAN_fcc170b0a62d0f4d" used for this statement

22 rows selected.

The note at the bottom tells you that the optimizer used an accepted plan.

A plan history might have multiple plans. You can see one of the accepted plans if the optimizer selects it for execution. But what if you want to display some or all of the other plans? You can do this using the display_sql_plan_baseline function in the DBMS_XPLAN package. Using the above example, here's how you can display the plan for all plans in the plan history.

Two parameters allow you to control SPM. The first, optimizer_capture_sql_plan_baselines, which is FALSE by default, allows you to automatically capture plans. SPM will start managing every repeatable SQL statement that is executed and will create a plan history for it. The first plan that is captured will beautomatically accepted. Subsequent plans for these statements will not be accepted until they are evolved.

The second parameter, optimizer_use_sql_plan_baselines, is TRUE by default. It allows the SPM aware optimizer to use the SQL plan baseline if available when compiling a SQL statement. If you set this parameter to FALSE, the SPM aware optimizer will be disabled and you will get the regular cost-based optimizer which will select the best plan based on estimated cost.

SPM and SQL profiles

A SQL statement can have both a SQL profile and a SQL plan baseline. Such a case was described in Part 3 where we evolved a SQL plan baseline by accepting a SQL profile. In this case, the SPM aware optimizer will use both the SQL profile and the SQL plan baseline. The SQL profile contains additional information that helps the optimizer to accurately cost each accepted plan and select the best one. The SPM aware optimizer may choose a different accepted plan when a SQL profile is present than when it is not.

SPM and stored outlines

It is possible for a SQL statement to have a stored outline as well as a SQL plan baseline. If a stored outline exists for a SQL statement and is enabled for use, then the optimizer will use it, ignoring the SQL plan baseline. In other words, the stored outline trumps a SQL plan baseline. If you are using stored outlines, you can test SPM by creating SQL plan baselines and disabling the stored outlines. If you are satisfied with SPM, you can either drop the stored outlines or leave them disabled. If SPM doesn't work for you (and we would love to know why), you can re-enable the stored outlines.

If you are using stored outlines, be aware of their limitations:

You can only have one stored outline at a time for a given SQL statement. This may be fine in some cases, but a single plan is not necessarily the best when the statement is executed under varying conditions (e.g., bind values).

The second limitation is related to the first. Stored outlines do not allow for evolution. That is, even if a better plan exists, the stored outline will continue to be used, potentially degrading your system's performance. To get the better plan, you have to manually drop the current stored outline and generate a new one.

If an access path (e.g., an index) used in a stored outline is dropped or otherwise becomes unusable, the partial stored outline will continue to be used with the potential of a much worse plan.

One question that readers have is what we plan to do with the stored outlines feature. Here is the official word in Chapter 20 of Oracle's Performance Tuning Guide:

Stored outlines will be desupported in a future release in favor of SQL plan management. In Oracle Database 11g Release 1 (11.1), stored outlines continue to function as in past releases. However, Oracle strongly recommends that you use SQL plan management for new applications. SQL plan management creates SQL plan baselines, which offer superior SQL performance and stability compared with stored outlines.

If you have existing stored outlines, consider migrating them to SQL plan baselines by using the LOAD_PLANS_FROM_CURSOR_CACHE or LOAD_PLANS_FROM_SQLSET procedure of the DBMS_SPM package. When the migration is complete, you should disable or remove the stored outlines.

SPM and adaptive cursor sharing

Adaptive cursor sharing (ACS) may generate multiple cursors for a given bind sensitive SQL statement if it is determined that a single plan is not optimal under all conditions. Each cursor is generated by forcing a hard parse of the statement. The optimizer will normally select the plan with the best cost upon each hard parse.

When you have a SQL plan baseline for a statement, the SPM aware optimizer will select the best accepted plan as the optimal plan. This also applies for the hard parse of a bind sensitive statement. There may be multiple accepted plans, each of which is optimal for different bind sets. With SPM and ACS enabled, the SPM aware optimizer will select the best plan for the current bind set.

Thus, if a hard parse occurs, the normal SPM plan selection algorithm is used regardless of whether a statement is bind sensitive.

Enterprise Manager

You can view SQL plan baselines and configure and manage most SPM tasks through the Enterprise Manager. The screenshots below show two of these tasks.

Setting init.ora parameters for SPM

Loading SQL plan baselines from cursor cache

Further Reading

More details about SPM are available in the Oracle documentation, especially Chapter 15 of the Performance Tuning Guide. There is also a whitepaper, and a paper published in the VLDB 2008 conference. The VLDB paper also has experimental results that show how SPM prevents performance regressions while simultaneously allowing better plans to be used.

In the first three parts of this article, we have seen how SQL plan baselines are created, used and evolved. In this final installment, we will show some user interfaces, describe the interaction of SPM with other features and answer some of your questions.

DBMS_SPM package

A new package, DBMS_SPM, allows you to manage plan histories. We have already seen in previous examples how you can use it to create and evolve SQL plan baselines. Other management functions include changing attributes (like enabled status and plan name) of plans or dropping plans. You need the ADMINISTER SQL MANAGEMENT OBJECT privilege to execute this package.

Viewing the plan history

Regardless of how a plan history is created, you can view details about the various plans in the view DBA_SQL_PLAN_BASELINES. At the end of Part 3 of this blog, we saw that the SQL statement had two accepted plans:

The SQL handle is a unique identifier for each SQL statement that you can use when managing your plan history using the DBMS_SPM package.

Creating an accepted plan by modifying the SQL text

Some of you may be manually tuning SQL statements by adding hints or otherwise modifying the SQL text. If you enable automatic capture of SQL plans and then execute this statement, you will be creating a SQL plan baseline for this modified statement. What you most likely want, however, is to add this plan to the plan history of the original SQL statement. Here's how you can do this using the above SQL statement as an example.

When the optimizer uses an accepted plan for a SQL statement, you can see it in the plan table (for explain) or V$SQL_PLAN (for shared cursors). Let's explain the SQL statement above and display its plan:

Note------ SQL plan baseline "SYS_SQL_PLAN_fcc170b0a62d0f4d" used for this statement

22 rows selected.

The note at the bottom tells you that the optimizer used an accepted plan.

A plan history might have multiple plans. You can see one of the accepted plans if the optimizer selects it for execution. But what if you want to display some or all of the other plans? You can do this using the display_sql_plan_baseline function in the DBMS_XPLAN package. Using the above example, here's how you can display the plan for all plans in the plan history.

Two parameters allow you to control SPM. The first, optimizer_capture_sql_plan_baselines, which is FALSE by default, allows you to automatically capture plans. SPM will start managing every repeatable SQL statement that is executed and will create a plan history for it. The first plan that is captured will beautomatically accepted. Subsequent plans for these statements will not be accepted until they are evolved.

The second parameter, optimizer_use_sql_plan_baselines, is TRUE by default. It allows the SPM aware optimizer to use the SQL plan baseline if available when compiling a SQL statement. If you set this parameter to FALSE, the SPM aware optimizer will be disabled and you will get the regular cost-based optimizer which will select the best plan based on estimated cost.

SPM and SQL profiles

A SQL statement can have both a SQL profile and a SQL plan baseline. Such a case was described in Part 3 where we evolved a SQL plan baseline by accepting a SQL profile. In this case, the SPM aware optimizer will use both the SQL profile and the SQL plan baseline. The SQL profile contains additional information that helps the optimizer to accurately cost each accepted plan and select the best one. The SPM aware optimizer may choose a different accepted plan when a SQL profile is present than when it is not.

SPM and stored outlines

It is possible for a SQL statement to have a stored outline as well as a SQL plan baseline. If a stored outline exists for a SQL statement and is enabled for use, then the optimizer will use it, ignoring the SQL plan baseline. In other words, the stored outline trumps a SQL plan baseline. If you are using stored outlines, you can test SPM by creating SQL plan baselines and disabling the stored outlines. If you are satisfied with SPM, you can either drop the stored outlines or leave them disabled. If SPM doesn't work for you (and we would love to know why), you can re-enable the stored outlines.

If you are using stored outlines, be aware of their limitations:

You can only have one stored outline at a time for a given SQL statement. This may be fine in some cases, but a single plan is not necessarily the best when the statement is executed under varying conditions (e.g., bind values).

The second limitation is related to the first. Stored outlines do not allow for evolution. That is, even if a better plan exists, the stored outline will continue to be used, potentially degrading your system's performance. To get the better plan, you have to manually drop the current stored outline and generate a new one.

If an access path (e.g., an index) used in a stored outline is dropped or otherwise becomes unusable, the partial stored outline will continue to be used with the potential of a much worse plan.

One question that readers have is what we plan to do with the stored outlines feature. Here is the official word in Chapter 20 of Oracle's Performance Tuning Guide:

Stored outlines will be desupported in a future release in favor of SQL plan management. In Oracle Database 11g Release 1 (11.1), stored outlines continue to function as in past releases. However, Oracle strongly recommends that you use SQL plan management for new applications. SQL plan management creates SQL plan baselines, which offer superior SQL performance and stability compared with stored outlines.

If you have existing stored outlines, consider migrating them to SQL plan baselines by using the LOAD_PLANS_FROM_CURSOR_CACHE or LOAD_PLANS_FROM_SQLSET procedure of the DBMS_SPM package. When the migration is complete, you should disable or remove the stored outlines.

SPM and adaptive cursor sharing

Adaptive cursor sharing (ACS) may generate multiple cursors for a given bind sensitive SQL statement if it is determined that a single plan is not optimal under all conditions. Each cursor is generated by forcing a hard parse of the statement. The optimizer will normally select the plan with the best cost upon each hard parse.

When you have a SQL plan baseline for a statement, the SPM aware optimizer will select the best accepted plan as the optimal plan. This also applies for the hard parse of a bind sensitive statement. There may be multiple accepted plans, each of which is optimal for different bind sets. With SPM and ACS enabled, the SPM aware optimizer will select the best plan for the current bind set.

Thus, if a hard parse occurs, the normal SPM plan selection algorithm is used regardless of whether a statement is bind sensitive.

Enterprise Manager

You can view SQL plan baselines and configure and manage most SPM tasks through the Enterprise Manager. The screenshots below show two of these tasks.

Setting init.ora parameters for SPM

Loading SQL plan baselines from cursor cache

Further Reading

More details about SPM are available in the Oracle documentation, especially Chapter 15 of the Performance Tuning Guide. There is also a whitepaper, and a paper published in the VLDB 2008 conference. The VLDB paper also has experimental results that show how SPM prevents performance regressions while simultaneously allowing better plans to be used.

By accident, I stumbled upon a Java script library for generating mind maps. But instead of having a static diagram, this Java script enabled one is dynamic, you can move nodes, you can zoom in or out to another level of nodes. A very nice graphical way of presenting hierarchical data. For example, who's managing who in Oracle's EMP table as shown below. Come on, grab your mouse and start moving and clicking, it's alive!

Here's the recipe to build such an application.

Start with downloading the source of the Google code project js-mindmap. In the sources you will several Java script libraries, and an example file js-mindmap.html.

In APEX upload the Java script libraries add references from the header section to these libraries.

Add a PL/SQL region, which generates the hierarchy and will be converted in a diagram:

For example, the functionshow_emp_tree returns the html code for the <div> element.

Note that when you replace the EMP table by OEHR_EMPLOYEES table the animation will become very slow due to the increase of nodes and links. Also the js-mindmap project is still in development. But anyway it's fun!

With all this talk of SQLstream's recent v2.0 launch, I was interested to read Tim Bass's CEP blog posting on Analytics-as-a-Service. He calls it A3S - and rightly avoids calling it AaaS; apart from the fact the X-as-a-Service is almost as cliched as XYZ 2.0 (and equally meaningless), careless use of that sequence of As and Ss could cause spam filters round the world to get over-excited.

If we must have the as-a-service tag, I'd like to trademark BI-as-a-service: BIAS. Apart from being a proper acronym, it also gets across that BI often gives you the answers you think you want - not necessarily the ones you really need.

No it does not. If it did, then your customer/employer
would be
doing something no other human being has ever done, which
is unlikely in the extreme. The application may be
unique in its particulars, but it is almost certainly
extremely common in its patterns. This week we will see
how "unique" needs are in fact nothing but common ordinary
development projects.

Beginning With the Conclusion

I have had this conversation with many programmers over
the past few years, and it always follows the same
patterns. The easy part of the argument is showing the
programmer that what he thinks is special or unique
is in fact common. The much harder part, because it
involves the delicate human ego, is showing the programmer
that he has not seen this because he is ignorant. This
is not fun to do and I myself usually skip it, it's
usually not worth the trouble.

Path 1: Details

Occasionally I speak to a programmer who thinks he has
a unique situation. His claim begins with
the mountain of details he must handle, details which appear
to be contradictory, subtle, and overall perplexing. He
wonders if some new approach is required to handle them.

In answering this claim, we begin with the easy part,
showing that the situation is itself not unique. In short,
all worthwhile projects involve mountains of detail, so
there is nothing special there. When it comes to the
subtleties and the maze of exceptions and special cases,
these are common in mature businesses that have evolved
this complexity in response to business needs over the years.
So again there is nothing unique here, the programmer's
situation is again common.

At this point we have to ask how the programmer will deal
with this perplexing mountain of detail. If he knows
what he is doing, he will give the general answer that he
is going to break it down as much as possible into
independent smaller problems that can be solved on their
own. Since this is nothing more than how all programmers
solve complex problems, the entire "uniqueness" claim
has completely collapsed. His project is utterly common.

The much harder part of the conversation comes if the
programmer does not know how to break down
the problem. For instance, if the problem is all about
a fiendishly complex pricing system with lots of discounts
and pricing levels, and the programmer does not know that
he needs to begin with the database, and he further does not
want to hear that, well, there is not much I can do for
him. He will end up working a lot harder than he needs
to, and will probably remain convinced he is dealing with
something "unique".

But let's go a little deeper into that example of the
complicated pricing system. Why do I claim that he must
start with the tables, and that is he is wasting time
if he does not? Well, a complete answer is much more than
will fit here, and in fact I hit that theme over and over
in these essays, but it comes down to:

He must have an accurate and precise description of
the details that govern the pricing scheme. That is
what tables are for.

In working out the mechanics of the tables, particularly
their primary and foreign keys, he will come to a
his most complete understanding of the mechanisms
involved.

When the tables completely reflect the details he
must work with, the code will just about write itself.

Lastly, but probably most importantly, the customer
will expect to control the pricing system by adjusting
the parameters at all levels. Again, that is what tables
are for. The user is in control of the pricing system
if he can edit the tables (because of course he cannot
edit the code).

Path 2: Combinations

Once upon a time we had simple desktop business applications,
games, and then this weird new thing, "the web". Now they
are all mixed together, as we play games on the internet that
are tied into huge databases. Modern applications often
combine technologies that used to be comfortably separate.
On any particular project,
some of the requirements look like they
can be met with an RDBMS, some require management and
delivery of media such as MP3 or video, and he is told as
well he must provide RSS feeds and import data coming in
XML format. Perhaps as well there will be stone tablets
and papyrus scrolls.

This programmer may believe he is in a unique situation
because of this combination of needs. Because no single
toolset out there can meet the entire project, perhaps this
is something never before seen? But this does
not hold up. Just like the argument about complexity,
he must break the problem up correctly, and when he has done
so he will have a perfectly ordinary project. Though I might
add it will also be a very interesting project and
probably a lot of fun.

In The End It Is All About Patterns

I have given two examples above taken from my own experience
where programmers have claimed to me that they faced some
unique situation. There are many other cases, and they always
make perfect sense to the person who thinks he has discovered
something new. The biggest flaw in the programmer's thinking
is failing to distinguish between particulars and
patterns.

My claim in this essay is that the patterns of all problems
are the same. Somebody has seen it before, somebody has done
it before, the answer is out there. The process of analysis
and programming is about slotting your particulars in the
patterns that have already been established.

In the broadest sense all programs process data, and
particular programs break down into broad patterns of data
access and manipulation. Sometimes you have a broad range
of users putting in data with very little massaging
(think twitter) and sometimes you have one group controlling
much of the data while others make use of it (think
Amazon), and sometimes your data is mostly relational
and table based (think any ecommerce or biz app) and
sometimes its mostly media (think youtube).

Once you have these broad patterns identified, you can then
proceed to make use of established practices within
each particular area. What is the best way to provide
sensitive data on the web and protect it from unauthorized
eyes? Somebody has done it before. What is the best way
to track large amounts of media? Somebody has done it
before. What is the best way to set up a complex pricing
system with lots of discounts and pricing levels? Somebody
has done it before. In all cases, your particulars may
be different, but the patterns will be the same.

Conclusion: Find the Patterns

Whenever I find myself looking at a situation that appears
to be new, I try to tell myself that it may be new to me,
but it is not likely to be new to the human race. If it
does not appear to follow a well-known pattern then I
proceed as if I have not yet recognized the pattern
and continue to analyze and break it apart until the pattern
emerges. So far it always has.