Friday Philosophy – Lead or Lag (When to Upgrade)? January 20, 2012

I was involved in a discussion recently with Debra Lilley which version of Oracle to use. You can see her blog about it here (and she would love any further feedback from others). Oracle now has a policy that it will release the quarterly PSUs for a given point release for 12 months once that point release is superseded. ie once 11.2.0.3 came out, Oracle will only guarantee to provide PSUs for 11.2.0.2 for 12 months. See “My Oracle Support” note ID 742060.1. However, an older Terminal release such as 11.1.0.7 is not superseded and is supported until 2015 – and will get the quarterly PSU updates. This left the customer with an issue. Should they start doing their development on the latest and theoretically greatest version of Oracle and be forced to do a point upgrade “soon” to keep getting the PSUs, or use an older version of Oracle and avoid the need to upgrade?

This is in many ways a special case of the perennial issue of should you use the latest version of Oracle (or in fact any complex software solution) or go with the version you know and trust? Plus, should you patch up to the latest version which in theory gives you protection against bugs and vulnerabilities (along with the CPUs). Yes, they are two separate issues but people tend to sit on the same side of both points, for the same reasons.

The arguments to stay using an older version are that it is working, it is stable, you do not need the new features and upgrading is a lot of work and effort. Plus the new version will have new bugs that come along with the new features you do not need and things might be turned on by default that you could do without (like stats collecting or not creating the actual segments when a new table or partition is created). If you remain on your favourite version long enough, you get another issue which is that the latest version of Oracle might not be compatible with your ancient version of the OS or another package or programming language critical to your system (I got caught in a terrible web with old perl, old O/S and old DB that resulted in a need to upgrade all three together – ouch!).

The arguments to moving forward are that you get access to the latest features, that over all older features will have more bugs fixed in newer version, performance will be better {again, overall, exceptions allowing}. Also, if you do hit bugs and problems there are no issues in having to first upgrade to a fully supported version. Plus, fixes are made for current versions first and then back-ported to older ones. Those pack-ported fixes can cause real problems when you DO decide to upgrade.

The big sticking points are the effort involved in upgrading and living with the bugs that you find that Oracle Testing didn’t.

I’ve got a few of other considerations to throw into the pot.

Firstly, if you are developing something new, it is not a lot more effort to use the latest version. This allows you to learn the new version and eases the transition of older systems to it.

Secondly, Oracle like you if you use the latest version, especially if it is the latest-latest version or even beta. Yeah, the helpdesk will not have a clue about some of your issues but in my experience you get access to those really smart guys and gals in Oracle who do the third-line support or even the development work.

Thirdly, if you are on the latest version, if you do decide to freeze on that version for a while, for stability and a quiet life, you have a lot longer before your version (at least at a major level) drops out of support.

Fourthly, dynamic, inquisitive, flexible staff like new things. In my experience, environments that freeze on an old version have a higher percentage of staff who either like it dull and repetitive, or hate it being dull and repetitive – and itch to get out. If I’m in charge, I know which type of staff I like to have more of {NB there are some very good arguments for having some staff who like it dull and repetitive}.

As you can guess, I am in the “be on the latest version” side of the argument. I was ambivalent about it until a few years ago when I noticed a trend:

Sites that like to move forward tend to (a) do it in a controlled manner and (b) have the infrastructure to do proper regression testing.
Site that like to stay still lack the ability to do regression testing and move forward only when forced – and in a pressured, unplanned and frankly chaotic manner.

That was it, that was the real key thing for me. The further you lag behind the more likely you are to eventually be forced to upgrade and it won’t be a nice time doing it. I know, there are exceptions, systems still running Oracle 6 absolutely fine on an old DOS6.1 box. In the same way you also get the odd 95-year-old life-long smokers – and thousands of 45-year-old smokers with emphysema.

When I have any sway over the situation I now always strive to be on modern versions of Oracle {OS, language, whatever} and to patch small and regular. To support all this, have very good regression testing. I’ve only a couple of times been able to get the regression testing sorted out as well as I would like, but when you do the pain of patching and upgrading, as well as developing and integrating, is so much reduced that not patching seems madness.

So to sum up:

If it is a new development, go for the very latest version, play with the latest features if potentially beneficial and see if you can get Oracle to be interested in your attempts. ie (B)lead.

If you have good regression testing, plan and carry out patch and version upgrades as they come available and stay current. ie Lead

If you have a complex solution in place and no/poor regression testing, do not move to a new major release, leave it a while for the worst new bugs to be found and fixed. Then move. ie Lag

If your system is old AND critical and all the guys and gals who implemented it are long gone, stay on that version for ever. ie stagnate.

Oh, and if that last one applies to many of your systems – dust off the CV and start reading technical manuals. One day you will need a new job in a hurry.

Like this:

Related

Hmmmm…. This is still looking at the world through coloured glasses that the database is the driver.
It isn’t and hasn’t been for ages.
What determines if the database software can be upgraded/used at latest version is the application that is using it.
If that app is not compatible with the latest db then, to put it simply, any attempt to drive an upgrade of the db is going to hit the cost of testing/re-training barrier. Guaranteed.
And as soon as cost comes into it, guess what happens?
Ah yes, of course, I forgot: it’s the “costly 1.0 dba” that is the problem…

Hmmm, You seem to be looking at the world through coloured glasses. Without the datastore, your app is nothing – always has been, will be for ages more :-)

It’s all software Noons, in this case layers in the end result. Apps are usually not supported on later version of the DB (be it Oracle, SQL*Server, DB2) as they have not been regression tested by the company that built it. You or a 3rd party.

It’s a very valid point though that you can’t upgrade the DB if your 3rd party vendor supplier can’t support it yet. And if you have hit a bug that is only fixed in the next version of the DB….

Obviously we live in different worlds. Here, it’s very simple: no db is created/installed/upgraded by itself. Those are processes *always* driven by an application. No app upgrade, no touchy the db.
The notion that database software upgrades are the driver is gone. Databases “first”, went out in the 90s.
I’m not saying I like it, but I’ve given up rowing against the tide a long time ago.

Fair enough Noons but just because in your world everything is there to support the app does not move you away from the lag or lead situation, surely? When do you migrate your app? At what point do you chose to move to the latest version of your App? When the support for that version is about to run out or when the latest version of the app is released by the vendor? Do you apply patches they supply (though many app vendors do not seem to release much in the way of patches, it can be a case of “live with it”) or stay on a version until forced to move forward?

It’s far more complex when the app sits on a DB and uses a mid tier and relies on another language etc but at some point upgrades have to occur and, at the juncture, regression testing is a boon. In one case I had early in my career, the regression test was a fake hospital with fake patients and people pushed data through it every day. It would not be so easy now due to data protection, but all my friends were in that hospital and they had a lot of nasty things wrong with them :-)

Well, that’s the whole shebang right there, isn’t it? When do you touch/upgrade/regress-test an app. Because like it or not, that is the driver. Rarely if ever the db.

Where I work it’s common for apps to not be touched for years, after being commissioned. Not because anyone wants to block it. Simply because the business sees no need for an upgrade, is perfectly happy with things as they are and won’t cost-justify the upgrade and the ancillary regression testing outlays.

No touchy the app, ergo no touchy the db.

Our JDE is running on a version >10 years old and has got no chance of ever being upgraded while it keeps doing everything the business needs. You then tell me I must upgrade? No way it’s gonna happen, no matter what I say or do.
Note: I don’t handle JDE here – it’s on AS400/DB2 and I don’t touch that. Just using it as an example.

It’s all about what the business wants/needs, not about what IT wants.

I – and many other dbas – have to live with the fact that we have to pump data into and out of these ante-deluvian releases/environments (JDE, Lotus, MSSQL, etc) and it ain’t easy when Oracle make it hard for their products and releases to talk between them, let alone to other maker’s.

And believe me, Oracle’s “dba1.0″ utter nonsense is totally mis-directed, self-hurting and not helpful at all:
it’s NOT the dba, it’s never been so.
It’s the business and the management!

I think this brings us back to the point I started with. Do you lead or lag? IT people tend to want to lead, but the business does not see the benefit as “it works and upgrading equals effort equals cost – so leave the damn thing alone.”

Which is fine all the way up to the point where an upgrade is forced. At which point it is a massively complex task, probably upgrading several components at once and, because it is a forced situation, it’s done under incredibly pressure and a higher likelihood of failure.

At which point the business gets all wise in retrospect and demands to know why IT did not (a) just cope with this and (b) avoid the situation in the first place.

Pondering more on the whole topic, it might actually be in the best interest of the business to implement a solution, live with it “frozen” as long as possible and in year x when the situation becomes untenable, just replace the whole system. No upgrades, no patches, just an x-yearly replacement.

My opinion? New software versions come too fast, and support for older products is stopped too early. There are still tons of 9i and 10g installations running stable without any problem. Why would you spend lots of money, time and resources upgrading these systems to a higher version, when the end result is just the same? Why not add new features to “old” software? Virtualisation makes it possible to make these “old” legacy systems almost hardware-independent, so there’s also no need anymore to upgrade them when the hardware is being replaced. Same reasoning for consumer software: I really don’t need a new version of MS Windows or Office every 2 years!

I agree with you, it is a pain to have to keep moving forward but it seems to me that commercial software companies have to keep (a) selling stuff and (b) make potential new customers believe that their product is better – and being “constantly improving” is part of that, as well as matching any feature that another competing product has.
Opensource stuff can actually move even faster as people try and put their new bell or whistle into it.

By doing a small upgrade to the terminal release of an older version you can get support for longer, but as Noons says, if the application vendor does not support the terminal release you could be trapped in a catch-22.

Thinking back to the early 90’s, word-processing and spreadsheet software did everything I needed back then, so why is it that if I tried to run the modern equivalents on a machine with a spec from 2000, it would grind to a halt! Frustrating indeed.

Martin, Whilst I agree with the general slant of your article, i.e. from a purist best practice point-of-view; I think it is not practical for large complex enterprises which especially have robust and meticulous regression processes and systems in place, to constantly be upgrading ALL their databases every 12 months. These are, as per your statement, sites that “tend to (a) do it (patch upgrades) in a controlled manner and (b) have the infrastructure to do proper regression testing”

If they have a farm-type simplified and standarised architecture with few variables and their applications are not that disparate, then the regression testing of their entire application stack before rolling in the upgrades for hundreds or even thousands of databases, might still be achievable. Though, I’d still question the cost-benefit to the business in following such a strategy.

However, for those large enterprises that have complex and disparate business systems in their hundreds let alone thousands, implementing the change right across the estate could put them in an ever-upgrade mode. Obviously, these are the “Lag” Customers I’m on about but with one crucial difference from that described in your post – these have very robust regression testing regimes which alone make the agility required for a yearly upgrade impossible due to the scale. Or, one could say they are a mixture of Lag and Lead (closely follow the latest stable release for their estate after other bleading edge customers worldwide have “taken the hit”). Mind you, these are businesses where you and I as end-users would want the risk-averse, secure, stable and robust regression “Lag” stance rather than be anywhere close to “Bleading” edge.

Thankyou for the above comment. I agree with you, when you have a very large Oracle estate then the effort of maintaining it all is huge. Especially when you have a wide variety of applications to maintain.

When things get “big”, be they a single massive VLDB, a huge number of DBs or a mixture of both, then as you point out you have issues with sheer volume. Within big and especially with geographically disparate IT departments, you have further issues in the coordination and communication needed for both the regression testing and the migration itself. The cost could well be higher than the benefit and so you are forced to lag even if in theory you are storing up problems for the future.

I used to be the chair for the Management and Infrastructure Special Interest Group of the UKOUG – trying to handle huge estates was a major part of what we discussed. We had some good presentations about managing lots of cookie-cutter systems (where they are all very similar) but never much of a handle on handling your third scenario. It’s just very hard work.