I worked at a location that was running a single, world-wide instance of SAP. The whole of their HR, Finance, and some of their manufacturing was running from this single instance, twenty four hours per day, seven days per week.
Naturally, the process analyst in me (and to some extent, the pragmatist) had a number of questions about this:

- “How do you back-up?” - “We mirror and back-up the mirror”

- “How do you do maintenance?” - “We have a back-up machine at a separate location across town. We transfer the users to that and perform the maintenance then”

The questions went on and on. Each one had a suitable reply. But then I started thinking about this single instance and it didn’t seem to sit right with me. For some reason I could foresee problems with it. Finally, I was able to verbalise the issues I was having: “How do you deal with disaster recovery?”

Most of my readers will know what disaster recovery is, but for those that do not, a quick explanation. When something happens to a computer system that renders it unusable, a business continuity plan (BCP) needs to kick in. This is where processes are enacted that allow the business to continue operating without the damaged/unusable machine. Running in parallel with this is a recovery process that attempts to repair or replace the damaged/unusable machine and reinstate all the data and transactions that were damaged or missing as a result of the disaster. I had concerns that because this was a single instance, and because it was being used globally, BCP and disaster recovery would be a nightmare.

The project team had already thought about this.

They told me that for BCP purposes, every affiliate running the software had processes on-site that would kick in if the system was unavailable. Even in the worst case scenario, the system would not be offline for any more than 48 hours because they had an agreement with a third-party provider to have a hot-site at a remote location which would be implemented at the first signs the redundant back-up didn’t work.

I asked if this had been tested. It hadn’t. But I was assured that it had been implemented in other organisations within 48 hours.

I turned my attention to the redundant back-up. What happens if there is a fire that destroys the main machine? The redundant back-up would kick in, and all systems would be re-routed. How is the rerouting done? Through underground cables that link the two sites. What happens if someone cuts through the underground cables? There are redundant underground cables that route North and South of the city between the two sites. The chances of both being cut are infinitesimally small. What happens if there is a power spike that takes out both systems? Both systems run off separate power supplies. A power spike would only take out one, not both. What happens if a small nuclear device takes out the town, both sites and the power sources? We would have bigger things to worry about than the system, but in that case the hot-site would kick into action.

I grilled the technical operators and designers of this system for almost two days, coming back to them time and time again when a new scenario occurred to me. They had a suitable answer for every single one of my questions. So, at the end of it - despite my reservations - I had nothing concrete to justify them

Then, a couple of weeks after I left the site they ran a new interface program. It was an HR population interface that filled in specific records on an HR master file. The interface had not been tested properly and it went into a loop. The program ran all day and most of the night. It kept populating the file overnight, making this file larger and larger and larger. It made it so large, in fact, that it completely filled all the disk space on the system.

The main computer, and global SAP instance, shut down.

As designed, it failed over to the redundant back-up across town. The system was on-line, ready to go as expected. It dutifully continued running all the processes it should be running - including the HR population interface. This ran for the rest of the night until it, too, had filled up the disk space on the redundant back-up.

That machine failed.

Plan C was, of course, the hot-site located in another town. A town which was, in fact almost 500 miles away from where the two main sites had been. The hot site didn’t work. It took almost 72 hours to get up and on-line and even then all the information needed to make the system run globally wasn’t there. The comms lines to the global sites were not connected. Anything that could go wrong, did go wrong.

The rest of the story gets a little fuzzy. All I know is that the global SAP system did not come back on line in the host town for almost six weeks. Retrofitting all the missed transactions from within the BCP took another few months.

Nobody lost their job over this. The hot-site provider lost their contract, I believe.
I sat there shaking my head.

But that wasn’t the worst of it. The worst part came when I spoke to people who were affected by the outage. I asked them what their BCP was for when they had no access to the system. (remember I had been told by the tech crew that each affiliate had processes for dealing with on-going business when the system was down). They told me “We wait.” The BCP for operating a multi-billion dollar global business across numerous affiliates, sites and countries was “Wait”. Wait until the redundant back-up kicks in and, if this doesn’t work, wait 48 hours until the hot-site is ready to run. I wondered how they dealt with a six week backlog of waiting? Nobody was able to tell me. But I’m reasonably sure it involved manual work arounds.

The moral of the story?

Your redundant back-up isn’t. There is always the scenario where you will find yourself without it. Plan for that scenario. Make sure that you have processes in place to deal with it - even if the chances of that happening are remote. Remember the Twin Towers of the World Trade Centre were designed to withstand the impact of commercial airliners when they were built. The chances of an airliner destroying more than a few floors was negligible. But the impact of the jets didn’t bring down the towers. It was the thousands of gallons of jet fuel burning away at vital support beams that did it. Once a couple of these had weakened, gravity did the rest.

Several years ago I was involved in a global project to implement an ERP system in a multinational pharmaceutical company.

This project was huge. It was the biggest thing the company had ever done internally (and this is a company that had developed some of the most well known drugs ever made). At one point we had so many external contractors working on it we were spending over $100,000 per week just on them. It was unsustainable (this was ten or fifteen years ago, too).

All we were doing was replacing financial systems worldwide along with HR systems.

But here's the kicker : Nobody outside the project really cared.

The people who were working in the labs discovering drugs didn't care. The salesmen out on the beat trying to persuade doctors and pharmacists to prescribe the drugs didn't care. The manufacturing folks physically making the end product didn't care.

More importantly, the customer didn't care.

The IT component of the business cared, as did the CFO who was convinced he would be able to get more accurate data about the state of the company quicker than before. HR cared because they would be able to see more accurate information about the number of people employed worldwide, and in what capacity. But in reality this was never going to happen. The financial people were always going to keep their own sets of books locally and only report the figures they had approved to the CFO, rather than the actual day-to-day figures. Excel was the main way of manipulating this data and the system allowed Excel to be used in just this way.

So this hugely expensive project (which had dodgy cost-benefit justification in the first place) was being run purely to give the CFO and head of HR more accurate data - and it wasn't even doing that appropriately. None of the key departments were affected by this project (and the associated process change that came from it), and the customers were blissfully unaware that they were paying a huge amount of money for their pharmaceuticals so that a lot of it could be hived off to pay for this extremely expensive internal project.

So why do it?

Somebody obviously thought there was a cost benefit to doing this project. 'Better information flow' was bandied about as one justification, as was 'Single Worldwide System'. In my capacity as an auditor I got to see some of the documents that were not widely available to others on the project and can tell you that some of the justification and cost-benefit was tenuous to say the least.

But it does bring up the whole question of why do we implement ERP systems in the first place? Sure, there are companies with disparate and widespread systems that would benefit from having some sort of unification across their enterprise. But, in my opinion, the benefit comes not necessarily from the software itself, but from the unification of underlying process and settings. Merging a couple of companies together always benefits from implementing a common chart of accounts and common processes to underly them, but the implementation of a common (usually expensive) system is not always required.

I was sitting over the weekend watching the guys practice their cricketing skills on the local cricket green. (For those of you who want an in-depth explanation of cricket I would recommend this: Or for a more in-depth and less humourous look, this).

The exercise was simple. A batter would lob a ball really high into the air and the fielder would have to catch the high ball, immediately throw it back accurately towards the batter who would knock it along the ground. The fielder would then catch the low ball and return it to the batter who would lob another high ball for the next fielder in the line.

Simple straightforward and practical.

The problem came when one of the fielders didn't catch the ball. It threw the rythmn of the practice and the guys would have to regroup and start again.

I got to thinking how this applies to process.

At first glance it’s a simple process with a number of moving parts (or steps). The process is designed so that each step can be executed in a sequence and the output of one step is used as the input of another. The problem comes when the output of one step is not what is expected by the following step. This is the case with the cricket practice. The batter was expecting a ball to come at him from the previous fielder so he could launch that to the next participant in the process. When the ball didn’t come back (because the fielder dropped it) it made the process grind to a halt.

In reality that solution is really simple: Have a second ball ready to throw into the process when the previous step fails to deliver. But this is what a lot of process definitions steps fail to take into account. They are designed to run with an optimum process flow (i.e. they assume that the output from previous steps is valid). Designing some error handling into a process is always easier than trying to fix a process when it doesn’t work as designed.

My last post asked Where is the BPM Market going? and opened the way for discussion (mostly on Twitter) about the changing state of the BPM marketplace.

Thanks to Craig, The Process Ninja, we can now look at the latest analysts offerings from Gartner and Forrester in the form of the Forrester Wave for BPM suites and the Gartner Magic Quadrant for Intelligent Business Process Management Software.

Wait! Hold-up now. "Intelligent Business Process Management Software", you say? What the hell is that?

Gartner has made changes to the classification of BPM solutions by redefining the marketplace. They are referring to this as "an evolution of the BPMS market" that is "centered on a new IBO use case". IBO in this case means Intelligent Business Operations. It is the same thing they did several years ago when the Magic Quadrant for Pure-Play BPM morphed into the Magic Quadrant for Business Process Management Suites.

In other words, they've moved the goalposts!

Further analysis of the underlying reason for this reveals that Gartner feels that iBPMS represents a maturation of the capability and is used typically at higher levels of BPM maturity.

But the problem is they are pushing the same products as they had in the previous Magic Quadrant for BPMS - which they say cannot be compared with this iBPMS Magic Quadrant.

My recollection of the earlier Magic Quadrant showed a group of vendors in the top right sector of the diagram with a number of other vendors trailing down a diagonal to the bottom left. The new Magic Quadrant shows a wider spread of vendors, many of whom can be found in the lower right quadrant. In Gartner speak this indicates that they are visionaries, but lack the ability to execute on their vision.

So what am I to do if I am a Gartner customer looking to identify which vendor I can pursue to fulfill my needs?

I might previously have gone for a Metastorm offering (for example) as they were highly regarded and in the top right quadrant. They have since been purchased by Open Text and now languish in the lower left quadrant as being niche players with an incomplete vision and a lack of ability to execute. Does this mean the purchase has been a failure? Not at all. But the goalposts have moved.

I worked for American multinationals who would only look at vendors who landed in the top right quadrant of the Gartner grid. As of the current offering this would reduce the market down to three vendors. In many multinationals that's not even enough to put out an Invitation to Tender as a minimum of four vendors are needed.

But is that a problem?

Well it might be if you are one of the vendors who was in a more elevated position and now find yourself in a less elevated position, but if we look at the Forrester Wave report for BPM suites we find some sobering statistics. In a survey of 520 IT decision makers in Q4 2012, when asked "What are your firm's plans to adopt BPM tools", fully 43% said they were not interested or had no plans and only 27% said they were planning to implement.

When the same group were asked "What are your firm's plans to use Software as a Service (SaaS) to complement or replace your BPM software?", 52% answered that they did not know or had no plans to us SaaS and only 33% had plans to do so in 2 years.

In a market where Forrester have identified 52 different vendors competing in the broader BPM market, where Gartner have redefined the goalposts about what a good vendor is, and where half the IT decisions makers are not looking at using BPM, I sense a serious disconnect.

Who are the 52 vendors marketing themselves at? Can the market sustain this onslaught?

I remember back in the deep, dark mists of time (about ten years ago, actually), The BPM market used to have several players in it. Gartner's Magic Quadrant had a diverse number of players in each of the quadrants, and it was easy to look at and understand the fragmentation. Things were called 'BPM' and everyone knew where they stood with it - although, in reality, very few people could adequately define 'BPM' as a concept.

More recently, though, the market has started to amalgamate. Major companies were purchased by competitors and their products merged together (Metastorm and Provision is one example). The fragmentation of the market decreased suddenly. The Magic Quadrant (and Forrester's Wave) had fewer parts to it. Things looked good for the BPM vendors, but, not necessarily, good for the market.

People like Gartner then started to split their BPM Magic Quadrant up into different areas. We got BPMS, and ACM and the like. Different companies were invited in to join, and, pretty soon, the market seemed to be just as wide-ranging as before.

But is it really? Or have we just moved the goalposts?

Is this a classic reorganisation the likes of which we experience in companies at regular intervals? Movement for the sake of movement.

Is it a way for some of the consulting companies and business integrators to muddy the waters for customers and justify large consulting fees?

Or is the market genuinely in the throes of some major increase in the number of vendors working in a particular niche? Are we on the cusp of an explosion of products that will help customers conquer the BPM beast?

I'm not sure I know the answer myself, but I suspect a number of my readers will have opinions on this. Feel free to share in the comments below.

Not far from where I live, there is a DIY store on an out-of-town trading estate. It is surrounded on all sides by dual carriageway roads and the only, real, access to it is by vehicular transport. However on the opposite side of this dual carriageway is a housing estate. I was waiting at the nearby traffic lights yesterday and noticed that from the fence surrounding the estate there appeared to be the beginnings of a pathway that had been worn by pedestrians across the central grass reservation of the dual carriageway and into a hole in the fence surrounding the DIY store. As I watched I saw at least three people take the route from the estate, across the dual carriageway, into the store.

It struck me as being a prime example of people finding a way of doing things that wasn't originally anticipated in the design of the thing.

When they first built the store and surrounded it by roads, nobody imagined that people would actually want to walk to the store. But people found a way. What's more they found the path of least resistance to achieve their goals.

The same happens in processes. You can design a process in whatever way you want, but people (users) will always find the easiest way to achieve their goal, and it may ot be by using the process in the way you anticipated. More often than not this should be the way the process should be designed in the first place.

One thing I have learned since leaving the comfortable role I held in a huge American multinational and starting my own business is that large companies do not have a clue about how to run a responsive organisation.

They are slow (and very resistant) to change, unreactive and labouring. Their processes are usually large and complicated, and this results in a lack of ability to go with quick changes that are often required.

The governance process I worked with within the multi-national involved tortuous meetings with nineteen interested parties where prospective change agents would plead their case for why their particular affiliate/department/competency was different and required something to be changed. This would all then be discussed, moderated and voted upon. More often than not the stupid, nonsensical ideas would be passed, whereas the ones that would genuinely make a difference to the business were blocked. Sure, we would let the Spanish affiliate have its own, Spanish language, portal. And if we were doing that then we would also let Italy have an Italian language one, Portugal have their own and Germany and France have their own. But would we hold all these and sanction a European (or company wide) portal that was multi-lingual and customisable? No, not a chance.

Why? Size.

An affiliate that deals with its own portal can budget for its own portal. It can manage its own portal and it can pay for its own portal. If we put something in that is cross European (or global) then money has to be found from other budgets and responsibility for maintenance has to be found too. The fact that the money and resource all come, in effect, from the same big bucket is carefully swept under the table. It becomes a political game.

But in a small company, such as the one that I and millions of other small businessmen run around the world, something like this is a simple, no-brainer. Do we need a portal? Yes. Can we afford it? Yes. Do it! The governance is light and quick and the decision is made pretty much instantly.

The same can be said for process. Things work on a process in every company. In smaller companies the process is probably very light and fluid. Checks and balances might not be as necessary as they are in larger companies. But the ability to modify or redesign a process is a lot easier in smaller companies. We can react to the market conditions and change direction/strategy/ market a lot quicker.

So if small companies can do this so much quicker, why are they not ruling the world? Well, when they start to rule the world they get bigger and the ability to react as quickly disappears. Companies like Amazon and Google were once small start-ups. They had small staff numbers, small capital and big ideas. They were responsive to what the market wanted and they could pivot on a sixpence if needed. They could fail quickly and move on. This mentality is now no longer there to the same extent (although a lot of this is cultural. Google’s “Spend 20% of your time on your own projects” ethos has now fallen pretty much by the wayside, for example)

So what are larger companies to do? How can they become leaner and more reactive? The answer is easy (although the implementation is not). To become leaner and quicker you have to.. become leaner and quicker. Remove the levels of bureaucracy that slow down changes. Keep the organisational chart shallow enough that you can get the decision makers into a room and make decisions quickly. Put the decision makers at the right place in the organisation. I’ve talked previously about process owners and the need for them to be at the board level. This is an immediate example of why.