Lean Integration is not a one-time effort; you can't just flip a switch and proclaim to be done. It is a long-term strategy for how an organization approaches the challenges of process and data integration. Lean can and does deliver early benefits, but it doesn't end there. Lean principles such as waste elimination are never-ending activities that result in ongoing benefits. Furthermore, some Lean objectives such as becoming a team-based learning organization with a sustainable culture of continuous improvement may require years to change entrenched bad habits.

Before you start on the Lean journey, therefore, you should be clear about why you are doing so. This chapter, and the rest of the book, will elaborate on the technical merits and business value of Lean Integration and how to implement a program that delivers on the promise. Here is a summary of why you would want to:

Agility: Take integration off the critical path on projects by using highly automated processes, reusable components, and self-service delivery models. The mass customization case study (Chapter 8) demonstrates key elements of this benefit.

Data quality: Establish one version of the truth by treating data as an asset, establishing effective information models, and engaging business leaders and front-line staff to accept accountability for data quality. The Smith & Nephew case study (Chapter 6) shows how this is possible.

Governance: Measure the business value of integration by establishing metrics that drive continuous improvement, enable benchmarking against market prices, and support regulatory and compliance enforcement. The integration hub case study (Chapter 10) is an excellent example of effective data governance.

Innovation: Enable staff to innovate and test new ideas by using fact-based problem solving and automating "routine" integration tasks to give staff more time for value-added activities. The Wells Fargo business process automation case study (Chapter 9) is a compelling example of automation enabling innovation.

Staff morale: Increase the engagement and motivation of IT staff by empowering cross-functional teams to drive bottom-up improvements. The decentralized enterprise case study (Chapter 12) shows how staff can be engaged and work together across highly independent business units.

Achieving all these benefits will take time, but we hope that after you have finished reading this book, you will agree with us that these benefits are real and achievable. Most important, we hope that you will have learned enough to start the Lean journey with confidence.

Let's start by exploring one of the major challenges in most non-Lean IT organizations: the rapid pace of change and surviving at the edge of chaos.

Constant Rapid Change and Organizational Agility

Much has been written about the accelerating pace of change in the global business environment and the exponential growth in IT systems and data. While rapid change is the modern enterprise reality, the question is how organizations can manage the changes. At one end of the spectrum we find agile data-driven organizations that are able to quickly adapt to market opportunities and regulatory demands, leverage business intelligence for competitive advantage, and regularly invest in simplification to stay ahead of the IT complexity wave. At the other end of the spectrum we find organizations that operate at the edge of chaos, constantly fighting fires and barely in control of a constantly changing environment. You may be somewhere in the middle, but on balance we find more organizations at the edge of chaos rather than at the agile data-driven end of the spectrum.

Here is a quick test you can perform to determine if your IT organization is at the edge of chaos. Look up a few of the major production incidents that have occurred in the past year and that have been closely analyzed and well documented. If there haven't been any, that might be a sign that you are not on the edge of chaos (unless your organization has a culture of firefighting without postmortems). Assuming you have a few, how many findings are documented for each production incident? Are there one or two issues that contributed to the outage, or are there dozens of findings and follow-up action items?

We're not talking about the root cause of the incident. As a general rule, an analysis of most production incidents results in identifying a single, and often very simple, failure that caused a chain reaction of events resulting in a major outage. But we also find that for virtually all major outages there is a host of contributing factors that delayed the recovery process or amplified the impact.

Here is a typical example: An air conditioner fails, the backup air conditioner fails as well, the room overheats, the lights-out data center sends an automatic page to the night operator, the pager battery is dead, a disk controller fails when it overheats, the failure shuts down a batch update application, a dependent application is put on hold waiting for the first one to complete, an automatic page to the application owner is sent out once the service level agreement (SLA) for the required completion is missed, the application owner quit a month ago and the new owner's pager has not been updated in the phone list, the chain reaction sets off dozens of alarms, and a major outage is declared which triggers 30 staff members to dial into the recovery bridge line, the volume of alarms creates conflicting information about the cause of the problem which delays problem analysis for several hours, and so on and so on.

Based on our experience with hundreds of similar incidents in banks, retail organizations, manufacturers, telecommunications companies, health care providers, utilities, and government agencies, we have made two key observations: (1) There is never just one thing that contributes to a major outage, and (2) the exact same combination of factors never happens twice. The pattern is that there is no pattern—which is a good definition of chaos. Our conclusion is that at any given point in time, every large IT organization has hundreds or thousands of undiscovered defects, and all it takes is just the right one to begin a chain reaction that results in a severity 1 outage.

So what does this have to do with Lean? Production failures are examples of the necessity of detecting and dealing with every small problem because it is impossible to predict how they line up to create a catastrophe. Three Mile Island is a classic example. Lean organizations relentlessly improve in numerous small steps. A metaphor for how Lean organizations uncover their problems is to imagine a lake with a rocky bottom, where the rocks represent the many quality and process problems affecting their ability to build the best products for their customers. Metaphorically, they "lower the water level" (reduce their inventories, reduce their batch sizes, and speed up reconfiguring their assembly lines, among other techniques) in order to expose the rocks on the bottom of the lake. Once the "rocks" are exposed, they can focus on continually improving themselves by fixing these problems. Integration systems benefit from "lowering the water level" as well. Every failure of a system uncovers a lack of knowledge about the process or its connections. Problem solving is learning more deeply about our processes, infrastructure, and information domains.

We are of the opinion that the edge of chaos is the normal state of affairs and cannot be mitigated purely by technology. The very nature of systems-of-systems is that they emerge and evolve without a complete (100 percent) understanding of all dependencies and behaviors. There are literally billions of permutations and combinations of the internal states of each software component in a large enterprise, and they are constantly changing. It is virtually impossible to test all of them or to build systems that can guard against all possible failures. The challenge is stated best in remarks by Fred Brooks in The Mythical Man-Month: "Software entities are more complex for their size than perhaps any other human construct, because no two parts are alike. . . . If they are, we make the two similar parts into one, a subroutine." And "Software systems have orders of magnitude more states than computers do."2

So what is the solution? The solution is to perform IT practices such as integration, change management, enterprise architecture, and project management in a disciplined fashion. Note that discipline is not simply a matter of people doing what they are supposed to do. Lack of discipline is not their problem; it is the problem of their managers who have not ensured that the work process makes failure obvious or who have not trained people to respond to revealed failures first with immediate containment and then with effective countermeasures using PDCA (Plan, Do, Check, and Act).

To effectively counter the effects of chaos, you need to approach integration as an enterprise strategy and not as an ad hoc or project activity. If you view integration as a series of discrete and separate activities that are not connected, you won't buy into the Lean concept. By virtue of the fact that you are reading this book, the chances are you are among the majority of IT professionals who understand the need for efficiency and the value of reuse and repeatability. After all, we know what happens when you execute project after project without a standard platform and without an integration strategy; 100 percent of the time the result is an integration hairball. There are no counterexamples. When you allow independent project teams to choose their own tools and to apply their own coding, naming, and documentation standards, you eventually end up with a hairball—every time. The hairball is characterized by an overly complex collection of dependencies between application components that is hard to change, expensive to maintain, and unpredictable in operation.

If for whatever reason you remain fixed in the paradigm that integration is a project process as opposed to an ongoing process, there are many methodologies to choose from. Virtually all large consulting firms have a proprietary methodology that they would be happy to share with you if you hire them, and some of them will even sell it to you. Some integration platform suppliers make their integration methodology available to customers at no cost.

But if you perceive the integration challenge to be more than a project activity—in other words, an ongoing, sustainable discipline—you need another approach. Some alternatives that you may consider are IT service management practices such as ITIL (Information Technology Infrastructure Library), IT governance practices such as COBIT (Control Objectives for Information and Technology), IT architecture practices such as TOGAF (The Open Group Architecture Framework), software engineering practices such as CMM (Capability Maturity Model), or generalized quality management practices such as Six Sigma. All of these are well-established management systems that inherently, because of their holistic enterprise-wide perspective, provide a measure of sustainable integration. That said, none of them provides detailed practices for sustaining solutions to data quality or integration issues that emerge from information exchanges between independently managed applications, with incompatible data models that evolve independently. In short, these "off the shelf" methods aren't sustainable since they are not your own. Different business contexts, service sets, products, and corporate cultures need different practices. Every enterprise ultimately needs to grow its own methods and practices, drawing from the principles of Lean Integration.

Another alternative to fixing the hairball issue that is often considered is the enterprise resource planning (ERP) architecture, a monolithic integrated application. The rationale for this argument is that you can make the integration problem go away by simply buying all the software from one vendor. In practice this approach doesn't work except in very unique situations such as in an organization that has a narrow business scope and a rigid operating model, is prepared to accept the trade-off of simply "doing without" if the chosen software package doesn't offer a solution, and is resigned to not growing or getting involved in any mergers or acquisitions. This combination of circumstances is rare in the modern business economy. The reality is that the complexity of most enterprises, and the variation in business processes, simply cannot be handled by one software application.

A final alternative that some organizations consider is to outsource the entire IT department. This doesn't actually solve the integration challenges; it simply transfers them to someone else. In some respects outsourcing can make the problem worse since integration is not simply an IT problem; it is a problem of alignment across business functions. In an outsourced business model, the formality of the arrangement between the company and the supplier may handcuff the mutual collaboration that is generally necessary for a sustainable integration scenario. On the other hand, if you outsource your IT function, you may insist (contractually) that the supplier provide a sustainable approach to integration. In this case you may want to ask your supplier to read this book and then write the principles of Lean Integration into the contract.

In summary, Lean transforms integration from an art into a science, a repeatable and teachable methodology that shifts the focus from integration as a point-in-time activity to integration as a sustainable activity that enables organizational agility. This is perhaps the greatest value of Lean Integration—the ability of the business to change rapidly without compromising on IT risk or quality, in other words, transforming the organization from one on the edge of chaos into an agile data-driven enterprise.