Self-healing networks

By Philip Hunter

Published Monday, August 3, 2009

Computer networks are smart - they know when they're having a complexity crisis; but can they be made smart enough to apply solutions to their own problems without the need for human intervention? E&T reports.

As IT becomes ever-more complex, sophisticated, and extensive, the task of network management is to maintain levels of user expectation against growing technological challenges. To many of those within the IT so tasked it is a thankless undertaking, as no sooner has one performance objective been met than new problems arise that discombobulate the original plan.

It is, arguably, an ambition of containment rather than expansion, and even the quest for autonomic management, or self-healing networks, must be seen in that light, against a continuous background of cost constraints. To an extent it has always been possible to guarantee the highest levels of availability, and also defend against the most penetrating threats, given the maximum human and technical resources available at the time.

In practice, such resources have been confined to the highest security defence networks, so for everyone else a compromise has to be reached. The objective of network management is therefore unchanging, summarised in the phrase "faster, cheaper, better", according to Rick Sturm, founder and CEO of network management research firm Enterprise Management Associates (EMA).

But while network management will continue to have the role of maintaining compliance, governance, and performance in the face of ongoing cost constraints - now with the added requirement of hitting energy efficiency targets - true automation does offer a more enticing goal. For the first time, network management will become a source of new applications and even competitive advantage in its own right, Sturm believes.

The key will lie in uniting the whole IT infrastructure within a common management framework - including not just the physical components (such as cables) of the IT network itself, but also power supply and cooling, as well as aspects of the building, such as air conditioning. All of these play a part in the operation and effectiveness of IT services and applications as a whole; indeed, EMA's Sturm extends this further by suggesting that as other electronic components become measurable - and capable of being embraced within an IT management framework - great new opportunities will arise.

"As more physical infrastructure becomes network enabled with RFID [radio frequency identification] tags and Internet-enabled network interfaces, then this obviously presents a fantastic opportunity to change the way in which business is done," Sturm says. "Home meter reading can be done automatically, remotely and securely."

This may seem a premature expectation, but Sturm is suggesting the possibilities should provide added impetus in the drive towards self-healing and automatic network infrastructures: "There will obviously be a need to use techniques developed in autonomic computing that have been used to secure 'traditional' IT infrastructures to secure these new network-enabled systems."

Autonomic computing initiative

The major vendors in network and IT management, notably IBM and Hewlett-Packard (HP), tend to play down the possibilities of autonomic management; perhaps because progress in some areas has been disappointingly slow. IBM launched its Autonomic Computing Initiative with a dedicated team amid much huzzah in 2001, but it has featured little in the mediasince then.

This though is largely because it was a long-term project that would inevitably take time to come together, given that it relies upon extensive integration across the whole IT firmament, not just the networking dimension. "The big challenge was that in order for this to work fully, IBM had to tie together and integrally traverse multiple infrastructure domains simultaneously, such as network, server, storage, database, and client, and like all other technology vendors is still not fully capable of delivering this," says another EMA expert, research director Jim Frey.

Yet IBM's head of autonomic computing Matt Ellis insists that substantial progress had been made and delivered to enterprise customers via the company's network management company, Tivoli. "For example, we introduced predictive analytics into our Tivoli management software in 2008, and coupled this with our configuration management products," says Ellis. "This has allowed our customers to automatically monitor the state and health of their application or middleware, and then detect when there might be a potential capacity problem. The management system will then automatically configure a server, set up another instance of the middleware and then make sure that workloads are moved to the additional capacity."

This indeed is a step in the right direction, but full autonomic computing will be impossible until one most critical function, identity management, is embedded across the whole IT infrastructure, according to Gijs Zantvoort, EMEA product marketing manager at HP ProCurve. "Enterprises are just scratching the surface with respect to the benefits from automation of their network management," says Zantvoort. "The main reason is that if you want to automate this, you need to know who is who and who wants to connect what where. It becomes imperative that you start to manage identities."

Unified communications deployment

While most of the technology is already there, implementation of full-blown identity management has been impeded by internal politics, except within specific domains where tight access control is required, at least according to HP ProCurve's Zantvoort; but salvation may be at hand, for many enterprises are intent on deploying unified communications (UC) based on presence to increase efficiency, particularly those with highly mobile workforces.

"This is a factor that will accelerate the adoption of identity management enormously," avers Zantvoort. "UC will only work if you know the identity, and then if you know the presence. From a networking perspective, the presence is of less relevance, but the management of the identity is offering a real solid foundation to automate your network management accordingly."

This could lead for example to Microsoft's Active Directory becoming the focal point both for managing identities and as the foundation for automation of network management.

"That's pretty compelling - one database with all relevant information," Zantvoort adds. "It gets better if you can use one automated management system for both your wired and wireless infrastructure. You don't want to have different islands of management as they will have holes and overlaps and thus are a recipe for error".

While UC may be an enabler of automation, there are several IT service trends that will be primary drivers, in particular Cloud Computing. This will require fast and efficient provisioning of services to win over enterprise customers, relying, in turn, on growing levels of automation to deliver. If Cloud Computing does indeed live up to its promise, enterprises may be spared the agonies of implementing automation themselves, instead relying on external suppliers to do it for them. It is most likely enterprises will, in fact, need to make some difficult decisions, and so at another level perhaps will service providers, as EMA's Jim Frey indicated.

"The challenge is to answer the question of 'how much is enough?'," says Frey. "It is easy to overspend adding capacity and redundancy into a network architecture. The answer is to reconcile capacity with usage requirements, and that has to start with understanding more than just connectivity - it requires an understanding of all the service and applications that the network will be delivering, along with their relative criticality to the served organisation."

Frey is talking here both about the enterprise as a customer of a cloud computing service, and end-users as customers of the enterprise. Either way, IT operations teams will have to embrace the rudiments of automation where possible, including - auto discovery - to recognise newly added elements and changes in existing ones, proactive performance trend monitoring that takes into account changes or exceptions compared with normal variations in behaviour, automated determination of root causes of problems to accelerate recovery, and some degree of automated corrective action.

Autonomic computing will only be a reality when the latter has evolved to cope with unforeseen events, including failures in software, and been expanded across the whole IT infrastructure.