IT S&M Pain or Pleasure – Major incident management

In the final part of this three-part blog we look at the pleasure and pain involved with Major incident management

Major Incident Management

To my mind the three areas, outside of event management, that really improve the way that major incidents are handled, are communication, learning from the incident and being transparent.

In IT, we are not an island. We are a core part of most organisations, whether they and we like it or not. We have to share what is happening, what we have learnt from the incident and what is going to happen to reduce the chance of it happening again. Otherwise, how can we even hope to be trusted?

The Pain

I have seen IT teams embrace the opportunity to operate, during major incidents, with heads down in a locked room keeping information secret thereby increasing the chance of recurrence.

The trouble with being like this is that the wider organisation starts to accept that this is the way it is. So they don’t complain formally, they just moan to each other.

And in fairness to IT, how often do you see a review of a failed marketing campaign? Or a full and frank review of a failed pay run (apart from to blame IT; who ever heard finance say that they should have had business continuity plans in place because they accepted that the system was not resilient when it was implemented?)

The pleasure

Get it right however and the pleasure is shared amongst all concerned parties.

Communicate with your customers and users. Let them know what is going on, even if that communication is just to say that you are still working on it. Set up lines of communication. Make sure that the technicians working on the fix are left alone to get on with it, but have one person who gets updates. If you commit to telling users every 30 minutes or hour, get an update 10 mins before. Keep telling people what is happening. AND keep the service desk updated.

When the service is restored, carry out a full review. What went well and what didn’t. Keep to the facts. No emotions. What infrastructure components failed. Can resilience be done better? Should the service be resilient if everyone in the rest of the business was jumping up and down? Should processes inside or outside of IT be amended? Set actions. Set timeframes. Follow up on them.

Share the outcome with as much of the rest of the business as possible. If there has been a failure and you have a plan to mitigate the chances of this happening again, tell everyone. If you have a piece of work in place and it is being progressed, but the issue recurs sooner than you can finish the remediation, less people are going to complain than if you had done nothing.

Use the incident as a driver for good.

Service Management Life Lessons Learnt

Some key lessons that I have learned that I would share with you on removing some of the pain from Operations and Service Management.

Don’t be afraid to bring in consultants, but find ones with real world experience. You probably know what needs to be done, but do you have the time to think it through and implement the change, while also doing your day job? Probably not.

Don’t be afraid to take advice and use a mentor. Other people’s thoughts are sometimes the clarity you need when the trees are hiding the woods.

Don’t forget to ask the people on the ground what is wrong or what can improve. They have good ideas; often the best ideas.

Ask a lot of questions before starting.

Don’t try and do too much too quickly. Keep going for the “low hanging fruit”. It’s always there and it’s always easier to do that than try and pick all fruit in one day. Look up Tipu by Rob England.

As a consultant, remember, you don’t know it all. You can learn from each client. Admit it to them and yourself. Also, use your peers. Accept that other people can sometimes do things better, and generally will help.

If you missed the earlier parts of this series, catch up with part one and part two

James is an ITIL accredited Service Management and IT Operations Management consultant with over 20 years in IT and over 10 years experience managing, mentoring and leading IT support teams in the UK,
India and New Zealand, across Outsource, Utilities, Media & Broadcast, Public Health and Tertiary Education environments.
Recent opportunities have allowed James to use his experience to assist organisations improve processes Service Desks and IT support teams, enabling continuous improvement whilst also delivering a stable operational environment.
James is also an accomplished people manager, varying from small local teams to large multi-national teams and is experienced in strategic thinking to drive improvements and change.