I've been reading for a while, got my foundation certification now and feel more comfortable with all this. It's good because I'm in charge of introducing ITIL in a fairly large company

After my first presentation to a group of support I noticed that more than one considered Problem Management the "second level support", and I'm pretty sure that's a mistake, probably even a common one.

In that demo, they take a practical approach to explain ITIL through the software and a practical case. However, they describe a case that needs escalation and they describe their 2nd level support in Problem Management.

Did I miss something or Peregrine associates 2nd level support and problem management?_________________BR,
Fabien Papleux

wow - I watched the peregrine presentation, and that's a fantasy world that our organization certainly doesn't inhabit!
I believe that the knowledge base and its management are considered a Problem management function, but at incident time, the focus must be on containing the incident and getting the customer back in service as soon as possible. I think that the service desk 'owns the incident' & should have access to the KB tools.
In my experience as a Problem Ananlyst, trying to determine root cause during a live incident interferes with the objectives of incident management. I'll take my chances waiting for the dust to settle.
We have established a successful 'communication & escalation' process within incident mgmt, that addresses some of the concerns. Based on the service desk agents assessment of the incident (severity level) we direct the incident to the necessary support groups, and communicate it to the business & IT owners of the configuration item affected depending on elapsed time before resololution.
Hope that helps.
/Sharon E

You will find the 'confusion' you described extremely common. Though it doesn't count as confusion if it is a deliberate choice under the 'adopt and adapt' cavet applied to implementing ITIL.

(Editorial: I have no problem with the adaptation if the ITIL framework to specific site requirements, or to accomodate some of the design trade offs that need to be made when developing software. I do wish that those behind these decisions would not present them as if they weren't).

The difference between incidents and problems has nothing whatsoever to do with: serverity, urgency, priority, impact, SLA breeches, etc. etc.

A problem is simply the root cause of an incident wether trivial or of business destroying severity. Regardless of where the boundaries are drawn, you are doing incident management when you are trying to get the business up and running again (not always a service restoration) and you are doing problem management when you are investigating the causes (or potential) causes of incidents and erradicate them.

Simple as that - so keep up the good work. Be practical, you can't always draw a line between the processes as clear as 'good theory' would require. But understanding the difference puts you on solid ground. It is one thing to 'compromise' intelligently, based on recogintion of the facts on the ground, and another thing to do so unwittingly

After further investigation, I found in Service Support, p.76 last paragraph. It says: "Where the underlying cause of the Incident is not identifiable, then it may be appropriate to raise a problem record."

This is the sentence that actually introduces the concept of Problem Mgt, a problem being an unknown underlying cause of one or more incidents.

So ITIL says that if in the process of doing Incident Mgt, you cannot identify the underlying cause of an incident, it is your responsibility to raise a prb record, which will then be investigated, I assume, by 2nd level support personnel.

How this actually differs from functional escalation is that Incident Mgt could provide a workaround and close the incident, while the problem record is investigated. In escalation, the incident record stays open until a specialist provides an answer.

If I am wrong, please don't hesitate to correct me but I think this is the spirit of the book._________________BR,
Fabien Papleux

Well I don't have a crystal ball - but I would bet that line won't be in the V3 refresh.

The critical distinction between incident and problem management is that incident management does not 'seek' the underlying cause an problem management does. In fact incident and problem management are described as being in potential conflict becasue of this difference - and the recommendation is not to have a single report responsible for both.

This is becasue analysing a root cause will often delay resoultion - the classic example is that of a crashed service: Analyse the logs (takes time but might stop it from happening again) or blow away the system with a restore (quick restoration - but you loose diagnostic material and root cause analysis will be hindered) - Incident management will always hit the 'nuke' button to get the user back up a running.

What the line does reflect is a common sense understanding that in many cases incident resolution activity will either a) discover the root cause anyway - without wasting time, b) be unsuccessful and require deeper investigation, c) encounter something the root cause of which is quite well known.

In these cases Incident Management and Probelm Management are more alike than different - and the adivce you highlighted intends that the requistied processes still be followed, and the inforation recorded, without being too 'officious' about where in the process cycle these steps are occurring.

Remember: ITIL is written by a number of committees - the wording will not be consistent and unambiguous in every case.

Fabien, you have inspired me.
In my current world I have specific criteria for determining which incidents to investigate as Problems.

I want to do more. The difficulty is deciding which of the other 4-5000 incidents reported to our various helpdesks in a month I should investigate, as the criteria I have now won't work (that's another story). Naturally, I have no Project...

What do you think of this idea: one of the growing areas of our company has a help desk and a well-established incident management process. They may have something in place for Problem Management, of which I am unaware (but the usual routine is to set up 'special teams' for new products until the obvious defects are fixed, then assume that everything that can possibly happen has already happened and there is a knowledge base article about everything else. Ha. In real life, shit happens)

I am thinking of suggesting that they throw the ugly ones my way. I am hoping that if this help desk's 12 managers get together, they will come up with criteria that does not seem to single out any one group either. I have found it to be very important to keep your reasons for investigating 'beyond the incident itself' objective and defensible. I also don't want to be seen as encroaching on their territory, or in any way criticising what they have already done. Corporate culture is a wonderful thing.

Glad to hear this. It seems to me that you are interested in that growing area and I have really no way of knowing from your info whether your proposition will fly. My gut feeling is that if you'd like to get involved, then you should try to make it happen. The best way usually is to create relationships and try to establish whether there is a need. If you feel that there is one, then make a simple business case and volunteer. That shows willingness and leadership... and that's rarely turned down (that's the good side of the corporate world).

Closer to the topic at hand though, I found that, as rjp pointed out, the books are written by a number of people and sometimes it's confusing. But it is clearly stated that when the root cause of an incident is not known, then the person in charge of the Incident on the Service Desk should either create a problem record, or mark the incident to be investigated later, provided that a workaround is found. If there is no workaround, the incident should be sent to a specialist group (2nd lvl) for further investigation.

And you could also consider, for instance, that all escalated incidents where a workaround has been used, should be reviewed by the problem management team.

Fabien, mea culpa - I missread you quote from the ITIL books as "Where the underlying cause of the Incident is identifiable, then it may be appropriate to raise a problem record."

What a difference a "not" can make - so my entire post is inside-out.

Strangely, this very issue came up in a project planning meeting today - one of the ITIL specialists was quite strongly of the opinion that IM and PM proccesses should be kept as distinct as possible. He considered it more effective to not allow incidents to be 'converted' to problems - even where the root cause is known, or discovered as a byproduct of incident resolution activities. His reasons were mainly 'cultural' and certainly suppported Sharon's approach.

And I certainly accpet my slant would require a high level of training by incident resolvers. So after some reflection - perhaps the caveat should be... "in an ideal world" but watch the risks

Progress update!
Fabien, you said "create relationships and try to establish whether there is a need. If you feel that there is one, then make a simple business case and volunteer. "
Well I did exactly that. And so far the feedback seems to be 'great' 'glad to have you'. I am scheduled to make a presentation at one of their meetings. And RJ, thanks for your vote of confidence for my 'culturally-sensitive' approach. I will use all your suggestions to show them alternate crieria they could use to decide which incidents to make into problems.
Now the funny part is: the meeting I am scheduled to present at is mid MARCH - and on May 4 I am leaving for a month in Sweden. Such are the joys of bureaucracy. Life is funny
/Sharon

In reviewing the above comments, let me add a comment. To best understand the difference between incident and problem management, look to the reason they were separated. In the old day, notice I did not say good old days, the tech would always find the cause of the incident. This meant that the customer had to wait, but the needs of IT came first!

Under ITIL the tech’s primary goal is to restore service. Incident management is like Alexander and the Gordian knot. He does not unravel it, but rather cuts it in two with his sword. Do what ever it takes to get the customers back to a productive state.

Problem management seeks the root cause of only those incidents where later incidents can be prevented. Thus scarce technical resources are only applied where they will do the most good.
Hope this helps.
Rob Roy