Can anyone provide me with real/realistic reasons for seperating Incident Mgt and Problem Mgt. My client is not interested in what "best practice" / ITIL says , but rather on real pragmatic reasons or examples of where benefits of doing this could be realised.

Can anyone provide me with real/realistic reasons for seperating Incident Mgt and Problem Mgt. My client is not interested in what "best practice" / ITIL says , but rather on real pragmatic reasons or examples of where benefits of doing this could be realised.

Why can't one investigate a root-cause in Indicent Mgt?

In my opinion it is important to differentiate between Incident Management and Problem management due to the different focus areas...incident management needs to be solely concerned with returning the service to an operational state, and should not be over-complicated with root cause anaylsis which would only slow the process down. Problem management process would also benefit as this would free resource for more pro-active work to comabat future incidents.

Some people are good at getting people back up and running and should not suffer any distractions . OtherS - Problem Management will analyse
the bigger picture - catagorise incidents and make reccomendation for changes that will prevent re-ocurrance. Cost justifications need to be raised, impact assessed, etc,etc

Let's say the Service Desk gets a call from a user. She's the secretary of the VP of Finance and she needs some documents printed on her printer STAT for the VP's 5:00 meeting. (Why do we always pick on secretaries? ). It's 4:15 and the Service Desk tech goes at it over the phone, connecting remotely to the secretary's workstation. He pokes here, prods there and can't make the computer print. However, the tech knows there's a printer down the hall from where the secretary sits that could probably work. He proceeds then to configure the workstation to print to that printer down the hall. He makes the connection and it works!. The secretary prints her document in the nick of time and gets them to her boss just in time for his meeting. Of course, the SD documents this information and in his Service Desk tool he inputs that, as a workaround, he configured the printer down the hall.

Now, the Incident (the secretary can't print) is solved, but the problem is not. He proceeds to create a Problem and pass it to a level 2 tech, who has more time on his hands to do deep troubleshooting. It's a bit more detailed than this and I won't go into the details of escalation and all the other stuff. But this is roughly what it is!!_________________audentes fortuna juvat

Thanks for your responses. They confirm what I am saying (professing).

I think the real "problem" I am facing is to get the client to understand the difference between an incident and a problem. In his eyes if the SD cannot resolve an "incident' then it is a problem.
It goes against the grain of what ITIL/best practice says. I will have to introduce terms like, minor, major and significant incidents in order to get around the terminology issue.

One terminology issue: if their primary server (running their core mission critical application) goes down (which happens from time to time) although they have a work around to bring the service back up, they see this as a problem and believe it should fall within PM.

Explain to them that the incident is the loss of the service, hence when the server is back (rebooted etc...) the incident is closed. The problem is what caused the server to go down in the first place, and needs to be addressed to stop future recurrences

I agree to a degree. Incident management is definitely reactive as the SD the function of IM is responding to reported incidents but it is an oversimplification of problem management. Problem management has both reactive and proactive. A mature problem management process is doing both. Reactive problem management is responding to the problems as they arise. Proactive problem management utilizes trend analysis to identify problems in advance. For example the problem manager may get reports on incidents from the SD and notice a recurring issue and then raise a problem record.

I think the fundamental difference is that Incident Management's goal is to restore service as soon as possible either through a temporary fix or a workaround, or if one is lucky actually resolving the problem. Problem Management in contrast has the goal of finding the root cause.

A business justification for implementing the separate processes is the service level to the customer. If the 'resolving' team is concerned with finding the root cause it might take a long time to identify and eliminate, meanwhile the customer is out of service the entire time -- customer unhappy. Incident Management would ensure that the service is restored within the agreed time frame (SLA if the organization has SLAs). Without Problem Management the underlying cause is not identified increasing the chance of the incident happening again leaving an unhappy customer since the same issue is recurring -- degraded service level.

Having the two processes separate is ideal. In a way they balance each other out and promote the highest achievable service level to the customer. ITIL doesn't prescribe how to create the org structure, it merely states the processes that should be occurring. So depending on the size of the org and the volume of incidents and problems if the client is against investing the additional resources to have two separate teams it might be a possibility to have the one team fullfilling both processes. Not knowing the details it is difficult to say.

I think the fundamental difference is that Incident Management's goal is to restore service as soon as possible either through a temporary fix or a workaround, or if one is lucky actually resolving the problem.

Does the work order remain classified as an incident or a problem if the "problem" is resolved?

In the printer example, the Incident Record would have been closed once the incident was resolved...the workaround to another printer. The Problem Record goes through investigation and into known error status (we hope). At this point a RFC might be opened if a change is needed to fix the secretaries printer. The bottom line is that the Problem Record is not closed until the customer indicates that the resolution took care of the problem. I'm not 100% sure how your company uses work orders or how they handle the classification of them, so I am making an assumption that they are used to dispatch techs. I am also assuming that the work order would not be generated until after the initial incident and it was known that further investigation was needed and a tech needed to be dispatched (remotely or onsite). Given the assumptions made, it would be my opinion that it would be classified as a "problem" and would remain so after closing. The reason being is for trending and reporting...but you have to do what works for your organization.

To get to one of the earlier points about justifying the reasons for separation of Incident and Problem, from a ITFM perspective it is beneficial to know the cost of doing business. This means understanding the difference between solving incidents and problems. Understanding when it is cost effective to remove a problem and when it makes sense to live with the work around. I won't go on about the advantages of keeping them separate from a process perspective, there are many comments earlier in the thread that nail this pretty good.

The separation of incidents and problems, and the processes that deal with them has been pretty well articualted in the previous posts, and I pretty much agree with what has been said.

Some comments on the realistic operation of incident and problem managent and how they work together....

Incident resolvers will frequently discover the root cause of an incident. The most obvious cases will be simple break/fix incidents where a phsyical component has gone belly up and needs replacing. At other times more complex root-causes may be found while the Incident resolvers are getting the service back up.

In short not every single incident has a workaround. The goal of Incident Management is to get the service restored. Sometimes the move into Problem Management is necessary to get this done.

In these cases the aspects of Incident and Problem managent come together: The goal is to get the Service restored (Incident Management), and the method is root cause analysis (Problem Management).

However you implement each process you should have clear protocols for managing the overlaps and linkages.

...but often in smaller companyes you will not find a explicit problem process because the same people (men) solves every incident/problem. but they do it as well.

"Who" resolves incidents/problems may be an issue from a staffing perspective. However, from an ITIL perspective, it is the management of the incidents/problems (not the people solving them) that are addressed.

One of the biggest reasons to separate incidents and problems is to provide management a view into the nature of the work their support staffs are doing. Incident solvers and Problem solvers generally have different skill sets (and different pay rates). Knowing the incident/problem workloads allows managers to make better decisions regarding the kind of support personnel on their staffs.

Granted, very small companies (IT departments of <20) won't get much benefit from implementing two separate management queues. Where I work, our IT department is <100 people. By implementing incident and problem management, we were able to outsource our incident staff and use those who were left behind to work on problems. Overall customer satisfaction went up dramatically with little increase in cost.