User Rating: / 0PoorBest An effective root cause analysis process can improve production reliability significantly. But, few organizations have a functioning root cause analysis process in place. This article will discuss common problems and some suggested solutions in order to improve root cause problem elimination.

Don
The name itself implies the largest and most expensive problem when implementing problem solving in an organization. The results wanted from the process are to eliminate the problem, not to analyze the failure. To convey the desired result to the organization, the name should therefore be changed to Root Cause Problem Elimination (RCPE).

An example of RCPE results are plotted over time in figure 1. The results could have been a $ 1,800 cost if the problem had been analyzed but a solution not implemented. But, as the graph shows, a solution was implemented which generates an $8,000 profit per year.

Initially it costs money to identify and analyze the problem, in this case $1,800 for personnel, testing, and some consumables. It also costs money to prioritize, plan, and schedule the corrective action ($200). The redesign and material cost to implement the solution is $1,800 = a total cost of $4,000 for the implemented solution. The cost avoidance from future problems is estimated to $8,000 per year. The figure shows an RCPE profit of $20,000 of avoided problem after 3 years. The avoided cost will most likely continue to accumulate in the future.

If an organization engages in Root Cause Problem Elimination it is critical to implement the discovered solutions otherwise the organization will end up with paying for a wasted analysis. An interesting question is why investigated solutions aren’t implemented.

Start with basics before RCPE is engaged
An example of a successful RCFA that cost a plant $1.1 Million was once described to me. To make a long story short, the investigation boiled down to that a team of people performed a RCFA for several weeks in order to discover that the root cause was worn out coupling bolts and missing bolts due to misalignment. The main business process root cause was poor planning and scheduling practices that didn’t allow mechanics time to align properly. This analysis is described as a huge success. But, the obvious question has to be asked: Why focus on RCFA in this plant? If the plant would have had a basic PM in place the missing bolts would have been found much earlier by looking at the coupling using a stroboscope (yes, guards should have OSHA specified inspection ports).

Organizations should not prioritize Root Cause Problem Elimination if they work in a highly reactive environment. It could be a good idea if an organization is in a somewhat reactive situation, but not in a highly reactive mode. In a highly reactive mode, it may sound as a good plan to start solving problem, but it doesn’t work. The reason is simple. Highly reactive organizations don’t need to analyze problems to find solution. Common problems are poor foundations, corrosion, broken components that aren’t fixed yet, lack of bill of materials, disorganized spare parts and materials, lack of equipment numbers, an extensive maintenance backlog, lack of standard operating procedures and training for operators, the list goes on. A highly reactive organization need to work on basic preventive maintenance and planning and scheduling before they can free up time to do Root Cause Problem Elimination (RCPE). Even if RCPE was engaged in a highly reactive organization, the solutions would point at the obvious problems mentioned above.

The right people should engage in RCPE
A root cause problem elimination process should be designed to involve few people for most problems and engage larger groups only if needed.

Root Cause programs are often designed to engage a facilitator and a group of people. The group size is often ten or more people. Engaging a larger group for RCPE can be a great learning experience and can provide great results if it’s used in moderation. But, large groups tend to be hard to get together and will usually dissolve over time if meeting becomes too cumbersome. Day to day root cause problem eliminations should therefore be managed by the frontline (hourly and first line supervision). If they run into a tricky problem, a larger group may be called.

I truly believe that 80% of all problems in your organization can be solved by the front line using simple problem solving skills. In order to be effective they need to be given the right tools and processes. They are in most cases closest to the problem and can therefore collect data and observations better than anyone else in the organization. They usually have the technical knowledge needed to solve the problem. The piece lacking to be successful is often a problem solving process and discipline to follow that process.

It is also important to remember that it is management’s responsibility to design or provide a root cause process. Management is also responsible to implement the root cause business process in the mill and set responsibilities, follow up to make people accountable. This is not unique to root cause; it applies to all business processes in the organization

Change the culture
"Downtime reported by department breaks the first rule of root cause problem elimination;
Ask WHY not WHO."

It is also management’s responsibility to change the plant culture to support root cause problem elimination. Reporting structures and Key Performance Indicators (KPI’s) must drive the organization in the right direction. For example it is still common to see classifications of problems by departments. Downtime is often reported in four or five categories such as Operations, Electrical maintenance, Mechanical maintenance, Instrumentation, and sometimes Process control/ Automation. Lost Production reported by department breaks the first rule of root cause problem elimination. The rule is to ask why and not who. Many of you will argue that we need to know where to spend the resources, therefore we ask who. The issue is that you do not know, since you classified the problem before a root cause was completed.

A typical example may be that a motor tripped at 2:00 am. In the morning meeting we ask, what happened? Well, a motor tripped, so it’s not operations, it is a maintenance issue. It will be classified as electrical because it was an electrical motor that tripped. The though process may look something like the picture bellow. The actual story is that operations overloaded the process, therefore the motor tripped. The E/I mechanic reset the motor, but will not ever tell anyone what happened because the mechanic understands the culture in the organization and do not want to put his friend in operations in a bad spot.