How to build a high-performance engineering team

Alcoa commits to reliability.

In 2003, the Alcoa Warrick Smelter was 43 years old, and it had the second-highest maintenance costs in the corporation’s global smelting system. Asset reliability in the plant continued to suffer, and equipment instability prevented success in fully implementing lean manufacturing tools. A formal assessment of the smelter’s repair and maintenance (R&M) efforts determined a mostly reactive approach with a focus on trying to be really good at response to emergency breakdowns.

That year, the location’s top management provided support to embark on a Reliability Excellence (REX) journey, which created a significant transformation. Ten years later in 2013, Alcoa Warrick Smelter’s R&M costs are 29% below its 2003 pre-REX base (44% lower adjusting for inflation) and OEE performance improvement gains have matched R&M savings dollar-for-dollar annually.

A formal asset integrity audit performed in 2010 by corporate-level resources confirmed that these cost savings were real — in other words, they weren’t gained by simply deferring R&M. In fact, the Warrick Smelter had the lowest percent of corrective actions needing attention in the next five years of all the corporation’s global smelters.

Building a high-performance engineering team contributed to the success of reliability excellence now in use for the smelting business at Alcoa Warrick Operations.

Importance of reliability engineers

Reliability engineering is different than a traditional engineering role in manufacturing. These are not engineers who provide the routine, day-to-day support for production centers. Instead, reliability engineers are in a strategic role — focused on failure prevention and, most importantly, helping to determine how to improve reliability and operate the plant’s assets at the lowest cost.

Do you have reliability engineers at your plant? If so, what types of tasks are they doing? Are they managing capital projects? Are they firefighting? Are they in tactical roles? If so, they’re not reliability engineers.

If a problem isn’t solved to root cause, it may keep recurring. If a plant doesn’t know which assets are the most critical, then the plant may be focusing on the wrong things. If a facility isn’t using equipment failure data to direct its resources on the true equipment bad actors, then there is most likely a lot of money being left on the table.

Reliability engineers help with all of that and lots more.

But we like firefighting

How many times have you experienced a major equipment failure at your plant, and felt relieved when it was over? We all have praised our firefighters; these are the individuals who excel in a crisis and, in many cases, thrive during every minute of it.

We need them. There is no doubt that when a production center is interrupted, we need resources to respond. And when a major downtime event occurs, we need people with strong troubleshooting skills and those who can get our equipment back up and running again. These are the ones who are working hard to reduce mean time to repair (MTTR). They could be engineers, technicians, craftsmen, or others. And when they get the equipment running again, we thank them and feel the weight lifted off our shoulders. How many of us have given lavish praise to these “knights in shining armor” when they swoop in to save the day? We probably all have, and that is not a bad thing, but what we sometimes forget in the heat of the moment is to step back and ask, “How did we get into this mess in the first place? Why did this failure occur?” And most importantly, “What are we going to do to prevent it from happening again?”

This is where the reliability engineers steps in. They are not focused on MTTR, but instead mean time between failures (MTBF). While others are on the scene, working to do whatever it takes to restore immediate production flow, the reliability engineers should be there investigating what happened. They will talk with the operators, review operational data and trends, take photographs of the scene, pull up past history of similar incidents, review camera footage, if available, and try to piece together all the available evidence. This is the detective work that will enable them to lead a root cause analysis (RCA) to determine why the failure occurred.

And speaking of RCA, it may seem obvious, but you need to have proper follow-up mechanisms to ensure the RCA action items are getting completed. Have your reliability engineers keep these corrective and preventive action items in front of your teams so that these tasks get done. Archive your RCA files to be easily accessible later on. And if the failure returns, retrieve the previous RCA and review it to try to understand what may have been missed and why the failure recurred.

Free-time reliability engineering

Focus. That is one of the most important parts of our success. If you have a reliability engineer in a hybrid role — doing some maintenance engineering or some project engineering — then you don’t have a true reliability engineer. In a hybrid role, the crisis of the day or a production manager’s pet project can take precedence over working on long-term objectives. If you expect reliability engineers to work on reliability when they can find the time, you aren’t going to make the gains you’re seeking. They must be focused.

Yes, we all are busier than ever these days. We all wear multiple hats. But the reliability engineering role is one where we must discipline ourselves to focus them solely on failure elimination and prevention. When you pull a reliability engineer off a proactive task to work on a reactive task, you are losing ground in your reliability efforts. Set yourself up with a maintenance engineer or maintenance professional to handle the tactical production needs while the reliability engineer is allowed to focus on the strategic efforts.