Root cause analysis is well-known as a technique for determining the true underlying cause of an asset failure. This is probably the most common use of root cause analysis (RCA) in maintenance and asset management.
Determining exactly why a particular asset failed can help you to ensure history doesn’t repeat itself. However, RCA can also be used to turn a critical eye on your last shutdown, turnaround, or outage (STO).
The goal is the same as with almost any other application of RCA techniques: finding out what really went wrong so you can ensure it doesn’t happen again. Using RCA after the conclusion of an STO isn’t common, but maybe it should be.
Making root cause analysis part of your STO may require some changes in how your organization conducts its events. Typically, members of the STO team are exhausted after the conclusion of the event.
Key resources may take time off PTO or move to other projects. By the time everyone is back on-site, it’s time to prepare for the next event. If you want to perform root cause analysis on your STO events, you’ll need buy-in from the STO team.
There seems little doubt that many STO events could be improved. According to data gathered by IDC Technologies and published in “Practical Shutdown and Turnaround Management for Engineers and Managers,” more than 90 percent of the STO events included in the analysis failed to meet overall business goals, specific turnaround goals, or both. This is such a high number that it might be more accurate to say that STOs are rarely a complete success.
There are steps you can take during the planning and execution stages of an STO to give yourself the best chance of success. According to the report above, almost 90 percent of the STOs studied experienced scope growth of between 10 and 50 percent.
Some scope growth is practically inevitable during an STO, but we’ve assembled 9 tactics you can use to control scope creep and give your team a better chance of success. Adaptive planning is also helpful in STOs, infusing agility and mobility into the planning and execution phases.
These are forward-looking techniques that can help bring your STO to a successful conclusion. Unfortunately, no amount of controlling scope creep will tell you precisely why your organization’s last STO event went completely sideways or prevent it from happening again. For that, you need RCA techniques.
In the next section, we’ll briefly discuss some of the more common types of RCA and how you can put them to use when critiquing a previous STO. Some techniques, such as Failure Mode and Effects Analysis (FMEA), aren’t easily applicable to STOs and therefore are not covered here.
The Pareto principle states that approximately 80 percent of consequences stem from approximately 20 percent of the possible causes. In other words, while many things may have gone wrong during your organization’s last STO, a large percentage may have a single root cause.
For example, we can easily see how a single cause such as “no attempt to control scope creep” could lead to execution problems for every work order included in the STO as there simply aren’t enough skilled people available to do the work.
We should mention that the Pareto principle isn’t written in stone. It’s merely a way to guide your thinking. Don’t be surprised if your analysis doesn’t match up with that 80/20 rule.
Pareto charts are a handy way to visualize what’s causing most of your problems and therefore where you should concentrate your efforts. The charts themselves are usually bar charts combined with a line graph. The bars represent types of failure that occurred, such as lack of available contractors or missing parts. The line shows the total percentage of failures as you move left to right.
Building a Pareto chart is a useful exercise for deciding where to concentrate your efforts. It’s also handy for showing other stakeholders exactly where the biggest problems lie.
Creating the chart itself is relatively simple once you have the data. Simply arrange the failures in descending order as bars and then determine what percentage of the whole is occupied by each category.
The first step in building a Pareto chart is collecting your data and determining your categories. The precise categories you use are up to you and will vary depending on the nature of your organization. For example, you may choose to sort "missing parts" and "wrong parts" into separate categories, but someone else may elect to lump both of those together as "parts unavailable."
The failure types are listed from left to right in descending order. If your most common issues were that materials weren’t available when needed, you would list that first. The line graph on your Pareto chart simply gives the raw percentage at this point. The second point on the line gives the percentage of the first and second failure types added together.
This technique relies on repeatedly asking “why” until the root cause is discovered. Sakichi Toyoda, founder of Toyota Industries, came up with the idea in the 1930s and it’s still in use today. The process looks something like this:
Q: Why did work order X take longer than scoped?
A: The technician didn’t have parts.
Q: Why didn’t the technician have parts?
A: The vendor didn’t deliver them on time.
Q: Why didn’t the vendor deliver them on time?
…and so on until you find the true cause of the issue, at which point the countermeasure is usually obvious. The 5 Whys type of RCA typically refers to “countermeasures” rather than “solutions.” The idea is that a countermeasure prevents the issue from coming up again at its root, rather than just treating the symptoms.
It’s easy to get started on a 5 Whys analysis. Practically anyone can ask questions. Answering them, however, requires expertise. You will likely have to consult with other stakeholders to make full use of this technique.
The big weakness of 5 Whys is that it’s very linear and tends to follow a single track. Using this technique, you may believe that you’ve found the root cause when you’ve only found a root cause. Any failure can have multiple causes, and the 5 Whys may not reveal this.
Cause-and-effect analysis was developed by Professor Kaoru Ishikawa of the University of Tokyo. It is less linear than the 5 Whys, allowing you to assign multiple causes to a single problem. In fact, the method relies on you listing all the possible causes.
An Ishikawa diagram is also called a fishbone diagram, as it resembles the skeleton of a fish. You construct a fishbone diagram by writing down the failure you wish to analyze. This forms the fish’s mouth.
Next, you fill in all the categories of failure coming off the fish’s spine. You can see some of the more common categories in the accompanying diagram, but the categories you use should be tailored to your situation and organization.
You can then fill in all the possible causes of failure under the appropriate category or categories. The final stage is to ask questions, like the 5 Whys, until you arrive at the root cause for each item. This allows you to put countermeasures in place.
Pareto charts and the 5 Whys concentrate on examining failures after they’ve happened. In the context of STOs, they are primarily useful for improving on the results of previous shutdowns. One of the advantages of cause-and-effect analysis for STOs is that you can use it examine all the possible ways your shutdown can go wrong before the event takes place. This is especially useful if you don’t have a large bank of data to draw from.
Fishbone diagrams are most often used in asset management as part of a preventive management strategy. We can apply them to STOs in the same way.
If we take the broadest view, there are three ways STOs can fail: going over budget, going over time, or not completing the agreed upon scope. We can start with either one of those as our fish’s “mouth” and build out the rest of the diagram accordingly.
It may not be possible for you to run an exhaustive cause-and-effect analysis for a cause of failure as broad as “going over budget.” Assembling a truly complete list of possible root causes requires a tremendous amount of time and effort.
However, you can find value in cause-and-effect analysis just from running it as a thought exercise for your STO team. Getting together and brainstorming the ways the shutdown can fail will highlight some of the countermeasures you need to put in place. You aren’t likely to list every possible root cause, but every potential failure you eliminate increases your chances of success.
Running the STO will reveal more possible points of failure. You can then add these to the diagram, ensuring that they do not happen again.
Regardless of the type of RCA technique you use, it is essential to keep track of the results from shutdown to shutdown and keep adding to them as you discover new ways the STO can go off-track and ways you can prevent that from happening. The biggest wastes of time and money occur when we don’t learn our lessons.
Before moving key resources to the next STO or project, take a few days and do a deep-dive to document any failures using Pareto Analysis or the 5 Whys. These analyses will prove helpful for the next event!
Prometheus STO Planner helps you to keep track of the often-painful lessons learned during previous STOs with a special section for storing documentation. This helps ensure that what has been learned during previous STOs stays with the organization, regardless of the personnel who planned and executed the shutdown.
For more information on keeping your STO event on track, please see our whitepaper, “Centralizing Your Shutdown, Turnaround, or Outage: Creating Alignment Between Your STO Plan, Process, and Team" or give us a call today.