Exception detection, also known as anomaly detection or outlier detection, is a term used in data analysis to refer to the identification of rare or unusual events. Exceptional behavior in a system often indicates an underlying problem.
For example, an asset that experiences unexplained spikes in heat may be on the verge of failure. Note that this is distinct from situations where the heat itself may cause failure, such as with temperature-based fuses. In terms of exception detection, we’re looking at the heat as a symptom, not a cause.
We can think of these exceptions as the clues that assets provide before failure. Noticing these indications and correctly interpreting their meaning enables us to prevent failures before they occur.
This is central to the concept of predictive maintenance: we listen to what our assets have to say about their condition and make our maintenance decisions accordingly. This ability goes well beyond what traditional maintenance KPIs have to offer.
Detecting exceptions and acting accordingly is the key to predictive maintenance, and therefore preventing failure before it occurs. Continue reading to find out how exception detection can improve the accuracy of your predictive maintenance.
Traditionally, any condition-based monitoring (CBM) has been done primarily through physical inspections. Thanks to technological advances, an increasing amount of the legwork is taken up by remote sensors that continuously collect operating data.
Smooth running gives you a baseline data set from monitoring detections. The exceptions to the baseline will give you the clues you are looking for.
There are three distinct categories of exception detection: unsupervised, supervised, and semi-supervised. All three categories require existing data sets to function, as it’s impossible to determine an outlier if you have nothing to work from. We’ll discuss each one briefly below.
This method detects exceptions by seeing how much any event varies from unlabeled test data sets. The assumption behind this method is that most instances in any data set will be “normal.” Events that show the least fit with that test data can be labeled as exceptions.
This technique differs from the previous one by requiring data that has been deliberately labeled “normal” and “abnormal.” In other words, you’re giving the system clues as to what to look for.
For example, you can label vibration that falls outside of a certain range of hertz as abnormal. This data is then used to train an algorithm to watch for this and similar conditions. In theory, as the system accesses more and more data, it gets better at detecting these exceptions.
This method uses a “normal” training data set to create a model representing normal behavior. From there, the algorithm is unleashed on a large set of unlabeled data. It compares this to the normal data and uses it to improve.
Machine learning enabled predictive maintenance is a natural evolution of previously developed maintenance tactics. The industry started with reactive maintenance, where you fix failures as they come up. Preventive maintenance was developed next and sought to prevent failures before they occurred by performing maintenance on a set schedule determined by time, cycles, or a combination of both.
Then predictive maintenance was invented by the aviation industry following the publication of F. Stanley Nowlan’s and Howard’s Heap landmark study, Reliability-Centered Maintenance. Among many other findings, Nowlan and Heap determined that preventive maintenance was not always the most effective or efficient method of preventing certain failures. Instead, they argued for a model where every asset receives appropriate maintenance, whether that’s preventive or predictive.
The key to predictive maintenance is to perform maintenance only when it’s really required to keep the asset operating optimally. Preventive maintenance, with its set schedules, may result in “extra” maintenance work being done. Predictive maintenance techniques eliminate this.
Initially, predictive maintenance depended on physical inspections and identified threshold rules. However, this is very time and labor intensive. High costs mean it can only ever be applied to the most critical assets. Even in those cases, it’s likely only used when preventive maintenance has been historically insufficient to prevent failure.
As John Soldatos points out in his blog, “Your Predictive Maintenance Capabilities will be Enhanced by Big Data Technologies,” Big Data provides what’s needed to overcome the challenges of implementing predictive maintenance.
Today, predictive maintenance has evolved, replacing inspections with sensors and threshold rules with machine learning algorithms. Predictive maintenance of this type relies on exception detection. In turn, meaningful exception detection relies on having solid data that indicates normal operating thresholds for temperature, lubrication, vibration, etc., and historical data that shows the range of potential failures.
Additionally, you need data on which conditions indicate potential failure. Without this, you won’t be able to train your algorithm to tell which exceptions are just noise and which indicate serious problems.
Previous maintenance tactics all involve certain trade-offs. Reactive maintenance is only performed when it’s truly required, but it can interfere with production. Preventive maintenance can prevent failures, but at the cost of performing “extra” maintenance that may not actually be required to prevent a failure.
Traditional predictive maintenance tries to anticipate this, but it requires tremendous effort in terms of inspections and relies either on the tribal knowledge of the inspector or a very rigorous set of rules to determine exactly when maintenance should be performed.
The proliferation of sensors and the increase of computing power in recent years means that true predictive maintenance is now available to more organizations. Machine learning can be used to refine models and produce more accurate results over time. In turn, this means more efficient maintenance.
Real-time data collection, combined with the application of machine learning algorithms, provides a deeper understanding of functionality and further enables accurate prediction of when failures are most likely to occur. This enables maintenance organizations to anticipate their needs and plan accordingly.
When set up properly, your sensors and the data they provide can be very useful in helping you build a solid foundation for predictive maintenance.
Once you find a normal range for your data, you can start looking into the exceptions and the outliers to determine where the abnormalities are occurring and how you can predict them in the future.
To learn more about how our reporting and analytics dashboard can help you take your maintenance programs to the next level, check out our whitepaper “Maintenance Analytics: The Most Powerful Change Agent” or contact us today.