A survey of signal processing and decision technologies for CBM

Hundreds of theoretical and practical research papers on CBM appear every year in scientific journals, conference proceedings and technical reports. In this chapter we provide an overview of recent developments in the diagnostocs and prognostics of systems.

Hundreds of theoretical and practical research papers on CBM appear every year in scientific journals, conference proceedings and technical reports. In this chapter we provide an overview of recent developments in the diagnostocs and prognostics of systems.

In previous chapters we learned that Condition-based Maintenance recommends actions based on information acquired through observation and analysis. We noted, moreover, that the CBM process, itself contains three sub-processes or steps: data acquisiton, signal processing, and maintenance decision making.

Figure 12‑1 The three CBM steps

Chapter 5. Case based reasoning (page 70) pointed out, in regard to complex systems, that prognostics are often indistinguishable from diagnostics, where both aim to identify the occurance of a potential failure.

Hundreds of theoretical and practical research papers on CBM appear every year in scientific journals, conference procedings and technical reports. In this chapter we provide an overview of recent developments in the diagnostics and prognostics of systems. We will mention a number of models, algorithms, and technologies for signal processing and maintenance decision making. Given the increased use of multiple sensors, we will also discuss various techniques for data fusion. The chapter is concluded with a brief discussion on current practices and possible future trends in CBM. The purpose of this survey of advanced methods of signal processing and decision making is not to instruct the reader in the the use of these new techniques, but merely to provide the maintenance professional with references to the source material so that he or she can investigate alternatives when encountering various situations where a CBM solution is proposed.

Reliability has always been an important criterion in the selection of industrial equipment. Good equipment design is essential for processes requiring high reliability. However, no amount of design effort will prevent deterioration over time. Machinery and systems operate under stress in an environment that is characterized by randomness. Maintenance is the major way in which we assure the user of the asset a satisfactory level of reliability. Physical asset managers look towards CBM as an efficient form of maintenance, which, they expect will assist them in the avoidance or reduction of risk. That is, they seek to reduce, to an acceptable level, the combined impact of the probability of failure and its consequences. A CBM program, if properly established and effectively implemented, can significantly reduce overall cost by reducing the number and/or extent of unnecessary preventive maintenance operations, while still achieving the desired reliability.

Let us begin by reviewing, briefly, the first CBM step, data acquisition.

Data acquistion

Data acquisition, the essential first step in the CBM task, is a process for collecting and storing useful information that emanates from operating physical assets. Data collected in a CBM program is of two main types: “event” data and condition monitoring (CM) data. Event data tells us what happened, for example, an installation, a breakdown, or an overhaul. Event data also tells us what was done, for example, a minor repair, a preventive maintenance action, an oil change, and so on. CM data consists of observational measurements that we believe are, in some way, related to the deteriorating health or state of the physical asset.

CM data can include vibration data, acoustics data, oil analysis data, temperature, pressure, moisture, humidity, and any other physical observations, including visual clues, that relate to to the condition of an operating physical asset in its environment. A variety of sensors (microsensors, ultrasonic sensors, acoustic emission sensors, thermographic imagers, etc) have been designed to collect different types of data [11,12]. Wireless technologies such as Bluetooth have provided an alternative to more expensive hard wired data communication. Information systems such as Computerized Maintenance Management Systems (CMMS), Enterprise Resource Planning (ERP) systems, control system historians, and CBM databases have been developed for data storage and handling[13]. With the rapid development of computer and advanced sensor technologies, data acquisition technologies have become more powerful and less expensive, resulting in exponentially growing databases of CM data.

Event data and CM data are equally important in CBM. In practice, however, engineers and managers tend to place more emphasis on the latter and sometimes neglect the former. Overlooking event data may have grown from the mistaken belief that it is not valuable to fault prediction as long as the condition monitoring data seems to be working well. We tend to overlook event data, in part, because we lack the knowledge and methods to use it. Event data is at least as helpful as CM data in assessing machine health. It augments our ability to judge the significance of CM data with respect to specific failure modes. The use of event data is discouraged by the fact that its collection usually implies manual data entry. Once a human is involved, everything becomes more complicated and error-prone. Choosing the “simple” solution, that of removing the human element, is hasty and ill-advised. Rather, as we discovered in Chapter 11. Information Procedures for Optimized CBM Policies (page 155), it is preferable to equip humans with tools and procedures with which to capture event data accurately, in a meaningful format, and in sufficient detail.

Signal processing

Under the topic of signal processing we include a necessary preliminary step - data cleaning. Data, especially event data, particularly when it is entered manually, always contains errors. Data cleaning is meant to ensure that clean (error-free) data is used for subsequent analysis and modeling. Data errors are caused by many factors, including the human factor mentioned previously. Errors in CM data may be caused by sensor faults, which are handled by sensor fault isolation[14]. In general, there is no simple, single method to clean data. Sometimes manual examination is required. Graphical tools are helpful in finding and removing data errors. Data cleaning is indeed a vast subject area. In Example 2 Data validation on page 129 (Chapter 10. ) we touched upon various aspects of data cleaning.

The next step in signal processing is data analysis. A variety of models, algorithms and tools are described in the technical literature. Their purpose is to analyze data in order to better understand and interpret it. The choice of which model, algorithm, or tool to use for data analysis depends primarily on the type of data collected. Condition monitoring data falls into three principal types:

Value: Data collected at a specific time epoch as single valued variables. For example, oil analysis data, temperature, pressure, humidity are all value type data.

Waveform: Data collected at a specific time epoch as a time series of values. For example, vibration data and acoustic data are or the waveform type.

Multi-dimension: Data collected at a specific time epoch as multi-dimensional values. The most common multi-dimensional data is image data, for example infrared thermographs, X-ray images, visual images, etc.

Although we have been using the term more broadly to describe the entire data analysis phase of CBM, “signal processing” usually refers most specifically to waveform and multi-dimension data analysis. A large variety of signal processing techniques have been developed to analyze and interpret these types of data. Their purpose is to extract useful information from the raw signal in order to perform diagnostics and prognostics. The signal processing procedure for extracting information relevant to targeted failure modes is often called “feature extraction”.

Signal processing

There are numerous signal processing techniques and algorithms in the literature for diagnostics and prognostics of mechanical systems. Case-dependent knowledge and investigation are required to select appropriate the signal processing tools from among a large number of possibilities.Waveform data analysis

The most common waveform data in condition monitoring are vibration signals and acoustic emissions. Other waveform data include ultrasonic signals, motor current, partial discharge, and others. In the literature, there are three main categories of waveform data analysis: time-domain analysis, frequency-domain analysis and time-frequency analysis.

Time-domain analysis is directly based on the time waveform itself. Traditional time-domain analysis calculates characteristic features from time waveform signals as descriptive statistics. For example: mean, peak, peak-to-peak interval, standard deviation, crest factor, high order statistics: RMS (root mean square), skewness, kurtosis, etc. These features are usually called time-domain features. A popular time-domain analysis approach is time synchronous average (TSA). The idea of TSA is to use the ensemble average of the raw signal over a number of evolutions in an attempt to remove or reduce noise and effects from other sources, so as to enhance the signal components of interest. A brief review of TSA was given by Dalpiaz[15] and some drawbacks of TSA were pointed out by Miller[16]. Most of the references on TSA can be found in [15,16].

More advanced approaches to time-domain analysis apply time series models to waveform data. The main idea of time series modeling is to fit the waveform data to a parametric time series model and extract features based on this parametric model. The popular models used in the literature are AR (autoregressive) model and ARMA (autoregressive moving average) model. An ARMA model of order() , denoted by ARMA(), is expressed by

where() is the waveform signal,() ’s are independent normally distributed with mean 0 and constant variance , and ()are model coefficients. An AR model of order() is a special case of ARMA() with (). Poyhonen et al[17] applied AR model to vibration signals collected from an induction motor and used the AR model coefficients as extracted features. Baillie and Mathew[18] compared the performance of three autoregressive time series modeling techniques: AR model, back propagation neural networks and radial basis function networks, to bearing fault diagnostics. Garga[19] proposed using AR modeling followed by dimension reduction for machinery fault diagnostics. Recently, Zhan[20] used a state space model representation of an AR model to analyze vibration signals for fault detection.

There are many other time-domain analysis techniques to analyze waveform data for machinery fault diagnostics. Some of them are briefly described as follows. Wang et al [21] introduced three nonlinear diagnostic methods for rotating machine fault diagnosis. These three methods are pseudo-phase portrait, singular spectrum analysis and correlation dimension. Pseudo-phase portrait is simple for computer execution and is sensitive to some fault types. Wang and Lin[22] used a statistical approach known as singular value decomposition to obtain the pseudo-phase portrait. Singular spectrum analysis can reveal the complexity of a signal and reduce the noise. Correlation dimension can provide some intrinsic information of an underlying dynamical system. Koizumi[23] also considered application of correlation dimension to fault diagnosis. Wang et al[24] applied both correlation dimension and bispectrum for rotating machine fault diagnosis. Zhuge and Lu[25] proposed a modified least mean square algorithm to model the non-stationary impulse-like signals for reciprocating machine fault diagnosis. Baydar et al investigated the use of a multivariate statistical technique known as principal component analysis (PCA) in gear fault diagnostics[26].

Frequency-domain analysis is based on the transformed signal in the frequency domain. The advantage of frequency-domain analysis over time-domain analysis is its ability to easily identify and isolate certain frequency components of interest. The most widely used conventional analysis is spectrum analysis by means of FFT (fast Fourier transform). The main idea of spectrum analysis is to either look at the whole spectrum or look closely at certain frequency components of interest and thus extract features from the signal (see, e.g.[27-29]). The most commonly used tool in spectrum analysis is the power spectrum. It is defined as() , where ()(and throughout this section) is the Fourier transform of signal (), E denotes expectation and “()” denotes complex conjugate. Some useful auxiliary tools for spectrum analysis are graphical presentation of the spectrum, frequency filters, envelope analysis (also called amplitude demodulation)[30-32], side band structure analysis [33], etc. Descriptions of the above mentioned techniques for FFT based spectrum can be found in textbooks such as[34,35] and will not be discussed in detail here. Another useful transform, Hilbert transform, has also been used for machine fault detection and diagnostics[30,36].

Despite the wide acceptance of the power spectrum, other useful spectra for signal processing have been developed and have been shown to have their own advantages over the FFT spectrum in certain cases. Cepstrum has the capability to detect harmonics and sideband patterns in the power spectrum. There are several versions or definitions of cepstrum[35]. Among them, the power cepstrum, which is defined as the inverse Fourier transform of the logarithmic power spectrum, is the most commonly used. A modified cepstrum analysis was proposed in[37]. A high order spectrum, i.e. bispectrum or trispectrum, can provide more diagnostic information than the power spectrum for non-Gaussian signals. In the literature, high order spectrum is also called high order statistics[38]. This name comes from the fact that bispectrum and trispectrum are actually the Fourier transforms of the third- and fourth-order statistics of the time waveform, respectively. But this name could be confused with the time-domain high order statistics. Bispectrum and trispectrum are defined as

and

respectively. Bispectrum and trispectrum can be normalized to obtain bicoherence and tricoherence as

and

respectively. Bispectrum analysis has been shown to have wide application in machinery diagnostics for various mechanical systems such as gears[39], bearings [40], rotating machines[41,42] and induction machines[43,24]. Li[44] investigated the application of bispectrum diagonal slice to gear fault diagnostics. Yang[40] used both bispectrum diagonal slice and bicoherence diagonal slice , summed bispectrum, and summed bicoherence for bearing fault diagnostics. Application of both bispectrum and trispectrum to bearing fault diagnostics was discussed in [45]. A new technique called holospectrum was introduced by Qu[46] to integrate all the information of phase, amplitude and frequency of a waveform signal. Application of holospectrum to machine fault diagnostics was studied in[47,48]. A review on holospectrum and its applications was given by Qu[49] (in Chinese).

Generally speaking, there are two classes of approaches for power spectrum estimation. The first covers the non-parametric approaches that estimate the autocorrelation sequence of the signal and subsequently apply a Fourier transform to the estimated autocorrelation sequence. For details, see[50]. The second class includes the parametric approaches that build a parametric model for the signal and then estimate power spectrum based on the fitted model. Among them, AR spectrum[51-53] and ARMA spectrum[54] based on AR model and ARMA model respectively are the two most commonly used parametric spectra for machinery fault diagnostics.

One limitation of frequency-domain analysis is its inability to handle non-stationary waveform signals, which are very common when machinery faults occur. Thus, time-frequency analysis, which investigates waveform signals in both time and frequency domain, has been developed for non-stationary waveform signals. Traditional time-frequency analysis uses time-frequency distributions, which represents the energy or power of waveform signals in two-dimensional functions of both time and frequency. Short-time Fourier transform (STFT, and also called spectrogram)[55,56] and Wigner-Ville distribution[57-60] are the most popular time-frequency distributions. Cohen[61] reviewed a class of time-frequency distributions which include spectrogram, Wigner-Ville distribution, Choi-Williams and others. The idea of spectrogram is to divide the whole waveform signal into segments with a short time window and then apply a Fourier transform to each segment. Spectrogram has some limitations in time-frequency resolution due to signal segmentation. It can be applied only to non-stationary signals with slow change in their dynamics. Bilinear transforms such as Wigner-Ville distribution are not based on signal segmentation and thus overcome the time-frequency resolution limitation of spectrogram. However, there is one main disadvantage of bilinear transforms, which is due to interference terms formed by the transformation itself. These interference terms make interpretation of the estimated distribution difficult[62]. Improved transforms such as the Choi-Williams distribution have been developed to overcome this difficulty. Gu et al[63] applied singular value decomposition to extract features from the time-frequency distribution. Loughlin[64] used a set of conditional time-frequency moments as characteristic features for fault diagnosis.

Another transform for time-frequency analysis is the wavelet transform. Wavelet theory has been rapidly developed in the past decade and has wide application[65]. A continuous wavelet transform is defined as

where ()is the waveform signal, ()is the scale parameter, ()is the time parameter and ()is a wavelet, which is a zero average oscillatory function centered around zero with a finite energy, and “()” denotes complex conjugate. Commonly used wavelets are Morlet, Mexican hat, Haar, etc. Similar to Fourier transform, the wavelet transform has its discrete form, which is obtained by discretizing ()and() , and expressing ()in discrete form. Similar to FFT, a fast wavelet transform is likewise available for the calculation.

Wavelet analysis of a waveform signal expresses the signal in a series of oscillatory functions with different frequencies at different times by dilations via the scale parameter and translations via the time parameter() . Similar to the power spectrum and the phase spectrum in Fourier analysis, a scalogram defined as ()and a wavelet phase spectrum defined as ()the phase angle of the complex variable() are used to interpret the signal. Wavelet transformation has been successfully applied to fault diagnostics of gears[66,67], bearings[68,69] and other mechanical systems[70,71]. Dalpiaz and Rivola[72] assessed and compared the effectiveness and reliability of wavelet transform to other vibration signal analysis techniques for fault detection and diagnostics. Baydar and Ball[73] applied wavelet transform to both acoustic signals and vibration signals for gear tooth fault diagnostic. Addison et al[74] investigated the use of low-oscillation complex wavelets, Mexican hat and Morlet wavelets, as feature detection tools. Wavelet analysis using Haar wavelet was considered in[75,76]. Miller[77] used a wavelet basis as a comb filter to decompose vibration signals for gear fault diagnostics. A graphical tool called wavelet polar maps to display wavelet amplitude and phase was proposed in[78] and was applied to gear fault diagnostics in[79]. Wavelet transform combined with Fourier transform to enhance feature extraction capability was proposed in[80]. A more advanced transform, known as wavelet packet transform, was studied and applied to machinery fault diagnostics in[81-83]. A new technique know as basis pursuit based on a general wavelet packet dictionary was applied to rolling element bearing fault diagnostics in[84]. It was shown that basis pursuit has some advantages over other commonly used wavelet analysis approaches. A recent review with more references on the applications of wavelet transform in machine condition monitoring and fault diagnostics was given in[85].

Image processing

Image processing is similar to but more complicated than waveform signal processing due to one more dimension involved. In practice, raw images are usually very complicated and immediate information for fault detection is unavailable. In these cases, image processing techniques must be powerful enough to extract useful features from raw images for fault diagnosis — see[86,87] for descriptions and discussions on image processing tools and algorithms. Image processing seems unnecessary when raw images provide sufficient and clear information under visual examination to identify patterns and detect faults. However, image processing can still help in extracting features for automatic fault detection in such situations. In addition to raw images obtained via data acquisition, some waveform processing techniques such as time-frequency analysis also produce images. In these situations, image processing can be combined with waveform processing to obtain better results.

A few examples of applying image processing techniques in condition monitoring and fault diagnosis and prognosis are as follows. Wang and MacFadden[88] applied image processing techniques to spectrograms for early gear fault detection and diagnostics. Utsumi et al[89] used a wavelet transform to analyze ferrographic images for bearing diagnosis. Heger and Pandit[90] considered a wavelet-based segmentation approach to image processing for the condition monitoring and fault diagnostics of grinding tools. Ellwein et al[91] combined image processing techniques with waveform power spectrum density to identify a region of interest (ROI) for fault discrimination enhancement.

Value type data analysis

Value type data includes both raw data obtained via data acquisition and feature values extracted from raw signals via signal processing. Value type data looks much simpler than waveform and image data. However, complexity lies in the correlation structure when the number of variables is large. Multivariate analysis techniques such as PCA and independent component analysis (ICA) are very useful to handle data with complicated correlation structure. For example, Stellman et al[92] applied PCA to spectroscopic data to monitor the condition of a lubricant in helicopter rotary gearboxes. Allgood and Upadhyaya[93] performed PCA on certain descriptive statistics for DC motor diagnostics and prognostics. ICA is an extension of PCA and will be discussed later. When the number of variables is large, dimension reduction techniques such as PCA and project pursuit can be used for data reduction. For a review on dimension reduction techniques, see[94]. An example of applying dimension reduction techniques for machine fault diagnostics is given in[19].

Trend analysis techniques such as regression analysis and time series model are commonly used techniques for analyzing value type data. For example, Grimmelius et al[95] developed a prototype condition monitoring and diagnostics system for compression refrigeration plants using a regression analysis model to predict healthy system behavior. Yang et al[96] established an ARMA model to extract features from on-line data for power equipment diagnosis. Sinhap[97] applied both polynomial regression and an ARMA model to predict the trend of vibration peak amplitude for turbine fault diagnostics and prognostics.

Data analysis combining event data and condition monitoring data

Data analysis for event data only is well known as “reliability analysis”, which fits the event data to a time between events probability distribution and uses the fitted distribution for further analysis. In condition-based maintenance, however, additional information — condition monitoring data, is available. It is beneficial to analyze event data and condition monitoring data together. This combined data analysis can be accomplished by building a mathematical model that describes the underlying mechanism of a fault or a failure. The model built on both event and condition monitoring data is the basis for maintenance decision support — diagnostics and prognostics, which will be discussed in the next section.

A time-dependent proportional hazards model (PHM) is suitable for analyzing both event and condition monitoring data together. It has a hazard function of the form

where() is a baseline hazard function, ()are covariates which are functions of time, and ()are coefficients. The baseline hazard function can be in non-parametric or parametric form. A commonly used parametric baseline hazard function is the Weibull hazard function, which is the hazard function of the Weibull distribution. A PHM with Weibull baseline hazard function is called Weibull PHM. Jardine et al[98] proposed using a Weibull PHM to analyze the aircraft and marine engine failure data together with the metal concentration measurements of the engine oil. An extension of PHM is the proportional intensity model (PIM), which adopts a stochastic process setting and assumes a similar form to the intensity function of the stochastic process. Vlok et al[99] studied the application of PIM to analyze failure and diagnostic measurement data from bearings.

In reliability centered maintenance (RCM)[100], the concept known as the “P-F interval” is used to describe failure patterns in condition monitoring. A P-F interval is the time interval between a potential failure (P), which is identified by a condition indicator, and a functional failure (F). A P-F interval is a useful concept with which to determine an appropriate interval for periodic condition monitoring. A condition monitoring interval is usually set to the P-F interval divided by an integer. In practice, however, it is usually difficult to quantify the P-F interval (see Chapter 9. The Elusive P-F Curve page 102). Goode et al[101] assumed two Weibull distributions for the P-F interval and the I-P interval, i.e. from machine installation to a potential failure. Using the statistical process control (SPC) methods on historical data, they separated each machine life cycle into two zones: a stable zone and a failure zone. They used the stable zone duration times to fit a Weibull distribution for the I-P interval. Similarly, they used the failure zone duration times to fit the Weibull distribution for the P-F interval. Based on these two fitted distributions combined with the condition monitoring process, machine prognosis was derived.

A hidden Markov model (HMM)[102,103] is another model for analyzing event and condition monitoring data together. A HMM consists of two stochastic processes: a Markov chain with a finite number of states describing an underlying failure mechanism, and an observation process that depends on the hidden state. Bunks et al[104] applied a HMM to analyze Westland helicopter data which consists of gearbox fault class information and vibration measurements surrounding the occurance of various faults. The fault classes were treated as states in the hidden Markov chain, whereas the vibration measurements were treated as realizations of the observation process. The trained HMM using lab test data was then applied to fault classification for a data set from an operating gearbox. Dong and He[105] proposed a more general model, hidden semi-Markov model (HSMM), for hydraulic pump diagnostics. It was shown that HSMM outperforms HMM in pump diagnostics.

Lin and Makis[106] proposed using a partially observable stochastic model to describe the underlying failure mechanism of a system undergoing condition monitoring. The proposed model is similar to that of a HMM but it has some distinguishing characteristics. One (failure) state is observable, whle the partially hidden state process is continuous in time. The observation process, however, is in discrete in time. These characteristics are more realistic in relation to actual condition monitoring processes. The model parameters were estimated using both event and condition monitoring data. The fitted model is used for subsequent diagnostics and prognostics. A fast recursive parameter estimation procedure for a partially observable stochastic model was given in[107].

Other models in the literature that can be used to analyze both event and condition monitoring data are models using the delay time concept[108] and stochastic process models such as a gamma process[109].

Maintenance decision support

The ultimate goal and final step of a CBM program is maintenance decision making. Sufficient and efficient decision support will result in maintenance personnel’s taking the “right” maintenance actions given the current known information. Jardine[110] reviewed and compared several commonly used CBM decision strategies. They included trend analysis that is rooted in statistical process control, expert systems, and neural networks. Wang and Sharp[111] discussed the decision aspect of CBM and reviewed the recent development in modeling CBM decision support.

Diagnostics

Machine fault diagnostics is a discovery procedure based on mapping information in the measurement space and/or features in the feature space to machine faults in the fault space. From an “RCM” perspective, a machine fault may or may not have immediate consequences. If a fault does not have immediate consequences, other than those necessary to diagnose and repair it, it is a potential failure. The diagnostic action following the detection of a potential failure will be a proactive activity, initiated, often, by a condition based maintenance process. A common example is an alarm generated by a “rule” applied to the data in a control system historian. Besides a potential failure, a diagnostic alarm may also expose an otherwise hidden functional failure, usually the failure of a protective or backup device. The failure of a hidden function has the immediate consequence that a “multiple” failure is, from that moment on, highly probable. This topic was developed in Failure Finding Intervals of Chapter 3. on page 38.

The diagnostic mapping process is also called pattern recognition. Traditionally, pattern recognition was a manual exercise, performed with the assistance of graphical tools such as a power spectrum graph, a phase spectrum graph, a cepstrum graph, an AR spectrum graph, a spectrogram, a wavelet scalogram, a wavelet phase graph, and so on. However, manual pattern recognition requires expertise in the specific area of the diagnostic application. It is slow and expensive requiring highly trained and skilled personnel. Therefore, automatic pattern recognition is highly desirable. This can be achieved by classification of signals based on the information and/or features extracted from the signals. In the following sections, different machine fault diagnostic approaches are discussed with emphasis on statistical approaches and artificial intelligent approaches. Machine diagnostics with emphasis on practical issues was discussed in[112]. Various topics in fault diagnosis with emphasis on model-based and artificial intelligence approaches were covered in a recent co-authored book[113].

Statistical approaches

A common method of fault diagnostics is to detect whether a specific fault is present or not based on the available condition monitoring information without intrusive inspection of the machine. This fault detection problem can be described as a hypothesis test problem with null hypothesis H0: Fault A is present, against alternative hypothesis H1: Fault A is not present. In a concrete fault diagnostic problem, hypotheses H0 and H1 are interpreted into an expression using specific models or distributions, or the parameters of a specific model or distribution. Test statistics are then constructed to summarize the condition monitoring information so as to be able to decide whether to accept the null hypothesis H0 or reject it. See[114-116] for some examples of using hypothesis testing for fault diagnosis. Recently, a framework for fault diagnosis, called structured hypothesis tests, was proposed for conveniently handling complicated multiple faults of different types[117].

A conventional approach, statistical process control, which was originally developed in quality control theory, has been well developed and widely used in fault detection and diagnostics. The principle of SPC is to measure the deviation of the current signal from a reference signal representing the normal condition to see whether the current signal is within the control limits or not. An example of using SPC for damage detection was discussed in[118].

Cluster analysis, as a multivariate statistical analysis method, is a statistical classification approach that groups signals into different fault categories on the basis of the similarity of the characteristics or features they possess. It seeks to minimize within-group variance and maximize between-group variance. The result of cluster analysis is a number of heterogeneous groups with homogeneous contents. There are substantial differences between the groups, but the signals within a single group are similar. Application of cluster analysis in machinery fault diagnosis was discussed in[119,120]. A natural way of signal grouping is based on certain distance measures or similarity measures between two signals. These measures are usually derived from certain discriminant functions in statistical pattern recognition[121]. Commonly used distance measures are Euclidean distance, Mahalanobis distance, Kullback-Leibler distance and Bayesian distance. See[122-125] for some examples of using these distance metrics for fault diagnostics. Ding et al[122] introduced a new distance metric called quotient distance for engine fault diagnosis. Pan et al[126] proposed an extended symmetric, the Itakura distance, for signals in time-frequency representations, for example the Wigner-Ville distributions. In addition to distance measures, the feature vector correlation coefficient is a similarity measure commonly used for signal classification in machinery fault diagnosis[125]. Many clustering algorithms are available for distinguishing the signal groups[127]. A commonly used algorithm in machine fault classification is the nearest neighbour algorithm that fuses the two closest groups into a new group and calculates the distance between two groups as the distance of the nearest neighbour in the two separate groups[128]. The boundary between two adjacent groups is determined by the discriminant function used. A piecewise linear discriminant function was used and thus piecewise linear boundaries were obtained for bearing condition classification in[129]. A technique called support vector machine (SVM) is usually employed to optimize a boundary curve in the sense that the distance of the closest point to the boundary curve is maximized. The support vector machine approach applied to machine fault diagnosis was considered in[17,130].

The hidden Markov model (HMM) described earlier can also be used for fault classification. Early applications of HMM in fault classification and diagnostics treated the real machine faulty states and the machine normal state as the hidden states of the HMM[104,131]. Two recent applications of HMM in fault classification assumed a HMM with hidden states having no physical meaning for two machine conditions (normal and faulty)[132,133]. The trained HMMs are then used to decode an observation for fault classification in a machine whose condition is unknown. Xu and Ge[134] presented an intelligent fault diagnosis system based on a hidden Markov model. Ye et al[135] considered the application of 2-dimension HMM based on time-frequency analysis for fault diagnosis.

Artificial intelligence approaches

Artificial intelligence (AI) techniques have been increasingly applied to machine diagnosis and have shown improved performance over conventional approaches. In the literature, two popular AI techniques for machine diagnosis are artificial neural networks (ANN) and expert systems (ES). Other AI techniques include fuzzy logic systems (FLS), fuzzy-neural networks (FNN), neural-fuzzy systems (NFS), and evolutionary algorithms (EA). A review of recent developments in applications of AI techniques for induction machine stator fault diagnostics was given by Siddique et al[136].

An artificial neural network is a computational model that mimics the human brain. It consists of simple processing elements connected together in a complex layer structure. The model approximates a complex nonlinear function with multi-input and multi-output. One processing element comprises a node and a weight. The artificial neural network learns the unknown function by adjusting its weights with observations of input and output. This process is usually called training of an artificial neural network. There are various neural network models. The feedforward neural network (FFNN) is the most widely used neural network structure in machine fault diagnosis[137-140]. A special FFNN, mulitlayer perceptron (MLP) with the back propagation (BP) training algorithm, is the most commonly used neural network model for pattern recognition and classification. Hence it is popular in machine fault diagnostics as well[140,141,142]. The BP neural networks, however, have two main limitations: 1) difficulty of determining the appropriate network structure and the number of nodes; 2) slow convergence of the training process.

A cascade correlation neural network (CCNN) does not require initial determination of the network structure and the number of nodes. CCNN can be used in cases where on-line training is preferable. Spoerre[143] applied CCNN to bearing fault classification and showed that CCNN can result in utilizing the minimum network structure for fault recognition with satisfactory accuracy. Other neural network models applied in machine diagnostics are radial basis function neural networks[18], recurrent neural networks[144,145] and counter propagation neural networks (CPNN)[146]. The above ANN models usually use supervised learning algorithms which require external input such as a priori knowledge about the target or desired output. For example, a common practice of training a neural network model is to use a set of experimental data with known (seeded) faults. This training process is supervised learning. In contrast to supervised learning, unsupervised learning does not require external input. An unsupervised neural network learns by itself using new information available. Wang and Too[38] applied unsupervised neural networks, a self-organizing map (SOM), and learning vector quantization (LVQ) to the detection of rotating machine faults. Tallam et al[147] proposed several self-commissioning and on-line training algorithms for FFNN applied particularly to electric machine fault diagnostics. Sohn et al[116] used an autoassociative neural network to separate the effect of damage on the extracted features from those caused by the environmental and vibration variations of the system. Then a sequential probability ratio test was performed on the normalized features for damage classification.

In contrast to neural networks, which acquire knowledge by training on observed data with known inputs and outputs, expert systems utilize domain expert knowledge in a computer program with an automated inference engine to perform reasoning for problem solving. Three main reasoning methods for ES used in the area of machinery diagnostics are rule-based reasoning[148-150], case-based reasoning[151,152] and model-based reasoning[153]. Another reasoning method, negative reasoning, was introduced to mechanical diagnosis by Hall et al[154]. Stanek et al[155] compared case-based and model-based reasoning and proposed to combine them for a lower cost solution to machine condition assessment and diagnosis. Unlike other reasoning methods, negative reasoning deals with negative information, which by its absence or lack of symptoms is indicative of meaningful inferences.

Expert systems and neural networks have known limitations. A significant limitation of rule-based expert systems is combinatorial explosion, which refers to the computation problem caused when the number rules increases exponentially as the number of variables increases. Another important limitation is consistency maintenance, which refers to the process by which the system decides when some of the variables need to be recomputed in response to changes in other values. Two important limitations of neural networks are the difficulty to have physical explanations of the trained model and the difficulty of the training process. It is natural then to attempt a combination of both techniques in order to combine their respective advantages thus improving performance in a hybrid system. For instance, Silva et al[156] used two neural networks, SOM and adaptive resonance theory (ART), combined with an expert system based on Taylor's tool life equation to classify tool wear state. DePold and Gass[157] studied the applications of neural networks and expert systems in a modular intelligent and adaptive system for gas turbine diagnostics and prognostics. Yang et al[158] presented an approach for integrating case-based reasoning ES with an ART-Kohonen neural network to enhance fault diagnosis. It was shown that the proposed approach outperforms the self-organizing feature map (SOFM) based system with respect to classification rate.

In condition monitoring practice, knowledge from domain specific experts is usually inexact. Therefore expert system reasoning on domain knowledge is often imprecise. Measures of the uncertainties in knowledge and reasoning are required in order that an ES may provide more robust problem solving capability. Commonly used uncertainty measures are probability, fuzzy member functions in fuzzy logic theory, and belief functions in belief networks theory. An example of applying fuzzy logic to machine fault classification was given in[159] to classify frequency spectra representing various rolling element bearing faults. A comparison between conventional rule-based expert systems and belief networks applied to machine diagnostics was given in[160]. Du and Yeung[161] introduced an approach called fuzzy transition probability, which combines transition probability (Markov process) as well as the fuzzy set, to monitoring progressive faults. The application of fuzzy logic is usually incorporated with other techniques such as neural networks and expert systems. For example, Zhang et al[162] developed a fuzzy neural network for fault diagnosis of rotary machines to improve the recognition rate of pattern recognition, especially in the case when sample data are similar. Lou and Loparo[125] employed an adaptive neural-fuzzy inference system as a diagnostic classifier for bearing fault diagnosis. Liu et al[163] applied fuzzy logic and expert systems to build a fuzzy expert system for bearing fault detection. Chang et al[164] built a system for decision making support in a power plant using both a rule-based ES and fuzzy logic.

Neural networks and expert systems have also been combined with other AI techniques to enhance machine diagnostic systems. Garga et al[165] proposed a hybrid reasoning approach combining neural network, fuzzy logic and expert systems to integrate domain knowledge and test operational data. Evolutionary algorithms[166], which mimic the natural evolution process of a population, have also been shown to have merit when applied to machine diagnostics. Genetic algorithms (GA) are the most widely used type of EA. Sampath et al[167] proposed a GA-based optimization approach to gas turbine diagnostics. Several examples of ANN incorporating GA and other EA algorithms for machine fault classification and diagnostics are[168-170]. Other approaches Another class of machine fault diagnostic approaches are the model-based approaches[171,172]. These approaches utilize physics specific, explicit mathematical models of the monitored machine. Based on this explicit model, residual generation methods such as Kalman filter, parameter estimation (or system identification), and parity relations are used to obtain signals, called residuals, which indicate fault presence in the machine. The residuals are evaluated to detect, isolate and identify the faut(s). This general procedure is illustrated in Figure 12‑2 . Model-based approaches can be more effective than other approaches if a correct and accurate model is built. However, explicit mathematical modeling may not be feasible for complex systems.

Figure 12‑2: General flowchart of a model-based approach

Various model-based diagnostic approaches have been applied to fault diagnosis of a variety of mechanical systems such as gearboxes[173,174], bearings[175-177], rotors[178,179], and cutting tools[180]. Bartelmus[181,182] used mathematical modeling and computer simulation to aid signal processing and interpretation. Hansen et al[183] proposed an approach to more robust diagnosis based on the fusion of sensor-based and model-based information. Vania and Pennacchi[184] developed some methods to measure the accuracy of the results obtained with model-based techniques aimed to identify faults in rotating machines. The information provided by these methods was shown to be very helpful to precise fault identification as well as an evaluation the confidence of the diagnostic decision.

Petri nets, as a general purpose graphical tool for describing relations existing between conditions and events[185], have been applied recently to machine fault detection and diagnostics. Propes[186] used a fuzzy Petri net to describe operating mode transition and to detect a mode change event for fault detection and diagnosis in complex systems. Yang[187] proposed a hybrid Petri-net modeling method coupled with fault-tree analysis and Kalman filtering for early failure detection and fault isolation. Yang et al[188] introduced an approach for integrating case-based reasoning with Petri net for fault diagnosis of induction motors. The integrated approach was shown to outperform the conventional case-base reasoning expert system.

Prognostics

Compared with diagnostics, the literature on prognostics is much smaller. There are two main prediction types in machine prognostics. The most obvious and widely used is the prediction of how much time is left before a failure occurs (or, one or more faults or “potential failures”) given the current machine condition and the past (and future) operating profile. The time left before observing a failure is usually called “remaining useful life” or RUL.

In many situations, especially when a fault or a failure has catastrophic consequences (e.g. nuclear power plant), it is desirable to predict the chance that a machine operates without a fault or a failure up to some future time (for example, the next inspection), given the machine’s current condition and its past operational profile. In the general maintenance context, the probability that a machine operates without fault until next inspection interval is a good reference in helping to determine whether or not the inspection interval is appropriate.

Most of the papers in the literature of machine prognostics discuss only the former type of prognostics, namely RUL estimation. Only a small number of papers address the second type of prognostics[106,189]. In the following sections, we discuss 1. RUL estimation, 2. prognostics that incorporate maintenance actions or policies, and 3. the determination of the appropriate condition monitoring interval.

Remaining useful life

RUL, also called remaining service life, residual life, or remnant life, refers to the time left before observing a failure, given the current machine age, its condition, and the past operation profile. Note here that the definition of failure is crucial to the interpretation of RUL. Although there is some controversy in current industrial practice, a formal definition of failure can be found in many reliability textbooks.

Prognosis, requires knowledge (or data) on the fault propagation process as well as knowledge (or data) on the failure mechanism. The fault propagation process is usually tracked by a trending or forecasting model for certain condition variables. There are two ways of describing the failure. The first assumes that failure depends on the condition variables (which reflect the actual fault level)and a predetermined boundary. The most commonly used failure definition in this case is simple: failure occurs when the fault reaches the predetermined level.

The second builds a model for the failure mechanism using available historical data. Various definitions of failure can be used. A failure can be defined as the event that the machine is operating at an unsatisfactory level (a partial failure); or, it can be a total functional failure when the machine cannot perform its intended function at all; or it can be a breakdown when the machine stops operating; or it can be the attainment of a potential failure condition defined in terms of acceptable risk. Similar to diagnosis, the prognostic methods fall into three main categories: statistical approaches, artificial intelligent approaches and model-based approaches.

Goode et al[101] used SPC to separate the whole machine life into two intervals, the I-P (Installation-Potential failure) interval in which the machine is running correctly and the P-F (Potential failure-Functional failure) in which the machine is running with a problem. Based on two Weibull distributions assumed for the I-P and P-F time intervals respectively, failure prediction was derived in the two intervals and the RUL was estimated. Yan et al[190] employed a logistic regression model to calculate the probability of failure for given condition variables and an ARMA time series model to trend the condition variables for failure prediction. A predetermined level of failure probability was used to estimate the RUL. Phelps et al[191] proposed to track sensor-level test-failure probability vectors instead of the physical system or sensor parameters for prognostics. A Kalman filter with an associated interacting multiple model (IMM) was used to perform the tracking.

Two statistical models in survival analysis, PHM and PIM, are useful tools for RUL estimation in combination with a trending model for the fault propagation process. Banjevic and Jardine[192] discussed RUL estimation for a Markov failure time process which includes a joint model of PHM and a Markov property for the covariate evolution as a special case. Vlok et al[99] applied PIM with covariate extrapolation to estimate bearing residual life. HMM, a stochastic process model discussed earlier, is also a powerful tool for RUL estimation[193,194]. Lin and Makis[195] introduced a partially observable continuous-discrete stochastic process model to describe the hidden evolution process of the machine state associated with the observation process. RUL estimation, as one of the prediction tasks, was generated by the model. Wang et al[109] proposed a stochastic process, called a “gamma process”, with hazard rate as the the residual life prediction criterion. The condition information considered was expert judgment based on vibration analysis. Wang[108] used the residual delay time concept and stochastic filtering theory to derive the residual life distribution.

AI techniques applied to RUL estimation have been considered by some researchers. Zhang and Ganesan[196] used self-organizing neural networks, for multivariable trending of the fault development, to estimate the residual life of a bearing system. Wang and Vachtsevanos[197] applied dynamic wavelet neural networks to predict the fault propagation process and estimate the RUL as the time left before the fault reaches a given value. Yam et al[198] applied a recurrent neural network for predicting the machine condition trend. Dong et al[199] utilized a grey model and a BP neural network to predict machine condition. Wang et al[200] compared the results of applying recurrent neural networks and neural-fuzzy inference systems to predict the fault damage propagation trend. Chinnam and Baruah[201] presented a neural-fuzzy approach to estimating RUL for the situation where neither failure data nor a specific failure definition model is available, but domain experts with strong experiential knowledge are on hand.

Model-based approaches to prognosis require specific failure mechanism knowledge and theory relevant to the monitored machine. Ray and Tangirala[202] used a nonlinear stochastic model of fatigue crack dynamics for real-time computation of the time-dependent damage rate and accumulation in mechanical structures. Li et al[203,204] introduced two defect propagation models via failure mechanism modeling for RUL estimation of bearings. Oppenheimer and Loparo[178] applied a physical model for predicting the machine condition in combination with a fault strengths-to-life model, based on a crack growth law, to estimate RUL. Chelidze and Cusumano[205] proposed a general method for tracking the evolution of a hidden damage process given a situation where a slowly evolving damage process is related to a fast, directly observable dynamic system. Luo et al[206] introduced an integrated prognostic process based on data from model-based simulations under nominal and degraded conditions. Kacprzynski et al[207] proposed fusing the physics of failure modeling with relevant diagnostic information for helicopter gear prognosis.

A different way of applying model-based approaches to prognosis is to derive the explicit relationship between the condition variables and the lifetimes (current lifetime and failure lifetime) via failure mechanism modeling. Two examples of research along this line are[208] for machines considered as energy processors subject to vibration monitoring and[209] for bearings with vibration monitoring. Lesieutre et al[210] developed a hierarchical modeling approach for system simulation to assess RUL. Engel et al[211] discussed some practical issues regarding accuracy, precision and confidence of the RUL estimates.

Prognostics incorporating maintenance policies

The aim of machine prognosis is to provide decision support for maintenance actions. As such, it is natural to include maintenance policies in the consideration of the machine prognostic process. This makes the situation more complicated since extra effort is needed to describe the nature of maintenance policies. We interest ourselves particularly in policies governing in the broad class of maintenance actions that we know as “CBM” and have set out to describe in this review. Compared to conventional maintenance, mathematical models applicable to the CBM scenario are much fewer[212]. See also[213] for more recent references on maintenance modeling.

The main idea of prognostics incorporating maintenance policies is to optimize the maintenance policies according to certain criteria such as risk, cost, reliability and availability. Risk is defined as the combination of failure probability and consequence. Usually, consequence can be measured by cost. In this case, the risk criterion is equivalent to the cost criterion. However, there are some cases, for example, critical equipment in a power plant, in which consequence cannot be estimated by cost. In these scenarios, probability or reliability criterion would be more appropriate. Since the cost criterion applies to most situations, it is not surprising that the literature in CBM optimization is dominated by cost-based CBM optimization. The consequence analysis technique discussed in[214] is a general risk evaluation tool for CBM optimization based on various kinds of criteria.

In condition monitoring, no matter what machines are monitored, they fall into two categories: completely observable systems and partially observable systems. For a completely observable system, the machine state can be completely observed or identified. The information collected from this system is called direct information. For a partially observable system, the machine condition cannot be fully observed or identified. The information obtained from this system is called indirect information, which is somehow related to the real machine state. In the text to follow, we discuss various models and methods for evaluating, through modeling, these two types of systems.

First, we consider completely observable systems. Wang[215] developed a CBM model based on a random coefficient growth model where the coefficients of the regression growth model are assumed to follow known distribution functions. The model was used to determine the optimal critical level and inspection interval in CBM in terms of a criterion of interest, which can be cost, downtime or reliability. In a series of works[216-218], a stochastic model — gamma process, was used to describe the deterioration process; the system was considered as failed if its condition jumps above a pre-set failure level; a sequential (or non-periodic) inspection interval was assumed. Grall et al [216] went on to assume a multi-level control-limit rule replacement policy and obtained the optimal thresholds and inspection scheduling by minimizing the expected maintenance cost per unit time. Castanier et al[217] assumed a multi-level control-limit rule repair/replacement policy and obtained optimal thresholds and inspection scheduling based on a cost criterion and an availability criterion as well. Dieulle et al[218] assumed a one-level replacement policy and a sequentially chosen inspection interval using a maintenance scheduling function, and obtained the optimal threshold and inspection scheduling by minimizing the global cost per unit time. Amari and McLaughlin[219] utilized a Markov chain to describe the CBM model for a deterioration system subject to periodic inspection. The optimal inspection frequency and maintenance threshold were found to maximize the system availability.

Berenguer et al[220] presented a CBM structure for continuously deteriorating multi-component systems, which allows cost savings by performing simultaneous maintenance actions. Barata et al[221] used Monte-Carlo simulation to model continuously monitored deteriorating systems, non-repairable single components or multi-component repairable systems. Then optimal degradation thresholds of maintenance intervention were found to minimize the expected total system cost over a given mission time by a direct search. Marseguerra et al[222] used GA to find the optimal thresholds in the previous work by simultaneously optimizing two typical objectives of interest, profit and availability. Hosseini et al[223] employed generalized stochastic Petri nets to represent a CBM model for a system subject to deterioration failures and Poisson failures. It was assumed that deterioration failures are restored by major repair and Poisson failures are restored by minimal repair. The optimal maintenance policy and inspection interval were then found to maximize system throughput.

We turn now to the consideration of partially observable systems. Ohnishi et al[224] applied a Markov decision process model for a discrete-time deterioration system to find the optimal replacement policy in which minimal repair is used to restore a failure if the decision is not to replace. Hontelez et al[225] formulated the decision process as a discrete Markov decision problem based on a continuous deterioration process to find the optimum maintenance policy with respect to cost. Aven[226] presented a counting process approach to determining the replacement policy minimizing the long run expected cost. Barbera et al[227] proposed a CBM model assuming that exponential failures with failure rate depend on the condition variables, and fixed inspection intervals. The optimal maintenance action was then found to minimize the long-run average cost of maintenance actions and failures. Barbera et al[228] extended the previous work to the case of two-unit series systems. Christer et al[229] used a state space model and the Kalman filter to predict the erosion condition of the inductors in an induction furnace conditional on the indirect measurements to date. Then a replacement cost model was developed to obtain the optimal replacement policy given all available information. Kumar and Westberg[230] proposed a reliability based approach for estimating the optimal maintenance time interval or the optimal threshold of the maintenance policy to minimize the total cost per unit time. The authors used PHM to identify the importance of monitored variables and a total time on test (TTT) plot to find the optimal solution. Makis and Jardine[231] established a CBM model using a Markov process to describe the evolution process of condition variables and a PHM to describe the failure mechanism which depends both on age and condition variables. This CBM model was further elaborated in[232]. The optimal replacement policy of the hazard control limit type was then determined by minimizing the long-run expected total cost per unit time. Makis et al[233] applied optimal stopping theory to find the replacement policy maximizing the total expected profit during the machine life where no assumption of monotonicity of the signal process is made. Makis and Jiang[234] presented a framework for CBM optimization based on a continuous-discrete stochastic model. The evolution of the hidden machine state was described by a continuous-time Markov process, and the condition monitoring process was described by a discrete-time observation stochastic process which depends on the hidden machine state. Then the optimal replacement policy was found to minimize the long run expected cost per unit time using optimal stopping theory. Wang[235] applied a stochastic recursive control model for CBM optimization based on the assumptions that the item monitored follows a two-period failure process with the first period of a normal life and the second, of a potential failure. A stochastic recursive filtering model was used to predict the residual, and then a decision model was established to recommend the optimal maintenance actions. The optimal condition monitoring intervals were determined by a hybrid of simulation and analytical analysis. Okumura and Okino[236] constructed a generalized condition-based maintenance model, in which residual life loss and replacement preparation lead-time are included. The optimal inspection time vector and warning level of the target maintained system under a constraint preventive replacement probability were obtained by minimizing the long-run average incurred cost per unit time. Barros et al[237] considered an optimal CBM policy for a two-unit parallel system of which unit-level monitoring information is imperfect and/or partial.

Condition monitoring interval

There are two broad types of condition monitoring: continuous and periodic. By continuous monitoring one continuously monitor (usually by mounted sensors) a machine and trigger a warning alarm whenever something wrong is detected. Two limitations of continuous monitoring are: 1) it is often expensive; 2) the continuous monitoring of raw signals produces large volumes of data, including noise, leading to difficult and inaccurate diagnostics. Periodic monitoring, therefore, is used due to its being more cost effective. Diagnostics from periodic monitoring are often more accurate due to the use of filtered and/or processed the data. Of course, the risk of periodic monitoring is the possibility of missing some failure events that occur between successive inspections ([34], p. 131).

An important issue relevant to periodic monitoring is the determination of the condition monitoring interval. Optimal design of the condition monitoring interval (or inspection interval) has been studied together with optimal threshold design in some of the works discussed in the previous section [215-219,223,230,235,236]. The following research works considered condition monitoring interval determination only. Christer and Wang[238] derived a simple model to find the optimal time for next inspection based upon the wear condition obtained up to current inspection. The criterion is to minimize the expected cost per unit time over the time interval between the current inspection and the next inspection time. Okumura[239] used a delay-time model to obtain the optimal sequential inspection intervals of a CBM policy for a deteriorating system by minimizing the long-run average cost per unit time. Goode et al[240] used the model developed in[101] to determine the length of the next condition monitoring interval for a given risk level. Wang[241] developed a model for optimal condition monitoring intervals based on the failure delay time concept and the conditional residual time concept. Condition monitoring is assumed to be performed at a fixed condition monitoring interval over the whole life and at a dynamic condition monitoring interval as well in the failure delay-time period realizing that more frequent monitoring might be needed in this later period. A hybrid of simulation and analytical procedure was used to find the optimal intervals based on one of five cost criterion functions.

Multiple sensor data fusion

For a complex system, a single sensor is limited in its capability of collecting enough data for accurate condition monitoring, fault diagnosis and prognosis. Multiple sensors are needed in order to do a better job. With the rapid development of computer science and advanced sensor technology, there has been an increasing trend in the use of multiple sensors for condition monitoring, fault diagnosis and prognosis. Data collected from different sensors may contain dissimilar partial information on the same machine’s condition. The problem is knowing how to combine all partial information obtained from different sensors for accurate machine diagnosis and prognosis. The solution to this problem is the subject of multisensor data fusion.

There are many techniques to multisensor data fusion. They can be grouped into three main approaches: (1) data-level fusion, (2) feature-level fusion, and (3) decision-level fusion. For more discussion on these three approaches, see[242,243]. Heger and Pandit [90] used a data-level fusion approach to fuse images obtained by multidirectional illumination to generate an image with a high degree of relevant information for grinding tool condition monitoring and fault diagnostics. Liu and Wang[244] briefly reviewed some applications of these three multisensor data fusion approaches to machine diagnosis and prognosis, and applied a feature-level fusion approach called Cascade-Correlation neural network for rotating imbalance diagnosis. Diagnostics based on the multisensor data fusion was shown to outperform diagnostics based on a single sensor. Wang and Wang[245] used a decision-level data fusion approach called Dempster-Shafer evidence theory for diesel engine fault diagnosis. Kozlowski et al[246] proposed a model-based approach to battery diagnostics using decision-level data fusion. Byington et al[247] explored the methods to fuse non-commensurate oil and vibration features for better gearbox fault diagnostics and prognostics. Mannan et al[248] applied a radial basis function neural network to fuse the features extracted from images of machined surfaces and acoustic signals generated during the machining process. The results were applied to the diagnostics of cutting tools. Hannah et al[249] discussed frameworks in data fusion applications for condition monitoring and diagnostic engineering. Data fusion combined with CBM optimization was studied in[250,251]. Assessment and evaluation of data and information fusion strategies were discussed in[252,253]. Wang and Wang[254] discussed the reliability and self-diagnosis of sensors in a multisensor data fusion diagnostic system.

In a mechanical system with multiple sensors installed, data collected from each sensor may be a complicated mixture of data from several sources. But only some of the sources are related to a particular machine condition of interest. The problem is to separate the various sources for better machine diagnosis and prognosis by fusing the observed multisensor data. The technique for solving this problem is known as blind source separation (BSS)[255]. Recently, BSS has received increasing attention in the area of machine fault diagnostics and prognostics. The general idea behind BSS is shown in Figure 12‑3. It is assumed that the source signals() , generated from () unknown independent sources, and the noise signals() , independent of the source signals, are combined together by an unknown mixing process. The mixed result is observed at the channel output as an ()-dimensional () signal . A formula for the mixing process can be written as

where() is generally a non-linear, time-dependent function. A commonly used form for the mixing process separates the signal and noise, i.e.,() . The objective of BSS is to find a separating function that is applied to the observed signals to obtain an estimate of the source signals .

Figure 12‑3: General idea of BSS

In the literature, there are two categories of mixing process: instantaneous and convolutive mixing process. A mixing process is instantaneous if ()is a time-independent (memoryless) function, and convolutive otherwise. The convolutive mixing process is more common, especially for mechanical systems. The instantaneous mixing model is also called an “independent component analysis” (ICA) model, which is a natural extension of PCA. For a survey of ICA theory and methods, see[256]. Several authors applied ICA together with other signal processing techniques for condition monitoring and machine fault diagnosis[257-260]. Tian et al[261] used ICA in frequency domain and wavelet filtering for gearbox fault diagnostics. Zhang et al[262] studied ICA for partially blind source separation of diagnostic signals for bearing faults with prior knowledge. For a convolutive mixing process, BSS is more complicated. Gelle et al[263] compared two approaches, namely a temporal approach and a frequency approach, to solving the BSS problem of rotating machine signals for monitoring and diagnosis purposes. They further studied the application of the temporal approach to bearing fault diagnostics[264]. Tse and Zhang[265] applied the BSS based method of second order statistics to separate aggregated vibration signals generated from a number of mechanical components for machine fault diagnostics. Vilela et al[266] used the temporal de-correlation approach to separate the mixed acoustic signals for machine monitoring and fault diagnosis. Serviere et al[267] applied BSS to separate noisy harmonic signals for rotating machine diagnostics on a semi-blind mixing basis.

Concluding remarks

In this chapter, we have summarized recent research and developments in machinery diagnostics and prognostics used in implementing CBM. Various techniques, models and algorithms were reviewed. Of the three main steps of a CBM program, namely, data acquisition, signal processing, and maintenance decision making, we focused on the latter two. Finally we discussed various techniques for multiple sensor data fusion.

Although advanced maintenance techniques have been available in the literature, CBM, is under-employed by maintenance departments. Commercial predictive maintenance solution providers have not kept pace with recent advances in signal processing and decision support despite many situations, especially where both maintenance and failure are very costly, where well developed and managed condition-based maintenance is absolutely a better choice than current time based, or inadequate condition based, maintenance policies. Expert knowledge of both the application field and of reliability and maintenance theory are required for selecting and implementing effective condition based maintenance policies in each operating context.

Among the reasons that advanced maintenance technologies have not been well implemented in industry are: 1) lack of data due to incorrect data collecting approaches (see Chapter 11. page 155), 2) lack of efficient communication between theory developers and practitioners in the area of reliability and maintenance; 3) lack of efficient validation approaches; 4) difficulty of communication of the principles of CBM to business policy makers and management executives.

With the rapid development of the MEMS (micro-electro-mechanical systems) technology, future trends in CBM research will include the design of intelligent devices capable of continuously monitoring their own health (see, e.g.[268]). Fast and robust on-line signal processing algorithms are crucial to the design of intelligent devices. Such novel technology will, no doubt, stimulate increased research interest in this area. Another trend in CBM research is a growing collaboration among different, yet individually specialised, CBM research groups, for the joint devlopment of integrated platforms for enhanced diagnostics and prognostics (See[2] for an application of this idea).

[16] A. J. Miller, A New Wavelet Basis For The Decompostion Of Gear Motion Error Signals And Its Application To Gearbox Diagnostics, M.Sc. Thesis, Graduate Program in Acoustics, The Pennsylvania State University, State College, PA, 1999.

[96] L. Yang, M. Z. Yang, Z. Yan, B. Z. Shi, Extraction of symptom for on-line diagnosis of power equipment based on method of time series analysis, in: Proceedings of the 6th International Conference on Properties and Applications of Dielectric Materials, Vol. 1, Xi'an, China, 2000, pp. 314-317.

[97] B. K. Sinha, Trend prediction from steam turbine responses of vibration and eccentricity, Proceedings of the Institution of Mechanical Engineers Part A-Journal of Power and Energy, 216 (2002) 97-104.