Predictive maintenance framework for assessing health state of centrifugal pumps

ABSTRACT


INTRODUCTION
Industry 5.0 is shaping technology-driven ecosystems into sustainable, resilient, human-centric, and value-driven ecosystems [1].The concept of predictive maintenance (PdM) is the utilization of input features such as high velocity, variability, veracity, volume, and value measurements [2], generating production forecasts, providing crucial information for equipment condition and facilitating maintenance management [3].Specifically, PdM can reduce maintenance and overtime costs by 20% while decreasing downtime by 5% [4].Additionally, through [5] findings, a PdM solution can predict approximately 70% of failures and reduce scheduled repairs and maintenance costs by up to 12% and 30%, respectively.Therefore, it is crucial to further research PdM in modern industrial environments for constructing innovative, sustainable, and resilient manufacturing processes.Several maintenance approaches are applied, namely corrective, preventive, and predictive maintenance [6].As defined by [7], corrective maintenance, also named run-to-failure, focuses on repairing equipment or individual system components following their malfunction.Argued that in corrective maintenance, replacements or repairs are performed when the critical part is entirely worn out, and a failure occurs [8].Hence, system malfunctioning can lead to unwanted events, jeopardizing the safety of operators and increasing production downtimes.Differentiating from the approach above, preventive maintenance is performed periodically in specific timeframes regardless of the system's health state [9].
PdM increases safety and productivity, decreases downtimes and reduces operational costs.Historical data containing key input features, statistical outputs, and data-driven algorithms enable the prediction and early detection of malfunctions [10].PdM tools identify the necessity of a maintenance action based on sensor measurements for condition-based maintenance (CBM) approaches [11] or the prognosis of the remaining lifetime of the equipment remaining useful life (RUL) of industrial equipment [12].Specifically, CBM uses historical or real time data to diagnose critical components' state and schedule maintenance prior breakdown [11].Moreover, the reliability of prognosis and health management systems is based upon diagnosing critical components' degradation state for RUL prediction [8].Important indicators for depicting a component's or machinery's health state vary in each manufacturing sector.The most common vital measurements are vibrations, temperatures, acoustic emissions, currents, pressures, and rotational speed [13].
In the transition towards Industry 5.0, the advent of AI, namely machine and deep learning algorithms and big data analytics, has provided researchers with valuable tools for predicting the condition of machinery and its linear degradation over time.AI algorithms capture sensor measurements (input features) and produce classification or regression outputs (labels), such as the 'healthy' and 'unhealthy stage', by employing mathematical equations in the form of activation and loss functions.A predictive algorithm is trained based on historical data (input features) to accurately predict the targeting output, labelling the stage of the machinery to identify an impending failure [14].Commonly used machine learning (ML) models include random forest (RF), Naïve Bayes, support vector machines (SVM), and extreme gradient boosting (XGBoost) [15].
There are many reports on the successful application of AI models in forecasting health states or upcoming failures in electric inductive motors.Presented an anomaly detection approach using the Simulink/MATLAB programming environment in an electrical motor-driven system connected with a gearbox [16].Vibration, rotation axis, and current signals were proposed as input features for the artificial neural nerworks (ANN) detection algorithm.Similarly, a deep convolutional neural networks (CNN) model for machine state identification of conveyor motors was presented [17].Key input features were considered, namely vibration, temperature, pressure, acceleration, rotational speed, and torque, while accuracy, precision, and recall were the evaluation metrics.The authors' findings indicated that vibration velocity above 4.5mm/s constitutes unsatisfactory vibration severity.Focusing on centrifugal pumps, a failure classification approach was presented [18].Using a context-based Multilayered Bayesian algorithm and vibration, temperature, and pressure as input features, the authors classified failure data into multiple classes based on the estimated fault magnitude with an F1-Score of 98%.However, most research papers analyzed PdM solutions handling historical measurements of healthy inductive motors where malfunction data is sparse.This poses the challenge of conducting health state predictions, having a biased dataset towards healthy state records and reducing the overall accuracy of real-time malfunction prediction.To overcome this issue, our study proposes a PdM solution where measurements collected upon healthy and maintenance-prone stages of centrifugal pumps have similar volumes.The novelty of our study facing the challenge above is to collect multistage health condition data of the same manufacturer, providing adequate information to the prediction algorithms to recognize and classify health state conditions.Additionally, our research proposes model-optimization using Spearman statistical correlation for feature selection, enhancing the overall accuracy of the PdM approach.Overall, this contributes to the RUL research.further assisting the safety and productivity of the system.
The main objective of this research is to develop a PdM model based on health and maintenanceprone data collected from two different centrifugal pumps, referring to healthy and maintenance-prone stages, respectively.Moreover, descriptive statistics and Spearman statistical analysis will be conducted to identify the correlation between input measurements and the predicted label of health state.Hence, a generic model will be developed for producing accurate outcomes based on historical input data.Furthermore, typical AI models, namely RF, Naïve Bayes, SVM, and XGBoost, will be evaluated based on "Accuracy", "Precision", "Recall", "F1 score", and "Cohen Kappa score".The prediction model's scope is to maximize the overall system's efficiency, increase reliability and productivity, and reduce maintenance costs and downtimes.Moreover, state-of-the-art AI algorithms are compared to provide meaningful insights regarding the health state of the equipment based on crucial input features extracted from historical data.The training process will utilize two different datasets, each depicting healthy and maintenance-prone stages of centrifugal pumps, overcoming the issue of analyzing measurements biased towards the healthy data outputs.Our solution provides experimented algorithms with insightful information assisting in health state recognition and malfunction prediction of centrifugal pumps.The remainder of the paper is structured as follows: i) Section 2 presents the methodology applied to this research, ii) Section 3 refers to the description of the use case and overall results; iii) While section 4 provides the conclusions of this work and future research directions.

METHOD
In the proposed case study, two centrifugal pumps depicting different health stages, healthy and maintenance-prone, were studied using a PdM approach to compare various AI algorithms and optimize the selection of input features.Using the Mitsubishi Electric FA Smart Condition Monitoring kit provided by UTECO S.A, GX Works 3 software and Python software scripts authors collected 5,118 rows of measurements depicting key features: velocity, demodulation, acceleration and temperature.Data analysis techniques, namely data cleaning and dimensionality reduction, were conducted for handling and pre-processing heterogeneous sensor data.In real-world applications, the initial measurements frequently contain inconsistencies like missing values, data duplication, outliers, and structural errors.Apprehending these inconsistencies before feeding the input features to a predictive algorithm is necessary.Hence, a data analysis procedure, processing raw heterogeneous measurements, is essential before a decision-making approach.The initial stage of the analysis procedure consists of data cleaning.Data cleaning can be defined as detecting and correcting an error [19].As a first step, researchers and analysts handling raw data inputs should identify missing or not a number (NaN) values and apply any necessary actions, namely replacing or deleting them.
Moreover, a common issue in data analytics procedures is dimensionality reduction.One of the challenges in predictive maintenance applications is the significant difference between the number of records depicting healthy and malfunctioning measurements.In most applications, embedded sensors will store measurements of the healthy system, creating a rough analogy and making the predictions biased into a healthy system output.The novelty of our research facing the abovementioned issue is to handle and merge two different datasets of centrifugal pumps of the same manufacturer.The datasets include varying working hours and age, one containing healthy and one maintenance-prone data, to provide adequate information to the training model to recognize and classify health state conditions.Following pre-processing, the topics discussed in this research are feature selection and data mining procedures.For the feature selection, Spearman statistical analysis was conducted to determine collected measurements and select the most appropriate features for model optimization.Regarding data mining, RF, Naïve Bayes, SVM, and XGBoost, were evaluated based on "Accuracy", "Precision", "Recall", "F1 score", and "Cohen Kappa score" for health state prediction of centrifugal pumps.Figure 1 proposes a comprehensive framework for PdM in centrifugal pumps.853 maintenance-prone measurements.Specifically, each pump outputted five columns containing key features: temperature and vibration parameters such as velocity_ISO, rms_demodulation, rms_acceleration, and peakto-peak acceleration.Moreover, the final column included the status characterisation, meaning the output used for the prediction algorithm, indicating the presence of failure, whether healthy or not, on the centrifugal pump at the exact moment [20].Variable velocity_ISO denotes the rotational speed of the centrifugal pumps, measured in mm/s.High-velocity values usually indicate imbalance or misalignment in the detected system.The work in [21] used rotational speed on a deep transfer learning approach for upcoming failure prediction on a rotor kit.Furthermore, demodulation measurements can be vital for early bearing failure indication.The output frequencies identified in the demodulation spectrum are helpful for damage detection in rolling element bearings.Proposed a hybrid PdM approach combining data-driven algorithms and a physics-based model to predict the optimal maintenance timeline of a computer numerically controlled (CNC) milling machine [22].Time domain force X root mean square (RMS) and the frequency domain forces were considered some of the features with the highest correlation, while the hybrid approach outputted an error ratio of 3.17%.
Additionally, acceleration denotes the rate of change of velocity.Acceleration refers to the ratio of velocity shifts over time in speed and direction and indicates gear defect upon detection.Fast fourier transform (FFT) converts time waveform into acceleration spectrum, followed by mathematical equations that produce the velocity and demodulation outputs.A health state classification approach of conveyor motors [17] used acceleration alongside temperature and rotational speed, outputting 100% correct failure classification.
Finally, this research also considered peak-to-peak acceleration and temperature measurements for the health state classification of centrifugal pumps.Peak-to-peak refers to the maximum distance between the negative and positive peaks of the vibration spectrum.The amplitude indicates the vibration's intensity and depicts the detected issue's severity.At the same time, temperature ( o C) can be a vital indication, as it is stated that before the total malfunction of a machine, its temperature rapidly rises.In their analysis, [23] considered statistical values of vibration measurements for anomaly and RUL prediction in CNC milling machines.
The proposed use case denotes a condition-based maintenance approach with a binary classification output predicting the health state of centrifugal pumps.Variables collected from embedded sensors of the Mitsubishi Smart Condition Monitoring kit, namely machine, temperature, and vibration features, were considered for further processing and model optimization.Centrifugal pumps consist of mechanical parts and bearings that change the condition state precisely due to long-term use and overall strain (Figure 2), making them suitable for PdM applications.Our research focuses on developing and optimizing a predictive model to accurately characterize the condition state of centrifugal pumps based on two specifically selected datasets depicting healthy and maintenance-prone measurements, respectively.

Data analysis of input features
The initial data analysis procedure denoted the manipulation of missing or NaN values.As mentioned in section 2, the approach to handling missing data is replacing or deleting them, depending on the volume of  ISSN: 2252-8938 Int J Artif Intell, Vol. 13, No. 1, March 2024: 850-862 854 particular instances in the overall dataset.In our case, the measurements depicting NaN values were sparse (three instances), and the approach chosen was removing the entire missing record.Both healthy and maintenance-prone measurements were summarized, to compose a complete unbiased dataset for data mining procedures and health state classification.Furthermore, the Spearman correlation method was used to find the significance of each feature, enabling the selection of the most convenient features to be applied to the prediction algorithm.
Feature selection focuses on extracting non-informative input variables from historical data that do not enhance the efficiency of the prediction model and may cause overfitting issues [24].Spearman correlation is appropriate when handling continuous parameters, such as time series and sensor measurements, where input features and predicted labels do not express a linear relationship or the predicted output is an ordinal value.Hence in this research, Spearman is considered the most suitable method.The mathematical expression of the Spearman correlation analysis can be written as (1): where   = difference in paired ranks and n = number of cases.

AI algorithm selection
The advent of artificial intelligence (AI), namely ML and deep learning algorithms alongside big data analysis, has provided researchers with valuable tools for predicting the condition of machinery or its RUL.This study examines and compares ML algorithms which efficiently describe the linear degradation of machinery parts (bearings, gearboxes, and many more) on centrifugal pumps.RF, Naïve Bayes, SVM, and XGBoost are state-of-the-art algorithms referred to in the literature [25].
Moreover, a Naïve Bayes algorithm was selected for condition state prediction of machine components in a steel hot rolling mill process providing root mean square error (RMSE) of 2.98 [26].Authors considered steel type, weight, length, temperature, maintenance dates, and thickness parameters as input features.Furthermore, Tutivén et al. [27] proposed an SVM algorithm for bearing failure prediction on wind turbines.It was highlighted that the accuracy of their model was optimal due to the high correlation between mean central shaft temperature and bearing failure.

Random forest
One of the most popular and efficient ML algorithms, RF, can be selected for classification (RF classifier) or regression (RF regressor) outputs.It is a well-suited candidate for health state classification and RUL applications.Its main philosophy is built upon decision trees, a collection of CART-like trees where each individual tree grows while training.Precisely, a 'forest' consists of a certain number of decision trees (or branches) that each outputs a specific prediction.Decision trees resemble the structure of a tree containing roots, branches, and leaves, which output properties, decision rules, and outcomes, respectively.The central concept of the algorithm is the strategic choice (hill climbing) of the independent variable, where each branch will expand.Information entropy in (2), which summarizes the impurity of the sample, is one of the most commonly applied criteria for selecting the independent variable.
Where S is the training sample of the separation node,  + is the fraction of positive examples of S, and  − is the fraction of negative examples of S. Hence, the final predicted output will express most of the decision outcomes, also called as ''bagging'' prediction method, decreasing the possibility of error.

Naïve Bayes
The Naïve Bayes algorithm, based on the Bayes theorem, calculates the probability of a particular event A occurring before a previous event B that has already happened.In the prediction state of the algorithm, it is assumed that the selected variables are independent.Thus, each input feature independently contributes to the classification output regardless of any possible correlation among the given variables.In addition, the theorem overcomes the issue of calculating probabilities where valuable information is absent or many parameters must be considered for an accurate output by calculating conditional probabilities and implementing estimations instead of event frequencies.The respective probabilities are calculated by (3), called Bayes classifier.
Where as: P(A) and P(B) is the probability of events A and B occurring respectively, P(A│B) is the probability of A occurring when B has already occurred and P(B│A) as the exact opposite.
Int J Artif Intell ISSN: 2252-8938  Predictive maintenance framework for assessing health state of centrifugal pumps (Panagiotis Mallioris) 855

Support vector machines
Developed the SVM and followed the concept of classifying input features by creating hyperplanes in a multi-dimensional space with exact dimensions as the features [28], [29].The optimal solution is found when the separating hyperplane margin constitutes the maximum distance between the closest data of the classified features.This distance is called maximum margin hypersurface and, in linearly separable problems, is defined by the number of features (support vectors).The margin is calculated with constrained quadratic optimization algorithms such as the hinge loss formula.An advantage of the SVM algorithm is that through kernel functions, non-linearly separable problems can be converted into linearly separable ones by transforming the original hypothesis space and eventually solving them.Overall cost calculation function (4): where w denotes a vector projecting sample point, n is the number of features, m is the number of samples, and C denotes a scalable number for misclassifications control.

XGBoost
Extreme gradient boosting (XGBoost) implements the gradient boosting algorithm upon decision trees.Boosting refers to creating vectors composed of the function derivatives, calculated on each input feature, thus optimizing weaker branches and adding new branches which predict the residuals of error of the previous ones.Furthermore, an objective function is constantly minimized and updated, combining a convex cost function that quantifies the predictive model's accuracy and a penalty term.XgBoost employs the Quantile Sketch algorithm to secure the optimal output among weighted input features and a cross-validation method at each iteration.The mathematical equation of XGBoost can be defined as (5): where K denotes the number of trees, f is the functional space of F, and F refers to possible classification and regression trees.Hence, the final predicted output refers to weaker branches optimization, also called as ''boosting'' prediction method, increasing overall accuracy.

Comparative analysis of the ML algorithms
Several factors must be considered to ensure effective and efficient use of AI algorithms and highlight applicability for each domain.Ease of implementation, speed of execution, computational intensity, sensitivity to outliers, sensitivity to specific dataset characteristics and overall accury are some of the critical factors identified in this research study.RF is a versatile algorithm that provides an efficient PdM solution when handling outliers and non-linear input features such as vibration, because all values including outliers are treated as positive or negative values outputting the desired result.However, enhancing the overall accuracy of the RF algorithm may require the increase of trees and branches which can result in extending computational intensity and reducing the speed of execution.On the other hand, although Naïve Bayes is a fast and scalable algorithm capable of handling multiple clusters simultaneously, estimated probabilities are sensitive to outliers leading to inaccuracies in classification outcomes.Moreover, despite its implementation simplicity and low computational intensity, the Naïve Bayes algorithm lacks flexibility regarding hyper-parameter fine-tuning.It may result in inaccurate classification outputs since it hypothesizes independence among all features.Additionally, the main characteristics of SVM algorithm are the robustness to unprocessed raw data based on its methodology of determining decision boundaries with support vectors and the ability of generalization, providing efficient classification outputs on previously unseen input measurements.However, SVM can become computationally and memory intensive when handling high volume input datasets due to extensive kernel manipulation and being highly sensitive to hyper-parameter fine tuning, namely regularization parameter and appropriate kernel selection, negatively affecting overall performance.Similarly, an advantage of the XGBoost methodology is its robustness, reducing the need for extensive fine-tuning regarding the model parameters and decreasing the possibilities of overfitting.Nevertheless, XGBoost can become computationally and memory intensive when handling high-volume datasets [30].XGBoost is characterized as a highly accurate and efficient state-of-the-art algorithm on classification and regression applications because of its architecture of optimising and learning based on previous outputs and weaker branches.

USE CASE DESCRIPTION
Embedded sensors of the Mitsubishi smart condition monitoring kit collected 5,118 rows of measurements depicting key features, namely velocity_ISO, rms_demodulation, rms_acceleration, and peakto-peak acceleration from two centrifugal pumps of the same manufacturer, in a healthy and maintenance- ISSN: 2252-8938 Int J Artif Intell, Vol. 13, No. 1, March 2024: 850-862 856 prone state respectively.Both machines operate individually in the Alexander Campus of the International Hellenic University facilities; an indicative dataset can be found on Kaggle (https://www.kaggle.com/datasets/panosmallioris/sensor-measurements-of-centrifugal-pumps).It is worth mentioning that experimented pumps work approximately every 5 minutes for a period of 1 minute filling a water tank, or in extreme cases when a large amount of water supply is requested.Following the extraction of NaN values after initial preprocessing, each pump individually outputted a sample of 2,557 rows for further analysis.Another action in handling raw collected data was extracting measurements where the pumps were stationary.The remaining data, namely 857 rows depicting the running states of both pumps, were used for training (80%) and evaluation (20%) of the experimented algorithms.
Python programming language, Scikit-learn library, and Jupyter Notebook were employed to develop health state classification ML models.The descriptive statistics of numerical input features depicting the training and validation dataset of RF, Naïve Bayes, SVM, and XGBoost models are summarised in Table 1.The measurements of rms_demodulation and rms_acceleration have relatively low deviations (0.115 and 0.212, respectively), differentiating from velocity_ISO, peak-to-peak acceleration, and temperature, where a high deviation is noticed.On the other hand, significant deviations in velocity_ISO, peak-to-peak acceleration, and temperature indicate inconsistencies and instability in the output measurements of experimented centrifugal pumps (one in healthy condition and one in maintenance-prone condition, respectively), making them promising candidates for depicting the health state of the machinery and enhance prediction performance.

AI algorithm selection
In this section, Spearman statistical analysis was conducted using the Python programming language and stats module from the SciPy library.To pre-determine which of the input features, namely velocity_ISO, rms_demodulation, rms_acceleration, peak-to-peak acceleration, and temperature, are the most informative and further enhance the prediction algorithm's performance, Spearman correlation values were calculated.As mentioned in section 2.2, Spearman correlation is selected when the dependent and independent variables do not express a linear relationship, or the predicted label depicts an ordinal value.Therefore, Spearman statistical analysis was considered the most appropriate choice for feature selection and model optimization.The results are presented in Table 2.
The correlation outputs, shown in Table 2, indicated that the velocity_ISO parameter is the most informative for health state classification, with a correlation of 0.865.Moreover, rms_demodulation (0.856), peak-to-peak acceleration (0.849), and rms_acceleration (0.832) also output promising results, making them suitable candidates for feature selection regarding model optimization.However, although the temperature is considered one of the most critical inputs for short-term anomaly detection (Figure 2), the high deviation of measurements resulted since the temperature sensor was more affected by the environmental conditions (measurements were conducted in winter solstice) than by the actual temperature output of the machine.Hence, as calculated by the correlation analysis, temperature input will have a negligible effect or not be relevant to the health state classification output.Thus, in the specified use case, the temperature selection as an input feature is expected to be biased and negatively affect the predictive algorithm for the health state classification.

857
of both types of machinery was split into 80% for training purposes and 20% for validation purposes, respectively.It is worth mentioning that the proposed research handles a balanced dataset with relatively equal amounts of measurements both for healthy and maintenance-prone states of machinery.In case the input dataset was biased, meaning it depicted a specific state more frequently, the ML model would potentially learn always to predict that state outputting and overall poor performance in new out-of-sample inputs.In such cases, trimming away samples from the high-frequency labels (undersampling) or using class weights, weighting appropriate outputs that occur more frequently with a fraction of 1 is preferred.Moreover, this approach defined the 'shuffle' parameter as 'True' during the training set.This is an essential aspect of training optimization.The input dataset is more likely to become biased towards any class if the specific label is seen more frequently towards the end of the training, even if the dataset is appropriately balanced.This occurs because the model will learn that the quickest way to reduce the overall loss is to predict the class seen in the specific train batches more frequently.This will conclude in training loss spikes, and the model will most likely cycle through local minimums outputting the label that is currently being repeated and never output the global minimum depicting the optimal model.Therefore, shuffling training samples in combination with the outputted targets will potentially enhance the performance of the health state classification model.

Model evaluation
Several evaluation metrics calculate the performance of prediction outputs for an experimented model.The evaluation is performed in the validation dataset, meaning that it occurred in a part of the overall collected measurements.In our case, health state classification denotes the predicted output.Hence, binary classification evaluation methods will accurately represent the overall system performance.The core of evaluation methods is the true negatives (TN), true positives (TP), false negatives (FN), and false positive (FP) values depicting the binary outcome.Respectively, TN refers to negative outputs, correctly classified as negative by the model, TP refers to positive results, correctly classified as positive, while FN means negative classes incorrectly classified as positive and FP positive classes incorrectly classified as negative [31].Regarding health state classification outcome, TN value refers to a healthy pump with a condition output value of 0. In contrast, TP refers to a maintenance-prone centrifugal pump with a condition output value 1.More specifically, the metrics implemented in our research were accuracy, precision, recall, F1-score, and Cohen Kappa score [32].(10) Where po refers to relative observed agreement and pe refers to hypothetical probability of chance agreement.

RESULTS AND DISCUSSION
RF, Naïve Bayes, SVM and XGBoost models were developed for a CBM approach predicting the health state of centrifugal pumps.Two different approaches were compared, one using velocity_ISO, rms_demodulation, rms_acceleration, peak-to-peak acceleration, and temperature as input features and one removing the temperature parameter, following Spearman correlation analysis.Insightful metrics (accuracy, precision, recall, F1-score, and Cohen Kappa score) were implemented to evaluate the predictive output.The predicted output denotes a binary system outputting healthy and maintenance-prone conditions.Table 3 presents the results of each model without temperature as an input feature.XGBoost with 98.83% accuracy and 98.89% F1-score and RF with 98.25% accuracy and 98.33% F1-score outputted the best results accurately classifying the health state of validation measurements.Similarly, in Table 4 XGBoost with 99.41% accuracy and 99.31% F1-score and RF with 98.83% accuracy and 98.59% F1-score outputted the best overall results.Moreover, in the second set of experiments, the predictive models outputted more accurate results by removing the temperature parameter as an input feature verifying the conclusions of Spearman analysis.Figure 3 presents the overall accuracy performance depicting the results prior to and following feature selection.
Based on the outputted evaluations in both experiments, we notice a recall outcome of 100%, except for SVM with temperature as an input parameter which outputted occasions of FN values.Hence, most of the misclassification outputs have been attributed to FP values (error in precision), meaning that the predicted model classified the condition of the centrifugal pump as a maintenance-prone state.At the same time, the actual dataset depicted a healthy state value.Another conclusion worth discussing is that in both experiments, the results were highly accurate (above 96%) without any hyperparameter fine-tuning for each model.Furthermore, based on a balanced dataset, the machine learning models could recognize the critical inputs differentiating healthy state and maintenance-prone measurements and correctly identifying the condition of centrifugal pumps in most input measurements.Moreover, the accuracy of predictions additionally verifies the importance of collecting vibration and temperature measurements in manufacturing processes.Overall results confirmed the efficiency of the proposed framework following feature selection and XGBoost algorithm implementation in industrial applications.Furthermore, authors' findings aligned with previous research where XGBoost was effectively applied in socioeconomical aspects namely medicine [33], [34], economy [35], cybersecurity [36], language processing [37] and environmental applications [38].Regarding medical applications, feature selection and XGBoost was considered the most effective solution for heart disease classification with 99.6% accuracy [39] improving the solution of [40] where the proposed decision trees provided 97.75% accuracy.Additionally, a similar framework was implemented in [41] for diabetes prediction with the presented approach resulting in an area under curve (AUC) of 82%.However, in the cases of [42] and [43] facing glycose levels predictions, XGBoost was not considered the optimal solution and DNN and RF were the selected algorithms respectively.Furthermore, the XGBoost algorithm was selected for pregnancy risk monitoring with 96% accuracy [44] whereas an improvement of the proposed approach combining CNN and XGBoost methodology was proposed for renal stone diagnosis [45], breast cancer detection [46] and image classification [47] with accuracies of 99.5 %.Finally, feature selection combined with ensemble learning was proposed for epileptic seizure detection and classification from electroencephalogram signals with an effectiveness of 96% [48], [49].Similarly, feature selection and ensemble learning were additionally proposed in the economic sector.A light gradient boosting algorithm for risk analysis [50] and a cost sensitive-XGBoost approach for bankruptcy prediction [51] were developed with highly accurate results (around 95%) whilst [52] improved those outputs with an accuracy rate of 97%.On the other hand, principal component analysis (PCA) for feature selection [53] and Bayesian hyper-parameter optimization [54] were proposed in conjunction with XGBoost as optimal solutions for crowdfunding and credit worthiness prediction respectively.In the case of cybersecurity and language processing applications, the extensive appliance of XGBoost has been identified in denial service attacks distinguishing traffic requests from malicious or not.Both [36] and [55] highlighted the effectiveness of the proposed methodology by combining feature selection and XGBoost with overall performance accuracy of 99%.Furthermore, in [56] multilayer perceptron (MLP) slightly outperformed the XGBoost methodology with 99.3 % precision and was suggested by the authors as the optimal solution.Moreover, regarding language processing, XGBoost was proposed for sentiment features selection [37] and text similarity identification [57] with an F1 score of 69% and 89% respectively, whereas in [58] ANN was selected for human speech recognition with 77% precision.Finally, in terms of environmental applications, [59] combined a grid search algorithm and XGBoost model for hyperparameter fine tuning and electricity load prediction respectively, similarly [60] proved that ensemble techniques provide an efficient solution for solar radiation forecasting.Additionally, [61], [62], highlighted the effectiveness of ensemble methods on classification predictions combining XGBoost with ML algorithms for land use and rice leaf disease identification respectively.Nevertheless, collecting real-time sensor measurements and the raw data transformation in a machinecomprehensive format can be challenging.The proposed research integrated the proprietary GX works software tool and custom-built Python scripts to enable an interoperable real-time connection between the framework and Mitsubishi's smart sensor kit.The mapping of the appropriate input registers and the correct representation of sensor values in the Jupyter Notebook was a demanding and time-consuming task identifying the complexity of real-time data collection applications.Furthermore, another challenge is implementing the proposed method and selecting informative features for health state prediction in other industrial applications and machinery besides centrifugal pumps.Hence, through Mitsubishi's smart sensor kit, vibration and temperature measurements were collected from centrifugal pumps depicting the condition and health state of

CONCLUSION AND FUTURE DIRECTIONS
PdM constitutes one of the promising future concepts for a resilient and reliable industrial environment.The uprise of digitalization, big data analysis, and AI enhance the implementation of PdM applications assessing the health state conditions of machinery.The proposed approach focuses on developing a condition-based maintenance solution predicting the health state of centrifugal pumps based on vibration and temperature measurements.Our research proposes a robust PdM framework for health state prediction using feature selection and model optimization integrable into industrial facility layouts and infrastructures.Several machine learning models, namely Random Forest, Support Vector Machines, Naïve Bayes, and XGBoost, were implemented and evaluated.Additionally, feature selection was performed using Spearman correlation analysis, leading to model optimization.XGBoost outperformed the experimented models outputting as high as 99.41% accuracy in the validation sample.This research verified the implementation of ML algorithms, efficiently assessing centrifugal pumps' health state, increasing the safety and productivity in a sustainable and resilient industrial ecosystem and providing a comprehensive framework for CBM and health state prediction applications.Additionally, experimented outputs highlighted the importance of feature selection for model optimization and training using a balanced dataset, as suggested in previous sections depicting unbiased measurements regarding both health state conditions.Upcomming researchers and field technicians can benefit from the proposed framework and presented health state prediction methodology due to its resilience and viability on similar industrial applications.Future work will focus on improving the system performance and determining the RUL of centrifugal pumps.Additionally, the selection of the most appropriate AI algorithms will become a challenging issue and a hot topic of interest for future research.DL models and ANN are expected to provide accurate predictions due to their capability of handling high volume and regression data.Moreover, model hyperparameter fine-tuning is suggested to improve the classification, as mentioned above.The ML algorithms applied (RF, Naïve Bayes, SVM, and XGBoost) were trained based on the Scikit-learn default parameters configurations.Thus, additional testing on various hyperparameter combinations can further enhance the accuracy of the proposed system.Finally, a promising research direction is the automatic selection of optimal hyperparameters using genetic algorithms.A guideline of hyperparameter selection based on each application can provide researchers with a beneficial contribution to PdM approaches and promote the concept of PdM in an innovative and resilient manufacturing environment.

Int
Predictive maintenance framework for assessing health state of centrifugal pumps (Panagiotis Mallioris) 851

Figure 1 .
Figure 1.Framework for PdM and health state prediction in centrifugal pumps with feature selection and model optimization (adapted from [20])

Figure 2 .
Figure 2. Machine condition degradation over time: Machine condition and time dimensions establish a hierarchy of attributes: (i) out-of-order, smoke, temperature, noise, vibrations and state change, and (ii) minutes, days, weeks, and months

Int
Predictive maintenance framework for assessing health state of centrifugal pumps (Panagiotis Mallioris) 859 the machine.The developed predictive model could accurately predict the machine's condition in future cases when handling out-of-sample measurements.Therefore, it could alarm the maintenance personnel and increase the resilience and safety of the overall system in the optimal timeline.

Figure 3 .
Figure 3. Overall performance of experimented output

Table 1 .
Descriptive statistics of numerical input features.The features of velocity, rms_demodulation, rms_acceleration, peak-to-peak acceleration and temperature are highlighted as the most relevant for further examination

Table 2 .
Spearman statistical analysis resultsThe distinctive models, namely RF, Naïve Bayes, SVM, and XGBoost were trained based on the Scikit-learn default parameters configurations.The collected dataset depicting 857 running state measurements Predictive maintenance framework for assessing health state of centrifugal pumps (Panagiotis Mallioris)

Table 3 .
Prediction outputs prior feature selection with velocity_ISO, rms_demodulation, rms_acceleration, peak-to-peak acceleration and temperature as input features