Optimization of artificial neural network topology for membrane bioreactor filtration using response surface methodology

Received Nov 9, 2019 Revised Jan 5, 2020 Accepted Jan 20, 2020 The optimization of artificial neural networks (ANN) topology for predicting permeate flux of palm oil mill effluent (POME) in membrane bioreactor (MBR) filtration has been investigated using response surface methodology (RSM). A radial basis function neural network (RBFNN) model, trained by gradient descent with momentum (GDM) algorithms was developed to correlate output (permeate flux) to the four exogenous input variables (airflow rate, transmembrane pressure, permeate pump and aeration pump). A second-order polynomial model was developed from training results for natural log mean square error of 50 developed ANNs to generate 3D response surfaces. The optimum ANN topology had minimum ln MSE when the number of hidden neurons, spread, momentum coefficient, learning rate and number of epochs were 16, 1.4, 0.28, 0.3 and 1852, respectively. The MSE and regression coeffcient of the ANN model were determined as 0.0022 and 0.9906 for training, 0.0052 and 0.9839 for testing and 0.0217 and 0.9707 for validation data sets. These results confirmed that combining RSM and ANN was precise for predicting permeates flux of POME on MBR system. This development may have significant potential to improve model accuracy and reduce computational time.


INTRODUCTION
Malaysia is one of the world's leading producers in palm oil industries [1]. Along with the increase of production capacity of palm oil every year, a large amount of wastewater was also being generated. These uncontrolled discharges of untreated palm oil mill effluent (POME) may cause pollution to the waterways [2]. In comparison with conventional activated system, membrane system is preferable to treat POME due to its simple operation, easy to scale-up, less weight and space requirements and high efficiency [3]. Membrane bioreactor (MBR) has been proven as a reliable technology in treating a wide range of water such as wastewater, groundwater and surface water. However, fouling phenomena is the main drawback of MBR system which contribute to high energy consumption and maintenance cost [4]. According to [5][6][7], fouling may varies with time during operation and this variation can be minimized by controlling the fouling variables [8].
Fouling can be controlled and reduced using several hydrodynamic condition techniques such as air bubble (aeration) control, relaxation, backwashing, and chemical cleaning [8][9][10]. It was found that still little work conducted on the development of modelling and optimization for the operation condition of POME using MBR. Most of the works focused mainly on biological reduction of POME using MBR filtration [9][10][11][12]. Modeling of membrane process, involving with large number of parameters that needs to be considered is not an easy task.
Recently, modeling of membrane process using neural network has received enormous attention because of their ability in modeling and prediction of complex processes. ANN has been successfully applied to predict oily wastewater [13][14], permeate flux of albumin from serum bovine [15] and palm oil mill wastewater [16][17]. In addition, a good understanding of factors that affect ANNs model performance is crucial to predict the optimum value of the number of iterations, learning rate, momentum coefficient, number of hidden layers and number of hidden neurons. The parameters are varied until their optimal value are determined [18]. Determination of the best ANN topology is important because it affects the weight and bias. Usually it performed by trial and error [19][20] or one-variable-at-time (OVAT) [21][22] where this procedure is very time-consuming and monotonous task. According to [23] for three different level of each ANN variables, about 245 (=3 5 ) different configuration of ANN would be required. There is no specific rule used in selecting the value of variables in ANN. It is dependent on the complexity of the modeled system. Thus, it is of importance for researchers in order to find a standard technique to solve the problems associated with the ANN development.
Response surface methodology (RSM) as a collection of statistical and mathematical techniques has a capability for optimizing objective functions. It is a powerful optimum design tool in many engineering applications and can provide accurate models. RSM technique has been used to determine the ANN topology applied for multi-layer feed forward with backpropagation neural network [23][24]. It is also used to find the optimum value of neuron number in first and second hidden layers [18]. This paper aims for the development of radial basis function neural network (RBFNN) models for prediction of permeate flux during MBR filtration of POME wastewater. In this case, the RSM is proposed to find the optimum ANN topology to achieve minimum mean square error to improve the performance of the model

RESEARCH METHOD 2.1. Data collection
The experiments were carried out using membrane bioreactor for palm oil mill effluent (POME) with working volume of 20 L. The sample of POME was taken from Sedenak Palm Oil Mill Sdn. Bhd. in Johor, Malaysia with the working temperature at 27 ± 1 °C. There are four input variables for the POME model including transmembrane pressure (TMP), airflow rate, permeates pump and aeration pump. The output variable is permeate flux. The analysis of required data was carried out by using MATLAB R2014a and Design Expert version 7.1.6 to obtain the response surface and the contours plot. The total of 1602 data for each parameter were collected from the experiment including airflow rate, TMP, permeate pump, aeration pump and permeate flux. Figure 1 shows the flux was rapidly decreased after the airflow rate was decreased from 8 SLPM to 5 SLPM.

Model development
In this work, the RBFNN model was used to predict the permeate flux of POME membrane bioreactor. Before that, all data need to undergo data pre-processing stage so called normalization. Since the input data for this system involved with different magnitude value and scale, all data were normalized into a minimum of +0 and maximum of +1. This procedure prevents the transfer function model from becoming saturated [25]. Equation (1) used for normalization given as: where ′ is the scale value, is the sample value while and are minimum and maximum value of data. The permeate flux was determined as given in (2): where is the permeate flux in ( −2 ℎ −1 ), is the volume flow rate in liter, is membrane surface area ( 2 ) and is the time (ℎ). To investigate the feasibility of the predictive model, the collected data were separated into three data sets. From the total, 651 for training data set, where this data included the transition between high and low airflow rate. The 500 for testing data set was taken from the high airflow and finally, 451 for validation data set was taken from the low airflow rate. The training data was used to compute the network parameters. The testing data was used to assess the predictive ability of the generated model, while the remaining validation data was subsequenty used to ensure robustness of the network parameters and to avoid over-training [26]. The amount of training data set must be equal or larger than the amount of testing and validation data set to avoid extrapolation problem [27].
In this paper, three layers of RBFNN which are input, output and hidden were used. The non-linear transfer function of hyperbolic tangent sigmoid was used in the hidden layer and the linear transfer function of purelin was chosen for the output layer to produce a continuous output. The RSM is used to find the optimal value for each learning parameters of RBFNN model. 50 different experiments of central composite design (CCD) for five numerical factors (number of neurons, number of spread, learning rate, momentum rate and number of epoch) with eight repetition at center point were used. Five numerical factors and simulation ranks for RBFNN are shown in Table 1. The experimental results of the CCD were fitted with a second-order polynomial equation by a multiple regression technique. For predicting the optia point, the quadratic model is expressed by (3): where is the response ln(MSE), 0 , , and are regression coefficients for intercept, linear quadratic and interaction coefficients, respectively and and are independent variables and k is a number of factors. All ANN topologies were designed and trained using RSM. The obtained quadratic equation was solved using response optimizer of RSM until the optimum condition to minimize MSE (response variable) data set was found. The MSE were transformed into natural log function (ln(MSE)) with α equal to 1. In this case, the distribution of the response variable become closer to the normal distribution [24].

Performance evaluation
where is the predicted output from observation i, is the experimental or actual output form observation i, ̅ is the average value of the experimental output and N is the number of data. Smaller values of MSE and RMSE mean a better performance of the model. For R 2 equal to 1 reveals that the regression line perfectly fit the data [26].

RESULTS AND DISCUSSION
The relationship between the permeate flux and the independent parameters, namely number of neuron ( 1 ), spread ( 2 ), learning rate ( The fitness of the model is determined by analysis of variance (ANOVA) which consists of sum of square (SS), degree of freedom (df), mean square (MS), F-values and P-values as shown in Table 2. The significance of each coefficient was determined by the F-test and P-value. The significant of corresponding variables would be increase if the absolute F-value becomes greater and the P-value becomes smaller. From Table 2, the model gives F-value of 81.25 and very low P-value (< 0.0001). P-values < 0.05 reveal that the model terms were significant. The number of neuron had the highest effect on ln(MSE) response followed by number of spread and number of epoch. The learning rate and momentum coefficient had no significant effect on the responses. The prediction 2 of 0.9825 is in reasonable agreement with adjusted 2 , 0.9704. The low value of coefficient of variance (CV=4.62%) which is less than 10 showed that the experiments conducted were precise and reliable.

Response surface plot results
The plot of response surface results is presented in Figure 2. Each graph represented a combination of two factors at the time and holding all other factors at the middle level. Figure 2

Neural network plot results
In this section, the regression plots of the experimental data versus the computed neural network data using the optimum ANN topology are presented for each step incuding training, testing and validation netwoks. The predicted models were well fitted to the experimental data for all steps as depicted in Figure 3. The correlation coefficients (R) for training is 0.9906, for testing is 0.9839 and for validation is 0.9707. The comparative values correlation of determination (R 2 ), RMSE and MSE were given in Table 3. The results showed that the optimum ANN model is suitable for describing the permeate flux of POME using MBR filtration. The optimal topology of ANN using RSM provided good quality prediction for the five exogenous outputs. The results have been compared with the conventional RBFNN and showed an improved ANN model performance as shown in Table 3. The RBFNN-RSM showed its superiority and faster then trial-anderror methods in finding the optimum topology of ANNs.   show the response variable of permeate flux for training, testing and validation, respectively. For training data which is the transition between high to low airflow rate, the permeate flux starts to decrease slowly from 0.88 to 0.60 L/m2 h. For testing data, the permeate flux is at high airflow rate and it remains at 0.8 L/m2 h. For validation data, the permeate flux decreases rapidly compared to the permeate flux at high airflow. It can be seen that good prediction models are obtained for the permeate flux for all data set.