Application of artificial neural network to predict amount of carried weight of cargo train in rail transportation system

Received Feb 1, 2020 Revised Apr 22, 2020 Accepted May 24, 2020 Derailments of cargo have frequently occurred in Malaysian train services during the last decade. Many factors contribute to this incident, especially its total amount of carried weight. It is found that severe derailments cause damage to both lives and properties every year. If the amount of carried weight of cargo train could be accurately forecasted in advance, then its detrimental effect could be greatly minimized. This paper presents the application of Artificial Neural Network (ANN) to predict the amount of carried weight of cargo train, with KTMB used as the study case. As there are many types of cargo being carried by KTMB, this study focuses only on cement that being carried in twelve (12) different routes. In this study, Artificial Neural Network (ANN) has been incorporated for developing a predictive model with three (3) different training algorithms, Levenberg-Marquardt (LM), Quick Propagation (QP) and Conjugate Gradient Descent (CGD). The best training algorithm is selected to predict the amount of carried weight by comparing the error measures of all the training algorithm which are Root Mean Squared Error (RMSE) and Mean Absolute Percentage Error (MAPE). The obtained results indicated that the ANN technique is suitable for predicting the amount of carried weight.


INTRODUCTION
Cargo or freight refers to goods or product that are transferred of distributed generally for commercial gain. Nowadays, cargo transport can be carried on water, air or land. Most widely used to carry cargo is road transport. Different form of weight and vehicle are used to transport cargo around. Road transport has many advantages like it can do door to door delivery on top of having several type of vehicles like trucks, busses, lorry, cars and so on. However, some bulky items like sugar, cements, charcoals that need to be transferred in large volume are moved using train or rail transport.
Other than known as able to carry passengers, train is also capable of transporting large volume of items such as water, cement, steel, wood and coal. Generally, train cargo has a direct route to its destination. Under the right condition, cargo transport by rail is more economic and more productive compared to road transport, especially when transporting items in large volume over long distance. The choice of mode of transportation depends very much on carried weight. Carried weight is an important matter in transport Int J Artif Intell ISSN: 2252-8938  Application of artificial neural network to predict amount of… (Siti Nasuha Zubir) 481 system and need to be considered. In the logistic transportation system, the amount of weight carried is very important to ensure that all the goods arrive safely at the destination in time. Using train as mode of transportation is beneficial to the environment as it is limiting greenhouse gas emissions, increasing fuel efficiency and reducing its carbon footprint [1]. KTMB Freight Service has three types of train services: Train Contena Service, Train Cargo Conventional Service and Train Landbridge Service. In 2017, KTMB has experienced three major derailments. On August 21, 2017, a cargo train crashed at Jalan Kucing, causing delays for a few days [2]. On September 23, 2017, KTMB's cargo train snapped electrical cable between Rawang and Kuang stations and forcing KTMB to close all tracks for two days [3]. On November 23, 2017, once again another cargo train accidents occurred when twelve cargo trains travelling southward between the National Bank Station and Kuala Lumpur Station slipped due to heavy weight and oversized loads carried by the cargo trains. As a result, KTM and ETS services were disrupted on several routes around the Klang Valley. One of the major causes of this tragedy is the overloading of the cargo train's wagon [4]. In recent accident that occurred on 21 July 2019 cargo train that carried 30 wagons of cement. During the derailment, KTMB needed to relocate all the wagons as soon as possible because all the KTMB's services were effected [5]. The derailment happened due to many factors and one of the most significant factors is the amount of carried weight. Having the amount of carried weight planned to match the track capability can avoid derailment occurrences. Artificial Neural Network (ANN) is a popular method used by other previous researchers to predict carried weight. In this study, the cargo train carried weight will be predicted.
The previous research outcomes demonstrated that the ANN is an efficient option strategy in prediction [6][7][8][9][10]. This is supported by [11][12] who proposed that ANN is the best model compared to Adaptive Neuro-Fuzzy Inference System (ANFIS). This study compared the models with American Concrete Institute and Iranian Concrete Institute empirical codes. As a result, the prediction of ANN is better than ANFIS model. In [13] developed a decision support system that can forecast demand in electronic retails industry at Turkey by using ANN techniques such as Gradient Descent (GD), the Conjugate Gradient Descent (GCD), Quick Propagation (QP) and LM methods. However, in multistage supplychain area the application of these artificial technique still have severe lack.
There are more studies focusing on how the predictive ability can be influenced by the training and testing algorithm. According to [14] in their study, ANN is used to predict carried weight and three (3) classes of ANN are used which are incremental back propagation algorithm (IBP), Genetic algorithm (GA) and Levenberg -Marquardt algorithm (LM). The predicting performance of the three algorithm was compared. This study was applied in an automobile industry, Iran Khodro Company (IKCO) as to appropriately provide the machinery resources, labor and transport system demand. ANN was used to test the weekly data of carried weight based on the observation of the number of vehicles and fuel consumption. At the end of the study, IBP give the optimum training algorithm. As for improvement, [15] used the same variable as the previous research to predict the carried weight. Instead of using GA and IBP, Quick Propagation (QP) and Batch Back propagation (BBP) are used and QP exhibits the better performance. Hence, this paper presents the application of Artificial Neural Network (ANN) to predict the amount of carried of cargo train, using three training algorithms: Levenberg -Marquardt algorithm (LM) as a well performed algorithm to predict different set of carried weight data, Conjugate Gradient Descent (GCD) as a well perform algorithm for prediction of other sets of data and Quick Propagation (QP) as a new algorithm used to predict carried weight.

RESEARCH METHODS
ANN is a mathematical model or computational model based on the neural networks or called an imitation of biological neural system. It is an adaptive system as it could modify the structure based on the information either internal or external that flow through the network [16]. This model is a flexible computing framework and a universal approximator. It can be applied to a wide range of problem like a time series forecasting with a high degree of accuracy. ANN replicates the biological neuron structure by creating a simple processing unit called artificial neurons. An approximation of the 3-dimensional intercoonectedness of biological neurones is done in ANN by means of the usage of layers. Figure 1 shows an ANN with input nodes, hidden nodes, and one output node. The hidden nodes will be generated using the different built-in algorithms.

Training algorithm
Three built in training algorithms are used and compared. a. Levenberg-Marquardt (LM) It is a higher-order adaptive algorithm and it minimizes the Mean Square Error of a neural network [17]. LM algorithm is a variation of Newton's method that is designed for minimizing functions that are sums of squares of other nonlinear functions. LM algorithm provides numerical solution to minimized non-linear function. The (non-negative) damping parameter is adjusted in every iteration, where small values of the algorithmic parameter λ result in Gauss-Newton update, and large values of λ result in a gradient descent update. The parameter λ is initialized to be large so that first updates are small steps in the steepest descent direction. If any iteration happens to lead to a poor approximation, then λ is increased. Therefore, for large values of λ, the step will be taken approximately in the direction of the gradient. Otherwise, as the solution improves, λ is decreased, the LM method approaches the Gauss-Newton method, and the solution typically accelerates to the local minimum. b. Conjugate Gradient Descent (CGD) The CGD method solves systems of linear equations, also used to solve system where matrix is not symmetric, not positive-definite, and still not square [18]. CGD is an advanced method for training multilayer neural network. In the CGD method, the line is not searched, but a plane is searched. A plane is formulated from a random linear combination of two vectors. For minimizing quadratic functions, the plane search requires only the solution of a two by two sets of linear equation for α and β. Solving convex optimization problems using CGD.
Gradient Descent Method will try to find the minimum by computing the gradient of ( ) at the initial guess. To achieve the value of x close to optimal solution the whole process has to iterate. c. Quick Propagation (QP) The Quick Propagation method uses the following updating equation: Where, is the model response for the ith iteration. The approximation of the Jacobian matrix +1 for the ( + 1) ℎ iteration is calculated using the Jacobian matrix approximation , the parameter perturbation vector and the change in the model response △ for the ith iteration. The updating matrix is a rankone matrix and Broyden's method is a rank-one quick propagation method. The algorithm classified to the group of the second order learning method which is it follows a quadratic approximation of the previous gradient step and the current gradient [19].

Error measures
According to [20], forecasting error is about measuring how good the performance of a model itself compares to the one of using the past data. a. Root Mean Squared Error (RMSE) b. Mean Absolute Percentage Error (MAPE)

Model validation
The first stage is called initial data preparation. During the first stage, the data series will divided into two parts. The first part known as within samples or fitting parts that used to estimate the performance of forecasting model [21]. Meanwhile, the second part is to evaluate the model called as out samples or evaluation part. In this study, the data are partitioned into 70% for training part where as the 30% for validation part. There are 13,152 observation.
In the second stage, the within sample statistics is used to estimate the model using three built in algorithms, LM, CGD and QP. The best estimation approach is selected based on the outcomes of comparing their error measures performances [22]. For this purpose, RMSE and MAPE are used [23][24]. Training algorithm with the smallest error measure is decided to be able to produce the best fit model.
Having completed the first and second stages, the last stage is to use the best fit model to forecast the amount of carried weight by each train per trip, that can help KTMB to plan for its future operation.

RESULTS AND DISCUSSION
Predictive modeling using Artificial Neural Network were carried out by using Alyuda Neurointelligence software. In the first stage, data is treated for its missing values. Initially, there were 12 routes. Since, the missing values for some routes are more than 15% [25], then, those routes are omitted. The remaining two routes which are Route 1 and Route 2 are further analyzed and underwent imputation process by using IBM SPSS Modeler 18.0 software.

Designing the network
In order to choose the best training algorithm, the best network architecture is defined first. For Route 1 shown in Table 2, there were 8 iterations in finding the best network architecture. However, 6 network architectures which is in red colored has been removed in order to avoid over fit problem when the number of hidden nodes is greater than the number of input nodes. From the results, it was found that the best architecture is [5-5-1] model since it gives the largest fitness value, lowest test error and lowest AIC. Table 3 also shows that the best architecture for Route 2 is also [5-5-1] model.  The fitness of training algorithms is also done in which Table 4 shows that LM produces the smallest value of Absolute and Network Error for both Route 1 and Route 2. Table 4 also shows that LM produces the smallest error value (RMSE and MAPE) for both training and validation parts for Route 1 and Route 2.

Forecasting by using the best training algorithm
As previously discussed, the best training algorithm will be used for prediction of carried weight. Hence, the ANN model with LM as the training algorithm is used to predict in both routes, Route 1 and Route 2.
The amount of carried weight forecasted for year 2019 at Route 1 is illustrated in Figure 2. The grey line represents the forecast value and the dotted orange line represents the trend line of the new forecasted values which negative slope indicates that amount of carried weight for Route 1 slightly decrease and going to decline over time. The trend line of the new forecasted carried weight values is constructed and it can be concluded that there is a decrease in amount of total tonnage carried each day by 0.0489 this due to negative relationship. The equation is y = -0.0489x + 3249.8. The forecast value shows that amount of carried weight fluctuates over time and decrease by 48.9 kg per day.
Then, the amount of carried weight forecast for year 2019 at Route 2 was illustrated in Figure 3. The grey line represents the forecast value and the dotted orange line represents the trend line of the new forecasted value. The forecasted line for Route 2 also is having a negative slope indicating that amount of carried weight for Route 2 slightly decreases over time. The trend equation is calculated, y = -0.1186x + 6079.3 which shows a decrease by 118.6 kg per day. Comparing the trend line of the new forecasted carried weight with earlier trend in Section 4.5.2, it can be seen that the decrease of average amount of carried weight for Route 1 slightly changes from 69.1 kg per day to only 48.9 kg per day. Therefore, it can be concluded that the amount of carried weight of cargo is increasing and cargo business is improving.

CONCLUSION
This paper presents the application of Artificial Neural Network (ANN) to predict the amount of carried of cargo train, using three training algorithms: Levenberg -Marquardt algorithm (LM) as a well performed algorithm to predict different set of carried weight data, Conjugate Gradient Descent (GCD) as a well perform algorithm for prediction of other sets of data and Quick Propagation (QP) as a new algorithm used to predict carried weight. The achieved results show the appropriateness of the Artificial Neural Network in predicting the amount of carried weight based on the correlation, fitness and test error values. Based on the RMSE and MAPE, LM shows the smallest values for Route 1 and Route 2 that carry cement cargos for KTMB customers.
Furthermore, the ANN model based on the best training algorithm found in the first phase of the study is used to forecast value of carried weight of cargo train for both routes in rail transportation system. Results show that the values of carried weight fluctuate and decline overtime for year 2019 (365 days ahead). It is hope that the results can help KTMB to plan the right amount to be carried by its cargo per trip in its effort to prevent form more derailments occurrence. At the same time, as the amount of carried weight is predicted to decline over time, KTMB can plan a strategic initiative in getting more customers while monitoring the right amount to carry each trip.