Gait cycle prediction model based on gait kinematic using machine learning technique for assistive rehabilitation device

Received Oct 31, 2020 Revised May 5, 2021 Accepted May 20, 2021 The gait cycle prediction model is critical for controlling assistive rehabilitation equipment like orthosis. The human gait model has recently used statistical models, but the dynamic properties of human physiology limit the current approach. Current human gait cycle prediction models need detailed kinematic and kinetic data of the human body as input parameters, and measuring them requires special instruments, making them difficult to use in real-world applications. In our study, three separate machine learning algorithms were used to create a human gait model: Gaussian process regression, support vector machine, and decision tree. The algorithm used to create the model's input parameters are height, weight, hip and knee angle, and ground reaction force (GRF). For better gait cycle model prediction, the models produced were enhanced by incorporating different sliding window data. The best gait period prediction model was DT with sliding window data (t−3), which had a root mean square error of 3.3018 and the R-squared (R-Value) of 0.97. The projection model focused on hip and knee angle and GRF was a feasible solution to controlling assistive rehabilitation devices during the gait cycle.


INTRODUCTION
Human gait rehabilitation is the process of regaining patient mobility from paralyzed muscles caused by neurological disorders such as stroke and SCI [1]. The technique in human gait rehabilitation includes orthosis [2], [3], and functional electrical stimulation (FES) [4], [5] as the assistive device to help the rehabilitation process. Gait analysis is the biomedical study of the lower limbs during locomotion [6]. Medical practitioners often face the daunting prospects of storing and analyzing vast volumes of data typically obtained during several gait cycles. For example, Chen et al. [7] highlighted the difficulties faced by doctors in assessing quantitative analysis of gait anomalies, thus developing a gait acquisition and analysis method for osteoarthritis prediction model using RGB-D camera. In gait rehabilitation, researchers such as Galli et al. [8] and Vallery et al. [9] have used the data collected during gait rehabilitation for the purpose of exoskeleton control designs to suit patient needs during rehabilitation. The data collection of gait study used technique such as optical system, electromyography, goniometric system, and imaging [10]- [13]. Quantification of walking parameters by the gait cycle is important for a deeper understanding of human locomotion and for the regulation of assistive devices. A gait can be divided into stance and swing phases, separated by initial contact and foot-off events as shown in Figure 1. The stance and swing phases are divided into 60% and 40% of the gait cycle. Gait patterns can be recognized from kinetic and kinematic parameters during gait progression. Figure 1. Gait cycle phases [13] Research conducted by Farah et al. [14] determined gait phase detection can be implemented by using machine learning techniques. The parameters used in the study were knee angle, thigh angular velocity, and thigh center of gravity acceleration for all planes of gait progression. The study was done in the laboratory setting, with more than 97% accuracy for the identification of gait events. However, the used of all three sagittal, coronal, and transverse planes of gait progression for detection of gait cycle may lead to difficulties in usage of the detection model. On the other hand, despite using the above-mentioned parameters, Mahdavian et al. [15], developed the model for lower limb motion by using machine learning techniques from ground reaction forces (GRF) parameter. The model developed successfully predicted the user's gait steps with more than 80% accuracy, which was then incorporated with the hip exoskeleton control scheme. However, the detected GRFs value at all planes of gait regression may not identically represent the gait cycle itself. Susanto et al. [16], investigated the use of artificial neural network (ANN) for a prediction of the assistive torque required for lower limb exoskeleton from previous human walking gait cycles and the center of pressure (CoP). The study successfully controlled the lower limb exoskeleton from the prediction of ANN on the different walking speed situations. It is suggested that more input variables should be given to ANN so that better prediction models can be developed. Another research by Young et al. [17] and Kilmartin et al. [18] used dynamic Bayesian networks (DBN) to determine the current state and make a prediction of what the next state will be, which takes into account previous data and states. Subject-independent algorithms that include generalized gait and transition prediction classification have also shown promise for both able-bodied subjects and users of prostheses. In the laboratory setting, these algorithms were highly accurate, with greater than 95% accuracy for the detection of gait events. Building on the success of previous work stated above, artificial intelligence algorithms have been put in place to leverage the temporal nature of gait data and to optimize the classification or prediction of gait events. These algorithms require large sensory arrays, i.e., multiple mechanical and neuromuscular sensors for classification and predictions. From these scenarios it seems that the prediction of gait cycle during gait progression is vital and better control schemes can be developed for rehabilitation assistive devices such as orthosis and exoskeleton. The goal of this research work is to predict the gait cycle using the selected features that includes knee angle, hip angle and GRF at sagittal plane using machine learning algorithms. In this work, Gaussian process regression (GPR), support vector machine (SVM) and decision trees (DT) were used and compared to predict gait cycle.

RESEARCH METHOD
The studies conducted in this research included developing the gait cycle prediction model using three different machine learning algorithms that were GPR, SVM and DT. In order to develop the model, the study work was begun with data preparation for the training, parameter selection and sliding window, constructing the gait cycle prediction model and model performance evaluation via root means square error (RMSE) and R-Squared (R-Value) values. Figure 2 shows the flowchart of work carried out in this research.

Data preparation
Dataset from the volunteers consisting of 24 young adults (age 27.6 ± 4.4 years, height 171.1 ± 10.5 cm, and mass 68.4 ± 12.2 kg) were retrieved from Fukuchi et al. [19]. The dataset contains kinematics data of joint angles and moment of the hip, knee, and ankle joints and of the pelvis and foot segments during overground walking at various speeds. Figure 3 shows joint angle of hip and knee and GRF at the sagittal, frontal, and transverse planes.
All gait trials were performed at three different speeds: comfortable speed, and then at a speed of 30% faster and a speed of 30% slower than the comfortable speed. The dataset used as the parameters for this study are the volunteer information namely weight and height, joint angles of hip and knee, and GRF at sagittal plane. There are several techniques that have previously been used for gait cycle detection, for instance, using angle measurement using goniometers at hip, knee, and ankle joints [20] and ground reaction force measurement using pressure sensors [21].
The input parameters extracted from the dataset used in this study are weight (Kg), height (cm), hip angle (deg) and knee angle (degree) and one output data, which is the gait cycle (%). Initially, the gait cycle prediction model is developed by using the current step of all the input parameters used. Then the model training data in each step was combined with one of the previous ones in order to better observe the effect of gait cycle in the model and enhance the accuracy. This approach is called sliding window technique, where the idea is to feed not only input parameters at current gait cycle, but also previous input parameters, thus it can incorporate auto correlation information into the model as suggested by Khairuddin et al. [22]. This way the variation in each input could be better counted for by the models. All input and output parameters are listed in Table 1. These input and output parameters are used as the equation variables for all machine learning techniques used in this study.

Machine learning technique
Machine learning (ML) discipline is a subset of artificial intelligence (AI) concerned with the ability of computer systems or machines to improve performance automatically throughout its input parameter during data training process [23]. ML algorithms were used to implement into the different models. According to recent study in prediction of using ML, it shows that Gaussian process regression (GPR), support vector machine (SVM) and decision tree (DT) yield advantage in speedup the ability to predict in generalization performance and in case of small sample [24]- [26]. Thus, by knowing the advantages of those ML algorithms we used GPR, SVM and DT in this study to develop the gait cycle prediction model.

Gaussian process regression
GPR method is a supervised machine learning method. GPR is completely specified by its mean function and covariance function. As explained by Quinonero-Candela et al. [27], the learning regressor is quantified in terms of the Bayesian estimation problem, for which predictive role is inferred from the deployment of an arbitrary procedure derived from the Gaussian distribution. Given a training input parameter data * the best estimate of the output value, gait cycle, * connected with it is represented by the anticipation of the anticipated output quantity to L and * : The gait cycle predictive distribution ( * | L, * ) can be shown in: And, given a covariance function ( , ′ ), K and * represent training samples covariance matrix and covariance vector between training samples and sample x * respectively. In addition, , and I are bias factor, noise variance, and identified matrix, respectively. It is indeed possible to restore two critical components from them: i) the mean * which implies the best estimate of the output value according to (1) for the sample considered; ii) the variance * 2 which reflects a confidence measurement linked to the output by the model. The covariance function ( , ′ ) plays a pivotal role as it embeds the geometrical formation of the training samples. Squared exponential function is a normal choice for the covariance function: where, respectively, the two hyperparameters 2 and are described as process variance and length scale.

Support vector machine (SVM)
The training data ( , ) =1 in that, ∈ is n dimension sample input, ∈ R is the sample output. From nonlinear map function (·), the trained dataset was plotted non-linearly to a high dimension feature space (Hilbert space), then the nonlinear system recognition complexity was translated into a linear function that estimates the problem in a high dimension feature space. Complete formulation and deviation of SVM for regression can be found at [28]. Supposed the gait cycle prediction function as shown in (6).
In the formula, the input space is plotted by nonlinear functions to an unclear dimension of feature space. The function estimation problem involves finding the minimum of the function f(x), based on the structure risk minimization principle of Vapnik: where ‖ ‖ 2 describes the complexity of model function ( ), the constant C > 0 can be regulated to compromise between the error of the train and complexity of the model. The experiential risk: Insensitiveness loss function | = ( )| is defined as (9): Then (7) where in ( −̇ * ) ≠ 0 is support vector, −̇ * is support value, and is the number of support vector. The kernel function ( , ) = ( ) ( ) is an arbitrary symmetry function that is acceptable with the Mercer condition.

Decision trees (DT)
A DT is a recursive binary division into a set of rectangles for a future space. The tree is a simple function of the input in every rectangle; it is usually a constant. Therefore, the input-output map of a DT has the form of as refer from classification and regression tree [29] developed by Breiman in 1984 which the algorithm for DT: where is the input parameter data N-dimensional vector, = ( 1 , 2 , ..., ) and I is the input indicator function which assumes a value of 1 if the input is true and 0 otherwise. Whenever the input is in rectangle , the tree takes outputs the value . Complex functions can be recognized by using a good enough partition. It is possible to extract the output of a DT very soon by performing a couple of observations. There is a pair ( , )at each node of the tree, where is the index of a variable in the input vector , and is the threshold. If < , the tree is descends to the left, otherwise it descends to the right. In the tree, leaf nodes differ from regular nodes in that they store only a constant value. When a tree descends, the output is the constant value ci for that node upon reaching a leaf node. Although comparisons are made with the elements, the sides of the regions are parallel to the axes of the coordinates.

Sliding window data (SWD)
To improve the accuracy on the prediction of gait cycle, an SWD prediction scheme was proposed. A SWD is normally used to segment a data sequence [30], which in our study SWD on the input parameter of hip angle, knee angle and ground reaction force is used. This study used three different window size of 3 gait cycles for training and validation of the gait cycle prediction models. The SWD is represented in (13). input data at gait cycle , and ( ) is the gait cycle output data. The data at gait cycle + 1 is the value to be predicted after the training period, based on the current ( − + 1), ( − + 2), … , ( ) input update. Figure 4 can be used to express the scheme of SWD. Figure 4. SWD

Training and testing
As according to study by Kumari et al. [31] and Shams-Baboli and Ezoji [32], the training and validation or testing dataset was set at 70% and 30% respectively, with good accuracy result. Hence, the dataset was divided into three part 70% of the total data was used for training, 15% used for validation, and the remaining 15% used for testing. In the case of small sample size data, a cross-validation method was used to evaluate the performance of the gait cycle prediction model. A five-fold cross-validation scheme, that is, 24 young adult results, was divided into five segments. Firstly, the prediction model was trained and built using four out of the five segments, and the remaining one was used for validation. Secondly, the procedures described above have been replicated five times. Finally, to achieve a final output result, the five prediction results were averaged. The training and validation were carried out with four different SWD conditions that are i) (t), ii) (t − 1), iii) (t − 2) and iv) (t − 3). The RMSE and R-Value were used to measure the correlation between predicted gait cycle by each machine learning technique and the actual target values. These two criteria, RMSE and R-Value, were used as the basis for training and selecting the idealized prediction model. RMSE was calculated based on actual value and those predicted by GPR, SVM and DT model, respectively, using: and R-Value can be presented as (15).
where ̂i=predicted value, =actual value, ̅=is mean of the actual value and N=number of measured data points.

RESULTS AND DISCUSSION
In this section, the training, validation, and testing for the gait cycle prediction model are discussed. As discussed in the methodology section, different types of machine learning technique GPR, SVM and DT are used to develop gait cycle prediction model. The model is with and without the enhanced SWD, and the RMSE and R-Value results are compared and discussed. Then, the gait cycle prediction model is tested with the RMSE and R-Value results are compared from subjects' data that was not used during the training and validation process.

Training and validation: Gait cycle prediction model using GPR, SVM and DT
We applied GPR, SVM and DT to gait cycle prediction using joint angles of hip and knee and GRF dataset of humans during normal gait without any additional input parameter of SWD technique. The training and validation results with different types of machine learning technique GPR, SVM and DT are summarized in Table 2. Results are expressed in terms of RMSE and R-Value for estimation of which is the best model to be used. The results on GPR, SWM and DT training show that for RMSE and R-Value of 13.297 and 0.78, 14.059 and 0.77, and 11.535 and 0.84 for GPR, SVM and DT, respectively. Over the three models validated here, SVM and GPR models had similar performance because they are exemplar-based kernel machines. In addition, study by Hultquist et al. [33] and Nguyen-Tuong and Peters [34] shows that SVM often gives a sparse model that is fitted only from a subset of training samples, whereas GPR is a full model based on all the training samples. At this stage, DT model had the highest performance result compared to GPR and SVM model, where the prediction result is shown in Figure 5. The performance result for GPR in Figure 5(a) and SVM in Figure 5(b) was low as compared to DT model in Figure 5(c), because at gait cycle 0%−10% and 90%−100%, the prediction did not fitted nearly to the accurate prediction level. However, from 10%−90% of gait cycle, the prediction fitted at the accurate prediction level, indicating that all GPR, SVM and DT model have good performance for gait cycle prediction.   Table 3. The results for sliding window ( − 3) for GPR, SWM and DT models are shown in Figure 6 and gait cycle prediction accuracy with RMSE and R-Value of each machine learning technique are presented in Table 4. For all the training models, the RMSE and R-Value decreased when the sliding window of input data increased. This is because more previous information can be fed into the training model so that it will enhance the accuracy in the prediction gait cycle. Among the three different models, DT model gives good results with accuracy of 11 759 response during early 10% and late 90% had improved compared with the results without SWD, as shown in Figure 5(a) and Figure 5(b). However, from 10% to 90% gait cycle the result showed a good prediction level as it fitted near to the accurate prediction level. In contrast, for DT model as shown in Figure 6(c), the predicted response fitted evenly along the accurate prediction level. Thus, the best model for gait cycle prediction to be used during testing was the DT model.   Dey et al. [35] used minimum and maximum angles and moment of the hip, knee, and angle joints to predict values of gait pattern at different speed. The author has used quadratic regression technique in the study. Eslamy and Schilling [36] reported an R-Value score above 0.92 using GPR for ankle kinematic prediction for the trained speed levels. Dey et al. [37] reported R-Value of 0.98 for prediction of ankle joint angles by using SVM for level ground walking at self-selected normal speeds. However, the variation in the range of input features, the difference in data sets, the volume of data used and the different output quantification measures

Testing: Gait cycle prediction model
Each of the models were tested using the testing data with four subjects ( = 6, 10, 13,19). The results are summarized in Table 5, and Figure 7(a) and Figure 7(b) show the GPR, SVM and DT complete gait cycle prediction testing results for subjects = 6 and 10, respectively. From the results, all the model developed reached good accuracies in which low RMSE value was at around 3 to 7 and high R-Value which was at 0.9 and above. Among the results, the DT model showed consistency of RMSE and R-Value which were around 3.0 and 0.95, respectively. Figure 7(a) and Figure 7(b) shows the GPR and SVM model prediction which did not perform well as the graph line divert from the actual line. In contrast the DT model showed very good results and fitted to the actual line, thus it showed that gait cycle prediction was best using the DT model. This study demonstrates that ML with input parameter GRF, hip and knee angle can be used to predict human gait cycle. Unlike [24], [38] on joint moment prediction using ANN model, this study avoids the use or marker trajectories that could be time-consuming and complex equipment requires. It also effectively reduces Int J Artif Intell ISSN: 2252-8938  Gait cycle prediction model based on gait kinematic using machine… (Che Ani Adi Izhar) 761 the number of input parameters, which makes it possible to predict gait cycle. Our methods of SWD further enhance high accuracy of prediction with R-Value > 0.95. The proposed method is therefore sufficient to be used as a model of the gait cycle for the design of control system for the assistive rehabilitation device.

CONCLUSION
This paper presents gait cycle prediction models developed using machine learning techniques. The DT model is presents suitable for modeling of gait cycle using input parameters of height, weight, hip, and knee angle, and GRF. The gait cycle prediction model is enhanced further and achieved a better accuracy once the sliding window was introduced. Further investigations with input parameters such as joint moment to predict joint angle should be explored for further studies. This research would also benefit from assistive rehabilitation devices for gait cycle prediction.