Explainable ensemble technique for enhancing credit risk prediction

ABSTRACT


INTRODUCTION
Credit risk prediction is a vital task in the financial industry, as it helps institutions make informed lending decisions and maintain financial stability.With the increasing data availability and the advancement of machine learning (ML) algorithms, there has been a growing interest in using ML models for credit risk prediction [1], [2].However, many of these models are complex and difficult to interpret, which can limit their usefulness in practice [3].
To address this challenge, there has been a growing interest in developing explainable AI (XAI) techniques that can provide insights into the ML models decision-making process [4].One approach to improve performance is through the use of ensemble methods, which combine the predictions of multiple models to improve accuracy and reduce overfitting [5].Ensemble methods have shown promising results in various applications, including credit risk prediction [6], [7].
Traditional credit scoring methods, such as logistic regression and discriminant analysis, are widely used in practice [2].These models rely on linear relationships between variables and may not capture the complex interactions and nonlinear patterns in credit data.To address these limitations, ML models have been proposed, which can handle high-dimensional and heterogeneous data and learn complex patterns and relationships between variables [8].ML methods have gained popularity in credit risk prediction due to their capacity to handle large amounts of data and capture nonlinear relationships between variables.Decision trees, random forests, support vector machines, and neural networks are some of the commonly used algorithms in credit risk prediction [1], [9], [10].Although these models have demonstrated success in improving the accuracy of credit risk prediction, their complexity and lack of interpretability present challenges in explaining model outputs to stakeholders.[11] reviewed the recent developments in credit risk prediction using ML models, discussing the advantages and limitations of various models.They concluded that deep learning models have better predictive performance compared to traditional ML models, but their black-box nature raises the issue of interpretability.
To address this challenge, there has been growing interest in developing explainable ML models for credit risk prediction.These models aim to maintain the accuracy of ML algorithms while providing interpretability through model visualization and feature importance analysis [12], [13].Some examples of explainable ML models in credit risk prediction include rule-based systems, fuzzy logic models, and ensemble models [14]- [16].
This issue is addressed in the survey of explainable ML by [11], which discusses the different approaches to explainability in ML models.[17] proposed an explainable credit analysis method using deep neural networks, which allows for better understanding of the decision-making process.[18] developed an explainable and interpretable credit risk evaluation model based on extreme gradient boosting (XGBoost), which provides a clear and concise explanation for its predictions.[19] proposed an explainable ML model for credit risk prediction, which incorporates the shapley additive explanations (SHAP) method to explain the importance of each feature.However, one of the main challenges of using ML models for credit risk prediction is their lack of interpretability.This is particularly important in the financial industry, where decisions must be transparent and justified [20].Interpretable models can help explain the factors that contribute to credit risk, identify potential biases or errors, and provide insights for risk management [2].
To address the need for interpretable credit risk models, there has been a growing interest in developing explainable AI (XAI) techniques that can provide insights into the decision-making process of ML models [4].Ensemble methods have emerged as a popular approach to building interpretable models that can improve prediction accuracy and provide insights into the model's decision-making process [21].Ensemble methods involve combining multiple ML models to improve predictive performance and reduce the risk of overfitting.Bagging, boosting, and stacking are the three most common ensemble methods [22].Bagging involves training multiple models on different subsets of the data and combining their predictions using majority voting.Boosting involves iteratively training models on the most difficult examples and combining their predictions using weighted voting.Stacking involves combining the predictions of multiple models using another model as a meta-classifier.
Ensemble methods have been used for credit risk prediction with promising results.[21] proposed an explainable ensemble model for credit scoring that combined multiple ML algorithms, including logistic regression, decision trees, and neural networks, and used feature importance analysis to identify the most important features for credit risk prediction.The proposed model achieved better performance than other state-of-the-art methods and provided insights into the factors that contribute to credit risk.
Ensemble methods have been used in other studies for credit risk prediction.[23] proposed a random forest model that combined bagging and feature selection to enhance the accuracy and interpretability of the model.Similarly, [8] evaluated the performance of several ML algorithms, including ensemble methods, for credit risk assessment and found that ensemble methods generally outperformed other approaches.
In addition to ensemble methods, other XAI techniques have been proposed for credit risk prediction.[20] proposed an explainable credit risk assessment framework that used local interpretable model-agnostic explanations (LIME) to generate local explanations for individual credit decisions.[2] proposed a deep learning approach for credit scoring that used a convolutional neural network (CNN) and shapley additive explanations (SHAP) values to identify the most important features for credit risk prediction.
Overall, ensemble methods have emerged as a promising approach that can improve predictive performance of ML models [24]- [28].By combining multiple algorithms and features, explainable ensemble methods can capture complex patterns and interactions in credit data and provide insights into the factors that contribute to credit risk.However, there are still some challenges and limitations to be addressed.One challenge is the selection of appropriate ML algorithms and ensemble techniques.Different algorithms and techniques may have different strengths and weaknesses, and their performance may vary depending on the data characteristics and the problem domain.Future research could investigate the optimal combination of algorithms and techniques for credit risk prediction.Another challenge is the interpretability of ensemble models.While ensemble methods can provide insights into the decision-making process of ML models, the interpretation of ensemble models can be more complex and challenging than that of individual models.Future research could explore how to provide more transparent and understandable explanations for ensemble models.Furthermore, there is a need to address the issue of data quality and bias in credit risk prediction.ML models can amplify the biases and errors in the data, leading to unfair or discriminatory outcomes.XAI techniques can help identify and mitigate these biases, but there is still a need for more research on how to ensure the fairness and accountability of credit risk models.
In summary, explainable ensemble methods have shown promising results for credit risk prediction and can provide insights into the decision-making process of ML models.The present study focuses on addressing the challenges and limitations of these methods and developing more transparent and fair credit risk models for the financial industry which enhances both accuracy and explainability of the model.The paper proposes the use of explainable ensemble methods for credit risk prediction.We aim to build an ensemble model that is both accurate and interpretable, by combining multiple base models that use different ML algorithms and features.Finally model interpretation technique is proposed to identify the most important features and visualize the model's decision-making process.The goal of this research is to provide insights into the factors that contribute to credit risk and promote transparency and trust in financial decisionmaking.
The remainder of this paper is organized as follows.In Section 2, the paper describes proposed approach for building an explainable ensemble model for credit risk prediction.Section 3, discusses experimental results on a real-world dataset and comparison of proposed approach with other state-of-the-art methods.Finally, the paper is concluded in Section 5 and future directions for research is discussed.

METHOD
The proposed methodology for enhancing credit risk prediction with explainable ensemble methods consists of various steps.The process applied in the research methodology is shown in figure 1 The proposed methodology represents a significant advancement in the domain of credit risk prediction, offering a holistic approach that combines both enhanced predictive accuracy and interpretability.Its primary objective is to empower financial institutions with robust tools for making lending decisions that  ISSN: 2252-8938 Int J Artif Intell, Vol. 13, No. 1, March 2024: 917-924 920 are not only more accurate but also transparent and comprehensible.In an era where the financial industry faces increasing scrutiny and regulatory requirements, maintaining interpretability and transparency in credit risk prediction is of paramount importance.The use of explainable ensemble methods ensures that the model's predictions are not perceived as black-box decisions but are rooted in a clear understanding of the contributing factors.This transparency fosters trust among stakeholders, including customers, regulators, and internal decision-makers, as they can trace and comprehend how the model arrives at its risk assessments.

RESULTS AND DISCUSSION
The performance of various ML algorithms, including Gaussian NB, Logistic Regression, Extra Trees Classifier, Random Forest Classifier, XGB Classifier, LGBM Classifier, Neural Network, and the proposed explainable ensemble method, is evaluated using precision, recall, F1-score, and accuracy.The proposed algorithm achieves the highest performance with precision, recall, F1-score, and accuracy of 99.8% on dataset 1 (Figure 2).This indicates that the proposed algorithm has a high degree of accuracy in identifying positive and negative instances of credit risk.The high precision score indicates that the algorithm has a low false positive rate, i.e., it can accurately predict negative instances of credit risk.The high recall score indicates that the algorithm has a low false negative rate, i.e., it can accurately predict positive instances of credit risk.The XGB Classifier is the second-best-performing algorithm, achieving precision, recall, F1-score, and accuracy of 97.3%, 97.2%, 96.5%, and 97.1%, respectively.This algorithm also performs well in accurately predicting credit risk, with high precision and recall scores.The Random Forest Classifier achieves precision, recall, F1-score, and accuracy of 96.2%, 96.3%, 95.4%, and 96.9%, respectively, and is also a promising algorithm for credit risk prediction.The other algorithms, including Gaussian NB, Logistic Regression, Extra Trees Classifier, LGBM Classifier, and Neural Network, also perform reasonably well but are outperformed by the proposed algorithm and the top-performing algorithms.
Overall, the results show that the proposed explainable ensemble method outperforms individual base models and achieves high accuracy with improved interpretability.The high precision, recall, F1-score, and accuracy scores of the proposed algorithm demonstrate its potential to accurately predict credit risk while providing insights into the factors that contribute to credit risk.This can help financial institutions make more informed lending decisions and promote transparency and trust in financial decision-making.On dataset 2 (Figure 3), the performance of the model is evaluated with various ML algorithms in predicting credit risk and compared their results with our proposed algorithm.We used precision, recall, F1score, and accuracy as performance metrics to evaluate the algorithms.The results of the experiments are shown in the Table 2. Results demonstrate that the proposed algorithm achieved the highest precision, recall, F1-score, and accuracy, all at 99.9%.The next best performing algorithms were XGB Classifier, LGBM Classifier, and Neural Network with accuracy scores of 97.1%, 98.0%, and 95.4%, respectively.It is important to note that the proposed algorithm achieved these high scores while maintaining interpretability and transparency, making it a valuable tool for financial institutions to make informed lending decisions.The algorithm achieves this by using an ensemble of different ML algorithms and features, which improves the model's performance and reduces the risk of overfitting.Overall, the results demonstrate that the proposed algorithm can effectively predict credit risk with high accuracy while maintaining interpretability and transparency, which is crucial for financial institutions to build trust with their customers and stakeholders.
Explaining ML models becomes challenging when dealing with correlated data, as commonly used methods ignore these dependencies, leading to unrealistic settings and misleading explanations.This is a pitfall in model-agnostic interpretation methods for ML models.To address this issue, we propose a module that considers the dependencies between variables and enables ML engineers to explain their models.This module includes functionalities for estimating the importance and contribution of variables by grouping them into aspects.The Model Aspect Importance is calculated using permutation-based variable importance.To obtain explanations for specific groups, the groups are specified using h-cut-off level, which represents the minimum value of dependency between the variables in one aspect.The results are shown in Figures 4 and 5 for dataset 1 and 2 respectively.The results of our experiments demonstrate that the proposed ensemble method with feature selection and model interpretation techniques can effectively improve credit risk prediction while maintaining model interpretability.Our proposed algorithm achieved significantly higher precision, recall, F1-score, and accuracy compared to the other base models, with an overall accuracy of 99.9%.The performance of the other base models ranged from 91.3% to 98.0% accuracy.The high accuracy of the proposed algorithm indicates that it can effectively predict credit risk, which can help financial institutions make informed lending decisions.

CONCLUSION
In conclusion, the study proposes the use of explainable ensemble methods to enhance credit risk prediction models while maintaining interpretability, which can improve transparency and trust in financial decision-making.The ensemble model is built by using multistage heterogeneous stacking technique that use different ML algorithms and features and use model interpretation techniques to identify the most important features and visualize the model's decision-making process.The experimental results demonstrate that the proposed model outperforms individual base models and achieves high accuracy with improved interpretability.Moreover, the proposed model provides insights into the factors that contribute to credit risk, which can help financial institutions make more informed lending decisions.Overall, our study highlights the potential of explainable ensemble methods in enhancing credit risk prediction models and promoting transparency and trust in financial decision-making.Based on the findings of this study, some potential future research directions include, further exploration of the interpretability of ensemble methods.While ensemble methods are generally considered to be more interpretable than individual models, there is still room for improvement in understanding how these methods make decisions.Future research could focus on developing more advanced interpretability techniques to better understand the decision-making process of ensemble models.Investigation of the impact of different feature selection methods.There are many feature selection methods available that could be compared to determine which method is most effective for credit risk prediction.Evaluation of the proposed model on different datasets.While the proposed model showed promising results on the dataset used in this study, it would be useful to evaluate its performance on different datasets to determine its generalizability and robustness.Development of hybrid models, in this study, an ensemble of ML models was used to improve credit risk prediction.However, future research could explore the use of hybrid models that combine ML models with other methods, such as rule-based systems or expert knowledge, to further enhance performance and interpretability.Incorporation of non-traditional data sources: Credit risk prediction models typically rely on traditional data sources, such as credit scores and income.However, with the rise of alternative data sources, such as social media and online behavior, there may be opportunities to incorporate these sources into credit risk prediction models.Future research could explore the potential of using these non-traditional data sources to improve credit risk prediction and decision-making.

Int
Explainable ensemble technique for enhancing credit risk prediction... (Pavitha Nooji) 919 . The first and foremost is the data processing which involves data collection and preprocessing.The dataset is cleaned, missing values are imputed, and categorical variables are converted into numerical values.For the study data is collected from private bank and NBFC from India.D = {x1, x2, ..., xn}: original dataset Dc = {x1c, x2c, ..., xnc}: cleaned dataset Dcn = {x1cn, x2cn, ..., xncn}: dataset with categorical variables converted to numerical Feature Selection: In this step, relevant features are selected for building the models.Feature selection helps to reduce the dimensionality of the data and improves the performance of the models.F = {f1, f2, ..., fp}: original feature set Fs = {fs1, fs2, ..., fsk}: selected feature set Base Model Selection: Multiple base models are selected for building the ensemble model.Different ML algorithms and features are used to create diverse base models, which helps to improve the overall performance of the ensemble model.M = {M1, M2, ..., Mk}: set of base models Ensemble Model Construction: The model construction is done using multistage heterogeneous stacking ensemble techniques.Em = f(M1, M2, ..., Mk): ensemble model constructed from base models Model Interpretation: Once the ensemble model is constructed, model interpretation technique is applied to understand the factors that contribute to credit risk and provides insights into the model's predictions.I: model interpretation technique used Ir: interpretation result obtained from ensemble model Model Evaluation: The performance of the ensemble model is evaluated using standard evaluation metrics such as accuracy, precision, recall, and F1-score.The proposed model is compared with individual base models to demonstrate its effectiveness in improving credit risk prediction.E: evaluation metric used to measure model performance Ep: performance of proposed model Ei = {Ei1, Ei2, ..., Eik}: performance of individual base models C = {C1, C2, ..., Ck}: comparison result between proposed model and individual base models based on evaluation metric

Figure 2 .
Figure 2. Stacking Ensemble model result on data set 1

Figure 3 .
Figure 3. Stacking Ensemble model result on data set 2

Figure 4 .
Figure 4. Explanation for data set 1

Figure 5 .
Figure 5. Explanation for data set 2