A survey of predicting software reliability using machine learning methods

ABSTRACT


INTRODUCTION
Companies create smart software to increase software credibility, and thus control failures.Since software in general has real concerns about reliability and maintainability.Researchers have used a variety of machine learning algorithms to find controls for variables that have an impact on most programs [1], [2].
Currently, testing methods are important and most important in determining the usability of software [3].Software usability is defined as the ability to use the software to its fullest potential without errors within a predetermined period of time [4].Various techniques are on hand to generate clever programs.Artificial neural networks, fuzzy set theory, approximate set theory, and artificial intelligence are all examples of records retrieval [5].Some errors formed during error removal and some errors initially present in the data set have the potential to cause the entire system to fail [6].
According to  3 , the framework shown in Figure 1 of applying program usability, algorithm, and architecture to the reliable work of software without defects.Intelligent software becomes necessary by combining machine learning techniques on the defective company dataset to build reliable models in different dimensions.Based on the early prediction model [7].
Incidents that turn into critical failures as a result of program failure cause financial losses, time losses and information losses [8].For this reason, errors are handled correctly at the time of release, and they are carefully checked throughout the testing and debugging processes using historical data about software failures to determine the number of test-related errors.Based on the failure history of the application, the best defect handling methods m (t) and software density function λ (t) are discovered for software reliability models [9].
Machine learning is imperative to flaw detection.It is used in evaluating software program reliability to seem to be for refined variations in how nicely a product works in proper use, and it makes uses a variety of machine learning techniques to validate a prediction application [10].Depending on the variety of processing layers via which the facts need to pass, the identify "deep" was once given, and deep studying led to the introduction of neural networks with higher complexity and greater wonderful mastering capabilities, the place the statistical mannequin is produced as output by means of the deep mastering mannequin after making use of a step-by-step non-linear transformation.to an input, and these iterations are made by using the model till the end result is sufficiently accurate [11].
Figure 1. 3 usability framework for programs Researchers and software engineers are increasingly integrating deep learning into software engineering (SE) processes.Deep learning benefits SE experts in three main ways: understanding requirements from plain language, generating source code, and predicting software errors.Understanding requirements from plain language is facilitated by deep learning models.These models analyze and extract valuable insights from textual descriptions, enabling a better comprehension of stakeholder needs and expectations.
Deep learning also aids in generating source code.By training on vast amounts of existing code, deep learning models learn patterns, structures, and coding conventions.This enables them to generate code snippets or complete programs based on high-level descriptions or specific requirements, accelerating development efforts and improving productivity.
Predicting software errors is another area where deep learning excels.By analyzing large datasets of code, deep learning models identify patterns indicative of potential bugs or vulnerabilities.This proactive approach allows software engineers to address issues before they become critical problems, enhancing software quality and reliability.
The integration of deep learning into SE processes offers vast potential for advancement.However, it is important to remember that deep learning should be seen as a complement to human expertise rather than a replacement.Domain knowledge, experience, and critical thinking are crucial for ensuring accuracy, reliability, and ethical considerations in applying deep learning techniques in software engineering.By combining the power of deep learning with human intelligence, researchers and software engineers can unlock new possibilities and drive innovation in the SE domain.
The research is organized: The second part clarifies software reliability models, how they are predicted, and their important role in software engineering.The third part dealt with machine learning and its important role in predicting software reliability.The fourth part dealt with previous studies and the findings of researchers in predicting software reliability and using machine learning techniques.As for the fifth part, it dealt with the conclusions reached by the research by reviewing the work of researchers in this field.

SOFTWARE RELIABILITY MODELS
Over the past few decades, many research on software reliability estimation and prediction have been introduced at conferences, reviewing the improvement of software reliability prediction models is the fundamental effect of this research.These types are based totally on records accumulated at some stage in the checking out segment of the program.The majority of the models that will be put ahead are mathematical features that categorical the relationship between the quantity of blunders observed and the check effort.
The test effort can be calculated using the number of operable test cases, execution time, or calendar time [12].Models can be used to predict the reliability of the software as properly as the range of defects (or remaining defects) that are no longer detected.Thus, models can assist determine when to end trying out and releasing software.Estimating software reliability is based totally on the primary assumption that as checking out continues greater and greater defects are found.As a result, reliability improves and the variety of defects closing decreases.Hence these models are recognized as software reliability growth models (SRGM) [13].
Non-homogeneous Poisson process (NHPP) models are the most extensively used class of software reliability boom models.As software checking out progresses, NHPP models seem for a heterogeneous method (Poisson) model that excellent matches the error detection sample and use this model to estimate application reliability or residual errors.There is a parameter that represents the predicted complete quantity of software blunders in the majority of these models [14].
A survey of predicting software reliability using machine learning methods (Shahbaa I. Khaleel) 37 Heterogeneous Poisson system models and stochastic system models are the two fundamental classes into which software reliability increase models fall.These two sorts of fashions are extra frequent and frequently used than NHPP models.NHPP models are additionally labeled in accordance to a variety of factors.Figure 2 affords a classification of software reliability increase models [15], [16].The most important one.

MACHINE LEARNING
Neural systems have seen a rush of abundance in the past couple of years and is commonly interconnected across an uncommon array of issue spaces, in disciplines as diverse as finance, solution, design, topography, and materials science.It all started in 1943 when McCulloch and Bates proved that neurons can have two states and that these states can depend on a certain ceiling value.Where they showed for the first time a simulated neuron.Since then, many updated and newer models have been released.The revelation gave McCulloch and Pitt a foothold for intelligent machines [17].
Artificial neural network (ANN) works similarly to a human brain.The probability is that at some point, all people will be the same.Each individual may have made the same judgments in each of the cases.Human nerves may cause one or both of them to react similarly in certain situations, which can be the distinguishing element behind a wide range of human differences [18] .
By using machine learning to study data reliability prediction.the methods of artificial intelligince had been studied and employed in software engineering.And that was once carried out thru the usage of the particle swarm optimization (PSO) and crow swarm optimization (CSO) in producing most suitable check instances of the software written with C language in an automated way due to the fact that permits the agency which develops the software to keep time and expenses as properly as making sure the check technique quality, which is estimated via 50% of the product cost [19].
Also using bees swarm to appointment, it to serve software engineering.And that used to be carried out thru the usage of artificial bee's colony algorithm in resolution of check instances for the software written in C++ language in a computerized way because to allow the business enterprise which develops the software to store time, effort and charges that required for trying out section and regression checking out activity, which is continually evaluated through 50% of the product cost [20].The estimating in software is used to estimate some necessary and future traits of the software project, such as estimating the developed task effort, and that failure in the application is by and large due to incorrect venture administration practices [21].
Assuming there is a topological network connected by arrows pointing in the right direction.These arrows represent a connection between two neurons and show the direction of information flow.There is a weight for each link, which is an integer representing the signal difference between the two neurons.The structure of the neural network is shown in Figure 3.

DEEP LEARNING
Deep learning methods have recently made significant strides in enhancing research workloads in the field of software engineering.Table 1 provides an overview of the most commonly used deep techniques in this context.Among the widely adopted models are deep belief networks (DBNs), recurrent neural networks (RNNs), convolutional neural networks (CNNs), and long short-term memory (LSTM).These models have proven effective in a range of software engineering tasks, showcasing the versatility and applicability of deep learning in the field.One of the areas where these deep learning techniques have shown promise is in understanding requirements from plain language.By leveraging DBNs, RNNs, CNNs, and LSTM models, researchers and software engineering experts can analyze textual descriptions and extract meaningful insights to comprehend stakeholder needs and expectations more effectively.This ability to interpret plain language requirements using deep learning methods can greatly improve the accuracy and understanding of project specifications.
Furthermore, these deep learning models have also been successfully applied to generating source code.Through training on vast amounts of existing code, DBNs, RNNs, CNNs, and LSTM models can learn patterns, structures, and coding conventions.This enables them to generate code snippets or even complete programs ISSN: 2252-8938  A survey of predicting software reliability using machine learning methods (Shahbaa I. Khaleel)

39
based on high-level descriptions or specific requirements.The use of deep learning in code generation has the potential to significantly speed up the development process and boost overall productivity for software engineers.
Traditional neural network models start with random selection of the initial value of the weights.So software flaws cause feature selection to be unstable.By employing a greedy approach, the deep neural network model is able to capture the subtle and reliable aspects of software flaws.Second, traditional neural network models usually make it easier to obtain the local optimal solution.But by using the greedy algorithm, the deep neural network model can find the best overall answer.Compared with previous methods, it can also detect features from software errors more accurately.As a result, the deep neural network model outperforms the regular neural network models in terms of prediction accuracy [31].

PREVIOUS STUDIES
RNNs have gained significant popularity in addressing problems associated with sequential data.These networks have found extensive application in various domains, including natural language processing, speech recognition, and time series analysis, among others.RNNs excel in handling data sequences due to their ability to retain memory and capture temporal dependencies.
Bai et al. [32] has developed a software prediction model based totally on networks (Markov Bayesian), and a technique is proposed to remedy the community model.The researchers assumed that the modern quantity of defects in the application was once normal.This is by and large due to the truth that the regular distribution has many fascinating properties, such as the linear stability, the usage of the (AdaBoosting) algorithm and an accuracy of 82.3%.
Hu et al. presented RNNs to describe the interaction between software bug detection and debugging.comparisons with feedforward neural networks and analytical models have been developed.thus, researchers have reached a maximum accuracy of 94.62% [33].Costa et al. presented a method based totally on genetic programming.The use of enhancement methods to enhance overall performance has additionally been proposed.Experiments had been carried out with reliability primarily based on time and take a look at coverage [34].
The result in [35] selected several different forms of SRGM to obtain the self-combining model a selfcombination model (ASCM), the second selects several candidate SRGMs to obtain the multiple synthesis model AMCM, and each form of SGRM has been studied, and the results show that ASCM is fairly effective and applicable to improve the estimation and prediction of the performance of the corresponding original SRGM without adding any other factors and assumptions.A multi-combinational model (AMCM) is effective and applicable, and also produces better estimation and prediction ability than the neural network-based combinatorial model with an accuracy of 79.63% [35].Kotaiah and Khan [36] presented a various machine learning strategies or methods to examine software reliability.These methods are, fuzzy method, fuzzy neural strategie, genetic algorithm, Bayesian classification approach, SVM approach, and the self-organization method.
Zhang et al. [37] presented main disadvantages of software reliability models based on the basic PSO-SVM evaluation and software reliability prediction properties, some enhaneced PSO-SVM metrics have been proposed.The simulation consequences confirmed that in contrast to the classical models, the accelerated model has higher prediction accuracy, higher generalization ability, much less dependence on the range of samples, and it is greater relevant to predict software program reliability through measuring the unit size, which represents the quantity of line codes and the variety of errors, which represents the variety of module defects, and an accuracy of 97.98% was once reached.
The word in [38] describing the inference and statistical prediction of software reliability in the presence of variable information.The Bayesian method was once developed the use of Gaussian strategies and the local occupancy grid map (LOGM) algorithm to estimate the wide variety of application errors over specific time intervals.When the application is assumed to have modified after every time duration and application metrics facts is handy after every update.Also, Amin et al.
[39] presented a well-established method to predicting software reliability primarily based on autoregressive integrated moving average (ARIMA) for time sequence as a choice answer to tackle SRGM constraints and supply extra correct dependable prediction.Using real-life datasets on application failures, the accuracy of the proposed strategy used to be evaluated and in contrast to existing, famous approaches.This contrast confirmed that the proposed strategy carried out higher than different ARIMA-based approaches, used to be steady in overall performance and used to be much less high-priced than the SVR approach.An accuracy of 78.80% was once reached.
Zhao et al. [40] suggested positive feed back sipport vector machine (PF-SVR) scheme, the proposed scheme defines the parameters of the SVR model using the full sample data while dynamically adjusting the parameters, and when additional reliability data is received, the parameters of the SVR model are updated using special equations that include the SVR training model.PF-SVR method provides Improved prediction performance compared to normal SVR performance due to parameter modification.PF-SVR can capture changes in reliability trends by updating adaptive parameters, which makes it convenient for software reliability testing.The MSE scale was used to predict the accuracy of the algorithm and the results were 1.1848, 0.4318 respectively.
While Tyagi and Sharma [41] developed a new component-based software systems (CBSS) mannequin that explains the use of the pathway.Where it has been established that the proposed mannequin the usage of ant colony optimization (ACO) works higher than different models, the reliability of the utility can be estimated by using measuring the time and the opportunity of error.This model gives heuristic component dependency graphs (HCDGs), which assist to estimate CBSS reliability.The HCDGs provide higher reliability estimates than different contemporary techniques with an accuracy of 65.78%.
Roy [42] used some algorithms based on different mathematical approaches such as: fuzzy set theory, different approaches based on time series, wave packet transmission function, which can accurately predict the occurrence of different frequently occurring web errors.The predictive accuracy of the proposed methods is better than a number of current and widely used methods.Moreover, the proposed methods are free from all kinds of unrealistic assumptions such as: the number of errors in the system is limited; Once an error is detected, it is completely removed, the total number of errors detected is proportional to the test time.
While Bhuyan et al. [43] used method for predicting software program reliability the use of fuzzy min-max algorithm mixed with recurrent neural technique.An empirical proof has been introduced displaying that the max-min fuzzy algorithm with recurrent method using backpropagation learning offers a correct result.Software reliability prediction has been used to enhance application system manage and acquire excessive software reliability.
Software reliability prediction models proposed by many researchers, where they found some shortcomings as explained in [44].It has been found that deep learning models are very useful in predicting software errors.RNN-based learning models give better results.Odification in [45] studied the J-M model, the concept of the learn about was once to generalize the proposed risk fee equation by way of including a new structure parameter.The new customary risk ratio method is very bendy to accommodate all varieties of timedependent conduct.can provide a range of SRGMs that can be used with much less effort and time in any methods decision study.
The two researchers Tamura and Yamada in [46] have proposed a method for selecting the optimal program reliability model based on deep learning.Many numerical examples of software reliability assessment are presented in actual software projects.Where the optimum release time and the expected total cost of the program were discussed in terms of model selection based on deep learning, the proposed method based on deep learning showed a better potential than that based on neural network.
While Xu et al. [47] used an approach multi-layered heterogeneous dynamic particle swarm optimization-back propagation (MHPSO-BP) for software reliability prediction that is based on a more effective multi-layer heterogeneous PSO neural network BP.This approach uses an attractor to optimize the pace replace equation for the particle and sets the demography of the particle swarm to a hierarchical structure.The particle swarm technique has been optimized, and the statistics interaction between particles has been improved.Then, the optimised PSO was once applied to raise the neural network weight and threshold BP during the experiment, the software reliability prediction test was run using dataset from the NASA metrics data program (NASAMDP).The results showed that the suggested method has better prediction performance overall than the typical neural for back propagation via 92%.
While Wang [48] analyzed the necessities for prediction of software program reliability mannequin and contrast system, describing the standard shape of the system, the precise unit features and database design.Where JavaScript, HyperText Markup Language (HTML) and different applied sciences have been used to whole the diagram of the software reliability contrast machine and evaluation of the hierarchical shape of training and essential software packages.And the check consequences exhibit that the software reliability predictive machine can meet the commercial enterprise requirements, and with an accuracy of 94.01%.
The researchers Pattnaik and Ray [49] discussed the reliability of existing software, estimation models at different stages of the software development process, and metrics used for software reliability at different levels ie, code level and architectural level.Various models have been represented for reliability analysis.Most of them are derived analytically from assumptions.The limitations of prediction models as well as architectural models are also discussed.The effect of failure data on software reliability prediction has been observed, and it has been analytically observed that the exponential distribution plays an important role in reliability since it has a constant failure rate.Finally, some familiar tools for measuring the expectation and estimation of software reliability are discussed.
While Barack and Huang [50] studied cellular utility reliability evaluation and prediction the usage of frequent software reliability increase models SRGMs, the four software reliability models are used to consider The researchers Sahu and Srivastava [51] have studied a number of already developed reliability growth models (RGM) and used them at different stages of development respectively.This was found in the study that there is no reliable prediction model that can be used during the software development process.The researchers provide suggestions for developers to develop and describe a reliable prediction model that can be used with every stage of development.
Also, Gandhi et al. [52] presented a high quality algorithm that can be used to predict the reliability of the program.The proposed algorithm is applied the usage of a hybrid strategy referred to as neuro-fuzzy inference system and it has additionally been utilized to the take a look at After checking out and coaching real-time records with reliability prediction in phrases of imply relative error and suggest absolute relative error as 0.0060 and 0.0121 respectively.The consequences exhibit that the proposed algorithm predicts captivating outcomes in phrases of the absolute imply relative error as properly as the imply relative error in contrast to different current models that justify the dependable prediction of the proposed model.Thus, this new technological know-how goals to make this model as easy as viable to enhance software reliability.
Kushwah and Sharma [53] by examining the nature of the labour in the software process, researchers explored the prediction of software failure.The research found that the software program dependability prediction models put forth by numerous researchers had some flaws and didn't work in all test conditions.Additionally, assessing the trustworthiness of software programes is no longer an actual science.Soft computing techniques including neural networks, fuzzy logic, genetic algorithms, genetic programming, swarm intelligence, and bayesian networks, among others, are of utmost significance.While the use of modern light computing techniques in software for dependability modelling is stressed.
While San et al. [54] presented a new technique for software program reliability modeling known as deep projects software reliability growth model deep cross-project software reliability growth model and this approach is a cross-project forecasting approach that makes use of the elements of previous tasks records via challenge similarity.Specifically, the proposed technique applies block-based mission resolution of coaching and modeling statistics supply by using a deep mastering method.Experimental find out about outcomes that encompass 15 actual E-Seikatsu datasets and eleven open supply software program datasets exhibit that DC-SRGM can greater precisely describe the reliability of ongoing improvement tasks than the contemporary traditional SRGM and LSTM models.
Ali et al. [55] presented a reliability prediction mannequin that enhances scalability by using introducing an algorithmic mechanism TypeScript state machine.In addition, the proposed method helps modeling the nature of concurrent functions by way of adapting the formal statistical distribution in the direction of the situation set.The proposed method was once evaluated the use of sensor-based case studies.The experimental outcomes confirmed the effectiveness of the proposed method from the factor of view of lowering the computational price in contrast to comparable models.This discount is the most important parameter to enhance scalability.In addition, the introduced work can allow gadget builders to be aware of the load their device will be dependable with the aid of watching the reliability fee in many running situations.
After reviewing these studies, they are summarized in Table 2 (see in Appendix).It shows the database used, whether it was previously stored data or real-time data.To mentioning the scale used to determine the quality and accuracy of the technology used to predict the reliability of the software.
Table 2 provides evidence that the utilization of machine learning techniques yields satisfactory accuracy when assessing the reliability of software programs.The high accuracy rates achieved can be attributed to the quality of the technology employed, regardless of whether the database is extensive or of moderate size.This implies that machine learning algorithms have the capability to effectively determine the reliability of programs, regardless of the scale of the database being analyzed.

CONCLUSION
Using deep learning is the best solution for ensuring software reliability, according to previous discussions.Ensuring software reliability has become a serious concern due to the increasing size and complexity of the current software.Anticipating potential code defects in software applications can be considered a useful way to increase software reliability since it can significantly reduce software maintenance work.A flaw prediction framework that uses deep learning algorithms to automatically generate features from source code while preserving semantic and structural information has the greatest promise.Moreover, our survey confirms the feasibility of deep learning methods for programming and its important role in using it to predict software reliability.

−
Test voltage measurement is categorized into models based on the amount of test cases and test duration.− Exponential and S-shape models, depending on the reliability or average value of the m (t) function.− Models with perfect correction and those without are based on different types of correction assumptions.

Figure 3 .
Figure 3.The structure of the neural network

Table 1 .
Combined machine learning and deep learning methods to predict software flaws an open supply cellular utility through examining computer virus reports.Experiments have validated that it is viable to use SRGM with fault records got from error reviews to consider and predict software program reliability in cell applications.The consequences of the find out about allow software program builders and testers to evaluate and predict the reliability of cellular software program functions.

42 APPENDIX Table 2 .
Summarizes the relevant works