Fault-type coverage based ant colony optimization algorithm for attaining smaller test suite

Received Feb 3, 2020 Revised Apr 8, 2020 Accepted May 22, 2020 In this paper, we proposed Fault-Type Coverage Based Ant Colony Optimization (FTCBACO) technique for test suite optimization. An algorithm starts with initialization of FTCBACO factors using test cases in test suite. Then, assign separate ant to each test case called vertex. Each ant chooses best vertices to attain food source called objective of the problem by means of updating of pheromone trails and higher probability trails. This procedure is repeated up to the ant reaches food source. In FTCBACO algorithm, minimal number of test cases with less execution time chosen by an ant to cover all faults type (objective) are taken as optimal solution. We measured the performance of FTCBACO against Greedy approach and Additional Greedy Approach in terms of fault type coverage, test suite size and execution time. However, the heuristic Greedy approach and Additional Greedy approach required more execution time and maximum test suite size to provide the best resolution for test suite optimization problem. Statistical investigations are performed to finalize the performance significance of FTCBACO with other approaches that concludes FTCBACO technique enriches the reduction rate of test suite and minimizes execution time of reducing test cases efficiently.


INTRODUCTION
The Software revises numerous times during its development phase and later. Revision of software application program is required when new attributes and functionalities are introduced [1]. After renovation, software testing is essential to test the software application program to assure that the system is functioning properly with recent modifications. Software testing is the most significant role of an effective software product. Software testing is the procedure of executing the software program for discovering faults which cause software failure. Combinatorial testing (CT) is an effective software testing technique used to identify the faults during pair of feature combinations of Software Program Applications (SPA). Quality of software program is achieved by means of right and proper test suite. CT is essential for effective test suite generation. Main objective of software testing is to generate a group of tiniest test cases which comprises higher faults in least time. Hence, quality of software program is measured in terms of software metrics like maximum faults coverage, minimum test suite size and minimum computational time. Computation time of huge test suites is supposed as a bottleneck while building huge software. Therefore test suite size optimization is essential to  ISSN: 2252-8938 Int J Artif Intell, Vol. 9, No. 3, September 2020: 507 -519 508 reduce the computational cost. Test suite size optimization is classified into minimization of test suite size, selection of test case, prioritization of test case [2]. Test suite size minimization is the method of choosing test cases that fulfills specified constraints. Additionally, test suite optimization plays essential role to reduce he testing cost of SPA without corrupting their quality factors. So, this investigation work aims test suite optimization proficiently for increasing the ability of SPA.
We have recognized 2 various heuristic algorithms called Greedy Strategy based algorithm [3] and Additional Greedy Strategy based algorithm [4] for conducting this research as baseline algorithms. In Greedy approach [3], test cases are arranged in descending order based on their fault coverage ability and begins with test cases that cover highest quantity of faults till either all the faults are covered or test adequacy condition is encountered. Additional Greedy approach [4] differs from the greedy approach while choosing test cases for addition in the minimized test suite. Initially, it selects the test case which includes the highest amount of faults. Then it selects the test case which includes the highest number of still unobserved faults by the minimized test suite. Same process repeated till all the faults are observed. But we observed that both baseline algorithms are required more execution time to optimize the test suite.
To overcome the above cited issues in test suite optimization problem using Greedy and additional greedy approach, Fault-Type Coverage Based Ant Colony Optimization (FTCBACO) algorithm Technique is built. FTCBACO algorithm is designed based on maximum faults-type coverage analysis and produce an efficient optimum resolution. The major contributions of FTCBACO technique is expressed as follows,  To enhance the ability of test suite reduction for software testing. The investigation work of this paper is outlined as follows. Section 2 reviews the related works. In Section 3, we explain implementation process of FTCBACO for test suite optimization problem with an illustrative example. Section 4 reports results analysis and discussions. In section 5, we conclude this paper.

RELATED WORKS
Chen and Lau [5] presented GRE methodology for minimization. This methodology is based on 3 methods-Essential method, 1-to-1 redundant method, Greedy method. In this methodology, initially important test cases are nominated and inserted into set, after that 1-to-1 repeated test cases are eliminated continually. Now greedy methodology is used on the pending test cases till all the necessities are fulfilled. GRE assured to create optimal sets. Khalilian and Parsa [6] projected Bi-criteria test suite reduction with cluster analysis of execution profiles. They merged distribution-based technique with coverage-based technique to build full coverage reduced test suites. Coverage based techniques used for test case selection which contains faults. Distribution based techniques used for clustering the test cases. These two techniques combined to form a reduced set with full coverage. Finally they generated reduced test suites with a lesser amount of fault identification capability. Tallam and Gupta [7] proposed inspired greedy algorithm for test suite reduction which is based on the relation between test cases and testing requirements. For reduction, test cases are assumed as objects and requirements as their attributes. Context table (test suite) was built based on association among object and attributes. Aim of proposed algorithm is to reduce context table size. For reducing objects and attributes, Object reduction rules and attribute reduction rules are utilized. Context table size was reduced using object reductions, attribute reduction and owner reduction. The size of context table was slightly reduced by eliminating duplicate objects. Finally size of context table was minimized. Yoo and Harman [8] proposed using hybrid algorithm for Pareto efficient multi-objective test suite minimization. They merged the greedy approach with the genetic algorithm to offer Pareto fronts with high quality. Testing results prepared by their method were more proficient. Chen, Zhang and Xu [9] recommended degraded ILP Approach for Test Suite Reduction. In this approach, lower bound of minimum test suite was produced and was searched feasible solution nearby lower bound. If representative set size matches with lower bound at that point representative set considered as finest result, if representative set size is nearer to lower bound at that moment representative set judged as good result, and if representative set size is distant from lower bound then Integer Linear Programming or any other expensive method was  [10] have proposed a greedy approach for coverage-based test suite reduction. They obtained reduced test suite size based on code coverage criteria. Then they compared its performance results with bi-objective greedy techniques as well as HGS. Chen and Lau [11] offered divideand-conquer approach for test suite reduction. They focused on dividing approaches that are whole with respect towards the least and optimal representative sets. Divide-and-conquer approach fundamentally split the original problem into smaller sub problems, discover optimal results for the sub problems, and build a result for the original problem from result of the sub problems. They obtained needed subset and repeated subset relating to needed test cases and repeated test cases separately. Needed subset holds needed test case. A repeated subset whose fulfilled requirements can be fulfilled by other test cases. Finally, representative set comprised needed subset and rejected repeated subset. Galeebathullah and Indumathi [12] proposed a novel approach for controlling a size of a test Suite. Greedy approach and set theory were used to produce reduced sets. They have used intersection function to find the unsatisfied unique elements. Then the intersection between one elements to other elements of branch coverage criteria was found using set theory for the set of test cases. Initially, they have calculated intersection between the elements. If any intersection elements take place then the test case is included into reduced test suite. This procedure was continued till whole requirements are fulfilled. Finally reduced test suites have similar size compared to other approaches. You and Lu [13] [21] proposed an incremental approach to unit testing during maintenance. In this approach, they have considered testing effort during software maintenance. Lin and Huang [22] have proposed an analysis of test suite reduction with enhanced tie-breaking techniques. They Combined HGS and GRE approaches to find higher reduced test suite size based on capability of faults detection. Jeffrey and Gupta [23] have projected test suite reduction with selective redundancy. This proposed technique was used to cut the fault identification cost by removing duplicate test cases. They have found redundant test cases using branch coverage data. This technique provided slightly reduced test suite size with better fault identification effective. Boussai et al. [24] have discussed various metaheuristic optimization techniques. Dokeroglu et al. [25] argued new generation metaheuristic algorithms. Ilango et al. [26] have presented Optimization using Artificial Bee Colony based clustering approach for big data. Vimal et al. [27] proposed Energy enhancement using Multiobjective Ant colony optimization with Double Q learning algorithm for IoT based cognitive radio networks.

FAULT-TYPE COVERAGE BASED ANT COLONY OPTIMIZATION (FTCBACO)
Fault-Type Coverage Based Ant Colony Optimization (FTCBACO) Technique is planned to solve the issues in Greedy approach and Additional Greedy approach for test suite reduction problem. FTCBACO designed based on metaheuristic Ant Colony Optimization (ACO) Algorithm for achieving test suite reduction rate in high-level, minimization of execution time and maximum faults type coverage to improve the efficiency of RT. Swarm intelligence (SI) techniques are used for resolving computational problems. ACO is intended from SI techniques. The ACO algorithm is constructed using graphs for finding an optimal path depending up on the ants behavior. With the help of Pheromone (chemical substances) omitted by an ant, ant selects path from its colony to food source and return to colony. Foragers track the path to reach food source by observing pheromone trail of other ants. As a result, an optimal path found from their colony to food source. Based on the concept of ACO algorithm, FTCBACO algorithm designed to find optimum test cases in a test suite in terms of maximization of test suite reduction with minimum execution time for achieving all faults type. In FTCBACO algorithm, ant considers test case as vertex and probability of test case as weight of an edge. Ant chooses a test case with higher probability value (edge) as best test case to obtain all faults type as food source (objective). The real impact of each ant is to obtain all faults type of a problem. The probability value of edges is calculated using the pheromone value of test case deposited on path. Also FTCBACO algorithm is required three control parameters α, β and ρ to optimize the problem. The values of α, β parameter defines the relationship between pheromone value and heuristic value. The parameter ρ represents evaporation rate and its value must be defined the range between 0 and 1. Optimal combination of α, β, ρ helps to evaluate the proposed FTCBACO algorithm effectively. Combinatorial testing plays a key role to search a best combination of α, β, ρ values. The values of α, β, ρ are determined based on the problem during implementation phase. The data flow diagram of Fault-Type Coverage Based Ant Colony Optimization (FTCBACO) algorithm is presented in Figure 1. FTCj where j=1,2,…n of ant Ai contains all faults type then move to next ant. Otherwise update pheromone of ant Ai using equation (2) or (3) based on constraints, probability of ant Ai using (4) and select next test case with higher probability. The similar process will be repeated until all ants covered all faults type. Now we obtained test case path with all faults type covered for all ants. Afterwards, we compared execution time of all the ants and finest ant is elected with minimum execution time and smaller number of test case in test case path.

Procedure of FTCBACO
The procedure of FTCBACO algorithm is presented down,

Functions of FTCBACO
FTCBACO has two major functions for selecting next test case to achieve an objective function of the problem.  Update pheromone  Upadate probability

Update pheromone
If the ant number is same as selected test case number then the pheromone of test case is evaluated using (1). Otherwise evaluated using (2). (1) From (1) and (2) represents amount of pheromone deposited.

Update probability
The probability of test case is updated using equation (3)

Demonstrative with example
An example is demonstrated to exhibit the functioning of FTCBACO algorithm in regard to existing Greedy approach and Additional Greedy approach. In this example, we considered a sample input fault matrix with 8 test cases (rows) and 10 faults type (column) are shown in Table 1. Fault matrix can be encoded using binary values either 0 or 1, where 1 represents a fault type FTi covered by its associated test case TCi and 0 represents that a fault type is not covered. Hence, number of faults type covered by associated test case can be determined using sample input fault matrix. Then, the ranges of control parameters α, β are assumed between 0 and 10 and ρ between 0 and 1. Now mixture of parameters α, β, ρ yielded 179 combinations and we proceeded the algorithm 10 times for every combinations. Finally we identified optimal combination of parameters value are α=2, β=1, ρ=0.4 for improving performance of the algorithm. Then the proposed FTCBACO algorithm is executed along with the existing Greedy and Additional greedy approach based on the sample input fault matrix illustrated in Table 1.   Table 2 represents initial pheromone distribution of each test case and its probability. Tables 3 to 10 proves step by step implementation of proposed FTCBACO algorithm with sample inputs, process of test case path selection of each ant and optimized test path of a test case in a test suite. Outcome shows the ability of FTCBACO algorithm technique. Amongst 8 input test cases, the recommended FTCBACO algorithm suggested only 3 test cases to attain the objective of a problem. Table 2 Shows Initial pheromone distribution and probability of sample input test suite. Table 3 shows outcomes of ant 1. In Table 3 Step-by-step execution outcomes of each existing algorithm as presented in Table 4 and also proposed FTCBACO technique as exhibited in  Step by step execution of existing techniques using sample input fault matrix described in Table 1 Step no.

Process of test case path selection
Greedy Approach Additional Greedy Approach  Table 4 shows that an existing Greedy approach needed 4 test cases and 0.038(ms) computational time to cover all faults type and Additional greedy approach offered 3 test cases and 0.021(ms) to report all faults type. But only 3 test cases and 0.015(ms) computational time are essential to cover all faults type through proposed FTCBACO as shown in Table 5 and therefore FTCBACO is better than both existing techniques. After the execution of both existing and proposed algorithms with test fault matrix, the function factors are acquired in terms of number of faults type covered, execution time, minimized test suite size as shown in Table 6. In both existing techniques, execution time considered as time taken for finding minimized test suite. But in Proposed algorithm, each ant obtained execution time for finding minimized test suite size. Among the execution time of all the ants, FTCBACO considered execution time for the best ant which has generated minimum test suite size with minimum execution time. And also Table 6 shows percentage of test suite reduction rate. Test suite reduction rate is computed using (4).
Test suite reduction rate(%) = Total test suite size−optimal test suite size Total test suite size * 100 The results of Table 6 proved that proposed FTCBACO algorithm required minimum execution time to produce maximum test suite reduction rate. Hence we conclude that proposed FTCBACO algorithm outperforms both existing techniques in terms of execution time needed for finding reduced test suite and percentage of test suite reduction rate.

EVALUATION OF RESULTS AND DISCUSSION
Numerous experiments have been conducted for determining the efficiency of the proposed FTCBACO algorithm. In order to analyze the performance of function factors, both proposed and existing algorithms have implemented in Java Language by means of similar experimental environment. Fault metrics of various 10 subject programs (benchmark dataset) have retrieved from Software-artifact Infrastructure Repository (SIR) as shown in Table 7 and used for conducting experiments. Table 7 shows subject name, total number of mutated faults presented, number of faults type covered and test suite size of each subjects. Also unique subject id have been assigned to each subject program version. In the experiment, we used 2 independent variables and 3 dependent variables. Selection of algorithms (Greedy, Additional greedy and FTCBACO) and 10 subject programs (flex v1, flex v2, grep v1, grep v3, grep v4, gzip v1, gpiz v2, make v1, make v4, sed v2) are considered as 2 independent variables. Various combination of independent variables have been used for conducting the computational model.Three dependent variables (faults type coverage, optimized test suite size and execution time) are identified from the mixture of every independent Fault type variable. Total number of test cases in the minimized test suite should be considered as optimum test suite size. Proposed FTCBACO algorithm and Existing techniques have designed based on stochastic nature. Hence, each selected algorithm have executed 10 times respectively for each subject program to gather the outcomes in the similar environment. Table 8 shows the mean values of function factors for every algorithm in regard to the subject programs as shown in Table 7.  From the outcomes stated in Table 8, we state that the proposed FTCBACO algorithm required minimum mean value of execution time (0.0848 ms) and optimized test suite size (12.23). So, we conclude that performance of proposed FTCBACO is better than both existing techniques.

Statistical investigations
Statistical investigations have conducted to obtain an accurate conclusion regarding the efficiency of the proposed FTCBACO. Therefore 3 research questions and their related hypothesis have built and shown in Table 9. H0 denotes assumption of null hypothesis whereas Ha indicates assumption of alternative hypothesis.  Table 10 presents the mean values of faults type coverage for every algorithm with respect to subject programs. An outcome shows that each algorithm has presented uniform result on fault type coverage relating to subject programs. From the data shown in Table 10, we can state that all selected algorithm is able to achieve 100% faults type coverage. So, each algorithm has equal mean value of faults type coverage in regard to subject programs. Therefore, we agreed H0 and finally come to the statement that there is no significant variance in the fault type coverage capability of each algorithm.  SC1  SC2  SC3  SC4  SC5  SC6  SC7  SC8  SC9  SC10  Greedy  16  14  3  7  3  6  3  19  5  5  Additional Greedy  16  14  3  7  3  6  3  19  5  5  FTCBACO  16  14  3  7  3  6  3  19 5 5

Solution to RQ2:
Proposed FTCBACO algorithm and Existing techniques have designed based on stochastic nature. Hence, each algorithm executed 10 times (total run =10) respectively for each subject program and collected the optimized test suite size of each run for every algorithm. Mean value of optimized test suite size for every algorithm is calculated in regard to subjects programs and are shown in Table 11. From the results illustrated in Table 11, we proved that proposed FTCBACO algorithm required minimum mean value of test suite size to cover all the faults type compared with other algorithms techniques. And from the results reported in Table  12, we concluded that proposed FTCBACO algorithm provided maximum test suite reduction rate compared to other techniques. It is evident from the results of Table 11 and 12 we identified that there is a significant variance among the algorithms in terms of optimized test suite size. Figure 2 shows the performance analysis of mean values of optimum test suite size against various subject programs using three methods specifically existing Greedy approach, Additional Greedy approach and proposed FTCBACO Technique. As in Figure 2, the proposed FTCBACO Technique offers minimum mean value of optimum test suite size for software testing as compared to an existing Greedy and Additional Greedy approaches. So, the test suite reduction rate using proposed FTCBACO Technique is also higher. Therefore, we discard null hypothesis H0 and agree Ha that there is a significant variance among the algorithms to find the optimum test suite size.

Solution to RQ3:
Each algorithm executed 10 times (total run =10) respectively for each subject program and collected the execution time of each run for every algorithm. Mean value of execution time for every algorithm is calculated in regard to subjects programs. Table 13 shows the mean values of execution time for every algorithm. From the results shown in Table 13, we proved that proposed FTCBACO algorithm required minimum mean value of execution time to find optimized test suite size for covering all the faults type compared with other algorithms techniques. It is evident from the results of Table 13 we identified that there is a significant variance among the algorithms in terms of execution time to find optimized test suite. Figure 3 shows the performance analysis of mean values of execution time against various subject programs using three methods specifically existing Greedy approach, Additional Greedy approach and proposed FTCBACO Technique. As shown in Figure 3, the proposed FTCBACO Technique offers minimum mean value of execution time as compared to an existing Greedy and Additional Greedy approaches. Hence, we discard null hypothesis H0 and agree Ha that there is a significant variance among the algorithms in terms of execution time to find the optimum test suite.

CONCLUSION
Ant Colony Optimization algorithm is an extraordinary technique to find the best test cases in a test suite. In this investigation work, proposed FTCBACO algorithm is designed, implemented using JAVA language in an efficient manner and also the results of proposed algorithm is compared with an existing Greedy approach and Additional Greedy approach. In existing technique, additional execution time needed to optimize the test suite and also it produced least test suite reduction rate. But, Proposed FTCBACO Algorithm optimizes the test suite highly. From the comparative result analysis of existing and proposed techniques, we stated that there is no difference among algorithms under study with respect to fault type coverage capability. But proposed algorithm required minimum execution time to produce maximum test suite reduction rate. In future, my research work would consider this kind of experimentations with complicated programs on huger test suite sizes and complex fault intensities.