Reducing the time needed to solve a traveling salesman problem by clustering with a Hierarchy-based algorithm

In this study, we compare a cluster-based whale optimization algorithm (WOA) with an uncombined method to ﬁnd a more optimized solution for a traveling salesman problem (TSP). The main goal is to reduce the time of solving a TSP. First, we solve the TSP with the Whale optimization algorithm, later we solve it with the combined method of solving TSP which uses the clustering method, called BIRCH (balanced iterative reducing and clustering using hierarchies). Birch builds a clustering feature (CF) tree and then applies one of the clustering methods (for ex. K-means) to cluster data. Experiments performed on three datasets show that the convergence time improves by using the combined algorithm. This is an open access article under the CC BY-SA license.


INTRODUCTION
Meta-heuristics are methods for finding the search agents to approximate the exact solutions, and swarm-based algorithms use agents like Bees, Krills, and Whales.These algorithms are flexible so they can solve many problems like traveling salesman problem (TSP) much faster.Therefore researchers have found many meta-heuristics to solve difficult optimization problems.For the TSP, many advanced methods have been used to solve it in a shorter time.The cost function of TSP is minimized, which means finding the shortest path is the best answer.In our previous article (2022) [1], a meta-heuristic continuous optimization algorithm called the whale optimization algorithm (WOA) was combined with K-means to solve the TSP problem, but in this study, WOA is combined with the balanced iterative reducing and clustering using hierarchies (BIRCH) algorithm to find the shortest path.This new TSP solver is called WOA-BIRCH.In TSP, a traveling salesman wishes to visit exactly once each of a list of m cities (where the cost of traveling from city i to city j is C ij ) and then returns to the home city with minimum cost [2].
The metaheuristic algorithms have different characteristics because of their inspirations from natural or special biological behaviors.These algorithms need comprehensive tests from the benchmark functions to evaluate their performance.The WOA algorithm has shown good results compared with other meta-heuristics so it is being used in different fields of engineering such as [3]: optimizing the placement of capacitors, Making feature selection techniques, solving the economic dispatch problem, enhancing the performance of photovoltaic power systems, efficiently balancing energy production with the load demand [3], finding proper coefficients, cost minimization, and feature selection based on whale optimization algorithm (FSWOA) [4] with the aim to reduce the dimensionality of medical data [5] by the selection of a reduced feature set.

Ì ISSN: 2252-8938
In this study, after some explanation about the BIRCH algorithm, once the problem solves with WOA, then we recommend the combined BIRCH algorithm as another solution for the same problem.Within the whale algorithm, we have three mathematical phases: encircling prey, spiral bubble-net feeding, and searching for prey [6].In this algorithm, agents have a unique method to hunt which is called bubble-net feeding [7] as we have explained in our previous research.In section 2, some theories about the BIRCH algorithm, and in section 3 the research method with the pseudocode and some equations have been provided.In section 4 Experimental tests and in section 5 Results have been discussed in the form of some tables and figures, and in section 6, The research is finalized.

COMPREHENSIVE THEORIES
In this section, we want to explain some theories about the BIRCH algorithm with some of the equations which help us to understand the steps of the algorithm in the next sections.Clustering is a technique for grouping data into different subsets to find some information from data that were hidden.This method divides objects into subsets, based on the similarity(closeness) within the clusters and dissimilarity in the outer clusters.The issues with clustering algorithms can be summarized as scalability and the execution time for big data.Here, we want to explain the BIRCH algorithm [8] which is an integrated hierarchical clustering algorithm.Birch is a multi-stage method in the field of hierarchical clustering techniques that solves two hierarchical clustering difficulties, which are the scalability and rollback problems.Hierarchical clustering (HC) tends to merge the nearest pairs or divides the farthest pairs we call them agglomerative HC or divisive HC approaches respectively.BIRCH defines as a hierarchy-based technique used in data mining for solving reallife problems.When we insert the data into BIRCH, a clustering feature (CF) will be generated, such as each node showing a cluster, intermediate nodes representing superclusters, and the leaf nodes demonstrating the present clusters.The CF values summarize information about subclusters instead of storing all points [9].The Branching (Br) parameter indicates the maximum number of children.When a new cluster is built, if it ends up with a greater Branching factor, the parent will divide.
The new point walks down recursively, always entering the subcluster which is the closest center until the walk reaches the leaf node.For having a balanced tree, the nodes split recursively.When all data are assigned, the leaf's centers will enter another clustering algorithm like k-means.This step merges the neighboring clusters to improve them.The advantages of this algorithm are its high performance in terms of memory, execution time, quality of the clusters, stability, and scalability, parallel and concurrent clustering, and its interactively and dynamically tune performance; so we combine it with WOA to use it as a TSP solver.In fact, BIRCH handles an accumulated section collectively by storing a compact CF tree.The dense regions are actually called subclusters.The algorithm operates in the following four steps [10]: i) scans all data and builds CF tree (loading), ii) condensing the original tree by building a smaller CF tree, iii) applying a clustering algorithm for all leaf entries, and iv) refining the clusters format(optional).For N d-dimensional data [11], the clustering feature, CF, the cluster's centroid, x0, radius, R, and diameter, D, are [12]: The WOA algorithm that we want to combine with our algorithm is summarized in the research method section in the form of pseudocode in Algorithm 1 and Table 1, the parameters of the WOA algorithm are defined [13].In these parameters, a which is a kind of distance parameter affects the ability of exploration and exploitation as a decreasing number from 2 to 0 and results in fast convergence.

RESEARCH METHOD
The partitioning technique is one of the foremost important parts of cluster analysis, and lots of algorithms are implemented for it like K-means, K-means++, and K-medoids [21].K-means first selects K initial seeds, then assigns the set of points into K clusters by minimizing the sum of squared error (SSE) [22].We apply K-means in phase 3 of the BIRCH pseudocode [23].

Pseudocode of BIRCH
Here, the five phases of BIRCH are summarized: 1) Search all the data to make a tree as in (1).2) Rebuild the CF tree with a bigger T (tree size).3) Apply K-means or K-modes.4) Cluster refining by making some additional passings over the data for the new reassignments based on the closeness to the centroids.5) Repeat all the steps to create K number of clusters.

Equations for the BIRCH algorithm
In the previous equations, the compactness of the clusters defines by R, and D, such R [24] is the average distance from member points to the centroid x0, and D is the average pairwise distance within a cluster, and CF entry is a triple, [12] where N is the number of data points, LS (Linear Sum of N), and SS (square sum of N) defines our subclusters.Here is an example of how two CFs can be merged which is called the additivity theorem [25]: Ì ISSN: 2252-8938 if we have two centroids for two clusters called x0 and x1, one of the equations of distance measurement that can be used is Euclidean distance [26]: there exist some other distance measurements like Manhattan and city block.We have two measurements for quality called Q1 (mean of the radius), and Q2 (mean of the diameter).These quality measurements can be used for pre-processing [27]:

EXPERIMENTAL TESTS
In Table 2 the values of our parameters, and Tables 3 to 6 the result of the experimental tests for our two combined algorithms have shown.We perform experiments on a MacBook with 8 GB of RAM, and 8 core CPU running MATLAB R2019a for both algorithms.This algorithm runs itself until the termination criteria are met.We can use some termination criteria, like a predefined number of iterations (here 20 iterations), sustainability of the results and time limit.The results obtained and shown in the tables are averaged over 20 runs of the proposed model for our datasets which are Ali535, Rat783, and pr1002 from TSPLIB TESTDATA [28].The execution time for both of the algorithms is calculated using a tic-toc command.

RESULTS AND DISCUSSIONS
Here, the steps of our algorithm start with setting the parameter.The other parameters like the branching factor can be decided as (11), but these numbers are experimental.For the threshold, we have used the standard deviation of our data model which consists of x and y for the dimensions of our dataset.Again, this formula can change based on experience.The next equations are variance and standard deviation, but we have used ( 14) by experience.1: Set the parameters as shown in Table 2: 2: Specify K randomly or the below equation for big tours among the cities: which N is the city's length, or we can also divide it by 2 (the standard formula).We use the same technique For the Branching factor.Table 2 mentions some parameters consisting of the initial population, iteration number, and city numbers.In the other tables like Table 3 and Table 4, some statistical values of fitness function (cost function) for unclustered and clustered WOA are mentioned.In Table 5 and Table 6, the execution time for the unclustered and clustered WOA are shown, respectively, and the same statistical calculations are considered for all three datasets.Based on the results, the fitness function is not changed much but based on the time average for Tables 5 and 6, the time has improved for the new method (WOA-BIRCH).

Figures
Figures 1(a) shows the best cost function of an unclustered approach for the ali535 dataset in 20 iterations, and Figure 1(b) shows the connected cities in an unclustered approach for the same dataset.Figure 2(a) shows the cost function for the Rat783 dataset, and Figure 2(b) shows how these cities are connected.Figure 3(a) shows the cost function of an unclustered approach for the pr1002 datasets during the iterations, and Figure 3(b) shows all the cities which are connected through an unclustered approach for finding the best path.In Figure 4(a) the WOA-BIRCH is applied for clustering with the BIRCH algorithm for the Ali535 dataset, and in Figure 4(b) all the subgraphs (before connections) are ilustrated for the same dataset.In Figure 5 cluster our data by the BIRCH algorithm, then solve each part by WOA and finally connect those clusters to complete a tour.Based on the final results, the TSP solver WOA-BIRCH is faster than the whale algorithm for TSP, therefore we can conclude that applying the BIRCH algorithm has empowered our algorithm and made it more applicable.The comparison measurements certify the improvement of our algorithm.

3 :
Int J Artif Intell, Vol. 12, No. 4, December 2023: 1619-1627 Int J Artif Intell ISSN: 2252-8938 Ì 1623 Applying the BIRCH algorithm to cluster our data.4: Applying the WOA algorithm for all of the found clusters i =1:K.5: Find the location of agents.6: Sorting by indexing.7: Joining the clusters by finding the cities that are closer to the centroid of that cluster.8: Repeat till joining all the clusters.

Table 1 .
[20]k search agents, Compute the fitness for X i , and update X*[20].Parameter definition

Table 2 .
The parameter settings

Table 3 .
The fitness function for the unclustered whale optimization algorithm

Table 4 .
The fitness function for the clustered whale with BIRCH algorithm

Table 5 .
The execution time of the unclustered whale optimization algorithm

Table 6 .
The execution time of the clustered whale optimization with BIRCH algorithm