Traveling Salesman Problem

<< Click to Display Table of Contents >>

Navigation:  Chapter 7 - Programming Combinatorial Problems with the DLL >

Traveling Salesman Problem

Previous pageReturn to chapter overviewNext page

One of the most popular combinatorial problems is the Traveling Salesman Problem (TSP).  Problems of this type are often encountered in practice, and the TSP has become a benchmark for the different algorithms of combinatorial optimization.  Its solution is computationally difficult though the problem itself is easily expressed.

The Problem

Imagine a salesperson who must make a closed complete tour of a given number of cities.  He wishes to minimize the length of the route.  All cities are connected to each other by roads, and each of the cities should be visited only once.

 

The main obstacle to solving the problem is that the map is not available to the salesman.  The only information that he has is the table of distances between the cities.  These circumstances make the solution of the problem very difficult.  Having a map makes it relatively easy to construct a quasi-optimal route, while shuffling the cities according to the table of distances is a very complicated task.  The total number of possible alternative tours in this problem is (N-1)!/2, where N stands for the number of cities.

 

GeneHunter includes the TSP sample program written in Visual Basic that will be described later in this chapter.  When you installed GeneHunter, it was included in the c:\GENEHUNTER\EXAMPLES\VB6 directory if you installed GeneHunter in the default directory.

The TSP Program

_bm43

 

Fig. 7.1  View of TSP main form

 

 

In this example, the Traveling Salesman Problem is solved by a genetic algorithm optimum search.  Each city has its own ordinal number from 0 to N-1, where N is the number of cities.  The tour length is represented by a sequence of city numbers.  In this example, we use unique representation:  the tour (4-0-2-1-3) goes from city 4 to city 0 to city 2 to city 1 to city 3, then back to city 4.

 

The default GA population consists of 100 individuals.  Each individual contains a single enumerated chromosome.  The fitness function seeks to minimize the total length of the tour.  The partially matched crossover and the mutation operators are used to create new offspring.

 

To reach an optimum solution, the population size has to be large enough relative to N, the number of cities.  If you experiment with the TSP sample program, you will discover that the population size should be several times more than the number of cities.  A small population does not contain sufficient genetic material to evolve successfully.

 

The TSP sample is a more complicated program than the MAXI sample from Chapter 6, but from the point of view of the global structure of any GA program, the TSP sample is very similar to the MAXI one.  You are advised to first study the basic GA structure of the MAXI sample described in Chapter 6.  The description in this chapter will only concern the GeneHunter functions that employ enumerated chromosomes.

Declarations Section

First, let's take a look at the declarations section of TSP.FRM.  The code declares the variables that are needed as global variables.  They include the population number PopN, the generation counter NGener, and the flag GoGener that is used to control the program flow:

 

Dim PopN%, NGener%

Dim GoGener%

The size of the population, the number of cities, and the arrays with X and Y coordinates of the cities are declared in this section as variables:

 

Dim Siz%

Dim CityN%

Dim X!()

Dim Y!()

 

The other main parameters of the evolutionary process (such as crossover and mutation rates, the strategy type, and the generation gap) are not global variables.  The user can change them freely in run time with the help of the graphical interface of the program.

 

The declarations section also contains the definition of a dynamic array of integers for the enumerated chromosome contents:

 

Dim chromosome%()

 

When the optimal search process converges, the array chromosome will contain the sequence of city numbers giving the shortest tour between cities.

Form Load

Initially, the TSP program sets the default city configuration.  This configuration looks like a concave square:  12 cities are located on the perimeter of a concave square with a side length equal to approximately 3 (arbitrary units).  Obviously, the optimal tour simply passes on the square perimeter, the length of which is 12.15843 (arbitrary units).

 

First, we set the default number of cities CityN=12 and we redimension the arrays with the X and Y city coordinates together with the buffer array for the enumerated chromosome:

Sub Form_Load (  )

              CityN=12

              maxcityN=CityN-1

              ReDim X(maxcityN)

              ReDim Y(maxcityN)

              ReDim chromosome(maxcityN)

              initXY

 

After redimension is complete, fill both arrays X and Y with actual default values.  This is done by the subroutine initXY.  All cities are encoded on the map and the search for the shortest route may begin.

The Start Over Button.  Creating An Enumerated Chromosome

The button Start Over is used for both starting and restarting the problem.  The first step is to get a free population number, store it in the global variable PopN, and create a population with this number:

 

i%=GetNextPopulation(PopN%)

If i% Then error_handler I%

i% = MakePopulation(PopN%, Siz%, 1000)

If i% Then error_handler I%

 

You may notice that these GeneHunter functions are quite similar to the ones used in the MAXI example.  More details concerning these functions are available in Chapter 6 and Chapter 8 .

 

Each individual in the TSP search will be constructed with a single enumerated chromosome using unique representation.  This enumerated chromosome is created by the following call:

 

i% = MakeEnumChromosome(PopN%, 0, CityN, CityN, 1)

If i% Then error_handler I%

 

With this call, we create an enumerated chromosome number 0 (the second parameter) with the dynamic range of chromosome genes equal to CityN (the third parameter) and with the length which is equal to the number of cities CityN (the fourth parameter).  This call specifies that the enumerated chromosome will consist of CityN elements.  Each element may have any integer value from the range 0...CityN-1.  This chromosome will have unique representation (the fifth parameter equals 1), so duplicate genes will not be permitted.

 

Though the call to the MakeEnumChromosome function is only made once, all Siz individuals will have one enumerated chromosome after this call.  In the TSP problem, we need only one enumerated chromosome per individual.  Multidimensional problems will require multichromosome individuals.

Fitness Function

Before the optimal search process starts, we must calculate the fitness function value for all individuals in the population and put them into the GeneHunter DLL memory.  This initialization is done in the next loop:

 

For  k% = 0  To  Siz% - 1

              i% = GetEnumChromosome(PopN%, k%, 0,                chromosome(0))

              If i% Then error_handler i%

              . . .

              i%=CalcOrderArrayPath(CityN, chromosome(0),

                                                                                                              X(0), Y(0), fitness!)

              If i% Then error_handler i%

              . . .

              i% = PutFitness(PopN%, k%, -fitness!)

              If i% Then error_handler i%

Next k%

 

In this loop, we get the contents of the enumerated chromosome number 0 of every individual by calling the GetEnumChromosome function, then calculate the tour path length by calling the GeneHunter function CalcOrderArrayPath, which calculates the length and stores it in the variable fitness.  Finally, we put the fitness variable into the DLL memory by calling the function PutFitness.

 

Here, we use an integer array chromosome as the buffer store of the chromosome contents.  Note that individuals are numbered from 0 to Siz%-1.

 

The fitness function is written according to the problem that we are going to solve.  In the TSP example, we are searching for the minimum of the function that is calculated as the sum of the Euclidean distances between cities along the closed tour.  To find the minimum, we use the inversion of the fitness function values, so the GA will be searching for a maximum negative tour length.

 

The fitness function can be placed anywhere in the Visual Basic modules. Here, we use the GeneHunter function CalcOrderArrayPath only because it is much faster in calculating the tour length in the DLL.

Subroutine Set Parameters

This subroutine sets all the parameters of the genetic algorithm optimum search.  Only key parts of the code are shown.  When creating the population, you have to indicate the type of selection, the generation gap, and the ratio parameter for fitness scaling:

 

i% = SetStrategy(PopN, selection%, gener_gap!, 3.0)

If i% Then error_handler i%

 

The selection variable is the type of selection, which is initialized by the constants ELITIST_SELECTION or PURE_SELECTION defined in GALIB32.BAS.  Initialization is made by user choice.  The variable gener_gap defines how many individuals pass into the next generation without crossover and mutation.

 

Next, we can set the parameters for the evolutionary operators:  type of crossover, as well as the crossover and mutation rates:

 

i% = SetEnumOperators(PopN, UNIQUE_GENES_CROSS,

                                                                                                              cross_rate!, mut_rate!)

If i% Then error_handler I%

 

Note that we set a special type of crossover for the enumerated chromosome using the global constant UNIQUE_GENES_CROSS, which is also defined in GALIB32.BAS.  See Appendix A for a detailed description of crossover types.

Subroutine Generate

The subroutine Generate contains all of the genetic algorithm code which searches for a solution to the TSP problem.  This subroutine starts the loop in which we call the Reproduce function, calculates new fitness function values, puts them back into the DLL, and finds the individual with the best tour length:

 

Do

              . . .

              i% = Reproduce(PopN)

              If i% Then error_handler i%

              . . .

              [For the fitness function loop, see above]

              . . .

              i% = CalcMeanFitness(PopN, mean_fitness!)

              If i% Then error_handler i%

              . . .

              i% = FindBest(PopN, best_indiv%, best_fitness!)

              If i% Then error_handler i%

              . . .

              i% = GetEnumChromosome(PopN%, best_indiv%, 0,                                                chromosome(0))

              If i% Then error_handler i%

              . . .

Loop While GoGener

 

In each generation, we find an individual with the best fitness by calling the FindBest function.  This function stores the number of the best individual in the best_indiv variable and stores its fitness value in the best_fitness variable.  The best tour is stored in the buffer array chromosome.

 

Evolution of the best tour length from generation to generation is drawn by the subroutine:

 

Draw_TSPath (best_fitness!)

 

If the value of best_fitness has changed, the program redraws the cities map together using the best tour plan.  This is done by the following subroutine:

 

Draw_Map

Subroutine Error_Handler

The error handler subroutine is a very important part of the TSP program.  Every GeneHunter function returns an error code after it is called.  Zero means that no error was detected.  If an error code is not zero, then the subroutine Error_Handler is called.

Sub Error_Handler (errornum%)

              If errornum% = equal_fitnesses Then

                              cmdStop.Enabled = False

                              MsgBox Ga_Error(errornum%), 48, "Message"

                              GoGener = False

                              Exit Sub

              Else

                              MsgBox Ga_Error(errornum%), 16, "Error"

                              End

              End If

End Sub

 

The module GALIB32.BAS (found in the \ The code is located in the c:\GENEHUNTER\EXAMPLES subdirectory) contains a special function called GA_error, written in Visual Basic, that returns a string error message for the corresponding error code.  This error message is displayed by the MsgBox subroutine call.

 

Important note: Error number 12 (which refers to the definition of the equal_fitnesses constant in GALIB32.BAS) is not a real error, but an indication that the fitnesses of all of the individuals in the population are equal.  This happens when the algorithm has already converged to a solution.  There is no sense in continuing the search when you receive this error.  If you think that the solution is not optimal, change the current evolution conditions:  population size, selection type, generation gap, crossover, and mutation rates.

General Considerations About TSP

The following are general guidelines for examining the Traveling Salesman Problem.  To achieve the optimum path, the population size must be appropriate relative to N, the number of cities.  For N = 12, 100 individuals are sufficient, and for N = 20, 100-200 individuals should suffice.

 

General recommendations for the crossover and mutation rates, and the generation gap are:

 

              Crossover rate = 0.7 - 0.9

              Mutation rate = 0.003-0.01

              Generation gap = 0.95-1.0

 

The selection strategy may be either "pure" or "elitist".  When elitist selection is chosen, several elite individuals with high fitness values pass into the next generation without crossover or mutation.  The number of elite individuals is determined by the formula:

 

number of elite individuals = (1.0 - generation gap) * Population Size

 

 

_bm44

 

        Roulette Wheel                                       Elitist

Figure 7.2  The graphs display evolution of the best tour length in the TSP sample.  The horizontal range displays progress through approximately 250 generations.   Problem parameters include:  20 cities, 200 individuals, crossover rate 0.9, mutation rate 0.01, generation gap 0.99.

The generation gap should be in the range 0.98-0.99 and the mutation rate should be in the range 0.008-0.01 when using elitist selection.  The difference between roulette-wheel random and elitist selection can be seen from Figure 7.2, where the generation-to-generation dynamics of the best tour length are shown over time.  Elitism selection causes the search process to converge much faster, though it requires a significantly higher mutation rate and generation gap.

 

All evolution parameters (except the number of cities and population size) may be altered during program run without interruption.  Changes are effective immediately during the TSP run.

 

The performance of the GA search may be evaluated by "Current shortest length", "Shortest length so far ", and "Average length" values displayed in the "Statistics" window.  "Current shortest length" stands for the length of a best tour in the current generation, and "Shortest length so far " stands for the length of a best tour over the entire evolutionary history.

 

The number of generations required to find an optimum tour depends upon the initial genetic information in the first generation.  This number varies from trial to trial.  It is recommended that several trials (three to five) be made to ensure that an optimal tour has been found.

 

Our TSP sample demonstrates the search for the optimal path for six different sets of city configurations (all distances and tour lengths are given in arbitrary units):

 

1.

      Concave Square: 12 cities are located on the perimeter of a concave square with the side length equal to approximately 3 arbitrary units. Obviously, the optimal tour is simply on the square’s perimeter, the length of which is 12.15843.  The average number of generations needed to find this unique tour is about 20-60, depending upon the initial genetic information in the first generation.  The total number of possible tours is 19,958,400.

 

2.

      Circle: 12 cities are located on the circumference of a circle.  There is a unique optimal tour length of 9.32094.  The average number of generations needed to find this solution is about 40-50.  The total number of possible tours is also 19,958,400.

 

3.

      Random: 12, 20, 30 cities are located randomly on the map.  The optimal tour and its length are unknown.  The population size must be at least 100, and 200 is highly recommended.  The number of generations required to find a near optimum tour is approximately 200, although continuing evolution for 200 to 500 generations results in a better tour. Frequently, several optimal tours exist with equal lengths (fitnesses).  In such cases, the algorithm will be "switching" between these tours.  The total number of possible tours for 20 cities is more than 60 quadrillion (6e+16).  Figure 7.3 illustrates one of the runs of the TSP program.  The best tour length found by the GA search after 350 generations is about 2 times smaller than the initial tour length.

 

4.

      User Selection: The user can designate cities on the map by mouse click.  The maximum number of cities is 20, and the minimum number is 4.  You cannot erase cities, but you can add cities to the existing map.  To restart a map (to begin with a blank map), you must select Random 20 Cities from the Map Menu.

 

_bm45

Figure 7.3  Two views of the best tour across 20 cities located randomly on the map are shown.  At left, the very beginning of the GA optimum search is displayed.  On the right, the tour is shown after approximately 350 generations.  The population consists of 200 individuals.  The final best tour is 2 times shorter than the initial one.