GeneHunter is a powerful software solution for optimization problems which utilizes a state-of-the-art genetic algorithm methodology. GeneHunter includes an Excel Add-In which allows the user to run an optimization problem from
Microsoft Excel, as well as a Dynamic Link Library of genetic algorithm functions that may be called from programming languages such as Microsoft® Visual Basic or C.
What Are Genetic Algorithms?
Genetic algorithms (GAs) seek to solve optimization problems using the methods of evolution, specifically survival of the fittest. In a typical optimization problem, there are a number of variables which control the process, and a formula or algorithm which combines the variables to fully model the process. The problem is then to find the values of the variables which optimize the model in some way. If the model is a formula, then we will usually be seeking the maximum or minimum value of the formula. There are many mathematical methods which can optimize problems of this nature (and very quickly) for fairly "well-behaved" problems. These traditional methods tend to break down when the problem is not so "well-behaved."
Evolution and Genetic Algorithms
Before describing how a GA can be applied to an optimization problem, let us draw the evolutionary parallel. The theory is that a population of a certain species will, after many generations, adapt to live better in its environment.
For example, if the species is an animal that lives mainly in a swampy area, it may eventually evolve with webbed feet. The reason is that the members of this population, which we call individuals, will tend to die if they are poor swimmers which cannot easily get food, and live to reproduce if they are good swimmers. The offspring of two good swimmers will probably be good swimmers because they will usually carry genetic traits of their parents, such as slight webbing between their toes. These genetic traits are carried in the chromosomes of the individuals.
How Does GeneHunter Work?
GeneHunter solves optimization problems in the same way. It will create a population of possible solutions to the problem. The individuals in this population will carry chromosomes which are the values of variables of the problem.
GeneHunter actually solves your problem by allowing the less fit individuals in the population to die, and selectively breeding the most fit individuals. The process is called selection, as in selection of the fittest. GeneHunter takes two individuals and mates them (crossover), the offspring of the mated pair will receive some of the characteristics of the mother and some of the father.
In nature, offspring often have some slight abnormalities, called mutations. Usually, these mutations are disabling and inhibit the ability of the offspring to survive, but once in a while, they improve the fitness of the individual. GeneHunter occasionally causes mutations to occur.
As GeneHunter mates fit individuals and mutates some, the population undergoes a generation change. The population will then consist of offspring plus a few of the older individuals which GeneHunter allows to survive to the next generation. These are the most fit in the population, and we will want to keep them breeding. These most fit individuals are called elite individuals. After dozens or even hundreds of "generations," a population eventually emerges wherein the individuals will solve the problem very well. In fact, the most fit (elite) individual will be an optimum or close to optimum solution.
problem solving model in GeneHunter requires that you enter the
relevant data into a Microsoft Excel spreadsheet and
specify problem solving parameters in a GeneHunter dialog screen.
GeneHunter Dialog Screen
GeneHunter Dialog screen enables you to identify the cells in your
spreadsheet that are involved in solving your problem. You also
list constraints that should be met by the solution.
Function box tells GeneHunter the location of the cell which
contains the formula that measures GeneHunter's success in finding
a solution to your problem. The formula may be created using any
of the Excel functions that are available from the Insert menu,
such as average. You may also use Excel macros or Visual Basic
functions to create a formula that allows you to solve very
complex problems. You may even use a neural net to model the
process if you don't have an appropriate mathematical formula!
If you are
creating a sales application that considers variables such as
advertising, distribution, and research and development budgets,
the goal might be to maximize revenue, and you would want to
maximize the fitness function. In another application, the goal
might be to develop an oil that reaches a specified quality by
adjusting the amounts of additives that go into making the oil.
are the variables whose values are adjusted in order to solve the
problem. Their value is related in some way to the fitness
uses two types of chromosomes to solve the problems.
Chromosomes are used when the adjustable cell can take on a value
that may be within a continuous range, such as the value 1.5 with
the range 0 to 2. Continuous chromosomes may also be integers if
you want to restrict the search space.
Chromosomes are used when the problem involves finding an optimal
combination of tasks, resources, duties, etc.
constraint portion of the GeneHunter dialog box allow you to do
range of values that GeneHunter will search for a solution, thus
limiting the time that GeneHunter will take to find an optimal
solution. This is called hard constraint.
restrictions or sub-goals to the original fitness function. This is
called a soft constraint. GeneHunter attempts to find solutions
that meet the soft constraints, as well as optimize the fitness
includes the following example programs in an Excel workbook.
Trip - Balanced Diet Within a Budget
serves as a tutorial for GeneHunter. A camp leader needs to
purchase the food supplies for an overnight camping trip for a
group of 20 people. He has to minimize the food costs while still
providing the appropriate number of servings of grains, fruits,
vegetables, dairy products, meats, and sweets in order to meet the
camp's recommended dietary standards.
You may have
problems to solve that are similar to this one, such as minimizing
the cost of an advertising budget while reaching a targeted number
of people in different demographic groups by adjusting the dollars
spent on different types of advertising.
goal is to minimize the food cost while maximizing diet quality by
varying the number of servings per person for each food group.
Portfolio example is representative of many problems which require
you to optimally group items with different values into sets (for
example, students into academic competition teams, workers of
different skills into job teams, etc.). In the example, a money
manager has a list of stocks, each with a different value, that he
wants to separate into six groups, based upon the specified
percentages of his entire portfolio.
example, GeneHunter generates rules to predict a rise in the NYSE
index. To learn more about this example and stock market
prediction with GAs, see
the NYSE Using Rules Generated by GeneHunter.
Net - Bob's Deli
learn more about using neural networks and genetic algorithms
together later on. "Bob's Deli" is just one way in which
you can actually use GeneHunter to build and train a neural
network without any other neural network products or algorithms.
decided to train a neural net to predict his sandwich sales so
that he will have enough ingredients and staff available each day
at lunch. He has been keeping records of sales for 16 weeks. He
uses temperature, daily precipitation, the day of the week, and
payday for the company across the street as inputs to his network.
example, we use GeneHunter to train Bob's neural network to
predict the number of sandwiches he should be prepared to sell
each day at lunchtime in his Deli. GeneHunter's genetic algorithm
is used to find the network's weights.
Traveling Salesman Problem is a well-known problem which has
become a comparison benchmark test for different algorithms used
to solve combinatorial optimization problems. A salesperson must
make a complete closed tour of a given number of cities. All
cities are connected by roads, and each city must be visited only
once. GeneHunter solves this problem by minimizing the total tour
length, and by changing the order of the city numbers.
One of the
most important uses of genetic algorithms is their ability to
create optimum schedules for just about any reasonable sized
scheduling problem. There may be quite an opportunity for
entrepreneurs to build GA schedulers for specific job scheduling
environments. You can use our example as a model.
example, we are the manager of a small circuit board assembly
plant that assembles, wires, solders and tests a number of circuit
boards of different types each day. There are a total of five
workstations that each board must go through, but each workstation
may have several specialists on duty (or machines in the case of a
fully automated shop), each of which can independently fully
perform the duties of that workstation.
several types of circuit boards manufactured at this shop, and
each type requires a different amount of time at a particular
workstation. Therefore, there is the potential that some
specialists or machines will have work to do at various times,
because boards at the previous stations require long job times.
GeneHunter creates a schedule for manufacturing the boards in a
minimum amount of time.
Portfolio Optimization example is more representative of a real
world problem than the simple bin packing example
"Portfolio". In true portfolio optimization, a trader or
fund manager would seek to minimize his/her risk while
simultaneously trying to maximize return. The trader does this by
modeling his/her high return portfolio after an existing
diversified portfolio, such as the S&P 500.
picks a number of stocks, for example, that he/she believes will
offer a high rate of return. The GA is used to minimize the risk
by helping to adjust the portfolio so that it behaves like the
more diversified S&P 500 "portfolio".
possible to use GeneHunter to find mathematical models of data,
much like the models that could be built with multiple non-linear
Polynomial Approximation example uses GeneHunter to determine the
coefficients of five independent variables along with the power of
each variable. The example could be extended to using more
complicated mathematical models.
Many users want to include the power of a genetic
algorithm in their applications but prefer to design their own user interface or to speed
up the time that is required by Excel to compute complicated fitness functions. To meet
these needs, GeneHunter also includes a complete genetic algorithm function library that
is available in a Dynamic Link Library called GALIB.DLL.
The library includes functions for creating
populations, evolving the populations by setting parameters such as crossover, mutation,
and diversity (sometimes called "creep" in the literature), measuring the
fitness values of individuals in the population, and updating the population into the next
generation. The user has the option of creating individuals with either continuous or
Crossover, one of GeneHunter's operators,
involves a process in which GeneHunter takes two fit individuals and mates them. The
offspring will receive some characteristics from both parents.
Mutation is an operator that also serves to
continue evolution. Instead of combining traits from two parents, however, mutation
changes a single individual by randomly changing the value in one of the chromosomes. The
diversity operator also changes a single individual, but changes each chromosome in small
increments rather than drastically changing a single chromosome.
Extinction, another operator, allows you to
destroy all but the elite of the population. It is similar to a plague destroying most of
the population, and refreshing it with a large new set of individuals.
Complete Function Reference
The GeneHunter manual includes a comprehensive
explanation of each function, including argument descriptions, sample function calls, and
a listing of related functions.
The manual contains separate chapters describing
how to program applications with continuous and enumerated chromosomes, including detailed
code listings with corresponding explanations.
The GALIB.DLL enables you to create applications
that can simultaneously run up to 128 populations. The function MakeChromosomePool enables
you to quickly create multiple similar chromosomes for applications such as optimizing the
weights in a neural network. GeneHunter even allows you to mix continuous and enumerated
chromosomes in a single population.
Any applications you build with the GeneHunter programming interface may be distributed only within your company or specific government agency. That is, you may not allow anyone outside of your company to use it if your company purchased it (or outside of your agency if your government agency purchased it). If you purchased GeneHunter personally, you may not distribute your applications outside of your immediate family.
If you desire to sell or distribute a GeneHunter application outside of your company, agency, or family, you must negotiate a separate license from Ward Systems Group, Inc.
The GeneHunter installation includes the
source code for three Visual Basic examples (plus C code for one example). Executable
versions of these examples are installed on your computer during GeneHunter setup.
Traders often spend years developing trading
schemes based on complex formulas or rules they either learned in college or on the job.
Using GeneHunter, however, you may be able to produce rules that are more adept at
predicting financial markets.
One of the exciting aspects of genetic algorithms
is the fact that the more imagination you use in creating fitness functions and their
corresponding use of variables (chromosomes), the more you will be able to develop
powerful applications. For example, it is not too hard to use GeneHunter to find
"rules" to process data. One of the most interesting applications of rule
generation may be in the generation of rules to predict various financial markets.
GeneHunter includes a sample application for
creating rules for predicting financial markets.
The formulation of the problem to GeneHunter is
somewhat different than the usual GA formulation. Each individual in the population
represents a rule such as:
If the close 5
days ago is greater than the close 1 day ago and if the low 2 days ago is less than the
high 7 days ago, then the NYSE will rise tomorrow.
The concept of using a genetic algorithm to
create the rules can be expanded to include finding optimum rules to process data,
schedule airline flights or merchandise delivery, etc.
GAs and Neural Networks
Artificial neural networks were originally
designed to model the pattern recognition abilities of the brain. They have been used
extensively for many practical predictive and data classification tasks. There are many
ways neural networks can be trained, and using a genetic algorithm is one of those ways.
The Bob's Deli example shows you how to do this without any neural network products.
The integration of traditionally trained neural
nets and genetic algorithms is a fruitful area of research, and there are already many new
and exciting ways to merge the technologies. If you own one of Ward Systems Group's neural
network programs (e.g. NeuroShell 2 or the Run-Time Option for NeuroShell Predictor or
Classifier), then the integration of the technologies can be fairly easy.
Using Neural Networks As Fitness
Neural networks can be trained to find the
fitness function if there are enough available samples of fitnesses already measured. Both
NeuroShell 2 and the Run-Time Option for NeuroShell Predictor and Classifier have
functions that can be executed from Excel spreadsheets to fire these neural nets so that
they can become the GeneHunter fitness function. Then, use GeneHunter to find an
optimal set of inputs. By this, we mean the best set of inputs; the set that
optimizes the process you are modeling.
optimization problem from a Microsoft Excel spreadsheet.
cells in your spreadsheet that are involved in solving your
genetic algorithm parameters and list constraints that must be
met by the solution.
demonstrate the different types of problems that GeneHunter
may help you solve.
crossover and mutation rates, as well as generation gap.
with parameters and specifications in the example programs in
the Excel interface.
runtime license with restrictions.
Basic source code available for GeneHunter examples. C code
example also provided.
applications with both continuous and enumerated chromosomes.
especially for Microsoft's Visual Basic, Visual Basic for
Applications, or Access Basic.
with other languages which call DLLs under Windows, such as Visual C++
and Visual Basic.
programs with GeneHunter which can run up to 128 populations
simultaneously. Each population may contain up to 2,000
continuous (integer/noninteger) chromosomes with the
MakeChromosome function. You may use up to 5,000 chromosomes
in one individual.*
enumerated chromosomes with the MakeEnumChromosome function.
You may use up to 2,000 enumerated chromosomes in one
chromosome pools with the MakeChromosomePool function. You may
use up to 200 chromosome pools in one population. Each pool
may contain up to 10,000 chromosomes.*
creates a program group from which you can view example programs
created with Visual Basic and Borland C++ that use GALIB.DLL.
mentioned are program limits. Your computer may not contain enough
memory for large problems, and some large problems may not be
solvable even if they are within the limits.
The GeneHunter® Excel Add-In allows you to save
"runner-up solutions" in addition to the best solution.
This feature will allow you to try near optimal combinations of
values for adjustable cells that may in fact be a more practical
solution to your problem. The user may specify how many (N)
runner-up solutions should be kept after evolution is finished.
These N best solutions are stored in a separate worksheet in the
Excel workbook. This option can be turned on/off as needed. The
default setting is off.
Several fitness functions can be set up to
optimize simultaneously. In addition to the main fitness function
the user can enter a list of additional fitness functions to be
optimized in the same run. The goal of each fitness function
(maximize/minimize/search for a specific value) is set
independently. This feature will allow you to control more than
one facet of an optimization problem.
Hard and Soft
Hard constraints (also known as ranges) and soft
constraints (secondary fitness functions) are entered separately.
Ranges (maximum and minimum values of chromosomes) can be set by
referencing cells on the worksheet containing the minimum and
maximum values. You can set these values directly on the worksheet
without going through the GeneHunter dialog box.
The user can add a tolerance and a priority for
each soft constraint entered. The purpose of the tolerance is to
tell GeneHunter when it has done a sufficient job. For example, if
your constraint is that cell B4 must be less than cell B6, then
you may want to instruct GeneHunter that if they are within plus
or minus .5, then the solution is acceptable.
The priority tells GeneHunter that some
constraints are more important than others in case they cannot all
be met. Priorities can be set to "low",
"medium", or "high".
Control The Random
GeneHunter allows the user to control the
evolution of the initial population using the random number
generator. Some users may want to begin with the same
population each time and vary other factors in the optimization
process. This is done by using the same random number seed
each time, which is the default setting. Others may want to
begin with a different population each evolution in order to find
different solutions. You may do this by selecting a
different random number seed.
display a graph of the progress of an optimization run.
GeneHunter can display a graph of the best fitness
function versus generation number. The graph is a regular
Microsoft Excel chart that you can format using Excel. View this
graph to see if the genetic algorithm is progressing in finding a
* CalcOrderArrayPath * FindBest * GetChromosome *
GetChromosomePool * GetEnumChromosome * GetEnumChromosomeParm *
GetMaxPopulNum * GetNextPopulation * GetPopulationParam *
MakeChromosome * MakeChromosomePool * MakeEnumChromosome *
MakePopulation * PutChromosome * PutChromosomePool *
PutFitness * Reproduce * SetDiversity * SetEnumOperators *
SetExtinction * SetGenSeed *
SetOperators * SetStrategy
Reprinted with Permission
GeneHunter: GA Software
by Lisa Lewinson
Reprinted With Permission From PC AI Magazine
category, we present information about new AI products as they hit the marketplace. The
first installment provides an inside view of a genetic algorithm package from Frederick,
MD-based Ward Systems Group.
"Survival of the fittest" is the law of
nature that genetic algorithms attempt to emulate. Survival of the fittest may also be the
law of software marketing. GeneHunter, a new user-friendly fast-running genetic algorithm
software package from Ward Systems Group, hopes to demonstrate that it is very fit indeed.
Introduced this month, GeneHunter works in two
ways. It's callable from Microsoft Excel spreadsheets, and accessible via function calls
in a Dynamic Link Library (DLL). The function calls -- unique to GeneHunter -- are similar
to the neural network function calls in NeuroWindows, Ward System Group's neural net DLL
University of Michigan mathematician/psychologist
John Holland and his colleagues developed genetic algorithms (GA) over twenty years ago.
Their goal was to design a general optimization algorithm that would efficiently find good
solutions to multifaceted problems. Holland et al took two approaches:
An Evolution of Interest
People have known Holland's work for two decades.
But Ward Systems Group founder and President Steven Ward notes that only the last five
years have seen a large number of successful applications of this work to a range of
"A genetic algorithm is an encoding of the
variables of an optimization problem," says Ward. "The essential genetic
algorithm maintains a set of individuals. Think of the set of variables in an individual
as a potential solution to a problem. The evolutionary parallel is that a population of a
certain species will, over many generations, adapt in an increasingly better manner to its
Ward draws what looks like a
dog's paw on a whiteboard. (It turns out to be an illustration of a webbed foot.) "An
example might be an animal that lives mainly in a swampy area. It may develop webbed feet
to better survive in its environment," Ward says. "The offspring of two slightly
web-footed animals are likely to carry the genetic traits of their parents such as
increased webbing between the toes. The individual animal's chromosomes carry these
solves an optimization problem by creating a 'population' of individual solutions to the
problem. The individuals in this population carry chromosomes -- or variables -- which are
the values of the problem. Chromosomes consist of units (hidden from the user) called
"Several types of
'chromosomes' are in GeneHunter -- continuous, integer, and enumerated. A continuous
chromosome consists of either 8, 16, or 32 genes, implemented in the computer as binary
bits. The three sizes of chromosomes are chromosome precision. GeneHunter provides two
types of enumerated chromosomes: repeating gene and unique genes. A continuous chromosome
has two distinct values of a gene, 0 and 1, called alleles.
"So let's say we set up a
population of 100 individuals," Ward continues. "Each one is a potential
solution to a problem. With GeneHunter, you attempt to measure the fitness of an
individual in order to decide if it will survive or not. The measure of how well an
individual solves a problem is called a fitness function."
Regardless of the
problem-type, the genetic algorithm searches for the fittest individuals, and ranks them
by fitness. It "marries" the fittest individuals, producing offspring. A
crossover operator carries out this operation. The genetic algorithm then evaluates the
next generation of the population, again using the fitness function as a criterion,
eliminating unfit individuals while fit individuals marry and produce offspring. The best
individuals in the set undergo crossover and mutation to produce new individuals that in
turn are presumably better solutions to the problem.
"As more generations are
born," Ward explains, "solutions to the problem continue to get better because
the fitness keeps increasing, since marrying means merging the best characteristics of
both individuals. To keep 'inbreeding' down, we program the occurrence of a 'mutation'
every so often. This mirrors nature. Mutated offspring occasionally have some slight
abnormalities. Usually these mutations are disabling and inhibit the ability of the
children to survive, but once in a while they improve the fitness of the individual --
like toes stuck together in a weblike fashion."
After GeneHunter mates fit
individuals and mutates some, the population undergoes a generation change. At this point,
the population consists of offspring plus a few of the older individuals which GeneHunter
allows to survive to the next "generation" because they are the fittest in the
population. These fittest individuals are called elite individuals.
After dozens or even hundreds
of "generations", a population eventually emerges whose individuals solve the
target problem very well. In fact, the fittest (elite) individual will be an optimum or
close to optimum solution.
The processes of selection,
crossover, and mutation are called genetic operators. GeneHunter has several other genetic
operators. One, diversity, allows additional but very slight mutations of the population.
Another, extinction, is like a catastrophe that destroys most of the population (all but
the elite individuals) and refreshes it with a large set of new individuals. GeneHunter
sometimes causes extinction when so much inbreeding takes place that no offspring are
fitter than their parents for many generations. (Extinction is only available in the DLL
form of GeneHunter, not in the Excel interface.)
To work with GeneHunter, an
Excel add-in tool, you enter the relevant data into an Excel spreadsheet and enter a
description of the problem into a GeneHunter dialog box opened from the Excel Tools Menu.
Here's an example of GeneHunter
at work. Suppose you have to buy supplies for an overnight camping trip for 20 people.
Your goal is to minimize food cost while meeting or exceeding recommended dietary
standards. Changing the number of servings of one food group affects the number of
servings from the other food groups, and cost per serving varies substantially from one
food group to the next.
This example is analogous to
business problems like
Minimizing the cost of an
advertising budget while reaching a targeted number of people in different demographic
Maximizing profit for a company
operating within a budget limit by adjusting the amount spent in different departments
Generating work schedules
Figure 2 shows an example of
camping trip data in Excel. Column B holds the six essential food groups. Column C
contains the recommended percentage of total diet for each food group. Column D computes
each food group's actual percentage of the total diet. The entire data set of combinations
of the number of servings for each food group is the population. In this example,
GeneHunter automatically changes the number of servings of food from each food group in
Column E (the chromosomes) in order to minimize the budget constraint in cell G15, the
Fitness Function. At the same time, GeneHunter maximizes diet quality in cell G16.
The Fitness Function cell must
contain a formula which decides how well GeneHunter is performing. (GeneHunter allows the
user to select whether the program will produce a maximum or minimum value in the Fitness
Function Cell, or produce a value that is as close as possible to the value entered in an
edit box on the GeneHunter screen.)
Fitness Function Cell G15 in
Figure 2 calculates the total cost of the camping food budget by summing up the total cost
of servings in each food group in column G. The total cost of servings for each food group
is calculated by multiplying the number of servings in column E by the cost per serving in
column F, and multiplying that value by 20 (the total number of people in the group).
Figure 2 Camping trip in Excel.
Although only one Fitness
Function Cell is allowed, GeneHunter allows the user to work with sub-goals when finding an
optimum solution, in this case, diet quality. Diet quality is a constraint which is set
before running the program (see Figure 3). Constraints allow the user to limit how long
GeneHunter will search for a solution. For example, in this case, constraints for $E8 to
$E13 have been set to <=20 (the number of servings per person).
3 Diet quality is a constraint which is set up before running the program.
Once the parameters are in
Excel, the user presses the Start button and GeneHunter begins its survival of the fittest
calculations -- determining, in this case, the optimum number of servings which will
minimize the budget while increasing diet quality.
A World of
A fitness function doesn't have
to be a mathematical formula. It can be a neural net or the application of logical rules.
Ward feels that this seemingly limitless variety is "one of the most exciting things
about genetic algorithms. You can use a genetic algorithm, for example, to generate the
best rules to decide when to buy or sell stock." (An example of this functionality is
one of GeneHunter's sample applications.)
"The concept of being able
to generate rules to find an optimum way to do something has huge potential. There may be
a time when genetic algorithm packages can program computers. You can build strings of
code with a genetic algorithm.
"When we first introduced
NeuroShell 1, we could only think of four or five applications for it," Ward
concludes, "but our users have since applied NeuroShell to thousands of creative uses
that we never would have thought of. The whole industry grew from people who learned the
technique and applied it to their own domain of expertise. We believe the same thing will
happen with genetic algorithms. People will see what GAs can do and light bulbs will come
on in their minds."
Contributing Editor Lisa Lewinson
is President of Northstar Consulting Services, Inc. which specializes in the development
of corporate data mining and expert systems applications.