This section will detail changes between NeuroShell 2 Release 2.0 and Release 1.5.
Genetic Adaptive Training for GRNN and PNN Networks
Ward Systems Group was a pioneer in building neural networks using genetic algorithms (GA). However, we have not previously released any neural networks using a GA since the traditional methods either were extremely slow or did not enhance the ability of a network to generalize (work well on new data it hasn't been trained with.) Finally we have developed a GA based network algorithm that uses the GA directly with Calibration to improve the network's generalization. It is an outgrowth of Specht's work on adaptive GRNN and adaptive PNN networks. It uses a GA to find appropriate individual smoothing factors for each input as well as an overall smoothing factor. (The input smoothing factor is an adjustment used to modify the overall smoothing to provide a new value for each input.) The individual smoothing factors may be used as a sensitivity analysis tool: the larger the factor for a given input, the more important that input is to the model, at least as far as the test set is concerned. Inputs with low smoothing factors are candidates for removal for a later trial, especially if the smoothing factor gets set to zero.
To use genetic adaptive PNN and genetic adaptive GRNN, you must select PNN and GRNN architectures in the design module as usual. However, in the Training and Stop Training Criteria module (the stoplight) you will be given the option of using genetic adaptive Calibration instead of the iterative method previously available. Remember, you must have a test set to use either type of Calibration, and genetic adaptive only works in the Calibration mode. The genetic adaptive methods will produce networks which work much better on the test set but they will take a lot longer. As with Calibration on all of our architectures and paradigms, the key to success is in using a representative test set. If you do not, you will get better answers on the test set, but generalization on future production sets could even be poorer if the future data is unlike anything in the test set. On the other hand, you cannot use too large a test set (or too large a training set for that matter) or the Calibration process will take entirely too long.
Genetic algorithms use a "fitness" measure to determine which of the individuals in the population survive and reproduce. Thus, survival of the fittest causes good solutions to evolve. The fitness for GRNN is the mean squared error of the outputs over the entire test set, just as it is with iterative Calibration. For PNN, the fitness is the number of incorrect answers, which we are obviously trying to minimize. Actually, the number of incorrect answers may not be a whole number. In some cases we add fractions, e.g., 13.9, to show that the net almost gets 14 wrong. The fraction is an internal number that allows the genetic algorithm to distinguish between two nets that get the same number of wrong answers. In both GRNN and PNN, the GA seeks to minimize the fitness.
In the GRNN Genetic Adaptive Learning and PNN Genetic Adaptive Learning modules, you will be given some new options related to genetic adaptive learning:
Genetic Breeding Pool Size: A GA works by selective breeding of a population of "individuals", each of which is a potential solution to the problem. In this case, a potential solution is a set of smoothing factors, and we are seeking to breed an individual that minimizes the mean squared error of the test set (remember that the training set was used to train the network, so we are not "cheating" by training on the test set.) The larger the breeding pool size, the greater the potential of it producing a better individual. However, the networks produced by every individual must be applied to the test set on every reproductive cycle, so larger breeding pools take longer.
Auto Termination (GRNN only): If this option is turned off, the Calibration process will continue until you decide to stop it. When turned on, the learning module will automatically stop the process when there have been 20 successive reproductions (generations) of the whole population, but none has produced an individual that improved the mean squared error by at least 1%. Without this option, the GA could continue for quite some time finding solutions that are only marginally better. However, if the problem is particularly stubborn, you may just want to let it run awhile with the option turned off.
The Auto Termination option is not available with genetic adaptive PNN, which simply terminates when there has been no improvement at all in 20 generations.
If you want to resume training a network after stopping (even if stopped by Calibration because of lack of progress) you will be given that option by dialog box as with backpropagation. You may also want to start over (usually after selecting a new random seed) but continue with the smoothing factors already found from the last training session. You will be given this option by dialog box as well when you start training a second time.
For details, refer to the following modules:
GRNN Training Criteria
GRNN Genetic Adaptive Learning
PNN Training Criteria
PNN Genetic Adaptive Learning
Test Set Extract
NeuroShell 2 now allows you to automatically extract a production set of data in addition to a training set and test set. The production set may be used as an “out of sample” data set with which to test the trained network. When applying a trained network, many users have that data in a separate file, but others have asked for a method of automatically extracting a production set. For example, you may want to predict the stock market and have your data in a spreadsheet, which you use to train your network. After you have trained the network, you may want to add new data to the bottom of the spreadsheet and apply the network to get results. The extract module allows you to automatically select the new data for inclusion in a production set.
NeuroShell 2 Files - Spreadsheet Format
NeuroShell 2 Release 2.0 can work directly with Microsoft Excel .XLS files for Excel Releases up to and including Release 4. Users with Excel Release 5.0 or higher can save the file as an Excel Release 4 Worksheet or below, or they can use the Spreadsheet File Import module. If your file has label information, you will need to tell NeuroShell 2 which is the label row and where data begins.
Tutorial On Line
NeuroShell 2 now includes an on-line Tutorial which teaches you how to use the NeuroShell 2 modules.
Categories in Backpropagation Networks
When you want to use a backpropagation network to classify data into categories, a new button in the Apply Backpropagation Network module allows you to set the highest output to 1 and all others to 0.
The statistics computed when backpropagation, PNN, and GRNN networks are applied to a file may now be copied to the Windows clipboard for use in other applications. For example, you may want to compare the results of different neural networks. This new option allows you to copy the results to the clipboard and paste them into a spreadsheet for easy comparison.
Spreadsheet File Export
The Spreadsheet Export module now allows you to designate specific spreadsheet columns where you can copy data from NeuroShell 2 internal files.
When opening a NeuroShell 2 problem, a list of the previous five problems that you have used is displayed beneath the File Menu. Click on the problem you wish to open.
NeuroShell 2 is now configured so that you can’t close Windows unless you close NeuroShell 2. This modification is especially useful if you are training a network in the background and NeuroShell 2 is minimized.