The Learning module for the Beginner's System lets you define certain network training parameters and train the network.
The Beginner's System uses a three layer Backpropagation network, a universal architecture which has the ability to generalize well on a wide variety of problems. The Beginner's System presets network parameters such as learning rate, momentum, and number of hidden nodes.
1. Select a problem type by clicking on the appropriate button. Selecting a problem type selects default settings for learning rate and momentum, and sets a Pattern Selection method of rotation (in sequence) or random.
Simple -- Use for simple "toy" problems or when you want the network to learn quickly for testing purposes.
Complex -- Use for real world problems.
Complex and very noisy -- Use for real world problems with very noisy data.
Note: These are guidelines only and require some subjective interpretation on your part. In the final analysis, you should use the set of learning rate and momentum factors that work best for your problem.
2. Set Number of Hidden Neurons to Default. You can either type in the number of hidden neurons in the edit box or have NeuroShell 2 compute the number by clicking on the button.
Selecting the number of hidden neurons to use for a particular problem is part of the art of developing a neural network.
Generally speaking, too few hidden neurons will result in not enough distinctive characteristics of the problem being captured in the neural network. Learning may not go below a specified learning threshold, incorrect classifications may occur, or both. Defining too many hidden neurons may not result in incorrect classifications, but will cause much longer learning, thereby increasing the time it takes to develop the model.
At a certain point, increasing the number of hidden neurons does not greatly increase the capability of the neural network to classify, but does increase the learning time. There are occasions, however, when increasing the number of hidden neurons slightly will actually decrease learning time. It is also possibly to use so many hidden neurons that the training set is "memorized" rather than "learned," in which case generalization on new cases may be poor.
Again, there is no set formula for the number of hidden neurons to use. Only experimentation will lead you to the correct number. In general, it is better to err on the side of too many hidden neurons.
When using NeuroShell 2 as one might use regression analysis, more hidden neurons give "tighter" data fits.
3. Decide when to save the network. You can choose to automatically save the network whenever it reaches a new minimum average error for either the training or test set, or not to automatically save the network. Choose test set when using Calibration.
4. View statistics on how training is progressing. You can observe statistics for both the training and test sets while the network is learning to determine if the network is continuing to make progress.
Learning epochs: This is the number of times the entire set of training patterns (an epoch) has been propagated through the network. If you have a large training set, it will take a while before this number is displayed.
Last average error: This is the network's most recent computation for the difference between the network's predictions and the actual predictions or classifications for data in the training set. If there is more than one output, the error is "averaged" over all of the output values. Error refers to the mean squared error, a standard statistical technique for determining closeness of fit.
The network computes the mean (average) squared error between the actual and predicted values for all outputs over all patterns. The way it works is that the network first computes the squared error for each output in a pattern, totals them, and then computes the mean of the total for each pattern. The network then computes the mean of that number over all patterns in the training set.
Note: The main training statistic is the internal average "error factor." You cannot calculate or reproduce this average error factor exactly yourself on the training set, nor is it in any way useful for you do so. However, it is the mean over all patterns of the squared error of all outputs computed within NeuroShell 2's internal interval, which is either 0 to 1 or -1 to 1, depending upon what the output activation function is. This number is not useful for its value itself. It is useful during training to see if the network is improving, because it gets lower as the network improves. As the network learns the training set better, the average error for the training set gets lower, and eventually makes very slow downward progress. As the network does better on the test set, the test set error decreases, and then usually starts increasing after awhile. This is because the network is getting worse on the test set even though it is getting better on the training set. Calibration saves the network when the average test set error is at its lowest point.
Minimum average error: This displays the lowest value for average error that the network achieved during training for data in the training set..
Epochs since min: This displays the number of epochs that have elapsed since the minimum average error was calculated.
Calibration interval: This number specifies a number of events or training set patterns that are propagated through the network before the average error for the test set is computed.
Last average error: This is the network's most recent computation for the difference between the network's predictions and the actual predictions or classifications for data in the test set. If there is more than one output, the error is summed over all of the output values in each pattern and then averaged over all patterns. Refer to last average error in the Training Set for details.
Minimum average error: This displays the lowest value for average error that the network achieved during training for data in the test set.
Events since min: This displays the number of events that have elapsed since the minimum average error was saved.
FILE NOTE: The procedures in this module will produce two types of files: configuration files (.FIG), which save the network architecture and learning parameters, and network files (.N01), which saves the trained network. When you apply this network to a file, the .N01 file is read back in for processing. This module defaults to training on a .TRN file, if it exists, or the .PAT file if there is no .TRN file.