What is Calibration
When you train a neural network, it will continually work at making a model that works better and better on the training set. Of course it is possible that this learning will eventually cease to make any progress, especially if your learning rate and momentum are too high or you have used too few hidden neurons. However, it is generally safe to say that learning improves forever, but eventually you get to a point where the forward progress is too slow to be practical.
However, the mistake that has been made for many years and continues to be made by neural network practitioners is this quest for "convergence". Most people overlearn their problems. To see how this happens, let's consider that there are two types of problems:
Problem Type 1
The training set consists of all possible sample cases that will be encountered.
Problem Type 2
The number of sample cases that can be encountered is infinite or at least very large, and the training set therefore is only representative of this huge number of sample cases.
In Problem Type 1 it is perfectly acceptable, in fact desirable, to let your network train until it fails to make any more perceptible progress.
However, in Problem Type 2, the correct course of action is to learn the training set ONLY until the point where the network gives the best answers overall on that huge number of sample cases that you did NOT train with. But how do you know when that is?
Before we answer that question, let's look at what happens when you learn a Type 2 problem. Since the training set is only a small set of the possible patterns of inputs that can occur, the network is trying to build a model that correctly interpolates between close patterns with which it is being trained. This is called the network's ability to generalize. However, if you overtrain the model (or use too many hidden neurons) the model will "memorize" the patterns instead of becoming able to smoothly interpolate between them. Since certain kinds of data, e.g., financial data, is inherently noisy, the network will also tend to learn noise if it is overtrained.
One thing we must note. If you train with a huge amount of data in your training set, there may be so much conflicting data in it that the network cannot possibly memorize it or learn the noise, especially if you haven't given it too many hidden neurons. In such a case, the network MAY be very good no matter how long you let it learn.
So how do we know when it has trained enough, or as we stated above, when it has reached the point when it gives the best results on that huge universe of patterns on which it is NOT being trained? We can only create a set of patterns that represent this universe outside the training set, which we will call the test set. You may use any means you feel is appropriate to create a test set that adequately represents this outside universe. It may even contain a few of the patterns in the training set, but usually the test set is not a subset of the training set. A good way to create the test set is to remove a random extraction of about 10% to 40% of the patterns in your training set before you train (see Size of Test Set ).
Calibration trains on the training set and computes an average error factor for it during training. But every so often at intervals you specify (see Calibration Test Interval below), it reads in the test set and computes an average error for it. What usually happens is that the error for the training set continues to get smaller forever, or at least gets to the point where it is fairly flat. The error for the test set continues to get smaller to a point (the optimal point) and then it slowly begins to get larger for many problems! Unfortunately it gets larger slowly and bounces around a lot, so unless you use Calibration, you may not know this is happening.
If you have specified that the network be saved on the best test set, Calibration saves the network at this optimal point. Furthermore, it shows you how many events have passed since this optimal lowest error occurred. We suggest that you specify training should stop when the number of events since the minimum error for the test set reaches 20,000-40,000 events (much more for recurrent networks). You can also watch the Test Set Average Error Graph until you are sure the error is no longer decreasing. There is no point in continuing learning if you don't like the results. You are better off restarting with different variables, number of hidden neurons, etc.
Usually the training set error gets lower than the test set error, but on occasion the reverse is true. When this happens you have built a very good network for your test set, but check to make sure your test set is representative. If it is, don't worry about it.
Now you can see that no average error level should be viewed as a target which should be attained before the net should be used. Further, since Calibration limits overlearning and prevents memorization, the number of hidden neurons you use is not as critical as long as there are enough. It may be better to err on the side of using more rather than fewer hidden neurons when using Calibration.
Set Calibration Test Interval
This number defines the number of training patterns the network processes before NeuroShell 2 temporarily stops training and computes the error factor for the test set. The default setting for the test set error computation interval is 200. The usual range is between 50 and 500, although there are times when you may want to set it higher.
Note: If you are using TurboProp for weight updates, there is no point in setting the Calibration interval less than the epoch size (the number of patterns in the training set). TurboProp is a batch update technique and the weights are only updated every epoch.
Size of Test Set
The test set should be approximately 10 to 40 percent the size of the training set of data. You may use the Test Set Extract module to draw out a test set from your training set data.
When to Stop Training
When using Calibration with a Backpropagation network, it is recommended that you train until the events since the minimum error factor is greater than 20,000 to 40,000 events (higher for recurrent nets). Don't set any other stopping criteria.
When to Save the Network
When using Calibration with a Backpropagation network, save the network on the best test set.