This module allows the user to choose which variables will be used as network inputs and outputs, and to either specify or compute the minimum and maximum value for each variable. (Variables are included as columns and patterns are rows in the NeuroShell 2 internal file format.) This module creates an .MMX file which is required to either train a network or to process a file through the trained network.
1. Define Inputs/Outputs
The module enables you to choose which variables are network inputs and which are actual outputs. Click on the arrow in the variable type list box to display the available column designations. Click on the variable type you wish to use from the following list:
I Input variable - These are the variables used to make the prediction or classification (the independent variables).
A Actual output - The results the network is trying to learn to predict (the dependent variables). If you are classifying data, there will be an output for each possible category. Designate columns with an A only when using a supervised network type (Backpropagation, PNN, GRNN, and GMDH).
(Blank) Unused - The values contained in any column that is left blank will not be used to build the network.
Once you have selected a variable type, click on the cell in the Variable Type row underneath the column that you want to designate. If you want to designate more than one column as the same variable type, drag the mouse to blacken (select) the corresponding cells in the Variable Type row. Click on the row name Variable Type to blacken the entire row.
If you fail to designate any column with an I or A, the network will use the whole file. The network will define the number of inputs and outputs based upon the number of neurons in the input and output slabs that you define in the Design module. For example, if you have a file of 50 variables and you define the input slab size as 40 and the output slab size as 10, the network assumes the first 40 variables are inputs and the next 10 variables are outputs. The number of columns in the file must be exactly equal to the number of inputs+outputs designated in the slabs for this to work.
If there are I's and A's designated, but they do not match the number of inputs and outputs in the slabs, then training will use the I's and A's that were defined instead of the value in the slabs.
2. Set Minimums/Maximums
Since neural networks require variables to be scaled into the range 0 to 1 or -1 to 1, the network needs to know the variable's real value range. Use this module to enter the minimum and maximum value for each variable that is to be included in the network, or you can compute the range automatically from your data by selecting options from the Settings Menu.
In general, use a range that is tight around your data. (You may want to specify minimum and maximum values that are above and below values in the data file to allow a wider range for future predictions, or you may want to select values smaller than existing values to eliminate outliers that may affect the network's precision. Refer to Market Predictions for details.) If you fail to set the minimum and maximum values tightly around your data, the network will lose its ability to spot differences in the data.
Note: If you change the network's minimum and maximum values, you must retrain the network.
Use the File Menu to select a different pattern file or to view the current pattern file.
Use the Edit Menu to cut, copy, or paste the values that are displayed in the Define Inputs/Outputs datagrid.
Use the Settings Menu to automatically set the minimum and maximum values for the variables. Selecting the Set Minimums or Set Maximums option allows you to enter the appropriate value in the edit box that is displayed and set all selected cells at once to this value. Selecting the Compute mins/maxes option automatically calculates the minimum and maximum values as well as the mean and standard deviation.
Selecting the Compute min/maxes using Std Dev option will compute minimum and maximum values that are a specified number (N) of standard deviations away from the mean value for the variable. (A standard deviation = square root of the variance, which is the average squared deviation from the mean. This is a statistical measure of the spread of the data.) A dialog box will be displayed and you should type in the edit box the number of standard deviations between 1 and 5 that you want to use to compute the minimum and maximum values. The minimum value will be set to the mean - N * Standard Deviation. The max will be set to the mean + N * Standard Deviation. You can use decimals, such as 1.75, for N.
Selecting the Increase Min/Max Range option will display a dialog box that asks you to type in a percentage by which to increase the minimum and maximum values. Note that the minimum and maximum values must be entered in the Define Inputs/Outputs datagrid before you may use this option. Note: It is sometimes advisable to add 5 or 10 percent to the output(s) Maximum values and subtract 5 or 10 percent from the output(s) minimum values. This gives the net a little freedom to expand outputs if it needs to, and also may improve accuracy on extreme values. This is an attempt to correct the problem that sometimes occurs when extreme values in the data are off a little.
If you blacken (select) a column or a range of columns, the minimum and maximum values will only be changed in those columns. If you wish to change the values in all of the columns, blacken any one of the following cells: Variable Name, Variable Type, Min, Max, Mean, or Standard Deviation.
File Note: Once the inputs and outputs have been defined, along with minimum and maximum values for each variable, this data is stored in an .MMX file which is created when you exit this module. This file is used both for network training and processing.
If you change the .PAT file after creating the .MMX file in the Define Inputs and Outputs module, the next time you open the problem and select the Define module you will receive the following error message: "MMX file may not be up to date. You should probably recreate since mins, maxs, etc. may be in the wrong columns. Load anyway? Yes No" You need to use your judgment to determine whether or not to change the .MMX file. For example, if you make a simple change in the .PAT file such as changing one value, you do not need to change the .MMX file. If you change the number of network inputs, however, you will need to change the .MMX file and you should answer NO to the question. Otherwise, the .MMX file will not match the .PAT file and errors could result.